linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Nandita Dukkipati	a262f0cdf1	Proportional Rate Reduction for TCP. This patch implements Proportional Rate Reduction (PRR) for TCP. PRR is an algorithm that determines TCP's sending rate in fast recovery. PRR avoids excessive window reductions and aims for the actual congestion window size at the end of recovery to be as close as possible to the window determined by the congestion control algorithm. PRR also improves accuracy of the amount of data sent during loss recovery. The patch implements the recommended flavor of PRR called PRR-SSRB (Proportional rate reduction with slow start reduction bound) and replaces the existing rate halving algorithm. PRR improves upon the existing Linux fast recovery under a number of conditions including: 1) burst losses where the losses implicitly reduce the amount of outstanding data (pipe) below the ssthresh value selected by the congestion control algorithm and, 2) losses near the end of short flows where application runs out of data to send. As an example, with the existing rate halving implementation a single loss event can cause a connection carrying short Web transactions to go into the slow start mode after the recovery. This is because during recovery Linux pulls the congestion window down to packets_in_flight+1 on every ACK. A short Web response often runs out of new data to send and its pipe reduces to zero by the end of recovery when all its packets are drained from the network. Subsequent HTTP responses using the same connection will have to slow start to raise cwnd to ssthresh. PRR on the other hand aims for the cwnd to be as close as possible to ssthresh by the end of recovery. A description of PRR and a discussion of its performance can be found at the following links: - IETF Draft: http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01 - IETF Slides: http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf - Paper to appear in Internet Measurements Conference (IMC) 2011: Improving TCP Loss Recovery Nandita Dukkipati, Matt Mathis, Yuchung Cheng Signed-off-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-24 19:40:40 -07:00
Ian Campbell	aff65da0f1	net: ipv4: convert to SKB frag APIs Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-24 17:52:11 -07:00
David S. Miller	823dcd2506	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net	2011-08-20 10:39:12 -07:00
Jiri Pirko	b81693d914	net: remove ndo_set_multicast_list callback Remove no longer used operation. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-17 20:22:03 -07:00
Tom Herbert	bdeab99191	rps: Add flag to skb to indicate rxhash is based on L4 tuple The l4_rxhash flag was added to the skb structure to indicate that the rxhash value was computed over the 4 tuple for the packet which includes the port information in the encapsulated transport packet. This is used by the stack to preserve the rxhash value in __skb_rx_tunnel. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-17 20:06:03 -07:00
Eric Dumazet	33d480ce6d	net: cleanup some rcu_dereference_raw RCU api had been completed and rcu_access_pointer() or rcu_dereference_protected() are better than generic rcu_dereference_raw() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-12 02:55:28 -07:00
Julian Anastasov	97a8041020	ipv4: some rt_iif -> rt_route_iif conversions As rt_iif represents input device even for packets coming from loopback with output route, it is not an unique key specific to input routes. Now rt_route_iif has such role, it was fl.iif in 2.6.38, so better to change the checks at some places to save CPU cycles and to restore 2.6.38 semantics. compare_keys: - input routes: only rt_route_iif matters, rt_iif is same - output routes: only rt_oif matters, rt_iif is not used for matching in __ip_route_output_key - now we are back to 2.6.38 state ip_route_input_common: - matching rt_route_iif implies input route - compared to 2.6.38 we eliminated one rth->fl.oif check because it was not needed even for 2.6.38 compare_hash_inputs: Only the change here is not an optimization, it has effect only for output routes. I assume I'm restoring the original intention to ignore oif, it was using fl.iif - now we are back to 2.6.38 state Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-11 05:58:59 -07:00
Mike Waychison	f0e3d0689d	tcp: initialize variable ecn_ok in syncookies path Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use of ecn_ok. This can happen if cookie_check_timestamp() returns due to not having seen a timestamp. Defaulting to ecn off seems like a reasonable thing to do in this case, so initialized ecn_ok to false. Signed-off-by: Mike Waychison <mikew@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-10 21:59:57 -07:00
David S. Miller	19fd61785a	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net	2011-08-07 23:20:26 -07:00
Julian Anastasov	d52fbfc9e5	ipv4: use dst with ref during bcast/mcast loopback Make sure skb dst has reference when moving to another context. Currently, I don't see protocols that can hit it when sending broadcasts/multicasts to loopback using noref dsts, so it is just a precaution. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-07 22:52:32 -07:00
Julian Anastasov	47670b767b	ipv4: route non-local sources for raw socket The raw sockets can provide source address for routing but their privileges are not considered. We can provide non-local source address, make sure the FLOWI_FLAG_ANYSRC flag is set if socket has privileges for this, i.e. based on hdrincl (IP_HDRINCL) and transparent flags. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-07 22:52:32 -07:00
Julian Anastasov	797fd3913a	netfilter: TCP and raw fix for ip_route_me_harder TCP in some cases uses different global (raw) socket to send RST and ACK. The transparent flag is not set there. Currently, it is a problem for rerouting after the previous change. Fix it by simplifying the checks in ip_route_me_harder and use FLOWI_FLAG_ANYSRC even for sockets. It looks safe because the initial routing allowed this source address to be used and now we just have to make sure the packet is rerouted. As a side effect this also allows rerouting for normal raw sockets that use spoofed source addresses which was not possible even before we eliminated the ip_route_input call. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-07 22:52:32 -07:00
Daniel Baluta	dd23198e58	ipv4: Fix ip_getsockopt for IP_PKTOPTIONS IP_PKTOPTIONS is broken for 32-bit applications running in COMPAT mode on 64-bit kernels. This happens because msghdr's msg_flags field is always set to zero. When running in COMPAT mode this should be set to MSG_CMSG_COMPAT instead. Signed-off-by: Tiberiu Szocs-Mihai <tszocs@ixiacom.com> Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-07 22:31:07 -07:00
Julian Anastasov	d547f727df	ipv4: fix the reusing of routing cache entries compare_keys and ip_route_input_common rely on rt_oif for distinguishing of input and output routes with same keys values. But sometimes the input route has also same hash chain (keyed by iif != 0) with the output routes (keyed by orig_oif=0). Problem visible if running with small number of rhash_entries. Fix them to use rt_route_iif instead. By this way input route can not be returned to users that request output route. The patch fixes the ip_rt_bug errors that were reported in ip_local_out context, mostly for 255.255.255.255 destinations. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-07 22:20:20 -07:00
David S. Miller	6e5714eaf7	net: Compute protocol sequence numbers and fragment IDs using MD5. Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-06 18:33:19 -07:00
Eric Dumazet	f2c31e32b3	net: fix NULL dereferences in check_peer_redir() Gergely Kalman reported crashes in check_peer_redir(). It appears commit `f39925dbde` (ipv4: Cache learned redirect information in inetpeer.) added a race, leading to possible NULL ptr dereference. Since we can now change dst neighbour, we should make sure a reader can safely use a neighbour. Add RCU protection to dst neighbour, and make sure check_peer_redir() can be called safely by different cpus in parallel. As neighbours are already freed after one RCU grace period, this patch should not add typical RCU penalty (cache cold effects) Many thanks to Gergely for providing a pretty report pointing to the bug. Reported-by: Gergely Kalman <synapse@hippy.csoma.elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-03 03:34:12 -07:00
Stephen Hemminger	a9b3cd7f32	rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER When assigning a NULL value to an RCU protected pointer, no barrier is needed. The rcu_assign_pointer, used to handle that but will soon change to not handle the special case. Convert all rcu_assign_pointer of NULL value. //smpl @@ expression P; @@ - rcu_assign_pointer(P, NULL) + RCU_INIT_POINTER(P, NULL) // </smpl> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-02 04:29:23 -07:00
Julia Lawall	a1889c0d20	net: adjust array index Convert array index from the loop bound to the loop index. A simplified version of the semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e1,e2,ar; @@ for(e1 = 0; e1 < e2; e1++) { <... ar[ - e2 + e1 ] ...> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-01 02:27:21 -07:00
Linus Torvalds	d5eab9152a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits) tg3: Remove 5719 jumbo frames and TSO blocks tg3: Break larger frags into 4k chunks for 5719 tg3: Add tx BD budgeting code tg3: Consolidate code that calls tg3_tx_set_bd() tg3: Add partial fragment unmapping code tg3: Generalize tg3_skb_error_unmap() tg3: Remove short DMA check for 1st fragment tg3: Simplify tx bd assignments tg3: Reintroduce tg3_tx_ring_info ASIX: Use only 11 bits of header for data size ASIX: Simplify condition in rx_fixup() Fix cdc-phonet build bonding: reduce noise during init bonding: fix string comparison errors net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared net: add IFF_SKB_TX_SHARED flag to priv_flags net: sock_sendmsg_nosec() is static forcedeth: fix vlans gianfar: fix bug caused by `87c288c6e9` gro: Only reset frag0 when skb can be pulled ...	2011-07-28 05:58:19 -07:00
Arun Sharma	60063497a9	atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Zoltan Kiss	b76d0789c9	IPv4: Send gratuitous ARP for secondary IP addresses also If a device event generates gratuitous ARP messages, only primary address is used for sending. This patch iterates through the whole list. Tested with 2 IP addresses configuration on bonding interface. Signed-off-by: Zoltan Kiss <schaman@sch.bme.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-25 16:16:00 -07:00
xeb@mail.ru	559fafb94a	gre: fix improper error handling Fix improper protocol err_handler, current implementation is fully unapplicable and may cause kernel crash due to double kfree_skb. Signed-off-by: Dmitry Kozlov <xeb@mail.ru> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-23 20:06:00 -07:00
Julian Anastasov	b0fe4a3184	ipv4: use RT_TOS after some rt_tos conversions rt_tos was changed to iph->tos but it must be filtered by RT_TOS Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-23 20:05:31 -07:00
David S. Miller	415b3334a2	icmp: Fix regression in nexthop resolution during replies. icmp_route_lookup() uses the wrong flow parameters if the reverse session route lookup isn't used. So do not commit to the re-decoded flow until we actually make a final decision to use a real route saved in 'rt2'. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-22 06:22:10 -07:00
Bill Sommerfeld	d9be4f7a6f	ipv4: Constrain UFO fragment sizes to multiples of 8 bytes Because the ip fragment offset field counts 8-byte chunks, ip fragments other than the last must contain a multiple of 8 bytes of payload. ip_ufo_append_data wasn't respecting this constraint and, depending on the MTU and ip option sizes, could create malformed non-final fragments. Google-Bug-Id: 5009328 Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-21 21:31:41 -07:00

1 2 3 4 5 ...

4364 Commits