Commit Graph

227551 Commits

Author SHA1 Message Date
Florian Westphal 2f46e07995 netfilter: ebtables: make broute table work again
broute table init hook sets up the "br_should_route_hook" pointer,
which then gets called from br_input.

commit a386f99025
(bridge: add proper RCU annotation to should_route_hook)
introduced a typedef, and then changed this to:

br_should_route_hook_t *rhook;
[..]
rhook = rcu_dereference(br_should_route_hook);
if (*rhook(skb))

problem is that "br_should_route_hook" contains the address of the function,
so calling *rhook() results in kernel panic.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-11 23:55:51 +01:00
Stephen Hemminger 13ee6ac579 netfilter: fix race in conntrack between dump_table and destroy
The netlink interface to dump the connection tracking table has a race
when entries are deleted at the same time. A customer reported a crash
and the backtrace showed thatctnetlink_dump_table was running while a
conntrack entry was being destroyed.
(see https://bugzilla.vyatta.com/show_bug.cgi?id=6402).

According to RCU documentation, when using hlist_nulls the reader
must handle the case of seeing a deleted entry and not proceed
further down the linked list.  The old code would continue
which caused the scan to walk into the free list.

This patch uses locking (rather than RCU) for this operation which
is guaranteed safe, and no longer requires getting reference while
doing dump operation.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-11 23:54:42 +01:00
Eric Dumazet 83723d6071 netfilter: x_tables: dont block BH while reading counters
Using "iptables -L" with a lot of rules have a too big BH latency.
Jesper mentioned ~6 ms and worried of frame drops.

Switch to a per_cpu seqlock scheme, so that taking a snapshot of
counters doesnt need to block BH (for this cpu, but also other cpus).

This adds two increments on seqlock sequence per ipt_do_table() call,
its a reasonable cost for allowing "iptables -L" not block BH
processing.

Reported-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-10 20:11:38 +01:00
Alexander Duyck 45b9f509b7 ixgbe: update ntuple filter configuration
This change fixes several issues found in ntuple filtering while I was
doing the ATR refactor.

Specifically I updated the masks to work correctly with the latest version
of ethtool, I cleaned up the exception handling and added detailed error
output when a filter is rejected, and corrected several bits that were set
incorrectly in ixgbe_type.h.

The previous version of this patch included a printk that was left over from
me fixing the filter setup.  This patch does not include that printk.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:12 -08:00
Alexander Duyck 69830529b2 ixgbe: further flow director performance optimizations
This change adds a compressed input type for atr signature hash
computation.  It also drops the use of the set functions when setting up
the ATR input since we can then directly setup the hash input as two dwords
that can be stored and passed as registers.

With these changes the cost of computing the has is low enough that we can
perform a hash computation on each TCP SYN flagged packet allowing us to
drop the number of flow director misses considerably in tests such as
netperf TCP_CRR.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:11 -08:00
Alexander Duyck 905e4a4163 ixgbe: cleanup flow director hash computation to improve performance
This change cleans up the layout of the flow director data, and the
algorithm used to calculate the hash resulting in a 35x / 3500% performance
increase versus the old flow director hash computation.  The overall effect
is only a 1% increase in transactions per second though due to the fact
that only 1 packet in 20 are actually hashed upon.

TCP_RR before:
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       60.00    23059.27
16384  87380

TCP_RR after:
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       60.00    23239.98
16384  87380

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:11 -08:00
Yi Zou 2d39d576fa ixgbe: make sure per Rx queue is disabled before unmapping the receive buffer
When disable the Rx logic globally, we would also want to disable the per Rx
queue receive logic by per queue Rx control register RXDCTL so no more DMA is
happening from the packet buffer to the receive buffer associated with the Rx
ring, before we start unmapping Rx ring receive buffer. The hardware may take
max of 100us before the corresponding Rx queue is really disabled. Added
ixgbe_disable_rx_queue() for this purpose.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:10 -08:00
Dirk Brandewie 5377a4160b e1000: Add support for the CE4100 reference platform
This patch adds support for the gigabit phys present on the CE4100 reference
platforms.

Signed-off-by:  Dirk Brandewie <dirk.j.brandewie@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:10 -08:00
Bruce Allan 77996d1d4c e1000e: add custom set_d[0|3]_lplu_state function pointer for 82574
82574 needs to configure Low Power Link Up (or LPLU) differently than
the other parts in the 8257x family supported by the driver.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:09 -08:00
Bruce Allan 31dbe5b4ac e1000e: power off PHY after reset when interface is down
Some Phys supported by the driver do not remain powered off across a reset
of the device when the interface is down, e.g. on 82571, but not on 82574.
This patch powers down (only when WoL is disabled) the PHY after a reset if
the interface is down and the ethtool diagnostics are not currently running.

The ethtool diagnostic function required a minor re-factor as a result, and
the e1000_[get|put]_hw_control() functions are renamed since they are no
longer static to netdev.c as they are needed by the ethtool diagnostics.
A couple minor whitespace issues were cleaned up, too.

Reported-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:09 -08:00
Bruce Allan fe46f58fa6 e1000e: use either_crc_le() rather than re-write it
For the 82579 jumbo frame workaround, there is no need to re-write the CRC
calculation functionality already found in the kernel's ether_crc_le().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:08 -08:00
Bruce Allan e0dc4f1254 e1000e: properly bounds-check string functions
Use string functions with bounds checking rather than their non-bounds
checking counterparts, and do not hard code these boundaries.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:08 -08:00
Bruce Allan 482fed85e6 e1000e: convert calls of ops.[read|write]_reg to e1e_[r|w]phy
Cleans up the code a bit by using the driver-specific e1e_rphy and
e1e_wphy macros instead of the full function pointer variants.  Fix
a couple whitespace issue with two already existing calls to e1e_wphy.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:07 -08:00
Bruce Allan dd93f95e92 e1000e: cleanup variables set but not used
The ICR register is clear on read and we don't care what the returned value
is when resetting the hardware so the icr variable(s) can be removed.  We
should not ignore the return from e1000_lv_jumbo_workaround_ich8lan() and
from e1000_get_phy_id_82571() (dump a debug message when it fails and when
an unknown Phy id is returned).

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:44:06 -08:00
Jesse Gross 0363466866 net offloading: Convert checksums to use centrally computed features.
In order to compute the features for other offloads (primarily
scatter/gather), we need to first check the ability of the NIC to
offload the checksum for the packet.  Since we have already computed
this, we can directly use the result instead of figuring it out
again.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:35:35 -08:00
Jesse Gross 02932ce9e2 net offloading: Convert skb_need_linearize() to use precomputed features.
This switches skb_need_linearize() to use the features that have
been centrally computed.  In doing so, this fixes a problem where
scatter/gather should not be used because the card does not support
checksum offloading on that type of packet.  On device registration
we only check that some form of checksum offloading is available if
scatter/gatther is enabled but we must also check at transmission
time.  Examples of this include IPv6 or vlan packets on a NIC that
only supports IPv4 offloading.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:35:35 -08:00
Jesse Gross 91ecb63c07 net offloading: Convert dev_gso_segment() to use precomputed features.
This switches dev_gso_segment() to use the device features computed
by the centralized routine.  In doing so, it fixes a problem where
it would always use dev->features, instead of those appropriate
to the number of vlan tags if any are present.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:35:34 -08:00
Jesse Gross fc741216db net offloading: Pass features into netif_needs_gso().
Now that there is a single function that can compute the device
features relevant to a packet, we don't want to run it for each
offload.  This converts netif_needs_gso() to take the features
of the device, rather than computing them itself.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:35:34 -08:00
Jesse Gross f01a5236bd net offloading: Generalize netif_get_vlan_features().
netif_get_vlan_features() is currently only used by netif_needs_gso(),
so it only concerns itself with GSO features.  However, several other
places also should take into account the contents of the packet when
deciding whether to offload to hardware.  This generalizes the function
to return features about all of the various forms of offloading.  Since
offloads tend to be linked together, this avoids duplicating the logic
in each location (i.e. the scatter/gather code also needs the checksum
logic).

Suggested-by: Michał Mirosław <mirqus@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:35:33 -08:00
Jesse Gross 9497a0518e net offloading: Accept NETIF_F_HW_CSUM for all protocols.
We currently only have software fallback for one type of checksum: the
TCP/UDP one's complement.  This means that a protocol that uses hardware
offloading for a different type of checksum (FCoE, SCTP) must directly
check the device's features and do the right thing ahead of time.  By
the time we get to dev_can_checksum(), we're only deciding whether to
apply the one algorithm in software or hardware.  NETIF_F_HW_CSUM has the
same capabilities as the software version, so we should always use it if
present.  The primary advantage of this is multiply tagged vlans can use
hardware checksumming.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 23:35:33 -08:00
françois romieu eee3a96c63 r8169: delay phy init until device opens.
It workarounds the 60s firmware load failure timeout for the
non-modular case.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 18:15:08 -08:00
Randy Dunlap 697d0e338c net: fix kernel-doc warning in core/filter.c
Fix new kernel-doc notation warning in net/core/filter.c:

Warning(net/core/filter.c:172): No description found for parameter 'fentry'
Warning(net/core/filter.c:172): Excess function parameter 'filter' description in 'sk_run_filter'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 16:26:51 -08:00
Randy Dunlap 928c41e7a1 net/sock.h: make some fields private to fix kernel-doc warning(s)
Fix new kernel-doc notation warning in sock.h by annotating skc_dontcopy_*
as private fields.

Warning(include/net/sock.h:163): No description found for parameter 'skc_dontcopy_end[0]'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 16:26:51 -08:00
Jan Engelhardt 0ab03c2b14 netlink: test for all flags of the NLM_F_DUMP composite
Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH,
when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits
being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL,
non-dump requests with NLM_F_EXCL set are mistaken as dump requests.

Substitute the condition to test for _all_ bits being set.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 16:25:03 -08:00
Rafael J. Wysocki dba5a68ae1 forcedeth: Do not use legacy PCI power management
The forcedeth driver uses the legacy PCI power management, so it has
to do PCI-specific things in its ->suspend() and ->resume() callbacks
and some of them are not done correctly.

Convert forcedeth to the new PCI power management framework and make
it let the PCI subsystem take care of all the PCI-specific aspects of
device handling during system power transitions.

Tested with nVidia Corporation MCP55 Ethernet (rev a2).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09 16:20:29 -08:00