Commit Graph

376256 Commits

Author SHA1 Message Date
Or Gerlitz c418253f12 net/mlx4_core: Keep VF assigned MAC in the PF admin table
MAC addresses assigned by the PF to VFs were not kept in the PF driver
admin table. As a result, displaying the VF MACs from the PF interface
to user space showed zero address where in fact the VF got non-zero
address from the PF, fix that.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:58:24 -07:00
Or Gerlitz ef96f7d46a net/mlx4_en: Handle unassigned VF MAC address correctly
When a VF sense they didn't get MAC address, use random one. This will
address the case of administrator not assigning MAC to the VF through
the PF OS APIs and keep udev happy.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:58:24 -07:00
Jack Morgenstein 5efe5355f2 net/mlx4_core: Return -EPROBE_DEFER when a VF is probed before PF is sufficiently initialized
In the PF initialization, SRIOV is enabled before the PF is fully initialized.
This allows the kernel to probe the newly-exposed VFs before the PF is ready
to handle them (nested probes).

Have the probe method return the -EPROBE_DEFER value in this situation (instead
of the VF probe method retrying its initialization in a loop, and returning -EIO
on failure). When -EPROBE_DEFER is returned by the VF probe method, the kernel
itself will retry the probe after a suitable delay.

Based upon a suggestion by Ben Hutchings <bhutchings@solarflare.com>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:58:24 -07:00
Sagi Grimberg a1c6693a50 net/mlx4_en: Fix adaptive moderation cq update
When turning on adaptive_rx under adaptive moderation, the CQ's moderation
count wasn't updated according to rx_frames which resulted in too many
interrupts and bandwidth drop.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:58:24 -07:00
Eric Dumazet 01cb71d2d4 net_sched: restore "overhead xxx" handling
commit 56b765b79 ("htb: improved accuracy at high rates")
broke the "overhead xxx" handling, as well as the "linklayer atm"
attribute.

tc class add ... htb rate X ceil Y linklayer atm overhead 10

This patch restores the "overhead xxx" handling, for htb, tbf
and act_police

The "linklayer atm" thing needs a separate fix.

Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vimalkumar <j.vimal@gmail.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-02 22:22:35 -07:00
Eric Dumazet c87a124a5d net: force a reload of first item in hlist_nulls_for_each_entry_rcu
Roman Gushchin discovered that udp4_lib_lookup2() was not reloading
first item in the rcu protected list, in case the loop was restarted.

This produced soft lockups as in https://lkml.org/lkml/2013/4/16/37

rcu_dereference(X)/ACCESS_ONCE(X) seem to not work as intended if X is
ptr->field :

In some cases, gcc caches the value or ptr->field in a register.

Use a barrier() to disallow such caching, as documented in
Documentation/atomic_ops.txt line 114

Thanks a lot to Roman for providing analysis and numerous patches.

Diagnosed-by: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-02 20:53:59 -07:00
Haiyang Zhang c802db1164 hyperv: Fix vlan_proto setting in netvsc_recv_callback()
Since the recent addition of 8021AD, we need to set the new field vlan_proto in
sk_buff. Otherwise, it will trigger BUG() call in vlan_proto_idx().

This patch fixes the problem.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:33:40 -07:00
Jiri Pirko 3f3e7ce4ff team: fix port list dump for big number of ports
In case the port list dump does not fit into one skb currently the
dump would start over again. Fix this by continue from the last dumped
port.

Introduced by commit d90f889e9c (team: handle sending port list in the
same way option list is sent)

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:31:52 -07:00
Jiri Pirko 6d7581e62f list: introduce list_first_entry_or_null
non-rcu variant of list_first_or_null_rcu

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:31:52 -07:00
Paul Moore e4e8536f65 selinux: fix the labeled xfrm/IPsec reference count handling
The SELinux labeled IPsec code was improperly handling its reference
counting, dropping a reference on a delete operation instead of on a
free/release operation.

Reported-by: Ondrej Moris <omoris@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:30:07 -07:00
Paul Moore e4c1721642 xfrm: force a garbage collection after deleting a policy
In some cases after deleting a policy from the SPD the policy would
remain in the dst/flow/route cache for an extended period of time
which caused problems for SELinux as its dynamic network access
controls key off of the number of XFRM policy and state entries.
This patch corrects this problem by forcing a XFRM garbage collection
whenever a policy is sucessfully removed.

Reported-by: Ondrej Moris <omoris@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:30:07 -07:00
Pravin B Shelar 1e2bd517c1 udp6: Fix udp fragmentation for tunnel traffic.
udp6 over GRE tunnel does not work after to GRE tso changes. GRE
tso handler passes inner packet but keeps track of outer header
start in SKB_GSO_CB(skb)->mac_offset.  udp6 fragment need to
take care of outer header, which start at the mac_offset, while
adding fragment header.
This bug is introduced by commit 68c3316311 (GRE: Add TCP
segmentation offload for GRE).

Reported-by: Dmitry Kravkov <dkravkov@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 17:06:07 -07:00
Jay Vosburgh b190a50875 net/core: dev_mc_sync_multiple calls wrong helper
The dev_mc_sync_multiple function is currently calling
__hw_addr_sync, and not __hw_addr_sync_multiple.  This will result in
addresses only being synced to the first device from the set.

	Corrected by calling the _multiple variant.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 16:56:56 -07:00
Jay Vosburgh 29ca2f8fcc net/core: __hw_addr_sync_one / _multiple broken
Currently, __hw_addr_sync_one is called in a loop by
__hw_addr_sync_multiple to sync each of a "from" device's hw addresses
to a "to" device.  __hw_addr_sync_one calls __hw_addr_add_ex to attempt
to add each address.  __hw_addr_add_ex is called with global=false, and
sync=true.

	__hw_addr_add_ex checks to see if the new address matches an
address already on the list.  If so, it tests global and sync.  In this
case, sync=true, and it then checks if the address is already synced,
and if so, returns 0.

	This 0 return causes __hw_addr_sync_one to increment the sync_cnt
and refcount for the "from" list's address entry, even though the address
is already synced and has a reference and sync_cnt.  This will cause
the sync_cnt and refcount to increment without bound every time an
addresses is added to the "from" device and synced to the "to" device.

	The fix here has two parts:

	First, when __hw_addr_add_ex finds the address already exists
and is synced, return -EEXIST instead of 0.

	Second, __hw_addr_sync_one checks the error return for -EEXIST,
and if so, it (a) does not add a refcount/sync_cnt, and (b) returns 0
itself so that __hw_addr_sync_multiple will not return an error.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 16:56:56 -07:00
Jay Vosburgh 60ba834c2f net/core: __hw_addr_unsync_one "from" address not marked synced
When an address is added to a subordinate interface (the "to"
list), the address entry in the "from" list is not marked "synced" as
the entry added to the "to" list is.

	When performing the unsync operation (e.g., dev_mc_unsync),
__hw_addr_unsync_one calls __hw_addr_del_entry with the "synced"
parameter set to true for the case when the address reference is being
released from the "from" list.  This causes a test inside to fail,
with the result being that the reference count on the "from" address
is not properly decremeted and the address on the "from" list will
never be freed.

	Correct this by having __hw_addr_unsync_one call the
__hw_addr_del_entry function with the "sync" flag set to false for the
"remove from the from list" case.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 16:56:56 -07:00
Jay Vosburgh 9747ba6636 net/core: __hw_addr_create_ex does not initialize sync_cnt
The sync_cnt field is not being initialized, which can result
in arbitrary values in the field.  Fixed by initializing it to zero.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Reviewed-by: Vlad Yasevich <vyasevic@redhat.com>
Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 16:56:56 -07:00
Nicolas Dichtel fda3f402f4 snmp6: remove IPSTATS_MIB_CSUMERRORS
This stat is not relevant in IPv6, there is no checksum in IPv6 header.
Just leave a comment to explain the hole.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 16:26:49 -07:00
David S. Miller fb68e2f438 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.10 stream...

Regarding the NFC bits, Samuel says:

"This is the first batch of NFC fixes for 3.10, and it contains:

- 3 fixes for the NFC MEI support:
        * We now depend on the correct Kconfig symbol.
        * We register an MEI event callback whenever we enable an NFC device,
          otherwise we fail to read anything after an enable/disable cycle.
        * We only disable an MEI device from its disable mey_phy_ops,
          preventing useless consecutive disable calls.

- An NFC Makefile cleanup, as I forgot to remove a commented out line when
  moving the LLCP code to the NFC top level directory."

As for the mac80211 bits, Johannes says:

"This time I have a fix from Stanislaw for a stupid mistake I made in the
auth/assoc timeout changes, a fix from Felix for 64-bit traffic counters
and one from Helmut for address mask handling in mac80211. I also have a
few fixes myself for four different crashes reported by a few people."

And Johannes says this about the iwlwifi bit:

"This fixes a brown paper-bag bug that we really should've caught in
review. More details in the changelog for the fix."

On top of that...

Arend van Spriel and Hante Meuleman cooperate to send a series of AP
and P2P mode fixes for brcmfmac.

Gabor Juhos corrects a register offset for AR9550, avoiding a bus
error.

Dan Carpenter provides a fixup to some dmesg output in the atmel
driver.

And, finally...

Felix Fietkau not only gives us a trio of small AR934x fixes, but
also refactors the ath9k aggregation session start/stop handling
(using the generic mac80211 support) in order to avoid a deadlock.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31 02:38:19 -07:00
Somnath Kotur 01e5b2c455 be2net: Fix crash on 2nd invocation of PCI AER/EEH error_detected hook
During a PCI EEH/AER error recovery flow, if the device did not successfully
restart, the error_detected() hook may be called a second time with a
"perm_failure" state. This patch skips over driver cleanup for the second
invocation of the callback.

Also, Lancer error recovery code is fixed-up to handle these changes.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Somnath kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-30 16:58:06 -07:00
Somnath Kotur e38b170695 be2net: Mark checksum fail for IP fragmented packets
HW does not compute L4 checksum for IP Fragmented packets.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-30 16:58:06 -07:00
David S. Miller 73ce00d4d6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter/IPVS fixes for 3.10-rc3,
they are:

* fix xt_addrtype with IPv6, from Florian Westphal. This required
  a new hook for IPv6 functions in the netfilter core to avoid
  hard dependencies with the ipv6 subsystem when this match is
  only used for IPv4.

* fix connection reuse case in IPVS. Currently, if an reused
  connection are directed to the same server. If that server is
  down, those connection would fail. Therefore, clear the
  connection and choose a new server among the available ones.

* fix possible non-nul terminated string sent to user-space if
  ipt_ULOG is used as the default netfilter logging stub, from
  Chen Gang.

* fix mark logging of IPv6 packets in xt_LOG, from Michal Kubecek.
  This bug has been there since 2.6.26.

* Fix breakage ip_vs_sh due to incorrect structure layout for
  RCU, from Jan Beulich.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-30 16:38:38 -07:00
Jan Beulich a70b9641e6 ipvs: ip_vs_sh: fix build
kfree_rcu() requires offsetof(..., rcu_head) < 4096, which can
get violated with a sufficiently high CONFIG_IP_VS_SH_TAB_BITS.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-29 17:50:39 +02:00
John W. Linville 50cc1cab4c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-05-29 10:59:35 -04:00
Michal Kubeček d660164d79 netfilter: xt_LOG: fix mark logging for IPv6 packets
In dump_ipv6_packet(), the "recurse" parameter is zero only if
dumping contents of a packet embedded into an ICMPv6 error
message. Therefore we want to log packet mark if recurse is
non-zero, not when it is zero.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-29 12:29:18 +02:00
Jason Wang 8e6d91ae09 tuntap: forbid changing mq flag for persistent device
We currently allow changing the mq flag (IFF_MULTI_QUEUE) for a persistent
device. This will result a mismatch between the number the queues in netdev and
tuntap. This is because we only allocate a 1q netdevice when IFF_MULTI_QUEUE was
not specified, so when we set the IFF_MULTI_QUEUE and try to attach more queues
later, netif_set_real_num_tx_queues() may fail which result a single queue
netdevice with multiple sockets attached.

Solve this by disallowing changing the mq flag for persistent device.

Bug was introduced by commit edfb6a148c
(tuntap: reduce memory using of queues).

Reported-by: Sriram Narasimhan <sriram.narasimhan@hp.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-29 00:21:32 -07:00