Commit Graph

165 Commits

Author SHA1 Message Date
Toshiaki Makita f2808d226f bridge: Prepare for forwarding another bridge group addresses
If a bridge is an 802.1ad bridge, it must forward another bridge group
addresses (the Nearest Customer Bridge group addresses).
(For details, see IEEE 802.1Q-2011 8.6.3.)

As user might not want group_fwd_mask to be modified by enabling 802.1ad,
introduce a new mask, group_fwd_mask_required, which indicates addresses
the bridge wants to forward. This will be set by enabling 802.1ad.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:22:53 -07:00
Toshiaki Makita 8580e2117c bridge: Prepare for 802.1ad vlan filtering support
This enables a bridge to have vlan protocol informantion and allows vlan
tag manipulation (retrieve, insert and remove tags) according to the vlan
protocol.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:22:53 -07:00
Linus Lüssing 2cd4143192 bridge: memorize and export selected IGMP/MLD querier port
Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in IGMP/MLD queriers
to be able to reliably serve any multicast listener behind this same
bridge.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:47 -07:00
Linus Lüssing 07f8ac4a1e bridge: add export of multicast database adjacent to net_dev
With this new, exported function br_multicast_list_adjacent(net_dev) a
list of IPv4/6 addresses is returned. This list contains all multicast
addresses sensed by the bridge multicast snooping feature on all bridge
ports of the bridge interface of net_dev, excluding addresses from the
specified net_device itself.

Adding bridge support to the batman-adv multicast optimization requires
batman-adv knowing about the existence of bridged-in multicast
listeners to be able to reliably serve them with multicast packets.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:47 -07:00
Linus Lüssing dc4eb53a99 bridge: adhere to querier election mechanism specified by RFCs
MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2
(RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the
querier with lowest source address shall become the selected
querier.

So far the bridge stopped its querier as soon as it heard another
querier regardless of its source address. This results in the "wrong"
querier potentially becoming the active querier or a potential,
unnecessary querying delay.

With this patch the bridge memorizes the source address of the currently
selected querier and ignores queries from queriers with a higher source
address than the currently selected one. This slight optimization is
supposed to make it more RFC compliant (but is rather uncritical and
therefore probably not necessary to be queued for stable kernels).

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:47 -07:00
Linus Lüssing 90010b36eb bridge: rename struct bridge_mcast_query/querier
The current naming of these two structs is very random, in that
reversing their naming would not make any semantical difference.

This patch tries to make the naming less confusing by giving them a more
specific, distinguishable naming.

This is also useful for the upcoming patches reintroducing the
"struct bridge_mcast_querier" but for storing information about the
selected querier (no matter if our own or a foreign querier).

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 23:50:46 -07:00
David S. Miller c99f7abf0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	include/net/inetpeer.h
	net/ipv6/output_core.c

Changes in net were fixing bugs in code removed in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03 23:32:12 -07:00
Toshiaki Makita e0d7968ab6 bridge: Prevent insertion of FDB entry with disallowed vlan
br_handle_local_finish() is allowing us to insert an FDB entry with
disallowed vlan. For example, when port 1 and 2 are communicating in
vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can
interfere with their communication by spoofed src mac address with
vlan id 10.

Note: Even if it is judged that a frame should not be learned, it should
not be dropped because it is destined for not forwarding layer but higher
layer. See IEEE 802.1Q-2011 8.13.10.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 13:38:23 -07:00
Cong Wang b1282726d5 bridge: make br_device_notifier static
Merge net/bridge/br_notify.c into net/bridge/br.c,
since it has only br_device_event() and br.c is small.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 15:33:47 -04:00
Alexei Starovoitov d4f0e0958d net: bridge: fix build
fix build when BRIDGE_VLAN_FILTERING is not set

Fixes: 2796d0c648 ("bridge: Automatically manage port promiscuous mode")

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18 20:09:50 -04:00
Vlad Yasevich 2796d0c648 bridge: Automatically manage port promiscuous mode.
There exist configurations where the administrator or another management
entity has the foreknowledge of all the mac addresses of end systems
that are being bridged together.

In these environments, the administrator can statically configure known
addresses in the bridge FDB and disable flooding and learning on ports.
This makes it possible to turn off promiscuous mode on the interfaces
connected to the bridge.

Here is why disabling flooding and learning allows us to control
promiscuity:
 Consider port X.  All traffic coming into this port from outside the
bridge (ingress) will be either forwarded through other ports of the
bridge (egress) or dropped.  Forwarding (egress) is defined by FDB
entries and by flooding in the event that no FDB entry exists.
In the event that flooding is disabled, only FDB entries define
the egress.  Once learning is disabled, only static FDB entries
provided by a management entity define the egress.  If we provide
information from these static FDBs to the ingress port X, then we'll
be able to accept all traffic that can be successfully forwarded and
drop all the other traffic sooner without spending CPU cycles to
process it.
 Another way to define the above is as following equations:
    ingress = egress + drop
 expanding egress
    ingress = static FDB + learned FDB + flooding + drop
 disabling flooding and learning we a left with
    ingress = static FDB + drop

By adding addresses from the static FDB entries to the MAC address
filter of an ingress port X, we fully define what the bridge can
process without dropping and can thus turn off promiscuous mode,
thus dropping packets sooner.

There have been suggestions that we may want to allow learning
and update the filters with learned addresses as well.  This
would require mac-level authentication similar to 802.1x to
prevent attacks against the hw filters as they are limited
resource.

Additionally, if the user places the bridge device in promiscuous mode,
all ports are placed in promiscuous mode regardless of the changes
to flooding and learning.

Since the above functionality depends on full static configuration,
we have also require that vlan filtering be enabled to take
advantage of this.  The reason is that the bridge has to be
able to receive and process VLAN-tagged frames and the there
are only 2 ways to accomplish this right now: promiscuous mode
or vlan filtering.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich f3a6ddf152 bridge: Introduce BR_PROMISC flag
Introduce a BR_PROMISC per-port flag that will help us track if the
current port is supposed to be in promiscuous mode or not.  For now,
always start in promiscuous mode.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich 8db24af71b bridge: Add functionality to sync static fdb entries to hw
Add code that allows static fdb entires to be synced to the
hw list for a specified port.  This will be used later to
program ports that can function in non-promiscuous mode.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Vlad Yasevich e028e4b8dc bridge: Keep track of ports capable of automatic discovery.
By default, ports on the bridge are capable of automatic
discovery of nodes located behind the port.  This is accomplished
via flooding of unknown traffic (BR_FLOOD) and learning the
mac addresses from these packets (BR_LEARNING).
If the above functionality is disabled by turning off these
flags, the port requires static configuration in the form
of static FDB entries to function properly.

This patch adds functionality to keep track of all ports
capable of automatic discovery.  This will later be used
to control promiscuity settings.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:06:33 -04:00
Eric W. Biederman a8779ec1c5 netpoll: Remove gfp parameter from __netpoll_setup
The gfp parameter was added in:
commit 47be03a28c
Author: Amerigo Wang <amwang@redhat.com>
Date:   Fri Aug 10 01:24:37 2012 +0000

    netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup()

    slave_enable_netpoll() and __netpoll_setup() may be called
    with read_lock() held, so should use GFP_ATOMIC to allocate
    memory. Eric suggested to pass gfp flags to __netpoll_setup().

    Cc: Eric Dumazet <eric.dumazet@gmail.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Cong Wang <amwang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

The reason for the gfp parameter was removed in:
commit c4cdef9b71
Author: dingtianhong <dingtianhong@huawei.com>
Date:   Tue Jul 23 15:25:27 2013 +0800

    bonding: don't call slave_xxx_netpoll under spinlocks

    The slave_xxx_netpoll will call synchronize_rcu_bh(),
    so the function may schedule and sleep, it should't be
    called under spinlocks.

    bond_netpoll_setup() and bond_netpoll_cleanup() are always
    protected by rtnl lock, it is no need to take the read lock,
    as the slave list couldn't be changed outside rtnl lock.

    Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
    Cc: Jay Vosburgh <fubar@us.ibm.com>
    Cc: Andy Gospodarek <andy@greyhouse.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Nothing else that calls __netpoll_setup or ndo_netpoll_setup
requires a gfp paramter, so remove the gfp parameter from both
of these functions making the code clearer.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29 17:58:37 -04:00
Joe Perches e5a727f663 bridge: Use ether_addr_copy and ETH_ALEN
Convert the more obvious uses of memcpy to ether_addr_copy.

There are still uses of memcpy that could be converted but
these addresses are __aligned(2).

Convert a couple uses of 6 in gr_private.h to ETH_ALEN.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 19:16:43 -05:00
Toshiaki Makita 424bb9c97c bridge: Properly check if local fdb entry can be deleted when deleting vlan
Vlan codes unconditionally delete local fdb entries.
We should consider the possibility that other ports have the same
address and vlan.

Example of problematic case:
  ip link set eth0 address 12:34:56:78:90:ab
  ip link set eth1 address aa:bb:cc:dd:ee:ff
  brctl addif br0 eth0
  brctl addif br0 eth1 # br0 will have mac address 12:34:56:78:90:ab
  bridge vlan add dev eth0 vid 10
  bridge vlan add dev eth1 vid 10
  bridge vlan add dev br0 vid 10 self
We will have fdb entry such that f->dst == eth0, f->vlan_id == 10 and
f->addr == 12:34:56:78:90:ab at this time.
Next, delete eth0 vlan 10.
  bridge vlan del dev eth0 vid 10
In this case, we still need the entry for br0, but it will be deleted.

Note that br0 needs the entry even though its mac address is not set
manually. To delete the entry with proper condition checking,
fdb_delete_local() is suitable to use.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 14:34:34 -08:00
Toshiaki Makita 2b292fb4a5 bridge: Fix the way to check if a local fdb entry can be deleted
We should take into account the followings when deleting a local fdb
entry.

- nbp_vlan_find() can be used only when vid != 0 to check if an entry is
  deletable, because a fdb entry with vid 0 can exist at any time while
  nbp_vlan_find() always return false with vid 0.

  Example of problematic case:
    ip link set eth0 address 12:34:56:78:90:ab
    ip link set eth1 address 12:34:56:78:90:ab
    brctl addif br0 eth0
    brctl addif br0 eth1
    ip link set eth0 address aa:bb:cc:dd:ee:ff
  Then, the fdb entry 12:34:56:78:90:ab will be deleted even though the
  bridge port eth1 still has that address.

- The port to which the bridge device is attached might needs a local entry
  if its mac address is set manually.

  Example of problematic case:
    ip link set eth0 address 12:34:56:78:90:ab
    brctl addif br0 eth0
    ip link set br0 address 12:34:56:78:90:ab
    ip link set eth0 address aa:bb:cc:dd:ee:ff
  Then, the fdb still must have the entry 12:34:56:78:90:ab, but it will be
  deleted.

We can use br->dev->addr_assign_type to check if the address is manually
set or not, but I propose another approach.

Since we delete and insert local entries whenever changing mac address
of the bridge device, we can change dst of the entry to NULL regardless of
addr_assign_type when deleting an entry associated with a certain port,
and if it is found to be unnecessary later, then delete it.
That is, if changing mac address of a port, the entry might be changed
to its dst being NULL first, but is eventually deleted when recalculating
and changing bridge id.

This approach is especially useful when we want to share the code with
deleting vlan in which the bridge device might want such an entry regardless
of addr_assign_type, and makes things easy because we don't have to consider
if mac address of the bridge device will be changed or not at the time we
delete a local entry of a port, which means fdb code will not be bothered
even if the bridge id calculating logic is changed in the future.

Also, this change reduces inconsistent state, where frames whose dst is the
mac address of the bridge, can't reach the bridge because of premature fdb
entry deletion. This change reduces the possibility that the bridge device
replies unreachable mac address to arp requests, which could occur during
the short window between calling del_nbp() and br_stp_recalculate_bridge_id()
in br_del_if(). This will effective after br_fdb_delete_by_port() starts to
use the same code by following patch.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 14:34:33 -08:00
Toshiaki Makita a5642ab474 bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
br_fdb_changeaddr() assumes that there is at most one local entry per port
per vlan. It used to be true, but since commit 36fd2b63e3 ("bridge: allow
creating/deleting fdb entries via netlink"), it has not been so.
Therefore, the function might fail to search a correct previous address
to be deleted and delete an arbitrary local entry if user has added local
entries manually.

Example of problematic case:
  ip link set eth0 address ee:ff:12:34:56:78
  brctl addif br0 eth0
  bridge fdb add 12:34:56:78:90:ab dev eth0 master
  ip link set eth0 address aa:bb:cc:dd:ee:ff
Then, the address 12:34:56:78:90:ab might be deleted instead of
ee:ff:12:34:56:78, the original mac address of eth0.

Address this issue by introducing a new flag, added_by_user, to struct
net_bridge_fdb_entry.

Note that br_fdb_delete_by_port() has to set added_by_user to 0 in cases
like:
  ip link set eth0 address 12:34:56:78:90:ab
  ip link set eth1 address aa:bb:cc:dd:ee:ff
  brctl addif br0 eth0
  bridge fdb add aa:bb:cc:dd:ee:ff dev eth0 master
  brctl addif br0 eth1
  brctl delif br0 eth0
In this case, kernel should delete the user-added entry aa:bb:cc:dd:ee:ff,
but it also should have been added by "brctl addif br0 eth1" originally,
so we don't delete it and treat it a new kernel-created entry.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 14:34:33 -08:00
WANG Cong b86f81cca9 bridge: move br_net_exit() to br.c
And it can become static.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13 23:42:39 -08:00
Li RongQing 8f84985fec net: unify the pcpu_tstats and br_cpu_netstats as one
They are same, so unify them as one, pcpu_sw_netstats.

Define pcpu_sw_netstat in netdevice.h, remove pcpu_tstats
from if_tunnel and remove br_cpu_netstats from br_private.h

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-04 20:10:24 -05:00
tanxiaojun 31a5b837c2 bridge: add space before '(/{', after ',', etc.
Spaces required before the open parenthesis '(', before the open
brace '{', after that ',' and around that '?/:'.

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:27:26 -05:00
Jiri Pirko 859828c0ea br: fix use of ->rx_handler_data in code executed on non-rx_handler path
br_stp_rcv() is reached by non-rx_handler path. That means there is no
guarantee that dev is bridge port and therefore simple NULL check of
->rx_handler_data is not enough. There is need to check if dev is really
bridge port and since only rcu read lock is held here, do it by checking
->rx_handler pointer.

Note that synchronize_net() in netdev_rx_handler_unregister() ensures
this approach as valid.

Introduced originally by:
commit f350a0a873
  "bridge: use rx_handler_data pointer to store net_bridge_port pointer"

Fixed but not in the best way by:
commit b5ed54e94d
  "bridge: fix RCU races with bridge port"

Reintroduced by:
commit 716ec052d2
  "bridge: fix NULL pointer deref of br_port_get_rcu"

Please apply to stable trees as well. Thanks.

RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770

Reported-by: Laine Stump <laine@redhat.com>
Debugged-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 15:41:40 -05:00
David S. Miller 394efd19d5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be.h
	drivers/net/netconsole.c
	net/bridge/br_private.h

Three mostly trivial conflicts.

The net/bridge/br_private.h conflict was a function signature (argument
addition) change overlapping with the extern removals from Joe Perches.

In drivers/net/netconsole.c we had one change adjusting a printk message
whilst another changed "printk(KERN_INFO" into "pr_info(".

Lastly, the emulex change was a new inline function addition overlapping
with Joe Perches's extern removals.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 13:48:30 -05:00
Vlad Yasevich 06499098a0 bridge: pass correct vlan id to multicast code
Currently multicast code attempts to extrace the vlan id from
the skb even when vlan filtering is disabled.  This can lead
to mdb entries being created with the wrong vlan id.
Pass the already extracted vlan id to the multicast
filtering code to make the correct id is used in
creation as well as lookup.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29 17:40:08 -04:00