When using layer2 or layer2+3 hash, only the 5th byte of the MAC
addresses is used.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The documentation for the tip tree is really in quite a similar
spirit to the netdev-FAQ. Move the netdev-FAQ to the process docs
as well.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Convert the "should I use new or old comment formatting" to cover
all formatting. This makes the question itself shorter.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add the most important case to the question about "where are we
in the cycle" - the case of net-next being closed.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
I think double back ticks are more correct. Add where they are missing.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
These days we often ask for selftests so let's update our
testing requirements.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
We have to tell people to stop reposting to often lately,
or not to repost while the discussion is ongoing.
Document this.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Make the question shorter and adjust the start of the answer accordingly.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The semantics of "Under review" have shifted. Reword the question
about it a bit and focus it on the response time.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Cut down the length of the question so it renders better in docs.
Mention that Message-ID can be used to search patchwork.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Use the sphinx Warning box to make the net-next being closed
stand out more.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
We want people to mark their patches with net and net-next in the subject.
Many miss doing that. Move the FAQ section which points that out up, and
place it after the section which enumerates the trees, that seems like
a pretty logical place for it. Since the two sections are together we
can remove a little bit (not too much) of the repetition.
v2: also remove the text for non-git setups, we want people to use git.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Most people use (or should use) lore at this point.
Replace the pointers to older archiving systems.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
It should be familiar and beneficial to expose devlink instance
lock to the drivers. This way drivers can block devlink from
calling them during critical sections without breakneck locking.
Add port helpers, port splitting callbacks will be the first
target.
Use 'devl_' prefix for "explicitly locked" API. Initial RFC used
'__devlink' but that's too much typing.
devl_lock_is_held() is not defined without lockdep, which is
the same behavior as lockdep_is_held() itself.
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Back when tcp_tso_autosize() and TCP pacing were introduced,
our focus was really to reduce burst sizes for long distance
flows.
The simple heuristic of using sk_pacing_rate/1024 has worked
well, but can lead to too small packets for hosts in the same
rack/cluster, when thousands of flows compete for the bottleneck.
Neal Cardwell had the idea of making the TSO burst size
a function of both sk_pacing_rate and tcp_min_rtt()
Indeed, for local flows, sending bigger bursts is better
to reduce cpu costs, as occasional losses can be repaired
quite fast.
This patch is based on Neal Cardwell implementation
done more than two years ago.
bbr is adjusting max_pacing_rate based on measured bandwidth,
while cubic would over estimate max_pacing_rate.
/proc/sys/net/ipv4/tcp_tso_rtt_log can be used to tune or disable
this new feature, in logarithmic steps.
Tested:
100Gbit NIC, two hosts in the same rack, 4K MTU.
600 flows rate-limited to 20000000 bytes per second.
Before patch: (TSO sizes would be limited to 20000000/1024/4096 -> 4 segments per TSO)
~# echo 0 >/proc/sys/net/ipv4/tcp_tso_rtt_log
~# nstat -n;perf stat ./super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000;nstat|egrep "TcpInSegs|TcpOutSegs|TcpRetransSegs|Delivered"
96005
Performance counter stats for './super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000':
65,945.29 msec task-clock # 2.845 CPUs utilized
1,314,632 context-switches # 19935.279 M/sec
5,292 cpu-migrations # 80.249 M/sec
940,641 page-faults # 14264.023 M/sec
201,117,030,926 cycles # 3049769.216 GHz (83.45%)
17,699,435,405 stalled-cycles-frontend # 8.80% frontend cycles idle (83.48%)
136,584,015,071 stalled-cycles-backend # 67.91% backend cycles idle (83.44%)
53,809,530,436 instructions # 0.27 insn per cycle
# 2.54 stalled cycles per insn (83.36%)
9,062,315,523 branches # 137422329.563 M/sec (83.22%)
153,008,621 branch-misses # 1.69% of all branches (83.32%)
23.182970846 seconds time elapsed
TcpInSegs 15648792 0.0
TcpOutSegs 58659110 0.0 # Average of 3.7 4K segments per TSO packet
TcpExtTCPDelivered 58654791 0.0
TcpExtTCPDeliveredCE 19 0.0
After patch:
~# echo 9 >/proc/sys/net/ipv4/tcp_tso_rtt_log
~# nstat -n;perf stat ./super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000;nstat|egrep "TcpInSegs|TcpOutSegs|TcpRetransSegs|Delivered"
96046
Performance counter stats for './super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000':
48,982.58 msec task-clock # 2.104 CPUs utilized
186,014 context-switches # 3797.599 M/sec
3,109 cpu-migrations # 63.472 M/sec
941,180 page-faults # 19214.814 M/sec
153,459,763,868 cycles # 3132982.807 GHz (83.56%)
12,069,861,356 stalled-cycles-frontend # 7.87% frontend cycles idle (83.32%)
120,485,917,953 stalled-cycles-backend # 78.51% backend cycles idle (83.24%)
36,803,672,106 instructions # 0.24 insn per cycle
# 3.27 stalled cycles per insn (83.18%)
5,947,266,275 branches # 121417383.427 M/sec (83.64%)
87,984,616 branch-misses # 1.48% of all branches (83.43%)
23.281200256 seconds time elapsed
TcpInSegs 1434706 0.0
TcpOutSegs 58883378 0.0 # Average of 41 4K segments per TSO packet
TcpExtTCPDelivered 58878971 0.0
TcpExtTCPDeliveredCE 9664 0.0
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20220309015757.2532973-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since commit
baebdf48c3 ("net: dev: Makes sure netif_rx() can be invoked in any context.")
the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.
Use netif_rx().
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen reported the following warning messages from smc-sysctl.rst
Documentation/networking/smc-sysctl.rst:3: WARNING: Title overline
too short.
Documentation/networking/smc-sysctl.rst: WARNING: document isn't
included in any toctree
Fix the title overline and add smc-sysctl entry into
Documentation/networking/index.rst
Fixes: 12bbb0d163 ("net/smc: add sysctl for autocorking")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220303113527.62047-1-dust.li@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the new stats API, kernel config parameter, and stats structure
information to the page_pool documentation.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This add a new sysctl: net.smc.autocorking_size
We can dynamically change the behaviour of autocorking
by change the value of autocorking_size.
Setting to 0 disables autocorking in SMC
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After change "net: dsa: tag_8021q: replace the SVL bridging with
VLAN-unaware IVL bridging", tag_8021q enforces two different pvids on a
port, depending on whether it is standalone or in a VLAN-unaware bridge.
Up until now, there was a single pvid, represented by
dsa_tag_8021q_rx_vid(), and that was used as the VLAN for VLAN-unaware
virtual link rules, regardless of whether the port was bridged or
standalone.
To keep VLAN-unaware virtual links working, we need to follow whether
the port is in a bridge or not, and update the VLAN ID from those rules.
In fact we can't fully do that. Depending on whether the switch is
VLAN-aware or not, we can accept Virtual Link rules with just the MAC
DA, or with a MAC DA and a VID. So we already deny changes to the VLAN
awareness of the switch. But the VLAN awareness may also change as a
result of joining or leaving a bridge.
One might say we could just allow the following: a port may leave a
VLAN-unaware bridge while it has VLAN-unaware VL (tc-flower) rules, and
the driver will update those with the new tag_8021q pvid for standalone
mode, but the driver won't accept joining a bridge at all while VL rules
were installed in standalone mode. This is sort of a compromise made
because leaving a bridge is an operation that cannot be vetoed.
But this sort of setup change is not fully supported, either: as
mentioned, VLAN filtering changes can also be triggered by leaving a
bridge, therefore, the existing veto we have in place for turning VLAN
filtering off with VLAN-aware VL rules active still isn't fully
effective.
I really don't know how to deal with this in a way that produces
predictable behavior for user space. Since at the moment, keeping this
feature fully functional on constellation changes (not changing the
tag_8021q port pvid when joining a bridge) is blocking progress for the
DSA FDB isolation, I'd rather document it as a (potentially temporary)
limitation and go on without it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>