kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Geliang Tang	49fa1919d6	mptcp: reset subflow when MP_FAIL doesn't respond This patch adds a new msk->flags bit MPTCP_FAIL_NO_RESPONSE, then reuses sk_timer to trigger a check if we have not received a response from the peer after sending MP_FAIL. If the peer doesn't respond properly, reset the subflow. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	9c81be0dbc	mptcp: add MP_FAIL response support This patch adds a new struct member mp_fail_response_expect in struct mptcp_subflow_context to support MP_FAIL response. In the single subflow with checksum error and contiguous data special case, a MP_FAIL is sent in response to another MP_FAIL. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	4293248c67	mptcp: add data lock for sk timers mptcp_data_lock() needs to be held when manipulating the msk retransmit_timer or the sk sk_timer. This patch adds the data lock for the both timers. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	bcf3cf93f6	mptcp: use mptcp_stop_timer Use the helper mptcp_stop_timer() instead of using sk_stop_timer() to stop icsk_retransmit_timer directly. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:53 +01:00
Geliang Tang	b6e074e171	selftests: mptcp: add infinite map testcase Add the single subflow test case for MP_FAIL, to test the infinite mapping case. Use the test_linkfail value to make 128KB test files. Add a new function reset_with_fail(), in it use 'iptables' and 'tc action pedit' rules to produce the bit flips to trigger the checksum failures. Set validate_checksum to enable checksums for the MP_FAIL tests without passing the '-C' argument. Set check_invert flag to enable the invert bytes check for the output data in check_transfer(). Instead of the file mismatch error, this test prints out the inverted bytes. Add a new function pedit_action_pkts() to get the numbers of the packets edited by the tc pedit actions. Print this numbers to the output. Also add the needed kernel configures in the selftests config file. Suggested-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:53 +01:00
Marcel Ziswiler	b1190d5175	net: stmmac: dwmac-imx: comment spelling fix Fix spelling in comment. Fixes: `94abdad697` ("net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip") Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Link: https://lore.kernel.org/r/20220425154856.169499-1-marcel@ziswiler.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:28:31 -07:00
Bjorn Helgaas	e39f63fe0d	net: remove comments that mention obsolete __SLOW_DOWN_IO The only remaining definitions of __SLOW_DOWN_IO (for alpha and ia64) do nothing, and the only mentions in networking are in comments. Remove these mentions. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:09:24 -07:00
Bjorn Helgaas	dac173db11	net: wan: atp: remove unused eeprom_delay() atp.h is included only by atp.c, which does not use eeprom_delay(). Remove the unused definition. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:09:23 -07:00
Jakub Kicinski	c706b2b5ed	net: tls: fix async vs NIC crypto offload When NIC takes care of crypto (or the record has already been decrypted) we forget to update darg->async. ->async is supposed to mean whether record is async capable on input and whether record has been queued for async crypto on output. Reported-by: Gal Pressman <gal@nvidia.com> Fixes: `3547a1f9d9` ("tls: rx: use async as an in-out argument") Tested-by: Gal Pressman <gal@nvidia.com> Link: https://lore.kernel.org/r/20220425233309.344858-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:08:49 -07:00
Russell King (Oracle)	fae4630840	net: dsa: mt753x: fix pcs conversion regression Daniel Golle reports that the conversion of mt753x to phylink PCS caused an oops as below. The problem is with the placement of the PCS initialisation, which occurs after mt7531_setup() has been called. However, burited in this function is a call to setup the CPU port, which requires the PCS structure to be already setup. Fix this by changing the initialisation order. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Mem abort info: ESR = 0x96000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005 CM = 0, WnR = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=0000000046057000 [0000000000000020] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 96000005 [#1] SMP Modules linked in: CPU: 0 PID: 32 Comm: kworker/u4:1 Tainted: G S 5.18.0-rc3-next-20220422+ #0 Hardware name: Bananapi BPI-R64 (DT) Workqueue: events_unbound deferred_probe_work_func pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mt7531_cpu_port_config+0xcc/0x1b0 lr : mt7531_cpu_port_config+0xc0/0x1b0 sp : ffffffc008d5b980 x29: ffffffc008d5b990 x28: ffffff80060562c8 x27: 00000000f805633b x26: ffffff80001a8880 x25: 00000000000009c4 x24: 0000000000000016 x23: ffffff8005eb6470 x22: 0000000000003600 x21: ffffff8006948080 x20: 0000000000000000 x19: 0000000000000006 x18: 0000000000000000 x17: 0000000000000001 x16: 0000000000000001 x15: 02963607fcee069e x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101 x11: ffffffc037302000 x10: 0000000000000870 x9 : ffffffc008d5b800 x8 : ffffff800028f950 x7 : 0000000000000001 x6 : 00000000662b3000 x5 : 00000000000002f0 x4 : 0000000000000000 x3 : ffffff800028f080 x2 : 0000000000000000 x1 : ffffff800028f080 x0 : 0000000000000000 Call trace: mt7531_cpu_port_config+0xcc/0x1b0 mt753x_cpu_port_enable+0x24/0x1f0 mt7531_setup+0x49c/0x5c0 mt753x_setup+0x20/0x31c dsa_register_switch+0x8bc/0x1020 mt7530_probe+0x118/0x200 mdio_probe+0x30/0x64 really_probe.part.0+0x98/0x280 __driver_probe_device+0x94/0x140 driver_probe_device+0x40/0x114 __device_attach_driver+0xb0/0x10c bus_for_each_drv+0x64/0xa0 __device_attach+0xa8/0x16c device_initial_probe+0x10/0x20 bus_probe_device+0x94/0x9c deferred_probe_work_func+0x80/0xb4 process_one_work+0x200/0x3a0 worker_thread+0x260/0x4c0 kthread+0xd4/0xe0 ret_from_fork+0x10/0x20 Code: 9409e911 937b7e60 8b0002a0 f9405800 (f9401005) ---[ end trace 0000000000000000 ]--- Reported-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Fixes: `cbd1f243bc` ("net: dsa: mt7530: partially convert to phylink_pcs") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/E1nj6FW-007WZB-5Y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:08:31 -07:00
Eric Dumazet	68822bdf76	net: generalize skb freeing deferral to per-cpu lists Logic added in commit `f35f821935` ("tcp: defer skb freeing after socket lock is released") helped bulk TCP flows to move the cost of skbs frees outside of critical section where socket lock was held. But for RPC traffic, or hosts with RFS enabled, the solution is far from being ideal. For RPC traffic, recvmsg() has to return to user space right after skb payload has been consumed, meaning that BH handler has no chance to pick the skb before recvmsg() thread. This issue is more visible with BIG TCP, as more RPC fit one skb. For RFS, even if BH handler picks the skbs, they are still picked from the cpu on which user thread is running. Ideally, it is better to free the skbs (and associated page frags) on the cpu that originally allocated them. This patch removes the per socket anchor (sk->defer_list) and instead uses a per-cpu list, which will hold more skbs per round. This new per-cpu list is drained at the end of net_action_rx(), after incoming packets have been processed, to lower latencies. In normal conditions, skbs are added to the per-cpu list with no further action. In the (unlikely) cases where the cpu does not run net_action_rx() handler fast enough, we use an IPI to raise NET_RX_SOFTIRQ on the remote cpu. Also, we do not bother draining the per-cpu list from dev_cpu_dead() This is because skbs in this list have no requirement on how fast they should be freed. Note that we can add in the future a small per-cpu cache if we see any contention on sd->defer_lock. Tested on a pair of hosts with 100Gbit NIC, RFS enabled, and /proc/sys/net/ipv4/tcp_rmem[2] tuned to 16MB to work around page recycling strategy used by NIC driver (its page pool capacity being too small compared to number of skbs/pages held in sockets receive queues) Note that this tuning was only done to demonstrate worse conditions for skb freeing for this particular test. These conditions can happen in more general production workload. 10 runs of one TCP_STREAM flow Before: Average throughput: 49685 Mbit. Kernel profiles on cpu running user thread recvmsg() show high cost for skb freeing related functions () 57.81% [kernel] [k] copy_user_enhanced_fast_string () 12.87% [kernel] [k] skb_release_data () 4.25% [kernel] [k] __free_one_page () 3.57% [kernel] [k] __list_del_entry_valid 1.85% [kernel] [k] __netif_receive_skb_core 1.60% [kernel] [k] __skb_datagram_iter () 1.59% [kernel] [k] free_unref_page_commit () 1.16% [kernel] [k] __slab_free 1.16% [kernel] [k] _copy_to_iter () 1.01% [kernel] [k] kfree () 0.88% [kernel] [k] free_unref_page 0.57% [kernel] [k] ip6_rcv_core 0.55% [kernel] [k] ip6t_do_table 0.54% [kernel] [k] flush_smp_call_function_queue () 0.54% [kernel] [k] free_pcppages_bulk 0.51% [kernel] [k] llist_reverse_order 0.38% [kernel] [k] process_backlog () 0.38% [kernel] [k] free_pcp_prepare 0.37% [kernel] [k] tcp_recvmsg_locked () 0.37% [kernel] [k] __list_add_valid 0.34% [kernel] [k] sock_rfree 0.34% [kernel] [k] _raw_spin_lock_irq () 0.33% [kernel] [k] __page_cache_release 0.33% [kernel] [k] tcp_v6_rcv () 0.33% [kernel] [k] __put_page () 0.29% [kernel] [k] __mod_zone_page_state 0.27% [kernel] [k] _raw_spin_lock After patch: Average throughput: 73076 Mbit. Kernel profiles on cpu running user thread recvmsg() looks better: 81.35% [kernel] [k] copy_user_enhanced_fast_string 1.95% [kernel] [k] _copy_to_iter 1.95% [kernel] [k] __skb_datagram_iter 1.27% [kernel] [k] __netif_receive_skb_core 1.03% [kernel] [k] ip6t_do_table 0.60% [kernel] [k] sock_rfree 0.50% [kernel] [k] tcp_v6_rcv 0.47% [kernel] [k] ip6_rcv_core 0.45% [kernel] [k] read_tsc 0.44% [kernel] [k] _raw_spin_lock_irqsave 0.37% [kernel] [k] _raw_spin_lock 0.37% [kernel] [k] native_irq_return_iret 0.33% [kernel] [k] __inet6_lookup_established 0.31% [kernel] [k] ip6_protocol_deliver_rcu 0.29% [kernel] [k] tcp_rcv_established 0.29% [kernel] [k] llist_reverse_order v2: kdoc issue (kernel bots) do not defer if (alloc_cpu == smp_processor_id()) (Paolo) replace the sk_buff_head with a single-linked list (Jakub) add a READ_ONCE()/WRITE_ONCE() for the lockless read of sd->defer_list Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220422201237.416238-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:05:59 -07:00
Ethan Yang	561215482c	net: usb: qmi_wwan: add support for Sierra Wireless EM7590 add support for Sierra Wireless EM7590 0xc081 composition. Signed-off-by: Ethan Yang <etyang@sierrawireless.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20220425054028.5444-1-etyang@sierrawireless.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-26 11:32:35 +02:00
Hangbin Liu	dfed913e8b	net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO Currently, the kernel drops GSO VLAN tagged packet if it's created with socket(AF_PACKET, SOCK_RAW, 0) plus virtio_net_hdr. The reason is AF_PACKET doesn't adjust the skb network header if there is a VLAN tag. Then after virtio_net_hdr_set_proto() called, the skb->protocol will be set to ETH_P_IP/IPv6. And in later inet/ipv6_gso_segment() the skb is dropped as network header position is invalid. Let's handle VLAN packets by adjusting network header position in packet_parse_headers(). The adjustment is safe and does not affect the later xmit as tap device also did that. In packet_snd(), packet_parse_headers() need to be moved before calling virtio_net_hdr_set_proto(), so we can set correct skb->protocol and network header first. There is no need to update tpacket_snd() as it calls packet_parse_headers() in tpacket_fill_skb(), which is already before calling virtio_net_hdr_* functions. skb->no_fcs setting is also moved upper to make all skb settings together and keep consistency with function packet_sendmsg_spkt(). Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20220425014502.985464-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-26 11:07:45 +02:00
Arun Ramadoss	de6dd626d7	net: dsa: ksz: added the generic port_stp_state_set function The ksz8795 and ksz9477 uses the same algorithm for the port_stp_state_set function except the register address is different. So moved the algorithm to the ksz_common.c and used the dev_ops for register read and write. This function can also used for the lan937x part. Hence making it generic for all the parts. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220424112831.11504-1-arun.ramadoss@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-26 10:19:34 +02:00
Arun Ramadoss	fb0a43f5bd	net: phy: LAN937x: add interrupt support for link detection Added the config_intr and handle_interrupt for the LAN937x phy which is same as the LAN87xx phy. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220423154727.29052-1-arun.ramadoss@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-26 10:08:59 +02:00
Tetsuo Handa	cc271ab866	wwan_hwsim: Avoid flush_scheduled_work() usage Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_wq with local wwan_wq. While we are at it, make err_clean_devs: label of wwan_hwsim_init() behave like wwan_hwsim_exit(), for it is theoretically possible to call wwan_hwsim_debugfs_devcreate_write()/wwan_hwsim_debugfs_devdestroy_write() by the moment wwan_hwsim_init_devs() returns. Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Link: https://lore.kernel.org/r/7390d51f-60e2-3cee-5277-b819a55ceabe@I-love.SAKURA.ne.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-25 12:10:24 -07:00
Marcin Wojtas	df1cc21152	net: dsa: remove unused headers Reduce a number of included headers to a necessary minimum. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 12:08:40 +01:00
Yajun Deng	b0e653b2a0	arp: fix unused variable warnning when CONFIG_PROC_FS=n net/ipv4/arp.c:1412:36: warning: unused variable 'arp_seq_ops' [-Wunused-const-variable] Add #ifdef CONFIG_PROC_FS for 'arp_seq_ops'. Fixes: `e968b1b3e9` ("arp: Remove #ifdef CONFIG_PROC_FS") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 11:50:18 +01:00
Alex Elder	c5794097b2	net: ipa: compute proper aggregation limit The aggregation byte limit for an endpoint is currently computed based on the endpoint's receive buffer size. However, some bytes at the front of each receive buffer are reserved on the assumption that--as with SKBs--it might be useful to insert data (such as headers) before what lands in the buffer. The aggregation byte limit currently doesn't take into account that reserved space, and as a result, aggregation could require space past that which is available in the buffer. Fix this by reducing the size used to compute the aggregation byte limit by the NET_SKB_PAD offset reserved for each receive buffer. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 11:27:56 +01:00
Dan Carpenter	a00e41bf2f	net: ethernet: mtk_eth_soc: add check for allocation failure Check if the kzalloc() failed. Fixes: `804775dfc2` ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 11:26:30 +01:00
Yang Yingliang	60d78e9fce	ethernet: broadcom/sb1250-mac: remove BUG_ON in sbmac_probe() Replace the BUG_ON() with returning error code to handle the fault more gracefully. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 11:24:41 +01:00
Haowen Bai	985e254c73	net: mscc: ocelot: Remove useless code payload only memset but no use at all, so we drop them. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 11:19:16 +01:00
David S. Miller	5e927a9f4b	Merge branch 'mlxsw-line-card-model' Ido Schimmel says: ==================== mlxsw: extend line card model by devices and info Jiri says: This patchset is extending the line card model by three items: 1) line card devices 2) line card info 3) line card device info First three patches are introducing the necessary changes in devlink core. Then, all three extensions are implemented in mlxsw alongside with selftest. Examples: $ devlink lc show pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 state active type 16x100G supported_types: 16x100G devices: device 0 device 1 device 2 device 3 $ devlink lc info pci/0000:01:00.0 lc 8 pci/0000:01:00.0: lc 8 versions: fixed: hw.revision 0 running: ini.version 4 devices: device 0 versions: running: fw 19.2010.1310 device 1 versions: running: fw 19.2010.1310 device 2 versions: running: fw 19.2010.1310 device 3 versions: running: fw 19.2010.1310 Note that device FW flashing is going to be implemented in the follow-up patchset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 10:42:29 +01:00
Jiri Pirko	002defd576	selftests: mlxsw: Check device info on activated line card Once line card is activated, check the device FW version is exposed. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 10:42:29 +01:00
Jiri Pirko	e932b4bdbd	mlxsw: core_linecards: Expose device FW version over device info Extend MDDQ to obtain FW version of line card device and implement device_info_get() op to fill up the info with that. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 10:42:29 +01:00

1 2 3 4 5 ...

1090025 Commits