Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

Pull networking updates from David Miller:
 "Highlights:

   1) Steady transitioning of the BPF instructure to a generic spot so
      all kernel subsystems can make use of it, from Alexei Starovoitov.

   2) SFC driver supports busy polling, from Alexandre Rames.

   3) Take advantage of hash table in UDP multicast delivery, from David
      Held.

   4) Lighten locking, in particular by getting rid of the LRU lists, in
      inet frag handling.  From Florian Westphal.

   5) Add support for various RFC6458 control messages in SCTP, from
      Geir Ola Vaagland.

   6) Allow to filter bridge forwarding database dumps by device, from
      Jamal Hadi Salim.

   7) virtio-net also now supports busy polling, from Jason Wang.

   8) Some low level optimization tweaks in pktgen from Jesper Dangaard
      Brouer.

   9) Add support for ipv6 address generation modes, so that userland
      can have some input into the process.  From Jiri Pirko.

  10) Consolidate common TCP connection request code in ipv4 and ipv6,
      from Octavian Purdila.

  11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.

  12) Generic resizable RCU hash table, with intial users in netlink and
      nftables.  From Thomas Graf.

  13) Maintain a name assignment type so that userspace can see where a
      network device name came from (enumerated by kernel, assigned
      explicitly by userspace, etc.) From Tom Gundersen.

  14) Automatic flow label generation on transmit in ipv6, from Tom
      Herbert.

  15) New packet timestamping facilities from Willem de Bruijn, meant to
      assist in measuring latencies going into/out-of the packet
      scheduler, latency from TCP data transmission to ACK, etc"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
  cxgb4 : Disable recursive mailbox commands when enabling vi
  net: reduce USB network driver config options.
  tg3: Modify tg3_tso_bug() to handle multiple TX rings
  amd-xgbe: Perform phy connect/disconnect at dev open/stop
  amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
  net: sun4i-emac: fix memory leak on bad packet
  sctp: fix possible seqlock seadlock in sctp_packet_transmit()
  Revert "net: phy: Set the driver when registering an MDIO bus device"
  cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
  team: Simplify return path of team_newlink
  bridge: Update outdated comment on promiscuous mode
  net-timestamp: ACK timestamp for bytestreams
  net-timestamp: TCP timestamping
  net-timestamp: SCHED timestamp on entering packet scheduler
  net-timestamp: add key to disambiguate concurrent datagrams
  net-timestamp: move timestamp flags out of sk_flags
  net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
  cxgb4i : Move stray CPL definitions to cxgb4 driver
  tcp: reduce spurious retransmits due to transient SACK reneging
  qlcnic: Initialize dcbnl_ops before register_netdev
  ...
This commit is contained in:
Linus Torvalds
2014-08-06 09:38:14 -07:00
1159 changed files with 71116 additions and 31018 deletions
+11
View File
@@ -1,3 +1,14 @@
What: /sys/class/net/<iface>/name_assign_type
Date: July 2014
KernelVersion: 3.17
Contact: netdev@vger.kernel.org
Description:
Indicates the name assignment type. Possible values are:
1: enumerated by the kernel, possibly in an unpredictable way
2: predictably named by the kernel
3: named by userspace
4: renamed
What: /sys/class/net/<iface>/addr_assign_type
Date: July 2010
KernelVersion: 3.2
@@ -8,10 +8,16 @@ Required properties:
- SerDes integration registers (1/2)
- SerDes integration registers (2/2)
Optional properties:
- amd,speed-set: Speed capabilities of the device
0 - 1GbE and 10GbE (default)
1 - 2.5GbE and 10GbE
Example:
xgbe_phy@e1240800 {
compatible = "amd,xgbe-phy-seattle-v1a", "ethernet-phy-ieee802.3-c45";
reg = <0 0xe1240800 0 0x00400>,
<0 0xe1250000 0 0x00060>,
<0 0xe1250080 0 0x00004>;
amd,speed-set = <0>;
};
@@ -8,16 +8,21 @@ Required properties:
- interrupt-parent: Should be the phandle for the interrupt controller
that services interrupts for this device
- interrupts: Should contain the amd-xgbe interrupt
- clocks: Should be the DMA clock for the amd-xgbe device (used for
calculating the correct Rx interrupt watchdog timer value on a DMA
channel for coalescing)
- clock-names: Should be the name of the DMA clock, "dma_clk"
- clocks:
- DMA clock for the amd-xgbe device (used for calculating the
correct Rx interrupt watchdog timer value on a DMA channel
for coalescing)
- PTP clock for the amd-xgbe device
- clock-names: Should be the names of the clocks
- "dma_clk" for the DMA clock
- "ptp_clk" for the PTP clock
- phy-handle: See ethernet.txt file in the same directory
- phy-mode: See ethernet.txt file in the same directory
Optional properties:
- mac-address: mac address to be assigned to the device. Can be overridden
by UEFI.
- dma-coherent: Present if dma operations are coherent
Example:
xgbe@e0700000 {
@@ -26,8 +31,8 @@ Example:
<0 0xe0780000 0 0x80000>;
interrupt-parent = <&gic>;
interrupts = <0 325 4>;
clocks = <&xgbe_clk>;
clock-names = "dma_clk";
clocks = <&xgbe_dma_clk>, <&xgbe_ptp_clk>;
clock-names = "dma_clk", "ptp_clk";
phy-handle = <&phy>;
phy-mode = "xgmii";
mac-address = [ 02 a1 a2 a3 a4 a5 ];
@@ -4,7 +4,8 @@ Required properties:
- compatible: should be one of "brcm,systemport-v1.00" or "brcm,systemport"
- reg: address and length of the register set for the device.
- interrupts: interrupts for the device, first cell must be for the the rx
interrupts, and the second cell should be for the transmit queues
interrupts, and the second cell should be for the transmit queues. An
optional third interrupt cell for Wake-on-LAN can be specified
- local-mac-address: Ethernet MAC address (48 bits) of this adapter
- phy-mode: Should be a string describing the PHY interface to the
Ethernet switch/PHY, see Documentation/devicetree/bindings/net/ethernet.txt
@@ -1,8 +1,8 @@
TI SoC Davinci MDIO Controller Device Tree Bindings
TI SoC Davinci/Keystone2 MDIO Controller Device Tree Bindings
---------------------------------------------------
Required properties:
- compatible : Should be "ti,davinci_mdio"
- compatible : Should be "ti,davinci_mdio" or "ti,keystone_mdio"
- reg : physical base address and size of the davinci mdio
registers map
- bus_freq : Mdio Bus frequency
@@ -19,7 +19,7 @@ file.
Examples:
mdio: davinci_mdio@4A101000 {
compatible = "ti,cpsw";
compatible = "ti,davinci_mdio";
reg = <0x4A101000 0x1000>;
bus_freq = <1000000>;
};
@@ -27,7 +27,7 @@ Examples:
(or)
mdio: davinci_mdio@4A101000 {
compatible = "ti,cpsw";
compatible = "ti,davinci_mdio";
ti,hwmods = "davinci_mdio";
bus_freq = <1000000>;
};
@@ -0,0 +1,29 @@
*CC2520 IEEE 802.15.4 Compatible Radio*
Required properties:
- compatible: should be "ti,cc2520"
- spi-max-frequency: maximal bus speed (8000000), should be set to 4000000 depends
sync or async operation mode
- reg: the chipselect index
- pinctrl-0: pin control group to be used for this controller.
- pinctrl-names: must contain a "default" entry.
- fifo-gpio: GPIO spec for the FIFO pin
- fifop-gpio: GPIO spec for the FIFOP pin
- sfd-gpio: GPIO spec for the SFD pin
- cca-gpio: GPIO spec for the CCA pin
- vreg-gpio: GPIO spec for the VREG pin
- reset-gpio: GPIO spec for the RESET pin
Example:
cc2520@0 {
compatible = "ti,cc2520";
reg = <0>;
spi-max-frequency = <4000000>;
pinctrl-names = "default";
pinctrl-0 = <&cc2520_cape_pins>;
fifo-gpio = <&gpio1 18 0>;
fifop-gpio = <&gpio1 19 0>;
sfd-gpio = <&gpio1 13 0>;
cca-gpio = <&gpio1 16 0>;
vreg-gpio = <&gpio0 31 0>;
reset-gpio = <&gpio1 12 0>;
};
@@ -0,0 +1,61 @@
* Marvell Armada 375 Ethernet Controller (PPv2)
Required properties:
- compatible: should be "marvell,armada-375-pp2"
- reg: addresses and length of the register sets for the device.
Must contain the following register sets:
- common controller registers
- LMS registers
In addition, at least one port register set is required.
- clocks: a pointer to the reference clocks for this device, consequently:
- main controller clock
- GOP clock
- clock-names: names of used clocks, must be "pp_clk" and "gop_clk".
The ethernet ports are represented by subnodes. At least one port is
required.
Required properties (port):
- interrupts: interrupt for the port
- port-id: should be '0' or '1' for ethernet ports, and '2' for the
loopback port
- phy-mode: See ethernet.txt file in the same directory
Optional properties (port):
- marvell,loopback: port is loopback mode
- phy: a phandle to a phy node defining the PHY address (as the reg
property, a single integer). Note: if this property isn't present,
then fixed link is assumed, and the 'fixed-link' property is
mandatory.
Example:
ethernet@f0000 {
compatible = "marvell,armada-375-pp2";
reg = <0xf0000 0xa000>,
<0xc0000 0x3060>,
<0xc4000 0x100>,
<0xc5000 0x100>;
clocks = <&gateclk 3>, <&gateclk 19>;
clock-names = "pp_clk", "gop_clk";
status = "okay";
eth0: eth0@c4000 {
interrupts = <GIC_SPI 37 IRQ_TYPE_LEVEL_HIGH>;
port-id = <0>;
status = "okay";
phy = <&phy0>;
phy-mode = "gmii";
};
eth1: eth1@c5000 {
interrupts = <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>;
port-id = <1>;
status = "okay";
phy = <&phy3>;
phy-mode = "gmii";
};
};
@@ -0,0 +1,33 @@
* STMicroelectronics SAS. ST21NFCB NFC Controller
Required properties:
- compatible: Should be "st,st21nfcb_i2c".
- clock-frequency: I²C work frequency.
- reg: address on the bus
- interrupt-parent: phandle for the interrupt gpio controller
- interrupts: GPIO interrupt to which the chip is connected
- reset-gpios: Output GPIO pin used to reset the ST21NFCB
Optional SoC Specific Properties:
- pinctrl-names: Contains only one value - "default".
- pintctrl-0: Specifies the pin control groups used for this controller.
Example (for ARM-based BeagleBoard xM with ST21NFCB on I2C2):
&i2c2 {
status = "okay";
st21nfcb: st21nfcb@8 {
compatible = "st,st21nfcb_i2c";
reg = <0x08>;
clock-frequency = <400000>;
interrupt-parent = <&gpio5>;
interrupts = <2 IRQ_TYPE_LEVEL_LOW>;
reset-gpios = <&gpio5 29 GPIO_ACTIVE_HIGH>;
};
};
@@ -9,6 +9,7 @@ Required properties:
"renesas,ether-r8a7779" if the device is a part of R8A7779 SoC.
"renesas,ether-r8a7790" if the device is a part of R8A7790 SoC.
"renesas,ether-r8a7791" if the device is a part of R8A7791 SoC.
"renesas,ether-r8a7794" if the device is a part of R8A7794 SoC.
"renesas,ether-r7s72100" if the device is a part of R7S72100 SoC.
- reg: offset and length of (1) the E-DMAC/feLic register block (required),
(2) the TSU register block (optional).
@@ -25,6 +25,10 @@ Required properties:
- snps,force_sf_dma_mode Force DMA to use the Store and Forward
mode for both tx and rx. This flag is
ignored if force_thresh_dma_mode is set.
- snps,multicast-filter-bins: Number of multicast filter hash bins
supported by this device instance
- snps,perfect-filter-entries: Number of perfect filter entries supported
by this device instance
Optional properties:
- resets: Should contain a phandle to the STMMAC reset signal, if any
@@ -47,6 +51,8 @@ Examples:
mac-address = [000000000000]; /* Filled in by U-Boot */
max-frame-size = <3800>;
phy-mode = "gmii";
snps,multicast-filter-bins = <256>;
snps,perfect-filter-entries = <128>;
clocks = <&clock>;
clock-names = "stmmaceth">;
};
@@ -0,0 +1,41 @@
Broadcom BCM43xx Fullmac wireless SDIO devices
This node provides properties for controlling the Broadcom wireless device. The
node is expected to be specified as a child node to the SDIO controller that
connects the device to the system.
Required properties:
- compatible : Should be "brcm,bcm4329-fmac".
Optional properties:
- brcm,drive-strength : drive strength used for SDIO pins on device in mA
(default = 6).
- interrupt-parent : the phandle for the interrupt controller to which the
device interrupts are connected.
- interrupts : specifies attributes for the out-of-band interrupt (host-wake).
When not specified the device will use in-band SDIO interrupts.
- interrupt-names : name of the out-of-band interrupt, which must be set
to "host-wake".
Example:
mmc3: mmc@01c12000 {
#address-cells = <1>;
#size-cells = <0>;
pinctrl-names = "default";
pinctrl-0 = <&mmc3_pins_a>;
vmmc-supply = <&reg_vmmc3>;
bus-width = <4>;
non-removable;
status = "okay";
brcmf: bcrmf@1 {
reg = <1>;
compatible = "brcm,bcm4329-fmac";
interrupt-parent = <&pio>;
interrupts = <10 8>; /* PH10 / EINT10 */
interrupt-names = "host-wake";
};
};
+16 -15
View File
@@ -542,10 +542,10 @@ mode
XOR policy: Transmit based on the selected transmit
hash policy. The default policy is a simple [(source
MAC address XOR'd with destination MAC address) modulo
slave count]. Alternate transmit policies may be
selected via the xmit_hash_policy option, described
below.
MAC address XOR'd with destination MAC address XOR
packet type ID) modulo slave count]. Alternate transmit
policies may be selected via the xmit_hash_policy option,
described below.
This mode provides load balancing and fault tolerance.
@@ -801,10 +801,11 @@ xmit_hash_policy
layer2
Uses XOR of hardware MAC addresses to generate the
hash. The formula is
Uses XOR of hardware MAC addresses and packet type ID
field to generate the hash. The formula is
(source MAC XOR destination MAC) modulo slave count
hash = source MAC XOR destination MAC XOR packet type ID
slave number = hash modulo slave count
This algorithm will place all traffic to a particular
network peer on the same slave.
@@ -819,7 +820,7 @@ xmit_hash_policy
Uses XOR of hardware MAC addresses and IP addresses to
generate the hash. The formula is
hash = source MAC XOR destination MAC
hash = source MAC XOR destination MAC XOR packet type ID
hash = hash XOR source IP XOR destination IP
hash = hash XOR (hash RSHIFT 16)
hash = hash XOR (hash RSHIFT 8)
@@ -2301,13 +2302,13 @@ broadcast: Like active-backup, there is not much advantage to this
bandwidth.
Additionally, the linux bonding 802.3ad implementation
distributes traffic by peer (using an XOR of MAC addresses),
so in a "gatewayed" configuration, all outgoing traffic will
generally use the same device. Incoming traffic may also end
up on a single device, but that is dependent upon the
balancing policy of the peer's 8023.ad implementation. In a
"local" configuration, traffic will be distributed across the
devices in the bond.
distributes traffic by peer (using an XOR of MAC addresses
and packet type ID), so in a "gatewayed" configuration, all
outgoing traffic will generally use the same device. Incoming
traffic may also end up on a single device, but that is
dependent upon the balancing policy of the peer's 8023.ad
implementation. In a "local" configuration, traffic will be
distributed across the devices in the bond.
Finally, the 802.3ad mode mandates the use of the MII monitor,
therefore, the ARP monitor is not available in this mode.
+6 -6
View File
@@ -586,12 +586,12 @@ team driver's classifier for its load-balancing mode, netfilter's xt_bpf
extension, PTP dissector/classifier, and much more. They are all internally
converted by the kernel into the new instruction set representation and run
in the eBPF interpreter. For in-kernel handlers, this all works transparently
by using sk_unattached_filter_create() for setting up the filter, resp.
sk_unattached_filter_destroy() for destroying it. The macro
SK_RUN_FILTER(filter, ctx) transparently invokes eBPF interpreter or JITed
code to run the filter. 'filter' is a pointer to struct sk_filter that we
got from sk_unattached_filter_create(), and 'ctx' the given context (e.g.
skb pointer). All constraints and restrictions from sk_chk_filter() apply
by using bpf_prog_create() for setting up the filter, resp.
bpf_prog_destroy() for destroying it. The macro
BPF_PROG_RUN(filter, ctx) transparently invokes eBPF interpreter or JITed
code to run the filter. 'filter' is a pointer to struct bpf_prog that we
got from bpf_prog_create(), and 'ctx' the given context (e.g.
skb pointer). All constraints and restrictions from bpf_check_classic() apply
before a conversion to the new layout is being done behind the scenes!
Currently, the classic BPF format is being used for JITing on most of the
+5 -2
View File
@@ -69,8 +69,11 @@ Additional Configurations
FCoE
----
Fiber Channel over Ethernet (FCoE) hardware offload is not currently
supported.
The driver supports Fiber Channel over Ethernet (FCoE) and Data Center
Bridging (DCB) functionality. Configuring DCB and FCoE is outside the scope
of this driver doc. Refer to http://www.open-fcoe.org/ for FCoE project
information and http://www.open-lldp.org/ or email list
e1000-eedc@lists.sourceforge.net for DCB information.
MAC and VLAN anti-spoofing feature
----------------------------------
+26 -12
View File
@@ -101,19 +101,17 @@ ipfrag_high_thresh - INTEGER
Maximum memory used to reassemble IP fragments. When
ipfrag_high_thresh bytes of memory is allocated for this purpose,
the fragment handler will toss packets until ipfrag_low_thresh
is reached.
is reached. This also serves as a maximum limit to namespaces
different from the initial one.
ipfrag_low_thresh - INTEGER
See ipfrag_high_thresh
Maximum memory used to reassemble IP fragments before the kernel
begins to remove incomplete fragment queues to free up resources.
The kernel still accepts new fragments for defragmentation.
ipfrag_time - INTEGER
Time in seconds to keep an IP fragment in memory.
ipfrag_secret_interval - INTEGER
Regeneration interval (in seconds) of the hash secret (or lifetime
for the hash secret) for IP fragments.
Default: 600
ipfrag_max_dist - INTEGER
ipfrag_max_dist is a non-negative integer value which defines the
maximum "disorder" which is allowed among fragments which share a
@@ -1132,6 +1130,15 @@ flowlabel_consistency - BOOLEAN
FALSE: disabled
Default: TRUE
auto_flowlabels - BOOLEAN
Automatically generate flow labels based based on a flow hash
of the packet. This allows intermediate devices, such as routers,
to idenfify packet flows for mechanisms like Equal Cost Multipath
Routing (see RFC 6438).
TRUE: enabled
FALSE: disabled
Default: false
anycast_src_echo_reply - BOOLEAN
Controls the use of anycast addresses as source addresses for ICMPv6
echo reply
@@ -1153,11 +1160,6 @@ ip6frag_low_thresh - INTEGER
ip6frag_time - INTEGER
Time in seconds to keep an IPv6 fragment in memory.
ip6frag_secret_interval - INTEGER
Regeneration interval (in seconds) of the hash secret (or lifetime
for the hash secret) for IPv6 fragments.
Default: 600
conf/default/*:
Change the interface-specific default settings.
@@ -1210,6 +1212,18 @@ accept_ra_defrtr - BOOLEAN
Functional default: enabled if accept_ra is enabled.
disabled if accept_ra is disabled.
accept_ra_from_local - BOOLEAN
Accept RA with source-address that is found on local machine
if the RA is otherwise proper and able to be accepted.
Default is to NOT accept these as it may be an un-intended
network loop.
Functional default:
enabled if accept_ra_from_local is enabled
on a specific interface.
disabled if accept_ra_from_local is disabled
on a specific interface.
accept_ra_pinfo - BOOLEAN
Learn Prefix Information in Router Advertisement.
+5 -11
View File
@@ -1008,14 +1008,9 @@ hardware timestamps to be used. Note: you may need to enable the generation
of hardware timestamps with SIOCSHWTSTAMP (see related information from
Documentation/networking/timestamping.txt).
PACKET_TIMESTAMP accepts the same integer bit field as
SO_TIMESTAMPING. However, only the SOF_TIMESTAMPING_SYS_HARDWARE
and SOF_TIMESTAMPING_RAW_HARDWARE values are recognized by
PACKET_TIMESTAMP. SOF_TIMESTAMPING_SYS_HARDWARE takes precedence over
SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set.
PACKET_TIMESTAMP accepts the same integer bit field as SO_TIMESTAMPING:
int req = 0;
req |= SOF_TIMESTAMPING_SYS_HARDWARE;
int req = SOF_TIMESTAMPING_RAW_HARDWARE;
setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, (void *) &req, sizeof(req))
For the mmap(2)ed ring buffers, such timestamps are stored in the
@@ -1023,14 +1018,13 @@ tpacket{,2,3}_hdr structure's tp_sec and tp_{n,u}sec members. To determine
what kind of timestamp has been reported, the tp_status field is binary |'ed
with the following possible bits ...
TP_STATUS_TS_SYS_HARDWARE
TP_STATUS_TS_RAW_HARDWARE
TP_STATUS_TS_SOFTWARE
... that are equivalent to its SOF_TIMESTAMPING_* counterparts. For the
RX_RING, if none of those 3 are set (i.e. PACKET_TIMESTAMP is not set),
then this means that a software fallback was invoked *within* PF_PACKET's
processing code (less precise).
RX_RING, if neither is set (i.e. PACKET_TIMESTAMP is not set), then a
software fallback was invoked *within* PF_PACKET's processing code (less
precise).
Getting timestamps for the TX_RING works as follows: i) fill the ring frames,
ii) call sendto() e.g. in blocking mode, iii) wait for status of relevant
+17 -1
View File
@@ -272,6 +272,8 @@ Writing a PHY driver
txtsamp: Requests a transmit timestamp at the PHY level for a 'skb'
set_wol: Enable Wake-on-LAN at the PHY level
get_wol: Get the Wake-on-LAN status at the PHY level
read_mmd_indirect: Read PHY MMD indirect register
write_mmd_indirect: Write PHY MMD indirect register
Of these, only config_aneg and read_status are required to be
assigned by the driver code. The rest are optional. Also, it is
@@ -284,7 +286,21 @@ Writing a PHY driver
Feel free to look at the Marvell, Cicada, and Davicom drivers in
drivers/net/phy/ for examples (the lxt and qsemi drivers have
not been tested as of this writing)
not been tested as of this writing).
The PHY's MMD register accesses are handled by the PAL framework
by default, but can be overridden by a specific PHY driver if
required. This could be the case if a PHY was released for
manufacturing before the MMD PHY register definitions were
standardized by the IEEE. Most modern PHYs will be able to use
the generic PAL framework for accessing the PHY's MMD registers.
An example of such usage is for Energy Efficient Ethernet support,
implemented in the PAL. This support uses the PAL to access MMD
registers for EEE query and configuration if the PHY supports
the IEEE standard access mechanisms, or can use the PHY's specific
access interfaces if overridden by the specific PHY driver. See
the Micrel driver in drivers/net/phy/ for an example of how this
can be implemented.
Board Fixups
+28
View File
@@ -24,6 +24,34 @@ For monitoring and control pktgen creates:
/proc/net/pktgen/ethX
Tuning NIC for max performance
==============================
The default NIC setting are (likely) not tuned for pktgen's artificial
overload type of benchmarking, as this could hurt the normal use-case.
Specifically increasing the TX ring buffer in the NIC:
# ethtool -G ethX tx 1024
A larger TX ring can improve pktgen's performance, while it can hurt
in the general case, 1) because the TX ring buffer might get larger
than the CPUs L1/L2 cache, 2) because it allow more queueing in the
NIC HW layer (which is bad for bufferbloat).
One should be careful to conclude, that packets/descriptors in the HW
TX ring cause delay. Drivers usually delay cleaning up the
ring-buffers (for various performance reasons), thus packets stalling
the TX ring, might just be waiting for cleanup.
This cleanup issues is specifically the case, for the driver ixgbe
(Intel 82599 chip). This driver (ixgbe) combine TX+RX ring cleanups,
and the cleanup interval is affected by the ethtool --coalesce setting
of parameter "rx-usecs".
For ixgbe use e.g "30" resulting in approx 33K interrupts/sec (1/30*10^6):
# ethtool -C ethX rx-usecs 30
Viewing threads
===============
/proc/net/pktgen/kpktgend_0
+7 -9
View File
@@ -40,7 +40,7 @@ the set bits correspond to data that is available, then the control
message will not be generated:
SOF_TIMESTAMPING_SOFTWARE: report systime if available
SOF_TIMESTAMPING_SYS_HARDWARE: report hwtimetrans if available
SOF_TIMESTAMPING_SYS_HARDWARE: report hwtimetrans if available (deprecated)
SOF_TIMESTAMPING_RAW_HARDWARE: report hwtimeraw if available
It is worth noting that timestamps may be collected for reasons other
@@ -88,13 +88,12 @@ hwtimeraw is the original hardware time stamp. Filled in if
SOF_TIMESTAMPING_RAW_HARDWARE is set. No assumptions about its
relation to system time should be made.
hwtimetrans is the hardware time stamp transformed so that it
corresponds as good as possible to system time. This correlation is
not perfect; as a consequence, sorting packets received via different
NICs by their hwtimetrans may differ from the order in which they were
received. hwtimetrans may be non-monotonic even for the same NIC.
Filled in if SOF_TIMESTAMPING_SYS_HARDWARE is set. Requires support
by the network device and will be empty without that support.
hwtimetrans is always zero. This field is deprecated. It used to hold
hw timestamps converted to system time. Instead, expose the hardware
clock device on the NIC directly as a HW PTP clock source, to allow
time conversion in userspace and optionally synchronize system time
with a userspace PTP stack such as linuxptp. For the PTP clock API,
see Documentation/ptp/ptp.txt.
SIOCSHWTSTAMP, SIOCGHWTSTAMP:
@@ -185,7 +184,6 @@ struct skb_shared_hwtstamps {
* since arbitrary point in time
*/
ktime_t hwtstamp;
ktime_t syststamp; /* hwtstamp transformed to system time base */
};
Time stamps for outgoing packets are to be generated as follows:
@@ -76,7 +76,6 @@ static void usage(const char *error)
" SOF_TIMESTAMPING_RX_HARDWARE - hardware time stamping of incoming packets\n"
" SOF_TIMESTAMPING_RX_SOFTWARE - software fallback for incoming packets\n"
" SOF_TIMESTAMPING_SOFTWARE - request reporting of software time stamps\n"
" SOF_TIMESTAMPING_SYS_HARDWARE - request reporting of transformed HW time stamps\n"
" SOF_TIMESTAMPING_RAW_HARDWARE - request reporting of raw HW time stamps\n"
" SIOCGSTAMP - check last socket time stamp\n"
" SIOCGSTAMPNS - more accurate socket time stamp\n");
@@ -202,9 +201,7 @@ static void printpacket(struct msghdr *msg, int res,
(long)stamp->tv_sec,
(long)stamp->tv_nsec);
stamp++;
printf("HW transformed %ld.%09ld ",
(long)stamp->tv_sec,
(long)stamp->tv_nsec);
/* skip deprecated HW transformed */
stamp++;
printf("HW raw %ld.%09ld",
(long)stamp->tv_sec,
@@ -361,8 +358,6 @@ int main(int argc, char **argv)
so_timestamping_flags |= SOF_TIMESTAMPING_RX_SOFTWARE;
else if (!strcasecmp(argv[i], "SOF_TIMESTAMPING_SOFTWARE"))
so_timestamping_flags |= SOF_TIMESTAMPING_SOFTWARE;
else if (!strcasecmp(argv[i], "SOF_TIMESTAMPING_SYS_HARDWARE"))
so_timestamping_flags |= SOF_TIMESTAMPING_SYS_HARDWARE;
else if (!strcasecmp(argv[i], "SOF_TIMESTAMPING_RAW_HARDWARE"))
so_timestamping_flags |= SOF_TIMESTAMPING_RAW_HARDWARE;
else

Some files were not shown because too many files have changed in this diff Show More