Commit Graph

34251 Commits

Author SHA1 Message Date
David S. Miller
ea3d1cc285 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull to get the thermal netlink multicast group name fix, otherwise
the assertion added in net-next to netlink to detect that kind of bug
makes systems unbootable for some folks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-22 12:53:09 -04:00
Florian Fainelli
c3a07134e6 mv643xx_eth: convert to use the Marvell Orion MDIO driver
This patch converts the Marvell MV643XX ethernet driver to use the
Marvell Orion MDIO driver. As a result, PowerPC and ARM platforms
registering the Marvell MV643XX ethernet driver are also updated to
register a Marvell Orion MDIO driver. This driver voluntarily overlaps
with the Marvell Ethernet shared registers because it will use a subset
of this shared register (shared_base + 0x4 to shared_base + 0x84). The
Ethernet driver is also updated to look up for a PHY device using the
Orion MDIO bus driver.

For ARM and PowerPC we register a single instance of the "mvmdio" driver
in the system like it used to be done with the use of the "shared_smi"
platform_data cookie on ARM.

Note that it is safe to register the mvmdio driver only for the "ge00"
instance of the driver because this "ge00" interface is guaranteed to
always be explicitely registered by consumers of
arch/arm/plat-orion/common.c and other instances (ge01, ge10 and ge11)
were all pointing their shared_smi to ge00. For PowerPC the in-tree
Device Tree Source files mention only one MV643XX ethernet MAC instance
so the MDIO bus driver is registered only when id == 0.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-22 10:25:15 -04:00
Rusty Russell
9d0ca6ed6f virtio: remove obsolete virtqueue_get_queue_index()
You can access it directly now, since 3.8: v3.7-rc1-13-g06ca287
'virtio: move queue_index and num_free fields into core struct
virtqueue.'

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-22 10:23:34 -04:00
Daniel Borkmann
79617801ea filter: bpf_jit_comp: refactor and unify BPF JIT image dump output
If bpf_jit_enable > 1, then we dump the emitted JIT compiled image
after creation. Currently, only SPARC and PowerPC has similar output
as in the reference implementation on x86_64. Make a small helper
function in order to reduce duplicated code and make the dump output
uniform across architectures x86_64, SPARC, PPC, ARM (e.g. on ARM
flen, pass and proglen are currently not shown, but would be
interesting to know as well), also for future BPF JIT implementations
on other archs.

Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Eric Dumazet <eric.dumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 17:25:56 -04:00
Yuchung Cheng
e33099f96d tcp: implement RFC5682 F-RTO
This patch implements F-RTO (foward RTO recovery):

When the first retransmission after timeout is acknowledged, F-RTO
sends new data instead of old data. If the next ACK acknowledges
some never-retransmitted data, then the timeout was spurious and the
congestion state is reverted.  Otherwise if the next ACK selectively
acknowledges the new data, then the timeout was genuine and the
loss recovery continues. This idea applies to recurring timeouts
as well. While F-RTO sends different data during timeout recovery,
it does not (and should not) change the congestion control.

The implementaion follows the three steps of SACK enhanced algorithm
(section 3) in RFC5682. Step 1 is in tcp_enter_loss(). Step 2 and
3 are in tcp_process_loss().  The basic version is not supported
because SACK enhanced version also works for non-SACK connections.

The new implementation is functionally in parity with the old F-RTO
implementation except the one case where it increases undo events:
In addition to the RFC algorithm, a spurious timeout may be detected
without sending data in step 2, as long as the SACK confirms not
all the original data are dropped. When this happens, the sender
will undo the cwnd and perhaps enter fast recovery instead. This
additional check increases the F-RTO undo events by 5x compared
to the prior implementation on Google Web servers, since the sender
often does not have new data to send for HTTP.

Note F-RTO may detect spurious timeout before Eifel with timestamps
does so.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 11:47:51 -04:00
Yuchung Cheng
9b44190dc1 tcp: refactor F-RTO
The patch series refactor the F-RTO feature (RFC4138/5682).

This is to simplify the loss recovery processing. Existing F-RTO
was developed during the experimental stage (RFC4138) and has
many experimental features.  It takes a separate code path from
the traditional timeout processing by overloading CA_Disorder
instead of using CA_Loss state. This complicates CA_Disorder state
handling because it's also used for handling dubious ACKs and undos.
While the algorithm in the RFC does not change the congestion control,
the implementation intercepts congestion control in various places
(e.g., frto_cwnd in tcp_ack()).

The new code implements newer F-RTO RFC5682 using CA_Loss processing
path.  F-RTO becomes a small extension in the timeout processing
and interfaces with congestion control and Eifel undo modules.
It lets congestion control (module) determines how many to send
independently.  F-RTO only chooses what to send in order to detect
spurious retranmission. If timeout is found spurious it invokes
existing Eifel undo algorithms like DSACK or TCP timestamp based
detection.

The first patch removes all F-RTO code except the sysctl_tcp_frto is
left for the new implementation.  Since CA_EVENT_FRTO is removed, TCP
westwood now computes ssthresh on regular timeout CA_EVENT_LOSS event.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-21 11:47:50 -04:00
Masatake YAMATO
73214f5d9f thermal: shorten too long mcast group name
The original name is too long.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 17:56:58 -04:00
John W. Linville
5470b462c3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-03-20 15:24:57 -04:00
Jesper Derehag
2b5faa4c55 connector: Added coredumping event to the process connector
Process connector can now also detect coredumping events.

Main aim of patch is get notified at start of coredumping, instead of
having to wait for it to finish and then being notified through EXIT
event.

Could be used for instance by process-managers that want to get
notified as soon as possible about process failures, and not
necessarily beeing notified after coredump, which could be in the
order of minutes depending on size of coredump, piping and so on.

Signed-off-by: Jesper Derehag <jderehag@hotmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 13:23:21 -04:00
Daniel Borkmann
3e5289d5e3 filter: add ANC_PAY_OFFSET instruction for loading payload start offset
It is very useful to do dynamic truncation of packets. In particular,
we're interested to push the necessary header bytes to the user space and
cut off user payload that should probably not be transferred for some reasons
(e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET,
we can load it into the accumulator, and return it. E.g. in bpfc syntax ...

        ld #poff        ; { 0x20, 0, 0, 0xfffff034 },
        ret a           ; { 0x16, 0, 0, 0x00000000 },

... as a filter will accomplish this without having to do a big hackery in
a BPF filter itself. Follow-up JIT implementations are welcome.

Thanks to Eric Dumazet for suggesting and discussing this during the
Netfilter Workshop in Copenhagen.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 13:15:45 -04:00
Daniel Borkmann
f77668dc25 net: flow_dissector: add __skb_get_poff to get a start offset to payload
__skb_get_poff() returns the offset to the payload as far as it could
be dissected. The main user is currently BPF, so that we can dynamically
truncate packets without needing to push actual payload to the user
space and instead can analyze headers only.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 13:15:45 -04:00
David S. Miller
61816596d1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull in the 'net' tree to get Daniel Borkmann's flow dissector
infrastructure change.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:46:26 -04:00
Tom Parkin
44046a593e udp: add encap_destroy callback
Users of udp encapsulation currently have an encap_rcv callback which they can
use to hook into the udp receive path.

In situations where a encapsulation user allocates resources associated with a
udp encap socket, it may be convenient to be able to also hook the proto
.destroy operation.  For example, if an encap user holds a reference to the
udp socket, the destroy hook might be used to relinquish this reference.

This patch adds a socket destroy hook into udp, which is set and enabled
in the same way as the existing encap_rcv hook.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20 12:10:38 -04:00
Linus Torvalds
7b1b3fd74e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang.

 2) Insufficient space reserved for bridge netlink values, fix from
    Stephen Hemminger.

 3) Some dst_neigh_lookup*() callers don't interpret error pointer
    correctly, fix from Zhouyi Zhou.

 4) Fix transport match in SCTP active_path loops, from Xugeng Zhang.

 5) Fix qeth driver handling of multi-order SKB frags, from Frank
    Blaschka.

 6) fec driver is missing napi_disable() call, resulting in crashes on
    unload, from Georg Hofmann.

 7) Don't try to handle PMTU events on a listening socket, fix from Eric
    Dumazet.

 8) Fix timestamp location calculations in IP option processing, from
    David Ward.

 9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig
    tests, from Denis V Lunev.

10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings.

11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd
    Bergmann.

12) bnx2x statistics don't handle 4GB rollover correctly, fix from
    Maciej Żenczykowski.

13) Openvswitch bug fixes for vport del/new error reporting, missing
    genlmsg_end() call in netlink processing, and mis-parsing of
    LLC/SNAP ethernet types.  From Rich Lane.

14) SKB pfmemalloc state should only be propagated from the head page of
    a compound page, fix from Pavel Emelyanov.

15) Fix link handling in tg3 driver for 5715 chips when autonegotation
    is disabled.  From Nithin Sujir.

16) Fix inverted test of cpdma_check_free_tx_desc return value in
    davinci_emac driver, from Mugunthan V N.

17) vlan_depth is incorrectly calculated in skb_network_protocol(), from
    Li RongQing.

18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM
    device mode backwards compat in cdc_ncm driver.  From Bjørn Mork.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  inet: limit length of fragment queue hash table bucket lists
  qeth: Fix scatter-gather regression
  qeth: Fix invalid router settings handling
  qeth: delay feature trace
  tcp: dont handle MTU reduction on LISTEN socket
  bnx2x: fix occasional statistics off-by-4GB error
  vhost/net: fix heads usage of ubuf_info
  bridge: Add support for setting BR_ROOT_BLOCK flag.
  bnx2x: add missing napi deletion in error path
  drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc()
  ethernet/tulip: DE4x5 needs VIRT_TO_BUS
  isdn: hisax: netjet requires VIRT_TO_BUS
  net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility
  rtnetlink: Mask the rta_type when range checking
  Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally"
  Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
  smsc75xx: configuration help incorrectly mentions smsc95xx
  net: fec: fix missing napi_disable call
  net: fec: restart the FEC when PHY speed changes
  skb: Propagate pfmemalloc on skb from head page only
  ...
2013-03-19 13:20:51 -07:00
Linus Torvalds
35f8c769aa Merge tag 'for-linus-20130318' of git://git.infradead.org/linux-mtd
Pull MTD fixes from David Woodhouse:
 "This fixes a couple of problems.  Firstly, some people are actually
  still using old small-page flash and we broke it by removing the ready
  check.

  Secondly.  fix the handling of partitions on Broadcom 47xx devices.
  Recent changes had made it misdetect the location of the NVRAM and
  scribble over the bootloader when it tried to update the variables
  there.  With predictably sad results."

* tag 'for-linus-20130318' of git://git.infradead.org/linux-mtd:
  mtd: nand: reintroduce NAND_NO_READRDY as NAND_NEED_READRDY
  mtd: bcm47xxpart: look for NVRAM at the end of device
  Revert "mtd: bcm47xxpart: improve probing of nvram partition"
2013-03-18 08:27:41 -07:00
John W. Linville
49c87cd1ea Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	net/nfc/llcp/llcp.c
2013-03-18 09:39:21 -04:00
David Rientjes
6c4d3bc99b perf,x86: fix link failure for non-Intel configs
Commit 1d9d8639c0 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:

	arch/x86/power/built-in.o: In function `restore_processor_state':
	(.text+0x45c): undefined reference to `perf_restore_debug_store'

Fix it by defining the dummy function appropriately.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-17 15:59:15 -07:00
Christoph Paasch
1a2c6181c4 tcp: Remove TCPCT
TCPCT uses option-number 253, reserved for experimental use and should
not be used in production environments.
Further, TCPCT does not fully implement RFC 6013.

As a nice side-effect, removing TCPCT increases TCP's performance for
very short flows:

Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
for files of 1KB size.

before this patch:
	average (among 7 runs) of 20845.5 Requests/Second
after:
	average (among 7 runs) of 21403.6 Requests/Second

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 14:35:13 -04:00
Cong Wang
a362db3d6c net: fix some typos in netif features
Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 14:25:42 -04:00
David S. Miller
86feff3f3e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Conflicts:
	net/openvswitch/vport-internal_dev.c

Jesse Gross says:

====================
A couple of minor enhancements for net-next/3.10.  The largest is an
extension to allow variable length metadata to be passed to userspace
with packets.

There is a merge conflict in net/openvswitch/vport-internal_dev.c:
A existing commit modifies internal_dev_mac_addr() and a new commit
deletes it.  The new one is correct, so you can just remove that function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 12:58:47 -04:00
Bjørn Mork
1e8bbe6cd0 net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility
commit bd329e1 ("net: cdc_ncm: do not bind to NCM compatible MBIM devices")
introduced a new policy, preferring MBIM for dual NCM/MBIM functions if
the cdc_mbim driver was enabled.  This caused a regression for users
wanting to use NCM.

Devices implementing NCM backwards compatibility according to section
3.2 of the MBIM v1.0 specification allow either NCM or MBIM on a single
USB function, using different altsettings.  The cdc_ncm and cdc_mbim
drivers will both probe such functions, and must agree on a common
policy for selecting either MBIM or NCM.  Until now, this policy has
been set at build time based on CONFIG_USB_NET_CDC_MBIM.

Use a module parameter to set the system policy at runtime, allowing the
user to prefer NCM on systems with the cdc_mbim driver.

Cc: Greg Suarez <gsuarez@smithmicro.com>
Cc: Alexey Orishko <alexey.orishko@stericsson.com>
Reported-by: Geir Haatveit <nospam@haatveit.nu>
Reported-by: Tommi Kyntola <kynde@ts.ray.fi>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=54791
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 11:59:03 -04:00
Linus Torvalds
de1893f640 Merge tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes
Pull MFD fixes from Samuel Ortiz:
 "This is the first batch of MFD fixes for 3.9.

  With this one we have:

   - An ab8500 build failure fix.
   - An ab8500 device tree parsing fix.
   - A fix for twl4030_madc remove routine to work properly (when
     built-in).
   - A fix for properly registering palmas interrupt handler.
   - A fix for omap-usb init routine to actually write into the
     hostconfig register.
   - A couple of warning fixes for ab8500-gpadc and tps65912"

* tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
  mfd: twl4030-madc: Remove __exit_p annotation
  mfd: ab8500: Kill "reg" property from binding
  mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO
  mfd: wm831x: Don't forward declare enum wm831x_auxadc
  mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource()
  mfd: tps65912: Declare and use tps65912_irq_exit()
  mfd: palmas: Provide irq flags through DT/platform data
  mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error
  mfd: omap-usb-host: Actually update hostconfig
2013-03-15 17:34:01 -07:00
Stephane Eranian
1d9d8639c0 perf,x86: fix kernel crash with PEBS/BTS after suspend/resume
This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.

The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-15 09:26:35 -07:00
Fernando Luis Vazquez Cao
764444f5a3 net: clean leftover of COMPAT_NET_DEV_OPS removal
COMPAT_NET_DEV_OPS was removed a while back and with it the definition of
netdev_resync_ops() went away. Let's finish the clean-up.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-15 08:30:46 -04:00
nikolay@redhat.com
8a7fbfab4b netxen: write IP address to firmware when using bonding
This patch allows LRO aggregation on bonded devices that contain an
NX3031 device. It also adds a for_each_netdev_in_bond_rcu(bond, slave)
macro which executes for each slave that has bond as master.

V3: After testing and discussing this with Rajesh, I decided to keep the
    vlan ip cache and just rename it to ip_cache since it will store bond
    ip addresses too. A new master flag has been added to the ip cache to
    denote that the address has been added because of a master device.
    I've taken care of the enslave/release cases by checking for various
    combinations of events and flags (e.g. netxen has a master, it's a
    bond master and it's not marked as a slave means it is being enslaved
    and is dev_open()ed in bond_enslave).
    I've changed netxen_free_ip_list() to have a "master" parameter which
    causes all IP addresses marked as master to be deleted (used when a
    netxen is being released). I've made the patch use the new upper
    device API as well. The following cases were tested:
    - bond -> netxen
    - vlan -> netxen
    - vlan -> bond -> netxen

V2: Remove local ip caching, retrieve addresses dynamically and
    restore them if necessary.

Note: Tested with NX3031 adapter.

Tested-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Andy Gospodarek <agospoda@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-15 08:22:20 -04:00