Commit Graph

25091 Commits

Author SHA1 Message Date
Jason Wang
10a8d94a95 virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID
There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.

No feature negotiation is needed as old driver just ignore this flag.

Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 15:57:47 -07:00
Greg Rose
c7ac8679be rtnetlink: Compute and store minimum ifinfo dump size
The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-06-09 20:38:07 -07:00
Jerry Chu
9ad7c049f0 tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side
This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.

It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.

The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
Alexander Duyck
bff55273f9 v2 ethtool: remove support for ETHTOOL_GRXNTUPLE
This change is meant to remove all support for displaying an ntuple as
strings via ETHTOOL_GRXNTUPLE.  The reason for this change is due to the
fact that multiple issues have been found including:
 - Multiple buffer overruns for strings being displayed.
 - Incorrect filters displayed, cleared filters with ring of -2 are displayed
 - Setting get_rx_ntuple displays no rules if defined.
 - Endianess wrong on displayed values.
 - Hard limit of 1024 filters makes display functionality extremely limited

The only driver that had supported this interface was ixgbe.  Since it no
longer uses the interface and due to the issues mentioned above I am
submitting this patch to remove it.

v2:
Updated based on comments from Ben Hutchings
 - Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated
 - Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container
 - Left ETHTOOL_GRXNTUPLE but commented it as deprecated

Also cleaned up set_rx_ntuple since there is no flow spec container to
maintain we can drop all the code for the alloc and free of it and just
return ops->set_rx_ntuple().
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 16:45:31 -07:00
John W. Linville
c0c33addcb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-06-08 13:44:21 -04:00
John W. Linville
41bfce8ede Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-06-07 14:07:11 -04:00
Alexey Dobriyan
a6b7a40786 net: remove interrupt.h inclusion from netdevice.h
* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:55:11 -07:00
Linus Torvalds
0e833d8cfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
  tg3: Fix tg3_skb_error_unmap()
  net: tracepoint of net_dev_xmit sees freed skb and causes panic
  drivers/net/can/flexcan.c: add missing clk_put
  net: dm9000: Get the chip in a known good state before enabling interrupts
  drivers/net/davinci_emac.c: add missing clk_put
  af-packet: Add flag to distinguish VID 0 from no-vlan.
  caif: Fix race when conditionally taking rtnl lock
  usbnet/cdc_ncm: add missing .reset_resume hook
  vlan: fix typo in vlan_dev_hard_start_xmit()
  net/ipv4: Check for mistakenly passed in non-IPv4 address
  iwl4965: correctly validate temperature value
  bluetooth l2cap: fix locking in l2cap_global_chan_by_psm
  ath9k: fix two more bugs in tx power
  cfg80211: don't drop p2p probe responses
  Revert "net: fix section mismatches"
  drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run()
  sctp: stop pending timers and purge queues when peer restart asoc
  drivers/net: ks8842 Fix crash on received packet when in PIO mode.
  ip_options_compile: properly handle unaligned pointer
  iwlagn: fix incorrect PCI subsystem id for 6150 devices
  ...
2011-06-04 23:16:00 +09:00
Linus Torvalds
4f1ba49efa Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: Use hlist_entry() for io_context.cic_list.first
  cfq-iosched: Remove bogus check in queue_fail path
  xen/blkback: potential null dereference in error handling
  xen/blkback: don't call vbd_size() if bd_disk is NULL
  block: blkdev_get() should access ->bd_disk only after success
  CFQ: Fix typo and remove unnecessary semicolon
  block: remove unwanted semicolons
  Revert "block: Remove extra discard_alignment from hd_struct."
  nbd: adjust 'max_part' according to part_shift
  nbd: limit module parameters to a sane value
  nbd: pass MSG_* flags to kernel_recvmsg()
  block: improve the bio_add_page() and bio_add_pc_page() descriptions
2011-06-04 08:11:26 +09:00
Linus Torvalds
55db4c64ed Revert "tty: make receive_buf() return the amout of bytes received"
This reverts commit b1c43f82c5.

It was broken in so many ways, and results in random odd pty issues.

It re-introduced the buggy schedule_work() in flush_to_ldisc() that can
cause endless work-loops (see commit a5660b41af: "tty: fix endless
work loop when the buffer fills up").

It also used an "unsigned int" return value fo the ->receive_buf()
function, but then made multiple functions return a negative error code,
and didn't actually check for the error in the caller.

And it didn't actually work at all.  BenH bisected down odd tty behavior
to it:
  "It looks like the patch is causing some major malfunctions of the X
   server for me, possibly related to PTYs.  For example, cat'ing a
   large file in a gnome terminal hangs the kernel for -minutes- in a
   loop of what looks like flush_to_ldisc/workqueue code, (some ftrace
   data in the quoted bits further down).

   ...

   Some more data: It -looks- like what happens is that the
   flush_to_ldisc work queue entry constantly re-queues itself (because
   the PTY is full ?) and the workqueue thread will basically loop
   forver calling it without ever scheduling, thus starving the consumer
   process that could have emptied the PTY."

which is pretty much exactly the problem we fixed in a5660b41af.

Milton Miller pointed out the 'unsigned int' issue.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Milton Miller <miltonm@bga.com>
Cc: Stefan Bigler <stefan.bigler@keymile.com>
Cc: Toby Gray <toby.gray@realvnc.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-04 06:33:24 +09:00
Rafał Miłecki
27f18dc2da bcma: read SPROM and extract MAC from it
In case of BCMA cards SPROM is located in the ChipCommon core, it is
not mapped as separated host window. So far we have met only SPROMs rev
8.
SPROM layout seems to be the same as for SSB buses, so we decided to
share SPROM struct and some defines.
For now we extract MAC address only, this can be improved of course.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 15:01:07 -04:00
Arend van Spriel
10f8113ecb lib: cordic: add library module providing cordic angle calculation
The brcm80211 driver in the staging tree has a cordic function to
determine cosine and sine for a given angle. Feedback received from
John Linville suggested that these kind of functions should be made
available to others as a library function in the kernel tree. The
b43 driver also has a cordic angle calculation implemented.

Cc: linux-kernel@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Dan Carpenter <error27@gmail.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Reviewed-by: Roland Vossen <rvossen@broadcom.com>
Reviewed-by: Henry Ptasinski <henryp@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 15:01:07 -04:00
Arend van Spriel
7150962d63 lib: crc8: add new library module providing crc8 algorithm
The brcm80211 driver in staging tree uses a crc8 function. Based on
feedback from John Linville to move this to lib directory, the linux
source has been searched. Although there is currently only one other
kernel driver using this algorithm (ie. drivers/ssb) we are providing
this as a library function for others to use.

Cc: linux-kernel@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: Dan Carpenter <error27@gmail.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Reviewed-by: Henry Ptasinski <henryp@broadcom.com>
Reviewed-by: Roland Vossen <rvossen@broadcom.com>
Reviewed-by: "Franky (Zhenhui) Lin" <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 15:01:06 -04:00
John W. Linville
7b29dc21ea Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem 2011-06-03 14:31:50 -04:00
Ben Greear
a3bcc23e89 af-packet: Add flag to distinguish VID 0 from no-vlan.
Currently, user-space cannot determine if a 0 tcp_vlan_tci
means there is no VLAN tag or the VLAN ID was zero.

Add flag to make this explicit.  User-space can check for
TP_STATUS_VLAN_VALID || tp_vlan_tci > 0, which will be backwards
compatible. Older could would have just checked for tp_vlan_tci,
so it will work no worse than before.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-01 21:18:03 -07:00
Linus Torvalds
f0f52a9463 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: Fix off-by-one in RMRR setup
  intel-iommu: Add domain check in domain_remove_one_dev_info
  intel-iommu: Remove Host Bridge devices from identity mapping
  intel-iommu: Use coherent DMA mask when requested
  intel-iommu: Dont cache iova above 32bit
  intel-iommu: Speed up processing of the identity_mapping function
  intel-iommu: Check for identity mapping candidate using system dma mask
  intel-iommu: Only unlink device domains from iommu
  intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
  intel-iommu: Flush unmaps at domain_exit
  intel-iommu: Remove obsolete comment from detect_intel_iommu
  intel-iommu: fix VT-d PMR disable for TXT on S3 resume
2011-06-02 05:48:50 +09:00
Rafał Miłecki
9d75ef0f8f bcma: host pci: implement block R/W operations
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-01 15:12:28 -04:00
Rafał Miłecki
1bdcd095e3 bcma: add IRQ number and pointer to DMA dev
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-01 15:10:58 -04:00
Eliad Peller
333ba73252 cfg80211: don't drop p2p probe responses
Commit 0a35d36 ("cfg80211: Use capability info to detect mesh beacons")
assumed that probe response with both ESS and IBSS bits cleared
means that the frame was sent by a mesh sta.

However, these capabilities are also being used in the p2p_find phase,
and the mesh-validation broke it.

Rename the WLAN_CAPABILITY_IS_MBSS macro, and verify that mesh ies
exist before assuming this frame was sent by a mesh sta.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-01 14:34:01 -04:00
Linus Torvalds
3f303103b8 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  mtd: fix physmap.h warnings
2011-06-01 21:47:39 +09:00
Youquan Song
6dd9a7c737 intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
  - size >= 2MiB, and
  - virtual address aligned to 2MiB, and
  - physical address aligned to 2MiB, and
  - on hardware that supports superpages.

(and likewise for larger superpages).

We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.

Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.

Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.

==

The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.

I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.

The original sequence of commits leading to identical code was:

Youquan Song (3):
      intel-iommu: super page support
      intel-iommu: Fix superpage alignment calculation error
      intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()

David Woodhouse (4):
      intel-iommu: Precalculate superpage support for dmar_domain
      intel-iommu: Fix hardware_largepage_caps()
      intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
      intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages

Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:26:35 +01:00
Randy Dunlap
63da029015 mtd: fix physmap.h warnings
Fix build warnings in physmap.h:

include/linux/mtd/physmap.h:25: warning: 'struct platform_device' declared inside parameter list
include/linux/mtd/physmap.h:25: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mtd/physmap.h:26: warning: 'struct platform_device' declared inside parameter list
include/linux/mtd/physmap.h:27: warning: 'struct platform_device' declared inside parameter list

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 11:36:49 +01:00
Namhyung Kim
ea9d6553b3 block: remove unwanted semicolons
Since those defined functions require additional semicolon
from the caller, they could cause potential syntax errors
when used in if-else statements.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-31 13:45:53 +02:00
Jens Axboe
a1706ac4c0 Revert "block: Remove extra discard_alignment from hd_struct."
It was not a good idea to start dereferencing disk->queue from
the fs sysfs strategy for displaying discard alignment. We ran
into first a NULL pointer deref, and after fixing that we sometimes
see unvalid disk->queue pointer values.

Since discard is the only one of the bunch actually looking into
the queue, just revert the change.

This reverts commit 23ceb5b771.

Conflicts:
	fs/partitions/check.c
2011-05-30 07:42:51 +02:00
Michael S. Tsirkin
7ab358c23c virtio: add api for delayed callbacks
Add an API that tells the other side that callbacks
should be delayed until a lot of work has been done.
Implement using the new event_idx feature.

Note: it might seem advantageous to let the drivers
ask for a callback after a specific capacity has
been reached. However, as a single head can
free many entries in the descriptor table,
we don't really have a clue about capacity
until get_buf is called. The API is the simplest
to implement at the moment, we'll see what kind of
hints drivers can pass when there's more than one
user of the feature.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:16 +09:30