This patch relaxes the default SCSI DMA alignment from 512 bytes to 4
bytes. I remember from previous discussions that usb and firewire have
sector size alignment requirements, so I upped their alignments in the
respective slave allocs.
The reason for doing this is so that we don't get such a huge amount of
copy overhead in bio_copy_user() for udev. (basically all inquiries it
issues can now be directly mapped).
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Third rendition of FireWire OHCI 1.0 Isochronous Receive support, using a
zer-copy method similar to OHCI 1.1 which puts the IR data payload directly
into the userspace buffer. The zero-copy implementation eliminates the
video artifacts, audio popping, and buffer underrun problems seen with
version 1 of this patch, as well as fixing a regression in OHCI 1.1 support
introduced by version 2 of this patch.
Successfully tested in OHCI 1.1 mode on the following chipsets:
- NEC uPD72847 (rev 01), OHCI 1.1 (PCI)
- Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe)
- Ti TSB41AB2 (rev 01), OHCI 1.1 (PCI on SB Audigy)
- Apple UniNorth 2 (rev 81), OHCI 1.1 (PowerBook G4 onboard)
Successfully tested in OHCI 1.0 mode on the following chipsets:
- Agere FW323 (rev 06), OHCI 1.0 (Mac Mini onboard)
- Agere FW323 (rev 06), OHCI 1.0 (PCI)
- Via VT6306 (rev 46), OHCI 1.0 (PCI)
- NEC OrangeLink (rev 01), OHCI 1.0 (PCI)
- NEC uPD72847 (rev 01), OHCI 1.1 (PCI)
- Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe)
The bulk of testing was done in an x86_64 system, but was also successfully
sanity-tested on other systems, including a PPC(32) PowerBook G4 and an i686
EPIA M10k. Crude benchmarking (watching top during capture) puts the cpu
utilization during capture on the EPIA's 1GHz Via C3 processor around 13%,
which is down from 30% with the v1 code.
Some implementation details:
To maintain the same userspace API as dual-buffer mode, we set up two
descriptors for every incoming packet. The first is an INPUT_MORE descriptor,
pointing to a buffer large enough to hold just the packet's iso headers,
immediately followed by an INPUT_LAST descriptor, pointing to a chunk of the
userspace buffer big enough for the packet's data payload. With this setup,
each incoming packet fills in these two descriptors in a manner that very
closely emulates dual-buffer receive, to the point where the bulk of the
handle_ir_* code is now identical between the two (and probably primed for
some restructuring to share code between them).
The only caveat I have at the moment is that neither of my OHCI 1.0 Via
VT6307-based FireWire controllers work particularly well with this code
for reasons I have yet to figure out.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Since patch "fw-sbp2: use an own workqueue (fix system responsiveness)"
increased parallelism between fw-sbp2 and fw-core, it was possible that
fw-sbp2 didn't release the SCSI device when the FireWire device was
disconnected.
This happened if sbp2_update() ran during sbp2_login(), because a bus
reset occurred during sbp2_login(). The sbp2_login() work would [try
to] reschedule itself because it failed due to the bus reset, and it
would _not_ drop its reference on the target. However, sbp2_update()
would schedule sbp2_login() too before sbp2_login() rescheduled itself
and hence sbp2_update() would take an additional reference. And then
we would have one reference too many.
The fix is to _always_ drop the reference when leaving the sbp2_login()
work. If the sbp2_login() work reschedules itself, it takes a
reference, but only if it wasn't already rescheduled by sbp2_update().
Ditto in the sbp2_reconnect() work.
The resulting code is actually simpler than before: We _always_ take
a reference when successfully scheduling work. And we _always_ drop
a reference when leaving a workqueue job. No exceptions.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The ohci_enable() function shared between pci_probe and pci_resume
takes a host endian config rom, but ohci->config_rom is __be32. This
sets up the config rom in the wrong endian on little endian machine,
specifically, BusOptions will be initialized to a 0 max receive size.
This patch changes the way we reuse the config rom so that we avoid
this problem.
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
New warning since commit ab88ca488b,
"firewire: fw-ohci: missing dma_unmap_single":
drivers/firewire/fw-ohci.c: In function 'at_context_transmit':
drivers/firewire/fw-ohci.c:609: warning: 'payload_bus' may be used
uninitialized in this function
Access to payload_bus is conditional on packet->payload_length > 0,
and that won't change while in at_context_queue_packet.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
because there seems to be more time needed to implement this.
Also, change related error return values to more appropriate ones.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This duplicates the read cycle timer feature of raw1394 (added in Linux
2.6.21) in firewire-core's userspace ABI. The argument to the ioctl is
reordered though to ensure 32/64 bit compatibility.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Check NodeID.nodeNumber as per OHCI 1.1 clause 7.2.3.2. See also IEEE
1394a table 5B-1.
Also, demote the "node ID not valid" message from error to notification
as it is not an error condition.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
at_context_queue_packet() didn't clean up in an early exit path.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
It seems unlikely, but access to self_id_cpu[0] could at least in theory
be deferred until after the loop over self_id_cpu[1..n] or even after
the subsequent reg_read. Enforce the desired order by a read barrier.
Also prevent the reg_read from being reordered relative to the for loop.
This isn't necessary if the loop's conditional printk counts as an
implicit barrier, but better make it explicit.
(self_id_cpu[] is a coherent DMA buffer.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Firewire-sbp2 did very uncooperative things in the kernel's shared
workqueue: Sleeping until reception of management status from the
target for up to 2 seconds, and performing SCSI inquiry and all of the
setup of SCSI command set drivers via scsi_add_device. If there were
transient or permanent error conditions, this caused long blockage of
the kernel's events process, noticeable e.g. by blocked keyboard input.
We now allocate a workqueue process exclusive to fw-sbp2. As a side
effect, this also increases parallelism of fw-sbp2's login and reconnect
work versus fw-core's device discovery and device update work which is
performed in the shared workqueue.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
On rare occasions, the ability to set one of the workaround flags at
runtime may save the day.
People who experience I/O errors with firewire-sbp2 while the old sbp2
driver worked for them should try workarounds=1 and report to the devel
mailinglist whether that improves things. Firewire-sbp2 defaults to the
SCSI stack's maximum transfer size per command, while sbp2 limits them
to 128 kBytes. Flag 1 accomplishes just that.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
On IOMMU-less noncoherent architectures, orb->callback will memcpy the
whole SCSI command buffer for READ-like SCSI commands. It is therefore
friendlier to enable IRQs before the call, like before patch "Add
ref-counting for sbp2 orbs".
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Kristian Høgsberg <krh@redhat.com>
Sparse warned about it although it was apparently harmless:
drivers/firewire/fw-cdev.c:624:23: warning: symbol 'interrupt' shadows an earlier one
include/asm/hw_irq.h:29:13: originally declared here
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This changes the uevent buffer functions to use a struct instead of a
long list of parameters. It does no longer require the caller to do the
proper buffer termination and size accounting, which is currently wrong
in some places. It fixes a known bug where parts of the uevent
environment are overwritten because of wrong index calculations.
Many thanks to Mathieu Desnoyers for finding bugs and improving the
error handling.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>