Fix a possible though highly unlikely deadlock:
Thread A: Thread B:
- acquire mmap_sem - dv1394_ioctl/read/write()
- dv1394_mmap() - acquire video->mtx
- acquire video->mtx - copy_to/from_user(), possible page fault:
acquire mmap_sem
The simplest fix is to use mutex_trylock() instead of mutex_lock() in
dv1394_mmap(). This changes the behavior under contention in a way
which is visible to userspace clients. However, my guess is that no
clients exist which use mmap vs. ioctl/read/write on the dv1394
character device file interface in concurrent threads.
Reported-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Regression in 2.6.28-rc1: When I added the new state_mutex which
prevents corruption of raw1394's internal state when accessed by
multithreaded client applications, the following possible though
highly unlikely deadlock slipped in:
Thread A: Thread B:
- acquire mmap_sem - raw1394_write() or raw1394_ioctl()
- raw1394_mmap() - acquire state_mutex
- acquire state_mutex - copy_to/from_user(), possible page fault:
acquire mmap_sem
The simplest fix is to use mutex_trylock() instead of mutex_lock() in
raw1394_mmap(). This changes the behavior under contention in a way
which is visible to userspace clients. However, since multithreaded
access was entirely buggy before state_mutex was added and libraw1394's
documentation advised application programmers to use a handle only in a
single thread, this change in behaviour should not be an issue in
practice at all.
Since we have to use mutex_trylock() in raw1394_mmap() regardless
whether /dev/raw1394 was opened with O_NONBLOCK or not, we now use
mutex_trylock() unconditionally everywhere for state_mutex, just to have
consistent behavior.
Reported-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1: There is a small race between queue_delayed_work() and its
corresponding kref_get(). Do the kref_get first, and _put it again
if the queue_delayed_work() failed, so there is no chance of the
kref going to zero while the work is scheduled.
2: An SBP2_LOGOUT_REQUEST could be sent out with a login_id full of
garbage. Initialize it to an invalid value so we can tell if we
ever got a valid login_id.
3: The node ID and generation may have changed but the new values may
not yet have been recorded in lu and tgt when the final logout is
attempted. Use the latest values from the device in
sbp2_release_target().
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This optimizes firewire-sbp2's device probe for the case that the local
node and the SBP-2 node were discovered at the same time. In this case,
fw-core's bus management work and fw-sbp2's login and SCSI probe work
are scheduled in parallel (in the globally shared workqueue and in
fw-sbp2's workqueue, respectively). The bus reset from fw-core may then
disturb and extremely delay the login and SCSI probe because the latter
fails with several command timeouts and retries and has to be retried
from scratch.
We avoid this particular situation of sbp2_login() and fw_card_bm_work()
running in parallel by delaying the first sbp2_login() a little bit.
This is meant to be a short-term fix for
https://bugzilla.redhat.com/show_bug.cgi?id=466679. In the long run,
the SCSI probe, i.e. fw-sbp2's call of __scsi_add_device(), should be
parallelized with sbp2_reconnect().
Problem reported and fix tested and confirmed by Alex Kanavin.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The transmit and receive context dma memory was not being freed on
module removal. Neither was the config rom memory. Fix that.
The ab->next assignment is pure paranoia.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
With the bus_resets patch applied, it is easy to see this memory leak
by repeatedly resetting the firewire bus while running slabtop in
another window. Just watch kmalloc-32 grow and grow...
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The "color" is used during the topology building after a bus reset,
hovever in "struct fw_node"s it is stored in a u8, but in struct fw_card
it is stored in an int. When the value wraps in one struct, but not
the other, disaster strikes.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10922.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reported by Jay Fenlason: ioctl() did not return as intended
- the size of data read into ioctl_send_request,
- the number of datagrams enqueued by ioctl_queue_iso.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
queuecommand() looked at the remote and local node IDs before it read
the bus generation. The corresponding race with sbp2_reconnect updating
these data was probably impossible to happen though because the current
code blocks the SCSI layer during reconnection. However, better safe
than sorry, especially if someone later improves the code to not block
the SCSI layer.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the firewire stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Share code between fw_send_request + wait_for_completion callers.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Addendum:
Removes an unnecessary struct and an ununsed retry loop.
Calls it fw_run_transaction() instead of fw_send_request_sync().
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Kristian Høgsberg <krh@redhat.com>
There are situations when nodes vanish from the bus and come back in
quickly thereafter:
- When certain bus-powered hubs are plugged in,
- when certain disk enclosures are switched from self-power to bus
power or vice versa and break the daisy chain during the transition,
- when the user plugs a cable out and quickly plugs it back in, e.g.
to reorder a daisy chain (works on Mac OS X if done quickly enough),
- when certain hubs temporarily malfunction during high bus traffic.
The ieee1394 driver's nodemgr already contained a function to set
vanished nodes aside into "limbo"; i.e. they wouldn't actually be
deleted right away. (In fact, only unloading the driver or writing into
an obscure sysfs attribute would delete them eventually.) If nodes
reappeared later, they would be resurrected out of limbo.
Moving nodes into and out of limbo was accompanied with calling the
.suspend() and .resume() driver methods of the drivers which were bound
to a respective node's unit directories. Not only is this somewhat
strange due to the intended use of these driver methods for power
management, also the sbp2 driver in particular does not implement
.suspend() and .resume(). Hence sbp2 would be disconnected from devices
in situations as listed above.
We now:
- leave drivers bound when nodes go into limbo,
- call the drivers' .update() when nodes come out of limbo,
- automatically delete in-limbo nodes 3 seconds after the last
bus reset and bus rescan.
- Because of the automatic removal, the now obsolete bus attribute
/sys/bus/ieee1394/destroy_node is removed.
This especially lets sbp2 survive brief disconnections. You can for
example yank a disk's cable and plug it back in while reading the
respective disk with dd, but dd will happily continue as if nothing
happened.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Remove useless pointer type casts.
Remove unnecessary hi->host indirection where only host is used.
Remove an unnecessary WARN_ON.
Change a few names.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
init->channel and v.buffer are unsigned and tests for < 0 therefore
always false. gcc knows this and eliminates the code, but anyway...
Reported by Roel Kluin.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Application programs should use a libraw1394 handle only in a single
thread. The raw1394 driver was apparently relying on this, because it
did nothing to protect its fi->state variable from corruption due to
concurrent accesses.
We now serialize the fi->state accesses. This affects the write() path.
We re-use the state_mutex which was introduced to protect fi->iso_state
accesses in the ioctl() path. These paths and accesses are independent
of each other, hence separate mutexes could be used. But I don't see
much benefit in that.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Refactor the ioctl dispatcher in order to move a fraction of it out of
the section which is serialized by fi->state_mutex. This is not so much
about performance but more about self-documentation: The mutex_lock()/
mutex_unlock() calls are now closer to the data accesses which the mutex
protects, i.e. to the iso_state switch.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This removes the last usage of the Big Kernel Lock from the ieee1394
stack, i.e. from raw1394's (unlocked_)ioctl and compat_ioctl.
The ioctl()s don't need to take the BKL, but they need to be serialized
per struct file *. In particular, accesses to ->iso_state need to be
serial. We simply use a blocking mutex for this purpose because
libraw1394 does not use O_NONBLOCK. In practice, there is no lock
contention anyway because most if not all libraw1394 clients use a
libraw1394 handle only in a single thread.
mmap() also accesses ->iso_state. Until now this was unprotected
against concurrent changes by ioctls. Fix this bug while we are at it.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the ieee1394 stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
We can also uniformly use dma_map_sg() for the single segment case just
like for the multi segment case, to further simplify the code.
Also clean up how the page table is converted to big endian.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Two dma_sync_single_for_cpu() were called in the wrong place.
Luckily they were merely for DMA_TO_DEVICE, hence nobody noticed.
Also reorder the matching dma_sync_single_for_device() a little bit
so that they reside in the same functions as their counterparts.
This also avoids syncing the s/g table for requests which don't use it.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>