* 'stable/ttm.pci-api.v5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
ttm: Include the 'struct dev' when using the DMA API.
nouveau/ttm/PCIe: Use dma_addr if TTM has set it.
radeon/ttm/PCIe: Use dma_addr if TTM has set it.
ttm: Expand (*populate) to support an array of DMA addresses.
ttm: Utilize the DMA API for pages that have TTM_PAGE_FLAG_DMA32 set.
ttm: Introduce a placeholder for DMA (bus) addresses.
Nouveau doesn't have enough information at ttm_backend_func.bind() time
to implement things like tiled GART, or to keep a buffer at a constant
address in the GPU virtual address space no matter where in physical
memory it's placed.
To resolve this, nouveau will handle binding of all buffers to the GPU
itself from the move_notify() hook. This commit ensures it's called
for all buffer moves.
Talked to Dave about the impact on radeon, which uses move_notify, it
doesn't look like anything should break there.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This makes the accounting when using 'debug_dma_dump_mappings()'
and CONFIG_DMA_API_DEBUG=y be assigned to the correct device
instead of 'fallback'.
No functional change - just cosmetic.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We pass in the array of ttm pages to be populated in the GART/MM
of the card (or AGP). Patch titled: "ttm: Utilize the DMA API for
pages that have TTM_PAGE_FLAG_DMA32 set." uses the DMA API to make
those pages have a proper DMA addresses (in the situation where
page_to_phys or virt_to_phys do not give use the DMA (bus) address).
Since we are using the DMA API on those pages, we should pass in the
DMA address to this function so it can save it in its proper fields
(later patches use it).
[v2: Added reviewed-by tag]
Reviewed-by: Thomas Hellstrom <thellstrom@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
For pages that have the TTM_PAGE_FLAG_DMA32 flag set we
use the DMA API. We save the bus address in our array which we
use to program the GART (see "radeon/ttm/PCIe: Use dma_addr if TTM
has set it." and "nouveau/ttm/PCIe: Use dma_addr if TTM has set it.").
The reason behind using the DMA API is that under Xen we would
end up programming the GART with the bounce buffer (SWIOTLB)
DMA address instead of the physical DMA address of the TTM page.
The reason being that alloc_page with GFP_DMA32 does not allocate
pages under the the 4GB mark when running under Xen hypervisor.
Under baremetal this means we do the DMA API call earlier instead
of when we program the GART.
For details please refer to:
https://lkml.org/lkml/2011/1/7/251
[v2: Fixed indentation, revised desc, added Reviewed-by]
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
This is right now limited to only non-pool constructs.
[v2: Fixed indentation issues, add review-by tag]
Reviewed-by: Thomas Hellstrom <thomas@shipmail.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (390 commits)
drm/radeon/kms: disable underscan by default
drm/radeon/kms: only enable hdmi features if the monitor supports audio
drm: Restore the old_fb upon modeset failure
drm/nouveau: fix hwmon device binding
radeon: consolidate asic-specific function decls for pre-r600
vga_switcheroo: comparing too few characters in strncmp()
drm/radeon/kms: add NI pci ids
drm/radeon/kms: don't enable pcie gen2 on NI yet
drm/radeon/kms: add radeon_asic struct for NI asics
drm/radeon/kms/ni: load default sclk/mclk/vddc at pm init
drm/radeon/kms: add ucode loader for NI
drm/radeon/kms: add support for DCE5 display LUTs
drm/radeon/kms: add ni_reg.h
drm/radeon/kms: add bo blit support for NI
drm/radeon/kms: always use writeback/events for fences on NI
drm/radeon/kms: adjust default clock/vddc tracking for pm on DCE5
drm/radeon/kms: add backend map workaround for barts
drm/radeon/kms: fill gpu init for NI asics
drm/radeon/kms: add disabled vbios accessor for NI asics
drm/radeon/kms: handle NI thermal controller
...
Make ttm_bo::ttm_bo_device_release call cancel_delayed_work_sync()
instead of calling cancel_delayed_work() followed by
flush_scheduled_work().
This is to prepare for the deprecation and removal of
flush_scheduled_work().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc:: Thomas Hellstrom <thellstrom@vmware.com>
Cc:: Dave Airlie <airlied@redhat.com>
Drivers using their own implementation of io_mem_reserve/io_mem_free are
likely to store the tracking information for the map in mem.mm_node, so
it can't be freed while still mapped.
Signed-off-by: Ben Skeggs<bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch attempts to fix up shortcomings with the current calling
sequences.
1) There's a fastpath where no locking occurs and only io_mem_reserved is
called to obtain needed info for mapping. The fastpath is set per
memory type manager.
2) If the fastpath is disabled, io_mem_reserve and io_mem_free will be exactly
balanced and not called recursively for the same struct ttm_mem_reg.
3) Optionally the driver can choose to enable a per memory type manager LRU
eviction mechanism that, when io_mem_reserve returns -EAGAIN will attempt
to kill user-space mappings of memory in that manager to free up needed
resources
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Rather than having the driver supply the validation sequence, leave that
responsibility to TTM. This saves some confusion and a function argument.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Drastically reduce the number of spin lock / unlock operations by performing
unreserving and fencing under global locks.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <j.glisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The bo lock used only to protect the bo sync object members, and since it
is a per bo lock, fencing a buffer list will see a lot of locks and unlocks.
Replace it with a per-device lock that protects the sync object members on
*all* bos. Reading and setting these members will always be very quick, so
the risc of heavy lock contention is microscopic. Note that waiting for
sync objects will always take place outside of this lock.
The bo device fence lock will eventually be replaced with a seqlock /
rcu mechanism so we can determine that a bo is idle under a
rcu / read seqlock.
However this change will allow us to batch fencing and unreserving of
buffers with a minimal amount of locking.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add an aid for the driver to detect deadlocks on multi-bo reservations
Update documentation.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Avoid the ttm_bo_unreserve() spinlocks by calling
ttm_eu_backoff_reservation_locked under the lru spinlock.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Makes it possible to reserve a list of buffer objects with a single
spin lock / unlock if there is no contention.
Should improve cpu usage on SMP kernels.
v2: Initialize private list members on reserve and don't call
ttm_bo_list_ref_sub() with zero put_count.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A process suspended waiting for a higher sequence or no sequence to unreserve,
a bo may be beaten to the reservation by a process with a lower sequence.
In that case the first process should give up trying to reserve and
return -EAGAIN. In order for that to happen, we must wake waiting processes
when we change sequence, so that they have a chance to detect the new
sequence.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Call destroy() on _all_ ttm_bo_init() failures, and make sure that
behavior is documented in the function description.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This breaks vmwgfx non-root EGL clients and is a remnant from the
TTM user-space interface. This test should be done in the driver.
Replace the remaining placement test with a BUG_ON, since triggering
it is a driver bug.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The sync object may disappear as soon as we release the bo::lock, so
take a reference on it while we use it.
One option would be to call sync_object_flush() before releasing the bo::lock,
but that would put an atomic requirement on that function.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The driver (for example vmwgfx) may want to silently deal with the
error itself.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since we're doing this outside of a spinlock to provide the necessary
barriers, add an explicit barrier.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace with BUG_ON(). These error messages remained from the time
when TTM was initialized from user-space. Nowadays hitting one of those
is really a kernel bug.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>