Merge branches 'tracing/ftrace', 'tracing/fastboot', 'tracing/nmisafe' and 'tracing/urgent' into tracing/core

This commit is contained in:
Ingo Molnar
2008-11-08 09:34:35 +01:00
45 changed files with 724 additions and 352 deletions
+72 -97
View File
@@ -8,7 +8,7 @@ Copyright 2008 Red Hat Inc.
Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
John Kacur, and David Teigland.
Written for: 2.6.27-rc1
Written for: 2.6.28-rc2
Introduction
------------
@@ -50,26 +50,26 @@ of ftrace. Here is a list of some of the key files:
Note: all time values are in microseconds.
current_tracer : This is used to set or display the current tracer
current_tracer: This is used to set or display the current tracer
that is configured.
available_tracers : This holds the different types of tracers that
available_tracers: This holds the different types of tracers that
have been compiled into the kernel. The tracers
listed here can be configured by echoing their name
into current_tracer.
tracing_enabled : This sets or displays whether the current_tracer
tracing_enabled: This sets or displays whether the current_tracer
is activated and tracing or not. Echo 0 into this
file to disable the tracer or 1 to enable it.
trace : This file holds the output of the trace in a human readable
trace: This file holds the output of the trace in a human readable
format (described below).
latency_trace : This file shows the same trace but the information
latency_trace: This file shows the same trace but the information
is organized more to display possible latencies
in the system (described below).
trace_pipe : The output is the same as the "trace" file but this
trace_pipe: The output is the same as the "trace" file but this
file is meant to be streamed with live tracing.
Reads from this file will block until new data
is retrieved. Unlike the "trace" and "latency_trace"
@@ -82,11 +82,11 @@ of ftrace. Here is a list of some of the key files:
tracer is not adding more data, they will display
the same information every time they are read.
iter_ctrl : This file lets the user control the amount of data
iter_ctrl: This file lets the user control the amount of data
that is displayed in one of the above output
files.
trace_max_latency : Some of the tracers record the max latency.
trace_max_latency: Some of the tracers record the max latency.
For example, the time interrupts are disabled.
This time is saved in this file. The max trace
will also be stored, and displayed by either
@@ -94,29 +94,26 @@ of ftrace. Here is a list of some of the key files:
only be recorded if the latency is greater than
the value in this file. (in microseconds)
trace_entries : This sets or displays the number of trace
entries each CPU buffer can hold. The tracer buffers
are the same size for each CPU. The displayed number
is the size of the CPU buffer and not total size. The
trace_entries: This sets or displays the number of bytes each CPU
buffer can hold. The tracer buffers are the same size
for each CPU. The displayed number is the size of the
CPU buffer and not total size of all buffers. The
trace buffers are allocated in pages (blocks of memory
that the kernel uses for allocation, usually 4 KB in size).
Since each entry is smaller than a page, if the last
allocated page has room for more entries than were
requested, the rest of the page is used to allocate
entries.
If the last page allocated has room for more bytes
than requested, the rest of the page will be used,
making the actual allocation bigger than requested.
(Note, the size may not be a multiple of the page size due
to buffer managment overhead.)
This can only be updated when the current_tracer
is set to "none".
is set to "nop".
NOTE: It is planned on changing the allocated buffers
from being the number of possible CPUS to
the number of online CPUS.
tracing_cpumask : This is a mask that lets the user only trace
tracing_cpumask: This is a mask that lets the user only trace
on specified CPUS. The format is a hex string
representing the CPUS.
set_ftrace_filter : When dynamic ftrace is configured in (see the
set_ftrace_filter: When dynamic ftrace is configured in (see the
section below "dynamic ftrace"), the code is dynamically
modified (code text rewrite) to disable calling of the
function profiler (mcount). This lets tracing be configured
@@ -130,14 +127,11 @@ of ftrace. Here is a list of some of the key files:
be traced. If a function exists in both set_ftrace_filter
and set_ftrace_notrace, the function will _not_ be traced.
available_filter_functions : When a function is encountered the first
time by the dynamic tracer, it is recorded and
later the call is converted into a nop. This file
lists the functions that have been recorded
by the dynamic tracer and these functions can
be used to set the ftrace filter by the above
"set_ftrace_filter" file. (See the section "dynamic ftrace"
below for more details).
available_filter_functions: This lists the functions that ftrace
has processed and can trace. These are the function
names that you can pass to "set_ftrace_filter" or
"set_ftrace_notrace". (See the section "dynamic ftrace"
below for more details.)
The Tracers
@@ -145,7 +139,7 @@ The Tracers
Here is the list of current tracers that may be configured.
ftrace - function tracer that uses mcount to trace all functions.
function - function tracer that uses mcount to trace all functions.
sched_switch - traces the context switches between tasks.
@@ -166,8 +160,8 @@ Here is the list of current tracers that may be configured.
the highest priority task to get scheduled after
it has been woken up.
none - This is not a tracer. To remove all tracers from tracing
simply echo "none" into current_tracer.
nop - This is not a tracer. To remove all tracers from tracing
simply echo "nop" into current_tracer.
Examples of using the tracer
@@ -182,7 +176,7 @@ Output format:
Here is an example of the output format of the file "trace"
--------
# tracer: ftrace
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
@@ -192,7 +186,7 @@ Here is an example of the output format of the file "trace"
--------
A header is printed with the tracer name that is represented by the trace.
In this case the tracer is "ftrace". Then a header showing the format. Task
In this case the tracer is "function". Then a header showing the format. Task
name "bash", the task PID "4251", the CPU that it was running on
"01", the timestamp in <secs>.<usecs> format, the function name that was
traced "path_put" and the parent function that called this function
@@ -1003,22 +997,20 @@ is the stack for the hard interrupt. This hides the fact that NEED_RESCHED
has been set. We do not see the 'N' until we switch back to the task's
assigned stack.
ftrace
------
function
--------
ftrace is not only the name of the tracing infrastructure, but it
is also a name of one of the tracers. The tracer is the function
tracer. Enabling the function tracer can be done from the
debug file system. Make sure the ftrace_enabled is set otherwise
this tracer is a nop.
This tracer is the function tracer. Enabling the function tracer
can be done from the debug file system. Make sure the ftrace_enabled is
set; otherwise this tracer is a nop.
# sysctl kernel.ftrace_enabled=1
# echo ftrace > /debug/tracing/current_tracer
# echo function > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
# tracer: ftrace
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
@@ -1040,10 +1032,10 @@ this tracer is a nop.
[...]
Note: ftrace uses ring buffers to store the above entries. The newest data
may overwrite the oldest data. Sometimes using echo to stop the trace
is not sufficient because the tracing could have overwritten the data
that you wanted to record. For this reason, it is sometimes better to
Note: function tracer uses ring buffers to store the above entries.
The newest data may overwrite the oldest data. Sometimes using echo to
stop the trace is not sufficient because the tracing could have overwritten
the data that you wanted to record. For this reason, it is sometimes better to
disable tracing directly from a program. This allows you to stop the
tracing at the point that you hit the part that you are interested in.
To disable the tracing directly from a C program, something like following
@@ -1077,18 +1069,31 @@ every kernel function, produced by the -pg switch in gcc), starts
of pointing to a simple return. (Enabling FTRACE will include the
-pg switch in the compiling of the kernel.)
When dynamic ftrace is initialized, it calls kstop_machine to make
the machine act like a uniprocessor so that it can freely modify code
without worrying about other processors executing that same code. At
initialization, the mcount calls are changed to call a "record_ip"
function. After this, the first time a kernel function is called,
it has the calling address saved in a hash table.
At compile time every C file object is run through the
recordmcount.pl script (located in the scripts directory). This
script will process the C object using objdump to find all the
locations in the .text section that call mcount. (Note, only
the .text section is processed, since processing other sections
like .init.text may cause races due to those sections being freed).
Later on the ftraced kernel thread is awoken and will again call
kstop_machine if new functions have been recorded. The ftraced thread
will change all calls to mcount to "nop". Just calling mcount
and having mcount return has shown a 10% overhead. By converting
it to a nop, there is no measurable overhead to the system.
A new section called "__mcount_loc" is created that holds references
to all the mcount call sites in the .text section. This section is
compiled back into the original object. The final linker will add
all these references into a single table.
On boot up, before SMP is initialized, the dynamic ftrace code
scans this table and updates all the locations into nops. It also
records the locations, which are added to the available_filter_functions
list. Modules are processed as they are loaded and before they are
executed. When a module is unloaded, it also removes its functions from
the ftrace function list. This is automatic in the module unload
code, and the module author does not need to worry about it.
When tracing is enabled, kstop_machine is called to prevent races
with the CPUS executing code being modified (which can cause the
CPU to do undesireable things), and the nops are patched back
to calls. But this time, they do not call mcount (which is just
a function stub). They now call into the ftrace infrastructure.
One special side-effect to the recording of the functions being
traced is that we can now selectively choose which functions we
@@ -1251,36 +1256,6 @@ Produces:
We can see that there's no more lock or preempt tracing.
ftraced
-------
As mentioned above, when dynamic ftrace is configured in, a kernel
thread wakes up once a second and checks to see if there are mcount
calls that need to be converted into nops. If there are not any, then
it simply goes back to sleep. But if there are some, it will call
kstop_machine to convert the calls to nops.
There may be a case in which you do not want this added latency.
Perhaps you are doing some audio recording and this activity might
cause skips in the playback. There is an interface to disable
and enable the "ftraced" kernel thread.
# echo 0 > /debug/tracing/ftraced_enabled
This will disable the calling of kstop_machine to update the
mcount calls to nops. Remember that there is a large overhead
to calling mcount. Without this kernel thread, that overhead will
exist.
If there are recorded calls to mcount, any write to the ftraced_enabled
file will cause the kstop_machine to run. This means that a
user can manually perform the updates when they want to by simply
echoing a '0' into the ftraced_enabled file.
The updates are also done at the beginning of enabling a tracer
that uses ftrace function recording.
trace_pipe
----------
@@ -1289,14 +1264,14 @@ on the tracing is different. Every read from trace_pipe is consumed.
This means that subsequent reads will be different. The trace
is live.
# echo ftrace > /debug/tracing/current_tracer
# echo function > /debug/tracing/current_tracer
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
[1] 4153
# echo 1 > /debug/tracing/tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
# tracer: ftrace
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
@@ -1317,7 +1292,7 @@ is live.
Note, reading the trace_pipe file will block until more input is added.
By changing the tracer, trace_pipe will issue an EOF. We needed
to set the ftrace tracer _before_ cating the trace_pipe file.
to set the function tracer _before_ we "cat" the trace_pipe file.
trace entries
@@ -1334,10 +1309,10 @@ number of entries.
65620
Note, to modify this, you must have tracing completely disabled. To do that,
echo "none" into the current_tracer. If the current_tracer is not set
to "none", an EINVAL error will be returned.
echo "nop" into the current_tracer. If the current_tracer is not set
to "nop", an EINVAL error will be returned.
# echo none > /debug/tracing/current_tracer
# echo nop > /debug/tracing/current_tracer
# echo 100000 > /debug/tracing/trace_entries
# cat /debug/tracing/trace_entries
100045
+82
View File
@@ -0,0 +1,82 @@
The io_mapping functions in linux/io-mapping.h provide an abstraction for
efficiently mapping small regions of an I/O device to the CPU. The initial
usage is to support the large graphics aperture on 32-bit processors where
ioremap_wc cannot be used to statically map the entire aperture to the CPU
as it would consume too much of the kernel address space.
A mapping object is created during driver initialization using
struct io_mapping *io_mapping_create_wc(unsigned long base,
unsigned long size)
'base' is the bus address of the region to be made
mappable, while 'size' indicates how large a mapping region to
enable. Both are in bytes.
This _wc variant provides a mapping which may only be used
with the io_mapping_map_atomic_wc or io_mapping_map_wc.
With this mapping object, individual pages can be mapped either atomically
or not, depending on the necessary scheduling environment. Of course, atomic
maps are more efficient:
void *io_mapping_map_atomic_wc(struct io_mapping *mapping,
unsigned long offset)
'offset' is the offset within the defined mapping region.
Accessing addresses beyond the region specified in the
creation function yields undefined results. Using an offset
which is not page aligned yields an undefined result. The
return value points to a single page in CPU address space.
This _wc variant returns a write-combining map to the
page and may only be used with mappings created by
io_mapping_create_wc
Note that the task may not sleep while holding this page
mapped.
void io_mapping_unmap_atomic(void *vaddr)
'vaddr' must be the the value returned by the last
io_mapping_map_atomic_wc call. This unmaps the specified
page and allows the task to sleep once again.
If you need to sleep while holding the lock, you can use the non-atomic
variant, although they may be significantly slower.
void *io_mapping_map_wc(struct io_mapping *mapping,
unsigned long offset)
This works like io_mapping_map_atomic_wc except it allows
the task to sleep while holding the page mapped.
void io_mapping_unmap(void *vaddr)
This works like io_mapping_unmap_atomic, except it is used
for pages mapped with io_mapping_map_wc.
At driver close time, the io_mapping object must be freed:
void io_mapping_free(struct io_mapping *mapping)
Current Implementation:
The initial implementation of these functions uses existing mapping
mechanisms and so provides only an abstraction layer and no new
functionality.
On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole
range, creating a permanent kernel-visible mapping to the resource. The
map_atomic and map functions add the requested offset to the base of the
virtual address returned by ioremap_wc.
On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses
kmap_atomic_pfn to map the specified page in an atomic fashion;
kmap_atomic_pfn isn't really supposed to be used with device pages, but it
provides an efficient mapping for this usage.
On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and
io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which
performs an IPI to inform all processors about the new mapping. This results
in a significant performance penalty.
-5
View File
@@ -1,11 +1,6 @@
#ifndef _ASM_ARM_FTRACE
#define _ASM_ARM_FTRACE
#ifndef __ASSEMBLY__
static inline void ftrace_nmi_enter(void) { }
static inline void ftrace_nmi_exit(void) { }
#endif
#ifdef CONFIG_FUNCTION_TRACER
#define MCOUNT_ADDR ((long)(mcount))
#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */
-5
View File
@@ -1,11 +1,6 @@
#ifndef _ASM_POWERPC_FTRACE
#define _ASM_POWERPC_FTRACE
#ifndef __ASSEMBLY__
static inline void ftrace_nmi_enter(void) { }
static inline void ftrace_nmi_exit(void) { }
#endif
#ifdef CONFIG_FUNCTION_TRACER
#define MCOUNT_ADDR ((long)(_mcount))
#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */
-5
View File
@@ -1,11 +1,6 @@
#ifndef __ASM_SH_FTRACE_H
#define __ASM_SH_FTRACE_H
#ifndef __ASSEMBLY__
static inline void ftrace_nmi_enter(void) { }
static inline void ftrace_nmi_exit(void) { }
#endif
#ifndef __ASSEMBLY__
extern void mcount(void);
#endif
-5
View File
@@ -1,11 +1,6 @@
#ifndef _ASM_SPARC64_FTRACE
#define _ASM_SPARC64_FTRACE
#ifndef __ASSEMBLY__
static inline void ftrace_nmi_enter(void) { }
static inline void ftrace_nmi_exit(void) { }
#endif
#ifdef CONFIG_MCOUNT
#define MCOUNT_ADDR ((long)(_mcount))
#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */
+4
View File
@@ -1895,6 +1895,10 @@ config SYSVIPC_COMPAT
endmenu
config HAVE_ATOMIC_IOMAP
def_bool y
depends on X86_32
source "net/Kconfig"
source "drivers/Kconfig"
+4
View File
@@ -9,6 +9,10 @@
extern int fixmaps_set;
extern pte_t *kmap_pte;
extern pgprot_t kmap_prot;
extern pte_t *pkmap_page_table;
void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
void native_set_fixmap(enum fixed_addresses idx,
unsigned long phys, pgprot_t flags);
-4
View File
@@ -28,10 +28,8 @@ extern unsigned long __FIXADDR_TOP;
#include <asm/acpi.h>
#include <asm/apicdef.h>
#include <asm/page.h>
#ifdef CONFIG_HIGHMEM
#include <linux/threads.h>
#include <asm/kmap_types.h>
#endif
/*
* Here we define all the compile-time 'special' virtual
@@ -75,10 +73,8 @@ enum fixed_addresses {
#ifdef CONFIG_X86_CYCLONE_TIMER
FIX_CYCLONE_TIMER, /*cyclone timer register*/
#endif
#ifdef CONFIG_HIGHMEM
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
#endif
#ifdef CONFIG_PCI_MMCONFIG
FIX_PCIE_MCFG,
#endif
-16
View File
@@ -17,23 +17,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
*/
return addr - 1;
}
#ifdef CONFIG_DYNAMIC_FTRACE
extern void ftrace_nmi_enter(void);
extern void ftrace_nmi_exit(void);
#else
static inline void ftrace_nmi_enter(void) { }
static inline void ftrace_nmi_exit(void) { }
#endif
#endif /* __ASSEMBLY__ */
#else /* CONFIG_FUNCTION_TRACER */
#ifndef __ASSEMBLY__
static inline void ftrace_nmi_enter(void) { }
static inline void ftrace_nmi_exit(void) { }
#endif
#endif /* CONFIG_FUNCTION_TRACER */
#endif /* _ASM_X86_FTRACE_H */
+1 -4
View File
@@ -25,14 +25,11 @@
#include <asm/kmap_types.h>
#include <asm/tlbflush.h>
#include <asm/paravirt.h>
#include <asm/fixmap.h>
/* declarations for highmem.c */
extern unsigned long highstart_pfn, highend_pfn;
extern pte_t *kmap_pte;
extern pgprot_t kmap_prot;
extern pte_t *pkmap_page_table;
/*
* Right now we initialize only a single pte table. It can be extended
* easily, subsequent pte tables have to be allocated in one physical
+1 -1
View File
@@ -1,7 +1,7 @@
obj-y := init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \
pat.o pgtable.o gup.o
obj-$(CONFIG_X86_32) += pgtable_32.o
obj-$(CONFIG_X86_32) += pgtable_32.o iomap_32.o
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_X86_PTDUMP) += dump_pagetables.o
+1 -2
View File
@@ -334,7 +334,6 @@ int devmem_is_allowed(unsigned long pagenr)
return 0;
}
#ifdef CONFIG_HIGHMEM
pte_t *kmap_pte;
pgprot_t kmap_prot;
@@ -357,6 +356,7 @@ static void __init kmap_init(void)
kmap_prot = PAGE_KERNEL;
}
#ifdef CONFIG_HIGHMEM
static void __init permanent_kmaps_init(pgd_t *pgd_base)
{
unsigned long vaddr;
@@ -436,7 +436,6 @@ static void __init set_highmem_pages_init(void)
#endif /* !CONFIG_NUMA */
#else
# define kmap_init() do { } while (0)
# define permanent_kmaps_init(pgd_base) do { } while (0)
# define set_highmem_pages_init() do { } while (0)
#endif /* CONFIG_HIGHMEM */
+59
View File
@@ -0,0 +1,59 @@
/*
* Copyright © 2008 Ingo Molnar
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
*/
#include <asm/iomap.h>
#include <linux/module.h>
/* Map 'pfn' using fixed map 'type' and protections 'prot'
*/
void *
iomap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot)
{
enum fixed_addresses idx;
unsigned long vaddr;
pagefault_disable();
idx = type + KM_TYPE_NR*smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
set_pte(kmap_pte-idx, pfn_pte(pfn, prot));
arch_flush_lazy_mmu_mode();
return (void*) vaddr;
}
EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn);
void
iounmap_atomic(void *kvaddr, enum km_type type)
{
unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
enum fixed_addresses idx = type + KM_TYPE_NR*smp_processor_id();
/*
* Force other mappings to Oops if they'll try to access this pte
* without first remap it. Keeping stale mappings around is a bad idea
* also, in case the page changes cacheability attributes or becomes
* a protected page in a hypervisor.
*/
if (vaddr == __fix_to_virt(FIX_KMAP_BEGIN+idx))
kpte_clear_flush(kmap_pte-idx, vaddr);
arch_flush_lazy_mmu_mode();
pagefault_enable();
}
EXPORT_SYMBOL_GPL(iounmap_atomic);
+2 -1
View File
@@ -3,13 +3,14 @@
# Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
ccflags-y := -Iinclude/drm
i915-y := i915_drv.o i915_dma.o i915_irq.o i915_mem.o i915_opregion.o \
i915-y := i915_drv.o i915_dma.o i915_irq.o i915_mem.o \
i915_suspend.o \
i915_gem.o \
i915_gem_debug.o \
i915_gem_proc.o \
i915_gem_tiling.o
i915-$(CONFIG_ACPI) += i915_opregion.o
i915-$(CONFIG_COMPAT) += i915_ioc32.o
obj-$(CONFIG_DRM_I915) += i915.o
+1
View File
@@ -960,6 +960,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF(DRM_I915_GEM_SW_FINISH, i915_gem_sw_finish_ioctl, 0),
DRM_IOCTL_DEF(DRM_I915_GEM_SET_TILING, i915_gem_set_tiling, 0),
DRM_IOCTL_DEF(DRM_I915_GEM_GET_TILING, i915_gem_get_tiling, 0),
DRM_IOCTL_DEF(DRM_I915_GEM_GET_APERTURE, i915_gem_get_aperture_ioctl, 0),
};
int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
+12
View File
@@ -31,6 +31,7 @@
#define _I915_DRV_H_
#include "i915_reg.h"
#include <linux/io-mapping.h>
/* General customization:
*/
@@ -246,6 +247,8 @@ typedef struct drm_i915_private {
struct {
struct drm_mm gtt_space;
struct io_mapping *gtt_mapping;
/**
* List of objects currently involved in rendering from the
* ringbuffer.
@@ -502,6 +505,8 @@ int i915_gem_set_tiling(struct drm_device *dev, void *data,
struct drm_file *file_priv);
int i915_gem_get_tiling(struct drm_device *dev, void *data,
struct drm_file *file_priv);
int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv);
void i915_gem_load(struct drm_device *dev);
int i915_gem_proc_init(struct drm_minor *minor);
void i915_gem_proc_cleanup(struct drm_minor *minor);
@@ -539,11 +544,18 @@ extern int i915_restore_state(struct drm_device *dev);
extern int i915_save_state(struct drm_device *dev);
extern int i915_restore_state(struct drm_device *dev);
#ifdef CONFIG_ACPI
/* i915_opregion.c */
extern int intel_opregion_init(struct drm_device *dev);
extern void intel_opregion_free(struct drm_device *dev);
extern void opregion_asle_intr(struct drm_device *dev);
extern void opregion_enable_asle(struct drm_device *dev);
#else
static inline int intel_opregion_init(struct drm_device *dev) { return 0; }
static inline void intel_opregion_free(struct drm_device *dev) { return; }
static inline void opregion_asle_intr(struct drm_device *dev) { return; }
static inline void opregion_enable_asle(struct drm_device *dev) { return; }
#endif
/**
* Lock test for when it's just for synchronization of ring access.
+102 -94
View File
@@ -79,6 +79,28 @@ i915_gem_init_ioctl(struct drm_device *dev, void *data,
return 0;
}
int
i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_gem_get_aperture *args = data;
struct drm_i915_gem_object *obj_priv;
if (!(dev->driver->driver_features & DRIVER_GEM))
return -ENODEV;
args->aper_size = dev->gtt_total;
args->aper_available_size = args->aper_size;
list_for_each_entry(obj_priv, &dev_priv->mm.active_list, list) {
if (obj_priv->pin_count > 0)
args->aper_available_size -= obj_priv->obj->size;
}
return 0;
}
/**
* Creates a new mm object and returns a handle to it.
@@ -171,35 +193,50 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
return 0;
}
/*
* Try to write quickly with an atomic kmap. Return true on success.
*
* If this fails (which includes a partial write), we'll redo the whole
* thing with the slow version.
*
* This is a workaround for the low performance of iounmap (approximate
* 10% cpu cost on normal 3D workloads). kmap_atomic on HIGHMEM kernels
* happens to let us map card memory without taking IPIs. When the vmap
* rework lands we should be able to dump this hack.
/* This is the fast write path which cannot handle
* page faults in the source data
*/
static inline int fast_user_write(unsigned long pfn, char __user *user_data,
int l, int o)
{
#ifdef CONFIG_HIGHMEM
unsigned long unwritten;
char *vaddr_atomic;
vaddr_atomic = kmap_atomic_pfn(pfn, KM_USER0);
#if WATCH_PWRITE
DRM_INFO("pwrite i %d o %d l %d pfn %ld vaddr %p\n",
i, o, l, pfn, vaddr_atomic);
#endif
unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + o, user_data, l);
kunmap_atomic(vaddr_atomic, KM_USER0);
return !unwritten;
#else
static inline int
fast_user_write(struct io_mapping *mapping,
loff_t page_base, int page_offset,
char __user *user_data,
int length)
{
char *vaddr_atomic;
unsigned long unwritten;
vaddr_atomic = io_mapping_map_atomic_wc(mapping, page_base);
unwritten = __copy_from_user_inatomic_nocache(vaddr_atomic + page_offset,
user_data, length);
io_mapping_unmap_atomic(vaddr_atomic);
if (unwritten)
return -EFAULT;
return 0;
}
/* Here's the write path which can sleep for
* page faults
*/
static inline int
slow_user_write(struct io_mapping *mapping,
loff_t page_base, int page_offset,
char __user *user_data,
int length)
{
char __iomem *vaddr;
unsigned long unwritten;
vaddr = io_mapping_map_wc(mapping, page_base);
if (vaddr == NULL)
return -EFAULT;
unwritten = __copy_from_user(vaddr + page_offset,
user_data, length);
io_mapping_unmap(vaddr);
if (unwritten)
return -EFAULT;
return 0;
#endif
}
static int
@@ -208,10 +245,12 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj,
struct drm_file *file_priv)
{
struct drm_i915_gem_object *obj_priv = obj->driver_private;
drm_i915_private_t *dev_priv = dev->dev_private;
ssize_t remain;
loff_t offset;
loff_t offset, page_base;
char __user *user_data;
int ret = 0;
int page_offset, page_length;
int ret;
user_data = (char __user *) (uintptr_t) args->data_ptr;
remain = args->size;
@@ -235,57 +274,37 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct drm_gem_object *obj,
obj_priv->dirty = 1;
while (remain > 0) {
unsigned long pfn;
int i, o, l;
/* Operation in this page
*
* i = page number
* o = offset within page
* l = bytes to copy
* page_base = page offset within aperture
* page_offset = offset within page
* page_length = bytes to copy for this page
*/
i = offset >> PAGE_SHIFT;
o = offset & (PAGE_SIZE-1);
l = remain;
if ((o + l) > PAGE_SIZE)
l = PAGE_SIZE - o;
page_base = (offset & ~(PAGE_SIZE-1));
page_offset = offset & (PAGE_SIZE-1);
page_length = remain;
if ((page_offset + remain) > PAGE_SIZE)
page_length = PAGE_SIZE - page_offset;
pfn = (dev->agp->base >> PAGE_SHIFT) + i;
ret = fast_user_write (dev_priv->mm.gtt_mapping, page_base,
page_offset, user_data, page_length);
if (!fast_user_write(pfn, user_data, l, o)) {
unsigned long unwritten;
char __iomem *vaddr;
vaddr = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE);
#if WATCH_PWRITE
DRM_INFO("pwrite slow i %d o %d l %d "
"pfn %ld vaddr %p\n",
i, o, l, pfn, vaddr);
#endif
if (vaddr == NULL) {
ret = -EFAULT;
/* If we get a fault while copying data, then (presumably) our
* source page isn't available. In this case, use the
* non-atomic function
*/
if (ret) {
ret = slow_user_write (dev_priv->mm.gtt_mapping,
page_base, page_offset,
user_data, page_length);
if (ret)
goto fail;
}
unwritten = __copy_from_user(vaddr + o, user_data, l);
#if WATCH_PWRITE
DRM_INFO("unwritten %ld\n", unwritten);
#endif
iounmap(vaddr);
if (unwritten) {
ret = -EFAULT;
goto fail;
}
}
remain -= l;
user_data += l;
offset += l;
remain -= page_length;
user_data += page_length;
offset += page_length;
}
#if WATCH_PWRITE && 1
i915_gem_clflush_object(obj);
i915_gem_dump_object(obj, args->offset + args->size, __func__, ~0);
i915_gem_clflush_object(obj);
#endif
fail:
i915_gem_object_unpin(obj);
@@ -1503,12 +1522,12 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj,
struct drm_i915_gem_exec_object *entry)
{
struct drm_device *dev = obj->dev;
drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_gem_relocation_entry reloc;
struct drm_i915_gem_relocation_entry __user *relocs;
struct drm_i915_gem_object *obj_priv = obj->driver_private;
int i, ret;
uint32_t last_reloc_offset = -1;
void __iomem *reloc_page = NULL;
void __iomem *reloc_page;
/* Choose the GTT offset for our buffer and put it there. */
ret = i915_gem_object_pin(obj, (uint32_t) entry->alignment);
@@ -1631,26 +1650,11 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj,
* perform.
*/
reloc_offset = obj_priv->gtt_offset + reloc.offset;
if (reloc_page == NULL ||
(last_reloc_offset & ~(PAGE_SIZE - 1)) !=
(reloc_offset & ~(PAGE_SIZE - 1))) {
if (reloc_page != NULL)
iounmap(reloc_page);
reloc_page = ioremap_wc(dev->agp->base +
(reloc_offset &
~(PAGE_SIZE - 1)),
PAGE_SIZE);
last_reloc_offset = reloc_offset;
if (reloc_page == NULL) {
drm_gem_object_unreference(target_obj);
i915_gem_object_unpin(obj);
return -ENOMEM;
}
}
reloc_page = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
(reloc_offset &
~(PAGE_SIZE - 1)));
reloc_entry = (uint32_t __iomem *)(reloc_page +
(reloc_offset & (PAGE_SIZE - 1)));
(reloc_offset & (PAGE_SIZE - 1)));
reloc_val = target_obj_priv->gtt_offset + reloc.delta;
#if WATCH_BUF
@@ -1659,6 +1663,7 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj,
readl(reloc_entry), reloc_val);
#endif
writel(reloc_val, reloc_entry);
io_mapping_unmap_atomic(reloc_page);
/* Write the updated presumed offset for this entry back out
* to the user.
@@ -1674,9 +1679,6 @@ i915_gem_object_pin_and_relocate(struct drm_gem_object *obj,
drm_gem_object_unreference(target_obj);
}
if (reloc_page != NULL)
iounmap(reloc_page);
#if WATCH_BUF
if (0)
i915_gem_dump_object(obj, 128, __func__, ~0);
@@ -2518,6 +2520,10 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
if (ret != 0)
return ret;
dev_priv->mm.gtt_mapping = io_mapping_create_wc(dev->agp->base,
dev->agp->agp_info.aper_size
* 1024 * 1024);
mutex_lock(&dev->struct_mutex);
BUG_ON(!list_empty(&dev_priv->mm.active_list));
BUG_ON(!list_empty(&dev_priv->mm.flushing_list));
@@ -2535,11 +2541,13 @@ int
i915_gem_leavevt_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
drm_i915_private_t *dev_priv = dev->dev_private;
int ret;
ret = i915_gem_idle(dev);
drm_irq_uninstall(dev);
io_mapping_free(dev_priv->mm.gtt_mapping);
return ret;
}
+8 -7
View File
@@ -653,15 +653,16 @@ static void radeon_cp_init_ring_buffer(struct drm_device * dev,
RADEON_WRITE(RADEON_SCRATCH_UMSK, 0x7);
/* Turn on bus mastering */
if (((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RS400) ||
((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RS690) ||
if (((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RS690) ||
((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RS740)) {
/* rs400, rs690/rs740 */
tmp = RADEON_READ(RADEON_BUS_CNTL) & ~RS400_BUS_MASTER_DIS;
/* rs600/rs690/rs740 */
tmp = RADEON_READ(RADEON_BUS_CNTL) & ~RS600_BUS_MASTER_DIS;
RADEON_WRITE(RADEON_BUS_CNTL, tmp);
} else if (!(((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RV380) ||
((dev_priv->flags & RADEON_FAMILY_MASK) >= CHIP_R423))) {
/* r1xx, r2xx, r300, r(v)350, r420/r481, rs480 */
} else if (((dev_priv->flags & RADEON_FAMILY_MASK) <= CHIP_RV350) ||
((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_R420) ||
((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RS400) ||
((dev_priv->flags & RADEON_FAMILY_MASK) == CHIP_RS480)) {
/* r1xx, r2xx, r300, r(v)350, r420/r481, rs400/rs480 */
tmp = RADEON_READ(RADEON_BUS_CNTL) & ~RADEON_BUS_MASTER_DIS;
RADEON_WRITE(RADEON_BUS_CNTL, tmp);
} /* PCIE cards appears to not need this */
+6 -6
View File
@@ -447,12 +447,12 @@ extern int r300_do_cp_cmdbuf(struct drm_device *dev,
* handling, not bus mastering itself.
*/
#define RADEON_BUS_CNTL 0x0030
/* r1xx, r2xx, r300, r(v)350, r420/r481, rs480 */
/* r1xx, r2xx, r300, r(v)350, r420/r481, rs400/rs480 */
# define RADEON_BUS_MASTER_DIS (1 << 6)
/* rs400, rs690/rs740 */
# define RS400_BUS_MASTER_DIS (1 << 14)
# define RS400_MSI_REARM (1 << 20)
/* see RS480_MSI_REARM in AIC_CNTL for rs480 */
/* rs600/rs690/rs740 */
# define RS600_BUS_MASTER_DIS (1 << 14)
# define RS600_MSI_REARM (1 << 20)
/* see RS400_MSI_REARM in AIC_CNTL for rs480 */
#define RADEON_BUS_CNTL1 0x0034
# define RADEON_PMI_BM_DIS (1 << 2)
@@ -937,7 +937,7 @@ extern int r300_do_cp_cmdbuf(struct drm_device *dev,
#define RADEON_AIC_CNTL 0x01d0
# define RADEON_PCIGART_TRANSLATE_EN (1 << 0)
# define RS480_MSI_REARM (1 << 3)
# define RS400_MSI_REARM (1 << 3)
#define RADEON_AIC_STAT 0x01d4
#define RADEON_AIC_PT_BASE 0x01d8
#define RADEON_AIC_LO_ADDR 0x01dc

Some files were not shown because too many files have changed in this diff Show More