Merge branch 'linus' into x86/apic-cleanups

Conflicts:
	arch/x86/include/asm/io_apic.h

Merge reason: Resolve the conflict, update to a more recent -rc base

Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
Ingo Molnar
2011-01-07 14:14:15 +01:00
1847 changed files with 32544 additions and 18048 deletions
-2
View File
@@ -2365,8 +2365,6 @@ E: acme@redhat.com
W: http://oops.ghostprotocols.net:81/blog/
P: 1024D/9224DF01 D5DF E3BB E3C8 BCBB F8AD 841A B6AB 4681 9224 DF01
D: IPX, LLC, DCCP, cyc2x, wl3501_cs, net/ hacks
S: R. Brasílio Itiberê, 4270/1010 - Água Verde
S: 80240-060 - Curitiba - Paraná
S: Brazil
N: Karsten Merker
+83
View File
@@ -0,0 +1,83 @@
What: /sys/bus/rbd/
Date: November 2010
Contact: Yehuda Sadeh <yehuda@hq.newdream.net>,
Sage Weil <sage@newdream.net>
Description:
Being used for adding and removing rbd block devices.
Usage: <mon ip addr> <options> <pool name> <rbd image name> [snap name]
$ echo "192.168.0.1 name=admin rbd foo" > /sys/bus/rbd/add
The snapshot name can be "-" or omitted to map the image read/write. A <dev-id>
will be assigned for any registered block device. If snapshot is used, it will
be mapped read-only.
Removal of a device:
$ echo <dev-id> > /sys/bus/rbd/remove
Entries under /sys/bus/rbd/devices/<dev-id>/
--------------------------------------------
client_id
The ceph unique client id that was assigned for this specific session.
major
The block device major number.
name
The name of the rbd image.
pool
The pool where this rbd image resides. The pool-name pair is unique
per rados system.
size
The size (in bytes) of the mapped block device.
refresh
Writing to this file will reread the image header data and set
all relevant datastructures accordingly.
current_snap
The current snapshot for which the device is mapped.
create_snap
Create a snapshot:
$ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create
rollback_snap
Rolls back data to the specified snapshot. This goes over the entire
list of rados blocks and sends a rollback command to each.
$ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_rollback
snap_*
A directory per each snapshot
Entries under /sys/bus/rbd/devices/<dev-id>/snap_<snap-name>
-------------------------------------------------------------
id
The rados internal snapshot id assigned for this snapshot
size
The size of the image when this snapshot was taken.
@@ -47,6 +47,20 @@ Date: January 2007
KernelVersion: 2.6.20
Contact: "Corentin Chary" <corentincj@iksaif.net>
Description:
Control the bluetooth device. 1 means on, 0 means off.
Control the wlan device. 1 means on, 0 means off.
This may control the led, the device or both.
Users: Lapsus
What: /sys/devices/platform/asus_laptop/wimax
Date: October 2010
KernelVersion: 2.6.37
Contact: "Corentin Chary" <corentincj@iksaif.net>
Description:
Control the wimax device. 1 means on, 0 means off.
What: /sys/devices/platform/asus_laptop/wwan
Date: October 2010
KernelVersion: 2.6.37
Contact: "Corentin Chary" <corentincj@iksaif.net>
Description:
Control the wwan (3G) device. 1 means on, 0 means off.
@@ -0,0 +1,10 @@
What: /sys/devices/platform/eeepc-wmi/cpufv
Date: Oct 2010
KernelVersion: 2.6.37
Contact: "Corentin Chary" <corentincj@iksaif.net>
Description:
Change CPU clock configuration (write-only).
There are three available clock configuration:
* 0 -> Super Performance Mode
* 1 -> High Performance Mode
* 2 -> Power Saving Mode
-4
View File
@@ -79,10 +79,6 @@
</sect2>
</sect1>
</chapter>
<chapter id="clk">
<title>Clock Framework Extensions</title>
!Iinclude/linux/sh_clk.h
</chapter>
<chapter id="mach">
<title>Machine Specific Interfaces</title>
<sect1 id="dreamcast">
+3 -3
View File
@@ -16,7 +16,7 @@
</orgname>
<address>
<email>hjk@linutronix.de</email>
<email>hjk@hansjkoch.de</email>
</address>
</affiliation>
</author>
@@ -114,7 +114,7 @@ GPL version 2.
<para>If you know of any translations for this document, or you are
interested in translating it, please email me
<email>hjk@linutronix.de</email>.
<email>hjk@hansjkoch.de</email>.
</para>
</sect1>
@@ -171,7 +171,7 @@ interested in translating it, please email me
<title>Feedback</title>
<para>Find something wrong with this document? (Or perhaps something
right?) I would love to hear from you. Please email me at
<email>hjk@linutronix.de</email>.</para>
<email>hjk@hansjkoch.de</email>.</para>
</sect1>
</chapter>
+128 -16
View File
@@ -1,18 +1,22 @@
CONFIG_RCU_TRACE debugfs Files and Formats
The rcutree implementation of RCU provides debugfs trace output that
summarizes counters and state. This information is useful for debugging
RCU itself, and can sometimes also help to debug abuses of RCU.
The following sections describe the debugfs files and formats.
The rcutree and rcutiny implementations of RCU provide debugfs trace
output that summarizes counters and state. This information is useful for
debugging RCU itself, and can sometimes also help to debug abuses of RCU.
The following sections describe the debugfs files and formats, first
for rcutree and next for rcutiny.
Hierarchical RCU debugfs Files and Formats
CONFIG_TREE_RCU and CONFIG_TREE_PREEMPT_RCU debugfs Files and Formats
This implementation of RCU provides three debugfs files under the
These implementations of RCU provides five debugfs files under the
top-level directory RCU: rcu/rcudata (which displays fields in struct
rcu_data), rcu/rcugp (which displays grace-period counters), and
rcu/rcuhier (which displays the struct rcu_node hierarchy).
rcu_data), rcu/rcudata.csv (which is a .csv spreadsheet version of
rcu/rcudata), rcu/rcugp (which displays grace-period counters),
rcu/rcuhier (which displays the struct rcu_node hierarchy), and
rcu/rcu_pending (which displays counts of the reasons that the
rcu_pending() function decided that there was core RCU work to do).
The output of "cat rcu/rcudata" looks as follows:
@@ -130,7 +134,8 @@ o "ci" is the number of RCU callbacks that have been invoked for
been registered in absence of CPU-hotplug activity.
o "co" is the number of RCU callbacks that have been orphaned due to
this CPU going offline.
this CPU going offline. These orphaned callbacks have been moved
to an arbitrarily chosen online CPU.
o "ca" is the number of RCU callbacks that have been adopted due to
other CPUs going offline. Note that ci+co-ca+ql is the number of
@@ -168,12 +173,12 @@ o "gpnum" is the number of grace periods that have started. It is
The output of "cat rcu/rcuhier" looks as follows, with very long lines:
c=6902 g=6903 s=2 jfq=3 j=72c7 nfqs=13142/nfqsng=0(13142) fqlh=6 oqlen=0
c=6902 g=6903 s=2 jfq=3 j=72c7 nfqs=13142/nfqsng=0(13142) fqlh=6
1/1 .>. 0:127 ^0
3/3 .>. 0:35 ^0 0/0 .>. 36:71 ^1 0/0 .>. 72:107 ^2 0/0 .>. 108:127 ^3
3/3f .>. 0:5 ^0 2/3 .>. 6:11 ^1 0/0 .>. 12:17 ^2 0/0 .>. 18:23 ^3 0/0 .>. 24:29 ^4 0/0 .>. 30:35 ^5 0/0 .>. 36:41 ^0 0/0 .>. 42:47 ^1 0/0 .>. 48:53 ^2 0/0 .>. 54:59 ^3 0/0 .>. 60:65 ^4 0/0 .>. 66:71 ^5 0/0 .>. 72:77 ^0 0/0 .>. 78:83 ^1 0/0 .>. 84:89 ^2 0/0 .>. 90:95 ^3 0/0 .>. 96:101 ^4 0/0 .>. 102:107 ^5 0/0 .>. 108:113 ^0 0/0 .>. 114:119 ^1 0/0 .>. 120:125 ^2 0/0 .>. 126:127 ^3
rcu_bh:
c=-226 g=-226 s=1 jfq=-5701 j=72c7 nfqs=88/nfqsng=0(88) fqlh=0 oqlen=0
c=-226 g=-226 s=1 jfq=-5701 j=72c7 nfqs=88/nfqsng=0(88) fqlh=0
0/1 .>. 0:127 ^0
0/3 .>. 0:35 ^0 0/0 .>. 36:71 ^1 0/0 .>. 72:107 ^2 0/0 .>. 108:127 ^3
0/3f .>. 0:5 ^0 0/3 .>. 6:11 ^1 0/0 .>. 12:17 ^2 0/0 .>. 18:23 ^3 0/0 .>. 24:29 ^4 0/0 .>. 30:35 ^5 0/0 .>. 36:41 ^0 0/0 .>. 42:47 ^1 0/0 .>. 48:53 ^2 0/0 .>. 54:59 ^3 0/0 .>. 60:65 ^4 0/0 .>. 66:71 ^5 0/0 .>. 72:77 ^0 0/0 .>. 78:83 ^1 0/0 .>. 84:89 ^2 0/0 .>. 90:95 ^3 0/0 .>. 96:101 ^4 0/0 .>. 102:107 ^5 0/0 .>. 108:113 ^0 0/0 .>. 114:119 ^1 0/0 .>. 120:125 ^2 0/0 .>. 126:127 ^3
@@ -212,11 +217,6 @@ o "fqlh" is the number of calls to force_quiescent_state() that
exited immediately (without even being counted in nfqs above)
due to contention on ->fqslock.
o "oqlen" is the number of callbacks on the "orphan" callback
list. RCU callbacks are placed on this list by CPUs going
offline, and are "adopted" either by the CPU helping the outgoing
CPU or by the next rcu_barrier*() call, whichever comes first.
o Each element of the form "1/1 0:127 ^0" represents one struct
rcu_node. Each line represents one level of the hierarchy, from
root to leaves. It is best to think of the rcu_data structures
@@ -326,3 +326,115 @@ o "nn" is the number of times that this CPU needed nothing. Alert
readers will note that the rcu "nn" number for a given CPU very
closely matches the rcu_bh "np" number for that same CPU. This
is due to short-circuit evaluation in rcu_pending().
CONFIG_TINY_RCU and CONFIG_TINY_PREEMPT_RCU debugfs Files and Formats
These implementations of RCU provides a single debugfs file under the
top-level directory RCU, namely rcu/rcudata, which displays fields in
rcu_bh_ctrlblk, rcu_sched_ctrlblk and, for CONFIG_TINY_PREEMPT_RCU,
rcu_preempt_ctrlblk.
The output of "cat rcu/rcudata" is as follows:
rcu_preempt: qlen=24 gp=1097669 g197/p197/c197 tasks=...
ttb=. btg=no ntb=184 neb=0 nnb=183 j=01f7 bt=0274
normal balk: nt=1097669 gt=0 bt=371 b=0 ny=25073378 nos=0
exp balk: bt=0 nos=0
rcu_sched: qlen: 0
rcu_bh: qlen: 0
This is split into rcu_preempt, rcu_sched, and rcu_bh sections, with the
rcu_preempt section appearing only in CONFIG_TINY_PREEMPT_RCU builds.
The last three lines of the rcu_preempt section appear only in
CONFIG_RCU_BOOST kernel builds. The fields are as follows:
o "qlen" is the number of RCU callbacks currently waiting either
for an RCU grace period or waiting to be invoked. This is the
only field present for rcu_sched and rcu_bh, due to the
short-circuiting of grace period in those two cases.
o "gp" is the number of grace periods that have completed.
o "g197/p197/c197" displays the grace-period state, with the
"g" number being the number of grace periods that have started
(mod 256), the "p" number being the number of grace periods
that the CPU has responded to (also mod 256), and the "c"
number being the number of grace periods that have completed
(once again mode 256).
Why have both "gp" and "g"? Because the data flowing into
"gp" is only present in a CONFIG_RCU_TRACE kernel.
o "tasks" is a set of bits. The first bit is "T" if there are
currently tasks that have recently blocked within an RCU
read-side critical section, the second bit is "N" if any of the
aforementioned tasks are blocking the current RCU grace period,
and the third bit is "E" if any of the aforementioned tasks are
blocking the current expedited grace period. Each bit is "."
if the corresponding condition does not hold.
o "ttb" is a single bit. It is "B" if any of the blocked tasks
need to be priority boosted and "." otherwise.
o "btg" indicates whether boosting has been carried out during
the current grace period, with "exp" indicating that boosting
is in progress for an expedited grace period, "no" indicating
that boosting has not yet started for a normal grace period,
"begun" indicating that boosting has bebug for a normal grace
period, and "done" indicating that boosting has completed for
a normal grace period.
o "ntb" is the total number of tasks subjected to RCU priority boosting
periods since boot.
o "neb" is the number of expedited grace periods that have had
to resort to RCU priority boosting since boot.
o "nnb" is the number of normal grace periods that have had
to resort to RCU priority boosting since boot.
o "j" is the low-order 12 bits of the jiffies counter in hexadecimal.
o "bt" is the low-order 12 bits of the value that the jiffies counter
will have at the next time that boosting is scheduled to begin.
o In the line beginning with "normal balk", the fields are as follows:
o "nt" is the number of times that the system balked from
boosting because there were no blocked tasks to boost.
Note that the system will balk from boosting even if the
grace period is overdue when the currently running task
is looping within an RCU read-side critical section.
There is no point in boosting in this case, because
boosting a running task won't make it run any faster.
o "gt" is the number of times that the system balked
from boosting because, although there were blocked tasks,
none of them were preventing the current grace period
from completing.
o "bt" is the number of times that the system balked
from boosting because boosting was already in progress.
o "b" is the number of times that the system balked from
boosting because boosting had already completed for
the grace period in question.
o "ny" is the number of times that the system balked from
boosting because it was not yet time to start boosting
the grace period in question.
o "nos" is the number of times that the system balked from
boosting for inexplicable ("not otherwise specified")
reasons. This can actually happen due to races involving
increments of the jiffies counter.
o In the line beginning with "exp balk", the fields are as follows:
o "bt" is the number of times that the system balked from
boosting because there were no blocked tasks to boost.
o "nos" is the number of times that the system balked from
boosting for inexplicable ("not otherwise specified")
reasons.
+1
View File
@@ -516,6 +516,7 @@ int main(int argc, char *argv[])
default:
fprintf(stderr, "Unknown nla_type %d\n",
na->nla_type);
case TASKSTATS_TYPE_NULL:
break;
}
na = (struct nlattr *) (GENLMSG_DATA(&msg) + len);
+22 -9
View File
@@ -154,7 +154,7 @@ The stages that a patch goes through are, generally:
inclusion, it should be accepted by a relevant subsystem maintainer -
though this acceptance is not a guarantee that the patch will make it
all the way to the mainline. The patch will show up in the maintainer's
subsystem tree and into the staging trees (described below). When the
subsystem tree and into the -next trees (described below). When the
process works, this step leads to more extensive review of the patch and
the discovery of any problems resulting from the integration of this
patch with work being done by others.
@@ -236,7 +236,7 @@ finding the right maintainer. Sending patches directly to Linus is not
normally the right way to go.
2.4: STAGING TREES
2.4: NEXT TREES
The chain of subsystem trees guides the flow of patches into the kernel,
but it also raises an interesting question: what if somebody wants to look
@@ -250,7 +250,7 @@ changes land in the mainline kernel. One could pull changes from all of
the interesting subsystem trees, but that would be a big and error-prone
job.
The answer comes in the form of staging trees, where subsystem trees are
The answer comes in the form of -next trees, where subsystem trees are
collected for testing and review. The older of these trees, maintained by
Andrew Morton, is called "-mm" (for memory management, which is how it got
started). The -mm tree integrates patches from a long list of subsystem
@@ -275,7 +275,7 @@ directory at:
Use of the MMOTM tree is likely to be a frustrating experience, though;
there is a definite chance that it will not even compile.
The other staging tree, started more recently, is linux-next, maintained by
The other -next tree, started more recently, is linux-next, maintained by
Stephen Rothwell. The linux-next tree is, by design, a snapshot of what
the mainline is expected to look like after the next merge window closes.
Linux-next trees are announced on the linux-kernel and linux-next mailing
@@ -303,12 +303,25 @@ volatility of linux-next tends to make it a difficult development target.
See http://lwn.net/Articles/289013/ for more information on this topic, and
stay tuned; much is still in flux where linux-next is involved.
Besides the mmotm and linux-next trees, the kernel source tree now contains
the drivers/staging/ directory and many sub-directories for drivers or
filesystems that are on their way to being added to the kernel tree
proper, but they remain in drivers/staging/ while they still need more
work.
2.4.1: STAGING TREES
The kernel source tree now contains the drivers/staging/ directory, where
many sub-directories for drivers or filesystems that are on their way to
being added to the kernel tree live. They remain in drivers/staging while
they still need more work; once complete, they can be moved into the
kernel proper. This is a way to keep track of drivers that aren't
up to Linux kernel coding or quality standards, but people may want to use
them and track development.
Greg Kroah-Hartman currently (as of 2.6.36) maintains the staging tree.
Drivers that still need work are sent to him, with each driver having
its own subdirectory in drivers/staging/. Along with the driver source
files, a TODO file should be present in the directory as well. The TODO
file lists the pending work that the driver needs for acceptance into
the kernel proper, as well as a list of people that should be Cc'd for any
patches to the driver. Staging drivers that don't currently build should
have their config entries depend upon CONFIG_BROKEN. Once they can
be successfully built without outside patches, CONFIG_BROKEN can be removed.
2.5: TOOLS
+26
View File
@@ -62,6 +62,10 @@ aic7*reg_print.c*
aic7*seq.h*
aicasm
aicdb.h*
altivec1.c
altivec2.c
altivec4.c
altivec8.c
asm-offsets.h
asm_offsets.h
autoconf.h*
@@ -76,6 +80,7 @@ btfixupprep
build
bvmlinux
bzImage*
capflags.c
classlist.h*
comp*.log
compile.h*
@@ -94,6 +99,7 @@ devlist.h*
docproc
elf2ecoff
elfconfig.h*
evergreen_reg_safe.h
fixdep
flask.h
fore200e_mkfirm
@@ -108,9 +114,16 @@ genksyms
*_gray256.c
ihex2fw
ikconfig.h*
inat-tables.c
initramfs_data.cpio
initramfs_data.cpio.gz
initramfs_list
int16.c
int1.c
int2.c
int32.c
int4.c
int8.c
kallsyms
kconfig
keywords.c
@@ -140,6 +153,7 @@ mkprep
mktables
mktree
modpost
modules.builtin
modules.order
modversions.h*
ncscope.*
@@ -153,14 +167,23 @@ pca200e.bin
pca200e_ecd.bin2
piggy.gz
piggyback
piggy.S
pnmtologo
ppc_defs.h*
pss_boot.h
qconf
r100_reg_safe.h
r200_reg_safe.h
r300_reg_safe.h
r420_reg_safe.h
r600_reg_safe.h
raid6altivec*.c
raid6int*.c
raid6tables.c
relocs
rn50_reg_safe.h
rs600_reg_safe.h
rv515_reg_safe.h
series
setup
setup.bin
@@ -169,6 +192,7 @@ sImage
sm_tbl*
split-include
syscalltab.h
tables.c
tags
tftpboot.img
timeconst.h
@@ -190,6 +214,7 @@ vmlinux
vmlinux-*
vmlinux.aout
vmlinux.lds
voffset.h
vsyscall.lds
vsyscall_32.lds
wanxlfw.inc
@@ -200,3 +225,4 @@ wakeup.elf
wakeup.lds
zImage*
zconf.hash.c
zoffset.h
-129
View File
@@ -1,129 +0,0 @@
Device Interfaces
Introduction
~~~~~~~~~~~~
Device interfaces are the logical interfaces of device classes that correlate
directly to userspace interfaces, like device nodes.
Each device class may have multiple interfaces through which you can
access the same device. An input device may support the mouse interface,
the 'evdev' interface, and the touchscreen interface. A SCSI disk would
support the disk interface, the SCSI generic interface, and possibly a raw
device interface.
Device interfaces are registered with the class they belong to. As devices
are added to the class, they are added to each interface registered with
the class. The interface is responsible for determining whether the device
supports the interface or not.
Programming Interface
~~~~~~~~~~~~~~~~~~~~~
struct device_interface {
char * name;
rwlock_t lock;
u32 devnum;
struct device_class * devclass;
struct list_head node;
struct driver_dir_entry dir;
int (*add_device)(struct device *);
int (*add_device)(struct intf_data *);
};
int interface_register(struct device_interface *);
void interface_unregister(struct device_interface *);
An interface must specify the device class it belongs to. It is added
to that class's list of interfaces on registration.
Interfaces can be added to a device class at any time. Whenever it is
added, each device in the class is passed to the interface's
add_device callback. When an interface is removed, each device is
removed from the interface.
Devices
~~~~~~~
Once a device is added to a device class, it is added to each
interface that is registered with the device class. The class
is expected to place a class-specific data structure in
struct device::class_data. The interface can use that (along with
other fields of struct device) to determine whether or not the driver
and/or device support that particular interface.
Data
~~~~
struct intf_data {
struct list_head node;
struct device_interface * intf;
struct device * dev;
u32 intf_num;
};
int interface_add_data(struct interface_data *);
The interface is responsible for allocating and initializing a struct
intf_data and calling interface_add_data() to add it to the device's list
of interfaces it belongs to. This list will be iterated over when the device
is removed from the class (instead of all possible interfaces for a class).
This structure should probably be embedded in whatever per-device data
structure the interface is allocating anyway.
Devices are enumerated within the interface. This happens in interface_add_data()
and the enumerated value is stored in the struct intf_data for that device.
sysfs
~~~~~
Each interface is given a directory in the directory of the device
class it belongs to:
Interfaces get a directory in the class's directory as well:
class/
`-- input
|-- devices
|-- drivers
|-- mouse
`-- evdev
When a device is added to the interface, a symlink is created that points
to the device's directory in the physical hierarchy:
class/
`-- input
|-- devices
| `-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/
|-- drivers
| `-- usb:usb_mouse -> ../../../bus/drivers/usb_mouse/
|-- mouse
| `-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/
`-- evdev
`-- 1 -> ../../../root/pci0/00:1f.0/usb_bus/00:1f.2-1:0/
Future Plans
~~~~~~~~~~~~
A device interface is correlated directly with a userspace interface
for a device, specifically a device node. For instance, a SCSI disk
exposes at least two interfaces to userspace: the standard SCSI disk
interface and the SCSI generic interface. It might also export a raw
device interface.
Many interfaces have a major number associated with them and each
device gets a minor number. Or, multiple interfaces might share one
major number, and each will receive a range of minor numbers (like in
the case of input devices).
These major and minor numbers could be stored in the interface
structure. Major and minor allocations could happen when the interface
is registered with the class, or via a helper function.
+4 -4
View File
@@ -196,7 +196,7 @@ csrow3.
The representation of the above is reflected in the directory tree
in EDAC's sysfs interface. Starting in directory
/sys/devices/system/edac/mc each memory controller will be represented
by its own 'mcX' directory, where 'X" is the index of the MC.
by its own 'mcX' directory, where 'X' is the index of the MC.
..../edac/mc/
@@ -207,7 +207,7 @@ by its own 'mcX' directory, where 'X" is the index of the MC.
....
Under each 'mcX' directory each 'csrowX' is again represented by a
'csrowX', where 'X" is the csrow index:
'csrowX', where 'X' is the csrow index:
.../mc/mc0/
@@ -232,7 +232,7 @@ EDAC control and attribute files.
In 'mcX' directories are EDAC control and attribute files for
this 'X" instance of the memory controllers:
this 'X' instance of the memory controllers:
Counter reset control file:
@@ -343,7 +343,7 @@ Sdram memory scrubbing rate:
'csrowX' DIRECTORIES
In the 'csrowX' directories are EDAC control and attribute files for
this 'X" instance of csrow:
this 'X' instance of csrow:
Total Uncorrectable Errors count attribute file:
+25 -7
View File
@@ -4,33 +4,41 @@ please mail me.
Geert Uytterhoeven <geert@linux-m68k.org>
00-INDEX
- this file
- this file.
arkfb.txt
- info on the fbdev driver for ARK Logic chips.
aty128fb.txt
- info on the ATI Rage128 frame buffer driver.
cirrusfb.txt
- info on the driver for Cirrus Logic chipsets.
cmap_xfbdev.txt
- an introduction to fbdev's cmap structures.
deferred_io.txt
- an introduction to deferred IO.
efifb.txt
- info on the EFI platform driver for Intel based Apple computers.
ep93xx-fb.txt
- info on the driver for EP93xx LCD controller.
fbcon.txt
- intro to and usage guide for the framebuffer console (fbcon).
framebuffer.txt
- introduction to frame buffer devices.
imacfb.txt
- info on the generic EFI platform driver for Intel based Macs.
gxfb.txt
- info on the framebuffer driver for AMD Geode GX2 based processors.
intel810.txt
- documentation for the Intel 810/815 framebuffer driver.
intelfb.txt
- docs for Intel 830M/845G/852GM/855GM/865G/915G/945G fb driver.
internals.txt
- quick overview of frame buffer device internals.
lxfb.txt
- info on the framebuffer driver for AMD Geode LX based processors.
matroxfb.txt
- info on the Matrox framebuffer driver for Alpha, Intel and PPC.
metronomefb.txt
- info on the driver for the Metronome display controller.
modedb.txt
- info on the video mode database.
matroxfb.txt
- info on the Matrox frame buffer driver.
pvr2fb.txt
- info on the PowerVR 2 frame buffer driver.
pxafb.txt
@@ -39,13 +47,23 @@ s3fb.txt
- info on the fbdev driver for S3 Trio/Virge chips.
sa1100fb.txt
- information about the driver for the SA-1100 LCD controller.
sh7760fb.txt
- info on the SH7760/SH7763 integrated LCDC Framebuffer driver.
sisfb.txt
- info on the framebuffer device driver for various SiS chips.
sstfb.txt
- info on the frame buffer driver for 3dfx' Voodoo Graphics boards.
tgafb.txt
- info on the TGA (DECChip 21030) frame buffer driver
- info on the TGA (DECChip 21030) frame buffer driver.
tridentfb.txt
info on the framebuffer driver for some Trident chip based cards.
uvesafb.txt
- info on the userspace VESA (VBE2+ compliant) frame buffer device.
vesafb.txt
- info on the VESA frame buffer device
- info on the VESA frame buffer device.
viafb.modes
- list of modes for VIA Integration Graphic Chip.
viafb.txt
- info on the VIA Integration Graphic Chip console framebuffer driver.
vt8623fb.txt
- info on the fb driver for the graphics core in VIA VT8623 chipsets.
+104 -111
View File
@@ -18,7 +18,6 @@ prototypes:
char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
locking rules:
none have BKL
dcache_lock rename_lock ->d_lock may block
d_revalidate: no no no yes
d_hash no no no yes
@@ -42,18 +41,23 @@ ata *);
int (*rename) (struct inode *, struct dentry *,
struct inode *, struct dentry *);
int (*readlink) (struct dentry *, char __user *,int);
int (*follow_link) (struct dentry *, struct nameidata *);
void * (*follow_link) (struct dentry *, struct nameidata *);
void (*put_link) (struct dentry *, struct nameidata *, void *);
void (*truncate) (struct inode *);
int (*permission) (struct inode *, int, struct nameidata *);
int (*check_acl)(struct inode *, int);
int (*setattr) (struct dentry *, struct iattr *);
int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *);
int (*setxattr) (struct dentry *, const char *,const void *,size_t,int);
ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
int (*removexattr) (struct dentry *, const char *);
void (*truncate_range)(struct inode *, loff_t, loff_t);
long (*fallocate)(struct inode *inode, int mode, loff_t offset, loff_t len);
int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len);
locking rules:
all may block, none have BKL
all may block
i_mutex(inode)
lookup: yes
create: yes
@@ -66,19 +70,24 @@ rmdir: yes (both) (see below)
rename: yes (all) (see below)
readlink: no
follow_link: no
put_link: no
truncate: yes (see below)
setattr: yes
permission: no
check_acl: no
getattr: no
setxattr: yes
getxattr: no
listxattr: no
removexattr: yes
truncate_range: yes
fallocate: no
fiemap: no
Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_mutex on
victim.
cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
->truncate() is never called directly - it's a callback, not a
method. It's called by vmtruncate() - library function normally used by
method. It's called by vmtruncate() - deprecated library function used by
->setattr(). Locking information above applies to that call (i.e. is
inherited from ->setattr() - vmtruncate() is used when ATTR_SIZE had been
passed).
@@ -91,7 +100,7 @@ prototypes:
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
void (*dirty_inode) (struct inode *);
int (*write_inode) (struct inode *, int);
int (*write_inode) (struct inode *, struct writeback_control *wbc);
int (*drop_inode) (struct inode *);
void (*evict_inode) (struct inode *);
void (*put_super) (struct super_block *);
@@ -105,10 +114,10 @@ prototypes:
int (*show_options)(struct seq_file *, struct vfsmount *);
ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
locking rules:
All may block [not true, see below]
None have BKL
s_umount
alloc_inode:
destroy_inode:
@@ -127,6 +136,7 @@ umount_begin: no
show_options: no (namespace_sem)
quota_read: no (see below)
quota_write: no (see below)
bdev_try_to_free_page: no (see below)
->statfs() has s_umount (shared) when called by ustat(2) (native or
compat), but that's an accident of bad API; s_umount is used to pin
@@ -139,19 +149,25 @@ be the only ones operating on the quota file by the quota code (via
dqio_sem) (unless an admin really wants to screw up something and
writes to quota files with quotas on). For other details about locking
see also dquot_operations section.
->bdev_try_to_free_page is called from the ->releasepage handler of
the block device inode. See there for more details.
--------------------------- file_system_type ---------------------------
prototypes:
int (*get_sb) (struct file_system_type *, int,
const char *, void *, struct vfsmount *);
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
void (*kill_sb) (struct super_block *);
locking rules:
may block BKL
get_sb yes no
kill_sb yes no
may block
get_sb yes
mount yes
kill_sb yes
->get_sb() returns error or 0 with locked superblock attached to the vfsmount
(exclusive on ->s_umount).
->mount() returns ERR_PTR or the root dentry.
->kill_sb() takes a write-locked superblock, does all shutdown work on it,
unlocks and drops the reference.
@@ -173,28 +189,38 @@ prototypes:
sector_t (*bmap)(struct address_space *, sector_t);
int (*invalidatepage) (struct page *, unsigned long);
int (*releasepage) (struct page *, int);
void (*freepage)(struct page *);
int (*direct_IO)(int, struct kiocb *, const struct iovec *iov,
loff_t offset, unsigned long nr_segs);
int (*launder_page) (struct page *);
int (*get_xip_mem)(struct address_space *, pgoff_t, int, void **,
unsigned long *);
int (*migratepage)(struct address_space *, struct page *, struct page *);
int (*launder_page)(struct page *);
int (*is_partially_uptodate)(struct page *, read_descriptor_t *, unsigned long);
int (*error_remove_page)(struct address_space *, struct page *);
locking rules:
All except set_page_dirty may block
All except set_page_dirty and freepage may block
BKL PageLocked(page) i_mutex
writepage: no yes, unlocks (see below)
readpage: no yes, unlocks
sync_page: no maybe
writepages: no
set_page_dirty no no
readpages: no
write_begin: no locks the page yes
write_end: no yes, unlocks yes
perform_write: no n/a yes
bmap: no
invalidatepage: no yes
releasepage: no yes
direct_IO: no
launder_page: no yes
PageLocked(page) i_mutex
writepage: yes, unlocks (see below)
readpage: yes, unlocks
sync_page: maybe
writepages:
set_page_dirty no
readpages:
write_begin: locks the page yes
write_end: yes, unlocks yes
bmap:
invalidatepage: yes
releasepage: yes
freepage: yes
direct_IO:
get_xip_mem: maybe
migratepage: yes (both)
launder_page: yes
is_partially_uptodate: yes
error_remove_page: yes
->write_begin(), ->write_end(), ->sync_page() and ->readpage()
may be called from the request handler (/dev/loop).
@@ -274,9 +300,8 @@ under spinlock (it cannot block) and is sometimes called with the page
not locked.
->bmap() is currently used by legacy ioctl() (FIBMAP) provided by some
filesystems and by the swapper. The latter will eventually go away. All
instances do not actually need the BKL. Please, keep it that way and don't
breed new callers.
filesystems and by the swapper. The latter will eventually go away. Please,
keep it that way and don't breed new callers.
->invalidatepage() is called when the filesystem must attempt to drop
some or all of the buffers from the page when it is being truncated. It
@@ -288,53 +313,46 @@ buffers from the page in preparation for freeing it. It returns zero to
indicate that the buffers are (or may be) freeable. If ->releasepage is zero,
the kernel assumes that the fs has no private interest in the buffers.
->freepage() is called when the kernel is done dropping the page
from the page cache.
->launder_page() may be called prior to releasing a page if
it is still found to be dirty. It returns zero if the page was successfully
cleaned, or an error value if not. Note that in order to prevent the page
getting mapped back in and redirtied, it needs to be kept locked
across the entire operation.
Note: currently almost all instances of address_space methods are
using BKL for internal serialization and that's one of the worst sources
of contention. Normally they are calling library functions (in fs/buffer.c)
and pass foo_get_block() as a callback (on local block-based filesystems,
indeed). BKL is not needed for library stuff and is usually taken by
foo_get_block(). It's an overkill, since block bitmaps can be protected by
internal fs locking and real critical areas are much smaller than the areas
filesystems protect now.
----------------------- file_lock_operations ------------------------------
prototypes:
void (*fl_insert)(struct file_lock *); /* lock insertion callback */
void (*fl_remove)(struct file_lock *); /* lock removal callback */
void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
void (*fl_release_private)(struct file_lock *);
locking rules:
BKL may block
fl_insert: yes no
fl_remove: yes no
fl_copy_lock: yes no
fl_release_private: yes yes
file_lock_lock may block
fl_copy_lock: yes no
fl_release_private: maybe no
----------------------- lock_manager_operations ---------------------------
prototypes:
int (*fl_compare_owner)(struct file_lock *, struct file_lock *);
void (*fl_notify)(struct file_lock *); /* unblock callback */
int (*fl_grant)(struct file_lock *, struct file_lock *, int);
void (*fl_release_private)(struct file_lock *);
void (*fl_break)(struct file_lock *); /* break_lease callback */
int (*fl_mylease)(struct file_lock *, struct file_lock *);
int (*fl_change)(struct file_lock **, int);
locking rules:
BKL may block
fl_compare_owner: yes no
fl_notify: yes no
fl_release_private: yes yes
fl_break: yes no
file_lock_lock may block
fl_compare_owner: yes no
fl_notify: yes no
fl_grant: no no
fl_release_private: maybe no
fl_break: yes no
fl_mylease: yes no
fl_change yes no
Currently only NFSD and NLM provide instances of this class. None of the
them block. If you have out-of-tree instances - please, show up. Locking
in that area will change.
--------------------------- buffer_head -----------------------------------
prototypes:
void (*b_end_io)(struct buffer_head *bh, int uptodate);
@@ -359,17 +377,17 @@ prototypes:
void (*swap_slot_free_notify) (struct block_device *, unsigned long);
locking rules:
BKL bd_mutex
open: no yes
release: no yes
ioctl: no no
compat_ioctl: no no
direct_access: no no
media_changed: no no
unlock_native_capacity: no no
revalidate_disk: no no
getgeo: no no
swap_slot_free_notify: no no (see below)
bd_mutex
open: yes
release: yes
ioctl: no
compat_ioctl: no
direct_access: no
media_changed: no
unlock_native_capacity: no
revalidate_disk: no
getgeo: no
swap_slot_free_notify: no (see below)
media_changed, unlock_native_capacity and revalidate_disk are called only from
check_disk_change().
@@ -408,34 +426,21 @@ prototypes:
unsigned long (*get_unmapped_area)(struct file *, unsigned long,
unsigned long, unsigned long, unsigned long);
int (*check_flags)(int);
int (*flock) (struct file *, int, struct file_lock *);
ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *,
size_t, unsigned int);
ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *,
size_t, unsigned int);
int (*setlease)(struct file *, long, struct file_lock **);
};
locking rules:
All may block.
BKL
llseek: no (see below)
read: no
aio_read: no
write: no
aio_write: no
readdir: no
poll: no
unlocked_ioctl: no
compat_ioctl: no
mmap: no
open: no
flush: no
release: no
fsync: no (see below)
aio_fsync: no
fasync: no
lock: yes
readv: no
writev: no
sendfile: no
sendpage: no
get_unmapped_area: no
check_flags: no
All may block except for ->setlease.
No VFS locks held on entry except for ->fsync and ->setlease.
->fsync() has i_mutex on inode.
->setlease has the file_list_lock held and must not sleep.
->llseek() locking has moved from llseek to the individual llseek
implementations. If your fs is not using generic_file_llseek, you
@@ -445,17 +450,10 @@ mutex or just to use i_size_read() instead.
Note: this does not protect the file->f_pos against concurrent modifications
since this is something the userspace has to take care about.
Note: ext2_release() was *the* source of contention on fs-intensive
loads and dropping BKL on ->release() helps to get rid of that (we still
grab BKL for cases when we close a file that had been opened r/w, but that
can and should be done using the internal locking with smaller critical areas).
Current worst offender is ext2_get_block()...
->fasync() is called without BKL protection, and is responsible for
maintaining the FASYNC bit in filp->f_flags. Most instances call
fasync_helper(), which does that maintenance, so it's not normally
something one needs to worry about. Return values > 0 will be mapped to
zero in the VFS layer.
->fasync() is responsible for maintaining the FASYNC bit in filp->f_flags.
Most instances call fasync_helper(), which does that maintenance, so it's
not normally something one needs to worry about. Return values > 0 will be
mapped to zero in the VFS layer.
->readdir() and ->ioctl() on directories must be changed. Ideally we would
move ->readdir() to inode_operations and use a separate method for directory
@@ -466,8 +464,6 @@ components. And there are other reasons why the current interface is a mess...
->read on directories probably must go away - we should just enforce -EISDIR
in sys_read() and friends.
->fsync() has i_mutex on inode.
--------------------------- dquot_operations -------------------------------
prototypes:
int (*write_dquot) (struct dquot *);
@@ -502,12 +498,12 @@ prototypes:
int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
locking rules:
BKL mmap_sem PageLocked(page)
open: no yes
close: no yes
fault: no yes can return with page locked
page_mkwrite: no yes can return with page locked
access: no yes
mmap_sem PageLocked(page)
open: yes
close: yes
fault: yes can return with page locked
page_mkwrite: yes can return with page locked
access: yes
->fault() is called when a previously not present pte is about
to be faulted in. The filesystem must find and return the page associated
@@ -534,6 +530,3 @@ VM_IO | VM_PFNMAP VMAs.
(if you break something or notice that it is broken and do not fix it yourself
- at least put it here)
ipc/shm.c::shm_delete() - may need BKL.
->read() and ->write() in many drivers are (probably) missing BKL.
@@ -89,7 +89,7 @@ static ssize_t childless_storeme_write(struct childless *childless,
char *p = (char *) page;
tmp = simple_strtoul(p, &p, 10);
if (!p || (*p && (*p != '\n')))
if ((*p != '\0') && (*p != '\n'))
return -EINVAL;
if (tmp > INT_MAX)
+11 -5
View File
@@ -534,6 +534,7 @@ struct address_space_operations {
sector_t (*bmap)(struct address_space *, sector_t);
int (*invalidatepage) (struct page *, unsigned long);
int (*releasepage) (struct page *, int);
void (*freepage)(struct page *);
ssize_t (*direct_IO)(int, struct kiocb *, const struct iovec *iov,
loff_t offset, unsigned long nr_segs);
struct page* (*get_xip_page)(struct address_space *, sector_t,
@@ -660,11 +661,10 @@ struct address_space_operations {
releasepage: releasepage is called on PagePrivate pages to indicate
that the page should be freed if possible. ->releasepage
should remove any private data from the page and clear the
PagePrivate flag. It may also remove the page from the
address_space. If this fails for some reason, it may indicate
failure with a 0 return value.
This is used in two distinct though related cases. The first
is when the VM finds a clean page with no active users and
PagePrivate flag. If releasepage() fails for some reason, it must
indicate failure with a 0 return value.
releasepage() is used in two distinct though related cases. The
first is when the VM finds a clean page with no active users and
wants to make it a free page. If ->releasepage succeeds, the
page will be removed from the address_space and become free.
@@ -679,6 +679,12 @@ struct address_space_operations {
need to ensure this. Possibly it can clear the PageUptodate
bit if it cannot free private data yet.
freepage: freepage is called once the page is no longer visible in
the page cache in order to allow the cleanup of any private
data. Since it may be called by the memory reclaimer, it
should not assume that the original address_space mapping still
exists, and it should not block.
direct_IO: called by the generic read/write routines to perform
direct_IO - that is IO requests which bypass the page cache
and transfer data directly between the storage and the
+10
View File
@@ -617,6 +617,16 @@ and have the following read/write attributes:
is configured as an output, this value may be written;
any nonzero value is treated as high.
If the pin can be configured as interrupt-generating interrupt
and if it has been configured to generate interrupts (see the
description of "edge"), you can poll(2) on that file and
poll(2) will return whenever the interrupt was triggered. If
you use poll(2), set the events POLLPRI and POLLERR. If you
use select(2), set the file descriptor in exceptfds. After
poll(2) returns, either lseek(2) to the beginning of the sysfs
file and read the new value or close the file and re-open it
to read the value.
"edge" ... reads as either "none", "rising", "falling", or
"both". Write these strings to select the signal edge(s)
that will make poll(2) on the "value" file return.
+1 -1
View File
@@ -11,7 +11,7 @@ Authors:
Mark M. Hoffman <mhoffman@lightlink.com>
Ported to 2.6 by Eric J. Bowersox <ericb@aspsys.com>
Adapted to 2.6.20 by Carsten Emde <ce@osadl.org>
Modified for mainline integration by Hans J. Koch <hjk@linutronix.de>
Modified for mainline integration by Hans J. Koch <hjk@hansjkoch.de>
Module Parameters
-----------------
+1 -1
View File
@@ -8,7 +8,7 @@ Supported chips:
Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf
Authors:
Hans J. Koch <hjk@linutronix.de>
Hans J. Koch <hjk@hansjkoch.de>
John Morris <john.morris@spirentcom.com>
Claus Gindhart <claus.gindhart@kontron.com>
+2 -25
View File
@@ -537,7 +537,7 @@
Notes: Further information in
http://www.oreilly.com/catalog/linuxdrive2/
* Title: "Linux Device Drivers, 3nd Edition"
* Title: "Linux Device Drivers, 3rd Edition"
Authors: Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman
Publisher: O'Reilly & Associates.
Date: 2005.
@@ -592,14 +592,6 @@
Pages: 600.
ISBN: 0-13-101908-2
* Title: "The Design and Implementation of the 4.4 BSD UNIX
Operating System"
Author: Marshall Kirk McKusick, Keith Bostic, Michael J. Karels,
John S. Quarterman.
Publisher: Addison-Wesley.
Date: 1996.
ISBN: 0-201-54979-4
* Title: "Programming for the real world - POSIX.4"
Author: Bill O. Gallmeister.
Publisher: O'Reilly & Associates, Inc..
@@ -610,28 +602,13 @@
POSIX. Good reference.
* Title: "UNIX Systems for Modern Architectures: Symmetric
Multiprocesssing and Caching for Kernel Programmers"
Multiprocessing and Caching for Kernel Programmers"
Author: Curt Schimmel.
Publisher: Addison Wesley.
Date: June, 1994.
Pages: 432.
ISBN: 0-201-63338-8
* Title: "The Design and Implementation of the 4.3 BSD UNIX
Operating System"
Author: Samuel J. Leffler, Marshall Kirk McKusick, Michael J.
Karels, John S. Quarterman.
Publisher: Addison-Wesley.
Date: 1989 (reprinted with corrections on October, 1990).
ISBN: 0-201-06196-1
* Title: "The Design of the UNIX Operating System"
Author: Maurice J. Bach.
Publisher: Prentice Hall.
Date: 1986.
Pages: 471.
ISBN: 0-13-201757-1
MISCELLANEOUS:
* Name: linux/Documentation

Some files were not shown because too many files have changed in this diff Show More