You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge branch 'linux-2.6' into for-linus
This commit is contained in:
@@ -20,6 +20,7 @@
|
|||||||
# Top-level generic files
|
# Top-level generic files
|
||||||
#
|
#
|
||||||
tags
|
tags
|
||||||
|
TAGS
|
||||||
vmlinux*
|
vmlinux*
|
||||||
System.map
|
System.map
|
||||||
Module.symvers
|
Module.symvers
|
||||||
|
|||||||
@@ -45,7 +45,7 @@ S: Longford, Ireland
|
|||||||
S: Sydney, Australia
|
S: Sydney, Australia
|
||||||
|
|
||||||
N: Tigran A. Aivazian
|
N: Tigran A. Aivazian
|
||||||
E: tigran@veritas.com
|
E: tigran@aivazian.fsnet.co.uk
|
||||||
W: http://www.moses.uklinux.net/patches
|
W: http://www.moses.uklinux.net/patches
|
||||||
D: BFS filesystem
|
D: BFS filesystem
|
||||||
D: Intel IA32 CPU microcode update support
|
D: Intel IA32 CPU microcode update support
|
||||||
@@ -2598,6 +2598,9 @@ S: Ucitelska 1576
|
|||||||
S: Prague 8
|
S: Prague 8
|
||||||
S: 182 00 Czech Republic
|
S: 182 00 Czech Republic
|
||||||
|
|
||||||
|
N: Rick Payne
|
||||||
|
D: RFC2385 Support for TCP
|
||||||
|
|
||||||
N: Barak A. Pearlmutter
|
N: Barak A. Pearlmutter
|
||||||
E: bap@cs.unm.edu
|
E: bap@cs.unm.edu
|
||||||
W: http://www.cs.unm.edu/~bap/
|
W: http://www.cs.unm.edu/~bap/
|
||||||
@@ -3511,14 +3514,12 @@ D: The Linux Support Team Erlangen
|
|||||||
|
|
||||||
N: David Weinehall
|
N: David Weinehall
|
||||||
E: tao@acc.umu.se
|
E: tao@acc.umu.se
|
||||||
|
P: 1024D/DC47CA16 7ACE 0FB0 7A74 F994 9B36 E1D1 D14E 8526 DC47 CA16
|
||||||
W: http://www.acc.umu.se/~tao/
|
W: http://www.acc.umu.se/~tao/
|
||||||
W: http://www.acc.umu.se/~mcalinux/
|
D: v2.0 kernel maintainer
|
||||||
D: Fixes for the NE/2-driver
|
D: Fixes for the NE/2-driver
|
||||||
D: Miscellaneous MCA-support
|
D: Miscellaneous MCA-support
|
||||||
D: Cleanup of the Config-files
|
D: Cleanup of the Config-files
|
||||||
S: Axtorpsvagen 40:20
|
|
||||||
S: S-903 37 UMEA
|
|
||||||
S: Sweden
|
|
||||||
|
|
||||||
N: Matt Welsh
|
N: Matt Welsh
|
||||||
E: mdw@metalab.unc.edu
|
E: mdw@metalab.unc.edu
|
||||||
|
|||||||
@@ -21,7 +21,7 @@ Description:
|
|||||||
these states.
|
these states.
|
||||||
|
|
||||||
What: /sys/power/disk
|
What: /sys/power/disk
|
||||||
Date: August 2006
|
Date: September 2006
|
||||||
Contact: Rafael J. Wysocki <rjw@sisk.pl>
|
Contact: Rafael J. Wysocki <rjw@sisk.pl>
|
||||||
Description:
|
Description:
|
||||||
The /sys/power/disk file controls the operating mode of the
|
The /sys/power/disk file controls the operating mode of the
|
||||||
@@ -39,6 +39,19 @@ Description:
|
|||||||
'reboot' - the memory image will be saved by the kernel and
|
'reboot' - the memory image will be saved by the kernel and
|
||||||
the system will be rebooted.
|
the system will be rebooted.
|
||||||
|
|
||||||
|
Additionally, /sys/power/disk can be used to turn on one of the
|
||||||
|
two testing modes of the suspend-to-disk mechanism: 'testproc'
|
||||||
|
or 'test'. If the suspend-to-disk mechanism is in the
|
||||||
|
'testproc' mode, writing 'disk' to /sys/power/state will cause
|
||||||
|
the kernel to disable nonboot CPUs and freeze tasks, wait for 5
|
||||||
|
seconds, unfreeze tasks and enable nonboot CPUs. If it is in
|
||||||
|
the 'test' mode, writing 'disk' to /sys/power/state will cause
|
||||||
|
the kernel to disable nonboot CPUs and freeze tasks, shrink
|
||||||
|
memory, suspend devices, wait for 5 seconds, resume devices,
|
||||||
|
unfreeze tasks and enable nonboot CPUs. Then, we are able to
|
||||||
|
look in the log messages and work out, for example, which code
|
||||||
|
is being slow and which device drivers are misbehaving.
|
||||||
|
|
||||||
The suspend-to-disk method may be chosen by writing to this
|
The suspend-to-disk method may be chosen by writing to this
|
||||||
file one of the accepted strings:
|
file one of the accepted strings:
|
||||||
|
|
||||||
@@ -46,6 +59,8 @@ Description:
|
|||||||
'platform'
|
'platform'
|
||||||
'shutdown'
|
'shutdown'
|
||||||
'reboot'
|
'reboot'
|
||||||
|
'testproc'
|
||||||
|
'test'
|
||||||
|
|
||||||
It will only change to 'firmware' or 'platform' if the system
|
It will only change to 'firmware' or 'platform' if the system
|
||||||
supports that.
|
supports that.
|
||||||
|
|||||||
@@ -201,7 +201,7 @@ udev
|
|||||||
----
|
----
|
||||||
udev is a userspace application for populating /dev dynamically with
|
udev is a userspace application for populating /dev dynamically with
|
||||||
only entries for devices actually present. udev replaces the basic
|
only entries for devices actually present. udev replaces the basic
|
||||||
functionality of devfs, while allowing persistant device naming for
|
functionality of devfs, while allowing persistent device naming for
|
||||||
devices.
|
devices.
|
||||||
|
|
||||||
FUSE
|
FUSE
|
||||||
|
|||||||
@@ -489,7 +489,7 @@ size is the size of the area (must be multiples of PAGE_SIZE).
|
|||||||
flags can be or'd together and are
|
flags can be or'd together and are
|
||||||
|
|
||||||
DMA_MEMORY_MAP - request that the memory returned from
|
DMA_MEMORY_MAP - request that the memory returned from
|
||||||
dma_alloc_coherent() be directly writeable.
|
dma_alloc_coherent() be directly writable.
|
||||||
|
|
||||||
DMA_MEMORY_IO - request that the memory returned from
|
DMA_MEMORY_IO - request that the memory returned from
|
||||||
dma_alloc_coherent() be addressable using read/write/memcpy_toio etc.
|
dma_alloc_coherent() be addressable using read/write/memcpy_toio etc.
|
||||||
|
|||||||
@@ -110,7 +110,7 @@ lock.
|
|||||||
|
|
||||||
Once the DMA transfer is finished (or timed out) you should disable
|
Once the DMA transfer is finished (or timed out) you should disable
|
||||||
the channel again. You should also check get_dma_residue() to make
|
the channel again. You should also check get_dma_residue() to make
|
||||||
sure that all data has been transfered.
|
sure that all data has been transferred.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
|
|||||||
@@ -9,7 +9,7 @@
|
|||||||
DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
|
DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
|
||||||
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
|
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
|
||||||
procfs-guide.xml writing_usb_driver.xml \
|
procfs-guide.xml writing_usb_driver.xml \
|
||||||
kernel-api.xml journal-api.xml lsm.xml usb.xml \
|
kernel-api.xml filesystems.xml lsm.xml usb.xml \
|
||||||
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
|
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
|
||||||
genericirq.xml
|
genericirq.xml
|
||||||
|
|
||||||
|
|||||||
@@ -2,9 +2,106 @@
|
|||||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||||
|
|
||||||
<book id="LinuxJBDAPI">
|
<book id="Linux-filesystems-API">
|
||||||
<bookinfo>
|
<bookinfo>
|
||||||
|
<title>Linux Filesystems API</title>
|
||||||
|
|
||||||
|
<legalnotice>
|
||||||
|
<para>
|
||||||
|
This documentation is free software; you can redistribute
|
||||||
|
it and/or modify it under the terms of the GNU General Public
|
||||||
|
License as published by the Free Software Foundation; either
|
||||||
|
version 2 of the License, or (at your option) any later
|
||||||
|
version.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
This program is distributed in the hope that it will be
|
||||||
|
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||||
|
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||||
|
See the GNU General Public License for more details.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
You should have received a copy of the GNU General Public
|
||||||
|
License along with this program; if not, write to the Free
|
||||||
|
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||||
|
MA 02111-1307 USA
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
For more details see the file COPYING in the source
|
||||||
|
distribution of Linux.
|
||||||
|
</para>
|
||||||
|
</legalnotice>
|
||||||
|
</bookinfo>
|
||||||
|
|
||||||
|
<toc></toc>
|
||||||
|
|
||||||
|
<chapter id="vfs">
|
||||||
|
<title>The Linux VFS</title>
|
||||||
|
<sect1><title>The Filesystem types</title>
|
||||||
|
!Iinclude/linux/fs.h
|
||||||
|
</sect1>
|
||||||
|
<sect1><title>The Directory Cache</title>
|
||||||
|
!Efs/dcache.c
|
||||||
|
!Iinclude/linux/dcache.h
|
||||||
|
</sect1>
|
||||||
|
<sect1><title>Inode Handling</title>
|
||||||
|
!Efs/inode.c
|
||||||
|
!Efs/bad_inode.c
|
||||||
|
</sect1>
|
||||||
|
<sect1><title>Registration and Superblocks</title>
|
||||||
|
!Efs/super.c
|
||||||
|
</sect1>
|
||||||
|
<sect1><title>File Locks</title>
|
||||||
|
!Efs/locks.c
|
||||||
|
!Ifs/locks.c
|
||||||
|
</sect1>
|
||||||
|
<sect1><title>Other Functions</title>
|
||||||
|
!Efs/mpage.c
|
||||||
|
!Efs/namei.c
|
||||||
|
!Efs/buffer.c
|
||||||
|
!Efs/bio.c
|
||||||
|
!Efs/seq_file.c
|
||||||
|
!Efs/filesystems.c
|
||||||
|
!Efs/fs-writeback.c
|
||||||
|
!Efs/block_dev.c
|
||||||
|
</sect1>
|
||||||
|
</chapter>
|
||||||
|
|
||||||
|
<chapter id="proc">
|
||||||
|
<title>The proc filesystem</title>
|
||||||
|
|
||||||
|
<sect1><title>sysctl interface</title>
|
||||||
|
!Ekernel/sysctl.c
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
<sect1><title>proc filesystem interface</title>
|
||||||
|
!Ifs/proc/base.c
|
||||||
|
</sect1>
|
||||||
|
</chapter>
|
||||||
|
|
||||||
|
<chapter id="sysfs">
|
||||||
|
<title>The Filesystem for Exporting Kernel Objects</title>
|
||||||
|
!Efs/sysfs/file.c
|
||||||
|
!Efs/sysfs/symlink.c
|
||||||
|
!Efs/sysfs/bin.c
|
||||||
|
</chapter>
|
||||||
|
|
||||||
|
<chapter id="debugfs">
|
||||||
|
<title>The debugfs filesystem</title>
|
||||||
|
|
||||||
|
<sect1><title>debugfs interface</title>
|
||||||
|
!Efs/debugfs/inode.c
|
||||||
|
!Efs/debugfs/file.c
|
||||||
|
</sect1>
|
||||||
|
</chapter>
|
||||||
|
|
||||||
|
<chapter id="LinuxJDBAPI">
|
||||||
|
<chapterinfo>
|
||||||
<title>The Linux Journalling API</title>
|
<title>The Linux Journalling API</title>
|
||||||
|
|
||||||
<authorgroup>
|
<authorgroup>
|
||||||
<author>
|
<author>
|
||||||
<firstname>Roger</firstname>
|
<firstname>Roger</firstname>
|
||||||
@@ -14,9 +111,9 @@
|
|||||||
<email>rgammans@computer-surgery.co.uk</email>
|
<email>rgammans@computer-surgery.co.uk</email>
|
||||||
</address>
|
</address>
|
||||||
</affiliation>
|
</affiliation>
|
||||||
</author>
|
</author>
|
||||||
</authorgroup>
|
</authorgroup>
|
||||||
|
|
||||||
<authorgroup>
|
<authorgroup>
|
||||||
<author>
|
<author>
|
||||||
<firstname>Stephen</firstname>
|
<firstname>Stephen</firstname>
|
||||||
@@ -33,50 +130,21 @@
|
|||||||
<year>2002</year>
|
<year>2002</year>
|
||||||
<holder>Roger Gammans</holder>
|
<holder>Roger Gammans</holder>
|
||||||
</copyright>
|
</copyright>
|
||||||
|
</chapterinfo>
|
||||||
|
|
||||||
<legalnotice>
|
<title>The Linux Journalling API</title>
|
||||||
<para>
|
|
||||||
This documentation is free software; you can redistribute
|
|
||||||
it and/or modify it under the terms of the GNU General Public
|
|
||||||
License as published by the Free Software Foundation; either
|
|
||||||
version 2 of the License, or (at your option) any later
|
|
||||||
version.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
This program is distributed in the hope that it will be
|
|
||||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
|
||||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
||||||
See the GNU General Public License for more details.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
You should have received a copy of the GNU General Public
|
|
||||||
License along with this program; if not, write to the Free
|
|
||||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
|
||||||
MA 02111-1307 USA
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
For more details see the file COPYING in the source
|
|
||||||
distribution of Linux.
|
|
||||||
</para>
|
|
||||||
</legalnotice>
|
|
||||||
</bookinfo>
|
|
||||||
|
|
||||||
<toc></toc>
|
<sect1>
|
||||||
|
|
||||||
<chapter id="Overview">
|
|
||||||
<title>Overview</title>
|
<title>Overview</title>
|
||||||
<sect1>
|
<sect2>
|
||||||
<title>Details</title>
|
<title>Details</title>
|
||||||
<para>
|
<para>
|
||||||
The journalling layer is easy to use. You need to
|
The journalling layer is easy to use. You need to
|
||||||
first of all create a journal_t data structure. There are
|
first of all create a journal_t data structure. There are
|
||||||
two calls to do this dependent on how you decide to allocate the physical
|
two calls to do this dependent on how you decide to allocate the physical
|
||||||
media on which the journal resides. The journal_init_inode() call
|
media on which the journal resides. The journal_init_inode() call
|
||||||
is for journals stored in filesystem inodes, or the journal_init_dev()
|
is for journals stored in filesystem inodes, or the journal_init_dev()
|
||||||
call can be use for journal stored on a raw device (in a continuous range
|
call can be use for journal stored on a raw device (in a continuous range
|
||||||
of blocks). A journal_t is a typedef for a struct pointer, so when
|
of blocks). A journal_t is a typedef for a struct pointer, so when
|
||||||
you are finally finished make sure you call journal_destroy() on it
|
you are finally finished make sure you call journal_destroy() on it
|
||||||
to free up any used kernel memory.
|
to free up any used kernel memory.
|
||||||
@@ -91,27 +159,26 @@ need to call journal_create().
|
|||||||
<para>
|
<para>
|
||||||
Most of the time however your journal file will already have been created, but
|
Most of the time however your journal file will already have been created, but
|
||||||
before you load it you must call journal_wipe() to empty the journal file.
|
before you load it you must call journal_wipe() to empty the journal file.
|
||||||
Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the
|
Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the
|
||||||
job of the client file system to detect this and skip the call to journal_wipe().
|
job of the client file system to detect this and skip the call to journal_wipe().
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
In either case the next call should be to journal_load() which prepares the
|
In either case the next call should be to journal_load() which prepares the
|
||||||
journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery()
|
journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery()
|
||||||
for you if it detects any outstanding transactions in the journal and similarly
|
for you if it detects any outstanding transactions in the journal and similarly
|
||||||
journal_load() will call journal_recover() if necessary.
|
journal_load() will call journal_recover() if necessary.
|
||||||
I would advise reading fs/ext3/super.c for examples on this stage.
|
I would advise reading fs/ext3/super.c for examples on this stage.
|
||||||
[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly
|
[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly
|
||||||
complicate the API. Or isn't a good idea for the journal layer to hide
|
complicate the API. Or isn't a good idea for the journal layer to hide
|
||||||
dirty mounts from the client fs]
|
dirty mounts from the client fs]
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Now you can go ahead and start modifying the underlying
|
Now you can go ahead and start modifying the underlying
|
||||||
filesystem. Almost.
|
filesystem. Almost.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
You still need to actually journal your filesystem changes, this
|
You still need to actually journal your filesystem changes, this
|
||||||
@@ -138,10 +205,10 @@ individual buffers (blocks). Before you start to modify a buffer you
|
|||||||
need to call journal_get_{create,write,undo}_access() as appropriate,
|
need to call journal_get_{create,write,undo}_access() as appropriate,
|
||||||
this allows the journalling layer to copy the unmodified data if it
|
this allows the journalling layer to copy the unmodified data if it
|
||||||
needs to. After all the buffer may be part of a previously uncommitted
|
needs to. After all the buffer may be part of a previously uncommitted
|
||||||
transaction.
|
transaction.
|
||||||
At this point you are at last ready to modify a buffer, and once
|
At this point you are at last ready to modify a buffer, and once
|
||||||
you are have done so you need to call journal_dirty_{meta,}data().
|
you are have done so you need to call journal_dirty_{meta,}data().
|
||||||
Or if you've asked for access to a buffer you now know is now longer
|
Or if you've asked for access to a buffer you now know is now longer
|
||||||
required to be pushed back on the device you can call journal_forget()
|
required to be pushed back on the device you can call journal_forget()
|
||||||
in much the same way as you might have used bforget() in the past.
|
in much the same way as you might have used bforget() in the past.
|
||||||
</para>
|
</para>
|
||||||
@@ -156,7 +223,6 @@ Then at umount time , in your put_super() (2.4) or write_super() (2.5)
|
|||||||
you can then call journal_destroy() to clean up your in-core journal object.
|
you can then call journal_destroy() to clean up your in-core journal object.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Unfortunately there a couple of ways the journal layer can cause a deadlock.
|
Unfortunately there a couple of ways the journal layer can cause a deadlock.
|
||||||
The first thing to note is that each task can only have
|
The first thing to note is that each task can only have
|
||||||
@@ -164,19 +230,19 @@ a single outstanding transaction at any one time, remember nothing
|
|||||||
commits until the outermost journal_stop(). This means
|
commits until the outermost journal_stop(). This means
|
||||||
you must complete the transaction at the end of each file/inode/address
|
you must complete the transaction at the end of each file/inode/address
|
||||||
etc. operation you perform, so that the journalling system isn't re-entered
|
etc. operation you perform, so that the journalling system isn't re-entered
|
||||||
on another journal. Since transactions can't be nested/batched
|
on another journal. Since transactions can't be nested/batched
|
||||||
across differing journals, and another filesystem other than
|
across differing journals, and another filesystem other than
|
||||||
yours (say ext3) may be modified in a later syscall.
|
yours (say ext3) may be modified in a later syscall.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The second case to bear in mind is that journal_start() can
|
The second case to bear in mind is that journal_start() can
|
||||||
block if there isn't enough space in the journal for your transaction
|
block if there isn't enough space in the journal for your transaction
|
||||||
(based on the passed nblocks param) - when it blocks it merely(!) needs to
|
(based on the passed nblocks param) - when it blocks it merely(!) needs to
|
||||||
wait for transactions to complete and be committed from other tasks,
|
wait for transactions to complete and be committed from other tasks,
|
||||||
so essentially we are waiting for journal_stop(). So to avoid
|
so essentially we are waiting for journal_stop(). So to avoid
|
||||||
deadlocks you must treat journal_start/stop() as if they
|
deadlocks you must treat journal_start/stop() as if they
|
||||||
were semaphores and include them in your semaphore ordering rules to prevent
|
were semaphores and include them in your semaphore ordering rules to prevent
|
||||||
deadlocks. Note that journal_extend() has similar blocking behaviour to
|
deadlocks. Note that journal_extend() has similar blocking behaviour to
|
||||||
journal_start() so you can deadlock here just as easily as on journal_start().
|
journal_start() so you can deadlock here just as easily as on journal_start().
|
||||||
</para>
|
</para>
|
||||||
@@ -184,7 +250,7 @@ journal_start() so you can deadlock here just as easily as on journal_start().
|
|||||||
<para>
|
<para>
|
||||||
Try to reserve the right number of blocks the first time. ;-). This will
|
Try to reserve the right number of blocks the first time. ;-). This will
|
||||||
be the maximum number of blocks you are going to touch in this transaction.
|
be the maximum number of blocks you are going to touch in this transaction.
|
||||||
I advise having a look at at least ext3_jbd.h to see the basis on which
|
I advise having a look at at least ext3_jbd.h to see the basis on which
|
||||||
ext3 uses to make these decisions.
|
ext3 uses to make these decisions.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@@ -193,13 +259,13 @@ Another wriggle to watch out for is your on-disk block allocation strategy.
|
|||||||
why? Because, if you undo a delete, you need to ensure you haven't reused any
|
why? Because, if you undo a delete, you need to ensure you haven't reused any
|
||||||
of the freed blocks in a later transaction. One simple way of doing this
|
of the freed blocks in a later transaction. One simple way of doing this
|
||||||
is make sure any blocks you allocate only have checkpointed transactions
|
is make sure any blocks you allocate only have checkpointed transactions
|
||||||
listed against them. Ext3 does this in ext3_test_allocatable().
|
listed against them. Ext3 does this in ext3_test_allocatable().
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Lock is also providing through journal_{un,}lock_updates(),
|
Lock is also providing through journal_{un,}lock_updates(),
|
||||||
ext3 uses this when it wants a window with a clean and stable fs for a moment.
|
ext3 uses this when it wants a window with a clean and stable fs for a moment.
|
||||||
eg.
|
eg.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
@@ -230,19 +296,19 @@ extend it like this:-
|
|||||||
struct journal_callback for_jbd;
|
struct journal_callback for_jbd;
|
||||||
// Stuff for myfs allocated together.
|
// Stuff for myfs allocated together.
|
||||||
myfs_inode* i_commited;
|
myfs_inode* i_commited;
|
||||||
|
|
||||||
}
|
}
|
||||||
</programlisting>
|
</programlisting>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
this would be useful if you needed to know when data was committed to a
|
this would be useful if you needed to know when data was committed to a
|
||||||
particular inode.
|
particular inode.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
</sect1>
|
</sect2>
|
||||||
|
|
||||||
<sect1>
|
<sect2>
|
||||||
<title>Summary</title>
|
<title>Summary</title>
|
||||||
<para>
|
<para>
|
||||||
Using the journal is a matter of wrapping the different context changes,
|
Using the journal is a matter of wrapping the different context changes,
|
||||||
being each mount, each modification (transaction) and each changed buffer
|
being each mount, each modification (transaction) and each changed buffer
|
||||||
@@ -260,15 +326,15 @@ an example.
|
|||||||
if (clean) journal_wipe();
|
if (clean) journal_wipe();
|
||||||
journal_load();
|
journal_load();
|
||||||
|
|
||||||
foreach(transaction) { /*transactions must be
|
foreach(transaction) { /*transactions must be
|
||||||
completed before
|
completed before
|
||||||
a syscall returns to
|
a syscall returns to
|
||||||
userspace*/
|
userspace*/
|
||||||
|
|
||||||
handle_t * xct=journal_start(my_jnrl);
|
handle_t * xct=journal_start(my_jnrl);
|
||||||
foreach(bh) {
|
foreach(bh) {
|
||||||
journal_get_{create,write,undo}_access(xact,bh);
|
journal_get_{create,write,undo}_access(xact,bh);
|
||||||
if ( myfs_modify(bh) ) { /* returns true
|
if ( myfs_modify(bh) ) { /* returns true
|
||||||
if makes changes */
|
if makes changes */
|
||||||
journal_dirty_{meta,}data(xact,bh);
|
journal_dirty_{meta,}data(xact,bh);
|
||||||
} else {
|
} else {
|
||||||
@@ -279,55 +345,57 @@ an example.
|
|||||||
}
|
}
|
||||||
journal_destroy(my_jrnl);
|
journal_destroy(my_jrnl);
|
||||||
</programlisting>
|
</programlisting>
|
||||||
</sect1>
|
</sect2>
|
||||||
|
|
||||||
</chapter>
|
</sect1>
|
||||||
|
|
||||||
<chapter id="adt">
|
<sect1>
|
||||||
<title>Data Types</title>
|
<title>Data Types</title>
|
||||||
<para>
|
<para>
|
||||||
The journalling layer uses typedefs to 'hide' the concrete definitions
|
The journalling layer uses typedefs to 'hide' the concrete definitions
|
||||||
of the structures used. As a client of the JBD layer you can
|
of the structures used. As a client of the JBD layer you can
|
||||||
just rely on the using the pointer as a magic cookie of some sort.
|
just rely on the using the pointer as a magic cookie of some sort.
|
||||||
|
|
||||||
Obviously the hiding is not enforced as this is 'C'.
|
|
||||||
</para>
|
|
||||||
<sect1><title>Structures</title>
|
|
||||||
!Iinclude/linux/jbd.h
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="calls">
|
Obviously the hiding is not enforced as this is 'C'.
|
||||||
|
</para>
|
||||||
|
<sect2><title>Structures</title>
|
||||||
|
!Iinclude/linux/jbd.h
|
||||||
|
</sect2>
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
<sect1>
|
||||||
<title>Functions</title>
|
<title>Functions</title>
|
||||||
<para>
|
<para>
|
||||||
The functions here are split into two groups those that
|
The functions here are split into two groups those that
|
||||||
affect a journal as a whole, and those which are used to
|
affect a journal as a whole, and those which are used to
|
||||||
manage transactions
|
manage transactions
|
||||||
</para>
|
</para>
|
||||||
<sect1><title>Journal Level</title>
|
<sect2><title>Journal Level</title>
|
||||||
!Efs/jbd/journal.c
|
!Efs/jbd/journal.c
|
||||||
!Ifs/jbd/recovery.c
|
!Ifs/jbd/recovery.c
|
||||||
</sect1>
|
</sect2>
|
||||||
<sect1><title>Transasction Level</title>
|
<sect2><title>Transasction Level</title>
|
||||||
!Efs/jbd/transaction.c
|
!Efs/jbd/transaction.c
|
||||||
</sect1>
|
</sect2>
|
||||||
</chapter>
|
</sect1>
|
||||||
<chapter>
|
<sect1>
|
||||||
<title>See also</title>
|
<title>See also</title>
|
||||||
<para>
|
<para>
|
||||||
<citation>
|
<citation>
|
||||||
<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz">
|
<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz">
|
||||||
Journaling the Linux ext2fs Filesystem,LinuxExpo 98, Stephen Tweedie
|
Journaling the Linux ext2fs Filesystem, LinuxExpo 98, Stephen Tweedie
|
||||||
</ulink>
|
</ulink>
|
||||||
</citation>
|
</citation>
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
<citation>
|
<citation>
|
||||||
<ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html">
|
<ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html">
|
||||||
Ext3 Journalling FileSystem , OLS 2000, Dr. Stephen Tweedie
|
Ext3 Journalling FileSystem, OLS 2000, Dr. Stephen Tweedie
|
||||||
</ulink>
|
</ulink>
|
||||||
</citation>
|
</citation>
|
||||||
</para>
|
</para>
|
||||||
</chapter>
|
</sect1>
|
||||||
|
|
||||||
|
</chapter>
|
||||||
|
|
||||||
</book>
|
</book>
|
||||||
@@ -182,66 +182,6 @@ X!Ilib/string.c
|
|||||||
</sect1>
|
</sect1>
|
||||||
</chapter>
|
</chapter>
|
||||||
|
|
||||||
<chapter id="vfs">
|
|
||||||
<title>The Linux VFS</title>
|
|
||||||
<sect1><title>The Filesystem types</title>
|
|
||||||
!Iinclude/linux/fs.h
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>The Directory Cache</title>
|
|
||||||
!Efs/dcache.c
|
|
||||||
!Iinclude/linux/dcache.h
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Inode Handling</title>
|
|
||||||
!Efs/inode.c
|
|
||||||
!Efs/bad_inode.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Registration and Superblocks</title>
|
|
||||||
!Efs/super.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>File Locks</title>
|
|
||||||
!Efs/locks.c
|
|
||||||
!Ifs/locks.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Other Functions</title>
|
|
||||||
!Efs/mpage.c
|
|
||||||
!Efs/namei.c
|
|
||||||
!Efs/buffer.c
|
|
||||||
!Efs/bio.c
|
|
||||||
!Efs/seq_file.c
|
|
||||||
!Efs/filesystems.c
|
|
||||||
!Efs/fs-writeback.c
|
|
||||||
!Efs/block_dev.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="proc">
|
|
||||||
<title>The proc filesystem</title>
|
|
||||||
|
|
||||||
<sect1><title>sysctl interface</title>
|
|
||||||
!Ekernel/sysctl.c
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1><title>proc filesystem interface</title>
|
|
||||||
!Ifs/proc/base.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="sysfs">
|
|
||||||
<title>The Filesystem for Exporting Kernel Objects</title>
|
|
||||||
!Efs/sysfs/file.c
|
|
||||||
!Efs/sysfs/symlink.c
|
|
||||||
!Efs/sysfs/bin.c
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="debugfs">
|
|
||||||
<title>The debugfs filesystem</title>
|
|
||||||
|
|
||||||
<sect1><title>debugfs interface</title>
|
|
||||||
!Efs/debugfs/inode.c
|
|
||||||
!Efs/debugfs/file.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="relayfs">
|
<chapter id="relayfs">
|
||||||
<title>relay interface support</title>
|
<title>relay interface support</title>
|
||||||
|
|
||||||
|
|||||||
@@ -345,8 +345,7 @@ static inline void skel_delete (struct usb_skel *dev)
|
|||||||
usb_buffer_free (dev->udev, dev->bulk_out_size,
|
usb_buffer_free (dev->udev, dev->bulk_out_size,
|
||||||
dev->bulk_out_buffer,
|
dev->bulk_out_buffer,
|
||||||
dev->write_urb->transfer_dma);
|
dev->write_urb->transfer_dma);
|
||||||
if (dev->write_urb != NULL)
|
usb_free_urb (dev->write_urb);
|
||||||
usb_free_urb (dev->write_urb);
|
|
||||||
kfree (dev);
|
kfree (dev);
|
||||||
}
|
}
|
||||||
</programlisting>
|
</programlisting>
|
||||||
|
|||||||
@@ -395,6 +395,26 @@ bugme-janitor mailing list (every change in the bugzilla is mailed here)
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Managing bug reports
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
One of the best ways to put into practice your hacking skills is by fixing
|
||||||
|
bugs reported by other people. Not only you will help to make the kernel
|
||||||
|
more stable, you'll learn to fix real world problems and you will improve
|
||||||
|
your skills, and other developers will be aware of your presence. Fixing
|
||||||
|
bugs is one of the best ways to get merits among other developers, because
|
||||||
|
not many people like wasting time fixing other people's bugs.
|
||||||
|
|
||||||
|
To work in the already reported bug reports, go to http://bugzilla.kernel.org.
|
||||||
|
If you want to be advised of the future bug reports, you can subscribe to the
|
||||||
|
bugme-new mailing list (only new bug reports are mailed here) or to the
|
||||||
|
bugme-janitor mailing list (every change in the bugzilla is mailed here)
|
||||||
|
|
||||||
|
http://lists.osdl.org/mailman/listinfo/bugme-new
|
||||||
|
http://lists.osdl.org/mailman/listinfo/bugme-janitors
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Mailing lists
|
Mailing lists
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
|
|||||||
@@ -219,7 +219,7 @@ into the field vector of each element contained in a second argument.
|
|||||||
Note that the pre-assigned IOAPIC dev->irq is valid only if the device
|
Note that the pre-assigned IOAPIC dev->irq is valid only if the device
|
||||||
operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt at
|
operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt at
|
||||||
using dev->irq by the device driver to request for interrupt service
|
using dev->irq by the device driver to request for interrupt service
|
||||||
may result unpredictabe behavior.
|
may result in unpredictable behavior.
|
||||||
|
|
||||||
For each MSI-X vector granted, a device driver is responsible for calling
|
For each MSI-X vector granted, a device driver is responsible for calling
|
||||||
other functions like request_irq(), enable_irq(), etc. to enable
|
other functions like request_irq(), enable_irq(), etc. to enable
|
||||||
@@ -470,7 +470,68 @@ LOC: 324553 325068
|
|||||||
ERR: 0
|
ERR: 0
|
||||||
MIS: 0
|
MIS: 0
|
||||||
|
|
||||||
6. FAQ
|
6. MSI quirks
|
||||||
|
|
||||||
|
Several PCI chipsets or devices are known to not support MSI.
|
||||||
|
The PCI stack provides 3 possible levels of MSI disabling:
|
||||||
|
* on a single device
|
||||||
|
* on all devices behind a specific bridge
|
||||||
|
* globally
|
||||||
|
|
||||||
|
6.1. Disabling MSI on a single device
|
||||||
|
|
||||||
|
Under some circumstances, it might be required to disable MSI on a
|
||||||
|
single device, It may be achived by either not calling pci_enable_msi()
|
||||||
|
or all, or setting the pci_dev->no_msi flag before (most of the time
|
||||||
|
in a quirk).
|
||||||
|
|
||||||
|
6.2. Disabling MSI below a bridge
|
||||||
|
|
||||||
|
The vast majority of MSI quirks are required by PCI bridges not
|
||||||
|
being able to route MSI between busses. In this case, MSI have to be
|
||||||
|
disabled on all devices behind this bridge. It is achieves by setting
|
||||||
|
the PCI_BUS_FLAGS_NO_MSI flag in the pci_bus->bus_flags of the bridge
|
||||||
|
subordinate bus. There is no need to set the same flag on bridges that
|
||||||
|
are below the broken brigde. When pci_enable_msi() is called to enable
|
||||||
|
MSI on a device, pci_msi_supported() takes care of checking the NO_MSI
|
||||||
|
flag in all parent busses of the device.
|
||||||
|
|
||||||
|
Some bridges actually support dynamic MSI support enabling/disabling
|
||||||
|
by changing some bits in their PCI configuration space (especially
|
||||||
|
the Hypertransport chipsets such as the nVidia nForce and Serverworks
|
||||||
|
HT2000). It may then be required to update the NO_MSI flag on the
|
||||||
|
corresponding devices in the sysfs hierarchy. To enable MSI support
|
||||||
|
on device "0000:00:0e", do:
|
||||||
|
|
||||||
|
echo 1 > /sys/bus/pci/devices/0000:00:0e/msi_bus
|
||||||
|
|
||||||
|
To disable MSI support, echo 0 instead of 1. Note that it should be
|
||||||
|
used with caution since changing this value might break interrupts.
|
||||||
|
|
||||||
|
6.3. Disabling MSI globally
|
||||||
|
|
||||||
|
Some extreme cases may require to disable MSI globally on the system.
|
||||||
|
For now, the only known case is a Serverworks PCI-X chipsets (MSI are
|
||||||
|
not supported on several busses that are not all connected to the
|
||||||
|
chipset in the Linux PCI hierarchy). In the vast majority of other
|
||||||
|
cases, disabling only behind a specific bridge is enough.
|
||||||
|
|
||||||
|
For debugging purpose, the user may also pass pci=nomsi on the kernel
|
||||||
|
command-line to explicitly disable MSI globally. But, once the appro-
|
||||||
|
priate quirks are added to the kernel, this option should not be
|
||||||
|
required anymore.
|
||||||
|
|
||||||
|
6.4. Finding why MSI cannot be enabled on a device
|
||||||
|
|
||||||
|
Assuming that MSI are not enabled on a device, you should look at
|
||||||
|
dmesg to find messages that quirks may output when disabling MSI
|
||||||
|
on some devices, some bridges or even globally.
|
||||||
|
Then, lspci -t gives the list of bridges above a device. Reading
|
||||||
|
/sys/bus/pci/devices/0000:00:0e/msi_bus will tell you whether MSI
|
||||||
|
are enabled (1) or disabled (0). In 0 is found in a single bridge
|
||||||
|
msi_bus file above the device, MSI cannot be enabled.
|
||||||
|
|
||||||
|
7. FAQ
|
||||||
|
|
||||||
Q1. Are there any limitations on using the MSI?
|
Q1. Are there any limitations on using the MSI?
|
||||||
|
|
||||||
|
|||||||
@@ -49,7 +49,7 @@ __u64 stime, utime;
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* Maximum size of response requested or message sent */
|
/* Maximum size of response requested or message sent */
|
||||||
#define MAX_MSG_SIZE 256
|
#define MAX_MSG_SIZE 1024
|
||||||
/* Maximum number of cpus expected to be specified in a cpumask */
|
/* Maximum number of cpus expected to be specified in a cpumask */
|
||||||
#define MAX_CPUS 32
|
#define MAX_CPUS 32
|
||||||
/* Maximum length of pathname to log file */
|
/* Maximum length of pathname to log file */
|
||||||
|
|||||||
@@ -96,9 +96,9 @@ a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates
|
|||||||
a pid/tgid will be followed by some stats.
|
a pid/tgid will be followed by some stats.
|
||||||
|
|
||||||
b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
|
b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
|
||||||
is being returned.
|
are being returned.
|
||||||
|
|
||||||
c) TASKSTATS_TYPE_STATS: attribute with a struct taskstsats as payload. The
|
c) TASKSTATS_TYPE_STATS: attribute with a struct taskstats as payload. The
|
||||||
same structure is used for both per-pid and per-tgid stats.
|
same structure is used for both per-pid and per-tgid stats.
|
||||||
|
|
||||||
3. New message sent by kernel whenever a task exits. The payload consists of a
|
3. New message sent by kernel whenever a task exits. The payload consists of a
|
||||||
@@ -122,12 +122,12 @@ of atomicity).
|
|||||||
|
|
||||||
However, maintaining per-process, in addition to per-task stats, within the
|
However, maintaining per-process, in addition to per-task stats, within the
|
||||||
kernel has space and time overheads. To address this, the taskstats code
|
kernel has space and time overheads. To address this, the taskstats code
|
||||||
accumalates each exiting task's statistics into a process-wide data structure.
|
accumulates each exiting task's statistics into a process-wide data structure.
|
||||||
When the last task of a process exits, the process level data accumalated also
|
When the last task of a process exits, the process level data accumulated also
|
||||||
gets sent to userspace (along with the per-task data).
|
gets sent to userspace (along with the per-task data).
|
||||||
|
|
||||||
When a user queries to get per-tgid data, the sum of all other live threads in
|
When a user queries to get per-tgid data, the sum of all other live threads in
|
||||||
the group is added up and added to the accumalated total for previously exited
|
the group is added up and added to the accumulated total for previously exited
|
||||||
threads of the same thread group.
|
threads of the same thread group.
|
||||||
|
|
||||||
Extending taskstats
|
Extending taskstats
|
||||||
|
|||||||
@@ -183,7 +183,7 @@ it, the pci dma mapping routines and associated data structures have now been
|
|||||||
modified to accomplish a direct page -> bus translation, without requiring
|
modified to accomplish a direct page -> bus translation, without requiring
|
||||||
a virtual address mapping (unlike the earlier scheme of virtual address
|
a virtual address mapping (unlike the earlier scheme of virtual address
|
||||||
-> bus translation). So this works uniformly for high-memory pages (which
|
-> bus translation). So this works uniformly for high-memory pages (which
|
||||||
do not have a correponding kernel virtual address space mapping) and
|
do not have a corresponding kernel virtual address space mapping) and
|
||||||
low-memory pages.
|
low-memory pages.
|
||||||
|
|
||||||
Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA
|
Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA
|
||||||
@@ -391,7 +391,7 @@ forced such requests to be broken up into small chunks before being passed
|
|||||||
on to the generic block layer, only to be merged by the i/o scheduler
|
on to the generic block layer, only to be merged by the i/o scheduler
|
||||||
when the underlying device was capable of handling the i/o in one shot.
|
when the underlying device was capable of handling the i/o in one shot.
|
||||||
Also, using the buffer head as an i/o structure for i/os that didn't originate
|
Also, using the buffer head as an i/o structure for i/os that didn't originate
|
||||||
from the buffer cache unecessarily added to the weight of the descriptors
|
from the buffer cache unnecessarily added to the weight of the descriptors
|
||||||
which were generated for each such chunk.
|
which were generated for each such chunk.
|
||||||
|
|
||||||
The following were some of the goals and expectations considered in the
|
The following were some of the goals and expectations considered in the
|
||||||
@@ -403,14 +403,14 @@ i. Should be appropriate as a descriptor for both raw and buffered i/o -
|
|||||||
for raw i/o.
|
for raw i/o.
|
||||||
ii. Ability to represent high-memory buffers (which do not have a virtual
|
ii. Ability to represent high-memory buffers (which do not have a virtual
|
||||||
address mapping in kernel address space).
|
address mapping in kernel address space).
|
||||||
iii.Ability to represent large i/os w/o unecessarily breaking them up (i.e
|
iii.Ability to represent large i/os w/o unnecessarily breaking them up (i.e
|
||||||
greater than PAGE_SIZE chunks in one shot)
|
greater than PAGE_SIZE chunks in one shot)
|
||||||
iv. At the same time, ability to retain independent identity of i/os from
|
iv. At the same time, ability to retain independent identity of i/os from
|
||||||
different sources or i/o units requiring individual completion (e.g. for
|
different sources or i/o units requiring individual completion (e.g. for
|
||||||
latency reasons)
|
latency reasons)
|
||||||
v. Ability to represent an i/o involving multiple physical memory segments
|
v. Ability to represent an i/o involving multiple physical memory segments
|
||||||
(including non-page aligned page fragments, as specified via readv/writev)
|
(including non-page aligned page fragments, as specified via readv/writev)
|
||||||
without unecessarily breaking it up, if the underlying device is capable of
|
without unnecessarily breaking it up, if the underlying device is capable of
|
||||||
handling it.
|
handling it.
|
||||||
vi. Preferably should be based on a memory descriptor structure that can be
|
vi. Preferably should be based on a memory descriptor structure that can be
|
||||||
passed around different types of subsystems or layers, maybe even
|
passed around different types of subsystems or layers, maybe even
|
||||||
@@ -1013,7 +1013,7 @@ Characteristics:
|
|||||||
i. Binary tree
|
i. Binary tree
|
||||||
AS and deadline i/o schedulers use red black binary trees for disk position
|
AS and deadline i/o schedulers use red black binary trees for disk position
|
||||||
sorting and searching, and a fifo linked list for time-based searching. This
|
sorting and searching, and a fifo linked list for time-based searching. This
|
||||||
gives good scalability and good availablility of information. Requests are
|
gives good scalability and good availability of information. Requests are
|
||||||
almost always dispatched in disk sort order, so a cache is kept of the next
|
almost always dispatched in disk sort order, so a cache is kept of the next
|
||||||
request in sort order to prevent binary tree lookups.
|
request in sort order to prevent binary tree lookups.
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
|
|
||||||
The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 plattforms.
|
The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 platforms.
|
||||||
|
|
||||||
This works better than on other plattforms, because the FSB of the CPU
|
This works better than on other platforms, because the FSB of the CPU
|
||||||
can be controlled independently from the PCI/AGP clock.
|
can be controlled independently from the PCI/AGP clock.
|
||||||
|
|
||||||
The module has two options:
|
The module has two options:
|
||||||
|
|||||||
@@ -46,7 +46,7 @@ maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using
|
|||||||
maxcpus=2 will only boot 2. You can choose to bring the
|
maxcpus=2 will only boot 2. You can choose to bring the
|
||||||
other cpus later online, read FAQ's for more info.
|
other cpus later online, read FAQ's for more info.
|
||||||
|
|
||||||
additional_cpus*=n Use this to limit hotpluggable cpus. This option sets
|
additional_cpus=n (*) Use this to limit hotpluggable cpus. This option sets
|
||||||
cpu_possible_map = cpu_present_map + additional_cpus
|
cpu_possible_map = cpu_present_map + additional_cpus
|
||||||
|
|
||||||
(*) Option valid only for following architectures
|
(*) Option valid only for following architectures
|
||||||
@@ -54,8 +54,8 @@ additional_cpus*=n Use this to limit hotpluggable cpus. This option sets
|
|||||||
|
|
||||||
ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT
|
ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT
|
||||||
to determine the number of potentially hot-pluggable cpus. The implementation
|
to determine the number of potentially hot-pluggable cpus. The implementation
|
||||||
should only rely on this to count the #of cpus, but *MUST* not rely on the
|
should only rely on this to count the # of cpus, but *MUST* not rely on the
|
||||||
apicid values in those tables for disabled apics. In the event BIOS doesnt
|
apicid values in those tables for disabled apics. In the event BIOS doesn't
|
||||||
mark such hot-pluggable cpus as disabled entries, one could use this
|
mark such hot-pluggable cpus as disabled entries, one could use this
|
||||||
parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map.
|
parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map.
|
||||||
|
|
||||||
@@ -101,15 +101,15 @@ cpu_possible_map/for_each_possible_cpu() to iterate.
|
|||||||
|
|
||||||
Never use anything other than cpumask_t to represent bitmap of CPUs.
|
Never use anything other than cpumask_t to represent bitmap of CPUs.
|
||||||
|
|
||||||
#include <linux/cpumask.h>
|
#include <linux/cpumask.h>
|
||||||
|
|
||||||
for_each_possible_cpu - Iterate over cpu_possible_map
|
for_each_possible_cpu - Iterate over cpu_possible_map
|
||||||
for_each_online_cpu - Iterate over cpu_online_map
|
for_each_online_cpu - Iterate over cpu_online_map
|
||||||
for_each_present_cpu - Iterate over cpu_present_map
|
for_each_present_cpu - Iterate over cpu_present_map
|
||||||
for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask.
|
for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask.
|
||||||
|
|
||||||
#include <linux/cpu.h>
|
#include <linux/cpu.h>
|
||||||
lock_cpu_hotplug() and unlock_cpu_hotplug():
|
lock_cpu_hotplug() and unlock_cpu_hotplug():
|
||||||
|
|
||||||
The above calls are used to inhibit cpu hotplug operations. While holding the
|
The above calls are used to inhibit cpu hotplug operations. While holding the
|
||||||
cpucontrol mutex, cpu_online_map will not change. If you merely need to avoid
|
cpucontrol mutex, cpu_online_map will not change. If you merely need to avoid
|
||||||
@@ -120,7 +120,7 @@ will work as long as stop_machine_run() is used to take a cpu down.
|
|||||||
|
|
||||||
CPU Hotplug - Frequently Asked Questions.
|
CPU Hotplug - Frequently Asked Questions.
|
||||||
|
|
||||||
Q: How to i enable my kernel to support CPU hotplug?
|
Q: How to enable my kernel to support CPU hotplug?
|
||||||
A: When doing make defconfig, Enable CPU hotplug support
|
A: When doing make defconfig, Enable CPU hotplug support
|
||||||
|
|
||||||
"Processor type and Features" -> Support for Hotpluggable CPUs
|
"Processor type and Features" -> Support for Hotpluggable CPUs
|
||||||
@@ -141,39 +141,39 @@ A: You should now notice an entry in sysfs.
|
|||||||
Check if sysfs is mounted, using the "mount" command. You should notice
|
Check if sysfs is mounted, using the "mount" command. You should notice
|
||||||
an entry as shown below in the output.
|
an entry as shown below in the output.
|
||||||
|
|
||||||
....
|
....
|
||||||
none on /sys type sysfs (rw)
|
none on /sys type sysfs (rw)
|
||||||
....
|
....
|
||||||
|
|
||||||
if this is not mounted, do the following.
|
If this is not mounted, do the following.
|
||||||
|
|
||||||
#mkdir /sysfs
|
#mkdir /sysfs
|
||||||
#mount -t sysfs sys /sys
|
#mount -t sysfs sys /sys
|
||||||
|
|
||||||
now you should see entries for all present cpu, the following is an example
|
Now you should see entries for all present cpu, the following is an example
|
||||||
in a 8-way system.
|
in a 8-way system.
|
||||||
|
|
||||||
#pwd
|
#pwd
|
||||||
#/sys/devices/system/cpu
|
#/sys/devices/system/cpu
|
||||||
#ls -l
|
#ls -l
|
||||||
total 0
|
total 0
|
||||||
drwxr-xr-x 10 root root 0 Sep 19 07:44 .
|
drwxr-xr-x 10 root root 0 Sep 19 07:44 .
|
||||||
drwxr-xr-x 13 root root 0 Sep 19 07:45 ..
|
drwxr-xr-x 13 root root 0 Sep 19 07:45 ..
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu0
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu0
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu1
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu1
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu2
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu2
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu3
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu3
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu4
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu4
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu5
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu5
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu6
|
drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu6
|
||||||
drwxr-xr-x 3 root root 0 Sep 19 07:48 cpu7
|
drwxr-xr-x 3 root root 0 Sep 19 07:48 cpu7
|
||||||
|
|
||||||
Under each directory you would find an "online" file which is the control
|
Under each directory you would find an "online" file which is the control
|
||||||
file to logically online/offline a processor.
|
file to logically online/offline a processor.
|
||||||
|
|
||||||
Q: Does hot-add/hot-remove refer to physical add/remove of cpus?
|
Q: Does hot-add/hot-remove refer to physical add/remove of cpus?
|
||||||
A: The usage of hot-add/remove may not be very consistently used in the code.
|
A: The usage of hot-add/remove may not be very consistently used in the code.
|
||||||
CONFIG_CPU_HOTPLUG enables logical online/offline capability in the kernel.
|
CONFIG_HOTPLUG_CPU enables logical online/offline capability in the kernel.
|
||||||
To support physical addition/removal, one would need some BIOS hooks and
|
To support physical addition/removal, one would need some BIOS hooks and
|
||||||
the platform should have something like an attention button in PCI hotplug.
|
the platform should have something like an attention button in PCI hotplug.
|
||||||
CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
|
CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
|
||||||
@@ -181,17 +181,17 @@ CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
|
|||||||
Q: How do i logically offline a CPU?
|
Q: How do i logically offline a CPU?
|
||||||
A: Do the following.
|
A: Do the following.
|
||||||
|
|
||||||
#echo 0 > /sys/devices/system/cpu/cpuX/online
|
#echo 0 > /sys/devices/system/cpu/cpuX/online
|
||||||
|
|
||||||
once the logical offline is successful, check
|
Once the logical offline is successful, check
|
||||||
|
|
||||||
#cat /proc/interrupts
|
#cat /proc/interrupts
|
||||||
|
|
||||||
you should now not see the CPU that you removed. Also online file will report
|
You should now not see the CPU that you removed. Also online file will report
|
||||||
the state as 0 when a cpu if offline and 1 when its online.
|
the state as 0 when a cpu if offline and 1 when its online.
|
||||||
|
|
||||||
#To display the current cpu state.
|
#To display the current cpu state.
|
||||||
#cat /sys/devices/system/cpu/cpuX/online
|
#cat /sys/devices/system/cpu/cpuX/online
|
||||||
|
|
||||||
Q: Why cant i remove CPU0 on some systems?
|
Q: Why cant i remove CPU0 on some systems?
|
||||||
A: Some architectures may have some special dependency on a certain CPU.
|
A: Some architectures may have some special dependency on a certain CPU.
|
||||||
@@ -234,8 +234,8 @@ Q: If i have some kernel code that needs to be aware of CPU arrival and
|
|||||||
departure, how to i arrange for proper notification?
|
departure, how to i arrange for proper notification?
|
||||||
A: This is what you would need in your kernel code to receive notifications.
|
A: This is what you would need in your kernel code to receive notifications.
|
||||||
|
|
||||||
#include <linux/cpu.h>
|
#include <linux/cpu.h>
|
||||||
static int __cpuinit foobar_cpu_callback(struct notifier_block *nfb,
|
static int __cpuinit foobar_cpu_callback(struct notifier_block *nfb,
|
||||||
unsigned long action, void *hcpu)
|
unsigned long action, void *hcpu)
|
||||||
{
|
{
|
||||||
unsigned int cpu = (unsigned long)hcpu;
|
unsigned int cpu = (unsigned long)hcpu;
|
||||||
@@ -279,10 +279,10 @@ Q: I don't see my action being called for all CPUs already up and running?
|
|||||||
A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
|
A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
|
||||||
If you need to perform some action for each cpu already in the system, then
|
If you need to perform some action for each cpu already in the system, then
|
||||||
|
|
||||||
for_each_online_cpu(i) {
|
for_each_online_cpu(i) {
|
||||||
foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i);
|
foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i);
|
||||||
foobar_cpu_callback(&foobar-cpu_notifier, CPU_ONLINE, i);
|
foobar_cpu_callback(&foobar_cpu_notifier, CPU_ONLINE, i);
|
||||||
}
|
}
|
||||||
|
|
||||||
Q: If i would like to develop cpu hotplug support for a new architecture,
|
Q: If i would like to develop cpu hotplug support for a new architecture,
|
||||||
what do i need at a minimum?
|
what do i need at a minimum?
|
||||||
@@ -307,38 +307,38 @@ Q: I need to ensure that a particular cpu is not removed when there is some
|
|||||||
work specific to this cpu is in progress.
|
work specific to this cpu is in progress.
|
||||||
A: First switch the current thread context to preferred cpu
|
A: First switch the current thread context to preferred cpu
|
||||||
|
|
||||||
int my_func_on_cpu(int cpu)
|
int my_func_on_cpu(int cpu)
|
||||||
{
|
{
|
||||||
cpumask_t saved_mask, new_mask = CPU_MASK_NONE;
|
cpumask_t saved_mask, new_mask = CPU_MASK_NONE;
|
||||||
int curr_cpu, err = 0;
|
int curr_cpu, err = 0;
|
||||||
|
|
||||||
saved_mask = current->cpus_allowed;
|
saved_mask = current->cpus_allowed;
|
||||||
cpu_set(cpu, new_mask);
|
cpu_set(cpu, new_mask);
|
||||||
err = set_cpus_allowed(current, new_mask);
|
err = set_cpus_allowed(current, new_mask);
|
||||||
|
|
||||||
if (err)
|
if (err)
|
||||||
return err;
|
return err;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If we got scheduled out just after the return from
|
* If we got scheduled out just after the return from
|
||||||
* set_cpus_allowed() before running the work, this ensures
|
* set_cpus_allowed() before running the work, this ensures
|
||||||
* we stay locked.
|
* we stay locked.
|
||||||
*/
|
*/
|
||||||
curr_cpu = get_cpu();
|
curr_cpu = get_cpu();
|
||||||
|
|
||||||
if (curr_cpu != cpu) {
|
if (curr_cpu != cpu) {
|
||||||
err = -EAGAIN;
|
err = -EAGAIN;
|
||||||
goto ret;
|
goto ret;
|
||||||
} else {
|
} else {
|
||||||
/*
|
/*
|
||||||
* Do work : But cant sleep, since get_cpu() disables preempt
|
* Do work : But cant sleep, since get_cpu() disables preempt
|
||||||
*/
|
*/
|
||||||
}
|
}
|
||||||
ret:
|
ret:
|
||||||
put_cpu();
|
put_cpu();
|
||||||
set_cpus_allowed(current, saved_mask);
|
set_cpus_allowed(current, saved_mask);
|
||||||
return err;
|
return err;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
Q: How do we determine how many CPUs are available for hotplug.
|
Q: How do we determine how many CPUs are available for hotplug.
|
||||||
|
|||||||
@@ -92,7 +92,7 @@ Your cooperation is appreciated.
|
|||||||
7 = /dev/full Returns ENOSPC on write
|
7 = /dev/full Returns ENOSPC on write
|
||||||
8 = /dev/random Nondeterministic random number gen.
|
8 = /dev/random Nondeterministic random number gen.
|
||||||
9 = /dev/urandom Faster, less secure random number gen.
|
9 = /dev/urandom Faster, less secure random number gen.
|
||||||
10 = /dev/aio Asyncronous I/O notification interface
|
10 = /dev/aio Asynchronous I/O notification interface
|
||||||
11 = /dev/kmsg Writes to this come out as printk's
|
11 = /dev/kmsg Writes to this come out as printk's
|
||||||
1 block RAM disk
|
1 block RAM disk
|
||||||
0 = /dev/ram0 First RAM disk
|
0 = /dev/ram0 First RAM disk
|
||||||
@@ -1093,7 +1093,7 @@ Your cooperation is appreciated.
|
|||||||
|
|
||||||
55 char DSP56001 digital signal processor
|
55 char DSP56001 digital signal processor
|
||||||
0 = /dev/dsp56k First DSP56001
|
0 = /dev/dsp56k First DSP56001
|
||||||
55 block Mylex DAC960 PCI RAID controller; eigth controller
|
55 block Mylex DAC960 PCI RAID controller; eighth controller
|
||||||
0 = /dev/rd/c7d0 First disk, whole disk
|
0 = /dev/rd/c7d0 First disk, whole disk
|
||||||
8 = /dev/rd/c7d1 Second disk, whole disk
|
8 = /dev/rd/c7d1 Second disk, whole disk
|
||||||
...
|
...
|
||||||
@@ -1456,7 +1456,7 @@ Your cooperation is appreciated.
|
|||||||
1 = /dev/cum1 Callout device for ttyM1
|
1 = /dev/cum1 Callout device for ttyM1
|
||||||
...
|
...
|
||||||
|
|
||||||
79 block Compaq Intelligent Drive Array, eigth controller
|
79 block Compaq Intelligent Drive Array, eighth controller
|
||||||
0 = /dev/ida/c7d0 First logical drive whole disk
|
0 = /dev/ida/c7d0 First logical drive whole disk
|
||||||
16 = /dev/ida/c7d1 Second logical drive whole disk
|
16 = /dev/ida/c7d1 Second logical drive whole disk
|
||||||
...
|
...
|
||||||
@@ -1900,7 +1900,7 @@ Your cooperation is appreciated.
|
|||||||
1 = /dev/av1 Second A/V card
|
1 = /dev/av1 Second A/V card
|
||||||
...
|
...
|
||||||
|
|
||||||
111 block Compaq Next Generation Drive Array, eigth controller
|
111 block Compaq Next Generation Drive Array, eighth controller
|
||||||
0 = /dev/cciss/c7d0 First logical drive, whole disk
|
0 = /dev/cciss/c7d0 First logical drive, whole disk
|
||||||
16 = /dev/cciss/c7d1 Second logical drive, whole disk
|
16 = /dev/cciss/c7d1 Second logical drive, whole disk
|
||||||
...
|
...
|
||||||
|
|||||||
@@ -1,99 +1,131 @@
|
|||||||
Platform Devices and Drivers
|
Platform Devices and Drivers
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
See <linux/platform_device.h> for the driver model interface to the
|
||||||
|
platform bus: platform_device, and platform_driver. This pseudo-bus
|
||||||
|
is used to connect devices on busses with minimal infrastructure,
|
||||||
|
like those used to integrate peripherals on many system-on-chip
|
||||||
|
processors, or some "legacy" PC interconnects; as opposed to large
|
||||||
|
formally specified ones like PCI or USB.
|
||||||
|
|
||||||
|
|
||||||
Platform devices
|
Platform devices
|
||||||
~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~
|
||||||
Platform devices are devices that typically appear as autonomous
|
Platform devices are devices that typically appear as autonomous
|
||||||
entities in the system. This includes legacy port-based devices and
|
entities in the system. This includes legacy port-based devices and
|
||||||
host bridges to peripheral buses.
|
host bridges to peripheral buses, and most controllers integrated
|
||||||
|
into system-on-chip platforms. What they usually have in common
|
||||||
|
is direct addressing from a CPU bus. Rarely, a platform_device will
|
||||||
|
be connected through a segment of some other kind of bus; but its
|
||||||
|
registers will still be directly addressible.
|
||||||
|
|
||||||
|
Platform devices are given a name, used in driver binding, and a
|
||||||
|
list of resources such as addresses and IRQs.
|
||||||
|
|
||||||
|
struct platform_device {
|
||||||
|
const char *name;
|
||||||
|
u32 id;
|
||||||
|
struct device dev;
|
||||||
|
u32 num_resources;
|
||||||
|
struct resource *resource;
|
||||||
|
};
|
||||||
|
|
||||||
|
|
||||||
Platform drivers
|
Platform drivers
|
||||||
~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~
|
||||||
Drivers for platform devices are typically very simple and
|
Platform drivers follow the standard driver model convention, where
|
||||||
unstructured. Either the device was present at a particular I/O port
|
discovery/enumeration is handled outside the drivers, and drivers
|
||||||
and the driver was loaded, or it was not. There was no possibility
|
provide probe() and remove() methods. They support power management
|
||||||
of hotplugging or alternative discovery besides probing at a specific
|
and shutdown notifications using the standard conventions.
|
||||||
I/O address and expecting a specific response.
|
|
||||||
|
struct platform_driver {
|
||||||
|
int (*probe)(struct platform_device *);
|
||||||
|
int (*remove)(struct platform_device *);
|
||||||
|
void (*shutdown)(struct platform_device *);
|
||||||
|
int (*suspend)(struct platform_device *, pm_message_t state);
|
||||||
|
int (*suspend_late)(struct platform_device *, pm_message_t state);
|
||||||
|
int (*resume_early)(struct platform_device *);
|
||||||
|
int (*resume)(struct platform_device *);
|
||||||
|
struct device_driver driver;
|
||||||
|
};
|
||||||
|
|
||||||
|
Note that probe() should general verify that the specified device hardware
|
||||||
|
actually exists; sometimes platform setup code can't be sure. The probing
|
||||||
|
can use device resources, including clocks, and device platform_data.
|
||||||
|
|
||||||
|
Platform drivers register themselves the normal way:
|
||||||
|
|
||||||
|
int platform_driver_register(struct platform_driver *drv);
|
||||||
|
|
||||||
|
Or, in common situations where the device is known not to be hot-pluggable,
|
||||||
|
the probe() routine can live in an init section to reduce the driver's
|
||||||
|
runtime memory footprint:
|
||||||
|
|
||||||
|
int platform_driver_probe(struct platform_driver *drv,
|
||||||
|
int (*probe)(struct platform_device *))
|
||||||
|
|
||||||
|
|
||||||
Other Architectures, Modern Firmware, and new Platforms
|
Device Enumeration
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~
|
||||||
These devices are not always at the legacy I/O ports. This is true on
|
As a rule, platform specific (and often board-specific) setup code wil
|
||||||
other architectures and on some modern architectures. In most cases,
|
register platform devices:
|
||||||
the drivers are modified to discover the devices at other well-known
|
|
||||||
ports for the given platform. However, the firmware in these systems
|
int platform_device_register(struct platform_device *pdev);
|
||||||
does usually know where exactly these devices reside, and in some
|
|
||||||
cases, it's the only way of discovering them.
|
int platform_add_devices(struct platform_device **pdevs, int ndev);
|
||||||
|
|
||||||
|
The general rule is to register only those devices that actually exist,
|
||||||
|
but in some cases extra devices might be registered. For example, a kernel
|
||||||
|
might be configured to work with an external network adapter that might not
|
||||||
|
be populated on all boards, or likewise to work with an integrated controller
|
||||||
|
that some boards might not hook up to any peripherals.
|
||||||
|
|
||||||
|
In some cases, boot firmware will export tables describing the devices
|
||||||
|
that are populated on a given board. Without such tables, often the
|
||||||
|
only way for system setup code to set up the correct devices is to build
|
||||||
|
a kernel for a specific target board. Such board-specific kernels are
|
||||||
|
common with embedded and custom systems development.
|
||||||
|
|
||||||
|
In many cases, the memory and IRQ resources associated with the platform
|
||||||
|
device are not enough to let the device's driver work. Board setup code
|
||||||
|
will often provide additional information using the device's platform_data
|
||||||
|
field to hold additional information.
|
||||||
|
|
||||||
|
Embedded systems frequently need one or more clocks for platform devices,
|
||||||
|
which are normally kept off until they're actively needed (to save power).
|
||||||
|
System setup also associates those clocks with the device, so that that
|
||||||
|
calls to clk_get(&pdev->dev, clock_name) return them as needed.
|
||||||
|
|
||||||
|
|
||||||
The Platform Bus
|
Device Naming and Driver Binding
|
||||||
~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
A platform bus has been created to deal with these issues. First and
|
The platform_device.dev.bus_id is the canonical name for the devices.
|
||||||
foremost, it groups all the legacy devices under a common bus, and
|
It's built from two components:
|
||||||
gives them a common parent if they don't already have one.
|
|
||||||
|
|
||||||
But, besides the organizational benefits, the platform bus can also
|
* platform_device.name ... which is also used to for driver matching.
|
||||||
accommodate firmware-based enumeration.
|
|
||||||
|
|
||||||
|
* platform_device.id ... the device instance number, or else "-1"
|
||||||
|
to indicate there's only one.
|
||||||
|
|
||||||
Device Discovery
|
These are catenated, so name/id "serial"/0 indicates bus_id "serial.0", and
|
||||||
~~~~~~~~~~~~~~~~
|
"serial/3" indicates bus_id "serial.3"; both would use the platform_driver
|
||||||
The platform bus has no concept of probing for devices. Devices
|
named "serial". While "my_rtc"/-1 would be bus_id "my_rtc" (no instance id)
|
||||||
discovery is left up to either the legacy drivers or the
|
and use the platform_driver called "my_rtc".
|
||||||
firmware. These entities are expected to notify the platform of
|
|
||||||
devices that it discovers via the bus's add() callback:
|
|
||||||
|
|
||||||
platform_bus.add(parent,bus_id).
|
Driver binding is performed automatically by the driver core, invoking
|
||||||
|
driver probe() after finding a match between device and driver. If the
|
||||||
|
probe() succeeds, the driver and device are bound as usual. There are
|
||||||
|
three different ways to find such a match:
|
||||||
|
|
||||||
|
- Whenever a device is registered, the drivers for that bus are
|
||||||
|
checked for matches. Platform devices should be registered very
|
||||||
|
early during system boot.
|
||||||
|
|
||||||
Bus IDs
|
- When a driver is registered using platform_driver_register(), all
|
||||||
~~~~~~~
|
unbound devices on that bus are checked for matches. Drivers
|
||||||
Bus IDs are the canonical names for the devices. There is no globally
|
usually register later during booting, or by module loading.
|
||||||
standard addressing mechanism for legacy devices. In the IA-32 world,
|
|
||||||
we have Pnp IDs to use, as well as the legacy I/O ports. However,
|
|
||||||
neither tell what the device really is or have any meaning on other
|
|
||||||
platforms.
|
|
||||||
|
|
||||||
Since both PnP IDs and the legacy I/O ports (and other standard I/O
|
- Registering a driver using platform_driver_probe() works just like
|
||||||
ports for specific devices) have a 1:1 mapping, we map the
|
using platform_driver_register(), except that the the driver won't
|
||||||
platform-specific name or identifier to a generic name (at least
|
be probed later if another device registers. (Which is OK, since
|
||||||
within the scope of the kernel).
|
this interface is only for use with non-hotpluggable devices.)
|
||||||
|
|
||||||
For example, a serial driver might find a device at I/O 0x3f8. The
|
|
||||||
ACPI firmware might also discover a device with PnP ID (_HID)
|
|
||||||
PNP0501. Both correspond to the same device and should be mapped to the
|
|
||||||
canonical name 'serial'.
|
|
||||||
|
|
||||||
The bus_id field should be a concatenation of the canonical name and
|
|
||||||
the instance of that type of device. For example, the device at I/O
|
|
||||||
port 0x3f8 should have a bus_id of "serial0". This places the
|
|
||||||
responsibility of enumerating devices of a particular type up to the
|
|
||||||
discovery mechanism. But, they are the entity that should know best
|
|
||||||
(as opposed to the platform bus driver).
|
|
||||||
|
|
||||||
|
|
||||||
Drivers
|
|
||||||
~~~~~~~
|
|
||||||
Drivers for platform devices should have a name that is the same as
|
|
||||||
the canonical name of the devices they support. This allows the
|
|
||||||
platform bus driver to do simple matching with the basic data
|
|
||||||
structures to determine if a driver supports a certain device.
|
|
||||||
|
|
||||||
For example, a legacy serial driver should have a name of 'serial' and
|
|
||||||
register itself with the platform bus.
|
|
||||||
|
|
||||||
|
|
||||||
Driver Binding
|
|
||||||
~~~~~~~~~~~~~~
|
|
||||||
Legacy drivers assume they are bound to the device once they start up
|
|
||||||
and probe an I/O port. Divorcing them from this will be a difficult
|
|
||||||
process. However, that shouldn't prevent us from implementing
|
|
||||||
firmware-based enumeration.
|
|
||||||
|
|
||||||
The firmware should notify the platform bus about devices before the
|
|
||||||
legacy drivers have had a chance to load. Once the drivers are loaded,
|
|
||||||
they driver model core will attempt to bind the driver to any
|
|
||||||
previously-discovered devices. Once that has happened, it will be free
|
|
||||||
to discover any other devices it pleases.
|
|
||||||
|
|
||||||
|
|||||||
@@ -92,7 +92,7 @@ struct device represents a single device. It mainly contains metadata
|
|||||||
describing the relationship the device has to other entities.
|
describing the relationship the device has to other entities.
|
||||||
|
|
||||||
|
|
||||||
- Embedd a struct device in the bus-specific device type.
|
- Embed a struct device in the bus-specific device type.
|
||||||
|
|
||||||
|
|
||||||
struct pci_dev {
|
struct pci_dev {
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user