Merge branch 'linux-2.6' into for-linus

2026-05-01 15:00:59 -07:00 · 2006-12-04 15:59:07 +11:00
parent 19a79859e1 2b5f6dcce5
commit 79acbb3ff2
2661 changed files with 91557 additions and 34728 deletions
@@ -20,6 +20,7 @@
 # Top-level generic files
 #
 tags
 TAGS
 vmlinux*
 System.map
 Module.symvers
@@ -45,7 +45,7 @@ S: Longford, Ireland
 S: Sydney, Australia
 N: Tigran A. Aivazian
-E: tigran@veritas.com
+E: tigran@aivazian.fsnet.co.uk
 W: http://www.moses.uklinux.net/patches
 D: BFS filesystem
 D: Intel IA32 CPU microcode update support
@@ -2598,6 +2598,9 @@ S: Ucitelska 1576
 S: Prague 8
 S: 182 00 Czech Republic
 N: Rick Payne
 D: RFC2385 Support for TCP
 N: Barak A. Pearlmutter
 E: bap@cs.unm.edu
 W: http://www.cs.unm.edu/~bap/
@@ -3511,14 +3514,12 @@ D: The Linux Support Team Erlangen
 N: David Weinehall
 E: tao@acc.umu.se
 P: 1024D/DC47CA16 7ACE 0FB0 7A74 F994 9B36  E1D1 D14E 8526 DC47 CA16
 W: http://www.acc.umu.se/~tao/
-W: http://www.acc.umu.se/~mcalinux/
+D: v2.0 kernel maintainer
 D: Fixes for the NE/2-driver
 D: Miscellaneous MCA-support
 D: Cleanup of the Config-files
 S: Axtorpsvagen 40:20
 S: S-903 37  UMEA
 S: Sweden
 N: Matt Welsh
 E: mdw@metalab.unc.edu
@@ -21,7 +21,7 @@ Description:
 		these states.
 What:		/sys/power/disk
-Date:		August 2006
+Date:		September 2006
 Contact:	Rafael J. Wysocki <rjw@sisk.pl>
 Description:
 		The /sys/power/disk file controls the operating mode of the
@@ -39,6 +39,19 @@ Description:
 		'reboot' - the memory image will be saved by the kernel and
 		the system will be rebooted.
 		Additionally, /sys/power/disk can be used to turn on one of the
 		two testing modes of the suspend-to-disk mechanism: 'testproc'
 		or 'test'.  If the suspend-to-disk mechanism is in the
 		'testproc' mode, writing 'disk' to /sys/power/state will cause
 		the kernel to disable nonboot CPUs and freeze tasks, wait for 5
 		seconds, unfreeze tasks and enable nonboot CPUs.  If it is in
 		the 'test' mode, writing 'disk' to /sys/power/state will cause
 		the kernel to disable nonboot CPUs and freeze tasks, shrink
 		memory, suspend devices, wait for 5 seconds, resume devices,
 		unfreeze tasks and enable nonboot CPUs.  Then, we are able to
 		look in the log messages and work out, for example, which code
 		is being slow and which device drivers are misbehaving.
 		The suspend-to-disk method may be chosen by writing to this
 		file one of the accepted strings:
@@ -46,6 +59,8 @@ Description:
 		'platform'
 		'shutdown'
 		'reboot'
 		'testproc'
 		'test'
 		It will only change to 'firmware' or 'platform' if the system
 		supports that.
@@ -201,7 +201,7 @@ udev
 ----
 udev is a userspace application for populating /dev dynamically with
 only entries for devices actually present.  udev replaces the basic
-functionality of devfs, while allowing persistant device naming for
+functionality of devfs, while allowing persistent device naming for
 devices.
 FUSE
@@ -489,7 +489,7 @@ size is the size of the area (must be multiples of PAGE_SIZE).
 flags can be or'd together and are
 DMA_MEMORY_MAP - request that the memory returned from
-dma_alloc_coherent() be directly writeable.
+dma_alloc_coherent() be directly writable.
 DMA_MEMORY_IO - request that the memory returned from
 dma_alloc_coherent() be addressable using read/write/memcpy_toio etc.
@@ -110,7 +110,7 @@ lock.
 Once the DMA transfer is finished (or timed out) you should disable
 the channel again. You should also check get_dma_residue() to make
-sure that all data has been transfered.
+sure that all data has been transferred.
 Example:
@@ -9,7 +9,7 @@
 DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
 	    kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
 	    procfs-guide.xml writing_usb_driver.xml \
-	    kernel-api.xml journal-api.xml lsm.xml usb.xml \
+	    kernel-api.xml filesystems.xml lsm.xml usb.xml \
 	    gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
 	    genericirq.xml
@@ -2,9 +2,106 @@
 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
 	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
-<book id="LinuxJBDAPI">
+<book id="Linux-filesystems-API">
 <bookinfo>
  <title>Linux Filesystems API</title>
  <legalnotice>
   <para>
     This documentation is free software; you can redistribute
     it and/or modify it under the terms of the GNU General Public
     License as published by the Free Software Foundation; either
     version 2 of the License, or (at your option) any later
     version.
   </para>
   <para>
     This program is distributed in the hope that it will be
     useful, but WITHOUT ANY WARRANTY; without even the implied
     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     See the GNU General Public License for more details.
   </para>
   <para>
     You should have received a copy of the GNU General Public
     License along with this program; if not, write to the Free
     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
     MA 02111-1307 USA
   </para>
   <para>
     For more details see the file COPYING in the source
     distribution of Linux.
   </para>
  </legalnotice>
 </bookinfo>
 <toc></toc>
  <chapter id="vfs">
     <title>The Linux VFS</title>
     <sect1><title>The Filesystem types</title>
 !Iinclude/linux/fs.h
     </sect1>
     <sect1><title>The Directory Cache</title>
 !Efs/dcache.c
 !Iinclude/linux/dcache.h
     </sect1>
     <sect1><title>Inode Handling</title>
 !Efs/inode.c
 !Efs/bad_inode.c
     </sect1>
     <sect1><title>Registration and Superblocks</title>
 !Efs/super.c
     </sect1>
     <sect1><title>File Locks</title>
 !Efs/locks.c
 !Ifs/locks.c
     </sect1>
     <sect1><title>Other Functions</title>
 !Efs/mpage.c
 !Efs/namei.c
 !Efs/buffer.c
 !Efs/bio.c
 !Efs/seq_file.c
 !Efs/filesystems.c
 !Efs/fs-writeback.c
 !Efs/block_dev.c
     </sect1>
  </chapter>
  <chapter id="proc">
     <title>The proc filesystem</title>
     <sect1><title>sysctl interface</title>
 !Ekernel/sysctl.c
     </sect1>
     <sect1><title>proc filesystem interface</title>
 !Ifs/proc/base.c
     </sect1>
  </chapter>
  <chapter id="sysfs">
     <title>The Filesystem for Exporting Kernel Objects</title>
 !Efs/sysfs/file.c
 !Efs/sysfs/symlink.c
 !Efs/sysfs/bin.c
  </chapter>
  <chapter id="debugfs">
     <title>The debugfs filesystem</title>
     <sect1><title>debugfs interface</title>
 !Efs/debugfs/inode.c
 !Efs/debugfs/file.c
     </sect1>
  </chapter>
  <chapter id="LinuxJDBAPI">
  <chapterinfo>
  <title>The Linux Journalling API</title>
  <authorgroup>
  <author>
     <firstname>Roger</firstname>
@@ -14,9 +111,9 @@
      <email>rgammans@computer-surgery.co.uk</email>
     </address>
    </affiliation>
-     </author> 
+     </author>
  </authorgroup>
-  
+
  <authorgroup>
   <author>
    <firstname>Stephen</firstname>
@@ -33,50 +130,21 @@
   <year>2002</year>
   <holder>Roger Gammans</holder>
  </copyright>
  </chapterinfo>
-<legalnotice>
+  <title>The Linux Journalling API</title>
   <para>
     This documentation is free software; you can redistribute
     it and/or modify it under the terms of the GNU General Public
     License as published by the Free Software Foundation; either
     version 2 of the License, or (at your option) any later
     version.
   </para>
   <para>
     This program is distributed in the hope that it will be
     useful, but WITHOUT ANY WARRANTY; without even the implied
     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     See the GNU General Public License for more details.
   </para>
   <para>
     You should have received a copy of the GNU General Public
     License along with this program; if not, write to the Free
     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
     MA 02111-1307 USA
   </para>
   <para>
     For more details see the file COPYING in the source
     distribution of Linux.
   </para>
  </legalnotice>
 </bookinfo>
-<toc></toc>
+    <sect1>
  <chapter id="Overview">
     <title>Overview</title>
-  <sect1>
+    <sect2>
     <title>Details</title>
 <para>
-The journalling layer is  easy to use. You need to 
+The journalling layer is  easy to use. You need to
 first of all create a journal_t data structure. There are
 two calls to do this dependent on how you decide to allocate the physical
-media on which the journal resides. The journal_init_inode() call 
+media on which the journal resides. The journal_init_inode() call
 is for journals stored in filesystem inodes, or the journal_init_dev()
-call can be use for journal stored on a raw device (in a continuous range 
+call can be use for journal stored on a raw device (in a continuous range
 of blocks). A journal_t is a typedef for a struct pointer, so when
 you are finally finished make sure you call journal_destroy() on it
 to free up any used kernel memory.
@@ -91,27 +159,26 @@ need to call journal_create().
 <para>
 Most of the time however your journal file will already have been created, but
 before you load it you must call journal_wipe() to empty the journal file.
-Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the 
+Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the
 job of the client file system to detect this and skip the call to journal_wipe().
 </para>
 <para>
 In either case the next call should be to journal_load() which prepares the
-journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery() 
+journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery()
 for you if it detects any outstanding transactions in the journal and similarly
 journal_load() will call journal_recover() if necessary.
 I would advise reading fs/ext3/super.c for examples on this stage.
-[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly 
+[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly
-complicate the API. Or isn't a good idea for the journal layer to hide 
+complicate the API. Or isn't a good idea for the journal layer to hide
 dirty mounts from the client fs]
 </para>
 <para>
-Now you can go ahead and start modifying the underlying 
+Now you can go ahead and start modifying the underlying
 filesystem. Almost.
 </para>
 <para>
 You still need to actually journal your filesystem changes, this
@@ -138,10 +205,10 @@ individual buffers (blocks). Before you start to modify a buffer you
 need to call journal_get_{create,write,undo}_access() as appropriate,
 this allows the journalling layer to copy the unmodified data if it
 needs to. After all the buffer may be part of a previously uncommitted
-transaction. 
+transaction.
 At this point you are at last ready to modify a buffer, and once
 you are have done so you need to call journal_dirty_{meta,}data().
-Or if you've asked for access to a buffer you now know is now longer 
+Or if you've asked for access to a buffer you now know is now longer
 required to be pushed back on the device you can call journal_forget()
 in much the same way as you might have used bforget() in the past.
 </para>
@@ -156,7 +223,6 @@ Then at umount time , in your put_super() (2.4) or write_super() (2.5)
 you can then call journal_destroy() to clean up your in-core journal object.
 </para>
 <para>
 Unfortunately there a couple of ways the journal layer can cause a deadlock.
 The first thing to note is that each task can only have
@@ -164,19 +230,19 @@ a single outstanding transaction at any one time, remember nothing
 commits until the outermost journal_stop(). This means
 you must complete the transaction at the end of each file/inode/address
 etc. operation you perform, so that the journalling system isn't re-entered
-on another journal. Since transactions can't be nested/batched 
+on another journal. Since transactions can't be nested/batched
 across differing journals, and another filesystem other than
 yours (say ext3) may be modified in a later syscall.
 </para>
 <para>
-The second case to bear in mind is that journal_start() can 
+The second case to bear in mind is that journal_start() can
-block if there isn't enough space in the journal for your transaction 
+block if there isn't enough space in the journal for your transaction
 (based on the passed nblocks param) - when it blocks it merely(!) needs to
-wait for transactions to complete and be committed from other tasks, 
+wait for transactions to complete and be committed from other tasks,
-so essentially we are waiting for journal_stop(). So to avoid 
+so essentially we are waiting for journal_stop(). So to avoid
 deadlocks you must treat journal_start/stop() as if they
-were semaphores and include them in your semaphore ordering rules to prevent 
+were semaphores and include them in your semaphore ordering rules to prevent
 deadlocks. Note that journal_extend() has similar blocking behaviour to
 journal_start() so you can deadlock here just as easily as on journal_start().
 </para>
@@ -184,7 +250,7 @@ journal_start() so you can deadlock here just as easily as on journal_start().
 <para>
 Try to reserve the right number of blocks the first time. ;-). This will
 be the maximum number of blocks you are going to touch in this transaction.
-I advise having a look at at least ext3_jbd.h to see the basis on which 
+I advise having a look at at least ext3_jbd.h to see the basis on which
 ext3 uses to make these decisions.
 </para>
@@ -193,13 +259,13 @@ Another wriggle to watch out for is your on-disk block allocation strategy.
 why? Because, if you undo a delete, you need to ensure you haven't reused any
 of the freed blocks in a later transaction. One simple way of doing this
 is make sure any blocks you allocate only have checkpointed transactions
-listed against them. Ext3 does this in ext3_test_allocatable(). 
+listed against them. Ext3 does this in ext3_test_allocatable().
 </para>
 <para>
 Lock is also providing through journal_{un,}lock_updates(),
 ext3 uses this when it wants a window with a clean and stable fs for a moment.
-eg. 
+eg.
 </para>
 <programlisting>
@@ -230,19 +296,19 @@ extend it like this:-
 		struct journal_callback for_jbd;
 		// Stuff for myfs allocated together.
 		myfs_inode*    i_commited;
-	
+
 	}
 </programlisting>
 <para>
-this would be useful if you needed to know when data was committed to a 
+this would be useful if you needed to know when data was committed to a
 particular inode.
 </para>
-</sect1>
+    </sect2>
-<sect1>
+    <sect2>
-<title>Summary</title>
+     <title>Summary</title>
 <para>
 Using the journal is a matter of wrapping the different context changes,
 being each mount, each modification (transaction) and each changed buffer
@@ -260,15 +326,15 @@ an example.
  if (clean) journal_wipe();
  journal_load();
-   foreach(transaction) { /*transactions must be 
+   foreach(transaction) { /*transactions must be
                            completed before
-                            a syscall returns to 
+                            a syscall returns to
                            userspace*/
          handle_t * xct=journal_start(my_jnrl);
          foreach(bh) {
                journal_get_{create,write,undo}_access(xact,bh);
-                if ( myfs_modify(bh) ) { /* returns true 
+                if ( myfs_modify(bh) ) { /* returns true
                                        if makes changes */
                           journal_dirty_{meta,}data(xact,bh);
                } else {
@@ -279,55 +345,57 @@ an example.
   }
   journal_destroy(my_jrnl);
 </programlisting>
-</sect1>
+    </sect2>
-</chapter>
+    </sect1>
-  <chapter id="adt">
+    <sect1>
     <title>Data Types</title>
-     <para>	
+     <para>
 	The journalling layer uses typedefs to 'hide' the concrete definitions
 	of the structures used. As a client of the JBD layer you can
 	just rely on the using the pointer as a magic cookie  of some sort.
 	Obviously the hiding is not enforced as this is 'C'.
 	</para>
 	<sect1><title>Structures</title>
 !Iinclude/linux/jbd.h
 	</sect1>
 </chapter>
-  <chapter id="calls">
+	Obviously the hiding is not enforced as this is 'C'.
     </para>
 	<sect2><title>Structures</title>
 !Iinclude/linux/jbd.h
 	</sect2>
    </sect1>
    <sect1>
     <title>Functions</title>
-     <para>	
+     <para>
 	The functions here are split into two groups those that
 	affect a journal as a whole, and those which are used to
 	manage transactions
-</para>
+     </para>
-	<sect1><title>Journal Level</title>
+	<sect2><title>Journal Level</title>
 !Efs/jbd/journal.c
 !Ifs/jbd/recovery.c
-	</sect1>
+	</sect2>
-	<sect1><title>Transasction Level</title>
+	<sect2><title>Transasction Level</title>
-!Efs/jbd/transaction.c	
+!Efs/jbd/transaction.c
-	</sect1>
+	</sect2>
-</chapter>
+    </sect1>
-<chapter>
+    <sect1>
     <title>See also</title>
 	<para>
-	<citation>
+	  <citation>
 	   <ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz">
-	   	Journaling the Linux ext2fs Filesystem,LinuxExpo 98, Stephen Tweedie
+	   	Journaling the Linux ext2fs Filesystem, LinuxExpo 98, Stephen Tweedie
 	   </ulink>
-	   </citation>
+	  </citation>
-	   </para>
+	</para>
-	   <para>
+	<para>
 	   <citation>
 	   <ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html">
-	   	Ext3 Journalling FileSystem , OLS 2000, Dr. Stephen Tweedie
+	   	Ext3 Journalling FileSystem, OLS 2000, Dr. Stephen Tweedie
 	   </ulink>
 	   </citation>
-	   </para>
+	</para>
-</chapter>
+    </sect1>
  </chapter>
 </book>
@@ -182,66 +182,6 @@ X!Ilib/string.c
     </sect1>
  </chapter>
  <chapter id="vfs">
     <title>The Linux VFS</title>
     <sect1><title>The Filesystem types</title>
 !Iinclude/linux/fs.h
     </sect1>
     <sect1><title>The Directory Cache</title>
 !Efs/dcache.c
 !Iinclude/linux/dcache.h
     </sect1>
     <sect1><title>Inode Handling</title>
 !Efs/inode.c
 !Efs/bad_inode.c
     </sect1>
     <sect1><title>Registration and Superblocks</title>
 !Efs/super.c
     </sect1>
     <sect1><title>File Locks</title>
 !Efs/locks.c
 !Ifs/locks.c
     </sect1>
     <sect1><title>Other Functions</title>
 !Efs/mpage.c
 !Efs/namei.c
 !Efs/buffer.c
 !Efs/bio.c
 !Efs/seq_file.c
 !Efs/filesystems.c
 !Efs/fs-writeback.c
 !Efs/block_dev.c
     </sect1>
  </chapter>
  <chapter id="proc">
     <title>The proc filesystem</title>
     <sect1><title>sysctl interface</title>
 !Ekernel/sysctl.c
     </sect1>
     <sect1><title>proc filesystem interface</title>
 !Ifs/proc/base.c
     </sect1>
  </chapter>
  <chapter id="sysfs">
     <title>The Filesystem for Exporting Kernel Objects</title>
 !Efs/sysfs/file.c
 !Efs/sysfs/symlink.c
 !Efs/sysfs/bin.c
  </chapter>
  <chapter id="debugfs">
     <title>The debugfs filesystem</title>
     <sect1><title>debugfs interface</title>
 !Efs/debugfs/inode.c
 !Efs/debugfs/file.c
     </sect1>
  </chapter>
  <chapter id="relayfs">
     <title>relay interface support</title>
@@ -345,8 +345,7 @@ static inline void skel_delete (struct usb_skel *dev)
        usb_buffer_free (dev->udev, dev->bulk_out_size,
            dev->bulk_out_buffer,
            dev->write_urb->transfer_dma);
-    if (dev->write_urb != NULL)
+    usb_free_urb (dev->write_urb);
        usb_free_urb (dev->write_urb);
    kfree (dev);
 }
  </programlisting>
@@ -395,6 +395,26 @@ bugme-janitor mailing list (every change in the bugzilla is mailed here)
 Managing bug reports
 --------------------
 One of the best ways to put into practice your hacking skills is by fixing
 bugs reported by other people. Not only you will help to make the kernel
 more stable, you'll learn to fix real world problems and you will improve
 your skills, and other developers will be aware of your presence. Fixing
 bugs is one of the best ways to get merits among other developers, because
 not many people like wasting time fixing other people's bugs.
 To work in the already reported bug reports, go to http://bugzilla.kernel.org.
 If you want to be advised of the future bug reports, you can subscribe to the
 bugme-new mailing list (only new bug reports are mailed here) or to the
 bugme-janitor mailing list (every change in the bugzilla is mailed here)
 	http://lists.osdl.org/mailman/listinfo/bugme-new
 	http://lists.osdl.org/mailman/listinfo/bugme-janitors
 Mailing lists
 -------------
@@ -219,7 +219,7 @@ into the field vector of each element contained in a second argument.
 Note that the pre-assigned IOAPIC dev->irq is valid only if the device
 operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt at
 using dev->irq by the device driver to request for interrupt service
-may result unpredictabe behavior.
+may result in unpredictable behavior.
 For each MSI-X vector granted, a device driver is responsible for calling
 other functions like request_irq(), enable_irq(), etc. to enable
@@ -470,7 +470,68 @@ LOC:     324553     325068
 ERR:          0
 MIS:          0
-6. FAQ
+6. MSI quirks
 Several PCI chipsets or devices are known to not support MSI.
 The PCI stack provides 3 possible levels of MSI disabling:
 * on a single device
 * on all devices behind a specific bridge
 * globally
 6.1. Disabling MSI on a single device
 Under some circumstances, it might be required to disable MSI on a
 single device, It may be achived by either not calling pci_enable_msi()
 or all, or setting the pci_dev->no_msi flag before (most of the time
 in a quirk).
 6.2. Disabling MSI below a bridge
 The vast majority of MSI quirks are required by PCI bridges not
 being able to route MSI between busses. In this case, MSI have to be
 disabled on all devices behind this bridge. It is achieves by setting
 the PCI_BUS_FLAGS_NO_MSI flag in the pci_bus->bus_flags of the bridge
 subordinate bus. There is no need to set the same flag on bridges that
 are below the broken brigde. When pci_enable_msi() is called to enable
 MSI on a device, pci_msi_supported() takes care of checking the NO_MSI
 flag in all parent busses of the device.
 Some bridges actually support dynamic MSI support enabling/disabling
 by changing some bits in their PCI configuration space (especially
 the Hypertransport chipsets such as the nVidia nForce and Serverworks
 HT2000). It may then be required to update the NO_MSI flag on the
 corresponding devices in the sysfs hierarchy. To enable MSI support
 on device "0000:00:0e", do:
 	echo 1 > /sys/bus/pci/devices/0000:00:0e/msi_bus
 To disable MSI support, echo 0 instead of 1. Note that it should be
 used with caution since changing this value might break interrupts.
 6.3. Disabling MSI globally
 Some extreme cases may require to disable MSI globally on the system.
 For now, the only known case is a Serverworks PCI-X chipsets (MSI are
 not supported on several busses that are not all connected to the
 chipset in the Linux PCI hierarchy). In the vast majority of other
 cases, disabling only behind a specific bridge is enough.
 For debugging purpose, the user may also pass pci=nomsi on the kernel
 command-line to explicitly disable MSI globally. But, once the appro-
 priate quirks are added to the kernel, this option should not be
 required anymore.
 6.4. Finding why MSI cannot be enabled on a device
 Assuming that MSI are not enabled on a device, you should look at
 dmesg to find messages that quirks may output when disabling MSI
 on some devices, some bridges or even globally.
 Then, lspci -t gives the list of bridges above a device. Reading
 /sys/bus/pci/devices/0000:00:0e/msi_bus will tell you whether MSI
 are enabled (1) or disabled (0). In 0 is found in a single bridge
 msi_bus file above the device, MSI cannot be enabled.
 7. FAQ
 Q1. Are there any limitations on using the MSI?
@@ -49,7 +49,7 @@ __u64 stime, utime;
 	}
 /* Maximum size of response requested or message sent */
-#define MAX_MSG_SIZE	256
+#define MAX_MSG_SIZE	1024
 /* Maximum number of cpus expected to be specified in a cpumask */
 #define MAX_CPUS	32
 /* Maximum length of pathname to log file */
@@ -96,9 +96,9 @@ a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates
 a pid/tgid will be followed by some stats.
 b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
-is being returned.
+are being returned.
-c) TASKSTATS_TYPE_STATS: attribute with a struct taskstsats as payload. The
+c) TASKSTATS_TYPE_STATS: attribute with a struct taskstats as payload. The
 same structure is used for both per-pid and per-tgid stats.
 3. New message sent by kernel whenever a task exits. The payload consists of a
@@ -122,12 +122,12 @@ of atomicity).
 However, maintaining per-process, in addition to per-task stats, within the
 kernel has space and time overheads. To address this, the taskstats code
-accumalates each exiting task's statistics into a process-wide data structure.
+accumulates each exiting task's statistics into a process-wide data structure.
-When the last task of a process exits, the process level data accumalated also
+When the last task of a process exits, the process level data accumulated also
 gets sent to userspace (along with the per-task data).
 When a user queries to get per-tgid data, the sum of all other live threads in
-the group is added up and added to the accumalated total for previously exited
+the group is added up and added to the accumulated total for previously exited
 threads of the same thread group.
 Extending taskstats
@@ -183,7 +183,7 @@ it, the pci dma mapping routines and associated data structures have now been
 modified to accomplish a direct page -> bus translation, without requiring
 a virtual address mapping (unlike the earlier scheme of virtual address
 -> bus translation). So this works uniformly for high-memory pages (which
-do not have a correponding kernel virtual address space mapping) and
+do not have a corresponding kernel virtual address space mapping) and
 low-memory pages.
 Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA
@@ -391,7 +391,7 @@ forced such requests to be broken up into small chunks before being passed
 on to the generic block layer, only to be merged by the i/o scheduler
 when the underlying device was capable of handling the i/o in one shot.
 Also, using the buffer head as an i/o structure for i/os that didn't originate
-from the buffer cache unecessarily added to the weight of the descriptors
+from the buffer cache unnecessarily added to the weight of the descriptors
 which were generated for each such chunk.
 The following were some of the goals and expectations considered in the
@@ -403,14 +403,14 @@ i.  Should be appropriate as a descriptor for both raw and buffered i/o  -
    for raw i/o.
 ii. Ability to represent high-memory buffers (which do not have a virtual
    address mapping in kernel address space).
-iii.Ability to represent large i/os w/o unecessarily breaking them up (i.e
+iii.Ability to represent large i/os w/o unnecessarily breaking them up (i.e
    greater than PAGE_SIZE chunks in one shot)
 iv. At the same time, ability to retain independent identity of i/os from
    different sources or i/o units requiring individual completion (e.g. for
    latency reasons)
 v.  Ability to represent an i/o involving multiple physical memory segments
    (including non-page aligned page fragments, as specified via readv/writev)
-    without unecessarily breaking it up, if the underlying device is capable of
+    without unnecessarily breaking it up, if the underlying device is capable of
    handling it.
 vi. Preferably should be based on a memory descriptor structure that can be
    passed around different types of subsystems or layers, maybe even
@@ -1013,7 +1013,7 @@ Characteristics:
 i. Binary tree
 AS and deadline i/o schedulers use red black binary trees for disk position
 sorting and searching, and a fifo linked list for time-based searching. This
-gives good scalability and good availablility of information. Requests are
+gives good scalability and good availability of information. Requests are
 almost always dispatched in disk sort order, so a cache is kept of the next
 request in sort order to prevent binary tree lookups.
@@ -1,7 +1,7 @@
-The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 plattforms.
+The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 platforms.
-This works better than on other plattforms, because the FSB of the CPU
+This works better than on other platforms, because the FSB of the CPU
 can be controlled independently from the PCI/AGP clock.
 The module has two options:
@@ -46,7 +46,7 @@ maxcpus=n    Restrict boot time cpus to n. Say if you have 4 cpus, using
             maxcpus=2 will only boot 2. You can choose to bring the
             other cpus later online, read FAQ's for more info.
-additional_cpus*=n	Use this to limit hotpluggable cpus. This option sets
+additional_cpus=n (*)	Use this to limit hotpluggable cpus. This option sets
  			cpu_possible_map = cpu_present_map + additional_cpus
 (*) Option valid only for following architectures
@@ -54,8 +54,8 @@ additional_cpus*=n	Use this to limit hotpluggable cpus. This option sets
 ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT
 to determine the number of potentially hot-pluggable cpus. The implementation
-should only rely on this to count the #of cpus, but *MUST* not rely on the
+should only rely on this to count the # of cpus, but *MUST* not rely on the
-apicid values in those tables for disabled apics. In the event BIOS doesnt
+apicid values in those tables for disabled apics. In the event BIOS doesn't
 mark such hot-pluggable cpus as disabled entries, one could use this
 parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map.
@@ -101,15 +101,15 @@ cpu_possible_map/for_each_possible_cpu() to iterate.
 Never use anything other than cpumask_t to represent bitmap of CPUs.
-#include <linux/cpumask.h>
+	#include <linux/cpumask.h>
-for_each_possible_cpu     - Iterate over cpu_possible_map
+	for_each_possible_cpu     - Iterate over cpu_possible_map
-for_each_online_cpu       - Iterate over cpu_online_map
+	for_each_online_cpu       - Iterate over cpu_online_map
-for_each_present_cpu      - Iterate over cpu_present_map
+	for_each_present_cpu      - Iterate over cpu_present_map
-for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask.
+	for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask.
-#include <linux/cpu.h>
+	#include <linux/cpu.h>
-lock_cpu_hotplug() and unlock_cpu_hotplug():
+	lock_cpu_hotplug() and unlock_cpu_hotplug():
 The above calls are used to inhibit cpu hotplug operations. While holding the
 cpucontrol mutex, cpu_online_map will not change. If you merely need to avoid
@@ -120,7 +120,7 @@ will work as long as stop_machine_run() is used to take a cpu down.
 CPU Hotplug - Frequently Asked Questions.
-Q: How to i enable my kernel to support CPU hotplug?
+Q: How to enable my kernel to support CPU hotplug?
 A: When doing make defconfig, Enable CPU hotplug support
   "Processor type and Features" -> Support for Hotpluggable CPUs
@@ -141,39 +141,39 @@ A: You should now notice an entry in sysfs.
 Check if sysfs is mounted, using the "mount" command. You should notice
 an entry as shown below in the output.
-....
+	....
-none on /sys type sysfs (rw)
+	none on /sys type sysfs (rw)
-....
+	....
-if this is not mounted, do the following.
+If this is not mounted, do the following.
-#mkdir /sysfs
+	 #mkdir /sysfs
-#mount -t sysfs sys /sys
+	#mount -t sysfs sys /sys
-now you should see entries for all present cpu, the following is an example
+Now you should see entries for all present cpu, the following is an example
 in a 8-way system.
-#pwd
+	#pwd
-#/sys/devices/system/cpu
+	#/sys/devices/system/cpu
-#ls -l
+	#ls -l
-total 0
+	total 0
-drwxr-xr-x  10 root root 0 Sep 19 07:44 .
+	drwxr-xr-x  10 root root 0 Sep 19 07:44 .
-drwxr-xr-x  13 root root 0 Sep 19 07:45 ..
+	drwxr-xr-x  13 root root 0 Sep 19 07:45 ..
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu0
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu0
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu1
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu1
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu2
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu2
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu3
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu3
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu4
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu4
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu5
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu5
-drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu6
+	drwxr-xr-x   3 root root 0 Sep 19 07:44 cpu6
-drwxr-xr-x   3 root root 0 Sep 19 07:48 cpu7
+	drwxr-xr-x   3 root root 0 Sep 19 07:48 cpu7
 Under each directory you would find an "online" file which is the control
 file to logically online/offline a processor.
 Q: Does hot-add/hot-remove refer to physical add/remove of cpus?
 A: The usage of hot-add/remove may not be very consistently used in the code.
-CONFIG_CPU_HOTPLUG enables logical online/offline capability in the kernel.
+CONFIG_HOTPLUG_CPU enables logical online/offline capability in the kernel.
 To support physical addition/removal, one would need some BIOS hooks and
 the platform should have something like an attention button in PCI hotplug.
 CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
@@ -181,17 +181,17 @@ CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
 Q: How do i logically offline a CPU?
 A: Do the following.
-#echo 0 > /sys/devices/system/cpu/cpuX/online
+	#echo 0 > /sys/devices/system/cpu/cpuX/online
-once the logical offline is successful, check
+Once the logical offline is successful, check
-#cat /proc/interrupts
+	#cat /proc/interrupts
-you should now not see the CPU that you removed. Also online file will report
+You should now not see the CPU that you removed. Also online file will report
 the state as 0 when a cpu if offline and 1 when its online.
-#To display the current cpu state.
+	#To display the current cpu state.
-#cat /sys/devices/system/cpu/cpuX/online
+	#cat /sys/devices/system/cpu/cpuX/online
 Q: Why cant i remove CPU0 on some systems?
 A: Some architectures may have some special dependency on a certain CPU.
@@ -234,8 +234,8 @@ Q: If i have some kernel code that needs to be aware of CPU arrival and
   departure, how to i arrange for proper notification?
 A: This is what you would need in your kernel code to receive notifications.
-    #include <linux/cpu.h>
+	#include <linux/cpu.h>
-    static int __cpuinit foobar_cpu_callback(struct notifier_block *nfb,
+	static int __cpuinit foobar_cpu_callback(struct notifier_block *nfb,
 					    unsigned long action, void *hcpu)
 	{
 		unsigned int cpu = (unsigned long)hcpu;
@@ -279,10 +279,10 @@ Q: I don't see my action being called for all CPUs already up and running?
 A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
   If you need to perform some action for each cpu already in the system, then
-  for_each_online_cpu(i) {
+	for_each_online_cpu(i) {
 		foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i);
-		foobar_cpu_callback(&foobar-cpu_notifier, CPU_ONLINE, i);
+		foobar_cpu_callback(&foobar_cpu_notifier, CPU_ONLINE, i);
-  }
+	}
 Q: If i would like to develop cpu hotplug support for a new architecture,
   what do i need at a minimum?
@@ -307,38 +307,38 @@ Q: I need to ensure that a particular cpu is not removed when there is some
   work specific to this cpu is in progress.
 A: First switch the current thread context to preferred cpu
-   int my_func_on_cpu(int cpu)
+	int my_func_on_cpu(int cpu)
-   {
+	{
-       cpumask_t saved_mask, new_mask = CPU_MASK_NONE;
+		cpumask_t saved_mask, new_mask = CPU_MASK_NONE;
-       int curr_cpu, err = 0;
+		int curr_cpu, err = 0;
-       saved_mask = current->cpus_allowed;
+		saved_mask = current->cpus_allowed;
-       cpu_set(cpu, new_mask);
+		cpu_set(cpu, new_mask);
-       err = set_cpus_allowed(current, new_mask);
+		err = set_cpus_allowed(current, new_mask);
-       if (err)
+		if (err)
-           return err;
+			return err;
-       /*
+		/*
-        * If we got scheduled out just after the return from
+		 * If we got scheduled out just after the return from
-        * set_cpus_allowed() before running the work, this ensures
+		 * set_cpus_allowed() before running the work, this ensures
-        * we stay locked.
+		 * we stay locked.
-        */
+		 */
-       curr_cpu = get_cpu();
+		curr_cpu = get_cpu();
-       if (curr_cpu != cpu) {
+		if (curr_cpu != cpu) {
-	   err = -EAGAIN;
+			err = -EAGAIN;
-           goto ret;
+			goto ret;
-       } else {
+		} else {
-       	   /*
+			/*
-	    * Do work : But cant sleep, since get_cpu() disables preempt
+			 * Do work : But cant sleep, since get_cpu() disables preempt
-	    */
+			 */
-       }
+		}
-    ret:
+		ret:
-    	put_cpu();
+			put_cpu();
-	set_cpus_allowed(current, saved_mask);
+			set_cpus_allowed(current, saved_mask);
-	return err;
+			return err;
-    }
+		}
 Q: How do we determine how many CPUs are available for hotplug.
@@ -92,7 +92,7 @@ Your cooperation is appreciated.
 		  7 = /dev/full		Returns ENOSPC on write
 		  8 = /dev/random	Nondeterministic random number gen.
 		  9 = /dev/urandom	Faster, less secure random number gen.
-		 10 = /dev/aio		Asyncronous I/O notification interface
+		 10 = /dev/aio		Asynchronous I/O notification interface
 		 11 = /dev/kmsg		Writes to this come out as printk's
  1 block	RAM disk
 		  0 = /dev/ram0		First RAM disk
@@ -1093,7 +1093,7 @@ Your cooperation is appreciated.
 55 char	DSP56001 digital signal processor
 		  0 = /dev/dsp56k	First DSP56001
- 55 block	Mylex DAC960 PCI RAID controller; eigth controller
+ 55 block	Mylex DAC960 PCI RAID controller; eighth controller
 		  0 = /dev/rd/c7d0	First disk, whole disk
 		  8 = /dev/rd/c7d1	Second disk, whole disk
 		    ...
@@ -1456,7 +1456,7 @@ Your cooperation is appreciated.
 		  1 = /dev/cum1		Callout device for ttyM1
 		    ...
- 79 block	Compaq Intelligent Drive Array, eigth controller
+ 79 block	Compaq Intelligent Drive Array, eighth controller
 		  0 = /dev/ida/c7d0	First logical drive whole disk
 		 16 = /dev/ida/c7d1	Second logical drive whole disk
 		    ...
@@ -1900,7 +1900,7 @@ Your cooperation is appreciated.
 		  1 = /dev/av1		Second A/V card
 		    ...
-111 block	Compaq Next Generation Drive Array, eigth controller
+111 block	Compaq Next Generation Drive Array, eighth controller
 		  0 = /dev/cciss/c7d0	First logical drive, whole disk
 		 16 = /dev/cciss/c7d1	Second logical drive, whole disk
 		    ...
@@ -1,99 +1,131 @@
 Platform Devices and Drivers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 See <linux/platform_device.h> for the driver model interface to the
 platform bus:  platform_device, and platform_driver.  This pseudo-bus
 is used to connect devices on busses with minimal infrastructure,
 like those used to integrate peripherals on many system-on-chip
 processors, or some "legacy" PC interconnects; as opposed to large
 formally specified ones like PCI or USB.
 Platform devices
 ~~~~~~~~~~~~~~~~
 Platform devices are devices that typically appear as autonomous
 entities in the system. This includes legacy port-based devices and
-host bridges to peripheral buses. 
+host bridges to peripheral buses, and most controllers integrated
 into system-on-chip platforms.  What they usually have in common
 is direct addressing from a CPU bus.  Rarely, a platform_device will
 be connected through a segment of some other kind of bus; but its
 registers will still be directly addressible.
 Platform devices are given a name, used in driver binding, and a
 list of resources such as addresses and IRQs.
 struct platform_device {
 	const char	*name;
 	u32		id;
 	struct device	dev;
 	u32		num_resources;
 	struct resource	*resource;
 };
 Platform drivers
 ~~~~~~~~~~~~~~~~
-Drivers for platform devices are typically very simple and
+Platform drivers follow the standard driver model convention, where
-unstructured. Either the device was present at a particular I/O port
+discovery/enumeration is handled outside the drivers, and drivers
-and the driver was loaded, or it was not. There was no possibility
+provide probe() and remove() methods.  They support power management
-of hotplugging or alternative discovery besides probing at a specific
+and shutdown notifications using the standard conventions.
-I/O address and expecting a specific response.
+
 struct platform_driver {
 	int (*probe)(struct platform_device *);
 	int (*remove)(struct platform_device *);
 	void (*shutdown)(struct platform_device *);
 	int (*suspend)(struct platform_device *, pm_message_t state);
 	int (*suspend_late)(struct platform_device *, pm_message_t state);
 	int (*resume_early)(struct platform_device *);
 	int (*resume)(struct platform_device *);
 	struct device_driver driver;
 };
 Note that probe() should general verify that the specified device hardware
 actually exists; sometimes platform setup code can't be sure.  The probing
 can use device resources, including clocks, and device platform_data.
 Platform drivers register themselves the normal way:
 	int platform_driver_register(struct platform_driver *drv);
 Or, in common situations where the device is known not to be hot-pluggable,
 the probe() routine can live in an init section to reduce the driver's
 runtime memory footprint:
 	int platform_driver_probe(struct platform_driver *drv,
 			  int (*probe)(struct platform_device *))
-Other Architectures, Modern Firmware, and new Platforms
+Device Enumeration
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~
-These devices are not always at the legacy I/O ports. This is true on
+As a rule, platform specific (and often board-specific) setup code wil
-other architectures and on some modern architectures. In most cases,
+register platform devices:
-the drivers are modified to discover the devices at other well-known
+
-ports for the given platform. However, the firmware in these systems
+	int platform_device_register(struct platform_device *pdev);
-does usually know where exactly these devices reside, and in some
+
-cases, it's the only way of discovering them. 
+	int platform_add_devices(struct platform_device **pdevs, int ndev);
 The general rule is to register only those devices that actually exist,
 but in some cases extra devices might be registered.  For example, a kernel
 might be configured to work with an external network adapter that might not
 be populated on all boards, or likewise to work with an integrated controller
 that some boards might not hook up to any peripherals.
 In some cases, boot firmware will export tables describing the devices
 that are populated on a given board.   Without such tables, often the
 only way for system setup code to set up the correct devices is to build
 a kernel for a specific target board.  Such board-specific kernels are
 common with embedded and custom systems development.
 In many cases, the memory and IRQ resources associated with the platform
 device are not enough to let the device's driver work.  Board setup code
 will often provide additional information using the device's platform_data
 field to hold additional information.
 Embedded systems frequently need one or more clocks for platform devices,
 which are normally kept off until they're actively needed (to save power).
 System setup also associates those clocks with the device, so that that
 calls to clk_get(&pdev->dev, clock_name) return them as needed.
-The Platform Bus
+Device Naming and Driver Binding
-~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-A platform bus has been created to deal with these issues. First and
+The platform_device.dev.bus_id is the canonical name for the devices.
-foremost, it groups all the legacy devices under a common bus, and
+It's built from two components:
 gives them a common parent if they don't already have one. 
-But, besides the organizational benefits, the platform bus can also
+    * platform_device.name ... which is also used to for driver matching.
 accommodate firmware-based enumeration. 
    * platform_device.id ... the device instance number, or else "-1"
      to indicate there's only one.
-Device Discovery
+These are catenated, so name/id "serial"/0 indicates bus_id "serial.0", and
-~~~~~~~~~~~~~~~~
+"serial/3" indicates bus_id "serial.3"; both would use the platform_driver
-The platform bus has no concept of probing for devices. Devices
+named "serial".  While "my_rtc"/-1 would be bus_id "my_rtc" (no instance id)
-discovery is left up to either the legacy drivers or the
+and use the platform_driver called "my_rtc".
 firmware. These entities are expected to notify the platform of
 devices that it discovers via the bus's add() callback:
-	platform_bus.add(parent,bus_id).
+Driver binding is performed automatically by the driver core, invoking
 driver probe() after finding a match between device and driver.  If the
 probe() succeeds, the driver and device are bound as usual.  There are
 three different ways to find such a match:
    - Whenever a device is registered, the drivers for that bus are
      checked for matches.  Platform devices should be registered very
      early during system boot.
-Bus IDs
+    - When a driver is registered using platform_driver_register(), all
-~~~~~~~
+      unbound devices on that bus are checked for matches.  Drivers
-Bus IDs are the canonical names for the devices. There is no globally
+      usually register later during booting, or by module loading.
 standard addressing mechanism for legacy devices. In the IA-32 world,
 we have Pnp IDs to use, as well as the legacy I/O ports. However,
 neither tell what the device really is or have any meaning on other
 platforms. 
-Since both PnP IDs and the legacy I/O ports (and other standard I/O
+    - Registering a driver using platform_driver_probe() works just like
-ports for specific devices) have a 1:1 mapping, we map the
+      using platform_driver_register(), except that the the driver won't
-platform-specific name or identifier to a generic name (at least
+      be probed later if another device registers.  (Which is OK, since
-within the scope of the kernel).
+      this interface is only for use with non-hotpluggable devices.)
 For example, a serial driver might find a device at I/O 0x3f8. The
 ACPI firmware might also discover a device with PnP ID (_HID)
 PNP0501. Both correspond to the same device and should be mapped to the
 canonical name 'serial'. 
 The bus_id field should be a concatenation of the canonical name and
 the instance of that type of device. For example, the device at I/O
 port 0x3f8 should have a bus_id of "serial0". This places the
 responsibility of enumerating devices of a particular type up to the
 discovery mechanism. But, they are the entity that should know best
 (as opposed to the platform bus driver).
 Drivers 
 ~~~~~~~
 Drivers for platform devices should have a name that is the same as
 the canonical name of the devices they support. This allows the
 platform bus driver to do simple matching with the basic data
 structures to determine if a driver supports a certain device. 
 For example, a legacy serial driver should have a name of 'serial' and
 register itself with the platform bus. 
 Driver Binding
 ~~~~~~~~~~~~~~
 Legacy drivers assume they are bound to the device once they start up
 and probe an I/O port. Divorcing them from this will be a difficult
 process. However, that shouldn't prevent us from implementing
 firmware-based enumeration. 
 The firmware should notify the platform bus about devices before the
 legacy drivers have had a chance to load. Once the drivers are loaded,
 they driver model core will attempt to bind the driver to any
 previously-discovered devices. Once that has happened, it will be free
 to discover any other devices it pleases.
@@ -92,7 +92,7 @@ struct device represents a single device. It mainly contains metadata
 describing the relationship the device has to other entities. 
- Embedd a struct device in the bus-specific device type. 
+- Embed a struct device in the bus-specific device type. 
 struct pci_dev {
--- a/Show More
+++ b/Show More