Merge branch 'master' into gfs2

2026-01-06 10:13:00 -08:00 · 2006-09-28 08:29:59 -04:00
parent 3f1a9aaeff a77c64c1a6
commit 185a257f2f
1941 changed files with 87478 additions and 33031 deletions
--- a/Documentation/ABI/obsolete/devfs
+++ b/Documentation/ABI/obsolete/devfs
@@ -1,13 +1,12 @@
 What:		devfs
-Date:		July 2005
+Date:		July 2005 (scheduled), finally removed in kernel v2.6.18
 Contact:	Greg Kroah-Hartman <gregkh@suse.de>
 Description:
 	devfs has been unmaintained for a number of years, has unfixable
 	races, contains a naming policy within the kernel that is
 	against the LSB, and can be replaced by using udev.
-	The files fs/devfs/*, include/linux/devfs_fs*.h will be removed,
+	The files fs/devfs/*, include/linux/devfs_fs*.h were removed,
 	along with the the assorted devfs function calls throughout the
 	kernel tree.

 Users:
-
--- a/Documentation/ABI/testing/sysfs-power
+++ b/Documentation/ABI/testing/sysfs-power
@@ -0,0 +1,88 @@
+What:		/sys/power/
+Date:		August 2006
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power directory will contain files that will
+		provide a unified interface to the power management
+		subsystem.
+
+What:		/sys/power/state
+Date:		August 2006
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/state file controls the system power state.
+		Reading from this file returns what states are supported,
+		which is hard-coded to 'standby' (Power-On Suspend), 'mem'
+		(Suspend-to-RAM), and 'disk' (Suspend-to-Disk).
+
+		Writing to this file one of these strings causes the system to
+		transition into that state. Please see the file
+		Documentation/power/states.txt for a description of each of
+		these states.
+
+What:		/sys/power/disk
+Date:		August 2006
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/disk file controls the operating mode of the
+		suspend-to-disk mechanism.  Reading from this file returns
+		the name of the method by which the system will be put to
+		sleep on the next suspend.  There are four methods supported:
+		'firmware' - means that the memory image will be saved to disk
+		by some firmware, in which case we also assume that the
+		firmware will handle the system suspend.
+		'platform' - the memory image will be saved by the kernel and
+		the system will be put to sleep by the platform driver (e.g.
+		ACPI or other PM registers).
+		'shutdown' - the memory image will be saved by the kernel and
+		the system will be powered off.
+		'reboot' - the memory image will be saved by the kernel and
+		the system will be rebooted.
+
+		The suspend-to-disk method may be chosen by writing to this
+		file one of the accepted strings:
+
+		'firmware'
+		'platform'
+		'shutdown'
+		'reboot'
+
+		It will only change to 'firmware' or 'platform' if the system
+		supports that.
+
+What:		/sys/power/image_size
+Date:		August 2006
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/image_size file controls the size of the image
+		created by the suspend-to-disk mechanism.  It can be written a
+		string representing a non-negative integer that will be used
+		as an upper limit of the image size, in bytes.  The kernel's
+		suspend-to-disk code will do its best to ensure the image size
+		will not exceed this number.  However, if it turns out to be
+		impossible, the kernel will try to suspend anyway using the
+		smallest image possible.  In particular, if "0" is written to
+		this file, the suspend image will be as small as possible.
+
+		Reading from this file will display the current image size
+		limit, which is set to 500 MB by default.
+
+What:		/sys/power/pm_trace
+Date:		August 2006
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/pm_trace file controls the code which saves the
+		last PM event point in the RTC across reboots, so that you can
+		debug a machine that just hangs during suspend (or more
+		commonly, during resume).  Namely, the RTC is only used to save
+		the last PM event point if this file contains '1'.  Initially
+		it contains '0' which may be changed to '1' by writing a
+		string representing a nonzero integer into it.
+
+		To use this debugging feature you should attempt to suspend
+		the machine, then reboot it and run
+
+		dmesg -s 1000000 | grep 'hash matches'
+
+		CAUTION: Using it will cause your machine's real-time (CMOS)
+		clock to be set to a random invalid time after a resume.
--- a/Documentation/DocBook/usb.tmpl
+++ b/Documentation/DocBook/usb.tmpl
@@ -43,59 +43,52 @@

    <para>A Universal Serial Bus (USB) is used to connect a host,
    such as a PC or workstation, to a number of peripheral
-    devices.  USB uses a tree structure, with the host at the
+    devices.  USB uses a tree structure, with the host as the
    root (the system's master), hubs as interior nodes, and
-    peripheral devices as leaves (and slaves).
+    peripherals as leaves (and slaves).
    Modern PCs support several such trees of USB devices, usually
    one USB 2.0 tree (480 Mbit/sec each) with
    a few USB 1.1 trees (12 Mbit/sec each) that are used when you
    connect a USB 1.1 device directly to the machine's "root hub".
    </para>

-    <para>That master/slave asymmetry was designed in part for
-    ease of use.  It is not physically possible to assemble
-    (legal) USB cables incorrectly:  all upstream "to-the-host"
-    connectors are the rectangular type, matching the sockets on
-    root hubs, and the downstream type are the squarish type
-    (or they are built in to the peripheral).
-    Software doesn't need to deal with distributed autoconfiguration
-    since the pre-designated master node manages all that.
-    At the electrical level, bus protocol overhead is reduced by
-    eliminating arbitration and moving scheduling into host software.
+    <para>That master/slave asymmetry was designed-in for a number of
+    reasons, one being ease of use.  It is not physically possible to
+    assemble (legal) USB cables incorrectly:  all upstream "to the host"
+    connectors are the rectangular type (matching the sockets on
+    root hubs), and all downstream connectors are the squarish type
+    (or they are built into the peripheral).
+    Also, the host software doesn't need to deal with distributed
+    auto-configuration since the pre-designated master node manages all that.
+    And finally, at the electrical level, bus protocol overhead is reduced by
+    eliminating arbitration and moving scheduling into the host software.
    </para>

-    <para>USB 1.0 was announced in January 1996, and was revised
+    <para>USB 1.0 was announced in January 1996 and was revised
    as USB 1.1 (with improvements in hub specification and
    support for interrupt-out transfers) in September 1998.
-    USB 2.0 was released in April 2000, including high speed
-    transfers and transaction translating hubs (used for USB 1.1
+    USB 2.0 was released in April 2000, adding high-speed
+    transfers and transaction-translating hubs (used for USB 1.1
    and 1.0 backward compatibility).
    </para>

-    <para>USB support was added to Linux early in the 2.2 kernel series
-    shortly before the 2.3 development forked off.  Updates
-    from 2.3 were regularly folded back into 2.2 releases, bringing
-    new features such as <filename>/sbin/hotplug</filename> support,
-    more drivers, and more robustness.
-    The 2.5 kernel series continued such improvements, and also
-    worked on USB 2.0 support,
-    higher performance,
-    better consistency between host controller drivers,
-    API simplification (to make bugs less likely),
-    and providing internal "kerneldoc" documentation.
+    <para>Kernel developers added USB support to Linux early in the 2.2 kernel
+    series, shortly before 2.3 development forked.  Updates from 2.3 were
+    regularly folded back into 2.2 releases, which improved reliability and
+    brought <filename>/sbin/hotplug</filename> support as well more drivers.
+    Such improvements were continued in the 2.5 kernel series, where they added
+    USB 2.0 support, improved performance, and made the host controller drivers
+    (HCDs) more consistent.  They also simplified the API (to make bugs less
+    likely) and added internal "kerneldoc" documentation.
    </para>

    <para>Linux can run inside USB devices as well as on
    the hosts that control the devices.
-    Because the Linux 2.x USB support evolved to support mass market
-    platforms such as Apple Macintosh or PC-compatible systems,
-    it didn't address design concerns for those types of USB systems.
-    So it can't be used inside mass-market PDAs, or other peripherals.
-    USB device drivers running inside those Linux peripherals
+    But USB device drivers running inside those peripherals
    don't do the same things as the ones running inside hosts,
-    and so they've been given a different name:
-    they're called <emphasis>gadget drivers</emphasis>.
-    This document does not present gadget drivers.
+    so they've been given a different name:
+    <emphasis>gadget drivers</emphasis>.
+    This document does not cover gadget drivers.
    </para>

    </chapter>
@@ -103,17 +96,14 @@
 <chapter id="host">
    <title>USB Host-Side API Model</title>

-    <para>Within the kernel,
-    host-side drivers for USB devices talk to the "usbcore" APIs.
-    There are two types of public "usbcore" APIs, targetted at two different
-    layers of USB driver.  Those are
-    <emphasis>general purpose</emphasis> drivers, exposed through
-    driver frameworks such as block, character, or network devices;
-    and drivers that are <emphasis>part of the core</emphasis>,
-    which are involved in managing a USB bus.
-    Such core drivers include the <emphasis>hub</emphasis> driver,
-    which manages trees of USB devices, and several different kinds
-    of <emphasis>host controller driver (HCD)</emphasis>,
+    <para>Host-side drivers for USB devices talk to the "usbcore" APIs.
+    There are two.  One is intended for
+    <emphasis>general-purpose</emphasis> drivers (exposed through
+    driver frameworks), and the other is for drivers that are
+    <emphasis>part of the core</emphasis>.
+    Such core drivers include the <emphasis>hub</emphasis> driver
+    (which manages trees of USB devices) and several different kinds
+    of <emphasis>host controller drivers</emphasis>,
    which control individual busses.
    </para>

@@ -122,21 +112,21 @@
     
    <itemizedlist>

-	<listitem><para>USB supports four kinds of data transfer
-	(control, bulk, interrupt, and isochronous).  Two transfer
-	types use bandwidth as it's available (control and bulk),
-	while the other two types of transfer (interrupt and isochronous)
+	<listitem><para>USB supports four kinds of data transfers
+	(control, bulk, interrupt, and isochronous).  Two of them (control
+	and bulk) use bandwidth as it's available,
+	while the other two (interrupt and isochronous)
 	are scheduled to provide guaranteed bandwidth.
 	</para></listitem>

 	<listitem><para>The device description model includes one or more
 	"configurations" per device, only one of which is active at a time.
-	Devices that are capable of high speed operation must also support
-	full speed configurations, along with a way to ask about the
-	"other speed" configurations that might be used.
+	Devices that are capable of high-speed operation must also support
+	full-speed configurations, along with a way to ask about the
+	"other speed" configurations which might be used.
 	</para></listitem>

-	<listitem><para>Configurations have one or more "interface", each
+	<listitem><para>Configurations have one or more "interfaces", each
 	of which may have "alternate settings".  Interfaces may be
 	standardized by USB "Class" specifications, or may be specific to
 	a vendor or device.</para>
@@ -162,7 +152,7 @@
 	</para></listitem>

 	<listitem><para>The Linux USB API supports synchronous calls for
-	control and bulk messaging.
+	control and bulk messages.
 	It also supports asynchnous calls for all kinds of data transfer,
 	using request structures called "URBs" (USB Request Blocks).
 	</para></listitem>
@@ -463,14 +453,25 @@
 	    file in your Linux kernel sources.
 	    </para>

-	    <para>Otherwise the main use for this file from programs
-	    is to poll() it to get notifications of usb devices
-	    as they're plugged or unplugged.
-	    To see what changed, you'd need to read the file and
-	    compare "before" and "after" contents, scan the filesystem,
-	    or see its hotplug event.
-	    </para>
+	    <para>This file, in combination with the poll() system call, can
+	    also be used to detect when devices are added or removed:
+<programlisting>int fd;
+struct pollfd pfd;

+fd = open("/proc/bus/usb/devices", O_RDONLY);
+pfd = { fd, POLLIN, 0 };
+for (;;) {
+	/* The first time through, this call will return immediately. */
+	poll(&amp;pfd, 1, -1);
+
+	/* To see what's changed, compare the file's previous and current
+	   contents or scan the filesystem.  (Scanning is more precise.) */
+}</programlisting>
+	    Note that this behavior is intended to be used for informational
+	    and debug purposes.  It would be more appropriate to use programs
+	    such as udev or HAL to initialize a device or start a user-mode
+	    helper program, for instance.
+	    </para>
 	</sect1>

 	<sect1>
--- a/Documentation/HOWTO
+++ b/Documentation/HOWTO
@@ -358,7 +358,8 @@ Here is a list of some of the different kernel trees available:
  quilt trees:
    - USB, PCI, Driver Core, and I2C, Greg Kroah-Hartman <gregkh@suse.de>
 	kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/
-
+    - x86-64, partly i386, Andi Kleen <ak@suse.de>
+        ftp.firstfloor.org:/pub/ak/x86_64/quilt/

 Bug Reporting
 -------------
--- a/Documentation/devices.txt
+++ b/Documentation/devices.txt
@@ -2543,6 +2543,9 @@ Your cooperation is appreciated.
 		 64 = /dev/usb/rio500	Diamond Rio 500
 		 65 = /dev/usb/usblcd	USBLCD Interface (info@usblcd.de)
 		 66 = /dev/usb/cpad0	Synaptics cPad (mouse/LCD)
+		 67 = /dev/usb/adutux0	1st Ontrak ADU device
+		    ...
+		 76 = /dev/usb/adutux10	10th Ontrak ADU device
 		 96 = /dev/usb/hiddev0	1st USB HID device
 		    ...
 		111 = /dev/usb/hiddev15	16th USB HID device
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -6,6 +6,21 @@ be removed from this file.

 ---------------------------

+What:	/sys/devices/.../power/state
+	dev->power.power_state
+	dpm_runtime_{suspend,resume)()
+When:	July 2007
+Why:	Broken design for runtime control over driver power states, confusing
+	driver-internal runtime power management with:  mechanisms to support
+	system-wide sleep state transitions; event codes that distinguish
+	different phases of swsusp "sleep" transitions; and userspace policy
+	inputs.  This framework was never widely used, and most attempts to
+	use it were broken.  Drivers should instead be exposing domain-specific
+	interfaces either to kernel or to userspace.
+Who:	Pavel Machek <pavel@suse.cz>
+
+---------------------------
+
 What:	RAW driver (CONFIG_RAW_DRIVER)
 When:	December 2005
 Why:	declared obsolete since kernel 2.6.3
@@ -55,6 +70,18 @@ Who:	Mauro Carvalho Chehab <mchehab@brturbo.com.br>

 ---------------------------

+What:	sys_sysctl
+When:	January 2007
+Why:	The same information is available through /proc/sys and that is the
+	interface user space prefers to use. And there do not appear to be
+	any existing user in user space of sys_sysctl.  The additional
+	maintenance overhead of keeping a set of binary names gets
+	in the way of doing a good job of maintaining this interface.
+
+Who:	Eric Biederman <ebiederm@xmission.com>
+
+---------------------------
+
 What:	PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl])
 When:	November 2005
 Files:	drivers/pcmcia/: pcmcia_ioctl.c
@@ -202,14 +229,6 @@ Who:	Nick Piggin <npiggin@suse.de>

 ---------------------------

-What:	Support for the MIPS EV96100 evaluation board
-When:	September 2006
-Why:	Does no longer build since at least November 15, 2003, apparently
-	no userbase left.
-Who:	Ralf Baechle <ralf@linux-mips.org>
-
---------------------------
-
 What:	Support for the Momentum / PMC-Sierra Jaguar ATX evaluation board
 When:	September 2006
 Why:	Does no longer build since quite some time, and was never popular,
@@ -294,3 +313,24 @@ Why:	The frame diverter is included in most distribution kernels, but is
 	It is not clear if anyone is still using it.
 Who:	Stephen Hemminger <shemminger@osdl.org>

+---------------------------
+
+
+What:	PHYSDEVPATH, PHYSDEVBUS, PHYSDEVDRIVER in the uevent environment
+When:	Oktober 2008
+Why:	The stacking of class devices makes these values misleading and
+	inconsistent.
+	Class devices should not carry any of these properties, and bus
+	devices have SUBSYTEM and DRIVER as a replacement.
+Who:	Kay Sievers <kay.sievers@suse.de>
+
+---------------------------
+
+What:	i2c-isa
+When:	December 2006
+Why:	i2c-isa is a non-sense and doesn't fit in the device driver
+	model. Drivers relying on it are better implemented as platform
+	drivers.
+Who:	Jean Delvare <khali@linux-fr.org>
+
+---------------------------
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1124,11 +1124,15 @@ debugging information is displayed on console.
 NMI switch that most IA32 servers have fires unknown NMI up, for example.
 If a system hangs up, try pressing the NMI switch.

-[NOTE]
-   This function and oprofile share a NMI callback. Therefore this function
-   cannot be enabled when oprofile is activated.
-   And NMI watchdog will be disabled when the value in this file is set to
-   non-zero.
+nmi_watchdog
+------------
+
+Enables/Disables the NMI watchdog on x86 systems.  When the value is non-zero
+the NMI watchdog is enabled and will continuously test all online cpus to
+determine whether or not they are still functioning properly.
+
+Because the NMI watchdog shares registers with oprofile, by disabling the NMI
+watchdog, oprofile may have more registers to utilize.


 2.4 /proc/sys/vm - The virtual memory subsystem
--- a/Documentation/i2c/busses/i2c-viapro
+++ b/Documentation/i2c/busses/i2c-viapro
@@ -7,9 +7,12 @@ Supported adapters:
  * VIA Technologies, Inc. VT82C686A/B
    Datasheet: Sometimes available at the VIA website

-  * VIA Technologies, Inc. VT8231, VT8233, VT8233A, VT8235, VT8237R
+  * VIA Technologies, Inc. VT8231, VT8233, VT8233A
    Datasheet: available on request from VIA

+  * VIA Technologies, Inc. VT8235, VT8237R, VT8237A, VT8251
+    Datasheet: available on request and under NDA from VIA
+
 Authors:
 	Kyösti Mälkki <kmalkki@cc.hut.fi>,
 	Mark D. Studebaker <mdsxyz123@yahoo.com>,
@@ -39,6 +42,8 @@ Your lspci -n listing must show one of these :
 device 1106:8235   (VT8231 function 4)
 device 1106:3177   (VT8235)
 device 1106:3227   (VT8237R)
+ device 1106:3337   (VT8237A)
+ device 1106:3287   (VT8251)

 If none of these show up, you should look in the BIOS for settings like
 enable ACPI / SMBus or even USB.
--- a/Documentation/i2c/i2c-stub
+++ b/Documentation/i2c/i2c-stub
@@ -6,9 +6,12 @@ This module is a very simple fake I2C/SMBus driver.  It implements four
 types of SMBus commands: write quick, (r/w) byte, (r/w) byte data, and
 (r/w) word data.

+You need to provide a chip address as a module parameter when loading
+this driver, which will then only react to SMBus commands to this address.
+
 No hardware is needed nor associated with this module.  It will accept write
-quick commands to all addresses; it will respond to the other commands (also
-to all addresses) by reading from or writing to an array in memory.  It will
+quick commands to one address; it will respond to the other commands (also
+to one address) by reading from or writing to an array in memory.  It will
 also spam the kernel logs for every command it handles.

 A pointer register with auto-increment is implemented for all byte
@@ -21,6 +24,11 @@ The typical use-case is like this:
 	3. load the target sensors chip driver module
 	4. observe its behavior in the kernel log

+PARAMETERS:
+
+int chip_addr:
+	The SMBus address to emulate a chip at.
+
 CAVEATS:

 There are independent arrays for byte/data and word/data commands.  Depending
@@ -33,6 +41,9 @@ If the hardware for your driver has banked registers (e.g. Winbond sensors
 chips) this module will not work well - although it could be extended to
 support that pretty easily.

+Only one chip address is supported - although this module could be
+extended to support more.
+
 If you spam it hard enough, printk can be lossy.  This module really wants
 something like relayfs.

--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -421,6 +421,11 @@ more details, with real examples.
 	The second argument is optional, and if supplied will be used
 	if first argument is not supported.

+    as-instr
+	as-instr checks if the assembler reports a specific instruction
+	and then outputs either option1 or option2
+	C escapes are supported in the test instruction
+
    cc-option
 	cc-option is used to check if $(CC) supports a given option, and not
 	supported to use an optional second option.
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -573,8 +573,6 @@ running once the system is up.
 	gscd=		[HW,CD]
 			Format: <io>

-	gt96100eth=	[NET] MIPS GT96100 Advanced Communication Controller
-
 	gus=		[HW,OSS]
 			Format: <io>,<irq>,<dma>,<dma16>

@@ -1240,7 +1238,11 @@ running once the system is up.
 				bootloader. This is currently used on
 				IXP2000 systems where the bus has to be
 				configured a certain way for adjunct CPUs.
-
+		noearly		[X86] Don't do any early type 1 scanning.
+				This might help on some broken boards which
+				machine check when some devices' config space
+				is read. But various workarounds are disabled
+				and some IOMMU drivers will not work.
 	pcmv=		[HW,PCMCIA] BadgePAD 4

 	pd.		[PARIDE]
@@ -1363,6 +1365,14 @@ running once the system is up.

 	reserve=	[KNL,BUGS] Force the kernel to ignore some iomem area

+	reservetop=	[IA-32]
+			Format: nn[KMG]
+			Reserves a hole at the top of the kernel virtual
+			address space.
+
+	reset_devices	[KNL] Force drivers to reset the underlying device
+			during initialization.
+
 	resume=		[SWSUSP]
 			Specify the partition device for software suspend

--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -192,6 +192,17 @@ or, for backwards compatibility, the option value.  E.g.,
 arp_interval

 	Specifies the ARP link monitoring frequency in milliseconds.
+
+	The ARP monitor works by periodically checking the slave
+	devices to determine whether they have sent or received
+	traffic recently (the precise criteria depends upon the
+	bonding mode, and the state of the slave).  Regular traffic is
+	generated via ARP probes issued for the addresses specified by
+	the arp_ip_target option.
+
+	This behavior can be modified by the arp_validate option,
+	below.
+
 	If ARP monitoring is used in an etherchannel compatible mode
 	(modes 0 and 2), the switch should be configured in a mode
 	that evenly distributes packets across all links. If the
@@ -213,6 +224,54 @@ arp_ip_target
 	maximum number of targets that can be specified is 16.  The
 	default value is no IP addresses.

+arp_validate
+
+	Specifies whether or not ARP probes and replies should be
+	validated in the active-backup mode.  This causes the ARP
+	monitor to examine the incoming ARP requests and replies, and
+	only consider a slave to be up if it is receiving the
+	appropriate ARP traffic.
+
+	Possible values are:
+
+	none or 0
+
+		No validation is performed.  This is the default.
+
+	active or 1
+
+		Validation is performed only for the active slave.
+
+	backup or 2
+
+		Validation is performed only for backup slaves.
+
+	all or 3
+
+		Validation is performed for all slaves.
+
+	For the active slave, the validation checks ARP replies to
+	confirm that they were generated by an arp_ip_target.  Since
+	backup slaves do not typically receive these replies, the
+	validation performed for backup slaves is on the ARP request
+	sent out via the active slave.  It is possible that some
+	switch or network configurations may result in situations
+	wherein the backup slaves do not receive the ARP requests; in
+	such a situation, validation of backup slaves must be
+	disabled.
+
+	This option is useful in network configurations in which
+	multiple bonding hosts are concurrently issuing ARPs to one or
+	more targets beyond a common switch.  Should the link between
+	the switch and target fail (but not the switch itself), the
+	probe traffic generated by the multiple bonding instances will
+	fool the standard ARP monitor into considering the links as
+	still up.  Use of the arp_validate option can resolve this, as
+	the ARP monitor will only consider ARP requests and replies
+	associated with its own instance of bonding.
+
+	This option was added in bonding version 3.1.0.
+
 downdelay

 	Specifies the time, in milliseconds, to wait before disabling
--- a/Documentation/networking/dccp.txt
+++ b/Documentation/networking/dccp.txt
@@ -1,7 +1,6 @@
 DCCP protocol
 ============

-Last updated: 10 November 2005

 Contents
 ========
@@ -42,8 +41,11 @@ Socket options
 DCCP_SOCKOPT_PACKET_SIZE is used for CCID3 to set default packet size for
 calculations.

-DCCP_SOCKOPT_SERVICE sets the service. This is compulsory as per the
-specification. If you don't set it you will get EPROTO.
+DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
+service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
+the socket will fall back to 0 (which means that no meaningful service code
+is present). Connecting sockets set at most one service option; for
+listening sockets, multiple service codes can be specified.

 Notes
 =====
--- a/Documentation/nommu-mmap.txt
+++ b/Documentation/nommu-mmap.txt
@@ -116,6 +116,9 @@ FURTHER NOTES ON NO-MMU MMAP
 (*) A list of all the mappings on the system is visible through /proc/maps in
     no-MMU mode.

+ (*) A list of all the mappings in use by a process is visible through
+     /proc/<pid>/maps in no-MMU mode.
+
 (*) Supplying MAP_FIXED or a requesting a particular mapping address will
     result in an error.

@@ -125,6 +128,49 @@ FURTHER NOTES ON NO-MMU MMAP
     error will result if they don't. This is most likely to be encountered
     with character device files, pipes, fifos and sockets.

+
+==========================
+INTERPROCESS SHARED MEMORY
+==========================
+
+Both SYSV IPC SHM shared memory and POSIX shared memory is supported in NOMMU
+mode.  The former through the usual mechanism, the latter through files created
+on ramfs or tmpfs mounts.
+
+
+=======
+FUTEXES
+=======
+
+Futexes are supported in NOMMU mode if the arch supports them.  An error will
+be given if an address passed to the futex system call lies outside the
+mappings made by a process or if the mapping in which the address lies does not
+support futexes (such as an I/O chardev mapping).
+
+
+=============
+NO-MMU MREMAP
+=============
+
+The mremap() function is partially supported.  It may change the size of a
+mapping, and may move it[*] if MREMAP_MAYMOVE is specified and if the new size
+of the mapping exceeds the size of the slab object currently occupied by the
+memory to which the mapping refers, or if a smaller slab object could be used.
+
+MREMAP_FIXED is not supported, though it is ignored if there's no change of
+address and the object does not need to be moved.
+
+Shared mappings may not be moved.  Shareable mappings may not be moved either,
+even if they are not currently shared.
+
+The mremap() function must be given an exact match for base address and size of
+a previously mapped object.  It may not be used to create holes in existing
+mappings, move parts of existing mappings or resize parts of mappings.  It must
+act on a complete mapping.
+
+[*] Not currently supported.
+
+
 ============================================
 PROVIDING SHAREABLE CHARACTER DEVICE SUPPORT
 ============================================
--- a/Documentation/pcieaer-howto.txt
+++ b/Documentation/pcieaer-howto.txt
@@ -0,0 +1,253 @@
+   The PCI Express Advanced Error Reporting Driver Guide HOWTO
+		T. Long Nguyen	<tom.l.nguyen@intel.com>
+		Yanmin Zhang	<yanmin.zhang@intel.com>
+				07/29/2006
+
+
+1. Overview
+
+1.1 About this guide
+
+This guide describes the basics of the PCI Express Advanced Error
+Reporting (AER) driver and provides information on how to use it, as
+well as how to enable the drivers of endpoint devices to conform with
+PCI Express AER driver.
+
+1.2 Copyright © Intel Corporation 2006.
+
+1.3 What is the PCI Express AER Driver?
+
+PCI Express error signaling can occur on the PCI Express link itself
+or on behalf of transactions initiated on the link. PCI Express
+defines two error reporting paradigms: the baseline capability and
+the Advanced Error Reporting capability. The baseline capability is
+required of all PCI Express components providing a minimum defined
+set of error reporting requirements. Advanced Error Reporting
+capability is implemented with a PCI Express advanced error reporting
+extended capability structure providing more robust error reporting.
+
+The PCI Express AER driver provides the infrastructure to support PCI
+Express Advanced Error Reporting capability. The PCI Express AER
+driver provides three basic functions:
+
+-	Gathers the comprehensive error information if errors occurred.
+-	Reports error to the users.
+-	Performs error recovery actions.
+
+AER driver only attaches root ports which support PCI-Express AER
+capability.
+
+
+2. User Guide
+
+2.1 Include the PCI Express AER Root Driver into the Linux Kernel
+
+The PCI Express AER Root driver is a Root Port service driver attached
+to the PCI Express Port Bus driver. If a user wants to use it, the driver
+has to be compiled. Option CONFIG_PCIEAER supports this capability. It
+depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
+CONFIG_PCIEAER = y.
+
+2.2 Load PCI Express AER Root Driver
+There is a case where a system has AER support in BIOS. Enabling the AER
+Root driver and having AER support in BIOS may result unpredictable
+behavior. To avoid this conflict, a successful load of the AER Root driver
+requires ACPI _OSC support in the BIOS to allow the AER Root driver to
+request for native control of AER. See the PCI FW 3.0 Specification for
+details regarding OSC usage. Currently, lots of firmwares don't provide
+_OSC support while they use PCI Express. To support such firmwares,
+forceload, a parameter of type bool, could enable AER to continue to
+be initiated although firmwares have no _OSC support. To enable the
+walkaround, pls. add aerdriver.forceload=y to kernel boot parameter line
+when booting kernel. Note that forceload=n by default.
+
+2.3 AER error output
+When a PCI-E AER error is captured, an error message will be outputed to
+console. If it's a correctable error, it is outputed as a warning.
+Otherwise, it is printed as an error. So users could choose different
+log level to filter out correctable error messages.
+
+Below shows an example.
+------ PCI-Express Device Error -----+
+Error Severity          : Uncorrected (Fatal)
+PCIE Bus Error type     : Transaction Layer
+Unsupported Request     : First
+Requester ID            : 0500
+VendorID=8086h, DeviceID=0329h, Bus=05h, Device=00h, Function=00h
+TLB Header:
+04000001 00200a03 05010000 00050100
+
+In the example, 'Requester ID' means the ID of the device who sends
+the error message to root port. Pls. refer to pci express specs for
+other fields.
+
+
+3. Developer Guide
+
+To enable AER aware support requires a software driver to configure
+the AER capability structure within its device and to provide callbacks.
+
+To support AER better, developers need understand how AER does work
+firstly.
+
+PCI Express errors are classified into two types: correctable errors
+and uncorrectable errors. This classification is based on the impacts
+of those errors, which may result in degraded performance or function
+failure.
+
+Correctable errors pose no impacts on the functionality of the
+interface. The PCI Express protocol can recover without any software
+intervention or any loss of data. These errors are detected and
+corrected by hardware. Unlike correctable errors, uncorrectable
+errors impact functionality of the interface. Uncorrectable errors
+can cause a particular transaction or a particular PCI Express link
+to be unreliable. Depending on those error conditions, uncorrectable
+errors are further classified into non-fatal errors and fatal errors.
+Non-fatal errors cause the particular transaction to be unreliable,
+but the PCI Express link itself is fully functional. Fatal errors, on
+the other hand, cause the link to be unreliable.
+
+When AER is enabled, a PCI Express device will automatically send an
+error message to the PCIE root port above it when the device captures
+an error. The Root Port, upon receiving an error reporting message,
+internally processes and logs the error message in its PCI Express
+capability structure. Error information being logged includes storing
+the error reporting agent's requestor ID into the Error Source
+Identification Registers and setting the error bits of the Root Error
+Status Register accordingly. If AER error reporting is enabled in Root
+Error Command Register, the Root Port generates an interrupt if an
+error is detected.
+
+Note that the errors as described above are related to the PCI Express
+hierarchy and links. These errors do not include any device specific
+errors because device specific errors will still get sent directly to
+the device driver.
+
+3.1 Configure the AER capability structure
+
+AER aware drivers of PCI Express component need change the device
+control registers to enable AER. They also could change AER registers,
+including mask and severity registers. Helper function
+pci_enable_pcie_error_reporting could be used to enable AER. See
+section 3.3.
+
+3.2. Provide callbacks
+
+3.2.1 callback reset_link to reset pci express link
+
+This callback is used to reset the pci express physical link when a
+fatal error happens. The root port aer service driver provides a
+default reset_link function, but different upstream ports might
+have different specifications to reset pci express link, so all
+upstream ports should provide their own reset_link functions.
+
+In struct pcie_port_service_driver, a new pointer, reset_link, is
+added.
+
+pci_ers_result_t (*reset_link) (struct pci_dev *dev);
+
+Section 3.2.2.2 provides more detailed info on when to call
+reset_link.
+
+3.2.2 PCI error-recovery callbacks
+
+The PCI Express AER Root driver uses error callbacks to coordinate
+with downstream device drivers associated with a hierarchy in question
+when performing error recovery actions.
+
+Data struct pci_driver has a pointer, err_handler, to point to
+pci_error_handlers who consists of a couple of callback function
+pointers. AER driver follows the rules defined in
+pci-error-recovery.txt except pci express specific parts (e.g.
+reset_link). Pls. refer to pci-error-recovery.txt for detailed
+definitions of the callbacks.
+
+Below sections specify when to call the error callback functions.
+
+3.2.2.1 Correctable errors
+
+Correctable errors pose no impacts on the functionality of
+the interface. The PCI Express protocol can recover without any
+software intervention or any loss of data. These errors do not
+require any recovery actions. The AER driver clears the device's
+correctable error status register accordingly and logs these errors.
+
+3.2.2.2 Non-correctable (non-fatal and fatal) errors
+
+If an error message indicates a non-fatal error, performing link reset
+at upstream is not required. The AER driver calls error_detected(dev,
+pci_channel_io_normal) to all drivers associated within a hierarchy in
+question. for example,
+EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort.
+If Upstream port A captures an AER error, the hierarchy consists of
+Downstream port B and EndPoint.
+
+A driver may return PCI_ERS_RESULT_CAN_RECOVER,
+PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
+whether it can recover or the AER driver calls mmio_enabled as next.
+
+If an error message indicates a fatal error, kernel will broadcast
+error_detected(dev, pci_channel_io_frozen) to all drivers within
+a hierarchy in question. Then, performing link reset at upstream is
+necessary. As different kinds of devices might use different approaches
+to reset link, AER port service driver is required to provide the
+function to reset link. Firstly, kernel looks for if the upstream
+component has an aer driver. If it has, kernel uses the reset_link
+callback of the aer driver. If the upstream component has no aer driver
+and the port is downstream port, we will use the aer driver of the
+root port who reports the AER error. As for upstream ports,
+they should provide their own aer service drivers with reset_link
+function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
+reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
+to mmio_enabled.
+
+3.3 helper functions
+
+3.3.1 int pci_find_aer_capability(struct pci_dev *dev);
+pci_find_aer_capability locates the PCI Express AER capability
+in the device configuration space. If the device doesn't support
+PCI-Express AER, the function returns 0.
+
+3.3.2 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+pci_enable_pcie_error_reporting enables the device to send error
+messages to root port when an error is detected. Note that devices
+don't enable the error reporting by default, so device drivers need
+call this function to enable it.
+
+3.3.3 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+pci_disable_pcie_error_reporting disables the device to send error
+messages to root port when an error is detected.
+
+3.3.4 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
+error status register.
+
+3.4 Frequent Asked Questions
+
+Q: What happens if a PCI Express device driver does not provide an
+error recovery handler (pci_driver->err_handler is equal to NULL)?
+
+A: The devices attached with the driver won't be recovered. If the
+error is fatal, kernel will print out warning messages. Please refer
+to section 3 for more information.
+
+Q: What happens if an upstream port service driver does not provide
+callback reset_link?
+
+A: Fatal error recovery will fail if the errors are reported by the
+upstream ports who are attached by the service driver.
+
+Q: How does this infrastructure deal with driver that is not PCI
+Express aware?
+
+A: This infrastructure calls the error callback functions of the
+driver when an error happens. But if the driver is not aware of
+PCI Express, the device might not report its own errors to root
+port.
+
+Q: What modifications will that driver need to make it compatible
+with the PCI Express AER Root driver?
+
+A: It could call the helper functions to enable AER in devices and
+cleanup uncorrectable status register. Pls. refer to section 3.3.
+
--- a/Documentation/power/devices.txt
+++ b/Documentation/power/devices.txt
--- a/Documentation/power/interface.txt
+++ b/Documentation/power/interface.txt
@@ -52,3 +52,18 @@ suspend image will be as small as possible.

 Reading from this file will display the current image size limit, which
 is set to 500 MB by default.
+
+/sys/power/pm_trace controls the code which saves the last PM event point in
+the RTC across reboots, so that you can debug a machine that just hangs
+during suspend (or more commonly, during resume).  Namely, the RTC is only
+used to save the last PM event point if this file contains '1'.  Initially it
+contains '0' which may be changed to '1' by writing a string representing a
+nonzero integer into it.
+
+To use this debugging feature you should attempt to suspend the machine, then
+reboot it and run
+
+	dmesg -s 1000000 | grep 'hash matches'
+
+CAUTION: Using it will cause your machine's real-time (CMOS) clock to be
+set to a random invalid time after a resume.
--- a/Documentation/sh/new-machine.txt
+++ b/Documentation/sh/new-machine.txt
@@ -41,11 +41,6 @@ Board-specific code:
        |
 	.. more boards here ...

-It should also be noted that each board is required to have some certain
-headers. At the time of this writing, io.h is the only thing that needs
-to be provided for each board, and can generally just reference generic
-functions (with the exception of isa_port2addr).
-
 Next, for companion chips:
 .
 `-- arch
@@ -104,12 +99,13 @@ and then populate that with sub-directories for each member of the family.
 Both the Solution Engine and the hp6xx boards are an example of this.

 After you have setup your new arch/sh/boards/ directory, remember that you
-also must add a directory in include/asm-sh for headers localized to this
-board. In order to interoperate seamlessly with the build system, it's best
-to have this directory the same as the arch/sh/boards/ directory name,
-though if your board is again part of a family, the build system has ways
-of dealing with this, and you can feel free to name the directory after
-the family member itself.
+should also add a directory in include/asm-sh for headers localized to this
+board (if there are going to be more than one). In order to interoperate
+seamlessly with the build system, it's best to have this directory the same
+as the arch/sh/boards/ directory name, though if your board is again part of
+a family, the build system has ways of dealing with this (via incdir-y
+overloading), and you can feel free to name the directory after the family
+member itself.

 There are a few things that each board is required to have, both in the
 arch/sh/boards and the include/asm-sh/ heirarchy. In order to better
@@ -122,6 +118,7 @@ might look something like:
 * arch/sh/boards/vapor/setup.c - Setup code for imaginary board
 */
 #include <linux/init.h>
+#include <asm/rtc.h> /* for board_time_init() */

 const char *get_system_type(void)
 {
@@ -152,79 +149,57 @@ int __init platform_setup(void)
 }

 Our new imaginary board will also have to tie into the machvec in order for it
-to be of any use. Currently the machvec is slowly on its way out, but is still
-required for the time being. As such, let us take a look at what needs to be
-done for the machvec assignment.
+to be of any use.

 machvec functions fall into a number of categories:

 - I/O functions to IO memory (inb etc) and PCI/main memory (readb etc).
- - I/O remapping functions (ioremap etc)
- - some initialisation functions
- - a 'heartbeat' function
- - some miscellaneous flags
+ - I/O mapping functions (ioport_map, ioport_unmap, etc).
+ - a 'heartbeat' function.
+ - PCI and IRQ initialization routines.
+ - Consistent allocators (for boards that need special allocators,
+   particularly for allocating out of some board-specific SRAM for DMA
+   handles).

-The tree can be built in two ways:
- - as a fully generic build. All drivers are linked in, and all functions
-   go through the machvec
- - as a machine specific build. In this case only the required drivers
-   will be linked in, and some macros may be redefined to not go through
-   the machvec where performance is important (in particular IO functions).
+There are machvec functions added and removed over time, so always be sure to
+consult include/asm-sh/machvec.h for the current state of the machvec.

-There are three ways in which IO can be performed:
- - none at all. This is really only useful for the 'unknown' machine type,
-   which us designed to run on a machine about which we know nothing, and
-   so all all IO instructions do nothing.
- - fully custom. In this case all IO functions go to a machine specific
-   set of functions which can do what they like
- - a generic set of functions. These will cope with most situations,
-   and rely on a single function, mv_port2addr, which is called through the
-   machine vector, and converts an IO address into a memory address, which
-   can be read from/written to directly.
+The kernel will automatically wrap in generic routines for undefined function
+pointers in the machvec at boot time, as machvec functions are referenced
+unconditionally throughout most of the tree. Some boards have incredibly
+sparse machvecs (such as the dreamcast and sh03), whereas others must define
+virtually everything (rts7751r2d).

-Thus adding a new machine involves the following steps (I will assume I am
-adding a machine called vapor):
+Adding a new machine is relatively trivial (using vapor as an example):

- - add a new file include/asm-sh/vapor/io.h which contains prototypes for
+If the board-specific definitions are quite minimalistic, as is the case for
+the vast majority of boards, simply having a single board-specific header is
+sufficient.
+
+ - add a new file include/asm-sh/vapor.h which contains prototypes for
   any machine specific IO functions prefixed with the machine name, for
   example vapor_inb. These will be needed when filling out the machine
   vector.

-   This is the minimum that is required, however there are ample
-   opportunities to optimise this. In particular, by making the prototypes
-   inline function definitions, it is possible to inline the function when
-   building machine specific versions. Note that the machine vector
-   functions will still be needed, so that a module built for a generic
-   setup can be loaded.
+   Note that these prototypes are generated automatically by setting
+   __IO_PREFIX to something sensible. A typical example would be:

- - add a new file arch/sh/boards/vapor/mach.c. This contains the definition
-   of the machine vector. When building the machine specific version, this
-   will be the real machine vector (via an alias), while in the generic
-   version is used to initialise the machine vector, and then freed, by
-   making it initdata. This should be defined as:
+	#define __IO_PREFIX vapor
+   	#include <asm/io_generic.h>

-     struct sh_machine_vector mv_vapor __initmv = {
-       .mv_name = "vapor",
-     }
-     ALIAS_MV(vapor)
+   somewhere in the board-specific header. Any boards being ported that still
+   have a legacy io.h should remove it entirely and switch to the new model.

- - finally add a file arch/sh/boards/vapor/io.c, which contains
-   definitions of the machine specific io functions.
+ - Add machine vector definitions to the board's setup.c. At a bare minimum,
+   this must be defined as something like:

-A note about initialisation functions. Three initialisation functions are
-provided in the machine vector:
- - mv_arch_init - called very early on from setup_arch
- - mv_init_irq - called from init_IRQ, after the generic SH interrupt
-   initialisation
- - mv_init_pci - currently not used
+	struct sh_machine_vector mv_vapor __initmv = {
+		.mv_name = "vapor",
+	};
+	ALIAS_MV(vapor)

-Any other remaining functions which need to be called at start up can be
-added to the list using the __initcalls macro (or module_init if the code
-can be built as a module). Many generic drivers probe to see if the device
-they are targeting is present, however this may not always be appropriate,
-so a flag can be added to the machine vector which will be set on those
-machines which have the hardware in question, reducing the probe to a
-single conditional.
+ - finally add a file arch/sh/boards/vapor/io.c, which contains definitions of
+   the machine specific io functions (if there are enough to warrant it).

 3. Hooking into the Build System
 ================================
@@ -303,4 +278,3 @@ which will in turn copy the defconfig for this board, run it through
 oldconfig (prompting you for any new options since the time of creation),
 and start you on your way to having a functional kernel for your new
 board.
-
--- a/Documentation/sh/register-banks.txt
+++ b/Documentation/sh/register-banks.txt
@@ -0,0 +1,33 @@
+	Notes on register bank usage in the kernel
+	==========================================
+
+Introduction
+------------
+
+The SH-3 and SH-4 CPU families traditionally include a single partial register
+bank (selected by SR.RB, only r0 ... r7 are banked), whereas other families
+may have more full-featured banking or simply no such capabilities at all.
+
+SR.RB banking
+-------------
+
+In the case of this type of banking, banked registers are mapped directly to
+r0 ... r7 if SR.RB is set to the bank we are interested in, otherwise ldc/stc
+can still be used to reference the banked registers (as r0_bank ... r7_bank)
+when in the context of another bank. The developer must keep the SR.RB value
+in mind when writing code that utilizes these banked registers, for obvious
+reasons. Userspace is also not able to poke at the bank1 values, so these can
+be used rather effectively as scratch registers by the kernel.
+
+Presently the kernel uses several of these registers.
+
+	- r0_bank, r1_bank (referenced as k0 and k1, used for scratch
+	  registers when doing exception handling).
+	- r2_bank (used to track the EXPEVT/INTEVT code)
+		- Used by do_IRQ() and friends for doing irq mapping based off
+		  of the interrupt exception vector jump table offset
+	- r6_bank (global interrupt mask)
+		- The SR.IMASK interrupt handler makes use of this to set the
+		  interrupt priority level (used by local_irq_enable())
+	- r7_bank (current)
+
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -29,6 +29,7 @@ Currently, these files are in /proc/sys/vm:
 - drop-caches
 - zone_reclaim_mode
 - min_unmapped_ratio
+- min_slab_ratio
 - panic_on_oom

 ==============================================================
@@ -138,7 +139,6 @@ This is value ORed together of
 1	= Zone reclaim on
 2	= Zone reclaim writes dirty pages out
 4	= Zone reclaim swaps pages
-8	= Also do a global slab reclaim pass

 zone_reclaim_mode is set during bootup to 1 if it is determined that pages
 from remote zones will cause a measurable performance reduction. The
@@ -162,18 +162,13 @@ Allowing regular swap effectively restricts allocations to the local
 node unless explicitly overridden by memory policies or cpuset
 configurations.

-It may be advisable to allow slab reclaim if the system makes heavy
-use of files and builds up large slab caches. However, the slab
-shrink operation is global, may take a long time and free slabs
-in all nodes of the system.
-
 =============================================================

 min_unmapped_ratio:

 This is available only on NUMA kernels.

-A percentage of the file backed pages in each zone.  Zone reclaim will only
+A percentage of the total pages in each zone.  Zone reclaim will only
 occur if more than this percentage of pages are file backed and unmapped.
 This is to insure that a minimal amount of local pages is still available for
 file I/O even if the node is overallocated.
@@ -182,6 +177,24 @@ The default is 1 percent.

 =============================================================

+min_slab_ratio:
+
+This is available only on NUMA kernels.
+
+A percentage of the total pages in each zone.  On Zone reclaim
+(fallback from the local zone occurs) slabs will be reclaimed if more
+than this percentage of pages in a zone are reclaimable slab pages.
+This insures that the slab growth stays under control even in NUMA
+systems that rarely perform global reclaim.
+
+The default is 5 percent.
+
+Note that slab reclaim is triggered in a per zone / node fashion.
+The process of reclaiming slab memory is currently not node specific
+and may not be fast.
+
+=============================================================
+
 panic_on_oom

 This enables or disables panic on out-of-memory feature.  If this is set to 1,
--- a/Show More
+++ b/Show More