Merge branch 'master' into for-linus

Conflicts:
	include/linux/percpu.h
	mm/percpu.c
This commit is contained in:
Pekka Enberg
2010-10-24 19:57:05 +03:00
4957 changed files with 300129 additions and 179763 deletions
+4 -4
View File
@@ -3554,12 +3554,12 @@ E: cvance@nai.com
D: portions of the Linux Security Module (LSM) framework and security modules
N: Petr Vandrovec
E: vandrove@vc.cvut.cz
E: petr@vandrovec.name
D: Small contributions to ncpfs
D: Matrox framebuffer driver
S: Chudenicka 8
S: 10200 Prague 10, Hostivar
S: Czech Republic
S: 21513 Conradia Ct
S: Cupertino, CA 95014
S: USA
N: Thibaut Varene
E: T-Bone@parisc-linux.org
+99
View File
@@ -0,0 +1,99 @@
What: /sys/class/ata_...
Date: August 2008
Contact: Gwendal Grignou<gwendal@google.com>
Description:
Provide a place in sysfs for storing the ATA topology of the system. This allows
retrieving various information about ATA objects.
Files under /sys/class/ata_port
-------------------------------
For each port, a directory ataX is created where X is the ata_port_id of
the port. The device parent is the ata host device.
idle_irq (read)
Number of IRQ received by the port while idle [some ata HBA only].
nr_pmp_links (read)
If a SATA Port Multiplier (PM) is connected, number of link behind it.
Files under /sys/class/ata_link
-------------------------------
Behind each port, there is a ata_link. If there is a SATA PM in the
topology, 15 ata_link objects are created.
If a link is behind a port, the directory name is linkX, where X is
ata_port_id of the port.
If a link is behind a PM, its name is linkX.Y where X is ata_port_id
of the parent port and Y the PM port.
hw_sata_spd_limit
Maximum speed supported by the connected SATA device.
sata_spd_limit
Maximum speed imposed by libata.
sata_spd
Current speed of the link [1.5, 3Gps,...].
Files under /sys/class/ata_device
---------------------------------
Behind each link, up to two ata device are created.
The name of the directory is devX[.Y].Z where:
- X is ata_port_id of the port where the device is connected,
- Y the port of the PM if any, and
- Z the device id: for PATA, there is usually 2 devices [0,1],
only 1 for SATA.
class
Device class. Can be "ata" for disk, "atapi" for packet device,
"pmp" for PM, or "none" if no device was found behind the link.
dma_mode
Transfer modes supported by the device when in DMA mode.
Mostly used by PATA device.
pio_mode
Transfer modes supported by the device when in PIO mode.
Mostly used by PATA device.
xfer_mode
Current transfer mode.
id
Cached result of IDENTIFY command, as described in ATA8 7.16 and 7.17.
Only valid if the device is not a PM.
gscr
Cached result of the dump of PM GSCR register.
Valid registers are:
0: SATA_PMP_GSCR_PROD_ID,
1: SATA_PMP_GSCR_REV,
2: SATA_PMP_GSCR_PORT_INFO,
32: SATA_PMP_GSCR_ERROR,
33: SATA_PMP_GSCR_ERROR_EN,
64: SATA_PMP_GSCR_FEAT,
96: SATA_PMP_GSCR_FEAT_EN,
130: SATA_PMP_GSCR_SII_GPIO
Only valid if the device is a PM.
spdn_cnt
Number of time libata decided to lower the speed of link due to errors.
ering
Formatted output of the error ring of the device.
@@ -77,3 +77,91 @@ Description:
devices this attribute is set to "enabled" by bus type code or
device drivers and in that cases it should be safe to leave the
default value.
What: /sys/devices/.../power/wakeup_count
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_count attribute contains the number
of signaled wakeup events associated with the device. This
attribute is read-only. If the device is not enabled to wake up
the system from sleep states, this attribute is empty.
What: /sys/devices/.../power/wakeup_active_count
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_active_count attribute contains the
number of times the processing of wakeup events associated with
the device was completed (at the kernel level). This attribute
is read-only. If the device is not enabled to wake up the
system from sleep states, this attribute is empty.
What: /sys/devices/.../power/wakeup_hit_count
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_hit_count attribute contains the
number of times the processing of a wakeup event associated with
the device might prevent the system from entering a sleep state.
This attribute is read-only. If the device is not enabled to
wake up the system from sleep states, this attribute is empty.
What: /sys/devices/.../power/wakeup_active
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_active attribute contains either 1,
or 0, depending on whether or not a wakeup event associated with
the device is being processed (1). This attribute is read-only.
If the device is not enabled to wake up the system from sleep
states, this attribute is empty.
What: /sys/devices/.../power/wakeup_total_time_ms
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_total_time_ms attribute contains
the total time of processing wakeup events associated with the
device, in milliseconds. This attribute is read-only. If the
device is not enabled to wake up the system from sleep states,
this attribute is empty.
What: /sys/devices/.../power/wakeup_max_time_ms
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_max_time_ms attribute contains
the maximum time of processing a single wakeup event associated
with the device, in milliseconds. This attribute is read-only.
If the device is not enabled to wake up the system from sleep
states, this attribute is empty.
What: /sys/devices/.../power/wakeup_last_time_ms
Date: September 2010
Contact: Rafael J. Wysocki <rjw@sisk.pl>
Description:
The /sys/devices/.../wakeup_last_time_ms attribute contains
the value of the monotonic clock corresponding to the time of
signaling the last wakeup event associated with the device, in
milliseconds. This attribute is read-only. If the device is
not enabled to wake up the system from sleep states, this
attribute is empty.
What: /sys/devices/.../power/autosuspend_delay_ms
Date: September 2010
Contact: Alan Stern <stern@rowland.harvard.edu>
Description:
The /sys/devices/.../power/autosuspend_delay_ms attribute
contains the autosuspend delay value (in milliseconds). Some
drivers do not want their device to suspend as soon as it
becomes idle at run time; they want the device to remain
inactive for a certain minimum period of time first. That
period is called the autosuspend delay. Negative values will
prevent the device from being suspended at run time (similar
to writing "on" to the power/control attribute). Values >=
1000 will cause the autosuspend timer expiration to be rounded
up to the nearest second.
Not all drivers support this attribute. If it isn't supported,
attempts to read or write it will yield I/O errors.
+12
View File
@@ -0,0 +1,12 @@
What: /sys/module/pch_phub/drivers/.../pch_mac
Date: August 2010
KernelVersion: 2.6.35
Contact: masa-korg@dsn.okisemi.com
Description: Write/read GbE MAC address.
What: /sys/module/pch_phub/drivers/.../pch_firmware
Date: August 2010
KernelVersion: 2.6.35
Contact: masa-korg@dsn.okisemi.com
Description: Write/read Option ROM data.
+29
View File
@@ -99,9 +99,38 @@ Description:
dmesg -s 1000000 | grep 'hash matches'
If you do not get any matches (or they appear to be false
positives), it is possible that the last PM event point
referred to a device created by a loadable kernel module. In
this case cat /sys/power/pm_trace_dev_match (see below) after
your system is started up and the kernel modules are loaded.
CAUTION: Using it will cause your machine's real-time (CMOS)
clock to be set to a random invalid time after a resume.
What; /sys/power/pm_trace_dev_match
Date: October 2010
Contact: James Hogan <james@albanarts.com>
Description:
The /sys/power/pm_trace_dev_match file contains the name of the
device associated with the last PM event point saved in the RTC
across reboots when pm_trace has been used. More precisely it
contains the list of current devices (including those
registered by loadable kernel modules since boot) which match
the device hash in the RTC at boot, with a newline after each
one.
The advantage of this file over the hash matches printed to the
kernel log (see /sys/power/pm_trace), is that it includes
devices created after boot by loadable kernel modules.
Due to the small hash size necessary to fit in the RTC, it is
possible that more than one device matches the hash, in which
case further investigation is required to determine which
device is causing the problem. Note that genuine RTC clock
values (such as when pm_trace has not been used), can still
match a device and output it's name here.
What: /sys/power/pm_async
Date: January 2009
Contact: Rafael J. Wysocki <rjw@sisk.pl>
+495
View File
@@ -0,0 +1,495 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE set PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<set>
<setinfo>
<title>The 802.11 subsystems &ndash; for kernel developers</title>
<subtitle>
Explaining wireless 802.11 networking in the Linux kernel
</subtitle>
<copyright>
<year>2007-2009</year>
<holder>Johannes Berg</holder>
</copyright>
<authorgroup>
<author>
<firstname>Johannes</firstname>
<surname>Berg</surname>
<affiliation>
<address><email>johannes@sipsolutions.net</email></address>
</affiliation>
</author>
</authorgroup>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License version 2 as published by the Free Software Foundation.
</para>
<para>
This documentation is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this documentation; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
<abstract>
<para>
These books attempt to give a description of the
various subsystems that play a role in 802.11 wireless
networking in Linux. Since these books are for kernel
developers they attempts to document the structures
and functions used in the kernel as well as giving a
higher-level overview.
</para>
<para>
The reader is expected to be familiar with the 802.11
standard as published by the IEEE in 802.11-2007 (or
possibly later versions). References to this standard
will be given as "802.11-2007 8.1.5".
</para>
</abstract>
</setinfo>
<book id="cfg80211-developers-guide">
<bookinfo>
<title>The cfg80211 subsystem</title>
<abstract>
!Pinclude/net/cfg80211.h Introduction
</abstract>
</bookinfo>
<chapter>
<title>Device registration</title>
!Pinclude/net/cfg80211.h Device registration
!Finclude/net/cfg80211.h ieee80211_band
!Finclude/net/cfg80211.h ieee80211_channel_flags
!Finclude/net/cfg80211.h ieee80211_channel
!Finclude/net/cfg80211.h ieee80211_rate_flags
!Finclude/net/cfg80211.h ieee80211_rate
!Finclude/net/cfg80211.h ieee80211_sta_ht_cap
!Finclude/net/cfg80211.h ieee80211_supported_band
!Finclude/net/cfg80211.h cfg80211_signal_type
!Finclude/net/cfg80211.h wiphy_params_flags
!Finclude/net/cfg80211.h wiphy_flags
!Finclude/net/cfg80211.h wiphy
!Finclude/net/cfg80211.h wireless_dev
!Finclude/net/cfg80211.h wiphy_new
!Finclude/net/cfg80211.h wiphy_register
!Finclude/net/cfg80211.h wiphy_unregister
!Finclude/net/cfg80211.h wiphy_free
!Finclude/net/cfg80211.h wiphy_name
!Finclude/net/cfg80211.h wiphy_dev
!Finclude/net/cfg80211.h wiphy_priv
!Finclude/net/cfg80211.h priv_to_wiphy
!Finclude/net/cfg80211.h set_wiphy_dev
!Finclude/net/cfg80211.h wdev_priv
</chapter>
<chapter>
<title>Actions and configuration</title>
!Pinclude/net/cfg80211.h Actions and configuration
!Finclude/net/cfg80211.h cfg80211_ops
!Finclude/net/cfg80211.h vif_params
!Finclude/net/cfg80211.h key_params
!Finclude/net/cfg80211.h survey_info_flags
!Finclude/net/cfg80211.h survey_info
!Finclude/net/cfg80211.h beacon_parameters
!Finclude/net/cfg80211.h plink_actions
!Finclude/net/cfg80211.h station_parameters
!Finclude/net/cfg80211.h station_info_flags
!Finclude/net/cfg80211.h rate_info_flags
!Finclude/net/cfg80211.h rate_info
!Finclude/net/cfg80211.h station_info
!Finclude/net/cfg80211.h monitor_flags
!Finclude/net/cfg80211.h mpath_info_flags
!Finclude/net/cfg80211.h mpath_info
!Finclude/net/cfg80211.h bss_parameters
!Finclude/net/cfg80211.h ieee80211_txq_params
!Finclude/net/cfg80211.h cfg80211_crypto_settings
!Finclude/net/cfg80211.h cfg80211_auth_request
!Finclude/net/cfg80211.h cfg80211_assoc_request
!Finclude/net/cfg80211.h cfg80211_deauth_request
!Finclude/net/cfg80211.h cfg80211_disassoc_request
!Finclude/net/cfg80211.h cfg80211_ibss_params
!Finclude/net/cfg80211.h cfg80211_connect_params
!Finclude/net/cfg80211.h cfg80211_pmksa
!Finclude/net/cfg80211.h cfg80211_send_rx_auth
!Finclude/net/cfg80211.h cfg80211_send_auth_timeout
!Finclude/net/cfg80211.h __cfg80211_auth_canceled
!Finclude/net/cfg80211.h cfg80211_send_rx_assoc
!Finclude/net/cfg80211.h cfg80211_send_assoc_timeout
!Finclude/net/cfg80211.h cfg80211_send_deauth
!Finclude/net/cfg80211.h __cfg80211_send_deauth
!Finclude/net/cfg80211.h cfg80211_send_disassoc
!Finclude/net/cfg80211.h __cfg80211_send_disassoc
!Finclude/net/cfg80211.h cfg80211_ibss_joined
!Finclude/net/cfg80211.h cfg80211_connect_result
!Finclude/net/cfg80211.h cfg80211_roamed
!Finclude/net/cfg80211.h cfg80211_disconnected
!Finclude/net/cfg80211.h cfg80211_ready_on_channel
!Finclude/net/cfg80211.h cfg80211_remain_on_channel_expired
!Finclude/net/cfg80211.h cfg80211_new_sta
!Finclude/net/cfg80211.h cfg80211_rx_mgmt
!Finclude/net/cfg80211.h cfg80211_mgmt_tx_status
!Finclude/net/cfg80211.h cfg80211_cqm_rssi_notify
!Finclude/net/cfg80211.h cfg80211_michael_mic_failure
</chapter>
<chapter>
<title>Scanning and BSS list handling</title>
!Pinclude/net/cfg80211.h Scanning and BSS list handling
!Finclude/net/cfg80211.h cfg80211_ssid
!Finclude/net/cfg80211.h cfg80211_scan_request
!Finclude/net/cfg80211.h cfg80211_scan_done
!Finclude/net/cfg80211.h cfg80211_bss
!Finclude/net/cfg80211.h cfg80211_inform_bss_frame
!Finclude/net/cfg80211.h cfg80211_inform_bss
!Finclude/net/cfg80211.h cfg80211_unlink_bss
!Finclude/net/cfg80211.h cfg80211_find_ie
!Finclude/net/cfg80211.h ieee80211_bss_get_ie
</chapter>
<chapter>
<title>Utility functions</title>
!Pinclude/net/cfg80211.h Utility functions
!Finclude/net/cfg80211.h ieee80211_channel_to_frequency
!Finclude/net/cfg80211.h ieee80211_frequency_to_channel
!Finclude/net/cfg80211.h ieee80211_get_channel
!Finclude/net/cfg80211.h ieee80211_get_response_rate
!Finclude/net/cfg80211.h ieee80211_hdrlen
!Finclude/net/cfg80211.h ieee80211_get_hdrlen_from_skb
!Finclude/net/cfg80211.h ieee80211_radiotap_iterator
</chapter>
<chapter>
<title>Data path helpers</title>
!Pinclude/net/cfg80211.h Data path helpers
!Finclude/net/cfg80211.h ieee80211_data_to_8023
!Finclude/net/cfg80211.h ieee80211_data_from_8023
!Finclude/net/cfg80211.h ieee80211_amsdu_to_8023s
!Finclude/net/cfg80211.h cfg80211_classify8021d
</chapter>
<chapter>
<title>Regulatory enforcement infrastructure</title>
!Pinclude/net/cfg80211.h Regulatory enforcement infrastructure
!Finclude/net/cfg80211.h regulatory_hint
!Finclude/net/cfg80211.h wiphy_apply_custom_regulatory
!Finclude/net/cfg80211.h freq_reg_info
</chapter>
<chapter>
<title>RFkill integration</title>
!Pinclude/net/cfg80211.h RFkill integration
!Finclude/net/cfg80211.h wiphy_rfkill_set_hw_state
!Finclude/net/cfg80211.h wiphy_rfkill_start_polling
!Finclude/net/cfg80211.h wiphy_rfkill_stop_polling
</chapter>
<chapter>
<title>Test mode</title>
!Pinclude/net/cfg80211.h Test mode
!Finclude/net/cfg80211.h cfg80211_testmode_alloc_reply_skb
!Finclude/net/cfg80211.h cfg80211_testmode_reply
!Finclude/net/cfg80211.h cfg80211_testmode_alloc_event_skb
!Finclude/net/cfg80211.h cfg80211_testmode_event
</chapter>
</book>
<book id="mac80211-developers-guide">
<bookinfo>
<title>The mac80211 subsystem</title>
<abstract>
!Pinclude/net/mac80211.h Introduction
!Pinclude/net/mac80211.h Warning
</abstract>
</bookinfo>
<toc></toc>
<!--
Generally, this document shall be ordered by increasing complexity.
It is important to note that readers should be able to read only
the first few sections to get a working driver and only advanced
usage should require reading the full document.
-->
<part>
<title>The basic mac80211 driver interface</title>
<partintro>
<para>
You should read and understand the information contained
within this part of the book while implementing a driver.
In some chapters, advanced usage is noted, that may be
skipped at first.
</para>
<para>
This part of the book only covers station and monitor mode
functionality, additional information required to implement
the other modes is covered in the second part of the book.
</para>
</partintro>
<chapter id="basics">
<title>Basic hardware handling</title>
<para>TBD</para>
<para>
This chapter shall contain information on getting a hw
struct allocated and registered with mac80211.
</para>
<para>
Since it is required to allocate rates/modes before registering
a hw struct, this chapter shall also contain information on setting
up the rate/mode structs.
</para>
<para>
Additionally, some discussion about the callbacks and
the general programming model should be in here, including
the definition of ieee80211_ops which will be referred to
a lot.
</para>
<para>
Finally, a discussion of hardware capabilities should be done
with references to other parts of the book.
</para>
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h ieee80211_hw
!Finclude/net/mac80211.h ieee80211_hw_flags
!Finclude/net/mac80211.h SET_IEEE80211_DEV
!Finclude/net/mac80211.h SET_IEEE80211_PERM_ADDR
!Finclude/net/mac80211.h ieee80211_ops
!Finclude/net/mac80211.h ieee80211_alloc_hw
!Finclude/net/mac80211.h ieee80211_register_hw
!Finclude/net/mac80211.h ieee80211_get_tx_led_name
!Finclude/net/mac80211.h ieee80211_get_rx_led_name
!Finclude/net/mac80211.h ieee80211_get_assoc_led_name
!Finclude/net/mac80211.h ieee80211_get_radio_led_name
!Finclude/net/mac80211.h ieee80211_unregister_hw
!Finclude/net/mac80211.h ieee80211_free_hw
</chapter>
<chapter id="phy-handling">
<title>PHY configuration</title>
<para>TBD</para>
<para>
This chapter should describe PHY handling including
start/stop callbacks and the various structures used.
</para>
!Finclude/net/mac80211.h ieee80211_conf
!Finclude/net/mac80211.h ieee80211_conf_flags
</chapter>
<chapter id="iface-handling">
<title>Virtual interfaces</title>
<para>TBD</para>
<para>
This chapter should describe virtual interface basics
that are relevant to the driver (VLANs, MGMT etc are not.)
It should explain the use of the add_iface/remove_iface
callbacks as well as the interface configuration callbacks.
</para>
<para>Things related to AP mode should be discussed there.</para>
<para>
Things related to supporting multiple interfaces should be
in the appropriate chapter, a BIG FAT note should be here about
this though and the recommendation to allow only a single
interface in STA mode at first!
</para>
!Finclude/net/mac80211.h ieee80211_vif
</chapter>
<chapter id="rx-tx">
<title>Receive and transmit processing</title>
<sect1>
<title>what should be here</title>
<para>TBD</para>
<para>
This should describe the receive and transmit
paths in mac80211/the drivers as well as
transmit status handling.
</para>
</sect1>
<sect1>
<title>Frame format</title>
!Pinclude/net/mac80211.h Frame format
</sect1>
<sect1>
<title>Packet alignment</title>
!Pnet/mac80211/rx.c Packet alignment
</sect1>
<sect1>
<title>Calling into mac80211 from interrupts</title>
!Pinclude/net/mac80211.h Calling mac80211 from interrupts
</sect1>
<sect1>
<title>functions/definitions</title>
!Finclude/net/mac80211.h ieee80211_rx_status
!Finclude/net/mac80211.h mac80211_rx_flags
!Finclude/net/mac80211.h ieee80211_tx_info
!Finclude/net/mac80211.h ieee80211_rx
!Finclude/net/mac80211.h ieee80211_rx_irqsafe
!Finclude/net/mac80211.h ieee80211_tx_status
!Finclude/net/mac80211.h ieee80211_tx_status_irqsafe
!Finclude/net/mac80211.h ieee80211_rts_get
!Finclude/net/mac80211.h ieee80211_rts_duration
!Finclude/net/mac80211.h ieee80211_ctstoself_get
!Finclude/net/mac80211.h ieee80211_ctstoself_duration
!Finclude/net/mac80211.h ieee80211_generic_frame_duration
!Finclude/net/mac80211.h ieee80211_wake_queue
!Finclude/net/mac80211.h ieee80211_stop_queue
!Finclude/net/mac80211.h ieee80211_wake_queues
!Finclude/net/mac80211.h ieee80211_stop_queues
</sect1>
</chapter>
<chapter id="filters">
<title>Frame filtering</title>
!Pinclude/net/mac80211.h Frame filtering
!Finclude/net/mac80211.h ieee80211_filter_flags
</chapter>
</part>
<part id="advanced">
<title>Advanced driver interface</title>
<partintro>
<para>
Information contained within this part of the book is
of interest only for advanced interaction of mac80211
with drivers to exploit more hardware capabilities and
improve performance.
</para>
</partintro>
<chapter id="hardware-crypto-offload">
<title>Hardware crypto acceleration</title>
!Pinclude/net/mac80211.h Hardware crypto acceleration
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h set_key_cmd
!Finclude/net/mac80211.h ieee80211_key_conf
!Finclude/net/mac80211.h ieee80211_key_flags
</chapter>
<chapter id="powersave">
<title>Powersave support</title>
!Pinclude/net/mac80211.h Powersave support
</chapter>
<chapter id="beacon-filter">
<title>Beacon filter support</title>
!Pinclude/net/mac80211.h Beacon filter support
!Finclude/net/mac80211.h ieee80211_beacon_loss
</chapter>
<chapter id="qos">
<title>Multiple queues and QoS support</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_tx_queue_params
</chapter>
<chapter id="AP">
<title>Access point mode support</title>
<para>TBD</para>
<para>Some parts of the if_conf should be discussed here instead</para>
<para>
Insert notes about VLAN interfaces with hw crypto here or
in the hw crypto chapter.
</para>
!Finclude/net/mac80211.h ieee80211_get_buffered_bc
!Finclude/net/mac80211.h ieee80211_beacon_get
</chapter>
<chapter id="multi-iface">
<title>Supporting multiple virtual interfaces</title>
<para>TBD</para>
<para>
Note: WDS with identical MAC address should almost always be OK
</para>
<para>
Insert notes about having multiple virtual interfaces with
different MAC addresses here, note which configurations are
supported by mac80211, add notes about supporting hw crypto
with it.
</para>
</chapter>
<chapter id="hardware-scan-offload">
<title>Hardware scan offload</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_scan_completed
</chapter>
</part>
<part id="rate-control">
<title>Rate control interface</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes the rate control algorithm
interface and how it relates to mac80211 and drivers.
</para>
</partintro>
<chapter id="dummy">
<title>dummy chapter</title>
<para>TBD</para>
</chapter>
</part>
<part id="internal">
<title>Internals</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes mac80211 internals.
</para>
</partintro>
<chapter id="key-handling">
<title>Key handling</title>
<sect1>
<title>Key handling basics</title>
!Pnet/mac80211/key.c Key handling basics
</sect1>
<sect1>
<title>MORE TBD</title>
<para>TBD</para>
</sect1>
</chapter>
<chapter id="rx-processing">
<title>Receive processing</title>
<para>TBD</para>
</chapter>
<chapter id="tx-processing">
<title>Transmit processing</title>
<para>TBD</para>
</chapter>
<chapter id="sta-info">
<title>Station info handling</title>
<sect1>
<title>Programming information</title>
!Fnet/mac80211/sta_info.h sta_info
!Fnet/mac80211/sta_info.h ieee80211_sta_info_flags
</sect1>
<sect1>
<title>STA information lifetime rules</title>
!Pnet/mac80211/sta_info.c STA information lifetime rules
</sect1>
</chapter>
<chapter id="synchronisation">
<title>Synchronisation</title>
<para>TBD</para>
<para>Locking, lots of RCU</para>
</chapter>
</part>
</book>
</set>
+1 -1
View File
@@ -12,7 +12,7 @@ DOCBOOKS := z8530book.xml mcabook.xml device-drivers.xml \
kernel-api.xml filesystems.xml lsm.xml usb.xml kgdb.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
mac80211.xml debugobjects.xml sh.xml regulator.xml \
80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml \
tracepoint.xml media.xml drm.xml
+1
View File
@@ -136,6 +136,7 @@
#ifdef CONFIG_COMPAT
.compat_ioctl = i915_compat_ioctl,
#endif
.llseek = noop_llseek,
},
.pci_driver = {
.name = DRIVER_NAME,
+52 -32
View File
@@ -28,7 +28,7 @@
</authorgroup>
<copyright>
<year>2005-2006</year>
<year>2005-2010</year>
<holder>Thomas Gleixner</holder>
</copyright>
<copyright>
@@ -100,6 +100,10 @@
<listitem><para>Edge type</para></listitem>
<listitem><para>Simple type</para></listitem>
</itemizedlist>
During the implementation we identified another type:
<itemizedlist>
<listitem><para>Fast EOI type</para></listitem>
</itemizedlist>
In the SMP world of the __do_IRQ() super-handler another type
was identified:
<itemizedlist>
@@ -153,6 +157,7 @@
is still available. This leads to a kind of duality for the time
being. Over time the new model should be used in more and more
architectures, as it enables smaller and cleaner IRQ subsystems.
It's deprecated for three years now and about to be removed.
</para>
</chapter>
<chapter id="bugs">
@@ -217,6 +222,7 @@
<itemizedlist>
<listitem><para>handle_level_irq</para></listitem>
<listitem><para>handle_edge_irq</para></listitem>
<listitem><para>handle_fasteoi_irq</para></listitem>
<listitem><para>handle_simple_irq</para></listitem>
<listitem><para>handle_percpu_irq</para></listitem>
</itemizedlist>
@@ -233,33 +239,33 @@
are used by the default flow implementations.
The following helper functions are implemented (simplified excerpt):
<programlisting>
default_enable(irq)
default_enable(struct irq_data *data)
{
desc->chip->unmask(irq);
desc->chip->irq_unmask(data);
}
default_disable(irq)
default_disable(struct irq_data *data)
{
if (!delay_disable(irq))
desc->chip->mask(irq);
if (!delay_disable(data))
desc->chip->irq_mask(data);
}
default_ack(irq)
default_ack(struct irq_data *data)
{
chip->ack(irq);
chip->irq_ack(data);
}
default_mask_ack(irq)
default_mask_ack(struct irq_data *data)
{
if (chip->mask_ack) {
chip->mask_ack(irq);
if (chip->irq_mask_ack) {
chip->irq_mask_ack(data);
} else {
chip->mask(irq);
chip->ack(irq);
chip->irq_mask(data);
chip->irq_ack(data);
}
}
noop(irq)
noop(struct irq_data *data))
{
}
@@ -278,12 +284,27 @@ noop(irq)
<para>
The following control flow is implemented (simplified excerpt):
<programlisting>
desc->chip->start();
desc->chip->irq_mask();
handle_IRQ_event(desc->action);
desc->chip->end();
desc->chip->irq_unmask();
</programlisting>
</para>
</sect3>
</sect3>
<sect3 id="Default_FASTEOI_IRQ_flow_handler">
<title>Default Fast EOI IRQ flow handler</title>
<para>
handle_fasteoi_irq provides a generic implementation
for interrupts, which only need an EOI at the end of
the handler
</para>
<para>
The following control flow is implemented (simplified excerpt):
<programlisting>
handle_IRQ_event(desc->action);
desc->chip->irq_eoi();
</programlisting>
</para>
</sect3>
<sect3 id="Default_Edge_IRQ_flow_handler">
<title>Default Edge IRQ flow handler</title>
<para>
@@ -294,20 +315,19 @@ desc->chip->end();
The following control flow is implemented (simplified excerpt):
<programlisting>
if (desc->status &amp; running) {
desc->chip->hold();
desc->chip->irq_mask();
desc->status |= pending | masked;
return;
}
desc->chip->start();
desc->chip->irq_ack();
desc->status |= running;
do {
if (desc->status &amp; masked)
desc->chip->enable();
desc->chip->irq_unmask();
desc->status &amp;= ~pending;
handle_IRQ_event(desc->action);
} while (status &amp; pending);
desc->status &amp;= ~running;
desc->chip->end();
</programlisting>
</para>
</sect3>
@@ -342,9 +362,9 @@ handle_IRQ_event(desc->action);
<para>
The following control flow is implemented (simplified excerpt):
<programlisting>
desc->chip->start();
handle_IRQ_event(desc->action);
desc->chip->end();
if (desc->chip->irq_eoi)
desc->chip->irq_eoi();
</programlisting>
</para>
</sect3>
@@ -375,8 +395,7 @@ desc->chip->end();
mechanism. (It's necessary to enable CONFIG_HARDIRQS_SW_RESEND when
you want to use the delayed interrupt disable feature and your
hardware is not capable of retriggering an interrupt.)
The delayed interrupt disable can be runtime enabled, per interrupt,
by setting the IRQ_DELAYED_DISABLE flag in the irq_desc status field.
The delayed interrupt disable is not configurable.
</para>
</sect2>
</sect1>
@@ -387,13 +406,13 @@ desc->chip->end();
contains all the direct chip relevant functions, which
can be utilized by the irq flow implementations.
<itemizedlist>
<listitem><para>ack()</para></listitem>
<listitem><para>mask_ack() - Optional, recommended for performance</para></listitem>
<listitem><para>mask()</para></listitem>
<listitem><para>unmask()</para></listitem>
<listitem><para>retrigger() - Optional</para></listitem>
<listitem><para>set_type() - Optional</para></listitem>
<listitem><para>set_wake() - Optional</para></listitem>
<listitem><para>irq_ack()</para></listitem>
<listitem><para>irq_mask_ack() - Optional, recommended for performance</para></listitem>
<listitem><para>irq_mask()</para></listitem>
<listitem><para>irq_unmask()</para></listitem>
<listitem><para>irq_retrigger() - Optional</para></listitem>
<listitem><para>irq_set_type() - Optional</para></listitem>
<listitem><para>irq_set_wake() - Optional</para></listitem>
</itemizedlist>
These primitives are strictly intended to mean what they say: ack means
ACK, masking means masking of an IRQ line, etc. It is up to the flow
@@ -458,6 +477,7 @@ desc->chip->end();
<para>
This chapter contains the autogenerated documentation of the internal functions.
</para>
!Ikernel/irq/irqdesc.c
!Ikernel/irq/handle.c
!Ikernel/irq/chip.c
</chapter>
+2 -1
View File
@@ -257,7 +257,8 @@ X!Earch/x86/kernel/mca_32.c
!Iblock/blk-sysfs.c
!Eblock/blk-settings.c
!Eblock/blk-exec.c
!Eblock/blk-barrier.c
!Eblock/blk-flush.c
!Eblock/blk-lib.c
!Eblock/blk-tag.c
!Iblock/blk-tag.c
!Eblock/blk-integrity.c
+4 -10
View File
@@ -1645,7 +1645,9 @@ the amount of locking which needs to be done.
all the readers who were traversing the list when we deleted the
element are finished. We use <function>call_rcu()</function> to
register a callback which will actually destroy the object once
the readers are finished.
all pre-existing readers are finished. Alternatively,
<function>synchronize_rcu()</function> may be used to block until
all pre-existing are finished.
</para>
<para>
But how does Read Copy Update know when the readers are
@@ -1714,7 +1716,7 @@ the amount of locking which needs to be done.
- object_put(obj);
+ list_del_rcu(&amp;obj-&gt;list);
cache_num--;
+ call_rcu(&amp;obj-&gt;rcu, cache_delete_rcu, obj);
+ call_rcu(&amp;obj-&gt;rcu, cache_delete_rcu);
}
/* Must be holding cache_lock */
@@ -1725,14 +1727,6 @@ the amount of locking which needs to be done.
if (++cache_num > MAX_CACHE_SIZE) {
struct object *i, *outcast = NULL;
list_for_each_entry(i, &amp;cache, list) {
@@ -85,6 +94,7 @@
obj-&gt;popularity = 0;
atomic_set(&amp;obj-&gt;refcnt, 1); /* The cache holds a reference */
spin_lock_init(&amp;obj-&gt;lock);
+ INIT_RCU_HEAD(&amp;obj-&gt;rcu);
spin_lock_irqsave(&amp;cache_lock, flags);
__cache_add(obj);
@@ -104,12 +114,11 @@
struct object *cache_find(int id)
{
-337
View File
@@ -1,337 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<book id="mac80211-developers-guide">
<bookinfo>
<title>The mac80211 subsystem for kernel developers</title>
<authorgroup>
<author>
<firstname>Johannes</firstname>
<surname>Berg</surname>
<affiliation>
<address><email>johannes@sipsolutions.net</email></address>
</affiliation>
</author>
</authorgroup>
<copyright>
<year>2007-2009</year>
<holder>Johannes Berg</holder>
</copyright>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License version 2 as published by the Free Software Foundation.
</para>
<para>
This documentation is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this documentation; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
<abstract>
!Pinclude/net/mac80211.h Introduction
!Pinclude/net/mac80211.h Warning
</abstract>
</bookinfo>
<toc></toc>
<!--
Generally, this document shall be ordered by increasing complexity.
It is important to note that readers should be able to read only
the first few sections to get a working driver and only advanced
usage should require reading the full document.
-->
<part>
<title>The basic mac80211 driver interface</title>
<partintro>
<para>
You should read and understand the information contained
within this part of the book while implementing a driver.
In some chapters, advanced usage is noted, that may be
skipped at first.
</para>
<para>
This part of the book only covers station and monitor mode
functionality, additional information required to implement
the other modes is covered in the second part of the book.
</para>
</partintro>
<chapter id="basics">
<title>Basic hardware handling</title>
<para>TBD</para>
<para>
This chapter shall contain information on getting a hw
struct allocated and registered with mac80211.
</para>
<para>
Since it is required to allocate rates/modes before registering
a hw struct, this chapter shall also contain information on setting
up the rate/mode structs.
</para>
<para>
Additionally, some discussion about the callbacks and
the general programming model should be in here, including
the definition of ieee80211_ops which will be referred to
a lot.
</para>
<para>
Finally, a discussion of hardware capabilities should be done
with references to other parts of the book.
</para>
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h ieee80211_hw
!Finclude/net/mac80211.h ieee80211_hw_flags
!Finclude/net/mac80211.h SET_IEEE80211_DEV
!Finclude/net/mac80211.h SET_IEEE80211_PERM_ADDR
!Finclude/net/mac80211.h ieee80211_ops
!Finclude/net/mac80211.h ieee80211_alloc_hw
!Finclude/net/mac80211.h ieee80211_register_hw
!Finclude/net/mac80211.h ieee80211_get_tx_led_name
!Finclude/net/mac80211.h ieee80211_get_rx_led_name
!Finclude/net/mac80211.h ieee80211_get_assoc_led_name
!Finclude/net/mac80211.h ieee80211_get_radio_led_name
!Finclude/net/mac80211.h ieee80211_unregister_hw
!Finclude/net/mac80211.h ieee80211_free_hw
</chapter>
<chapter id="phy-handling">
<title>PHY configuration</title>
<para>TBD</para>
<para>
This chapter should describe PHY handling including
start/stop callbacks and the various structures used.
</para>
!Finclude/net/mac80211.h ieee80211_conf
!Finclude/net/mac80211.h ieee80211_conf_flags
</chapter>
<chapter id="iface-handling">
<title>Virtual interfaces</title>
<para>TBD</para>
<para>
This chapter should describe virtual interface basics
that are relevant to the driver (VLANs, MGMT etc are not.)
It should explain the use of the add_iface/remove_iface
callbacks as well as the interface configuration callbacks.
</para>
<para>Things related to AP mode should be discussed there.</para>
<para>
Things related to supporting multiple interfaces should be
in the appropriate chapter, a BIG FAT note should be here about
this though and the recommendation to allow only a single
interface in STA mode at first!
</para>
!Finclude/net/mac80211.h ieee80211_vif
</chapter>
<chapter id="rx-tx">
<title>Receive and transmit processing</title>
<sect1>
<title>what should be here</title>
<para>TBD</para>
<para>
This should describe the receive and transmit
paths in mac80211/the drivers as well as
transmit status handling.
</para>
</sect1>
<sect1>
<title>Frame format</title>
!Pinclude/net/mac80211.h Frame format
</sect1>
<sect1>
<title>Packet alignment</title>
!Pnet/mac80211/rx.c Packet alignment
</sect1>
<sect1>
<title>Calling into mac80211 from interrupts</title>
!Pinclude/net/mac80211.h Calling mac80211 from interrupts
</sect1>
<sect1>
<title>functions/definitions</title>
!Finclude/net/mac80211.h ieee80211_rx_status
!Finclude/net/mac80211.h mac80211_rx_flags
!Finclude/net/mac80211.h ieee80211_tx_info
!Finclude/net/mac80211.h ieee80211_rx
!Finclude/net/mac80211.h ieee80211_rx_irqsafe
!Finclude/net/mac80211.h ieee80211_tx_status
!Finclude/net/mac80211.h ieee80211_tx_status_irqsafe
!Finclude/net/mac80211.h ieee80211_rts_get
!Finclude/net/mac80211.h ieee80211_rts_duration
!Finclude/net/mac80211.h ieee80211_ctstoself_get
!Finclude/net/mac80211.h ieee80211_ctstoself_duration
!Finclude/net/mac80211.h ieee80211_generic_frame_duration
!Finclude/net/mac80211.h ieee80211_wake_queue
!Finclude/net/mac80211.h ieee80211_stop_queue
!Finclude/net/mac80211.h ieee80211_wake_queues
!Finclude/net/mac80211.h ieee80211_stop_queues
</sect1>
</chapter>
<chapter id="filters">
<title>Frame filtering</title>
!Pinclude/net/mac80211.h Frame filtering
!Finclude/net/mac80211.h ieee80211_filter_flags
</chapter>
</part>
<part id="advanced">
<title>Advanced driver interface</title>
<partintro>
<para>
Information contained within this part of the book is
of interest only for advanced interaction of mac80211
with drivers to exploit more hardware capabilities and
improve performance.
</para>
</partintro>
<chapter id="hardware-crypto-offload">
<title>Hardware crypto acceleration</title>
!Pinclude/net/mac80211.h Hardware crypto acceleration
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h set_key_cmd
!Finclude/net/mac80211.h ieee80211_key_conf
!Finclude/net/mac80211.h ieee80211_key_alg
!Finclude/net/mac80211.h ieee80211_key_flags
</chapter>
<chapter id="powersave">
<title>Powersave support</title>
!Pinclude/net/mac80211.h Powersave support
</chapter>
<chapter id="beacon-filter">
<title>Beacon filter support</title>
!Pinclude/net/mac80211.h Beacon filter support
!Finclude/net/mac80211.h ieee80211_beacon_loss
</chapter>
<chapter id="qos">
<title>Multiple queues and QoS support</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_tx_queue_params
</chapter>
<chapter id="AP">
<title>Access point mode support</title>
<para>TBD</para>
<para>Some parts of the if_conf should be discussed here instead</para>
<para>
Insert notes about VLAN interfaces with hw crypto here or
in the hw crypto chapter.
</para>
!Finclude/net/mac80211.h ieee80211_get_buffered_bc
!Finclude/net/mac80211.h ieee80211_beacon_get
</chapter>
<chapter id="multi-iface">
<title>Supporting multiple virtual interfaces</title>
<para>TBD</para>
<para>
Note: WDS with identical MAC address should almost always be OK
</para>
<para>
Insert notes about having multiple virtual interfaces with
different MAC addresses here, note which configurations are
supported by mac80211, add notes about supporting hw crypto
with it.
</para>
</chapter>
<chapter id="hardware-scan-offload">
<title>Hardware scan offload</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_scan_completed
</chapter>
</part>
<part id="rate-control">
<title>Rate control interface</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes the rate control algorithm
interface and how it relates to mac80211 and drivers.
</para>
</partintro>
<chapter id="dummy">
<title>dummy chapter</title>
<para>TBD</para>
</chapter>
</part>
<part id="internal">
<title>Internals</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes mac80211 internals.
</para>
</partintro>
<chapter id="key-handling">
<title>Key handling</title>
<sect1>
<title>Key handling basics</title>
!Pnet/mac80211/key.c Key handling basics
</sect1>
<sect1>
<title>MORE TBD</title>
<para>TBD</para>
</sect1>
</chapter>
<chapter id="rx-processing">
<title>Receive processing</title>
<para>TBD</para>
</chapter>
<chapter id="tx-processing">
<title>Transmit processing</title>
<para>TBD</para>
</chapter>
<chapter id="sta-info">
<title>Station info handling</title>
<sect1>
<title>Programming information</title>
!Fnet/mac80211/sta_info.h sta_info
!Fnet/mac80211/sta_info.h ieee80211_sta_info_flags
</sect1>
<sect1>
<title>STA information lifetime rules</title>
!Pnet/mac80211/sta_info.c STA information lifetime rules
</sect1>
</chapter>
<chapter id="synchronisation">
<title>Synchronisation</title>
<para>TBD</para>
<para>Locking, lots of RCU</para>
</chapter>
</part>
</book>
+38 -6
View File
@@ -218,13 +218,22 @@ over a rather long period of time, but improvements are always welcome!
include:
a. Keeping a count of the number of data-structure elements
used by the RCU-protected data structure, including those
waiting for a grace period to elapse. Enforce a limit
on this number, stalling updates as needed to allow
previously deferred frees to complete.
used by the RCU-protected data structure, including
those waiting for a grace period to elapse. Enforce a
limit on this number, stalling updates as needed to allow
previously deferred frees to complete. Alternatively,
limit only the number awaiting deferred free rather than
the total number of elements.
Alternatively, limit only the number awaiting deferred
free rather than the total number of elements.
One way to stall the updates is to acquire the update-side
mutex. (Don't try this with a spinlock -- other CPUs
spinning on the lock could prevent the grace period
from ever ending.) Another way to stall the updates
is for the updates to use a wrapper function around
the memory allocator, so that this wrapper function
simulates OOM when there is too much memory awaiting an
RCU grace period. There are of course many other
variations on this theme.
b. Limiting update rate. For example, if updates occur only
once per hour, then no explicit rate limiting is required,
@@ -365,3 +374,26 @@ over a rather long period of time, but improvements are always welcome!
and the compiler to freely reorder code into and out of RCU
read-side critical sections. It is the responsibility of the
RCU update-side primitives to deal with this.
17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and
the __rcu sparse checks to validate your RCU code. These
can help find problems as follows:
CONFIG_PROVE_RCU: check that accesses to RCU-protected data
structures are carried out under the proper RCU
read-side critical section, while holding the right
combination of locks, or whatever other conditions
are appropriate.
CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the
same object to call_rcu() (or friends) before an RCU
grace period has elapsed since the last time that you
passed that same object to call_rcu() (or friends).
__rcu sparse checks: tag the pointer to the RCU-protected data
structure with __rcu, and sparse will warn you if you
access that pointer without the services of one of the
variants of rcu_dereference().
These debugging aids can help you find problems that are
otherwise extremely difficult to spot.
+18
View File
@@ -80,6 +80,24 @@ o A CPU looping with bottom halves disabled. This condition can
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().
o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
happen to preempt a low-priority task in the middle of an RCU
read-side critical section. This is especially damaging if
that low-priority task is not permitted to run on any other CPU,
in which case the next RCU grace period can never complete, which
will eventually cause the system to run out of memory and hang.
While the system is in the process of running itself out of
memory, you might see stall-warning messages.
o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
is running at a higher priority than the RCU softirq threads.
This will prevent RCU callbacks from ever being invoked,
and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
RCU grace periods from ever completing. Either way, the
system will eventually run out of memory and hang. In the
CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
messages.
o A bug in the RCU implementation.
o A hardware failure. This is quite unlikely, but has occurred
+12 -1
View File
@@ -125,6 +125,17 @@ o "b" is the batch limit for this CPU. If more than this number
of RCU callbacks is ready to invoke, then the remainder will
be deferred.
o "ci" is the number of RCU callbacks that have been invoked for
this CPU. Note that ci+ql is the number of callbacks that have
been registered in absence of CPU-hotplug activity.
o "co" is the number of RCU callbacks that have been orphaned due to
this CPU going offline.
o "ca" is the number of RCU callbacks that have been adopted due to
other CPUs going offline. Note that ci+co-ca+ql is the number of
RCU callbacks registered on this CPU.
There is also an rcu/rcudata.csv file with the same information in
comma-separated-variable spreadsheet format.
@@ -180,7 +191,7 @@ o "s" is the "signaled" state that drives force_quiescent_state()'s
o "jfq" is the number of jiffies remaining for this grace period
before force_quiescent_state() is invoked to help push things
along. Note that CPUs in dyntick-idle mode thoughout the grace
along. Note that CPUs in dyntick-idle mode throughout the grace
period will not report on their own, but rather must be check by
some other CPU via force_quiescent_state().
+2
View File
@@ -6,6 +6,8 @@ Interrupts
- ARM Interrupt subsystem documentation
IXP2000
- Release Notes for Linux on Intel's IXP2000 Network Processor
msm
- MSM specific documentation
Netwinder
- Netwinder specific documentation
Porting
+176
View File
@@ -0,0 +1,176 @@
This document provides an overview of the msm_gpiomux interface, which
is used to provide gpio pin multiplexing and configuration on mach-msm
targets.
History
=======
The first-generation API for gpio configuration & multiplexing on msm
is the function gpio_tlmm_config(). This function has a few notable
shortcomings, which led to its deprecation and replacement by gpiomux:
The 'disable' parameter: Setting the second parameter to
gpio_tlmm_config to GPIO_CFG_DISABLE tells the peripheral
processor in charge of the subsystem to perform a look-up into a
low-power table and apply the low-power/sleep setting for the pin.
As the msm family evolved this became problematic. Not all pins
have sleep settings, not all peripheral processors will accept requests
to apply said sleep settings, and not all msm targets have their gpio
subsystems managed by a peripheral processor. In order to get consistent
behavior on all targets, drivers are forced to ignore this parameter,
rendering it useless.
The 'direction' flag: for all mux-settings other than raw-gpio (0),
the output-enable bit of a gpio is hard-wired to a known
input (usually VDD or ground). For those settings, the direction flag
is meaningless at best, and deceptive at worst. In addition, using the
direction flag to change output-enable (OE) directly can cause trouble in
gpiolib, which has no visibility into gpio direction changes made
in this way. Direction control in gpio mode should be made through gpiolib.
Key Features of gpiomux
=======================
- A consistent interface across all generations of msm. Drivers can expect
the same results on every target.
- gpiomux plays nicely with gpiolib. Functions that should belong to gpiolib
are left to gpiolib and not duplicated here. gpiomux is written with the
intent that gpio_chips will call gpiomux reference-counting methods
from their request() and free() hooks, providing full integration.
- Tabular configuration. Instead of having to call gpio_tlmm_config
hundreds of times, gpio configuration is placed in a single table.
- Per-gpio sleep. Each gpio is individually reference counted, allowing only
those lines which are in use to be put in high-power states.
- 0 means 'do nothing': all flags are designed so that the default memset-zero
equates to a sensible default of 'no configuration', preventing users
from having to provide hundreds of 'no-op' configs for unused or
unwanted lines.
Usage
=====
To use gpiomux, provide configuration information for relevant gpio lines
in the msm_gpiomux_configs table. Since a 0 equates to "unconfigured",
only those lines to be managed by gpiomux need to be specified. Here
is a completely fictional example:
struct msm_gpiomux_config msm_gpiomux_configs[GPIOMUX_NGPIOS] = {
[12] = {
.active = GPIOMUX_VALID | GPIOMUX_DRV_8MA | GPIOMUX_FUNC_1,
.suspended = GPIOMUX_VALID | GPIOMUX_PULL_DOWN,
},
[34] = {
.suspended = GPIOMUX_VALID | GPIOMUX_PULL_DOWN,
},
};
To indicate that a gpio is in use, call msm_gpiomux_get() to increase
its reference count. To decrease the reference count, call msm_gpiomux_put().
The effect of this configuration is as follows:
When the system boots, gpios 12 and 34 will be initialized with their
'suspended' configurations. All other gpios, which were left unconfigured,
will not be touched.
When msm_gpiomux_get() is called on gpio 12 to raise its reference count
above 0, its active configuration will be applied. Since no other gpio
line has a valid active configuration, msm_gpiomux_get() will have no
effect on any other line.
When msm_gpiomux_put() is called on gpio 12 or 34 to drop their reference
count to 0, their suspended configurations will be applied.
Since no other gpio line has a valid suspended configuration, no other
gpio line will be effected by msm_gpiomux_put(). Since gpio 34 has no valid
active configuration, this is effectively a no-op for gpio 34 as well,
with one small caveat, see the section "About Output-Enable Settings".
All of the GPIOMUX_VALID flags may seem like unnecessary overhead, but
they address some important issues. As unused entries (all those
except 12 and 34) are zero-filled, gpiomux needs a way to distinguish
the used fields from the unused. In addition, the all-zero pattern
is a valid configuration! Therefore, gpiomux defines an additional bit
which is used to indicate when a field is used. This has the pleasant
side-effect of allowing calls to msm_gpiomux_write to use '0' to indicate
that a value should not be changed:
msm_gpiomux_write(0, GPIOMUX_VALID, 0);
replaces the active configuration of gpio 0 with an all-zero configuration,
but leaves the suspended configuration as it was.
Static Configurations
=====================
To install a static configuration, which is applied at boot and does
not change after that, install a configuration with a suspended component
but no active component, as in the previous example:
[34] = {
.suspended = GPIOMUX_VALID | GPIOMUX_PULL_DOWN,
},
The suspended setting is applied during boot, and the lack of any valid
active setting prevents any other setting from being applied at runtime.
If other subsystems attempting to access the line is a concern, one could
*really* anchor the configuration down by calling msm_gpiomux_get on the
line at initialization to move the line into active mode. With the line
held, it will never be re-suspended, and with no valid active configuration,
no new configurations will be applied.
But then, if having other subsystems grabbing for the line is truly a concern,
it should be reserved with gpio_request instead, which carries an implicit
msm_gpiomux_get.
gpiomux and gpiolib
===================
It is expected that msm gpio_chips will call msm_gpiomux_get() and
msm_gpiomux_put() from their request and free hooks, like this fictional
example:
static int request(struct gpio_chip *chip, unsigned offset)
{
return msm_gpiomux_get(chip->base + offset);
}
static void free(struct gpio_chip *chip, unsigned offset)
{
msm_gpiomux_put(chip->base + offset);
}
...somewhere in a gpio_chip declaration...
.request = request,
.free = free,
This provides important functionality:
- It guarantees that a gpio line will have its 'active' config applied
when the line is requested, and will not be suspended while the line
remains requested; and
- It guarantees that gpio-direction settings from gpiolib behave sensibly.
See "About Output-Enable Settings."
This mechanism allows for "auto-request" of gpiomux lines via gpiolib
when it is suitable. Drivers wishing more exact control are, of course,
free to also use msm_gpiomux_set and msm_gpiomux_get.
About Output-Enable Settings
============================
Some msm targets do not have the ability to query the current gpio
configuration setting. This means that changes made to the output-enable
(OE) bit by gpiolib cannot be consistently detected and preserved by gpiomux.
Therefore, when gpiomux applies a configuration setting, any direction
settings which may have been applied by gpiolib are lost and the default
input settings are re-applied.
For this reason, drivers should not assume that gpio direction settings
continue to hold if they free and then re-request a gpio. This seems like
common sense - after all, anybody could have obtained the line in the
meantime - but it needs saying.
This also means that calls to msm_gpiomux_write will reset the OE bit,
which means that if the gpio line is held by a client of gpiolib and
msm_gpiomux_write is called, the direction setting has been lost and
gpiolib's internal state has been broken.
Release gpio lines before reconfiguring them.
+2 -2
View File
@@ -1,7 +1,5 @@
00-INDEX
- This file
barrier.txt
- I/O Barriers
biodoc.txt
- Notes on the Generic Block Layer Rewrite in Linux 2.5
capability.txt
@@ -16,3 +14,5 @@ stat.txt
- Block layer statistics in /sys/block/<dev>/stat
switching-sched.txt
- Switching I/O schedulers at runtime
writeback_cache_control.txt
- Control of volatile write back caches
-261
View File
@@ -1,261 +0,0 @@
I/O Barriers
============
Tejun Heo <htejun@gmail.com>, July 22 2005
I/O barrier requests are used to guarantee ordering around the barrier
requests. Unless you're crazy enough to use disk drives for
implementing synchronization constructs (wow, sounds interesting...),
the ordering is meaningful only for write requests for things like
journal checkpoints. All requests queued before a barrier request
must be finished (made it to the physical medium) before the barrier
request is started, and all requests queued after the barrier request
must be started only after the barrier request is finished (again,
made it to the physical medium).
In other words, I/O barrier requests have the following two properties.
1. Request ordering
Requests cannot pass the barrier request. Preceding requests are
processed before the barrier and following requests after.
Depending on what features a drive supports, this can be done in one
of the following three ways.
i. For devices which have queue depth greater than 1 (TCQ devices) and
support ordered tags, block layer can just issue the barrier as an
ordered request and the lower level driver, controller and drive
itself are responsible for making sure that the ordering constraint is
met. Most modern SCSI controllers/drives should support this.
NOTE: SCSI ordered tag isn't currently used due to limitation in the
SCSI midlayer, see the following random notes section.
ii. For devices which have queue depth greater than 1 but don't
support ordered tags, block layer ensures that the requests preceding
a barrier request finishes before issuing the barrier request. Also,
it defers requests following the barrier until the barrier request is
finished. Older SCSI controllers/drives and SATA drives fall in this
category.
iii. Devices which have queue depth of 1. This is a degenerate case
of ii. Just keeping issue order suffices. Ancient SCSI
controllers/drives and IDE drives are in this category.
2. Forced flushing to physical medium
Again, if you're not gonna do synchronization with disk drives (dang,
it sounds even more appealing now!), the reason you use I/O barriers
is mainly to protect filesystem integrity when power failure or some
other events abruptly stop the drive from operating and possibly make
the drive lose data in its cache. So, I/O barriers need to guarantee
that requests actually get written to non-volatile medium in order.
There are four cases,
i. No write-back cache. Keeping requests ordered is enough.
ii. Write-back cache but no flush operation. There's no way to
guarantee physical-medium commit order. This kind of devices can't to
I/O barriers.
iii. Write-back cache and flush operation but no FUA (forced unit
access). We need two cache flushes - before and after the barrier
request.
iv. Write-back cache, flush operation and FUA. We still need one
flush to make sure requests preceding a barrier are written to medium,
but post-barrier flush can be avoided by using FUA write on the
barrier itself.
How to support barrier requests in drivers
------------------------------------------
All barrier handling is done inside block layer proper. All low level
drivers have to are implementing its prepare_flush_fn and using one
the following two functions to indicate what barrier type it supports
and how to prepare flush requests. Note that the term 'ordered' is
used to indicate the whole sequence of performing barrier requests
including draining and flushing.
typedef void (prepare_flush_fn)(struct request_queue *q, struct request *rq);
int blk_queue_ordered(struct request_queue *q, unsigned ordered,
prepare_flush_fn *prepare_flush_fn);
@q : the queue in question
@ordered : the ordered mode the driver/device supports
@prepare_flush_fn : this function should prepare @rq such that it
flushes cache to physical medium when executed
For example, SCSI disk driver's prepare_flush_fn looks like the
following.
static void sd_prepare_flush(struct request_queue *q, struct request *rq)
{
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->timeout = SD_TIMEOUT;
rq->cmd[0] = SYNCHRONIZE_CACHE;
rq->cmd_len = 10;
}
The following seven ordered modes are supported. The following table
shows which mode should be used depending on what features a
device/driver supports. In the leftmost column of table,
QUEUE_ORDERED_ prefix is omitted from the mode names to save space.
The table is followed by description of each mode. Note that in the
descriptions of QUEUE_ORDERED_DRAIN*, '=>' is used whereas '->' is
used for QUEUE_ORDERED_TAG* descriptions. '=>' indicates that the
preceding step must be complete before proceeding to the next step.
'->' indicates that the next step can start as soon as the previous
step is issued.
write-back cache ordered tag flush FUA
-----------------------------------------------------------------------
NONE yes/no N/A no N/A
DRAIN no no N/A N/A
DRAIN_FLUSH yes no yes no
DRAIN_FUA yes no yes yes
TAG no yes N/A N/A
TAG_FLUSH yes yes yes no
TAG_FUA yes yes yes yes
QUEUE_ORDERED_NONE
I/O barriers are not needed and/or supported.
Sequence: N/A
QUEUE_ORDERED_DRAIN
Requests are ordered by draining the request queue and cache
flushing isn't needed.
Sequence: drain => barrier
QUEUE_ORDERED_DRAIN_FLUSH
Requests are ordered by draining the request queue and both
pre-barrier and post-barrier cache flushings are needed.
Sequence: drain => preflush => barrier => postflush
QUEUE_ORDERED_DRAIN_FUA
Requests are ordered by draining the request queue and
pre-barrier cache flushing is needed. By using FUA on barrier
request, post-barrier flushing can be skipped.
Sequence: drain => preflush => barrier
QUEUE_ORDERED_TAG
Requests are ordered by ordered tag and cache flushing isn't
needed.
Sequence: barrier
QUEUE_ORDERED_TAG_FLUSH
Requests are ordered by ordered tag and both pre-barrier and
post-barrier cache flushings are needed.
Sequence: preflush -> barrier -> postflush
QUEUE_ORDERED_TAG_FUA
Requests are ordered by ordered tag and pre-barrier cache
flushing is needed. By using FUA on barrier request,
post-barrier flushing can be skipped.
Sequence: preflush -> barrier
Random notes/caveats
--------------------
* SCSI layer currently can't use TAG ordering even if the drive,
controller and driver support it. The problem is that SCSI midlayer
request dispatch function is not atomic. It releases queue lock and
switch to SCSI host lock during issue and it's possible and likely to
happen in time that requests change their relative positions. Once
this problem is solved, TAG ordering can be enabled.
* Currently, no matter which ordered mode is used, there can be only
one barrier request in progress. All I/O barriers are held off by
block layer until the previous I/O barrier is complete. This doesn't
make any difference for DRAIN ordered devices, but, for TAG ordered
devices with very high command latency, passing multiple I/O barriers
to low level *might* be helpful if they are very frequent. Well, this
certainly is a non-issue. I'm writing this just to make clear that no
two I/O barrier is ever passed to low-level driver.
* Completion order. Requests in ordered sequence are issued in order
but not required to finish in order. Barrier implementation can
handle out-of-order completion of ordered sequence. IOW, the requests
MUST be processed in order but the hardware/software completion paths
are allowed to reorder completion notifications - eg. current SCSI
midlayer doesn't preserve completion order during error handling.
* Requeueing order. Low-level drivers are free to requeue any request
after they removed it from the request queue with
blkdev_dequeue_request(). As barrier sequence should be kept in order
when requeued, generic elevator code takes care of putting requests in
order around barrier. See blk_ordered_req_seq() and
ELEVATOR_INSERT_REQUEUE handling in __elv_add_request() for details.
Note that block drivers must not requeue preceding requests while
completing latter requests in an ordered sequence. Currently, no
error checking is done against this.
* Error handling. Currently, block layer will report error to upper
layer if any of requests in an ordered sequence fails. Unfortunately,
this doesn't seem to be enough. Look at the following request flow.
QUEUE_ORDERED_TAG_FLUSH is in use.
[0] [1] [2] [3] [pre] [barrier] [post] < [4] [5] [6] ... >
still in elevator
Let's say request [2], [3] are write requests to update file system
metadata (journal or whatever) and [barrier] is used to mark that
those updates are valid. Consider the following sequence.
i. Requests [0] ~ [post] leaves the request queue and enters
low-level driver.
ii. After a while, unfortunately, something goes wrong and the
drive fails [2]. Note that any of [0], [1] and [3] could have
completed by this time, but [pre] couldn't have been finished
as the drive must process it in order and it failed before
processing that command.
iii. Error handling kicks in and determines that the error is
unrecoverable and fails [2], and resumes operation.
iv. [pre] [barrier] [post] gets processed.
v. *BOOM* power fails
The problem here is that the barrier request is *supposed* to indicate
that filesystem update requests [2] and [3] made it safely to the
physical medium and, if the machine crashes after the barrier is
written, filesystem recovery code can depend on that. Sadly, that
isn't true in this case anymore. IOW, the success of a I/O barrier
should also be dependent on success of some of the preceding requests,
where only upper layer (filesystem) knows what 'some' is.
This can be solved by implementing a way to tell the block layer which
requests affect the success of the following barrier request and
making lower lever drivers to resume operation on error only after
block layer tells it to do so.
As the probability of this happening is very low and the drive should
be faulty, implementing the fix is probably an overkill. But, still,
it's there.
* In previous drafts of barrier implementation, there was fallback
mechanism such that, if FUA or ordered TAG fails, less fancy ordered
mode can be selected and the failed barrier request is retried
automatically. The rationale for this feature was that as FUA is
pretty new in ATA world and ordered tag was never used widely, there
could be devices which report to support those features but choke when
actually given such requests.
This was removed for two reasons 1. it's an overkill 2. it's
impossible to implement properly when TAG ordering is used as low
level drivers resume after an error automatically. If it's ever
needed adding it back and modifying low level drivers accordingly
shouldn't be difficult.
@@ -0,0 +1,86 @@
Explicit volatile write back cache control
=====================================
Introduction
------------
Many storage devices, especially in the consumer market, come with volatile
write back caches. That means the devices signal I/O completion to the
operating system before data actually has hit the non-volatile storage. This
behavior obviously speeds up various workloads, but it means the operating
system needs to force data out to the non-volatile storage when it performs
a data integrity operation like fsync, sync or an unmount.
The Linux block layer provides two simple mechanisms that let filesystems
control the caching behavior of the storage device. These mechanisms are
a forced cache flush, and the Force Unit Access (FUA) flag for requests.
Explicit cache flushes
----------------------
The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
the filesystem and will make sure the volatile cache of the storage device
has been flushed before the actual I/O operation is started. This explicitly
guarantees that previously completed write requests are on non-volatile
storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
set on an otherwise empty bio structure, which causes only an explicit cache
flush without any dependent I/O. It is recommend to use
the blkdev_issue_flush() helper for a pure cache flush.
Forced Unit Access
-----------------
The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
filesystem and will make sure that I/O completion for this request is only
signaled after the data has been committed to non-volatile storage.
Implementation details for filesystems
--------------------------------------
Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
worry if the underlying devices need any explicit cache flushing and how
the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags
may both be set on a single bio.
Implementation details for make_request_fn based block drivers
--------------------------------------------------------------
These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
directly below the submit_bio interface. For remapping drivers the REQ_FUA
bits need to be propagated to underlying devices, and a global flush needs
to be implemented for bios with the REQ_FLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_FLUSH requests without
data can be completed successfully without doing any work. Drivers for
devices with volatile caches need to implement the support for these
flags themselves without any help from the block layer.
Implementation details for request_fn based block drivers
--------------------------------------------------------------
For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_FLUSH requests before
entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from
requests that have a payload. For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing:
blk_queue_flush(sdkp->disk->queue, REQ_FLUSH);
and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that
REQ_FLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_FLUSH request followed by the actual write by the block
layer. For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using:
blk_queue_flush(sdkp->disk->queue, REQ_FLUSH | REQ_FUA);
and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn. If the FUA bit is not natively supported the block
layer turns it into an empty REQ_FLUSH request after the actual write.

Some files were not shown because too many files have changed in this diff Show More