Merge branch 'linus' into x86/cpu

This commit is contained in:
Ingo Molnar
2009-06-07 12:22:15 +02:00
2856 changed files with 85612 additions and 66978 deletions
+3 -3
View File
@@ -1,4 +1,4 @@
What: /debug/pktcdvd/pktcdvd[0-7]
What: /sys/kernel/debug/pktcdvd/pktcdvd[0-7]
Date: Oct. 2006
KernelVersion: 2.6.20
Contact: Thomas Maier <balagi@justmail.de>
@@ -10,10 +10,10 @@ debugfs interface
The pktcdvd module (packet writing driver) creates
these files in debugfs:
/debug/pktcdvd/pktcdvd[0-7]/
/sys/kernel/debug/pktcdvd/pktcdvd[0-7]/
info (0444) Lots of driver statistics and infos.
Example:
-------
cat /debug/pktcdvd/pktcdvd0/info
cat /sys/kernel/debug/pktcdvd/pktcdvd0/info
@@ -69,9 +69,13 @@ Description:
gpe1F: 0 invalid
gpe_all: 1192
sci: 1194
sci_not: 0
sci - The total number of times the ACPI SCI
has claimed an interrupt.
sci - The number of times the ACPI SCI
has been called and claimed an interrupt.
sci_not - The number of times the ACPI SCI
has been called and NOT claimed an interrupt.
gpe_all - count of SCI caused by GPEs.
+479
View File
@@ -0,0 +1,479 @@
What: /sys/kernel/slab
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The /sys/kernel/slab directory contains a snapshot of the
internal state of the SLUB allocator for each cache. Certain
files may be modified to change the behavior of the cache (and
any cache it aliases, if any).
Users: kernel memory tuning tools
What: /sys/kernel/slab/cache/aliases
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The aliases file is read-only and specifies how many caches
have merged into this cache.
What: /sys/kernel/slab/cache/align
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The align file is read-only and specifies the cache's object
alignment in bytes.
What: /sys/kernel/slab/cache/alloc_calls
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The alloc_calls file is read-only and lists the kernel code
locations from which allocations for this cache were performed.
The alloc_calls file only contains information if debugging is
enabled for that cache (see Documentation/vm/slub.txt).
What: /sys/kernel/slab/cache/alloc_fastpath
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The alloc_fastpath file is read-only and specifies how many
objects have been allocated using the fast path.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_from_partial
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The alloc_from_partial file is read-only and specifies how
many times a cpu slab has been full and it has been refilled
by using a slab from the list of partially used slabs.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_refill
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The alloc_refill file is read-only and specifies how many
times the per-cpu freelist was empty but there were objects
available as the result of remote cpu frees.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_slab
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The alloc_slab file is read-only and specifies how many times
a new slab had to be allocated from the page allocator.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/alloc_slowpath
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The alloc_slowpath file is read-only and specifies how many
objects have been allocated using the slow path because of a
refill or allocation from a partial or new slab.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/cache_dma
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The cache_dma file is read-only and specifies whether objects
are from ZONE_DMA.
Available when CONFIG_ZONE_DMA is enabled.
What: /sys/kernel/slab/cache/cpu_slabs
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The cpu_slabs file is read-only and displays how many cpu slabs
are active and their NUMA locality.
What: /sys/kernel/slab/cache/cpuslab_flush
Date: April 2009
KernelVersion: 2.6.31
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file cpuslab_flush is read-only and specifies how many
times a cache's cpu slabs have been flushed as the result of
destroying or shrinking a cache, a cpu going offline, or as
the result of forcing an allocation from a certain node.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/ctor
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The ctor file is read-only and specifies the cache's object
constructor function, which is invoked for each object when a
new slab is allocated.
What: /sys/kernel/slab/cache/deactivate_empty
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file deactivate_empty is read-only and specifies how many
times an empty cpu slab was deactivated.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_full
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file deactivate_full is read-only and specifies how many
times a full cpu slab was deactivated.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_remote_frees
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file deactivate_remote_frees is read-only and specifies how
many times a cpu slab has been deactivated and contained free
objects that were freed remotely.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_to_head
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file deactivate_to_head is read-only and specifies how
many times a partial cpu slab was deactivated and added to the
head of its node's partial list.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/deactivate_to_tail
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file deactivate_to_tail is read-only and specifies how
many times a partial cpu slab was deactivated and added to the
tail of its node's partial list.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/destroy_by_rcu
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The destroy_by_rcu file is read-only and specifies whether
slabs (not objects) are freed by rcu.
What: /sys/kernel/slab/cache/free_add_partial
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file free_add_partial is read-only and specifies how many
times an object has been freed in a full slab so that it had to
added to its node's partial list.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_calls
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The free_calls file is read-only and lists the locations of
object frees if slab debugging is enabled (see
Documentation/vm/slub.txt).
What: /sys/kernel/slab/cache/free_fastpath
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The free_fastpath file is read-only and specifies how many
objects have been freed using the fast path because it was an
object from the cpu slab.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_frozen
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The free_frozen file is read-only and specifies how many
objects have been freed to a frozen slab (i.e. a remote cpu
slab).
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_remove_partial
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file free_remove_partial is read-only and specifies how
many times an object has been freed to a now-empty slab so
that it had to be removed from its node's partial list.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_slab
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The free_slab file is read-only and specifies how many times an
empty slab has been freed back to the page allocator.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/free_slowpath
Date: February 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The free_slowpath file is read-only and specifies how many
objects have been freed using the slow path (i.e. to a full or
partial slab).
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/hwcache_align
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The hwcache_align file is read-only and specifies whether
objects are aligned on cachelines.
What: /sys/kernel/slab/cache/min_partial
Date: February 2009
KernelVersion: 2.6.30
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
David Rientjes <rientjes@google.com>
Description:
The min_partial file specifies how many empty slabs shall
remain on a node's partial list to avoid the overhead of
allocating new slabs. Such slabs may be reclaimed by utilizing
the shrink file.
What: /sys/kernel/slab/cache/object_size
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The object_size file is read-only and specifies the cache's
object size.
What: /sys/kernel/slab/cache/objects
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The objects file is read-only and displays how many objects are
active and from which nodes they are from.
What: /sys/kernel/slab/cache/objects_partial
Date: April 2008
KernelVersion: 2.6.26
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The objects_partial file is read-only and displays how many
objects are on partial slabs and from which nodes they are
from.
What: /sys/kernel/slab/cache/objs_per_slab
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file objs_per_slab is read-only and specifies how many
objects may be allocated from a single slab of the order
specified in /sys/kernel/slab/cache/order.
What: /sys/kernel/slab/cache/order
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The order file specifies the page order at which new slabs are
allocated. It is writable and can be changed to increase the
number of objects per slab. If a slab cannot be allocated
because of fragmentation, SLUB will retry with the minimum order
possible depending on its characteristics.
What: /sys/kernel/slab/cache/order_fallback
Date: April 2008
KernelVersion: 2.6.26
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file order_fallback is read-only and specifies how many
times an allocation of a new slab has not been possible at the
cache's order and instead fallen back to its minimum possible
order.
Available when CONFIG_SLUB_STATS is enabled.
What: /sys/kernel/slab/cache/partial
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The partial file is read-only and displays how long many
partial slabs there are and how long each node's list is.
What: /sys/kernel/slab/cache/poison
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The poison file specifies whether objects should be poisoned
when a new slab is allocated.
What: /sys/kernel/slab/cache/reclaim_account
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The reclaim_account file specifies whether the cache's objects
are reclaimable (and grouped by their mobility).
What: /sys/kernel/slab/cache/red_zone
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The red_zone file specifies whether the cache's objects are red
zoned.
What: /sys/kernel/slab/cache/remote_node_defrag_ratio
Date: January 2008
KernelVersion: 2.6.25
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The file remote_node_defrag_ratio specifies the percentage of
times SLUB will attempt to refill the cpu slab with a partial
slab from a remote node as opposed to allocating a new slab on
the local node. This reduces the amount of wasted memory over
the entire system but can be expensive.
Available when CONFIG_NUMA is enabled.
What: /sys/kernel/slab/cache/sanity_checks
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The sanity_checks file specifies whether expensive checks
should be performed on free and, at minimum, enables double free
checks. Caches that enable sanity_checks cannot be merged with
caches that do not.
What: /sys/kernel/slab/cache/shrink
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The shrink file is written when memory should be reclaimed from
a cache. Empty partial slabs are freed and the partial list is
sorted so the slabs with the fewest available objects are used
first.
What: /sys/kernel/slab/cache/slab_size
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The slab_size file is read-only and specifies the object size
with metadata (debugging information and alignment) in bytes.
What: /sys/kernel/slab/cache/slabs
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The slabs file is read-only and displays how long many slabs
there are (both cpu and partial) and from which nodes they are
from.
What: /sys/kernel/slab/cache/store_user
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The store_user file specifies whether the location of
allocation or free should be tracked for a cache.
What: /sys/kernel/slab/cache/total_objects
Date: April 2008
KernelVersion: 2.6.26
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The total_objects file is read-only and displays how many total
objects a cache has and from which nodes they are from.
What: /sys/kernel/slab/cache/trace
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
The trace file specifies whether object allocations and frees
should be traced.
What: /sys/kernel/slab/cache/validate
Date: May 2007
KernelVersion: 2.6.22
Contact: Pekka Enberg <penberg@cs.helsinki.fi>,
Christoph Lameter <cl@linux-foundation.org>
Description:
Writing to the validate file causes SLUB to traverse all of its
cache's objects and check the validity of metadata.
+11 -5
View File
@@ -31,7 +31,7 @@ PS_METHOD = $(prefer-db2x)
###
# The targets that may be used.
PHONY += xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs
PHONY += xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs cleandocs
BOOKS := $(addprefix $(obj)/,$(DOCBOOKS))
xmldocs: $(BOOKS)
@@ -143,7 +143,8 @@ quiet_cmd_db2pdf = PDF $@
$(call cmd,db2pdf)
main_idx = Documentation/DocBook/index.html
index = index.html
main_idx = Documentation/DocBook/$(index)
build_main_index = rm -rf $(main_idx) && \
echo '<h1>Linux Kernel HTML Documentation</h1>' >> $(main_idx) && \
echo '<h2>Kernel Version: $(KERNELVERSION)</h2>' >> $(main_idx) && \
@@ -213,11 +214,12 @@ silent_gen_xml = :
dochelp:
@echo ' Linux kernel internal documentation in different formats:'
@echo ' htmldocs - HTML'
@echo ' installmandocs - install man pages generated by mandocs'
@echo ' mandocs - man pages'
@echo ' pdfdocs - PDF'
@echo ' psdocs - Postscript'
@echo ' xmldocs - XML DocBook'
@echo ' mandocs - man pages'
@echo ' installmandocs - install man pages generated by mandocs'
@echo ' cleandocs - clean all generated DocBook files'
###
# Temporary files left by various tools
@@ -231,10 +233,14 @@ clean-files := $(DOCBOOKS) \
$(patsubst %.xml, %.pdf, $(DOCBOOKS)) \
$(patsubst %.xml, %.html, $(DOCBOOKS)) \
$(patsubst %.xml, %.9, $(DOCBOOKS)) \
$(C-procfs-example)
$(C-procfs-example) $(index)
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man
cleandocs:
$(Q)rm -f $(call objectify, $(clean-files))
$(Q)rm -rf $(call objectify, $(clean-dirs))
# Declare the contents of the .PHONY variable as phony. We keep that
# information in a variable se we can use it in if_changed and friends.
+5 -1
View File
@@ -190,16 +190,20 @@ X!Ekernel/module.c
!Edrivers/pci/pci.c
!Edrivers/pci/pci-driver.c
!Edrivers/pci/remove.c
!Edrivers/pci/pci-acpi.c
!Edrivers/pci/search.c
!Edrivers/pci/msi.c
!Edrivers/pci/bus.c
!Edrivers/pci/access.c
!Edrivers/pci/irq.c
!Edrivers/pci/htirq.c
<!-- FIXME: Removed for now since no structured comments in source
X!Edrivers/pci/hotplug.c
-->
!Edrivers/pci/probe.c
!Edrivers/pci/slot.c
!Edrivers/pci/rom.c
!Edrivers/pci/iov.c
!Idrivers/pci/pci-sysfs.c
</sect1>
<sect1><title>PCI Hotplug Support Library</title>
!Edrivers/pci/hotplug/pci_hotplug_core.c
+1 -1
View File
@@ -281,7 +281,7 @@
seriously wrong while debugging, it will most often be the case
that you want to enable gdb to be verbose about its target
communications. You do this prior to issuing the <constant>target
remote</constant> command by typing in: <constant>set remote debug 1</constant>
remote</constant> command by typing in: <constant>set debug remote 1</constant>
</para>
</chapter>
<chapter id="KGDBTestSuite">
+6 -13
View File
@@ -1040,23 +1040,21 @@ Front merges are handled by the binary trees in AS and deadline schedulers.
iii. Plugging the queue to batch requests in anticipation of opportunities for
merge/sort optimizations
This is just the same as in 2.4 so far, though per-device unplugging
support is anticipated for 2.5. Also with a priority-based i/o scheduler,
such decisions could be based on request priorities.
Plugging is an approach that the current i/o scheduling algorithm resorts to so
that it collects up enough requests in the queue to be able to take
advantage of the sorting/merging logic in the elevator. If the
queue is empty when a request comes in, then it plugs the request queue
(sort of like plugging the bottom of a vessel to get fluid to build up)
(sort of like plugging the bath tub of a vessel to get fluid to build up)
till it fills up with a few more requests, before starting to service
the requests. This provides an opportunity to merge/sort the requests before
passing them down to the device. There are various conditions when the queue is
unplugged (to open up the flow again), either through a scheduled task or
could be on demand. For example wait_on_buffer sets the unplugging going
(by running tq_disk) so the read gets satisfied soon. So in the read case,
the queue gets explicitly unplugged as part of waiting for completion,
in fact all queues get unplugged as a side-effect.
through sync_buffer() running blk_run_address_space(mapping). Or the caller
can do it explicity through blk_unplug(bdev). So in the read case,
the queue gets explicitly unplugged as part of waiting for completion on that
buffer. For page driven IO, the address space ->sync_page() takes care of
doing the blk_run_address_space().
Aside:
This is kind of controversial territory, as it's not clear if plugging is
@@ -1067,11 +1065,6 @@ Aside:
multi-page bios being queued in one shot, we may not need to wait to merge
a big request from the broken up pieces coming by.
Per-queue granularity unplugging (still a Todo) may help reduce some of the
concerns with just a single tq_disk flush approach. Something like
blk_kick_queue() to unplug a specific queue (right away ?)
or optionally, all queues, is in the plan.
4.4 I/O contexts
I/O contexts provide a dynamically allocated per process data area. They may
be used in I/O schedulers, and in the block layer (could be used for IO statis,
+18
View File
@@ -30,3 +30,21 @@ The above steps create a new group g1 and move the current shell
process (bash) into it. CPU time consumed by this bash and its children
can be obtained from g1/cpuacct.usage and the same is accumulated in
/cgroups/cpuacct.usage also.
cpuacct.stat file lists a few statistics which further divide the
CPU time obtained by the cgroup into user and system times. Currently
the following statistics are supported:
user: Time spent by tasks of the cgroup in user mode.
system: Time spent by tasks of the cgroup in kernel mode.
user and system are in USER_HZ unit.
cpuacct controller uses percpu_counter interface to collect user and
system times. This has two side effects:
- It is theoretically possible to see wrong values for user and system times.
This is because percpu_counter_read() on 32bit systems isn't safe
against concurrent writes.
- It is possible to see slightly outdated values for user and system times
due to the batch processing nature of percpu_counter.
+30 -21
View File
@@ -6,15 +6,14 @@ used here with the memory controller that is used in hardware.
Salient features
a. Enable control of both RSS (mapped) and Page Cache (unmapped) pages
a. Enable control of Anonymous, Page Cache (mapped and unmapped) and
Swap Cache memory pages.
b. The infrastructure allows easy addition of other types of memory to control
c. Provides *zero overhead* for non memory controller users
d. Provides a double LRU: global memory pressure causes reclaim from the
global LRU; a cgroup on hitting a limit, reclaims from the per
cgroup LRU
NOTE: Swap Cache (unmapped) is not accounted now.
Benefits and Purpose of the memory controller
The memory controller isolates the memory behaviour of a group of tasks
@@ -290,34 +289,44 @@ will be charged as a new owner of it.
moved to the parent. If you want to avoid that, force_empty will be useful.
5.2 stat file
memory.stat file includes following statistics (now)
cache - # of pages from page-cache and shmem.
rss - # of pages from anonymous memory.
pgpgin - # of event of charging
pgpgout - # of event of uncharging
active_anon - # of pages on active lru of anon, shmem.
inactive_anon - # of pages on active lru of anon, shmem
active_file - # of pages on active lru of file-cache
inactive_file - # of pages on inactive lru of file cache
unevictable - # of pages cannot be reclaimed.(mlocked etc)
Below is depend on CONFIG_DEBUG_VM.
inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
recent_scanned_file - VM internal parameter. (see mm/vmscan.c)
memory.stat file includes following statistics
Memo:
cache - # of bytes of page cache memory.
rss - # of bytes of anonymous and swap cache memory.
pgpgin - # of pages paged in (equivalent to # of charging events).
pgpgout - # of pages paged out (equivalent to # of uncharging events).
active_anon - # of bytes of anonymous and swap cache memory on active
lru list.
inactive_anon - # of bytes of anonymous memory and swap cache memory on
inactive lru list.
active_file - # of bytes of file-backed memory on active lru list.
inactive_file - # of bytes of file-backed memory on inactive lru list.
unevictable - # of bytes of memory that cannot be reclaimed (mlocked etc).
The following additional stats are dependent on CONFIG_DEBUG_VM.
inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
recent_scanned_file - VM internal parameter. (see mm/vmscan.c)
Memo:
recent_rotated means recent frequency of lru rotation.
recent_scanned means recent # of scans to lru.
showing for better debug please see the code for meanings.
Note:
Only anonymous and swap cache memory is listed as part of 'rss' stat.
This should not be confused with the true 'resident set size' or the
amount of physical memory used by the cgroup. Per-cgroup rss
accounting is not done yet.
5.3 swappiness
Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.
Following cgroup's swapiness can't be changed.
Following cgroups' swapiness can't be changed.
- root cgroup (uses /proc/sys/vm/swappiness).
- a cgroup which uses hierarchy and it has child cgroup.
- a cgroup which uses hierarchy and not the root of hierarchy.
+21 -6
View File
@@ -47,13 +47,18 @@ to work with it.
2. Basic accounting routines
a. void res_counter_init(struct res_counter *rc)
a. void res_counter_init(struct res_counter *rc,
struct res_counter *rc_parent)
Initializes the resource counter. As usual, should be the first
routine called for a new counter.
b. int res_counter_charge[_locked]
(struct res_counter *rc, unsigned long val)
The struct res_counter *parent can be used to define a hierarchical
child -> parent relationship directly in the res_counter structure,
NULL can be used to define no relationship.
c. int res_counter_charge(struct res_counter *rc, unsigned long val,
struct res_counter **limit_fail_at)
When a resource is about to be allocated it has to be accounted
with the appropriate resource counter (controller should determine
@@ -67,15 +72,25 @@ to work with it.
* if the charging is performed first, then it should be uncharged
on error path (if the one is called).
c. void res_counter_uncharge[_locked]
If the charging fails and a hierarchical dependency exists, the
limit_fail_at parameter is set to the particular res_counter element
where the charging failed.
d. int res_counter_charge_locked
(struct res_counter *rc, unsigned long val)
The same as res_counter_charge(), but it must not acquire/release the
res_counter->lock internally (it must be called with res_counter->lock
held).
e. void res_counter_uncharge[_locked]
(struct res_counter *rc, unsigned long val)
When a resource is released (freed) it should be de-accounted
from the resource counter it was accounted to. This is called
"uncharging".
The _locked routines imply that the res_counter->lock is taken.
The _locked routines imply that the res_counter->lock is taken.
2.1 Other accounting routines
+59
View File
@@ -169,3 +169,62 @@ three different ways to find such a match:
be probed later if another device registers. (Which is OK, since
this interface is only for use with non-hotpluggable devices.)
Early Platform Devices and Drivers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The early platform interfaces provide platform data to platform device
drivers early on during the system boot. The code is built on top of the
early_param() command line parsing and can be executed very early on.
Example: "earlyprintk" class early serial console in 6 steps
1. Registering early platform device data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code registers platform device data using the function
early_platform_add_devices(). In the case of early serial console this
should be hardware configuration for the serial port. Devices registered
at this point will later on be matched against early platform drivers.
2. Parsing kernel command line
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code calls parse_early_param() to parse the kernel
command line. This will execute all matching early_param() callbacks.
User specified early platform devices will be registered at this point.
For the early serial console case the user can specify port on the
kernel command line as "earlyprintk=serial.0" where "earlyprintk" is
the class string, "serial" is the name of the platfrom driver and
0 is the platform device id. If the id is -1 then the dot and the
id can be omitted.
3. Installing early platform drivers belonging to a certain class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code may optionally force registration of all early
platform drivers belonging to a certain class using the function
early_platform_driver_register_all(). User specified devices from
step 2 have priority over these. This step is omitted by the serial
driver example since the early serial driver code should be disabled
unless the user has specified port on the kernel command line.
4. Early platform driver registration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compiled-in platform drivers making use of early_platform_init() are
automatically registered during step 2 or 3. The serial driver example
should use early_platform_init("earlyprintk", &platform_driver).
5. Probing of early platform drivers belonging to a certain class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The architecture code calls early_platform_driver_probe() to match
registered early platform devices associated with a certain class with
registered early platform drivers. Matched devices will get probed().
This step can be executed at any point during the early boot. As soon
as possible may be good for the serial port case.
6. Inside the early platform driver probe()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The driver code needs to take special care during early boot, especially
when it comes to memory allocation and interrupt registration. The code
in the probe() function can use is_early_platform_device() to check if
it is called at early platform device or at the regular platform device
time. The early serial driver performs register_console() at this point.
For further information, see <linux/platform_device.h>.
@@ -428,3 +428,12 @@ Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to
After a reasonable transition period, we will remove the legacy
fakephp interface.
Who: Alex Chiang <achiang@hp.com>
---------------------------
What: i2c-voodoo3 driver
When: October 2009
Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate
driver but this caused driver conflicts.
Who: Jean Delvare <khali@linux-fr.org>
Krzysztof Helt <krzysztof.h1@wp.pl>
+16 -8
View File
@@ -512,16 +512,24 @@ locking rules:
BKL mmap_sem PageLocked(page)
open: no yes
close: no yes
fault: no yes
page_mkwrite: no yes no
fault: no yes can return with page locked
page_mkwrite: no yes can return with page locked
access: no yes
->page_mkwrite() is called when a previously read-only page is
about to become writeable. The file system is responsible for
protecting against truncate races. Once appropriate action has been
taking to lock out truncate, the page range should be verified to be
within i_size. The page mapping should also be checked that it is not
NULL.
->fault() is called when a previously not present pte is about
to be faulted in. The filesystem must find and return the page associated
with the passed in "pgoff" in the vm_fault structure. If it is possible that
the page may be truncated and/or invalidated, then the filesystem must lock
the page, then ensure it is not already truncated (the page lock will block
subsequent truncate), and then return with VM_FAULT_LOCKED, and the page
locked. The VM will unlock the page.
->page_mkwrite() is called when a previously read-only pte is
about to become writeable. The filesystem again must ensure that there are
no truncate/invalidate races, and then return with the page locked. If
the page has been truncated, the filesystem should not look up a new page
like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which
will cause the VM to retry the fault.
->access() is called when get_user_pages() fails in
acces_process_vm(), typically used to debug a process through
@@ -407,7 +407,7 @@ A NOTE ON SECURITY
==================
CacheFiles makes use of the split security in the task_struct. It allocates
its own task_security structure, and redirects current->act_as to point to it
its own task_security structure, and redirects current->cred to point to it
when it acts on behalf of another process, in that process's context.
The reason it does this is that it calls vfs_mkdir() and suchlike rather than
@@ -429,9 +429,9 @@ This means it may lose signals or ptrace events for example, and affects what
the process looks like in /proc.
So CacheFiles makes use of a logical split in the security between the
objective security (task->sec) and the subjective security (task->act_as). The
objective security holds the intrinsic security properties of a process and is
never overridden. This is what appears in /proc, and is what is used when a
objective security (task->real_cred) and the subjective security (task->cred).
The objective security holds the intrinsic security properties of a process and
is never overridden. This is what appears in /proc, and is what is used when a
process is the target of an operation by some other process (SIGKILL for
example).
@@ -56,9 +56,10 @@ workloads and can fully utilize the bandwidth to the servers when doing bulk
data transfers.
POHMELFS clients operate with a working set of servers and are capable of balancing read-only
operations (like lookups or directory listings) between them.
operations (like lookups or directory listings) between them according to IO priorities.
Administrators can add or remove servers from the set at run-time via special commands (described
in Documentation/pohmelfs/info.txt file). Writes are replicated to all servers.
in Documentation/pohmelfs/info.txt file). Writes are replicated to all servers, which are connected
with write permission turned on. IO priority and permissions can be changed in run-time.
POHMELFS is capable of full data channel encryption and/or strong crypto hashing.
One can select any kernel supported cipher, encryption mode, hash type and operation mode
+17 -4
View File
@@ -1,6 +1,8 @@
POHMELFS usage information.
Mount options:
Mount options.
All but index, number of crypto threads and maximum IO size can changed via remount.
idx=%u
Each mountpoint is associated with a special index via this option.
Administrator can add or remove servers from the given index, so all mounts,
@@ -52,16 +54,27 @@ mcache_timeout=%u
Usage examples.
Add (or remove if it already exists) server server1.net:1025 into the working set with index $idx
Add server server1.net:1025 into the working set with index $idx
with appropriate hash algorithm and key file and cipher algorithm, mode and key file:
$cfg -a server1.net -p 1025 -i $idx -K $hash_key -k $cipher_key
$cfg A add -a server1.net -p 1025 -i $idx -K $hash_key -k $cipher_key
Mount filesystem with given index $idx to /mnt mountpoint.
Client will connect to all servers specified in the working set via previous command:
mount -t pohmel -o idx=$idx q /mnt
One can add or remove servers from working set after mounting too.
Change permissions to read-only (-I 1 option, '-I 2' - write-only, 3 - rw):
$cfg A modify -a server1.net -p 1025 -i $idx -I 1
Change IO priority to 123 (node with the highest priority gets read requests).
$cfg A modify -a server1.net -p 1025 -i $idx -P 123
One can check currect status of all connections in the mountstats file:
# cat /proc/$PID/mountstats
...
device none mounted on /mnt with fstype pohmel
idx addr(:port) socket_type protocol active priority permissions
0 server1.net:1026 1 6 1 250 1
0 server2.net:1025 1 6 1 123 3
Server installation.
+1 -1
View File
@@ -133,4 +133,4 @@ RAM/SWAP in 10240 inodes and it is only accessible by root.
Author:
Christoph Rohland <cr@sap.com>, 1.12.01
Updated:
Hugh Dickins <hugh@veritas.com>, 4 June 2007
Hugh Dickins, 4 June 2007
+1 -2
View File
@@ -277,8 +277,7 @@ or bottom half).
unfreeze_fs: called when VFS is unlocking a filesystem and making it writable
again.
statfs: called when the VFS needs to get filesystem statistics. This
is called with the kernel lock held
statfs: called when the VFS needs to get filesystem statistics.
remount_fs: called when the filesystem is remounted. This is called
with the kernel lock held
+6
View File
@@ -150,6 +150,11 @@ fan[1-*]_min Fan minimum value
Unit: revolution/min (RPM)
RW
fan[1-*]_max Fan maximum value
Unit: revolution/min (RPM)
Only rarely supported by the hardware.
RW
fan[1-*]_input Fan input value.
Unit: revolution/min (RPM)
RO
@@ -390,6 +395,7 @@ OR
in[0-*]_min_alarm
in[0-*]_max_alarm
fan[1-*]_min_alarm
fan[1-*]_max_alarm
temp[1-*]_min_alarm
temp[1-*]_max_alarm
temp[1-*]_crit_alarm
+45
View File
@@ -24,6 +24,49 @@ Partitions and P_Keys
The P_Key for any interface is given by the "pkey" file, and the
main interface for a subinterface is in "parent."
Datagram vs Connected modes
The IPoIB driver supports two modes of operation: datagram and
connected. The mode is set and read through an interface's
/sys/class/net/<intf name>/mode file.
In datagram mode, the IB UD (Unreliable Datagram) transport is used
and so the interface MTU has is equal to the IB L2 MTU minus the
IPoIB encapsulation header (4 bytes). For example, in a typical IB
fabric with a 2K MTU, the IPoIB MTU will be 2048 - 4 = 2044 bytes.
In connected mode, the IB RC (Reliable Connected) transport is used.
Connected mode is to takes advantage of the connected nature of the
IB transport and allows an MTU up to the maximal IP packet size of
64K, which reduces the number of IP packets needed for handling
large UDP datagrams, TCP segments, etc and increases the performance
for large messages.
In connected mode, the interface's UD QP is still used for multicast
and communication with peers that don't support connected mode. In
this case, RX emulation of ICMP PMTU packets is used to cause the
networking stack to use the smaller UD MTU for these neighbours.
Stateless offloads
If the IB HW supports IPoIB stateless offloads, IPoIB advertises
TCP/IP checksum and/or Large Send (LSO) offloading capability to the
network stack.
Large Receive (LRO) offloading is also implemented and may be turned
on/off using ethtool calls. Currently LRO is supported only for
checksum offload capable devices.
Stateless offloads are supported only in datagram mode.
Interrupt moderation
If the underlying IB device supports CQ event moderation, one can
use ethtool to set interrupt mitigation parameters and thus reduce
the overhead incurred by handling interrupts. The main code path of
IPoIB doesn't use events for TX completion signaling so only RX
moderation is supported.
Debugging Information
By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
@@ -55,3 +98,5 @@ References
http://ietf.org/rfc/rfc4391.txt
IP over InfiniBand (IPoIB) Architecture (RFC 4392)
http://ietf.org/rfc/rfc4392.txt
IP over InfiniBand: Connected Mode (RFC 4755)
http://ietf.org/rfc/rfc4755.txt

Some files were not shown because too many files have changed in this diff Show More