Commit Graph

17224 Commits

Author SHA1 Message Date
Christoph Hellwig
05da080482 efs: new export ops
Trivial switch over to the new generic helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:20 -07:00
Christoph Hellwig
2596110a39 exportfs: add new methods
Add the guts for the new filesystem API to exportfs.

There's now a fh_to_dentry method that returns a dentry for the object looked
for given a filehandle fragment, and a fh_to_parent operation that returns the
dentry for the encoded parent directory in case the file handle contains it.

There are default implementations for these methods that only take a callback
for an nfs-enhanced iget variant and implement the rest of the semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Christoph Hellwig
6e91ea2bb0 exportfs: add fid type
This patchset is a medium scale rewrite of the export operations interface.
The goal is to make the interface less complex, and easier to understand from
the filesystem side, aswell as preparing generic support for exporting of
64bit inode numbers.

This touches all nfs exporting filesystems, and I've done testing on all of
the filesystems I have here locally (xfs, ext2, ext3, reiserfs, jfs)

This patch:

Add a structured fid type so that we don't have to pass an array of u32 values
around everywhere.  It's a union of possible layouts.

As a start there's only the u32 array and the traditional 32bit inode format,
but there will be more in one of my next patchset when I start to document the
various filehandle formats we have in lowlevel filesystems better.

Also add an enum that gives the various filehandle types human- readable
names.

Note: Some people might think the struct containing an anonymous union is
ugly, but I didn't want to pass around a raw union type.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Bernhard Walle
00bf4098be kexec: add BSS to resource tree
Add the BSS to the resource tree just as kernel text and kernel data are in
the resource tree.  The main reason behind this is to avoid crashkernel
reservation in that area.

While it's not strictly necessary to have the BSS in the resource tree (the
actual collision detection is done in the reserve_bootmem() function before),
the usage of the BSS resource should be presented to the user in /proc/iomem
just as Kernel data and Kernel code.

Note: The patch currently is only implemented for x86 and ia64 (because
efi_initialize_iomem_resources() has the same signature on i386 and ia64).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S
358dd8ac53 intel-iommu: fix for IOMMU early crash
pci_dev's->sysdata is highly overloaded and currently IOMMU is broken due
to IOMMU code depending on this field.

This patch introduces new field in pci_dev's dev.archdata struct to hold
IOMMU specific per device IOMMU private data.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S
3460a6d9ce Intel IOMMU: DMAR fault handling support
MSI interrupt handler registrations and fault handling support for Intel-IOMMU
hadrware.

This patch enables the MSI interrupts for the DMA remapping units and in the
interrupt handler read the fault cause and outputs the same on to the console.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S
ba39592764 Intel IOMMU: Intel IOMMU driver
Actual intel IOMMU driver.  Hardware spec can be found at:
http://www.intel.com/technology/virtualization

This driver sets X86_64 'dma_ops', so hook into standard DMA APIs.  In this
way, PCI driver will get virtual DMA address.  This change is transparent to
PCI drivers.

[akpm@linux-foundation.org: remove unneeded cast]
[akpm@linux-foundation.org: build fix]
[bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S
a9c55b3ba8 Intel IOMMU: clflush_cache_range now takes size param
Introduce the size param for clflush_cache_range().

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S
994a65e25d Intel IOMMU: PCI generic helper function
When devices are under a p2p bridge, upstream transactions get replaced by the
device id of the bridge as it owns the PCIE transaction.  Hence its necessary
to setup translations on behalf of the bridge as well.  Due to this limitation
all devices under a p2p share the same domain in a DMAR.

We just cache the type of device, if its a native PCIe device
or not for later use.

[akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S
10e5247f40 Intel IOMMU: DMAR detection and parsing logic
This patch supports the upcomming Intel IOMMU hardware a.k.a.  Intel(R)
Virtualization Technology for Directed I/O Architecture and the hardware spec
for the same can be found here
http://www.intel.com/technology/virtualization/index.htm

FAQ! (questions from akpm, answers from ak)

> So...  what's all this code for?
>
> I assume that the intent here is to speed things up under Xen, etc?

Yes in some cases, but not this code.  That would be the Xen version of this
code that could potentially assign whole devices to guests.  I expect this to
be only useful in some special cases though because most hardware is not
virtualizable and you typically want an own instance for each guest.

Ok at some point KVM might implement this too; i likely would use this code
for this.

> Do we
> have any benchmark results to help us to decide whether a merge would be
> justified?

The main advantage for doing it in the normal kernel is not performance, but
more safety.  Broken devices won't be able to corrupt memory by doing random
DMA.

Unfortunately that doesn't work for graphics yet, for that need user space
interfaces for the X server are needed.

There are some potential performance benefits too:

- When you have a device that cannot address the complete address range an
  IOMMU can remap its memory instead of bounce buffering.  Remapping is likely
  cheaper than copying.

- The IOMMU can merge sg lists into a single virtual block.  This could
  potentially speed up SG IO when the device is slow walking SG lists.  [I
  long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but
  it probably depends a lot on the HBA]

And you get better driver debugging because unexpected memory accesses from
the devices will cause a trappable event.

>
> Does it slow anything down?

It adds more overhead to each IO so yes.

This patch:

Add support for early detection and parsing of DMAR's (DMA Remapping) reported
to OS via ACPI tables.

DMA remapping(DMAR) devices support enables independent address translations
for Direct Memory Access(DMA) from Devices.  These DMA remapping devices are
reported via ACPI tables and includes pci device scope covered by these DMA
remapping device.

For detailed info on the specification of "Intel(R) Virtualization Technology
for Directed I/O Architecture" please see
http://www.intel.com/technology/virtualization/index.htm

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Jan Kara
89910cccb8 ext2: avoid rec_len overflow with 64KB block size
With 64KB blocksize, a directory entry can have size 64KB which does not
fit into 16 bits we have for entry length.  So we store 0xffff instead and
convert the value when read from / written to disk.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Serge E. Hallyn
b68680e473 capabilities: clean up file capability reading
Simplify the vfs_cap_data structure.

Also fix get_file_caps which was declaring
__le32 v1caps[XATTR_CAPS_SZ] on the stack, but
XATTR_CAPS_SZ is already * sizeof(__le32).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Andrew Morgan <morgan@kernel.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Yasunori Goto
b9049e2344 memory hotplug: make kmem_cache_node for SLUB on memory online avoid panic
Fix a panic due to access NULL pointer of kmem_cache_node at discard_slab()
after memory online.

When memory online is called, kmem_cache_nodes are created for all SLUBs
for new node whose memory are available.

slab_mem_going_online_callback() is called to make kmem_cache_node() in
callback of memory online event.  If it (or other callbacks) fails, then
slab_mem_offline_callback() is called for rollback.

In memory offline, slab_mem_going_offline_callback() is called to shrink
all slub cache, then slab_mem_offline_callback() is called later.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: locking fix]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Yasunori Goto
7b78d335ac memory hotplug: rearrange memory hotplug notifier
Current memory notifier has some defects yet.  (Fortunately, nothing uses
it.) This patch is to fix and rearrange for them.

  - Add information of start_pfn, nr_pages, and node id if node status is
    changes from/to memoryless node for callback functions.
    Callbacks can't do anything without those information.
  - Add notification going-online status.
    It is necessary for creating per node structure before the node's
    pages are available.
  - Move GOING_OFFLINE status notification after page isolation.
    It is good place for return memory like cache for callback,
    because returned page is not used again.
  - Make CANCEL events for rollingback when error occurs.
  - Delete MEM_MAPPING_INVALID notification. It will be not used.
  - Fix compile error of (un)register_memory_notifier().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Rusty Russell
214541d1f3 add WEAK() for creating weak asm labels
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Rusty Russell
e5371ac566 update boot spec to 2.07
Updates for version 2.07 of the boot protocol.  This includes:

load_flags.KEEP_SEGMENTS- flag to request/inhibit segment reloads
hardware_subarch	- what subarchitecture we're booting under
hardware_subarch_data	- per-architecture data

The intention of these changes is to make booting a paravirtualized
kernel work via the normal Linux boot protocol.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Linus Torvalds
efea90a454 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
  Blackfin arch: update boards files
  Blackfin arch: dma add some API and cleanup bf54x DMA definition
  Blackfin arch: cleanup and promote the general purpose timers api to a core blackfin component
  Blackfin arch: add a cheesy install target
  Blackfin arch: add functions for converting between sclks and usecs
  Blackfin arch: add assembly function for doing 64bit unsigned division
  Blackfin arch: -mno-fdpic works
  Blackfin arch: use "char bfin_board_name[]" rather than "char *bfin_board_name" per discussion on lkml as the former uses less storage
  Blackfin arch: Fixing Bug: balance calls to get_task_mm with corresponding mmput calls
  Blackfin serial driver Kconfig: depend on DMA not being enabled rather than a specific DMA size
  Blackfin arch: Fix bug: missing CHIPID register field definition of BF54x
  Blackfin arch: Fix up /proc/cpuinfo so it is like everyone else
  Blackfin arch: Optimization - no need to make additional math here
  Blackfin arch: force irq_flags into the .data section
  Blackfin arch BF548 defconfig: enable watchdog by default
  Blackfin arch: add new processor ADSP-BF52x arch/mach support
2007-10-21 09:57:55 -07:00
Bryan Wu
452af71f36 Blackfin arch: dma add some API and cleanup bf54x DMA definition
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-22 00:02:14 +08:00
Mike Frysinger
780431e397 Blackfin arch: cleanup and promote the general purpose timers api to a core blackfin component
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-21 23:37:54 +08:00
Mike Frysinger
2f6cf7bfc6 Blackfin arch: add functions for converting between sclks and usecs
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-21 22:59:49 +08:00
Mike Frysinger
066954a389 Blackfin arch: use "char bfin_board_name[]" rather than "char *bfin_board_name" per discussion on lkml as the former uses less storage
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-21 22:36:06 +08:00
Bryan Wu
1e5b24431b Blackfin arch: Fix bug: missing CHIPID register field definition of BF54x
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-21 16:58:49 +08:00
Michael Hennerich
590031450a Blackfin arch: add new processor ADSP-BF52x arch/mach support
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
2007-10-21 16:54:27 +08:00
Al Viro
74c3cbe33b [PATCH] audit: watching subtrees
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is.  Limitations:
	* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
	* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
	* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there.  New command
tells audit to recalculate the trees, trimming such sources of false
positives.

Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-10-21 02:37:45 -04:00
Al Viro
455434d450 [PATCH] new helper - inotify_evict_watch()
Kicks the watch out without dropping it.  Called under ->inotify_mutex

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-10-21 02:37:38 -04:00