* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (34 commits)
nfsd race fixes: jfs
nfsd race fixes: reiserfs
nfsd race fixes: ext4
nfsd race fixes: ext3
nfsd race fixes: ext2
nfsd/create race fixes, infrastructure
filesystem notification: create fs/notify to contain all fs notification
fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization
kill ->dir_notify()
filp_cachep can be static in fs/file_table.c
fix f_count description in Documentation/filesystems/files.txt
make INIT_FS use the __RW_LOCK_UNLOCKED initialization
take init_fs to saner place
kill vfs_permission
pass a struct path * to may_open
kill walk_init_root
remove incorrect comment in inode_permission
expand some comments (d_path / seq_path)
correct wrong function name of d_put in kernel document and source comment
fix switch_names() breakage in short-to-short case
...
struct dentry is one of the most critical structures in the kernel. So it's
sad to see it going neglected.
With CONFIG_PROFILING turned on (which is probably the common case at least
for distros and kernel developers), sizeof(struct dcache) == 208 here
(64-bit). This gives 19 objects per slab.
I packed d_mounted into a hole, and took another 4 bytes off the inline
name length to take the padding out from the end of the structure. This
shinks it to 200 bytes. I could have gone the other way and increased the
length to 40, but I'm aiming for a magic number, read on...
I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant:
why was this ever a good idea? The cookie system should increase its hash
size or use a tree or something if lookups are a problem. Also the "fast
dcookie lookups" in oprofile should be moved into the dcookie code -- how
can oprofile possibly care about the dcookie_mutex? It gets dropped after
get_dcookie() returns so it can't be providing any sort of protection.
At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system
with ~140 000 entries allocated. 192 is also a multiple of 64, so we get
nice cacheline alignment on 64 and 32 byte line systems -- any given dentry
will now require 3 cachelines to touch all fields wheras previously it
would require 4.
I know the inline name size was chosen quite carefully, however with the
reduction in cacheline footprint, it should actually be just about as fast
to do a name lookup for a 36 character name as it was before the patch (and
faster for other sizes). The memory footprint savings for names which are
<= 32 or > 36 bytes long should more than make up for the memory cost for
33-36 byte names.
Performance is a feature...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sparseirq: move __weak symbols into separate compilation unit
sparseirq: work around __weak alias bug
sparseirq: fix hang with !SPARSE_IRQ
sparseirq: set lock_class for legacy irq when sparse_irq is selected
sparseirq: work around compiler optimizing away __weak functions
sparseirq: fix desc->lock init
sparseirq: do not printk when migrating IRQ descriptors
sparseirq: remove duplicated arch_early_irq_init()
irq: simplify for_each_irq_desc() usage
proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c
irq: for_each_irq_desc() move to irqnr.h
hrtimer: remove #include <linux/irq.h>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: struct device - replace bus_id with dev_name()
lguest: move the initial guest page table creation code to the host
kvm-s390: implement config_changed for virtio on s390
virtio_console: support console resizing
virtio: add PCI device release() function
virtio_blk: fix type warning
virtio: block: dynamic maximum segments
virtio: set max_segment_size and max_sectors to infinite.
virtio: avoid implicit use of Linux page size in balloon interface
virtio: hand virtio ring alignment as argument to vring_new_virtqueue
virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize
virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize
virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci.
virtio: rename 'pagesize' arg to vring_init/vring_size
virtio: Don't use PAGE_SIZE in virtio_pci.c
virtio: struct device - replace bus_id with dev_name(), dev_set_name()
virtio-pci queue allocation not page-aligned
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (407 commits)
[ARM] pxafb: add support for overlay1 and overlay2 as framebuffer devices
[ARM] pxafb: cleanup of the timing checking code
[ARM] pxafb: cleanup of the color format manipulation code
[ARM] pxafb: add palette format support for LCCR4_PAL_FOR_3
[ARM] pxafb: add support for FBIOPAN_DISPLAY by dma braching
[ARM] pxafb: allow pxafb_set_par() to start from arbitrary yoffset
[ARM] pxafb: allow video memory size to be configurable
[ARM] pxa: add document on the MFP design and how to use it
[ARM] sa1100_wdt: don't assume CLOCK_TICK_RATE to be a constant
[ARM] rtc-sa1100: don't assume CLOCK_TICK_RATE to be a constant
[ARM] pxa/tavorevb: update board support (smartpanel LCD + keypad)
[ARM] pxa: Update eseries defconfig
[ARM] 5352/1: add w90p910-plat config file
[ARM] s3c: S3C options should depend on PLAT_S3C
[ARM] mv78xx0: implement GPIO and GPIO interrupt support
[ARM] Kirkwood: implement GPIO and GPIO interrupt support
[ARM] Orion: share GPIO IRQ handling code
[ARM] Orion: share GPIO handling code
[ARM] s3c: define __io using the typesafe version
[ARM] S3C64XX: Ensure CPU_V6 is selected
...
* 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
oprofile: select RING_BUFFER
ring_buffer: adding EXPORT_SYMBOLs
oprofile: fix lost sample counter
oprofile: remove nr_available_slots()
oprofile: port to the new ring_buffer
ring_buffer: add remaining cpu functions to ring_buffer.h
oprofile: moving cpu_buffer_reset() to cpu_buffer.h
oprofile: adding cpu_buffer_entries()
oprofile: adding cpu_buffer_write_commit()
oprofile: adding cpu buffer r/w access functions
ftrace: remove unused function arg in trace_iterator_increment()
ring_buffer: update description for ring_buffer_alloc()
oprofile: set values to default when creating oprofilefs
oprofile: implement switch/case in buffer_sync.c
x86/oprofile: cleanup IBS init/exit functions in op_model_amd.c
x86/oprofile: reordering IBS code in op_model_amd.c
oprofile: fix typo
oprofile: whitspace changes only
oprofile: update comment for oprofile_add_sample()
oprofile: comment cleanup
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (98 commits)
sparc: move select of ARCH_SUPPORTS_MSI
sparc: drop SUN_IO
sparc: unify sections.h
sparc: use .data.init_task section for init_thread_union
sparc: fix array overrun check in of_device_64.c
sparc: unify module.c
sparc64: prepare module_64.c for unification
sparc64: use bit neutral Elf symbols
sparc: unify module.h
sparc: introduce CONFIG_BITS
sparc: fix hardirq.h removal fallout
sparc64: do not export pus_fs_struct
sparc: use sparc64 version of scatterlist.h
sparc: Commonize memcmp assembler.
sparc: Unify strlen assembler.
sparc: Add asm/asm.h
sparc: Kill memcmp_32.S code which has been ifdef'd out for centuries.
sparc: replace for_each_cpu_mask_nr with for_each_cpu
sparc: fix sparse warnings in irq_32.c
sparc: add include guards to kernel.h
...
* 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits)
bio: get rid of bio_vec clearing
bounce: don't rely on a zeroed bio_vec list
cciss: simplify parameters to deregister_disk function
cfq-iosched: fix race between exiting queue and exiting task
loop: Do not call loop_unplug for not configured loop device.
loop: Flush possible running bios when loop device is released.
alpha: remove dead BIO_VMERGE_BOUNDARY
Get rid of CONFIG_LSF
block: make blk_softirq_init() static
block: use min_not_zero in blk_queue_stack_limits
block: add one-hit cache for disk partition lookup
cfq-iosched: remove limit of dispatch depth of max 4 times quantum
nbd: tell the block layer that it is not a rotational device
block: get rid of elevator_t typedef
aio: make the lookup_ioctx() lockless
bio: add support for inlining a number of bio_vecs inside the bio
bio: allow individual slabs in the bio_set
bio: move the slab pointer inside the bio_set
bio: only mempool back the largest bio_vec slab cache
block: don't use plugging on SSD devices
...
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, sparseirq: clean up Kconfig entry
x86: turn CONFIG_SPARSE_IRQ off by default
sparseirq: fix numa_migrate_irq_desc dependency and comments
sparseirq: add kernel-doc notation for new member in irq_desc, -v2
locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
sparseirq, xen: make sure irq_desc is allocated for interrupts
sparseirq: fix !SMP building, #2
x86, sparseirq: move irq_desc according to smp_affinity, v7
proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
sparse irqs: add irqnr.h to the user headers list
sparse irqs: handle !GENIRQ platforms
sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
sparseirq: fix Alpha build failure
sparseirq: fix typo in !CONFIG_IO_APIC case
x86, MSI: pass irq_cfg and irq_desc
x86: MSI start irq numbering from nr_irqs_gsi
x86: use NR_IRQS_LEGACY
sparse irq_desc[] array: core kernel and x86 changes
genirq: record IRQ_LEVEL in irq_desc[]
irq.h: remove padding from irq_desc on 64bits
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimers: fix warning in kernel/hrtimer.c
x86: make sure we really have an hpet mapping before using it
x86: enable HPET on Fujitsu u9200
linux/timex.h: cleanup for userspace
posix-timers: simplify de_thread()->exit_itimers() path
posix-timers: check ->it_signal instead of ->it_pid to validate the timer
posix-timers: use "struct pid*" instead of "struct task_struct*"
nohz: suppress needless timer reprogramming
clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
nohz: no softirq pending warnings for offline cpus
hrtimer: removing all ur callback modes, fix
hrtimer: removing all ur callback modes, fix hotplug
hrtimer: removing all ur callback modes
x86: correct link to HPET timer specification
rtc-cmos: export second NVRAM bank
Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
manually.
* 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
stacktrace: provide save_stack_trace_tsk() weak alias
rcu: provide RCU options on non-preempt architectures too
printk: fix discarding message when recursion_bug
futex: clean up futex_(un)lock_pi fault handling
"Tree RCU": scalable classic RCU implementation
futex: rename field in futex_q to clarify single waiter semantics
x86/swiotlb: add default swiotlb_arch_range_needs_mapping
x86/swiotlb: add default phys<->bus conversion
x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
x86: add swiotlb allocation functions
swiotlb: consolidate swiotlb info message printing
swiotlb: support bouncing of HighMem pages
swiotlb: factor out copy to/from device
swiotlb: add arch hook to force mapping
swiotlb: allow architectures to override phys<->bus<->phys conversions
swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
rcu: fix rcutorture behavior during reboot
resources: skip sanity check of busy resources
swiotlb: move some definitions to header
swiotlb: allow architectures to override swiotlb pool allocation
...
Fix up trivial conflicts in
arch/x86/kernel/Makefile
arch/x86/mm/init_32.c
include/linux/hardirq.h
as per Ingo's suggestions.
This patch moves the initial guest page table creation code to the host,
so the launcher keeps working with PAE enabled configs.
Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
PXA27x and later processors support overlay1 and overlay2 on-top of the
base framebuffer (although under-neath the base is also possible). They
support palette and no-palette RGB formats, as well as YUV formats (only
available on overlay2). These overlays have dedicated DMA channels and
behave in a similar way as a framebuffer.
This heavily simplified and re-structured work is based on the original
pxafb_overlay.c (which is pending for mainline merge for a long time).
The major problems with this pxafb_overlay.c are (if you are interested
in the history):
1. heavily redundant (the control logics for overlay1 and overlay2 are
actually identical except for some small operations, which are now
abstracted into a 'pxafb_layer_ops' structure)
2. a lot of useless and un-tested code (two workarounds which are now
fixed on mature silicons)
3. cursorfb is actually useless, hardware cursor should not be used
this way, and the code was actually un-tested for a long time.
The code in this patch should be self-explanatory, I tried to add minimum
comments. As said, this is basically simplified, there are several things
still on the pending list:
1. palette mode is un-supported and un-tested (although re-using the
palette code of the base framebuffer is actually very easy now with
previous clean-up patches)
2. fb_pan_display for overlay(s) is un-supported
3. the base framebuffer can actually be abstracted by 'pxafb_layer' as
well, which will help further re-use of the code and keep a better
and consistent structure. (This is the reason I named it 'pxafb_layer'
instead of 'pxafb_overlay' or something alike)
See Documentation/fb/pxafb.txt for additional usage information.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Cc: Rodolfo Giometti <giometti@linux.it>
Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
1. introduce var_to_depth() to calculate the color depth including the
transparency bit
2. the conversion from 'fb_var_screeninfo' to LCCR3 BPP bits can be re-
used by overlays (in OVLxC1), thus an individual pxafb_var_to_bpp()
has been separated out.
3. pxafb_setmode() should really set the color bitfields correctly at
begining, introduce a pxafb_set_pixfmt() for this
4. allow user apps to specify color formats within fb_var_screeninfo,
and checking of this in pxafb_check_var() has been simplified as
below:
a) pxafb_var_to_bpp() should pass - which means a basically correct
bits_per_pixel and color depth setting
b) the RGBT bitfields are then forced into supported values by
pxafb_set_pixfmt()
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
Add the palette format support for LCCR4_PAL_FOR_3, and fix the
issue of LCCR4 being never assigned.
Also remove the useless pxafb_set_truecolor().
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
dma branching is enabled by extending the current setup_frame_dma()
function to allow a 2nd set of frame/palette dma descriptors to be
used.
As a result, pxafb_dma_buff.dma_desc[], pxafb_dma_buff.pal_desc[]
and pxafb_info.fdadr[] are doubled.
This allows maximum re-use of the current dma setup code, although
the pxafb_info.fdadr[xx] for FBRx register values looks a bit odd.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
The amount of video memory size is decided according to the following
order:
1. <xres> x <yres> x <bits_per_pixel> by default, which is the backward
compatible way
2. size specified in platform data
3. size specified in module parameter 'options' string or specified in
kernel boot command line (see updated Documentation/fb/pxafb.txt)
And now since the memory is allocated from system memory, the pxafb_mmap
can be removed and the default fb_mmap() should be working all right.
Also, since we now have introduced the 'struct pxafb_dma_buff' for DMA
descriptors and palettes, the allocation can be separated cleanly.
NOTE: the LCD DMA actually supports chained transfer (i.e. page-based
transfers), to simplify the logic and keep the performance (with less
TLB misses when accessing from memory mapped user space), the memory
is allocated by alloc_pages_*() to ensures it's physical contiguous.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Eric Miao <ycmiao@ycmiao-hp520.(none)>
As Nicolas and Russell pointed out, CLOCK_TICK_RATE is no more
a constant on PXA when multiple processors and platforms are
selected, change TIMER_FREQ in rtc-sa1100.c into a variable.
Since the code to decide the clock tick rate is re-used from
timer.c, introduce a common get_clock_tick_rate() for this.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Nicolas Pitre <nico@marvell.com>
The mm->ioctx_list is currently protected by a reader-writer lock,
so we always grab that lock on the read side for doing ioctx
lookups. As the workload is extremely reader biased, turn this into
an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
the rwlock and use a spinlock for providing update side exclusion.
There's usually only 1 entry on this list, so it doesn't make sense
to look into fancier data structures.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>