kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Kees Cook	b66c598401	exec: do not leave bprm->interp on stack If a series of scripts are executed, each triggering module loading via unprintable bytes in the script header, kernel stack contents can leak into the command line. Normally execution of binfmt_script and binfmt_misc happens recursively. However, when modules are enabled, and unprintable bytes exist in the bprm->buf, execution will restart after attempting to load matching binfmt modules. Unfortunately, the logic in binfmt_script and binfmt_misc does not expect to get restarted. They leave bprm->interp pointing to their local stack. This means on restart bprm->interp is left pointing into unused stack memory which can then be copied into the userspace argv areas. After additional study, it seems that both recursion and restart remains the desirable way to handle exec with scripts, misc, and modules. As such, we need to protect the changes to interp. This changes the logic to require allocation for any changes to the bprm->interp. To avoid adding a new kmalloc to every exec, the default value is left as-is. Only when passing through binfmt_script or binfmt_misc does an allocation take place. For a proof of concept, see DoTest.sh from: http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/ Signed-off-by: Kees Cook <keescook@chromium.org> Cc: halfdog <me@halfdog.net> Cc: P J P <ppandit@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-20 17:40:19 -08:00
Linus Torvalds	787314c35f	Merge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "A few new features this merge-window. The most important one is probably, that dma-debug now warns if a dma-handle is not checked with dma_mapping_error by the device driver. This requires minor changes to some architectures which make use of dma-debug. Most of these changes have the respective Acks by the Arch-Maintainers. Besides that there are updates to the AMD IOMMU driver for refactor the IOMMU-Groups support and to make sure it does not trigger a hardware erratum. The OMAP changes (for which I pulled in a branch from Tony Lindgren's tree) have a conflict in linux-next with the arm-soc tree. The conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in the arm-soc tree. It is safe to delete the file too so solve the conflict. Similar changes are done in the arm-soc tree in the common clock framework migration. A missing hunk from the patch in the IOMMU tree will be submitted as a seperate patch when the merge-window is closed." * tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits) ARM: dma-mapping: support debug_dma_mapping_error ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks iommu/omap: Adapt to runtime pm iommu/omap: Migrate to hwmod framework iommu/omap: Keep mmu enabled when requested iommu/omap: Remove redundant clock handling on ISR iommu/amd: Remove obsolete comment iommu/amd: Don't use 512GB pages iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch iommu/tegra: gart: Move bus_set_iommu after probe for multi arch iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all tile: dma_debug: add debug_dma_mapping_error support sh: dma_debug: add debug_dma_mapping_error support powerpc: dma_debug: add debug_dma_mapping_error support mips: dma_debug: add debug_dma_mapping_error support microblaze: dma-mapping: support debug_dma_mapping_error ia64: dma_debug: add debug_dma_mapping_error support c6x: dma_debug: add debug_dma_mapping_error support ARM64: dma_debug: add debug_dma_mapping_error support intel-iommu: Prevent devices with RMRRs from being placed into SI Domain ...	2012-12-20 10:07:25 -08:00
Linus Torvalds	b7dfde956d	Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio update from Rusty Russell: "Some nice cleanups, and even a patch my wife did as a "live" demo for Latinoware 2012. There's a slightly non-trivial merge in virtio-net, as we cleaned up the virtio add_buf interface while DaveM accepted the mq virtio-net patches." * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits) virtio_console: Add support for remoteproc serial virtio_console: Merge struct buffer_token into struct port_buffer virtio: add drv_to_virtio to make code clearly virtio: use dev_to_virtio wrapper in virtio virtio-mmio: Fix irq parsing in command line parameter virtio_console: Free buffers from out-queue upon close virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>( virtio_console: Use kmalloc instead of kzalloc virtio_console: Free buffer if splice fails virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0 virtio: make virtqueue_add_buf() returning 0 on success, not capacity. virtio: console: don't rely on virtqueue_add_buf() returning capacity. virtio_net: don't rely on virtqueue_add_buf() returning capacity. virtio-net: remove unused skb_vnet_hdr->num_sg field virtio-net: correct capacity math on ring full virtio: move queue_index and num_free fields into core struct virtqueue. ...	2012-12-20 08:37:05 -08:00
Linus Torvalds	1ffab3d413	Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "This is a batch of fixes for arm-soc platforms, most of it is for OMAP but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some broken platforms and drivers, etc. A bit all over the map really." There was some concern about commit `68136b10` ("RM: sunxi: Change device tree naming scheme for sunxi"), but Tony says: "Looks like that's trivial to fix as needed, no need to rebuild the branch to fix that AFAIK. The fix can be done once Olof is available online again. Linus, I suggest that you go ahead and pull this if there are no other issues with this branch." * tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits) ARM: sunxi: Change device tree naming scheme for sunxi ARM: ux500: fix missing include ARM: u300: delete custom pin hog code ARM: davinci: fix build break due to missing include ARM: exynos: Fix warning due to missing 'inline' in stub ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices ARM i.MX51 clock: Fix regression since enabling MIPI/HSP clocks ARM: dts: mx27: Fix the AIPI bus for FEC ARM: OMAP2+: common: remove use of vram ARM: OMAP3/4: cpuidle: fix sparse and checkpatch warnings ARM: OMAP4: clock data: DPLLs are missing bypass clocks in their parent lists ARM: OMAP4: clock data: div_iva_hs_clk is a power-of-two divider ARM: OMAP4: Fix EMU clock domain always on ARM: OMAP4460: Workaround ABE DPLL failing to turn-on ARM: OMAP4: Enhance support for DPLLs with 4X multiplier ARM: OMAP4: Add function table for non-M4X dplls ARM: OMAP4: Update timer clock aliases ARM: OMAP: Move plat/omap-serial.h to include/linux/platform_data/serial-omap.h ARM: dts: Add build target for omap4-panda-a4 ARM: dts: OMAP2420: Correct H4 board memory size ...	2012-12-20 07:21:54 -08:00
Maarten Lankhorst	ada65c7405	dma-buf: remove fallback for !CONFIG_DMA_SHARED_BUFFER Documentation says that code requiring dma-buf should add it to select, so inline fallbacks are not going to be used. A link error will make it obvious what went wrong, instead of silently doing nothing at runtime. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Rob Clark <rob.clark@linaro.org> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>	2012-12-20 12:05:06 +05:30
Linus Torvalds	9eb127cc04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Really fix tuntap SKB use after free bug, from Eric Dumazet. 2) Adjust SKB data pointer to point past the transport header before calling icmpv6_notify() so that the headers are in the state which that function expects. From Duan Jiong. 3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason Wang. 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov. 5) Don't destroy mutex after freeing up device private in mac802154, fix also from Konstantin Khlebnikov. 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch. 7) SCTP HMAC kconfig rework, from Neil Horman. 8) Fix SCTP jprobes function signature, otherwise things explode, from Daniel Borkmann. 9) Fix typo in ipv6-offload Makefile variable reference, from Simon Arlott. 10) Don't fail USBNET open just because remote wakeup isn't supported, from Oliver Neukum. 11) be2net driver bug fixes from Sathya Perla. 12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David Woodhouse. 13) Fix MTU changing regression in 8139cp driver, from John Greene. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) solos-pci: ensure all TX packets are aligned to 4 bytes solos-pci: add firmware upgrade support for new models solos-pci: remove superfluous debug output solos-pci: add GPIO support for newer versions on Geos board 8139cp: Prevent dev_close/cp_interrupt race on MTU change net: qmi_wwan: add ZTE MF880 drivers/net: Use of_match_ptr() macro in smsc911x.c drivers/net: Use of_match_ptr() macro in smc91x.c ipv6: addrconf.c: remove unnecessary "if" bridge: Correctly encode addresses when dumping mdb entries bridge: Do not unregister all PF_BRIDGE rtnl operations use generic usbnet_manage_power() usbnet: generic manage_power() usbnet: handle PM failure gracefully ksz884x: fix receive polling race condition qlcnic: update driver version qlcnic: fix unused variable warnings net: fec: forbid FEC_PTP on SoCs that do not support be2net: fix wrong frag_idx reported by RX CQ be2net: fix be_close() to ensure all events are ack'ed ...	2012-12-19 20:29:15 -08:00
Linus Torvalds	e32795503d	Merge tags 'dt-for-linus', 'gpio-for-linus' and 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6 Pull devicetree, gpio and spi bugfixes from Grant Likely: "Device tree v3.8 bug fix: - Fixes an undefined struct device build error and a missing symbol export. GPIO device driver bug fixes: - gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG - gpio/ich: Add missing spinlock init SPI device driver bug fixes: - Most of this is bug fixes to the core code and the sh-hspi and s3c64xx device drivers. - There is also a patch here to add DT support to the Atmel driver. This one should have been in the first round, but I missed it. It's a low risk change contained within a single driver and the Atmel maintainer has requested it." * tag 'dt-for-linus' of git://git.secretlab.ca/git/linux-2.6: of: define struct device in of_platform.h if !OF_DEVICE and !OF_ADDRESS of: Fix export of of_find_matching_node_and_match() * tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6: gpio/mvebu-gpio: Make mvebu-gpio depend on OF_CONFIG gpio/ich: Add missing spinlock init * tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6: spi/sh-hspi: fix return value check in hspi_probe(). spi: fix tegra SPI binding examples spi/atmel: add DT support of/spi: Fix SPI module loading by using proper "spi:" modalias prefixes. spi: Change FIFO flush operation and spi channel off spi: Keep chipselect assertion during one message	2012-12-19 20:26:16 -08:00
Linus Torvalds	ca2a88f56a	Merge tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd Pull MTD updates from David Woodhouse: - Various cleanups especially in NAND tests - Add support for NAND flash on BCMA bus - DT support for sh_flctl and denali NAND drivers - Kill obsolete/superceded drivers (fortunet, nomadik_nand) - Fix JFFS2 locking bug in ENOMEM failure path - New SPI flash chips, as usual - Support writing in 'reliable mode' for DiskOnChip G4 - Debugfs support in nandsim * tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd: (96 commits) mtd: nand: typo in nand_id_has_period() comments mtd: nand/gpio: use io{read,write}*_rep accessors mtd: block2mtd: throttle writes by calling balance_dirty_pages_ratelimited. mtd: nand: gpmi: reset BCH earlier, too, to avoid NAND startup problems mtd: nand/docg4: fix and improve read of factory bbt mtd: nand/docg4: reserve bb marker area in ecclayout mtd: nand/docg4: add support for writing in reliable mode mtd: mxc_nand: reorder part_probes to let cmdline override other sources mtd: mxc_nand: fix unbalanced clk_disable() in error path mtd: nandsim: Introduce debugfs infrastructure mtd: physmap_of: error checking to prevent a NULL pointer dereference mtg: docg3: potential divide by zero in doc_write_oob() mtd: bcm47xxnflash: writing support mtd: tests/read: initialize buffer for whole next page mtd: at91: atmel_nand: return bit flips for the PMECC read_page() mtd: fix recovery after failed write-buffer operation in cfi_cmdset_0002.c mtd: nand: onfi need to be probed in 8 bits mode mtd: nand: add NAND_BUSWIDTH_AUTO to autodetect bus width mtd: nand: print flash size during detection mted: nand_wait_ready timeout fix ...	2012-12-19 12:47:41 -08:00
Oliver Neukum	2dd7c8cf29	usbnet: generic manage_power() Centralise common code for manage_power() in usbnet by making a generic simple implementation Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-19 12:46:40 -08:00
Oliver Neukum	a1c088e01b	usbnet: handle PM failure gracefully If a device fails to do remote wakeup, this is no reason to abort an open totally. This patch just continues without runtime PM. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-19 12:46:40 -08:00
Linus Torvalds	74779e2226	Merge tag 'for-3.8-rc1' of git://gitorious.org/linux-pwm/linux-pwm Pull pwm changes from Thierry Reding: "A new driver has been added for the SPEAr platform and the TWL4030/6030 driver has been replaced by two drivers that control the regular PWMs and the PWM driven LEDs provided by the chips. The vt8500, tiecap, tiehrpwm, i.MX, LPC32xx and Samsung drivers have all been improved and the device tree bindings now support the PWM signal polarity." Fix up trivial conflicts due to __devinit/exit removal. * tag 'for-3.8-rc1' of git://gitorious.org/linux-pwm/linux-pwm: (21 commits) pwm: samsung: add missing s3c->pwm_id assignment pwm: lpc32xx: Set the chip base for dynamic allocation pwm: lpc32xx: Properly disable the clock on device removal pwm: lpc32xx: Fix the PWM polarity pwm: i.MX: eliminate build warning pwm: Export of_pwm_xlate_with_flags() pwm: Remove pwm-twl6030 driver pwm: New driver to support PWM driven LEDs on TWL4030/6030 series of PMICs pwm: New driver to support PWMs on TWL4030/6030 series of PMICs pwm: pwm-tiehrpwm: pinctrl support pwm: tiehrpwm: Add device-tree binding pwm: pwm-tiehrpwm: Adding TBCLK gating support. pwm: pwm-tiecap: pinctrl support pwm: tiecap: Add device-tree binding pwm: Add TI PWM subsystem driver pwm: Device tree support for PWM polarity pwm: vt8500: Ensure PWM clock is enabled during pwm_config pwm: vt8500: Fix build error pwm: spear: Staticize spear_pwm_config() pwm: Add SPEAr PWM chip driver support ...	2012-12-19 08:19:07 -08:00
Jonas Gorski	ab28698d33	of: define struct device in of_platform.h if !OF_DEVICE and !OF_ADDRESS Fixes the following warning: include/linux/of_platform.h:106:13: warning: 'struct device' declared inside parameter list [enabled by default] include/linux/of_platform.h:106:13: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] Signed-off-by: Jonas Gorski <jogo@openwrt.org> Signed-off-by: Rob Herring <robherring2@gmail.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2012-12-19 16:15:17 +00:00
Linus Torvalds	7a684c452e	Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd	2012-12-19 07:55:08 -08:00
Linus Torvalds	7f2de8171d	Merge tag 'byteswap-for-linus-20121219' of git://git.infradead.org/users/dwmw2/byteswap Pull preparatory gcc intrisics bswap patch from David Woodhouse: "This single patch is effectively a no-op for now. It enables architectures to opt in to using GCC's __builtin_bswapXX() intrinsics for byteswapping, and if we merge this now then the architecture maintainers can enable it for their arch during the next cycle without dependency issues. It's worth making it a par-arch opt-in, because although in theory the compiler should never do worse than hand-coded assembler (and of course it also ought to do a lot better on platforms like Atom and PowerPC which have load-and-swap or store-and-swap instructions), that isn't always the case. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46453 for example." * tag 'byteswap-for-linus-20121219' of git://git.infradead.org/users/dwmw2/byteswap: byteorder: allow arch to opt to use GCC intrinsics for byteswapping	2012-12-19 07:52:48 -08:00
Linus Torvalds	59771079c1	blk: avoid divide-by-zero with zero discard granularity Commit `8dd2cb7e88` ("block: discard granularity might not be power of 2") changed a couple of 'binary and' operations into modulus operations. Which turned the harmless case of a zero discard_granularity into a possible divide-by-zero. The code also had a much more subtle bug: it was doing the modulus of a value in bytes using 'sector_t'. That was always conceptually wrong, but didn't actually matter back when the code assumed a power-of-two granularity: we only looked at the low bits anyway. But with potentially arbitrary sector numbers, using a 'sector_t' to express bytes is very very wrong: depending on configuration it limits the starting offset of the device to just 32 bits, and any overflow would result in a wrong value if the modulus wasn't a power-of-two. So re-write the code to not only protect against the divide-by-zero, but to do the starting sector arithmetic in sectors, and using the proper types. [ For any mathematicians out there: it also looks monumentally stupid to do the 'modulo granularity' operation twice, never mind having a "+ granularity" in the second modulus op. But that's the easiest way to avoid negative values or overflow, and it is how the original code was done. ] Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Doug Anderson <dianders@chromium.org> Cc: Neil Brown <neilb@suse.de> Cc: Shaohua Li <shli@fusionio.com> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-19 07:18:35 -08:00
Linus Torvalds	752451f01c	Merge branch 'i2c-embedded/for-next' of git://git.pengutronix.de/git/wsa/linux Pull i2c-embedded changes from Wolfram Sang: - CBUS driver (an I2C variant) - continued rework of the omap driver - s3c2410 gets lots of fixes and gains pinctrl support - at91 gains DMA support - the GPIO muxer gains devicetree probing - typical fixes and additions all over * 'i2c-embedded/for-next' of git://git.pengutronix.de/git/wsa/linux: (45 commits) i2c: omap: Remove the OMAP_I2C_FLAG_RESET_REGS_POSTIDLE flag i2c: at91: add dma support i2c: at91: change struct members indentation i2c: at91: fix compilation warning i2c: mxs: Do not disable the I2C SMBus quick mode i2c: mxs: Handle i2c DMA failure properly i2c: s3c2410: Remove recently introduced performance overheads i2c: ocores: Move grlib set/get functions into #ifdef CONFIG_OF block i2c: s3c2410: Add fix for i2c suspend/resume i2c: s3c2410: Fix code to free gpios i2c: i2c-cbus-gpio: introduce driver i2c: ocores: Add support for the GRLIB port of the controller and use function pointers for getreg and setreg functions i2c: ocores: Add irq support for sparc i2c: omap: Move the remove constraint ARM: dts: cfa10049: Add the i2c muxer buses to the CFA-10049 i2c: s3c2410: do not special case HDMIPHY stuck bus detection i2c: s3c2410: use exponential back off while polling for bus idle i2c: s3c2410: do not generate STOP for QUIRK_HDMIPHY i2c: s3c2410: grab adapter lock while changing i2c clock i2c: s3c2410: Add support for pinctrl ...	2012-12-18 16:51:10 -08:00
Linus Torvalds	673ab8783b	Merge branch 'akpm' (more patches from Andrew) Merge patches from Andrew Morton: "Most of the rest of MM, plus a few dribs and drabs. I still have quite a few irritating patches left around: ones with dubious testing results, lack of review, ones which should have gone via maintainer trees but the maintainers are slack, etc. I need to be more activist in getting these things wrapped up outside the merge window, but they're such a PITA." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits) mm/vmscan.c: avoid possible deadlock caused by too_many_isolated() vmscan: comment too_many_isolated() mm/kmemleak.c: remove obsolete simple_strtoul mm/memory_hotplug.c: improve comments mm/hugetlb: create hugetlb cgroup file in hugetlb_init mm/mprotect.c: coding-style cleanups Documentation: ABI: /sys/devices/system/node/ slub: drop mutex before deleting sysfs entry memcg: add comments clarifying aspects of cache attribute propagation kmem: add slab-specific documentation about the kmem controller slub: slub-specific propagation changes slab: propagate tunable values memcg: aggregate memcg cache values in slabinfo memcg/sl[au]b: shrink dead caches memcg/sl[au]b: track all the memcg children of a kmem_cache memcg: destroy memcg caches sl[au]b: allocate objects from memcg cache sl[au]b: always get the cache from its page in kmem_cache_free() memcg: skip memcg kmem allocations in specified code regions memcg: infrastructure to match an allocation to the right cache ...	2012-12-18 15:08:12 -08:00
Jianguo Wu	7179e7bf45	mm/hugetlb: create hugetlb cgroup file in hugetlb_init Build kernel with CONFIG_HUGETLBFS=y,CONFIG_HUGETLB_PAGE=y and CONFIG_CGROUP_HUGETLB=y, then specify hugepagesz=xx boot option, system will fail to boot. This failure is caused by following code path: setup_hugepagesz hugetlb_add_hstate hugetlb_cgroup_file_init cgroup_add_cftypes kzalloc <--slab is not available yet For this path, slab is not available yet, so memory allocated will be failed, and cause WARN_ON() in hugetlb_cgroup_file_init(). So I move hugetlb_cgroup_file_init() into hugetlb_init(). [akpm@linux-foundation.org: tweak coding-style, remove pointless __init on inlined function] [akpm@linux-foundation.org: fix warning] Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:15 -08:00
Glauber Costa	ebe945c276	memcg: add comments clarifying aspects of cache attribute propagation This patch clarifies two aspects of cache attribute propagation. First, the expected context for the for_each_memcg_cache macro in memcontrol.h. The usages already in the codebase are safe. In mm/slub.c, it is trivially safe because the lock is acquired right before the loop. In mm/slab.c, it is less so: the lock is acquired by an outer function a few steps back in the stack, so a VM_BUG_ON() is added to make sure it is indeed safe. A comment is also added to detail why we are returning the value of the parent cache and ignoring the children's when we propagate the attributes. Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:15 -08:00
Glauber Costa	107dab5c92	slub: slub-specific propagation changes SLUB allows us to tune a particular cache behavior with sysfs-based tunables. When creating a new memcg cache copy, we'd like to preserve any tunables the parent cache already had. This can be done by tapping into the store attribute function provided by the allocator. We of course don't need to mess with read-only fields. Since the attributes can have multiple types and are stored internally by sysfs, the best strategy is to issue a ->show() in the root cache, and then ->store() in the memcg cache. The drawback of that, is that sysfs can allocate up to a page in buffering for show(), that we are likely not to need, but also can't guarantee. To avoid always allocating a page for that, we can update the caches at store time with the maximum attribute size ever stored to the root cache. We will then get a buffer big enough to hold it. The corolary to this, is that if no stores happened, nothing will be propagated. It can also happen that a root cache has its tunables updated during normal system operation. In this case, we will propagate the change to all caches that are already active. [akpm@linux-foundation.org: tweak code to avoid __maybe_unused] Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	943a451a87	slab: propagate tunable values SLAB allows us to tune a particular cache behavior with tunables. When creating a new memcg cache copy, we'd like to preserve any tunables the parent cache already had. This could be done by an explicit call to do_tune_cpucache() after the cache is created. But this is not very convenient now that the caches are created from common code, since this function is SLAB-specific. Another method of doing that is taking advantage of the fact that do_tune_cpucache() is always called from enable_cpucache(), which is called at cache initialization. We can just preset the values, and then things work as expected. It can also happen that a root cache has its tunables updated during normal system operation. In this case, we will propagate the change to all caches that are already active. This change will require us to move the assignment of root_cache in memcg_params a bit earlier. We need this to be already set - which memcg_kmem_register_cache will do - when we reach __kmem_cache_create() Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	749c54151a	memcg: aggregate memcg cache values in slabinfo When we create caches in memcgs, we need to display their usage information somewhere. We'll adopt a scheme similar to /proc/meminfo, with aggregate totals shown in the global file, and per-group information stored in the group itself. For the time being, only reads are allowed in the per-group cache. Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	7cf2798240	memcg/sl[au]b: track all the memcg children of a kmem_cache This enables us to remove all the children of a kmem_cache being destroyed, if for example the kernel module it's being used in gets unloaded. Otherwise, the children will still point to the destroyed parent. Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	1f458cbf12	memcg: destroy memcg caches Implement destruction of memcg caches. Right now, only caches where our reference counter is the last remaining are deleted. If there are any other reference counters around, we just leave the caches lying around until they go away. When that happens, a destruction function is called from the cache code. Caches are only destroyed in process context, so we queue them up for later processing in the general case. Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	d79923fad9	sl[au]b: allocate objects from memcg cache We are able to match a cache allocation to a particular memcg. If the task doesn't change groups during the allocation itself - a rare event, this will give us a good picture about who is the first group to touch a cache page. This patch uses the now available infrastructure by calling memcg_kmem_get_cache() before all the cache allocations. Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00

1 2 3 4 5 ...

33151 Commits