If a timer fires after kvm_inject_pending_timer_irqs() but before
local_irq_disable() the code will enter guest mode and only inject such
timer interrupt the next time an unrelated event causes an exit.
It would be simpler if the timer->pending irq conversion could be done
with IRQ's disabled, so that the above problem cannot happen.
For now introduce a new vcpu requests bit to cancel guest entry.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some of the requirement rules are now more relaxed. Also correct a
contradiction in the previous update
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes the function reserve_bootmem_node() from void to int,
returning -ENOMEM if the allocation fails.
This fixes a build problem on x86 with CONFIG_KEXEC=y and
CONFIG_NEED_MULTIPLE_NODES=y
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp/intel: cleanup some serious whitespace badness
[AGP] intel_agp: Add support for Intel 4 series chipsets
[AGP] intel_agp: extra stolen mem size available for IGD_GM chipset
agp: more boolean conversions.
drivers/char/agp - use bool
agp: two-stage page destruction issue
agp/via: fixup pci ids
We've introduced extra need of compat layer for ip_tunnel_prl{}
for PRL (Potential Router List) management. Though compat_ioctl
is still missing in ipv4/ipv6, let's make the interface more
straight-forward and eliminate extra need for nasty compat layer
anyway since the interface is new for 2.6.26.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This file is only included where it makes sense now, so there's no need
for the CONFIG_ARCH_SUPPORTS_AOUT conditional -- and that conditional is
bad, because we want to export <linux/a.out.h> to userspace.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fixup write combine comment in pci_mmap_resource
x86: PAT export resource_wc in pci sysfs
x86, pci-dma.c: don't always add __GFP_NORETRY to gfp
suspend-vs-iommu: prevent suspend if we could not resume
x86: pci-dma.c: use __GFP_NO_OOM instead of __GFP_NORETRY
pci, x86: add workaround for bug in ASUS A7V600 BIOS (rev 1005)
PCI: use dev_to_node in pci_call_probe
PCI: Correct last two HP entries in the bfsort whitelist
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
tcp: Revert 'process defer accept as established' changes.
ipv6: Fix duplicate initialization of rawv6_prot.destroy
bnx2x: Updating the Maintainer
net: Eliminate flush_scheduled_work() calls while RTNL is held.
drivers/net/r6040.c: correct bad use of round_jiffies()
fec_mpc52xx: MPC52xx_MESSAGES_DEFAULT: 2nd NETIF_MSG_IFDOWN => IFUP
ipg: fix receivemode IPG_RM_RECEIVEMULTICAST{,HASH} in ipg_nic_set_multicast_list()
netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
netfilter: Make nflog quiet when no one listen in userspace.
ipv6: Fail with appropriate error code when setting not-applicable sockopt.
ipv6: Check IPV6_MULTICAST_LOOP option value.
ipv6: Check the hop limit setting in ancillary data.
ipv6 route: Fix route lifetime in netlink message.
ipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).
dccp: Bug in initial acknowledgment number assignment
dccp ccid-3: X truncated due to type conversion
dccp ccid-3: TFRC reverse-lookup Bug-Fix
dccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets
dccp: Fix sparse warnings
dccp ccid-3: Bug-Fix - Zero RTT is possible
We need this at least for huge page detection for now, because powerpc
needs the vm_area_struct to be able to determine whether a virtual address
is referring to a huge page (its pmd_huge() doesn't work).
It might also come in handy for some of the other users.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the forward-declaration of struct mm_struct a little way up
proc_fs.h. This fixes a bunch of "'struct mm_struct' declared inside
parameter list" warnings with CONFIG_PROC_FS=n
Signed-off-by: Ben Nizette <bn@niasdigital.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts two changesets, ec3c0982a2
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adb
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").
This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.
Ilpo Järvinen first spotted the locking problems. The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.
Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock. The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.
Next is a problem noticed by Vitaliy Gusev, he noted:
----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> goto death;
> }
>
>+ if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+ tcp_send_active_reset(sk, GFP_ATOMIC);
>+ goto death;
Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------
Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:
----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------
So revert this thing for now.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'core/iter-div' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
always_inline timespec_add_ns
add an inlined version of iter_div_u64_rem
common implementation of iterative div/mod
timespec_add_ns is used from the x86-64 vdso, which cannot call out to
other kernel code. Make sure that timespec_add_ns is always inlined
(and only uses always_inlined functions) to make sure there are no
unexpected calls.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
iter_div_u64_rem is used in the x86-64 vdso, which cannot call other
kernel code. For this case, provide the always_inlined version,
__iter_div_u64_rem.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We have a few instances of the open-coded iterative div/mod loop, used
when we don't expcet the dividend to be much bigger than the divisor.
Unfortunately modern gcc's have the tendency to strength "reduce" this
into a full mod operation, which isn't necessarily any faster, and
even if it were, doesn't exist if gcc implements it in libgcc.
The workaround is to put a dummy asm statement in the loop to prevent
gcc from performing the transformation.
This patch creates a single implementation of this loop, and uses it
to replace the open-coded versions I know about.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>