We can now handle the snapshot cases under RCU, as well as the
non-snapshot case when we don't need to queue up a lease renewal
allow LOOKUP_RCU walks to proceed under those conditions.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
Under rcuwalk, we need to take extra care when dereferencing d_parent.
We want to do that once and pass a pointer to dentry_lease_is_valid.
Also, we must ensure that that function can handle the case where we're
racing with d_release. Check whether "di" is NULL under the d_lock, and
just return 0 if so.
Finally, we still need to kick off a renewal job if the lease is getting
close to expiration. If that's the case, then just drop out of rcuwalk
mode since that could block.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
To check for a valid dentry lease, we need to get at the
ceph_dentry_info. Under rcuwalk though, we may end up with a dentry that
is on its way to destruction. Since we need to take the d_lock in
dentry_lease_is_valid already, we can just ensure that we clear the
d_fsinfo pointer out under the same lock before destroying it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
This patch adds codes that decode pool namespace information in
cap message and request reply. Pool namespace is saved in i_layout,
it will be passed to libceph when doing read/write.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Add pool namesapce pointer to struct ceph_file_layout and struct
ceph_object_locator. Pool namespace is used by when mapping object
to PG, it's also used when composing OSD request.
The namespace pointer in struct ceph_file_layout is RCU protected.
So libceph can read namespace without taking lock.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
[idryomov@gmail.com: ceph_oloc_destroy(), misc minor changes]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The data structure is for storing namesapce string. It allows namespace
string to be shared between cephfs inodes with same layout. This data
structure can also be referenced by OSD request.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Define new ceph_file_layout structure and rename old ceph_file_layout
to ceph_file_layout_legacy. This is preparation for adding namespace
to ceph_file_layout structure.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Add ceph_start_encoding() and ceph_start_decoding(), the equivalent of
ENCODE_START and DECODE_START in the userspace ceph code.
This is based on a patch from Mike Christie <michaelc@cs.wisc.edu>.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
An on-stack oid in ceph_ioctl_get_dataloc() is not initialized,
resulting in a WARN and a NULL pointer dereference later on. We will
have more of these on-stack in the future, so fix it with a convenience
macro.
Fixes: d30291b985 ("libceph: variable-sized ceph_object_id")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Pull ceph fix from Ilya Dryomov:
"A fix for a long-standing bug in the incremental osdmap handling code
that caused misdirected requests, tagged for stable"
The tag is signed with a brand new key - Sage is on vacation and I
didn't anticipate this"
* tag 'ceph-for-4.7-rc8' of git://github.com/ceph/ceph-client:
libceph: apply new_state before new_up_client on incrementals
Pull networking fixes from David Miller:
1) Fix memory leak in nftables, from Liping Zhang.
2) Need to check result of vlan_insert_tag() in batman-adv otherwise we
risk NULL skb derefs, from Sven Eckelmann.
3) Check for dev_alloc_skb() failures in cfg80211, from Gregory
Greenman.
4) Handle properly when we have ppp_unregister_channel() happening in
parallel with ppp_connect_channel(), from WANG Cong.
5) Fix DCCP deadlock, from Eric Dumazet.
6) Bail out properly in UDP if sk_filter() truncates the packet to be
smaller than even the space that the protocol headers need. From
Michal Kubecek.
7) Similarly for rose, dccp, and sctp, from Willem de Bruijn.
8) Make TCP challenge ACKs less predictable, from Eric Dumazet.
9) Fix infinite loop in bgmac_dma_tx_add() from Florian Fainelli.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
packet: propagate sock_cmsg_send() error
net/mlx5e: Fix del vxlan port command buffer memset
packet: fix second argument of sock_tx_timestamp()
net: switchdev: change ageing_time type to clock_t
Update maintainer for EHEA driver.
net/mlx4_en: Add resilience in low memory systems
net/mlx4_en: Move filters cleanup to a proper location
sctp: load transport header after sk_filter
net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int
net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata
net: nb8800: Fix SKB leak in nb8800_receive()
et131x: Fix logical vs bitwise check in et131x_tx_timeout()
vlan: use a valid default mtu value for vlan over macsec
net: bgmac: Fix infinite loop in bgmac_dma_tx_add()
mlxsw: spectrum: Prevent invalid ingress buffer mapping
mlxsw: spectrum: Prevent overwrite of DCB capability fields
mlxsw: spectrum: Don't emit errors when PFC is disabled
mlxsw: spectrum: Indicate support for autonegotiation
mlxsw: spectrum: Force link training according to admin state
r8152: add MODULE_VERSION
...
Pull overlayfs fixes from Miklos Szeredi:
"This contains a fix for a potential crash/corruption issue and another
where the suid/sgid bits weren't cleared on write"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: verify upper dentry in ovl_remove_and_whiteout()
ovl: Copy up underlying inode's ->i_mode to overlay inode
ovl: handle ATTR_KILL*
Merge misc fixes from Andrew Morton:
"Five fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
pps: do not crash when failed to register
tools/vm/slabinfo: fix an unintentional printf
testing/radix-tree: fix a macro expansion bug
radix-tree: fix radix_tree_iter_retry() for tagged iterators.
mm: memcontrol: fix cgroup creation failure after many small jobs