kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Ilya Dryomov	33e9c8dbfb	libceph: advertise support for NEW_OSDOP_ENCODING and SERVER_LUMINOUS All four SERVER_LUMINOUS feature bits are implemented, switch it on! NEW_OSDOP_ENCODING doesn't mean much for the client (it signifies support for MOSDOp v6) but needs to be enabled in order to get the latest (currently v25) pg_pool_t. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Sage Weil <sage@redhat.com>	2017-07-07 17:26:24 +02:00
Ilya Dryomov	0bb05da2ec	libceph: osd_state is 32 bits wide in luminous Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:19 +02:00
Ilya Dryomov	6f428df47d	libceph: pg_upmap[_items] infrastructure pg_temp and pg_upmap encodings are the same (PG -> array of osds), except for the incremental remove: it's an empty mapping in new_pg_temp for pg_temp and a separate old_pg_upmap set for pg_upmap. (This isn't to allow for empty pg_upmap mappings -- apparently, pg_temp just wasn't looked at as an example for pg_upmap encoding.) Reuse __decode_pg_temp() for decoding pg_upmap and new_pg_upmap. __decode_pg_temp() stores into pg_temp union member, but since pg_upmap union member is identical, reading through pg_upmap later is OK. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:18 +02:00
Ilya Dryomov	278b1d709c	libceph: ceph_decode_skip_* helpers Some of these won't be as efficient as they could be (e.g. ceph_decode_skip_set(... 32 ...) could advance by len * sizeof(u32) once instead of advancing by sizeof(u32) len times), but that's fine and not worth a bunch of extra macro code. Replace skip_name_map() with ceph_decode_skip_map as an example. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:18 +02:00
Ilya Dryomov	a02a946dfe	libceph: respect RADOS_BACKOFF backoffs Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:17 +02:00
Ilya Dryomov	76f827a7b1	libceph: make DEFINE_RB_* helpers more general Initially for ceph_pg_mapping, ceph_spg_mapping and ceph_hobject_id, compared with ceph_pg_compare(), ceph_spg_compare() and hoid_compare() respectively. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:17 +02:00
Ilya Dryomov	df28152d53	libceph: avoid unnecessary pi lookups in calc_target() Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:17 +02:00
Ilya Dryomov	04c7d789e2	libceph: make sure need_resend targets reflect latest map Otherwise we may miss events like PG splits, pool deletions, etc when we get multiple incremental maps at once. Because check_pool_dne() can now be fed an unlinked request, finish_request() needed to be taught to handle unlinked requests. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:16 +02:00
Ilya Dryomov	7de030d6b1	libceph: resend on PG splits if OSD has RESEND_ON_SPLIT Note that ceph_osd_request_target fields are updated regardless of RESEND_ON_SPLIT. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:16 +02:00
Ilya Dryomov	8cb441c054	libceph: MOSDOp v8 encoding (actual spgid + full hash) Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:15 +02:00
Ilya Dryomov	98ad5ebd15	libceph: ceph_connection_operations::reencode_message() method Give upper layers a chance to reencode the message after the connection is negotiated and ->peer_features is set. OSD client will use this to support both luminous and pre-luminous OSDs (in a single cluster): the former need MOSDOp v8; the latter will continue to be sent MOSDOp v4. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:15 +02:00
Ilya Dryomov	dc98ff7230	libceph: introduce ceph_spg, ceph_pg_to_primary_shard() Store both raw pgid and actual spgid in ceph_osd_request_target. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:15 +02:00
Ilya Dryomov	dc93e0e283	libceph: fold [l]req->last_force_resend into ceph_osd_request_target Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:15 +02:00
Ilya Dryomov	220abf5aa7	libceph: support SERVER_JEWEL feature bits Only MON_STATEFUL_SUB, really. MON_ROUTE_OSDMAP and OSDSUBOP_NO_SNAPCONTEXT are irrelevant. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:15 +02:00
Ilya Dryomov	2d7522e0bd	libceph: advertise support for OSD_POOLRESEND The code has been in place since commit `63244fa123` ("libceph: introduce ceph_osd_request_target, calc_target()"), and, with the ceph_{oloc,oid}_copy() issue fixed in the previous commit, is now in working order. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:14 +02:00
Ilya Dryomov	f179d3ba8c	libceph: new features macros Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:14 +02:00
Ilya Dryomov	dcbbd97ccb	libceph: remove ceph_sanitize_features() workaround Reflects ceph.git commit ff1959282826ae6acd7134e1b1ede74ffd1cc04a. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-07-07 17:25:14 +02:00
Ilya Dryomov	6f4dbd149d	libceph: use kbasename() and kill ceph_file_part() Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>	2017-05-23 20:32:10 +02:00
Alexander Graf	f775ff7d89	ceph: fix file open flags on ppc64 The file open flags (O_foo) are platform specific and should never go out to an interface that is not local to the system. Unfortunately these flags have leaked out onto the wire in the cephfs implementation. That lead to bogus flags getting transmitted on ppc64. This patch converts the kernel view of flags to the ceph view of file open flags. Fixes: `124e68e74` ("ceph: file operations") Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-04 09:19:24 +02:00
Ilya Dryomov	14bb211d32	rbd: support updating the lock cookie without releasing the lock As we no longer release the lock before potentially raising BLACKLISTED in rbd_reregister_watch(), the "either locked or blacklisted" assert in rbd_queue_workfn() needs to go: we can be both locked and blacklisted at that point now. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>	2017-05-04 09:19:23 +02:00
Jeff Layton	58eb7932ae	libceph: add an epoch_barrier field to struct ceph_osd_client Cephfs can get cap update requests that contain a new epoch barrier in them. When that happens we want to pause all OSD traffic until the right map epoch arrives. Add an epoch_barrier field to ceph_osd_client that is protected by the osdc->lock rwsem. When the barrier is set, and the current OSD map epoch is below that, pause the request target when submitting the request or when revisiting it. Add a way for upper layers (cephfs) to update the epoch_barrier as well. If we get a new map, compare the new epoch against the barrier before kicking requests and request another map if the map epoch is still lower than the one we want. If we get a map with a full pool, or at quota condition, then set the barrier to the current epoch value. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-04 09:19:21 +02:00
Jeff Layton	a1f4020aab	libceph: allow requests to return immediately on full conditions if caller wishes Usually, when the osd map is flagged as full or the pool is at quota, write requests just hang. This is not what we want for cephfs, where it would be better to simply report -ENOSPC back to userland instead of stalling. If the caller knows that it will want an immediate error return instead of blocking on a full or at-quota error condition then allow it to set a flag to request that behavior. Set that flag in ceph_osdc_new_request (since ceph.ko is the only caller), and on any other write request from ceph.ko. A later patch will deal with requests that were submitted before the new map showing the full condition came in. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-04 09:19:21 +02:00
Jeff Layton	aa26d662b9	libceph: remove req->r_replay_version Nothing uses this anymore with the removal of the ack vs. commit code. Remove the field and just encode zeroes into place in the request encoding. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-04 09:19:20 +02:00
Yan, Zheng	79162547b7	ceph: make seeky readdir more efficient Current cephfs client uses string to indicate start position of readdir. The string is last entry of previous readdir reply. This approach does not work for seeky readdir because we can not easily convert the new postion to a string. For seeky readdir, mds needs to return dentries from the beginning. Client keeps retrying if the reply does not contain the dentry it wants. In current version of ceph, mds sorts CDentry in its cache in hash order. Client also uses dentry hash to compose dir postion. For seeky readdir, if client passes the hash part of dir postion to mds. mds can avoid replying useless dentries. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-04 09:19:20 +02:00
Yan, Zheng	76201b6354	ceph: allow connecting to mds whose rank >= mdsmap::m_max_mds mdsmap::m_max_mds is the expected count of active mds. It's not the max rank of active mds. User can decrease mdsmap::m_max_mds, but does not stop mds whose rank >= mdsmap::m_max_mds. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-05-04 09:19:20 +02:00

1 2 3 4 5 ...

376 Commits