linux-apfs

mirror of https://github.com/linux-apfs/linux-apfs.git synced 2026-05-01 15:00:59 -07:00

Author	SHA1	Message	Date
Fabian Frederick	f3ae1b97be	fs/ceph: replace pr_warning by pr_warn Update the last pr_warning callsites in fs branch Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Sage Weil <sage@inktank.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:06 -07:00
Ilya Dryomov	37b52fe608	ceph: fix dout() compile warnings in ceph_filemap_fault() PAGE_CACHE_SIZE is unsigned long on all architectures, however size_t is either unsigned int or unsigned long. Rather than change format strings, cast PAGE_CACHE_SIZE to size_t to be in line with dout()s in ceph_page_mkwrite(). Cc: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-01-28 09:57:06 -08:00
Li Wang	183028052b	ceph fscache: Uncaching no data page from fscache in readpage() Currently, if one new page allocated into fscache in readpage(), however, with no data read into due to error encountered during reading from OSDs, the slot in fscache is not uncached. This patch fixes this. Signed-off-by: Li Wang <liwang@ubuntukylin.com> Reviewed-by: Milosz Tanski <milosz@adfin.com>	2013-12-31 20:32:03 +02:00
Yan, Zheng	61f6881621	ceph: check caps in filemap_fault and page_mkwrite Adds cap check to the page fault handler. The check prevents page fault handler from adding new page to the page cache while Fcb caps are being revoked. This solves Fc revoking hang in multiple clients mmap IO workload. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-12-31 20:32:00 +02:00
Li Wang	f36132a75a	ceph: Clean up if error occurred in finish_read() Clean up if error occurred rather than going through normal process Signed-off-by: Li Wang <liwang@ubuntukylin.com> Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-13 09:13:28 -08:00
Li Wang	56f91aad69	ceph: Avoid data inconsistency due to d-cache aliasing in readpage() If the length of data to be read in readpage() is exactly PAGE_CACHE_SIZE, the original code does not flush d-cache for data consistency after finishing reading. This patches fixes this. Signed-off-by: Li Wang <liwang@ubuntukylin.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-12-13 09:11:38 -08:00
Li Wang	ff638b7df5	ceph: allocate non-zero page to fscache in readpage() ceph_osdc_readpages() returns number of bytes read, currently, the code only allocate full-zero page into fscache, this patch fixes this. Signed-off-by: Li Wang <liwang@ubuntukylin.com> Reviewed-by: Milosz Tanski <milosz@adfin.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-11-23 11:01:07 -08:00
Milosz Tanski	d4d3aa38d6	ceph: page still marked private_2 Previous patch that allowed us to cleanup most of the issues with pages marked as private_2 when calling ceph_readpages. However, there seams to be a case in the error case clean up in start read that still trigers this from time to time. I've only seen this one a couple times. BUG: Bad page state in process petabucket pfn:335b82 page:ffffea000cd6e080 count:0 mapcount:0 mapping: (null) index:0x0 page flags: 0x200000000001000(private_2) Call Trace: [<ffffffff81563442>] dump_stack+0x46/0x58 [<ffffffff8112c7f7>] bad_page+0xc7/0x120 [<ffffffff8112cd9e>] free_pages_prepare+0x10e/0x120 [<ffffffff8112e580>] free_hot_cold_page+0x40/0x160 [<ffffffff81132427>] __put_single_page+0x27/0x30 [<ffffffff81132d95>] put_page+0x25/0x40 [<ffffffffa02cb409>] ceph_readpages+0x2e9/0x6f0 [ceph] [<ffffffff811313cf>] __do_page_cache_readahead+0x1af/0x260 Signed-off-by: Milosz Tanski <milosz@adfin.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-09-06 16:50:12 +00:00
Milosz Tanski	76be778b3a	ceph: clean PgPrivate2 on returning from readpages In some cases the ceph readapages code code bails without filling all the pages already marked by fscache. When we return back to readahead code this causes a BUG. Signed-off-by: Milosz Tanski <milosz@adfin.com>	2013-09-06 16:50:11 +00:00
Milosz Tanski	99ccbd229c	ceph: use fscache as a local presisent cache Adding support for fscache to the Ceph filesystem. This would bring it to on par with some of the other network filesystems in Linux (like NFS, AFS, etc...) In order to mount the filesystem with fscache the 'fsc' mount option must be passed. Signed-off-by: Milosz Tanski <milosz@adfin.com> Signed-off-by: Sage Weil <sage@inktank.com>	2013-09-06 16:50:11 +00:00
Sha Zhengju	7d6e1f5461	ceph: use vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem Following we will begin to add memcg dirty page accounting around __set_page_dirty_{buffers,nobuffers} in vfs layer, so we'd better use vfs interface to avoid exporting those details to filesystems. Since vfs set_page_dirty() should be called under page lock, here we don't need elaborate codes to handle racy anymore, and two WARN_ON() are added to detect such exceptions. Thanks very much for Sage and Yan Zheng's coaching! I tested it in a two server's ceph environment that one is client and the other is mds/osd/mon, and run the following fsx test from xfstests: ./fsx 1MB -N 50000 -p 10000 -l 1048576 ./fsx 10MB -N 50000 -p 10000 -l 10485760 ./fsx 100MB -N 50000 -p 10000 -l 104857600 The fsx does lots of mmap-read/mmap-write/truncate operations and the tests completed successfully without triggering any of WARN_ON. Signed-off-by: Sha Zhengju <handai.szj@taobao.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-08-27 16:29:44 -07:00
Milosz Tanski	b150f5c1c7	ceph: cleanup the logic in ceph_invalidatepage The invalidatepage code bails if it encounters a non-zero page offset. The current logic that does is non-obvious with multiple if statements. This should be logically and functionally equivalent. Signed-off-by: Milosz Tanski <milosz@adfin.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-08-15 11:12:02 -07:00
Sage Weil	ee3e542fec	Merge remote-tracking branch 'linus/master' into testing	2013-08-15 11:11:45 -07:00
Milosz Tanski	fe2a801b50	ceph: Remove bogus check in invalidatepage The early bug checks are moot because the VMA layer ensures those things. 1. It will not call invalidatepage unless PagePrivate (or PagePrivate2) are set 2. It will not call invalidatepage without taking a PageLock first. 3. Guantrees that the inode page is mapped. Signed-off-by: Milosz Tanski <milosz@adfin.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-08-09 17:55:58 -07:00
Linus Torvalds	9a5889ae1c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's code drop, a series from Yan fixing multi-mds behavior in cephfs, and then a sprinkling of bug fixes all around. Some warnings, sleeping while atomic, a null dereference, and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits) libceph: fix invalid unsigned->signed conversion for timespec encoding libceph: call r_unsafe_callback when unsafe reply is received ceph: fix race between cap issue and revoke ceph: fix cap revoke race ceph: fix pending vmtruncate race ceph: avoid accessing invalid memory libceph: Fix NULL pointer dereference in auth client code ceph: Reconstruct the func ceph_reserve_caps. ceph: Free mdsc if alloc mdsc->mdsmap failed. ceph: remove sb_start/end_write in ceph_aio_write. ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. ceph: fix sleeping function called from invalid context. ceph: move inode to proper flushing list when auth MDS changes rbd: fix a couple warnings ceph: clear migrate seq when MDS restarts ceph: check migrate seq before changing auth cap ceph: fix race between page writeback and truncate ceph: reset iov_len when discarding cap release messages ceph: fix cap release race libceph: fix truncate size calculation ...	2013-07-09 12:39:10 -07:00
majianpeng	c62988ec09	ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-07-03 15:32:52 -07:00
Yan, Zheng	fc2744aa12	ceph: fix race between page writeback and truncate The client can receive truncate request from MDS at any time. So the page writeback code need to get i_size, truncate_seq and truncate_size atomically Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-07-03 15:32:47 -07:00
Lukas Czerner	569d39fc3e	ceph: use ->invalidatepage() length argument ->invalidatepage() aop now accepts range to invalidate so we can make use of it in ceph_invalidatepage(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Acked-by: Sage Weil <sage@inktank.com> Cc: ceph-devel@vger.kernel.org	2013-05-21 23:58:48 -04:00
Lukas Czerner	d47992f86b	mm: change invalidatepage prototype to accept length Currently there is no way to truncate partial page where the end truncate point is not at the end of the page. This is because it was not needed and the functionality was enough for file system truncate operation to work properly. However more file systems now support punch hole feature and it can benefit from mm supporting truncating page just up to the certain point. Specifically, with this functionality truncate_inode_pages_range() can be changed so it supports truncating partial page at the end of the range (currently it will BUG_ON() if 'end' is not at the end of the page). This commit changes the invalidatepage() address space operation prototype to accept range to be invalidated and update all the instances for it. We also change the block_invalidatepage() in the same way and actually make a use of the new length argument implementing range invalidation. Actual file system implementations will follow except the file systems where the changes are really simple and should not change the behaviour in any way .Implementation for truncate_page_range() which will be able to accept page unaligned ranges will follow as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com>	2013-05-21 23:17:23 -04:00
Alex Elder	406e2c9f92	libceph: kill off osd data write_request parameters In the incremental move toward supporting distinct data items in an osd request some of the functions had "write_request" parameters to indicate, basically, whether the data belonged to in_data or the out_data. Now that we maintain the data fields in the op structure there is no need to indicate the direction, so get rid of the "write_request" parameters. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:58 -07:00
Yan, Zheng	1ac0fc8adf	ceph: fix race between writepages and truncate ceph_writepages_start() reads inode->i_size in two places. It can get different values between successive read, because truncate can change inode->i_size at any time. The race can lead to mismatch between data length of osd request and pages marked as writeback. When osd request finishes, it clear writeback page according to its data length. So some pages can be left in writeback state forever. The fix is only read inode->i_size once, save its value to a local variable and use the local variable when i_size is needed. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Alex Elder <elder@inktank.com>	2013-05-01 21:18:55 -07:00
Alex Elder	a4ce40a9a7	libceph: combine initializing and setting osd data This ends up being a rather large patch but what it's doing is somewhat straightforward. Basically, this is replacing two calls with one. The first of the two calls is initializing a struct ceph_osd_data with data (either a page array, a page list, or a bio list); the second is setting an osd request op so it associates that data with one of the op's parameters. In place of those two will be a single function that initializes the op directly. That means we sort of fan out a set of the needed functions: - extent ops with pages data - extent ops with pagelist data - extent ops with bio list data and - class ops with page data for receiving a response We also have define another one, but it's only used internally: - class ops with pagelist data for request parameters Note that we still haven't gotten rid of the osd request's r_data_in and r_data_out fields. All the osd ops refer to them for their data. For now, these data fields are pointers assigned to the appropriate r_data_* field when these new functions are called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:23 -07:00
Alex Elder	c99d2d4abb	libceph: specify osd op by index in request An osd request now holds all of its source op structures, and every place that initializes one of these is in fact initializing one of the entries in the the osd request's array. So rather than supplying the address of the op to initialize, have caller specify the osd request and an indication of which op it would like to initialize. This better hides the details the op structure (and faciltates moving the data pointers they use). Since osd_req_op_init() is a common routine, and it's not used outside the osd client code, give it static scope. Also make it return the address of the specified op (so all the other init routines don't have to repeat that code). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:15 -07:00
Alex Elder	8c042b0df9	libceph: add data pointers in osd op structures An extent type osd operation currently implies that there will be corresponding data supplied in the data portion of the request (for write) or response (for read) message. Similarly, an osd class method operation implies a data item will be supplied to receive the response data from the operation. Add a ceph_osd_data pointer to each of those structures, and assign it to point to eithre the incoming or the outgoing data structure in the osd message. The data is not always available when an op is initially set up, so add two new functions to allow setting them after the op has been initialized. Begin to make use of the data item pointer available in the osd operation rather than the request data in or out structure in places where it's convenient. Add some assertions to verify pointers are always set the way they're expected to be. This is a sort of stepping stone toward really moving the data into the osd request ops, to allow for some validation before making that jump. This is the first in a series of patches that resolve: http://tracker.ceph.com/issues/4657 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:14 -07:00
Alex Elder	79528734f3	libceph: keep source rather than message osd op array An osd request keeps a pointer to the osd operations (ops) array that it builds in its request message. In order to allow each op in the array to have its own distinct data, we will need to keep track of each op's data, and that information does not go over the wire. As long as we're tracking the data we might as well just track the entire (source) op definition for each of the ops. And if we're doing that, we'll have no more need to keep a pointer to the wire-encoded version. This patch makes the array of source ops be kept with the osd request structure, and uses that instead of the version encoded in the message in places where that was previously used. The array will be embedded in the request structure, and the maximum number of ops we ever actually use is currently 2. So reduce CEPH_OSD_MAX_OP to 2 to reduce the size of the structure. The result of doing this sort of ripples back up, and as a result various function parameters and local variables become unnecessary. Make r_num_ops be unsigned, and move the definition of struct ceph_osd_req_op earlier to ensure it's defined where needed. It does not yet add per-op data, that's coming soon. This resolves: http://tracker.ceph.com/issues/4656 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:18:12 -07:00

1 2 3 4

95 Commits