kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Linus Torvalds	6b3a707736	Merge branch 'page-refs' (page ref overflow) Merge page ref overflow branch. Jann Horn reported that he can overflow the page ref count with sufficient memory (and a filesystem that is intentionally extremely slow). Admittedly it's not exactly easy. To have more than four billion references to a page requires a minimum of 32GB of kernel memory just for the pointers to the pages, much less any metadata to keep track of those pointers. Jann needed a total of 140GB of memory and a specially crafted filesystem that leaves all reads pending (in order to not ever free the page references and just keep adding more). Still, we have a fairly straightforward way to limit the two obvious user-controllable sources of page references: direct-IO like page references gotten through get_user_pages(), and the splice pipe page duplication. So let's just do that. * branch page-refs: fs: prevent page refcount overflow in pipe_buf_get mm: prevent get_user_pages() from overflowing page refcount mm: add 'try_get_page()' helper function mm: make page ref count overflow check tighter and more explicit	2019-04-14 15:09:40 -07:00
Matthew Wilcox	15fab63e1e	fs: prevent page refcount overflow in pipe_buf_get Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-04-14 10:00:04 -07:00
Linus Torvalds	dfee9c257b	Merge tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: "Scalability and performance improvements, as well as minor bug fixes and cleanups" * tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits) fuse: cache readdir calls if filesystem opts out of opendir fuse: support clients that don't implement 'opendir' fuse: lift bad inode checks into callers fuse: multiplex cached/direct_io file operations fuse add copy_file_range to direct io fops fuse: use iov_iter based generic splice helpers fuse: Switch to using async direct IO for FOPEN_DIRECT_IO fuse: use atomic64_t for khctr fuse: clean up aborted fuse: Protect ff->reserved_req via corresponding fi->lock fuse: Protect fi->nlookup with fi->lock fuse: Introduce fi->lock to protect write related fields fuse: Convert fc->attr_version into atomic64_t fuse: Add fuse_inode argument to fuse_prepare_release() fuse: Verify userspace asks to requeue interrupt that we really sent fuse: Do some refactoring in fuse_dev_do_write() fuse: Wake up req->waitq of only if not background fuse: Optimize request_end() by not taking fiq->waitq.lock fuse: Kill fasync only if interrupt is queued in queue_interrupt() fuse: Remove stale comment in end_requests() ...	2019-03-12 14:46:26 -07:00
Nikolay Borisov	b5420237ec	mm: refactor readahead defines in mm.h All users of VM_MAX_READAHEAD actually convert it to kbytes and then to pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This simplifies the expression in every filesystem. Also rename the macro to VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused VM_MIN_READAHEAD [akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen] Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: David Howells <dhowells@redhat.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-03-12 10:04:01 -07:00
Chad Austin	fabf7e0262	fuse: cache readdir calls if filesystem opts out of opendir If a filesystem returns ENOSYS from opendir and thus opts out of opendir and releasedir requests, it almost certainly would also like readdir results cached. Default open_flags to FOPEN_KEEP_CACHE and FOPEN_CACHE_DIR in that case. With this patch, I've measured recursive directory enumeration across large FUSE mounts to be faster than native mounts. Signed-off-by: Chad Austin <chadaustin@fb.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:15 +01:00
Chad Austin	d9a9ea94f7	fuse: support clients that don't implement 'opendir' Allow filesystems to return ENOSYS from opendir, preventing the kernel from sending opendir and releasedir messages in the future. This avoids userspace transitions when filesystems don't need to keep track of state per directory handle. A new capability flag, FUSE_NO_OPENDIR_SUPPORT, parallels FUSE_NO_OPEN_SUPPORT, indicating the new semantics for returning ENOSYS from opendir. Signed-off-by: Chad Austin <chadaustin@fb.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:15 +01:00
Miklos Szeredi	2f7b6f5bed	fuse: lift bad inode checks into callers Bad inode checks were done done in various places, and move them into fuse_file_{read\|write}_iter(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:15 +01:00
Miklos Szeredi	55752a3aba	fuse: multiplex cached/direct_io file operations This is cleanup, as well as allowing switching between I/O modes while the file is open in the future. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:15 +01:00
Miklos Szeredi	d4136d6075	fuse add copy_file_range to direct io fops Nothing preventing copy_file_range to work on files opened with FOPEN_DIRECT_IO. Fixes: `88bc7d5097` ("fuse: add support for copy_file_range()") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Miklos Szeredi	3c3db095b6	fuse: use iov_iter based generic splice helpers The default splice implementation is grossly inefficient and the iter based ones work just fine, so use those instead. I've measured an 8x speedup for splice write (with len = 128k). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Martin Raiber	23c94e1cdc	fuse: Switch to using async direct IO for FOPEN_DIRECT_IO Switch to using the async directo IO code path in fuse_direct_read_iter() and fuse_direct_write_iter(). This is especially important in connection with loop devices with direct IO enabled as loop assumes async direct io is actually async. Signed-off-by: Martin Raiber <martin@urbackup.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Miklos Szeredi	75126f5504	fuse: use atomic64_t for khctr ...to get rid of one more fc->lock use. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Miklos Szeredi	eb98e3bdf3	fuse: clean up aborted The only caller that needs fc->aborted set is fuse_conn_abort_write(). Setting fc->aborted is now racy (fuse_abort_conn() may already be in progress or finished) but there's no reason to care. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Kirill Tkhai	6b675738ce	fuse: Protect ff->reserved_req via corresponding fi->lock This is rather natural action after previous patches, and it just decreases load of fc->lock. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Kirill Tkhai	c9d8f5f069	fuse: Protect fi->nlookup with fi->lock This continues previous patch and introduces the same protection for nlookup field. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Kirill Tkhai	f15ecfef05	fuse: Introduce fi->lock to protect write related fields To minimize contention of fc->lock, this patch introduces a new spinlock for protection fuse_inode metadata: fuse_inode: writectr writepages write_files queued_writes attr_version inode: i_size i_nlink i_mtime i_ctime Also, it protects the fields changed in fuse_change_attributes_common() (too many to list). Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:14 +01:00
Kirill Tkhai	4510d86fbb	fuse: Convert fc->attr_version into atomic64_t This patch makes fc->attr_version of atomic64_t type, so fc->lock won't be needed to read or modify it anymore. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	ebf84d0c72	fuse: Add fuse_inode argument to fuse_prepare_release() Here is preparation for next patches, which introduce new fi->lock for protection of ff->write_entry linked into fi->write_files. This patch just passes new argument to the function. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	b782911b52	fuse: Verify userspace asks to requeue interrupt that we really sent When queue_interrupt() is called from fuse_dev_do_write(), it came from userspace directly. Userspace may pass any request id, even the request's we have not interrupted (or even background's request). This patch adds sanity check to make kernel safe against that. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	7407a10de5	fuse: Do some refactoring in fuse_dev_do_write() This is needed for next patch. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	5e0fed717a	fuse: Wake up req->waitq of only if not background Currently, we wait on req->waitq in request_wait_answer() function only, and it's never used for background requests. Since wake_up() is not a light-weight macros, instead of this, it unfolds in really called function, which makes locking operations taking some cpu cycles, let's avoid its call for the case we definitely know it's completely useless. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	217316a601	fuse: Optimize request_end() by not taking fiq->waitq.lock We take global fiq->waitq.lock every time, when we are in this function, but interrupted requests are just small subset of all requests. This patch optimizes request_end() and makes it to take the lock when it's really needed. queue_interrupt() needs small change for that. After req is linked to interrupt list, we do smp_mb() and check for FR_FINISHED again. In case of FR_FINISHED bit has appeared, we remove req and leave the function: Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	8da6e91832	fuse: Kill fasync only if interrupt is queued in queue_interrupt() We should sent signal only in case of interrupt is really queued. Not a real problem, but this makes the code clearer and intuitive. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:13 +01:00
Kirill Tkhai	340617508d	fuse: Remove stale comment in end_requests() Function end_requests() does not take fc->lock. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:12 +01:00
Kirill Tkhai	c5de16cca2	fuse: Replace page without copying in fuse_writepage_in_flight() It looks like we can optimize page replacement and avoid copying by simple updating the request's page. [SzM: swap with new request's tmp page to avoid use after free.] Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2019-02-13 13:15:12 +01:00

1 2 3 4 5 ...

941 Commits