License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
|
#include <linux/linkage.h>
|
|
|
|
|
#include <linux/errno.h>
|
|
|
|
|
|
|
|
|
|
#include <asm/unistd.h>
|
|
|
|
|
|
2018-04-05 11:53:03 +02:00
|
|
|
#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
|
|
|
|
|
/* Architectures may override COND_SYSCALL and COND_SYSCALL_COMPAT */
|
|
|
|
|
#include <asm/syscall_wrapper.h>
|
|
|
|
|
#endif /* CONFIG_ARCH_HAS_SYSCALL_WRAPPER */
|
|
|
|
|
|
2007-10-16 23:29:25 -07:00
|
|
|
/* we can't #include <linux/syscalls.h> here,
|
|
|
|
|
but tell gcc to not warn with -Wmissing-prototypes */
|
|
|
|
|
asmlinkage long sys_ni_syscall(void);
|
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
/*
|
|
|
|
|
* Non-implemented system calls get redirected here.
|
|
|
|
|
*/
|
|
|
|
|
asmlinkage long sys_ni_syscall(void)
|
|
|
|
|
{
|
|
|
|
|
return -ENOSYS;
|
|
|
|
|
}
|
|
|
|
|
|
2018-04-05 11:53:03 +02:00
|
|
|
#ifndef COND_SYSCALL
|
2018-03-04 19:06:35 +01:00
|
|
|
#define COND_SYSCALL(name) cond_syscall(sys_##name)
|
2018-04-05 11:53:03 +02:00
|
|
|
#endif /* COND_SYSCALL */
|
|
|
|
|
|
|
|
|
|
#ifndef COND_SYSCALL_COMPAT
|
2018-03-04 19:06:35 +01:00
|
|
|
#define COND_SYSCALL_COMPAT(name) cond_syscall(compat_sys_##name)
|
2018-04-05 11:53:03 +02:00
|
|
|
#endif /* COND_SYSCALL_COMPAT */
|
2018-03-04 19:06:35 +01:00
|
|
|
|
2018-03-06 19:53:01 +01:00
|
|
|
/*
|
|
|
|
|
* This list is kept in the same order as include/uapi/asm-generic/unistd.h.
|
|
|
|
|
* Architecture specific entries go below, followed by deprecated or obsolete
|
|
|
|
|
* system calls.
|
|
|
|
|
*/
|
|
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(io_setup);
|
|
|
|
|
COND_SYSCALL_COMPAT(io_setup);
|
|
|
|
|
COND_SYSCALL(io_destroy);
|
|
|
|
|
COND_SYSCALL(io_submit);
|
|
|
|
|
COND_SYSCALL_COMPAT(io_submit);
|
|
|
|
|
COND_SYSCALL(io_cancel);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(io_getevents_time32);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(io_getevents);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(io_pgetevents_time32);
|
aio: implement io_pgetevents
This is the io_getevents equivalent of ppoll/pselect and allows to
properly mix signals and aio completions (especially with IOCB_CMD_POLL)
and atomically executes the following sequence:
sigset_t origmask;
pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
ret = io_getevents(ctx, min_nr, nr, events, timeout);
pthread_sigmask(SIG_SETMASK, &origmask, NULL);
Note that unlike many other signal related calls we do not pass a sigmask
size, as that would get us to 7 arguments, which aren't easily supported
by the syscall infrastructure. It seems a lot less painful to just add a
new syscall variant in the unlikely case we're going to increase the
sigset size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-02 19:51:00 +02:00
|
|
|
COND_SYSCALL(io_pgetevents);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL_COMPAT(io_pgetevents_time32);
|
aio: implement io_pgetevents
This is the io_getevents equivalent of ppoll/pselect and allows to
properly mix signals and aio completions (especially with IOCB_CMD_POLL)
and atomically executes the following sequence:
sigset_t origmask;
pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
ret = io_getevents(ctx, min_nr, nr, events, timeout);
pthread_sigmask(SIG_SETMASK, &origmask, NULL);
Note that unlike many other signal related calls we do not pass a sigmask
size, as that would get us to 7 arguments, which aren't easily supported
by the syscall infrastructure. It seems a lot less painful to just add a
new syscall variant in the unlikely case we're going to increase the
sigset size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-02 19:51:00 +02:00
|
|
|
COND_SYSCALL_COMPAT(io_pgetevents);
|
Add io_uring IO interface
The submission queue (SQ) and completion queue (CQ) rings are shared
between the application and the kernel. This eliminates the need to
copy data back and forth to submit and complete IO.
IO submissions use the io_uring_sqe data structure, and completions
are generated in the form of io_uring_cqe data structures. The SQ
ring is an index into the io_uring_sqe array, which makes it possible
to submit a batch of IOs without them being contiguous in the ring.
The CQ ring is always contiguous, as completion events are inherently
unordered, and hence any io_uring_cqe entry can point back to an
arbitrary submission.
Two new system calls are added for this:
io_uring_setup(entries, params)
Sets up an io_uring instance for doing async IO. On success,
returns a file descriptor that the application can mmap to
gain access to the SQ ring, CQ ring, and io_uring_sqes.
io_uring_enter(fd, to_submit, min_complete, flags, sigset, sigsetsize)
Initiates IO against the rings mapped to this fd, or waits for
them to complete, or both. The behavior is controlled by the
parameters passed in. If 'to_submit' is non-zero, then we'll
try and submit new IO. If IORING_ENTER_GETEVENTS is set, the
kernel will wait for 'min_complete' events, if they aren't
already available. It's valid to set IORING_ENTER_GETEVENTS
and 'min_complete' == 0 at the same time, this allows the
kernel to return already completed events without waiting
for them. This is useful only for polling, as for IRQ
driven IO, the application can just check the CQ ring
without entering the kernel.
With this setup, it's possible to do async IO with a single system
call. Future developments will enable polled IO with this interface,
and polled submission as well. The latter will enable an application
to do IO without doing ANY system calls at all.
For IRQ driven IO, an application only needs to enter the kernel for
completions if it wants to wait for them to occur.
Each io_uring is backed by a workqueue, to support buffered async IO
as well. We will only punt to an async context if the command would
need to wait for IO on the device side. Any data that can be accessed
directly in the page cache is done inline. This avoids the slowness
issue of usual threadpools, since cached data is accessed as quickly
as a sync interface.
Sample application: http://git.kernel.dk/cgit/fio/plain/t/io_uring.c
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-07 10:46:33 -07:00
|
|
|
COND_SYSCALL(io_uring_setup);
|
|
|
|
|
COND_SYSCALL(io_uring_enter);
|
io_uring: add support for pre-mapped user IO buffers
If we have fixed user buffers, we can map them into the kernel when we
setup the io_uring. That avoids the need to do get_user_pages() for
each and every IO.
To utilize this feature, the application must call io_uring_register()
after having setup an io_uring instance, passing in
IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to
an iovec array, and the nr_args should contain how many iovecs the
application wishes to map.
If successful, these buffers are now mapped into the kernel, eligible
for IO. To use these fixed buffers, the application must use the
IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then
set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len
must point to somewhere inside the indexed buffer.
The application may register buffers throughout the lifetime of the
io_uring instance. It can call io_uring_register() with
IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of
buffers, and then register a new set. The application need not
unregister buffers explicitly before shutting down the io_uring
instance.
It's perfectly valid to setup a larger buffer, and then sometimes only
use parts of it for an IO. As long as the range is within the originally
mapped region, it will work just fine.
For now, buffers must not be file backed. If file backed buffers are
passed in, the registration will fail with -1/EOPNOTSUPP. This
restriction may be relaxed in the future.
RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat
arbitrary 1G per buffer size is also imposed.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-09 09:16:05 -07:00
|
|
|
COND_SYSCALL(io_uring_register);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/xattr.c */
|
|
|
|
|
|
|
|
|
|
/* fs/dcache.c */
|
|
|
|
|
|
|
|
|
|
/* fs/cookies.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(lookup_dcookie);
|
|
|
|
|
COND_SYSCALL_COMPAT(lookup_dcookie);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/eventfd.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(eventfd2);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/eventfd.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(epoll_create1);
|
|
|
|
|
COND_SYSCALL(epoll_ctl);
|
|
|
|
|
COND_SYSCALL(epoll_pwait);
|
|
|
|
|
COND_SYSCALL_COMPAT(epoll_pwait);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/fcntl.c */
|
|
|
|
|
|
|
|
|
|
/* fs/inotify_user.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(inotify_init1);
|
|
|
|
|
COND_SYSCALL(inotify_add_watch);
|
|
|
|
|
COND_SYSCALL(inotify_rm_watch);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/ioctl.c */
|
|
|
|
|
|
|
|
|
|
/* fs/ioprio.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(ioprio_set);
|
|
|
|
|
COND_SYSCALL(ioprio_get);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/locks.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(flock);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/namei.c */
|
|
|
|
|
|
|
|
|
|
/* fs/namespace.c */
|
|
|
|
|
|
|
|
|
|
/* fs/nfsctl.c */
|
|
|
|
|
|
|
|
|
|
/* fs/open.c */
|
|
|
|
|
|
|
|
|
|
/* fs/pipe.c */
|
|
|
|
|
|
|
|
|
|
/* fs/quota.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(quotactl);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/readdir.c */
|
|
|
|
|
|
|
|
|
|
/* fs/read_write.c */
|
|
|
|
|
|
|
|
|
|
/* fs/sendfile.c */
|
|
|
|
|
|
|
|
|
|
/* fs/select.c */
|
|
|
|
|
|
|
|
|
|
/* fs/signalfd.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(signalfd4);
|
|
|
|
|
COND_SYSCALL_COMPAT(signalfd4);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/splice.c */
|
|
|
|
|
|
|
|
|
|
/* fs/stat.c */
|
|
|
|
|
|
|
|
|
|
/* fs/sync.c */
|
|
|
|
|
|
|
|
|
|
/* fs/timerfd.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(timerfd_create);
|
|
|
|
|
COND_SYSCALL(timerfd_settime);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(timerfd_settime32);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(timerfd_gettime);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(timerfd_gettime32);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* fs/utimes.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/acct.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(acct);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/capability.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(capget);
|
|
|
|
|
COND_SYSCALL(capset);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/exec_domain.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/exit.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/fork.c */
|
2019-06-21 01:26:35 +02:00
|
|
|
/* __ARCH_WANT_SYS_CLONE3 */
|
|
|
|
|
COND_SYSCALL(clone3);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/futex.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(futex);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(futex_time32);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(set_robust_list);
|
|
|
|
|
COND_SYSCALL_COMPAT(set_robust_list);
|
|
|
|
|
COND_SYSCALL(get_robust_list);
|
|
|
|
|
COND_SYSCALL_COMPAT(get_robust_list);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/hrtimer.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/itimer.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/kexec.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(kexec_load);
|
|
|
|
|
COND_SYSCALL_COMPAT(kexec_load);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/module.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(init_module);
|
|
|
|
|
COND_SYSCALL(delete_module);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/posix-timers.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/printk.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(syslog);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/ptrace.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/sched/core.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/sys.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(setregid);
|
|
|
|
|
COND_SYSCALL(setgid);
|
|
|
|
|
COND_SYSCALL(setreuid);
|
|
|
|
|
COND_SYSCALL(setuid);
|
|
|
|
|
COND_SYSCALL(setresuid);
|
|
|
|
|
COND_SYSCALL(getresuid);
|
|
|
|
|
COND_SYSCALL(setresgid);
|
|
|
|
|
COND_SYSCALL(getresgid);
|
|
|
|
|
COND_SYSCALL(setfsuid);
|
|
|
|
|
COND_SYSCALL(setfsgid);
|
|
|
|
|
COND_SYSCALL(setgroups);
|
|
|
|
|
COND_SYSCALL(getgroups);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* kernel/time.c */
|
|
|
|
|
|
|
|
|
|
/* kernel/timer.c */
|
|
|
|
|
|
|
|
|
|
/* ipc/mqueue.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(mq_open);
|
|
|
|
|
COND_SYSCALL_COMPAT(mq_open);
|
|
|
|
|
COND_SYSCALL(mq_unlink);
|
|
|
|
|
COND_SYSCALL(mq_timedsend);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(mq_timedsend_time32);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(mq_timedreceive);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(mq_timedreceive_time32);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(mq_notify);
|
|
|
|
|
COND_SYSCALL_COMPAT(mq_notify);
|
|
|
|
|
COND_SYSCALL(mq_getsetattr);
|
|
|
|
|
COND_SYSCALL_COMPAT(mq_getsetattr);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* ipc/msg.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(msgget);
|
ipc: rename old-style shmctl/semctl/msgctl syscalls
The behavior of these system calls is slightly different between
architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION
symbol. Most architectures that implement the split IPC syscalls don't set
that symbol and only get the modern version, but alpha, arm, microblaze,
mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag.
For the architectures that so far only implement sys_ipc(), i.e. m68k,
mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior
when adding the split syscalls, so we need to distinguish between the
two groups of architectures.
The method I picked for this distinction is to have a separate system call
entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl()
does not. The system call tables of the five architectures are changed
accordingly.
As an additional benefit, we no longer need the configuration specific
definition for ipc_parse_version(), it always does the same thing now,
but simply won't get called on architectures with the modern interface.
A small downside is that on architectures that do set
ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points
that are never called. They only add a few bytes of bloat, so it seems
better to keep them compared to adding yet another Kconfig symbol.
I considered adding new syscall numbers for the IPC_64 variants for
consistency, but decided against that for now.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-12-31 22:22:40 +01:00
|
|
|
COND_SYSCALL(old_msgctl);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(msgctl);
|
|
|
|
|
COND_SYSCALL_COMPAT(msgctl);
|
2019-02-28 15:22:53 +01:00
|
|
|
COND_SYSCALL_COMPAT(old_msgctl);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(msgrcv);
|
|
|
|
|
COND_SYSCALL_COMPAT(msgrcv);
|
|
|
|
|
COND_SYSCALL(msgsnd);
|
|
|
|
|
COND_SYSCALL_COMPAT(msgsnd);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* ipc/sem.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(semget);
|
ipc: rename old-style shmctl/semctl/msgctl syscalls
The behavior of these system calls is slightly different between
architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION
symbol. Most architectures that implement the split IPC syscalls don't set
that symbol and only get the modern version, but alpha, arm, microblaze,
mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag.
For the architectures that so far only implement sys_ipc(), i.e. m68k,
mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior
when adding the split syscalls, so we need to distinguish between the
two groups of architectures.
The method I picked for this distinction is to have a separate system call
entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl()
does not. The system call tables of the five architectures are changed
accordingly.
As an additional benefit, we no longer need the configuration specific
definition for ipc_parse_version(), it always does the same thing now,
but simply won't get called on architectures with the modern interface.
A small downside is that on architectures that do set
ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points
that are never called. They only add a few bytes of bloat, so it seems
better to keep them compared to adding yet another Kconfig symbol.
I considered adding new syscall numbers for the IPC_64 variants for
consistency, but decided against that for now.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-12-31 22:22:40 +01:00
|
|
|
COND_SYSCALL(old_semctl);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(semctl);
|
|
|
|
|
COND_SYSCALL_COMPAT(semctl);
|
2019-02-28 15:22:53 +01:00
|
|
|
COND_SYSCALL_COMPAT(old_semctl);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(semtimedop);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL(semtimedop_time32);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(semop);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* ipc/shm.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(shmget);
|
ipc: rename old-style shmctl/semctl/msgctl syscalls
The behavior of these system calls is slightly different between
architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION
symbol. Most architectures that implement the split IPC syscalls don't set
that symbol and only get the modern version, but alpha, arm, microblaze,
mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag.
For the architectures that so far only implement sys_ipc(), i.e. m68k,
mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior
when adding the split syscalls, so we need to distinguish between the
two groups of architectures.
The method I picked for this distinction is to have a separate system call
entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl()
does not. The system call tables of the five architectures are changed
accordingly.
As an additional benefit, we no longer need the configuration specific
definition for ipc_parse_version(), it always does the same thing now,
but simply won't get called on architectures with the modern interface.
A small downside is that on architectures that do set
ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points
that are never called. They only add a few bytes of bloat, so it seems
better to keep them compared to adding yet another Kconfig symbol.
I considered adding new syscall numbers for the IPC_64 variants for
consistency, but decided against that for now.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-12-31 22:22:40 +01:00
|
|
|
COND_SYSCALL(old_shmctl);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(shmctl);
|
|
|
|
|
COND_SYSCALL_COMPAT(shmctl);
|
2019-02-28 15:22:53 +01:00
|
|
|
COND_SYSCALL_COMPAT(old_shmctl);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(shmat);
|
|
|
|
|
COND_SYSCALL_COMPAT(shmat);
|
|
|
|
|
COND_SYSCALL(shmdt);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* net/socket.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(socket);
|
|
|
|
|
COND_SYSCALL(socketpair);
|
|
|
|
|
COND_SYSCALL(bind);
|
|
|
|
|
COND_SYSCALL(listen);
|
|
|
|
|
COND_SYSCALL(accept);
|
|
|
|
|
COND_SYSCALL(connect);
|
|
|
|
|
COND_SYSCALL(getsockname);
|
|
|
|
|
COND_SYSCALL(getpeername);
|
|
|
|
|
COND_SYSCALL(setsockopt);
|
|
|
|
|
COND_SYSCALL_COMPAT(setsockopt);
|
|
|
|
|
COND_SYSCALL(getsockopt);
|
|
|
|
|
COND_SYSCALL_COMPAT(getsockopt);
|
|
|
|
|
COND_SYSCALL(sendto);
|
|
|
|
|
COND_SYSCALL(shutdown);
|
|
|
|
|
COND_SYSCALL(recvfrom);
|
|
|
|
|
COND_SYSCALL_COMPAT(recvfrom);
|
|
|
|
|
COND_SYSCALL(sendmsg);
|
|
|
|
|
COND_SYSCALL_COMPAT(sendmsg);
|
|
|
|
|
COND_SYSCALL(recvmsg);
|
|
|
|
|
COND_SYSCALL_COMPAT(recvmsg);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* mm/filemap.c */
|
|
|
|
|
|
|
|
|
|
/* mm/nommu.c, also with MMU */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(mremap);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* security/keys/keyctl.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(add_key);
|
|
|
|
|
COND_SYSCALL(request_key);
|
|
|
|
|
COND_SYSCALL(keyctl);
|
|
|
|
|
COND_SYSCALL_COMPAT(keyctl);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* arch/example/kernel/sys_example.c */
|
|
|
|
|
|
|
|
|
|
/* mm/fadvise.c */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(fadvise64_64);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* mm/, CONFIG_MMU only */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(swapon);
|
|
|
|
|
COND_SYSCALL(swapoff);
|
|
|
|
|
COND_SYSCALL(mprotect);
|
|
|
|
|
COND_SYSCALL(msync);
|
|
|
|
|
COND_SYSCALL(mlock);
|
|
|
|
|
COND_SYSCALL(munlock);
|
|
|
|
|
COND_SYSCALL(mlockall);
|
|
|
|
|
COND_SYSCALL(munlockall);
|
|
|
|
|
COND_SYSCALL(mincore);
|
|
|
|
|
COND_SYSCALL(madvise);
|
|
|
|
|
COND_SYSCALL(remap_file_pages);
|
|
|
|
|
COND_SYSCALL(mbind);
|
|
|
|
|
COND_SYSCALL_COMPAT(mbind);
|
|
|
|
|
COND_SYSCALL(get_mempolicy);
|
|
|
|
|
COND_SYSCALL_COMPAT(get_mempolicy);
|
|
|
|
|
COND_SYSCALL(set_mempolicy);
|
|
|
|
|
COND_SYSCALL_COMPAT(set_mempolicy);
|
|
|
|
|
COND_SYSCALL(migrate_pages);
|
|
|
|
|
COND_SYSCALL_COMPAT(migrate_pages);
|
|
|
|
|
COND_SYSCALL(move_pages);
|
|
|
|
|
COND_SYSCALL_COMPAT(move_pages);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(perf_event_open);
|
|
|
|
|
COND_SYSCALL(accept4);
|
|
|
|
|
COND_SYSCALL(recvmmsg);
|
y2038: socket: Add compat_sys_recvmmsg_time64
recvmmsg() takes two arguments to pointers of structures that differ
between 32-bit and 64-bit architectures: mmsghdr and timespec.
For y2038 compatbility, we are changing the native system call from
timespec to __kernel_timespec with a 64-bit time_t (in another patch),
and use the existing compat system call on both 32-bit and 64-bit
architectures for compatibility with traditional 32-bit user space.
As we now have two variants of recvmmsg() for 32-bit tasks that are both
different from the variant that we use on 64-bit tasks, this means we
also require two compat system calls!
The solution I picked is to flip things around: The existing
compat_sys_recvmmsg() call gets moved from net/compat.c into net/socket.c
and now handles the case for old user space on all architectures that
have set CONFIG_COMPAT_32BIT_TIME. A new compat_sys_recvmmsg_time64()
call gets added in the old place for 64-bit architectures only, this
one handles the case of a compat mmsghdr structure combined with
__kernel_timespec.
In the indirect sys_socketcall(), we now need to call either
do_sys_recvmmsg() or __compat_sys_recvmmsg(), depending on what kind of
architecture we are on. For compat_sys_socketcall(), no such change is
needed, we always call __compat_sys_recvmmsg().
I decided to not add a new SYS_RECVMMSG_TIME64 socketcall: Any libc
implementation for 64-bit time_t will need significant changes including
an updated asm/unistd.h, and it seems better to consistently use the
separate syscalls that configuration, leaving the socketcall only for
backward compatibility with 32-bit time_t based libc.
The naming is asymmetric for the moment, so both existing syscalls
entry points keep their names, while the new ones are recvmmsg_time32
and compat_recvmmsg_time64 respectively. I expect that we will rename
the compat syscalls later as we start using generated syscall tables
everywhere and add these entry points.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-18 13:43:52 +02:00
|
|
|
COND_SYSCALL(recvmmsg_time32);
|
2019-01-07 00:33:08 +01:00
|
|
|
COND_SYSCALL_COMPAT(recvmmsg_time32);
|
y2038: socket: Add compat_sys_recvmmsg_time64
recvmmsg() takes two arguments to pointers of structures that differ
between 32-bit and 64-bit architectures: mmsghdr and timespec.
For y2038 compatbility, we are changing the native system call from
timespec to __kernel_timespec with a 64-bit time_t (in another patch),
and use the existing compat system call on both 32-bit and 64-bit
architectures for compatibility with traditional 32-bit user space.
As we now have two variants of recvmmsg() for 32-bit tasks that are both
different from the variant that we use on 64-bit tasks, this means we
also require two compat system calls!
The solution I picked is to flip things around: The existing
compat_sys_recvmmsg() call gets moved from net/compat.c into net/socket.c
and now handles the case for old user space on all architectures that
have set CONFIG_COMPAT_32BIT_TIME. A new compat_sys_recvmmsg_time64()
call gets added in the old place for 64-bit architectures only, this
one handles the case of a compat mmsghdr structure combined with
__kernel_timespec.
In the indirect sys_socketcall(), we now need to call either
do_sys_recvmmsg() or __compat_sys_recvmmsg(), depending on what kind of
architecture we are on. For compat_sys_socketcall(), no such change is
needed, we always call __compat_sys_recvmmsg().
I decided to not add a new SYS_RECVMMSG_TIME64 socketcall: Any libc
implementation for 64-bit time_t will need significant changes including
an updated asm/unistd.h, and it seems better to consistently use the
separate syscalls that configuration, leaving the socketcall only for
backward compatibility with 32-bit time_t based libc.
The naming is asymmetric for the moment, so both existing syscalls
entry points keep their names, while the new ones are recvmmsg_time32
and compat_recvmmsg_time64 respectively. I expect that we will rename
the compat syscalls later as we start using generated syscall tables
everywhere and add these entry points.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-18 13:43:52 +02:00
|
|
|
COND_SYSCALL_COMPAT(recvmmsg_time64);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Architecture specific syscalls: see further below
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
/* fanotify */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(fanotify_init);
|
|
|
|
|
COND_SYSCALL(fanotify_mark);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* open by handle */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(name_to_handle_at);
|
|
|
|
|
COND_SYSCALL(open_by_handle_at);
|
|
|
|
|
COND_SYSCALL_COMPAT(open_by_handle_at);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(sendmmsg);
|
|
|
|
|
COND_SYSCALL_COMPAT(sendmmsg);
|
|
|
|
|
COND_SYSCALL(process_vm_readv);
|
|
|
|
|
COND_SYSCALL_COMPAT(process_vm_readv);
|
|
|
|
|
COND_SYSCALL(process_vm_writev);
|
|
|
|
|
COND_SYSCALL_COMPAT(process_vm_writev);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* compare kernel pointers */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(kcmp);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(finit_module);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* operate on Secure Computing state */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(seccomp);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(memfd_create);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* access BPF programs and maps */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(bpf);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* execveat */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(execveat);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(userfaultfd);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* membarrier */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(membarrier);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(mlock2);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(copy_file_range);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* memory protection keys */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(pkey_mprotect);
|
|
|
|
|
COND_SYSCALL(pkey_alloc);
|
|
|
|
|
COND_SYSCALL(pkey_free);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Architecture specific weak syscall entries.
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
/* pciconfig: alpha, arm, arm64, ia64, sparc */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(pciconfig_read);
|
|
|
|
|
COND_SYSCALL(pciconfig_write);
|
|
|
|
|
COND_SYSCALL(pciconfig_iobase);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* sys_socketcall: arm, mips, x86, ... */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(socketcall);
|
|
|
|
|
COND_SYSCALL_COMPAT(socketcall);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* compat syscalls for arm64, x86, ... */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL_COMPAT(fanotify_mark);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* x86 */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(vm86old);
|
|
|
|
|
COND_SYSCALL(modify_ldt);
|
|
|
|
|
COND_SYSCALL_COMPAT(quotactl32);
|
|
|
|
|
COND_SYSCALL(vm86);
|
|
|
|
|
COND_SYSCALL(kexec_file_load);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* s390 */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(s390_pci_mmio_read);
|
|
|
|
|
COND_SYSCALL(s390_pci_mmio_write);
|
2019-01-16 14:15:20 +01:00
|
|
|
COND_SYSCALL(s390_ipc);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL_COMPAT(s390_ipc);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* powerpc */
|
2018-05-02 23:20:48 +10:00
|
|
|
COND_SYSCALL(rtas);
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(spu_run);
|
|
|
|
|
COND_SYSCALL(spu_create);
|
|
|
|
|
COND_SYSCALL(subpage_prot);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Deprecated system calls which are still defined in
|
|
|
|
|
* include/uapi/asm-generic/unistd.h and wanted by >= 1 arch
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
/* __ARCH_WANT_SYSCALL_NO_FLAGS */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(epoll_create);
|
|
|
|
|
COND_SYSCALL(inotify_init);
|
|
|
|
|
COND_SYSCALL(eventfd);
|
|
|
|
|
COND_SYSCALL(signalfd);
|
|
|
|
|
COND_SYSCALL_COMPAT(signalfd);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* __ARCH_WANT_SYSCALL_OFF_T */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(fadvise64);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* __ARCH_WANT_SYSCALL_DEPRECATED */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(epoll_wait);
|
|
|
|
|
COND_SYSCALL(recv);
|
|
|
|
|
COND_SYSCALL_COMPAT(recv);
|
|
|
|
|
COND_SYSCALL(send);
|
|
|
|
|
COND_SYSCALL(bdflush);
|
|
|
|
|
COND_SYSCALL(uselib);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
2019-07-15 11:46:10 +02:00
|
|
|
/* optional: time32 */
|
|
|
|
|
COND_SYSCALL(time32);
|
|
|
|
|
COND_SYSCALL(stime32);
|
|
|
|
|
COND_SYSCALL(utime32);
|
|
|
|
|
COND_SYSCALL(adjtimex_time32);
|
|
|
|
|
COND_SYSCALL(sched_rr_get_interval_time32);
|
|
|
|
|
COND_SYSCALL(nanosleep_time32);
|
|
|
|
|
COND_SYSCALL(rt_sigtimedwait_time32);
|
|
|
|
|
COND_SYSCALL_COMPAT(rt_sigtimedwait_time32);
|
|
|
|
|
COND_SYSCALL(timer_settime32);
|
|
|
|
|
COND_SYSCALL(timer_gettime32);
|
|
|
|
|
COND_SYSCALL(clock_settime32);
|
|
|
|
|
COND_SYSCALL(clock_gettime32);
|
|
|
|
|
COND_SYSCALL(clock_getres_time32);
|
|
|
|
|
COND_SYSCALL(clock_nanosleep_time32);
|
|
|
|
|
COND_SYSCALL(utimes_time32);
|
|
|
|
|
COND_SYSCALL(futimesat_time32);
|
|
|
|
|
COND_SYSCALL(pselect6_time32);
|
|
|
|
|
COND_SYSCALL_COMPAT(pselect6_time32);
|
|
|
|
|
COND_SYSCALL(ppoll_time32);
|
|
|
|
|
COND_SYSCALL_COMPAT(ppoll_time32);
|
|
|
|
|
COND_SYSCALL(utimensat_time32);
|
|
|
|
|
COND_SYSCALL(clock_adjtime32);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* The syscalls below are not found in include/uapi/asm-generic/unistd.h
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
/* obsolete: SGETMASK_SYSCALL */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(sgetmask);
|
|
|
|
|
COND_SYSCALL(ssetmask);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* obsolete: SYSFS_SYSCALL */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(sysfs);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* obsolete: __ARCH_WANT_SYS_IPC */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(ipc);
|
|
|
|
|
COND_SYSCALL_COMPAT(ipc);
|
2018-03-06 19:53:01 +01:00
|
|
|
|
|
|
|
|
/* obsolete: UID16 */
|
2018-03-04 19:06:35 +01:00
|
|
|
COND_SYSCALL(chown16);
|
|
|
|
|
COND_SYSCALL(fchown16);
|
|
|
|
|
COND_SYSCALL(getegid16);
|
|
|
|
|
COND_SYSCALL(geteuid16);
|
|
|
|
|
COND_SYSCALL(getgid16);
|
|
|
|
|
COND_SYSCALL(getgroups16);
|
|
|
|
|
COND_SYSCALL(getresgid16);
|
|
|
|
|
COND_SYSCALL(getresuid16);
|
|
|
|
|
COND_SYSCALL(getuid16);
|
|
|
|
|
COND_SYSCALL(lchown16);
|
|
|
|
|
COND_SYSCALL(setfsgid16);
|
|
|
|
|
COND_SYSCALL(setfsuid16);
|
|
|
|
|
COND_SYSCALL(setgid16);
|
|
|
|
|
COND_SYSCALL(setgroups16);
|
|
|
|
|
COND_SYSCALL(setregid16);
|
|
|
|
|
COND_SYSCALL(setresgid16);
|
|
|
|
|
COND_SYSCALL(setresuid16);
|
|
|
|
|
COND_SYSCALL(setreuid16);
|
|
|
|
|
COND_SYSCALL(setuid16);
|
rseq: Introduce restartable sequences system call
Expose a new system call allowing each thread to register one userspace
memory area to be used as an ABI between kernel and user-space for two
purposes: user-space restartable sequences and quick access to read the
current CPU number value from user-space.
* Restartable sequences (per-cpu atomics)
Restartables sequences allow user-space to perform update operations on
per-cpu data without requiring heavy-weight atomic operations.
The restartable critical sections (percpu atomics) work has been started
by Paul Turner and Andrew Hunter. It lets the kernel handle restart of
critical sections. [1] [2] The re-implementation proposed here brings a
few simplifications to the ABI which facilitates porting to other
architectures and speeds up the user-space fast path.
Here are benchmarks of various rseq use-cases.
Test hardware:
arm32: ARMv7 Processor rev 4 (v7l) "Cubietruck", 2-core
x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading
The following benchmarks were all performed on a single thread.
* Per-CPU statistic counter increment
getcpu+atomic (ns/op) rseq (ns/op) speedup
arm32: 344.0 31.4 11.0
x86-64: 15.3 2.0 7.7
* LTTng-UST: write event 32-bit header, 32-bit payload into tracer
per-cpu buffer
getcpu+atomic (ns/op) rseq (ns/op) speedup
arm32: 2502.0 2250.0 1.1
x86-64: 117.4 98.0 1.2
* liburcu percpu: lock-unlock pair, dereference, read/compare word
getcpu+atomic (ns/op) rseq (ns/op) speedup
arm32: 751.0 128.5 5.8
x86-64: 53.4 28.6 1.9
* jemalloc memory allocator adapted to use rseq
Using rseq with per-cpu memory pools in jemalloc at Facebook (based on
rseq 2016 implementation):
The production workload response-time has 1-2% gain avg. latency, and
the P99 overall latency drops by 2-3%.
* Reading the current CPU number
Speeding up reading the current CPU number on which the caller thread is
running is done by keeping the current CPU number up do date within the
cpu_id field of the memory area registered by the thread. This is done
by making scheduler preemption set the TIF_NOTIFY_RESUME flag on the
current thread. Upon return to user-space, a notify-resume handler
updates the current CPU value within the registered user-space memory
area. User-space can then read the current CPU number directly from
memory.
Keeping the current cpu id in a memory area shared between kernel and
user-space is an improvement over current mechanisms available to read
the current CPU number, which has the following benefits over
alternative approaches:
- 35x speedup on ARM vs system call through glibc
- 20x speedup on x86 compared to calling glibc, which calls vdso
executing a "lsl" instruction,
- 14x speedup on x86 compared to inlined "lsl" instruction,
- Unlike vdso approaches, this cpu_id value can be read from an inline
assembly, which makes it a useful building block for restartable
sequences.
- The approach of reading the cpu id through memory mapping shared
between kernel and user-space is portable (e.g. ARM), which is not the
case for the lsl-based x86 vdso.
On x86, yet another possible approach would be to use the gs segment
selector to point to user-space per-cpu data. This approach performs
similarly to the cpu id cache, but it has two disadvantages: it is
not portable, and it is incompatible with existing applications already
using the gs segment selector for other purposes.
Benchmarking various approaches for reading the current CPU number:
ARMv7 Processor rev 4 (v7l)
Machine model: Cubietruck
- Baseline (empty loop): 8.4 ns
- Read CPU from rseq cpu_id: 16.7 ns
- Read CPU from rseq cpu_id (lazy register): 19.8 ns
- glibc 2.19-0ubuntu6.6 getcpu: 301.8 ns
- getcpu system call: 234.9 ns
x86-64 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz:
- Baseline (empty loop): 0.8 ns
- Read CPU from rseq cpu_id: 0.8 ns
- Read CPU from rseq cpu_id (lazy register): 0.8 ns
- Read using gs segment selector: 0.8 ns
- "lsl" inline assembly: 13.0 ns
- glibc 2.19-0ubuntu6 getcpu: 16.6 ns
- getcpu system call: 53.9 ns
- Speed (benchmark taken on v8 of patchset)
Running 10 runs of hackbench -l 100000 seems to indicate, contrary to
expectations, that enabling CONFIG_RSEQ slightly accelerates the
scheduler:
Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @
2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy
saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1
kernel parameter), with a Linux v4.6 defconfig+localyesconfig,
restartable sequences series applied.
* CONFIG_RSEQ=n
avg.: 41.37 s
std.dev.: 0.36 s
* CONFIG_RSEQ=y
avg.: 40.46 s
std.dev.: 0.33 s
- Size
On x86-64, between CONFIG_RSEQ=n/y, the text size increase of vmlinux is
567 bytes, and the data size increase of vmlinux is 5696 bytes.
[1] https://lwn.net/Articles/650333/
[2] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Hunter <ahh@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
Link: http://lkml.kernel.org/r/20150624222609.6116.86035.stgit@kitami.mtv.corp.google.com
Link: https://lkml.kernel.org/r/20180602124408.8430-3-mathieu.desnoyers@efficios.com
2018-06-02 08:43:54 -04:00
|
|
|
|
|
|
|
|
/* restartable sequence */
|
|
|
|
|
COND_SYSCALL(rseq);
|