binder has used write-side mmap_sem semaphore to release memory
mapped at address space of the process. However, right lock to
release pages is down_read, not down_write because page table lock
already protects the race for parallel freeing.
Please do not use mmap_sem write-side lock which is well known
contented lock.
Cc: Todd Kjos <tkjos@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes sparse issues reported by the kbuild test robot running
on https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
char-misc-testing: bde4a19fc0 ("binder: use userspace pointer as base
of buffer space")
Error output (drivers/android/binder_alloc_selftest.c):
sparse: warning: incorrect type in assignment (different address spaces)
sparse: expected void *page_addr
sparse: got void [noderef] <asn:1> *user_data
sparse: error: subtraction of different types can't work
Fixed by adding necessary "__user" tags.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that alloc->buffer points to the userspace vm_area
rename buffer->data to buffer->user_data and rename
local pointers that hold user addresses. Also use the
"__user" tag to annotate all user pointers so sparse
can flag cases where user pointer vaues are copied to
kernel pointers. Refactor code to use offsets instead
of user pointers.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove user_buffer_offset since there is no kernel
buffer pointer anymore.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the kernel's vm_area and the code that maps
buffer pages into it.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactor the functions to validate and fixup struct
binder_buffer pointer objects to avoid using vm_area
pointers. Instead copy to/from kernel space using
binder_alloc_copy_to_buffer() and
binder_alloc_copy_from_buffer(). The following
functions were refactored:
refactor binder_validate_ptr()
binder_validate_fixup()
binder_fixup_parent()
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When creating or tearing down a transaction, the binder driver
examines objects in the buffer and takes appropriate action.
To do this without needing to dereference pointers into the
buffer, the local copies of the objects are needed. This patch
introduces a function to validate and copy binder objects
from the buffer to a local structure.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid vm_area when copying to or from binder buffers.
Instead, new copy functions are added that copy from
kernel space to binder buffer space. These use
kmap_atomic() and kunmap_atomic() to create temporary
mappings and then memcpy() is used to copy within
that page.
Also, kmap_atomic() / kunmap_atomic() use the appropriate
cache flushing to support VIVT cache architectures.
Allow binder to build if CPU_CACHE_VIVT is defined.
Several uses of the new functions are added here. More
to follow in subsequent patches.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binder driver uses a vm_area to map the per-process
binder buffer space. For 32-bit android devices, this is
now taking too much vmalloc space. This patch removes
the use of vm_area when copying the transaction data
from the sender to the buffer space. Instead of using
copy_from_user() for multi-page copies, it now uses
binder_alloc_copy_user_to_buffer() which uses kmap()
and kunmap() to map each page, and uses copy_from_user()
for copying to that page.
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binderfs should not have a separate device_initcall(). When a kernel is
compiled with CONFIG_ANDROID_BINDERFS register the filesystem alongside
CONFIG_ANDROID_IPC. This use-case is especially sensible when users specify
CONFIG_ANDROID_IPC=y, CONFIG_ANDROID_BINDERFS=y and
ANDROID_BINDER_DEVICES="".
When CONFIG_ANDROID_BINDERFS=n then this always succeeds so there's no
regression potential for legacy workloads.
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We currently adhere to the reserved devices limit when creating new
binderfs devices in binderfs instances not located in the inital ipc
namespace. But it is still possible to rob the host instances of their 4
reserved devices by creating the maximum allowed number of devices in a
single binderfs instance located in a non-initial ipc namespace and then
mounting 4 separate binderfs instances in non-initial ipc namespaces. That
happens because the limit is currently not respected for the creation of
the initial binder-control device node. Block this nonsense by performing
the same check in binderfs_binder_ctl_create() that we perform in
binderfs_binder_device_create().
Fixes: 36bdf3cae0 ("binderfs: reserve devices for initial mount")
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To allow servers to verify client identity, allow a node
flag to be set that causes the sender's security context
to be delivered with the transaction. The BR_TRANSACTION
command is extended in BR_TRANSACTION_SEC_CTX to
contain a pointer to the security context string.
Signed-off-by: Todd Kjos <tkjos@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The binderfs_binder_ctl_create() call is a no-op on subsequent calls and
the first call is done before we unlock the suberblock. Hence, there is no
need to take inode_lock() in there. Let's remove it.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al pointed out that first calling kill_litter_super() before cleaning up
info is more correct since destroying info doesn't depend on the state of
the dentries and inodes. That the opposite remains true is not guaranteed.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- switch from d_alloc_name() + d_lookup() to lookup_one_len():
Instead of using d_alloc_name() and then doing a d_lookup() with the
allocated dentry to find whether a device with the name we're trying to
create already exists switch to using lookup_one_len(). The latter will
either return the existing dentry or a new one.
- switch from kmalloc() + strscpy() to kmemdup():
Use a more idiomatic way to copy the name for the new dentry that
userspace gave us.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al pointed out that on binderfs_fill_super() error
deactivate_locked_super() will call binderfs_kill_super() so all of the
freeing and putting we currently do in binderfs_fill_super() is unnecessary
and buggy. Let's simply return errors and let binderfs_fill_super() take
care of cleaning up on error.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- make binderfs control dentry immutable:
We don't allow to unlink it since it is crucial for binderfs to be
useable but if we allow to rename it we make the unlink trivial to
bypass. So prevent renaming too and simply treat the control dentry as
immutable.
- add is_binderfs_control_device() helper:
Take the opportunity and turn the check for the control dentry into a
separate helper is_binderfs_control_device() since it's now used in two
places.
- simplify binderfs_rename():
Instead of hand-rolling our custom version of simple_rename() just dumb
the whole function down to first check whether we're trying to rename the
control dentry. If we do EPERM the caller and if not call simple_rename().
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix to return a negative error code -ENOMEM from the new_inode() and
d_make_root() error handling cases instead of 0, as done elsewhere in
this function.
Fixes: 849d540ddf ("binderfs: implement "max" mount option")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>