Commit Graph

224 Commits

Author SHA1 Message Date
Linus Torvalds
30c999937f Scheduler changes for v6.1:
- Debuggability:
 
      - Change most occurances of BUG_ON() to WARN_ON_ONCE()
 
      - Reorganize & fix TASK_ state comparisons, turn it into a bitmap
 
      - Update/fix misc scheduler debugging facilities
 
  - Load-balancing & regular scheduling:
 
      - Improve the behavior of the scheduler in presence of lot of
        SCHED_IDLE tasks - in particular they should not impact other
        scheduling classes.
 
      - Optimize task load tracking, cleanups & fixes
 
      - Clean up & simplify misc load-balancing code
 
  - Freezer:
 
      - Rewrite the core freezer to behave better wrt thawing and be simpler
        in general, by replacing PF_FROZEN with TASK_FROZEN & fixing/adjusting
        all the fallout.
 
  - Deadline scheduler:
 
      - Fix the DL capacity-aware code
 
      - Factor out dl_task_is_earliest_deadline() & replenish_dl_new_period()
 
      - Relax/optimize locking in task_non_contending()
 
  - Cleanups:
 
      - Factor out the update_current_exec_runtime() helper
 
      - Various cleanups, simplifications
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/01cRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1geZA/+PB4KC1T9aVxzaTHI36R03YgJYZmIdtxw
 wTf02MixePmz+gQCbepJbempGOh5ST28aOcI0xhdYOql5B63MaUBBMlB0HvGUyDG
 IU3zETqLMRtAbnSTdQFv8m++ECUtZYp8/x1FCel4WO7ya4ETkRu1NRfCoUepEhpZ
 aVAlae9LH3NBaF9t7s0PT2lTjf3pIzMFRkddJ0ywJhbFR3VnWat05fAK+J6fGY8+
 LS54coefNlJD4oDh5TY8uniL1j5SmWmmwbk9Cdj7bLU5P3dFSS0/+5FJNHJPVGDE
 srGT7wstRUcDrN0CnZo48VIUBiApJCCDqTfJYi9wNYd0NAHvwY6MIJJgEIY8mKsI
 L/qH26H81Wt+ezSZ/5JIlGlZ/LIeNaa6OO/fbWEYABBQogvvx3nxsRNUYKSQzumH
 CnSBasBjLnjWyLlK4qARM9cI7NFSEK6NUigrEx/7h8JFu/8T4DlSy6LsF1HUyKgq
 4+FJLAqG6cL0tcwB/fHYd0oRESN8dStnQhGxSojgufwLc7dlFULvCYF5JM/dX+/V
 IKwbOfIOeOn6ViMtSOXAEGdII+IQ2/ZFPwr+8Z5JC7NzvTVL6xlu/3JXkLZR3L7o
 yaXTSaz06h1vil7Z+GRf7RHc+wUeGkEpXh5vnarGZKXivhFdWsBdROIJANK+xR0i
 TeSLCxQxXlU=
 =KjMD
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Debuggability:

   - Change most occurances of BUG_ON() to WARN_ON_ONCE()

   - Reorganize & fix TASK_ state comparisons, turn it into a bitmap

   - Update/fix misc scheduler debugging facilities

  Load-balancing & regular scheduling:

   - Improve the behavior of the scheduler in presence of lot of
     SCHED_IDLE tasks - in particular they should not impact other
     scheduling classes.

   - Optimize task load tracking, cleanups & fixes

   - Clean up & simplify misc load-balancing code

  Freezer:

   - Rewrite the core freezer to behave better wrt thawing and be
     simpler in general, by replacing PF_FROZEN with TASK_FROZEN &
     fixing/adjusting all the fallout.

  Deadline scheduler:

   - Fix the DL capacity-aware code

   - Factor out dl_task_is_earliest_deadline() &
     replenish_dl_new_period()

   - Relax/optimize locking in task_non_contending()

  Cleanups:

   - Factor out the update_current_exec_runtime() helper

   - Various cleanups, simplifications"

* tag 'sched-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  sched: Fix more TASK_state comparisons
  sched: Fix TASK_state comparisons
  sched/fair: Move call to list_last_entry() in detach_tasks
  sched/fair: Cleanup loop_max and loop_break
  sched/fair: Make sure to try to detach at least one movable task
  sched: Show PF_flag holes
  freezer,sched: Rewrite core freezer logic
  sched: Widen TAKS_state literals
  sched/wait: Add wait_event_state()
  sched/completion: Add wait_for_completion_state()
  sched: Add TASK_ANY for wait_task_inactive()
  sched: Change wait_task_inactive()s match_state
  freezer,umh: Clean up freezer/initrd interaction
  freezer: Have {,un}lock_system_sleep() save/restore flags
  sched: Rename task_running() to task_on_cpu()
  sched/fair: Cleanup for SIS_PROP
  sched/fair: Default to false in test_idle_cores()
  sched/fair: Remove useless check in select_idle_core()
  sched/fair: Avoid double search on same cpu
  sched/fair: Remove redundant check in select_idle_smt()
  ...
2022-10-10 09:10:28 -07:00
Peter Zijlstra
f5d39b0208 freezer,sched: Rewrite core freezer logic
Rewrite the core freezer to behave better wrt thawing and be simpler
in general.

By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.

As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).

Specifically; the current scheme works a little like:

	freezer_do_not_count();
	schedule();
	freezer_count();

And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.

However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.

That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.

This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.

As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:

	TASK_FREEZABLE	-> TASK_FROZEN
	__TASK_STOPPED	-> TASK_FROZEN
	__TASK_TRACED	-> TASK_FROZEN

The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).

The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.

With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
2022-09-07 21:53:50 +02:00
Carlos Llamas
a0e44c64b6 binder: fix UAF of ref->proc caused by race condition
A transaction of type BINDER_TYPE_WEAK_HANDLE can fail to increment the
reference for a node. In this case, the target proc normally releases
the failed reference upon close as expected. However, if the target is
dying in parallel the call will race with binder_deferred_release(), so
the target could have released all of its references by now leaving the
cleanup of the new failed reference unhandled.

The transaction then ends and the target proc gets released making the
ref->proc now a dangling pointer. Later on, ref->node is closed and we
attempt to take spin_lock(&ref->proc->inner_lock), which leads to the
use-after-free bug reported below. Let's fix this by cleaning up the
failed reference on the spot instead of relying on the target to do so.

  ==================================================================
  BUG: KASAN: use-after-free in _raw_spin_lock+0xa8/0x150
  Write of size 4 at addr ffff5ca207094238 by task kworker/1:0/590

  CPU: 1 PID: 590 Comm: kworker/1:0 Not tainted 5.19.0-rc8 #10
  Hardware name: linux,dummy-virt (DT)
  Workqueue: events binder_deferred_func
  Call trace:
   dump_backtrace.part.0+0x1d0/0x1e0
   show_stack+0x18/0x70
   dump_stack_lvl+0x68/0x84
   print_report+0x2e4/0x61c
   kasan_report+0xa4/0x110
   kasan_check_range+0xfc/0x1a4
   __kasan_check_write+0x3c/0x50
   _raw_spin_lock+0xa8/0x150
   binder_deferred_func+0x5e0/0x9b0
   process_one_work+0x38c/0x5f0
   worker_thread+0x9c/0x694
   kthread+0x188/0x190
   ret_from_fork+0x10/0x20

Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Cc: stable <stable@kernel.org> # 4.14+
Link: https://lore.kernel.org/r/20220801182511.3371447-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-18 16:30:09 +02:00
Carlos Llamas
b7e241bbff binder: fix redefinition of seq_file attributes
The patchset in [1] exported some definitions to binder_internal.h in
order to make the debugfs entries such as 'stats' and 'transaction_log'
available in a binderfs instance. However, the DEFINE_SHOW_ATTRIBUTE
macro expands into a static function/variable pair, which in turn get
redefined each time a source file includes this internal header.

This problem was made evident after a report from the kernel test robot
<lkp@intel.com> where several W=1 build warnings are seen in downstream
kernels. See the following example:

  include/../drivers/android/binder_internal.h:111:23: warning: 'binder_stats_fops' defined but not used [-Wunused-const-variable=]
     111 | DEFINE_SHOW_ATTRIBUTE(binder_stats);
         |                       ^~~~~~~~~~~~
  include/linux/seq_file.h:174:37: note: in definition of macro 'DEFINE_SHOW_ATTRIBUTE'
     174 | static const struct file_operations __name ## _fops = {                 \
         |                                     ^~~~~~

This patch fixes the above issues by moving back the definitions into
binder.c and instead creates an array of the debugfs entries which is
more convenient to share with binderfs and iterate through.

  [1] https://lore.kernel.org/all/20190903161655.107408-1-hridya@google.com/

Fixes: 0e13e452da ("binder: Add stats, state and transactions files")
Fixes: 03e2e07e38 ("binder: Make transaction_log available in binderfs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220701182041.2134313-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-08 15:44:19 +02:00
Li Li
9864bb4801 Binder: add TF_UPDATE_TXN to replace outdated txn
When the target process is busy, incoming oneway transactions are
queued in the async_todo list. If the clients continue sending extra
oneway transactions while the target process is frozen, this queue can
become too large to accommodate new transactions. That's why binder
driver introduced ONEWAY_SPAM_DETECTION to detect this situation. It's
helpful to debug the async binder buffer exhausting issue, but the
issue itself isn't solved directly.

In real cases applications are designed to send oneway transactions
repeatedly, delivering updated inforamtion to the target process.
Typical examples are Wi-Fi signal strength and some real time sensor
data. Even if the apps might only care about the lastet information,
all outdated oneway transactions are still accumulated there until the
frozen process is thawed later. For this kind of situations, there's
no existing method to skip those outdated transactions and deliver the
latest one only.

This patch introduces a new transaction flag TF_UPDATE_TXN. To use it,
use apps can set this new flag along with TF_ONE_WAY. When such an
oneway transaction is to be queued into the async_todo list of a frozen
process, binder driver will check if any previous pending transactions
can be superseded by comparing their code, flags and target node. If
such an outdated pending transaction is found, the latest transaction
will supersede that outdated one. This effectively prevents the async
binder buffer running out and saves unnecessary binder read workloads.

Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Li Li <dualli@google.com>
Link: https://lore.kernel.org/r/20220526220018.3334775-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-27 16:16:30 +02:00
Linus Torvalds
6684cf4290 fix for breakage in #work.fd this window
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYpz/IQAKCRBZ7Krx/gZQ
 666mAPwKOC/voemjl2m+RpSruxAbdlRvKei3IY8YxLfyv+rmUQD9HKLJJtQX2VRF
 QTFmQ3p7kx30ydwSbyY8Kpw3VMCDxgs=
 =1ZKm
 -----END PGP SIGNATURE-----

Merge tag 'pull-work.fd-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file descriptor fix from Al Viro:
 "Fix for breakage in #work.fd this window"

* tag 'pull-work.fd-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix the breakage in close_fd_get_file() calling conventions change
2022-06-05 17:14:03 -07:00
Al Viro
40a1926022 fix the breakage in close_fd_get_file() calling conventions change
It used to grab an extra reference to struct file rather than
just transferring to caller the one it had removed from descriptor
table.  New variant doesn't, and callers need to be adjusted.

Reported-and-tested-by: syzbot+47dd250f527cb7bebf24@syzkaller.appspotmail.com
Fixes: 6319194ec5 ("Unify the primitives for file descriptor closing")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-05 15:03:03 -04:00
Linus Torvalds
dbe0ee4661 Descriptor handling cleanups
-----BEGIN PGP SIGNATURE-----
 
 iHQEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYpwEZAAKCRBZ7Krx/gZQ
 691uAP0QnwO0lOYOa41MfQ6QnzPbiYcffqtUuTJBWyfs8+WnugD2NNGQP7Zjtin9
 q0wcv2KA6yY7qgu7RCCbxU/7J0YdCg==
 =ywuh
 -----END PGP SIGNATURE-----

Merge tag 'pull-18-rc1-work.fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file descriptor updates from Al Viro.

 - Descriptor handling cleanups

* tag 'pull-18-rc1-work.fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Unify the primitives for file descriptor closing
  fs: remove fget_many and fput_many interface
  io_uring_enter(): don't leave f.flags uninitialized
2022-06-04 18:52:00 -07:00
Schspa Shi
aed86f8add binder: fix atomic sleep when get extended error
binder_inner_proc_lock(thread->proc) is a spin lock, copy_to_user can't
be called with in this lock.

Copy it as a local variable to fix it.

Fixes: bd32889e84 ("binder: add BINDER_GET_EXTENDED_ERROR ioctl")
Reported-by: syzbot+46fff6434a7f968ecb39@syzkaller.appspotmail.com
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Schspa Shi <schspa@gmail.com>
Link: https://lore.kernel.org/r/20220518011754.49348-1-schspa@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19 18:41:33 +02:00
Carlos Llamas
dafa5e9ab8 binder: fix potential UAF of target_{proc,thread}
Commit 9474be34a7 ("binder: add failed transaction logging info")
dereferences target_{proc,thread} after they have been potentially
freed by binder_proc_dec_tmpref() and binder_thread_dec_tmpref().

This patch delays the release of the two references after their last
usage. Fixes the following two errors reported by smatch:

  drivers/android/binder.c:3562 binder_transaction() error: dereferencing freed memory 'target_proc'
  drivers/android/binder.c:3563 binder_transaction() error: dereferencing freed memory 'target_thread'

Fixes: 9474be34a7 ("binder: add failed transaction logging info")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220517185817.598872-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19 18:41:26 +02:00
Carlos Llamas
da4864962d binder: fix printk format for commands
Make sure we use unsigned format specifier %u for binder commands as
most of them are encoded above INT_MAX. This prevents negative values
when logging them as in the following case:

[  211.895781] binder: 8668:8668 BR_REPLY 258949 0:0, cmd -2143260157 size 0-0 ptr 0000006e766a8000-0000006e766a8000

Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220509231901.3852573-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19 18:41:11 +02:00
Al Viro
6319194ec5 Unify the primitives for file descriptor closing
Currently we have 3 primitives for removing an opened file from descriptor
table - pick_file(), __close_fd_get_file() and close_fd_get_file().  Their
calling conventions are rather odd and there's a code duplication for no
good reason.  They can be unified -

1) have __range_close() cap max_fd in the very beginning; that way
we don't need separate way for pick_file() to report being past the end
of descriptor table.

2) make {__,}close_fd_get_file() return file (or NULL) directly, rather
than returning it via struct file ** argument.  Don't bother with
(bogus) return value - nobody wants that -ENOENT.

3) make pick_file() return NULL on unopened descriptor - the only caller
that used to care about the distinction between descriptor past the end
of descriptor table and finding NULL in descriptor table doesn't give
a damn after (1).

4) lift ->files_lock out of pick_file()

That actually simplifies the callers, as well as the primitives themselves.
Code duplication is also gone...

Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-14 18:49:01 -04:00
Carlos Llamas
a15dac8b22 binder: additional transaction error logs
Log readable and specific error messages whenever a transaction failure
happens. This will ensure better context is given to regular users about
these unique error cases, without having to decode a cryptic log.

Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220429235644.697372-6-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09 15:43:24 +02:00
Carlos Llamas
06a3494ef6 binder: convert logging macros into functions
Converting binder_debug() and binder_user_error() macros into functions
reduces the overall object size by 16936 bytes when cross-compiled with
aarch64-linux-gnu-gcc 11.2.0:

$ size drivers/android/binder.o.{old,new}
   text	   data	    bss	    dec	    hex	filename
  77935	   6168	  20264	 104367	  197af	drivers/android/binder.o.old
  65551	   1616	  20264	  87431	  15587	drivers/android/binder.o.new

This is particularly beneficial to functions binder_transaction() and
binder_thread_write() which repeatedly use these macros and are both
part of the critical path for all binder transactions.

$ nm --size vmlinux.{old,new} |grep ' binder_transaction$'
0000000000002f60 t binder_transaction
0000000000002358 t binder_transaction

$ nm --size vmlinux.{old,new} |grep binder_thread_write
0000000000001c54 t binder_thread_write
00000000000014a8 t binder_thread_write

Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220429235644.697372-5-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09 15:43:24 +02:00
Carlos Llamas
bd32889e84 binder: add BINDER_GET_EXTENDED_ERROR ioctl
Provide a userspace mechanism to pull precise error information upon
failed operations. Extending the current error codes returned by the
interfaces allows userspace to better determine the course of action.
This could be for instance, retrying a failed transaction at a later
point and thus offloading the error handling from the driver.

Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220429235644.697372-3-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09 15:43:24 +02:00
Carlos Llamas
9474be34a7 binder: add failed transaction logging info
Make sure we log relevant information about failed transactions such as
the target proc/thread, call type and transaction id. These details are
particularly important when debugging userspace issues.

Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220429235644.697372-2-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09 15:43:24 +02:00
Greg Kroah-Hartman
35a7609639 Linux 5.18-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmJu9FYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGAyEH/16xtJSpLmLwrQzG
 o+4ToQxSQ+/9UHyu0RTEvHg2THm9/8emtIuYyc/5FgdoWctcSa3AaDcveWmuWmkS
 KYcdhfJsaEqjNHS3OPYXN84fmo9Hel7263shu5+IYmP/sN0DfQp6UWTryX1q4B3Q
 4Pdutkuq63Uwd8nBZ5LXQBumaBrmkkuMgWEdT4+6FOo1mPzwdIGBxCuz1UsNNl5k
 chLWxkQfe2eqgWbYJrgCQfrVdORXVtoU2fGilZUNrHRVGkkldXkkz5clJfapyZD3
 odmZCEbrE4GPKgZwCmDERMfD1hzhZDtYKiHfOQ506szH5ykJjPBcOjHed7dA60eB
 J3+wdek=
 =39Ca
 -----END PGP SIGNATURE-----

Merge 5.18-rc5 into char-misc-next

We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-02 13:49:24 +02:00
Alessandro Astone
ef38de9217 binder: Gracefully handle BINDER_TYPE_FDA objects with num_fds=0
Some android userspace is sending BINDER_TYPE_FDA objects with
num_fds=0. Like the previous patch, this is reproducible when
playing a video.

Before commit 09184ae9b5 BINDER_TYPE_FDA objects with num_fds=0
were 'correctly handled', as in no fixup was performed.

After commit 09184ae9b5 we aggregate fixup and skip regions in
binder_ptr_fixup structs and distinguish between the two by using
the skip_size field: if it's 0, then it's a fixup, otherwise skip.
When processing BINDER_TYPE_FDA objects with num_fds=0 we add a
skip region of skip_size=0, and this causes issues because now
binder_do_deferred_txn_copies will think this was a fixup region.

To address that, return early from binder_translate_fd_array to
avoid adding an empty skip region.

Fixes: 09184ae9b5 ("binder: defer copies of pre-patched txn data")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Alessandro Astone <ales.astone@gmail.com>
Link: https://lore.kernel.org/r/20220415120015.52684-1-ales.astone@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:22:51 +02:00
Alessandro Astone
2d1746e3fd binder: Address corner cases in deferred copy and fixup
When handling BINDER_TYPE_FDA object we are pushing a parent fixup
with a certain skip_size but no scatter-gather copy object, since
the copy is handled standalone.
If BINDER_TYPE_FDA is the last children the scatter-gather copy
loop will never stop to skip it, thus we are left with an item in
the parent fixup list. This will trigger the BUG_ON().

This is reproducible in android when playing a video.
We receive a transaction that looks like this:
    obj[0] BINDER_TYPE_PTR, parent
    obj[1] BINDER_TYPE_PTR, child
    obj[2] BINDER_TYPE_PTR, child
    obj[3] BINDER_TYPE_FDA, child

Fixes: 09184ae9b5 ("binder: defer copies of pre-patched txn data")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Alessandro Astone <ales.astone@gmail.com>
Link: https://lore.kernel.org/r/20220415120015.52684-2-ales.astone@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:22:48 +02:00
Carlos Llamas
a8a570c6d0 binder: hold fd_install until allocating fds first
Al noted in [1] that fd_install can't be undone, so it must come last in
the fd translation sequence, only after we've successfully reserved all
descriptors and copied them into the transaction buffer.

This patch takes Al's proposed fix in [2] and makes a few tweaks to fold
the traversal of t->fd_fixups during release.

[1] https://lore.kernel.org/driverdev-devel/YHnJwRvUhaK3IM0l@zeniv-ca.linux.org.uk
[2] https://lore.kernel.org/driverdev-devel/YHo6Ln9VI1T7RmLK@zeniv-ca.linux.org.uk

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220325232454.2210817-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 17:21:49 +02:00
Ajith P V
e80ca2e932 binder: use proper cacheflush header file
binder.c uses <asm/cacheflush.h> instead of <linux/cacheflush.h>.
Hence change cacheflush header file to proper one.

This change also avoid warning from checkpatch that shown below:
WARNING: Use #include <linux/cacheflush.h> instead of <asm/cacheflush.h>

Signed-off-by: Ajith P V <ajithpv.linux@gmail.com>
Link: https://lore.kernel.org/r/20211215132018.31522-1-ajithpv.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-21 10:11:27 +01:00
Greg Kroah-Hartman
af40d16042 Merge v5.15-rc5 into char-misc-next
We need the fixes in here as well, and also resolve some merge conflicts
in:
	drivers/misc/eeprom/at25.c

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-13 10:17:10 +01:00
Eric Biggers
a880b28a71 binder: use wake_up_pollfree()
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
all exclusive waiters.  Yet, POLLFREE *must* wake up all waiters.  epoll
and aio poll are fortunately not affected by this, but it's very
fragile.  Thus, the new function wake_up_pollfree() has been introduced.

Convert binder to use wake_up_pollfree().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: f5cb779ba1 ("ANDROID: binder: remove waitqueue when thread exits.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09 10:49:56 -08:00
Arnd Bergmann
9a0a930fe2 binder: fix pointer cast warning
binder_uintptr_t is not the same as uintptr_t, so converting it into a
pointer requires a second cast:

drivers/android/binder.c: In function 'binder_translate_fd_array':
drivers/android/binder.c:2511:28: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
 2511 |         sender_ufda_base = (void __user *)sender_uparent->buffer + fda->parent_offset;
      |                            ^

Fixes: 656e01f3ab ("binder: read pre-translated fds from sender buffer")
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211207122448.1185769-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08 17:30:03 +01:00
Todd Kjos
09184ae9b5 binder: defer copies of pre-patched txn data
BINDER_TYPE_PTR objects point to memory areas in the
source process to be copied into the target buffer
as part of a transaction. This implements a scatter-
gather model where non-contiguous memory in a source
process is "gathered" into a contiguous region in
the target buffer.

The data can include pointers that must be fixed up
to correctly point to the copied data. To avoid making
source process pointers visible to the target process,
this patch defers the copy until the fixups are known
and then copies and fixeups are done together.

There is a special case of BINDER_TYPE_FDA which applies
the fixup later in the target process context. In this
case the user data is skipped (so no untranslated fds
become visible to the target).

Reviewed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-5-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03 14:29:39 +01:00
Todd Kjos
656e01f3ab binder: read pre-translated fds from sender buffer
This patch is to prepare for an up coming patch where we read
pre-translated fds from the sender buffer and translate them before
copying them to the target.  It does not change run time.

The patch adds two new parameters to binder_translate_fd_array() to
hold the sender buffer and sender buffer parent.  These parameters let
us call copy_from_user() directly from the sender instead of using
binder_alloc_copy_from_buffer() to copy from the target.  Also the patch
adds some new alignment checks.  Previously the alignment checks would
have been done in a different place, but this lets us print more
useful error messages.

Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-4-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03 14:29:39 +01:00
Todd Kjos
6d98eb95b4 binder: avoid potential data leakage when copying txn
Transactions are copied from the sender to the target
first and objects like BINDER_TYPE_PTR and BINDER_TYPE_FDA
are then fixed up. This means there is a short period where
the sender's version of these objects are visible to the
target prior to the fixups.

Instead of copying all of the data first, copy data only
after any needed fixups have been applied.

Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-3-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03 14:29:39 +01:00
Todd Kjos
fe6b186924 binder: fix handling of error during copy
If a memory copy function fails to copy the whole buffer,
a positive integar with the remaining bytes is returned.
In binder_translate_fd_array() this can result in an fd being
skipped due to the failed copy, but the loop continues
processing fds since the early return condition expects a
negative integer on error.

Fix by returning "ret > 0 ? -EINVAL : ret" to handle this case.

Fixes: bb4a2e48d5 ("binder: return errors from buffer copy functions")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211130185152.437403-2-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03 14:29:39 +01:00
Ajith P V
690cfa20d0 binder: remove repeat word from comment
binder.c file comment produce warning with checkpatch as below:
WARNING: Possible repeated word: 'for'
Remove the repeated word from the comment avoid this warning.

Signed-off-by: Ajith P V <ajithpv.linux@gmail.com>
Link: https://lore.kernel.org/r/20211125122218.6767-1-ajithpv.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-03 14:28:51 +01:00
Todd Kjos
c21a80ca06 binder: fix test regression due to sender_euid change
This is a partial revert of commit
29bc22ac5e ("binder: use euid from cred instead of using task").
Setting sender_euid using proc->cred caused some Android system test
regressions that need further investigation. It is a partial
reversion because subsequent patches rely on proc->cred.

Fixes: 29bc22ac5e ("binder: use euid from cred instead of using task")
Cc: stable@vger.kernel.org # 4.4+
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: I9b1769a3510fed250bb21859ef8beebabe034c66
Link: https://lore.kernel.org/r/20211112180720.2858135-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17 16:08:52 +01:00
Linus Torvalds
5c904c66ed Char/Misc driver update for 5.16-rc1
Here is the big set of char and misc and other tiny driver subsystem
 updates for 5.16-rc1.
 
 Loads of things in here, all of which have been in linux-next for a
 while with no reported problems (except for one called out below.)
 
 Included are:
 	- habanana labs driver updates, including dma_buf usage,
 	  reviewed and acked by the dma_buf maintainers
 	- iio driver update (going through this tree not staging as they
 	  really do not belong going through that tree anymore)
 	- counter driver updates
 	- hwmon driver updates that the counter drivers needed, acked by
 	  the hwmon maintainer
 	- xillybus driver updates
 	- binder driver updates
 	- extcon driver updates
 	- dma_buf module namespaces added (will cause a build error in
 	  arm64 for allmodconfig, but that change is on its way through
 	  the drm tree)
 	- lkdtm driver updates
 	- pvpanic driver updates
 	- phy driver updates
 	- virt acrn and nitr_enclaves driver updates
 	- smaller char and misc driver updates
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k
 F+2dVSZ5WX6wumUzCaA6
 =bXQM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char and misc and other tiny driver subsystem
  updates for 5.16-rc1.

  Loads of things in here, all of which have been in linux-next for a
  while with no reported problems (except for one called out below.)

  Included are:

   - habanana labs driver updates, including dma_buf usage, reviewed and
     acked by the dma_buf maintainers

   - iio driver update (going through this tree not staging as they
     really do not belong going through that tree anymore)

   - counter driver updates

   - hwmon driver updates that the counter drivers needed, acked by the
     hwmon maintainer

   - xillybus driver updates

   - binder driver updates

   - extcon driver updates

   - dma_buf module namespaces added (will cause a build error in arm64
     for allmodconfig, but that change is on its way through the drm
     tree)

   - lkdtm driver updates

   - pvpanic driver updates

   - phy driver updates

   - virt acrn and nitr_enclaves driver updates

   - smaller char and misc driver updates"

* tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits)
  comedi: dt9812: fix DMA buffers on stack
  comedi: ni_usb6501: fix NULL-deref in command paths
  arm64: errata: Enable TRBE workaround for write to out-of-range address
  arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  coresight: trbe: Work around write to out of range
  coresight: trbe: Make sure we have enough space
  coresight: trbe: Add a helper to determine the minimum buffer size
  coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  coresight: trbe: Add infrastructure for Errata handling
  coresight: trbe: Allow driver to choose a different alignment
  coresight: trbe: Decouple buffer base from the hardware base
  coresight: trbe: Add a helper to pad a given buffer area
  coresight: trbe: Add a helper to calculate the trace generated
  coresight: trbe: Defer the probe on offline CPUs
  coresight: trbe: Fix incorrect access of the sink specific data
  coresight: etm4x: Add ETM PID for Kryo-5XX
  coresight: trbe: Prohibit trace before disabling TRBE
  coresight: trbe: End the AUX handle on truncation
  coresight: trbe: Do not truncate buffer on IRQ
  coresight: trbe: Fix handling of spurious interrupts
  ...
2021-11-04 08:21:47 -07:00
Linus Torvalds
cdab10bf32 selinux/stable-5.16 PR 20211101
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANbAUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNaMBAAg+9gZr0F7xiafu8JFZqZfx/AQdJ2
 G2cn3le+/tXGZmF8m/+82lOaR6LeQLatgSDJNSkXWkKr0nRwseQJDbtRfvYJdn0t
 Ax05/Fmz6OGxQ2wgRYgaFiSrKpE5p3NhDtiLFVdkCJaQNe/8DZOc7NhBl6EjZf3x
 ubhl2hUiJ4AmiXGwcYhr4uKgP4nhW8OM1/OkskVi+bBMmLA8KTY9kslmIDP5E3BW
 29W4qhqeLNQupY5dGMEMVcyxY9ZUWpO39q4uOaQVZrUGE7xABkj/jhnxT5gFTSlI
 pu8VhsYXm9KuRVveIsv0L5SZfadwoM9YAl7ki1wD3W5rHqOAte3rBTm6VmNlQwfU
 MqxP65Jiyxudxet5Be3/dCRH/+MDQuwBxivgmZXbeVxor2SeznVb0GDaEUC5FSHu
 CJIgWtQzsPJMxgAEGXN4F3QGP0htTTJni56GUPOsrf4TIBW02TT+oLTLFRIokQQL
 INNOfwVSRXElnCsvxsHR4oB+JZ9pJyBaAmeupcQ6jmcKiWlbLj4s+W0U0pM5h91v
 hmMpz7KMxrX6gVL4gB2Jj4aN3r5YRbq26NBu6D+wdwwBTeTTocaHSpAqkv4buClf
 uNk3cG8Hkp8TTg9cM8jYgpxMyzKH/AI/Uw3VhEa1xCiq2Ck3DgfnZvnvcRRaZevU
 FPgmwgqePJXGi60=
 =sb8J
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add LSM/SELinux/Smack controls and auditing for io-uring.

   As usual, the individual commit descriptions have more detail, but we
   were basically missing two things which we're adding here:

      + establishment of a proper audit context so that auditing of
        io-uring ops works similarly to how it does for syscalls (with
        some io-uring additions because io-uring ops are *not* syscalls)

      + additional LSM hooks to enable access control points for some of
        the more unusual io-uring features, e.g. credential overrides.

   The additional audit callouts and LSM hooks were done in conjunction
   with the io-uring folks, based on conversations and RFC patches
   earlier in the year.

 - Fixup the binder credential handling so that the proper credentials
   are used in the LSM hooks; the commit description and the code
   comment which is removed in these patches are helpful to understand
   the background and why this is the proper fix.

 - Enable SELinux genfscon policy support for securityfs, allowing
   improved SELinux filesystem labeling for other subsystems which make
   use of securityfs, e.g. IMA.

* tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  security: Return xattr name from security_dentry_init_security()
  selinux: fix a sock regression in selinux_ip_postroute_compat()
  binder: use cred instead of task for getsecid
  binder: use cred instead of task for selinux checks
  binder: use euid from cred instead of using task
  LSM: Avoid warnings about potentially unused hook variables
  selinux: fix all of the W=1 build warnings
  selinux: make better use of the nf_hook_state passed to the NF hooks
  selinux: fix race condition when computing ocontext SIDs
  selinux: remove unneeded ipv6 hook wrappers
  selinux: remove the SELinux lockdown implementation
  selinux: enable genfscon labeling for securityfs
  Smack: Brutalist io_uring support
  selinux: add support for the io_uring access controls
  lsm,io_uring: add LSM hooks to io_uring
  io_uring: convert io_uring to the secure anon inode interface
  fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()
  audit: add filtering for io_uring records
  audit,io_uring,io-wq: add some basic audit support to io_uring
  audit: prepare audit_context for use in calling contexts beyond syscalls
2021-11-01 21:06:18 -07:00
Todd Kjos
32e9f56a96 binder: don't detect sender/target during buffer cleanup
When freeing txn buffers, binder_transaction_buffer_release()
attempts to detect whether the current context is the target by
comparing current->group_leader to proc->tsk. This is an unreliable
test. Instead explicitly pass an 'is_failure' boolean.

Detecting the sender was being used as a way to tell if the
transaction failed to be sent.  When cleaning up after
failing to send a transaction, there is no need to close
the fds associated with a BINDER_TYPE_FDA object. Now
'is_failure' can be used to accurately detect this case.

Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Cc: stable <stable@vger.kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211015233811.3532235-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-19 09:38:55 +02:00
Todd Kjos
4d5b553974 binder: use cred instead of task for getsecid
Use the 'struct cred' saved at binder_open() to lookup
the security ID via security_cred_getsecid(). This
ensures that the security context that opened binder
is the one used to generate the secctx.

Cc: stable@vger.kernel.org # 5.4+
Fixes: ec74136ded ("binder: create node flag to request sender's security context")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-14 20:48:43 -04:00
Todd Kjos
52f8869337 binder: use cred instead of task for selinux checks
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.

Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.

Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes: 79af73079d ("Add security hooks to binder and implement the hooks for SELinux.")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-14 20:48:04 -04:00
Todd Kjos
29bc22ac5e binder: use euid from cred instead of using task
Save the 'struct cred' associated with a binder process
at initial open to avoid potential race conditions
when converting to an euid.

Set a transaction's sender_euid from the 'struct cred'
saved at binder_open() instead of looking up the euid
from the binder proc's 'struct task'. This ensures
the euid is associated with the security context that
of the task that opened binder.

Cc: stable@vger.kernel.org # 4.4+
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-14 17:19:40 -04:00
Todd Kjos
5fdb55c1ac binder: make sure fd closes complete
During BC_FREE_BUFFER processing, the BINDER_TYPE_FDA object
cleanup may close 1 or more fds. The close operations are
completed using the task work mechanism -- which means the thread
needs to return to userspace or the file object may never be
dereferenced -- which can lead to hung processes.

Force the binder thread back to userspace if an fd is closed during
BC_FREE_BUFFER handling.

Fixes: 80cd795630 ("binder: fix use-after-free due to ksys_close() during fdget()")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210830195146.587206-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14 09:02:13 +02:00
Li Li
b564171ade binder: fix freeze race
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.

But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.

1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.

This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.

Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.

As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.

NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.

To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.

If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.

A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.

the same time, confirmed the pending transactions succeeded.

Fixes: 432ff1e916 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14 08:46:08 +02:00
Ramji Jiyani
1ae14df56c binder: Add invalid handle info in user error log
In the case of a failed transaction, only the thread and process id are
logged. Add the handle info for the reference to the target node in user
error log to aid debugging.

Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
Link: https://lore.kernel.org/r/20210802220446.1938347-1-ramjiyani@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-03 16:29:25 +02:00
Luca Stefani
ced081a436 binder: Return EFAULT if we fail BINDER_ENABLE_ONEWAY_SPAM_DETECTION
All the other ioctl paths return EFAULT in case the
copy_from_user/copy_to_user call fails, make oneway spam detection
follow the same paradigm.

Fixes: a7dc1e6f99 ("binder: tell userspace to dump current backtrace when detected oneway spamming")
Acked-by: Todd Kjos <tkjos@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Link: https://lore.kernel.org/r/20210506193726.45118-1-luca.stefani.ge1@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-13 20:35:26 +02:00
Linus Torvalds
f1c921fb70 selinux/stable-5.13 PR 20210426
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmCHM2sUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNfCg/9GmoCyCh+ZRj5RGQ6M+yJas1+yyJQ
 uEfTNde54yfATUTaaWYnZG59yqzM3I2uaV11U7tqg8ajiFPxJKqbs5R9jl3lnSjH
 0Dg22nXPSCOTKcU0x/DeLoKRr+M9jO1K/nQ8NEZvYX4nC/OgtCvJqb/oEQZIKAk5
 2a7OEmNNQyFGd274p9dELaDHxN9UIaJ2PzQFXtq7ROHgBXQO4ONb2ajOf6mDSFQb
 vP/CDHwaH+pcE28w44oRy0/YBkO1SrdqoFQchg5yFagM5tQRLGkXK4OFSs5KHi5Q
 YMtmaOzMPIv1e5eaC1HuuMJYA4pPb30T9hFHP7tmBVZfmZaFaDeUs+BhMm98WTiS
 o0iTP7tfs36/poOR1Q0/sB06uvF9hUAAX1ZuE95YySifbXU9hsUc9b0uQSwCdg9P
 /J9rcdHLTpWqjw9n02mezWmAvo5U8ZvbDs+0xPIwI+3RTUP5t6mp+Hd5Tc7bPTq1
 0rpWXx+FQoSytFap5qiUSiwBp+HF6HQnNIXB0Muf6wctChoTjvo7TwoxH//z4kEm
 +SddhOCNkB7VC/X7hOxhl0F/rdHuXvb1AFIWjpTLJH2CR1PvMtF+sGey+uPT6hKZ
 /gvhmQGjFdph99eGlfVbCNvx1pM61O25IscaYD1T2wGImw+z7dX4WkG3WoOdDSkR
 bRjrBkcHh0gLhWk=
 =HTEy
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add support for measuring the SELinux state and policy capabilities
   using IMA.

 - A handful of SELinux/NFS patches to compare the SELinux state of one
   mount with a set of mount options. Olga goes into more detail in the
   patch descriptions, but this is important as it allows more
   flexibility when using NFS and SELinux context mounts.

 - Properly differentiate between the subjective and objective LSM
   credentials; including support for the SELinux and Smack. My clumsy
   attempt at a proper fix for AppArmor didn't quite pass muster so John
   is working on a proper AppArmor patch, in the meantime this set of
   patches shouldn't change the behavior of AppArmor in any way. This
   change explains the bulk of the diffstat beyond security/.

 - Fix a problem where we were not properly terminating the permission
   list for two SELinux object classes.

* tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: add proper NULL termination to the secclass_map permissions
  smack: differentiate between subjective and objective task credentials
  selinux: clarify task subjective and objective credentials
  lsm: separate security_task_getsecid() into subjective and objective variants
  nfs: account for selinux security context when deciding to share superblock
  nfs: remove unneeded null check in nfs_fill_super()
  lsm,selinux: add new hook to compare new mount to an existing mount
  selinux: fix misspellings using codespell tool
  selinux: fix misspellings using codespell tool
  selinux: measure state and policy capabilities
  selinux: Allow context mounts for unpriviliged overlayfs
2021-04-27 13:42:11 -07:00
Hang Lu
a7dc1e6f99 binder: tell userspace to dump current backtrace when detected oneway spamming
When async binder buffer got exhausted, some normal oneway transactions
will also be discarded and may cause system or application failures. By
that time, the binder debug information we dump may not be relevant to
the root cause. And this issue is difficult to debug if without the
backtrace of the thread sending spam.

This change will send BR_ONEWAY_SPAM_SUSPECT to userspace when oneway
spamming is detected, request to dump current backtrace. Oneway spamming
will be reported only once when exceeding the threshold (target process
dips below 80% of its oneway space, and current process is responsible for
either more than 50 transactions, or more than 50% of the oneway space).
And the detection will restart when the async buffer has returned to a
healthy state.

Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-3-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-10 10:52:04 +02:00
Hang Lu
0051691574 binder: fix the missing BR_FROZEN_REPLY in binder_return_strings
Add BR_FROZEN_REPLY in binder_return_strings to support stat function.

Fixes: ae28c1be1e ("binder: BINDER_GET_FROZEN_INFO ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Hang Lu <hangl@codeaurora.org>
Link: https://lore.kernel.org/r/1617961246-4502-2-git-send-email-hangl@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-10 10:52:04 +02:00
Marco Ballesio
ae28c1be1e binder: BINDER_GET_FROZEN_INFO ioctl
User space needs to know if binder transactions occurred to frozen
processes. Introduce a new BINDER_GET_FROZEN ioctl and keep track of
transactions occurring to frozen proceses.

Signed-off-by: Marco Ballesio <balejs@google.com>
Signed-off-by: Li Li <dualli@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210316011630.1121213-4-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 08:26:31 +01:00
Marco Ballesio
95c16f9d9c binder: use EINTR for interrupted wait for work
when interrupted by a signal, binder_wait_for_work currently returns
-ERESTARTSYS. This error code isn't propagated to user space, but a way
to handle interruption due to signals must be provided to code using
this API.

Replace this instance of -ERESTARTSYS with -EINTR, which is propagated
to user space.

binder_wait_for_work

Signed-off-by: Marco Ballesio <balejs@google.com>
Signed-off-by: Li Li <dualli@google.com>
Test: built, booted, interrupted a worker thread within
Acked-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210316011630.1121213-3-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 08:26:31 +01:00
Marco Ballesio
432ff1e916 binder: BINDER_FREEZE ioctl
Frozen tasks can't process binder transactions, so a way is required to
inform transmitting ends of communication failures due to the frozen
state of their receiving counterparts. Additionally, races are possible
between transitions to frozen state and binder transactions enqueued to
a specific process.

Implement BINDER_FREEZE ioctl for user space to inform the binder driver
about the intention to freeze or unfreeze a process. When the ioctl is
called, block the caller until any pending binder transactions toward
the target process are flushed. Return an error to transactions to
processes marked as frozen.

Co-developed-by: Todd Kjos <tkjos@google.com>
Acked-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Marco Ballesio <balejs@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Li Li <dualli@google.com>
Link: https://lore.kernel.org/r/20210316011630.1121213-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24 08:26:31 +01:00
Paul Moore
4ebd7651bf lsm: separate security_task_getsecid() into subjective and objective variants
Of the three LSMs that implement the security_task_getsecid() LSM
hook, all three LSMs provide the task's objective security
credentials.  This turns out to be unfortunate as most of the hook's
callers seem to expect the task's subjective credentials, although
a small handful of callers do correctly expect the objective
credentials.

This patch is the first step towards fixing the problem: it splits
the existing security_task_getsecid() hook into two variants, one
for the subjective creds, one for the objective creds.

  void security_task_getsecid_subj(struct task_struct *p,
				   u32 *secid);
  void security_task_getsecid_obj(struct task_struct *p,
				  u32 *secid);

While this patch does fix all of the callers to use the correct
variant, in order to keep this patch focused on the callers and to
ease review, the LSMs continue to use the same implementation for
both hooks.  The net effect is that this patch should not change
the behavior of the kernel in any way, it will be up to the latter
LSM specific patches in this series to change the hook
implementations and return the correct credentials.

Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 15:23:32 -04:00
Linus Torvalds
faf145d6f3 Merge branch 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "This set of changes ultimately fixes the interaction of posix file
  lock and exec. Fundamentally most of the change is just moving where
  unshare_files is called during exec, and tweaking the users of
  files_struct so that the count of files_struct is not unnecessarily
  played with.

  Along the way fcheck and related helpers were renamed to more
  accurately reflect what they do.

  There were also many other small changes that fell out, as this is the
  first time in a long time much of this code has been touched.

  Benchmarks haven't turned up any practical issues but Al Viro has
  observed a possibility for a lot of pounding on task_lock. So I have
  some changes in progress to convert put_files_struct to always rcu
  free files_struct. That wasn't ready for the merge window so that will
  have to wait until next time"

* 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
  exec: Move io_uring_task_cancel after the point of no return
  coredump: Document coredump code exclusively used by cell spufs
  file: Remove get_files_struct
  file: Rename __close_fd_get_file close_fd_get_file
  file: Replace ksys_close with close_fd
  file: Rename __close_fd to close_fd and remove the files parameter
  file: Merge __alloc_fd into alloc_fd
  file: In f_dupfd read RLIMIT_NOFILE once.
  file: Merge __fd_install into fd_install
  proc/fd: In fdinfo seq_show don't use get_files_struct
  bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu
  proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
  file: Implement task_lookup_next_fd_rcu
  kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
  proc/fd: In tid_fd_mode use task_lookup_fd_rcu
  file: Implement task_lookup_fd_rcu
  file: Rename fcheck lookup_fd_rcu
  file: Replace fcheck_files with files_lookup_fd_rcu
  file: Factor files_lookup_fd_locked out of fcheck_files
  file: Rename __fcheck_files to files_lookup_fd_raw
  ...
2020-12-15 19:29:43 -08:00
Eric W. Biederman
9fe83c43e7 file: Rename __close_fd_get_file close_fd_get_file
The function close_fd_get_file is explicitly a variant of
__close_fd[1].  Now that __close_fd has been renamed close_fd, rename
close_fd_get_file to be consistent with close_fd.

When __alloc_fd, __close_fd and __fd_install were introduced the
double underscore indicated that the function took a struct
files_struct parameter.  The function __close_fd_get_file never has so
the naming has always been inconsistent.  This just cleans things up
so there are not any lingering mentions or references __close_fd left
in the code.

[1] 80cd795630 ("binder: fix use-after-free due to ksys_close() during fdget()")
Link: https://lkml.kernel.org/r/20201120231441.29911-23-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-12-10 12:42:59 -06:00
Todd Kjos
0f966cba95 binder: add flag to clear buffer on txn complete
Add a per-transaction flag to indicate that the buffer
must be cleared when the transaction is complete to
prevent copies of sensitive data from being preserved
in memory.

Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20201120233743.3617529-1-tkjos@google.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09 15:41:21 +01:00