summaryrefslogtreecommitdiff
path: root/kernel/futex.c (follow)
Commit message (Collapse)AuthorAge
* Merge remote-tracking branch 'common/android-4.4-p' into ↵Michael Bestas2021-09-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lineage-18.1-caf-msm8998 # By Thomas Gleixner (11) and others # Via Greg Kroah-Hartman * google/common/android-4.4-p: Linux 4.4.283 Revert "floppy: reintroduce O_NDELAY fix" fbmem: add margin check to fb_check_caps() vt_kdsetmode: extend console locking vringh: Use wiov->used to check for read/write desc order virtio: Improve vq->broken access to avoid any compiler optimization net: marvell: fix MVNETA_TX_IN_PRGRS bit number e1000e: Fix the max snoop/no-snoop latency for 10M USB: serial: option: add new VID/PID to support Fibocom FG150 Revert "USB: serial: ch341: fix character loss at high transfer rates" can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters Linux 4.4.282 mmc: dw_mmc: Fix occasional hang after tuning on eMMC ASoC: intel: atom: Fix breakage for PCM buffer address setup ipack: tpci200: fix many double free issues in tpci200_pci_probe ALSA: hda - fix the 'Capture Switch' value change notifications mmc: dw_mmc: Fix hang on data CRC error mmc: dw_mmc: call the dw_mci_prep_stop_abort() by default mmc: dw_mmc: Wait for data transfer after response errors. net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32 net: 6pack: fix slab-out-of-bounds in decode_data dccp: add do-while-0 stubs for dccp_pr_debug macros Bluetooth: hidp: use correct wait queue when removing ctrl_wait scsi: core: Avoid printing an error if target_alloc() returns -ENXIO scsi: megaraid_mm: Fix end of loop tests for list_for_each_entry() dmaengine: of-dma: router_xlate to return -EPROBE_DEFER if controller is not yet available ARM: dts: am43x-epos-evm: Reduce i2c0 bus speed for tps65218 dmaengine: usb-dmac: Fix PM reference leak in usb_dmac_probe() KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653) vmlinux.lds.h: Handle clang's module.{c,d}tor sections PCI/MSI: Enforce MSI[X] entry updates to be visible PCI/MSI: Enforce that MSI-X table entry is masked for update PCI/MSI: Mask all unused MSI-X entries PCI/MSI: Protect msi_desc::masked for multi-MSI PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() PCI/MSI: Correct misleading comments PCI/MSI: Do not set invalid bits in MSI mask PCI/MSI: Enable and mask MSI-X early x86/tools: Fix objdump version check again xen/events: Fix race in set_evtchn_to_irq net: Fix memory leak in ieee802154_raw_deliver i2c: dev: zero out array used for i2c reads from userspace ASoC: intel: atom: Fix reference to PCM buffer address ANDROID: xt_quota2: set usersize in xt_match registration object ANDROID: xt_quota2: clear quota2_log message before sending ANDROID: xt_quota2: remove trailing junk which might have a digit in it UPSTREAM: netfilter: x_tables: fix pointer leaks to userspace Linux 4.4.281 ovl: prevent private clone if bind mount is not allowed net: xilinx_emaclite: Do not print real IOMEM pointer USB:ehci:fix Kunpeng920 ehci hardware problem pipe: increase minimum default pipe size to 2 pages net/qla3xxx: fix schedule while atomic in ql_wait_for_drvr_lock and ql_adapter_reset alpha: Send stop IPI to send to online CPUs reiserfs: check directory items on read from disk reiserfs: add check for root_inode in reiserfs_fill_super pcmcia: i82092: fix a null pointer dereference bug MIPS: Malta: Do not byte-swap accesses to the CBUS UART serial: 8250: Mask out floating 16/32-bit bus bits media: rtl28xxu: fix zero-length control request scripts/tracing: fix the bug that can't parse raw_trace_func USB: serial: ftdi_sio: add device ID for Auto-M3 OP-COM v2 USB: serial: ch341: fix character loss at high transfer rates USB: serial: option: add Telit FD980 composition 0x1056 Bluetooth: defer cleanup of resources in hci_unregister_dev() net: vxge: fix use-after-free in vxge_device_unregister net: pegasus: fix uninit-value in get_interrupt_interval bnx2x: fix an error code in bnx2x_nic_load() mips: Fix non-POSIX regexp net: natsemi: Fix missing pci_disable_device() in probe and remove media: videobuf2-core: dequeue if start_streaming fails scsi: sr: Return correct event when media event code is 3 ALSA: seq: Fix racy deletion of subscriber Linux 4.4.280 rcu: Update documentation of rcu_read_unlock() futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock() futex: Avoid freeing an active timer futex: Handle transient "ownerless" rtmutex state correctly rtmutex: Make wait_lock irq safe futex: Futex_unlock_pi() determinism futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock() futex: Pull rt_mutex_futex_unlock() out from under hb->lock futex,rt_mutex: Introduce rt_mutex_init_waiter() futex: Cleanup refcounting futex: Rename free_pi_state() to put_pi_state() Linux 4.4.279 can: raw: raw_setsockopt(): fix raw_rcv panic for sock UAF Revert "Bluetooth: Shutdown controller after workqueues are flushed or cancelled" net: Fix zero-copy head len calculation. r8152: Fix potential PM refcount imbalance regulator: rt5033: Fix n_voltages settings for BUCK and LDO btrfs: mark compressed range uptodate only if all bio succeed Conflicts: net/bluetooth/hci_core.c net/netfilter/xt_quota2.c Change-Id: I66e2384c8cc40448a7bff34bb935c74e6103e924
| * futex: Avoid freeing an active timerThomas Gleixner2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 97181f9bd57405b879403763284537e27d46963d ] Alexander reported a hrtimer debug_object splat: ODEBUG: free active (active state 0) object type: hrtimer hint: hrtimer_wakeup (kernel/time/hrtimer.c:1423) debug_object_free (lib/debugobjects.c:603) destroy_hrtimer_on_stack (kernel/time/hrtimer.c:427) futex_lock_pi (kernel/futex.c:2740) do_futex (kernel/futex.c:3399) SyS_futex (kernel/futex.c:3447 kernel/futex.c:3415) do_syscall_64 (arch/x86/entry/common.c:284) entry_SYSCALL64_slow_path (arch/x86/entry/entry_64.S:249) Which was caused by commit: cfafcd117da0 ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()") ... losing the hrtimer_cancel() in the shuffle. Where previously the hrtimer_cancel() was done by rt_mutex_slowlock() we now need to do it manually. Reported-by: Alexander Levin <alexander.levin@verizon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: cfafcd117da0 ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704101802370.2906@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Handle transient "ownerless" rtmutex state correctlyMike Galbraith2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 9f5d1c336a10c0d24e83e40b4c1b9539f7dba627 ] Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner(). This is one possible chain of events leading to this: Task Prio Operation T1 120 lock(F) T2 120 lock(F) -> blocks (top waiter) T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter) XX timeout/ -> wakes T2 signal T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set) T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter and the lower priority T2 cannot steal the lock. -> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON() The comment states that this is invalid and rt_mutex_real_owner() must return a non NULL owner when the trylock failed, but in case of a queued and woken up waiter rt_mutex_real_owner() == NULL is a valid transient state. The higher priority waiter has simply not yet managed to take over the rtmutex. The BUG_ON() is therefore wrong and this is just another retry condition in fixup_pi_state_owner(). Drop the locks, so that T3 can make progress, and then try the fixup again. Gratian provided a great analysis, traces and a reproducer. The analysis is to the point, but it confused the hell out of that tglx dude who had to page in all the futex horrors again. Condensed version is above. [ tglx: Wrote comment and changelog ] Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Reported-by: Gratian Crisan <gratian.crisan@ni.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Futex_unlock_pi() determinismPeter Zijlstra2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit bebe5b514345f09be2c15e414d076b02ecb9cce8 ] The problem with returning -EAGAIN when the waiter state mismatches is that it becomes very hard to proof a bounded execution time on the operation. And seeing that this is a RT operation, this is somewhat important. While in practise; given the previous patch; it will be very unlikely to ever really take more than one or two rounds, proving so becomes rather hard. However, now that modifying wait_list is done while holding both hb->lock and wait_lock, the scenario can be avoided entirely by acquiring wait_lock while still holding hb-lock. Doing a hand-over, without leaving a hole. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104152.112378812@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()Peter Zijlstra2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cfafcd117da0216520568c195cb2f6cd1980c4bb ] By changing futex_lock_pi() to use rt_mutex_*_proxy_lock() all wait_list modifications are done under both hb->lock and wait_lock. This closes the obvious interleave pattern between futex_lock_pi() and futex_unlock_pi(), but not entirely so. See below: Before: futex_lock_pi() futex_unlock_pi() unlock hb->lock lock hb->lock unlock hb->lock lock rt_mutex->wait_lock unlock rt_mutex_wait_lock -EAGAIN lock rt_mutex->wait_lock list_add unlock rt_mutex->wait_lock schedule() lock rt_mutex->wait_lock list_del unlock rt_mutex->wait_lock <idem> -EAGAIN lock hb->lock After: futex_lock_pi() futex_unlock_pi() lock hb->lock lock rt_mutex->wait_lock list_add unlock rt_mutex->wait_lock unlock hb->lock schedule() lock hb->lock unlock hb->lock lock hb->lock lock rt_mutex->wait_lock list_del unlock rt_mutex->wait_lock lock rt_mutex->wait_lock unlock rt_mutex_wait_lock -EAGAIN unlock hb->lock It does however solve the earlier starvation/live-lock scenario which got introduced with the -EAGAIN since unlike the before scenario; where the -EAGAIN happens while futex_unlock_pi() doesn't hold any locks; in the after scenario it happens while futex_unlock_pi() actually holds a lock, and then it is serialized on that lock. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104152.062785528@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Pull rt_mutex_futex_unlock() out from under hb->lockPeter Zijlstra2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 16ffa12d742534d4ff73e8b3a4e81c1de39196f0 ] There's a number of 'interesting' problems, all caused by holding hb->lock while doing the rt_mutex_unlock() equivalient. Notably: - a PI inversion on hb->lock; and, - a SCHED_DEADLINE crash because of pointer instability. The previous changes: - changed the locking rules to cover {uval,pi_state} with wait_lock. - allow to do rt_mutex_futex_unlock() without dropping wait_lock; which in turn allows to rely on wait_lock atomicity completely. - simplified the waiter conundrum. It's now sufficient to hold rtmutex::wait_lock and a reference on the pi_state to protect the state consistency, so hb->lock can be dropped before calling rt_mutex_futex_unlock(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.900002056@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex,rt_mutex: Introduce rt_mutex_init_waiter()Peter Zijlstra2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 50809358dd7199aa7ce232f6877dd09ec30ef374 ] Since there's already two copies of this code, introduce a helper now before adding a third one. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.950039479@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Cleanup refcountingPeter Zijlstra2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit bf92cf3a5100f5a0d5f9834787b130159397cb22 ] Add a put_pit_state() as counterpart for get_pi_state() so the refcounting becomes consistent. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.801778516@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Rename free_pi_state() to put_pi_state()Thomas Gleixner2021-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 29e9ee5d48c35d6cf8afe09bdf03f77125c9ac11 ] free_pi_state() is confusing as it is in fact only freeing/caching the pi state when the last reference is gone. Rename it to put_pi_state() which reflects better what it is doing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <darren@dvhart.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Bhuvanesh_Surachari@mentor.com Cc: Andy Lowe <Andy_Lowe@mentor.com> Link: http://lkml.kernel.org/r/20151219200607.259636467@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'android-4.4-p' of ↵Michael Bestas2021-04-19
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://android.googlesource.com/kernel/common into lineage-18.1-caf-msm8998 This brings LA.UM.9.2.r1-02700-SDMxx0.0 up to date with https://android.googlesource.com/kernel/common/ android-4.4-p at commit: f5978a07daf67 Merge 4.4.267 into android-4.4-p Conflicts: arch/alpha/include/asm/Kbuild drivers/mmc/core/mmc.c drivers/usb/gadget/configfs.c Change-Id: I978d923e97c18f284edbd32c0c19ac70002f7d83
| * futex: fix dead code in attach_to_pi_owner()Thomas Gleixner2021-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch comes directly from an origin patch (commit 91509e84949fc97e7424521c32a9e227746e0b85) in v4.9. And it is part of a full patch which was originally back-ported to v4.14 as commit e6e00df182908f34360c3c9f2d13cc719362e9c0 The handle_exit_race() function is defined in commit 9c3f39860367 ("futex: Cure exit race"), which never returns -EBUSY. This results in a small piece of dead code in the attach_to_pi_owner() function: int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */ ... if (ret == -EBUSY) *exiting = p; /* dead code */ The return value -EBUSY is added to handle_exit_race() in upsteam commit ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting"). This commit was incorporated into v4.9.255, before the function handle_exit_race() was introduced, whitout Modify handle_exit_race(). To fix dead code, extract the change of handle_exit_race() from commit ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting"), re-incorporated. Lee writes: This commit takes the remaining functional snippet of: ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting") ... and is the correct fix for this issue. Fixes: 9c3f39860367 ("futex: Cure exit race") Cc: stable@vger.kernel.org # v4.9.258 Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Cure exit raceThomas Gleixner2021-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit da791a667536bf8322042e38ca85d55a78d3c273 upstream. This patch comes directly from an origin patch (commit 9c3f3986036760c48a92f04b36774aa9f63673f80) in v4.9. Stefan reported, that the glibc tst-robustpi4 test case fails occasionally. That case creates the following race between sys_exit() and sys_futex_lock_pi(): CPU0 CPU1 sys_exit() sys_futex() do_exit() futex_lock_pi() exit_signals(tsk) No waiters: tsk->flags |= PF_EXITING; *uaddr == 0x00000PID mm_release(tsk) Set waiter bit exit_robust_list(tsk) { *uaddr = 0x80000PID; Set owner died attach_to_pi_owner() { *uaddr = 0xC0000000; tsk = get_task(PID); } if (!tsk->flags & PF_EXITING) { ... attach(); tsk->flags |= PF_EXITPIDONE; } else { if (!(tsk->flags & PF_EXITPIDONE)) return -EAGAIN; return -ESRCH; <--- FAIL } ESRCH is returned all the way to user space, which triggers the glibc test case assert. Returning ESRCH unconditionally is wrong here because the user space value has been changed by the exiting task to 0xC0000000, i.e. the FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This is a valid state and the kernel has to handle it, i.e. taking the futex. Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE is set in the task which 'owns' the futex. If the value has changed, let the kernel retry the operation, which includes all regular sanity checks and correctly handles the FUTEX_OWNER_DIED case. If it hasn't changed, then return ESRCH as there is no way to distinguish this case from malfunctioning user space. This happens when the exiting task did not have a robust list, the robust list was corrupted or the user space value in the futex was simply bogus. Reported-by: Stefan Liebler <stli@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467 Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.de Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [Lee: Required to satisfy functional dependency from futex back-port. Re-add the missing handle_exit_race() parts from: 3d4775df0a89 ("futex: Replace PF_EXITPIDONE with a state")] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Change locking rulesPeter Zijlstra2021-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 734009e96d1983ad739e5b656e03430b3660c913 upstream. This patch comes directly from an origin patch (commit dc3f2ff11740159080f2e8e359ae0ab57c8e74b6) in v4.9. Currently futex-pi relies on hb->lock to serialize everything. But hb->lock creates another set of problems, especially priority inversions on RT where hb->lock becomes a rt_mutex itself. The rt_mutex::wait_lock is the most obvious protection for keeping the futex user space value and the kernel internal pi_state in sync. Rework and document the locking so rt_mutex::wait_lock is held accross all operations which modify the user space value and the pi state. This allows to invoke rt_mutex_unlock() (including deboost) without holding hb->lock as a next step. Nothing yet relies on the new locking rules. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.751993333@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [Lee: Back-ported in support of a previous futex back-port attempt] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'android-4.4-p' of ↵Michael Bestas2021-03-17
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://android.googlesource.com/kernel/common into lineage-18.1-caf-msm8998 This brings LA.UM.9.2.r1-02700-SDMxx0.0 up to date with https://android.googlesource.com/kernel/common/ android-4.4-p at commit: 58bc8e0469d08 Merge 4.4.261 into android-4.4-p Conflicts: drivers/block/zram/zram_drv.c drivers/block/zram/zram_drv.h mm/zsmalloc.c Change-Id: I451bffa685eaaea04938bc6d0b8e3f4bb0f869e9
| * futex: fix spin_lock() / spin_unlock_irq() imbalanceThomas Schoebel-Theuer2021-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch and problem analysis is specific for 4.4 LTS, due to incomplete backporting of other fixes. Later LTS series have different backports. The following is obviously incorrect: static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this, struct futex_hash_bucket *hb) { [...] raw_spin_lock(&pi_state->pi_mutex.wait_lock); [...] raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock); [...] } The 4.4-specific fix should probably go in the direction of b4abf91047c, making everything irq-safe. Probably, backporting of b4abf91047c to 4.4 LTS could thus be another good idea. However, this might involve some more 4.4-specific work and require thorough testing: > git log --oneline v4.4..b4abf91047c -- kernel/futex.c kernel/locking/rtmutex.c | wc -l 10 So this patch is just an obvious quickfix for now. Hint: the lock order is documented in 4.9.y and later. A similar documenting is missing in 4.4.y. Please somebody either backport also, or write a new description, if there would be some differences I cannot easily see at the moment. Without reliable docs, inspection of the locking correctness may become a pain. Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Lee Jones <lee.jones@linaro.org> Fixes: 394fc4981426 ("futex: Rework inconsistent rt_mutex/futex_q state") Fixes: 6510e4a2d04f ("futex,rt_mutex: Provide futex specific rt_mutex API") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: fix irq self-deadlock and satisfy assertionThomas Schoebel-Theuer2021-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch and problem analysis is specific for 4.4 LTS, due to incomplete backporting of other fixes. Later LTS series have different backports. Since v4.4.257 when CONFIG_PROVE_LOCKING=y the following triggers right after reboot of our pre-life systems which equal our production setup: Mar 03 11:27:33 icpu-test-bap10 kernel: ================================= Mar 03 11:27:33 icpu-test-bap10 kernel: [ INFO: inconsistent lock state ] Mar 03 11:27:33 icpu-test-bap10 kernel: 4.4.259-rc1-grsec+ #730 Not tainted Mar 03 11:27:33 icpu-test-bap10 kernel: --------------------------------- Mar 03 11:27:33 icpu-test-bap10 kernel: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. Mar 03 11:27:33 icpu-test-bap10 kernel: apache2-ssl/9310 [HC0[0]:SC0[0]:HE1:SE1] takes: Mar 03 11:27:33 icpu-test-bap10 kernel: (&p->pi_lock){?.-.-.}, at: [<ffffffff810abb68>] pi_state_update_owner+0x51/0xd7 Mar 03 11:27:33 icpu-test-bap10 kernel: {IN-HARDIRQ-W} state was registered at: Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81088c4a>] __lock_acquire+0x3a7/0xe4a Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81089b01>] lock_acquire+0x18d/0x1bc Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff8170151c>] _raw_spin_lock_irqsave+0x3e/0x50 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810719a5>] try_to_wake_up+0x2c/0x210 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81071bf3>] default_wake_function+0xd/0xf Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81083588>] autoremove_wake_function+0x11/0x35 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810830b2>] __wake_up_common+0x48/0x7c Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff8108311a>] __wake_up+0x34/0x46 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff814c2a23>] megasas_complete_int_cmd+0x31/0x33 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff814c60a0>] megasas_complete_cmd+0x570/0x57b Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff814d05bc>] complete_cmd_fusion+0x23e/0x33d Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff814d0768>] megasas_isr_fusion+0x67/0x74 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81091ae5>] handle_irq_event_percpu+0x134/0x311 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81091cf5>] handle_irq_event+0x33/0x51 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810948b9>] handle_edge_irq+0xa3/0xc2 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81005f7b>] handle_irq+0xf9/0x101 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81005700>] do_IRQ+0x80/0xf5 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81702228>] ret_from_intr+0x0/0x20 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff8100cab0>] arch_cpu_idle+0xa/0xc Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81083a5a>] default_idle_call+0x1e/0x20 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81083b9d>] cpu_startup_entry+0x141/0x22f Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff816fb853>] rest_init+0x135/0x13b Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81d5ce99>] start_kernel+0x3fa/0x40a Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81d5c2af>] x86_64_start_reservations+0x2a/0x2c Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81d5c3d0>] x86_64_start_kernel+0x11f/0x12c Mar 03 11:27:33 icpu-test-bap10 kernel: irq event stamp: 1457 Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last enabled at (1457): [<ffffffff81042a69>] get_user_pages_fast+0xeb/0x14f Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last disabled at (1456): [<ffffffff810429dd>] get_user_pages_fast+0x5f/0x14f Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last enabled at (1446): [<ffffffff815e127d>] release_sock+0x142/0x14d Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last disabled at (1444): [<ffffffff815e116f>] release_sock+0x34/0x14d Mar 03 11:27:33 icpu-test-bap10 kernel: other info that might help us debug this: Mar 03 11:27:33 icpu-test-bap10 kernel: Possible unsafe locking scenario: Mar 03 11:27:33 icpu-test-bap10 kernel: CPU0 Mar 03 11:27:33 icpu-test-bap10 kernel: ---- Mar 03 11:27:33 icpu-test-bap10 kernel: lock(&p->pi_lock); Mar 03 11:27:33 icpu-test-bap10 kernel: <Interrupt> Mar 03 11:27:33 icpu-test-bap10 kernel: lock(&p->pi_lock); Mar 03 11:27:33 icpu-test-bap10 kernel: *** DEADLOCK *** Mar 03 11:27:33 icpu-test-bap10 kernel: 2 locks held by apache2-ssl/9310: Mar 03 11:27:33 icpu-test-bap10 kernel: #0: (&(&(__futex_data.queues)[i].lock)->rlock){+.+...}, at: [<ffffffff810ae4e6>] do Mar 03 11:27:33 icpu-test-bap10 kernel: #1: (&lock->wait_lock){+.+...}, at: [<ffffffff810ae53a>] do_futex+0x639/0x809 Mar 03 11:27:33 icpu-test-bap10 kernel: stack backtrace: Mar 03 11:27:33 icpu-test-bap10 kernel: CPU: 13 PID: 9310 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #730 Mar 03 11:27:33 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019 Mar 03 11:27:33 icpu-test-bap10 kernel: 0000000000000000 ffff883fb79bfc00 ffffffff816f8fc2 ffff883ffa66d300 Mar 03 11:27:33 icpu-test-bap10 kernel: ffffffff8eaa71f0 ffff883fb79bfc50 ffffffff81088484 0000000000000000 Mar 03 11:27:33 icpu-test-bap10 kernel: 0000000000000001 0000000000000001 0000000000000002 ffff883ffa66db58 Mar 03 11:27:33 icpu-test-bap10 kernel: Call Trace: Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff816f8fc2>] dump_stack+0x94/0xca Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81088484>] print_usage_bug+0x1bc/0x1d1 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81087d76>] ? check_usage_forwards+0x98/0x98 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810885a5>] mark_lock+0x10c/0x203 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81088cb9>] __lock_acquire+0x416/0xe4a Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810abb68>] ? pi_state_update_owner+0x51/0xd7 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81089b01>] lock_acquire+0x18d/0x1bc Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81089b01>] ? lock_acquire+0x18d/0x1bc Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810abb68>] ? pi_state_update_owner+0x51/0xd7 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81700d12>] _raw_spin_lock+0x2a/0x39 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810abb68>] ? pi_state_update_owner+0x51/0xd7 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810abb68>] pi_state_update_owner+0x51/0xd7 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810ae5af>] do_futex+0x6ae/0x809 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff810ae83d>] SyS_futex+0x133/0x143 Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff8100158a>] ? syscall_trace_enter_phase2+0x1a2/0x1bb Mar 03 11:27:33 icpu-test-bap10 kernel: [<ffffffff81701848>] tracesys_phase2+0x90/0x95 Bisecting detects 47e452fcf2f in the above specific scenario using apache-ssl, but apparently the missing *_irq() was introduced in 34c8e1c2c02. However, just reverting the old _irq() variants to a similar status than before 34c8e1c2c02, or using _irqsave() / _irqrestore() as some other backports are doing in various places, would not really help. The fundamental problem is the following violation of the assertion lockdep_assert_held(&pi_state->pi_mutex.wait_lock) in pi_state_update_owner(): Mar 03 12:50:03 icpu-test-bap10 kernel: ------------[ cut here ]------------ Mar 03 12:50:03 icpu-test-bap10 kernel: WARNING: CPU: 37 PID: 8488 at kernel/futex.c:844 pi_state_update_owner+0x3d/0xd7() Mar 03 12:50:03 icpu-test-bap10 kernel: Modules linked in: xt_time xt_connlimit xt_connmark xt_NFLOG xt_limit xt_hashlimit veth ip_set_bitmap_port xt_DSCP xt_multiport ip_set_hash_ip xt_owner xt_set ip_set_hash_net xt_state xt_conntrack nf_conntrack_ftp mars lz4_decompress lz4_compress ipmi_devintf x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul hed ipmi_si ipmi_msghandler processor crc32c_intel ehci_pci ehci_hcd usbcore i40e usb_common Mar 03 12:50:03 icpu-test-bap10 kernel: CPU: 37 PID: 8488 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #737 Mar 03 12:50:03 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019 Mar 03 12:50:03 icpu-test-bap10 kernel: 0000000000000000 ffff883f863f7c70 ffffffff816f9002 0000000000000000 Mar 03 12:50:03 icpu-test-bap10 kernel: 0000000000000009 ffff883f863f7ca8 ffffffff8104cda2 ffffffff810abac7 Mar 03 12:50:03 icpu-test-bap10 kernel: ffff883ffbfe5e80 0000000000000000 ffff883f82ed4bc0 00007fc01c9bf000 Mar 03 12:50:03 icpu-test-bap10 kernel: Call Trace: Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff816f9002>] dump_stack+0x94/0xca Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff8104cda2>] warn_slowpath_common+0x94/0xad Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810abac7>] ? pi_state_update_owner+0x3d/0xd7 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff8104ce5f>] warn_slowpath_null+0x15/0x17 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810abac7>] pi_state_update_owner+0x3d/0xd7 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810abea8>] free_pi_state+0x2d/0x73 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810abf0b>] unqueue_me_pi+0x1d/0x31 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810ad735>] futex_lock_pi+0x27a/0x2e8 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff81088bca>] ? __lock_acquire+0x327/0xe4a Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810ae6a9>] do_futex+0x784/0x809 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810cfa9a>] ? seccomp_phase1+0xde/0x1e7 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810a4503>] ? current_kernel_time64+0xb/0x31 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810d23c3>] ? current_kernel_time+0xb/0xf Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff810ae861>] SyS_futex+0x133/0x143 Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff8100158a>] ? syscall_trace_enter_phase2+0x1a2/0x1bb Mar 03 12:50:03 icpu-test-bap10 kernel: [<ffffffff81701888>] tracesys_phase2+0x90/0x95 Mar 03 12:50:03 icpu-test-bap10 kernel: ---[ end trace 968f95a458dea951 ]--- In order to both (1) prevent the self-deadlock, and (2) to satisfy the assertion at pi_state_update_owner(), some locking with irq disable is needed, at least in the specific call stack. Interestingly, there existed a suchalike locking just before f08a4af5ccb. This is just a quick hotfix, resurrecting some previous locks at the old places, but now using ->wait_lock in place of the previous ->pi_lock (which was in place before f08a4af5ccb). The ->pi_lock is now also taken, by the new code which had been introduced in 34c8e1c2c02. When this patch is applied, both the above splats are no longer triggering at my prelife machines. Without this patch, I cannot ensure stable production at 1&1 Ionos. Hint for further work: I have not yet tested other call paths, since I am under time pressure for security reasons. Hint for further hardening of 4.4.y and probably some more LTS series: Probably some more systematic testing with CONFIG_PROVE_LOCKING (and probably some more options) should be invested in order to make the 4.4 LTS series really "stable" again. Signed-off-by: Thomas Schoebel-Theuer <tst@1und1.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Lee Jones <lee.jones@linaro.org> Fixes: f08a4af5ccb2 ("futex: Use pi_state_update_owner() in put_pi_state()") Fixes: 34c8e1c2c025 ("futex: Provide and use pi_state_update_owner()") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Ensure the correct return value from futex_lock_pi()Thomas Gleixner2021-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9 upstream. In case that futex_lock_pi() was aborted by a signal or a timeout and the task returned without acquiring the rtmutex, but is the designated owner of the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to establish consistent state. In that case it invokes fixup_pi_state_owner() which in turn tries to acquire the rtmutex again. If that succeeds then it does not propagate this success to fixup_owner() and futex_lock_pi() returns -EINTR or -ETIMEOUT despite having the futex locked. Return success from fixup_pi_state_owner() in all cases where the current task owns the rtmutex and therefore the futex and propagate it correctly through fixup_owner(). Fixup the other callsite which does not expect a positive return value. Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> [Sharan: Backported patch for kernel 4.4.y. Also folded in is a part of the cleanup patch d7c5ed73b19c("futex: Remove needless goto's")] Signed-off-by: Sharan Turlapati <sturlapati@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Fix OWNER_DEAD fixupPeter Zijlstra2021-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a97cb0e7b3f4c6297fd857055ae8e895f402f501 upstream. Both Geert and DaveJ reported that the recent futex commit: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") introduced a problem with setting OWNER_DEAD. We set the bit on an uninitialized variable and then entirely optimize it away as a dead-store. Move the setting of the bit to where it is more useful. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Link: http://lkml.kernel.org/r/20180122103947.GD2228@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Reviewed-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'android-4.4-p' of ↵Michael Bestas2021-02-28
|\| | | | | | | | | | | | | | | | | | | https://android.googlesource.com/kernel/common into lineage-18.1-caf-msm8998 This brings LA.UM.9.2.r1-02500-SDMxx0.0 up to date with https://android.googlesource.com/kernel/common/ android-4.4-p at commit: 4fd124d1546d8 Merge 4.4.258 into android-4.4-p Change-Id: Idbae7489bc1d831a378dd60993f46139e5e28c4c
| * futex: Handle faults correctly for PI futexesLee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Thomas Gleixner <tglx@linutronix.de> fixup_pi_state_owner() tries to ensure that the state of the rtmutex, pi_state and the user space value related to the PI futex are consistent before returning to user space. In case that the user space value update faults and the fault cannot be resolved by faulting the page in via fault_in_user_writeable() the function returns with -EFAULT and leaves the rtmutex and pi_state owner state inconsistent. A subsequent futex_unlock_pi() operates on the inconsistent pi_state and releases the rtmutex despite not owning it which can corrupt the RB tree of the rtmutex and cause a subsequent kernel stack use after free. It was suggested to loop forever in fixup_pi_state_owner() if the fault cannot be resolved, but that results in runaway tasks which is especially undesired when the problem happens due to a programming error and not due to malice. As the user space value cannot be fixed up, the proper solution is to make the rtmutex and the pi_state consistent so both have the same owner. This leaves the user space value out of sync. Any subsequent operation on the futex will fail because the 10th rule of PI futexes (pi_state owner and user space value are consistent) has been violated. As a consequence this removes the inept attempts of 'fixing' the situation in case that the current task owns the rtmutex when returning with an unresolvable fault by unlocking the rtmutex which left pi_state::owner and rtmutex::owner out of sync in a different and only slightly less dangerous way. Fixes: 1b7558e457ed ("futexes: fix fault handling in futex_lock_pi") Reported-by: gzobqq@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Simplify fixup_pi_state_owner()Lee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Thomas Gleixner <tglx@linutronix.de> [ Upstream commit f2dac39d93987f7de1e20b3988c8685523247ae2 ] Too many gotos already and an upcoming fix would make it even more unreadable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Use pi_state_update_owner() in put_pi_state()Lee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Thomas Gleixner <tglx@linutronix.de> [ Upstream commit 6ccc84f917d33312eb2846bd7b567639f585ad6d ] No point in open coding it. This way it gains the extra sanity checks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * rtmutex: Remove unused argument from rt_mutex_proxy_unlock()Lee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Thomas Gleixner <tglx@linutronix.de> [ Upstream commit 2156ac1934166d6deb6cd0f6ffc4c1076ec63697 ] Nothing uses the argument. Remove it as preparation to use pi_state_update_owner(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Provide and use pi_state_update_owner()Lee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Thomas Gleixner <tglx@linutronix.de> [ Upstream commit c5cade200ab9a2a3be9e7f32a752c8d86b502ec7 ] Updating pi_state::owner is done at several places with the same code. Provide a function for it and use that at the obvious places. This is also a preparation for a bug fix to avoid yet another copy of the same code or alternatively introducing a completely unpenetratable mess of gotos. Originally-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Replace pointless printk in fixup_owner()Lee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Thomas Gleixner <tglx@linutronix.de> [ Upstream commit 04b79c55201f02ffd675e1231d731365e335c307 ] If that unexpected case of inconsistent arguments ever happens then the futex state is left completely inconsistent and the printk is not really helpful. Replace it with a warning and make the state consistent. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Avoid violating the 10th rule of futexLee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Peter Zijlstra <peterz@infradead.org> commit c1e2f0eaf015fb7076d51a339011f2383e6dd389 upstream. Julia reported futex state corruption in the following scenario: waiter waker stealer (prio > waiter) futex(WAIT_REQUEUE_PI, uaddr, uaddr2, timeout=[N ms]) futex_wait_requeue_pi() futex_wait_queue_me() freezable_schedule() <scheduled out> futex(LOCK_PI, uaddr2) futex(CMP_REQUEUE_PI, uaddr, uaddr2, 1, 0) /* requeues waiter to uaddr2 */ futex(UNLOCK_PI, uaddr2) wake_futex_pi() cmp_futex_value_locked(uaddr2, waiter) wake_up_q() <woken by waker> <hrtimer_wakeup() fires, clears sleeper->task> futex(LOCK_PI, uaddr2) __rt_mutex_start_proxy_lock() try_to_take_rt_mutex() /* steals lock */ rt_mutex_set_owner(lock, stealer) <preempted> <scheduled in> rt_mutex_wait_proxy_lock() __rt_mutex_slowlock() try_to_take_rt_mutex() /* fails, lock held by stealer */ if (timeout && !timeout->task) return -ETIMEDOUT; fixup_owner() /* lock wasn't acquired, so, fixup_pi_state_owner skipped */ return -ETIMEDOUT; /* At this point, we've returned -ETIMEDOUT to userspace, but the * futex word shows waiter to be the owner, and the pi_mutex has * stealer as the owner */ futex_lock(LOCK_PI, uaddr2) -> bails with EDEADLK, futex word says we're owner. And suggested that what commit: 73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state") removes from fixup_owner() looks to be just what is needed. And indeed it is -- I completely missed that requeue_pi could also result in this case. So we need to restore that, except that subsequent patches, like commit: 16ffa12d7425 ("futex: Pull rt_mutex_futex_unlock() out from under hb->lock") changed all the locking rules. Even without that, the sequence: - if (rt_mutex_futex_trylock(&q->pi_state->pi_mutex)) { - locked = 1; - goto out; - } - raw_spin_lock_irq(&q->pi_state->pi_mutex.wait_lock); - owner = rt_mutex_owner(&q->pi_state->pi_mutex); - if (!owner) - owner = rt_mutex_next_owner(&q->pi_state->pi_mutex); - raw_spin_unlock_irq(&q->pi_state->pi_mutex.wait_lock); - ret = fixup_pi_state_owner(uaddr, q, owner); already suggests there were races; otherwise we'd never have to look at next_owner. So instead of doing 3 consecutive wait_lock sections with who knows what races, we do it all in a single section. Additionally, the usage of pi_state->owner in fixup_owner() was only safe because only the rt_mutex owner would modify it, which this additional case wrecks. Luckily the values can only change away and not to the value we're testing, this means we can do a speculative test and double check once we have the wait_lock. Fixes: 73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state") Reported-by: Julia Cartwright <julia@ni.com> Reported-by: Gratian Crisan <gratian.crisan@ni.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Julia Cartwright <julia@ni.com> Tested-by: Gratian Crisan <gratian.crisan@ni.com> Cc: Darren Hart <dvhart@infradead.org> Link: https://lkml.kernel.org/r/20171208124939.7livp7no2ov65rrc@hirez.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [Lee: Back-ported to solve a dependency] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Rework inconsistent rt_mutex/futex_q stateLee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Peter Zijlstra <peterz@infradead.org> [Upstream commit 73d786bd043ebc855f349c81ea805f6b11cbf2aa ] There is a weird state in the futex_unlock_pi() path when it interleaves with a concurrent futex_lock_pi() at the point where it drops hb->lock. In this case, it can happen that the rt_mutex wait_list and the futex_q disagree on pending waiters, in particular rt_mutex will find no pending waiters where futex_q thinks there are. In this case the rt_mutex unlock code cannot assign an owner. The futex side fixup code has to cleanup the inconsistencies with quite a bunch of interesting corner cases. Simplify all this by changing wake_futex_pi() to return -EAGAIN when this situation occurs. This then gives the futex_lock_pi() code the opportunity to continue and the retried futex_unlock_pi() will now observe a coherent state. The only problem is that this breaks RT timeliness guarantees. That is, consider the following scenario: T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1) CPU0 T1 lock_pi() queue_me() <- Waiter is visible preemption T2 unlock_pi() loops with -EAGAIN forever Which is undesirable for PI primitives. Future patches will rectify this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.850383690@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [Lee: Back-ported to solve a dependency] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex,rt_mutex: Provide futex specific rt_mutex APILee Jones2021-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Peter Zijlstra <peterz@infradead.org> [ Upstream commit 5293c2efda37775346885c7e924d4ef7018ea60b ] Part of what makes futex_unlock_pi() intricate is that rt_mutex_futex_unlock() -> rt_mutex_slowunlock() can drop rt_mutex::wait_lock. This means it cannot rely on the atomicy of wait_lock, which would be preferred in order to not rely on hb->lock so much. The reason rt_mutex_slowunlock() needs to drop wait_lock is because it can race with the rt_mutex fastpath, however futexes have their own fast path. Since futexes already have a bunch of separate rt_mutex accessors, complete that set and implement a rt_mutex variant without fastpath for them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.702962446@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [Lee: Back-ported to solve a dependency] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'android-4.4-p' of ↵Michael Bestas2021-02-07
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://android.googlesource.com/kernel/common into lineage-18.1-caf-msm8998 This brings LA.UM.9.2.r1-02000-SDMxx0.0 up to date with https://android.googlesource.com/kernel/common/ android-4.4-p at commit: 0566f6529a7b8 Merge 4.4.255 into android-4.4-p Conflicts: drivers/scsi/ufs/ufshcd.c drivers/usb/gadget/function/f_accessory.c drivers/usb/gadget/function/f_uac2.c net/core/skbuff.c Change-Id: I327c7f3793e872609f33f2a8e70eba7b580d70f3
| * futex: Prevent exit livelockThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3ef240eaff36b8119ac9e2ea17cbf41179c930ba upstream. Oleg provided the following test case: int main(void) { struct sched_param sp = {}; sp.sched_priority = 2; assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0); int lock = vfork(); if (!lock) { sp.sched_priority = 1; assert(sched_setscheduler(0, SCHED_FIFO, &sp) == 0); _exit(0); } syscall(__NR_futex, &lock, FUTEX_LOCK_PI, 0,0,0); return 0; } This creates an unkillable RT process spinning in futex_lock_pi() on a UP machine or if the process is affine to a single CPU. The reason is: parent child set FIFO prio 2 vfork() -> set FIFO prio 1 implies wait_for_child() sched_setscheduler(...) exit() do_exit() .... mm_release() tsk->futex_state = FUTEX_STATE_EXITING; exit_futex(); (NOOP in this case) complete() --> wakes parent sys_futex() loop infinite because tsk->futex_state == FUTEX_STATE_EXITING The same problem can happen just by regular preemption as well: task holds futex ... do_exit() tsk->futex_state = FUTEX_STATE_EXITING; --> preemption (unrelated wakeup of some other higher prio task, e.g. timer) switch_to(other_task) return to user sys_futex() loop infinite as above Just for the fun of it the futex exit cleanup could trigger the wakeup itself before the task sets its futex state to DEAD. To cure this, the handling of the exiting owner is changed so: - A refcount is held on the task - The task pointer is stored in a caller visible location - The caller drops all locks (hash bucket, mmap_sem) and blocks on task::futex_exit_mutex. When the mutex is acquired then the exiting task has completed the cleanup and the state is consistent and can be reevaluated. This is not a pretty solution, but there is no choice other than returning an error code to user space, which would break the state consistency guarantee and open another can of problems including regressions. For stable backports the preparatory commits ac31c7ff8624 .. ba31c1a48538 are required as well, but for anything older than 5.3.y the backports are going to be provided when this hits mainline as the other dependencies for those kernels are definitely not stable material. Fixes: 778e9a9c3e71 ("pi-futex: fix exit races and locking problems") Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stable Team <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20191106224557.041676471@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Provide distinct return value when owner is exitingThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ac31c7ff8624409ba3c4901df9237a616c187a5d upstream. attach_to_pi_owner() returns -EAGAIN for various cases: - Owner task is exiting - Futex value has changed The caller drops the held locks (hash bucket, mmap_sem) and retries the operation. In case of the owner task exiting this can result in a live lock. As a preparatory step for seperating those cases, provide a distinct return value (EBUSY) for the owner exiting case. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.935606117@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Add mutex around futex exitThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3f186d974826847a07bc7964d79ec4eded475ad9 upstream. The mutex will be used in subsequent changes to replace the busy looping of a waiter when the futex owner is currently executing the exit cleanup to prevent a potential live lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.845798895@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Provide state handling for exec() as wellThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit af8cbda2cfcaa5515d61ec500498d46e9a8247e2 upstream. exec() attempts to handle potentially held futexes gracefully by running the futex exit handling code like exit() does. The current implementation has no protection against concurrent incoming waiters. The reason is that the futex state cannot be set to FUTEX_STATE_DEAD after the cleanup because the task struct is still active and just about to execute the new binary. While its arguably buggy when a task holds a futex over exec(), for consistency sake the state handling can at least cover the actual futex exit cleanup section. This provides state consistency protection accross the cleanup. As the futex state of the task becomes FUTEX_STATE_OK after the cleanup has been finished, this cannot prevent subsequent attempts to attach to the task in case that the cleanup was not successfull in mopping up all leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.753355618@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Sanitize exit state handlingThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4a8e991b91aca9e20705d434677ac013974e0e30 upstream. Instead of having a smp_mb() and an empty lock/unlock of task::pi_lock move the state setting into to the lock section. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.645603214@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Mark the begin of futex exit explicitlyThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 18f694385c4fd77a09851fd301236746ca83f3cb upstream. Instead of relying on PF_EXITING use an explicit state for the futex exit and set it in the futex exit function. This moves the smp barrier and the lock/unlock serialization into the futex code. As with the DEAD state this is restricted to the exit path as exec continues to use the same task struct. This allows to simplify that logic in a next step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.539409004@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Set task::futex_state to DEAD right after handling futex exitThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f24f22435dcc11389acc87e5586239c1819d217c upstream. Setting task::futex_state in do_exit() is rather arbitrarily placed for no reason. Move it into the futex code. Note, this is only done for the exit cleanup as the exec cleanup cannot set the state to FUTEX_STATE_DEAD because the task struct is still in active use. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.439511191@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Split futex_mm_release() for exit/execThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 150d71584b12809144b8145b817e83b81158ae5f upstream. To allow separate handling of the futex exit state in the futex exit code for exit and exec, split futex_mm_release() into two functions and invoke them from the corresponding exit/exec_mm_release() callsites. Preparatory only, no functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.332094221@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Replace PF_EXITPIDONE with a stateThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3d4775df0a89240f671861c6ab6e8d59af8e9e41 upstream. The futex exit handling relies on PF_ flags. That's suboptimal as it requires a smp_mb() and an ugly lock/unlock of the exiting tasks pi_lock in the middle of do_exit() to enforce the observability of PF_EXITING in the futex code. Add a futex_state member to task_struct and convert the PF_EXITPIDONE logic over to the new state. The PF_EXITING dependency will be cleaned up in a later step. This prepares for handling various futex exit issues later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.149449274@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Move futex exit handling into futex codeThomas Gleixner2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ba31c1a48538992316cc71ce94fa9cd3e7b427c0 upstream. The futex exit handling is #ifdeffed into mm_release() which is not pretty to begin with. But upcoming changes to address futex exit races need to add more functionality to this exit code. Split it out into a function, move it into futex code and make the various futex exit functions static. Preparatory only and no functional change. Folded build fix from Borislav. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.049705556@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * y2038: futex: Move compat implementation into futex.cArnd Bergmann2021-02-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 04e7712f4460585e5eed5b853fd8b82a9943958f upstream. We are going to share the compat_sys_futex() handler between 64-bit architectures and 32-bit architectures that need to deal with both 32-bit and 64-bit time_t, and this is easier if both entry points are in the same file. In fact, most other system call handlers do the same thing these days, so let's follow the trend here and merge all of futex_compat.c into futex.c. In the process, a few minor changes have to be done to make sure everything still makes sense: handle_futex_death() and futex_cmpxchg_enabled() become local symbol, and the compat version of the fetch_robust_entry() function gets renamed to compat_fetch_robust_entry() to avoid a symbol clash. This is intended as a purely cosmetic patch, no behavior should change. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [Lee: Back-ported to satisfy a build dependency] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'android-4.4-p' of ↵Michael Bestas2020-05-14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://android.googlesource.com/kernel/common into lineage-17.1-caf-msm8998 This brings LA.UM.8.4.r1-05400-8x98.0 up to date with https://android.googlesource.com/kernel/common/ android-4.4-p at commit: 96b09cba55905 UPSTREAM: net: socket: set sock->sk to NULL after calling proto_ops::release() Conflicts: drivers/scsi/ufs/ufshcd.c drivers/usb/gadget/composite.c drivers/usb/gadget/function/f_fs.c Change-Id: I3e79c0d20e3eb3246a50c9a1e815cdf030a4232e
| * futex: futex_wake_op, do not fail on invalid opJiri Slaby2020-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e78c38f6bdd900b2ad9ac9df8eff58b745dc5b3c upstream. In commit 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour"), I let FUTEX_WAKE_OP to fail on invalid op. Namely when op should be considered as shift and the shift is out of range (< 0 or > 31). But strace's test suite does this madness: futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee); futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xbadfaced); futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xffffffff); When I pick the first 0xa0caffee, it decodes as: 0x80000000 & 0xa0caffee: oparg is shift 0x70000000 & 0xa0caffee: op is FUTEX_OP_OR 0x0f000000 & 0xa0caffee: cmp is FUTEX_OP_CMP_EQ 0x00fff000 & 0xa0caffee: oparg is sign-extended 0xcaf = -849 0x00000fff & 0xa0caffee: cmparg is sign-extended 0xfee = -18 That means the op tries to do this: (futex |= (1 << (-849))) == -18 which is completely bogus. The new check of op in the code is: if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) { if (oparg < 0 || oparg > 31) return -EINVAL; oparg = 1 << oparg; } which results obviously in the "Invalid argument" errno: FAIL: futex =========== futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee) = -1: Invalid argument futex.test: failed test: ../futex failed with code 1 So let us soften the failure to print only a (ratelimited) message, crop the value and continue as if it were right. When userspace keeps up, we can switch this to return -EINVAL again. [v2] Do not return 0 immediatelly, proceed with the cropped value. Fixes: 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge branch 'android-4.4-p' of ↵Michael Bestas2020-04-14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://android.googlesource.com/kernel/common into lineage-17.1-caf-msm8998 This brings LA.UM.8.4.r1-05300-8x98.0 up to date with https://android.googlesource.com/kernel/common/ android-4.4-p at commit: f9991115f0793 Merge 4.4.219 into android-4.4-p Conflicts: drivers/clk/qcom/clk-rcg2.c drivers/scsi/sd.c drivers/usb/gadget/function/f_fs.c drivers/usb/gadget/function/u_serial.c Change-Id: Ifed3db0ddda828c1697e57e9f73c1b73354bebf7
| * futex: Unbreak futex hashingThomas Gleixner2020-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8d67743653dce5a0e7aa500fcccb237cde7ad88e upstream. The recent futex inode life time fix changed the ordering of the futex key union struct members, but forgot to adjust the hash function accordingly, As a result the hashing omits the leading 64bit and even hashes beyond the futex key causing a bad hash distribution which led to a ~100% performance regression. Hand in the futex key pointer instead of a random struct member and make the size calculation based of the struct offset. Fixes: 8019ad13ef7f ("futex: Fix inode life-time issue") Reported-by: Rong Chen <rong.a.chen@intel.com> Decoded-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Rong Chen <rong.a.chen@intel.com> Link: https://lkml.kernel.org/r/87h7yy90ve.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * futex: Fix inode life-time issuePeter Zijlstra2020-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8019ad13ef7f64be44d4f892af9c840179009254 upstream. As reported by Jann, ihold() does not in fact guarantee inode persistence. And instead of making it so, replace the usage of inode pointers with a per boot, machine wide, unique inode identifier. This sequence number is global, but shared (file backed) futexes are rare enough that this should not become a performance issue. Reported-by: Jann Horn <jannh@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge android-4.4.183 (94fd428) into msm-4.4Srinivasarao P2019-06-24
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/heads/tmp-94fd428 Linux 4.4.183 Abort file_remove_privs() for non-reg. files coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping Revert "crypto: crypto4xx - properly set IV after de- and encrypt" scsi: libsas: delete sas port if expander discover failed scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route() net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list ia64: fix build errors by exporting paddr_to_nid() configfs: Fix use-after-free when accessing sd->s_dentry i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() gpio: fix gpio-adp5588 build errors perf/ring_buffer: Add ordering to rb->nest increment perf/ring_buffer: Fix exposing a temporarily decreased data_head x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor mISDN: make sure device name is NUL terminated sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg neigh: fix use-after-free read in pneigh_get_next lapb: fixed leak of control-blocks. ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero be2net: Fix number of Rx queues used for flow hashing ax25: fix inconsistent lock state in ax25_destroy_timer USB: serial: option: add Telit 0x1260 and 0x1261 compositions USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode USB: serial: pl2303: add Allied Telesis VT-Kit3 USB: usb-storage: Add new ID to ums-realtek USB: Fix chipmunk-like voice when using Logitech C270 for recording audio. drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define() drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION KVM: x86/pmu: do not mask the value that is written to fixed PMUs usbnet: ipheth: fix racing condition scsi: bnx2fc: fix incorrect cast to u64 on shift operation scsi: lpfc: add check for loss of ndlp when sending RRQ Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var ASoC: cs42xx8: Add regcache mask dirty cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css() bcache: fix stack corruption by PRECEDING_KEY() i2c: acorn: fix i2c warning ptrace: restore smp_rmb() in __ptrace_may_access() signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO fs/ocfs2: fix race in ocfs2_dentry_attach_lock() mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk ALSA: seq: Cover unsubscribe_port() in list_mutex Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR connections" futex: Fix futex lock the wrong page ARM: exynos: Fix undefined instruction during Exynos5422 resume pwm: Fix deadlock warning when removing PWM device ARM: dts: exynos: Always enable necessary APIO_1V8 and ABB_1V8 regulators on Arndale Octa pwm: tiehrpwm: Update shadow register for disabling PWMs dmaengine: idma64: Use actual device for DMA transfers gpio: gpio-omap: add check for off wake capable gpios PCI: xilinx: Check for __get_free_pages() failure video: imsttfb: fix potential NULL pointer dereferences video: hgafb: fix potential NULL pointer dereference PCI: rcar: Fix a potential NULL pointer dereference PCI: rpadlpar: Fix leaked device_node references in add/remove paths ARM: dts: imx6qdl: Specify IMX6QDL_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ipg" clock to SDMA ARM: dts: imx6sx: Specify IMX6SX_CLK_IPG as "ahb" clock to SDMA clk: rockchip: Turn on "aclk_dmac1" for suspend on rk3288 soc: mediatek: pwrap: Zero initialize rdata in pwrap_init_cipher platform/chrome: cros_ec_proto: check for NULL transfer function x86/PCI: Fix PCI IRQ routing table memory leak nfsd: allow fh_want_write to be called twice fuse: retrieve: cap requested size to negotiated max_write nvmem: core: fix read buffer in place ALSA: hda - Register irq handler after the chip initialization iommu/vt-d: Set intel_iommu_gfx_mapped correctly f2fs: fix to do sanity check on valid block count of segment f2fs: fix to avoid panic in do_recover_data() ntp: Allow TAI-UTC offset to be set to zero drm/bridge: adv7511: Fix low refresh rate selection perf/x86/intel: Allow PEBS multi-entry in watermark mode mfd: twl6040: Fix device init errors for ACCCTL register mfd: intel-lpss: Set the device in reset state when init kernel/sys.c: prctl: fix false positive in validate_prctl_map() mm/cma_debug.c: fix the break condition in cma_maxchunk_get() mm/cma.c: fix crash on CMA allocation if bitmap allocation fails hugetlbfs: on restore reserve error path retain subpool reservation ipc: prevent lockup on alloc_msg and free_msg sysctl: return -EINVAL if val violates minmax fs/fat/file.c: issue flush after the writeback of FAT ANDROID: kernel: cgroup: cpuset: Clear cpus_requested for empty buf ANDROID: kernel: cgroup: cpuset: Add missing allocation of cpus_requested in alloc_trial_cpuset Change-Id: I5b33449bd21ec21d91b1030d53df3658a305bded Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
| * futex: Fix futex lock the wrong pageZhangXiaoxu2019-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The upstram commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in get_futex_key()") use variable 'page' as the page head, when merge it to stable branch, the variable `page_head` is page head. In the stable branch, the variable `page` not means the page head, when lock the page head, we should lock 'page_head', rather than 'page'. It maybe lead a hung task problem. Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge android-4.4.178 (7af10f2) into msm-4.4Srinivasarao P2019-04-05
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/heads/tmp-7af10f2 Linux 4.4.178 stm class: Hide STM-specific options if STM is disabled coresight: removing bind/unbind options from sysfs arm64: support keyctl() system call in 32-bit mode Revert "USB: core: only clean up what we allocated" xhci: Fix port resume done detection for SS ports with LPM enabled KVM: Reject device ioctls from processes other than the VM's creator x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y perf intel-pt: Fix TSC slip gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links Disable kgdboc failed by echo space to /sys/module/kgdboc/parameters/kgdboc USB: serial: option: add Olicard 600 USB: serial: option: set driver_info for SIM5218 and compatibles USB: serial: mos7720: fix mos_parport refcount imbalance on error path USB: serial: ftdi_sio: add additional NovaTech products USB: serial: cp210x: add new device id serial: sh-sci: Fix setting SCSCR_TIE while transferring data serial: max310x: Fix to avoid potential NULL pointer dereference staging: vt6655: Fix interrupt race condition on device start up. staging: vt6655: Remove vif check from vnt_interrupt tty: atmel_serial: fix a potential NULL pointer dereference scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host scsi: sd: Fix a race between closing an sd device and sd I/O ALSA: pcm: Don't suspend stream in unrecoverable PCM state ALSA: pcm: Fix possible OOB access in PCM oss plugins ALSA: seq: oss: Fix Spectre v1 vulnerability ALSA: rawmidi: Fix potential Spectre v1 vulnerability ALSA: compress: add support for 32bit calls in a 64bit kernel ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected time btrfs: raid56: properly unmap parity page in finish_parity_scrub() btrfs: remove WARN_ON in log_dir_items mac8390: Fix mmio access size probe sctp: get sctphdr by offset in sctp_compute_cksum vxlan: Don't call gro_cells_destroy() before device is unregistered tcp: do not use ipv6 header for ipv4 flow packets: Always register packet sk in the same order Add hlist_add_tail_rcu() (Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) net: rose: fix a possible stack overflow net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S dccp: do not use ipv6 header for ipv4 flow stmmac: copy unicast mac address to MAC registers cfg80211: size various nl80211 messages correctly mmc: mmc: fix switch timeout issue caused by jiffies precision arm64: kconfig: drop CONFIG_RTC_LIB dependency video: fbdev: Set pixclock = 0 in goldfishfb cpu/hotplug: Handle unbalanced hotplug enable/disable usb: gadget: rndis: free response queue during REMOTE_NDIS_RESET_MSG usb: gadget: configfs: add mutex lock before unregister gadget ipv6: fix endianness error in icmpv6_err stm class: Fix stm device initialization order stm class: Do not leak the chrdev in error path PM / Hibernate: Call flush_icache_range() on pages restored in-place arm64: kernel: Include _AC definition in page.h perf/ring_buffer: Refuse to begin AUX transaction after rb->aux_mmap_count drops mac80211: fix "warning: ‘target_metric’ may be used uninitialized" arm64/kernel: fix incorrect EL0 check in inv_entry macro ARM: 8510/1: rework ARM_CPU_SUSPEND dependencies staging: goldfish: audio: fix compiliation on arm staging: ion: Set minimum carveout heap allocation order to PAGE_SHIFT staging: ashmem: Add missing include staging: ashmem: Avoid deadlock with mmap/shrink asm-generic: Fix local variable shadow in __set_fixmap_offset coresight: etm4x: Check every parameter used by dma_xx_coherent. coresight: "DEVICE_ATTR_RO" should defined as static. stm class: Fix a race in unlinking stm class: Fix unbalanced module/device refcounting stm class: Guard output assignment against concurrency stm class: Fix unlocking braino in the error path stm class: Support devices with multiple instances stm class: Prevent user-controllable allocations stm class: Fix link list locking stm class: Fix locking in unbinding policy path coresight: remove csdev's link from topology coresight: release reference taken by 'bus_find_device()' coresight: coresight_unregister() function cleanup coresight: fixing lockdep error writeback: initialize inode members that track writeback history Revert "mmc: block: don't use parameter prefix if built as module" net: diag: support v4mapped sockets in inet_diag_find_one_icsk() perf: Synchronously free aux pages in case of allocation failure arm64: hide __efistub_ aliases from kallsyms hid-sensor-hub.c: fix wrong do_div() usage vmstat: make vmstat_updater deferrable again and shut down on idle android: unconditionally remove callbacks in sync_fence_free() ARM: 8494/1: mm: Enable PXN when running non-LPAE kernel on LPAE processor ARM: 8458/1: bL_switcher: add GIC dependency efi: stub: define DISABLE_BRANCH_PROFILING for all architectures arm64: fix COMPAT_SHMLBA definition for large pages mmc: block: Allow more than 8 partitions per card sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task() Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt ath10k: avoid possible string overflow rtc: Fix overflow when converting time64_t to rtc_time USB: core: only clean up what we allocated lib/int_sqrt: optimize small argument serial: sprd: clear timeout interrupt only rather than all interrupts usb: renesas_usbhs: gadget: fix unused-but-set-variable warning arm64: traps: disable irq in die() Hang/soft lockup in d_invalidate with simultaneous calls serial: sprd: adjust TIMEOUT to a big value tcp/dccp: drop SYN packets if accept queue is full usb: gadget: Add the gserial port checking in gs_start_tx() usb: gadget: composite: fix dereference after null check coverify warning kbuild: setlocalversion: print error to STDERR extcon: usb-gpio: Don't miss event during suspend/resume mm/rmap: replace BUG_ON(anon_vma->degree) with VM_WARN_ON mmc: core: fix using wrong io voltage if mmc_select_hs200 fails arm64: mm: Add trace_irqflags annotations to do_debug_exception() usb: dwc3: gadget: Fix suspend/resume during device mode mmc: core: shut up "voltage-ranges unspecified" pr_info() mmc: sanitize 'bus width' in debug output mmc: make MAN_BKOPS_EN message a debug mmc: debugfs: Add a restriction to mmc debugfs clock setting mmc: pwrseq_simple: Make reset-gpios optional to match doc ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec ALSA: hda - Record the current power state before suspend/resume calls locking/lockdep: Add debug_locks check in __lock_downgrade() media: v4l2-ctrls.c/uvc: zero v4l2_event mmc: tmio_mmc_core: don't claim spurious interrupts ext4: brelse all indirect buffer in ext4_ind_remove_space() ext4: fix data corruption caused by unaligned direct AIO ext4: fix NULL pointer dereference while journal is aborted futex: Ensure that futex address is aligned in handle_futex_death() MIPS: Fix kernel crash for R6 in jump label branch function mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction. udf: Fix crash on IO error during truncate drm/vmwgfx: Don't double-free the mode stored in par->set_mode mmc: pxamci: fix enum type confusion ANDROID: drop CONFIG_INPUT_KEYCHORD from cuttlefish and ranchu UPSTREAM: virt_wifi: Remove REGULATORY_WIPHY_SELF_MANAGED UPSTREAM: net: socket: set sock->sk to NULL after calling proto_ops::release() f2fs: set pin_file under CAP_SYS_ADMIN f2fs: fix to avoid deadlock in f2fs_read_inline_dir() f2fs: fix to adapt small inline xattr space in __find_inline_xattr() f2fs: fix to do sanity check with inode.i_inline_xattr_size f2fs: give some messages for inline_xattr_size f2fs: don't trigger read IO for beyond EOF page f2fs: fix to add refcount once page is tagged PG_private f2fs: remove wrong comment in f2fs_invalidate_page() f2fs: fix to use kvfree instead of kzfree f2fs: print more parameters in trace_f2fs_map_blocks f2fs: trace f2fs_ioc_shutdown f2fs: fix to avoid deadlock of atomic file operations f2fs: fix to dirty inode for i_mode recovery f2fs: give random value to i_generation f2fs: no need to take page lock in readdir f2fs: fix to update iostat correctly in IPU path f2fs: fix encrypted page memory leak f2fs: make fault injection covering __submit_flush_wait() f2fs: fix to retry fill_super only if recovery failed f2fs: silence VM_WARN_ON_ONCE in mempool_alloc f2fs: correct spelling mistake f2fs: fix wrong #endif f2fs: don't clear CP_QUOTA_NEED_FSCK_FLAG f2fs: don't allow negative ->write_io_size_bits f2fs: fix to check inline_xattr_size boundary correctly Revert "f2fs: fix to avoid deadlock of atomic file operations" Revert "f2fs: fix to check inline_xattr_size boundary correctly" f2fs: do not use mutex lock in atomic context f2fs: fix potential data inconsistence of checkpoint f2fs: fix to avoid deadlock of atomic file operations f2fs: fix to check inline_xattr_size boundary correctly f2fs: jump to label 'free_node_inode' when failing from d_make_root() f2fs: fix to document inline_xattr_size option f2fs: fix to data block override node segment by mistake f2fs: fix typos in code comments f2fs: sync filesystem after roll-forward recovery fs: export evict_inodes f2fs: flush quota blocks after turnning it off f2fs: avoid null pointer exception in dcc_info f2fs: don't wake up too frequently, if there is lots of IOs f2fs: try to keep CP_TRIMMED_FLAG after successful umount f2fs: add quick mode of checkpoint=disable for QA f2fs: run discard jobs when put_super f2fs: fix to set sbi dirty correctly f2fs: UBSAN: set boolean value iostat_enable correctly f2fs: add brackets for macros f2fs: check if file namelen exceeds max value f2fs: fix to trigger fsck if dirent.name_len is zero f2fs: no need to check return value of debugfs_create functions f2fs: export FS_NOCOW_FL flag to user f2fs: check inject_rate validity during configuring f2fs: remove set but not used variable 'err' f2fs: fix compile warnings: 'struct *' declared inside parameter list f2fs: change error code to -ENOMEM from -EINVAL Conflicts: arch/arm/Kconfig arch/arm64/kernel/traps.c drivers/hwtracing/coresight/coresight-etm4x.c drivers/hwtracing/coresight/coresight-tmc.c drivers/hwtracing/stm/Kconfig drivers/hwtracing/stm/core.c drivers/mmc/core/mmc.c drivers/usb/gadget/function/u_serial.c kernel/events/ring_buffer.c net/wireless/nl80211.c sound/core/compress_offload.c Change-Id: I33783dbd0a25d678d6c61204f9e67690e57bed8f Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
| * futex: Ensure that futex address is aligned in handle_futex_death()Chen Jie2019-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5a07168d8d89b00fe1760120714378175b3ef992 upstream. The futex code requires that the user space addresses of futexes are 32bit aligned. sys_futex() checks this in futex_get_keys() but the robust list code has no alignment check in place. As a consequence the kernel crashes on architectures with strict alignment requirements in handle_futex_death() when trying to cmpxchg() on an unaligned futex address which was retrieved from the robust list. [ tglx: Rewrote changelog, proper sizeof() based alignement check and add comment ] Fixes: 0771dfefc9e5 ("[PATCH] lightweight robust futexes: core") Signed-off-by: Chen Jie <chenjie6@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <dvhart@infradead.org> Cc: <peterz@infradead.org> Cc: <zengweilin@huawei.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1552621478-119787-1-git-send-email-chenjie6@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge android-4.4.177 (0c3b8c4) into msm-4.4Srinivasarao P2019-03-25
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/heads/tmp-0c3b8c4 Linux 4.4.177 KVM: X86: Fix residual mmio emulation request to userspace KVM: nVMX: Ignore limit checks on VMX instructions using flat segments KVM: nVMX: Sign extend displacements of VMX instr's mem operands drm/radeon/evergreen_cs: fix missing break in switch statement media: uvcvideo: Avoid NULL pointer dereference at the end of streaming rcu: Do RCU GP kthread self-wakeup from softirq and interrupt PM / wakeup: Rework wakeup source timer cancellation nfsd: fix wrong check in write_v4_end_grace() nfsd: fix memory corruption caused by readdir NFS: Don't recoalesce on error in nfs_pageio_complete_mirror() NFS: Fix an I/O request leakage in nfs_do_recoalesce md: Fix failed allocation of md_register_thread perf intel-pt: Fix overlap calculation for padding perf auxtrace: Define auxtrace record alignment perf intel-pt: Fix CYC timestamp calculation after OVF NFS41: pop some layoutget errors to application dm: fix to_sector() for 32bit ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify powerpc/83xx: Also save/restore SPRG4-7 during suspend powerpc/powernv: Make opal log only readable by root powerpc/wii: properly disable use of BATs when requested. powerpc/32: Clear on-stack exception marker upon exception return jbd2: fix compile warning when using JBUFFER_TRACE jbd2: clear dirty flag when revoking a buffer from an older transaction serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup() serial: 8250_pci: Fix number of ports for ACCES serial cards perf bench: Copy kernel files needed to build mem{cpy,set} x86_64 benchmarks i2c: tegra: fix maximum transfer size parport_pc: fix find_superio io compare code, should use equal test. intel_th: Don't reference unassigned outputs kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv mm/vmalloc: fix size check for remap_vmalloc_range_partial() dmaengine: usb-dmac: Make DMAC system sleep callbacks explicit clk: ingenic: Fix round_rate misbehaving with non-integer dividers ext2: Fix underflow in ext2_max_size() ext4: fix crash during online resizing cpufreq: pxa2xx: remove incorrect __init annotation cpufreq: tegra124: add missing of_node_put() crypto: pcbc - remove bogus memcpy()s with src == dest Btrfs: fix corruption reading shared and compressed extents after hole punching btrfs: ensure that a DUP or RAID1 block group has exactly two stripes m68k: Add -ffreestanding to CFLAGS scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock scsi: virtio_scsi: don't send sc payload with tmfs s390/virtio: handle find on invalid queue gracefully clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR regulator: s2mpa01: Fix step values for some LDOs regulator: s2mps11: Fix steps for buck7, buck8 and LDO35 ACPI / device_sysfs: Avoid OF modalias creation for removed device tracing: Do not free iter->trace in fail path of tracing_open_pipe() CIFS: Fix read after write for files with read caching crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling stm class: Prevent division by zero tmpfs: fix uninitialized return value in shmem_link net: set static variable an initial value in atl2_probe() mac80211_hwsim: propagate genlmsg_reply return code phonet: fix building with clang ARC: uacces: remove lp_start, lp_end from clobber list tmpfs: fix link accounting when a tmpfile is linked in arm64: Relax GIC version check during early boot ASoC: topology: free created components in tplg load error net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe() pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins net: systemport: Fix reception of BPDUs scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task assoc_array: Fix shortcut creation ARM: 8824/1: fix a migrating irq bug when hotplug cpu Input: st-keyscan - fix potential zalloc NULL dereference i2c: cadence: Fix the hold bit setting Input: matrix_keypad - use flush_delayed_work() ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized s390/dasd: fix using offset into zero size array error gpu: ipu-v3: Fix CSI offsets for imx53 gpu: ipu-v3: Fix i.MX51 CSI control registers offset crypto: ahash - fix another early termination in hash walk crypto: caam - fixed handling of sg list stm class: Fix an endless loop in channel allocation ASoC: fsl_esai: fix register setting issue in RIGHT_J mode 9p/net: fix memory leak in p9_client_create 9p: use inode->i_lock to protect i_size_write() under 32-bit media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused() It's wrong to add len to sector_nr in raid10 reshape twice fs/9p: use fscache mutex rather than spinlock ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against Liquid Saffire 56 tcp/dccp: remove reqsk_put() from inet_child_forget() gro_cells: make sure device is up in gro_cells_receive() net/hsr: fix possible crash in add_timer() vxlan: Fix GRO cells race condition between receive and link delete vxlan: test dev->flags & IFF_UP before calling gro_cells_receive() ipvlan: disallow userns cap_net_admin to change global mode/flags missing barriers in some of unix_sock ->addr and ->path accesses net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 mdio_bus: Fix use-after-free on device_register fails net/x25: fix a race in x25_bind() net/mlx4_core: Fix qp mtt size calculation net/mlx4_core: Fix reset flow when in command polling mode tcp: handle inet_csk_reqsk_queue_add() failures route: set the deleted fnhe fnhe_daddr to 0 in ip_del_fnhe to fix a race ravb: Decrease TxFIFO depth of Q3 and Q2 to one pptp: dst_release sk_dst_cache in pptp_sock_destruct net/x25: reset state in x25_connect() net/x25: fix use-after-free in x25_device_event() net: sit: fix UBSAN Undefined behaviour in check_6rd net: hsr: fix memory leak in hsr_dev_finalize() l2tp: fix infoleak in l2tp_ip6_recvmsg() KEYS: restrict /proc/keys by credentials at open time netfilter: nf_conntrack_tcp: Fix stack out of bounds when parsing TCP options netfilter: nfnetlink_acct: validate NFACCT_FILTER parameters netfilter: nfnetlink_log: just returns error for unknown command netfilter: x_tables: enforce nul-terminated table name from getsockopt GET_ENTRIES udplite: call proper backlog handlers ARM: dts: exynos: Do not ignore real-world fuse values for thermal zone 0 on Exynos5420 Revert "x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls" ARM: dts: exynos: Add minimal clkout parameters to Exynos3250 PMU futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock() iscsi_ibft: Fix missing break in switch statement Input: elan_i2c - add id for touchpad found in Lenovo s21e-20 Input: wacom_serial4 - add support for Wacom ArtPad II tablet MIPS: Remove function size check in get_frame_info() perf symbols: Filter out hidden symbols from labels s390/qeth: fix use-after-free in error path dmaengine: dmatest: Abort test in case of mapping error dmaengine: at_xdmac: Fix wrongfull report of a channel as in use irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable ARM: pxa: ssp: unneeded to free devm_ allocated data autofs: fix error return in autofs_fill_super() autofs: drop dentry reference only when it is never used fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone x86_64: increase stack size for KASAN_EXTRA x86/kexec: Don't setup EFI info if EFI runtime is not enabled cifs: fix computation for MAX_SMB2_HDR_SIZE platform/x86: Fix unmet dependency warning for SAMSUNG_Q10 scsi: libfc: free skb when receiving invalid flogi resp nfs: Fix NULL pointer dereference of dev_name gpio: vf610: Mask all GPIO interrupts net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup() net: hns: Fix wrong read accesses via Clause 45 MDIO protocol net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case xtensa: SMP: limit number of possible CPUs by NR_CPUS xtensa: SMP: mark each possible CPU as present xtensa: smp_lx200_defconfig: fix vectors clash xtensa: SMP: fix secondary CPU initialization xtensa: SMP: fix ccount_timer_shutdown iommu/amd: Fix IOMMU page flush when detach device from a domain ipvs: Fix signed integer overflow when setsockopt timeout IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM perf tools: Handle TOPOLOGY headers with no CPU vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel media: uvcvideo: Fix 'type' check leading to overflow ip6mr: Do not call __IP6_INC_STATS() from preemptible context net: dsa: mv88e6xxx: Fix u64 statistics netlabel: fix out-of-bounds memory accesses hugetlbfs: fix races and page leaks during migration MIPS: irq: Allocate accurate order pages for irq stack applicom: Fix potential Spectre v1 vulnerabilities x86/CPU/AMD: Set the CPB bit unconditionally on F17h net: phy: Micrel KSZ8061: link failure after cable connect net: avoid use IPCB in cipso_v4_error net: Add __icmp_send helper. xen-netback: fix occasional leak of grant ref mappings under memory pressure net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails bnxt_en: Drop oversize TX packets to prevent errors. team: Free BPF filter when unregistering netdev sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79 net-sysfs: Fix mem leak in netdev_register_kobject staging: lustre: fix buffer overflow of string buffer isdn: isdn_tty: fix build warning of strncpy ncpfs: fix build warning of strncpy sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names cpufreq: Use struct kobj_attribute instead of struct global_attr USB: serial: ftdi_sio: add ID for Hjelmslund Electronics USB485 USB: serial: cp210x: add ID for Ingenico 3070 USB: serial: option: add Telit ME910 ECM composition x86/uaccess: Don't leak the AC flag into __put_user() value evaluation mm: enforce min addr even if capable() in expand_downwards() mmc: spi: Fix card detection during probe powerpc: Always initialize input array when calling epapr_hypercall() KVM: arm/arm64: Fix MMIO emulation data handling arm/arm64: KVM: Feed initialized memory to MMIO accesses KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1 cfg80211: extend range deviation for DMG mac80211: don't initiate TDLS connection if station is not associated to AP ibmveth: Do not process frames after calling napi_reschedule net: altera_tse: fix connect_local_phy error path scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling mac80211: fix miscounting of ttl-dropped frames ARC: fix __ffs return value to avoid build warnings ASoC: imx-audmux: change snprintf to scnprintf for possible overflow ASoC: dapm: change snprintf to scnprintf for possible overflow usb: gadget: Potential NULL dereference on allocation error usb: dwc3: gadget: Fix the uninitialized link_state when udc starts thermal: int340x_thermal: Fix a NULL vs IS_ERR() check ALSA: compress: prevent potential divide by zero bugs ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field drm/msm: Unblock writer if reader closes file scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached libceph: handle an empty authorize reply Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" ARCv2: Enable unaligned access in early ASM code net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach() team: avoid complex list operations in team_nl_cmd_options_set() net/packet: fix 4gb buffer limit due to overflow check batman-adv: fix uninit-value in batadv_interface_tx() KEYS: always initialize keyring_index_key::desc_len KEYS: user: Align the payload buffer RDMA/srp: Rework SCSI device reset handling isdn: avm: Fix string plus integer warning from Clang leds: lp5523: fix a missing check of return value of lp55xx_read atm: he: fix sign-extension overflow on large shift isdn: i4l: isdn_tty: Fix some concurrency double-free bugs MIPS: jazz: fix 64bit build scsi: isci: initialize shost fully before calling scsi_add_host() scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param MIPS: ath79: Enable OF serial ports in the default config net: hns: Fix use after free identified by SLUB debug mfd: mc13xxx: Fix a missing check of a register-read failure mfd: wm5110: Add missing ASRC rate register mfd: qcom_rpm: write fw_version to CTRL_REG mfd: ab8500-core: Return zero in get_register_interruptible() mfd: db8500-prcmu: Fix some section annotations mfd: twl-core: Fix section annotations on {,un}protect_pm_master mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells KEYS: allow reaching the keys quotas exactly numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES ceph: avoid repeatedly adding inode to mdsc->snap_flush_list Revert "ANDROID: arm: process: Add display of memory around registers when displaying regs." ANDROID: mnt: Propagate remount correctly ANDROID: cuttlefish_defconfig: Add support for AC97 audio ANDROID: overlayfs: override_creds=off option bypass creator_cred FROMGIT: binder: create node flag to request sender's security context Conflicts: arch/arm/kernel/irq.c drivers/media/v4l2-core/videobuf2-v4l2.c sound/core/compress_offload.c Change-Id: I998f8d53b0c5b8a7102816034452b1779a3b69a3 Signed-off-by: Srinivasarao P <spathi@codeaurora.org>