summaryrefslogtreecommitdiff
path: root/include (unfollow)
Commit message (Collapse)Author
2024-10-17Merge remote-tracking branch 'msm8998/lineage-20' into lineage-20Raghuram Subramani
Change-Id: I126075a330f305c85f8fe1b8c9d408f368be95d1
2024-03-07string: uninline memcpy_and_padGuenter Roeck
commit 5c4e0a21fae877a7ef89be6dcc6263ec672372b8 upstream. When building m68k:allmodconfig, recent versions of gcc generate the following error if the length of UTS_RELEASE is less than 8 bytes. In function 'memcpy_and_pad', inlined from 'nvmet_execute_disc_identify' at drivers/nvme/target/discovery.c:268:2: arch/m68k/include/asm/string.h:72:25: error: '__builtin_memcpy' reading 8 bytes from a region of size 7 Discussions around the problem suggest that this only happens if an architecture does not provide strlen(), if -ffreestanding is provided as compiler option, and if CONFIG_FORTIFY_SOURCE=n. All of this is the case for m68k. The exact reasons are unknown, but seem to be related to the ability of the compiler to evaluate the return value of strlen() and the resulting execution flow in memcpy_and_pad(). It would be possible to work around the problem by using sizeof(UTS_RELEASE) instead of strlen(UTS_RELEASE), but that would only postpone the problem until the function is called in a similar way. Uninline memcpy_and_pad() instead to solve the problem for good. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com> Change-Id: I21516b6de0b5f3d8af30ebbbfcac2d4a495658ac Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Alexander Grund <theflamefire89@gmail.com>
2024-03-07string.h: un-fortify memcpy_and_padMartin Wilck
commit 1359798f9d4082eb04575efdd19512fbd9c28464 upstream. The way I'd implemented the new helper memcpy_and_pad with __FORTIFY_INLINE caused compiler warnings for certain kernel configurations. This helper is only used in a single place at this time, and thus doesn't benefit much from fortification. So simplify the code by dropping fortification support for now. Fixes: 01f33c336e2d "string.h: add memcpy_and_pad()" Change-Id: I8bb1ec4490e27d450ba2042074d6f228b102462a Signed-off-by: Martin Wilck <mwilck@suse.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alexander Grund <theflamefire89@gmail.com>
2024-03-07BACKPORT: string.h: add memcpy_and_pad()Martin Wilck
commit 01f33c336e2d298ea5d4ce5d6e5bcd12865cc30f upstream. This helper function is useful for the nvme subsystem, and maybe others. Note: the warnings reported by the kbuild test robot for this patch are actually generated by the use of CONFIG_PROFILE_ALL_BRANCHES together with __FORTIFY_INLINE. Change-Id: I5f7e1e9143ce9df88af0afd02aef971d5172bd3e Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> [AG: Backported to 4.4] Signed-off-by: Alexander Grund <theflamefire89@gmail.com>
2024-02-19input: Drop INPUT_PROP_NO_DUMMY_RELEASE bitHan Wang
* INPUT_PROP_NO_DUMMY_RELEASE definition in this kernel collides with INPUT_PROP_ACCELEROMETER definition in bionic and upstream kernel. As a result, Android recognizes normal input devices like accelerometers and causes strange behaviors. There are no references to this bit in userspace and it is not in 4.9+ kernels, so let's drop this CAF jank. Change-Id: Id9b4ec8d31470e663f533249c4bc4b9e94fd38be
2024-01-10CHROMIUM: remove Android's cgroup generic permissions checksDmitry Torokhov
The implementation is utterly broken, resulting in all processes being allows to move tasks between sets (as long as they have access to the "tasks" attribute), and upstream is heading towards checking only capability anyway, so let's get rid of this code. BUG=b:31790445,chromium:647994 TEST=Boot android container, examine logcat Change-Id: I2f780a5992c34e52a8f2d0b3557fc9d490da2779 Signed-off-by: Dmitry Torokhov <dtor@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/394967 Reviewed-by: Ricky Zhou <rickyz@chromium.org> Reviewed-by: John Stultz <john.stultz@linaro.org>
2024-01-10bpf: add BPF_J{LT,LE,SLT,SLE} instructionsDaniel Borkmann
Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=), BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that particularly *JLT/*JLE counterparts involving immediates need to be rewritten from e.g. X < [IMM] by swapping arguments into [IMM] > X, meaning the immediate first is required to be loaded into a register Y := [IMM], such that then we can compare with Y > X. Note that the destination operand is always required to be a register. This has the downside of having unnecessarily increased register pressure, meaning complex program would need to spill other registers temporarily to stack in order to obtain an unused register for the [IMM]. Loading to registers will thus also affect state pruning since we need to account for that register use and potentially those registers that had to be spilled/filled again. As a consequence slightly more stack space might have been used due to spilling, and BPF programs are a bit longer due to extra code involving the register load and potentially required spill/fills. Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=) counterparts to the eBPF instruction set. Modifying LLVM to remove the NegateCC() workaround in a PoC patch at [1] and allowing it to also emit the new instructions resulted in cilium's BPF programs that are injected into the fast-path to have a reduced program length in the range of 2-3% (e.g. accumulated main and tail call sections from one of the object file reduced from 4864 to 4729 insns), reduced complexity in the range of 10-30% (e.g. accumulated sections reduced in one of the cases from 116432 to 88428 insns), and reduced stack usage in the range of 1-5% (e.g. accumulated sections from one of the object files reduced from 824 to 784b). The modification for LLVM will be incorporated in a backwards compatible way. Plan is for LLVM to have i) a target specific option to offer a possibility to explicitly enable the extension by the user (as we have with -m target specific extensions today for various CPU insns), and ii) have the kernel checked for presence of the extensions and enable them transparently when the user is selecting more aggressive options such as -march=native in a bpf target context. (Other frontends generating BPF byte code, e.g. ply can probe the kernel directly for its code generation.) [1] https://github.com/borkmann/llvm/tree/bpf-insns Change-Id: Ic56500aaeaf5f3ebdfda094ad6ef4666c82e18c5 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-10bpf: free up BPF_JMP | BPF_CALL | BPF_X opcodeAlexei Starovoitov
free up BPF_JMP | BPF_CALL | BPF_X opcode to be used by actual indirect call by register and use kernel internal opcode to mark call instruction into bpf_tail_call() helper. Change-Id: I1a45b8e3c13848c9689ce288d4862935ede97fa7 Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-10bpf: remove stubs for cBPF from arch codeDaniel Borkmann
Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Change-Id: Iac221c09e9ae0879acdd7064d710c4f7cb8f478d Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-06{nl,mac}80211: add rssi to mesh candidatesBob Copeland
When peering is in userspace, some implementations may want to control which peers are accepted based on RSSI in addition to the information elements being sent today. Add signal level so that info is available to clients. Change-Id: Iae29de6adcb51c2dff58c0ea17e74cd988949991 Signed-off-by: Bob Copeland <bobcopeland@fb.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Backported: Alexander Grund
2024-01-02thread_info: Remove superflous struct declsAlexander Grund
Those should have been moved to <linux/restart_block.h> by 264c551c4c77c9645a1c5a03735a71ed37348bc4 ("UPSTREAM: thread_info: factor out restart_block") and are now superflous. Change-Id: Ic1c48c05ee5a2d759eb8210677c72ecad0e9a78c
2024-01-02USB: core: Prevent nested device-reset callsAlan Stern
commit 9c6d778800b921bde3bff3cff5003d1650f942d1 upstream. Automatic kernel fuzzing revealed a recursive locking violation in usb-storage: ============================================ WARNING: possible recursive locking detected 5.18.0 #3 Not tainted -------------------------------------------- kworker/1:3/1205 is trying to acquire lock: ffff888018638db8 (&us_interface_key[i]){+.+.}-{3:3}, at: usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230 but task is already holding lock: ffff888018638db8 (&us_interface_key[i]){+.+.}-{3:3}, at: usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230 ... stack backtrace: CPU: 1 PID: 1205 Comm: kworker/1:3 Not tainted 5.18.0 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: usb_hub_wq hub_event Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2988 [inline] check_deadlock kernel/locking/lockdep.c:3031 [inline] validate_chain kernel/locking/lockdep.c:3816 [inline] __lock_acquire.cold+0x152/0x3ca kernel/locking/lockdep.c:5053 lock_acquire kernel/locking/lockdep.c:5665 [inline] lock_acquire+0x1ab/0x520 kernel/locking/lockdep.c:5630 __mutex_lock_common kernel/locking/mutex.c:603 [inline] __mutex_lock+0x14f/0x1610 kernel/locking/mutex.c:747 usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230 usb_reset_device+0x37d/0x9a0 drivers/usb/core/hub.c:6109 r871xu_dev_remove+0x21a/0x270 drivers/staging/rtl8712/usb_intf.c:622 usb_unbind_interface+0x1bd/0x890 drivers/usb/core/driver.c:458 device_remove drivers/base/dd.c:545 [inline] device_remove+0x11f/0x170 drivers/base/dd.c:537 __device_release_driver drivers/base/dd.c:1222 [inline] device_release_driver_internal+0x1a7/0x2f0 drivers/base/dd.c:1248 usb_driver_release_interface+0x102/0x180 drivers/usb/core/driver.c:627 usb_forced_unbind_intf+0x4d/0xa0 drivers/usb/core/driver.c:1118 usb_reset_device+0x39b/0x9a0 drivers/usb/core/hub.c:6114 This turned out not to be an error in usb-storage but rather a nested device reset attempt. That is, as the rtl8712 driver was being unbound from a composite device in preparation for an unrelated USB reset (that driver does not have pre_reset or post_reset callbacks), its ->remove routine called usb_reset_device() -- thus nesting one reset call within another. Performing a reset as part of disconnect processing is a questionable practice at best. However, the bug report points out that the USB core does not have any protection against nested resets. Adding a reset_in_progress flag and testing it will prevent such errors in the future. Link: https://lore.kernel.org/all/CAB7eexKUpvX-JNiLzhXBDWgfg2T9e9_0Tw4HQ6keN==voRbP0g@mail.gmail.com/ Cc: stable@vger.kernel.org Reported-and-tested-by: Rondreis <linhaoguo86@gmail.com> Change-Id: I0812c3b2aec376fffddb3e03f3351f66ff76bcc6 Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/YwkflDxvg0KWqyZK@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-02flow: fix object-size-mismatch warning in flowi{4,6}_to_flowi_common()Tetsuo Handa
Commit 3df98d79215ace13 ("lsm,selinux: pass flowi_common instead of flowi to the LSM hooks") introduced flowi{4,6}_to_flowi_common() functions which cause UBSAN warning when building with LLVM 11.0.1 on Ubuntu 21.04. ================================================================================ UBSAN: object-size-mismatch in ./include/net/flow.h:197:33 member access within address ffffc9000109fbd8 with insufficient space for an object of type 'struct flowi' CPU: 2 PID: 7410 Comm: systemd-resolve Not tainted 5.14.0 #51 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 Call Trace: dump_stack_lvl+0x103/0x171 ubsan_type_mismatch_common+0x1de/0x390 __ubsan_handle_type_mismatch_v1+0x41/0x50 udp_sendmsg+0xda2/0x1300 ? ip_skb_dst_mtu+0x1f0/0x1f0 ? sock_rps_record_flow+0xe/0x200 ? inet_send_prepare+0x2d/0x90 sock_sendmsg+0x49/0x80 ____sys_sendmsg+0x269/0x370 __sys_sendmsg+0x15e/0x1d0 ? syscall_enter_from_user_mode+0xf0/0x1b0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7081a50497 Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffc153870f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f7081a50497 RDX: 0000000000000000 RSI: 00007ffc15387140 RDI: 000000000000000c RBP: 00007ffc15387140 R08: 0000563f29a5e4fc R09: 000000000000cd28 R10: 0000563f29a68a30 R11: 0000000000000246 R12: 000000000000000c R13: 0000000000000001 R14: 0000563f29a68a30 R15: 0000563f29a5e50c ================================================================================ I don't think we need to call flowi{4,6}_to_flowi() from these functions because the first member of "struct flowi4" and "struct flowi6" is struct flowi_common __fl_common; while the first member of "struct flowi" is union { struct flowi_common __fl_common; struct flowi4 ip4; struct flowi6 ip6; struct flowidn dn; } u; which should point to the same address without access to "struct flowi". Change-Id: Id4ba65a8029dabf06e424fb6bcbc40a4dbd086f5 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02lsm,selinux: pass flowi_common instead of flowi to the LSM hooksPaul Moore
As pointed out by Herbert in a recent related patch, the LSM hooks do not have the necessary address family information to use the flowi struct safely. As none of the LSMs currently use any of the protocol specific flowi information, replace the flowi pointers with pointers to the address family independent flowi_common struct. Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Change-Id: Ic0f16cf514773f473705d48c787527f910943f1a
2023-11-09Revert "ALSA: rawmidi: Fix racy buffer resize under concurrent accesses"Alexander Grund
This reverts commit 718eede1eeb602531e09191d3107eb849bbe64eb. Remove the remaining parts of that commit as we use `realloc_mutex` instead to protect the buffer and `buffer_ref` is effectively unused. Change-Id: If0cf319ca5ab097751bc5e6753f61bd626d9e601
2023-11-09HID: core: Provide new max_buffer_size attribute to over-ride the defaultLee Jones
commit b1a37ed00d7908a991c1d0f18a8cba3c2aa99bdc upstream. Presently, when a report is processed, its proposed size, provided by the user of the API (as Report Size * Report Count) is compared against the subsystem default HID_MAX_BUFFER_SIZE (16k). However, some low-level HID drivers allocate a reduced amount of memory to their buffers (e.g. UHID only allocates UHID_DATA_MAX (4k) buffers), rending this check inadequate in some cases. In these circumstances, if the received report ends up being smaller than the proposed report size, the remainder of the buffer is zeroed. That is, the space between sizeof(csize) (size of the current report) and the rsize (size proposed i.e. Report Size * Report Count), which can be handled up to HID_MAX_BUFFER_SIZE (16k). Meaning that memset() shoots straight past the end of the buffer boundary and starts zeroing out in-use values, often resulting in calamity. This patch introduces a new variable into 'struct hid_ll_driver' where individual low-level drivers can over-ride the default maximum value of HID_MAX_BUFFER_SIZE (16k) with something more sympathetic to the interface. Change-Id: I851ac2340e107f57aded660540218c693e0e73f4 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> [Lee: Backported to v4.14.y] Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ulrich Hecht <uli+cip@fpond.eu>
2023-11-09tty: use new tty_insert_flip_string_and_push_buffer() in pty_write()Jiri Slaby
commit a501ab75e7624d133a5a3c7ec010687c8b961d23 upstream. There is a race in pty_write(). pty_write() can be called in parallel with e.g. ioctl(TIOCSTI) or ioctl(TCXONC) which also inserts chars to the buffer. Provided, tty_flip_buffer_push() in pty_write() is called outside the lock, it can commit inconsistent tail. This can lead to out of bounds writes and other issues. See the Link below. To fix this, we have to introduce a new helper called tty_insert_flip_string_and_push_buffer(). It does both tty_insert_flip_string() and tty_flip_buffer_commit() under the port lock. It also calls queue_work(), but outside the lock. See 71a174b39f10 (pty: do tty_flip_buffer_push without port->lock in pty_write) for the reasons. Keep the helper internal-only (in drivers' tty.h). It is not intended to be used widely. Link: https://seclists.org/oss-sec/2022/q2/155 Fixes: 71a174b39f10 (pty: do tty_flip_buffer_push without port->lock in pty_write) Change-Id: I1f08439cc9047ee56df0681c3dfc5cd18f4b5a37
2023-11-02net: add ndo to setup/query xdp prog in adapter rxBrenden Blanco
Add one new netdev op for drivers implementing the BPF_PROG_TYPE_XDP filter. The single op is used for both setup/query of the xdp program, modelled after ndo_setup_tc. Change-Id: Ie46dec0b47e417e97d5fed19d7f2b143eab4ea73 Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02netdev: introduce ndo_set_rx_headroomPaolo Abeni
This method allows the controlling device (i.e. the bridge) to specify additional headroom to be allocated for skb head on frame reception. Change-Id: Ic4938a247408a87538a11c45e3fbc2031f4ac832 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}Keller, Jacob E
Ethernet drivers implementing both {GS}RXFH and {GS}CHANNELS ethtool ops incorrectly allow SCHANNELS when it would conflict with the settings from SRXFH. This occurs because it is not possible for drivers to understand whether their Rx flow indirection table has been configured or is in the default state. In addition, drivers currently behave in various ways when increasing the number of Rx channels. Some drivers will always destroy the Rx flow indirection table when this occurs, whether it has been set by the user or not. Other drivers will attempt to preserve the table even if the user has never modified it from the default driver settings. Neither of these situation is desirable because it leads to unexpected behavior or loss of user configuration. The correct behavior is to simply return -EINVAL when SCHANNELS would conflict with the current Rx flow table settings. However, it should only do so if the current settings were modified by the user. If we required that the new settings never conflict with the current (default) Rx flow settings, we would force users to first reduce their Rx flow settings and then reduce the number of Rx channels. This patch proposes a solution implemented in net/core/ethtool.c which ensures that all drivers behave correctly. It checks whether the RXFH table has been configured to non-default settings, and stores this information in a private netdev flag. When the number of channels is requested to change, it first ensures that the current Rx flow table is not going to assign flows to now disabled channels. Change-Id: I3c7ecdeed66e20a423fd46a0ebc937020e951105 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02net: constify netif_is_* helpers net_device paramJiri Pirko
As suggested by Eric, these helpers should have const dev param. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Change-Id: I320e01f441a2b266680cdaefc5eb659f329143fb Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02net: add netif_is_lag_port helperJiri Pirko
Some code does not mind if a device is bond slave or team port and treats them the same, as generic LAG ports. Change-Id: I9f1940897da8079deaec8d59bfba5e8a2c91c230 Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02net: add netif_is_lag_master helperJiri Pirko
Some code does not mind if the master is bond or team and treats them the same, as generic LAG. Change-Id: I97803f67bdaf42a2cbf3eb6adf417c1aeebb5be1 Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02net: add netif_is_team_port helperJiri Pirko
Similar to other helpers, caller can use this to find out if device is team port. Change-Id: Id39d3c7a188de21768a4f18190e25f27e5846b28 Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-02net: add netif_is_team_master helperJiri Pirko
Similar to other helpers, caller can use this to find out if device is team master. Change-Id: I8301442073b4af1e19a1ee207fa5055a3ab5c908 Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-01libfs: support RENAME_NOREPLACE in simple_rename()Miklos Szeredi
This is trivial to do: - add flags argument to simple_rename() - check if flags doesn't have any other than RENAME_NOREPLACE - assign simple_rename() to .rename2 instead of .rename Filesystems converted: hugetlbfs, ramfs, bpf. Debugfs uses simple_rename() to implement debugfs_rename(), which is for debugfs instances to rename files internally, not for userspace filesystem access. For this case pass zero flags to simple_rename(). Change-Id: I1a46ece3b40b05c9f18fd13b98062d2a959b76a0 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexei Starovoitov <ast@kernel.org>
2023-10-27writeback: Fix cgroup_path() return value checkLuK1337
cgroup_path() returns length, not char *. Change-Id: I8bdfcc0fc58789aa23f730866f27fbb932b24be1
2023-07-27softirq, sched: reduce softirq conflicts with RTJohn Dias
joshuous: Adapted to work with CAF's "softirq: defer softirq processing to ksoftirqd if CPU is busy with RT" commit. ajaivasudeve: adapted for the commit "softirq: Don't defer all softirq during RT task" We're finding audio glitches caused by audio-producing RT tasks that are either interrupted to handle softirq's or that are scheduled onto cpu's that are handling softirq's. In a previous patch, we attempted to catch many cases of the latter problem, but it's clear that we are still losing significant numbers of races in some apps. This patch attempts to address both problems: 1. It prohibits handling softirq's when interrupting an RT task, by delaying the softirq to the ksoftirqd thread. 2. It attempts to reduce the most common windows in which we lose the race between scheduling an RT task on a remote core and starting to handle softirq's on that core. We still lose some races, but we lose significantly fewer. (And we don't want to introduce any heavyweight forms of synchronization on these paths.) Bug: 64912585 Change-Id: Ida89a903be0f1965552dd0e84e67ef1d3158c7d8 Signed-off-by: John Dias <joaodias@google.com> Signed-off-by: joshuous <joshuous@gmail.com> Signed-off-by: ajaivasudeve <ajaivasudeve@gmail.com>
2023-07-26BACKPORT: ANDROID: Add hold functionality to schedtune CPU boostChris Redpath
* Render - added back missing header When tasks come and go from a runqueue quickly, this can lead to boost being applied and removed quickly which sometimes means we cannot raise the CPU frequency again when we need to (due to the rate limit on frequency updates). This has proved to be a particular issue for RT tasks and alternative methods have been used in the past to work around it. This is an attempt to solve the issue for all task classes and cpufreq governors by introducing a generic mechanism in schedtune to retain the max boost level from task enqueue for a minimum period - defined here as 50ms. This timeout was determined experimentally and is not configurable. A sched_feat guards the application of this to tasks - in the default configuration, task boosting only applied to tasks which have RT policy. Change SCHEDTUNE_BOOST_HOLD_ALL to true to apply it to all tasks regardless of class. It works like so: Every task enqueue (in an allowed class) stores a cpu-local timestamp. If the task is not a member of an allowed class (all or RT depending upon feature selection), the timestamp is not updated. The boost group will stay active regardless of tasks present until 50ms beyond the last timestamp stored. We also store the timestamp of the active boost group to avoid unneccesarily revisiting the boost groups when checking CPU boost level. If the timestamp is more than 50ms in the past when we check boost then we re-evaluate the boost groups for that CPU, taking into account the timestamps associated with each group. Idea based on rt-boost-retention patches from Joel. Change-Id: I52cc2d2e82d1c5aa03550378c8836764f41630c1 Suggested-by: Joel Fernandes <joelaf@google.com> Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: RenderBroken <zkennedy87@gmail.com>
2023-07-26sched/fair: remove energy_diff tracepoint in preparation to re-factoringPatrick Bellasi
The format of the energy_diff tracepoint is going to be changed by the following energ_diff refactoring patches. Let's remove it now to start from a clean slate. Change-Id: Id4f537ed60d90a7ddcca0a29a49944bfacb85c8c Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2023-07-16FROMLIST: cpufreq: Make iowait boost a policy optionJoel Fernandes
Make iowait boost a cpufreq policy option and enable it for intel_pstate cpufreq driver. Governors like schedutil can use it to determine if boosting for tasks that wake up with p->in_iowait set is needed. Bug: 38010527 Link: https://lkml.org/lkml/2017/5/19/43 Change-Id: Icf59e75fbe731dc67abb28fb837f7bb0cd5ec6cc Signed-off-by: Joel Fernandes <joelaf@google.com>
2023-07-16sched/walt: Re-add code to allow WALT to functionEthan Chen
Change-Id: Ieb1067c5e276f872ed4c722b7d1fabecbdad87e7
2023-07-16sched: cpufreq: Limit governor updates to WALT changes aloneVikram Mulukutla
It's not necessary to keep reporting load to the governor if it doesn't change in a window. Limit updates to when we expect load changes - after window rollover and when we send updates related to intercluster migrations. [beykerykt]: Adapt for HMP Change-Id: I3232d40f3d54b0b81cfafdcdb99b534df79327bf Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2023-07-16UPSTREAM: mm/ksm: Remove reuse_ksm_page()Peter Xu
Remove the function as the last reference has gone away with the do_wp_page() changes. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: Ie4da88791abe9407157566854b2db9b94c0c962f
2023-07-16UPSTREAM: mm: reuse only-pte-mapped KSM page in do_wp_page()Kirill Tkhai
Add an optimization for KSM pages almost in the same way that we have for ordinary anonymous pages. If there is a write fault in a page, which is mapped to an only pte, and it is not related to swap cache; the page may be reused without copying its content. [ Note that we do not consider PageSwapCache() pages at least for now, since we don't want to complicate __get_ksm_page(), which has nice optimization based on this (for the migration case). Currenly it is spinning on PageSwapCache() pages, waiting for when they have unfreezed counters (i.e., for the migration finish). But we don't want to make it also spinning on swap cache pages, which we try to reuse, since there is not a very high probability to reuse them. So, for now we do not consider PageSwapCache() pages at all. ] So in reuse_ksm_page() we check for 1) PageSwapCache() and 2) page_stable_node(), to skip a page, which KSM is currently trying to link to stable tree. Then we do page_ref_freeze() to prohibit KSM to merge one more page into the page, we are reusing. After that, nobody can refer to the reusing page: KSM skips !PageSwapCache() pages with zero refcount; and the protection against of all other participants is the same as for reused ordinary anon pages pte lock, page lock and mmap_sem. [akpm@linux-foundation.org: replace BUG_ON()s with WARN_ON()s] Link: http://lkml.kernel.org/r/154471491016.31352.1168978849911555609.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Cc: Rik van Riel <riel@surriel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: If32387b1f7c36f0e12fcbb0926bf1b67886ec594
2023-07-16mm: introduce arg_lock to protect arg_start|end andYang Shi
env_start|end in mm_struct mmap_sem is on the hot path of kernel, and it very contended, but it is abused too. It is used to protect arg_start|end and evn_start|end when reading /proc/$PID/cmdline and /proc/$PID/environ, but it doesn't make sense since those proc files just expect to read 4 values atomically and not related to VM, they could be set to arbitrary values by C/R. And, the mmap_sem contention may cause unexpected issue like below: INFO: task ps:14018 blocked for more than 120 seconds. Tainted: G E 4.9.79-009.ali3000.alios7.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ps D 0 14018 1 0x00000004 Call Trace: schedule+0x36/0x80 rwsem_down_read_failed+0xf0/0x150 call_rwsem_down_read_failed+0x18/0x30 down_read+0x20/0x40 proc_pid_cmdline_read+0xd9/0x4e0 __vfs_read+0x37/0x150 vfs_read+0x96/0x130 SyS_read+0x55/0xc0 entry_SYSCALL_64_fastpath+0x1a/0xc5 Both Alexey Dobriyan and Michal Hocko suggested to use dedicated lock for them to mitigate the abuse of mmap_sem. So, introduce a new spinlock in mm_struct to protect the concurrent access to arg_start|end, env_start|end and others, as well as replace write map_sem to read to protect the race condition between prctl and sys_brk which might break check_data_rlimit(), and makes prctl more friendly to other VM operations. This patch just eliminates the abuse of mmap_sem, but it can't resolve the above hung task warning completely since the later access_remote_vm() call needs acquire mmap_sem. The mmap_sem scalability issue will be solved in the future. Change-Id: Ifa8f001ee2fc4f0ce60c18e771cebcf8a1f0943e [yang.shi@linux.alibaba.com: add comment about mmap_sem and arg_lock] Link: http://lkml.kernel.org/r/1524077799-80690-1-git-send-email-yang.shi@linux.alibaba.com Link: http://lkml.kernel.org/r/1523730291-109696-1-git-send-email-yang.shi@linux.alibaba.com Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Git-commit: 88aa7cc688d48ddd84558b41d5905a0db9535c4b Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
2023-07-16cpuset: Restore tasks affinity while moving across cpusetsPavankumar Kondeti
When tasks move across cpusets, the current affinity settings are lost. Cache the task affinity and restore it during cpuset migration. The restoring happens only when the cached affinity is subset of the current cpuset settings. Change-Id: I6c2ec1d5e3d994e176926d94b9e0cc92418020cc Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2023-07-16genirq: Introduce effective affinity maskThomas Gleixner
There is currently no way to evaluate the effective affinity mask of a given interrupt. Many irq chips allow only a single target CPU or a subset of CPUs in the affinity mask. Updating the mask at the time of setting the affinity to the subset would be counterproductive because information for cpu hotplug about assigned interrupt affinities gets lost. On CPU hotplug it's also pointless to force migrate an interrupt, which is not targeted at the CPU effectively. But currently the information is not available. Provide a seperate mask to be updated by the irq_chip->irq_set_affinity() implementations. Implement the read only proc files so the user can see the effective mask as well w/o trying to deduce it from /proc/interrupts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20170619235446.247834245@linutronix.de Change-Id: Ibeec0031edb532d52cb411286f785aec160d6139
2023-07-16kernel: time: Add delay after cpu_relax() in tight loopsPrasad Sodagudi
Tight loops of spin_lock_irqsave() and spin_unlock_irqrestore() in timer and hrtimer are causing scheduling delays. Add delay of few nano seconds after cpu_relax in the timer/hrtimer tight loops. Change-Id: Iaa0ab92da93f7b245b1d922b6edca2bebdc0fbce Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
2023-05-22Revert "kernel: Only expose su when daemon is running"lineage-19.1Georg Veichtlbauer
This patch is no longer necessary because we no longer ship su add-ons, which is this patch initially designed for. Now it causes another issue which breaks custom root solution such as Magisk, as Magisk switches worker tmpfs dir to RO instead of RW for safety reasons and happens to satisfy MS_RDONLY check for su file, resulting in su file totally inaccessible. This reverts commit 08ff8a2e58eb226015fa68d577121137a7e0953f. Change-Id: If25a9ef7e64c79412948f4619e08faaedb18aa13
2022-10-28BACKPORT: ANDROID: ftrace: fix function type mismatchesSami Tolvanen
This change fixes indirect call mismatches with function and function graph tracing, which trip Control-Flow Integrity (CFI) checking. Bug: 79510107 Bug: 67506682 Change-Id: I5de08c113fb970ffefedce93c58e0161f22c7ca2 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit c2f9bce9fee8e31e0500c501076f73db7791d8e9) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: mm: fix filler function type mismatchSami Tolvanen
Bug: 67506682 Change-Id: I6f615164ccd86b407540ada9bbcb39d910395db9 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 4fd840d1743308b2ef470534523009dd99b3ce2b) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: mm: fix drain_local_pages function typeSami Tolvanen
Bug: 67506682 Change-Id: I6ca80f521c880589efe45dc467d494051daae015 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 97d5fd27f7af6b36d775f56a1c78a026686aa407) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: module: Do not paper over type mismatches in module_param_call()Kees Cook
The module_param_call() macro was explicitly casting the .set and .get function prototypes away. This can lead to hard-to-find type mismatches. Now that all the function prototypes have been fixed tree-wide, we can drop these casts, and use named initializers too. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jessica Yu <jeyu@kernel.org> Bug: 67506682 Change-Id: I439c8b4b9f0108ac357267bbc396a63baec2b242 (cherry picked from commit ece1996a21eeb344b49200e627c6660111009c10) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit cb214f0c4c105ceab5456556425ecec74e2ac3c1) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: treewide: Fix function prototypes for module_param_call()Kees Cook
Several function prototypes for the set/get functions defined by module_param_call() have a slightly wrong argument types. This fixes those in an effort to clean up the calls when running under type-enforced compiler instrumentation for CFI. This is the result of running the following semantic patch: @match_module_param_call_function@ declarer name module_param_call; identifier _name, _set_func, _get_func; expression _arg, _mode; @@ module_param_call(_name, _set_func, _get_func, _arg, _mode); @fix_set_prototype depends on match_module_param_call_function@ identifier match_module_param_call_function._set_func; identifier _val, _param; type _val_type, _param_type; @@ int _set_func( -_val_type _val +const char * _val , -_param_type _param +const struct kernel_param * _param ) { ... } @fix_get_prototype depends on match_module_param_call_function@ identifier match_module_param_call_function._get_func; identifier _val, _param; type _val_type, _param_type; @@ int _get_func( -_val_type _val +char * _val , -_param_type _param +const struct kernel_param * _param ) { ... } Two additional by-hand changes are included for places where the above Coccinelle script didn't notice them: drivers/platform/x86/thinkpad_acpi.c fs/lockd/svc.c Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jessica Yu <jeyu@kernel.org> Bug: 67506682 Change-Id: I2c9c0ee8ed28065e63270a52c155e5e7d2791295 (cherry picked from commit e4dca7b7aa08b22893c45485d222b5807c1375ae) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 24da2c84bd7dcdf2b56fa8d3b2f833656ee60a01) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: module: Prepare to convert all module_param_call() prototypesKees Cook
After actually converting all module_param_call() function prototypes, we no longer need to do a tricky sizeof(func(thing)) type-check. Remove it. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jessica Yu <jeyu@kernel.org> Bug: 67506682 Change-Id: Ie20dbd09634c7cbef499c81bf2dbfd762ad0058a (cherry picked from commit b2f270e8747387335d80428c576118e7d87f69cc) Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 38cbecf2ae55478a2046ef4fc93b9d9147fbb170) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: kbuild: fix --gc-sectionsSami Tolvanen
Adds KEEP() to __ex_table, __param, and __bug_table. Bug: 67506682 Change-Id: I44ce1a541ac61b18c9ef5eb4749122f39ca7c755 Reported-by: Channagoud Kadabi <ckadabi@quicinc.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 3a3a0844ac38928ce19ed45b38532f8f16422470) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: FROMLIST: kbuild: fix LD_DEAD_CODE_DATA_ELIMINATIONSami Tolvanen
Don't remove .head.text or .exitcall.exit when linking with --gc-sections, and include .init.text.* in .init.text and .init.rodata.* in .init.rodata. Bug: 62093296 Bug: 67506682 Change-Id: Ia0f9e735d04c2322dcc8bcfc94241f0551b149c4 (am from https://patchwork.kernel.org/patch/10085773/) Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> (cherry picked from commit 4d10cc8e3fd07f7ff77623a1551dad3ead787d26) Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: kbuild: allow archs to select link dead code/data eliminationNicholas Piggin
Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to select to build with -ffunction-sections, -fdata-sections, and link with --gc-sections. It requires some work (documented) to ensure all unreferenced entrypoints are live, and requires toolchain and build verification, so it is made a per-arch option for now. On a random powerpc64le build, this yelds a significant size saving, it boots and runs fine, but there is a lot I haven't tested as yet, so these savings may be reduced if there are bugs in the link. text data bss dec filename 11169741 1180744 1923176 14273661 vmlinux 10445269 1004127 1919707 13369103 vmlinux.dce ~700K text, ~170K data, 6% removed from kernel image size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.com> (cherry-pick from b67067f1176df6ee727450546b58704e4b588563) Change-Id: I81b63489605bc2f146498d0bb0e1cc5b7adab8a0 Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28BACKPORT: arch: wire-up pidfd_open()Christian Brauner
This wires up the pidfd_open() syscall into all arches at once. Signed-off-by: Christian Brauner <christian@brauner.io> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-api@vger.kernel.org Cc: linux-alpha@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-ia64@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org (cherry picked from commit 7615d9e1780e26e0178c93c55b73309a5dc093d7) Conflicts: arch/alpha/kernel/syscalls/syscall.tbl arch/arm/tools/syscall.tbl arch/ia64/kernel/syscalls/syscall.tbl arch/m68k/kernel/syscalls/syscall.tbl arch/microblaze/kernel/syscalls/syscall.tbl arch/mips/kernel/syscalls/syscall_n32.tbl arch/mips/kernel/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_o32.tbl arch/parisc/kernel/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl arch/sh/kernel/syscalls/syscall.tbl arch/sparc/kernel/syscalls/syscall.tbl arch/xtensa/kernel/syscalls/syscall.tbl arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_64.tbl (1. Skipped syscall.tbl modifications for missing architectures. 2. Removed __ia32_sys_pidfd_open in arch/x86/entry/syscalls/syscall_32.tbl. 3. Replaced __x64_sys_pidfd_open with sys_pidfd_open in arch/x86/entry/syscalls/syscall_64.tbl.) Bug: 135608568 Test: test program using syscall(__NR_sys_pidfd_open,..) and poll() Change-Id: I294aa33dea5ed2662e077340281d7aa0452f7471 Signed-off-by: Suren Baghdasaryan <surenb@google.com>