summaryrefslogtreecommitdiff
path: root/init (unfollow)
Commit message (Collapse)Author
2022-10-28BACKPORT: kbuild: allow archs to select link dead code/data eliminationNicholas Piggin
Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to select to build with -ffunction-sections, -fdata-sections, and link with --gc-sections. It requires some work (documented) to ensure all unreferenced entrypoints are live, and requires toolchain and build verification, so it is made a per-arch option for now. On a random powerpc64le build, this yelds a significant size saving, it boots and runs fine, but there is a lot I haven't tested as yet, so these savings may be reduced if there are bugs in the link. text data bss dec filename 11169741 1180744 1923176 14273661 vmlinux 10445269 1004127 1919707 13369103 vmlinux.dce ~700K text, ~170K data, 6% removed from kernel image size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.com> (cherry-pick from b67067f1176df6ee727450546b58704e4b588563) Change-Id: I81b63489605bc2f146498d0bb0e1cc5b7adab8a0 Signed-off-by: Dan Aloni <daloni@magicleap.com> Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-10-28UPSTREAM: Make anon_inodes unconditionalDavid Howells
Make the anon_inodes facility unconditional so that it can be used by core VFS code. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit dadd2299ab61fc2b55b95b7b3a8f674cdd3b69c9) Bug: 135608568 Test: test program using syscall(__NR_sys_pidfd_open,..) and poll() Change-Id: I2f97bda4f360d8d05bbb603de839717b3d8067ae Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2022-07-27init: Remove CGROUP_SCHEDTUNE duplicated definitionDavide Garberi
* This has been added twice who knows when by CAF, most likely during an important merge Signed-off-by: Davide Garberi <dade.garberi@gmail.com>
2022-04-19bpf: Add kconfig knob for disabling unpriv bpf by defaultDaniel Borkmann
commit 08389d888287c3823f80b0216766b71e17f0aba5 upstream. Add a kconfig knob which allows for unprivileged bpf to be disabled by default. If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2. This still allows a transition of 2 -> {0,1} through an admin. Similarly, this also still keeps 1 -> {1} behavior intact, so that once set to permanently disabled, it cannot be undone aside from a reboot. We've also added extra2 with max of 2 for the procfs handler, so that an admin still has a chance to toggle between 0 <-> 2. Either way, as an additional alternative, applications can make use of CAP_BPF that we added a while ago. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net [fllinden@amazon.com: backported to 4.9] Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-19UPSTREAM: cgroup: move CONFIG_SOCK_CGROUP_DATA to init/KconfigArnd Bergmann
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is not right when CONFIG_NET is disabled and there is no socket interface: warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET) I don't know what the correct solution for this is, but simply removing the dependency on NET from SOCK_CGROUP_DATA by moving it out of the 'if NET' section avoids the warning and does not produce other build errors. Fixes: 483c4933ea09 ("cgroup: Fix CGROUP_BPF config") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Fixes: Change-Id: Ib41ef78fba02eb9e592558ddbf06f9ec0aa337b6 ("UPSTREAM: cgroup: Fix CGROUP_BPF config") (cherry picked from commit 73b351473547e543e9c8166dd67fd99c64c15b0b) Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>
2022-04-19UPSTREAM: cgroup: Fix CGROUP_BPF configAndy Lutomirski
Cherry-pick from commit 483c4933ea09b7aa625b9d64af286fc22ec7e419 CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually enabled, making it rather challenging to turn CGROUP_BPF on. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Bug: 30950746 Change-Id: Ib41ef78fba02eb9e592558ddbf06f9ec0aa337b6 Signed-off-by: Chatur27 <jasonbright2709@gmail.com>
2022-04-19bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov
[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ] The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chatur27 <jasonbright2709@gmail.com>
2022-04-19UPSTREAM: cgroup: add support for eBPF programsDaniel Mack
Cherry-pick from commit 3007098494bec614fb55dee7bc0410bb7db5ad18 This patch adds two sets of eBPF program pointers to struct cgroup. One for such that are directly pinned to a cgroup, and one for such that are effective for it. To illustrate the logic behind that, assume the following example cgroup hierarchy. A - B - C \ D - E If only B has a program attached, it will be effective for B, C, D and E. If D then attaches a program itself, that will be effective for both D and E, and the program in B will only affect B and C. Only one program of a given type is effective for a cgroup. Attaching and detaching programs will be done through the bpf(2) syscall. For now, ingress and egress inet socket filtering are the only supported use-cases. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Bug: 30950746 Change-Id: I3df35d8d3b1261503f9b5bcd90b18c9358f1ac28 Signed-off-by: Chatur27 <jasonbright2709@gmail.com>
2022-04-19Revert "bpf: introduce BPF_JIT_ALWAYS_ON config"Anay Wadhera
This reverts commit 28c486744e6de4d882a1d853aa63d99fcba4b7a6. Change-Id: I0e65abce457093a6c05d971b750228d99d5de131
2021-06-10pid: take a reference when initializing `cad_pid`Mark Rutland
commit 0711f0d7050b9e07c44bc159bbc64ac0a1022c7f upstream. During boot, kernel_init_freeable() initializes `cad_pid` to the init task's struct pid. Later on, we may change `cad_pid` via a sysctl, and when this happens proc_do_cad_pid() will increment the refcount on the new pid via get_pid(), and will decrement the refcount on the old pid via put_pid(). As we never called get_pid() when we initialized `cad_pid`, we decrement a reference we never incremented, can therefore free the init task's struct pid early. As there can be dangling references to the struct pid, we can later encounter a use-after-free (e.g. when delivering signals). This was spotted when fuzzing v5.13-rc3 with Syzkaller, but seems to have been around since the conversion of `cad_pid` to struct pid in commit 9ec52099e4b8 ("[PATCH] replace cad_pid by a struct pid") from the pre-KASAN stone age of v2.6.19. Fix this by getting a reference to the init task's struct pid when we assign it to `cad_pid`. Full KASAN splat below. ================================================================== BUG: KASAN: use-after-free in ns_of_pid include/linux/pid.h:153 [inline] BUG: KASAN: use-after-free in task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509 Read of size 4 at addr ffff23794dda0004 by task syz-executor.0/273 CPU: 1 PID: 273 Comm: syz-executor.0 Not tainted 5.12.0-00001-g9aef892b2d15 #1 Hardware name: linux,dummy-virt (DT) Call trace: ns_of_pid include/linux/pid.h:153 [inline] task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509 do_notify_parent+0x308/0xe60 kernel/signal.c:1950 exit_notify kernel/exit.c:682 [inline] do_exit+0x2334/0x2bd0 kernel/exit.c:845 do_group_exit+0x108/0x2c8 kernel/exit.c:922 get_signal+0x4e4/0x2a88 kernel/signal.c:2781 do_signal arch/arm64/kernel/signal.c:882 [inline] do_notify_resume+0x300/0x970 arch/arm64/kernel/signal.c:936 work_pending+0xc/0x2dc Allocated by task 0: slab_post_alloc_hook+0x50/0x5c0 mm/slab.h:516 slab_alloc_node mm/slub.c:2907 [inline] slab_alloc mm/slub.c:2915 [inline] kmem_cache_alloc+0x1f4/0x4c0 mm/slub.c:2920 alloc_pid+0xdc/0xc00 kernel/pid.c:180 copy_process+0x2794/0x5e18 kernel/fork.c:2129 kernel_clone+0x194/0x13c8 kernel/fork.c:2500 kernel_thread+0xd4/0x110 kernel/fork.c:2552 rest_init+0x44/0x4a0 init/main.c:687 arch_call_rest_init+0x1c/0x28 start_kernel+0x520/0x554 init/main.c:1064 0x0 Freed by task 270: slab_free_hook mm/slub.c:1562 [inline] slab_free_freelist_hook+0x98/0x260 mm/slub.c:1600 slab_free mm/slub.c:3161 [inline] kmem_cache_free+0x224/0x8e0 mm/slub.c:3177 put_pid.part.4+0xe0/0x1a8 kernel/pid.c:114 put_pid+0x30/0x48 kernel/pid.c:109 proc_do_cad_pid+0x190/0x1b0 kernel/sysctl.c:1401 proc_sys_call_handler+0x338/0x4b0 fs/proc/proc_sysctl.c:591 proc_sys_write+0x34/0x48 fs/proc/proc_sysctl.c:617 call_write_iter include/linux/fs.h:1977 [inline] new_sync_write+0x3ac/0x510 fs/read_write.c:518 vfs_write fs/read_write.c:605 [inline] vfs_write+0x9c4/0x1018 fs/read_write.c:585 ksys_write+0x124/0x240 fs/read_write.c:658 __do_sys_write fs/read_write.c:670 [inline] __se_sys_write fs/read_write.c:667 [inline] __arm64_sys_write+0x78/0xb0 fs/read_write.c:667 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall arch/arm64/kernel/syscall.c:49 [inline] el0_svc_common.constprop.1+0x16c/0x388 arch/arm64/kernel/syscall.c:129 do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:168 el0_svc+0x28/0x38 arch/arm64/kernel/entry-common.c:416 el0_sync_handler+0x134/0x180 arch/arm64/kernel/entry-common.c:432 el0_sync+0x154/0x180 arch/arm64/kernel/entry.S:701 The buggy address belongs to the object at ffff23794dda0000 which belongs to the cache pid of size 224 The buggy address is located 4 bytes inside of 224-byte region [ffff23794dda0000, ffff23794dda00e0) The buggy address belongs to the page: page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4dda0 head:(____ptrval____) order:1 compound_mapcount:0 flags: 0x3fffc0000010200(slab|head) raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff23794d40d080 raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff23794dd9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff23794dd9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff23794dda0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff23794dda0080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff23794dda0100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 ================================================================== Link: https://lkml.kernel.org/r/20210524172230.38715-1-mark.rutland@arm.com Fixes: 9ec52099e4b8678a ("[PATCH] replace cad_pid by a struct pid") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Christian Brauner <christian@brauner.io> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-10init/Kconfig: make COMPILE_TEST depend on HAS_IOMEMMasahiro Yamada
commit ea29b20a828511de3348334e529a3d046a180416 upstream. I read the commit log of the following two: - bc083a64b6c0 ("init/Kconfig: make COMPILE_TEST depend on !UML") - 334ef6ed06fa ("init/Kconfig: make COMPILE_TEST depend on !S390") Both are talking about HAS_IOMEM dependency missing in many drivers. So, 'depends on HAS_IOMEM' seems the direct, sensible solution to me. This does not change the behavior of UML. UML still cannot enable COMPILE_TEST because it does not provide HAS_IOMEM. The current dependency for S390 is too strong. Under the condition of CONFIG_PCI=y, S390 provides HAS_IOMEM, hence can enable COMPILE_TEST. I also removed the meaningless 'default n'. Link: https://lkml.kernel.org/r/20210224140809.1067582-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KP Singh <kpsingh@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Quentin Perret <qperret@google.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Cc: "Enrico Weigelt, metux IT consult" <lkml@metux.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-10init/Kconfig: make COMPILE_TEST depend on !S390Heiko Carstens
commit 334ef6ed06fa1a54e35296b77b693bcf6d63ee9e upstream. While allmodconfig and allyesconfig build for s390 there are also various bots running compile tests with randconfig, where PCI is disabled. This reveals that a lot of drivers should actually depend on HAS_IOMEM. Adding this to each device driver would be a never ending story, therefore just disable COMPILE_TEST for s390. The reasoning is more or less the same as described in commit bc083a64b6c0 ("init/Kconfig: make COMPILE_TEST depend on !UML"). Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-10init/Kconfig: make COMPILE_TEST depend on !UMLRichard Weinberger
commit bc083a64b6c035135c0f80718f9e9192cc0867c6 upstream. UML is a bit special since it does not have iomem nor dma. That means a lot of drivers will not build if they miss a dependency on HAS_IOMEM. s390 used to have the same issues but since it gained PCI support UML is the only stranger. We are tired of patching dozens of new drivers after every merge window just to un-break allmod/yesconfig UML builds. One could argue that a decent driver has to know on what it depends and therefore a missing HAS_IOMEM dependency is a clear driver bug. But the dependency not obvious and not everyone does UML builds with COMPILE_TEST enabled when developing a device driver. A possible solution to make these builds succeed on UML would be providing stub functions for ioremap() and friends which fail upon runtime. Another one is simply disabling COMPILE_TEST for UML. Since it is the least hassle and does not force use to fake iomem support let's do the latter. Link: http://lkml.kernel.org/r/1466152995-28367-1-git-send-email-richard@nod.at Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-05UPSTREAM: locking/atomic, kref: Add KREF_INIT()Peter Zijlstra
Since we need to change the implementation, stop exposing internals. Provide KREF_INIT() to allow static initialization of struct kref. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Bug: 173789633 Change-Id: I5fa30eb173b15bcb5d291ccbb61294f28ba27ea9 (cherry picked from commit 1e24edca0557dba6486d39d3c24c288475432bcf) Signed-off-by: Giuliano Procida <gprocida@google.com>
2020-11-10printk: reduce LOG_BUF_SHIFT range for H8300John Ogness
[ Upstream commit 550c10d28d21bd82a8bb48debbb27e6ed53262f6 ] The .bss section for the h8300 is relatively small. A value of CONFIG_LOG_BUF_SHIFT that is larger than 19 will create a static printk ringbuffer that is too large. Limit the range appropriately for the H8300. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200812073122.25412-1-john.ogness@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20x86: Fix early boot crash on gcc-10, third tryBorislav Petkov
commit a9a3ed1eff3601b63aea4fb462d8b3b92c7c1e7e upstream. ... or the odyssey of trying to disable the stack protector for the function which generates the stack canary value. The whole story started with Sergei reporting a boot crash with a kernel built with gcc-10: Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013 Call Trace: dump_stack panic ? start_secondary __stack_chk_fail start_secondary secondary_startup_64 -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary This happens because gcc-10 tail-call optimizes the last function call in start_secondary() - cpu_startup_entry() - and thus emits a stack canary check which fails because the canary value changes after the boot_init_stack_canary() call. To fix that, the initial attempt was to mark the one function which generates the stack canary with: __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused) however, using the optimize attribute doesn't work cumulatively as the attribute does not add to but rather replaces previously supplied optimization options - roughly all -fxxx options. The key one among them being -fno-omit-frame-pointer and thus leading to not present frame pointer - frame pointer which the kernel needs. The next attempt to prevent compilers from tail-call optimizing the last function call cpu_startup_entry(), shy of carving out start_secondary() into a separate compilation unit and building it with -fno-stack-protector, was to add an empty asm(""). This current solution was short and sweet, and reportedly, is supported by both compilers but we didn't get very far this time: future (LTO?) optimization passes could potentially eliminate this, which leads us to the third attempt: having an actual memory barrier there which the compiler cannot ignore or move around etc. That should hold for a long time, but hey we said that about the other two solutions too so... Reported-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kalle Valo <kvalo@codeaurora.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20Stop the ad-hoc games with -Wno-maybe-initializedLinus Torvalds
commit 78a5255ffb6a1af189a83e493d916ba1c54d8c75 upstream. We have some rather random rules about when we accept the "maybe-initialized" warnings, and when we don't. For example, we consider it unreliable for gcc versions < 4.9, but also if -O3 is enabled, or if optimizing for size. And then various kernel config options disabled it, because they know that they trigger that warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES). And now gcc-10 seems to be introducing a lot of those warnings too, so it falls under the same heading as 4.9 did. At the same time, we have a very straightforward way to _enable_ that warning when wanted: use "W=2" to enable more warnings. So stop playing these ad-hoc games, and just disable that warning by default, with the known and straight-forward "if you want to work on the extra compiler warnings, use W=123". Would it be great to have code that is always so obvious that it never confuses the compiler whether a variable is used initialized or not? Yes, it would. In a perfect world, the compilers would be smarter, and our source code would be simpler. That's currently not the world we live in, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20kbuild: compute false-positive -Wmaybe-uninitialized cases in KconfigMasahiro Yamada
commit b303c6df80c9f8f13785aa83a0471fca7e38b24d upstream. Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched various false positives: - commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os") turned off this option for -Os. - commit 815eb71e7149 ("Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for CONFIG_PROFILE_ALL_BRANCHES - commit a76bcf557ef4 ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"") turned off this option for GCC < 4.9 Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903 I think this looks better by shifting the logic from Makefile to Kconfig. Link: https://github.com/ClangBuiltLinux/linux/issues/350 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21sched/core: Add try_get_task_stack() and put_task_stack()Andy Lutomirski
commit c6c314a613cd7d03fb97713e0d642b493de42e69 upstream. There are a few places in the kernel that access stack memory belonging to a different task. Before we can start freeing task stacks before the task_struct is freed, we need a way for those code paths to pin the stack. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/17a434f50ad3d77000104f21666575e10a9c1fbd.1474003868.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21sched/core: Allow putting thread_info into task_structAndy Lutomirski
commit c65eacbe290b8141554c71b2c94489e73ade8c8d upstream. If an arch opts in by setting CONFIG_THREAD_INFO_IN_TASK_STRUCT, then thread_info is defined as a single 'u32 flags' and is the first entry of task_struct. thread_info::task is removed (it serves no purpose if thread_info is embedded in task_struct), and thread_info::cpu gets its own slot in task_struct. This is heavily based on a patch written by Linus. Originally-from: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a0898196f0476195ca02713691a5037a14f2aac5.1473801993.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-02ANDROID: sched: Disallow WALT with CFS bandwidth controlQuentin Perret
The WALT time accounting breaks when CFS tasks are throttled by the CPU bandwidth control mechanism of the CPU cgroups controller. This can result in a negative cumulative_runnable_avg, which can then lead to a kernel panic, and the device crashing. Although the right fix would be add support for throttled CFS tasks to WALT, the common kernel is now in stable maintenance mode and will not get new features which could cause issues for partners downstream. To work around the issue, make the CFS_BANDWIDTH Kconfig option depend on SCHED_WALT=n, hence preventing these two things from being enabled simultaneously. This should not be an issue for most partners (nobody had noticed the breakage for years), and those who do need the better fix can apply it in their device kernel. Bug: 139071966 Bug: 120440300 Change-Id: Ieb3c367ae7893ac93fb5b38c1580dc59151aacce Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2019-09-02ANDROID: sched: Disallow WALT with CFS bandwidth controlQuentin Perret
The WALT time accounting breaks when CFS tasks are throttled by the CPU bandwidth control mechanism of the CPU cgroups controller. This can result in a negative cumulative_runnable_avg, which can then lead to a kernel panic, and the device crashing. Although the right fix would be add support for throttled CFS tasks to WALT, the common kernel is now in stable maintenance mode and will not get new features which could cause issues for partners downstream. To work around the issue, make the CFS_BANDWIDTH Kconfig option depend on SCHED_WALT=n, hence preventing these two things from being enabled simultaneously. This should not be an issue for most partners (nobody had noticed the breakage for years), and those who do need the better fix can apply it in their device kernel. Bug: 139071966 Bug: 120440300 Change-Id: Ieb3c367ae7893ac93fb5b38c1580dc59151aacce Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2019-05-16init: initialize jump labels before command line option parsingDan Williams
[ Upstream commit 6041186a32585fc7a1d0f6cfe2f138b05fdc3c82 ] When a module option, or core kernel argument, toggles a static-key it requires jump labels to be initialized early. While x86, PowerPC, and ARM64 arrange for jump_label_init() to be called before parse_args(), ARM does not. Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 console=ttyAMA0,115200 page_alloc.shuffle=1 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303 page_alloc_shuffle+0x12c/0x1ac static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used before call to jump_label_init() Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-rc4-next-20190410-00003-g3367c36ce744 #1 Hardware name: ARM Integrator/CP (Device Tree) [<c0011c68>] (unwind_backtrace) from [<c000ec48>] (show_stack+0x10/0x18) [<c000ec48>] (show_stack) from [<c07e9710>] (dump_stack+0x18/0x24) [<c07e9710>] (dump_stack) from [<c001bb1c>] (__warn+0xe0/0x108) [<c001bb1c>] (__warn) from [<c001bb88>] (warn_slowpath_fmt+0x44/0x6c) [<c001bb88>] (warn_slowpath_fmt) from [<c0b0c4a8>] (page_alloc_shuffle+0x12c/0x1ac) [<c0b0c4a8>] (page_alloc_shuffle) from [<c0b0c550>] (shuffle_store+0x28/0x48) [<c0b0c550>] (shuffle_store) from [<c003e6a0>] (parse_args+0x1f4/0x350) [<c003e6a0>] (parse_args) from [<c0ac3c00>] (start_kernel+0x1c0/0x488) Move the fallback call to jump_label_init() to occur before parse_args(). The redundant calls to jump_label_init() in other archs are left intact in case they have static key toggling use cases that are even earlier than option parsing. Link: http://lkml.kernel.org/r/155544804466.1032396.13418949511615676665.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Guenter Roeck <groeck@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Russell King <rmk@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-09-06printk: Make the console flush configurable in hotplug pathMohammed Khajapasha
The thread which initiates the hot plug can get scheduled out, while trying to acquire the console lock, thus increasing the hot plug latency. This option allows to selectively disable the console flush and in turn reduce the hot plug latency. Change-Id: I42507804d321b29b7761146a6c175d959bf79925 Signed-off-by: Mohammed Khajapasha <mkhaja@codeaurora.org> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
2018-06-06Kbuild: change CC_OPTIMIZE_FOR_SIZE definitionArnd Bergmann
commit 877417e6ffb9578e8580abf76a71e15732473456 upstream. CC_OPTIMIZE_FOR_SIZE disables the often useful -Wmaybe-unused warning, because that causes a ridiculous amount of false positives when combined with -Os. This means a lot of warnings don't show up in testing by the developers that should see them with an 'allmodconfig' kernel that has CC_OPTIMIZE_FOR_SIZE enabled, but only later in randconfig builds that don't. This changes the Kconfig logic around CC_OPTIMIZE_FOR_SIZE to make it a 'choice' statement defaulting to CC_OPTIMIZE_FOR_PERFORMANCE that gets added for this purpose. The allmodconfig and allyesconfig kernels now default to -O2 with the maybe-unused warning enabled. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michal Marek <mmarek@suse.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-16init/main: Put kernel end place_markerVivek Kumar
Put kernel end place_marker for all targets. This saves the kernel end time for targets which enable MSM_BOOT_TIME_MARKER. Change-Id: Iad635e971bdd341328d40681b7acf8a6f43f288d Signed-off-by: Vivek Kumar <vivekuma@codeaurora.org>
2018-02-03bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov
[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ] The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZEHugh Dickins
Kaiser only needs to map one page of the stack; and kernel/fork.c did not build on powerpc (no __PAGE_KERNEL). It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack() and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's kaiser_add_mapping() and kaiser_remove_mapping(). And use linux/kaiser.h in init/main.c to avoid the #ifdefs there. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05KAISER: Kernel Address IsolationRichard Fellner
This patch introduces our implementation of KAISER (Kernel Address Isolation to have Side-channels Efficiently Removed), a kernel isolation technique to close hardware side channels on kernel address information. More information about the patch can be found on: https://github.com/IAIK/KAISER From: Richard Fellner <richard.fellner@student.tugraz.at> From: Daniel Gruss <daniel.gruss@iaik.tugraz.at> X-Subject: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode Date: Thu, 4 May 2017 14:26:50 +0200 Link: http://marc.info/?l=linux-kernel&m=149390087310405&w=2 Kaiser-4.10-SHA1: c4b1831d44c6144d3762ccc72f0c4e71a0c713e5 To: <linux-kernel@vger.kernel.org> To: <kernel-hardening@lists.openwall.com> Cc: <clementine.maurice@iaik.tugraz.at> Cc: <moritz.lipp@iaik.tugraz.at> Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at> Cc: Richard Fellner <richard.fellner@student.tugraz.at> Cc: Ingo Molnar <mingo@kernel.org> Cc: <kirill.shutemov@linux.intel.com> Cc: <anders.fogh@gdata-adan.de> After several recent works [1,2,3] KASLR on x86_64 was basically considered dead by many researchers. We have been working on an efficient but effective fix for this problem and found that not mapping the kernel space when running in user mode is the solution to this problem [4] (the corresponding paper [5] will be presented at ESSoS17). With this RFC patch we allow anybody to configure their kernel with the flag CONFIG_KAISER to add our defense mechanism. If there are any questions we would love to answer them. We also appreciate any comments! Cheers, Daniel (+ the KAISER team from Graz University of Technology) [1] http://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf [2] https://www.blackhat.com/docs/us-16/materials/us-16-Fogh-Using-Undocumented-CPU-Behaviour-To-See-Into-Kernel-Mode-And-Break-KASLR-In-The-Process.pdf [3] https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Kernel-Address-Space-Layout-Randomization-KASLR-With-Intel-TSX.pdf [4] https://github.com/IAIK/KAISER [5] https://gruss.cc/files/kaiser.pdf [patch based also on https://raw.githubusercontent.com/IAIK/KAISER/master/KAISER/0001-KAISER-Kernel-Address-Isolation.patch] Signed-off-by: Richard Fellner <richard.fellner@student.tugraz.at> Signed-off-by: Moritz Lipp <moritz.lipp@iaik.tugraz.at> Signed-off-by: Daniel Gruss <daniel.gruss@iaik.tugraz.at> Signed-off-by: Michael Schwarz <michael.schwarz@iaik.tugraz.at> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-05ANDROID: initramfs: call free_initrd() when skipping initNick Bray
Memory allocated for initrd would not be reclaimed if initializing ramfs was skipped. Bug: 69901741 Test: "grep MemTotal /proc/meminfo" increases by a few MB on an Android device with a/b boot. Change-Id: Ifbe094d303ed12cfd6de6aa004a8a19137a2f58a Signed-off-by: Nick Bray <ncbray@google.com>
2017-08-09UPSTREAM: sched/core: Add try_get_task_stack() and put_task_stack()Andy Lutomirski
There are a few places in the kernel that access stack memory belonging to a different task. Before we can start freeing task stacks before the task_struct is freed, we need a way for those code paths to pin the stack. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/17a434f50ad3d77000104f21666575e10a9c1fbd.1474003868.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Bug: 38331309 Change-Id: I414853e9b72ecb0967d5e1cbfc77b4929bf3f4f5 (cherry picked from commit c6c314a613cd7d03fb97713e0d642b493de42e69) Signed-off-by: Zubin Mithra <zsm@google.com>
2017-08-09UPSTREAM: sched/core: Allow putting thread_info into task_structAndy Lutomirski
If an arch opts in by setting CONFIG_THREAD_INFO_IN_TASK_STRUCT, then thread_info is defined as a single 'u32 flags' and is the first entry of task_struct. thread_info::task is removed (it serves no purpose if thread_info is embedded in task_struct), and thread_info::cpu gets its own slot in task_struct. This is heavily based on a patch written by Linus. Originally-from: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jann Horn <jann@thejh.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a0898196f0476195ca02713691a5037a14f2aac5.1473801993.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Bug: 38331309 Change-Id: I25e5a830f2ada5e74fa93661e97e5e701b1b70d2 (cherry picked from commit c65eacbe290b8141554c71b2c94489e73ade8c8d) Signed-off-by: Zubin Mithra <zsm@google.com>
2017-08-09UPSTREAM: Clarify naming of thread info/stack allocatorsLinus Torvalds
We've had the thread info allocated together with the thread stack for most architectures for a long time (since the thread_info was split off from the task struct), but that is about to change. But the patches that move the thread info to be off-stack (and a part of the task struct instead) made it clear how confused the allocator and freeing functions are. Because the common case was that we share an allocation with the thread stack and the thread_info, the two pointers were identical. That identity then meant that we would have things like ti = alloc_thread_info_node(tsk, node); ... tsk->stack = ti; which certainly _worked_ (since stack and thread_info have the same value), but is rather confusing: why are we assigning a thread_info to the stack? And if we move the thread_info away, the "confusing" code just gets to be entirely bogus. So remove all this confusion, and make it clear that we are doing the stack allocation by renaming and clarifying the function names to be about the stack. The fact that the thread_info then shares the allocation is an implementation detail, and not really about the allocation itself. This is a pure renaming and type fix: we pass in the same pointer, it's just that we clarify what the pointer means. The ia64 code that actually only has one single allocation (for all of task_struct, thread_info and kernel thread stack) now looks a bit odd, but since "tsk->stack" is actually not even used there, that oddity doesn't matter. It would be a separate thing to clean that up, I intentionally left the ia64 changes as a pure brute-force renaming and type change. Acked-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Bug: 38331309 Change-Id: I870b5476fc900c9145134f9dd3ed18a32a490162 (cherry picked from commit b235beea9e996a4d36fed6cfef4801a3e7d7a9a5) Signed-off-by: Zubin Mithra <zsm@google.com>
2017-04-26ANDROID: Update init/do_mounts_dm.c to the latest ChromiumOS version.David Zeuthen
This is needed for AVB integration work. Bug: 31796270 Test: Manually tested (other arch). Change-Id: I32fd37c1578c6414e3e6ff277d16ad94df7886b8 Signed-off-by: David Zeuthen <zeuthen@google.com> Git-commit: 6a6a7657c231e947233c43ae0522bbd4edf0139e Git-repo: https://android.googlesource.com/kernel/common/ Signed-off-by: Runmin Wang <runminw@codeaurora.org>
2017-04-25ANDROID: Update init/do_mounts_dm.c to the latest ChromiumOS version.David Zeuthen
This is needed for AVB integration work. Bug: 31796270 Test: Manually tested (other arch). Change-Id: I32fd37c1578c6414e3e6ff277d16ad94df7886b8 Signed-off-by: David Zeuthen <zeuthen@google.com>
2017-01-16ANDROID: sched/walt: fix build failure if FAIR_GROUP_SCHED=nAmit Pundir
Fix SCHED_WALT dependency on FAIR_GROUP_SCHED otherwise we run into following build failure: CC kernel/sched/walt.o kernel/sched/walt.c: In function 'walt_inc_cfs_cumulative_runnable_avg': kernel/sched/walt.c:148:8: error: 'struct cfs_rq' has no member named 'cumulative_runnable_avg' cfs_rq->cumulative_runnable_avg += p->ravg.demand; ^ kernel/sched/walt.c: In function 'walt_dec_cfs_cumulative_runnable_avg': kernel/sched/walt.c:154:8: error: 'struct cfs_rq' has no member named 'cumulative_runnable_avg' cfs_rq->cumulative_runnable_avg -= p->ravg.demand; ^ Reported-at: https://bugs.linaro.org/show_bug.cgi?id=2793 Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2017-01-02ANDROID: sched/walt: fix build failure if FAIR_GROUP_SCHED=nAmit Pundir
Fix SCHED_WALT dependency on FAIR_GROUP_SCHED otherwise we run into following build failure: CC kernel/sched/walt.o kernel/sched/walt.c: In function 'walt_inc_cfs_cumulative_runnable_avg': kernel/sched/walt.c:148:8: error: 'struct cfs_rq' has no member named 'cumulative_runnable_avg' cfs_rq->cumulative_runnable_avg += p->ravg.demand; ^ kernel/sched/walt.c: In function 'walt_dec_cfs_cumulative_runnable_avg': kernel/sched/walt.c:154:8: error: 'struct cfs_rq' has no member named 'cumulative_runnable_avg' cfs_rq->cumulative_runnable_avg -= p->ravg.demand; ^ Reported-at: https://bugs.linaro.org/show_bug.cgi?id=2793 Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
2016-10-28Revert "init: do_mounts: Add a dummy definition for dm_table_put"Vikram Mulukutla
We no longer require dm_table_put. Upstream fixed the dm_mounts driver to remove this symbol. Change-Id: I4ba1043965d25ec444a833283392ac2394c845f3 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-10-28ANDROID: dm: Rebase on top of 4.1Badhri Jagan Sridharan
1. "dm: optimize use SRCU and RCU" removes the use of dm_table_put. 2. "dm: remove request-based logic from make_request_fn wrapper" necessitates calling dm_setup_md_queue or else the request_queue's make_request_fn pointer ends being unset. [ 7.711600] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP [ 7.717519] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 4.1.15-02273-gb057d16-dirty #33 [ 7.726559] Hardware name: HiKey Development Board (DT) [ 7.731779] task: ffffffc005f8acc0 ti: ffffffc005f8c000 task.ti: ffffffc005f8c000 [ 7.739257] PC is at 0x0 [ 7.741787] LR is at generic_make_request+0x8c/0x108 .... [ 9.082931] Call trace: [ 9.085372] [< (null)>] (null) [ 9.090074] [<ffffffc0003f4ac0>] submit_bio+0x98/0x1e0 [ 9.095212] [<ffffffc0001e2618>] _submit_bh+0x120/0x1f0 [ 9.096165] cfg80211: Calling CRDA to update world regulatory domain [ 9.106781] [<ffffffc0001e5450>] __bread_gfp+0x94/0x114 [ 9.112004] [<ffffffc00024a748>] ext4_fill_super+0x18c/0x2d64 [ 9.117750] [<ffffffc0001b275c>] mount_bdev+0x194/0x1c0 [ 9.122973] [<ffffffc0002450dc>] ext4_mount+0x14/0x1c [ 9.128021] [<ffffffc0001b29a0>] mount_fs+0x3c/0x194 [ 9.132985] [<ffffffc0001d059c>] vfs_kern_mount+0x4c/0x134 [ 9.138467] [<ffffffc0001d2168>] do_mount+0x204/0xbbc [ 9.143514] [<ffffffc0001d2ec4>] SyS_mount+0x94/0xe8 [ 9.148479] [<ffffffc000c54074>] mount_block_root+0x120/0x24c [ 9.154222] [<ffffffc000c543e8>] mount_root+0x110/0x12c [ 9.159443] [<ffffffc000c54574>] prepare_namespace+0x170/0x1b8 [ 9.165273] [<ffffffc000c53d98>] kernel_init_freeable+0x23c/0x260 [ 9.171365] [<ffffffc0009b1748>] kernel_init+0x10/0x118 [ 9.176589] Code: bad PC value [ 9.179807] ---[ end trace 75e1bc52ba364d13 ]--- Bug: 27175947 Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com> Change-Id: I952d86fd1475f0825f9be1386e3497b36127abd0 Git-commit: 5a77db78398a7247ab6bb6ff9720bae19d1f9ff4 Git-repo: https://android.googlesource.com/kernel/common Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-10-28CHROMIUM: dm: boot time specification of dm=Will Drewry
This is a wrap-up of three patches pending upstream approval. I'm bundling them because they are interdependent, and it'll be easier to drop it on rebase later. 1. dm: allow a dm-fs-style device to be shared via dm-ioctl Integrates feedback from Alisdair, Mike, and Kiyoshi. Two main changes occur here: - One function is added which allows for a programmatically created mapped device to be inserted into the dm-ioctl hash table. This binds the device to a name and, optional, uuid which is needed by udev and allows for userspace management of the mapped device. - dm_table_complete() was extended to handle all of the final functional changes required for the table to be operational once called. 2. init: boot to device-mapper targets without an initr* Add a dm= kernel parameter modeled after the md= parameter from do_mounts_md. It allows for device-mapper targets to be configured at boot time for use early in the boot process (as the root device or otherwise). It also replaces /dev/XXX calls with major:minor opportunistically. The format is dm="name uuid ro,table line 1,table line 2,...". The parser expects the comma to be safe to use as a newline substitute but, otherwise, uses the normal separator of space. Some attempt has been made to make it forgiving of additional spaces (using skip_spaces()). A mapped device created during boot will be assigned a minor of 0 and may be access via /dev/dm-0. An example dm-linear root with no uuid may look like: root=/dev/dm-0 dm="lroot none ro, 0 4096 linear /dev/ubdb 0, 4096 4096 linear /dv/ubdc 0" Once udev is started, /dev/dm-0 will become /dev/mapper/lroot. Older upstream threads: http://marc.info/?l=dm-devel&m=127429492521964&w=2 http://marc.info/?l=dm-devel&m=127429499422096&w=2 http://marc.info/?l=dm-devel&m=127429493922000&w=2 Latest upstream threads: https://patchwork.kernel.org/patch/104859/ https://patchwork.kernel.org/patch/104860/ https://patchwork.kernel.org/patch/104861/ Bug: 27175947 Signed-off-by: Will Drewry <wad@chromium.org> Review URL: http://codereview.chromium.org/2020011 Change-Id: I92bd53432a11241228d2e5ac89a3b20d19b05a31 Git-commit: 96b0434c25be8c73822a679be35b29ca5263baea Git-repo: https://android.googlesource.com/kernel/common Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-10-28init: do_mounts: Add a dummy definition for dm_table_putVikram Mulukutla
To temporarily allow compilation of an upcoming dm change, add a dummy dm_table_put definition. Change-Id: Iceca2eb6daa55f0acb936eafe1d59f65f7cfcd55 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-10-12sched: Add Kconfig option DEFAULT_USE_ENERGY_AWARE to set ENERGY_AWARE ↵John Stultz
feature flag The ENERGY_AWARE sched feature flag cannot be set unless CONFIG_SCHED_DEBUG is enabled. So this patch allows the flag to default to true at build time if the config is set. Change-Id: I8835a571fdb7a8f8ee6a54af1e11a69f3b5ce8e6 Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-10-10sched/tune: add initial support for CGroups based boostingPatrick Bellasi
To support task performance boosting, the usage of a single knob has the advantage to be a simple solution, both from the implementation and the usability standpoint. However, on a real system it can be difficult to identify a single value for the knob which fits the needs of multiple different tasks. For example, some kernel threads and/or user-space background services should be better managed the "standard" way while we still want to be able to boost the performance of specific workloads. In order to improve the flexibility of the task boosting mechanism this patch is the first of a small series which extends the previous implementation to introduce a "per task group" support. This first patch introduces just the basic CGroups support, a new "schedtune" CGroups controller is added which allows to configure different boost value for different groups of tasks. To keep the implementation simple but still effective for a boosting strategy, the new controller: 1. allows only a two layer hierarchy 2. supports only a limited number of boost groups A two layer hierarchy allows to place each task either: a) in the root control group thus being subject to a system-wide boosting value b) in a child of the root group thus being subject to the specific boost value defined by that "boost group" The limited number of "boost groups" supported is mainly motivated by the observation that in a real system it could be useful to have only few classes of tasks which deserve different treatment. For example, background vs foreground or interactive vs low-priority. As an additional benefit, a limited number of boost groups allows also to have a simpler implementation especially for the code required to compute the boost value for CPUs which have runnable tasks belonging to different boost groups. Change-Id: I1304e33a8440bfdad9c8bcf8129ff390216f2e32 cc: Tejun Heo <tj@kernel.org> cc: Li Zefan <lizefan@huawei.com> cc: Johannes Weiner <hannes@cmpxchg.org> cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Git-commit: 13001f47c9a610705219700af4636386b647e231 Git-repo: https://android.googlesource.com/kernel/common Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-10-09init: Move stack canary initialization after setup_archLaura Abbott
Stack canary initialization involves getting a random number. Getting this random number may involve accessing caches or other architectural specific features which are not available until after the architecture is setup. Move the stack canary initialization later to accommodate this. Change-Id: I00b564a2c3172229a44339c061fa380c17fe7d8e Signed-off-by: Laura Abbott <lauraa@codeaurora.org> [ohaugan@codeaurora.org: Fix trivial merge conflict] Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-10-05sched/tune: add sysctl interface to define a boost valuePatrick Bellasi
The current (CFS) scheduler implementation does not allow "to boost" tasks performance by running them at a higher OPP compared to the minimum required to meet their workload demands. To support tasks performance boosting the scheduler should provide a "knob" which allows to tune how much the system is going to be optimised for energy efficiency vs performance. This patch is the first of a series which provides a simple interface to define a tuning knob. One system-wide "boost" tunable is exposed via: /proc/sys/kernel/sched_cfs_boost which can be configured in the range [0..100], to define a percentage where: - 0% boost requires to operate in "standard" mode by scheduling tasks at the minimum capacities required by the workload demand - 100% boost requires to push at maximum the task performances, "regardless" of the incurred energy consumption A boost value in between these two boundaries is used to bias the power/performance trade-off, the higher the boost value the more the scheduler is biased toward performance boosting instead of energy efficiency. Change-Id: I59a41725e2d8f9238a61dfb0c909071b53560fc0 cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Git-commit: 63c8fad2b06805ef88f1220551289f0a3c3529f1 Git-repo: https://source.codeaurora.org/quic/la/kernel/msm-4.4 Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
2016-09-24core_ctrl: Move core control into kernelOlav Haugan
Move core control from out-of-tree module into the kernel proper. Core control monitors load on CPUs and controls how many CPUs are available for the system to use at any point in time. This can help save power. Core control can be configured through sysfs interface. Change-Id: Ia78e701468ea3828195c2a15c9cf9fafd099804a Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-09-23UPSTREAM: mm/init: Add 'rodata=off' boot cmdline parameter to disable ↵Kees Cook
read-only kernel mappings It may be useful to debug writes to the readonly sections of memory, so provide a cmdline "rodata=off" to allow for this. This can be expanded in the future to support "log" and "write" modes, but that will need to be architecture-specific. This also makes KDB software breakpoints more usable, as read-only mappings can now be disabled on any kernel. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Brown <david.brown@linaro.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Emese Revfy <re.emese@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: PaX Team <pageexec@freemail.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1455748879-21872-3-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Bug: 31660652 Change-Id: I67b818ca390afdd42ab1c27cb4f8ac64bbdb3b65 (cherry picked from commit d2aa1acad22f1bdd0cfa67b3861800e392254454) Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2016-09-21sched: Add Kconfig option DEFAULT_USE_ENERGY_AWARE to set ENERGY_AWARE ↵John Stultz
feature flag The ENERGY_AWARE sched feature flag cannot be set unless CONFIG_SCHED_DEBUG is enabled. So this patch allows the flag to default to true at build time if the config is set. Change-Id: I8835a571fdb7a8f8ee6a54af1e11a69f3b5ce8e6 Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-09-14sched: Introduce Window Assisted Load Tracking (WALT)Srivatsa Vaddagiri
use a window based view of time in order to track task demand and CPU utilization in the scheduler. Window Assisted Load Tracking (WALT) implementation credits: Srivatsa Vaddagiri, Steve Muckle, Syed Rameez Mustafa, Joonwoo Park, Pavan Kumar Kondeti, Olav Haugan 2016-03-06: Integration with EAS/refactoring by Vikram Mulukutla and Todd Kjos Change-Id: I21408236836625d4e7d7de1843d20ed5ff36c708 Includes fixes for issues: eas/walt: Use walt_ktime_clock() instead of ktime_get_ns() to avoid a race resulting in watchdog resets BUG: 29353986 Change-Id: Ic1820e22a136f7c7ebd6f42e15f14d470f6bbbdb Handle walt accounting anomoly during resume During resume, there is a corner case where on wakeup, a task's prev_runnable_sum can go negative. This is a workaround that fixes the condition and warns (instead of crashing). BUG: 29464099 Change-Id: I173e7874324b31a3584435530281708145773508 Signed-off-by: Todd Kjos <tkjos@google.com> Signed-off-by: Srinath Sridharan <srinathsr@google.com> Signed-off-by: Juri Lelli <juri.lelli@arm.com> [jstultz: fwdported to 4.4] Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-09-14FIXUP: sched: fix build for non-SMP targetPatrick Bellasi
Currently the build for a single-core (e.g. user-mode) Linux is broken and this configuration is required (at least) to run some network tests. The main issues for the current code support on single-core systems are: 1. {se,rq}::sched_avg is not available nor maintained for !SMP systems This means that load and utilisation signals are NOT available in single core systems. All the EAS code depends on these signals. 2. sched_group_energy is also SMP dependant. Again this means that all the EAS setup and preparation code (energyn model initialization) has to be properly guarded/disabled for !SMP systems. 3. SchedFreq depends on utilization signal, which is not available on !SMP systems. 4. SchedTune is useless on unicore systems if SchedFreq is not available. 5. WALT machinery is not required on single-core systems. This patch addresses all these issues by enforcing some constraints for single-core systems: a) WALT, SchedTune and SchedTune are now dependant on SMP b) The default governor for !SMP systems is INTERACTIVE c) The energy model initialisation/build functions are d) Other minor code re-arrangements and CONFIG_SMP guarding to enable single core builds. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>