summaryrefslogtreecommitdiff
path: root/arch/s390/include (unfollow)
Commit message (Collapse)Author
2022-04-19soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPFCraig Gallek
Expose socket options for setting a classic or extended BPF program for use when selecting sockets in an SO_REUSEPORT group. These options can be used on the first socket to belong to a group before bind or on any socket in the group after bind. This change includes refactoring of the existing sk_filter code to allow reuse of the existing BPF filter validation checks. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chatur27 <jasonbright2709@gmail.com> Change-Id: I4433dfd5d865bb69e8d3cab55cc68325829e8b49
2021-12-08hugetlbfs: flush TLBs correctly after huge_pmd_unshareNadav Amit
commit a4a118f2eead1d6c49e00765de89878288d4b890 upstream. When __unmap_hugepage_range() calls to huge_pmd_unshare() succeed, a TLB flush is missing. This TLB flush must be performed before releasing the i_mmap_rwsem, in order to prevent an unshared PMDs page from being released and reused before the TLB flush took place. Arguably, a comprehensive solution would use mmu_gather interface to batch the TLB flushes and the PMDs page release, however it is not an easy solution: (1) try_to_unmap_one() and try_to_migrate_one() also call huge_pmd_unshare() and they cannot use the mmu_gather interface; and (2) deferring the release of the page reference for the PMDs page until after i_mmap_rwsem is dropeed can confuse huge_pmd_unshare() into thinking PMDs are shared when they are not. Fix __unmap_hugepage_range() by adding the missing TLB flush, and forcing a flush when unshare is successful. Fixes: 24669e58477e ("hugetlb: use mmu_gather instead of a temporary linked list for accumulating pages)" # 3.6 Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28s390/ftrace: fix ftrace_update_ftrace_func implementationVasily Gorbik
commit f8c2602733c953ed7a16e060640b8e96f9d94b9b upstream. s390 enforces DYNAMIC_FTRACE if FUNCTION_TRACER is selected. At the same time implementation of ftrace_caller is not compliant with HAVE_DYNAMIC_FTRACE since it doesn't provide implementation of ftrace_update_ftrace_func() and calls ftrace_trace_function() directly. The subtle difference is that during ftrace code patching ftrace replaces function tracer via ftrace_update_ftrace_func() and activates it back afterwards. Unexpected direct calls to ftrace_trace_function() during ftrace code patching leads to nullptr-dereferences when tracing is activated for one of functions which are used during code patching. Those function currently are: copy_from_kernel_nofault() copy_from_kernel_nofault_allowed() preempt_count_sub() [with debug_defconfig] preempt_count_add() [with debug_defconfig] Corresponding KASAN report: BUG: KASAN: nullptr-dereference in function_trace_call+0x316/0x3b0 Read of size 4 at addr 0000000000001e08 by task migration/0/15 CPU: 0 PID: 15 Comm: migration/0 Tainted: G B 5.13.0-41423-g08316af3644d Hardware name: IBM 3906 M04 704 (LPAR) Stopper: multi_cpu_stop+0x0/0x3e0 <- stop_machine_cpuslocked+0x1e4/0x218 Call Trace: [<0000000001f77caa>] show_stack+0x16a/0x1d0 [<0000000001f8de42>] dump_stack+0x15a/0x1b0 [<0000000001f81d56>] print_address_description.constprop.0+0x66/0x2e0 [<000000000082b0ca>] kasan_report+0x152/0x1c0 [<00000000004cfd8e>] function_trace_call+0x316/0x3b0 [<0000000001fb7082>] ftrace_caller+0x7a/0x7e [<00000000006bb3e6>] copy_from_kernel_nofault_allowed+0x6/0x10 [<00000000006bb42e>] copy_from_kernel_nofault+0x3e/0xd0 [<000000000014605c>] ftrace_make_call+0xb4/0x1f8 [<000000000047a1b4>] ftrace_replace_code+0x134/0x1d8 [<000000000047a6e0>] ftrace_modify_all_code+0x120/0x1d0 [<000000000047a7ec>] __ftrace_modify_code+0x5c/0x78 [<000000000042395c>] multi_cpu_stop+0x224/0x3e0 [<0000000000423212>] cpu_stopper_thread+0x33a/0x5a0 [<0000000000243ff2>] smpboot_thread_fn+0x302/0x708 [<00000000002329ea>] kthread+0x342/0x408 [<00000000001066b2>] __ret_from_fork+0x92/0xf0 [<0000000001fb57fa>] ret_from_fork+0xa/0x30 The buggy address belongs to the page: page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1 flags: 0x1ffff00000001000(reserved|node=0|zone=0|lastcpupid=0x1ffff) raw: 1ffff00000001000 0000040000000048 0000040000000048 0000000000000000 raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: 0000000000001d00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 0000000000001d80: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 >0000000000001e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 ^ 0000000000001e80: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 0000000000001f00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 ================================================================== To fix that introduce ftrace_func callback to be called from ftrace_caller and update it in ftrace_update_ftrace_func(). Fixes: 4cc9bed034d1 ("[S390] cleanup ftrace backend functions") Cc: stable@vger.kernel.org Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-12s390: don't trace preemption in percpu macrosSven Schnelle
[ Upstream commit 1196f12a2c960951d02262af25af0bb1775ebcc2 ] Since commit a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables") the lockdep code itself uses percpu variables. This leads to recursions because the percpu macros are calling preempt_enable() which might call trace_preempt_on(). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-22KVM: s390: reduce number of IO pins to 1Christian Borntraeger
[ Upstream commit 774911290c589e98e3638e73b24b0a4d4530e97c ] The current number of KVM_IRQCHIP_NUM_PINS results in an order 3 allocation (32kb) for each guest start/restart. This can result in OOM killer activity even with free swap when the memory is fragmented enough: kernel: qemu-system-s39 invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0 kernel: CPU: 1 PID: 357274 Comm: qemu-system-s39 Kdump: loaded Not tainted 5.4.0-29-generic #33-Ubuntu kernel: Hardware name: IBM 8562 T02 Z06 (LPAR) kernel: Call Trace: kernel: ([<00000001f848fe2a>] show_stack+0x7a/0xc0) kernel: [<00000001f8d3437a>] dump_stack+0x8a/0xc0 kernel: [<00000001f8687032>] dump_header+0x62/0x258 kernel: [<00000001f8686122>] oom_kill_process+0x172/0x180 kernel: [<00000001f8686abe>] out_of_memory+0xee/0x580 kernel: [<00000001f86e66b8>] __alloc_pages_slowpath+0xd18/0xe90 kernel: [<00000001f86e6ad4>] __alloc_pages_nodemask+0x2a4/0x320 kernel: [<00000001f86b1ab4>] kmalloc_order+0x34/0xb0 kernel: [<00000001f86b1b62>] kmalloc_order_trace+0x32/0xe0 kernel: [<00000001f84bb806>] kvm_set_irq_routing+0xa6/0x2e0 kernel: [<00000001f84c99a4>] kvm_arch_vm_ioctl+0x544/0x9e0 kernel: [<00000001f84b8936>] kvm_vm_ioctl+0x396/0x760 kernel: [<00000001f875df66>] do_vfs_ioctl+0x376/0x690 kernel: [<00000001f875e304>] ksys_ioctl+0x84/0xb0 kernel: [<00000001f875e39a>] __s390x_sys_ioctl+0x2a/0x40 kernel: [<00000001f8d55424>] system_call+0xd8/0x2c8 As far as I can tell s390x does not use the iopins as we bail our for anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to reduce the memory footprint. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-29s390: fix syscall_get_error for compat processesDmitry V. Levin
commit b3583fca5fb654af2cfc1c08259abb9728272538 upstream. If both the tracer and the tracee are compat processes, and gprs[2] is assigned a value by __poke_user_compat, then the higher 32 bits of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and syscall_get_error() always returns 0. Fix the implementation by sign-extending the value for compat processes the same way as x86 implementation does. The bug was exposed to user space by commit 201766a20e30f ("ptrace: add PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite. This change fixes strace syscall tampering on s390. Link: https://lkml.kernel.org/r/20200602180051.GA2427@altlinux.org Fixes: 753c4dd6a2fa2 ("[S390] ptrace changes") Cc: Elvira Khabirova <lineprinter@altlinux.org> Cc: stable@vger.kernel.org # v2.6.28+ Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-01uapi: export all headers under uapi directoriesNicolas Dichtel
Regularly, when a new header is created in include/uapi/, the developer forgets to add it in the corresponding Kbuild file. This error is usually detected after the release is out. In fact, all headers under uapi directories should be exported, thus it's useless to have an exhaustive list. After this patch, the following files, which were not exported, are now exported (with make headers_install_all): asm-arc/kvm_para.h asm-arc/ucontext.h asm-blackfin/shmparam.h asm-blackfin/ucontext.h asm-c6x/shmparam.h asm-c6x/ucontext.h asm-cris/kvm_para.h asm-h8300/shmparam.h asm-h8300/ucontext.h asm-hexagon/shmparam.h asm-m32r/kvm_para.h asm-m68k/kvm_para.h asm-m68k/shmparam.h asm-metag/kvm_para.h asm-metag/shmparam.h asm-metag/ucontext.h asm-mips/hwcap.h asm-mips/reg.h asm-mips/ucontext.h asm-nios2/kvm_para.h asm-nios2/ucontext.h asm-openrisc/shmparam.h asm-parisc/kvm_para.h asm-powerpc/perf_regs.h asm-sh/kvm_para.h asm-sh/ucontext.h asm-tile/shmparam.h asm-unicore32/shmparam.h asm-unicore32/ucontext.h asm-x86/hwcap2.h asm-xtensa/kvm_para.h drm/armada_drm.h drm/etnaviv_drm.h drm/vgem_drm.h linux/aspeed-lpc-ctrl.h linux/auto_dev-ioctl.h linux/bcache.h linux/btrfs_tree.h linux/can/vxcan.h linux/cifs/cifs_mount.h linux/coresight-stm.h linux/cryptouser.h linux/fsmap.h linux/genwqe/genwqe_card.h linux/hash_info.h linux/kcm.h linux/kcov.h linux/kfd_ioctl.h linux/lightnvm.h linux/module.h linux/nbd-netlink.h linux/nilfs2_api.h linux/nilfs2_ondisk.h linux/nsfs.h linux/pr.h linux/qrtr.h linux/rpmsg.h linux/sched/types.h linux/sed-opal.h linux/smc.h linux/smc_diag.h linux/stm.h linux/switchtec_ioctl.h linux/vfio_ccw.h linux/wil6210_uapi.h rdma/bnxt_re-abi.h Note that I have removed from this list the files which are generated in every exported directories (like .install or .install.cmd). Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all subdirs with a pure makefile command. For the record, note that exported files for asm directories are a mix of files listed by: - include/uapi/asm-generic/Kbuild.asm; - arch/<arch>/include/uapi/asm/Kbuild; - arch/<arch>/include/asm/Kbuild. Change-Id: I132df74f736b8f35f77390eaa12804e74ef536ee Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Mark Salter <msalter@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Git-Commit: fcc8487d477a3452a1d0ccbdd4c5e0e1e3cb8bed Git-Repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git [bharad@codeaurora.org: resolve trivial merge conflicts] Signed-off-by: Naitik Bharadiya <bharad@codeaurora.org> [schikk@codeaurora.org: resolve trivial merge conflicts] Signed-off-by: Swetha Chikkaboraiah <schikk@codeaurora.org>
2020-02-28s390/time: Fix clk type in get_tod_clockNathan Chancellor
commit 0f8a206df7c920150d2aa45574fba0ab7ff6be4f upstream. Clang warns: In file included from ../arch/s390/boot/startup.c:3: In file included from ../include/linux/elf.h:5: In file included from ../arch/s390/include/asm/elf.h:132: In file included from ../include/linux/compat.h:10: In file included from ../include/linux/time.h:74: In file included from ../include/linux/time32.h:13: In file included from ../include/linux/timex.h:65: ../arch/s390/include/asm/timex.h:160:20: warning: passing 'unsigned char [16]' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign] get_tod_clock_ext(clk); ^~~ ../arch/s390/include/asm/timex.h:149:44: note: passing argument to parameter 'clk' here static inline void get_tod_clock_ext(char *clk) ^ Change clk's type to just be char so that it matches what happens in get_tod_clock_ext. Fixes: 57b28f66316d ("[S390] s390_hypfs: Add new attributes") Link: https://github.com/ClangBuiltLinux/linux/issues/861 Link: http://lkml.kernel.org/r/20200208140858.47970-1-natechancellor@gmail.com Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-21s390: fix stfle zero paddingHeiko Carstens
commit 4f18d869ffd056c7858f3d617c71345cf19be008 upstream. The stfle inline assembly returns the number of double words written (condition code 0) or the double words it would have written (condition code 3), if the memory array it got as parameter would have been large enough. The current stfle implementation assumes that the array is always large enough and clears those parts of the array that have not been written to with a subsequent memset call. If however the array is not large enough memset will get a negative length parameter, which means that memset clears memory until it gets an exception and the kernel crashes. To fix this simply limit the maximum length. Move also the inline assembly to an extra function to avoid clobbering of register 0, which might happen because of the added min_t invocation together with code instrumentation. The bug was introduced with commit 14375bc4eb8d ("[S390] cleanup facility list handling") but was rather harmless, since it would only write to a rather large array. It became a potential problem with commit 3ab121ab1866 ("[S390] kernel: Add z/VM LGR detection"). Since then it writes to an array with only four double words, while some machines already deliver three double words. As soon as machines have a facility bit within the fifth double a crash on IPL would happen. Fixes: 14375bc4eb8d ("[S390] cleanup facility list handling") Cc: <stable@vger.kernel.org> # v2.6.37+ Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05s390/qdio: reset old sbal_state flagsJulian Wiedmann
commit 64e03ff72623b8c2ea89ca3cb660094e019ed4ae upstream. When allocating a new AOB fails, handle_outbound() is still capable of transmitting the selected buffer (just without async completion). But if a previous transfer on this queue slot used async completion, its sbal_state flags field is still set to QDIO_OUTBUF_STATE_FLAG_PENDING. So when the upper layer driver sees this stale flag, it expects an async completion that never happens. Fix this by unconditionally clearing the flags field. Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06perf: fix invalid bit in diagnostic entryThomas Richter
[ Upstream commit 3c0a83b14ea71fef5ccc93a3bd2de5f892be3194 ] The s390 CPU measurement facility sampling mode supports basic entries and diagnostic entries. Each entry has a valid bit to indicate the status of the entry as valid or invalid. This bit is bit 31 in the diagnostic entry, but the bit mask definition refers to bit 30. Fix this by making the reserved field one bit larger. Fixes: 7e75fc3ff4cf ("s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06s390/cpum_sf: Add data entry sizes to sampling trailer entryThomas Richter
[ Upstream commit 77715b7ddb446bd39a06f3376e85f4bb95b29bb8 ] The CPU Measurement sampling facility creates a trailer entry for each Sample-Data-Block of stored samples. The trailer entry contains the sizes (in bytes) of the stored sampling types: - basic-sampling data entry size - diagnostic-sampling data entry size Both sizes are 2 bytes long. This patch changes the trailer entry definition to reflect this. Fixes: fcc77f507333 ("s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30s390/ftrace: use expoline for indirect branchesMartin Schwidefsky
commit 23a4d7fd34856da8218c4cfc23dba7a6ec0a423a upstream. The return from the ftrace_stub, _mcount, ftrace_caller and return_to_handler functions is done with "br %r14" and "br %r1". These are indirect branches as well and need to use execute trampolines for CONFIG_EXPOLINE=y. The ftrace_caller function is a special case as it returns to the start of a function and may only use %r0 and %r1. For a pre z10 machine the standard execute trampoline uses a LARL + EX to do this, but this requires *two* registers in the range %r1..%r15. To get around this the 'br %r1' located in the lowcore is used, then the EX instruction does not need an address register. But the lowcore trick may only be used for pre z14 machines, with noexec=on the mapping for the first page may not contain instructions. The solution for that is an ALTERNATIVE in the expoline THUNK generated by 'GEN_BR_THUNK %r1' to switch to EXRL, this relies on the fact that a machine that supports noexec=on has EXRL as well. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26s390: extend expoline to BC instructionsMartin Schwidefsky
[ Upstream commit 6deaa3bbca804b2a3627fd685f75de64da7be535 ] The BPF JIT uses a 'b <disp>(%r<x>)' instruction in the definition of the sk_load_word and sk_load_half functions. Add support for branch-on-condition instructions contained in the thunk code of an expoline. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26s390: move expoline assembler macros to a headerMartin Schwidefsky
[ Upstream commit 6dd85fbb87d1d6b87a3b1f02ca28d7b2abd2e7ba ] To be able to use the expoline branches in different assembler files move the associated macros from entry.S to a new header nospec-insn.h. While we are at it make the macros a bit nicer to use. Cc: stable@vger.kernel.org # 4.16 Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26s390: add assembler macros for CPU alternativesMartin Schwidefsky
[ Upstream commit fba9eb7946251d6e420df3bdf7bc45195be7be9a ] Add a header with macros usable in assembler files to emit alternative code sequences. It works analog to the alternatives for inline assmeblies in C files, with the same restrictions and capabilities. The syntax is ALTERNATIVE "<default instructions sequence>", \ "<alternative instructions sequence>", \ "<features-bit>" and ALTERNATIVE_2 "<default instructions sequence>", \ "<alternative instructions sqeuence #1>", \ "<feature-bit #1>", "<alternative instructions sqeuence #2>", \ "<feature-bit #2>" Reviewed-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-26futex: Remove duplicated code and fix undefined behaviourJiri Slaby
commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3 upstream. There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: correct nospec auto detection init orderMartin Schwidefsky
[ Upstream commit 6a3d1e81a434fc311f224b8be77258bafc18ccc6 ] With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via early_initcall is done *after* the early_param functions. This overwrites any settings done with the nobp/no_spectre_v2/spectre_v2 parameters. The code patching for the kernel is done after the evaluation of the early parameters but before the early_initcall is done. The end result is a kernel image that is patched correctly but the kernel modules are not. Make sure that the nospec auto detection function is called before the early parameters are evaluated and before the code patching is done. Fixes: 6e179d64126b ("s390: add automatic detection of the spectre defense") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: add automatic detection of the spectre defenseMartin Schwidefsky
[ Upstream commit 6e179d64126b909f0b288fa63cdbf07c531e9b1d ] Automatically decide between nobp vs. expolines if the spectre_v2=auto kernel parameter is specified or CONFIG_EXPOLINE_AUTO=y is set. The decision made at boot time due to CONFIG_EXPOLINE_AUTO=y being set can be overruled with the nobp, nospec and spectre_v2 kernel parameters. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: introduce execute-trampolines for branchesMartin Schwidefsky
[ Upstream commit f19fbd5ed642dc31c809596412dab1ed56f2f156 ] Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and -mfunction_return= compiler options to create a kernel fortified against the specte v2 attack. With CONFIG_EXPOLINE=y all indirect branches will be issued with an execute type instruction. For z10 or newer the EXRL instruction will be used, for older machines the EX instruction. The typical indirect call basr %r14,%r1 is replaced with a PC relative call to a new thunk brasl %r14,__s390x_indirect_jump_r1 The thunk contains the EXRL/EX instruction to the indirect branch __s390x_indirect_jump_r1: exrl 0,0f j . 0: br %r1 The detour via the execute type instruction has a performance impact. To get rid of the detour the new kernel parameter "nospectre_v2" and "spectre_v2=[on,off,auto]" can be used. If the parameter is specified the kernel and module code will be patched at runtime. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: run user space and KVM guests with modified branch predictionMartin Schwidefsky
[ Upstream commit 6b73044b2b0081ee3dd1cd6eaab7dee552601efb ] Define TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST and add the necessary plumbing in entry.S to be able to run user space and KVM guests with limited branch prediction. To switch a user space process to limited branch prediction the s390_isolate_bp() function has to be call, and to run a vCPU of a KVM guest associated with the current task with limited branch prediction call s390_isolate_bp_guest(). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: add options to change branch prediction behaviour for the kernelMartin Schwidefsky
[ Upstream commit d768bd892fc8f066cd3aa000eb1867bcf32db0ee ] Add the PPA instruction to the system entry and exit path to switch the kernel to a different branch prediction behaviour. The instructions are added via CPU alternatives and can be disabled with the "nospec" or the "nobp=0" kernel parameter. If the default behaviour selected with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be used to enable the changed kernel branch prediction. Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390/alternative: use a copy of the facility bit maskMartin Schwidefsky
[ Upstream commit cf1489984641369611556bf00c48f945c77bcf02 ] To be able to switch off specific CPU alternatives with kernel parameters make a copy of the facility bit mask provided by STFLE and use the copy for the decision to apply an alternative. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: add optimized array_index_mask_nospecMartin Schwidefsky
[ Upstream commit e2dd833389cc4069a96b57bdd24227b5f52288f5 ] Add an optimized version of the array_index_mask_nospec function for s390 based on a compare and a subtract with borrow. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29KVM: s390: wire up bpb featureChristian Borntraeger
[ Upstream commit 35b3fde6203b932b2b1a5b53b3d8808abc9c4f60 ] The new firmware interfaces for branch prediction behaviour changes are transparently available for the guest. Nevertheless, there is new state attached that should be migrated and properly resetted. Provide a mechanism for handling reset, migration and VSIE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [Changed capability number to 152. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: enable CPU alternatives unconditionallyHeiko Carstens
[ Upstream commit 049a2c2d486e8cc82c5cd79fa479c5b105b109e9 ] Remove the CPU_ALTERNATIVES config option and enable the code unconditionally. The config option was only added to avoid a conflict with the named saved segment support. Since that code is gone there is no reason to keep the CPU_ALTERNATIVES config option. Just enable it unconditionally to also reduce the number of config options and make it less likely that something breaks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29s390: introduce CPU alternativesVasily Gorbik
[ Upstream commit 686140a1a9c41d85a4212a1c26d671139b76404b ] Implement CPU alternatives, which allows to optionally patch newer instructions at runtime, based on CPU facilities availability. A new kernel boot parameter "noaltinstr" disables patching. Current implementation is derived from x86 alternatives. Although ideal instructions padding (when altinstr is longer then oldinstr) is added at compile time, and no oldinstr nops optimization has to be done at runtime. Also couple of compile time sanity checks are done: 1. oldinstr and altinstr must be <= 254 bytes long, 2. oldinstr and altinstr must not have an odd length. alternative(oldinstr, altinstr, facility); alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2); Both compile time and runtime padding consists of either 6/4/2 bytes nop or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes. .altinstructions and .altinstr_replacement sections are part of __init_begin : __init_end region and are freed after initialization. Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-16s390: always save and restore all registers on context switchHeiko Carstens
commit fbbd7f1a51965b50dd12924841da0d478f3da71b upstream. The switch_to() macro has an optimization to avoid saving and restoring register contents that aren't needed for kernel threads. There is however the possibility that a kernel thread execve's a user space program. In such a case the execve'd process can partially see the contents of the previous process, which shouldn't be allowed. To avoid this, simply always save and restore register contents on context switch. Cc: <stable@vger.kernel.org> # v2.6.37+ Fixes: fdb6d070effba ("switch_to: dont restore/save access & fpu regs for kernel threads") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-16Revert "s390/kbuild: enable modversions for symbols exported from asm"Sasha Levin
This reverts commit cabab3f9f5ca077535080b3252e6168935b914af. Not needed for < 4.9. Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09s390/pci: do not require AIS facilityChristian Borntraeger
[ Upstream commit 48070c73058be6de9c0d754d441ed7092dfc8f12 ] As of today QEMU does not provide the AIS facility to its guest. This prevents Linux guests from using PCI devices as the ais facility is checked during init. As this is just a performance optimization, we can move the ais check into the code where we need it (calling the SIC instruction). This is used at initialization and on interrupt. Both places do not require any serialization, so we can simply skip the instruction. Since we will now get all interrupts, we can also avoid the 2nd scan. As we can have multiple interrupts in parallel we might trigger spurious irqs more often for the non-AIS case but the core code can handle that. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09s390/runtime instrumentation: simplify task exit handlingHeiko Carstens
commit 8d9047f8b967ce6181fd824ae922978e1b055cc0 upstream. Free data structures required for runtime instrumentation from arch_release_task_struct(). This allows to simplify the code a bit, and also makes the semantics a bit easier: arch_release_task_struct() is never called from the task that is being removed. In addition this allows to get rid of exit_thread() in a later patch. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30s390/kbuild: enable modversions for symbols exported from asmHeiko Carstens
[ Upstream commit cabab3f9f5ca077535080b3252e6168935b914af ] s390 version of commit 334bb7738764 ("x86/kbuild: enable modversions for symbols exported from asm") so we get also rid of all these warnings: WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memcpy" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memmove" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memset" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "save_fpu_regs" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sie64a" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "sie_exit" [vmlinux] version generation failed, symbol will not be versioned. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30s390: fix transactional execution control register handlingHeiko Carstens
commit a1c5befc1c24eb9c1ee83f711e0f21ee79cbb556 upstream. Dan Horák reported the following crash related to transactional execution: User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000] CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1 Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) task: 00000000fafc8000 task.stack: 00000000fafc4000 User PSW : 0705200180000000 000003ff93c14e70 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3 User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e 0000000000000000 0000000000000002 0000000000000000 000003ff00000000 0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770 000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0 User Code: 000003ff93c14e62: 60e0b030 std %f14,48(%r11) 000003ff93c14e66: 60f0b038 std %f15,56(%r11) #000003ff93c14e6a: e5600000ff0e tbegin 0,65294 >000003ff93c14e70: a7740006 brc 7,3ff93c14e7c 000003ff93c14e74: a7080000 lhi %r0,0 000003ff93c14e78: a7f40023 brc 15,3ff93c14ebe 000003ff93c14e7c: b2220000 ipm %r0 000003ff93c14e80: 8800001c srl %r0,28 There are several bugs with control register handling with respect to transactional execution: - on task switch update_per_regs() is only called if the next task has an mm (is not a kernel thread). This however is incorrect. This breaks e.g. for user mode helper handling, where the kernel creates a kernel thread and then execve's a user space program. Control register contents related to transactional execution won't be updated on execve. If the previous task ran with transactional execution disabled then the new task will also run with transactional execution disabled, which is incorrect. Therefore call update_per_regs() unconditionally within switch_to(). - on startup the transactional execution facility is not enabled for the idle thread. This is not really a bug, but an inconsistency to other facilities. Therefore enable the facility if it is available. - on fork the new thread's per_flags field is not cleared. This means that a child process inherits the PER_FLAG_NO_TE flag. This flag can be set with a ptrace request to disable transactional execution for the current process. It should not be inherited by new child processes in order to be consistent with the handling of all other PER related debugging options. Therefore clear the per_flags field in copy_thread_tls(). Reported-and-tested-by: Dan Horák <dan@danny.cz> Fixes: d35339a42dd1 ("s390: add support for transactional memory") Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27s390/syscalls: Fix out of bounds arguments accessJiri Olsa
commit c46fc0424ced3fb71208e72bd597d91b9169a781 upstream. Zorro reported following crash while having enabled syscall tracing (CONFIG_FTRACE_SYSCALLS): Unable to handle kernel pointer dereference at virtual ... Oops: 0011 [#1] SMP DEBUG_PAGEALLOC SNIP Call Trace: ([<000000000024d79c>] ftrace_syscall_enter+0xec/0x1d8) [<00000000001099c6>] do_syscall_trace_enter+0x236/0x2f8 [<0000000000730f1c>] sysc_tracesys+0x1a/0x32 [<000003fffcf946a2>] 0x3fffcf946a2 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000000000022dd44>] rb_event_data+0x34/0x40 ---[ end trace 8c795f86b1b3f7b9 ]--- The crash happens in syscall_get_arguments function for syscalls with zero arguments, that will try to access first argument (args[0]) in event entry, but it's not allocated. Bail out of there are no arguments. Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-21s390: reduce ELF_ET_DYN_BASEKees Cook
commit a73dc5370e153ac63718d850bddf0c9aa9d871e6 upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For s390 the position could be 0x10000, but that is needlessly close to the NULL address. Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-05s390/ctl_reg: make __ctl_load a full memory barrierHeiko Carstens
[ Upstream commit e991c24d68b8c0ba297eeb7af80b1e398e98c33f ] We have quite a lot of code that depends on the order of the __ctl_load inline assemby and subsequent memory accesses, like e.g. disabling lowcore protection and the writing to lowcore. Since the __ctl_load macro does not have memory barrier semantics, nor any other dependencies the compiler is, theoretically, free to shuffle code around. Or in other words: storing to lowcore could happen before lowcore protection is disabled. In order to avoid this class of potential bugs simply add a full memory barrier to the __ctl_load macro. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-27s390/mm: fix CMMA vs KSM vs othersChristian Borntraeger
commit a8f60d1fadf7b8b54449fcc9d6b15248917478ba upstream. On heavy paging with KSM I see guest data corruption. Turns out that KSM will add pages to its tree, where the mapping return true for pte_unused (or might become as such later). KSM will unmap such pages and reinstantiate with different attributes (e.g. write protected or special, e.g. in replace_page or write_protect_page)). This uncovered a bug in our pagetable handling: We must remove the unused flag as soon as an entry becomes present again. Signed-of-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12s390/uaccess: get_user() should zero on failure (again)Heiko Carstens
commit d09c5373e8e4eaaa09233552cbf75dc4c4f21203 upstream. Commit fd2d2b191fe7 ("s390: get_user() should zero on failure") intended to fix s390's get_user() implementation which did not zero the target operand if the read from user space faulted. Unfortunately the patch has no effect: the corresponding inline assembly specifies that the operand is only written to ("=") and the previous value is discarded. Therefore the compiler is free to and actually does omit the zero initialization. To fix this simply change the contraint modifier to "+", so the compiler cannot omit the initialization anymore. Fixes: c9ca78415ac1 ("s390/uaccess: provide inline variants of get_user/put_user") Fixes: fd2d2b191fe7 ("s390: get_user() should zero on failure") Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15s390: TASK_SIZE for kernel threadsMartin Schwidefsky
commit fb94a687d96c570d46332a4a890f1dcb7310e643 upstream. Return a sensible value if TASK_SIZE if called from a kernel thread. This gets us around an issue with copy_mount_options that does a magic size calculation "TASK_SIZE - (unsigned long)data" while in a kernel thread and data pointing to kernel space. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28s390/mm: fix gmap tlb flush issuesDavid Hildenbrand
commit f045402984404ddc11016358411e445192919047 upstream. __tlb_flush_asce() should never be used if multiple asce belong to a mm. As this function changes mm logic determining if local or global tlb flushes will be neded, we might end up flushing only the gmap asce on all CPUs and a follow up mm asce flushes will only flush on the local CPU, although that asce ran on multiple CPUs. The missing tlb flushes will provoke strange faults in user space and even low address protections in user space, crashing the kernel. Fixes: 1b948d6caec4 ("s390/mm,tlb: optimize TLB flushing for zEC12") Cc: stable@vger.kernel.org # 3.15+ Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-24s390: get_user() should zero on failureAl Viro
commit fd2d2b191fe75825c4c7a6f12f3fef35aaed7dd7 upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-15s390/pci_dma: fix DMA table corruption with > 4 TB main memoryGerald Schaefer
[ Upstream commit 69eea95c48857c9dfcac120d6acea43027627b28 ] DMA addresses returned from map_page() are calculated by using an iommu bitmap plus a start_dma offset. The size of this bitmap is based on the main memory size. If we have more than (4 TB - start_dma) main memory, the DMA address calculation will also produce addresses > 4 TB. Such addresses cannot be inserted in the 3-level DMA page table, instead the entries modulo 4 TB will be overwritten. Fix this by restricting the iommu bitmap size to (4 TB - start_dma). Also set zdev->end_dma to the actual end address of the usable range, instead of the theoretical maximum as reported by the hardware, which fixes a sanity check in dma_map() and also the IOMMU API domain geometry aperture calculation. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-26Revert "Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"Trilok Soni
This reverts commit 9d6fd2c3e9fcfb ("Merge remote-tracking branch 'msm-4.4/tmp-510d0a3f' into msm-4.4"), because it breaks the dump parsing tools due to kernel can be loaded anywhere in the memory now and not fixed at linear mapping. Change-Id: Id416f0a249d803442847d09ac47781147b0d0ee6 Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-07-27s390: fix test_fp_ctl inline assembly contraintsMartin Schwidefsky
commit bcf4dd5f9ee096bd1510f838dd4750c35df4e38b upstream. The test_fp_ctl function is used to test if a given value is a valid floating-point control. The inline assembly in test_fp_ctl uses an incorrect constraint for the 'orig_fpc' variable. If the compiler chooses the same register for 'fpc' and 'orig_fpc' the test_fp_ctl() function always returns true. This allows user space to trigger kernel oopses with invalid floating-point control values on the signal stack. This problem has been introduced with git commit 4725c86055f5bbdcdf "s390: fix save and restore of the floating-point-control register" Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-18s390/mm: fix asce_bits handling with dynamic pagetable levelsGerald Schaefer
commit 723cacbd9dc79582e562c123a0bacf8bfc69e72a upstream. There is a race with multi-threaded applications between context switch and pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a pagetable upgrade on another thread in crst_table_upgrade() could already have set new asce_bits, but not yet the new mm->pgd. This would result in a corrupt user_asce in switch_mm(), and eventually in a kernel panic from a translation exception. Fix this by storing the complete asce instead of just the asce_bits, which can then be read atomically from switch_mm(), so that it either sees the old value or the new value, but no mixture. Both cases are OK. Having the old value would result in a page fault on access to the higher level memory, but the fault handler would see the new mm->pgd, if it was a valid access after the mmap on the other thread has completed. So as worst-case scenario we would have a page fault loop for the racing thread until the next time slice. Also remove dead code and simplify the upgrade/downgrade path, there are no upgrades from 2 levels, and only downgrades from 3 levels for compat tasks. There are also no concurrent upgrades, because the mmap_sem is held with down_write() in do_mmap, so the flush and table checks during upgrade can be removed. Reported-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04s390/pci: add extra padding to function measurement blockSebastian Ott
commit 9d89d9e61d361f3adb75e1aebe4bb367faf16cfa upstream. Newer machines might use a different (larger) format for function measurement blocks. To ensure that we comply with the alignment requirement on these machines and prevent memory corruption (when firmware writes more data than we expect) add 16 padding bytes at the end of the fmb. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12s390/pci: enforce fmb page boundary ruleSebastian Ott
commit 80c544ded25ac14d7cc3e555abb8ed2c2da99b84 upstream. The function measurement block must not cross a page boundary. Ensure that by raising the alignment requirement to the smallest power of 2 larger than the size of the fmb. Fixes: d0b088531 ("s390/pci: performance statistics and debug infrastructure") Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-22Revert "net, lib: kill arch_fast_hash library bits"Rohit Vaswani
This reverts commit 0cb6c969ed9de43687abdfc63714b6fe4385d2fc.
2016-03-16s390/mm: four page table levels vs. forkMartin Schwidefsky
commit 3446c13b268af86391d06611327006b059b8bab1 upstream. The fork of a process with four page table levels is broken since git commit 6252d702c5311ce9 "[S390] dynamic page tables." All new mm contexts are created with three page table levels and an asce limit of 4TB. If the parent has four levels dup_mmap will add vmas to the new context which are outside of the asce limit. The subsequent call to copy_page_range will walk the three level page table structure of the new process with non-zero pgd and pud indexes. This leads to memory clobbers as the pgd_index *and* the pud_index is added to the mm->pgd pointer without a pgd_deref in between. The init_new_context() function is selecting the number of page table levels for a new context. The function is used by mm_init() which in turn is called by dup_mm() and mm_alloc(). These two are used by fork() and exec(). The init_new_context() function can distinguish the two cases by looking at mm->context.asce_limit, for fork() the mm struct has been copied and the number of page table levels may not change. For exec() the mm_alloc() function set the new mm structure to zero, in this case a three-level page table is created as the temporary stack space is located at STACK_TOP_MAX = 4TB. This fixes CVE-2016-2143. Reported-by: Marcin Kościelnicki <koriakin@0x04.net> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03s390/fpu: signals vs. floating point control registerMartin Schwidefsky
commit 1b17cb796f5d40ffa239c6926385abd83a77a49b upstream. git commit 904818e2f229f3d94ec95f6932a6358c81e73d78 "s390/kernel: introduce fpu-internal.h with fpu helper functions" introduced the fpregs_store / fp_regs_load helper. These function fail to save and restore the floating pointer control registers. The effect is that the FPC is not correctly handled on signal delivery and signal return. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>