summaryrefslogtreecommitdiff
path: root/ipc/shm.c (follow)
Commit message (Collapse)AuthorAge
* Merge remote-tracking branch 'common/android-4.4-p' into ↵Michael Bestas2021-12-27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lineage-18.1-caf-msm8998 * common/android-4.4-p: Linux 4.4.296 xen/netback: don't queue unlimited number of packages xen/console: harden hvc_xen against event channel storms xen/netfront: harden netfront against event channel storms xen/blkfront: harden blkfront against event channel storms Input: touchscreen - avoid bitwise vs logical OR warning ARM: 8805/2: remove unneeded naked function usage net: lan78xx: Avoid unnecessary self assignment net: systemport: Add global locking for descriptor lifecycle timekeeping: Really make sure wall_to_monotonic isn't positive USB: serial: option: add Telit FN990 compositions PCI/MSI: Clear PCI_MSIX_FLAGS_MASKALL on error USB: gadget: bRequestType is a bitfield, not a enum igbvf: fix double free in `igbvf_probe` soc/tegra: fuse: Fix bitwise vs. logical OR warning nfsd: fix use-after-free due to delegation race dm btree remove: fix use after free in rebalance_children() recordmcount.pl: look for jgnop instruction as well as bcrl on s390 mac80211: send ADDBA requests using the tid/queue of the aggregation session hwmon: (dell-smm) Fix warning on /proc/i8k creation error net: netlink: af_netlink: Prevent empty skb by adding a check on len. i2c: rk3x: Handle a spurious start completion interrupt flag parisc/agp: Annotate parisc agp init functions with __init nfc: fix segfault in nfc_genl_dump_devices_done FROMGIT: USB: gadget: bRequestType is a bitfield, not a enum Linux 4.4.295 irqchip: nvic: Fix offset for Interrupt Priority Offsets irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL iio: accel: kxcjk-1013: Fix possible memory leak in probe and remove iio: itg3200: Call iio_trigger_notify_done() on error iio: ltr501: Don't return error code in trigger handler iio: mma8452: Fix trigger reference couting iio: stk3310: Don't return error code in interrupt handler usb: core: config: fix validation of wMaxPacketValue entries USB: gadget: zero allocate endpoint 0 buffers USB: gadget: detect too-big endpoint 0 requests net/qla3xxx: fix an error code in ql_adapter_up() net, neigh: clear whole pneigh_entry at alloc time net: fec: only clear interrupt of handling queue in fec_enet_rx_queue() net: altera: set a couple error code in probe() net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2) tracefs: Set all files to the same group ownership as the mount option signalfd: use wake_up_pollfree() binder: use wake_up_pollfree() wait: add wake_up_pollfree() libata: add horkage for ASMedia 1092 can: pch_can: pch_can_rx_normal: fix use after free tracefs: Have new files inherit the ownership of their parent ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*() ALSA: pcm: oss: Limit the period size to 16MB ALSA: pcm: oss: Fix negative period/buffer sizes ALSA: ctl: Fix copy of updated id with element read/write mm: bdi: initialize bdi_min_ratio when bdi is unregistered nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done can: sja1000: fix use after free in ems_pcmcia_add_card() HID: check for valid USB device for many HID drivers HID: wacom: fix problems when device is not a valid USB device HID: add USB_HID dependancy on some USB HID drivers HID: add USB_HID dependancy to hid-chicony HID: add USB_HID dependancy to hid-prodikeys HID: add hid_is_usb() function to make it simpler for USB detection HID: introduce hid_is_using_ll_driver UPSTREAM: USB: gadget: zero allocate endpoint 0 buffers UPSTREAM: USB: gadget: detect too-big endpoint 0 requests Linux 4.4.294 serial: pl011: Add ACPI SBSA UART match id tty: serial: msm_serial: Deactivate RX DMA for polling support vgacon: Propagate console boot parameters before calling `vc_resize' parisc: Fix "make install" on newer debian releases siphash: use _unaligned version by default net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings() natsemi: xtensa: fix section mismatch warnings fget: check that the fd still exists after getting a ref to it fs: add fget_many() and fput_many() sata_fsl: fix warning in remove_proc_entry when rmmod sata_fsl sata_fsl: fix UAF in sata_fsl_port_stop when rmmod sata_fsl kprobes: Limit max data_size of the kretprobe instances net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock() net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound scsi: iscsi: Unblock session then wake up error handler s390/setup: avoid using memblock_enforce_memory_limit platform/x86: thinkpad_acpi: Fix WWAN device disabled issue after S3 deep net: return correct error code hugetlb: take PMD sharing into account when flushing tlb/caches tty: hvc: replace BUG_ON() with negative return value xen/netfront: don't trust the backend response data blindly xen/netfront: disentangle tx_skb_freelist xen/netfront: don't read data from request on the ring page xen/netfront: read response from backend only once xen/blkfront: don't trust the backend response data blindly xen/blkfront: don't take local copy of a request from the ring page xen/blkfront: read response from backend only once xen: sync include/xen/interface/io/ring.h with Xen's newest version shm: extend forced shm destroy to support objects from several IPC nses fuse: release pipe buf after last use fuse: fix page stealing NFC: add NCI_UNREG flag to eliminate the race proc/vmcore: fix clearing user buffer by properly using clear_user() hugetlbfs: flush TLBs correctly after huge_pmd_unshare tracing: Check pid filtering when creating events tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows scsi: mpt3sas: Fix kernel panic during drive powercycle test ARM: socfpga: Fix crash with CONFIG_FORTIRY_SOURCE NFSv42: Don't fail clone() unless the OP_CLONE operation failed net: ieee802154: handle iftypes as u32 ASoC: topology: Add missing rwsem around snd_ctl_remove() calls ARM: dts: BCM5301X: Add interrupt properties to GPIO node xen: detect uninitialized xenbus in xenbus_init xen: don't continue xenstore initialization in case of errors staging: rtl8192e: Fix use after free in _rtl92e_pci_disconnect() ALSA: ctxfi: Fix out-of-range access binder: fix test regression due to sender_euid change usb: hub: Fix locking issues with address0_mutex usb: hub: Fix usb enumeration issue due to address0 race USB: serial: option: add Fibocom FM101-GL variants USB: serial: option: add Telit LE910S1 0x9200 composition staging: ion: Prevent incorrect reference counting behavour Change-Id: Iadf9f213915d2a02b27ceb3b2144eac827ade329
| * shm: extend forced shm destroy to support objects from several IPC nsesAlexander Mikhalitsyn2021-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 85b6d24646e4125c591639841169baa98a2da503 upstream. Currently, the exit_shm() function not designed to work properly when task->sysvshm.shm_clist holds shm objects from different IPC namespaces. This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it leads to use-after-free (reproducer exists). This is an attempt to fix the problem by extending exit_shm mechanism to handle shm's destroy from several IPC ns'es. To achieve that we do several things: 1. add a namespace (non-refcounted) pointer to the struct shmid_kernel 2. during new shm object creation (newseg()/shmget syscall) we initialize this pointer by current task IPC ns 3. exit_shm() fully reworked such that it traverses over all shp's in task->sysvshm.shm_clist and gets IPC namespace not from current task as it was before but from shp's object itself, then call shm_destroy(shp, ns). Note: We need to be really careful here, because as it was said before (1), our pointer to IPC ns non-refcnt'ed. To be on the safe side we using special helper get_ipc_ns_not_zero() which allows to get IPC ns refcounter only if IPC ns not in the "state of destruction". Q/A Q: Why can we access shp->ns memory using non-refcounted pointer? A: Because shp object lifetime is always shorther than IPC namespace lifetime, so, if we get shp object from the task->sysvshm.shm_clist while holding task_lock(task) nobody can steal our namespace. Q: Does this patch change semantics of unshare/setns/clone syscalls? A: No. It's just fixes non-covered case when process may leave IPC namespace without getting task->sysvshm.shm_clist list cleaned up. Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com Fixes: ab602f79915 ("shm: make exit_shm work proportional to task activity") Co-developed-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge android-4.4.135 (c9d74f2) into msm-4.4Srinivasarao P2018-06-27
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/heads/tmp-c9d74f2 Linux 4.4.135 Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" Linux 4.4.134 s390/ftrace: use expoline for indirect branches kdb: make "mdr" command repeat Bluetooth: btusb: Add device ID for RTL8822BE ASoC: samsung: i2s: Ensure the RCLK rate is properly determined regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()' scsi: lpfc: Fix frequency of Release WQE CQEs scsi: lpfc: Fix soft lockup in lpfc worker thread during LIP testing scsi: lpfc: Fix issue_lip if link is disabled netlabel: If PF_INET6, check sk_buff ip header version selftests/net: fixes psock_fanout eBPF test case perf report: Fix memory corruption in --branch-history mode --branch-history perf tests: Use arch__compare_symbol_names to compare symbols x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified drm/rockchip: Respect page offset for PRIME mmap calls MIPS: Octeon: Fix logging messages with spurious periods after newlines audit: return on memory error to avoid null pointer dereference crypto: sunxi-ss - Add MODULE_ALIAS to sun4i-ss clk: samsung: exynos3250: Fix PLL rates clk: samsung: exynos5250: Fix PLL rates clk: samsung: exynos5433: Fix PLL rates clk: samsung: exynos5260: Fix PLL rates clk: samsung: s3c2410: Fix PLL rates media: cx25821: prevent out-of-bounds read on array card udf: Provide saner default for invalid uid / gid PCI: Add function 1 DMA alias quirk for Marvell 88SE9220 serial: arc_uart: Fix out-of-bounds access through DT alias serial: fsl_lpuart: Fix out-of-bounds access through DT alias serial: imx: Fix out-of-bounds access through serial port index serial: mxs-auart: Fix out-of-bounds access through serial port index serial: samsung: Fix out-of-bounds access through serial port index serial: xuartps: Fix out-of-bounds access through DT alias rtc: tx4939: avoid unintended sign extension on a 24 bit shift staging: rtl8192u: return -ENOMEM on failed allocation of priv->oldaddr hwrng: stm32 - add reset during probe enic: enable rq before updating rq descriptors clk: rockchip: Prevent calculating mmc phase if clock rate is zero media: em28xx: USB bulk packet size fix dmaengine: pl330: fix a race condition in case of threaded irqs media: s3c-camif: fix out-of-bounds array access media: cx23885: Set subdev host data to clk_freq pointer media: cx23885: Override 888 ImpactVCBe crystal frequency ALSA: vmaster: Propagate slave error x86/devicetree: Fix device IRQ settings in DT x86/devicetree: Initialize device tree before using it usb: gadget: composite: fix incorrect handling of OS desc requests usb: gadget: udc: change comparison to bitshift when dealing with a mask gfs2: Fix fallocate chunk size cdrom: do not call check_disk_change() inside cdrom_open() hwmon: (pmbus/adm1275) Accept negative page register values hwmon: (pmbus/max8688) Accept negative page register values perf/core: Fix perf_output_read_group() ASoC: topology: create TLV data for dapm widgets powerpc: Add missing prototype for arch_irq_work_raise() usb: gadget: ffs: Execute copy_to_user() with USER_DS set usb: gadget: ffs: Let setup() return USB_GADGET_DELAYED_STATUS usb: dwc2: Fix interval type issue ipmi_ssif: Fix kernel panic at msg_done_handler PCI: Restore config space on runtime resume despite being unbound MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset xhci: zero usb device slot_id member when disabling and freeing a xhci slot KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use i2c: mv64xxx: Apply errata delay only in standard mode ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c ACPICA: Events: add a return on failure from acpi_hw_register_read bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set zorro: Set up z->dev.dma_mask for the DMA API clk: Don't show the incorrect clock phase cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields arm: dts: socfpga: fix GIC PPI warning virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS ima: Fallback to the builtin hash algorithm ima: Fix Kconfig to select TPM 2.0 CRB interface ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk) net/mlx5: Protect from command bit overflow selftests: Print the test we're running to /dev/kmsg tools/thermal: tmon: fix for segfault powerpc/perf: Fix kernel address leak via sampling registers powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer rtc: hctosys: Ensure system time doesn't overflow time_t hwmon: (nct6775) Fix writing pwmX_mode parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode m68k: set dma and coherent masks for platform FEC ethernets powerpc/mpic: Check if cpu_possible() in mpic_physmask() ACPI: acpi_pad: Fix memory leak in power saving threads xen/acpi: off by one in read_acpi_id() btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers Btrfs: fix copy_items() return value when logging an inode btrfs: tests/qgroup: Fix wrong tree backref level Bluetooth: btusb: Add USB ID 7392:a611 for Edimax EW-7611ULB net: bgmac: Fix endian access in bgmac_dma_tx_ring_free() rtc: snvs: Fix usage of snvs_rtc_enable sparc64: Make atomic_xchg() an inline function rather than a macro. fscache: Fix hanging wait on page discarded by writeback KVM: VMX: raise internal error for exception during invalid protected mode state sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning ocfs2/dlm: don't handle migrate lockres if already in shutdown btrfs: Fix possible softlock on single core machines Btrfs: fix NULL pointer dereference in log_dir_items Btrfs: bail out on error during replay_dir_deletes mm: fix races between address_space dereference and free in page_evicatable mm/ksm: fix interaction with THP dp83640: Ensure against premature access to PHY registers after reset scsi: aacraid: Insure command thread is not recursively stopped cpufreq: CPPC: Initialize shared perf capabilities of CPUs Force log to disk before reading the AGF during a fstrim sr: get/drop reference to device in revalidate and check_events swap: divide-by-zero when zero length swap file on ssd fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table x86/pgtable: Don't set huge PUD/PMD on non-leaf entries sh: fix debug trap failure to process signals before return to user net: mvneta: fix enable of all initialized RXQs net: Fix untag for vlan packets without ethernet header mm/kmemleak.c: wait for scan completion before disabling free llc: properly handle dev_queue_xmit() return value net-usb: add qmi_wwan if on lte modem wistron neweb d18q1 net/usb/qmi_wwan.c: Add USB id for lt4120 modem net: qmi_wwan: add BroadMobi BM806U 2020:2033 ARM: 8748/1: mm: Define vdso_start, vdso_end as array batman-adv: fix packet loss for broadcasted DHCP packets to a server batman-adv: fix multicast-via-unicast transmission with AP isolation selftests: ftrace: Add a testcase for probepoint selftests: ftrace: Add a testcase for string type with kprobe_event selftests: ftrace: Add probe event argument syntax testcase mm/mempolicy.c: avoid use uninitialized preferred_node RDMA/ucma: Correct option size check using optlen perf/cgroup: Fix child event counting bug vti4: Don't override MTU passed on link creation via IFLA_MTU vti4: Don't count header length twice on tunnel setup batman-adv: fix header size check in batadv_dbg_arp() net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off sunvnet: does not support GSO for sctp ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu workqueue: use put_device() instead of kfree() bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa(). netfilter: ebtables: fix erroneous reject of last rule USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM xen: xenbus: use put_device() instead of kfree() fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper(). scsi: sd: Keep disk read-only when re-reading partition scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers e1000e: allocate ring descriptors with dma_zalloc_coherent e1000e: Fix check_for_link return value with autoneg off watchdog: f71808e_wdt: Fix magic close handling KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable Btrfs: send, fix issuing write op when processing hole in no data mode xen/pirq: fix error path cleanup when binding MSIs net/tcp/illinois: replace broken algorithm reference link gianfar: Fix Rx byte accounting for ndev stats sit: fix IFLA_MTU ignored on NEWLINK bcache: fix kcrashes with fio in RAID5 backend dev dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3 virtio-gpu: fix ioctl and expose the fixed status to userspace. r8152: fix tx packets accounting clocksource/drivers/fsl_ftm_timer: Fix error return checking nvme-pci: Fix nvme queue cleanup if IRQ setup fails netfilter: ebtables: convert BUG_ONs to WARN_ONs batman-adv: invalidate checksum on fragment reassembly batman-adv: fix packet checksum in receive path md/raid1: fix NULL pointer dereference media: dmxdev: fix error code for invalid ioctls x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs regulatory: add NUL to request alpha2 smsc75xx: fix smsc75xx_set_features() ARM: OMAP: Fix dmtimer init for omap1 s390/cio: clear timer when terminating driver I/O s390/cio: fix return code after missing interrupt powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE md: raid5: avoid string overflow warning locking/xchg/alpha: Add unconditional memory barrier to cmpxchg() usb: musb: fix enumeration after resume drm/exynos: fix comparison to bitshift when dealing with a mask md raid10: fix NULL deference in handle_write_completed() mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4 NFC: llcp: Limit size of SDP URI ARM: OMAP1: clock: Fix debugfs_create_*() usage ARM: OMAP3: Fix prm wake interrupt for resume ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt scsi: qla4xxx: skip error recovery in case of register disconnect. scsi: aacraid: fix shutdown crash when init fails scsi: storvsc: Increase cmd_per_lun for higher speed devices selftests: memfd: add config fragment for fuse usb: dwc2: Fix dwc2_hsotg_core_init_disconnected() usb: gadget: fsl_udc_core: fix ep valid checks usb: gadget: f_uac2: fix bFirstInterface in composite gadget ARC: Fix malformed ARC_EMUL_UNALIGNED default scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion() scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo() scsi: sym53c8xx_2: iterator underflow in sym_getsync() scsi: bnx2fc: Fix check in SCSI completion handler for timed out request scsi: ufs: Enable quirk to ignore sending WRITE_SAME command irqchip/gic-v3: Change pr_debug message to pr_devel locking/qspinlock: Ensure node->count is updated before initialising node tools/libbpf: handle issues with bpf ELF objects containing .eh_frames bcache: return attach error when no cache set exist bcache: fix for data collapse after re-attaching an attached device bcache: fix for allocator and register thread race bcache: properly set task state in bch_writeback_thread() cifs: silence compiler warnings showing up with gcc-8.0.0 proc: fix /proc/*/map_files lookup arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics RDS: IB: Fix null pointer issue xen/grant-table: Use put_page instead of free_page xen-netfront: Fix race between device setup and open MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y ACPI: processor_perflib: Do not send _PPC change notification if not ready firmware: dmi_scan: Fix handling of empty DMI strings x86/power: Fix swsusp_arch_resume prototype IB/ipoib: Fix for potential no-carrier state mm: pin address_space before dereferencing it while isolating an LRU page asm-generic: provide generic_pmdp_establish() mm/mempolicy: add nodes_empty check in SYSC_migrate_pages mm/mempolicy: fix the check of nodemask from user ocfs2: return error when we attempt to access a dirty bh in jbd2 ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid ntb_transport: Fix bug with max_mw_size parameter RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure powerpc/numa: Ensure nodes initialized for hotplug powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path HID: roccat: prevent an out of bounds read in kovaplus_profile_activated() scsi: fas216: fix sense buffer initialization Btrfs: fix scrub to repair raid6 corruption btrfs: Fix out of bounds access in btrfs_search_slot Btrfs: set plug for fsync ipmi/powernv: Fix error return code in ipmi_powernv_probe() mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl() kconfig: Fix expr_free() E_NOT leak kconfig: Fix automatic menu creation mem leak kconfig: Don't leak main menus during parsing watchdog: sp5100_tco: Fix watchdog disable bit nfs: Do not convert nfs_idmap_cache_timeout to jiffies dm thin: fix documentation relative to low water mark threshold tools lib traceevent: Fix get_field_str() for dynamic strings perf callchain: Fix attr.sample_max_stack setting tools lib traceevent: Simplify pointer print logic and fix %pF PCI: Add function 1 DMA alias quirk for Marvell 9128 tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read() ALSA: hda - Use IS_REACHABLE() for dependency on input NFSv4: always set NFS_LOCK_LOST when a lock is lost. firewire-ohci: work around oversized DMA reads on JMicron controllers do d_instantiate/unlock_new_inode combinations safely xfs: remove racy hasattr check from attr ops kernel/signal.c: avoid undefined behaviour in kill_something_info kernel/sys.c: fix potential Spectre v1 issue kasan: fix memory hotplug during boot ipc/shm: fix shmat() nil address after round-down when remapping Revert "ipc/shm: Fix shmat mmap nil-page protection" xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent libata: blacklist Micron 500IT SSD with MU01 firmware libata: Blacklist some Sandisk SSDs for NCQ mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register ALSA: timer: Fix pause event notification aio: fix io_destroy(2) vs. lookup_ioctx() race affs_lookup(): close a race with affs_remove_link() KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable" MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs MIPS: ptrace: Expose FIR register through FP regset UPSTREAM: sched/fair: Consider RT/IRQ pressure in capacity_spare_wake Conflicts: drivers/media/dvb-core/dmxdev.c drivers/scsi/sd.c drivers/scsi/ufs/ufshcd.c drivers/usb/gadget/function/f_fs.c fs/ecryptfs/inode.c Change-Id: I15751ed8c82ec65ba7eedcb0d385b9f803c333f7 Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
| * ipc/shm: fix shmat() nil address after round-down when remappingDavidlohr Bueso2018-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8f89c007b6dec16a1793cb88de88fcc02117bbbc upstream. shmat()'s SHM_REMAP option forbids passing a nil address for; this is in fact the very first thing we check for. Andrea reported that for SHM_RND|SHM_REMAP cases we can end up bypassing the initial addr check, but we need to check again if the address was rounded down to nil. As of this patch, such cases will return -EINVAL. Link: http://lkml.kernel.org/r/20180503204934.kk63josdu6u53fbd@linux-n805 Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "ipc/shm: Fix shmat mmap nil-page protection"Davidlohr Bueso2018-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a73ab244f0dad8fffb3291b905f73e2d3eaa7c00 upstream. Patch series "ipc/shm: shmat() fixes around nil-page". These patches fix two issues reported[1] a while back by Joe and Andrea around how shmat(2) behaves with nil-page. The first reverts a commit that it was incorrectly thought that mapping nil-page (address=0) was a no no with MAP_FIXED. This is not the case, with the exception of SHM_REMAP; which is address in the second patch. I chose two patches because it is easier to backport and it explicitly reverts bogus behaviour. Both patches ought to be in -stable and ltp testcases need updated (the added testcase around the cve can be modified to just test for SHM_RND|SHM_REMAP). [1] lkml.kernel.org/r/20180430172152.nfa564pvgpk3ut7p@linux-n805 This patch (of 2): Commit 95e91b831f87 ("ipc/shm: Fix shmat mmap nil-page protection") worked on the idea that we should not be mapping as root addr=0 and MAP_FIXED. However, it was reported that this scenario is in fact valid, thus making the patch both bogus and breaks userspace as well. For example X11's libint10.so relies on shmat(1, SHM_RND) for lowmem initialization[1]. [1] https://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/os-support/linux/int10/linux.c#n347 Link: http://lkml.kernel.org/r/20180503203243.15045-2-dave@stgolabs.net Fixes: 95e91b831f87 ("ipc/shm: Fix shmat mmap nil-page protection") Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge android-4.4.129 (b1c4836) into msm-4.4Srinivasarao P2018-04-24
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * refs/heads/tmp-b1c4836 Linux 4.4.129 writeback: safer lock nesting fanotify: fix logic of events on child ext4: bugfix for mmaped pages in mpage_release_unused_pages() mm/filemap.c: fix NULL pointer in page_cache_tree_insert() mm: allow GFP_{FS,IO} for page_cache_read page cache allocation autofs: mount point create should honour passed in mode Don't leak MNT_INTERNAL away from internal mounts rpc_pipefs: fix double-dput() hypfs_kill_super(): deal with failed allocations jffs2_kill_sb(): deal with failed allocations powerpc/lib: Fix off-by-one in alternate feature patching powerpc/eeh: Fix enabling bridge MMIO windows MIPS: memset.S: Fix clobber of v1 in last_fixup MIPS: memset.S: Fix return of __clear_user from Lpartial_fixup MIPS: memset.S: EVA & fault support for small_memset MIPS: uaccess: Add micromips clobbers to bzero invocation HID: hidraw: Fix crash on HIDIOCGFEATURE with a destroyed device ALSA: hda - New VIA controller suppor no-snoop path ALSA: rawmidi: Fix missing input substream checks in compat ioctls ALSA: line6: Use correct endpoint type for midi output ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea() ext4: fix crashes in dioread_nolock mode drm/radeon: Fix PCIe lane width calculation ext4: don't allow r/w mounts if metadata blocks overlap the superblock vfio/pci: Virtualize Maximum Read Request Size vfio/pci: Virtualize Maximum Payload Size vfio-pci: Virtualize PCIe & AF FLR ALSA: pcm: Fix endless loop for XRUN recovery in OSS emulation ALSA: pcm: Fix mutex unbalance in OSS emulation ioctls ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams ALSA: pcm: Avoid potential races between OSS ioctls and read/write ALSA: pcm: Use ERESTARTSYS instead of EINTR in OSS emulation ALSA: oss: consolidate kmalloc/memset 0 call to kzalloc watchdog: f71808e_wdt: Fix WD_EN register read thermal: imx: Fix race condition in imx_thermal_probe() clk: bcm2835: De-assert/assert PLL reset signal when appropriate clk: mvebu: armada-38x: add support for missing clocks clk: mvebu: armada-38x: add support for 1866MHz variants mmc: jz4740: Fix race condition in IRQ mask update iommu/vt-d: Fix a potential memory leak um: Use POSIX ucontext_t instead of struct ucontext dmaengine: at_xdmac: fix rare residue corruption IB/srp: Fix completion vector assignment algorithm IB/srp: Fix srp_abort() ALSA: pcm: Fix UAF at PCM release via PCM timer access RDMA/ucma: Don't allow setting RDMA_OPTION_IB_PATH without an RDMA device ext4: fail ext4_iget for root directory if unallocated ext4: don't update checksum of new initialized bitmaps jbd2: if the journal is aborted then don't allow update of the log tail random: use a tighter cap in credit_entropy_bits_safe() thunderbolt: Resume control channel after hibernation image is created ASoC: ssm2602: Replace reg_default_raw with reg_default HID: core: Fix size as type u32 HID: Fix hid_report_len usage powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops powerpc/64: Fix smp_wmb barrier definition use use lwsync consistently powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write() HID: i2c-hid: fix size check and type usage usb: dwc3: pci: Properly cleanup resource USB:fix USB3 devices behind USB3 hubs not resuming at hibernate thaw ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status() ACPI / video: Add quirk to force acpi-video backlight on Samsung 670Z5E regmap: Fix reversed bounds check in regmap_raw_write() xen-netfront: Fix hang on device removal ARM: dts: at91: sama5d4: fix pinctrl compatible string ARM: dts: at91: at91sam9g25: fix mux-mask pinctrl property usb: musb: gadget: misplaced out of bounds check mm, slab: reschedule cache_reap() on the same CPU ipc/shm: fix use-after-free of shm file via remap_file_pages() resource: fix integer overflow at reallocation fs/reiserfs/journal.c: add missing resierfs_warning() arg ubi: Reject MLC NAND ubi: Fix error for write access ubi: fastmap: Don't flush fastmap work on detach ubifs: Check ubifs_wbuf_sync() return code tty: make n_tty_read() always abort if hangup is in progress x86/hweight: Don't clobber %rdi x86/hweight: Get rid of the special calling convention lan78xx: Correctly indicate invalid OTP slip: Check if rstate is initialized before uncompressing cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN hwmon: (ina2xx) Fix access to uninitialized mutex rtl8187: Fix NULL pointer dereference in priv->conf_mutex getname_kernel() needs to make sure that ->name != ->iname in long case s390/ipl: ensure loadparm valid flag is set s390/qdio: don't merge ERROR output buffers s390/qdio: don't retry EQBS after CCQ 96 block/loop: fix deadlock after loop_set_status Revert "perf tests: Decompress kernel module before objdump" radeon: hide pointless #warning when compile testing perf intel-pt: Fix timestamp following overflow perf intel-pt: Fix error recovery from missing TIP packet perf intel-pt: Fix sync_switch perf intel-pt: Fix overlap detection to identify consecutive buffers correctly parisc: Fix out of array access in match_pci_device() media: v4l2-compat-ioctl32: don't oops on overlay f2fs: check cap_resource only for data blocks Revert "f2fs: introduce f2fs_set_page_dirty_nobuffer" f2fs: clear PageError on writepage UPSTREAM: timer: Export destroy_hrtimer_on_stack() BACKPORT: dm verity: add 'check_at_most_once' option to only validate hashes once f2fs: call unlock_new_inode() before d_instantiate() f2fs: refactor read path to allow multiple postprocessing steps fscrypt: allow synchronous bio decryption Change-Id: I45f4ac10734d92023b53118d83dcd6c83974a283 Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
| * ipc/shm: fix use-after-free of shm file via remap_file_pages()Eric Biggers2018-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3f05317d9889ab75c7190dcd39491d2a97921984 upstream. syzbot reported a use-after-free of shm_file_data(file)->file->f_op in shm_get_unmapped_area(), called via sys_remap_file_pages(). Unfortunately it couldn't generate a reproducer, but I found a bug which I think caused it. When remap_file_pages() is passed a full System V shared memory segment, the memory is first unmapped, then a new map is created using the ->vm_file. Between these steps, the shm ID can be removed and reused for a new shm segment. But, shm_mmap() only checks whether the ID is currently valid before calling the underlying file's ->mmap(); it doesn't check whether it was reused. Thus it can use the wrong underlying file, one that was already freed. Fix this by making the "outer" shm file (the one that gets put in ->vm_file) hold a reference to the real shm file, and by making __shm_open() require that the file associated with the shm ID matches the one associated with the "outer" file. Taking the reference to the real shm file is needed to fully solve the problem, since otherwise sfd->file could point to a freed file, which then could be reallocated for the reused shm ID, causing the wrong shm segment to be mapped (and without the required permission checks). Commit 1ac0b6dec656 ("ipc/shm: handle removed segments gracefully in shm_mmap()") almost fixed this bug, but it didn't go far enough because it didn't consider the case where the shm ID is reused. The following program usually reproduces this bug: #include <stdlib.h> #include <sys/shm.h> #include <sys/syscall.h> #include <unistd.h> int main() { int is_parent = (fork() != 0); srand(getpid()); for (;;) { int id = shmget(0xF00F, 4096, IPC_CREAT|0700); if (is_parent) { void *addr = shmat(id, NULL, 0); usleep(rand() % 50); while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0)); } else { usleep(rand() % 50); shmctl(id, IPC_RMID, NULL); } } } It causes the following NULL pointer dereference due to a 'struct file' being used while it's being freed. (I couldn't actually get a KASAN use-after-free splat like in the syzbot report. But I think it's possible with this bug; it would just take a more extraordinary race...) BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16a7c95 #189 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 RIP: 0010:d_inode include/linux/dcache.h:519 [inline] RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724 [...] Call Trace: file_accessed include/linux/fs.h:2063 [inline] shmem_mmap+0x25/0x40 mm/shmem.c:2149 call_mmap include/linux/fs.h:1789 [inline] shm_mmap+0x34/0x80 ipc/shm.c:465 call_mmap include/linux/fs.h:1789 [inline] mmap_region+0x309/0x5b0 mm/mmap.c:1712 do_mmap+0x294/0x4a0 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2235 [inline] SYSC_remap_file_pages mm/mmap.c:2853 [inline] SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769 do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ebiggers@google.com: add comment] Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge tag v4.4.55 into branch 'msm-4.4'Blagovest Kolenichev2017-03-23
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | refs/heads/tmp-28ec98b: Linux 4.4.55 ext4: don't BUG when truncating encrypted inodes on the orphan list dm: flush queued bios when process blocks to avoid deadlock nfit, libnvdimm: fix interleave set cookie calculation s390/kdump: Use "LINUX" ELF note name instead of "CORE" KVM: s390: Fix guest migration for huge guests resulting in panic mvsas: fix misleading indentation serial: samsung: Continue to work if DMA request fails USB: serial: io_ti: fix information leak in completion handler USB: serial: io_ti: fix NULL-deref in interrupt callback USB: iowarrior: fix NULL-deref in write USB: iowarrior: fix NULL-deref at probe USB: serial: omninet: fix reference leaks at open USB: serial: safe_serial: fix information leak in completion handler usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers usb: host: xhci-dbg: HCIVERSION should be a binary number usb: gadget: function: f_fs: pass companion descriptor along usb: dwc3: gadget: make Set Endpoint Configuration macros safe usb: gadget: dummy_hcd: clear usb_gadget region before registration powerpc: Emulation support for load/store instructions on LE tracing: Add #undef to fix compile error MIPS: Netlogic: Fix CP0_EBASE redefinition warnings MIPS: DEC: Avoid la pseudo-instruction in delay slots mm: memcontrol: avoid unused function warning cpmac: remove hopeless #warning MIPS: ralink: Remove unused rt*_wdt_reset functions MIPS: ralink: Cosmetic change to prom_init(). mtd: pmcmsp: use kstrndup instead of kmalloc+strncpy MIPS: Update lemote2f_defconfig for CPU_FREQ_STAT change MIPS: ip22: Fix ip28 build for modern gcc MIPS: Update ip27_defconfig for SCSI_DH change MIPS: ip27: Disable qlge driver in defconfig MIPS: Update defconfigs for NF_CT_PROTO_DCCP/UDPLITE change crypto: improve gcc optimization flags for serpent and wp512 USB: serial: digi_acceleport: fix OOB-event processing USB: serial: digi_acceleport: fix OOB data sanity check Linux 4.4.54 drivers: hv: Turn off write permission on the hypercall page fat: fix using uninitialized fields of fat_inode/fsinfo_inode libceph: use BUG() instead of BUG_ON(1) drm/i915/dsi: Do not clear DPOUNIT_CLOCK_GATE_DISABLE from vlv_init_display_clock_gating fakelb: fix schedule while atomic drm/atomic: fix an error code in mode_fixup() drm/ttm: Make sure BOs being swapped out are cacheable drm/edid: Add EDID_QUIRK_FORCE_8BPC quirk for Rotel RSX-1058 drm/ast: Fix AST2400 POST failure without BMC FW or VBIOS drm/ast: Call open_key before enable_mmio in POST code drm/ast: Fix test for VGA enabled drm/amdgpu: add more cases to DCE11 possible crtc mask setup mac80211: flush delayed work when entering suspend xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD pwm: pca9685: Fix period change with same duty cycle nlm: Ensure callback code also checks that the files match target: Fix NULL dereference during LUN lookup + active I/O shutdown ceph: remove req from unsafe list when unregistering it ktest: Fix child exit code processing IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/IPoIB: Add destination address when re-queue packet IB/ipoib: Fix deadlock between rmmod and set_mode mnt: Tuck mounts under others instead of creating shadow/side mounts. net: mvpp2: fix DMA address calculation in mvpp2_txq_inc_put() s390: use correct input data address for setup_randomness s390: make setup_randomness work s390: TASK_SIZE for kernel threads s390/dcssblk: fix device size calculation in dcssblk_direct_access() s390/qdio: clear DSCI prior to scanning multiple input queues Bluetooth: Add another AR3012 04ca:3018 device KVM: VMX: use correct vmcs_read/write for guest segment selector/base KVM: s390: Disable dirty log retrieval for UCONTROL guests serial: 8250_pci: Add MKS Tenta SCOM-0800 and SCOM-0801 cards tty: n_hdlc: get rid of racy n_hdlc.tbuf TTY: n_hdlc, fix lockdep false positive Linux 4.4.53 scsi: lpfc: Correct WQ creation for pagesize MIPS: IP22: Fix build error due to binutils 2.25 uselessnes. MIPS: IP22: Reformat inline assembler code to modern standards. powerpc/xmon: Fix data-breakpoint dmaengine: ipu: Make sure the interrupt routine checks all interrupts. bcma: use (get|put)_device when probing/removing device driver md linear: fix a race between linear_add() and linear_congested() rtc: sun6i: Switch to the external oscillator rtc: sun6i: Add some locking NFSv4: fix getacl ERANGE for some ACL buffer sizes NFSv4: fix getacl head length estimation NFSv4: Fix memory and state leak in _nfs4_open_and_get_state nfsd: special case truncates some more nfsd: minor nfsd_setattr cleanup rtlwifi: rtl8192c-common: Fix "BUG: KASAN: rtlwifi: Fix alignment issues gfs2: Add missing rcu locking for glock lookup rdma_cm: fail iwarp accepts w/o connection params RDMA/core: Fix incorrect structure packing for booleans Drivers: hv: util: Backup: Fix a rescind processing issue Drivers: hv: util: Fcopy: Fix a rescind processing issue Drivers: hv: util: kvp: Fix a rescind processing issue hv: init percpu_list in hv_synic_alloc() hv: allocate synic pages for all present CPUs usb: gadget: udc: fsl: Add missing complete function. usb: host: xhci: plat: check hcc_params after add hcd usb: musb: da8xx: Remove CPPI 3.0 quirk and methods w1: ds2490: USB transfer buffers need to be DMAable w1: don't leak refcount on slave attach failure in w1_attach_slave_device() can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer iio: pressure: mpl3115: do not rely on structure field ordering iio: pressure: mpl115: do not rely on structure field ordering arm/arm64: KVM: Enforce unconditional flush to PoC when mapping to stage-2 fuse: add missing FR_FORCE crypto: testmgr - Pad aes_ccm_enc_tv_template vector ath9k: use correct OTP register offsets for the AR9340 and AR9550 ath9k: fix race condition in enabling/disabling IRQs ath5k: drop bogus warning on drv_set_key with unsupported cipher target: Fix multi-session dynamic se_node_acl double free OOPs target: Obtain se_node_acl->acl_kref during get_initiator_node_acl samples/seccomp: fix 64-bit comparison macros ext4: return EROFS if device is r/o and journal replay is needed ext4: preserve the needs_recovery flag when the journal is aborted ext4: fix inline data error paths ext4: fix data corruption in data=journal mode ext4: trim allocation requests to group size ext4: do not polute the extents cache while shifting extents ext4: Include forgotten start block on fallocate insert range loop: fix LO_FLAGS_PARTSCAN hang block/loop: fix race between I/O and set_status jbd2: don't leak modified metadata buffers on an aborted journal Fix: Disable sys_membarrier when nohz_full is enabled sd: get disk reference in sd_check_events() scsi: use 'scsi_device_from_queue()' for scsi_dh scsi: aacraid: Reorder Adapter status check scsi: storvsc: properly set residual data length on errors scsi: storvsc: properly handle SRB_ERROR when sense message is present scsi: storvsc: use tagged SRB requests if supported by the device dm stats: fix a leaked s->histogram_boundaries array dm cache: fix corruption seen when using cache > 2TB ipc/shm: Fix shmat mmap nil-page protection mm: do not access page->mapping directly on page_endio mm: vmpressure: fix sending wrong events on underflow mm/page_alloc: fix nodes for reclaim in fast path iommu/vt-d: Tylersburg isoch identity map check is done too late. iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommu regulator: Fix regulator_summary for deviceless consumers staging: rtl: fix possible NULL pointer dereference ALSA: hda - Fix micmute hotkey problem for a lenovo AIO machine ALSA: hda - Add subwoofer support for Dell Inspiron 17 7000 Gaming ALSA: seq: Fix link corruption by event error handling ALSA: ctxfi: Fallback DMA mask to 32bit ALSA: timer: Reject user params with too small ticks ALSA: hda - fix Lewisburg audio issue ALSA: hda/realtek - Cannot adjust speaker's volume on a Dell AIO ARM: dts: at91: Enable DMA on sama5d2_xplained console ARM: dts: at91: Enable DMA on sama5d4_xplained console ARM: at91: define LPDDR types media: fix dm1105.c build error uvcvideo: Fix a wrong macro am437x-vpfe: always assign bpp variable MIPS: Handle microMIPS jumps in the same way as MIPS32/MIPS64 jumps MIPS: Calculate microMIPS ra properly when unwinding the stack MIPS: Fix is_jump_ins() handling of 16b microMIPS instructions MIPS: Fix get_frame_info() handling of microMIPS function size MIPS: Prevent unaligned accesses during stack unwinding MIPS: Clear ISA bit correctly in get_frame_info() MIPS: Lantiq: Keep ethernet enabled during boot MIPS: OCTEON: Fix copy_from_user fault handling for large buffers MIPS: BCM47XX: Fix button inversion for Asus WL-500W MIPS: Fix special case in 64 bit IP checksumming. samples: move mic/mpssd example code from Documentation Linux 4.4.52 kvm: vmx: ensure VMCS is current while enabling PML Revert "usb: chipidea: imx: enable CI_HDRC_SET_NON_ZERO_TTHA" rtlwifi: rtl_usb: Fix for URB leaking when doing ifconfig up/down block: fix double-free in the failure path of cgwb_bdi_init() goldfish: Sanitize the broken interrupt handler x86/platform/goldfish: Prevent unconditional loading USB: serial: ark3116: fix register-accessor error handling USB: serial: opticon: fix CTS retrieval at open USB: serial: spcp8x5: fix modem-status handling USB: serial: ftdi_sio: fix line-status over-reporting USB: serial: ftdi_sio: fix extreme low-latency setting USB: serial: ftdi_sio: fix modem-status error handling USB: serial: cp210x: add new IDs for GE Bx50v3 boards USB: serial: mos7840: fix another NULL-deref at open tty: serial: msm: Fix module autoload net: socket: fix recvmmsg not returning error from sock_error ip: fix IP_CHECKSUM handling irda: Fix lockdep annotations in hashbin_delete(). dccp: fix freeing skb too early for IPV6_RECVPKTINFO packet: Do not call fanout_release from atomic contexts packet: fix races in fanout_add() net/llc: avoid BUG_ON() in skb_orphan() blk-mq: really fix plug list flushing for nomerge queues rtc: interface: ignore expired timers when enqueuing new timers rtlwifi: rtl_usb: Fix missing entry in USB driver's private data Linux 4.4.51 mmc: core: fix multi-bit bus width without high-speed mode bcache: Make gc wakeup sane, remove set_task_state() ntb_transport: Pick an unused queue NTB: ntb_transport: fix debugfs_remove_recursive printk: use rcuidle console tracepoint ARM: 8658/1: uaccess: fix zeroing of 64-bit get_user() futex: Move futex_init() to core_initcall drm/dp/mst: fix kernel oops when turning off secondary monitor drm/radeon: Use mode h/vdisplay fields to hide out of bounds HW cursor Input: elan_i2c - add ELAN0605 to the ACPI table Fix missing sanity check in /dev/sg scsi: don't BUG_ON() empty DMA transfers fuse: fix use after free issue in fuse_dev_do_read() siano: make it work again with CONFIG_VMAP_STACK vfs: fix uninitialized flags in splice_to_pipe() Linux 4.4.50 l2tp: do not use udp_ioctl() ping: fix a null pointer dereference packet: round up linear to header len net: introduce device min_header_len sit: fix a double free on error path sctp: avoid BUG_ON on sctp_wait_for_sndbuf mlx4: Invoke softirqs after napi_reschedule macvtap: read vnet_hdr_size once tun: read vnet_hdr_sz once tcp: avoid infinite loop in tcp_splice_read() ipv6: tcp: add a missing tcp_v6_restore_cb() ip6_gre: fix ip6gre_err() invalid reads netlabel: out of bound access in cipso_v4_validate() ipv4: keep skb->dst around in presence of IP options net: use a work queue to defer net_disable_timestamp() work tcp: fix 0 divide in __tcp_select_window() ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim() ipv6: fix ip6_tnl_parse_tlv_enc_lim() can: Fix kernel panic at security_sock_rcv_skb Conflicts: drivers/scsi/sd.c drivers/usb/gadget/function/f_fs.c drivers/usb/host/xhci-plat.c CRs-Fixed: 2023471 Change-Id: I396051a8de30271af77b3890d4b19787faa1c31e Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
| * ipc/shm: Fix shmat mmap nil-page protectionDavidlohr Bueso2017-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 95e91b831f87ac8e1f8ed50c14d709089b4e01b8 upstream. The issue is described here, with a nice testcase: https://bugzilla.kernel.org/show_bug.cgi?id=192931 The problem is that shmat() calls do_mmap_pgoff() with MAP_FIXED, and the address rounded down to 0. For the regular mmap case, the protection mentioned above is that the kernel gets to generate the address -- arch_get_unmapped_area() will always check for MAP_FIXED and return that address. So by the time we do security_mmap_addr(0) things get funky for shmat(). The testcase itself shows that while a regular user crashes, root will not have a problem attaching a nil-page. There are two possible fixes to this. The first, and which this patch does, is to simply allow root to crash as well -- this is also regular mmap behavior, ie when hacking up the testcase and adding mmap(... |MAP_FIXED). While this approach is the safer option, the second alternative is to ignore SHM_RND if the rounded address is 0, thus only having MAP_SHARED flags. This makes the behavior of shmat() identical to the mmap() case. The downside of this is obviously user visible, but does make sense in that it maintains semantics after the round-down wrt 0 address and mmap. Passes shm related ltp tests. Link: http://lkml.kernel.org/r/1486050195-18629-1-git-send-email-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reported-by: Gareth Evans <gareth.evans@contextis.co.uk> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | net: initialize variables to avoid UML compilation failureJeevan Shriram2016-03-23
|/ | | | | | | | While compiling for usermode linux for x86 architecture, observed compilation issues with probable usage of uninitialized variables. This change initializes the variables. Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
* ipc/shm: handle removed segments gracefully in shm_mmap()Kirill A. Shutemov2016-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1ac0b6dec656f3f78d1c3dd216fad84cb4d0a01e upstream. remap_file_pages(2) emulation can reach file which represents removed IPC ID as long as a memory segment is mapped. It breaks expectations of IPC subsystem. Test case (rewritten to be more human readable, originally autogenerated by syzkaller[1]): #define _GNU_SOURCE #include <stdlib.h> #include <sys/ipc.h> #include <sys/mman.h> #include <sys/shm.h> #define PAGE_SIZE 4096 int main() { int id; void *p; id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0); p = shmat(id, NULL, 0); shmctl(id, IPC_RMID, NULL); remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0); return 0; } The patch changes shm_mmap() and code around shm_lock() to propagate locking error back to caller of shm_mmap(). [1] http://github.com/google/syzkaller Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Initialize msg/shm IPC objects before doing ipc_addid()Linus Torvalds2015-09-30
| | | | | | | | | | | | | | | | | As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before having initialized the IPC object state. Yes, we initialize the IPC object in a locked state, but with all the lockless RCU lookup work, that IPC object lock no longer means that the state cannot be seen. We already did this for the IPC semaphore code (see commit e8577d1f0329: "ipc/sem.c: fully initialize sem_array before making it visible") but we clearly forgot about msg and shm. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: convert invalid scenarios to use WARN_ONDavidlohr Bueso2015-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Considering Linus' past rants about the (ab)use of BUG in the kernel, I took a look at how we deal with such calls in ipc. Given that any errors or corruption in ipc code are most likely contained within the set of processes participating in the broken mechanisms, there aren't really many strong fatal system failure scenarios that would require a BUG call. Also, if something is seriously wrong, ipc might not be the place for such a BUG either. 1. For example, recently, a customer hit one of these BUG_ONs in shm after failing shm_lock(). A busted ID imho does not merit a BUG_ON, and WARN would have been better. 2. MSG_COPY functionality of posix msgrcv(2) for checkpoint/restore. I don't see how we can hit this anyway -- at least it should be IS_ERR. The 'copy' arg from do_msgrcv is always set by calling prepare_copy() first and foremost. We could also probably drop this check altogether. Either way, it does not merit a BUG_ON. 3. No ->fault() callback for the fs getting the corresponding page -- seems selfish to make the system unusable. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: use private shmem or hugetlbfs inodes for shm segments.Stephen Smalley2015-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shm implementation internally uses shmem or hugetlbfs inodes for shm segments. As these inodes are never directly exposed to userspace and only accessed through the shm operations which are already hooked by security modules, mark the inodes with the S_PRIVATE flag so that inode security initialization and permission checking is skipped. This was motivated by the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 4.2.0-0.rc3.git0.1.fc24.x86_64+debug #1 Tainted: G W ------------------------------------------------------- httpd/1597 is trying to acquire lock: (&ids->rwsem){+++++.}, at: shm_close+0x34/0x130 but task is already holding lock: (&mm->mmap_sem){++++++}, at: SyS_shmdt+0x4b/0x180 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&mm->mmap_sem){++++++}: lock_acquire+0xc7/0x270 __might_fault+0x7a/0xa0 filldir+0x9e/0x130 xfs_dir2_block_getdents.isra.12+0x198/0x1c0 [xfs] xfs_readdir+0x1b4/0x330 [xfs] xfs_file_readdir+0x2b/0x30 [xfs] iterate_dir+0x97/0x130 SyS_getdents+0x91/0x120 entry_SYSCALL_64_fastpath+0x12/0x76 -> #2 (&xfs_dir_ilock_class){++++.+}: lock_acquire+0xc7/0x270 down_read_nested+0x57/0xa0 xfs_ilock+0x167/0x350 [xfs] xfs_ilock_attr_map_shared+0x38/0x50 [xfs] xfs_attr_get+0xbd/0x190 [xfs] xfs_xattr_get+0x3d/0x70 [xfs] generic_getxattr+0x4f/0x70 inode_doinit_with_dentry+0x162/0x670 sb_finish_set_opts+0xd9/0x230 selinux_set_mnt_opts+0x35c/0x660 superblock_doinit+0x77/0xf0 delayed_superblock_init+0x10/0x20 iterate_supers+0xb3/0x110 selinux_complete_init+0x2f/0x40 security_load_policy+0x103/0x600 sel_write_load+0xc1/0x750 __vfs_write+0x37/0x100 vfs_write+0xa9/0x1a0 SyS_write+0x58/0xd0 entry_SYSCALL_64_fastpath+0x12/0x76 ... Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Reported-by: Morten Stevens <mstevens@fedoraproject.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: rename ipc_obtain_objectDavidlohr Bueso2015-06-30
| | | | | | | | | | ... to ipc_obtain_object_idr, which is more meaningful and makes the code slightly easier to follow. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: move BUG_ON check into shm_lockDavidlohr Bueso2015-06-30
| | | | | | | | | | | | Upon every shm_lock call, we BUG_ON if an error was returned, indicating racing either in idr or in shm_destroy. Move this logic into the locking. [akpm@linux-foundation.org: simplify code] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2015-04-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
| * VFS: assorted weird filesystems: d_inode() annotationsDavid Howells2015-04-15
| | | | | | | | | | Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | ipc: remove use of seq_printf return valueJoe Perches2015-04-15
|/ | | | | | | | | | | | The seq_printf return value, because it's frequently misused, will eventually be converted to void. See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* shmdt: use i_size_read() instead of ->i_sizeDave Hansen2014-12-13
| | | | | | | | | | | | | | | | Andrew Morton noted http://lkml.kernel.org/r/20141104142027.a7a0d010772d84560b445f59@linux-foundation.org that the shmdt uses inode->i_size outside of i_mutex being held. There is one more case in shm.c in shm_destroy(). This converts both users over to use i_size_read(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segmentsDave Hansen2014-12-13
| | | | | | | | | | | | | | | | | | | | | | | This is a highly-contrived scenario. But, a single shmdt() call can be induced in to unmapping memory from mulitple shm segments. Example code is here: http://www.sr71.net/~dave/intel/shmfun.c The fix is pretty simple: Record the 'struct file' for the first VMA we encounter and then stick to it. Decline to unmap anything not from the same file and thus the same segment. I found this by inspection and the odds of anyone hitting this in practice are pretty darn small. Lightly tested, but it's a pretty small patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Manfred Spraul <manfred@colorfullife.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc/shm: kill the historical/wrong mm->start_stack checkOleg Nesterov2014-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | do_shmat() is the only user of ->start_stack (proc just reports its value), and this check looks ugly and wrong. The reason for this check is not clear at all, and it wrongly assumes that the stack can only grow down. But the main problem is that in general mm->start_stack has nothing to do with stack_vma->vm_start. Not only the application can switch to another stack and even unmap this area, setup_arg_pages() expands the stack without updating mm->start_stack during exec(). This means that in the likely case "addr > start_stack - size - PAGE_SIZE * 5" is simply impossible after find_vma_intersection() == F, or the stack can't grow anyway because of RLIMIT_STACK. Many thanks to Hugh for his explanations. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* shm: allow exit_shm in parallel if only marking orphansJack Miller2014-08-08
| | | | | | | | | | | | | | | | | | | | | | If shm_rmid_force (the default state) is not set then the shmids are only marked as orphaned and does not require any add, delete, or locking of the tree structure. Seperate the sysctl on and off case, and only obtain the read lock. The newly added list head can be deleted under the read lock because we are only called with current and will only change the semids allocated by this task and not manipulate the list. This commit assumes that up_read includes a sufficient memory barrier for the writes to be seen my others that later obtain a write lock. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Jack Miller <millerjo@us.ibm.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* shm: make exit_shm work proportional to task activityJack Miller2014-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is small set of patches our team has had kicking around for a few versions internally that fixes tasks getting hung on shm_exit when there are many threads hammering it at once. Anton wrote a simple test to cause the issue: http://ozlabs.org/~anton/junkcode/bust_shm_exit.c Before applying this patchset, this test code will cause either hanging tracebacks or pthread out of memory errors. After this patchset, it will still produce output like: root@somehost:~# ./bust_shm_exit 1024 160 ... INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 116, t=2111 jiffies, g=241, c=240, q=7113) INFO: Stall ended before state dump start ... But the task will continue to run along happily, so we consider this an improvement over hanging, even if it's a bit noisy. This patch (of 3): exit_shm obtains the ipc_ns shm rwsem for write and holds it while it walks every shared memory segment in the namespace. Thus the amount of work is related to the number of shm segments in the namespace not the number of segments that might need to be cleaned. In addition, this occurs after the task has been notified the thread has exited, so the number of tasks waiting for the ns shm rwsem can grow without bound until memory is exausted. Add a list to the task struct of all shmids allocated by this task. Init the list head in copy_process. Use the ns->rwsem for locking. Add segments after id is added, remove before removing from id. On unshare of NEW_IPCNS orphan any ids as if the task had exited, similar to handling of semaphore undo. I chose a define for the init sequence since its a simple list init, otherwise it would require a function call to avoid include loops between the semaphore code and the task struct. Converting the list_del to list_del_init for the unshare cases would remove the exit followed by init, but I left it blow up if not inited. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Jack Miller <millerjo@us.ibm.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc/shm.c: check for integer overflow during shmget.Manfred Spraul2014-06-06
| | | | | | | | | | | | | | | | SHMMAX is the upper limit for the size of a shared memory segment, counted in bytes. The actual allocation is that size, rounded up to the next full page. Add a check that prevents the creation of segments where the rounded up size causes an integer overflow. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc/shm.c: check for overflows of shm_totManfred Spraul2014-06-06
| | | | | | | | | | | | | | | | | shm_tot counts the total number of pages used by shm segments. If SHMALL is ULONG_MAX (or nearly ULONG_MAX), then the number can overflow. Subsequent calls to shmctl(,SHM_INFO,) would return wrong values for shm_tot. The patch adds a detection for overflows. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc/shm.c: check for ulong overflows in shmatManfred Spraul2014-06-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The increase of SHMMAX/SHMALL is a 4 patch series. The change itself is trivial, the only problem are interger overflows. The overflows are not new, but if we make huge values the default, then the code should be free from overflows. SHMMAX: - shmmem_file_setup places a hard limit on the segment size: MAX_LFS_FILESIZE. On 32-bit, the limit is > 1 TB, i.e. 4 GB-1 byte segments are possible. Rounded up to full pages the actual allocated size is 0. --> must be fixed, patch 3 - shmat: - find_vma_intersection does not handle overflows properly. --> must be fixed, patch 1 - the rest is fine, do_mmap_pgoff limits mappings to TASK_SIZE and checks for overflows (i.e.: map 2 GB, starting from addr=2.5GB fails). SHMALL: - after creating 8192 segments size (1L<<63)-1, shm_tot overflows and returns 0. --> must be fixed, patch 2. Userspace: - Obviously, there could be overflows in userspace. There is nothing we can do, only use values smaller than ULONG_MAX. I ended with "ULONG_MAX - 1L<<24": - TASK_SIZE cannot be used because it is the size of the current task. Could be 4G if it's a 32-bit task on a 64-bit kernel. - The maximum size is not standardized across archs: I found TASK_MAX_SIZE, TASK_SIZE_MAX and TASK_SIZE_64. - Just in case some arch revives a 4G/4G split, nearly ULONG_MAX is a valid segment size. - Using "0" as a magic value for infinity is even worse, because right now 0 means 0, i.e. fail all allocations. This patch (of 4): find_vma_intersection() does not work as intended if addr+size overflows. The patch adds a manual check before the call to find_vma_intersection. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc, kernel: clear whitespacePaul McQuade2014-06-06
| | | | | | | | trailing whitespace Signed-off-by: Paul McQuade <paulmcquad@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc, kernel: use Linux headersPaul McQuade2014-06-06
| | | | | | | | | Use #include <linux/uaccess.h> instead of <asm/uaccess.h> Use #include <linux/types.h> instead of <asm/types.h> Signed-off-by: Paul McQuade <paulmcquad@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: constify ipc_opsMathias Krause2014-06-06
| | | | | | | | | | | | | | | | There is no need to recreate the very same ipc_ops structure on every kernel entry for msgget/semget/shmget. Just declare it static and be done with it. While at it, constify it as we don't modify the structure at runtime. Found in the PaX patch, written by the PaX Team. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: PaX Team <pageexec@freemail.hu> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: standardize code commentsDavidlohr Bueso2014-01-27
| | | | | | | | | | | | IPC commenting style is all over the place, *specially* in util.c. This patch orders things a bit. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: whitespace cleanupManfred Spraul2014-01-27
| | | | | | | | | | | | | | | | | | | | | | | The ipc code does not adhere the typical linux coding style. This patch fixes lots of simple whitespace errors. - mostly autogenerated by scripts/checkpatch.pl -f --fix \ --types=pointer_location,spacing,space_before_tab - one manual fixup (keep structure members tab-aligned) - removal of additional space_before_tab that were not found by --fix Tested with some of my msg and sem test apps. Andrew: Could you include it in -mm and move it towards Linus' tree? Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Suggested-by: Li Bin <huawei.libin@huawei.com> Cc: Joe Perches <joe@perches.com> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: introduce ipc_valid_object() helper to sort out IPC_RMID racesRafael Aquini2014-01-27
| | | | | | | | | | | | | | | | | | | | After the locking semantics for the SysV IPC API got improved, a couple of IPC_RMID race windows were opened because we ended up dropping the 'kern_ipc_perm.deleted' check performed way down in ipc_lock(). The spotted races got sorted out by re-introducing the old test within the racy critical sections. This patch introduces ipc_valid_object() to consolidate the way we cope with IPC_RMID races by using the same abstraction across the API implementation. Signed-off-by: Rafael Aquini <aquini@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Greg Thelen <gthelen@google.com> Reviewed-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: correct error return value in shmctl (SHM_UNLOCK)Jesper Nilsson2013-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl") restructured the ipc shm to shorten critical region, but introduced a path where the return value could be -EPERM, even if the operation actually was performed. Before the commit, the err return value was reset by the return value from security_shm_shmctl() after the if (!ns_capable(...)) statement. Now, we still exit the if statement with err set to -EPERM, and in the case of SHM_UNLOCK, it is not reset at all, and used as the return value from shmctl. To fix this, we only set err when errors occur, leaving the fallthrough case alone. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [3.12.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: fix shm_file deletion racesGreg Thelen2013-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When IPC_RMID races with other shm operations there's potential for use-after-free of the shm object's associated file (shm_file). Here's the race before this patch: TASK 1 TASK 2 ------ ------ shm_rmid() ipc_lock_object() shmctl() shp = shm_obtain_object_check() shm_destroy() shum_unlock() fput(shp->shm_file) ipc_lock_object() shmem_lock(shp->shm_file) <OOPS> The oops is caused because shm_destroy() calls fput() after dropping the ipc_lock. fput() clears the file's f_inode, f_path.dentry, and f_path.mnt, which causes various NULL pointer references in task 2. I reliably see the oops in task 2 if with shmlock, shmu This patch fixes the races by: 1) set shm_file=NULL in shm_destroy() while holding ipc_object_lock(). 2) modify at risk operations to check shm_file while holding ipc_object_lock(). Example workloads, which each trigger oops... Workload 1: while true; do id=$(shmget 1 4096) shm_rmid $id & shmlock $id & wait done The oops stack shows accessing NULL f_inode due to racing fput: _raw_spin_lock shmem_lock SyS_shmctl Workload 2: while true; do id=$(shmget 1 4096) shmat $id 4096 & shm_rmid $id & wait done The oops stack is similar to workload 1 due to NULL f_inode: touch_atime shmem_mmap shm_mmap mmap_region do_mmap_pgoff do_shmat SyS_shmat Workload 3: while true; do id=$(shmget 1 4096) shmlock $id shm_rmid $id & shmunlock $id & wait done The oops stack shows second fput tripping on an NULL f_inode. The first fput() completed via from shm_destroy(), but a racing thread did a get_file() and queued this fput(): locks_remove_flock __fput ____fput task_work_run do_notify_resume int_signal Fixes: c2c737a0461e ("ipc,shm: shorten critical region for shmat") Fixes: 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl") Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> # 3.10.17+ 3.11.6+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: fix race with LSMsDavidlohr Bueso2013-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, IPC mechanisms do security and auditing related checks under RCU. However, since security modules can free the security structure, for example, through selinux_[sem,msg_queue,shm]_free_security(), we can race if the structure is freed before other tasks are done with it, creating a use-after-free condition. Manfred illustrates this nicely, for instance with shared mem and selinux: -> do_shmat calls rcu_read_lock() -> do_shmat calls shm_object_check(). Checks that the object is still valid - but doesn't acquire any locks. Then it returns. -> do_shmat calls security_shm_shmat (e.g. selinux_shm_shmat) -> selinux_shm_shmat calls ipc_has_perm() -> ipc_has_perm accesses ipc_perms->security shm_close() -> shm_close acquires rw_mutex & shm_lock -> shm_close calls shm_destroy -> shm_destroy calls security_shm_free (e.g. selinux_shm_free_security) -> selinux_shm_free_security calls ipc_free_security(&shp->shm_perm) -> ipc_free_security calls kfree(ipc_perms->security) This patch delays the freeing of the security structures after all RCU readers are done. Furthermore it aligns the security life cycle with that of the rest of IPC - freeing them based on the reference counter. For situations where we need not free security, the current behavior is kept. Linus states: "... the old behavior was suspect for another reason too: having the security blob go away from under a user sounds like it could cause various other problems anyway, so I think the old code was at least _prone_ to bugs even if it didn't have catastrophic behavior." I have tested this patch with IPC testcases from LTP on both my quad-core laptop and on a 64 core NUMA server. In both cases selinux is enabled, and tests pass for both voluntary and forced preemption models. While the mentioned races are theoretical (at least no one as reported them), I wanted to make sure that this new logic doesn't break anything we weren't aware of. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc, shm: drop shm_lock_checkDavidlohr Bueso2013-09-11
| | | | | | | | | | | | This function was replaced by a the lockless shm_obtain_object_check(), and no longer has any users. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc, shm: guard against non-existant vma in shmdt(2)Davidlohr Bueso2013-09-11
| | | | | | | | | | | | | | | When !CONFIG_MMU there's a chance we can derefence a NULL pointer when the VM area isn't found - check the return value of find_vma(). Also, remove the redundant -EINVAL return: retval is set to the proper return code and *only* changed to 0, when we actually unmap the segments. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: rename ids->rw_mutexDavidlohr Bueso2013-09-11
| | | | | | | | | | | | Since in some situations the lock can be shared for readers, we shouldn't be calling it a mutex, rename it to rwsem. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: shorten critical region for shmatDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | | Similar to other system calls, acquire the kern_ipc_perm lock after doing the initial permission and security checks. [sasha.levin@oracle.com: dont leave do_shmat with rcu lock held] Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: cleanup do_shmat pastaDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | Clean up some of the messy do_shmat() spaghetti code, getting rid of out_free and out_put_dentry labels. This makes shortening the critical region of this function in the next patch a little easier to do and read. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: shorten critical region for shmctlDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | | With the *_INFO, *_STAT, IPC_RMID and IPC_SET commands already optimized, deal with the remaining SHM_LOCK and SHM_UNLOCK commands. Take the shm_perm lock after doing the initial auditing and security checks. The rest of the logic remains unchanged. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: make shmctl_nolock locklessDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | | While the INFO cmd doesn't take the ipc lock, the STAT commands do acquire it unnecessarily. We can do the permissions and security checks only holding the rcu lock. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: introduce shmctl_nolockDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | | | | Similar to semctl and msgctl, when calling msgctl, the *_INFO and *_STAT commands can be performed without acquiring the ipc object. Add a shmctl_nolock() function and move the logic of *_INFO and *_STAT out of msgctl(). Since we are just moving functionality, this change still takes the lock and it will be properly lockless in the next patch. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: shorten critical region in shmctl_downDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | Instead of holding the ipc lock for the entire function, use the ipcctl_pre_down_nolock and only acquire the lock for specific commands: RMID and SET. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc,shm: introduce lockless functions to obtain the ipc objectDavidlohr Bueso2013-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the third and final patchset that deals with reducing the amount of contention we impose on the ipc lock (kern_ipc_perm.lock). These changes mostly deal with shared memory, previous work has already been done for semaphores and message queues: http://lkml.org/lkml/2013/3/20/546 (sems) http://lkml.org/lkml/2013/5/15/584 (mqueues) With these patches applied, a custom shm microbenchmark stressing shmctl doing IPC_STAT with 4 threads a million times, reduces the execution time by 50%. A similar run, this time with IPC_SET, reduces the execution time from 3 mins and 35 secs to 27 seconds. Patches 1-8: replaces blindly taking the ipc lock for a smarter combination of rcu and ipc_obtain_object, only acquiring the spinlock when updating. Patch 9: renames the ids rw_mutex to rwsem, which is what it already was. Patch 10: is a trivial mqueue leftover cleanup Patch 11: adds a brief lock scheme description, requested by Andrew. This patch: Add shm_obtain_object() and shm_obtain_object_check(), which will allow us to get the ipc object without acquiring the lock. Just as with other forms of ipc, these functions are basically wrappers around ipc_obtain_object*(). Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: move locking out of ipcctl_pre_down_nolockDavidlohr Bueso2013-07-09
| | | | | | | | | | | | | | | This function currently acquires both the rw_mutex and the rcu lock on successful lookups, leaving the callers to explicitly unlock them, creating another two level locking situation. Make the callers (including those that still use ipcctl_pre_down()) explicitly lock and unlock the rwsem and rcu lock. Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: close open coded spin lock callsDavidlohr Bueso2013-07-09
| | | | | | | | Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc: move rcu lock out of ipc_addidDavidlohr Bueso2013-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patchset continues the work that began in the sysv ipc semaphore scaling series, see https://lkml.org/lkml/2013/3/20/546 Just like semaphores used to be, sysv shared memory and msg queues also abuse the ipc lock, unnecessarily holding it for operations such as permission and security checks. This patchset mostly deals with mqueues, and while shared mem can be done in a very similar way, I want to get these patches out in the open first. It also does some pending cleanups, mostly focused on the two level locking we have in ipc code, taking care of ipc_addid() and ipcctl_pre_down_nolock() - yes there are still functions that need to be updated as well. This patch: Make all callers explicitly take and release the RCU read lock. This addresses the two level locking seen in newary(), newseg() and newqueue(). For the last two, explicitly unlock the ipc object and the rcu lock, instead of calling the custom shm_unlock and msg_unlock functions. The next patch will deal with the open coded locking for ->perm.lock Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ipc/shmc.c: eliminate ugly 80-col tricksAndrew Morton2013-07-09
| | | | | Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>