summaryrefslogtreecommitdiff
path: root/drivers/vhost (follow)
Commit message (Collapse)AuthorAge
* Merge 4.4.283 into android-4.4-pGreg Kroah-Hartman2021-09-03
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.283 can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters Revert "USB: serial: ch341: fix character loss at high transfer rates" USB: serial: option: add new VID/PID to support Fibocom FG150 e1000e: Fix the max snoop/no-snoop latency for 10M net: marvell: fix MVNETA_TX_IN_PRGRS bit number virtio: Improve vq->broken access to avoid any compiler optimization vringh: Use wiov->used to check for read/write desc order vt_kdsetmode: extend console locking fbmem: add margin check to fb_check_caps() Revert "floppy: reintroduce O_NDELAY fix" Linux 4.4.283 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I327e81b91a74a7dff9e1cfb71a7d833ff5f034ff
| * vringh: Use wiov->used to check for read/write desc orderNeeraj Upadhyay2021-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e74cfa91f42c50f7f649b0eca46aa049754ccdbd ] As __vringh_iov() traverses a descriptor chain, it populates each descriptor entry into either read or write vring iov and increments that iov's ->used member. So, as we iterate over a descriptor chain, at any point, (riov/wriov)->used value gives the number of descriptor enteries available, which are to be read or written by the device. As all read iovs must precede the write iovs, wiov->used should be zero when we are traversing a read descriptor. Current code checks for wiov->i, to figure out whether any previous entry in the current descriptor chain was a write descriptor. However, iov->i is only incremented, when these vring iovs are consumed, at a later point, and remain 0 in __vringh_iov(). So, correct the check for read and write descriptor order, to use wiov->used. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org> Link: https://lore.kernel.org/r/1624591502-4827-1-git-send-email-neeraju@codeaurora.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* | Merge 4.4.251 into android-4.4-pGreg Kroah-Hartman2021-01-12
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.251 kbuild: don't hardcode depmod path workqueue: Kick a worker based on the actual activation of delayed works lib/genalloc: fix the overflow when size is too big depmod: handle the case of /sbin/depmod without /sbin in PATH atm: idt77252: call pci_disable_device() on error path ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst() net: hns: fix return value check in __lb_other_process() net: hdlc_ppp: Fix issues when mod_timer is called while timer is running CDC-NCM: remove "connected" log message vhost_net: fix ubuf refcount incorrectly when sendmsg fails net: sched: prevent invalid Scell_log shift count virtio_net: Fix recursive call to cpus_read_lock() ethernet: ucc_geth: fix use-after-free in ucc_geth_remove() video: hyperv_fb: Fix the mmap() regression for v5.4.y and older usb: gadget: enable super speed plus USB: cdc-acm: blacklist another IR Droid device usb: chipidea: ci_hdrc_imx: add missing put_device() call in usbmisc_get_init_data() USB: xhci: fix U1/U2 handling for hardware with XHCI_INTEL_HOST quirk set usb: uas: Add PNY USB Portable SSD to unusual_uas USB: serial: iuu_phoenix: fix DMA from stack USB: serial: option: add LongSung M5710 module support USB: yurex: fix control-URB timeout handling USB: usblp: fix DMA to stack ALSA: usb-audio: Fix UBSAN warnings for MIDI jacks usb: gadget: select CONFIG_CRC32 usb: gadget: f_uac2: reset wMaxPacketSize usb: gadget: function: printer: Fix a memory leak for interface descriptor USB: gadget: legacy: fix return error code in acm_ms_bind() usb: gadget: Fix spinlock lockup on usb_function_deactivate usb: gadget: configfs: Preserve function ordering after bind failure USB: serial: keyspan_pda: remove unused variable x86/mm: Fix leak of pmd ptlock ALSA: hda/conexant: add a new hda codec CX11970 Revert "device property: Keep secondary firmware node secondary by type" netfilter: ipset: fix shift-out-of-bounds in htable_bits() netfilter: xt_RATEEST: reject non-null terminated string from userspace x86/mtrr: Correct the range check before performing MTRR type lookups Linux 4.4.251 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ic1f58f780b773ada8027b13b863c26e9d3e97e17
| * vhost_net: fix ubuf refcount incorrectly when sendmsg failsYunjian Wang2021-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 01e31bea7e622f1890c274f4aaaaf8bccd296aa5 ] Currently the vhost_zerocopy_callback() maybe be called to decrease the refcount when sendmsg fails in tun. The error handling in vhost handle_tx_zerocopy() will try to decrease the same refcount again. This is wrong. To fix this issue, we only call vhost_net_ubuf_put() when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS. Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1609207308-20544-1-git-send-email-wangyunjian@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge 4.4.242 into android-4.4-pGreg Kroah-Hartman2020-11-10
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.242 SUNRPC: ECONNREFUSED should cause a rebind. scripts/setlocalversion: make git describe output more reliable powerpc/powernv/opal-dump : Use IRQ_HANDLED instead of numbers in interrupt handler efivarfs: Replace invalid slashes with exclamation marks in dentries. ravb: Fix bit fields checking in ravb_hwtstamp_get() tipc: fix memory leak caused by tipc_buf_append() mtd: lpddr: Fix bad logic in print_drs_error ata: sata_rcar: Fix DMA boundary mask fscrypt: return -EXDEV for incompatible rename or link into encrypted dir f2fs crypto: avoid unneeded memory allocation in ->readdir powerpc/powernv/smp: Fix spurious DBG() warning sparc64: remove mm_cpumask clearing to fix kthread_use_mm race f2fs: fix to check segment boundary during SIT page readahead um: change sigio_spinlock to a mutex xfs: fix realtime bitmap/summary file truncation when growing rt volume video: fbdev: pvr2fb: initialize variables ath10k: fix VHT NSS calculation when STBC is enabled mmc: via-sdmmc: Fix data race bug printk: reduce LOG_BUF_SHIFT range for H8300 kgdb: Make "kgdbcon" work properly with "kgdb_earlycon" USB: adutux: fix debugging drivers/net/wan/hdlc_fr: Correctly handle special skb->protocol values power: supply: test_power: add missing newlines when printing parameters by sysfs md/bitmap: md_bitmap_get_counter returns wrong blocks clk: ti: clockdomain: fix static checker warning net: 9p: initialize sun_server.sun_path to have addr's value only when addr is valid drivers: watchdog: rdc321x_wdt: Fix race condition bugs ext4: Detect already used quota file early gfs2: add validation checks for size of superblock memory: emif: Remove bogus debugfs error handling ARM: dts: s5pv210: move PMU node out of clock controller ARM: dts: s5pv210: remove dedicated 'audio-subsystem' node md/raid5: fix oops during stripe resizing leds: bcm6328, bcm6358: use devres LED registering function NFS: fix nfs_path in case of a rename retry ACPI / extlog: Check for RDMSR failure ACPI: video: use ACPI backlight for HP 635 Notebook acpi-cpufreq: Honor _PSD table setting on new AMD CPUs w1: mxc_w1: Fix timeout resolution problem leading to bus error scsi: mptfusion: Fix null pointer dereferences in mptscsih_remove() btrfs: reschedule if necessary when logging directory items vt: keyboard, simplify vt_kdgkbsent vt: keyboard, extend func_buf_lock to readers dmaengine: dma-jz4780: Fix race in jz4780_dma_tx_status iio:gyro:itg3200: Fix timestamp alignment and prevent data leak. powerpc/powernv/elog: Fix race while processing OPAL error log event. ubifs: dent: Fix some potential memory leaks while iterating entries ubi: check kthread_should_stop() after the setting of task state ia64: fix build error with !COREDUMP ceph: promote to unsigned long long before shifting libceph: clear con->out_msg on Policy::stateful_server faults 9P: Cast to loff_t before multiplying ring-buffer: Return 0 on success from ring_buffer_resize() vringh: fix __vringh_iov() when riov and wiov are different tty: make FONTX ioctl use the tty pointer they were actually passed arm64: berlin: Select DW_APB_TIMER_OF cachefiles: Handle readpage error correctly hil/parisc: Disable HIL driver when it gets stuck ARM: samsung: fix PM debug build with DEBUG_LL but !MMU ARM: s3c24xx: fix missing system reset device property: Keep secondary firmware node secondary by type device property: Don't clear secondary pointer for shared primary firmware node staging: comedi: cb_pcidas: Allow 2-channel commands for AO subdevice xen/events: don't use chip_data for legacy IRQs tipc: fix use-after-free in tipc_bcast_get_mode gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP gianfar: Account for Tx PTP timestamp in the skb headroom Fonts: Replace discarded const qualifier ALSA: usb-audio: Add implicit feedback quirk for Qu-16 ftrace: Fix recursion check for NMI test ftrace: Handle tracing when switching between context ARM: dts: sun4i-a10: fix cpu_alert temperature x86/kexec: Use up-to-dated screen_info copy to fill boot params of: Fix reserved-memory overlap detection scsi: core: Don't start concurrent async scan on same host vsock: use ns_capable_noaudit() on socket create vt: Disable KD_FONT_OP_COPY fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent serial: 8250_mtk: Fix uart_get_baud_rate warning serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init USB: serial: cyberjack: fix write-URB completion race USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 USB: serial: option: add Telit FN980 composition 0x1055 USB: Add NO_LPM quirk for Kingston flash drive ARC: stack unwinding: avoid indefinite looping Revert "ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE" Linux 4.4.242 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I9c6408c403a91272e8255725dc8de294e522dc90
| * vringh: fix __vringh_iov() when riov and wiov are differentStefano Garzarella2020-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5745bcfbbf89b158416075374254d3c013488f21 upstream. If riov and wiov are both defined and they point to different objects, only riov is initialized. If the wiov is not initialized by the caller, the function fails returning -EINVAL and printing "Readable desc 0x... after writable" error message. This issue happens when descriptors have both readable and writable buffers (eg. virtio-blk devices has virtio_blk_outhdr in the readable buffer and status as last byte of writable buffer) and we call __vringh_iov() to get both type of buffers in two different iovecs. Let's replace the 'else if' clause with 'if' to initialize both riov and wiov if they are not NULL. As checkpatch pointed out, we also avoid crashing the kernel when riov and wiov are both NULL, replacing BUG() with WARN_ON() and returning -EINVAL. Fixes: f87d0fbb5798 ("vringh: host-side implementation of virtio rings.") Cc: stable@vger.kernel.org Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201008204256.162292-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge 4.4.218 into android-4.4-pGreg Kroah-Hartman2020-04-02
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.218 spi: qup: call spi_qup_pm_resume_runtime before suspending powerpc: Include .BTF section ARM: dts: dra7: Add "dma-ranges" property to PCIe RC DT nodes spi/zynqmp: remove entry that causes a cs glitch drm/exynos: dsi: propagate error value and silence meaningless warning drm/exynos: dsi: fix workaround for the legacy clock name altera-stapl: altera_get_note: prevent write beyond end of 'key' USB: Disable LPM on WD19's Realtek Hub usb: quirks: add NO_LPM quirk for RTL8153 based ethernet adapters USB: serial: option: add ME910G1 ECM composition 0x110b usb: host: xhci-plat: add a shutdown USB: serial: pl2303: add device-id for HP LD381 ALSA: line6: Fix endless MIDI read loop ALSA: seq: virmidi: Fix running status after receiving sysex ALSA: seq: oss: Fix running status after receiving sysex ALSA: pcm: oss: Avoid plugin buffer overflow ALSA: pcm: oss: Remove WARNING from snd_pcm_plug_alloc() checks staging: rtl8188eu: Add device id for MERCUSYS MW150US v2 staging/speakup: fix get_word non-space look-ahead intel_th: Fix user-visible error codes rtc: max8907: add missing select REGMAP_IRQ memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event mm: slub: be more careful about the double cmpxchg of freelist mm, slub: prevent kmalloc_node crashes and memory leaks x86/mm: split vmalloc_sync_all() USB: cdc-acm: fix close_delay and closing_wait units in TIOCSSERIAL USB: cdc-acm: fix rounding error in TIOCSSERIAL kbuild: Disable -Wpointer-to-enum-cast futex: Fix inode life-time issue futex: Unbreak futex hashing ALSA: hda/realtek: Fix pop noise on ALC225 arm64: smp: fix smp_send_stop() behaviour Revert "drm/dp_mst: Skip validating ports during destruction, just ref" hsr: fix general protection fault in hsr_addr_is_self() net: dsa: Fix duplicate frames flooded by learning net_sched: cls_route: remove the right filter from hashtable net_sched: keep alloc_hash updated after hash allocation NFC: fdp: Fix a signedness bug in fdp_nci_send_patch() slcan: not call free_netdev before rtnl_unlock in slcan_open vxlan: check return value of gro_cells_init() hsr: use rcu_read_lock() in hsr_get_node_{list/status}() hsr: add restart routine into hsr_get_node_list() hsr: set .netnsok flag vhost: Check docket sk_family instead of call getname IB/ipoib: Do not warn if IPoIB debugfs doesn't exist uapi glibc compat: fix outer guard of net device flags enum KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instr drivers/hwspinlock: use correct radix tree API net: ipv4: don't let PMTU updates increase route MTU cpupower: avoid multiple definition with gcc -fno-common dt-bindings: net: FMan erratum A050385 scsi: ipr: Fix softlockup when rescanning devices in petitboot mac80211: Do not send mesh HWMP PREQ if HWMP is disabled sxgbe: Fix off by one in samsung driver strncpy size arg i2c: hix5hd2: add missed clk_disable_unprepare in remove perf probe: Do not depend on dwfl_module_addrsym() scripts/dtc: Remove redundant YYLOC global declaration scsi: sd: Fix optimal I/O size for devices that change reported values mac80211: mark station unauthorized before key removal genirq: Fix reference leaks on irq affinity notifiers vti[6]: fix packet tx through bpf_redirect() in XinY cases xfrm: fix uctx len check in verify_sec_ctx_len xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire xfrm: policy: Fix doulbe free in xfrm_policy_timer vti6: Fix memory leak of skb if input policy check fails tools: Let O= makes handle a relative path with -C option USB: serial: option: add support for ASKEY WWHC050 USB: serial: option: add BroadMobi BM806U USB: serial: option: add Wistron Neweb D19Q1 USB: cdc-acm: restore capability check order USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback usb: musb: fix crash with highmen PIO and usbmon media: flexcop-usb: fix endpoint sanity check media: usbtv: fix control-message timeouts staging: rtl8188eu: Add ASUS USB-N10 Nano B1 to device table staging: wlan-ng: fix use-after-free Read in hfa384x_usbin_callback libfs: fix infoleak in simple_attr_read() media: ov519: add missing endpoint sanity checks media: dib0700: fix rc endpoint lookup media: stv06xx: add missing descriptor sanity checks media: xirlink_cit: add missing descriptor sanity checks vt: selection, introduce vc_is_sel vt: ioctl, switch VT_IS_IN_USE and VT_BUSY to inlines vt: switch vt_dont_switch to bool vt: vt_ioctl: remove unnecessary console allocation checks vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use virtual console locking/atomic, kref: Add kref_read() vt: vt_ioctl: fix use-after-free in vt_in_use() bpf: Explicitly memset the bpf_attr structure net: ks8851-ml: Fix IO operations, again perf map: Fix off by one in strncpy() size argument Linux 4.4.218 Change-Id: I8de6cf91805269943a4c08f8b08e6a0b8539c08e Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
| * vhost: Check docket sk_family instead of call getnameEugenio Pérez2020-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 42d84c8490f9f0931786f1623191fcab397c3d64 ] Doing so, we save one call to get data we already have in the struct. Also, since there is no guarantee that getname use sockaddr_ll parameter beyond its size, we add a little bit of security here. It should do not do beyond MAX_ADDR_LEN, but syzbot found that ax25_getname writes more (72 bytes, the size of full_sockaddr_ax25, versus 20 + 32 bytes of sockaddr_ll + MAX_ADDR_LEN in syzbot repro). Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Reported-by: syzbot+f2a62d07a5198c819c7b@syzkaller.appspotmail.com Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* | commit e82b9b0727ff ("vhost: introduce vhost_exceeds_weight()")Cody Schuffelen2019-11-13
| | | | | | | | | | | | | | | | | | Complete backport of vhost_exceeds_weight to android-4.4-p Test: Compiles Bug: 143972019 Change-Id: Ifafc3321332d9033be75123e761d7da7d8586e4c Signed-off-by: Cody Schuffelen <schuffelen@google.com>
* | Merge 4.4.193 into android-4.4-pGreg Kroah-Hartman2019-09-16
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.193 ALSA: hda - Fix potential endless loop at applying quirks ALSA: hda/realtek - Fix overridden device-specific initialization xfrm: clean up xfrm protocol checks vhost/test: fix build for vhost test scripts/decode_stacktrace: match basepath using shell prefix operator, not regex clk: s2mps11: Add used attribute to s2mps11_dt_match x86, boot: Remove multiple copy of static function sanitize_boot_params() af_packet: tone down the Tx-ring unsupported spew. vhost: make sure log_num < in_num Linux 4.4.193 Change-Id: If2283bf8bc29f3deaf1c047c8ec9e502fbdf0521 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
| * vhost: make sure log_num < in_numyongduan2019-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 060423bfdee3f8bc6e2c1bac97de24d5415e2bc4 upstream. The code assumes log_num < in_num everywhere, and that is true as long as in_num is incremented by descriptor iov count, and log_num by 1. However this breaks if there's a zero sized descriptor. As a result, if a malicious guest creates a vring desc with desc.len = 0, it may cause the host kernel to crash by overflowing the log array. This bug can be triggered during the VM migration. There's no need to log when desc.len = 0, so just don't increment log_num in this case. Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") Cc: stable@vger.kernel.org Reviewed-by: Lidong Chen <lidongchen@tencent.com> Signed-off-by: ruippan <ruippan@tencent.com> Signed-off-by: yongduan <yongduan@tencent.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * vhost/test: fix build for vhost testTiwei Bie2019-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 264b563b8675771834419057cbe076c1a41fb666 upstream. Since vhost_exceeds_weight() was introduced, callers need to specify the packet weight and byte weight in vhost_dev_init(). Note that, the packet weight isn't counted in this patch to keep the original behavior unchanged. Fixes: e82b9b0727ff ("vhost: introduce vhost_exceeds_weight()") Cc: stable@vger.kernel.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge 4.4.191 into android-4.4-pGreg Kroah-Hartman2019-09-06
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.191 HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT MIPS: kernel: only use i8253 clocksource with periodic clockevent netfilter: ebtables: fix a memory leak bug in compat bonding: Force slave speed check after link state recovery for 802.3ad can: dev: call netif_carrier_off() in register_candev() st21nfca_connectivity_event_received: null check the allocation st_nci_hci_connectivity_event_received: null check the allocation ASoC: ti: davinci-mcasp: Correct slot_width posed constraint net: usb: qmi_wwan: Add the BroadMobi BM818 card isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain() isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack perf bench numa: Fix cpu0 binding can: sja1000: force the string buffer NULL-terminated can: peak_usb: force the string buffer NULL-terminated NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' net: hisilicon: make hip04_tx_reclaim non-reentrant net: hisilicon: fix hip04-xmit never return TX_BUSY net: hisilicon: Fix dma_map_single failed on arm64 libata: add SG safety checks in SFF pio transfers selftests: kvm: Adding config fragments HID: wacom: correct misreported EKR ring values Revert "dm bufio: fix deadlock with loop device" userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386 x86/apic: Handle missing global clockevent gracefully x86/boot: Save fields explicitly, zero out everything else x86/boot: Fix boot regression caused by bootparam sanitizing dm btree: fix order of block initialization in btree_split_beneath dm space map metadata: fix missing store of apply_bops() return value dm table: fix invalid memory accesses with too high sector number cgroup: Disable IRQs while holding css_set_lock GFS2: don't set rgrp gl_object until it's inserted into rgrp tree net: arc_emac: fix koops caused by sk_buff free vhost-net: set packet weight of tx polling to 2 * vq size vhost_net: use packet weight for rx handler, too vhost_net: introduce vhost_exceeds_weight() vhost: introduce vhost_exceeds_weight() vhost_net: fix possible infinite loop vhost: scsi: add weight support siphash: add cryptographically secure PRF siphash: implement HalfSipHash1-3 for hash tables inet: switch IP ID generator to siphash netfilter: ctnetlink: don't use conntrack/expect object addresses as id netfilter: conntrack: Use consistent ct id hash calculation Revert "perf test 6: Fix missing kvm module load for s390" x86/pm: Introduce quirk framework to save/restore extra MSR registers around suspend/resume x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm() dmaengine: ste_dma40: fix unneeded variable warning usb: gadget: composite: Clear "suspended" on reset/disconnect usb: host: fotg2: restart hcd after port reset tools: hv: fix KVP and VSS daemons exit code watchdog: bcm2835_wdt: Fix module autoload tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue ALSA: usb-audio: Fix a stack buffer overflow bug in check_input_term ALSA: usb-audio: Fix an OOB bug in parse_audio_mixer_unit tcp: make sure EPOLLOUT wont be missed ALSA: seq: Fix potential concurrent access to the deleted pool KVM: x86: Don't update RIP or do single-step on faulting emulation x86/apic: Do not initialize LDR and DFR for bigsmp x86/apic: Include the LDR when clearing out APIC registers usb-storage: Add new JMS567 revision to unusual_devs USB: cdc-wdm: fix race between write and disconnect due to flag abuse usb: host: ohci: fix a race condition between shutdown and irq USB: storage: ums-realtek: Update module parameter description for auto_delink_en USB: storage: ums-realtek: Whitelist auto-delink support ptrace,x86: Make user_64bit_mode() available to 32-bit builds uprobes/x86: Fix detection of 32-bit user mode mmc: sdhci-of-at91: add quirk for broken HS200 mmc: core: Fix init of SD cards reporting an invalid VDD range stm class: Fix a double free of stm_source_device VMCI: Release resource if the work is already queued Revert "cfg80211: fix processing world regdomain when non modular" mac80211: fix possible sta leak x86/ptrace: fix up botched merge of spectrev1 fix Linux 4.4.191 Change-Id: Ic9a2554d2ba45f9c17478f1dfb5115e1a3bc3bd7 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
| * vhost: scsi: add weight supportJason Wang2019-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c1ea02f15ab5efb3e93fc3144d895410bf79fcf2 upstream. This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing scsi kthread from hogging cpu which is guest triggerable. This addresses CVE-2019-3900. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Fixes: 057cbf49a1f0 ("tcm_vhost: Initial merge for vhost level target fabric driver") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [bwh: Backported to 4.4: - Drop changes in vhost_scsi_ctl_handle_vq() - Adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
| * vhost_net: fix possible infinite loopJason Wang2019-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e2412c07f8f3040593dfb88207865a3cd58680c0 upstream. When the rx buffer is too small for a packet, we will discard the vq descriptor and retry it for the next packet: while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk, &busyloop_intr))) { ... /* On overrun, truncate and discard */ if (unlikely(headcount > UIO_MAXIOV)) { iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1); err = sock->ops->recvmsg(sock, &msg, 1, MSG_DONTWAIT | MSG_TRUNC); pr_debug("Discarded rx packet: len %zd\n", sock_len); continue; } ... } This makes it possible to trigger a infinite while..continue loop through the co-opreation of two VMs like: 1) Malicious VM1 allocate 1 byte rx buffer and try to slow down the vhost process as much as possible e.g using indirect descriptors or other. 2) Malicious VM2 generate packets to VM1 as fast as possible Fixing this by checking against weight at the end of RX and TX loop. This also eliminate other similar cases when: - userspace is consuming the packets in the meanwhile - theoretical TOCTOU attack if guest moving avail index back and forth to hit the continue after vhost find guest just add new buffers This addresses CVE-2019-3900. Fixes: d8316f3991d20 ("vhost: fix total length when packets are too short") Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [bwh: Backported to 4.4: - Both Tx modes are handled in one loop in handle_tx() - Adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
| * vhost: introduce vhost_exceeds_weight()Jason Wang2019-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e82b9b0727ff6d665fff2d326162b460dded554d upstream. We used to have vhost_exceeds_weight() for vhost-net to: - prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX This function could be useful for vsock and scsi as well. So move it to vhost.c. Device must specify a weight which counts the number of requests, or it can also specific a byte_weight which counts the number of bytes that has been processed. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [bwh: Backported to 4.4: - Drop changes to vhost_vsock - In vhost_net, both Tx modes are handled in one loop in handle_tx() - Adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
| * vhost_net: introduce vhost_exceeds_weight()Jason Wang2019-09-06
| | | | | | | | | | | | | | | | | | | | commit 272f35cba53d088085e5952fd81d7a133ab90789 upstream. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
| * vhost_net: use packet weight for rx handler, tooPaolo Abeni2019-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit db688c24eada63b1efe6d0d7d835e5c3bdd71fd3 upstream. Similar to commit a2ac99905f1e ("vhost-net: set packet weight of tx polling to 2 * vq size"), we need a packet-based limit for handler_rx, too - elsewhere, under rx flood with small packets, tx can be delayed for a very long time, even without busypolling. The pkt limit applied to handle_rx must be the same applied by handle_tx, or we will get unfair scheduling between rx and tx. Tying such limit to the queue length makes it less effective for large queue length values and can introduce large process scheduler latencies, so a constant valued is used - likewise the existing bytes limit. The selected limit has been validated with PVP[1] performance test with different queue sizes: queue size 256 512 1024 baseline 366 354 362 weight 128 715 723 670 weight 256 740 745 733 weight 512 600 460 583 weight 1024 423 427 418 A packet weight of 256 gives peek performances in under all the tested scenarios. No measurable regression in unidirectional performance tests has been detected. [1] https://developers.redhat.com/blog/2017/06/05/measuring-and-comparing-open-vswitch-performance/ Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
| * vhost-net: set packet weight of tx polling to 2 * vq sizehaibinzhang(张海斌)2019-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a2ac99905f1ea8b15997a6ec39af69aa28a3653b upstream. handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy polling udp packets with small length(e.g. 1byte udp payload), because setting VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length. Ping-Latencies shown below were tested between two Virtual Machines using netperf (UDP_STREAM, len=1), and then another machine pinged the client: vq size=256 Packet-Weight Ping-Latencies(millisecond) min avg max Origin 3.319 18.489 57.303 64 1.643 2.021 2.552 128 1.825 2.600 3.224 256 1.997 2.710 4.295 512 1.860 3.171 4.631 1024 2.002 4.173 9.056 2048 2.257 5.650 9.688 4096 2.093 8.508 15.943 vq size=512 Packet-Weight Ping-Latencies(millisecond) min avg max Origin 6.537 29.177 66.245 64 2.798 3.614 4.403 128 2.861 3.820 4.775 256 3.008 4.018 4.807 512 3.254 4.523 5.824 1024 3.079 5.335 7.747 2048 3.944 8.201 12.762 4096 4.158 11.057 19.985 Seems pretty consistent, a small dip at 2 VQ sizes. Ring size is a hint from device about a burst size it can tolerate. Based on benchmarks, set the weight to 2 * vq size. To evaluate this change, another tests were done using netperf(RR, TX) between two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was tweaked through qemu. Results shown below does not show obvious changes. vq size=256 TCP_RR vq size=512 TCP_RR size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% 1/ 1/ -7%/ -2% 1/ 1/ 0%/ -2% 1/ 4/ +1%/ 0% 1/ 4/ +1%/ 0% 1/ 8/ +1%/ -2% 1/ 8/ 0%/ +1% 64/ 1/ -6%/ 0% 64/ 1/ +7%/ +3% 64/ 4/ 0%/ +2% 64/ 4/ -1%/ +1% 64/ 8/ 0%/ 0% 64/ 8/ -1%/ -2% 256/ 1/ -3%/ -4% 256/ 1/ -4%/ -2% 256/ 4/ +3%/ +4% 256/ 4/ +1%/ +2% 256/ 8/ +2%/ 0% 256/ 8/ +1%/ -1% vq size=256 UDP_RR vq size=512 UDP_RR size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% 1/ 1/ -5%/ +1% 1/ 1/ -3%/ -2% 1/ 4/ +4%/ +1% 1/ 4/ -2%/ +2% 1/ 8/ -1%/ -1% 1/ 8/ -1%/ 0% 64/ 1/ -2%/ -3% 64/ 1/ +1%/ +1% 64/ 4/ -5%/ -1% 64/ 4/ +2%/ 0% 64/ 8/ 0%/ -1% 64/ 8/ -2%/ +1% 256/ 1/ +7%/ +1% 256/ 1/ -7%/ 0% 256/ 4/ +1%/ +1% 256/ 4/ -3%/ -4% 256/ 8/ +2%/ +2% 256/ 8/ +1%/ +1% vq size=256 TCP_STREAM vq size=512 TCP_STREAM size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% 64/ 1/ 0%/ -3% 64/ 1/ 0%/ 0% 64/ 4/ +3%/ -1% 64/ 4/ -2%/ +4% 64/ 8/ +9%/ -4% 64/ 8/ -1%/ +2% 256/ 1/ +1%/ -4% 256/ 1/ +1%/ +1% 256/ 4/ -1%/ -1% 256/ 4/ -3%/ 0% 256/ 8/ +7%/ +5% 256/ 8/ -3%/ 0% 512/ 1/ +1%/ 0% 512/ 1/ -1%/ -1% 512/ 4/ +1%/ -1% 512/ 4/ 0%/ 0% 512/ 8/ +7%/ -5% 512/ 8/ +6%/ -1% 1024/ 1/ 0%/ -1% 1024/ 1/ 0%/ +1% 1024/ 4/ +3%/ 0% 1024/ 4/ +1%/ 0% 1024/ 8/ +8%/ +5% 1024/ 8/ -1%/ 0% 2048/ 1/ +2%/ +2% 2048/ 1/ -1%/ 0% 2048/ 4/ +1%/ 0% 2048/ 4/ 0%/ -1% 2048/ 8/ -2%/ 0% 2048/ 8/ 5%/ -1% 4096/ 1/ -2%/ 0% 4096/ 1/ -2%/ 0% 4096/ 4/ +2%/ 0% 4096/ 4/ 0%/ 0% 4096/ 8/ +9%/ -2% 4096/ 8/ -5%/ -1% Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Haibin Zhang <haibinzhang@tencent.com> Signed-off-by: Yunfang Tai <yunfangtai@tencent.com> Signed-off-by: Lidong Chen <lidongchen@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
* | Merge 4.4.187 into android-4.4-pGreg Kroah-Hartman2019-08-04
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in 4.4.187 MIPS: ath79: fix ar933x uart parity mode MIPS: fix build on non-linux hosts dmaengine: imx-sdma: fix use-after-free on probe error path ath10k: Do not send probe response template for mesh ath9k: Check for errors when reading SREV register ath6kl: add some bounds checking ath: DFS JP domain W56 fixed pulse type 3 RADAR detection batman-adv: fix for leaked TVLV handler. media: dvb: usb: fix use after free in dvb_usb_device_exit crypto: talitos - fix skcipher failure due to wrong output IV media: marvell-ccic: fix DMA s/g desc number calculation media: vpss: fix a potential NULL pointer dereference net: stmmac: dwmac1000: Clear unused address entries signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig af_key: fix leaks in key_pol_get_resp and dump_sp. xfrm: Fix xfrm sel prefix length validation media: staging: media: davinci_vpfe: - Fix for memory leak if decoder initialization fails. net: phy: Check against net_device being NULL tua6100: Avoid build warnings. locking/lockdep: Fix merging of hlocks with non-zero references media: wl128x: Fix some error handling in fm_v4l2_init_video_device() cpupower : frequency-set -r option misses the last cpu in related cpu list net: fec: Do not use netdev messages too early net: axienet: Fix race condition causing TX hang s390/qdio: handle PENDING state for QEBSM devices perf test 6: Fix missing kvm module load for s390 gpio: omap: fix lack of irqstatus_raw0 for OMAP4 gpio: omap: ensure irq is enabled before wakeup regmap: fix bulk writes on paged registers bpf: silence warning messages in core rcu: Force inlining of rcu_read_lock() xfrm: fix sa selector validation perf evsel: Make perf_evsel__name() accept a NULL argument vhost_net: disable zerocopy by default EDAC/sysfs: Fix memory leak when creating a csrow object media: i2c: fix warning same module names ntp: Limit TAI-UTC offset timer_list: Guard procfs specific code acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 media: coda: fix mpeg2 sequence number handling media: coda: increment sequence offset for the last returned frame mt7601u: do not schedule rx_tasklet when the device has been disconnected x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c mt7601u: fix possible memory leak when the device is disconnected ath10k: fix PCIE device wake up failed rslib: Fix decoding of shortened codes rslib: Fix handling of of caller provided syndrome ixgbe: Check DDM existence in transceiver before access EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush() Bluetooth: hci_bcsp: Fix memory leak in rx_skb Bluetooth: 6lowpan: search for destination address in all peers Bluetooth: Check state in l2cap_disconnect_rsp Bluetooth: validate BLE connection interval updates crypto: ghash - fix unaligned memory access in ghash_setkey() crypto: arm64/sha1-ce - correct digest for empty data in finup crypto: arm64/sha2-ce - correct digest for empty data in finup Input: gtco - bounds check collection indent level regulator: s2mps11: Fix buck7 and buck8 wrong voltages tracing/snapshot: Resize spare buffer if size changed NFSv4: Handle the special Linux file open access mode lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE ALSA: seq: Break too long mutex context in the write loop media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom() media: coda: Remove unbalanced and unneeded mutex unlock KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed drm/nouveau/i2c: Enable i2c pads & busses during preinit padata: use smp_mb in padata_reorder to avoid orphaned padata jobs 9p/virtio: Add cleanup path in p9_virtio_init PCI: Do not poll for PME if the device is in D3cold take floppy compat ioctls to sodding floppy.c floppy: fix div-by-zero in setup_format_params floppy: fix out-of-bounds read in next_valid_format floppy: fix invalid pointer dereference in drive_name floppy: fix out-of-bounds read in copy_buffer coda: pass the host file in vma->vm_file on mmap gpu: ipu-v3: ipu-ic: Fix saturation bit offset in TPMEM parisc: Fix kernel panic due invalid values in IAOQ0 or IAOQ1 powerpc/32s: fix suspend/resume when IBATs 4-7 are used powerpc/watchpoint: Restore NV GPRs while returning from exception eCryptfs: fix a couple type promotion bugs intel_th: msu: Fix single mode with disabled IOMMU Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bug usb: Handle USB3 remote wakeup for LPM enabled devices correctly dm bufio: fix deadlock with loop device bnx2x: Prevent load reordering in tx completion processing caif-hsi: fix possible deadlock in cfhsi_exit_module() ipv4: don't set IPv6 only flags to IPv4 addresses net: bcmgenet: use promisc for unsupported filters net: neigh: fix multiple neigh timer scheduling nfc: fix potential illegal memory access sky2: Disable MSI on ASUS P6T netrom: fix a memory leak in nr_rx_frame() netrom: hold sock when setting skb->destructor tcp: Reset bytes_acked and bytes_received when disconnecting bonding: validate ip header before check IPPROTO_IGMP net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query net: bridge: stp: don't cache eth dest pointer before skb pull elevator: fix truncation of icq_cache_name NFSv4: Fix open create exclusive when the server reboots nfsd: increase DRC cache limit nfsd: give out fewer session slots as limit approaches nfsd: fix performance-limiting session calculation nfsd: Fix overflow causing non-working mounts on 1 TB machines drm/panel: simple: Fix panel_simple_dsi_probe usb: core: hub: Disable hub-initiated U1/U2 tty: max310x: Fix invalid baudrate divisors calculator pinctrl: rockchip: fix leaked of_node references tty: serial: cpm_uart - fix init when SMC is relocated memstick: Fix error cleanup path of memstick_init tty/serial: digicolor: Fix digicolor-usart already registered warning tty: serial: msm_serial: avoid system lockup condition drm/virtio: Add memory barriers for capset cache. phy: renesas: rcar-gen2: Fix memory leak at error paths usb: gadget: Zero ffs_io_data powerpc/pci/of: Fix OF flags parsing for 64bit BARs PCI: sysfs: Ignore lockdep for remove attribute iio: iio-utils: Fix possible incorrect mask calculation recordmcount: Fix spurious mcount entries on powerpc mfd: core: Set fwnode for created devices mfd: arizona: Fix undefined behavior um: Silence lockdep complaint about mmap_sem powerpc/4xx/uic: clear pending interrupt after irq type/pol change serial: sh-sci: Fix TX DMA buffer flushing and workqueue races kallsyms: exclude kasan local symbols on s390 perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning f2fs: avoid out-of-range memory access mailbox: handle failed named mailbox channel request powerpc/eeh: Handle hugepages in ioremap space sh: prevent warnings when using iounmap mm/kmemleak.c: fix check for softirq context 9p: pass the correct prototype to read_cache_page mm/mmu_notifier: use hlist_add_head_rcu() locking/lockdep: Fix lock used or unused stats error locking/lockdep: Hide unused 'class' variable usb: wusbcore: fix unbalanced get/put cluster_id usb: pci-quirks: Correct AMD PLL quirk detection x86/sysfb_efi: Add quirks for some devices with swapped width and height x86/speculation/mds: Apply more accurate check on hypervisor platform hpet: Fix division by zero in hpet_time_div() ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1 ALSA: hda - Add a conexant codec entry to let mute led work powerpc/tm: Fix oops on sigreturn on systems without TM access: avoid the RCU grace period for the temporary subjective credentials vmstat: Remove BUG_ON from vmstat_update mm, vmstat: make quiet_vmstat lighter ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt tcp: reset sk_send_head in tcp_write_queue_purge ISDN: hfcsusb: checking idx of ep configuration media: cpia2_usb: first wake up, then free in disconnect media: radio-raremono: change devm_k*alloc to k*alloc Bluetooth: hci_uart: check for missing tty operations sched/fair: Don't free p->numa_faults with concurrent readers drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl ceph: hold i_ceph_lock when removing caps for freeing inode Linux 4.4.187 Change-Id: I6086b23376cdf9f6a905f727fb07175a7ebdd356 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
| * vhost_net: disable zerocopy by defaultJason Wang2019-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 098eadce3c622c07b328d0a43dda379b38cf7c5e ] Vhost_net was known to suffer from HOL[1] issues which is not easy to fix. Several downstream disable the feature by default. What's more, the datapath was split and datacopy path got the support of batching and XDP support recently which makes it faster than zerocopy part for small packets transmission. It looks to me that disable zerocopy by default is more appropriate. It cold be enabled by default again in the future if we fix the above issues. [1] https://patchwork.kernel.org/patch/3787671/ Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* | UPSTREAM: vhost/vsock: fix reset orphans race with close timeoutStefan Hajnoczi2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit c38f57da428b033f2721b611d84b1f40bde674a8 ] If a local process has closed a connected socket and hasn't received a RST packet yet, then the socket remains in the table until a timeout expires. When a vhost_vsock instance is released with the timeout still pending, the socket is never freed because vhost_vsock has already set the SOCK_DONE flag. Check if the close timer is pending and let it close the socket. This prevents the race which can leak sockets. Reported-by: Maximilian Riemensberger <riemensberger@cadami.net> Cc: Graham Whaley <graham.whaley@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 06ec6679fe12cacafce68ab7b509586482a2ae1b) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I6b4564aebecc3938b789e56499a3f9b8bbbeb2a1
* | UPSTREAM: vhost/vsock: fix use-after-free in network stack callersStefan Hajnoczi2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 834e772c8db0c6a275d75315d90aba4ebbb1e249 ] If the network stack calls .send_pkt()/.cancel_pkt() during .release(), a struct vhost_vsock use-after-free is possible. This occurs because .release() does not wait for other CPUs to stop using struct vhost_vsock. Switch to an RCU-enabled hashtable (indexed by guest CID) so that .release() can wait for other CPUs by calling synchronize_rcu(). This also eliminates vhost_vsock_lock acquisition in the data path so it could have a positive effect on performance. This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt". Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com Reported-by: syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com Reported-by: syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 569fc4ffb5de8f12fe01759f0b85098b7b9bba8e) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I87f1e0fbe3fc01ccc18924085e33220373856e29
* | UPSTREAM: vhost: correctly check the iova range when waking virtqueueJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 2d66f997f0545c8f7fc5cf0b49af1decb35170e7 ] We don't wakeup the virtqueue if the first byte of pending iova range is the last byte of the range we just got updated. This will lead a virtqueue to wait for IOTLB updating forever. Fixing by correct the check and wake up the virtqueue in this case. Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API") Reported-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a21a39a9c37b8f629633d22a29cab69bbce38261) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I2cd64a1d7e195a508a3385e3afe4ecf22632770f
* | UPSTREAM: vhost: synchronize IOTLB message with dev cleanupJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 1b15ad683ab42a203f98b67045b40720e99d0e9a ] DaeRyong Jeong reports a race between vhost_dev_cleanup() and vhost_process_iotlb_msg(): Thread interleaving: CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup) (In the case of both VHOST_IOTLB_UPDATE and VHOST_IOTLB_INVALIDATE) ===== ===== vhost_umem_clean(dev->iotlb); if (!dev->iotlb) { ret = -EFAULT; break; } dev->iotlb = NULL; The reason is we don't synchronize between them, fixing by protecting vhost_process_iotlb_msg() with dev mutex. Reported-by: DaeRyong Jeong <threeearcat@gmail.com> Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f833209e15bd6cf066e731463308f0058736a74b) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I5f372cf9e2d0cf0002ebe01d692e33c3ff52f5d2
* | UPSTREAM: vhost: fix info leak due to uninitialized memoryMichael S. Tsirkin2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 670ae9caaca467ea1bfd325cb2a5c98ba87f94ad upstream. struct vhost_msg within struct vhost_msg_node is copied to userspace. Unfortunately it turns out on 64 bit systems vhost_msg has padding after type which gcc doesn't initialize, leaking 4 uninitialized bytes to userspace. This padding also unfortunately means 32 bit users of this interface are broken on a 64 bit kernel which will need to be fixed separately. Fixes: CVE-2018-1118 Cc: stable@vger.kernel.org Reported-by: Kevin Easton <kevin@guarana.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 9681c3bdb098f6c87a0422b6b63912c1b90ad197) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Ie5a29c3946792ae0f20e04015ba28c89fd90becb
* | UPSTREAM: vhost: fix vhost_vq_access_ok() log checkStefan Hajnoczi2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d14d2b78090c7de0557362b26a4ca591aa6a9faa ] Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log when IOTLB is enabled") introduced a regression. The logic was originally: if (vq->iotlb) return 1; return A && B; After the patch the short-circuit logic for A was inverted: if (A || vq->iotlb) return A; return B; This patch fixes the regression by rewriting the checks in the obvious way, no longer returning A when vq->iotlb is non-NULL (which is hard to understand). Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 72de9891b5f46f1f98e7e6243c47076a4b4daa3c) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I63938aaa9f1cf44e4eb1a18693f8c7963eff927e
* | UPSTREAM: vhost: validate log when IOTLB is enabledJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ] Vq log_base is the userspace address of bitmap which has nothing to do with IOTLB. So it needs to be validated unconditionally otherwise we may try use 0 as log_base which may lead to pin pages that will lead unexpected result (e.g trigger BUG_ON() in set_bit_to_user()). Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 992dbb1d5a653c49a7f2159f75d9cda0d8248f1f) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I64ef687ec8f55800ce300d0a0eac5473853bf6fd
* | UPSTREAM: vhost_net: add missing lock nesting notationJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit aaa3149bbee9ba9b4e6f0bd6e3e7d191edeae942 ] We try to hold TX virtqueue mutex in vhost_net_rx_peek_head_len() after RX virtqueue mutex is held in handle_rx(). This requires an appropriate lock nesting notation to calm down deadlock detector. Fixes: 0308813724606 ("vhost_net: basic polling support") Reported-by: syzbot+7f073540b1384a614e09@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 1cb81756b7c3813f7d65d3c6bb1133cb53b69775) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Ia72badca8f5f92095cfad91051fe47f76e6b03b7
* | UPSTREAM: vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()Jason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e9cb4239134c860e5f92c75bf5321bd377bb505b upstream. We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to hold mutexes of all virtqueues. This may confuse lockdep to report a possible deadlock because of trying to hold locks belong to same class. Switch to use mutex_lock_nested() to avoid false positive. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bd3ccdc6f922c6b7db4b7075d1b6596ddb986a98) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I20829e721706c10ef1696de0d78dd99110e40a6b
* | UPSTREAM: vhost/vsock: fix uninitialized vhost_vsock->guest_cidStefan Hajnoczi2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a72b69dc083a931422cc8a5e33841aff7d5312f2 upstream. The vhost_vsock->guest_cid field is uninitialized when /dev/vhost-vsock is opened until the VHOST_VSOCK_SET_GUEST_CID ioctl is called. kvmalloc(..., GFP_KERNEL | __GFP_RETRY_MAYFAIL) does not zero memory. All other vhost_vsock fields are initialized explicitly so just initialize this field too. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 258d8549b55e2cd5d73d9ff3d18b39aeeab08e1f) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I230006942ceb015f6549449160205cbb501b8fa6
* | UPSTREAM: vhost_net: correctly check tx avail during rx busy pollingJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 8b949bef9172ca69d918e93509a4ecb03d0355e0 ] We check tx avail through vhost_enable_notify() in the past which is wrong since it only checks whether or not guest has filled more available buffer since last avail idx synchronization which was just done by vhost_vq_avail_empty() before. What we really want is checking pending buffers in the avail ring. Fix this by calling vhost_vq_avail_empty() instead. This issue could be noticed by doing netperf TCP_RR benchmark as client from guest (but not host). With this fix, TCP_RR from guest to localhost restores from 1375.91 trans per sec to 55235.28 trans per sec on my laptop (Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz). Fixes: 030881372460 ("vhost_net: basic polling support") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f5755c0e870056dd35c95a0b5c0a038cdb4382ee) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Ia61d229fb5c82e64fd1c30aff06a9f938d20e467
* | UPSTREAM: vhost-vsock: add pkt cancel capabilityPeng Tao2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 16320f363ae128d9b9c70e60f00f2a572f57c23d ] To allow canceling all packets of a connection. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 482b3f92aea249b031ba0bc24df73f3c48f1f5c2) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I1a06010bbcbcf03e1557d053b773cd6a2c1c0b02
* | UPSTREAM: vhost: fix initialization for vq->is_leHalil Pasic2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit cda8bba0f99d25d2061c531113c14fa41effc3ae upstream. Currently, under certain circumstances vhost_init_is_le does just a part of the initialization job, and depends on vhost_reset_is_le being called too. For this reason vhost_vq_init_access used to call vhost_reset_is_le when vq->private_data is NULL. This is not only counter intuitive, but also real a problem because it breaks vhost_net. The bug was introduced to vhost_net with commit 2751c9882b94 ("vhost: cross-endian support for legacy devices"). The symptom is corruption of the vq's used.idx field (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost shutdown on a vq with pending descriptors. Let us make sure the outcome of vhost_init_is_le never depend on the state it is actually supposed to initialize, and fix virtio_net by removing the reset from vhost_vq_init_access. With the above, there is no reason for vhost_reset_is_le to do just half of the job. Let us make vhost_reset_is_le reinitialize is_le. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reported-by: Michael A. Tebolt <miket@us.ibm.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: commit 2751c9882b94 ("vhost: cross-endian support for legacy devices") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Michael A. Tebolt <miket@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 1594edd9ea0d75ef106bffc23c2b07b509f3301c) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I5436f3410dc32fe059c904c8001fbe05feae756d
* | UPSTREAM: vhost/vsock: handle vhost_vq_init_access() errorStefan Hajnoczi2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 0516ffd88fa0d006ee80389ce14a9ca5ae45e845 ] Propagate the error when vhost_vq_init_access() fails and set vq->private_data to NULL. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ae36f6a65af6f4eaca01cc5b68d8ecb266dbcc17) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I1e062ae003b92d000967921014a31bfc574632d8
* | UPSTREAM: vsock: lookup and setup guest_cid inside vhost_vsock_lockGao feng2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 6c083c2b8a0a110cad936bc0a2c089f0d8115175 ] Multi vsocks may setup the same cid at the same time. Signed-off-by: Gao feng <omarapazanadi@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 2d5a1b31799efcd37a57fe4d2f492d8dc2a0a334) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: Ic2fa852a2080a4df1cb3b823c02d65a354c15b34
* | UPSTREAM: vhost-vsock: fix orphan connection resetPeng Tao2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | local_addr.svm_cid is host cid. We should check guest cid instead, which is remote_addr.svm_cid. Otherwise we end up resetting all connections to all guests. Cc: stable@vger.kernel.org [4.8+] Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit c4587631c7bad47c045e081d1553cd73a23be59a) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: Ib9bf9e1f0befa71f4ebc0ff0cdcef7115a41b110
* | UPSTREAM: vhost/vsock: fix vhost virtio_vsock_pkt use-after-freeStefan Hajnoczi2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stash the packet length in a local variable before handing over ownership of the packet to virtio_transport_recv_pkt() or virtio_transport_free_pkt(). This patch solves the use-after-free since pkt is no longer guaranteed to be alive. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 3fda5d6e580193fa005014355b3a61498f1b3ae0) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I2a6a8b2eb1b647645ff7c76a37f61dce3b0fab9f
* | UPSTREAM: VSOCK: Use kvfree()Wei Yongjun2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | Use kvfree() instead of open-coding it. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit b226acab2f6aaa45c2af27279b63f622b23a44bd) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: Ie270449a727baf74ee547137928e166e772b62ad
* | BACKPORT: vhost: split out vringh KconfigMichael S. Tsirkin2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vringh is pulled in by caif and mic, but the other vhost config does not need to be there. In particular, it makes no sense to have vhost net/scsi/sock under caif/mic. Create a separate Kconfig file and put vringh bits there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 4d93824561057d54712066544609dfc7453b210f) [astrachan: Backported around no mic driver on 4.4] Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I6e630228bf91a8569bc665e9ad5d399eca8f7384
* | UPSTREAM: vhost: drop vringh dependencyMichael S. Tsirkin2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | vringh isn't used by vhost net or scsi - it's used by CAIF only at the moment. Drop the dependency. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 6190efb08c16dcd68c64b096a28f47ab33f017d7) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I2373bf9a5cd21d350af3aca957df811c6aaeae63
* | UPSTREAM: vhost: detect 32 bit integer wrap aroundMichael S. Tsirkin2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | Detect and fail early if long wrap around is triggered. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit ec33d031a14b3c5dd516627139c9550350dbba3e) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Id71c1ea0355ce3e403bb5865dc3056d197fe218b
* | UPSTREAM: VSOCK: Add Makefile and KconfigAsias He2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | Enable virtio-vsock and vhost-vsock. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 304ba62fd4e670c1a5784585da0fac9f7309ef6c) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I0b0f08cd28a94516903dbf3452e5999375e0f85a
* | UPSTREAM: VSOCK: Introduce vhost_vsock.koAsias He2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | VM sockets vhost transport implementation. This driver runs on the host. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 433fc58e6bf2c8bd97e57153ed28e64fd78207b8) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: Id90d852ffd498a7d89075cddb6d8ed0b9af5e69f
* | UPSTREAM: vhost: new device IOTLB APIJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to implement an device IOTLB for vhost. This could be used with userspace(qemu) implementation of DMA remapping to emulate an IOMMU for the guest. The idea is simple, cache the translation in a software device IOTLB (which is implemented as an interval tree) in vhost and use vhost_net file descriptor for reporting IOTLB miss and IOTLB update/invalidation. When vhost meets an IOTLB miss, the fault address, size and access can be read from the file. After userspace finishes the translation, it writes the translated address to the vhost_net file to update the device IOTLB. When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq addresses set by ioctl are treated as iova instead of virtual address and the accessing can only be done through IOTLB instead of direct userspace memory access. Before each round or vq processing, all vq metadata is prefetched in device IOTLB to make sure no translation fault happens during vq processing. In most cases, virtqueues are contiguous even in virtual address space. The IOTLB translation for virtqueue itself may make it a little slower. We might add fast path cache on top of this patch. Signed-off-by: Jason Wang <jasowang@redhat.com> [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ] [mst: fix build warnings ] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [ weiyj.lk: missing unlock on error ] Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> (cherry picked from commit 6b1e6cc7855b09a0a9bfa1d9f30172ba366f161c) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + vsock adb tunnel Signed-off-by: Cody Schuffelen <schuffelen@google.com> Change-Id: I10e4e64d6bc9b36a0d9b444c2319e290921c63c6
* | BACKPORT: vhost: convert pre sorted vhost memory array to interval treeAlistair Strachan2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current pre-sorted memory region array has some limitations for future device IOTLB conversion: 1) need extra work for adding and removing a single region, and it's expected to be slow because of sorting or memory re-allocation. 2) need extra work of removing a large range which may intersect several regions with different size. 3) need trick for a replacement policy like LRU To overcome the above shortcomings, this patch convert it to interval tree which can easily address the above issue with almost no extra work. The patch could be used for: - Extend the current API and only let the userspace to send diffs of memory table. - Simplify Device IOTLB implementation. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit a9709d6874d55130663567577a9b05c35138cc6b) [astrachan: Backported around stable backport 711df71 ("vhost_net: stop device during reset owner")] Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: I51c7c7229908a5ce1f082a80eeda5a01e85c2234
* | UPSTREAM: vhost: introduce vhost memory accessorsJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces vhost memory accessors which were just wrappers for userspace address access helpers. This is a requirement for vhost device iotlb implementation which will add iotlb translations in those accessors. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit bfe2bc512884d0b1c5297a15350f940ca80e439b) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Ia67c171384109e646f55027a71dd8df9c6b9c61a
* | UPSTREAM: vhost_net: stop polling socket during rx processingJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't stop rx polling socket during rx processing, this will lead unnecessary wakeups from under layer net devices (E.g sock_def_readable() form tun). Rx will be slowed down in this way. This patch avoids this by stop polling socket during rx processing. A small drawback is that this introduces some overheads in light load case because of the extra start/stop polling, but single netperf TCP_RR does not notice any change. In a super heavy load case, e.g using pktgen to inject packet to guest, we get about ~8.8% improvement on pps: before: ~1240000 pkt/s after: ~1350000 pkt/s Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 8241a1e466cd56e6c10472cac9c1ad4e54bc65db) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Ia51c61a6c1976b6f6406342f369ffc365d964078
* | UPSTREAM: vhost: lockless enqueuingJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use spinlock to synchronize the work list now which may cause unnecessary contentions. So this patch switch to use llist to remove this contention. Pktgen tests shows about 5% improvement: Before: ~1300000 pps After: ~1370000 pps Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 04b96e5528ca97199b429810fe963185a67dd40e) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Icf032cf2010eaedd92dc5906d454274709b4a9b4
* | UPSTREAM: vhost: simplify work flushingJason Wang2019-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to implement the work flushing through tracking queued seq, done seq, and the number of flushing. This patch simplify this by just implement work flushing through another kind of vhost work with completion. This will be used by lockless enqueuing patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 7235acdb1144460d9f520f0d931f3cbb79eb244c) Bug: 121166534 Test: Ran cuttlefish with android-4.4 + VSOCKETS, VMWARE_VMCI_VSOCKETS Signed-off-by: Alistair Strachan <astrachan@google.com> Change-Id: Id3903050706dd734a0217c56ad8ca99b2b22471e