| Commit message (Collapse) | Author | Age |
| |\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* refs/heads/tmp-aab9adb
Linux 4.4.179
kernel/sysctl.c: fix out-of-bounds access when setting file-max
Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
ALSA: info: Fix racy addition/deletion of nodes
mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
device_cgroup: fix RCU imbalance in error case
sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
Revert "kbuild: use -Oz instead of -Os when using clang"
mac80211: do not call driver wake_tx_queue op during reconfig
kprobes: Fix error check when reusing optimized probes
kprobes: Mark ftrace mcount handler functions nokprobe
x86/kprobes: Verify stack frame on kretprobe
arm64: futex: Restore oldval initialization to work around buggy compilers
crypto: x86/poly1305 - fix overflow during partial reduction
ALSA: core: Fix card races between register and disconnect
staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
staging: comedi: ni_usb6501: Fix use of uninitialized mutex
staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
staging: comedi: vmk80xx: Fix use of uninitialized semaphore
io: accel: kxcjk1013: restore the range after resume.
iio: adc: at91: disable adc channel interrupt in timeout case
iio: ad_sigma_delta: select channel when reading register
iio/gyro/bmg160: Use millidegrees for temperature scale
KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
modpost: file2alias: check prototype of handler
modpost: file2alias: go back to simple devtable lookup
crypto: crypto4xx - properly set IV after de- and encrypt
ipv4: ensure rcu_read_lock() in ipv4_link_failure()
ipv4: recompile ip options in ipv4_link_failure
tcp: tcp_grow_window() needs to respect tcp_space()
net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
net: atm: Fix potential Spectre v1 vulnerabilities
bonding: fix event handling for stacked bonds
appletalk: Fix compile regression
ovl: fix uid/gid when creating over whiteout
tpm/tpm_crb: Avoid unaligned reads in crb_recv()
include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
lib/div64.c: off by one in shift
appletalk: Fix use-after-free in atalk_proc_exit
ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t
iommu/dmar: Fix buffer overflow during PCI bus notification
crypto: sha512/arm - fix crash bug in Thumb2 build
crypto: sha256/arm - fix crash bug in Thumb2 build
cifs: fallback to older infolevels on findfirst queryinfo retry
ACPI / SBS: Fix GPE storm on recent MacBookPro's
ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms
serial: uartps: console_setup() can't be placed to init section
f2fs: fix to do sanity check with current segment number
9p locks: add mount option for lock retry interval
9p: do not trust pdu content for stat item size
rsi: improve kernel thread handling to fix kernel panic
ext4: prohibit fstrim in norecovery mode
fix incorrect error code mapping for OBJECTID_NOT_FOUND
x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error
iommu/vt-d: Check capability before disabling protected memory
x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors
x86/hpet: Prevent potential NULL pointer dereference
perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test()
perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test
perf evsel: Free evsel->counts in perf_evsel__exit()
perf top: Fix error handling in cmd_top()
tools/power turbostat: return the exit status of a command
thermal/int340x_thermal: fix mode setting
thermal/int340x_thermal: Add additional UUIDs
ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration
mmc: davinci: remove extraneous __init annotation
IB/mlx4: Fix race condition between catas error reset and aliasguid flows
ALSA: sb8: add a check for request_region
ALSA: echoaudio: add a check for ioremap_nocache
ext4: report real fs size after failed resize
ext4: add missing brelse() in add_new_gdb_meta_bg()
perf/core: Restore mmap record type correctly
PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
xtensa: fix return_address
sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
xen: Prevent buffer overflow in privcmd ioctl
arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
ARM: dts: at91: Fix typo in ISC_D0 on PC9
genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
block: do not leak memory in bio_copy_user_iov()
ASoC: fsl_esai: fix channel swap issue when stream starts
include/linux/bitrev.h: fix constant bitrev
ALSA: seq: Fix OOB-reads from strlcpy
ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type
net: ethtool: not call vzalloc for zero sized memory request
netns: provide pure entropy for net_hash_mix()
tcp: Ensure DCTCP reacts to losses
sctp: initialize _pad of sockaddr_in before copying to user memory
qmi_wwan: add Olicard 600
openvswitch: fix flow actions reallocation
net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock().
ipv6: sit: reset ip header pointer in ipip6_rcv
ipv6: Fix dangling pointer when ipv6 fragment
tty: ldisc: add sysctl to prevent autoloading of ldiscs
tty: mark Siemens R3964 line discipline as BROKEN
lib/string.c: implement a basic bcmp
x86/vdso: Drop implicit common-page-size linker flag
x86: vdso: Use $LD instead of $CC to link
x86/build: Specify elf_i386 linker emulation explicitly for i386 objects
kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
binfmt_elf: switch to new creds when switching to new mm
drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers
dmaengine: tegra: avoid overflow of byte tracking
x86/build: Mark per-CPU symbols as absolute explicitly for LLD
wlcore: Fix memory leak in case wl12xx_fetch_firmware failure
regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting
media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration
netfilter: physdev: relax br_netfilter dependency
dmaengine: imx-dma: fix warning comparison of distinct pointer types
hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable
soc/tegra: fuse: Fix illegal free of IO base address
hwrng: virtio - Avoid repeated init of completion
media: mt9m111: set initial frame size other than 0x0
tty: increase the default flip buffer limit to 2*640K
ARM: avoid Cortex-A9 livelock on tight dmb loops
mt7601u: bump supported EEPROM version
soc: qcom: gsbi: Fix error handling in gsbi_probe()
ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe
cdrom: Fix race condition in cdrom_sysctl_register
fbdev: fbmem: fix memory access if logo is bigger than the screen
bcache: improve sysfs_strtoul_clamp()
bcache: fix input overflow to sequential_cutoff
bcache: fix input overflow to cache set sysfs file io_error_halflife
ALSA: PCM: check if ops are defined before suspending PCM
ARM: 8833/1: Ensure that NEON code always compiles with Clang
kprobes: Prohibit probing on bsearch()
leds: lp55xx: fix null deref on firmware load failure
media: mx2_emmaprp: Correct return type for mem2mem buffer helpers
media: s5p-g2d: Correct return type for mem2mem buffer helpers
media: s5p-jpeg: Correct return type for mem2mem buffer helpers
media: sh_veu: Correct return type for mem2mem buffer helpers
SoC: imx-sgtl5000: add missing put_device()
perf test: Fix failure of 'evsel-tp-sched' test on s390
scsi: megaraid_sas: return error when create DMA pool failed
IB/mlx4: Increase the timeout for CM cache
e1000e: Fix -Wformat-truncation warnings
mmc: omap: fix the maximum timeout setting
ARM: 8840/1: use a raw_spinlock_t in unwind
coresight: etm4x: Add support to enable ETMv4.2
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
usb: chipidea: Grab the (legacy) USB PHY by phandle first
tools lib traceevent: Fix buffer overflow in arg_eval
fs: fix guard_bio_eod to check for real EOD errors
cifs: Fix NULL pointer dereference of devname
dm thin: add sanity checks to thin-pool and external snapshot creation
cifs: use correct format characters
fs/file.c: initialize init_files.resize_wait
f2fs: do not use mutex lock in atomic context
ocfs2: fix a panic problem caused by o2cb_ctl
mm/slab.c: kmemleak no scan alien caches
mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
mm/page_ext.c: fix an imbalance with kmemleak
mm/cma.c: cma_declare_contiguous: correct err handling
enic: fix build warning without CONFIG_CPUMASK_OFFSTACK
sysctl: handle overflow for file-max
gpio: gpio-omap: fix level interrupt idling
tracing: kdb: Fix ftdump to not sleep
h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
CIFS: fix POSIX lock leak and invalid ptr deref
tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped
Bluetooth: Fix decrementing reference count twice in releasing socket
i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
tty/serial: atmel: Add is_half_duplex helper
lib/int_sqrt: optimize initial value compute
ext4: cleanup bh release code in ext4_ind_remove_space()
arm64: debug: Ensure debug handlers check triggering exception level
arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
Make arm64 serial port config compatible with crosvm
Fix merge issue with 4.4.178
Fix merge issue with 4.4.177
ANDROID: cuttlefish_defconfig: Enable CONFIG_OVERLAY_FS
Change-Id: I0d6e7b00f0198867803d5fe305ce13e205cc7518
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
|
| | |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Changes in 4.4.179
arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
arm64: debug: Ensure debug handlers check triggering exception level
ext4: cleanup bh release code in ext4_ind_remove_space()
lib/int_sqrt: optimize initial value compute
tty/serial: atmel: Add is_half_duplex helper
mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
Bluetooth: Fix decrementing reference count twice in releasing socket
tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped
CIFS: fix POSIX lock leak and invalid ptr deref
h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
tracing: kdb: Fix ftdump to not sleep
gpio: gpio-omap: fix level interrupt idling
sysctl: handle overflow for file-max
enic: fix build warning without CONFIG_CPUMASK_OFFSTACK
mm/cma.c: cma_declare_contiguous: correct err handling
mm/page_ext.c: fix an imbalance with kmemleak
mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
mm/slab.c: kmemleak no scan alien caches
ocfs2: fix a panic problem caused by o2cb_ctl
f2fs: do not use mutex lock in atomic context
fs/file.c: initialize init_files.resize_wait
cifs: use correct format characters
dm thin: add sanity checks to thin-pool and external snapshot creation
cifs: Fix NULL pointer dereference of devname
fs: fix guard_bio_eod to check for real EOD errors
tools lib traceevent: Fix buffer overflow in arg_eval
usb: chipidea: Grab the (legacy) USB PHY by phandle first
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
coresight: etm4x: Add support to enable ETMv4.2
ARM: 8840/1: use a raw_spinlock_t in unwind
mmc: omap: fix the maximum timeout setting
e1000e: Fix -Wformat-truncation warnings
IB/mlx4: Increase the timeout for CM cache
scsi: megaraid_sas: return error when create DMA pool failed
perf test: Fix failure of 'evsel-tp-sched' test on s390
SoC: imx-sgtl5000: add missing put_device()
media: sh_veu: Correct return type for mem2mem buffer helpers
media: s5p-jpeg: Correct return type for mem2mem buffer helpers
media: s5p-g2d: Correct return type for mem2mem buffer helpers
media: mx2_emmaprp: Correct return type for mem2mem buffer helpers
leds: lp55xx: fix null deref on firmware load failure
kprobes: Prohibit probing on bsearch()
ARM: 8833/1: Ensure that NEON code always compiles with Clang
ALSA: PCM: check if ops are defined before suspending PCM
bcache: fix input overflow to cache set sysfs file io_error_halflife
bcache: fix input overflow to sequential_cutoff
bcache: improve sysfs_strtoul_clamp()
fbdev: fbmem: fix memory access if logo is bigger than the screen
cdrom: Fix race condition in cdrom_sysctl_register
ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe
soc: qcom: gsbi: Fix error handling in gsbi_probe()
mt7601u: bump supported EEPROM version
ARM: avoid Cortex-A9 livelock on tight dmb loops
tty: increase the default flip buffer limit to 2*640K
media: mt9m111: set initial frame size other than 0x0
hwrng: virtio - Avoid repeated init of completion
soc/tegra: fuse: Fix illegal free of IO base address
hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable
dmaengine: imx-dma: fix warning comparison of distinct pointer types
netfilter: physdev: relax br_netfilter dependency
media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration
regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting
wlcore: Fix memory leak in case wl12xx_fetch_firmware failure
x86/build: Mark per-CPU symbols as absolute explicitly for LLD
dmaengine: tegra: avoid overflow of byte tracking
drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers
binfmt_elf: switch to new creds when switching to new mm
kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
x86/build: Specify elf_i386 linker emulation explicitly for i386 objects
x86: vdso: Use $LD instead of $CC to link
x86/vdso: Drop implicit common-page-size linker flag
lib/string.c: implement a basic bcmp
tty: mark Siemens R3964 line discipline as BROKEN
tty: ldisc: add sysctl to prevent autoloading of ldiscs
ipv6: Fix dangling pointer when ipv6 fragment
ipv6: sit: reset ip header pointer in ipip6_rcv
net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock().
openvswitch: fix flow actions reallocation
qmi_wwan: add Olicard 600
sctp: initialize _pad of sockaddr_in before copying to user memory
tcp: Ensure DCTCP reacts to losses
netns: provide pure entropy for net_hash_mix()
net: ethtool: not call vzalloc for zero sized memory request
ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type
ALSA: seq: Fix OOB-reads from strlcpy
include/linux/bitrev.h: fix constant bitrev
ASoC: fsl_esai: fix channel swap issue when stream starts
block: do not leak memory in bio_copy_user_iov()
genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
ARM: dts: at91: Fix typo in ISC_D0 on PC9
arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
xen: Prevent buffer overflow in privcmd ioctl
sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
xtensa: fix return_address
PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
perf/core: Restore mmap record type correctly
ext4: add missing brelse() in add_new_gdb_meta_bg()
ext4: report real fs size after failed resize
ALSA: echoaudio: add a check for ioremap_nocache
ALSA: sb8: add a check for request_region
IB/mlx4: Fix race condition between catas error reset and aliasguid flows
mmc: davinci: remove extraneous __init annotation
ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration
thermal/int340x_thermal: Add additional UUIDs
thermal/int340x_thermal: fix mode setting
tools/power turbostat: return the exit status of a command
perf top: Fix error handling in cmd_top()
perf evsel: Free evsel->counts in perf_evsel__exit()
perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test
perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test()
x86/hpet: Prevent potential NULL pointer dereference
x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors
iommu/vt-d: Check capability before disabling protected memory
x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error
fix incorrect error code mapping for OBJECTID_NOT_FOUND
ext4: prohibit fstrim in norecovery mode
rsi: improve kernel thread handling to fix kernel panic
9p: do not trust pdu content for stat item size
9p locks: add mount option for lock retry interval
f2fs: fix to do sanity check with current segment number
serial: uartps: console_setup() can't be placed to init section
ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms
ACPI / SBS: Fix GPE storm on recent MacBookPro's
cifs: fallback to older infolevels on findfirst queryinfo retry
crypto: sha256/arm - fix crash bug in Thumb2 build
crypto: sha512/arm - fix crash bug in Thumb2 build
iommu/dmar: Fix buffer overflow during PCI bus notification
ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t
appletalk: Fix use-after-free in atalk_proc_exit
lib/div64.c: off by one in shift
include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
tpm/tpm_crb: Avoid unaligned reads in crb_recv()
ovl: fix uid/gid when creating over whiteout
appletalk: Fix compile regression
bonding: fix event handling for stacked bonds
net: atm: Fix potential Spectre v1 vulnerabilities
net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
tcp: tcp_grow_window() needs to respect tcp_space()
ipv4: recompile ip options in ipv4_link_failure
ipv4: ensure rcu_read_lock() in ipv4_link_failure()
crypto: crypto4xx - properly set IV after de- and encrypt
modpost: file2alias: go back to simple devtable lookup
modpost: file2alias: check prototype of handler
tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
iio/gyro/bmg160: Use millidegrees for temperature scale
iio: ad_sigma_delta: select channel when reading register
iio: adc: at91: disable adc channel interrupt in timeout case
io: accel: kxcjk1013: restore the range after resume.
staging: comedi: vmk80xx: Fix use of uninitialized semaphore
staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
staging: comedi: ni_usb6501: Fix use of uninitialized mutex
staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
ALSA: core: Fix card races between register and disconnect
crypto: x86/poly1305 - fix overflow during partial reduction
arm64: futex: Restore oldval initialization to work around buggy compilers
x86/kprobes: Verify stack frame on kretprobe
kprobes: Mark ftrace mcount handler functions nokprobe
kprobes: Fix error check when reusing optimized probes
mac80211: do not call driver wake_tx_queue op during reconfig
Revert "kbuild: use -Oz instead of -Os when using clang"
sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
device_cgroup: fix RCU imbalance in error case
mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
ALSA: info: Fix racy addition/deletion of nodes
Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
kernel/sysctl.c: fix out-of-bounds access when setting file-max
Linux 4.4.179
Change-Id: Ib81a248d73ba7504649be93bd6882b290e548882
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
[ Upstream commit a4046c06be50a4f01d435aa7fe57514818e6cc82 ]
Use offsetof() to calculate offset of a field to take advantage of
compiler built-in version when possible, and avoid UBSAN warning when
compiling with Clang:
UBSAN: Undefined behaviour in mm/swapfile.c:3010:38
member access within null pointer of type 'union swap_header'
CPU: 6 PID: 1833 Comm: swapon Tainted: G S 4.19.23 #43
Call trace:
dump_backtrace+0x0/0x194
show_stack+0x20/0x2c
__dump_stack+0x20/0x28
dump_stack+0x70/0x94
ubsan_epilogue+0x14/0x44
ubsan_type_mismatch_common+0xf4/0xfc
__ubsan_handle_type_mismatch_v1+0x34/0x54
__se_sys_swapon+0x654/0x1084
__arm64_sys_swapon+0x1c/0x24
el0_svc_common+0xa8/0x150
el0_svc_compat_handler+0x2c/0x38
el0_svc_compat+0x8/0x18
Link: http://lkml.kernel.org/r/20190312081902.223764-1-pihsun@chromium.org
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
| |\| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* refs/heads/tmp-5e24b4e
Linux 4.4.153
ovl: warn instead of error if d_type is not supported
ovl: Do d_type check only if work dir creation was successful
ovl: Ensure upper filesystem supports d_type
x86/mm: Fix use-after-free of ldt_struct
x86/mm/pat: Fix L1TF stable backport for CPA
ANDROID: x86_64_cuttlefish_defconfig: Enable lz4 compression for zram
UPSTREAM: drivers/block/zram/zram_drv.c: fix bug storing backing_dev
BACKPORT: zram: introduce zram memory tracking
BACKPORT: zram: record accessed second
BACKPORT: zram: mark incompressible page as ZRAM_HUGE
UPSTREAM: zram: correct flag name of ZRAM_ACCESS
UPSTREAM: zram: Delete gendisk before cleaning up the request queue
UPSTREAM: drivers/block/zram/zram_drv.c: make zram_page_end_io() static
BACKPORT: zram: set BDI_CAP_STABLE_WRITES once
UPSTREAM: zram: fix null dereference of handle
UPSTREAM: zram: add config and doc file for writeback feature
BACKPORT: zram: read page from backing device
BACKPORT: zram: write incompressible pages to backing device
BACKPORT: zram: identify asynchronous IO's return value
BACKPORT: zram: add free space management in backing device
UPSTREAM: zram: add interface to specif backing device
UPSTREAM: zram: rename zram_decompress_page to __zram_bvec_read
UPSTREAM: zram: inline zram_compress
UPSTREAM: zram: clean up duplicated codes in __zram_bvec_write
Linux 4.4.152
reiserfs: fix broken xattr handling (heap corruption, bad retval)
i2c: imx: Fix race condition in dma read
PCI: pciehp: Fix use-after-free on unplug
PCI: Skip MPS logic for Virtual Functions (VFs)
PCI: hotplug: Don't leak pci_slot on registration failure
parisc: Remove unnecessary barriers from spinlock.h
bridge: Propagate vlan add failure to user
packet: refine ring v3 block size test to hold one frame
netfilter: conntrack: dccp: treat SYNC/SYNCACK as invalid if no prior state
xfrm_user: prevent leaking 2 bytes of kernel memory
parisc: Remove ordered stores from syscall.S
ext4: fix spectre gadget in ext4_mb_regular_allocator()
KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer
staging: android: ion: check for kref overflow
tcp: identify cryptic messages as TCP seq # bugs
net: qca_spi: Fix log level if probe fails
net: qca_spi: Make sure the QCA7000 reset is triggered
net: qca_spi: Avoid packet drop during initial sync
net: usb: rtl8150: demote allmulti message to dev_dbg()
net/ethernet/freescale/fman: fix cross-build error
drm/nouveau/gem: off by one bugs in nouveau_gem_pushbuf_reloc_apply()
tcp: remove DELAYED ACK events in DCTCP
qlogic: check kstrtoul() for errors
packet: reset network header if packet shorter than ll reserved space
ixgbe: Be more careful when modifying MAC filters
ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller
ARM: 8780/1: ftrace: Only set kernel memory back to read-only after boot
perf llvm-utils: Remove bashism from kernel include fetch script
bnxt_en: Fix for system hang if request_irq fails
drm/armada: fix colorkey mode property
ieee802154: fakelb: switch from BUG_ON() to WARN_ON() on problem
ieee802154: at86rf230: use __func__ macro for debug messages
ieee802154: at86rf230: switch from BUG_ON() to WARN_ON() on problem
ARM: pxa: irq: fix handling of ICMR registers in suspend/resume
netfilter: x_tables: set module owner for icmp(6) matches
smsc75xx: Add workaround for gigabit link up hardware errata.
kasan: fix shadow_size calculation error in kasan_module_alloc
tracing: Use __printf markup to silence compiler
ARM: imx_v4_v5_defconfig: Select ULPI support
ARM: imx_v6_v7_defconfig: Select ULPI support
HID: wacom: Correct touch maximum XY of 2nd-gen Intuos
m68k: fix "bad page state" oops on ColdFire boot
bnx2x: Fix receiving tx-timeout in error or recovery state.
drm/exynos: decon5433: Fix WINCONx reset value
drm/exynos: decon5433: Fix per-plane global alpha for XRGB modes
drm/exynos: gsc: Fix support for NV16/61, YUV420/YVU420 and YUV422 modes
md/raid10: fix that replacement cannot complete recovery after reassemble
dmaengine: k3dma: Off by one in k3_of_dma_simple_xlate()
ARM: dts: da850: Fix interrups property for gpio
selftests/x86/sigreturn/64: Fix spurious failures on AMD CPUs
perf report powerpc: Fix crash if callchain is empty
perf test session topology: Fix test on s390
usb: xhci: increase CRS timeout value
ARM: dts: am437x: make edt-ft5x06 a wakeup source
brcmfmac: stop watchdog before detach and free everything
cxgb4: when disabling dcb set txq dcb priority to 0
Smack: Mark inode instant in smack_task_to_inode
ipv6: mcast: fix unsolicited report interval after receiving querys
locking/lockdep: Do not record IRQ state within lockdep code
net: davinci_emac: match the mdio device against its compatible if possible
ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMP
net: propagate dev_get_valid_name return code
net: hamradio: use eth_broadcast_addr
enic: initialize enic->rfs_h.lock in enic_probe
qed: Add sanity check for SIMD fastpath handler.
arm64: make secondary_start_kernel() notrace
scsi: xen-scsifront: add error handling for xenbus_printf
usb: gadget: dwc2: fix memory leak in gadget_init()
usb: gadget: composite: fix delayed_status race condition when set_interface
usb: dwc2: fix isoc split in transfer with no data
ARM: dts: Cygnus: Fix I2C controller interrupt type
selftests: sync: add config fragment for testing sync framework
selftests: zram: return Kselftest Skip code for skipped tests
selftests: user: return Kselftest Skip code for skipped tests
selftests: static_keys: return Kselftest Skip code for skipped tests
selftests: pstore: return Kselftest Skip code for skipped tests
netfilter: ipv6: nf_defrag: reduce struct net memory waste
ARC: Explicitly add -mmedium-calls to CFLAGS
ANDROID: x86_64_cuttlefish_defconfig: Enable zram and zstd
BACKPORT: crypto: zstd - Add zstd support
UPSTREAM: zram: add zstd to the supported algorithms list
UPSTREAM: lib: Add zstd modules
UPSTREAM: lib: Add xxhash module
UPSTREAM: zram: rework copy of compressor name in comp_algorithm_store()
UPSTREAM: zram: constify attribute_group structures.
UPSTREAM: zram: count same page write as page_stored
UPSTREAM: zram: reduce load operation in page_same_filled
UPSTREAM: zram: use zram_free_page instead of open-coded
UPSTREAM: zram: introduce zram data accessor
UPSTREAM: zram: remove zram_meta structure
UPSTREAM: zram: use zram_slot_lock instead of raw bit_spin_lock op
BACKPORT: zram: partial IO refactoring
BACKPORT: zram: handle multiple pages attached bio's bvec
UPSTREAM: zram: fix operator precedence to get offset
BACKPORT: zram: extend zero pages to same element pages
BACKPORT: zram: remove waitqueue for IO done
UPSTREAM: zram: remove obsolete sysfs attrs
UPSTREAM: zram: support BDI_CAP_STABLE_WRITES
UPSTREAM: zram: revalidate disk under init_lock
BACKPORT: mm: support anonymous stable page
UPSTREAM: zram: use __GFP_MOVABLE for memory allocation
UPSTREAM: zram: drop gfp_t from zcomp_strm_alloc()
UPSTREAM: zram: add more compression algorithms
UPSTREAM: zram: delete custom lzo/lz4
UPSTREAM: zram: cosmetic: cleanup documentation
UPSTREAM: zram: use crypto api to check alg availability
BACKPORT: zram: switch to crypto compress API
UPSTREAM: zram: rename zstrm find-release functions
UPSTREAM: zram: introduce per-device debug_stat sysfs node
UPSTREAM: zram: remove max_comp_streams internals
UPSTREAM: zram: user per-cpu compression streams
BACKPORT: zsmalloc: require GFP in zs_malloc()
UPSTREAM: zram/zcomp: do not zero out zcomp private pages
UPSTREAM: zram: pass gfp from zcomp frontend to backend
UPSTREAM: socket: close race condition between sock_close() and sockfs_setattr()
ANDROID: Refresh x86_64_cuttlefish_defconfig
Linux 4.4.151
isdn: Disable IIOCDBGVAR
Bluetooth: avoid killing an already killed socket
x86/mm: Simplify p[g4um]d_page() macros
serial: 8250_dw: always set baud rate in dw8250_set_termios
ACPI / PM: save NVS memory for ASUS 1025C laptop
ACPI: save NVS memory for Lenovo G50-45
USB: option: add support for DW5821e
USB: serial: sierra: fix potential deadlock at close
ALSA: vxpocket: Fix invalid endian conversions
ALSA: memalloc: Don't exceed over the requested size
ALSA: hda: Correct Asrock B85M-ITX power_save blacklist entry
ALSA: cs5535audio: Fix invalid endian conversion
ALSA: virmidi: Fix too long output trigger loop
ALSA: vx222: Fix invalid endian conversions
ALSA: hda - Turn CX8200 into D3 as well upon reboot
ALSA: hda - Sleep for 10ms after entering D3 on Conexant codecs
net_sched: fix NULL pointer dereference when delete tcindex filter
vsock: split dwork to avoid reinitializations
net_sched: Fix missing res info when create new tc_index filter
llc: use refcount_inc_not_zero() for llc_sap_find()
l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache
dccp: fix undefined behavior with 'cwnd' shift in ccid2_cwnd_restart()
Conflicts:
drivers/block/zram/zram_drv.c
drivers/staging/android/ion/ion.c
include/linux/swap.h
mm/zsmalloc.c
Change-Id: I1c437ac5133503a939d06d51ec778b65371df6d1
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
During developemnt for zram-swap asynchronous writeback, I found strange
corruption of compressed page, resulting in:
Modules linked in: zram(E)
CPU: 3 PID: 1520 Comm: zramd-1 Tainted: G E 4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff88007620b840 task.stack: ffff880078090000
RIP: set_freeobj.part.43+0x1c/0x1f
RSP: 0018:ffff880078093ca8 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff880076798d88 RCX: ffffffff81c408c8
RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000246
RBP: ffff880078093cb0 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88005bc43030 R11: 0000000000001df3 R12: ffff880076798d88
R13: 000000000005bc43 R14: ffff88007819d1b8 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff88007e380000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc934048f20 CR3: 0000000077b01000 CR4: 00000000000406e0
Call Trace:
obj_malloc+0x22b/0x260
zs_malloc+0x1e4/0x580
zram_bvec_rw+0x4cd/0x830 [zram]
page_requests_rw+0x9c/0x130 [zram]
zram_thread+0xe6/0x173 [zram]
kthread+0xca/0xe0
ret_from_fork+0x25/0x30
With investigation, it reveals currently stable page doesn't support
anonymous page. IOW, reuse_swap_page can reuse the page without waiting
writeback completion so it can overwrite page zram is compressing.
Unfortunately, zram has used per-cpu stream feature from v4.7.
It aims for increasing cache hit ratio of scratch buffer for
compressing. Downside of that approach is that zram should ask
memory space for compressed page in per-cpu context which requires
stricted gfp flag which could be failed. If so, it retries to
allocate memory space out of per-cpu context so it could get memory
this time and compress the data again, copies it to the memory space.
In this scenario, zram assumes the data should never be changed
but it is not true unless stable page supports. So, If the data is
changed under us, zram can make buffer overrun because second
compression size could be bigger than one we got in previous trial
and blindly, copy bigger size object to smaller buffer which is
buffer overrun. The overrun breaks zsmalloc free object chaining
so system goes crash like above.
I think below is same problem.
https://bugzilla.suse.com/show_bug.cgi?id=997574
Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
writeback in there so the approach in this patch is simply return false if
we found it needs stable page. Although it increases memory footprint
temporarily, it happens rarely and it should be reclaimed easily althoug
it happened. Also, It would be better than waiting of IO completion,
which is critial path for application latency.
Fixes: da9556a2367c ("zram: user per-cpu compression streams")
Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox
Link: http://lkml.kernel.org/r/1482366980-3782-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Hyeoncheol Lee <cheol.lee@lge.com>
Cc: <yjay.kim@lge.com>
Cc: Sangseok Lee <sangseok.lee@lge.com>
Cc: <stable@vger.kernel.org> [4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit f05714293a591038304ddae7cb0dd747bb3786cc)
Signed-off-by: Peter Kalauskas <peskal@google.com>
Bug: 112488418
Change-Id: I0fa5012aff9daf614b2d1d04f35b86ff7043ff21
|
| |\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* remotes/origin/tmp-2f0de51:
Linux 4.4.38
esp6: Fix integrity verification when ESN are used
esp4: Fix integrity verification when ESN are used
ipv4: Set skb->protocol properly for local output
ipv6: Set skb->protocol properly for local output
Don't feed anything but regular iovec's to blk_rq_map_user_iov
constify iov_iter_count() and iter_is_iovec()
sparc64: fix compile warning section mismatch in find_node()
sparc64: Fix find_node warning if numa node cannot be found
sparc32: Fix inverted invalid_frame_pointer checks on sigreturns
net: ping: check minimum size on ICMP header length
net: avoid signed overflows for SO_{SND|RCV}BUFFORCE
geneve: avoid use-after-free of skb->data
sh_eth: remove unchecked interrupts for RZ/A1
net: bcmgenet: Utilize correct struct device for all DMA operations
packet: fix race condition in packet_set_ring
net/dccp: fix use-after-free in dccp_invalid_packet
netlink: Do not schedule work from sk_destruct
netlink: Call cb->done from a worker thread
net/sched: pedit: make sure that offset is valid
net, sched: respect rcu grace period on cls destruction
net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
rtnetlink: fix FDB size computation
af_unix: conditionally use freezable blocking calls in read
net: sky2: Fix shutdown crash
ip6_tunnel: disable caching when the traffic class is inherited
net: check dead netns for peernet2id_alloc()
virtio-net: add a missing synchronize_net()
Linux 4.4.37
arm64: suspend: Reconfigure PSTATE after resume from idle
arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call
arm64: cpufeature: Schedule enable() calls instead of calling them via IPI
pwm: Fix device reference leak
mwifiex: printk() overflow with 32-byte SSIDs
PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)
PCI: Export pcie_find_root_port
rcu: Fix soft lockup for rcu_nocb_kthread
ALSA: pcm : Call kill_fasync() in stream lock
x86/traps: Ignore high word of regs->cs in early_fixup_exception()
kasan: update kasan_global for gcc 7
zram: fix unbalanced idr management at hot removal
ARC: Don't use "+l" inline asm constraint
Linux 4.4.36
scsi: mpt3sas: Unblock device after controller reset
flow_dissect: call init_default_flow_dissectors() earlier
mei: fix return value on disconnection
mei: me: fix place for kaby point device ids.
mei: me: disable driver on SPT SPS firmware
drm/radeon: Ensure vblank interrupt is enabled on DPMS transition to on
mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
parisc: Also flush data TLB in flush_icache_page_asm
parisc: Fix race in pci-dma.c
parisc: Fix races in parisc_setup_cache_timing()
NFSv4.x: hide array-bounds warning
apparmor: fix change_hat not finding hat after policy replacement
cfg80211: limit scan results cache size
tile: avoid using clocksource_cyc2ns with absolute cycle count
scsi: mpt3sas: Fix secure erase premature termination
Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad
USB: serial: cp210x: add ID for the Zone DPMX
usb: chipidea: move the lock initialization to core file
KVM: x86: check for pic and ioapic presence before use
KVM: x86: drop error recovery in em_jmp_far and em_ret_far
iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
iommu/vt-d: Fix PASID table allocation
sched: tune: Fix lacking spinlock initialization
UPSTREAM: trace: Update documentation for mono, mono_raw and boot clock
UPSTREAM: trace: Add an option for boot clock as trace clock
UPSTREAM: timekeeping: Add a fast and NMI safe boot clock
ANDROID: goldfish_pipe: fix allmodconfig build
ANDROID: goldfish: goldfish_pipe: fix locking errors
ANDROID: video: goldfishfb: fix platform_no_drv_owner.cocci warnings
ANDROID: goldfish_pipe: fix call_kern.cocci warnings
arm64: rename ranchu defconfig to ranchu64
ANDROID: arch: x86: disable pic for Android toolchain
ANDROID: goldfish_pipe: An implementation of more parallel pipe
ANDROID: goldfish_pipe: bugfixes and performance improvements.
ANDROID: goldfish: Add goldfish sync driver
ANDROID: goldfish: add ranchu defconfigs
ANDROID: goldfish_audio: Clear audio read buffer status after each read
ANDROID: goldfish_events: no extra EV_SYN; register goldfish
ANDROID: goldfish_fb: Set pixclock = 0
ANDROID: goldfish: Enable ACPI-based enumeration for goldfish audio
ANDROID: goldfish: Enable ACPI-based enumeration for goldfish framebuffer
ANDROID: video: goldfishfb: add devicetree bindings
BACKPORT: staging: goldfish: audio: fix compiliation on arm
BACKPORT: Input: goldfish_events - enable ACPI-based enumeration for goldfish events
BACKPORT: goldfish: Enable ACPI-based enumeration for goldfish battery
BACKPORT: drivers: tty: goldfish: Add device tree bindings
BACKPORT: tty: goldfish: support platform_device with id -1
BACKPORT: Input: goldfish_events - add devicetree bindings
BACKPORT: power: goldfish_battery: add devicetree bindings
BACKPORT: staging: goldfish: audio: add devicetree bindings
ANDROID: usb: gadget: function: cleanup: Add blank line after declaration
cpufreq: sched: Fix kernel crash on accessing sysfs file
usb: gadget: f_mtp: simplify ptp NULL pointer check
cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation
cgroup: rename Documentation/cgroups/ to Documentation/cgroup-legacy/
cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type
writeback: initialize inode members that track writeback history
mm: page_alloc: generalize the dirty balance reserve
block: fix module reference leak on put_disk() call for cgroups throttle
Linux 4.4.35
netfilter: nft_dynset: fix element timeout for HZ != 1000
IB/cm: Mark stale CM id's whenever the mad agent was unregistered
IB/uverbs: Fix leak of XRC target QPs
IB/core: Avoid unsigned int overflow in sg_alloc_table
IB/mlx5: Fix fatal error dispatching
IB/mlx5: Use cache line size to select CQE stride
IB/mlx4: Fix create CQ error flow
IB/mlx4: Check gid_index return value
PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
PM / sleep: fix device reference leak in test_suspend
uwb: fix device reference leaks
mfd: core: Fix device reference leak in mfd_clone_cell
iwlwifi: pcie: fix SPLC structure parsing
rtc: omap: Fix selecting external osc
clk: mmp: mmp2: fix return value check in mmp2_clk_init()
clk: mmp: pxa168: fix return value check in pxa168_clk_init()
clk: mmp: pxa910: fix return value check in pxa910_clk_init()
drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
crypto: caam - do not register AES-XTS mode on LP units
ext4: sanity check the block and cluster size at mount time
kbuild: Steal gcc's pie from the very beginning
x86/kexec: add -fno-PIE
scripts/has-stack-protector: add -fno-PIE
kbuild: add -fno-PIE
i2c: mux: fix up dependencies
can: bcm: fix warning in bcm_connect/proc_register
mfd: intel-lpss: Do not put device in reset state on suspend
fuse: fix fuse_write_end() if zero bytes were copied
KVM: Disable irq while unregistering user notifier
KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems
Linux 4.4.34
sparc64: Delete now unused user copy fixup functions.
sparc64: Delete now unused user copy assembler helpers.
sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.
sparc64: Convert NG2copy_{from,to}_user to accurate exception reporting.
sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.
sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.
sparc64: Convert U1copy_{from,to}_user to accurate exception reporting.
sparc64: Convert GENcopy_{from,to}_user to accurate exception reporting.
sparc64: Convert copy_in_user to accurate exception reporting.
sparc64: Prepare to move to more saner user copy exception handling.
sparc64: Delete __ret_efault.
sparc64: Handle extremely large kernel TLB range flushes more gracefully.
sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.
sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.
sparc64: Fix illegal relative branches in hypervisor patched TLB code.
sparc64: Handle extremely large kernel TSB range flushes sanely.
sparc: Handle negative offsets in arch_jump_label_transform
sparc64 mm: Fix base TSB sizing when hugetlb pages are used
sparc: serial: sunhv: fix a double lock bug
sparc: Don't leak context bits into thread->fault_address
tty: Prevent ldisc drivers from re-using stale tty fields
tcp: take care of truncations done by sk_filter()
ipv4: use new_gw for redirect neigh lookup
net: __skb_flow_dissect() must cap its return value
sock: fix sendmmsg for partial sendmsg
fib_trie: Correct /proc/net/route off by one error
sctp: assign assoc_id earlier in __sctp_connect
ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped
ipv6: dccp: fix out of bound access in dccp_v6_err()
dccp: fix out of bound access in dccp_v4_err()
dccp: do not send reset to already closed sockets
tcp: fix potential memory corruption
ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()
bgmac: stop clearing DMA receive control register right after it is set
net: mangle zero checksum in skb_checksum_help()
net: clear sk_err_soft in sk_clone_lock()
dctcp: avoid bogus doubling of cwnd after loss
ARM: 8485/1: cpuidle: remove cpu parameter from the cpuidle_ops suspend hook
Linux 4.4.33
netfilter: fix namespace handling in nf_log_proc_dostring
btrfs: qgroup: Prevent qgroup->reserved from going subzero
mmc: mxs: Initialize the spinlock prior to using it
ASoC: sun4i-codec: return error code instead of NULL when create_card fails
ACPI / APEI: Fix incorrect return value of ghes_proc()
i40e: fix call of ndo_dflt_bridge_getlink()
hwrng: core - Don't use a stack buffer in add_early_randomness()
lib/genalloc.c: start search from start of chunk
mei: bus: fix received data size check in NFC fixup
iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
iommu/amd: Free domain id when free a domain of struct dma_ops_domain
tty/serial: at91: fix hardware handshake on Atmel platforms
dmaengine: at_xdmac: fix spurious flag status for mem2mem transfers
drm/i915: Respect alternate_ddc_pin for all DDI ports
KVM: MIPS: Precalculate MMIO load resume PC
scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
clk: qoriq: Don't allow CPU clocks higher than starting value
toshiba-wmi: Fix loading the driver on non Toshiba laptops
drbd: Fix kernel_sendmsg() usage - potential NULL deref
usb: gadget: u_ether: remove interrupt throttling
USB: cdc-acm: fix TIOCMIWAIT
staging: nvec: remove managed resource from PS2 driver
Revert "staging: nvec: ps2: change serio type to passthrough"
drivers: staging: nvec: remove bogus reset command for PS/2 interface
staging: iio: ad5933: avoid uninitialized variable in error case
pinctrl: cherryview: Prevent possible interrupt storm on resume
pinctrl: cherryview: Serialize register access in suspend/resume
ARC: timer: rtc: implement read loop in "C" vs. inline asm
s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
coredump: fix unfreezable coredumping task
swapfile: fix memory corruption via malformed swapfile
dib0700: fix nec repeat handling
ASoC: cs4270: fix DAPM stream name mismatch
ALSA: info: Limit the proc text input size
ALSA: info: Return error for invalid read/write
arm64: Enable KPROBES/HIBERNATION/CORESIGHT in defconfig
arm64: kvm: allows kvm cpu hotplug
arm64: KVM: Register CPU notifiers when the kernel runs at HYP
arm64: KVM: Skip HYP setup when already running in HYP
arm64: hyp/kvm: Make hyp-stub reject kvm_call_hyp()
arm64: hyp/kvm: Make hyp-stub extensible
arm64: kvm: Move lr save/restore from do_el2_call into EL1
arm64: kvm: deal with kernel symbols outside of linear mapping
arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
ANDROID: video: adf: Avoid directly referencing user pointers
ANDROID: usb: gadget: audio_source: fix comparison of distinct pointer types
android: binder: support for file-descriptor arrays.
android: binder: support for scatter-gather.
android: binder: add extra size to allocator.
android: binder: refactor binder_transact()
android: binder: support multiple /dev instances.
android: binder: deal with contexts in debugfs.
android: binder: support multiple context managers.
android: binder: split flat_binder_object.
disable aio support in recommended configuration
Linux 4.4.32
scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
drm/radeon: fix DP mode validation
drm/radeon/dp: add back special handling for NUTMEG
drm/amdgpu: fix DP mode validation
drm/amdgpu/dp: add back special handling for NUTMEG
KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
Revert KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
of: silence warnings due to max() usage
packet: on direct_xmit, limit tso and csum to supported devices
sctp: validate chunk len before actually using it
net sched filters: fix notification of filter delete with proper handle
udp: fix IP_CHECKSUM handling
net: sctp, forbid negative length
ipv4: use the right lock for ping_group_range
ipv4: disable BH in set_ping_group_range()
net: add recursion limit to GRO
rtnetlink: Add rtnexthop offload flag to compare mask
bridge: multicast: restore perm router ports on multicast enable
net: pktgen: remove rcu locking in pktgen_change_name()
ipv6: correctly add local routes when lo goes up
ip6_tunnel: fix ip6_tnl_lookup
ipv6: tcp: restore IP6CB for pktoptions skbs
netlink: do not enter direct reclaim from netlink_dump()
packet: call fanout_release, while UNREGISTERING a netdev
net: Add netdev all_adj_list refcnt propagation to fix panic
net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions
net: pktgen: fix pkt_size
net: fec: set mac address unconditionally
tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
tcp: fix a compile error in DBGUNDO()
tcp: fix wrong checksum calculation on MTU probing
net: avoid sk_forward_alloc overflows
tcp: fix overflow in __tcp_retransmit_skb()
arm64/kvm: fix build issue on kvm debug
arm64: ptdump: Indicate whether memory should be faulting
arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
arm64: Drop alloc function from create_mapping
arm64: allow vmalloc regions to be set with set_memory_*
arm64: kernel: implement ACPI parking protocol
arm64: mm: create new fine-grained mappings at boot
arm64: ensure _stext and _etext are page-aligned
arm64: mm: allow passing a pgdir to alloc_init_*
arm64: mm: allocate pagetables anywhere
arm64: mm: use fixmap when creating page tables
arm64: mm: add functions to walk tables in fixmap
arm64: mm: add __{pud,pgd}_populate
arm64: mm: avoid redundant __pa(__va(x))
Linux 4.4.31
HID: usbhid: add ATEN CS962 to list of quirky devices
ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap()
kvm: x86: Check memopp before dereference (CVE-2016-8630)
tty: vt, fix bogus division in csi_J
usb: dwc3: Fix size used in dma_free_coherent()
pwm: Unexport children before chip removal
UBI: fastmap: scrub PEB when bitflips are detected in a free PEB EC header
Disable "frame-address" warning
smc91x: avoid self-comparison warning
cgroup: avoid false positive gcc-6 warning
drm/exynos: fix error handling in exynos_drm_subdrv_open
mm/cma: silence warnings due to max() usage
ARM: 8584/1: floppy: avoid gcc-6 warning
powerpc/ptrace: Fix out of bounds array access warning
x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
perf build: Fix traceevent plugins build race
drm/dp/mst: Check peer device type before attempting EDID read
drm/radeon: drop register readback in cayman_cp_int_cntl_setup
drm/radeon/si_dpm: workaround for SI kickers
drm/radeon/si_dpm: Limit clocks on HD86xx part
Revert "drm/radeon: fix DP link training issue with second 4K monitor"
mmc: dw_mmc-pltfm: fix the potential NULL pointer dereference
scsi: arcmsr: Send SYNCHRONIZE_CACHE command to firmware
scsi: scsi_debug: Fix memory leak if LBP enabled and module is unloaded
scsi: megaraid_sas: Fix data integrity failure for JBOD (passthrough) devices
mac80211: discard multicast and 4-addr A-MSDUs
firewire: net: fix fragmented datagram_size off-by-one
firewire: net: guard against rx buffer overflows
Input: i8042 - add XMG C504 to keyboard reset table
dm mirror: fix read error on recovery after default leg failure
virtio: console: Unlock vqs while freeing buffers
virtio_ring: Make interrupt suppression spec compliant
parisc: Ensure consistent state when switching to kernel stack at syscall entry
ovl: fsync after copy-up
KVM: MIPS: Make ERET handle ERL before EXL
KVM: x86: fix wbinvd_dirty_mask use-after-free
dm: free io_barrier after blk_cleanup_queue call
USB: serial: cp210x: fix tiocmget error handling
tty: limit terminal size to 4M chars
xhci: add restart quirk for Intel Wildcatpoint PCH
hv: do not lose pending heartbeat vmbus packets
vt: clear selection before resizing
Fix potential infoleak in older kernels
GenWQE: Fix bad page access during abort of resource allocation
usb: increase ohci watchdog delay to 275 msec
xhci: use default USB_RESUME_TIMEOUT when resuming ports.
USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7
USB: serial: fix potential NULL-dereference at probe
usb: gadget: function: u_ether: don't starve tx request queue
mei: txe: don't clean an unprocessed interrupt cause.
ubifs: Fix regression in ubifs_readdir()
ubifs: Abort readdir upon error
btrfs: fix races on root_log_ctx lists
ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
ANDROID: binder: Add strong ref checks
ALSA: hda - Fix headset mic detection problem for two Dell laptops
ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table
ALSA: hda - allow 40 bit DMA mask for NVidia devices
ALSA: hda - Raise AZX_DCAPS_RIRB_DELAY handling into top drivers
ALSA: hda - Merge RIRB_PRE_DELAY into CTX_WORKAROUND caps
ALSA: usb-audio: Add quirk for Syntek STK1160
KEYS: Fix short sprintf buffer in /proc/keys show function
mm: memcontrol: do not recurse in direct reclaim
mm/list_lru.c: avoid error-path NULL pointer deref
libxfs: clean up _calc_dquots_per_chunk
h8300: fix syscall restarting
drm/dp/mst: Clear port->pdt when tearing down the i2c adapter
i2c: core: fix NULL pointer dereference under race condition
i2c: xgene: Avoid dma_buffer overrun
arm64:cpufeature ARM64_NCAPS is the indicator of last feature
arm64: hibernate: Refuse to hibernate if the boot cpu is offline
PM / sleep: Add support for read-only sysfs attributes
arm64: kernel: Add support for hibernate/suspend-to-disk
arm64: mm: add functions to walk page tables by PA
arm64: mm: move pte_* macros
PM / Hibernate: Call flush_icache_range() on pages restored in-place
arm64: Add new asm macro copy_page
arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
arm64: kernel: Include _AC definition in page.h
arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
arm64: kernel: Rework finisher callback out of __cpu_suspend_enter()
arm64: Cleanup SCTLR flags
arm64: Fold proc-macros.S into assembler.h
arm/arm64: KVM: Add hook for C-based stage2 init
arm/arm64: KVM: Detect vGIC presence at runtime
arm64: KVM: Add support for 16-bit VMID
arm: KVM: Make kvm_arm.h friendly to assembly code
arm/arm64: KVM: Remove unreferenced S2_PGD_ORDER
arm64: KVM: debug: Remove spurious inline attributes
ARM: KVM: Cleanup exception injection
arm64: KVM: Remove weak attributes
arm64: KVM: Cleanup asm-offset.c
arm64: KVM: Turn system register numbers to an enum
arm64: KVM: VHE: Patch out use of HVC
arm64: Add ARM64_HAS_VIRT_HOST_EXTN feature
arm/arm64: Add new is_kernel_in_hyp_mode predicate
arm64: KVM: Move away from the assembly version of the world switch
arm64: KVM: Map the kernel RO section into HYP
arm64: KVM: Add compatibility aliases
arm64: KVM: Implement vgic-v3 save/restore
arm64: KVM: Add panic handling
arm64: KVM: HYP mode entry points
arm64: KVM: Implement TLB handling
arm64: KVM: Implement fpsimd save/restore
arm64: KVM: Implement the core world switch
arm64: KVM: Add patchable function selector
arm64: KVM: Implement guest entry
arm64: KVM: Implement debug save/restore
arm64: KVM: Implement 32bit system register save/restore
arm64: KVM: Implement system register save/restore
arm64: KVM: Implement timer save/restore
arm64: KVM: Implement vgic-v2 save/restore
arm64: KVM: Add a HYP-specific header file
KVM: arm/arm64: vgic-v3: Make the LR indexing macro public
arm64: Add macros to read/write system registers
Linux 4.4.30
Revert "fix minor infoleak in get_user_ex()"
Revert "x86/mm: Expand the exception table logic to allow new handling options"
Linux 4.4.29
ARM: pxa: pxa_cplds: fix interrupt handling
powerpc/nvram: Fix an incorrect partition merge
mpt3sas: Don't spam logs if logging level is 0
perf symbols: Fixup symbol sizes before picking best ones
perf symbols: Check symbol_conf.allow_aliases for kallsyms loading too
perf hists browser: Fix event group display
clk: divider: Fix clk_divider_round_rate() to use clk_readl()
clk: qoriq: fix a register offset error
s390/con3270: fix insufficient space padding
s390/con3270: fix use of uninitialised data
s390/cio: fix accidental interrupt enabling during resume
x86/mm: Expand the exception table logic to allow new handling options
dmaengine: ipu: remove bogus NO_IRQ reference
power: bq24257: Fix use of uninitialized pointer bq->charger
staging: r8188eu: Fix scheduling while atomic splat
ASoC: dapm: Fix kcontrol creation for output driver widget
ASoC: dapm: Fix value setting for _ENUM_DOUBLE MUX's second channel
ASoC: dapm: Fix possible uninitialized variable in snd_soc_dapm_get_volsw()
ASoC: topology: Fix error return code in soc_tplg_dapm_widget_create()
hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
crypto: arm/ghash-ce - add missing async import/export
crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
mwifiex: correct aid value during tdls setup
spi: spi-fsl-dspi: Drop extra spi_master_put in device remove function
ARM: clk-imx35: fix name for ckil clk
uio: fix dmem_region_start computation
genirq/generic_chip: Add irq_unmap callback
perf stat: Fix interval output values
powerpc/eeh: Null check uses of eeh_pe_bus_get
tunnels: Remove encapsulation offloads on decap.
tunnels: Don't apply GRO to multiple layers of encapsulation.
ipip: Properly mark ipip GRO packets as encapsulated.
posix_acl: Clear SGID bit when setting file permissions
brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
mm/hugetlb: fix memory offline with hugepage size > memory block size
drm/i915: Unalias obj->phys_handle and obj->userptr
drm/i915: Account for TSEG size when determining 865G stolen base
Revert "drm/i915: Check live status before reading edid"
drm/i915/gen9: fix the WaWmMemoryReadLatency implementation
xenbus: don't look up transaction IDs for ordinary writes
drm/vmwgfx: Limit the user-space command buffer size
drm/radeon: change vblank_time's calculation method to reduce computational error.
drm/radeon/si/dpm: fix phase shedding setup
drm/radeon: narrow asic_init for virtualization
drm/amdgpu: change vblank_time's calculation method to reduce computational error.
drm/amdgpu/dce11: add missing drm_mode_config_cleanup call
drm/amdgpu/dce11: disable hpd on local panels
drm/amdgpu/dce8: disable hpd on local panels
drm/amdgpu/dce10: disable hpd on local panels
drm/amdgpu: fix IB alignment for UVD
drm/prime: Pass the right module owner through to dma_buf_export()
Linux 4.4.28
target: Don't override EXTENDED_COPY xcopy_pt_cmd SCSI status code
target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE
target: Re-add missing SCF_ACK_KREF assignment in v4.1.y
ubifs: Fix xattr_names length in exit paths
jbd2: fix incorrect unlock on j_list_lock
ext4: do not advertise encryption support when disabled
mmc: rtsx_usb_sdmmc: Handle runtime PM while changing the led
mmc: rtsx_usb_sdmmc: Avoid keeping the device runtime resumed when unused
mmc: core: Annotate cmd_hdr as __le32
powerpc/mm: Prevent unlikely crash in copro_calculate_slb()
ceph: fix error handling in ceph_read_iter
arm64: kernel: Init MDCR_EL2 even in the absence of a PMU
arm64: percpu: rewrite ll/sc loops in assembly
memstick: rtsx_usb_ms: Manage runtime PM when accessing the device
memstick: rtsx_usb_ms: Runtime resume the device when polling for cards
isofs: Do not return EACCES for unknown filesystems
irqchip/gic-v3-its: Fix entry size mask for GITS_BASER
s390/mm: fix gmap tlb flush issues
Using BUG_ON() as an assert() is _never_ acceptable
mm: filemap: fix mapping->nrpages double accounting in fuse
mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
acpi, nfit: check for the correct event code in notifications
net/mlx4_core: Allow resetting VF admin mac to zero
bnx2x: Prevent false warning for lack of FC NPIV
PKCS#7: Don't require SpcSpOpusInfo in Authenticode pkcs7 signatures
hpsa: correct skipping masked peripherals
sd: Fix rw_max for devices that report an optimal xfer size
irqchip/gicv3: Handle loop timeout proper
kvm: x86: memset whole irq_eoi
x86/e820: Don't merge consecutive E820_PRAM ranges
blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
Fix regression which breaks DFS mounting
Cleanup missing frees on some ioctls
Do not send SMB3 SET_INFO request if nothing is changing
SMB3: GUIDs should be constructed as random but valid uuids
Set previous session id correctly on SMB3 reconnect
Display number of credits available
Clarify locking of cifs file and tcon structures and make more granular
fs/cifs: keep guid when assigning fid to fileinfo
cifs: Limit the overall credit acquired
fs/super.c: fix race between freeze_super() and thaw_super()
arc: don't leak bits of kernel stack into coredump
lightnvm: ensure that nvm_dev_ops can be used without CONFIG_NVM
ipc/sem.c: fix complex_count vs. simple op race
mm: filemap: don't plant shadow entries without radix tree node
metag: Only define atomic_dec_if_positive conditionally
scsi: Fix use-after-free
NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
NFSv4: Open state recovery must account for file permission changes
NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid
NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()
sunrpc: fix write space race causing stalls
Input: elantech - add Fujitsu Lifebook E556 to force crc_enabled
Input: elantech - force needed quirks on Fujitsu H760
Input: i8042 - skip selftest on ASUS laptops
lib: add "on"/"off" support to kstrtobool
lib: update single-char callers of strtobool()
lib: move strtobool() to kstrtobool()
MIPS: ptrace: Fix regs_return_value for kernel context
MIPS: Fix -mabi=64 build of vdso.lds
ALSA: hda - Fix a failure of micmute led when having multi adcs
cx231xx: fix GPIOs for Pixelview SBTVD hybrid
cx231xx: don't return error on success
mb86a20s: fix demod settings
mb86a20s: fix the locking logic
ovl: copy_up_xattr(): use strnlen
ovl: Fix info leak in ovl_lookup_temp()
fbdev/efifb: Fix 16 color palette entry calculation
scsi: zfcp: spin_lock_irqsave() is not nestable
zfcp: trace full payload of all SAN records (req,resp,iels)
zfcp: fix payload trace length for SAN request&response
zfcp: fix D_ID field with actual value on tracing SAN responses
zfcp: restore tracing of handle for port and LUN with HBA records
zfcp: trace on request for open and close of WKA port
zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace
zfcp: retain trace level for SCSI and HBA FSF response records
zfcp: close window with unblocked rport during rport gone
zfcp: fix ELS/GS request&response length for hardware data router
zfcp: fix fc_host port_type with NPIV
ubi: Deal with interrupted erasures in WL
powerpc/pseries: Fix stack corruption in htpe code
powerpc/64: Fix incorrect return value from __copy_tofrom_user
powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data()
powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag()
powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear()
powerpc/vdso64: Use double word compare on pointers
dm crypt: fix crash on exit
dm mpath: check if path's request_queue is dying in activate_path()
dm: return correct error code in dm_resume()'s retry loop
dm: mark request_queue dead before destroying the DM device
perf intel-pt: Fix MTC timestamp calculation for large MTC periods
perf intel-pt: Fix estimated timestamps for cycle-accurate mode
perf intel-pt: Fix snapshot overlap detection decoder errors
pstore/ram: Use memcpy_fromio() to save old buffer
pstore/ram: Use memcpy_toio instead of memcpy
pstore/core: drop cmpxchg based updates
pstore/ramoops: fixup driver removal
parisc: Increase initial kernel mapping size
parisc: Fix kernel memory layout regarding position of __gp
parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
cpufreq: intel_pstate: Fix unsafe HWP MSR access
platform: don't return 0 from platform_get_irq[_byname]() on error
PCI: Mark Atheros AR9580 to avoid bus reset
mmc: sdhci: cast unsigned int to unsigned long long to avoid unexpeted error
mmc: block: don't use CMD23 with very old MMC cards
rtlwifi: Fix missing country code for Great Britain
PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()
clk: imx6: initialize GPU clocks
regulator: tps65910: Work around silicon erratum SWCZ010
mei: me: add kaby point device ids
gpio: mpc8xxx: Correct irq handler function
cgroup: Change from CAP_SYS_NICE to CAP_SYS_RESOURCE for cgroup migration permissions
UPSTREAM: cpu/hotplug: Handle unbalanced hotplug enable/disable
UPSTREAM: arm64: kaslr: fix breakage with CONFIG_MODVERSIONS=y
UPSTREAM: arm64: kaslr: keep modules close to the kernel when DYNAMIC_FTRACE=y
cgroup: Remove leftover instances of allow_attach
BACKPORT: lib: harden strncpy_from_user
CHROMIUM: cgroups: relax permissions on moving tasks between cgroups
CHROMIUM: remove Android's cgroup generic permissions checks
Linux 4.4.27
cfq: fix starvation of asynchronous writes
vfs: move permission checking into notify_change() for utimes(NULL)
dlm: free workqueues after the connections
crypto: vmx - Fix memory corruption caused by p8_ghash
crypto: ghash-generic - move common definitions to a new header file
ext4: release bh in make_indexed_dir
ext4: allow DAX writeback for hole punch
ext4: fix memory leak in ext4_insert_range()
ext4: reinforce check of i_dtime when clearing high fields of uid and gid
ext4: enforce online defrag restriction for encrypted files
scsi: ibmvfc: Fix I/O hang when port is not mapped
scsi: arcmsr: Simplify user_len checking
scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer()
async_pq_val: fix DMA memory leak
reiserfs: switch to generic_{get,set,remove}xattr()
reiserfs: Unlock superblock before calling reiserfs_quota_on_mount()
ASoC: Intel: Atom: add a missing star in a memcpy call
brcmfmac: fix memory leak in brcmf_fill_bss_param
i40e: avoid NULL pointer dereference and recursive errors on early PCI error
fuse: fix killing s[ug]id in setattr
fuse: invalidate dir dentry after chmod
fuse: listxattr: verify xattr list
drivers: base: dma-mapping: page align the size when unmap_kernel_range
btrfs: assign error values to the correct bio structs
serial: 8250_dw: Check the data->pclk when get apb_pclk
arm64: Use PoU cache instr for I/D coherency
arm64: mm: add code to safely replace TTBR1_EL1
arm64: mm: place __cpu_setup in .text
arm64: add function to install the idmap
arm64: unmap idmap earlier
arm64: unify idmap removal
arm64: mm: place empty_zero_page in bss
arm64: head.S: use memset to clear BSS
arm64: mm: specialise pagetable allocators
arm64: mm: remove pointless PAGE_MASKing
asm-generic: Fix local variable shadow in __set_fixmap_offset
arm64: mm: fold alternatives into .init
ARM: 8511/1: ARM64: kernel: PSCI: move PSCI idle management code to drivers/firmware
ARM: 8481/2: drivers: psci: replace psci firmware calls
ARM: 8480/2: arm64: add implementation for arm-smccc
ARM: 8479/2: add implementation for arm-smccc
ARM: 8478/2: arm/arm64: add arm-smccc
ARM: 8510/1: rework ARM_CPU_SUSPEND dependencies
ARM: 8458/1: bL_switcher: add GIC dependency
Linux 4.4.26
mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
x86/build: Build compressed x86 kernels as PIE
arm64: Remove stack duplicating code from jprobes
arm64: kprobes: Add KASAN instrumentation around stack accesses
arm64: kprobes: Cleanup jprobe_return
arm64: kprobes: Fix overflow when saving stack
arm64: kprobes: WARN if attempting to step with PSTATE.D=1
kprobes: Add arm64 case in kprobe example module
arm64: Add kernel return probes support (kretprobes)
arm64: Add trampoline code for kretprobes
arm64: kprobes instruction simulation support
arm64: Treat all entry code as non-kprobe-able
arm64: Blacklist non-kprobe-able symbol
arm64: Kprobes with single stepping support
arm64: add conditional instruction simulation support
arm64: Add more test functions to insn.c
arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
Linux 4.4.25
tpm_crb: fix crb_req_canceled behavior
tpm: fix a race condition in tpm2_unseal_trusted()
ima: use file_dentry()
ARM: cpuidle: Fix error return code
ARM: dts: MSM8064 remove flags from SPMI/MPP IRQs
ARM: dts: mvebu: armada-390: add missing compatibility string and bracket
x86/dumpstack: Fix x86_32 kernel_stack_pointer() previous stack access
x86/irq: Prevent force migration of irqs which are not in the vector domain
x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation
KVM: PPC: BookE: Fix a sanity check
KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
mfd: wm8350-i2c: Make sure the i2c regmap functions are compiled
mfd: 88pm80x: Double shifting bug in suspend/resume
mfd: atmel-hlcdc: Do not sleep in atomic context
mfd: rtsx_usb: Avoid setting ucr->current_sg.status
ALSA: usb-line6: use the same declaration as definition in header for MIDI manufacturer ID
ALSA: usb-audio: Extend DragonFly dB scale quirk to cover other variants
ALSA: ali5451: Fix out-of-bound position reporting
timekeeping: Fix __ktime_get_fast_ns() regression
time: Add cycles to nanoseconds translation
mm: Fix build for hardened usercopy
ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct
ANDROID: binder: Add strong ref checks
UPSTREAM: staging/android/ion : fix a race condition in the ion driver
ANDROID: android-base: CONFIG_HARDENED_USERCOPY=y
UPSTREAM: fs/proc/kcore.c: Add bounce buffer for ktext data
UPSTREAM: fs/proc/kcore.c: Make bounce buffer global for read
BACKPORT: arm64: Correctly bounds check virt_addr_valid
Fix a build breakage in IO latency hist code.
UPSTREAM: efi: include asm/early_ioremap.h not asm/efi.h to get early_memremap
UPSTREAM: ia64: split off early_ioremap() declarations into asm/early_ioremap.h
FROMLIST: arm64: Enable CONFIG_ARM64_SW_TTBR0_PAN
FROMLIST: arm64: xen: Enable user access before a privcmd hvc call
FROMLIST: arm64: Handle faults caused by inadvertent user access with PAN enabled
FROMLIST: arm64: Disable TTBR0_EL1 during normal kernel execution
FROMLIST: arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
FROMLIST: arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
FROMLIST: arm64: Factor out PAN enabling/disabling into separate uaccess_* macros
UPSTREAM: arm64: Handle el1 synchronous instruction aborts cleanly
UPSTREAM: arm64: include alternative handling in dcache_by_line_op
UPSTREAM: arm64: fix "dc cvau" cache operation on errata-affected core
UPSTREAM: Revert "arm64: alternatives: add enable parameter to conditional asm macros"
UPSTREAM: arm64: Add new asm macro copy_page
UPSTREAM: arm64: kill ESR_LNX_EXEC
UPSTREAM: arm64: add macro to extract ESR_ELx.EC
UPSTREAM: arm64: mm: mark fault_info table const
UPSTREAM: arm64: fix dump_instr when PAN and UAO are in use
BACKPORT: arm64: Fold proc-macros.S into assembler.h
UPSTREAM: arm64: choose memstart_addr based on minimum sparsemem section alignment
UPSTREAM: arm64/mm: ensure memstart_addr remains sufficiently aligned
UPSTREAM: arm64/kernel: fix incorrect EL0 check in inv_entry macro
UPSTREAM: arm64: Add macros to read/write system registers
UPSTREAM: arm64/efi: refactor EFI init and runtime code for reuse by 32-bit ARM
UPSTREAM: arm64/efi: split off EFI init and runtime code for reuse by 32-bit ARM
UPSTREAM: arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP
BACKPORT: arm64: only consider memblocks with NOMAP cleared for linear mapping
UPSTREAM: mm/memblock: add MEMBLOCK_NOMAP attribute to memblock memory table
ANDROID: dm: android-verity: Remove fec_header location constraint
BACKPORT: audit: consistently record PIDs with task_tgid_nr()
android-base.cfg: Enable kernel ASLR
UPSTREAM: vmlinux.lds.h: allow arch specific handling of ro_after_init data section
UPSTREAM: arm64: spinlock: fix spin_unlock_wait for LSE atomics
UPSTREAM: arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
UPSTREAM: arm64: Only select ARM64_MODULE_PLTS if MODULES=y
sched: Add Kconfig option DEFAULT_USE_ENERGY_AWARE to set ENERGY_AWARE feature flag
sched/fair: remove printk while schedule is in progress
ANDROID: fs: FS tracepoints to track IO.
sched/walt: Drop arch-specific timer access
ANDROID: fiq_debugger: Pass task parameter to unwind_frame()
eas/sched/fair: Fixing comments in find_best_target.
input: keyreset: switch to orderly_reboot
UPSTREAM: tun: fix transmit timestamp support
UPSTREAM: arch/arm/include/asm/pgtable-3level.h: add pmd_mkclean for THP
net: inet: diag: expose the socket mark to privileged processes.
net: diag: make udp_diag_destroy work for mapped addresses.
net: diag: support SOCK_DESTROY for UDP sockets
net: diag: allow socket bytecode filters to match socket marks
net: diag: slightly refactor the inet_diag_bc_audit error checks.
net: diag: Add support to filter on device index
UPSTREAM: brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap()
Linux 4.4.24
ALSA: hda - Add the top speaker pin config for HP Spectre x360
ALSA: hda - Fix headset mic detection problem for several Dell laptops
ACPICA: acpi_get_sleep_type_data: Reduce warnings
ALSA: hda - Adding one more ALC255 pin definition for headset problem
Revert "usbtmc: convert to devm_kzalloc"
USB: serial: cp210x: Add ID for a Juniper console
Staging: fbtft: Fix bug in fbtft-core
usb: misc: legousbtower: Fix NULL pointer deference
USB: serial: cp210x: fix hardware flow-control disable
dm log writes: fix bug with too large bios
clk: xgene: Add missing parenthesis when clearing divider value
aio: mark AIO pseudo-fs noexec
batman-adv: remove unused callback from batadv_algo_ops struct
IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV
IB/mlx4: Fix code indentation in QP1 MAD flow
IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
IB/ipoib: Don't allow MC joins during light MC flush
IB/core: Fix use after free in send_leave function
IB/ipoib: Fix memory corruption in ipoib cm mode connect flow
KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
dmaengine: at_xdmac: fix to pass correct device identity to free_irq()
kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
ASoC: omap-mcpdm: Fix irq resource handling
sysctl: handle error writing UINT_MAX to u32 fields
powerpc/prom: Fix sub-processor option passed to ibm, client-architecture-support
brcmsmac: Initialize power in brcms_c_stf_ss_algo_channel_get()
brcmsmac: Free packet if dma_mapping_error() fails in dma_rxfill
brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain
ASoC: Intel: Skylake: Fix error return code in skl_probe()
pNFS/flexfiles: Fix layoutcommit after a commit to DS
pNFS/files: Fix layoutcommit after a commit to DS
NFS: Don't drop CB requests with invalid principals
svc: Avoid garbage replies when pc_func() returns rpc_drop_reply
dmaengine: at_xdmac: fix debug string
fnic: pci_dma_mapping_error() doesn't return an error code
avr32: off by one in at32_init_pio()
ath9k: Fix programming of minCCA power threshold
gspca: avoid unused variable warnings
em28xx-i2c: rt_mutex_trylock() returns zero on failure
NFC: fdp: Detect errors from fdp_nci_create_conn()
iwlmvm: mvm: set correct state in smart-fifo configuration
tile: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
pstore: drop file opened reference count
blk-mq: actually hook up defer list when running requests
hwrng: omap - Fix assumption that runtime_get_sync will always succeed
ARM: sa1111: fix pcmcia suspend/resume
ARM: shmobile: fix regulator quirk for Gen2
ARM: sa1100: clear reset status prior to reboot
ARM: sa1100: fix 3.6864MHz clock
ARM: sa1100: register clocks early
ARM: sun5i: Fix typo in trip point temperature
regulator: qcom_smd: Fix voltage ranges for pm8x41
regulator: qcom_spmi: Update mvs1/mvs2 switches on pm8941
regulator: qcom_spmi: Add support for get_mode/set_mode on switches
regulator: qcom_spmi: Add support for S4 supply on pm8941
tpm: fix byte-order for the value read by tpm2_get_tpm_pt
printk: fix parsing of "brl=" option
MIPS: uprobes: fix use of uninitialised variable
MIPS: Malta: Fix IOCU disable switch read for MIPS64
MIPS: fix uretprobe implementation
MIPS: uprobes: remove incorrect set_orig_insn
arm64: debug: avoid resetting stepping state machine when TIF_SINGLESTEP
ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
gpio: sa1100: fix irq probing for ucb1x00
usb: gadget: fsl_qe_udc: signedness bug in qe_get_frame()
ceph: fix race during filling readdir cache
iwlwifi: mvm: don't use ret when not initialised
iwlwifi: pcie: fix access to scratch buffer
spi: sh-msiof: Avoid invalid clock generator parameters
hwmon: (adt7411) set bit 3 in CFG1 register
nvmem: Declare nvmem_cell_read() consistently
ipvs: fix bind to link-local mcast IPv6 address in backup
tools/vm/slabinfo: fix an unintentional printf
mmc: pxamci: fix potential oops
drivers/perf: arm_pmu: Fix leak in error path
pinctrl: Flag strict is a field in struct pinmux_ops
pinctrl: uniphier: fix .pin_dbg_show() callback
i40e: avoid null pointer dereference
perf/core: Fix pmu::filter_match for SW-led groups
iwlwifi: mvm: fix a few firmware capability checks
usb: musb: fix DMA for host mode
usb: musb: Fix DMA desired mode for Mentor DMA engine
ARM: 8617/1: dma: fix dma_max_pfn()
ARM: 8616/1: dt: Respect property size when parsing CPUs
drm/radeon/si/dpm: add workaround for for Jet parts
drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
x86/init: Fix cr4_init_shadow() on CR4-less machines
can: dev: fix deadlock reported after bus-off
mm,ksm: fix endless looping in allocating memory when ksm enable
mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
cpuset: handle race between CPU hotplug and cpuset_hotplug_work
usercopy: fold builtin_const check into inline function
Linux 4.4.23
hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common()
qxl: check for kmap failures
power: supply: max17042_battery: fix model download bug.
power_supply: tps65217-charger: fix missing platform_set_drvdata()
PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
PM / hibernate: Restore processor state before using per-CPU variables
MIPS: paravirt: Fix undefined reference to smp_bootstrap
MIPS: Add a missing ".set pop" in an early commit
MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
MIPS: Remove compact branch policy Kconfig entries
MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
MIPS: Fix pre-r6 emulation FPU initialisation
i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
i2c-eg20t: fix race between i2c init and interrupt enable
btrfs: ensure that file descriptor used with subvol ioctls is a dir
nl80211: validate number of probe response CSA counters
can: flexcan: fix resume function
mm: delete unnecessary and unsafe init_tlb_ubc()
tracing: Move mutex to protect against resetting of seq data
fix memory leaks in tracing_buffers_splice_read()
power: reset: hisi-reboot: Unmap region obtained by of_iomap
mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
mtd: maps: sa1100-flash: potential NULL dereference
fix fault_in_multipages_...() on architectures with no-op access_ok()
fanotify: fix list corruption in fanotify_get_response()
fsnotify: add a way to stop queueing events on group shutdown
xfs: prevent dropping ioend completions during buftarg wait
autofs: use dentry flags to block walks during expire
autofs races
pwm: Mark all devices as "might sleep"
bridge: re-introduce 'fix parsing of MLDv2 reports'
net: smc91x: fix SMC accesses
Revert "phy: IRQ cannot be shared"
net: dsa: bcm_sf2: Fix race condition while unmasking interrupts
net/mlx5: Added missing check of msg length in verifying its signature
tipc: fix NULL pointer dereference in shutdown()
net/irda: handle iriap_register_lsap() allocation failure
vti: flush x-netns xfrm cache when vti interface is removed
af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock'
Revert "af_unix: Fix splice-bind deadlock"
bonding: Fix bonding crash
megaraid: fix null pointer check in megasas_detach_one().
nouveau: fix nv40_perfctr_next() cleanup regression
Staging: iio: adc: fix indent on break statement
iwlegacy: avoid warning about missing braces
ath9k: fix misleading indentation
am437x-vfpe: fix typo in vpfe_get_app_input_index
Add braces to avoid "ambiguous ‘else’" compiler warnings
net: caif: fix misleading indentation
Makefile: Mute warning for __builtin_return_address(>0) for tracing only
Disable "frame-address" warning
Disable "maybe-uninitialized" warning globally
gcov: disable -Wmaybe-uninitialized warning
Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES
kbuild: forbid kernel directory to contain spaces and colons
tools: Support relative directory path for 'O='
Makefile: revert "Makefile: Document ability to make file.lst and file.S" partially
kbuild: Do not run modules_install and install in paralel
ocfs2: fix start offset to ocfs2_zero_range_for_truncate()
ocfs2/dlm: fix race between convert and migration
crypto: echainiv - Replace chaining with multiplication
crypto: skcipher - Fix blkcipher walk OOM crash
crypto: arm/aes-ctr - fix NULL dereference in tail processing
crypto: arm64/aes-ctr - fix NULL dereference in tail processing
tcp: properly scale window in tcp_v[46]_reqsk_send_ack()
tcp: fix use after free in tcp_xmit_retransmit_queue()
tcp: cwnd does not increase in TCP YeAH
ipv6: release dst in ping_v6_sendmsg
ipv4: panic in leaf_walk_rcu due to stale node pointer
reiserfs: fix "new_insert_key may be used uninitialized ..."
Fix build warning in kernel/cpuset.c
include/linux/kernel.h: change abs() macro so it uses consistent return type
Linux 4.4.22
openrisc: fix the fix of copy_from_user()
avr32: fix 'undefined reference to `___copy_from_user'
ia64: copy_from_user() should zero the destination on access_ok() failure
genirq/msi: Fix broken debug output
ppc32: fix copy_from_user()
sparc32: fix copy_from_user()
mn10300: copy_from_user() should zero on access_ok() failure...
nios2: copy_from_user() should zero the tail of destination
openrisc: fix copy_from_user()
parisc: fix copy_from_user()
metag: copy_from_user() should zero the destination on access_ok() failure
alpha: fix copy_from_user()
asm-generic: make copy_from_user() zero the destination properly
mips: copy_from_user() must zero the destination on access_ok() failure
hexagon: fix strncpy_from_user() error return
sh: fix copy_from_user()
score: fix copy_from_user() and friends
blackfin: fix copy_from_user()
cris: buggered copy_from_user/copy_to_user/clear_user
frv: fix clear_user()
asm-generic: make get_user() clear the destination on errors
ARC: uaccess: get_user to zero out dest in cause of fault
s390: get_user() should zero on failure
score: fix __get_user/get_user
nios2: fix __get_user()
sh64: failing __get_user() should zero
m32r: fix __get_user()
mn10300: failing __get_user() and get_user() should zero
fix minor infoleak in get_user_ex()
microblaze: fix copy_from_user()
avr32: fix copy_from_user()
microblaze: fix __get_user()
fix iov_iter_fault_in_readable()
irqchip/atmel-aic: Fix potential deadlock in ->xlate()
genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers
drm: Only use compat ioctl for addfb2 on X86/IA64
drm: atmel-hlcdc: Fix vertical scaling
net: simplify napi_synchronize() to avoid warnings
kconfig: tinyconfig: provide whole choice blocks to avoid warnings
soc: qcom/spm: shut up uninitialized variable warning
pinctrl: at91-pio4: use %pr format string for resource
mmc: dw_mmc: use resource_size_t to store physical address
drm/i915: Avoid pointer arithmetic in calculating plane surface offset
mpssd: fix buffer overflow warning
gma500: remove annoying deprecation warning
ipv6: addrconf: fix dev refcont leak when DAD failed
sched/core: Fix a race between try_to_wake_up() and a woken up task
Revert "wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel"
ath9k: fix using sta->drv_priv before initializing it
md-cluster: make md-cluster also can work when compiled into kernel
xhci: fix null pointer dereference in stop command timeout function
fuse: direct-io: don't dirty ITER_BVEC pages
Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns
crypto: cryptd - initialize child shash_desc on import
arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb()
pinctrl: sunxi: fix uart1 CTS/RTS pins at PG on A23/A33
pinctrl: pistachio: fix mfio pll_lock pinmux
dm crypt: fix error with too large bios
dm log writes: move IO accounting earlier to fix error path
dm log writes: fix check of kthread_run() return value
bus: arm-ccn: Fix XP watchpoint settings bitmask
bus: arm-ccn: Do not attempt to configure XPs for cycle counter
bus: arm-ccn: Fix PMU handling of MN
ARM: dts: STiH407-family: Provide interconnect clock for consumption in ST SDHCI
ARM: dts: overo: fix gpmc nand on boards with ethernet
ARM: dts: overo: fix gpmc nand cs0 range
ARM: dts: imx6qdl: Fix SPDIF regression
ARM: OMAP3: hwmod data: Add sysc information for DSI
ARM: kirkwood: ib62x0: fix size of u-boot environment partition
ARM: imx6: add missing BM_CLPCR_BYPASS_PMIC_READY setting for imx6sx
ARM: imx6: add missing BM_CLPCR_BYP_MMDC_CH0_LPM_HS setting for imx6ul
ARM: AM43XX: hwmod: Fix RSTST register offset for pruss
cpuset: make sure new tasks conform to the current config of the cpuset
net: thunderx: Fix OOPs with ethtool --register-dump
USB: change bInterval default to 10 ms
ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase
usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition
USB: serial: simple: add support for another Infineon flashloader
serial: 8250: added acces i/o products quad and octal serial cards
serial: 8250_mid: fix divide error bug if baud rate is 0
iio: ensure ret is initialized to zero before entering do loop
iio:core: fix IIO_VAL_FRACTIONAL sign handling
iio: accel: kxsd9: Fix scaling bug
iio: fix pressure data output unit in hid-sensor-attributes
iio: accel: bmc150: reset chip at init time
iio: adc: at91: unbreak channel adc channel 3
iio: ad799x: Fix buffered capture for ad7991/ad7995/ad7999
iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample
iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access
iio: adc: rockchip_saradc: reset saradc controller before programming it
iio: proximity: as3935: set up buffer timestamps for non-zero values
iio: accel: kxsd9: Fix raw read return
kvm-arm: Unmap shadow pagetables properly
x86/AMD: Apply erratum 665 on machines without a BIOS fix
x86/paravirt: Do not trace _paravirt_ident_*() functions
ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
IB/uverbs: Fix race between uverbs_close and remove_one
dm flakey: fix reads to be issued if drop_writes configured
audit: fix exe_file access in audit_exe_compare
mm: introduce get_task_exe_file
kexec: fix double-free when failing to relocate the purgatory
NFSv4.1: Fix the CREATE_SESSION slot number accounting
pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised
nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock
NFSv4.x: Fix a refcount leak in nfs_callback_up_net
pNFS: The client must not do I/O to the DS if it's lease has expired
kernfs: don't depend on d_find_any_alias() when generating notifications
powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET
powerpc/powernv : Drop reference added by kset_find_obj()
powerpc/tm: do not use r13 for tabort_syscall
tipc: move linearization of buffers to generic code
lightnvm: put bio before return
fscrypto: require write access to mount to set encryption policy
Revert "KVM: x86: fix missed hardware breakpoints"
MIPS: KVM: Check for pfn noslot case
clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function
fscrypto: add authorization check for setting encryption policy
ext4: use __GFP_NOFAIL in ext4_free_blocks()
Conflicts:
arch/arm/kernel/devtree.c
arch/arm64/Kconfig
arch/arm64/kernel/arm64ksyms.c
arch/arm64/kernel/psci.c
arch/arm64/mm/fault.c
drivers/android/binder.c
drivers/usb/host/xhci-hub.c
fs/ext4/readpage.c
include/linux/mmc/core.h
include/linux/mmzone.h
mm/memcontrol.c
net/core/filter.c
net/netlink/af_netlink.c
net/netlink/af_netlink.h
Change-Id: I99fe7a0914e83e284b11b33185b71448a8999d1f
Signed-off-by: Runmin Wang <runminw@codeaurora.org>
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The dirty balance reserve that dirty throttling has to consider is
merely memory not available to userspace allocations. There is nothing
writeback-specific about it. Generalize the name so that it's reusable
outside of that context.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit a8d0143730d7b42c9fe6d1435d92ecce6863a62a)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit 21f54ddae449f4bdd9f1498124901d67202243d9 upstream.
That just generally kills the machine, and makes debugging only much
harder, since the traces may long be gone.
Debugging by assert() is a disease. Don't do it. If you can continue,
you're much better off doing so with a live machine where you have a
much higher chance that the report actually makes it to the system logs,
rather than result in a machine that is just completely dead.
The only valid situation for BUG_ON() is when continuing is not an
option, because there is massive corruption. But if you are just
verifying that something is true, you warn about your broken assumptions
(preferably just once), and limp on.
Fixes: 22f2ac51b6d6 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()")
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
replace_page_cache_page()
commit 22f2ac51b6d643666f4db093f13144f773ff3f3a upstream.
Antonio reports the following crash when using fuse under memory pressure:
kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
invalid opcode: 0000 [#1] SMP
Modules linked in: all of them
CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
RIP: shadow_lru_isolate+0x181/0x190
Call Trace:
__list_lru_walk_one.isra.3+0x8f/0x130
list_lru_walk_one+0x23/0x30
scan_shadow_nodes+0x34/0x50
shrink_slab.part.40+0x1ed/0x3d0
shrink_zone+0x2ca/0x2e0
kswapd+0x51e/0x990
kthread+0xd8/0xf0
ret_from_fork+0x3f/0x70
which corresponds to the following sanity check in the shadow node
tracking:
BUG_ON(node->count & RADIX_TREE_COUNT_MASK);
The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.
While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page. Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.
To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert(). This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.
Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.
[mhocko@suse.com: backport for 4.4]
Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Antonio SJ Musumeci <trapexit@spawn.link>
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add support to receive a static ratio from userspace to
divide the swap pages between ZRAM and disk based swap
devices. The existing infrastructure allows to keep
same priority for multiple swap devices, which results
in round robin distribution of pages. With this patch,
the ratio can be defined.
CRs-fixed: 968416
Change-Id: I54f54489db84cabb206569dd62d61a8a7a898991
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are couple of issues with swapcache usage when ZRAM is used
as swap device.
1) Kernel does a swap readahead which can be around 6 to 8 pages
depending on total ram, which is not required for zram since
accesses are fast.
2) Kernel delays the freeing up of swapcache expecting a later hit,
which again is useless in the case of zram.
3) This is not related to swapcache, but zram usage itself.
As mentioned in (2) kernel delays freeing of swapcache, but along with
that it delays zram compressed page free also. i.e. there can be 2 copies,
though one is compressed.
This patch addresses these issues using two new flags
QUEUE_FLAG_FAST and SWP_FAST, to indicate that accesses to the device
will be fast and cheap, and instructs the swap layer to free up
swap space agressively, and not to do read ahead.
Change-Id: I5d2d5176a5f9420300bb2f843f6ecbdb25ea80e4
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zswap_get_swap_cache_page and read_swap_cache_async have pretty much the
same code with only significant difference in return value and usage of
swap_readpage.
I a helper __read_swap_cache_async() with the common code. Behavior
change: now zswap_get_swap_cache_page will use radix_tree_maybe_preload
instead radix_tree_preload. Looks like, this wasn't changed only by the
reason of code duplication.
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mem_cgroup structure is defined in mm/memcontrol.c currently which means
that the code outside of this file has to use external API even for
trivial access stuff.
This patch exports mm_struct with its dependencies and makes some of the
exported functions inlines. This even helps to reduce the code size a bit
(make defconfig + CONFIG_MEMCG=y)
text data bss dec hex filename
12355346 1823792 1089536 15268674 e8fb42 vmlinux.before
12354970 1823792 1089536 15268298 e8f9ca vmlinux.after
This is not much (370B) but better than nothing.
We also save a function call in some hot paths like callers of
mem_cgroup_count_vm_event which is used for accounting.
The patch doesn't introduce any functional changes.
[vdavykov@parallels.com: inline memcg_kmem_is_active]
[vdavykov@parallels.com: do not expose type outside of CONFIG_MEMCG]
[akpm@linux-foundation.org: memcontrol.h needs eventfd.h for eventfd_ctx]
[akpm@linux-foundation.org: export mem_cgroup_from_task() to modules]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to know per-process workingset size for smart memory management
on userland and we use swap(ex, zram) heavily to maximize memory
efficiency so workingset includes swap as well as RSS.
On such system, if there are lots of shared anonymous pages, it's really
hard to figure out exactly how many each process consumes memory(ie, rss
+ wap) if the system has lots of shared anonymous memory(e.g, android).
This patch introduces SwapPss field on /proc/<pid>/smaps so we can get
more exact workingset size per process.
Bongkyu tested it. Result is below.
1. 50M used swap
SwapTotal: 461976 kB
SwapFree: 411192 kB
$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
48236
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
141184
2. 240M used swap
SwapTotal: 461976 kB
SwapFree: 216808 kB
$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
230315
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
1387744
[akpm@linux-foundation.org: simplify kunmap_atomic() call]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Bongkyu Kim <bongkyu.kim@lge.com>
Tested-by: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have two different ways to signal an I/O error on a BIO:
(1) by clearing the BIO_UPTODATE flag
(2) by returning a Linux errno value to the bi_end_io callback
The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario. Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.
So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stop abusing struct page functionality and the swap end_io handler, and
instead add a modified version of the blk-lib.c bio_batch helpers.
Also move the block I/O code into swap.c as they are directly tied into
each other.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Ming Lin <mlin@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"deactivate_page" was created for file invalidation so it has too
specific logic for file-backed pages. So, let's change the name of the
function and date to a file-specific one and yield the generic name.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Wang, Yalin <Yalin.Wang@sonymobile.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
The body of this function was removed by commit 0a31bc97c80c ("mm:
memcontrol: rewrite uncharge API").
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swp_entry_t being defined in include/linux/swap.h instead of
include/linux/mm_types.h causes cyclic include dependency later when
include/linux/page_cgroup.h is included from writeback path. Move the
definition to include/linux/mm_types.h.
While at it, reformat the comment above it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a memcg with even just moderate cache pressure, success rates for
transparent huge page allocations drop to zero, wasting a lot of effort
that the allocator puts into assembling these pages.
The reason for this is that the memcg reclaim code was never designed for
higher-order charges. It reclaims in small batches until there is room
for at least one page. Huge page charges only succeed when these batches
add up over a series of huge faults, which is unlikely under any
significant load involving order-0 allocations in the group.
Remove that loop on the memcg side in favor of passing the actual reclaim
goal to direct reclaim, which is already set up and optimized to meet
higher-order goals efficiently.
This brings memcg's THP policy in line with the system policy: if the
allocator painstakingly assembles a hugepage, memcg will at least make an
honest effort to charge it. As a result, transparent hugepage allocation
rates amid cache activity are drastically improved:
vanilla patched
pgalloc 4717530.80 ( +0.00%) 4451376.40 ( -5.64%)
pgfault 491370.60 ( +0.00%) 225477.40 ( -54.11%)
pgmajfault 2.00 ( +0.00%) 1.80 ( -6.67%)
thp_fault_alloc 0.00 ( +0.00%) 531.60 (+100.00%)
thp_fault_fallback 749.00 ( +0.00%) 217.40 ( -70.88%)
[ Note: this may in turn increase memory consumption from internal
fragmentation, which is an inherent risk of transparent hugepages.
Some setups may have to adjust the memcg limits accordingly to
accomodate this - or, if the machine is already packed to capacity,
disable the transparent huge page feature. ]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@sr71.net>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The deprecation warnings for the scan_unevictable interface triggers by
scripts doing `sysctl -a | grep something else'. This is annoying and not
helpful.
The interface has been defunct since 264e56d8247e ("mm: disable user
interface to manually rescue unevictable pages"), which was in 2011, and
there haven't been any reports of usecases for it, only reports that the
deprecation warnings are annying. It's unlikely that anybody is using
this interface specifically at this point, so remove it.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The memcg uncharging code that is involved towards the end of a page's
lifetime - truncation, reclaim, swapout, migration - is impressively
complicated and fragile.
Because anonymous and file pages were always charged before they had their
page->mapping established, uncharges had to happen when the page type
could still be known from the context; as in unmap for anonymous, page
cache removal for file and shmem pages, and swap cache truncation for swap
pages. However, these operations happen well before the page is actually
freed, and so a lot of synchronization is necessary:
- Charging, uncharging, page migration, and charge migration all need
to take a per-page bit spinlock as they could race with uncharging.
- Swap cache truncation happens during both swap-in and swap-out, and
possibly repeatedly before the page is actually freed. This means
that the memcg swapout code is called from many contexts that make
no sense and it has to figure out the direction from page state to
make sure memory and memory+swap are always correctly charged.
- On page migration, the old page might be unmapped but then reused,
so memcg code has to prevent untimely uncharging in that case.
Because this code - which should be a simple charge transfer - is so
special-cased, it is not reusable for replace_page_cache().
But now that charged pages always have a page->mapping, introduce
mem_cgroup_uncharge(), which is called after the final put_page(), when we
know for sure that nobody is looking at the page anymore.
For page migration, introduce mem_cgroup_migrate(), which is called after
the migration is successful and the new page is fully rmapped. Because
the old page is no longer uncharged after migration, prevent double
charges by decoupling the page's memcg association (PCG_USED and
pc->mem_cgroup) from the page holding an actual charge. The new bits
PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
to the new page during migration.
mem_cgroup_migrate() is suitable for replace_page_cache() as well,
which gets rid of mem_cgroup_replace_page_cache(). However, care
needs to be taken because both the source and the target page can
already be charged and on the LRU when fuse is splicing: grab the page
lock on the charge moving side to prevent changing pc->mem_cgroup of a
page under migration. Also, the lruvecs of both pages change as we
uncharge the old and charge the new during migration, and putback may
race with us, so grab the lru lock and isolate the pages iff on LRU to
prevent races and ensure the pages are on the right lruvec afterward.
Swap accounting is massively simplified: because the page is no longer
uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
before the final put_page() in page reclaim.
Finally, page_cgroup changes are now protected by whatever protection the
page itself offers: anonymous pages are charged under the page table lock,
whereas page cache insertions, swapin, and migration hold the page lock.
Uncharging happens under full exclusion with no outstanding references.
Charging and uncharging also ensure that the page is off-LRU, which
serializes against charge migration. Remove the very costly page_cgroup
lock and set pc->flags non-atomically.
[mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
[vdavydov@parallels.com: fix flags definition]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These patches rework memcg charge lifetime to integrate more naturally
with the lifetime of user pages. This drastically simplifies the code and
reduces charging and uncharging overhead. The most expensive part of
charging and uncharging is the page_cgroup bit spinlock, which is removed
entirely after this series.
Here are the top-10 profile entries of a stress test that reads a 128G
sparse file on a freshly booted box, without even a dedicated cgroup (i.e.
executing in the root memcg). Before:
15.36% cat [kernel.kallsyms] [k] copy_user_generic_string
13.31% cat [kernel.kallsyms] [k] memset
11.48% cat [kernel.kallsyms] [k] do_mpage_readpage
4.23% cat [kernel.kallsyms] [k] get_page_from_freelist
2.38% cat [kernel.kallsyms] [k] put_page
2.32% cat [kernel.kallsyms] [k] __mem_cgroup_commit_charge
2.18% kswapd0 [kernel.kallsyms] [k] __mem_cgroup_uncharge_common
1.92% kswapd0 [kernel.kallsyms] [k] shrink_page_list
1.86% cat [kernel.kallsyms] [k] __radix_tree_lookup
1.62% cat [kernel.kallsyms] [k] __pagevec_lru_add_fn
After:
15.67% cat [kernel.kallsyms] [k] copy_user_generic_string
13.48% cat [kernel.kallsyms] [k] memset
11.42% cat [kernel.kallsyms] [k] do_mpage_readpage
3.98% cat [kernel.kallsyms] [k] get_page_from_freelist
2.46% cat [kernel.kallsyms] [k] put_page
2.13% kswapd0 [kernel.kallsyms] [k] shrink_page_list
1.88% cat [kernel.kallsyms] [k] __radix_tree_lookup
1.67% cat [kernel.kallsyms] [k] __pagevec_lru_add_fn
1.39% kswapd0 [kernel.kallsyms] [k] free_pcppages_bulk
1.30% cat [kernel.kallsyms] [k] kfree
As you can see, the memcg footprint has shrunk quite a bit.
text data bss dec hex filename
37970 9892 400 48262 bc86 mm/memcontrol.o.old
35239 9892 400 45531 b1db mm/memcontrol.o
This patch (of 4):
The memcg charge API charges pages before they are rmapped - i.e. have an
actual "type" - and so every callsite needs its own set of charge and
uncharge functions to know what type is being operated on. Worse,
uncharge has to happen from a context that is still type-specific, rather
than at the end of the page's lifetime with exclusive access, and so
requires a lot of synchronization.
Rewrite the charge API to provide a generic set of try_charge(),
commit_charge() and cancel_charge() transaction operations, much like
what's currently done for swap-in:
mem_cgroup_try_charge() attempts to reserve a charge, reclaiming
pages from the memcg if necessary.
mem_cgroup_commit_charge() commits the page to the charge once it
has a valid page->mapping and PageAnon() reliably tells the type.
mem_cgroup_cancel_charge() aborts the transaction.
This reduces the charge API and enables subsequent patches to
drastically simplify uncharging.
As pages need to be committed after rmap is established but before they
are added to the LRU, page_add_new_anon_rmap() must stop doing LRU
additions again. Revive lru_cache_add_active_or_unevictable().
[hughd@google.com: fix shmem_unuse]
[hughd@google.com: Add comments on the private use of -EAGAIN]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do we really need an exported alias for __SetPageReferenced()? Its
callers better know what they're doing, in which case the page would not
be already marked referenced. Kill init_page_accessed(), just
__SetPageReferenced() inline.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Prabhakar Lad <prabhakar.csengg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct comments.
Currently, we use (zone->managed_pages + KSWAPD_ZONE_BALANCE_GAP_RATIO-1)
/ KSWAPD_ZONE_BALANCE_GAP_RATIO to avoid a zero gap value. It's better to
use DIV_ROUND_UP macro for neater code and clear meaning.
Besides, the gap value is calculated against the per-zone "managed pages",
not "present pages". This patch also corrects the comment and do some
rephrasing.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
possible
aops->write_begin may allocate a new page and make it visible only to have
mark_page_accessed called almost immediately after. Once the page is
visible the atomic operations are necessary which is noticable overhead
when writing to an in-memory filesystem like tmpfs but should also be
noticable with fast storage. The objective of the patch is to initialse
the accessed information with non-atomic operations before the page is
visible.
The bulk of filesystems directly or indirectly use
grab_cache_page_write_begin or find_or_create_page for the initial
allocation of a page cache page. This patch adds an init_page_accessed()
helper which behaves like the first call to mark_page_accessed() but may
called before the page is visible and can be done non-atomically.
The primary APIs of concern in this care are the following and are used
by most filesystems.
find_get_page
find_lock_page
find_or_create_page
grab_cache_page_nowait
grab_cache_page_write_begin
All of them are very similar in detail to the patch creates a core helper
pagecache_get_page() which takes a flags parameter that affects its
behavior such as whether the page should be marked accessed or not. Then
old API is preserved but is basically a thin wrapper around this core
function.
Each of the filesystems are then updated to avoid calling
mark_page_accessed when it is known that the VM interfaces have already
done the job. There is a slight snag in that the timing of the
mark_page_accessed() has now changed so in rare cases it's possible a page
gets to the end of the LRU as PageReferenced where as previously it might
have been repromoted. This is expected to be rare but it's worth the
filesystem people thinking about it in case they see a problem with the
timing change. It is also the case that some filesystems may be marking
pages accessed that previously did not but it makes sense that filesystems
have consistent behaviour in this regard.
The test case used to evaulate this is a simple dd of a large file done
multiple times with the file deleted on each iterations. The size of the
file is 1/10th physical memory to avoid dirty page balancing. In the
async case it will be possible that the workload completes without even
hitting the disk and will have variable results but highlight the impact
of mark_page_accessed for async IO. The sync results are expected to be
more stable. The exception is tmpfs where the normal case is for the "IO"
to not hit the disk.
The test machine was single socket and UMA to avoid any scheduling or NUMA
artifacts. Throughput and wall times are presented for sync IO, only wall
times are shown for async as the granularity reported by dd and the
variability is unsuitable for comparison. As async results were variable
do to writback timings, I'm only reporting the maximum figures. The sync
results were stable enough to make the mean and stddev uninteresting.
The performance results are reported based on a run with no profiling.
Profile data is based on a separate run with oprofile running.
async dd
3.15.0-rc3 3.15.0-rc3
vanilla accessed-v2
ext3 Max elapsed 13.9900 ( 0.00%) 11.5900 ( 17.16%)
tmpfs Max elapsed 0.5100 ( 0.00%) 0.4900 ( 3.92%)
btrfs Max elapsed 12.8100 ( 0.00%) 12.7800 ( 0.23%)
ext4 Max elapsed 18.6000 ( 0.00%) 13.3400 ( 28.28%)
xfs Max elapsed 12.5600 ( 0.00%) 2.0900 ( 83.36%)
The XFS figure is a bit strange as it managed to avoid a worst case by
sheer luck but the average figures looked reasonable.
samples percentage
ext3 86107 0.9783 vmlinux-3.15.0-rc4-vanilla mark_page_accessed
ext3 23833 0.2710 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
ext3 5036 0.0573 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
ext4 64566 0.8961 vmlinux-3.15.0-rc4-vanilla mark_page_accessed
ext4 5322 0.0713 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
ext4 2869 0.0384 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
xfs 62126 1.7675 vmlinux-3.15.0-rc4-vanilla mark_page_accessed
xfs 1904 0.0554 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
xfs 103 0.0030 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
btrfs 10655 0.1338 vmlinux-3.15.0-rc4-vanilla mark_page_accessed
btrfs 2020 0.0273 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
btrfs 587 0.0079 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
tmpfs 59562 3.2628 vmlinux-3.15.0-rc4-vanilla mark_page_accessed
tmpfs 1210 0.0696 vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
tmpfs 94 0.0054 vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
[akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Tested-by: Prabhakar Lad <prabhakar.csengg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cold is a bool, make it one. Make the likely case the "if" part of the
block instead of the else as according to the optimisation manual this is
preferred.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally get_swap_page() started iterating through the singly-linked
list of swap_info_structs using swap_list.next or highest_priority_index,
which both were intended to point to the highest priority active swap
target that was not full. The first patch in this series changed the
singly-linked list to a doubly-linked list, and removed the logic to start
at the highest priority non-full entry; it starts scanning at the highest
priority entry each time, even if the entry is full.
Replace the manually ordered swap_list_head with a plist, swap_active_head.
Add a new plist, swap_avail_head. The original swap_active_head plist
contains all active swap_info_structs, as before, while the new
swap_avail_head plist contains only swap_info_structs that are active and
available, i.e. not full. Add a new spinlock, swap_avail_lock, to protect
the swap_avail_head list.
Mel Gorman suggested using plists since they internally handle ordering
the list entries based on priority, which is exactly what swap was doing
manually. All the ordering code is now removed, and swap_info_struct
entries and simply added to their corresponding plist and automatically
ordered correctly.
Using a new plist for available swap_info_structs simplifies and
optimizes get_swap_page(), which no longer has to iterate over full
swap_info_structs. Using a new spinlock for swap_avail_head plist
allows each swap_info_struct to add or remove themselves from the
plist when they become full or not-full; previously they could not
do so because the swap_info_struct->lock is held when they change
from full<->not-full, and the swap_lock protecting the main
swap_active_head must be ordered before any swap_info_struct->lock.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic controlling the singly-linked list of swap_info_struct entries
for all active, i.e. swapon'ed, swap targets is rather complex, because:
- it stores the entries in priority order
- there is a pointer to the highest priority entry
- there is a pointer to the highest priority not-full entry
- there is a highest_priority_index variable set outside the swap_lock
- swap entries of equal priority should be used equally
this complexity leads to bugs such as: https://lkml.org/lkml/2014/2/13/181
where different priority swap targets are incorrectly used equally.
That bug probably could be solved with the existing singly-linked lists,
but I think it would only add more complexity to the already difficult to
understand get_swap_page() swap_list iteration logic.
The first patch changes from a singly-linked list to a doubly-linked list
using list_heads; the highest_priority_index and related code are removed
and get_swap_page() starts each iteration at the highest priority
swap_info entry, even if it's full. While this does introduce unnecessary
list iteration (i.e. Schlemiel the painter's algorithm) in the case where
one or more of the highest priority entries are full, the iteration and
manipulation code is much simpler and behaves correctly re: the above bug;
and the fourth patch removes the unnecessary iteration.
The second patch adds some minor plist helper functions; nothing new
really, just functions to match existing regular list functions. These
are used by the next two patches.
The third patch adds plist_requeue(), which is used by get_swap_page() in
the next patch - it performs the requeueing of same-priority entries
(which moves the entry to the end of its priority in the plist), so that
all equal-priority swap_info_structs get used equally.
The fourth patch converts the main list into a plist, and adds a new plist
that contains only swap_info entries that are both active and not full.
As Mel suggested using plists allows removing all the ordering code from
swap - plists handle ordering automatically. The list naming is also
clarified now that there are two lists, with the original list changed
from swap_list_head to swap_active_head and the new list named
swap_avail_head. A new spinlock is also added for the new list, so
swap_info entries can be added or removed from the new list immediately as
they become full or not full.
This patch (of 4):
Replace the singly-linked list tracking active, i.e. swapon'ed,
swap_info_struct entries with a doubly-linked list using struct
list_heads. Simplify the logic iterating and manipulating the list of
entries, especially get_swap_page(), by using standard list_head
functions, and removing the highest priority iteration logic.
The change fixes the bug:
https://lkml.org/lkml/2014/2/13/181
in which different priority swap entries after the highest priority entry
are incorrectly used equally in pairs. The swap behavior is now as
advertised, i.e. different priority swap entries are used in order, and
equal priority swap targets are used concurrently.
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Weijie Yang <weijieut@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In mm/swap.c, __lru_cache_add() is exported, but actually there are no
users outside this file.
This patch unexports __lru_cache_add(), and makes it static. It also
exports lru_cache_add_file(), as it is use by cifs and fuse, which can
loaded as modules.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, page cache radix tree nodes were freed after reclaim emptied
out their page pointers. But now reclaim stores shadow entries in their
place, which are only reclaimed when the inodes themselves are
reclaimed. This is problematic for bigger files that are still in use
after they have a significant amount of their cache reclaimed, without
any of those pages actually refaulting. The shadow entries will just
sit there and waste memory. In the worst case, the shadow entries will
accumulate until the machine runs out of memory.
To get this under control, the VM will track radix tree nodes
exclusively containing shadow entries on a per-NUMA node list. Per-NUMA
rather than global because we expect the radix tree nodes themselves to
be allocated node-locally and we want to reduce cross-node references of
otherwise independent cache workloads. A simple shrinker will then
reclaim these nodes on memory pressure.
A few things need to be stored in the radix tree node to implement the
shadow node LRU and allow tree deletions coming from the list:
1. There is no index available that would describe the reverse path
from the node up to the tree root, which is needed to perform a
deletion. To solve this, encode in each node its offset inside the
parent. This can be stored in the unused upper bits of the same
member that stores the node's height at no extra space cost.
2. The number of shadow entries needs to be counted in addition to the
regular entries, to quickly detect when the node is ready to go to
the shadow node LRU list. The current entry count is an unsigned
int but the maximum number of entries is 64, so a shadow counter
can easily be stored in the unused upper bits.
3. Tree modification needs tree lock and tree root, which are located
in the address space, so store an address_space backpointer in the
node. The parent pointer of the node is in a union with the 2-word
rcu_head, so the backpointer comes at no extra cost as well.
4. The node needs to be linked to an LRU list, which requires a list
head inside the node. This does increase the size of the node, but
it does not change the number of objects that fit into a slab page.
[akpm@linux-foundation.org: export the right function]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VM maintains cached filesystem pages on two types of lists. One
list holds the pages recently faulted into the cache, the other list
holds pages that have been referenced repeatedly on that first list.
The idea is to prefer reclaiming young pages over those that have shown
to benefit from caching in the past. We call the recently usedbut
ultimately was not significantly better than a FIFO policy and still
thrashed cache based on eviction speed, rather than actual demand for
cache.
This patch solves one half of the problem by decoupling the ability to
detect working set changes from the inactive list size. By maintaining
a history of recently evicted file pages it can detect frequently used
pages with an arbitrarily small inactive list size, and subsequently
apply pressure on the active list based on actual demand for cache, not
just overall eviction speed.
Every zone maintains a counter that tracks inactive list aging speed.
When a page is evicted, a snapshot of this counter is stored in the
now-empty page cache radix tree slot. On refault, the minimum access
distance of the page can be assessed, to evaluate whether the page
should be part of the active list or not.
This fixes the VM's blindness towards working set changes in excess of
the inactive list. And it's the foundation to further improve the
protection ability and reduce the minimum inactive list size of 50%.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Metin Doslu <metin@citusdata.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ozgun Erdogan <ozgun@citusdata.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
make lru_add_drain_all() only selectively interrupt the cpus that have
per-cpu free pages that can be drained.
This is important in nohz mode where calling mlockall(), for example,
otherwise will interrupt every core unnecessarily.
This is important on workloads where nohz cores are handling 10 Gb traffic
in userspace. Those CPUs do not enter the kernel and place pages into LRU
pagevecs and they really, really don't want to be interrupted, or they
drop packets on the floor.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
PageSwapCache() is always false when !CONFIG_SWAP, so compiler
properly discard related code. Therefore, we don't need #ifdef explicitly.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swap cluster allocation is to get better request merge to improve
performance. But the cluster is shared globally, if multiple tasks are
doing swap, this will cause interleave disk access. While multiple tasks
swap is quite common, for example, each numa node has a kswapd thread
doing swap and multiple threads/processes doing direct page reclaim.
ioscheduler can't help too much here, because tasks don't send swapout IO
down to block layer in the meantime. Block layer does merge some IOs, but
a lot not, depending on how many tasks are doing swapout concurrently. In
practice, I've seen a lot of small size IO in swapout workloads.
We makes the cluster allocation per-cpu here. The interleave disk access
issue goes away. All tasks swapout to their own cluster, so swapout will
become sequential, which can be easily merged to big size IO. If one CPU
can't get its per-cpu cluster (for example, there is no free cluster
anymore in the swap), it will fallback to scan swap_map. The CPU can
still continue swap. We don't need recycle free swap entries of other
CPUs.
In my test (swap to a 2-disk raid0 partition), this improves around 10%
swapout throughput, and request size is increased significantly.
How does this impact swap readahead is uncertain though. On one side,
page reclaim always isolates and swaps several adjancent pages, this will
make page reclaim write the pages sequentially and benefit readahead. On
the other side, several CPU write pages interleave means the pages don't
live _sequentially_ but relatively _near_. In the per-cpu allocation
case, if adjancent pages are written by different cpus, they will live
relatively _far_. So how this impacts swap readahead depends on how many
pages page reclaim isolates and swaps one time. If the number is big,
this patch will benefit swap readahead. Of course, this is about
sequential access pattern. The patch has no impact for random access
pattern, because the new cluster allocation algorithm is just for SSD.
Alternative solution is organizing swap layout to be per-mm instead of
this per-cpu approach. In the per-mm layout, we allocate a disk range for
each mm, so pages of one mm live in swap disk adjacently. per-mm layout
has potential issues of lock contention if multiple reclaimers are swap
pages from one mm. For a sequential workload, per-mm layout is better to
implement swap readahead, because pages from the mm are adjacent in disk.
But per-cpu layout isn't very bad in this workload, as page reclaim always
isolates and swaps several pages one time, such pages will still live in
disk sequentially and readahead can utilize this. For a random workload,
per-mm layout isn't beneficial of request merge, because it's quite
possible pages from different mm are swapout in the meantime and IO can't
be merged in per-mm layout. while with per-cpu layout we can merge
requests from any mm. Considering random workload is more popular in
workloads with swap (and per-cpu approach isn't too bad for sequential
workload too), I'm choosing per-cpu layout.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swap can do cluster discard for SSD, which is good, but there are some
problems here:
1. swap do the discard just before page reclaim gets a swap entry and
writes the disk sectors. This is useless for high end SSD, because an
overwrite to a sector implies a discard to original sector too. A
discard + overwrite == overwrite.
2. the purpose of doing discard is to improve SSD firmware garbage
collection. Idealy we should send discard as early as possible, so
firmware can do something smart. Sending discard just after swap entry
is freed is considered early compared to sending discard before write.
Of course, if workload is already bound to gc speed, sending discard
earlier or later doesn't make
3. block discard is a sync API, which will delay scan_swap_map()
significantly.
4. Write and discard command can be executed parallel in PCIe SSD.
Making swap discard async can make execution more efficiently.
This patch makes swap discard async and moves discard to where swap entry
is freed. Discard and write have no dependence now, so above issues can
be avoided. Idealy we should do discard for any freed sectors, but some
SSD discard is very slow. This patch still does discard for a whole
cluster.
My test does a several round of 'mmap, write, unmap', which will trigger a
lot of swap discard. In a fusionio card, with this patch, the test
runtime is reduced to 18% of the time without it, so around 5.5x faster.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm using a fast SSD to do swap. scan_swap_map() sometimes uses up to
20~30% CPU time (when cluster is hard to find, the CPU time can be up to
80%), which becomes a bottleneck. scan_swap_map() scans a byte array to
search a 256 page cluster, which is very slow.
Here I introduced a simple algorithm to search cluster. Since we only
care about 256 pages cluster, we can just use a counter to track if a
cluster is free. Every 256 pages use one int to store the counter. If
the counter of a cluster is 0, the cluster is free. All free clusters
will be added to a list, so searching cluster is very efficient. With
this, scap_swap_map() overhead disappears.
This might help low end SD card swap too. Because if the cluster is
aligned, SD firmware can do flash erase more efficiently.
We only enable the algorithm for SSD. Hard disk swap isn't fast enough
and has downside with the algorithm which might introduce regression (see
below).
The patch slightly changes which cluster is choosen. It always adds free
cluster to list tail. This can help wear leveling for low end SSD too.
And if no cluster found, the scan_swap_map() will do search from the end
of last cluster. So if no cluster found, the scan_swap_map() will do
search from the end of last free cluster, which is random. For SSD, this
isn't a problem at all.
Another downside is the cluster must be aligned to 256 pages, which will
reduce the chance to find a cluster. I would expect this isn't a big
problem for SSD because of the non-seek penality. (And this is the reason
I only enable the algorithm for SSD).
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Considering the use cases where the swap device supports discard:
a) and can do it quickly;
b) but it's slow to do in small granularities (or concurrent with other
I/O);
c) but the implementation is so horrendous that you don't even want to
send one down;
And assuming that the sysadmin considers it useful to send the discards down
at all, we would (probably) want the following solutions:
i. do the fine-grained discards for freed swap pages, if device is
capable of doing so optimally;
ii. do single-time (batched) swap area discards, either at swapon
or via something like fstrim (not implemented yet);
iii. allow doing both single-time and fine-grained discards; or
iv. turn it off completely (default behavior)
As implemented today, one can only enable/disable discards for swap, but
one cannot select, for instance, solution (ii) on a swap device like (b)
even though the single-time discard is regarded to be interesting, or
necessary to the workload because it would imply (1), and the device is
not capable of performing it optimally.
This patch addresses the scenario depicted above by introducing a way to
ensure the (probably) wanted solutions (i, ii, iii and iv) can be flexibly
flagged through swapon(8) to allow a sysadmin to select the best suitable
swap discard policy accordingly to system constraints.
This patch introduces SWAP_FLAG_DISCARD_PAGES and SWAP_FLAG_DISCARD_ONCE
new flags to allow more flexibe swap discard policies being flagged
through swapon(8). The default behavior is to keep both single-time, or
batched, area discards (SWAP_FLAG_DISCARD_ONCE) and fine-grained discards
for page-clusters (SWAP_FLAG_DISCARD_PAGES) enabled, in order to keep
consistentcy with older kernel behavior, as well as maintain compatibility
with older swapon(8). However, through the new introduced flags the best
suitable discard policy can be selected accordingly to any given swap
device constraint.
[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Karel Zak <kzak@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to __pagevec_lru_add, this patch removes the LRU parameter from
__lru_cache_add and lru_cache_add_lru as the caller does not control the
exact LRU the page gets added to. lru_cache_add_lru gets renamed to
lru_cache_add the name is silly without the lru parameter. With the
parameter removed, it is required that the caller indicate if they want
the page added to the active or inactive list by setting or clearing
PageActive respectively.
[akpm@linux-foundation.org: Suggested the patch]
[gang.chen@asianux.com: fix used-unintialized warning]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Alexey Lyahkov <alexey.lyashkov@gmail.com>
Cc: Andrew Perepechko <anserper@ya.ru>
Cc: Robin Dong <sanbai@taobao.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Bernd Schubert <bernd.schubert@fastmail.fm>
Cc: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In page reclaim, huge page is split. split_huge_page() adds tail pages
to LRU list. Since we are reclaiming a huge page, it's better we
reclaim all subpages of the huge page instead of just the head page.
This patch adds split tail pages to shrink page list so the tail pages
can be reclaimed soon.
Before this patch, run a swap workload:
thp_fault_alloc 3492
thp_fault_fallback 608
thp_collapse_alloc 6
thp_collapse_alloc_failed 0
thp_split 916
With this patch:
thp_fault_alloc 4085
thp_fault_fallback 16
thp_collapse_alloc 90
thp_collapse_alloc_failed 0
thp_split 1272
fallback allocation is reduced a lot.
[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To prevent flooding the swap device with writebacks, frontswap backends
need to count and limit the number of outstanding writebacks. The
incrementing of the counter can be done before the call to
__swap_writepage(). However, the caller must receive a notification
when the writeback completes in order to decrement the counter.
To achieve this functionality, this patch modifies __swap_writepage() to
take the bio completion callback function as an argument.
end_swap_bio_write(), the normal bio completion function, is also made
non-static so that code doing the accounting can call it after the
accounting is done.
There should be no behavioural change to existing code.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swap_writepage() is currently where frontswap hooks into the swap write
path to capture pages with the frontswap_store() function. However, if
a frontswap backend wants to "resume" the writeback of a page to the
swap device, it can't call swap_writepage() as the page will simply
reenter the backend.
This patch separates swap_writepage() into a top and bottom half, the
bottom half named __swap_writepage() to allow a frontswap backend, like
zswap, to resume writeback beyond the frontswap_store() hook.
__add_to_swap_cache() is also made non-static so that the page for which
writeback is to be resumed can be added to the swap cache.
Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
| |
This variable is calculated from nr_free_pagecache_pages so
change its type to unsigned long.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the amount of RAM that functions nr_free_*_pages return is
held in unsigned int. But in machines with big memory (exceeding 16TB),
the amount may be incorrect because of overflow, so fix it.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD). The main contention comes
from swap_info_get(). This patch tries to fix the gap with adding a new
per-partition lock.
Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.
nr_swap_pages is an atomic now, it can be changed without swap_lock. In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages. But sounds not a big problem.
Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.
Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it. read the
flags is ok with either the locks hold.
If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.
swap_entry_free() can change swap_list. To delete that code, we add a
new highest_priority_index. Whenever get_swap_page() is called, we
check it. If it's valid, we use it.
It's a pity get_swap_page() still holds swap_lock(). But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush). And
BTW, looks get_swap_page() doesn't really need the lock. We never free
swap_info[] and we check SWAP_WRITEOK flag. The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.
"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s. This patch further improves the
speed to 2.3G/s, so around 15% improvement. It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.
[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When I use several fast SSD to do swap, swapper_space.tree_lock is
heavily contended. This makes each swap partition have one
address_space to reduce the lock contention. There is an array of
address_space for swap. The swap entry type is the index to the array.
In my test with 3 SSD, this increases the swapout throughput 20%.
[akpm@linux-foundation.org: revert unneeded change to __add_to_swap_cache]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In certain cases (kswapd reclaim, memcg target reclaim), a fixed minimum
amount of pages is scanned from the LRU lists on each iteration, to make
progress.
Do not make this minimum bigger than the respective LRU list size,
however, and save some busy work trying to isolate and reclaim pages
that are not there.
Empty LRU lists are quite common with memory cgroups in NUMA
environments because there exists a set of LRU lists for each zone for
each memory cgroup, while the memory of a single cgroup is expected to
stay on just one node. The number of expected empty LRU lists is thus
memcgs * (nodes - 1) * lru types
Each attempt to reclaim from an empty LRU list does expensive size
comparisons between lists, acquires the zone's lru lock etc. Avoid
that.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Satoru Moriya <satoru.moriya@hds.com>
Cc: Simon Jeons <simon.jeons@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
page_evictable(page, vma) is an irritant: almost all its callers pass
NULL for vma. Remove the vma arg and use mlocked_vma_newpage(vma, page)
explicitly in the couple of places it's needed. But in those places we
don't even need page_evictable() itself! They're dealing with a freshly
allocated anonymous page, which has no "mapping" and cannot be mlocked yet.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The version of swap_activate introduced is sufficient for swap-over-NFS
but would not provide enough information to implement a generic handler.
This patch shuffles things slightly to ensure the same information is
available for aops->swap_activate() as is available to the core.
No functionality change.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for writing swap pages
Currently swapfiles are managed entirely by the core VM by using ->bmap to
allocate space and write to the blocks directly. This effectively ensures
that the underlying blocks are allocated and avoids the need for the swap
subsystem to locate what physical blocks store offsets within a file.
If the swap subsystem is to use the filesystem information to locate the
blocks, it is critical that information such as block groups, block
bitmaps and the block descriptor table that map the swap file were
resident in memory. This patch adds address_space_operations that the VM
can call when activating or deactivating swap backed by a file.
int swap_activate(struct file *);
int swap_deactivate(struct file *);
The ->swap_activate() method is used to communicate to the file that the
VM relies on it, and the address_space should take adequate measures such
as reserving space in the underlying device, reserving memory for mempools
and pinning information such as the block descriptor table in memory. The
->swap_deactivate() method is called on sys_swapoff() if ->swap_activate()
returned success.
After a successful swapfile ->swap_activate, the swapfile is marked
SWP_FILE and swapper_space.a_ops will proxy to
sis->swap_file->f_mappings->a_ops using ->direct_io to write swapcache
pages and ->readpage to read.
It is perfectly possible that direct_IO be used to read the swap pages but
it is an unnecessary complication. Similarly, it is possible that
->writepage be used instead of direct_io to write the pages but filesystem
developers have stated that calling writepage from the VM is undesirable
for a variety of reasons and using direct_IO opens up the possibility of
writing back batches of swap pages in the future.
[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|