| Commit message (Collapse) | Author | Age |
| |\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* refs/heads/tmp-aab9adb
Linux 4.4.179
kernel/sysctl.c: fix out-of-bounds access when setting file-max
Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"
ALSA: info: Fix racy addition/deletion of nodes
mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
device_cgroup: fix RCU imbalance in error case
sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
Revert "kbuild: use -Oz instead of -Os when using clang"
mac80211: do not call driver wake_tx_queue op during reconfig
kprobes: Fix error check when reusing optimized probes
kprobes: Mark ftrace mcount handler functions nokprobe
x86/kprobes: Verify stack frame on kretprobe
arm64: futex: Restore oldval initialization to work around buggy compilers
crypto: x86/poly1305 - fix overflow during partial reduction
ALSA: core: Fix card races between register and disconnect
staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
staging: comedi: ni_usb6501: Fix use of uninitialized mutex
staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
staging: comedi: vmk80xx: Fix use of uninitialized semaphore
io: accel: kxcjk1013: restore the range after resume.
iio: adc: at91: disable adc channel interrupt in timeout case
iio: ad_sigma_delta: select channel when reading register
iio/gyro/bmg160: Use millidegrees for temperature scale
KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
tpm/tpm_i2c_atmel: Return -E2BIG when the transfer is incomplete
modpost: file2alias: check prototype of handler
modpost: file2alias: go back to simple devtable lookup
crypto: crypto4xx - properly set IV after de- and encrypt
ipv4: ensure rcu_read_lock() in ipv4_link_failure()
ipv4: recompile ip options in ipv4_link_failure
tcp: tcp_grow_window() needs to respect tcp_space()
net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv
net: bridge: multicast: use rcu to access port list from br_multicast_start_querier
net: atm: Fix potential Spectre v1 vulnerabilities
bonding: fix event handling for stacked bonds
appletalk: Fix compile regression
ovl: fix uid/gid when creating over whiteout
tpm/tpm_crb: Avoid unaligned reads in crb_recv()
include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
lib/div64.c: off by one in shift
appletalk: Fix use-after-free in atalk_proc_exit
ARM: 8839/1: kprobe: make patch_lock a raw_spinlock_t
iommu/dmar: Fix buffer overflow during PCI bus notification
crypto: sha512/arm - fix crash bug in Thumb2 build
crypto: sha256/arm - fix crash bug in Thumb2 build
cifs: fallback to older infolevels on findfirst queryinfo retry
ACPI / SBS: Fix GPE storm on recent MacBookPro's
ARM: samsung: Limit SAMSUNG_PM_CHECK config option to non-Exynos platforms
serial: uartps: console_setup() can't be placed to init section
f2fs: fix to do sanity check with current segment number
9p locks: add mount option for lock retry interval
9p: do not trust pdu content for stat item size
rsi: improve kernel thread handling to fix kernel panic
ext4: prohibit fstrim in norecovery mode
fix incorrect error code mapping for OBJECTID_NOT_FOUND
x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error
iommu/vt-d: Check capability before disabling protected memory
x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors
x86/hpet: Prevent potential NULL pointer dereference
perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test()
perf tests: Fix a memory leak of cpu_map object in the openat_syscall_event_on_all_cpus test
perf evsel: Free evsel->counts in perf_evsel__exit()
perf top: Fix error handling in cmd_top()
tools/power turbostat: return the exit status of a command
thermal/int340x_thermal: fix mode setting
thermal/int340x_thermal: Add additional UUIDs
ALSA: opl3: fix mismatch between snd_opl3_drum_switch definition and declaration
mmc: davinci: remove extraneous __init annotation
IB/mlx4: Fix race condition between catas error reset and aliasguid flows
ALSA: sb8: add a check for request_region
ALSA: echoaudio: add a check for ioremap_nocache
ext4: report real fs size after failed resize
ext4: add missing brelse() in add_new_gdb_meta_bg()
perf/core: Restore mmap record type correctly
PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
xtensa: fix return_address
sched/fair: Do not re-read ->h_load_next during hierarchical load calculation
xen: Prevent buffer overflow in privcmd ioctl
arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
ARM: dts: at91: Fix typo in ISC_D0 on PC9
genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
block: do not leak memory in bio_copy_user_iov()
ASoC: fsl_esai: fix channel swap issue when stream starts
include/linux/bitrev.h: fix constant bitrev
ALSA: seq: Fix OOB-reads from strlcpy
ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type
net: ethtool: not call vzalloc for zero sized memory request
netns: provide pure entropy for net_hash_mix()
tcp: Ensure DCTCP reacts to losses
sctp: initialize _pad of sockaddr_in before copying to user memory
qmi_wwan: add Olicard 600
openvswitch: fix flow actions reallocation
net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock().
ipv6: sit: reset ip header pointer in ipip6_rcv
ipv6: Fix dangling pointer when ipv6 fragment
tty: ldisc: add sysctl to prevent autoloading of ldiscs
tty: mark Siemens R3964 line discipline as BROKEN
lib/string.c: implement a basic bcmp
x86/vdso: Drop implicit common-page-size linker flag
x86: vdso: Use $LD instead of $CC to link
x86/build: Specify elf_i386 linker emulation explicitly for i386 objects
kbuild: clang: choose GCC_TOOLCHAIN_DIR not on LD
binfmt_elf: switch to new creds when switching to new mm
drm/dp/mst: Configure no_stop_bit correctly for remote i2c xfers
dmaengine: tegra: avoid overflow of byte tracking
x86/build: Mark per-CPU symbols as absolute explicitly for LLD
wlcore: Fix memory leak in case wl12xx_fetch_firmware failure
regulator: act8865: Fix act8600_sudcdc_voltage_ranges setting
media: s5p-jpeg: Check for fmt_ver_flag when doing fmt enumeration
netfilter: physdev: relax br_netfilter dependency
dmaengine: imx-dma: fix warning comparison of distinct pointer types
hpet: Fix missing '=' character in the __setup() code of hpet_mmap_enable
soc/tegra: fuse: Fix illegal free of IO base address
hwrng: virtio - Avoid repeated init of completion
media: mt9m111: set initial frame size other than 0x0
tty: increase the default flip buffer limit to 2*640K
ARM: avoid Cortex-A9 livelock on tight dmb loops
mt7601u: bump supported EEPROM version
soc: qcom: gsbi: Fix error handling in gsbi_probe()
ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe
cdrom: Fix race condition in cdrom_sysctl_register
fbdev: fbmem: fix memory access if logo is bigger than the screen
bcache: improve sysfs_strtoul_clamp()
bcache: fix input overflow to sequential_cutoff
bcache: fix input overflow to cache set sysfs file io_error_halflife
ALSA: PCM: check if ops are defined before suspending PCM
ARM: 8833/1: Ensure that NEON code always compiles with Clang
kprobes: Prohibit probing on bsearch()
leds: lp55xx: fix null deref on firmware load failure
media: mx2_emmaprp: Correct return type for mem2mem buffer helpers
media: s5p-g2d: Correct return type for mem2mem buffer helpers
media: s5p-jpeg: Correct return type for mem2mem buffer helpers
media: sh_veu: Correct return type for mem2mem buffer helpers
SoC: imx-sgtl5000: add missing put_device()
perf test: Fix failure of 'evsel-tp-sched' test on s390
scsi: megaraid_sas: return error when create DMA pool failed
IB/mlx4: Increase the timeout for CM cache
e1000e: Fix -Wformat-truncation warnings
mmc: omap: fix the maximum timeout setting
ARM: 8840/1: use a raw_spinlock_t in unwind
coresight: etm4x: Add support to enable ETMv4.2
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
usb: chipidea: Grab the (legacy) USB PHY by phandle first
tools lib traceevent: Fix buffer overflow in arg_eval
fs: fix guard_bio_eod to check for real EOD errors
cifs: Fix NULL pointer dereference of devname
dm thin: add sanity checks to thin-pool and external snapshot creation
cifs: use correct format characters
fs/file.c: initialize init_files.resize_wait
f2fs: do not use mutex lock in atomic context
ocfs2: fix a panic problem caused by o2cb_ctl
mm/slab.c: kmemleak no scan alien caches
mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512!
mm/page_ext.c: fix an imbalance with kmemleak
mm/cma.c: cma_declare_contiguous: correct err handling
enic: fix build warning without CONFIG_CPUMASK_OFFSTACK
sysctl: handle overflow for file-max
gpio: gpio-omap: fix level interrupt idling
tracing: kdb: Fix ftdump to not sleep
h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
CIFS: fix POSIX lock leak and invalid ptr deref
tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped
Bluetooth: Fix decrementing reference count twice in releasing socket
i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
tty/serial: atmel: Add is_half_duplex helper
lib/int_sqrt: optimize initial value compute
ext4: cleanup bh release code in ext4_ind_remove_space()
arm64: debug: Ensure debug handlers check triggering exception level
arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
Make arm64 serial port config compatible with crosvm
Fix merge issue with 4.4.178
Fix merge issue with 4.4.177
ANDROID: cuttlefish_defconfig: Enable CONFIG_OVERLAY_FS
Change-Id: I0d6e7b00f0198867803d5fe305ce13e205cc7518
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ Upstream commit 0c81585499601acd1d0e1cbf424cabfaee60628c ]
After offlining a memory block, kmemleak scan will trigger a crash, as
it encounters a page ext address that has already been freed during
memory offlining. At the beginning in alloc_page_ext(), it calls
kmemleak_alloc(), but it does not call kmemleak_free() in
free_page_ext().
BUG: unable to handle kernel paging request at ffff888453d00000
PGD 128a01067 P4D 128a01067 PUD 128a04067 PMD 47e09e067 PTE 800ffffbac2ff060
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
CPU: 1 PID: 1594 Comm: bash Not tainted 5.0.0-rc8+ #15
Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20 10/25/2017
RIP: 0010:scan_block+0xb5/0x290
Code: 85 6e 01 00 00 48 b8 00 00 30 f5 81 88 ff ff 48 39 c3 0f 84 5b 01 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 0f 85 87 01 00 00 <4c> 8b 3b e8 f3 0c fa ff 4c 39 3d 0c 6b 4c 01 0f 87 08 01 00 00 4c
RSP: 0018:ffff8881ec57f8e0 EFLAGS: 00010082
RAX: 0000000000000000 RBX: ffff888453d00000 RCX: ffffffffa61e5a54
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff888453d00000
RBP: ffff8881ec57f920 R08: fffffbfff4ed588d R09: fffffbfff4ed588c
R10: fffffbfff4ed588c R11: ffffffffa76ac463 R12: dffffc0000000000
R13: ffff888453d00ff9 R14: ffff8881f80cef48 R15: ffff8881f80cef48
FS: 00007f6c0e3f8740(0000) GS:ffff8881f7680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff888453d00000 CR3: 00000001c4244003 CR4: 00000000001606a0
Call Trace:
scan_gray_list+0x269/0x430
kmemleak_scan+0x5a8/0x10f0
kmemleak_write+0x541/0x6ca
full_proxy_write+0xf8/0x190
__vfs_write+0xeb/0x980
vfs_write+0x15a/0x4f0
ksys_write+0xd2/0x1b0
__x64_sys_write+0x73/0xb0
do_syscall_64+0xeb/0xaaa
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f6c0dad73b8
Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 63 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
RSP: 002b:00007ffd5b863cb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f6c0dad73b8
RDX: 0000000000000005 RSI: 000055a9216e1710 RDI: 0000000000000001
RBP: 000055a9216e1710 R08: 000000000000000a R09: 00007ffd5b863840
R10: 000000000000000a R11: 0000000000000246 R12: 00007f6c0dda9780
R13: 0000000000000005 R14: 00007f6c0dda4740 R15: 0000000000000005
Modules linked in: nls_iso8859_1 nls_cp437 vfat fat kvm_intel kvm irqbypass efivars ip_tables x_tables xfs sd_mod ahci libahci igb i2c_algo_bit libata i2c_core dm_mirror dm_region_hash dm_log dm_mod efivarfs
CR2: ffff888453d00000
---[ end trace ccf646c7456717c5 ]---
Kernel panic - not syncing: Fatal exception
Shutting down cpus with NMI
Kernel Offset: 0x24c00000 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---
Link: http://lkml.kernel.org/r/20190227173147.75650-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
| | |
| |
| |
| |
| |
| |
| | |
Fix some null pointer dereference flaw and parameter not init issues.
change-Id: I0ed5f3f62c3794775bf97d353c4e50dd8ceb32da
Signed-off-by: Yao Jiang <yaojia@codeaurora.org>
|
| |\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* refs/heads/tmp-f0b9d2d
Linux 4.4.101
mm/pagewalk.c: report holes in hugetlb ranges
mm/page_ext.c: check if page_ext is not prepared
mm: check the return value of lookup_page_ext for all call sites
coda: fix 'kernel memory exposure attempt' in fsync
mm/page_alloc.c: broken deferred calculation
ipmi: fix unsigned long underflow
ocfs2: should wait dio before inode lock in ocfs2_setattr()
nvme: Fix memory order on async queue deletion
arm64: fix dump_instr when PAN and UAO are in use
serial: omap: Fix EFR write on RTS deassertion
ima: do not update security.ima if appraisal status is not INTEGRITY_PASS
net/sctp: Always set scope_id in sctp_inet6_skb_msgname
fealnx: Fix building error on MIPS
sctp: do not peel off an assoc from one netns to another one
af_netlink: ensure that NLMSG_DONE never fails in dumps
vlan: fix a use-after-free in vlan_device_event()
bonding: discard lowest hash bit for 802.3ad layer3+4
netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed
tcp: do not mangle skb->cb[] in tcp_make_synack()
Conflicts:
mm/debug-pagealloc.c
mm/page_ext.c
mm/page_owner.c
Change-Id: I551aff1b4c8a0d72f64a234abb8ac88990fbc9e5
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit e492080e640c2d1235ddf3441cae634cfffef7e1 upstream.
online_page_ext() and page_ext_init() allocate page_ext for each
section, but they do not allocate if the first PFN is !pfn_present(pfn)
or !pfn_valid(pfn). Then section->page_ext remains as NULL.
lookup_page_ext checks NULL only if CONFIG_DEBUG_VM is enabled. For a
valid PFN, __set_page_owner will try to get page_ext through
lookup_page_ext. Without CONFIG_DEBUG_VM lookup_page_ext will misuse
NULL pointer as value 0. This incurrs invalid address access.
This is the panic example when PFN 0x100000 is not valid but PFN
0x13FC00 is being used for page_ext. section->page_ext is NULL,
get_entry returned invalid page_ext address as 0x1DFA000 for a PFN
0x13FC00.
To avoid this panic, CONFIG_DEBUG_VM should be removed so that page_ext
will be checked at all times.
Unable to handle kernel paging request at virtual address 01dfa014
------------[ cut here ]------------
Kernel BUG at ffffff80082371e0 [verbose debug info unavailable]
Internal error: Oops: 96000045 [#1] PREEMPT SMP
Modules linked in:
PC is at __set_page_owner+0x48/0x78
LR is at __set_page_owner+0x44/0x78
__set_page_owner+0x48/0x78
get_page_from_freelist+0x880/0x8e8
__alloc_pages_nodemask+0x14c/0xc48
__do_page_cache_readahead+0xdc/0x264
filemap_fault+0x2ac/0x550
ext4_filemap_fault+0x3c/0x58
__do_fault+0x80/0x120
handle_mm_fault+0x704/0xbb0
do_page_fault+0x2e8/0x394
do_mem_abort+0x88/0x124
Pre-4.7 kernels also need commit f86e4271978b ("mm: check the return
value of lookup_page_ext for all call sites").
Link: http://lkml.kernel.org/r/20171107094131.14621-1-jaewon31.kim@samsung.com
Fixes: eefa864b701d ("mm/page_ext: resurrect struct page extending code for debugging")
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On SPARSEMEM systems page poisoning is enabled after buddy is up, because
of the dependency on page extension init. This causes the pages released
by free_all_bootmem not to be poisoned. This either delays or misses the
identification of some issues because the pages have to undergo another
cycle of alloc-free-alloc for any corruption to be detected.
Enable page poisoning early by getting rid of the PAGE_EXT_DEBUG_POISON
flag. Since all the free pages will now be poisoned, the flag need not be
verified before checking the poison during an alloc.
Link: http://lkml.kernel.org/r/1490358246-11001-1-git-send-email-vinmenon@codeaurora.org
Acked-by: Laura Abbott <labbott@redhat.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[vinmenon@codeaurora.org: resolve trivial merge conflicts.
Remove the redundant free pages RO feature from the
page_poison.c file which is the reason for conflicts +
squash the addendum commit 40961ef8d65f51093bc94de110b97b590b6b9275
('mm-enable-page-poisoning-early-at-boot-v2')]
Git-commit: c5b7cd344fd6341e6db79e55c0f1f4d1d9c67a7e
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Change-Id: I1bb1f99d3a2e1135131911905e0916c837ba9d8a
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default, page poisoning uses a poison value (0xaa) on free. If this
is changed to 0, the page is not only sanitized but zeroing on alloc
with __GFP_ZERO can be skipped as well. The tradeoff is that detecting
corruption from the poisoning is harder to detect. This feature also
cannot be used with hibernation since pages are not guaranteed to be
zeroed after hibernation.
Credit to Grsecurity/PaX team for inspiring this work
Change-Id: If7116e6bff246abbafc38bdfeb3601d3ea063ad2
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jianyu Zhan <nasa4836@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 1414c7f4f7d72d138fff35f00151d15749b5beda
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Knowing the portion of memory that is not used by a certain application or
memory cgroup (idle memory) can be useful for partitioning the system
efficiently, e.g. by setting memory cgroup limits appropriately.
Currently, the only means to estimate the amount of idle memory provided
by the kernel is /proc/PID/{clear_refs,smaps}: the user can clear the
access bit for all pages mapped to a particular process by writing 1 to
clear_refs, wait for some time, and then count smaps:Referenced. However,
this method has two serious shortcomings:
- it does not count unmapped file pages
- it affects the reclaimer logic
To overcome these drawbacks, this patch introduces two new page flags,
Idle and Young, and a new sysfs file, /sys/kernel/mm/page_idle/bitmap.
A page's Idle flag can only be set from userspace by setting bit in
/sys/kernel/mm/page_idle/bitmap at the offset corresponding to the page,
and it is cleared whenever the page is accessed either through page tables
(it is cleared in page_referenced() in this case) or using the read(2)
system call (mark_page_accessed()). Thus by setting the Idle flag for
pages of a particular workload, which can be found e.g. by reading
/proc/PID/pagemap, waiting for some time to let the workload access its
working set, and then reading the bitmap file, one can estimate the amount
of pages that are not used by the workload.
The Young page flag is used to avoid interference with the memory
reclaimer. A page's Young flag is set whenever the Access bit of a page
table entry pointing to the page is cleared by writing to the bitmap file.
If page_referenced() is called on a Young page, it will add 1 to its
return value, therefore concealing the fact that the Access bit was
cleared.
Note, since there is no room for extra page flags on 32 bit, this feature
uses extended page flags when compiled on 32 bit.
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: kpageidle requires an MMU]
[akpm@linux-foundation.org: decouple from page-flags rework]
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the page owner tracking code which is introduced so far ago. It
is resident on Andrew's tree, though, nobody tried to upstream so it
remain as is. Our company uses this feature actively to debug memory leak
or to find a memory hogger so I decide to upstream this feature.
This functionality help us to know who allocates the page. When
allocating a page, we store some information about allocation in extra
memory. Later, if we need to know status of all pages, we can get and
analyze it from this stored information.
In previous version of this feature, extra memory is statically defined in
struct page, but, in this version, extra memory is allocated outside of
struct page. It enables us to turn on/off this feature at boottime
without considerable memory waste.
Although we already have tracepoint for tracing page allocation/free,
using it to analyze page owner is rather complex. We need to enlarge the
trace buffer for preventing overlapping until userspace program launched.
And, launched program continually dump out the trace buffer for later
analysis and it would change system behaviour with more possibility rather
than just keeping it in memory, so bad for debug.
Moreover, we can use page_owner feature further for various purposes. For
example, we can use it for fragmentation statistics implemented in this
patch. And, I also plan to implement some CMA failure debugging feature
using this interface.
I'd like to give the credit for all developers contributed this feature,
but, it's not easy because I don't know exact history. Sorry about that.
Below is people who has "Signed-off-by" in the patches in Andrew's tree.
Contributor:
Alexander Nyberg <alexn@dsv.su.se>
Mel Gorman <mgorman@suse.de>
Dave Hansen <dave@linux.vnet.ibm.com>
Minchan Kim <minchan@kernel.org>
Michal Nazarewicz <mina86@mina86.com>
Andrew Morton <akpm@linux-foundation.org>
Jungsoo Son <jungsoo.son@lge.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, debug-pagealloc needs extra flags in struct page, so we need to
recompile whole source code when we decide to use it. This is really
painful, because it takes some time to recompile and sometimes rebuild is
not possible due to third party module depending on struct page. So, we
can't use this good feature in many cases.
Now, we have the page extension feature that allows us to insert extra
flags to outside of struct page. This gets rid of third party module
issue mentioned above. And, this allows us to determine if we need extra
memory for this page extension in boottime. With these property, we can
avoid using debug-pagealloc in boottime with low computational overhead in
the kernel built with CONFIG_DEBUG_PAGEALLOC. This will help our
development process greatly.
This patch is the preparation step to achive above goal. debug-pagealloc
originally uses extra field of struct page, but, after this patch, it will
use field of struct page_ext. Because memory for page_ext is allocated
later than initialization of page allocator in CONFIG_SPARSEMEM, we should
disable debug-pagealloc feature temporarily until initialization of
page_ext. This patch implements this.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
When we debug something, we'd like to insert some information to every
page. For this purpose, we sometimes modify struct page itself. But,
this has drawbacks. First, it requires re-compile. This makes us
hesitate to use the powerful debug feature so development process is
slowed down. And, second, sometimes it is impossible to rebuild the
kernel due to third party module dependency. At third, system behaviour
would be largely different after re-compile, because it changes size of
struct page greatly and this structure is accessed by every part of
kernel. Keeping this as it is would be better to reproduce errornous
situation.
This feature is intended to overcome above mentioned problems. This
feature allocates memory for extended data per page in certain place
rather than the struct page itself. This memory can be accessed by the
accessor functions provided by this code. During the boot process, it
checks whether allocation of huge chunk of memory is needed or not. If
not, it avoids allocating memory at all. With this advantage, we can
include this feature into the kernel in default and can avoid rebuild and
solve related problems.
Until now, memcg uses this technique. But, now, memcg decides to embed
their variable to struct page itself and it's code to extend struct page
has been removed. I'd like to use this code to develop debug feature, so
this patch resurrect it.
To help these things to work well, this patch introduces two callbacks for
clients. One is the need callback which is mandatory if user wants to
avoid useless memory allocation at boot-time. The other is optional, init
callback, which is used to do proper initialization after memory is
allocated. Detailed explanation about purpose of these functions is in
code comment. Please refer it.
Others are completely same with previous extension code in memcg.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|