summaryrefslogtreecommitdiff
path: root/mm/filemap.c
diff options
context:
space:
mode:
authorGreg Kroah-Hartman <gregkh@google.com>2018-05-26 10:12:26 +0200
committerGreg Kroah-Hartman <gregkh@google.com>2018-05-26 10:12:26 +0200
commit3f51ea2db97d9b434b4f4faca31c82a41c5c3722 (patch)
tree42364e11a7ba89acb4b587cc6d5093d36d23cb5e /mm/filemap.c
parent4b08356a76b859b8f4599bf8821a6702ac50a029 (diff)
parent7620164e85e48ea381a52864d729a7731be8d7f2 (diff)
Merge 4.4.133 into android-4.4
Changes in 4.4.133 8139too: Use disable_irq_nosync() in rtl8139_poll_controller() bridge: check iface upper dev when setting master via ioctl dccp: fix tasklet usage ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg llc: better deal with too small mtu net: ethernet: sun: niu set correct packet size in skb net/mlx4_en: Verify coalescing parameters are in range net_sched: fq: take care of throttled flows before reuse net: support compat 64-bit time in {s,g}etsockopt openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found qmi_wwan: do not steal interfaces from class drivers r8169: fix powering up RTL8168h sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr sctp: use the old asoc when making the cookie-ack chunk in dupcook_d tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent(). bonding: do not allow rlb updates to invalid mac tcp: ignore Fast Open on repair mode sctp: fix the issue that the cookie-ack with auth can't get processed sctp: delay the authentication for the duplicated cookie-echo chunk ALSA: timer: Call notifier in the same spinlock audit: move calcs after alloc and check when logging set loginuid arm64: introduce mov_q macro to move a constant into a 64-bit register arm64: Add work around for Arm Cortex-A55 Erratum 1024718 futex: Remove unnecessary warning from get_futex_key futex: Remove duplicated code and fix undefined behaviour xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM) lockd: lost rollback of set_grace_period() in lockd_down_net() Revert "ARM: dts: imx6qdl-wandboard: Fix audio channel swap" l2tp: revert "l2tp: fix missing print session offset info" pipe: cap initial pipe capacity according to pipe-max-size limit futex: futex_wake_op, fix sign_extend32 sign bits kernel/exit.c: avoid undefined behaviour when calling wait4() usbip: usbip_host: refine probe and disconnect debug msgs to be useful usbip: usbip_host: delete device from busid_table after rebind usbip: usbip_host: run rebind from exit when module is removed usbip: usbip_host: fix NULL-ptr deref and use-after-free errors usbip: usbip_host: fix bad unlock balance during stub_probe() ALSA: usb: mixer: volume quirk for CM102-A+/102S+ ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist ALSA: control: fix a redundant-copy issue spi: pxa2xx: Allow 64-bit DMA powerpc/powernv: panic() on OPAL < V3 powerpc/powernv: Remove OPALv2 firmware define and references powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL cpuidle: coupled: remove unused define cpuidle_coupled_lock powerpc: Don't preempt_disable() in show_cpuinfo() vmscan: do not force-scan file lru if its absolute size is small proc: meminfo: estimate available memory more conservatively mm: filemap: remove redundant code in do_read_cache_page mm: filemap: avoid unnecessary calls to lock_page when waiting for IO to complete during a read signals: avoid unnecessary taking of sighand->siglock cpufreq: intel_pstate: Enable HWP by default tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all} proc read mm's {arg,env}_{start,end} with mmap semaphore taken. procfs: fix pthread cross-thread naming if !PR_DUMPABLE powerpc/powernv: Fix NVRAM sleep in invalid context when crashing mm: don't allow deferred pages with NEED_PER_CPU_KM s390/qdio: fix access to uninitialized qdio_q fields s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero s390/qdio: don't release memory in qdio_setup_irq() s390: remove indirect branch from do_softirq_own_stack efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr tick/broadcast: Use for_each_cpu() specially on UP kernels ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed ARM: 8770/1: kprobes: Prohibit probing on optimized_callback ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions Btrfs: fix xattr loss after power failure btrfs: fix crash when trying to resume balance without the resume flag btrfs: fix reading stale metadata blocks after degraded raid1 mounts net: test tailroom before appending to linear skb packet: in packet_snd start writing at link layer allocation sock_diag: fix use-after-free read in __sk_free tcp: purge write queue in tcp_connect_init() ext2: fix a block leak s390: add assembler macros for CPU alternatives s390: move expoline assembler macros to a header s390/lib: use expoline for indirect branches s390/kernel: use expoline for indirect branches s390: move spectre sysfs attribute code s390: extend expoline to BC instructions s390: use expoline thunks in the BPF JIT scsi: libsas: defer ata device eh commands to libata scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() scsi: zfcp: fix infinite iteration on ERP ready list dmaengine: ensure dmaengine helpers check valid callback time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting gpio: rcar: Add Runtime PM handling for interrupts cfg80211: limit wiphy names to 128 bytes hfsplus: stop workqueue when fill_super() failed x86/kexec: Avoid double free_page() upon do_kexec_load() failure Linux 4.4.133 Change-Id: I0554b12889bc91add2a444da95f18d59c6fb9cdb Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Diffstat (limited to 'mm/filemap.c')
-rw-r--r--mm/filemap.c90
1 files changed, 60 insertions, 30 deletions
diff --git a/mm/filemap.c b/mm/filemap.c
index b15f1d8bba43..21e750b6e810 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1581,6 +1581,15 @@ find_page:
index, last_index - index);
}
if (!PageUptodate(page)) {
+ /*
+ * See comment in do_read_cache_page on why
+ * wait_on_page_locked is used to avoid unnecessarily
+ * serialisations and why it's safe.
+ */
+ wait_on_page_locked_killable(page);
+ if (PageUptodate(page))
+ goto page_ok;
+
if (inode->i_blkbits == PAGE_CACHE_SHIFT ||
!mapping->a_ops->is_partially_uptodate)
goto page_not_up_to_date;
@@ -2215,7 +2224,7 @@ static struct page *wait_on_page_read(struct page *page)
return page;
}
-static struct page *__read_cache_page(struct address_space *mapping,
+static struct page *do_read_cache_page(struct address_space *mapping,
pgoff_t index,
int (*filler)(void *, struct page *),
void *data,
@@ -2237,53 +2246,74 @@ repeat:
/* Presumably ENOMEM for radix tree node */
return ERR_PTR(err);
}
+
+filler:
err = filler(data, page);
if (err < 0) {
page_cache_release(page);
- page = ERR_PTR(err);
- } else {
- page = wait_on_page_read(page);
+ return ERR_PTR(err);
}
- }
- return page;
-}
-static struct page *do_read_cache_page(struct address_space *mapping,
- pgoff_t index,
- int (*filler)(void *, struct page *),
- void *data,
- gfp_t gfp)
-
-{
- struct page *page;
- int err;
+ page = wait_on_page_read(page);
+ if (IS_ERR(page))
+ return page;
+ goto out;
+ }
+ if (PageUptodate(page))
+ goto out;
-retry:
- page = __read_cache_page(mapping, index, filler, data, gfp);
- if (IS_ERR(page))
- return page;
+ /*
+ * Page is not up to date and may be locked due one of the following
+ * case a: Page is being filled and the page lock is held
+ * case b: Read/write error clearing the page uptodate status
+ * case c: Truncation in progress (page locked)
+ * case d: Reclaim in progress
+ *
+ * Case a, the page will be up to date when the page is unlocked.
+ * There is no need to serialise on the page lock here as the page
+ * is pinned so the lock gives no additional protection. Even if the
+ * the page is truncated, the data is still valid if PageUptodate as
+ * it's a race vs truncate race.
+ * Case b, the page will not be up to date
+ * Case c, the page may be truncated but in itself, the data may still
+ * be valid after IO completes as it's a read vs truncate race. The
+ * operation must restart if the page is not uptodate on unlock but
+ * otherwise serialising on page lock to stabilise the mapping gives
+ * no additional guarantees to the caller as the page lock is
+ * released before return.
+ * Case d, similar to truncation. If reclaim holds the page lock, it
+ * will be a race with remove_mapping that determines if the mapping
+ * is valid on unlock but otherwise the data is valid and there is
+ * no need to serialise with page lock.
+ *
+ * As the page lock gives no additional guarantee, we optimistically
+ * wait on the page to be unlocked and check if it's up to date and
+ * use the page if it is. Otherwise, the page lock is required to
+ * distinguish between the different cases. The motivation is that we
+ * avoid spurious serialisations and wakeups when multiple processes
+ * wait on the same page for IO to complete.
+ */
+ wait_on_page_locked(page);
if (PageUptodate(page))
goto out;
+ /* Distinguish between all the cases under the safety of the lock */
lock_page(page);
+
+ /* Case c or d, restart the operation */
if (!page->mapping) {
unlock_page(page);
page_cache_release(page);
- goto retry;
+ goto repeat;
}
+
+ /* Someone else locked and filled the page in a very small window */
if (PageUptodate(page)) {
unlock_page(page);
goto out;
}
- err = filler(data, page);
- if (err < 0) {
- page_cache_release(page);
- return ERR_PTR(err);
- } else {
- page = wait_on_page_read(page);
- if (IS_ERR(page))
- return page;
- }
+ goto filler;
+
out:
mark_page_accessed(page);
return page;