From e7c8b35e486775af65636a2a2b2766c3def9a8e8 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Mon, 12 Dec 2016 16:43:26 -0800 Subject: UPSTREAM: mm: don't cap request size based on read-ahead setting We ran into a funky issue, where someone doing 256K buffered reads saw 128K requests at the device level. Turns out it is read-ahead capping the request size, since we use 128K as the default setting. This doesn't make a lot of sense - if someone is issuing 256K reads, they should see 256K reads, regardless of the read-ahead setting, if the underlying device can support a 256K read in a single command. This patch introduces a bdi hint, io_pages. This is the soft max IO size for the lower level, I've hooked it up to the bdev settings here. Read-ahead is modified to issue the maximum of the user request size, and the read-ahead max size, but capped to the max request size on the device side. The latter is done to avoid reading ahead too much, if the application asks for a huge read. With this patch, the kernel behaves like the application expects. Change-Id: Ibe52ffac7a6e1ac86ed0c6a59a0f7a32d651ee5f Link: http://lkml.kernel.org/r/1479498073-8657-1-git-send-email-axboe@fb.com Signed-off-by: Jens Axboe Acked-by: Johannes Weiner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Jaegeuk Kim --- include/linux/backing-dev-defs.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/linux') diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index a307c37c2e6c..782d62673292 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -136,6 +136,7 @@ struct bdi_writeback { struct backing_dev_info { struct list_head bdi_list; unsigned long ra_pages; /* max readahead in PAGE_CACHE_SIZE units */ + unsigned long io_pages; /* max allowed IO size */ unsigned int capabilities; /* Device capabilities */ congested_fn *congested_fn; /* Function pointer if device is md/dm */ void *congested_data; /* Pointer to aux data for congested func */ -- cgit v1.2.3 From 523972a6e347c178204cb891bd18322df47b134b Mon Sep 17 00:00:00 2001 From: Chao Yu Date: Mon, 22 Feb 2016 18:29:18 +0800 Subject: f2fs: fix to convert inline directory correctly With below serials, we will lose parts of dirents: 1) mount f2fs with inline_dentry option 2) echo 1 > /sys/fs/f2fs/sdX/dir_level 3) mkdir dir 4) touch 180 files named [1-180] in dir 5) touch 181 in dir 6) echo 3 > /proc/sys/vm/drop_caches 7) ll dir ls: cannot access 2: No such file or directory ls: cannot access 4: No such file or directory ls: cannot access 5: No such file or directory ls: cannot access 6: No such file or directory ls: cannot access 8: No such file or directory ls: cannot access 9: No such file or directory ... total 360 drwxr-xr-x 2 root root 4096 Feb 19 15:12 ./ drwxr-xr-x 3 root root 4096 Feb 19 15:11 ../ -rw-r--r-- 1 root root 0 Feb 19 15:12 1 -rw-r--r-- 1 root root 0 Feb 19 15:12 10 -rw-r--r-- 1 root root 0 Feb 19 15:12 100 -????????? ? ? ? ? ? 101 -????????? ? ? ? ? ? 102 -????????? ? ? ? ? ? 103 ... The reason is: when doing the inline dir conversion, we didn't consider that directory has hierarchical hash structure which can be configured through sysfs interface 'dir_level'. By default, dir_level of directory inode is 0, it means we have one bucket in hash table located in first level, all dirents will be hashed in this bucket, so it has no problem for us to do the duplication simply between inline dentry page and converted normal dentry page. However, if we configured dir_level with the value N (greater than 0), it will expand the bucket number of first level hash table by 2^N - 1, it hashs dirents into different buckets according their hash value, if we still move all dirents to first bucket, it makes incorrent locating for inline dirents, the result is, although we can iterate all dirents through ->readdir, we can't stat some of them in ->lookup which based on hash table searching. This patch fixes this issue by rehashing dirents into correct position when converting inline directory. Signed-off-by: Chao Yu Signed-off-by: Jaegeuk Kim [bwh: Backported to 4.4: - Keep using f2fs_crypto functions instead of generic fscrypt API - Use remove_dirty_dir_inode() instead of remove_dirty_inode() - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman --- include/linux/f2fs_fs.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/linux') diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h index 3d6e6ce44c5c..1f81ebcc2948 100644 --- a/include/linux/f2fs_fs.h +++ b/include/linux/f2fs_fs.h @@ -497,4 +497,6 @@ enum { F2FS_FT_MAX }; +#define S_SHIFT 12 + #endif /* _LINUX_F2FS_FS_H */ -- cgit v1.2.3 From 65b9d5326d7d6dcd9239481e98ed2d4938420f31 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Mon, 19 Sep 2016 17:55:10 -0700 Subject: f2fs: use crc and cp version to determine roll-forward recovery commit a468f0ef516fda9c7d91bb550d458e853d76955e upstream. Previously, we used cp_version only to detect recoverable dnodes. In order to avoid same garbage cp_version, we needed to truncate the next dnode during checkpoint, resulting in additional discard or data write. If we can distinguish this by using crc in addition to cp_version, we can remove this overhead. There is backward compatibility concern where it changes node_footer layout. So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to detect new layout. New layout will be activated only when this flag is set. Signed-off-by: Jaegeuk Kim [bwh: Backported to 4.4: - Deleted code is slightly different - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman --- include/linux/f2fs_fs.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/linux') diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h index 1f81ebcc2948..520fd854e7b3 100644 --- a/include/linux/f2fs_fs.h +++ b/include/linux/f2fs_fs.h @@ -99,6 +99,7 @@ struct f2fs_super_block { /* * For checkpoint */ +#define CP_CRC_RECOVERY_FLAG 0x00000040 #define CP_FASTBOOT_FLAG 0x00000020 #define CP_FSCK_FLAG 0x00000010 #define CP_ERROR_FLAG 0x00000008 -- cgit v1.2.3 From 568736f8e5a7c5a1ca4d00e028f952da9bd0934c Mon Sep 17 00:00:00 2001 From: Anders Roxell Date: Tue, 30 Oct 2018 12:35:45 +0100 Subject: writeback: don't decrement wb->refcnt if !wb->bdi [ Upstream commit 347a28b586802d09604a149c1a1f6de5dccbe6fa ] This happened while running in qemu-system-aarch64, the AMBA PL011 UART driver when enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE. arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init), devtmpfs' handle_remove() crashes because the reference count is a NULL pointer only because wb->bdi hasn't been initialized yet. Rework so that wb_put have an extra check if wb->bdi before decrement wb->refcnt and also add a WARN_ON_ONCE to get a warning if it happens again in other drivers. Fixes: 52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Co-developed-by: Arnd Bergmann Signed-off-by: Arnd Bergmann Signed-off-by: Anders Roxell Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- include/linux/backing-dev-defs.h | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'include/linux') diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index a307c37c2e6c..072501a0ac86 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -225,6 +225,14 @@ static inline void wb_get(struct bdi_writeback *wb) */ static inline void wb_put(struct bdi_writeback *wb) { + if (WARN_ON_ONCE(!wb->bdi)) { + /* + * A driver bug might cause a file to be removed before bdi was + * initialized. + */ + return; + } + if (wb != &wb->bdi->wb) percpu_ref_put(&wb->refcnt); } -- cgit v1.2.3