From 353c16247d72937c90e7ef04cbd603502581019a Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Sat, 23 Sep 2017 17:02:18 +0800 Subject: f2fs: updates on 4.15-rc1 Pull f2fs updates from Jaegeuk Kim: "In this round, we introduce sysfile-based quota support which is required for Android by default. In addition, we allow that users are able to reserve some blocks in runtime to mitigate performance drops in low free space. Enhancements: - assign proper data segments according to write_hints given by user - issue cache_flush on dirty devices only among multiple devices - exploit cp_error flag and add more faults to enhance fault injection test - conduct more readaheads during f2fs_readdir - add a range for discard commands Bug fixes: - fix zero stat->st_blocks when inline_data is set - drop crypto key and free stale memory pointer while evict_inode is failing - fix some corner cases in free space and segment management - fix wrong last_disk_size This series includes lots of clean-ups and code enhancement in terms of xattr operations, discard/flush command control. In addition, it adds versatile debugfs entries to monitor f2fs status" Cherry-picked from origin/upstream-f2fs-stable-linux-4.4.y: 56a07b070510 f2fs: deny accessing encryption policy if encryption is off c394842e26e5 f2fs: inject fault in inc_valid_node_count 926292251022 f2fs: fix to clear FI_NO_PREALLOC e6cfc5de2d05 f2fs: expose quota information in debugfs c4cd2efe835b f2fs: separate nat entry mem alloc from nat_tree_lock 48c72b4c8c50 f2fs: validate before set/clear free nat bitmap baf9275a4bbd f2fs: avoid opened loop codes in __add_ino_entry 47af6c72d944 f2fs: apply write hints to select the type of segments for buffered write ac9819160586 f2fs: introduce scan_curseg_cache for cleanup ca28e9670e80 f2fs: optimize the way of traversing free_nid_bitmap 460688b59e8b f2fs: keep scanning until enough free nids are acquired 0186182c0c4d f2fs: trace checkpoint reason in fsync() 5d4b6efcfd09 f2fs: keep isize once block is reserved cross EOF 3c8f767e1374 f2fs: avoid race in between GC and block exchange 4423778adf0e f2fs: save a multiplication for last_nid calculation 3e3b40557525 f2fs: fix summary info corruption 44889e487981 f2fs: remove dead code in update_meta_page 55c7b9595bb9 f2fs: remove unneeded semicolon 8b92814117d5 f2fs: don't bother with inode->i_version 42c7c71824fc f2fs: check curseg space before foreground GC c5470498e59b f2fs: use rw_semaphore to protect SIT cache 82750d346ab7 f2fs: support quota sys files 26dfec49b25a f2fs: add quota_ino feature infra ddb8e2ae9811 f2fs: optimize __update_nat_bits f46ae958c701 f2fs: modify for accurate fggc node io stat c713fdb5a23c Revert "f2fs: handle dirty segments inside refresh_sit_entry" 873ec505cb07 f2fs: add a function to move nid ae66786296b4 f2fs: export SSR allocation threshold 90c28a18d2a4 f2fs: give correct trimmed blocks in fstrim 5612922fb0ac f2fs: support bio allocation error injection 583b7a274c27 f2fs: support get_page error injection 09a073cc8c56 f2fs: add missing sysfs description e945474a9c1b f2fs: support soft block reservation b7b2e629b6f6 f2fs: handle error case when adding xattr entry 7368e30495c5 f2fs: support flexible inline xattr size ada4061e191b f2fs: show current cp state 5b8ff1301a61 f2fs: add missing quota_initialize 46d4a691f035 f2fs: show # of dirty segments via sysfs fc13f9d7ce1e f2fs: stop all the operations by cp_error flag 91bea0c391b3 f2fs: remove several redundant assignments 807486c79534 f2fs: avoid using timespec 03b1cb0bb4a2 f2fs: fix to correct no_fggc_candidate 5c15033ceaea Revert "f2fs: return wrong error number on f2fs_quota_write" 5f5f59322240 f2fs: remove obsolete pointer for truncate_xattr_node 032a6906825a f2fs: retry ENOMEM for quota_read|write 171b638fc49b f2fs: limit # of inmemory pages 83ed7a615f0a f2fs: update ctx->pos correctly when hitting hole in directory 4d6e68be2534 f2fs: relocate readahead codes in readdir() c8be47b54018 f2fs: allow readdir() to be interrupted 2b903fe94cd0 f2fs: trace f2fs_readdir bb0db666d4bc f2fs: trace f2fs_lookup 40d6250f046a f2fs: skip searching non-exist range in truncate_hole 8e84f379df61 f2fs: expose some sectors to user in inline data or dentry case cb98f70dea02 f2fs: avoid stale fi->gdirty_list pointer 5562a3c53963 f2fs/crypto: drop crypto key at evict_inode only 85853e7e38d7 f2fs: fix to avoid race when accessing last_disk_size 0c47a892d555 f2fs: Fix bool initialization/comparison 68e801abc520 f2fs: give up CP_TRIMMED_FLAG if it drops discards df74eacb2075 f2fs: trace f2fs_remove_discard bd502c6e3e7a f2fs: reduce cmd_lock coverage in __issue_discard_cmd a34ab5ca4f94 f2fs: split discard policy 1e65afd14d32 f2fs: wrap discard policy 684447dad138 f2fs: support issuing/waiting discard in range 27eaad09380f f2fs: fix to flush multiple device in checkpoint 08bb9d68d51b f2fs: enhance multiple device flush 9c2526ac2ecb f2fs: fix to show ino management cache size correctly 814b463d262f f2fs: drop FI_UPDATE_WRITE tag after f2fs_issue_flush f555b0a117d3 f2fs: obsolete ALLOC_NID_LIST list 75d3164ae128 f2fs: convert inline data for direct I/O & FI_NO_PREALLOC 4de0ceb6b7ef f2fs: allow readpages with NULL file pointer 322a45d17212 f2fs: show flush list status in sysfs 6d625a93b4a8 f2fs: introduce read_xattr_block 8ea6e1c327c5 f2fs: introduce read_inline_xattr dbce11e9ee5b Revert "f2fs: reuse nids more aggressively" 131bc9f6b7f9 Revert "f2fs: node segment is prior to data segment selected victim" Change-Id: I93b9cd867b859a667a448b39299ff44a2b841b8c Signed-off-by: Jaegeuk Kim --- include/linux/f2fs_fs.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) (limited to 'include/linux') diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h index c2a975e4a711..fef1caeddf54 100644 --- a/include/linux/f2fs_fs.h +++ b/include/linux/f2fs_fs.h @@ -36,6 +36,8 @@ #define F2FS_NODE_INO(sbi) (sbi->node_ino_num) #define F2FS_META_INO(sbi) (sbi->meta_ino_num) +#define F2FS_MAX_QUOTAS 3 + #define F2FS_IO_SIZE(sbi) (1 << (sbi)->write_io_size_bits) /* Blocks */ #define F2FS_IO_SIZE_KB(sbi) (1 << ((sbi)->write_io_size_bits + 2)) /* KB */ #define F2FS_IO_SIZE_BYTES(sbi) (1 << ((sbi)->write_io_size_bits + 12)) /* B */ @@ -108,7 +110,8 @@ struct f2fs_super_block { __u8 encryption_level; /* versioning level for encryption */ __u8 encrypt_pw_salt[16]; /* Salt used for string2key algorithm */ struct f2fs_device devs[MAX_DEVICES]; /* device list */ - __u8 reserved[327]; /* valid reserved region */ + __le32 qf_ino[F2FS_MAX_QUOTAS]; /* quota inode numbers */ + __u8 reserved[315]; /* valid reserved region */ } __packed; /* @@ -184,7 +187,8 @@ struct f2fs_extent { } __packed; #define F2FS_NAME_LEN 255 -#define F2FS_INLINE_XATTR_ADDRS 50 /* 200 bytes for inline xattrs */ +/* 200 bytes for inline xattrs by default */ +#define DEFAULT_INLINE_XATTR_ADDRS 50 #define DEF_ADDRS_PER_INODE 923 /* Address Pointers in an Inode */ #define CUR_ADDRS_PER_INODE(inode) (DEF_ADDRS_PER_INODE - \ get_extra_isize(inode)) @@ -238,7 +242,7 @@ struct f2fs_inode { union { struct { __le16 i_extra_isize; /* extra inode attribute size */ - __le16 i_padding; /* padding */ + __le16 i_inline_xattr_size; /* inline xattr size, unit: 4 bytes */ __le32 i_projid; /* project id */ __le32 i_inode_checksum;/* inode meta checksum */ __le32 i_extra_end[0]; /* for attribute size calculation */ -- cgit v1.2.3 From 28850c79d071b4eef0001bb76af2cfb29402f4ac Mon Sep 17 00:00:00 2001 From: John Stultz Date: Thu, 8 Jun 2017 16:44:21 -0700 Subject: BACKPORT: time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting (cherry pick from commit 3d88d56c5873f6eebe23e05c3da701960146b801) Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz Tested-by: Daniel Mentz Cc: Prarit Bhargava Cc: Kevin Brodsky Cc: Richard Cochran Cc: Stephen Boyd Cc: Will Deacon Cc: "stable #4 . 8+" Cc: Miroslav Lichvar Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner Bug: 20045882 Bug: 63737556 Change-Id: I6c55dd7685f6bd212c6af9d09c527528e1dd5fa1 --- include/linux/timekeeper_internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include/linux') diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h index f0f1793cfa49..115216ec7cfe 100644 --- a/include/linux/timekeeper_internal.h +++ b/include/linux/timekeeper_internal.h @@ -56,7 +56,7 @@ struct tk_read_base { * interval. * @xtime_remainder: Shifted nano seconds left over when rounding * @cycle_interval - * @raw_interval: Raw nano seconds accumulated per NTP interval. + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval. * @ntp_error: Difference between accumulated time and NTP time in ntp * shifted nano seconds. * @ntp_error_shift: Shift conversion between clock shifted nano seconds and @@ -97,7 +97,7 @@ struct timekeeper { cycle_t cycle_interval; u64 xtime_interval; s64 xtime_remainder; - u32 raw_interval; + u64 raw_interval; /* The ntp_tick_length() value currently being used. * This cached copy ensures we consistently apply the tick * length for an entire tick, as ntp_tick_length may change -- cgit v1.2.3 From 1d35c0438678c7ad4c367135082685d5754eed20 Mon Sep 17 00:00:00 2001 From: John Stultz Date: Mon, 22 May 2017 17:20:20 -0700 Subject: BACKPORT: time: Clean up CLOCK_MONOTONIC_RAW time handling (cherry pick from commit fc6eead7c1e2e5376c25d2795d4539fdacbc0648) Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW, remove the duplicitive tk->raw_time.tv_nsec, which can be stored in tk->tkr_raw.xtime_nsec (similarly to how its handled for monotonic time). Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Miroslav Lichvar Cc: Richard Cochran Cc: Prarit Bhargava Cc: Stephen Boyd Cc: Kevin Brodsky Cc: Will Deacon Cc: Daniel Mentz Tested-by: Daniel Mentz Signed-off-by: John Stultz Bug: 20045882 Bug: 63737556 Change-Id: I243827d21b08703a09d2d2fe738a9258be224582 --- include/linux/timekeeper_internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include/linux') diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h index 115216ec7cfe..3a5af09af18b 100644 --- a/include/linux/timekeeper_internal.h +++ b/include/linux/timekeeper_internal.h @@ -50,7 +50,7 @@ struct tk_read_base { * @tai_offset: The current UTC to TAI offset in seconds * @clock_was_set_seq: The sequence number of clock was set events * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second - * @raw_time: Monotonic raw base time in timespec64 format + * @raw_sec: CLOCK_MONOTONIC_RAW time in seconds * @cycle_interval: Number of clock cycles in one NTP interval * @xtime_interval: Number of clock shifted nano seconds in one NTP * interval. @@ -91,7 +91,7 @@ struct timekeeper { s32 tai_offset; unsigned int clock_was_set_seq; ktime_t next_leap_ktime; - struct timespec64 raw_time; + u64 raw_sec; /* The following members are for timekeeping internal use */ cycle_t cycle_interval; -- cgit v1.2.3