diff options
| author | Davide Garberi <dade.garberi@gmail.com> | 2023-08-06 15:31:30 +0200 |
|---|---|---|
| committer | Davide Garberi <dade.garberi@gmail.com> | 2023-08-06 15:31:30 +0200 |
| commit | cc57cb4ee3b7918b74d30604735d353b9a5fa23b (patch) | |
| tree | 0be483b86472eaf1c74f747ecbaf6300f3998a1a /Documentation | |
| parent | 44be99a74546fb018cbf2049602a5fd2889a0089 (diff) | |
| parent | 7d11b1a7a11c598a07687f853ded9eca97d89043 (diff) | |
Merge lineage-20 of git@github.com:LineageOS/android_kernel_qcom_msm8998.git into lineage-20
7d11b1a7a11c Revert "sched: cpufreq: Use sched_clock instead of rq_clock when updating schedutil"
daaa5da96a74 sched: Take irq_sparse lock during the isolation
217ab2d0ef91 rcu: Speed up calling of RCU tasks callbacks
997b726bc092 kernel: power: Workaround for sensor ipc message causing high power consume
b933e4d37bc0 sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices
82d3f23d6dc5 sched/fair: Fix bandwidth timer clock drift condition
629bfed360f9 kernel: power: qos: remove check for core isolation while cluster LPMs
891a63210e1d sched/fair: Fix issue where frequency update not skipped
b775cb29f663 ANDROID: Move schedtune en/dequeue before schedutil update triggers
ebdb82f7b34a sched/fair: Skip frequency updates if CPU about to idle
ff383d94478a FROMLIST: sched: Make iowait_boost optional in schedutil
9539942cb065 FROMLIST: cpufreq: Make iowait boost a policy option
b65c91c9aa14 ARM: dts: msm: add HW CPU's busy-cost-data for additional freqs
72f13941085b ARM: dts: msm: fix CPU's idle-cost-data
ab88411382f7 ARM: dts: msm: fix EM to be monotonically increasing
83dcbae14782 ARM: dts: msm: Fix EAS idle-cost-data property length
33d3b17bfdfb ARM: dts: msm: Add msm8998 energy model
c0fa7577022c sched/walt: Re-add code to allow WALT to function
d5cd35f38616 FROMGIT: binder: use EINTR for interrupted wait for work
db74739c86de sched: Don't fail isolation request for an already isolated CPU
aee7a16e347b sched: WALT: increase WALT minimum window size to 20ms
4dbe44554792 sched: cpufreq: Use per_cpu_ptr instead of this_cpu_ptr when reporting load
ef3fb04c7df4 sched: cpufreq: Use sched_clock instead of rq_clock when updating schedutil
c7128748614a sched/cpupri: Exclude isolated CPUs from the lowest_mask
6adb092856e8 sched: cpufreq: Limit governor updates to WALT changes alone
0fa652ee00f5 sched: walt: Correct WALT window size initialization
41cbb7bc59fb sched: walt: fix window misalignment when HZ=300
43cbf9d6153d sched/tune: Increase the cgroup limit to 6
c71b8fffe6b3 drivers: cpuidle: lpm-levels: Fix KW issues with idle state idx < 0
938e42ca699f drivers: cpuidle: lpm-levels: Correctly check for list empty
8d8a48aecde5 sched/fair: Fix load_balance() affinity redo path
eccc8acbe705 sched/fair: Avoid unnecessary active load balance
0ffdb886996b BACKPORT: sched/core: Fix rules for running on online && !active CPUs
c9999f04236e sched/core: Allow kthreads to fall back to online && !active cpus
b9b6bc6ea3c0 sched: Allow migrating kthreads into online but inactive CPUs
a9314f9d8ad4 sched/fair: Allow load bigger task load balance when nr_running is 2
c0b317c27d44 pinctrl: qcom: Clear status bit on irq_unmask
45df1516d04a UPSTREAM: mm: fix misplaced unlock_page in do_wp_page()
899def5edcd4 UPSTREAM: mm/ksm: Remove reuse_ksm_page()
46c6fbdd185a BACKPORT: mm: do_wp_page() simplification
90dccbae4c04 UPSTREAM: mm: reuse only-pte-mapped KSM page in do_wp_page()
ebf270d24640 sched/fair: vruntime should normalize when switching from fair
cbe0b37059c9 mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct
12d40f1995b4 msm: mdss: Fix indentation
620df03a7229 msm: mdss: Treat polling_en as the bool that it is
12af218146a6 msm: mdss: add idle state node
13e661759656 cpuset: Restore tasks affinity while moving across cpusets
602bf4096dab genirq: Honour IRQ's affinity hint during migration
9209b5556f6a power: qos: Use effective affinity mask
f31078b5825f genirq: Introduce effective affinity mask
58c453484f7e sched/cputime: Mitigate performance regression in times()/clock_gettime()
400383059868 kernel: time: Add delay after cpu_relax() in tight loops
1daa7ea39076 pinctrl: qcom: Update irq handle for GPIO pins
07f7c9961c7c power: smb-lib: Fix mutex acquisition deadlock on PD hard reset
094b738f46c8 power: qpnp-smb2: Implement battery charging_enabled node
d6038d6da57f ASoC: msm-pcm-q6-v2: Add dsp buf check
0d7a6c301af8 qcacld-3.0: Fix OOB in wma_scan_roam.c
Change-Id: Ia2e189e37daad6e99bdb359d1204d9133a7916f4
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/scheduler/sched-bwc.txt | 45 |
1 files changed, 45 insertions, 0 deletions
diff --git a/Documentation/scheduler/sched-bwc.txt b/Documentation/scheduler/sched-bwc.txt index f6b1873f68ab..de583fbbfe42 100644 --- a/Documentation/scheduler/sched-bwc.txt +++ b/Documentation/scheduler/sched-bwc.txt @@ -90,6 +90,51 @@ There are two ways in which a group may become throttled: In case b) above, even though the child may have runtime remaining it will not be allowed to until the parent's runtime is refreshed. +CFS Bandwidth Quota Caveats +--------------------------- +Once a slice is assigned to a cpu it does not expire. However all but 1ms of +the slice may be returned to the global pool if all threads on that cpu become +unrunnable. This is configured at compile time by the min_cfs_rq_runtime +variable. This is a performance tweak that helps prevent added contention on +the global lock. + +The fact that cpu-local slices do not expire results in some interesting corner +cases that should be understood. + +For cgroup cpu constrained applications that are cpu limited this is a +relatively moot point because they will naturally consume the entirety of their +quota as well as the entirety of each cpu-local slice in each period. As a +result it is expected that nr_periods roughly equal nr_throttled, and that +cpuacct.usage will increase roughly equal to cfs_quota_us in each period. + +For highly-threaded, non-cpu bound applications this non-expiration nuance +allows applications to briefly burst past their quota limits by the amount of +unused slice on each cpu that the task group is running on (typically at most +1ms per cpu or as defined by min_cfs_rq_runtime). This slight burst only +applies if quota had been assigned to a cpu and then not fully used or returned +in previous periods. This burst amount will not be transferred between cores. +As a result, this mechanism still strictly limits the task group to quota +average usage, albeit over a longer time window than a single period. This +also limits the burst ability to no more than 1ms per cpu. This provides +better more predictable user experience for highly threaded applications with +small quota limits on high core count machines. It also eliminates the +propensity to throttle these applications while simultanously using less than +quota amounts of cpu. Another way to say this, is that by allowing the unused +portion of a slice to remain valid across periods we have decreased the +possibility of wastefully expiring quota on cpu-local silos that don't need a +full slice's amount of cpu time. + +The interaction between cpu-bound and non-cpu-bound-interactive applications +should also be considered, especially when single core usage hits 100%. If you +gave each of these applications half of a cpu-core and they both got scheduled +on the same CPU it is theoretically possible that the non-cpu bound application +will use up to 1ms additional quota in some periods, thereby preventing the +cpu-bound application from fully using its quota by that same amount. In these +instances it will be up to the CFS algorithm (see sched-design-CFS.rst) to +decide which application is chosen to run, as they will both be runnable and +have remaining quota. This runtime discrepancy will be made up in the following +periods when the interactive application idles. + Examples -------- 1. Limit a group to 1 CPU worth of runtime. |
