summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorDavide Garberi <dade.garberi@gmail.com>2023-08-06 15:31:30 +0200
committerDavide Garberi <dade.garberi@gmail.com>2023-08-06 15:31:30 +0200
commitcc57cb4ee3b7918b74d30604735d353b9a5fa23b (patch)
tree0be483b86472eaf1c74f747ecbaf6300f3998a1a /Documentation
parent44be99a74546fb018cbf2049602a5fd2889a0089 (diff)
parent7d11b1a7a11c598a07687f853ded9eca97d89043 (diff)
Merge lineage-20 of git@github.com:LineageOS/android_kernel_qcom_msm8998.git into lineage-20
7d11b1a7a11c Revert "sched: cpufreq: Use sched_clock instead of rq_clock when updating schedutil" daaa5da96a74 sched: Take irq_sparse lock during the isolation 217ab2d0ef91 rcu: Speed up calling of RCU tasks callbacks 997b726bc092 kernel: power: Workaround for sensor ipc message causing high power consume b933e4d37bc0 sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices 82d3f23d6dc5 sched/fair: Fix bandwidth timer clock drift condition 629bfed360f9 kernel: power: qos: remove check for core isolation while cluster LPMs 891a63210e1d sched/fair: Fix issue where frequency update not skipped b775cb29f663 ANDROID: Move schedtune en/dequeue before schedutil update triggers ebdb82f7b34a sched/fair: Skip frequency updates if CPU about to idle ff383d94478a FROMLIST: sched: Make iowait_boost optional in schedutil 9539942cb065 FROMLIST: cpufreq: Make iowait boost a policy option b65c91c9aa14 ARM: dts: msm: add HW CPU's busy-cost-data for additional freqs 72f13941085b ARM: dts: msm: fix CPU's idle-cost-data ab88411382f7 ARM: dts: msm: fix EM to be monotonically increasing 83dcbae14782 ARM: dts: msm: Fix EAS idle-cost-data property length 33d3b17bfdfb ARM: dts: msm: Add msm8998 energy model c0fa7577022c sched/walt: Re-add code to allow WALT to function d5cd35f38616 FROMGIT: binder: use EINTR for interrupted wait for work db74739c86de sched: Don't fail isolation request for an already isolated CPU aee7a16e347b sched: WALT: increase WALT minimum window size to 20ms 4dbe44554792 sched: cpufreq: Use per_cpu_ptr instead of this_cpu_ptr when reporting load ef3fb04c7df4 sched: cpufreq: Use sched_clock instead of rq_clock when updating schedutil c7128748614a sched/cpupri: Exclude isolated CPUs from the lowest_mask 6adb092856e8 sched: cpufreq: Limit governor updates to WALT changes alone 0fa652ee00f5 sched: walt: Correct WALT window size initialization 41cbb7bc59fb sched: walt: fix window misalignment when HZ=300 43cbf9d6153d sched/tune: Increase the cgroup limit to 6 c71b8fffe6b3 drivers: cpuidle: lpm-levels: Fix KW issues with idle state idx < 0 938e42ca699f drivers: cpuidle: lpm-levels: Correctly check for list empty 8d8a48aecde5 sched/fair: Fix load_balance() affinity redo path eccc8acbe705 sched/fair: Avoid unnecessary active load balance 0ffdb886996b BACKPORT: sched/core: Fix rules for running on online && !active CPUs c9999f04236e sched/core: Allow kthreads to fall back to online && !active cpus b9b6bc6ea3c0 sched: Allow migrating kthreads into online but inactive CPUs a9314f9d8ad4 sched/fair: Allow load bigger task load balance when nr_running is 2 c0b317c27d44 pinctrl: qcom: Clear status bit on irq_unmask 45df1516d04a UPSTREAM: mm: fix misplaced unlock_page in do_wp_page() 899def5edcd4 UPSTREAM: mm/ksm: Remove reuse_ksm_page() 46c6fbdd185a BACKPORT: mm: do_wp_page() simplification 90dccbae4c04 UPSTREAM: mm: reuse only-pte-mapped KSM page in do_wp_page() ebf270d24640 sched/fair: vruntime should normalize when switching from fair cbe0b37059c9 mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 12d40f1995b4 msm: mdss: Fix indentation 620df03a7229 msm: mdss: Treat polling_en as the bool that it is 12af218146a6 msm: mdss: add idle state node 13e661759656 cpuset: Restore tasks affinity while moving across cpusets 602bf4096dab genirq: Honour IRQ's affinity hint during migration 9209b5556f6a power: qos: Use effective affinity mask f31078b5825f genirq: Introduce effective affinity mask 58c453484f7e sched/cputime: Mitigate performance regression in times()/clock_gettime() 400383059868 kernel: time: Add delay after cpu_relax() in tight loops 1daa7ea39076 pinctrl: qcom: Update irq handle for GPIO pins 07f7c9961c7c power: smb-lib: Fix mutex acquisition deadlock on PD hard reset 094b738f46c8 power: qpnp-smb2: Implement battery charging_enabled node d6038d6da57f ASoC: msm-pcm-q6-v2: Add dsp buf check 0d7a6c301af8 qcacld-3.0: Fix OOB in wma_scan_roam.c Change-Id: Ia2e189e37daad6e99bdb359d1204d9133a7916f4
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/scheduler/sched-bwc.txt45
1 files changed, 45 insertions, 0 deletions
diff --git a/Documentation/scheduler/sched-bwc.txt b/Documentation/scheduler/sched-bwc.txt
index f6b1873f68ab..de583fbbfe42 100644
--- a/Documentation/scheduler/sched-bwc.txt
+++ b/Documentation/scheduler/sched-bwc.txt
@@ -90,6 +90,51 @@ There are two ways in which a group may become throttled:
In case b) above, even though the child may have runtime remaining it will not
be allowed to until the parent's runtime is refreshed.
+CFS Bandwidth Quota Caveats
+---------------------------
+Once a slice is assigned to a cpu it does not expire. However all but 1ms of
+the slice may be returned to the global pool if all threads on that cpu become
+unrunnable. This is configured at compile time by the min_cfs_rq_runtime
+variable. This is a performance tweak that helps prevent added contention on
+the global lock.
+
+The fact that cpu-local slices do not expire results in some interesting corner
+cases that should be understood.
+
+For cgroup cpu constrained applications that are cpu limited this is a
+relatively moot point because they will naturally consume the entirety of their
+quota as well as the entirety of each cpu-local slice in each period. As a
+result it is expected that nr_periods roughly equal nr_throttled, and that
+cpuacct.usage will increase roughly equal to cfs_quota_us in each period.
+
+For highly-threaded, non-cpu bound applications this non-expiration nuance
+allows applications to briefly burst past their quota limits by the amount of
+unused slice on each cpu that the task group is running on (typically at most
+1ms per cpu or as defined by min_cfs_rq_runtime). This slight burst only
+applies if quota had been assigned to a cpu and then not fully used or returned
+in previous periods. This burst amount will not be transferred between cores.
+As a result, this mechanism still strictly limits the task group to quota
+average usage, albeit over a longer time window than a single period. This
+also limits the burst ability to no more than 1ms per cpu. This provides
+better more predictable user experience for highly threaded applications with
+small quota limits on high core count machines. It also eliminates the
+propensity to throttle these applications while simultanously using less than
+quota amounts of cpu. Another way to say this, is that by allowing the unused
+portion of a slice to remain valid across periods we have decreased the
+possibility of wastefully expiring quota on cpu-local silos that don't need a
+full slice's amount of cpu time.
+
+The interaction between cpu-bound and non-cpu-bound-interactive applications
+should also be considered, especially when single core usage hits 100%. If you
+gave each of these applications half of a cpu-core and they both got scheduled
+on the same CPU it is theoretically possible that the non-cpu bound application
+will use up to 1ms additional quota in some periods, thereby preventing the
+cpu-bound application from fully using its quota by that same amount. In these
+instances it will be up to the CFS algorithm (see sched-design-CFS.rst) to
+decide which application is chosen to run, as they will both be runnable and
+have remaining quota. This runtime discrepancy will be made up in the following
+periods when the interactive application idles.
+
Examples
--------
1. Limit a group to 1 CPU worth of runtime.