BACKPORT: sched/fair: Initiate a new task's util avg to a bounded value

A new task's util_avg is set to full utilization of a CPU (100% time running). This accelerates a new task's utilization ramp-up, useful to boost its execution in early time. However, it may result in (insanely) high utilization for a transient time period when a flood of tasks are spawned. Importantly, it violates the "fundamentally bounded" CPU utilization, and its side effect is negative if we don't take any measure to bound it. This patch proposes an algorithm to address this issue. It has two methods to approach a sensible initial util_avg: (1) An expected (or average) util_avg based on its cfs_rq's util_avg: util_avg = cfs_rq->util_avg / (cfs_rq->load_avg + 1) * se.load.weight (2) A trajectory of how successive new tasks' util develops, which gives 1/2 of the left utilization budget to a new task such that the additional util is noticeably large (when overall util is low) or unnoticeably small (when overall util is high enough). In the meantime, the aggregate utilization is well bounded: util_avg_cap = (1024 - cfs_rq->avg.util_avg) / 2^n where n denotes the nth task. If util_avg is larger than util_avg_cap, then the effective util is clamped to the util_avg_cap. Change-Id: Idafe989b24d9e70911666f09800bf1d5a011e1f4 Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Yuyang Du <yuyang.du@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: steve.muckle@linaro.org Link: http://lkml.kernel.org/r/1459283456-21682-1-git-send-email-yuyang.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 2b8c41daba327c633228169e8bd8ec067ab443f8) [integrate with schedfreq - schedfreq has a tuneable for init task util but this commit removes the use of the tuneable since we have a new algorithm for calculating an initial utilisation. I've left the tuneable in place, but it is no longer used even when schedfreq is the CPUFreq governor] Signed-off-by: Chris Redpath <chris.redpath@arm.com>
author: Yuyang Du <yuyang.du@intel.com> 2016-03-30 04:30:56 +0800
committer: Andres Oportus <andresoportus@google.com> 2017-06-02 08:01:53 -0700
commit: 9de438d27c43863dabf9db696fecbb90bc5c91eb (patch)
tree: 30553ad68c1cd690dfcb9c12a33eca2f66b73c31 /kernel/sched/sched.h
parent: 4e18c8a10de0c4d435dce95e526ecbe97c77d5c5 (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2051fecdb9e5..50b9229d47ab 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1391,6 +1391,7 @@ extern void init_dl_task_timer(struct sched_dl_entity *dl_se);
 unsigned long to_ratio(u64 period, u64 runtime);
 
 extern void init_entity_runnable_average(struct sched_entity *se);
+extern void post_init_entity_util_avg(struct sched_entity *se);
 
 static inline void __add_nr_running(struct rq *rq, unsigned count)
 {
author	Yuyang Du <yuyang.du@intel.com>	2016-03-30 04:30:56 +0800
committer	Andres Oportus <andresoportus@google.com>	2017-06-02 08:01:53 -0700
commit	9de438d27c43863dabf9db696fecbb90bc5c91eb (patch)
tree	30553ad68c1cd690dfcb9c12a33eca2f66b73c31 /kernel/sched/sched.h
parent	4e18c8a10de0c4d435dce95e526ecbe97c77d5c5 (diff)