summaryrefslogtreecommitdiff
path: root/kernel
diff options
context:
space:
mode:
authorBrendan Jackman <brendan.jackman@arm.com>2017-08-07 17:39:00 +0100
committerChris Redpath <chris.redpath@arm.com>2017-10-27 13:30:33 +0100
commit795a6867cfe1ea0bbe967d2b037ae856484556a4 (patch)
treedf43fd6b36c15e3f898ea9ce0f250008314ce4c5 /kernel
parentfd4a95dab858fa1350cc94ce4b435b7295db3ed3 (diff)
UPSTREAM: sched/fair: Force balancing on nohz balance if local group has capacity
The "goto force_balance" here is intended to mitigate the fact that avg_load calculations can result in bad placement decisions when priority is asymmetrical. The original commit that adds it: fab476228ba3 ("sched: Force balancing on newidle balance if local group has capacity") explains: Under certain situations, such as a niced down task (i.e. nice = -15) in the presence of nr_cpus NICE0 tasks, the niced task lands on a sched group and kicks away other tasks because of its large weight. This leads to sub-optimal utilization of the machine. Even though the sched group has capacity, it does not pull tasks because sds.this_load >> sds.max_load, and f_b_g() returns NULL. A similar but inverted issue also affects ARM big.LITTLE (asymmetrical CPU capacity) systems - consider 8 always-running, same-priority tasks on a system with 4 "big" and 4 "little" CPUs. Suppose that 5 of them end up on the "big" CPUs (which will be represented by one sched_group in the DIE sched_domain) and 3 on the "little" (the other sched_group in DIE), leaving one CPU unused. Because the "big" group has a higher group_capacity its avg_load may not present an imbalance that would cause migrating a task to the idle "little". The force_balance case here solves the problem but currently only for CPU_NEWLY_IDLE balances, which in theory might never happen on the unused CPU. Including CPU_IDLE in the force_balance case means there's an upper bound on the time before we can attempt to solve the underutilization: after DIE's sd->balance_interval has passed the next nohz balance kick will help us out. Change-Id: I807ba5cba0ef1b8bbec02cbcd4755fd32af10135 Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170807163900.25180-1-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry-picked-from: commit 583ffd99d765 tip:sched/core) Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Diffstat (limited to 'kernel')
-rw-r--r--kernel/sched/fair.c7
1 files changed, 5 insertions, 2 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f6e730bcde27..ae94c1124655 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8707,8 +8707,11 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
if (busiest->group_type == group_imbalanced)
goto force_balance;
- /* SD_BALANCE_NEWIDLE trumps SMP nice when underutilized */
- if (env->idle == CPU_NEWLY_IDLE && group_has_capacity(env, local) &&
+ /*
+ * When dst_cpu is idle, prevent SMP nice and/or asymmetric group
+ * capacities from resulting in underutilization due to avg_load.
+ */
+ if (env->idle != CPU_NOT_IDLE && group_has_capacity(env, local) &&
busiest->group_no_capacity)
goto force_balance;