summaryrefslogtreecommitdiff
path: root/kernel/sched
diff options
context:
space:
mode:
authorJeffrey Hugo <jhugo@codeaurora.org>2017-05-19 23:49:11 -0400
committerGeorg Veichtlbauer <georg@vware.at>2023-07-16 12:47:43 +0200
commit8d8a48aecde5c4be6c57b9108dc22e8e0cd7f235 (patch)
treef5182397801b5c9b90d4962637675023c647d1e5 /kernel/sched
parenteccc8acbe705a20e0911ea776371d84eba53cc8e (diff)
sched/fair: Fix load_balance() affinity redo path
If load_balance() fails to migrate any tasks because all tasks were affined, load_balance() removes the source cpu from consideration and attempts to redo and balance among the new subset of cpus. There is a bug in this code path where the algorithm considers all active cpus in the system (minus the source that was just masked out). This is not valid for two reasons: some active cpus may not be in the current scheduling domain and one of the active cpus is dst_cpu. These cpus should not be considered, as we cannot pull load from them. Instead of failing out of load_balance(), we may end up redoing the search with no valid cpus and incorrectly concluding the domain is balanced. Additionally, if the group_imbalance flag was just set, it may also be incorrectly unset, thus the flag will not be seen by other cpus in future load_balance() runs as that algorithm intends. Fix the check by removing cpus not in the current domain and the dst_cpu from considertation, thus limiting the evaluation to valid remaining cpus from which load might be migrated. Co-authored-by: Austin Christ <austinwc@codeaurora.org> Co-authored-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Change-Id: Ife6701c9c62e7155493d9db9398f08c4474e94b3
Diffstat (limited to 'kernel/sched')
-rw-r--r--kernel/sched/fair.c19
1 files changed, 18 insertions, 1 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 42f05c742846..a2f52c35c76a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10697,7 +10697,24 @@ more_balance:
/* All tasks on this runqueue were pinned by CPU affinity */
if (unlikely(env.flags & LBF_ALL_PINNED)) {
cpumask_clear_cpu(cpu_of(busiest), cpus);
- if (!cpumask_empty(cpus)) {
+ /*
+ * dst_cpu is not a valid busiest cpu in the following
+ * check since load cannot be pulled from dst_cpu to be
+ * put on dst_cpu.
+ */
+ cpumask_clear_cpu(env.dst_cpu, cpus);
+ /*
+ * Go back to "redo" iff the load-balance cpumask
+ * contains other potential busiest cpus for the
+ * current sched domain.
+ */
+ if (cpumask_intersects(cpus, sched_domain_span(env.sd))) {
+ /*
+ * Now that the check has passed, reenable
+ * dst_cpu so that load can be calculated on
+ * it in the redo path.
+ */
+ cpumask_set_cpu(env.dst_cpu, cpus);
env.loop = 0;
env.loop_break = sched_nr_migrate_break;
goto redo;