diff options
| author | Jeffrey Hugo <jhugo@codeaurora.org> | 2017-05-19 23:49:11 -0400 |
|---|---|---|
| committer | Georg Veichtlbauer <georg@vware.at> | 2023-07-16 12:47:43 +0200 |
| commit | 8d8a48aecde5c4be6c57b9108dc22e8e0cd7f235 (patch) | |
| tree | f5182397801b5c9b90d4962637675023c647d1e5 /kernel/sched | |
| parent | eccc8acbe705a20e0911ea776371d84eba53cc8e (diff) | |
sched/fair: Fix load_balance() affinity redo path
If load_balance() fails to migrate any tasks because all tasks were
affined, load_balance() removes the source cpu from consideration and
attempts to redo and balance among the new subset of cpus.
There is a bug in this code path where the algorithm considers all active
cpus in the system (minus the source that was just masked out). This is
not valid for two reasons: some active cpus may not be in the current
scheduling domain and one of the active cpus is dst_cpu. These cpus should
not be considered, as we cannot pull load from them.
Instead of failing out of load_balance(), we may end up redoing the search
with no valid cpus and incorrectly concluding the domain is balanced.
Additionally, if the group_imbalance flag was just set, it may also be
incorrectly unset, thus the flag will not be seen by other cpus in future
load_balance() runs as that algorithm intends.
Fix the check by removing cpus not in the current domain and the dst_cpu
from considertation, thus limiting the evaluation to valid remaining cpus
from which load might be migrated.
Co-authored-by: Austin Christ <austinwc@codeaurora.org>
Co-authored-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Change-Id: Ife6701c9c62e7155493d9db9398f08c4474e94b3
Diffstat (limited to 'kernel/sched')
| -rw-r--r-- | kernel/sched/fair.c | 19 |
1 files changed, 18 insertions, 1 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 42f05c742846..a2f52c35c76a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10697,7 +10697,24 @@ more_balance: /* All tasks on this runqueue were pinned by CPU affinity */ if (unlikely(env.flags & LBF_ALL_PINNED)) { cpumask_clear_cpu(cpu_of(busiest), cpus); - if (!cpumask_empty(cpus)) { + /* + * dst_cpu is not a valid busiest cpu in the following + * check since load cannot be pulled from dst_cpu to be + * put on dst_cpu. + */ + cpumask_clear_cpu(env.dst_cpu, cpus); + /* + * Go back to "redo" iff the load-balance cpumask + * contains other potential busiest cpus for the + * current sched domain. + */ + if (cpumask_intersects(cpus, sched_domain_span(env.sd))) { + /* + * Now that the check has passed, reenable + * dst_cpu so that load can be calculated on + * it in the redo path. + */ + cpumask_set_cpu(env.dst_cpu, cpus); env.loop = 0; env.loop_break = sched_nr_migrate_break; goto redo; |
