summaryrefslogtreecommitdiff
path: root/kernel/sched (follow)
Commit message (Collapse)AuthorAge
...
* | | | sched: kill unnecessary divisions on fast pathJoonwoo Park2016-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The max_possible_efficiency and CPU's efficiency are fixed values which are determined at cluster allocation time. Avoid division on the fast by using precomputed scale factor. Also update_cpu_busy_time() doesn't need to know how many full windows have elapsed. Thus replace unneeded division with simple comparison. Change-Id: I2be1aad3fb9b895e4f0917d05bd8eade985bbccf Suggested-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: prevent race where update CPU cyclesJoonwoo Park2016-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating cycle counter should be serialized by holding rq lock. Add missing rq lock hold when cycle counter is updated by irq entry point. Change-Id: I92cf75d047a45ebf15a6ddeeecf8fc3823f96e5d Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: fix overflow in scaled execution time calculationJoonwoo Park2016-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Task execution time in nanoseconds and CPU cycle counters are large enough to cause overflow when we multiply both. Avoid overflow by calculating frequency separately. Change-Id: I076d9ecd27cb1c1f11578f009ebe1a19c1619454 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: remove unused parameter cpu from cpu_cycles_to_freq()Joonwoo Park2016-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function parameter cpu isn't used anymore by cpu_cycles_to_freq(). So remove it. Change-Id: Ide19321206dacb88fedca97e1b689d740f872866 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: avoid potential race between governor and thermal driverJoonwoo Park2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's possible thermal driver and governor notify that fmax is being changed at the same time. In such case we can potentially skip updating of CPU's capacity. Fix this by updating capacity always when limited fmax is changed by same entity. Meanwhile serialize sched_update_cpu_freq_min_max() with spinlock since this function can be called by multiple drivers at the same time. Change-Id: I3608cb09c30797bf858f434579fd07555546fb60 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: fix potential deflated frequency estimation during IRQ handlingJoonwoo Park2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Time between mark_start of idle task and IRQ handler entry time is CPU cycle counter stall period. Therefore it's inappropriate to include such duration as part of sample period when we do frequency estimation. Fix such suboptimality by replenishing idle task's CPU cycle counter upon IRQ entry and using irqtime as time delta. Change-Id: I274d5047a50565cfaaa2fb821ece21c8cf4c991d Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: fix CPU frequency estimation while idleJoonwoo Park2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CPU cycle counter won't increase when CPU or cluster is idle depending on hardware. Thus using cycle counter in that period of time can result in incorrect CPU frequency estimation. Use previously calculated CPU frequency when CPU was idle. Change-Id: I732b50c974a73c08038995900e008b4e16e9437b Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: preserve CPU cycle counter in rqJoonwoo Park2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preserve cycle counter in rq in preparation for wait time accounting while CPU idle fix. Change-Id: I469263c90e12f39bb36bde5ed26298b7c1c77597 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | arm64: Add support for app specific settingsSarangdhar Joshi2016-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to provide an interface that can be used from userspace to decide whether app specific settings need to be applied / cleared when particular processes are running. CRs-Fixed: 981519 997757 Change-Id: Id81f8b70de64f291a8586150f4d2c7c8f8b4420f Signed-off-by: Sarangdhar Joshi <spjoshi@codeaurora.org> [satyap@codeaurora.org: trivial merge conflict resolution and pull fixes for CR: 997757] Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
* | | | Revert "sched: warn/panic upon excessive scheduling latency"Joonwoo Park2016-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 8f90803a45d3aa349 ("sched: warn/panic upon excessive scheduling latency") as this feature is no longer used. Change-Id: I200d0e9e8dad5047522cd02a68de25d4a70a91a4 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | Revert "sched: add scheduling latency tracking procfs node"Joonwoo Park2016-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit b40bf941f61756bcc ("sched: add scheduling latency tracking procfs node") as this feature is no longer used. Change-Id: I5de789b6349e6ea78ae3725af2a3ffa72b7b7f11 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: eliminate sched_early_detection_duration knobJoonwoo Park2016-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill unused scheduler knob sched_early_detection_duration. Change-Id: I36b7a10982367f9c7ab8eefcb8ef1d0f9955601d Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: Remove the sched heavy task frequency guidance featureJoonwoo Park2016-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This has always been unused feature given its limitation of adding phantom load to the system. Since there are no immediate plans of using this and the fact that it adds unnecessary complications to the new load fixup mechanism, remove this feature for now. It can be revisited later in light of the new mechanism. Change-Id: Ie9501a898d0f423338293a8dde6bc56f493f1e75 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: eliminate sched_migration_fixup knobJoonwoo Park2016-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill unused scheduler knob sched_migration_fixup. With this change scheduler always adjusts CPU's busy time during migration. Change-Id: I5d59e89d5cc0f2c705c40036cd7b47f5d3f89e58 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: eliminate sched_upmigrate_min_nice knobJoonwoo Park2016-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill unused scheduler knob sched_upmigrate_min_nice. Change-Id: I53ddfde39c78e78306bd746c1c4da9a94ec67cd8 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: eliminate sched_enable_power_aware knob and parameterJoonwoo Park2016-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill unused scheduler knob and parameter sched_enable_power_aware. HMP scheduler always take into account power cost for placing task. Change-Id: Ib26a21df9b903baac26c026862b0a41b4a8834f3 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: eliminate sched_freq_account_wait_time knobJoonwoo Park2016-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill unused scheduler knob sched_freq_account_wait_time. Change-Id: Ib74123ebd69dfa3f86cf7335099f50c12a6e93c3 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: eliminate sched_account_wait_time knobJoonwoo Park2016-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kill unused scheduler knob sched_account_wait_time. With this change scheduler always accounts task's wait time into demand. Change-Id: Ifa4bcb5685798f48fd020f3d0c9853220b3f5fdc Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: Aggregate for frequencySrivatsa Vaddagiri2016-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Related threads in a group could execute on different CPUs and hence present a split-demand picture to cpufreq governor. IOW the governor fails to see the net cpu demand of all related threads in a given window if the threads's execution were to be split across CPUs. That could result in sub-optimal frequency chosen in comparison to the ideal frequency at which the aggregate work (taken up by related threads) needs to be run. This patch aggregates cpu execution stats in a window for all related threads in a group. This helps present cpu busy time to governor as if all related threads were part of the same thread and thus help select the right frequency required by related threads. This aggregation is done per-cluster. Change-Id: I71e6047620066323721c6d542034ddd4b2950e7f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: Fixed notify_migration() to hold rcu read lock as this version of Linux doesn't hold p->pi_lock when the function gets called while keeping use of rcu_access_pointer() since we never dereference return value.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: simplify CPU frequency estimation and cycle counter APIJoonwoo Park2016-05-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of CPUs increase cycle counter by one every cycle which makes frequency = cycles / time_delta is correct. Therefore it's reasonable to get rid of current cpu_cycle_max_scale_factor and ask cycle counter read callback function to return scaled counter value when it's needed in such a case that cycle counter doesn't increase every cycle. Thus multiply NSEC_PER_SEC / HZ_PER_KHZ to CPU cycle counter delta as we calculate frequency in khz and remove cpu_cycle_max_scale_factor. This allows us to simplify frequency estimation and cycle counter API. Change-Id: Ie7a628d4bc77c9b6c769f6099ce8d75740262a14 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: use correct Kconfig macro name CONFIG_SCHED_HMP_CSTATE_AWAREJoonwoo Park2016-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix macro name so CONFIG_SCHED_HMP_CSTATE_AWARE=y to take effect. Change-Id: I0218b36b2d74974f50a173a0ac3bc59156c57624 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | Revert "sched: set HMP scheduler's default initial task load to 100%"Joonwoo Park2016-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 28f67e5a50d7c1bfc ("sched: set HMP scheduler's default initial task load to 100%") since 100% of init task load makes too much of power inefficiency on some targets. CRs-fixed: 1006303 Change-Id: I81b4ba8fdc2e2fe1b40f18904964098fa558989b Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | watchdog: introduce touch_softlockup_watchdog_sched()Tejun Heo2016-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | touch_softlockup_watchdog() is used to tell watchdog that scheduler stall is expected. One group of usage is from paths where the task may not be able to yield for a long time such as performing slow PIO to finicky device and coming out of suspend. The other is to account for scheduler and timer going idle. For scheduler softlockup detection, there's no reason to distinguish the two cases; however, workqueue lockup detector is planned and it can use the same signals from the former group while the latter would spuriously prevent detection. This patch introduces a new function touch_softlockup_watchdog_sched() and convert the latter group to call it instead. For now, it just calls touch_softlockup_watchdog() and there's no functional difference. CRs-Fixed: 1007459 Change-Id: I6fe77926acd4240458cab29d399f81d8739a16c0 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Git-commit: 03e0d4610bf4d4a93bfa16b2474ed4fd5243aa71 Git-repo: git://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
* | | | sched: take into account of limited CPU min and max frequenciesJoonwoo Park2016-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Actual CPU's min and max frequencies can be limited by hardware components while governor's not aware of. Provide an API for them to notify for scheduler to be able to notice accurate currently operating frequency boundaries which helps better task placement decision. CRs-fixed: 1006303 Change-Id: I608f5fa8b0baff8d9e998731dcddec59c9073d20 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: add support for CPU frequency estimation with cycle counterJoonwoo Park2016-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At present scheduler calculates task's demand with the task's execution time weighted over CPU frequency. The CPU frequency is given by governor's CPU frequency transition notification. Such notification may not be available. Provide an API for CPU clock driver to register callback functions so in order for scheduler to access CPU's cycle counter to estimate CPU's frequency without notification. At time point scheduler assumes the cycle counter increases always even when cluster is idle which might not be true. This will be fixed by subsequent change for more accurate I/O wait time accounting. CRs-fixed: 1006303 Change-Id: I93b187efd7bc225db80da0184683694f5ab99738 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: revise sched_boost to make the best of big cluster CPUsJoonwoo Park2016-04-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At present sched_boost changes scheduler to place tasks on the least loaded CPU under the assumption both big and little clusters capacities are same at the same level of frequency. This is suboptimal for the big.Little system that doesn't have such a symmetrical capacity between big and little CPUs. Fix sched_boost to place tasks on the big CPUs for the non-symmetrical capacity target. CRs-fixed: 1006303 Change-Id: I752f020acf1a76580edb5cd0e5ad283b62edfeed Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: fix excessive task packing where CONFIG_SCHED_HMP_CSTATE_AWARE=yJoonwoo Park2016-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At present among the same power cost and c-state CPUs scheduler places newly waking up task on the most loaded CPU which can incur too much of task packing on the same CPU. Place onto the most loaded CPU only when the best CPU is in idle cstate, otherwise spread out by placing onto the least loaded CPU. CRs-fixed: 1006303 Change-Id: I8ae7332971b3293d912b1582f75e33fd81407d86 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: add option whether CPU C-state is used to guide task placementJoonwoo Park2016-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are CPUs that don't have an obvious low power mode exit latency penalty. Add a new Kconfig CONFIG_SCHED_HMP_CSTATE_AWARE which controls whether CPU C-state is used to guide task placement. CRs-fixed: 1006303 Change-Id: Ie8dbab8e173c3a1842d922f4d1fbd8cc4221789c Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: update placement logic to prefer C-state and busier CPUsSyed Rameez Mustafa2016-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the wakeup placement logic when need_idle is not set. Break ties in power with C-state. If C-state is the same break ties with prev_cpu. Finally go for the most loaded CPU. CRs-fixed: 1006303 Change-Id: Iafa98a909ed464af33f4fe3345bbfc8e77dee963 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed bug where assigns best_cpu_cstate with uninitialized cpu_cstate.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: Optimize wakeup placement logic when need_idle is setSyed Rameez Mustafa2016-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Try and find the min cstate CPU within the little cluster when a task fits there. If there is no idle CPU return the least busy CPU. Also Add a prev CPU bias when C-states or load is the same. CRs-fixed: 1006303 Change-Id: I577cc70a59f2b0c5309c87b54e106211f96e04a0 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* | | | kernel: sched: Fix compilation issues for Usermode LinuxJeevan Shriram2016-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix compilation errors for ARCH=um for x86_64 architecture. CRs-Fixed: 996252 Change-Id: I414b551e28a950e4b601f31bb4bfa2f1200d1713 Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>
* | | | sched: fix circular dependency of rq->lock and kswadp waitqueue lockPavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a deadlock scenario due to the circular dependency of CPU's rq->lock and kswapd's waitqueue lock. (1) when kswapd is woken up, try_to_wake_up() is called with it's waitqueue lock held. It's previous CPU is offline, so it is woken up on a different CPU. We try to acquire the offline CPU's rq->lock in either cpufreq change callback or fixup_busy_time() (2) At the same time, the offline CPU is coming online and init_idle() is called from __cpu_up(). init_idle() calls __sched_fork() with rq->lock held. A debug object allocation in hrtimer_init() called from __sched_fork() is trying to wakeup the kswapd and attempts to take the waitqueue lock held in the (1) path. Task specific initialization is done in __sched_fork() and rq->lock is not held when it is called for other tasks. The same holds true for the idle task as well. __sched_fork() for the idle task is called only when the CPU is not active. Acquire the rq->lock after calling __sched_fork() in init_idle() to fix this deadlock. CRs-Fixed: 965873 Change-Id: Ib8a265835c29861dba571c9b2a6b7e75b5cb43ee Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> [satyap: trivial merge conflicts resolution and omitted changes for QHMP] Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
* | | | sched: move out migration notification out of spinlockJoonwoo Park2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 5e16bbc2fb40537 ("sched: Streamline the task migration locking a little") hardened task migration locking and now __migrate_task() is called after rq lock held. Move out notification out of spinlock. Change-Id: I553adcfe80d5c670f4ddf83438226fd5e0924fe8 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: fix compile failure with !CONFIG_SCHED_HMPJoonwoo Park2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix various compilation failures when CONFIG_SCHED_HMP or CONFIG_SCHED_INPUT isn't enabled. Change-Id: I385dd37cfd778919f54f606bc13bebedd2fb5b9e Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: restrict sync wakee placement bias with waker's demandJoonwoo Park2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Biasing sync wakee task towards waker CPU's cluster makes sense when the waker's demand is high enough so the wakee also can take advantage of high CPU frequency voted because of waker's load. Placing sync wakee on the low demand waker's CPU can lead placement imbalance which can lead unnecessary migration. Introduce a new tunable "sched_big_waker_task_load" that defines the big waker so scheduler avoid wakee on waker's cluster bias when the waker's load is below the tunable. CRs-fixed: 971295 Change-Id: I1550ede0a71ac8c9be74a7daabe164c6a269a3fb Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [joonwoop@codeaurora.org: fixed a minor conflict in include/linux/sched/sysctl.h.]
* | | | sched: add preference for waker cluster CPU in wakee task placementJoonwoo Park2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If sync wakee task's demand is small it's worth to place the wakee task on waker's cluster for better performance in the sense that waker and wakee are corelated so the wakee should take advantage of waker cluster's frequency which is voted by the waker along with cache locality benefit. While biasing towards the waker's cluster we want to avoid the waker CPU as much as possible as placing the wakee on the waker's CPU can make the waker got preempted and migrated by load balancer. Introduce a new tunable 'sched_small_wakee_task_load' that differentiates eligible small wakee task and place the small wakee tasks on the waker's cluster. CRs-fixed: 971295 Change-Id: I96897d9a72a6f63dca4986d9219c2058cd5a7916 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [joonwoop@codeaurora.org: fixed a minor conflict in include/linux/sched/sysctl.h.]
* | | | sched/core: Add protection against null-pointer dereferenceOlav Haugan2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | p->grp is being accessed outside of lock which can cause null-pointer dereference. Fix this and also add rcu critical section around access of this data structure. CRs-fixed: 985379 Change-Id: Ic82de6ae2821845d704f0ec18046cc6a24f98e39 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in init_new_task_load().] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: allow select_prev_cpu_us to be set to values greater than 100usJoonwoo Park2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At present sched_select_prev_cpu_us tunable is restricted to values below 100us. Fix this unintended restriction. CRs-Fixed: 972237 Change-Id: I5eaf9f40468805c396328ca1022baef32acf8de0 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: clean up idle task's mark_start restoring in init_idle()Pavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idle task's mark_start can get updated even without the CPU being online. Hence the mark_start is restored when the CPU is coming online. The idle task's mark_start is reset in init_idle()->__sched_fork()-> init_new_task_load(). The original mark_start is saved and restored later. This can be avoided by moving init_new_task_load() to wake_up_new_task(), which never gets called for an idle task. We only care about idle task's ravg.mark_start and not initializing the other fields of ravg struct will not have any side effects. This clean up allows the subsequent patches to drop the rq->lock while calling __sched_fork() in init_idle(). CRs-Fixed: 965873 Change-Id: I41de6d69944d7d44b9c4d11b2d97ad01bd8fe96d Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> [joonwoop@codeaurora.org: fixed a minor conflict in core.c. omitted changes for CONFIG_SCHED_QHMP.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: let sched_boost take precedence over sched_restrict_cluster_spillPavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When sched_restrict_cluster_spill knob is enabled, RT tasks are restricted to lower power cluster. This knob also restricts inter cluster no-hz kicks. Ignore this knob setting when sched_boost is enabled so that tasks are placed on CPUs with highest spare capacity. CRs-Fixed: 968852 Change-Id: I01b3fc10b39dc834a733d64c2ee29c308d7ff730 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | | | sched: Add separate load tracking histogram to predict loadsPavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current window based load tracking only saves history for five windows. A historically heavy task's heavy load will be completely forgotten after five windows of light load. Even before the five window expires, a heavy task wakes up on same CPU it used to run won't trigger any frequency change until end of the window. It would starve for the entire window. It also adds one "small" load window to history because it's accumulating load at a low frequency, further reducing the tracked load for this heavy task. Ideally, scheduler should be able to identify such tasks and notify governor to increase frequency immediately after it wakes up. Add a histogram for each task to track a much longer load history. A prediction will be made based on runtime of previous or current window, histogram data and load tracked in recent windows. Prediction of all tasks that is currently running or runnable on a CPU is aggregated and reported to CPUFreq governor in sched_get_cpus_busy(). sched_get_cpus_busy() now returns predicted busy time in addition to previous window busy time and new task busy time, scaled to the CPU maximum possible frequency. Tunables: - /proc/sys/kernel/sched_gov_alert_freq (KHz) This tunable can be used to further filter the notifications. Frequency alert notification is sent only when the predicted load exceeds previous window load by sched_gov_alert_freq converted to load. Change-Id: If29098cd2c5499163ceaff18668639db76ee8504 Suggested-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Junjie Wu <junjiew@codeaurora.org> [joonwoop@codeaurora.org: fixed merge conflicts around __migrate_task() and removed changes for CONFIG_SCHED_QHMP.]
* | | | sched: Provide a wake up API without sending freq notificationsJunjie Wu2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each time a task wakes up, scheduler evaluates its load and notifies governor if the resulting frequency of destination CPU is larger than a threshold. However, some governor wakes up a separate task that handles frequency change, which again calls wake_up_process(). This is dangerous because if the task being woken up meets the threshold and ends up being moved around, there is a potential for endless recursive notifications. Introduce a new API for waking up a task without triggering frequency notification. Change-Id: I24261af81b7dc410c7fb01eaa90920b8d66fbd2a Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* | | | sched: Take downmigrate threshold into considerationPavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the tasks are run on the higher capacity cluster solely due to the reason that they can not be be fit in the lower capacity cluster, the downmigrate threshold prevents the frequent tasks migrations between the clusters. Change-Id: I234a23ffd907c2476c94d5f6227dab1bb6c9bebb Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | | | sched: Provide a facility to restrict RT tasks to lower power clusterPavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current CPU selection algorithm for RT tasks looks for the least loaded CPU in all clusters. Stop the search at the lowest possible power cluster based on "sched_restrict_cluster_spill" sysctl tunable. Change-Id: I34fdaefea56e0d1b7e7178d800f1bb86aa0ec01c Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | | | sched: Take cluster's minimum power into account for optimizing sbc()Pavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The select_best_cpu() algorithm iterates over all the clusters and selects the most power efficient CPU that satisfies the task needs. During the search, skip the next cluster if its minimum power cost is higher than the power cost of an eligible CPU found in the previous cluster. In a b.L system, if the BIG cluster minimum power cost is higher than the maximum power cost of the little cluster, this optimization avoids searching the BIG cluster if an eligible CPU is found in the little cluster. Change-Id: I5e3755f107edb6c72180edbec2a658be931c276d Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
* | | | sched: Revise the inter cluster load balance restrictionsPavankumar Kondeti2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The frequency based inter cluster load balance restrictions are not reliable as frequency does not provide a good estimate of the CPU's current load. Replace them with the spill_load and spill_nr_run based checks. The higher capacity cluster is restricted from pulling the tasks from the lower capacity cluster unless all of the lower capacity CPUs are above spill. This behavior can be controlled by a sysctl tunable and it is disabled by default (i.e. no load balance restrictions). Change-Id: I45c09c8adcb61a8a7d4e08beadf2f97f1805fb42 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> [joonwoop@codeaurora.org: fixed merge conflicts due to omitted changes for CONFIG_SCHED_QHMP.]
* | | | sched: colocate related threadsSrivatsa Vaddagiri2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide userspace interface for tasks to be grouped together as "related" threads. For example, all threads involved in updating display buffer could be tagged as related. Scheduler will attempt to provide special treatment for group of related threads such as: 1) Colocation of related threads in same "preferred" cluster 2) Aggregation of demand towards determination of cluster frequency This patch extends scheduler to provide best-effort colocation support for a group of related threads. Change-Id: Ic2cd769faf5da4d03a8f3cb0ada6224d0101a5f5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor merge conflicts. removed ifdefry for CONFIG_SCHED_QHMP.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: Update fair and rt placement logic to use scheduler clustersSrivatsa Vaddagiri2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make use of clusters in the fair and rt scheduling classes. This is needed as the freq domain mask can no longer be used to do correct task placement. The freq domain mask was being used to demarcate clusters. Change-Id: I57f74147c7006f22d6760256926c10fd0bf50cbd Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed merge conflicts due to omitted changes for CONFIG_SCHED_QHMP.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* | | | sched: Introduce the concept CPU clusters in the schedulerSrivatsa Vaddagiri2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A cluster is set of CPUs sharing some power controls and an L2 cache. This patch buids a list of clusters at bootup which are sorted by their max_power_cost. Many cluster-shared attributes like cur_freq, max_freq etc are needlessly maintained in per-cpu 'struct rq' currently. Consolidate them in a cluster structure. Change-Id: I0567672ad5fb67d211d9336181ceb53b9f6023af Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in arch/arm64/kernel/topology.c. fixed conflict due to ommited changes for CONFIG_SCHED_QHMP.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* | | | sched: remove init_new_task_load from CONFIG_SMPJeevan Shriram2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Move init_new_task_load function from CONFIG_SMP to avoid linking error for ARCH=um Signed-off-by: Jeevan Shriram <jshriram@codeaurora.org>