android_kernel_zuk_msm8996.git

	Commit message (Collapse)	Author	Age
...
* \| \|	mm: process reclaim: vmpressure based process reclaim	Vinayak Menon	2016-06-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this patch, anon pages of inactive tasks can be reclaimed, depending on memory pressure. Memory pressure is detected using vmpressure events. 'N' best tasks in terms of anon size is selected and pages proportional to their tasksize is reclaimed. The total number of pages reclaimed at each run of the swap work, can be tuned from userspace, the default being SWAP_CLUSTER_MAX * 32. The patch also adds tracepoints to debug and tune the feature. echo 1 > /sys/module/process_reclaim/parameters/enable_process_reclaim to enable the feature. echo <pages> > /sys/module/process_reclaim/parameters/per_swap_size, to set the number of pages reclaimed in each scan. /sys/module/process_reclaim/parameters/reclaim_avg_efficiency, provides the average efficiency (scan to reclaim ratio) of the algorithm. /sys/module/process_reclaim/parameters/swap_eff_win, to set the window period (in unit of number of times reclaim is triggered) to detect low efficiency runs. /sys/module/process_reclaim/parameters/swap_opt_eff, to set the optimal efficiency threshold for low efficiency detection. Change-Id: I895986f10c997d1715761eaaadc4bbbee60db9d2 Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
* \| \|	sched: fix incorrect type casting in trace events	Joonwoo Park	2016-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CPU cycles and execution time are in u64. Change-Id: Ifb3ce3fd2c5c6bf4d658137214b73659a60fd9d7 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: remove unused parameter cpu from cpu_cycles_to_freq()	Joonwoo Park	2016-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function parameter cpu isn't used anymore by cpu_cycles_to_freq(). So remove it. Change-Id: Ide19321206dacb88fedca97e1b689d740f872866 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	drivers: thermal: Add ftrace events for LMH DCVSh mitigation	Ram Chandrasekar	2016-06-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LMH DCVSh driver receives interrupt from hardware whenever there is a new mitigation frequency decision is made in hardware. Add ftrace event to print the hardware mitigation frequency value from driver. Change-Id: Ib357ee3c3a461613bfd1268ec8f98973c2982c10 Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org>
* \| \|	Merge branch 'dev/msm-4.4-mmc_port' into msm-4.4	Subhash Jadavani	2016-06-02
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This merge brings MMC/SD card changes from 'dev/msm-4.4-mmc_port' to msm-4.4. * origin/dev/msm-4.4-mmc_port: (472 commits) mmc: sdhci-msm: fix few compilation issues mmc: cmdq_hci: fix compilation issue mmc: host: fix compilation issue when clk_gating config is disabled mmc: sdhci: clean up legacy adma related variables mmc: sdhci: enable 64-bit DMA support only if chipset supports 64-bit mmc: sdhci: Replace SDHCI_USE_ADMA_64BIT flag mmc: auto bkops fixes mmc: card: fix quirk bit map Revert "mmc: core: get drive types supported by eMMC cards" mmc: host: sdhci: don't queue zero length descriptor mmc: core: fix deadlock between runtime-suspend and devfreq mmc: block: Add quirk and increase read data timeout for hynix emmc mmc: card: Fix broken clock gating mmc: core: postpone runtime suspend in case BKOPS is required mmc: core: update AUTO_EN in BKOPS_EN field on runtime resume mmc: revert runtime idle state mmc: host: Set max frequency when disabling clock scaling mmc: queue: Fix queue_lock spinlock bug from CMDQ shutdown path mmc: core: fix issue with devfreq clock scaling mmc: core: set REL_WR_SEC_C register to 0x1 per eMMC5.0 spec ... Change-Id: I702a72fbbecba520f429bf1149106e684335e2a5 Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
\| * \| \|	mmc: sdhci-msm: Add tracepoints to enhance pm debugging	Konstantin Dorfman	2016-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instrument the sdhci-msm platform driver with tracepoints to aid in debugging issues and identifying latencies in the following paths: * System suspend/resume * Runtime suspend/resume Change-Id: I4fed1c2ccba7d5d7f978f161e7985c98e869d1d8 Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org>
\| * \| \|	mmc: core: Add tracepoints to enhance pm debugging	Konstantin Dorfman	2016-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instrument the mmc core layer with tracepoints to aid in debugging issues and identifying latencies in the following paths: * System suspend/resume * Runtime suspend/resume Change-Id: I1e0fa7d3f8b54c102b4055f910b58a42412748da Signed-off-by: Konstantin Dorfman <kdorfman@codeaurora.org> [subhashj@codeaurora.org: fixed trivial merge conflicts] Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
\| * \| \|	mmc: core: Log MMC clock frequency transitions	Sujit Reddy Thumma	2016-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use kernel's ftrace support to capture MMC clock frequency transitions which can be useful for debugging issues related to power consumption. Usage: mount -t debugfs none /sys/kernel/debug echo 1 > /sys/kernel/debug/tracing/events/mmc/mmc_clk/enable cat /sys/kernel/debug/tracing/trace_pipe Change-Id: I25c4ee39dcbe30e7665902a9f723a5a421b55ca3 Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
\| * \| \|	mmc: sdhci: Add tracepoints to enhance debugging	Venkat Gopalakrishnan	2016-05-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instrument the sdhci driver with tracepoints to aid in debugging issues and identifying latencies in the following path: * CMD completion * DATA completion * DMA preparation * Post DMA cleanup Change-Id: Ie8cd0c2fb6c1bd6ab13883123be021081f8b8f78 Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org> [subhashj@codeaurora.org: fixed minor merge conflict] Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
* \| \| \|	sched: Aggregate for frequency	Srivatsa Vaddagiri	2016-05-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Related threads in a group could execute on different CPUs and hence present a split-demand picture to cpufreq governor. IOW the governor fails to see the net cpu demand of all related threads in a given window if the threads's execution were to be split across CPUs. That could result in sub-optimal frequency chosen in comparison to the ideal frequency at which the aggregate work (taken up by related threads) needs to be run. This patch aggregates cpu execution stats in a window for all related threads in a group. This helps present cpu busy time to governor as if all related threads were part of the same thread and thus help select the right frequency required by related threads. This aggregation is done per-cluster. Change-Id: I71e6047620066323721c6d542034ddd4b2950e7f Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: Fixed notify_migration() to hold rcu read lock as this version of Linux doesn't hold p->pi_lock when the function gets called while keeping use of rcu_access_pointer() since we never dereference return value.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \| \|	coresight: enable stm logging for trace events, marker and printk	Shashank Mittal	2016-05-24
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dup ftrace event traffic and writes to trace_marker file from userspace to STM. Also dup trace printk traffic to STM. This allows Linux tracing and log data to be correlated with other data transported over STM. Change-Id: I4fcb42f2e97ab963fdc85853f4f3ea1f208bfc3c Signed-off-by: Pratik Patel <pratikp@codeaurora.org> [spjoshi@codeaurora.org: 3.18 code fixup] Signed-off-by: Sarangdhar Joshi <spjoshi@codeaurora.org> [mittals@codeaurora.org: 4.4 code fixup] Signed-off-by: Shashank Mittal <mittals@codeaurora.org>
* \| \|	sched: fix compile error where !CONFIG_SCHED_HMP	Joonwoo Park	2016-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move trace event sched_get_task_cpu_cycles() under CONFIG_SCHED_HMP=y to fix compile error. Change-Id: Ie00cafeaceedb44100bda97f996ac3efa0e6c91f Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: add support for CPU frequency estimation with cycle counter	Joonwoo Park	2016-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At present scheduler calculates task's demand with the task's execution time weighted over CPU frequency. The CPU frequency is given by governor's CPU frequency transition notification. Such notification may not be available. Provide an API for CPU clock driver to register callback functions so in order for scheduler to access CPU's cycle counter to estimate CPU's frequency without notification. At time point scheduler assumes the cycle counter increases always even when cluster is idle which might not be true. This will be fixed by subsequent change for more accurate I/O wait time accounting. CRs-fixed: 1006303 Change-Id: I93b187efd7bc225db80da0184683694f5ab99738 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	lowmemorykiller: adapt to vmpressure	Vinayak Menon	2016-04-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were issues reported, where page cache thrashing was observed because of LMK not killing tasks when required, resulting in sluggishness and higher app launch latency. LMK does not kill a task for the following reasons. 1. The free and file pages are above the LMK thresholds 2. LMK tries to pick task with an adj level corresponding to current thresholds, but fails to do so because of the absence of tasks in that level. But sometimes it is better to kill a lower adj task, than thrashing. And there are cases where the number of file pages are huge, though we dont thrash, the reclaim process becomes time consuming, since LMK triggers will be delayed because of higher number of file pages. Even in such cases, when reclaim path finds it difficult to reclaim pages, it is better to trigger lmk to free up some memory faster. The basic idea here is to make LMK more aggressive dynamically when such a thrashing scenario is detected. To detect thrashing, this patch uses vmpressure events. The values of vmpressure upon which an action has to be taken, was derived empirically. This patch also adds tracepoints to validate this feature, almk_shrink and almk_vmpressure. Two knobs are available for the user to tune adaptive lmk behaviour. /sys/module/lowmemorykiller/parameters/adaptive_lmk - Write 1 to enable the feature, 0 to disable. By default disabled. /sys/module/lowmemorykiller/parameters/vmpressure_file_min - This parameter controls the behaviour of LMK when vmpressure is in the range of 90-94. Adaptive lmk triggers based on number file pages wrt vmpressure_file_min, when vmpressure is in the range of 90-94. Usually this is a pseudo minfree value, higher than the highest configured value in minfree array. Change-Id: I1a08160c35d3e33bdfd1d2c789c288fc07d0f0d3 Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
* \| \|	cpufreq: interactive: Use load prediction provided by scheduler	Junjie Wu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With modification in scheduler, governor now gets predicted instantaneous demand waiting to run in addition to demand from previous window for each CPU. Make use of this information since prediction from scheduler could be more accurate than just looking at past few windows. Governor calculates two frequencies during each sampling period: one based on demand in previous sampling period (f_prev), and the other based on prediction provided by scheduler (f_pred). Max of both will be selected as final frequency. Hispeed related logic, including both frequency selection and delay is ignored when prediction is enabled. If only f_pred but not f_prev picked policy->max, max_freq_hysteresis period is not started/extended. This is to reduce power cost of mis-prediction if it happens. One use case prediction could dramatically help is when a heavy task wakes up after sleeping for a long time. With prediction, governor could ramp up to frequency the task needs much faster than before. To enable prediction, echo 1 to enable_prediction file in cpufreq interactive sysfs directory. Change-Id: I27396785886e43ea01c9000c651c8bd142172273 Suggested-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	sched: Add separate load tracking histogram to predict loads	Pavankumar Kondeti	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current window based load tracking only saves history for five windows. A historically heavy task's heavy load will be completely forgotten after five windows of light load. Even before the five window expires, a heavy task wakes up on same CPU it used to run won't trigger any frequency change until end of the window. It would starve for the entire window. It also adds one "small" load window to history because it's accumulating load at a low frequency, further reducing the tracked load for this heavy task. Ideally, scheduler should be able to identify such tasks and notify governor to increase frequency immediately after it wakes up. Add a histogram for each task to track a much longer load history. A prediction will be made based on runtime of previous or current window, histogram data and load tracked in recent windows. Prediction of all tasks that is currently running or runnable on a CPU is aggregated and reported to CPUFreq governor in sched_get_cpus_busy(). sched_get_cpus_busy() now returns predicted busy time in addition to previous window busy time and new task busy time, scaled to the CPU maximum possible frequency. Tunables: - /proc/sys/kernel/sched_gov_alert_freq (KHz) This tunable can be used to further filter the notifications. Frequency alert notification is sent only when the predicted load exceeds previous window load by sched_gov_alert_freq converted to load. Change-Id: If29098cd2c5499163ceaff18668639db76ee8504 Suggested-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Signed-off-by: Junjie Wu <junjiew@codeaurora.org> [joonwoop@codeaurora.org: fixed merge conflicts around __migrate_task() and removed changes for CONFIG_SCHED_QHMP.]
* \| \|	sched: colocate related threads	Srivatsa Vaddagiri	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide userspace interface for tasks to be grouped together as "related" threads. For example, all threads involved in updating display buffer could be tagged as related. Scheduler will attempt to provide special treatment for group of related threads such as: 1) Colocation of related threads in same "preferred" cluster 2) Aggregation of demand towards determination of cluster frequency This patch extends scheduler to provide best-effort colocation support for a group of related threads. Change-Id: Ic2cd769faf5da4d03a8f3cb0ada6224d0101a5f5 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [joonwoop@codeaurora.org: fixed minor merge conflicts. removed ifdefry for CONFIG_SCHED_QHMP.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: Introduce the concept CPU clusters in the scheduler	Srivatsa Vaddagiri	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A cluster is set of CPUs sharing some power controls and an L2 cache. This patch buids a list of clusters at bootup which are sorted by their max_power_cost. Many cluster-shared attributes like cur_freq, max_freq etc are needlessly maintained in per-cpu 'struct rq' currently. Consolidate them in a cluster structure. Change-Id: I0567672ad5fb67d211d9336181ceb53b9f6023af Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> [joonwoop@codeaurora.org: fixed minor conflict in arch/arm64/kernel/topology.c. fixed conflict due to ommited changes for CONFIG_SCHED_QHMP.] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* \| \|	tracing: power: Add trace events for core control	Junjie Wu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add trace events for core control module. Change-Id: I36da5381709f81ef1ba82025cd9cf8610edef3fc Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	msm: vidc: Add Video driver for MSM targets	Arun Menon	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add snapshot for Video driver source for MSM targets. The code is migrated from msm-3.18 kernel at the below commit level - d5809484bb1bf5864dad2f081b0145224762963a. Signed-off-by: Arun Menon <avmenon@codeaurora.org>
* \| \|	soc: qcom: msm_perf: Detect and notify when peak perf Cluster load is seen	Vijay Ganti	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Detect Perf cluster peak loads near FMAX based on the trigger thresholds set. On meeting the peak load criteria, the userspace is notified to take action by applying parameters to enhance performance. CRs-Fixed: 969499 Change-Id: Ie9687bf1aa832434dc61d20056f91a096d7be4f0 Signed-off-by: Vijay Ganti <viganti@codeaurora.org>
* \| \|	soc: qcom: msm_perf: Add support for enter/exit cycle for io detection	Tapas Kumar Kundu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for enter/exit cycle sysfs nodes for io detection There are some usecases which may benefit from different enter/exit cycle load criteria for IO load. This change adds support for that. Change-Id: Iff135ed11b92becc374ace4578e0efc212d2b731 Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
* \| \|	soc: qcom: msm_perf: Add support for multi_cycle entry/exit nodes	Tapas Kumar Kundu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for multi_enter_cycles/multi_exit_cycles per cluster There are some usecases which may benefit from different enter/exit cycle load criteria for multimode cpu load. This change adds support for that. Change-Id: I3408405307ca03b9bba3f03e216ef59b98f29832 Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
* \| \|	soc: qcom: msm_perf: Add timers to exit SINGLE mode	Tapas Kumar Kundu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Certain governors may stop sending out notifications once CPUs enter idle at min frequency.If governor's notifications stop then single mode will not exit for long time. It can happen only if the exit conditions are set in such a way that the time taken to exit single mode exceeds the time for the governor to ramp down, idle out and hence stop sending notifications leaving the system in single mode indefinitely. This change adds separate enter/exit cycle sysfs nodes along with a per cluster non-deferrable timer for single mode exit. The timer is armed only when the load starts falling below the exit load threshold and is cancelled when either the load starts going up or SINGLE mode is exited due to exceeding exit cycle count. On expiry the timer resets SINGLE mode and the enter/exit cycle counts. Change-Id: I13552b2f4085c435b917833a2993f8c64ff4ed2f Signed-off-by: Tapas Kumar Kundu <tkundu@codeaurora.org>
* \| \|	soc: qcom: msm_perf: Detect & notify userspace about heavy CPU loads	Rohit Gupta	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Detect single and multi threaded heavy workloads based on loads received from interactive governor. - If the max load across all the CPUs is greater than a user-specified threshold for certain number of governor windows then the load is detected as a single-threaded workload. - If the total load across all the CPUs is greater than a user-specified threshold for certain number of governor windows then the load is detected as a multi-threaded workload. If one of these is detected then a notification is sent to the userspace so that an entity can read the nodes exposed to get an idea of the nature of workload running. Change-Id: Iba75d26fb3981886b3a8460d5f8999a632bbb73a Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
* \| \|	soc: qcom: msm_perf: Add detection for heavy IO workloads	Rohit Gupta	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some workloads spend a lot of time in IO activity and need higher performance from system resources (for eg. CPU/DDR frequencies)to complete with decent performance. Unfortunately cpufreq governors and other system resources crucial for IO are tuned for general usecases and hence might be slower to react to such demanding IO workloads. This patch adds functionality to detect IO workloads and then send hints to userspace of the detected activity so that userspace can take necessary tuning action to prepare the system for such activity. IO activity is tracked every interactive governor timer boundary and if the percentage of iowait time in each cycle exceeds certain threshold continuously for certain number of cycles then heavy IO activity is detected. Change-Id: I73859517cb436e50340ef14739183e61fc62f90f Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
* \| \|	soc: qcom: Add a msm_performance module	Rohit Gupta	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sometimes for power saving reasons we might want to keep fewer CPUs online without adversely affecting performance for certain real world usecases. This module helps to provide that hotplug support to the userspace such that it tries to make a best effort in keeping a certain number of CPUs online as specified by the userspace. It allows any userspace entity to specify the CPUs that it wants to manage with this module and of those, the number of CPUs that should be kept online. Change-Id: I82c6d6e998d3740ad6f8c67b47344ce87f328b8b Signed-off-by: Rohit Gupta <rohgup@codeaurora.org>
* \| \|	trace, scm: trace scm calls	Sanrio Alvares	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a scm group to enable profiling time spent in a scm call. This will help determine which scm call is spending how much time in a higher execution level. To enable "echo 1 > /sys/kernel/debug/tracing/events/scm/enable". It is disabled by default. If enabled, traces can be found in Ftrace logs. Ftrace Output Example: PROCESS CPU TIME SCM ID, X0, Number of args, args[0-2], X5, return values [0-2] kworker/u8:4-329 [002] 128.201129: scm_call_start: func id=0x42000904 (args: 0x6, 0x2, 0x200000000, 0x65b8000000019, 0x142e0f000) kworker/u8:4-329 [002] 128.201383: scm_call_end: ret: 0, 0, 0x4a07e00000001 kworker/u8:4-329 [002] 128.201464: scm_call_start: func id=0x42000904 (args: 0x6, 0x3, 0x1312d0000000000, 0x17900000000, 0x142e0f000) kworker/u8:4-329 [002] 128.201542: scm_call_end: ret: 0x1bf03dddddd, 0x2f72656b726f776b, 0x343a32 CRs-Fixed: 969770 Change-Id: I4e5aaff796dbc9457c55fa529114dcb57780b7ec Signed-off-by: Sanrio Alvares <salvares@codeaurora.org>
* \| \|	mm: cma: add trace events for CMA alloc perf testing	Liam Mark	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add cma and migrate trace events to enable CMA allocation performance to be measured via ftrace. Change-Id: I1e471e9e21f1a14ce2ed167d8515ccb5f83eb88c Signed-off-by: Liam Mark <lmark@codeaurora.org>
* \| \|	skb: Adding trace event for gso.	Ravinder Konka	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds trace events to help with debug for gso feature by identifying the packets(and their lenghts) that are using the segmentation offload feature. Change-Id: Ibfe1194cc63e74c75047040b0c540713d539992e Acked-by: Ashwanth Goli <ashwanth@qti.qualcomm.com> Signed-off-by: Ravinder Konka <rkonka@codeaurora.org>
* \| \|	PM / devfreq : Introduce a memory-latency governor	Rohit Gupta	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use performance counters to detect the memory latency sensitivity of CPU workloads and vote for higher DDR frequency if required. Change-Id: Ie77a3523bc5713fc0315bd0abc3913f485a96e0e Signed-off-by: Rohit Gupta <rohgup@codeaurora.org> Suggested-by: Saravana Kannan <skannan@codeaurora.org> [junjiew@codeaurora.org: dropped changes in arch/arm64/Kconfig] Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	PM / devfreq: governor_cache_hwmon: Add mrps and freq vote traces	Junjie Wu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace measured mrps and frequency vote debug prints with trace events. Change-Id: I78370b068e3819a57635cbabaf5cdd053ebabce4 Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	tracing: power: Add trace events for bw_hwmon	Saravana Kannan	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bw_hwmon_meas trace event is to log measurement details and the bw_hwmon_update trace event is to log the final decision. Change-Id: I839ace50b1f1686227bcbf7d38a75f89092d26b1 Signed-off-by: Saravana Kannan <skannan@codeaurora.org> [junjiew@codeaurora.org: resolved trivial conflicts] Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	cpufreq: interactive: Ramp up to policy->max for heavy new task	Junjie Wu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	New tasks don't have sufficient history to predict its behavior, even with scheduler's help. Ramping up conservatively for a heavy task could hurt performance when it's needed. Therefore, separate out new tasks' load with scheduler's help and ramp up more aggressively if new tasks make up a significant portion of total load. Change-Id: Ia95c956369edb9b7a0768f3bdcb0b2fab367fdf7 Suggested-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	cpufreq: interactive: Add cpuload trace events	Junjie Wu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With per-cluster timer implementation, only max load across CPUs in cluster is traced in timer function. Add cpufreq_interactive_cpuload trace to provide per-cpu load information. Change-Id: Icea9f2574332a4bc472b14193e77d76100a896ed Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	cpufreq: interactive: Re-evaluate immediately in load change callback	Junjie Wu	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, there was a limitation in load change callback that it can't attempt to wake up a task. Therefore the best we can do is to schedule timer at current jiffy. The timer function will only be executed at next timer tick. This could take up to 10ms. Now that this limitation is removed, re-evaluate load immediately upon receiving this callback. Change-Id: Iab3de4705b9aae96054655b1541e32fb040f7e60 Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
* \| \|	trace: power: add cpu_frequency_switch_{start, end}	Matt Wagantall	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is sometimes useful to profile how long CPU frequency switches take, since they often involve variable overhead (PLL lock times, voltage increase time, etc.). Add additional traces to to make this possible. Since the overhead involved may differ based on the frequencies being switched between, record both the start and the end frequencies as part of the trace. Change-Id: I2de743fc357dad3590fd4980f65f38f6073d426e Signed-off-by: Matt Wagantall <mattw@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> [abhimany: resolve trivial merge conflicts] Signed-off-by: Abhimanyu Kapur <abhimany@codeaurora.org>
* \| \|	sched: select task's prev_cpu as the best CPU when it was chosen recently	Joonwoo Park	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Select given task's prev_cpu when the task slept for short period to reduce latency of task placement and migrations. A new tunable /proc/sys/kernel/sched_select_prev_cpu_us introduced to determine whether tasks are eligible to go through fast path. CRs-fixed: 947467 Change-Id: Ia507665b91f4e9f0e6ee1448d8df8994ead9739a [joonwoop@codeaurora.org: fixed conflict in include/linux/sched.h, include/linux/sched/sysctl.h, kernel/sched/core.c and kernel/sysctl.c] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: use ktime instead of sched_clock for load tracking	Joonwoo Park	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At present, HMP scheduler uses sched_clock to setup window boundary to be aligned with timer interrupt to ensure timer interrupt fires after window rollover. However this alignment won't last long since the timer interrupt rearms next timer based on time measured by ktime which isn't coupled with sched_clock. Convert sched_clock to ktime to avoid wallclock discrepancy between scheduler and timer so that we can ensure scheduler's window boundary is always aligned with timer. CRs-fixed: 933330 Change-Id: I4108819a4382f725b3ce6075eb46aab0cf670b7e [joonwoop@codeaurora.org: fixed minor conflict in include/linux/tick.h and kernel/sched/core.c. omitted fixes for kernel/sched/qhmp_core.c] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: Optimize scheduler trace events to reduce trace buffer usage	Syed Rameez Mustafa	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scheduler ftrace events currently generate a lot of data when turned on. The excessive log messages often end up overflowing trace buffers for long use cases or crowding out other events. Optimize scheduler events so that the log spew is less and more manageable. To that end change the variable type for some event fields; introduce variants of sched_cpu_load that can be turned on/off for separate code paths and remove unused fields from various events. Change-Id: I2b313542b39ad5e09a01ad1303b5dfe2c4883b8a Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in rt.c due to CONFIG_SCHED_QHMP.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: Notify cpufreq governor early about potential big tasks	Syed Rameez Mustafa	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tasks that are on the runqueue continuously for a certain amount of time have the potential to be big tasks at the end of the window in which they are runnable. In such scenarios ramping the CPU frequency early can boost performance rather than waiting till the end of a window for the governor to query load. Notify the governor early at every tick when a task has been observed to execute beyond some percentage of the tick period. The threshold beyond which a task is eligible for early detection can be changed via the tunable sched_early_detection_duration. The feature itself is enabled only when scheduler boost is in effect. Change-Id: I528b72bbc79a55b4593d1b8ab45450411c6d70f3 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in scheduler_tick() in kernel/sched/core.c. fixed minor conflicts in include/linux/sched.h, include/linux/sched/sysctl.h and kernel/sysctl.c due to CONFIG_SCHED_QHMP.] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: update sched_task_load trace event	Joonwoo Park	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add best_cpu and latency field to sched_task_load trace event. The latency field represents combined latency of update_task_ravg(), update_task_ravg() and select_best_cpu() which is useful to analyze latency overhead of HMP scheduler. Change-Id: Ie6d777c918d0414d361d758490e3cd7d509f5837 Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: account new task load so that governor can apply different policy	Joonwoo Park	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Account amount of load contributed by new tasks within CPU load so that governor can apply different policy when CPU is loaded by new tasks. To be able to distinguish new task load a new tunable sched_new_task_windows also introduced. The tunable defines tasks as new when the tasks are have been active less than configured windows. Change-Id: I2e2e62e4103882f7362154b792ab978b181b9f59 Suggested-by: Saravana Kannan <skannan@codeaurora.org> [joonwoop@codeaurora.org: ommited changes for drivers/cpufreq/cpufreq_interactive.c. cpufreq changes needs to be applied separately later. fixed conflict in include/linux/sched.h and include/linux/sched/sysctl.h. omitted changes for qhmp_core.c] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: Add tunables for static cpu and cluster cost	Olav Haugan	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add per-cpu tunable to set the extra cost to use a CPU that is idle. Add the same for a cluster. Change-Id: I4aa53f3c42c963df7abc7480980f747f0413d389 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> [joonwoop@codeaurora.org: omitted changes for qhmp*.[c,h] stripped out CONFIG_SCHED_QHMP in drivers/base/cpu.c and include/linux/sched.h] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: Update the wakeup placement logic for fair and rt tasks	Syed Rameez Mustafa	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the fair sched class, update the select_best_cpu() policy to do power based placement. The hope is to minimize the voltage at which the CPU runs. While RT tasks already do power based placement, their placement preference has to now take into account the power cost of all tasks on a given CPU. Also remove the check for sched_boost since sched_boost no longer intends to elevate all tasks to the highest capacity cluster. Change-Id: Ic6a7625c97d567254d93b94cec3174a91727cb87 Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* \| \|	sched: remove the notion of small tasks and small task packing	Syed Rameez Mustafa	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Task packing will now be determined solely on the basis of the power cost of task placement. All tasks are eligible for packing. Remove the notion of "small" tasks from the scheduler. Change-Id: I72d52d04b2677c6a8d0bc6aa7d50ff0f1a4f5ebb Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* \| \|	sched: Keep track of average nr_big_tasks	Srivatsa Vaddagiri	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Extend sched_get_nr_running_avg() API to return average nr_big_tasks, in addition to average nr_running and average nr_io_wait tasks. Also add a new trace point to record values returned by sched_get_nr_running_avg() API. Change-Id: Id3591e6d04da8db484b4d1cb9d95dba075f5ab9a Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Resolve trivial merge conflicts] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
* \| \|	sched: Consolidate hmp stats into their own struct	Srivatsa Vaddagiri	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Key hmp stats (nr_big_tasks, nr_small_tasks and cumulative_runnable_average) are currently maintained per-cpu in 'struct rq'. Merge those stats in their own structure (struct hmp_sched_stats) and modify impacted functions to deal with the newly introduced structure. This cleanup is required for a subsequent patch which fixes various issues with use of CFS_BANDWIDTH feature in HMP scheduler. Change-Id: Ieffc10a3b82a102f561331bc385d042c15a33998 Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> [rameezmustafa@codeaurora.org: Port to msm-3.18] Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org> [joonwoop@codeaurora.org: fixed conflict in __update_load_avg().] Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
* \| \|	sched: Add temperature to cpu_load trace point	Olav Haugan	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the current CPU temperature to the sched_cpu_load trace point. This will allow us to track the CPU temperature. CRs-Fixed: 764788 Change-Id: Ib2e3559bbbe3fe07a6b7c8115db606828bc36254 Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
* \| \|	sched: extend sched_task_load tracepoint to indicate prefer_idle	Syed Rameez Mustafa	2016-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prefer idle determines whether the scheduler prefers an idle CPU over a busy CPU or not to wake up a task on. Knowing the correct value of this tunable is essential in understanding placement decisions made in select_best_cpu(). Change-Id: I955d7577061abccb65d01f560e1911d9db70298a Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>