| Commit message (Collapse) | Author | Age |
| |\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
https://android.googlesource.com/kernel/common into lineage-17.1-caf-msm8998
This brings LA.UM.8.4.r1-05200-8x98.0 up to date with
https://android.googlesource.com/kernel/common/ android-4.4-p at commit:
4db1ebdd40ec0 FROMLIST: HID: nintendo: add nintendo switch controller driver
Conflicts:
arch/arm64/boot/Makefile
arch/arm64/kernel/psci.c
arch/x86/configs/x86_64_cuttlefish_defconfig
drivers/md/dm.c
drivers/of/Kconfig
drivers/thermal/thermal_core.c
fs/proc/meminfo.c
kernel/locking/spinlock_debug.c
kernel/time/hrtimer.c
net/wireless/util.c
Change-Id: I5b5163497b7c6ab8487ffbb2d036e4cda01ed670
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ Upstream commit f47bee2ba447bebc304111c16ef1e1a73a9744dd ]
These regs are write-only, and the hw throws a hissy-fit (ie. reboots)
when we try to read them for GPU state snapshot, in response to a GPU
hang. It is rather impolite when GPU recovery triggers an insta-
reboot, so lets remove the TPL1 registers from the snapshot.
Fixes: 7198e6b03155 drm/msm: add a3xx gpu support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ Upstream commit 88b333b0ed790f9433ff542b163bf972953b74d3 ]
Currently the value written to CP_RB_WPTR is calculated on the fly as
(rb->next - rb->start). But as the code is designed rb->next is wrapped
before writing the commands so if a series of commands happened to
fit perfectly in the ringbuffer, rb->next would end up being equal to
rb->size / 4 and thus result in an out of bounds address to CP_RB_WPTR.
The easiest way to fix this is to mask WPTR when writing it to the
hardware; it makes the hardware happy and the rest of the ringbuffer
math appears to work and there isn't any point in upsetting anything.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[squash in is_power_of_2() check]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
CCU load_bit is supposed to be configured for RB_PERFCTR_CCU register, but
it is configured for RB_POWERCTR_CCU register. Updated the RB_PERFCTR_CCU
register configuration with CCU load_bit.
Change-Id: I3b4ce056923b5bd39bc274a0744008f5bc5db0f1
Signed-off-by: Venkateswara Rao Tadikonda <vtadik@codeaurora.org>
|
| |\| | |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Restore of TP perfcounters before turning ON the GPMU causes the GPU fault
and recovery. Restore the perfcounters after turning ON the GPMU.
Change-Id: I3c00ed0a487d452e29f360300f92227784b81bbf
Signed-off-by: Venkateswara Rao Tadikonda <vtadik@codeaurora.org>
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The parsing logic wrongly assumes the position of the initial/active power
level value in the gpu dts file. This leads to the active power level
always defaulting to a value of 1. Look for the initial power level
one level up in the device tree.
Change-Id: I63f8c8efd05ad3693c6f399f58bed44ac84105d2
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| |/ /
| |
| |
| |
| |
| |
| |
| |
| | |
Userspace needs to know the GPU timeout value to support Khronos robust
GPU timeout extension. The timeout value is returned to the user in
millisecond resolution.
Change-Id: Iba2ff43fce6d21da240356b392afa7a6e7a618ad
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Add functions to save and restore the value of some of the
performance counters during a power cycle.
Change-Id: Ic0dedbad4037d6a2262792b752dc5d33a2d0eb36
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A faulting submit is retired on the recovery worker and not
the retire worker. The usage count for the device must be
decremented for the faulting submit or the device will never
go into suspend following a fault despite being inactive for
the inactivity period.
Change-Id: Ieda698eb00008f5bcc7287f76b9261704e51e28b
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Ringbuffer pointers were getting reset only when resuming after
recovery. However, we need to reset them even after resuming
from SLUMBER or we will end up sending stale commands to the GPU
with bad results. Make ringbuffer reset part of the GPU init
sequence.
Change-Id: I93fc2f2e293245e584184315f8eb8a4ec73d2455
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
msm_gpu's get_timestamp() op (called by the MSM_GET_PARAM ioctl) can
result in register accesses. We need our power domain and clocks to
be active for that. Make sure they are enabled here.
Change-Id: I1b8e59e0246ed7d9b8a0b6ae660ebfbb15b08782
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We need to use pm-runtime properly when IOMMU is using device_link() to
control it's own clocks.
Change-Id: I7c5668e6a0fcfc2d4664355e49c49d4dcb26323e
Signed-off-by: Rob Clark <robdclark@gmail.com>
Git-commit: eeb754746b140c5f55e6b25706a9142aa549b348
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[ssusheel@codeaurora.org: fix some merge conflicts]
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
After enabling the GPU clocks, the GPU can pagefault
when trying to access memory(example the ringbuffer).
This patch addresses the pagefault issue by enabling
the memory retention flags on the GPU core clock.
Change-Id: Ibabecba77501d6a3b188b19c90c172de7d667c8c
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| |/ /
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Turn off the GPU power and free all resources allocated during
GPU init in case hardware init fails in adreno_gpu_load. This is
required to make sure further tries to load the GPU again doesn't
fail because of invalid GPU state.
Change-Id: I1d0d68f62be751d76274975e098364131712ca38
Signed-off-by: Deepak Kumar <dkumar@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There is a race condition issue between the IRQ context trying to
trigger preemption and the user context trying to submit commands to
the GPU. The check in a5xx_flush() API only updates the wptr if the GPU is
not in preemption. In the cases where we move from PREEMPT_START to
PREEMPT_NONE there is a small window where the preempt state is still
in START but the CPU context switches to the user thread which is in
the a5xx_flush() call to update the wptr, but fails to update the wptr to
the GPU since the preempt state is not PREEMPT_NONE. This leads to a
GPU stall.
Introduce a new intermediate state PREEMPT_ABORT and
change preempt_trigger() to use gpu's current ring instead of the
ring retrieved from get_next_ring() while in this state.
Change-Id: I333e9de19824bd373901bbc8afc829de04635017
CRs-Fixed: 2081164
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
On A5XX GPU hardware clock gating needs to be turned off before
reading certain GPU registers via AHB. Turn off HWCG before calling
adreno_show() to safely dump all the registers without a system hang.
Change-Id: Ic0dedbad550ab5d414cea7837672e586a7acd370
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Remember if the A5XX hardware clock gating is currently
enabled or disabled to avoid inadvertently enabling it.
Change-Id: Ic0dedbada3734a257ac966c041d06695f3521ad4
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Enabling and disabling the power at various points in the ->show()
call flow may have detrimental effects. For all targets make sure
power is on before reading any register and leave it on until we are
all done.
Change-Id: Ic0dedbad4d37a11634174105fc3ee6fe3713a143
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The generic msm_gpu_pm_resume/msm_gpu_pm_suspend functions have
built-in reference counting but the a5xx specific functions
are doing unconditional a5xx specific setup / teardown that
would behave very badly if they were not accompanied by an
actual power up / power down.
Change-Id: Ic0dedbad549c4ea9a5c68b0ca43eb98e0449d54b
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
In the confusion of adding the perfcounter API the timestamp query
was broken. Convert the query over to the perfcounter API to avoid
confusion.
Change-Id: Ic0dedbad590489a643e8aa6d678bf19f732c06dd
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| | |
adreno_last_fence is no longer very useful since we have a handy
per-ring pointer directly to the values we need.
Change-Id: Ic0dedbadfb195551afcd016651776965da32fb2d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we first did preemption the priority was set at submission
time. In order to be properly backwards compatible we made ring id 0
the lowest priority ring so that when a legacy app made a submission
it didn't get itself onto the highest priority ring by accident.
Now that we set the priority with submitqueues this is no longer
a concern and ordering priorities this way goes against long
standing convention in similar GPU drivers.
Declare a flag day and invert the priority algorithm so that
priority '0' is the highest priority and it descends from there.
The lowest prority ring is 'number of rings - 1' where the number
of active rings can be acquired through a parameter query of
MSM_PARAM_NR_RINGS.
This change also ensures that the legacy submitqueue id '0' will
use the next-to-lowest ring buffer by default for legacy
submissions.
Change-Id: Ic0dedbadeea522e4f07babc4395cbf5fb7143fe3
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
In order to manage ringbuffer priority to its fullest userspace
should know how many ringbuffers it has to work with. Add a
parameter to return the number of active rings.
Change-Id: Ic0dedbada6010dd5122e8409141fd23b414d73e4
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove the queued time from the profile struct and turn the submit time
into a proper timespec (tv_sec + tv_nsec). This should sync up better
with what userspace is used to seeing.
Change-Id: Ic0dedbad0621fa248e6cffde2d1ee3f9b609e19d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Record the GPU always on timer value at the start and end of a
submission on the ringbuffer. Since the timer runs at a constant
19.2 Mhz this is a handy way of tracking how long each
submission takes.
The timer values are recorded in the memptrs. Each ringbuffer is
given a circular list of 128 entries to store the event ticks;
this should be enough to avoid running out of room even when the
ring is completely full of submissions.
Add trace events for the user to track when submissions are
queued, submitted to the ringbuffer and retired. The submitted
trace point shows the GPU ticks and the current kernel time at
submit time (as read by the CPU) and the retired trace event shows
the GPU ticks at submission start/end as read by the GPU. Taken
together these two events can provide a pretty close match between
the current GPU time and the kernel time which is handy for tracing
tools that try to match up the various kernel events with one
another.
Change-Id: Ic0dedbadbcf89f032890820785b9fb49a6362b01
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Since most of the heavy lifting for managing submits lives in the
msm_gpu domain it makes sense to move the memptrs so that they are
globally visible and we can use them without relying on function
pointers.
Additionally, instead of having a single struct full of per-ring
arrays, reorganize the structure and assign a sub-allocation
to each ring. This simplifies all of the various macros and other
bits and allows us to make the size of the allocation dependent
on the acutal number of rings for the implementation.
Change-Id: Ic0dedbadc18ba1dc786c82b082c5030e13ff8012
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently the normal and secure MMUs are allocated when the
address space is created in msm_gpu_init() but not attached
until the end of adreno_gpu_init(). Since we can't map buffer
objects in the IOMMU without attaching it first this restricts
when we can allocate buffer objects in the sequence.
For arm-smmu based targets there isn't any reason why we can't
immediately attach the MMU after creating the address space -
this makes the whole system immediately available to map memory
and will facilitate moving around global allocations.
Change-Id: Ic0dedbad161396e9d095f3f3d1e4fca2d240a084
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
Nearly all of the buffer allocations for kernel allocate an buffer object,
virtual address and GPU iova at the same time. Make a helper function to
handle the details.
Change-Id: Ic0dedbad0ecd85d360895cc0d1e418277ba44c62
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Buffer object specific resources like pages, domains, sg list
need not be protected with struct_mutex. They can be protected
with a buffer object level lock. This simplifies locking and
makes it easier to avoid potential recursive locking scenarios
for SVM involving mmap_sem and struct_mutex. This also removes
unnecessary serialization when creating buffer objects, and also
between buffer object creation and GPU command submission.
Change-Id: I40cb437d0186c3d9aac365c9baba0aa4792f0aa1
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| | |
| |
| |
| |
| |
| |
| |
| | |
Define a 36-bit address space for TTBR1 which is used for
kernel side GPU buffer objects.
Change-Id: I1c4eaee0fd92236793621c7d3dba1700e56fefd2
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The initial version of the patch save the command submit_time and
queue_time in seconds, but its desired by the users of this profiling
API to return the time in nanoseconds resolution.
Change-Id: I3a56e3ffd3ebe86f51a00a12b7c3e7c4b4c9a956
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| |\ \ \ |
|
| | |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For some optimizations coming on the userspace side, splitting larger
draw or gmem cmds into multiple cmdstream buffers, we need to support
much more than the previous small/arbitrary limit.
Change-Id: Ic0dedbad2f79156f4e6c9f70c8e27cd5fff9acdb
Signed-off-by: Rob Clark <robdclark@gmail.com>
Git-commit: 6b597ce2f7c7a0f8116d753902db9aba6bc05cb0
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[jcrouse@codeaurora.org: fix some merge conflicts]
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |\ \ \
| |/ /
|/| | |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
At this point, there is nothing left to fail. And submit already has a
fence assigned and is added to the submit_list. Any problems from here
on out are asynchronous (ie. hangcheck/recovery).
Change-Id: Ib6b6bf00099137972649c97cc6cd8c4fe25ce7c3
Signed-off-by: Rob Clark <robdclark@gmail.com>
Git-commit: 1193c3bcb581807d58dd7df90528ec744af387a9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[smasetty@codeaurora.org: fixed merge conflict issues; made corresponding
changes to A5XX submit function.]
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| |/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A5XX targets support GPU rendering on secured surfaces by going
into a special secure mode to execute the commands. In secure mode
GPU rendering can only write to secure buffers that have been mapped
in an appropriately secured pagetable. In secure mode the GPU can read
both secure and unsecure buffers and the CP engine can only access
unsecured buffers (so commands do not need to be secure).
Secure buffers virtual addresses must fall into a specific range; this
is the clue to the GPU that it should use the secure pagetable
instead of the regular one. For A5XX targets that range will start
at 0xC0000000 and be 256MB in size. All secure buffers in all processes
share the same pagetable.
Add a secure address space for A5XX targets and automatically trigger
into secure mode if any buffer in the submission is marked as secure.
Change-Id: Ic0dedbad8f7168711d10928cd1894b98f908425f
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |\ \ |
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch helps dump the full 64k per ring preemption
record to GPU snapshot which is collected during GPU
recovery step. We use the general object snapshot section
type to store these records and we only collect the preemption
records if preemption was going to kick in, which is when
the number of rings is greater than one.
Change-Id: I1872bc14c6b39c8c4963ce9c98e96b03cbfec907
Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
|
| |\ \ \
| | | |
| | | |
| | | | |
msm-4.4"
|
| | |\ \ \
| | |/ /
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Conflicts:
arch/arm/boot/dts/qcom/msm8996-auto-cdp.dtsi
drivers/gpu/drm/msm/Makefile
Change-Id: Ief80c28ff1422fd71a0c3d2041531e2ab078ee7a
Signed-off-by: Zhiqiang Tu <ztu@codeaurora.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This register is not accessible by CPU for certain TZ, trying to read it
will cause CPU hang.
Change-Id: Ica1b18db2c3cc2c9bacfdbd4c5eb1e2e172ade33
Signed-off-by: Kasin Li <donglil@codeaurora.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
GMEM IOVA range is intended to start from 0x100000. But currently
it is initialized with RANGE_MIN_LO:RANGE_MIN_LO. It makes GMEM
IOVA start from 0.
Change-Id: Ib3c9a86d0cd85794881d8708386b18d58bd8e58e
Signed-off-by: Kasin Li <donglil@codeaurora.org>
|
| |\ \ \ \
| |/ / /
|/| | | |
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
With the upcoming secure code the decision tree for configuration
(deciding where virtual addresses start/stop, etc) is going to get
a bit more complex. Head issues off at the pass by moving the
configuration into the GPU specific code. This does result in a
bit more code duplication but it is a lot cleaner.
Change-Id: Ic0dedbad57c11a4bba01825214d0a7853ab537ba
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
None of the existing iommu implementations use the names passed in
at attach time by the API. Save a bit of .data room by removing
the static string definitions and passing NULL to the attach function.
Change-Id: Ic0dedbada9561768b8d9716ea101619e6b549ea4
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|