| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
| |
* Requires new GPU firmware
This reverts commit adec4f93e1705640e7b03d33394224ff5d835280.
Change-Id: I747c00bff92f6e793f207839a7ad0a61b2656f96
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Mark the scratch buffer as privileged so that it can only be accessed by
GPU through the ringbuffer. To accomplish this, we need to:
1. Disable the shadow rptr feature.
2. Trigger RPTR update from GPU using a WHERE_AM_I packet.
3. Add support for the new ucode.
Change-Id: I9b388f55f53b69028b9bbb2306cb43fd1297c52f
Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Pranav Patel <pranavp@codeaurora.org>
|
| |
|
|
|
|
|
|
|
| |
Select a random global GPU address for the "scratch" buffer that is used
by the ringbuffer for various tasks.
Change-Id: Ic0dedbaddda71dbf9cb2adab3c6c33a24d6a604c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Harshitha Sai Neelati <hsaine@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
Execute user profiling in an indirect buffer. This ensures that addresses
and values specified directly from the user don't end up in the
ringbuffer.
Change-Id: Ic0dedbadedcaab29ce5738a39c1ff6269261bae4
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Harshitha Sai Neelati <hsaine@codeaurora.org>
|
| |\ |
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
A3xx device gets the ring buffer read pointer directly
from the GPU registers. So don’t allocate scratch memory
which can’t be used for A3xx GPU devices.
Change-Id: I95016dfc169b9fee74e978f5560592740f34515e
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
|
| |/
|
|
|
|
|
|
|
|
|
| |
Stall on page fault feature is supported on A5XX and later GPUs.
Enabling this feature on unsupported GPUs causes GPU faults.
So don't insert GPU stall related commands in ringbuffer if
not supported. But allow user to capture the GPU snapshot on
GPU page fault.
Change-Id: Ied26a5b4f44c1877b289a0ff5c0a6d47901e453d
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
|
| |
|
|
|
|
|
|
|
| |
If requested, trace the GPU time to ensure
a useful mapping regardless of the chosen
trace clock.
Change-Id: I76a893975de9a278c8178f935991191354f29e2f
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
|
| |
|
|
|
|
|
|
| |
Check for legacy PM4 commands instead of adreno version to calculate
ringbuffer space for PM4 commands that write to memory.
Change-Id: I5d1d4cfbc70bc73ddee9ee752de24aae154a04dc
Signed-off-by: Lynus Vaz <lvaz@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
A context may be detached without submitting any commands
to GPU ringbuffer. This may cause us to wait on a timestamp
that will never be retired. So return immediately from
adreno_drawctxt_wait_rb() if context has not submitted any
commands to GPU ringbuffer.
Change-Id: If8b3f8df92ec9b54a1a83d2f6704d4d15eb1b979
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently dispatcher accepts kgsl_cmdbatch object. This object
is a superset of all the types of objects dispatcher accepts.
Split kgsl_cmdbatch object to SYNC and IB/MARKER objects and
structure the code to make it easier for new type of objects
to be added to the dispatcher queue.
CRs-Fixed: 1054354
Change-Id: I2d482d1081ce6fdb7925243c88ce00ea6b864efe
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
Rename all cmdbatch to drawobj. This forms a platform
for future changes where cmdbatch is split into different
types of drawobjs.
CRs-Fixed: 1054353
Change-Id: Ib84bee679e859db34e0d1f8a0ac70319eabddf53
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Log the nearby allocations for pagefaults on global buffers.
Print the names of the allocations that fall around the
faulting address on a global buffer. Also add a new debugfs
file to list all the global pagetable entries. Useful for
debugging pagefaults and other issues with "global" objects.
CRs-Fixed: 985631
Change-Id: Ifbbdc69044fc64d7ea02509bf8113ed94eeece1e
Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, if read pointer is behind write pointer and there
is not enough space toward the end of the ringbuffer for
new commands, then write pointer is being set to 0.
This is problematic, because it leads to the overwriting of
unexecuted commands with new commands at the start of the
ringbuffer. So, instead of setting the write pointer to 0,
look for space from the start of the ringbuffer up till the
read pointer and if there is room, update the write pointer
accordingly.
CRs-Fixed: 1028465
Change-Id: I1cbdbf139b14988513a22030aa2be4a99a221880
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
Force any command triggered context switch to the GPU - it should
be on the GPU anyway, but we were already passing a flags parameter
(unused) so this is a good chance to force the issue and make sure
that the cpu path decision isn't in play here.
CRs-Fixed: 1009124
Change-Id: Ic0dedbadb277a6498d0840b45c90e1265e2f354a
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We are only writing the ringbuffer start of pipeline timestamp for
internal commands that do not have a draw context associated which
happen rarely (if ever). We should be recording the timestamp for
*ALL* commands so when something goes wrong we can get a fuller
idea of the timestamp picture for each ringbuffer.
CRs-Fixed: 1009134
Change-Id: Ic0dedbad6d99130e31cd8a06dfe025610e9157a8
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
Allow 5XX targets to preempt quickly from an atomic context. In
particular this allows quicker transition from a high priority
ringbuffer to a lower one without having to wait for the worker
to schedule.
CRs-Fixed: 1009124
Change-Id: Ic0dedbad01a31a5da2954b097cb6fa937d45ef5c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
Remove some unused gpudev hooks and further segment the A4XX and
A5XX specific code into their respective areas. Remove some bits
that are only applicable to 4XX from the 5XX side.
CRs-Fixed: 1009124
Change-Id: Ic0dedbadc324b979583d7a3998195bf15ac537f6
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
It is no longer power efficient to independently enable and disable
the MMU clocks. We can safely enable and disable them with the rest
of the GPU clocks and take back the infrastructure needed to handle
the clocks.
CRs-Fixed: 1009124
Change-Id: Ic0dedbadc48095eada9c5fce6004475a2cb0f0a9
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The memstore shared between the CPU and GPU is old but can not be
messed with. Rather than stealing values from it where available,
add a new block of shared memory that is exclusive to the driver
and GPU. This block can be used more freely than the old
memstore block.
Program the GPU to write the RPTR out to an address the CPU can read rather
than having the CPU read a GPU register directly. There are some very
small but very real conditions where different blocks on the GPU have
outdated values for the RPTR. When scheduling preemption the value read
from the register could not reflect the actual value of the RPTR in the CP.
This can cause the save/restore from preemption to give back incorrect RPTR
values causing much confusion between the GPU and CPU.
Remove the ringbuffers copy of the read pointer shadow.
Now that the GPU will update a shared memory address with the
value of the read pointer, there is no need to poll the register
to get the value and then keep a local copy of it.
CRs-Fixed: 987082
Change-Id: Ic44759d1a5c6e48b2f0f566ea8c153f01cf68279
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
A3XX doesn't have support for command batch profiling. Return
EOPNOTSUPP for a command batch profiling request on A3XX, so that
userspace code knows that this feature is not supported.
CRs-Fixed: 986169
Change-Id: I6dfcab462a933ef31e3bba6bef07f17016ae50b9
Signed-off-by: Hareesh Gundu <hareeshg@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current order:
IB1 batch, timestamp writes, SRM=NULL, CP_YIELD_ENABLE,
CP_CONTEXT_SWITCH_YIELD
Correct order:
IB1 batch, SRM=NULL, CP_YIELD_ENABLE, timestamp writes,
CP_CONTEXT_SWITCH_YIELD
Reason:
if preemption is initiated after the last checkpoint but
before SET_RENDER_MODE == NULL is executed, all of the PM4s
starting at the preamble of the check point will be replayed
up to the SRM == NULL, including an attempt to re-timestamp/
re-retire the last batch of IBs.
If what was intended here was to make sure that the IB batch
would be retired once then the SET_RENDER_MODE == NULL and
CP_YIELD_ENABLE should be placed immediately after IB_PFE packets
and before the time stamping PM4 packets in the ring buffer.
CRs-Fixed: 990078
Change-Id: I04a1a44f12dd3a09c50b4fe39e14a2bd636b24de
Signed-off-by: Harshdeep Dhatt <hdhatt@codeaurora.org>
|
| |
|
|
|
|
|
|
| |
Move device specific features to the device rather than trying
to do them in the common initialization code.
Change-Id: I812db29a2eae90ca532755c265aaa2e52db972d7
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
Add a new cmdbatch profiling flag that populates the seconds
and nanosecond fields of the cmdbatch structure with the time
since boot instead of the wall time.
CRs-Fixed: 968114
Change-Id: I4e752d5237a74192b3ea9cc125c11bae574c1b36
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
|
| |
|
|
|
|
|
|
|
| |
CP_QUEUE_THRESHOLDS is only used in A3XX. Move the register setting
out of common ringbuffer initialization and into A3XX specific region.
CRs-Fixed: 971153
Change-Id: I05ef504a802534f1582e62085c5b12b20ac57209
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current MMU code assumes a binary state - either there is a
IOMMU or there isn't. This precludes other memory models and
makes for a lot of inherent IOMMU knowledge in the generic MMU
code and the rest of the driver. Reorganize and cleanup the
MMU and IOMMU code:
* Add a Kconfig boolean dependent on ARM and/or MSM SMMU support.
* Make "nommu" mode an actual MMU subtype and figure out available
MMU subtypes at probe time.
* Move IOMMU device tree parsing to the IOMMU code.
* Move the MMU subtype private structures into struct kgsl_mmu.
* Move adreno_iommu specific functions out of other generic
adreno code.
* Move A4XX specific preemption code out of the ringbuffer code.
CRs-Fixed: 970264
Change-Id: Ic0dedbad1293a1d129b7c4ed1105d684ca84d97f
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
Add a helper macro to convert an adreno_device pointer to a
struct kgsl_device pointer. This is mostly syntatic sugar
but it makes the code a bit cleaner and it abstracts a bit of
the ugliness away.
Change-Id: Ic0dedbadd97bda3316a58514a5a64757bd4154c7
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ringbuffer structures are static members of struct adreno_device
which means that they are permanently associated with a specific
adreno device and by extension a struct kgsl_device too. The upshot
is that we can use macro math to derive the adreno device from
a ringbuffer pointer and get rid of the device shortcut in the
ringbuffer struct. This also gives us a chance to clean up
how functions use the ringbuffer and adreno_device structs
to limit unnessesary dereferencing.
Change-Id: Ic0dedbad909ef71e99cd3319713cee38fb1700f0
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
| |
|
|
|
|
|
| |
Enable CP to process yield packets placed in the IB2s.
Change-Id: I2fadfb108a2dc42f574b3f6ed2e667baddb7889c
Signed-off-by: Jonathan Wicks <jwicks@codeaurora.org>
|
| |
|
|
|
|
|
|
|
|
| |
CP_CACHE_FLUSH interrupts can storm on very rare occasions.
Check for this interrupt storm and do nothing when it occurs
rather than thrashing the CPU which can occasionally bring the
system down.
Change-Id: I0528ad4fec43abfaeeba1499d0b0e51e14b09f0d
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
|
| |
|
|
|
|
|
|
|
| |
Global pagetable entries are exclusively for IOMMU and per-process
pagetables. Move all the code out of the generic driver and into
the IOMMU driver and clean up a bunch of stuff along the way.
Change-Id: Ic0dedbadbb368bb2a289ba4393f729d7e6066a17
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|
|
|
Snapshot of the Qualcom Adreno GPU driver (KGSL) as of msm-3.18 commit
commit e70ad0cd5efd ("Promotion of kernel.lnx.3.18-151201.").
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
|