summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/scheduler (follow)
Commit message (Collapse)AuthorAge
* drm/amdgpu: fix race condition in amd_sched_entity_push_jobNicolai Hähnle2015-12-02
| | | | | | | | | | | | | | | As soon as we leave the spinlock after the job has been added to the job queue, we can no longer rely on the job's data to be available. I have seen a null-pointer dereference due to sched == NULL in amd_sched_wakeup via amd_sched_entity_push_job and amd_sched_ib_submit_kernel_helper. Since the latter initializes sched_job->sched with the address of the ring scheduler, which is guaranteed to be non-NULL, this race appears to be a likely culprit. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079 Reviewed-by: Christian König <christian.koenig@amd.com>
* drm/amdgpu: move dependency handling out of atomic section v2Christian König2015-11-23
| | | | | | | | | This way the driver isn't limited in the dependency handling callback. v2: remove extra check in amd_sched_entity_pop_job() Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: optimize scheduler fence handlingChristian König2015-11-23
| | | | | | | | We only need to wait for jobs to be scheduled when the dependency is from the same scheduler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: fix incorrect mutex usage v3Christian König2015-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch the scheduler fence was created when we push the job into the queue, so we could only get the fence after pushing it. The mutex now was necessary to prevent the thread pushing the jobs to the hardware from running faster than the thread pushing the jobs into the queue. Otherwise the thread pushing jobs into the queue would have accessed possible freed up memory when it tries to get a reference to the fence. So what you get in the end is thread A: mutex_lock(&job->lock); ... Kick of thread B. ... mutex_unlock(&job->lock); And thread B: mutex_lock(&job->lock); .... mutex_unlock(&job->lock); kfree(job); I'm actually not sure if I'm still up to date on this, but this usage pattern used to be not allowed with mutexes. See here as well https://lwn.net/Articles/575460/. v2: remove unrelated changes, fix missing owner v3: rebased, add more commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: cleanup scheduler fence get/put danceChristian König2015-11-16
| | | | | | | | The code was correct, but getting two references when the ownership is linearly moved on is a bit awkward and just overhead. Signed: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add command submission workflow tracepointChunming Zhou2015-11-16
| | | | | | | OGL needs these tracepoints to investigate performance issue. Change-Id: I5e58187d061253f7d665dfce8e4e163ba91d3e2b Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
* drm/amd: add kmem cache for sched fenceChunming Zhou2015-11-16
| | | | | | Change-Id: I45bb8ff10ef05dc3b15e31a77fbcf31117705f11 Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
* drm/amdgpu: fix stoping the scheduler timeoutChristian König2015-11-04
| | | | | | | | cancel_delayed_work_sync is forbidden in interrupt context. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/scheduler: don't oops on failure to loadDave Airlie2015-11-03
| | | | | | | | | | | | | | | | | | In two places amdgpu tries to tear down something it hasn't initalised when failing. This is what happens when you enable experimental support on topaz which then fails in ring init. This patch allows it to fail cleanly. agd: Split out from from the original patch since the scheduler is a driver independent. Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* drm/amdgpu: ignore scheduler fences from the same entityChristian König2015-10-28
| | | | | | | We are going to submit them before the job anyway. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: fix lockup when clean pending fencesJunwei Zhang2015-10-14
| | | | | | | | | | The first lockup fence will lock the fence list of scheduler. Then cancel the delayed workqueues for all clean pending fences without waiting the workqueues to finish. Change-Id: I9bec826de1aa49d587b0662f3fb4a95333979429 Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
* drm/amdgpu: add timer to fence to detect scheduler lockupJunwei Zhang2015-10-14
| | | | | | | Change-Id: I67e987db0efdca28faa80b332b75571192130d33 Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: David Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
* drm/amdgpu: more scheduler cleanups v2Christian König2015-09-23
| | | | | | | | | | | Embed the scheduler into the ring structure instead of allocating it. Use the ring name directly instead of the id. v2: rebased, whitespace cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
* drm/amdgpu: rename fence->scheduler to sched v2Christian König2015-09-23
| | | | | | | | | | Just to be consistent with the other members. v2: rename the ring member as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> (v1) Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
* drm/amdgpu: cleanup entity initChristian König2015-09-23
| | | | | | | | Reorder the fields and properly return the kfifo_alloc error code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com>
* drm/amdgpu: refine the job naming for amdgpu_job and amdgpu_sched_jobJunwei Zhang2015-09-23
| | | | | | | | Use consistent naming across functions. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: David Zhou <david1.zhou@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
* drm/amdgpu: remove process_job callback from the schedulerChristian König2015-09-23
| | | | | | | | | Just free the resources immediately after submitting the job. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* drm/amdgpu: move scheduler fence callback into fence v2Christian König2015-09-23
| | | | | | | | | | | And call the processed callback directly after submitting the job. v2: split adding error handling into separate patch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* drm/amdgpu: signal scheduler fence when hw submission fails v3Christian König2015-09-23
| | | | | | | | | | | | Otherwise the resource blocked by it will never be reclaimed. v2: add DRM_ERROR. v3: fix typo in commit message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou<david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* drm/amdgpu: add tracepoint for scheduler (v2)Chunming Zhou2015-09-23
| | | | | | | | | | track sched job status like the length of job queue and hw job queue. v2: fix build after rebase Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
* drm/amdgpu: fix warning in schedulerAlex Deucher2015-09-04
| | | | | | | | This should never happen so warn when the count does not equal the expected size. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: make wait_event uninterruptible in push_jobChunming Zhou2015-09-02
| | | | | | | | | | | | | | | | | | | | with interruptible, the push_job maybe return -ERESTARTSYS, then result in push_job error. E.g. bug trace: [ 181.618860] *****amdgpu_copy_buffer:fence->seq:0x0000000048d8758b, contxt:1207959552, ref:683967304, r:-512 [ 181.618929] BUG: unable to handle kernel paging request at ffffffff811aa266 [ 181.625887] IP: [<ffffffff81548ffc>] reservation_object_add_excl_fence+0x3c/0x120 ... [ 181.859767] [<ffffffff811aa266>] ? unmap_mapping_range+0x66/0x110 [ 181.865928] [<ffffffffc0608ac1>] ttm_bo_move_accel_cleanup+0x41/0x3c0 [ttm] [ 181.872971] [<ffffffffc062d382>] amdgpu_move_blit.isra.18+0x122/0x150 [amdgpu] [ 181.880254] [<ffffffff811aa266>] ? unmap_mapping_range+0x66/0x110 [ 181.886420] [<ffffffffc062d709>] amdgpu_bo_move+0xa9/0x200 [amdgpu] [ 181.892753] [<ffffffffc0606e8d>] ttm_bo_handle_move_mem+0x26d/0x5c0 [ttm] Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
* drm/amdgpu: add scheduler dependency callback v2Christian König2015-08-28
| | | | | | | | | | This way the scheduler doesn't wait in it's work thread any more. v2: fix race conditions Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* drm/amdgpu: let the scheduler work more with jobs v2Christian König2015-08-28
| | | | | | | | v2: fix another race condition Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* drm/amdgpu: fix wait queue handling in the schedulerChristian König2015-08-26
| | | | | | | | Freeing up a queue after signalling it isn't race free. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: remove extra parameters from scheduler callbacksChristian König2015-08-26
| | | | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: wake up scheduler only when neccessaryChristian König2015-08-26
| | | | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: remove entity idle timeout v2Christian König2015-08-26
| | | | | | | | | | | | Removing the entity from scheduling can deadlock the whole system. Wait forever till the remaining IBs are scheduled. v2: fix comment as well Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
* drm/amdgpu: add priv data to schedChunming Zhou2015-08-25
| | | | | Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>
* drm/amdgpu: add owner for sched fenceChunming Zhou2015-08-25
| | | | | Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>
* drm/amdgpu: remove entity reference from sched fenceChristian König2015-08-25
| | | | | | | Entity don't live as long as scheduler fences. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: fix and cleanup amd_sched_entity_push_jobChristian König2015-08-25
| | | | | | | Calling schedule() is probably the worse things we can do. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: remove unused parameters to amd_sched_createChristian König2015-08-25
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: remove sched_lockChristian König2015-08-25
| | | | | | | It isn't protecting anything. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: remove prepare_job callbackChristian König2015-08-25
| | | | | | | Not used any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: cleanup a scheduler function nameChristian König2015-08-25
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: reorder scheduler functionsChristian König2015-08-25
| | | | | | | Keep run queue, entity and scheduler handling together. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: fix scheduler thread creation error checkingChristian König2015-08-25
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: fix entity wakeup race conditionChristian König2015-08-25
| | | | | | | That actually didn't worked at all. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: cleanup entity pickingChristian König2015-08-25
| | | | | | | | Cleanup function name, stop checking scheduler ready twice, but check if kernel thread should stop instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: remove some more unused entity members v2Christian König2015-08-25
| | | | | | | | | None of them are used any more. v2: fix type in error message Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: rework scheduler submission handling.Christian König2015-08-25
| | | | | | | | | | | | | Remove active_hw_rq and it's protecting queue_lock, they are unused. User 32bit atomic for hw_rq_count, 64bits for counting to three is a bit overkill. Cleanup the function name and remove incorrect comments. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* drm/amdgpu: remove v_seq handling from the scheduler v2Christian König2015-08-25
| | | | | | | | | | | Simply not used any more. Only keep 32bit atomic for fence sequence numbering. v2: trivial rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> (v1) Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
* drm/amdgpu: use a spinlock instead of a mutex for the rqChristian König2015-08-20
| | | | | | | More appropriate and fixes some nasty lockdep warnings. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: abstract amdgpu_job for schedulerChunming Zhou2015-08-20
| | | | | Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>
* drm/amdgpu: cleanup sheduler rq handling v2Christian König2015-08-17
| | | | | | | | | Rework run queue implementation, especially remove the odd list handling. v2: cleanup the code only, no algorithem change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
* drm/amdgpu: fix unnecessary wake upChunming Zhou2015-08-17
| | | | | | | | decrease CPU extra overhead. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>
* drm/amdgpu: add reference for **fenceChunming Zhou2015-08-17
| | | | | | | | fix fence is released when pass to **fence sometimes. add reference for it. Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian K?nig <christian.koenig@amd.com>
* drm/amdgpu: remove scheduler fence list v2Christian König2015-08-17
| | | | | | | | | Unused and missing proper locking. v2: add locking comment to commit message. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
* drm/amdgpu: remove amd_sched_wait_emit v2Christian König2015-08-17
| | | | | | | | | Not used any more. v2: remove amd_sched_emit as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>