Pass pointer to core buckets mgr back to sim layer.
Add support for RDTSC_START/RDTSC_STOP macros in the builder.
Each unique shader now has a unique bucket associated with it,
enabling more detailed reporting at the shader level. Currently
due to some llvm issue with thread local storage, 64bit runs require
single threaded mode.
Use head/tail ring buffer indices for thread synchronization.
1. SwrWaitForIdle loops until ring is empty. (head == tail)
2. GetDrawContext waits until ring is not full. (head - tail) == Ring Size
3. Draw enqueues by incrementing head.
4. Last worker thread to move past a DC dequeues by incrementing tail.
Todo: To reduce contention we can cache the tail in the API thread. For
example, if you know you have 64 free entries in the ring then you don't
need to keep checking the tail until you used those 64 entries.
v2: Polaris chips should be defined after Stoney
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> (v1)
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Signed-off-by: Leo Liu <leo.liu@amd.com> (v2 diff)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v2 diff)
v2: fix indentation as noted by Michel
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
radeonsi uses this property to make the best decision about which
shader to compile, but this is not currently used by our codegen.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The following Coverity warning
5378 tmpl.fetch_args = atomic_fetch_args;
5379 tmpl.emit = atomic_emit;
>>> CID 1357115: Uninitialized variables (UNINIT)
>>> Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized.
5380 bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl;
5381 bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add";
... is a false positive, but what the hell. This change should "fix" it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Results are undefined but may not crash. Without this change, out-of-bounds
indexing can lead to VM faults and GPU hangs.
Constant buffers, samplers, and possibly others will eventually need similar
treatment to support GL_ARB_robust_buffer_access_behavior.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
This fixes arb_shader_image_load_store-host-mem-barrier.
v2: flush TC L2 for index buffers on <= CIK (Marek)
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
These don't get used and haven't been in git history from what I can
see, so drop them.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This enables ARB_shader_image_load_store and ARB_shader_image_size.
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
[allow the same number of images for all shader stages and require LLVM 3.9]
Reviewed-by: Marek Olšák <marek.olsak@amd.com>