GS is tested, tessellation is untested.
Have outputs_written_before_ps for HW VS and outputs_written for other
stages. The reason is that COLOR and BCOLOR alias for HW VS, which
drives elimination of VS outputs based on PS inputs.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
It will contain more variables.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
so that it can be removed and replaced with inline VBO descriptors,
and the pointer can be packed in unused bits of VBO descriptors.
This also removes the pointer from merged TES-GS where it's useless.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
If alpha-to-coverage is enabled, we have to compute alpha
even if color writes are disabled.
Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Most files in gallium/radeon now include si_pipe.h.
chip_class and family are now here:
sscreen->info.family
sscreen->info.chip_class
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
With Gallium threaded contexts, creating shader/compute states is
effectively a screen operation, so we should not use context state.
In particular, this allows us to avoid using the context's LLVM
TargetMachine.
This isn't an issue yet because u_threaded_context filters out non-async
debug callbacks, and we disable threaded contexts for debug contexts.
However, we may want to change that in the future.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We only need the lock to guard changes in the variant linked list. The
actual compilation can happen outside the lock, since we use the ready
fence as a guard.
v2: fix double-unlock
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
There's a race condition between si_shader_select_with_key and
si_bind_XX_shader:
Thread 1 Thread 2
-------- --------
si_shader_select_with_key
begin compiling the first
variant
(guarded by sel->mutex)
si_bind_XX_shader
select first_variant by default
as state->current
si_shader_select_with_key
match state->current and early-out
Since thread 2 never takes sel->mutex, it may go on rendering without a
PM4 for that shader, for example.
The solution taken by this patch is to broaden the scope of
shader->optimized_ready to a fence shader->ready that applies to
all shaders. This does not hurt the fast path (if anything it makes
it faster, because we don't explicitly check is_optimized).
It will also allow reducing the scope of sel->mutex locks, but this is
deferred to a later commit for better bisectability.
Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
It's inaccurate. Instead, see the copyright and use "git log" and
"git blame" to know the authorship.
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>