This will be useful for shader objects and also because creating and
emitting a noop FS is useless, the hardware doesn't execute it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22848>
VK_EXT_sample_locations is only supported on < GFX10 due to some weird
issues on recent GPUs. extendedDynamicState3SampleLocationsEnable is
only enabled on GFX6-GFX9 for the same reason.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22947>
For ACO which won't do this for us. But we still can't
remove the same code in llvm because non-uniform sampler
is keept as index in nir.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
ACO need this to be done in nir. Remove the llvm round code
because both radv and radeonsi do this in nir for both aco
and llvm.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
aco does not implement fpow, need nir to lower it
first. llvm will do by itself in the same way, so
we always lower fpow in nir now.
Remove the llvm fpow implementation that has special
handling for the muliplication. It's not used any
more and does not match GLSL spec as fpow(0,0)=NaN
but here we get 0.
There's some pixel changes for gl-radeonsi-stoney:
ror-default 2 (no tolerance), 0 (1% tol.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
ACO only support nir_fsin/cos_amd.
There's some pixel changes for gl-radeonsi-stoney trace.
Different pixels:
furmark 61 (no tolerance), 0 (1% tol.)
gimark 93867 (no tolerance), 888 (1% tol.)
tessmark 39 (no tolerance), 0 (1% tol.)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
aco does not implement these idiv ops.
nir_lower_idiv is for idiv ops <= 32bit and ported from
llvm amdgpu, so llvm do the same.
nir_lower_divmod64 is for 64bit idiv ops.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
Use case: radeonsi will generate internal tgsi shader
with 64bit udiv instruction, and we want all 64bit udiv
to be lowered in nir by lower_int64_options.
For GLSL shaders, this is done in glsl to nir, so we do
the same for tgsi here.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
It's the output of ACO compiler. To share the si_shader_binary
struct with ELF type:
* add a type field to indicate RAW or ELF
* rename elf_buffer/size to code_buffer/size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22573>
committed has to be a constant so there is no need to have a src and
depend on constant folding to remove the i2b.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22963>
For ALU ops where input types are known, we can store that info on
the input sources. This can be used to produce the correct overloads
of load instructions that don't immediately need to be followed by
bitcasts, or similarly to produce a constant value which can be directly
consumed by the relevant instruction without needing a bitcast.
Similarly for values that will be stored in an output, we know type
information. And using that info, we can use more-correct information
for phis instead of forcing all phi sources to be bitcast to int just
to be bitcast back to float on the other side for an alu or an output
store.
One missing piece is SSBO stores, where we can support int or float.
If the input is coming from a phi, we don't influence the phi's type,
so it'll be int, even though the incoming sources might've been float.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22972>
For each phi src, ensure that it's only used as a phi src.
This lets us give each phi their own unique types without worrying
about them stomping on each other. Also scalarize phis.
For each constant, ensure that it's only used once. The DXIL backend
will already dedupe these consts within the module, but this lets a
single load_const have multiple types depending on how it's used.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22972>
Use ACCESS_* flags in call sites instead of GLC/DLC/SLC.
ACCESS_* flags are extended to describe other aspects of memory instructions
like load/store/atomic/smem.
Then add a function that converts the access flags to GLC, DLC, SLC.
The new functions are also usable by ACO.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22770>