These are funky enough that they make more sense as intrinsics than
texture opcodes.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
nir_build_frag_coord generates the correct sysval loads based on NIR
options. nir_load_frag_coord shouldn't be used directly because drivers
don't have to support it.
v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
Instead of lowering frag_coord 4 times during compilation,
just use this.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
to strengthen and simplify pixel_coord lowering
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
to lower bindless handles to 32 bits before nir_opt_varyings, so that
the high 32 bits of (input) loads of bindless handles are eliminated early.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41170>
panfrost has float16 point size, handling that precision too allows the
compiler to call lower_point_size later in the compilation pipeline
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
NVK is only going to use it for `fmul_rtz(frcp(ipa), ipa)` patterns, so
try not too hard to optimize this.
Totals from 10 (0.00% of 1212873) affected shaders:
CodeSize: 34480 -> 34288 (-0.56%); split: -0.60%, +0.05%
Static cycle count: 6225 -> 6132 (-1.49%); split: -1.57%, +0.08%
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
The IEEE754-2019 standard declaring the preceding sign "optional" when
converting NaN values to strings because the standard tries to not
regulate how sign bits in NaNs are interpreted.
In the real world, when using printf-series function to print a number
with type `float` on RISC-V, the sign of NaNs is wiped during the
conversion from `float` to `double` (defined as part of the default
argument promotions rule for variable arguments in the C spec).
Change the code to stop relying on isa_print() to print the negative
sign, instead parse it from the highest bit of value and manually print
it before "nan" string.
This fixes the `etnaviv_isa_disasm` unit test on RISC-V.
Suggested-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40887>
Use DFS traversal from main to resolve reachable functions. Avoid spurious
"unresolved reference" linker errors for dead helper functions.
It avoid reporting linking error for following shader test. The shader test used
to pass before merge_requests/31137:
[require]
GLSL >= 1.50
[vertex shader]
/* declared but not defined */
vec4 transform_color(vec3 color, float alpha);
/* calls transform_color — but this function is never called from main */
vec4 apply_transform(vec3 color, float alpha)
{
return transform_color(color, alpha);
}
[vertex shader]
in vec4 piglit_vertex;
void main()
{
/* apply_transform is never called here */
gl_Position = piglit_vertex;
}
Signed-off-by: Xinju Li <xinju.li@broadcom.com>
use pass_flags to mark function as reachable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41065>
For an inexact-associative operation (fadd or fmul), can_reassociate ensures the
root of the chain is inexact to allow reassociating. However, build_chain just
checks for opcodes to match up after, although we do sum up exactness across the
chain. Although an Effort Was Made, it still seems incorrect to reassociate
%3 = fadd! %0, %1
%4 = fadd %3, %2
to instead be (ex.)
%3 = fadd! %0, %2
%4 = fadd! %3, %1
Closes: #14418
Fixes: e0b0f7e73c ("nir: add ALU reassocation pass")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41162>
Rather than always emitting and swizzling 16 components for raw samples,
scale it by the number actually needed as defined by the selected tg4
channel/components.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
The structure will grow in later commits.
The major change is that the preheader and exit blocks are replaced
by tracking just the innermost optimized nir_loop * and getting the
predecessor and successor blocks out of it.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
ac_nir_optimize_outputs() might pack user varyings into the color
built-ins. If this happens we skip adding clamping to the
components that contain the user varying.
This change also fixes a second bug where a color built-in can be
packed into a non-color slot and was no longer being clamped.
Fixes: 3777a5d7 ("radeonsi: assign param export indices before compilation")
Closes: #14443
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40594>
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.
Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41051>
Fixes assertion failure when MESA_SPIRV_DUMP_PATH is set for OpenCL
programs.
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41097>
reduces hello world kernel 57 -> 44 inst on jay. why do we have two opcodes that
do literally the same thing? :/
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41085>
The hangover DXVK builds we want to use for arm64 CI hit this path, and we
have a perfectly reasonable fallback for handling this case (ignore the
sampler, as glslang should have done).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>