This lets nir-to-tgsi fold the constant offset of addressing calculations
into the CONST[] reference, which is important for D3D9-era compatibility:
HW of that age has limited uniform space, and if we do the addressing math
as math in the shader for dynamic indexing, the nir_load_consts end up
taking up uniforms we don't have available.
r300:
total instructions in shared programs: 1279699 -> 1279167 (-0.04%)
instructions in affected programs: 134796 -> 134264 (-0.39%)
total instructions in shared programs: 1279699 -> 1279167 (-0.04%)
instructions in affected programs: 134796 -> 134264 (-0.39%)
total temps in shared programs: 213912 -> 213736 (-0.08%)
temps in affected programs: 2166 -> 1990 (-8.13%)
total consts in shared programs: 953237 -> 952973 (-0.03%)
consts in affected programs: 45980 -> 45716 (-0.57%)
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309>
This saves a lot of pointless gl.h includes across the board,
it moves the one place that needs GLenum into a separate file
only used in those passes that require it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
This creates an internal shader_prim enum, I've fixed up most
users to use it instead of GL types.
don't store the enum in shader_info as it changes size, and confuses
other things.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.
This has a lot of fallout all over the place.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
Sparse ir_texture will set is_sparse and use struct type to
hold both residency code and sampled texel.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362>
This fixes nir_opt_cse miss replace a non-sparse tex instruction
with a sparse tex instruction and fail the nir_validate_shader().
Fixes: 3a7972f72a ("nir,spirv: add sparse texture fetches")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362>
Doing this for ir3 required adding a struct for limits of how much base to
fold in (which NTT wants as well for its case of shared vars), otherwise
the later work to lower to the 1<<9 word limit would emit more
instructions.
The shader-db results are that sometimes the reduction in NIR instruction
count results in the fewer sampler prefetches due to the shader being
estimated to be shorter (dota2, nexuiz):
total instructions in shared programs: 8996651 -> 8996776 (<.01%)
total cat5 in shared programs: 86561 -> 86577 (0.02%)
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023>
GL_LINE_STRIP_ADJACENCY is 0xB. 0xA is GL_LINES_ADJACENCY. While we're
here, drop the ARB prefixes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14157>
So far we were checking ARB_shader_image_load_store is supported as
requirement to expose GLES 3.1.
But when this extension functionality was added in GLES 3.1 spec, it
was relaxed, and one of its requirements, the support for formatless
writing, was not included.
So this means that a driver that support all the extension
functionality except formatless writing, could expose GLES 3.1, but it
couldn't expose the extension itself (nor GL 4.2, which requires fully
implementation of the extension).
v2:
- Add the same exposure treatment to ARB_shader_image_size (Ilia).
v3:
- Remove dependency for OES_texture_buffer (Ilia).
- Check image resources for GLES 3.1 (Ilia).
v4:
- Check for MaxImageUniforms in compute shader (Ilia).
- Check for max combined image uniforms/ssbo (Ilia).
v5:
- Remove ARB_shader_image_load_store from check (Ilia).
- ARB_shader_image_store and ARB_shader_image required for
ARB_ES3_1_compatibility (Ilia).
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14288>
This optimizations turns
loop {
...
if (cond1) {
if (cond2) {
do_work_1();
break;
} else {
do_work_2();
}
do_work_3();
break;
} else {
...
}
}
into:
loop {
...
if (cond1) {
if (cond2) {
do_work_1();
} else {
do_work_2();
do_work_3();
}
break;
} else {
...
}
}
As this optimizations moves code into the NIF statement,
it re-iterates on the branch legs in case of success.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587>
This patch is a rewrite of nir_opt_move.
Differently from the previous version, each instruction is checked
if it can be moved downwards and then inserted before the first user
of the definition. The advantage is that less insert operations are
performed, the original order is kept if two movable instructions have
the same first user, and instructions without user in the same block
are moved towards the end.
v2: Only return true if an instruction really changed the position.
Don't care for discards, this will be handled by another MR.
v3: fix self-referring phis and update according to nir_can_move_instr().
v4: use nir_can_move_instr() and nir_instr_ssa_def()
v5: deduplicate some code
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657>
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer
to correctly declare output variables.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
That's how the TGSI math opcodes work.
This lets lower_vec_to_regs coalesce the DP output into the .yzw channels,
giving an impressive shader-db win on softpipe:
total instructions in shared programs: 2929840 -> 2794036 (-4.64%)
instructions in affected programs: 1651438 -> 1515634 (-8.22%)
total temps in shared programs: 372730 -> 332744 (-10.73%)
temps in affected programs: 118151 -> 78165 (-33.84%)
and a minor one on r300:
total instructions in shared programs: 51238 -> 51149 (-0.17%)
instructions in affected programs: 2621 -> 2532 (-3.40%)
total vinst in shared programs: 15655 -> 15618 (-0.24%)
vinst in affected programs: 468 -> 431 (-7.91%)
total temps in shared programs: 9838 -> 9828 (-0.10%)
temps in affected programs: 59 -> 49 (-16.95%)
and a bigger one on i915g:
total instructions in shared programs: 398064 -> 395901 (-0.54%)
instructions in affected programs: 29271 -> 27108 (-7.39%)
total tex_indirect in shared programs: 12261 -> 12233 (-0.23%)
tex_indirect in affected programs: 98 -> 70 (-28.57%)
LOST: 0
GAINED: 5
The r300 change is less impressive because it does some backend copy-prop,
but also because intermediate storage of DPs now takes a vec4 instead of a
scalar.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14200>