Connor Abbott
9c9e8c3349
nir: Reorder ffma and fsub combining
...
It's relatively common to do something like "a * b - c", which on most
GPUs can be implemented in a single instruction. Before
opt_algebraic_late this will be something like
"fadd(fmul(a, b), fneg(c))", and we want to turn it info
"ffma(a, b, fneg(c))". But because the fsub pattern was first we
instead turned it into "fsub(fmul(a, b), c)". Fix this by reordering
them.
Selected shader-db results on freedreno:
total instructions in shared programs: 1561330 -> 1551619 (-0.62%)
instructions in affected programs: 780272 -> 770561 (-1.24%)
helped: 1941
HURT: 491
helped stats (abs) min: 1 max: 147 x̄: 7.98 x̃: 4
helped stats (rel) min: 0.07% max: 30.77% x̄: 4.36% x̃: 3.17%
HURT stats (abs) min: 1 max: 307 x̄: 11.76 x̃: 5
HURT stats (rel) min: 0.09% max: 18.71% x̄: 2.26% x̃: 1.38%
95% mean confidence interval for instructions value: -4.57 -3.41
95% mean confidence interval for instructions %-change: -3.21% -2.84%
Instructions are helped.
total nops in shared programs: 358926 -> 356263 (-0.74%)
nops in affected programs: 167116 -> 164453 (-1.59%)
helped: 1395
HURT: 859
helped stats (abs) min: 1 max: 108 x̄: 6.80 x̃: 3
helped stats (rel) min: 0.17% max: 100.00% x̄: 19.15% x̃: 10.57%
HURT stats (abs) min: 1 max: 307 x̄: 7.95 x̃: 3
HURT stats (rel) min: 0.00% max: 381.82% x̄: 20.04% x̃: 10.00%
95% mean confidence interval for nops value: -1.77 -0.59
95% mean confidence interval for nops %-change: -5.55% -2.87%
Nops are helped.
total non-nops in shared programs: 1202404 -> 1195356 (-0.59%)
non-nops in affected programs: 496682 -> 489634 (-1.42%)
helped: 1951
HURT: 265
helped stats (abs) min: 1 max: 39 x̄: 4.02 x̃: 3
helped stats (rel) min: 0.07% max: 15.38% x̄: 2.97% x̃: 1.96%
HURT stats (abs) min: 1 max: 22 x̄: 2.97 x̃: 2
HURT stats (rel) min: 0.05% max: 10.00% x̄: 1.14% x̃: 0.75%
95% mean confidence interval for non-nops value: -3.38 -2.99
95% mean confidence interval for non-nops %-change: -2.60% -2.36%
Non-nops are helped.
total systall in shared programs: 288317 -> 292975 (1.62%)
systall in affected programs: 87876 -> 92534 (5.30%)
helped: 388
HURT: 431
helped stats (abs) min: 1 max: 214 x̄: 14.39 x̃: 8
helped stats (rel) min: 0.25% max: 100.00% x̄: 22.12% x̃: 11.96%
HURT stats (abs) min: 1 max: 232 x̄: 23.77 x̃: 7
HURT stats (rel) min: 0.00% max: 1300.00% x̄: 51.71% x̃: 17.30%
95% mean confidence interval for systall value: 3.07 8.30
95% mean confidence interval for systall %-change: 9.49% 23.97%
Systall are HURT.
(The systall hurt is probably just due to having having fewer
instructions to hide latency with.)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14554 >
2022-01-18 17:44:50 +00:00
Qiang Yu
7d4b0b7789
glsl/nir: convert is_sparse_texels_resident to nir
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:36 +08:00
Qiang Yu
634bb25123
glsl: add sparseTexelsResidentARB builtin function
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:36 +08:00
Qiang Yu
a0b6515ec4
glsl/nir: adjust sparse texture nir_variable
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:36 +08:00
Qiang Yu
3dc950118c
glsl/nir: convert sparse image load to nir
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:36 +08:00
Qiang Yu
f4a972b748
glsl/nir: convert sparse ir_texture to nir
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
f62bbe44c9
glsl: add vec5 glsl types
...
Used by nir_variable holds sparse texture output which is
up to vec5 size.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Singed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
59d94d8188
glsl: add sparse texture image load builtin functions
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
dcec15d79c
glsl: add _texelFetch related sparse texture builtin function
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
a26dbfc24b
glsl: add _textureCubeArrayShadow related sparse texture builtin func
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
640f909862
glsl: add _texture related sparse texture builtin functions
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
dd84769c55
glsl: ir_texture support sprase texture
...
Sparse ir_texture will set is_sparse and use struct type to
hold both residency code and sampled texel.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
67b4c88512
glsl: add ARB_sparse_texture2 extension
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Singed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Qiang Yu
2cee73f0f7
nir: fix nir_tex_instr hash not count is_sparse field
...
This fixes nir_opt_cse miss replace a non-sparse tex instruction
with a sparse tex instruction and fail the nir_validate_shader().
Fixes: 3a7972f72a ("nir,spirv: add sparse texture fetches")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Rhys Perry
d95a0b52e4
nir/unsigned_upper_bound: don't follow 64-bit f2u32()
...
Fixes Doom Eternal crash.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 72ac3f6026 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14555 >
2022-01-17 10:59:21 +00:00
Emma Anholt
f6ffefba3e
nir: Apply nir_opt_offsets to nir_intrinsic_load_uniform as well.
...
Doing this for ir3 required adding a struct for limits of how much base to
fold in (which NTT wants as well for its case of shared vars), otherwise
the later work to lower to the 1<<9 word limit would emit more
instructions.
The shader-db results are that sometimes the reduction in NIR instruction
count results in the fewer sampler prefetches due to the shader being
estimated to be shorter (dota2, nexuiz):
total instructions in shared programs: 8996651 -> 8996776 (<.01%)
total cat5 in shared programs: 86561 -> 86577 (0.02%)
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023 >
2022-01-16 19:11:29 +00:00
Emma Anholt
b024102d7c
freedreno/ir3: Use nir_opt_offset for removing constant adds for shared vars.
...
Saves some work in carchase and manhattan31:
instructions in affected programs: 2842 -> 2818 (-0.84%)
nops in affected programs: 1131 -> 1105 (-2.30%)
non-nops in affected programs: 1236 -> 1238 (0.16%)
mov in affected programs: 57 -> 61 (7.02%)
dwords in affected programs: 2144 -> 2150 (0.28%)
cat0 in affected programs: 1195 -> 1169 (-2.18%)
cat1 in affected programs: 151 -> 155 (2.65%)
cat2 in affected programs: 142 -> 140 (-1.41%)
sstall in affected programs: 190 -> 178 (-6.32%)
(ss) in affected programs: 63 -> 63 (0.00%)
systall in affected programs: 532 -> 511 (-3.95%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023 >
2022-01-16 19:11:29 +00:00
Jason Ekstrand
8b3d947267
spirv,radv: Fix some GL enum comments
...
GL_LINE_STRIP_ADJACENCY is 0xB. 0xA is GL_LINES_ADJACENCY. While we're
here, drop the ARB prefixes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14157 >
2022-01-14 15:08:09 +00:00
Juan A. Suarez Romero
3b81d2d30d
mesa/st: do not expose ARB_shader_image_load_store if not fully implemented
...
So far we were checking ARB_shader_image_load_store is supported as
requirement to expose GLES 3.1.
But when this extension functionality was added in GLES 3.1 spec, it
was relaxed, and one of its requirements, the support for formatless
writing, was not included.
So this means that a driver that support all the extension
functionality except formatless writing, could expose GLES 3.1, but it
couldn't expose the extension itself (nor GL 4.2, which requires fully
implementation of the extension).
v2:
- Add the same exposure treatment to ARB_shader_image_size (Ilia).
v3:
- Remove dependency for OES_texture_buffer (Ilia).
- Check image resources for GLES 3.1 (Ilia).
v4:
- Check for MaxImageUniforms in compute shader (Ilia).
- Check for max combined image uniforms/ssbo (Ilia).
v5:
- Remove ARB_shader_image_load_store from check (Ilia).
- ARB_shader_image_store and ARB_shader_image required for
ARB_ES3_1_compatibility (Ilia).
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14288 >
2022-01-13 09:12:35 +00:00
Daniel Schürmann
79a987ad2a
nir/opt_if: also merge break statements with ones after the branch
...
This optimizations turns
loop {
...
if (cond1) {
if (cond2) {
do_work_1();
break;
} else {
do_work_2();
}
do_work_3();
break;
} else {
...
}
}
into:
loop {
...
if (cond1) {
if (cond2) {
do_work_1();
} else {
do_work_2();
do_work_3();
}
break;
} else {
...
}
}
As this optimizations moves code into the NIF statement,
it re-iterates on the branch legs in case of success.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587 >
2022-01-13 02:30:32 +00:00
Daniel Schürmann
dad609d152
nir/opt_if: merge two break statements from both branch legs
...
This optimization turns
loop {
...
if (cond) {
do_work_1();
break;
} else {
do_work_2();
break;
}
}
into:
loop {
...
if (cond) {
do_work_1();
} else {
do_work_2();
}
break;
}
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587 >
2022-01-13 02:30:32 +00:00
Daniel Schürmann
8a78706643
nir: refactor nir_opt_move
...
This patch is a rewrite of nir_opt_move.
Differently from the previous version, each instruction is checked
if it can be moved downwards and then inserted before the first user
of the definition. The advantage is that less insert operations are
performed, the original order is kept if two movable instructions have
the same first user, and instructions without user in the same block
are moved towards the end.
v2: Only return true if an instruction really changed the position.
Don't care for discards, this will be handled by another MR.
v3: fix self-referring phis and update according to nir_can_move_instr().
v4: use nir_can_move_instr() and nir_instr_ssa_def()
v5: deduplicate some code
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657 >
2022-01-12 13:41:54 +00:00
Marcin Ślusarz
f286ecf906
nir: handle per-view clip/cull distances
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
70a9710eee
spirv: mark [Clip|Cull]DistancePerViewNV variables as compact
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
0d6f83cbf1
nir: remove invalid assert affecting per-view variables
...
per-view variables can have arbitrary (but > 0) number of array levels
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
ce5a8bff77
spirv: handle multiview bits of SPV_NV_mesh_shader
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
4fed440724
nir: add load_mesh_view_count and load_mesh_view_indices intrinsics
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
e4ff7fd76a
spirv: add MeshViewCountNV/MeshViewIndidcesNV builtins from SPV_NV_mesh_shader
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
561de760fd
compiler: add new MESH_VIEW_COUNT/MESH_VIEW_INDICES system values
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
4cb7dcb097
spirv: handle ViewportMaskNV builtin/cap from SPV_NV_mesh_shader
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Rhys Perry
67fc7a1763
nir/uniform_atomics: fix is_atomic_already_optimized without workgroups
...
dims_needed would have been zero, so this would always returned true for
non-compute stages.
Also fix this for variable workgroup sizes.
Improves Shadow of the Tomb Raider RX 6800 performance by 10.6%, 11.5% and
4.5% (day_of_dead, jungle and paititi scenes).
radv_perf before and after:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '62.913333333333334', 'min_fps': '62.81', 'max_fps': '62.98', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '64.02666666666666', 'min_fps': '63.93', 'max_fps': '64.11', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '74.81666666666666', 'min_fps': '74.72', 'max_fps': '74.88', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '69.57', 'min_fps': '69.52', 'max_fps': '69.63', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '71.41000000000001', 'min_fps': '71.31', 'max_fps': '71.5', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '78.16666666666667', 'min_fps': '78.07', 'max_fps': '78.23', 'interations': '3'}
Performance now seems slightly better than AMDVLK 2021.Q4.3:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '68.02666666666666', 'min_fps': '67.95', 'max_fps': '68.16', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '70.24666666666667', 'min_fps': '69.83', 'max_fps': '70.51', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '77.19', 'min_fps': '77.18', 'max_fps': '77.2', 'interations': '3'}
fossil-db (Sienna Cichlid):
Totals from 40 (0.03% of 134621) affected shaders:
CodeSize: 62676 -> 65996 (+5.30%)
Instrs: 11372 -> 12111 (+6.50%)
Latency: 144122 -> 142848 (-0.88%); split: -1.09%, +0.21%
InvThroughput: 19686 -> 19847 (+0.82%); split: -0.06%, +0.87%
VClause: 304 -> 306 (+0.66%)
SClause: 603 -> 604 (+0.17%); split: -0.83%, +1.00%
Copies: 780 -> 858 (+10.00%)
Branches: 235 -> 329 (+40.00%)
PreSGPRs: 1072 -> 1083 (+1.03%); split: -0.37%, +1.40%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14407 >
2022-01-10 19:57:38 +00:00
Rhys Perry
b00138090e
nir/lower_shader_calls: fix store_scratch write_mask
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447 >
2022-01-10 19:01:04 +00:00
Danylo Piliaiev
b8d486f298
nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
...
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986 >
2022-01-10 13:20:39 +02:00
Gert Wollny
6f348d9c99
nir_lower_io: propagate the "invariant" flag to outputs
...
Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer
to correctly declare output variables.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423 >
2022-01-07 16:35:43 +00:00
Jesse Natalie
ef27036b4c
shader_info: tess.spacing needs to be unsigned
...
Otherwise MSVC will treat the bit-packed enum values as signed.
Reviewed-by: Marek Olák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14402 >
2022-01-07 12:16:41 +00:00
Emma Anholt
558a600629
nir_to_tgsi: Enable fdot_replicates flag.
...
That's how the TGSI math opcodes work.
This lets lower_vec_to_regs coalesce the DP output into the .yzw channels,
giving an impressive shader-db win on softpipe:
total instructions in shared programs: 2929840 -> 2794036 (-4.64%)
instructions in affected programs: 1651438 -> 1515634 (-8.22%)
total temps in shared programs: 372730 -> 332744 (-10.73%)
temps in affected programs: 118151 -> 78165 (-33.84%)
and a minor one on r300:
total instructions in shared programs: 51238 -> 51149 (-0.17%)
instructions in affected programs: 2621 -> 2532 (-3.40%)
total vinst in shared programs: 15655 -> 15618 (-0.24%)
vinst in affected programs: 468 -> 431 (-7.91%)
total temps in shared programs: 9838 -> 9828 (-0.10%)
temps in affected programs: 59 -> 49 (-16.95%)
and a bigger one on i915g:
total instructions in shared programs: 398064 -> 395901 (-0.54%)
instructions in affected programs: 29271 -> 27108 (-7.39%)
total tex_indirect in shared programs: 12261 -> 12233 (-0.23%)
tex_indirect in affected programs: 98 -> 70 (-28.57%)
LOST: 0
GAINED: 5
The r300 change is less impressive because it does some backend copy-prop,
but also because intermediate storage of DPs now takes a vec4 instead of a
scalar.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14200 >
2022-01-07 09:58:24 +00:00
Dave Airlie
f7bb68e499
glsl/nir: don't pass gl_context to the convertor routine.
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
ef3303385d
glsl/linker: remove a bunch more gl_context references.
...
This leaves only one reference used for the ctx->Driver.NewProgram
hook, this should likely be revisited.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
636b07943e
glsl/linker: drop unused gl_context.
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
32702047d8
glsl/linker/uniform_blocks: don't pass gl_context around.
...
just pass the constants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
e943af6a55
glsl/nir/linker: avoid passing gl_context inside gl_nir linker
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
7c8017bbd1
glsl/linker: remove gl_context usage from more places.
...
Just pass in consts, exts, api
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
34090712c6
glsl/linker: remove gl_context from check image resources
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
4d6e866b7b
glsl/linker: get rid of gl_context from atomic counters paths
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
e9ec1429ba
glsl/linker: get rid of gl_context from uniform assign paths
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
adbbee980d
glsl/linker: get rid of gl_context from link varyings
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
98f665e613
glsl/linker: remove direct gl_context usage in favour of consts/exts/api
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
d5e7c7351a
glsl/linker: move more ctx->Consts to consts.
...
Don't pass gl contexts around as much
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
eb811bdaf4
glsl/linker: don't pass gl_context just for constants in xfb code
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00
Dave Airlie
e83f0fc620
glsl: don't pass gl_context to lower shared references.
...
this uses the consts only
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433 >
2022-01-07 06:19:49 +00:00