This prevents ~400 shader-db regresssions and a handful of fossil-db
regressions after i2b is always lowered.
All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total cycles in shared programs: 855301494 -> 855302118 (<.01%)
cycles in affected programs: 52787 -> 53411 (1.18%)
helped: 4 / HURT: 5
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141206055 -> 141205979 (-0.0%)
Instructions helped: 14
Cycles in all programs: 9133376616 -> 9133387327 (+0.0%)
Cycles helped: 13
Cycles hurt: 3
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
No shader-db changes on any Intel platform.
All of the helped shaders were presumably regressed by 4676b3d3dd (nir:
Use nir_test_mask instead of i2b(iand)).
v2: Add some comments explaining why specific replacements are used. In
the umin pattern, only markup the first usage of 'b' in the source
pattern.
Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 141384970 -> 141200966 (-0.1%)
Instructions helped: 45842
Cycles in all programs: 9133648977 -> 9133282672 (-0.0%)
Cycles helped: 26812
Cycles hurt: 6025
Gained: 23
Lost: 135
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
A loop below already adds all the permutations... including the 1-bit
version that isn't included in this group.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
The exact same pattern appears later (around line 1323).
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
A later commit adds a pattern
(('umin', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)')),
('iand', ('iand', a, b), ('iand', c, b))),
When I originally made that pattern, I copied and pasted the search to
the replacement as
(('umin', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)')),
('iand', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)'))),
The caused the variables in the replacement to be marked is_constant,
and that resulted in an assertion failure deep inside nir_search.
src/compiler/nir/nir_search.c:530: construct_value: Assertion `!var->is_constant' failed.
These extra validation rules catch this kind of error at compile time
rather than at run time.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
For used by legacy GS to store output to different ring according
to stream id.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20158>
CSE tries to collapse equal instructions, and collapsing two phis with incompatible dests is illegal.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes: 6bdce55c ("nir: Add a basic CSE pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19960>
Also fix the leading curly for the new function definitions.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
Make the shader_info printing less verbose by skipping the fields that
are likely not used (being zero).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
This is a refactoring, it is not supposed to change the printed output.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
We were not taking into account that when all invocations within workgroup
are active, we'll copy more data than needed, corrupting task payload
of other workgroups.
Fixes: 8aff8d3dd4 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20080>
link_varyings ignores precisions and can assign the same location to
variables with different precisions. nir_link_varying_precision should
check location_frac as well.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20113>
In try_fold_load_store when trying to extract const addition from
non-const offset source, we should take into account that there is
already a constant base offset, which should count towards the limit.
The issue was found in "Monster Hunter: World" running on Turnip.
Fixes: cac6f633b2
("nir/opt_offsets: Use nir_ssa_scalar to chase offset additions.")
Well, the issue was present before this commit but it made a lot
of changes in surrounding code.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20099>
Used by radeonsi for lower nir_atomic_add_gen/xfb_prim_count_amd.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18010>
For radeonsi which use 32bit address in ac_build_load_to_sgpr().
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18010>
We'll use formatted loads and some system values to lower UBOs and VBOs to
global memory in NIR, using the AGX-specific format support and addressing
arithmetic to optimize the emitted code.
Add the intrinsics and teach nir_opt_preamble how to move them so we don't
regress UBO pushing.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
According to the deprecation note:
> It's a pure subset of meson.get_external_property, and works strangely
> in host == build configurations, since it would be more accurately
> described as get_host_property.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19904>
We need this, because on Intel task payload starts with private header,
followed by user-accessible data.
Fixes: 37e78803d7 ("intel/compiler: use nir_lower_task_shader pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19409>
Move them up to where the other conversion helpers. For nir_b2<T>(),
suffix them with N like all the others and make them use
nir_type_convert() as well.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20067>
If both types are the same or both are integer types with the same bit
size, no actual conversion happens and nir_type_conversion_op() will
return nir_op_mov. In this case, there's no point in emitting the move
and we can just return src instead.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20067>
In our handling of load_deref, we were calling builder helpers to create
conversions and then adjusting the destination bit size of the load. We
should adjust the bit size first because the builder sometimes looks at
the bit sizes of SSA values passed in as arguments.
Even though it's not strictly necessary, adjust the store_deref case as
well to make it fully symmetric with the load_deref case.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20067>
Based on nir_create_passthrough_tcs and d3d12_make_passthrough_gs, this
creates a passthrough geometry shader that can be used by drivers that
needs to emulate some graphics features in the geometry shader.
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19987>
We cannot fully use the vectorizer outside of this pass because once
stack load/store operations have been lower to global load/store, the
robustness rule applies to those as they would to application
load/store.
But this is all internal and we know it doesn't require out of bound
checking. So doing the vectorizing here is the best solution. We just
have to teach the vectorizer about our intrinsics.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20058>