Commit graph

4259 commits

Author SHA1 Message Date
Timur Kristóf
b688a6d227 nir: Remove IB address and stride intrinsics.
RADV used these to emulate firstTask for NV_mesh_shader.
They are no longer needed.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22139>
2023-03-29 15:08:55 +00:00
Qiang Yu
bf9c1699cd nir: add nir_fisnan helper function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21552>
2023-03-28 19:57:11 +00:00
Qiang Yu
c9d60547ef nir,radeonsi: add and implement nir_load_alpha_reference_amd
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21552>
2023-03-28 19:57:11 +00:00
Qiang Yu
6848e05f9c nir: pack_(s|u)norm_2x16 support float16 as input
For AMD GPU which has instruction to normalize and pack two float16
inputs, and used when fragment shader export color output.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21552>
2023-03-28 19:57:11 +00:00
Faith Ekstrand
01275a1a95 nir: Drop a bunch of Authors tags
This is what git blame is for.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22120>
2023-03-26 00:16:25 +00:00
Konstantin Seurer
200e551cbb nir/lower_shader_calls: Remat derefs before lowering resumes
Closes: #7923
cc: mesa-stable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20399>
2023-03-24 14:55:37 +00:00
Alyssa Rosenzweig
47ed0b41be nir: Add Mali load_output taking converison
Mali's LD_TILE instruction (mapping to NIR's load_output) requires a "conversion
descriptor" specifying how to convert from the register foramt to the tilebuffer
format. To implement framebuffer fetch on OpenGL without shader variants, we
generate these descriptors in the driver and pass them in a uniform. However, to
comply with the Ekstrand Rule, we can't have magically materialized system
values -- they should come only from the NIR where the driver can lower as it
pleases (e.g. PanVK can lower to a constant because it knows the framebuffer
format at pipeline create time). Add intrinsics to model this.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
2023-03-23 23:53:45 +00:00
Alyssa Rosenzweig
60bfc4deb9 nir: Add Panfrost intrinsics to lower sample mask
We want to lower this in NIR instead of the backend IR to give the driver a
chance to lower the "is multisampled?" system value, which makes more sense to
do in NIR. This gets rid of one of the magic compiler materialized sysvals.

Plus, this will let us constant fold away the lowering in Vulkan when we know
that the pipeline is single-sampled / multi-sampled.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>
2023-03-23 23:53:45 +00:00
Amber
8da3494d53 freedreno, nir, ir3: implement GL_EXT_shader_framebuffer_fetch
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21260>
2023-03-23 16:59:56 +00:00
Amber
ca92183845 nir: Add memory coherency information to shaders.
Signed-off-by: Amber Amber <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21260>
2023-03-23 16:59:56 +00:00
Amber
1462da2a70 nir: allow nir_lower_fb_read to support multiple render targets
Signed-off-by: Amber Amber <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21260>
2023-03-23 16:59:56 +00:00
Rhys Perry
e99ba0b6d3 nir/range_analysis: use perform_analysis() in nir_analyze_range()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
2023-03-22 09:24:18 +00:00
Rhys Perry
2b03db39b3 nir/range_analysis: use perform_analysis() in nir_unsigned_upper_bound()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
2023-03-22 09:24:18 +00:00
Rhys Perry
29a38b09cf nir/range_analysis: add helpers for limiting stack usage
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
2023-03-22 09:24:18 +00:00
Rhys Perry
2145cf3dd1 nir/range_analysis: add missing masking of shift amounts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 72ac3f6026 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21381>
2023-03-22 09:24:18 +00:00
Alyssa Rosenzweig
2933af7576 nir/builder: Add nir_umod_imm helper
Like nir_udiv_imm, we can do a similar power-of-two trick. It's also really
convenient.

v2: Assert reasonable bounds on the modulus (Faith).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> [v1]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v1]
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22010>
2023-03-22 06:18:18 +00:00
Georg Lehmann
cec04adcee nir: optimize i2f(f2i(fsign))
Foz-DB Navi10:
Totals from 3013 (2.23% of 134906) affected shaders:
VGPRs: 138068 -> 136964 (-0.80%); split: -0.80%, +0.00%
CodeSize: 10476416 -> 10391800 (-0.81%)
MaxWaves: 79118 -> 80088 (+1.23%)
Instrs: 1963227 -> 1945003 (-0.93%)
Latency: 24734883 -> 24649279 (-0.35%); split: -0.39%, +0.05%
InvThroughput: 6366777 -> 6334735 (-0.50%); split: -0.50%, +0.00%
VClause: 36845 -> 36882 (+0.10%); split: -0.26%, +0.36%
SClause: 59249 -> 59273 (+0.04%); split: -0.25%, +0.29%
Copies: 108570 -> 108501 (-0.06%); split: -0.19%, +0.13%
PreSGPRs: 105371 -> 105862 (+0.47%)
PreVGPRs: 117675 -> 116625 (-0.89%); split: -0.89%, +0.00%

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22003>
2023-03-22 05:34:55 +00:00
Samuel Pitoiset
bb7e0c4280 spirv,nir: add support for SpvBuiltInFullyCoveredEXT
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21497>
2023-03-21 08:44:09 +00:00
Emma Anholt
5873dcb32f nir/lower_mediump: Fix assertion about copy_deref lowering matching.
Copy and paste typo.  We shouldn't have copy_derefs during this pass,
anyway, but caught a failure with my upcoming unit testing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21666>
2023-03-21 00:51:24 +00:00
Jesse Natalie
49885f87c3 nir: Propagate alignment when rematerializing cast derefs
Fixes: 878a8daca6 ("nir: Add alignment information to cast derefs")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21975>
2023-03-17 08:16:03 +00:00
Timur Kristóf
022e55557b nir: Add load_typed_buffer_amd intrinsic.
This new intrinsic maps to the MTBUF instruction format on AMD GPUs
and represents a typed buffer load in NIR.

Also add an unsigned upper bound for the new intrinsic.
Code for that ported from aco_instruction_selection_setup.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16805>
2023-03-15 14:54:27 +00:00
Isabella Basso
59fea8af3a nir/algebraic: remove duplicate bool conversion lowerings
While [1] added some boolean conversion lowering patterns, those were
already dealt with on [2].

[1] - b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")
[2] - d7e0d47b ("nir/algebraic: nir: Add a bunch of b2[if] optimizations")

Fixes: b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
2023-03-11 17:21:38 +00:00
Isabella Basso
a553d3cd29 nir/algebraic: make patterns for float conversion lowerings imprecise
As noted on [1], lowering patterns of the form
floatS -> floatB -> floatS ==> floatS
cannot require precision since this may cause flush denorming.

[1] 3f779013 ("nir: Add an algebraic optimization for float->double->float")

Fixes: b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
2023-03-11 17:21:37 +00:00
Isabella Basso
79c94ef52e nir/algebraic: extend lowering patterns for conversions on smaller bit sizes
Conversions on smaller bit sizes should also be collapsed when composed.

This also adds more patterns on the
intS -> intB -> floatB ==> intS -> floatB
lowering so as to deal with any int size C > B instead of a fixed intB.

Closes: #7776
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
2023-03-11 17:21:37 +00:00
Isabella Basso
a27bcd63d0 nir/algebraic: extend mediump patterns
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Italo Nicola <italonicola@collabora.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
2023-03-11 17:21:37 +00:00
Isabella Basso
b3685f3ba7 nir/algebraic: insert patterns inside optimizations list
Some patterns were outside the list of optimizations.

Fixes: b86305bb ("nir/algebraic: collapse conversion opcodes (many patterns)")

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965>
2023-03-11 17:21:37 +00:00
Alyssa Rosenzweig
2ba48eea88 nir/lower_point_size: Use shader_instructions_pass
Sleepy code deletion mood.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21750>
2023-03-11 16:42:36 +00:00
Ian Romanick
0cadc3830f nir/lower_int64: Optionally lower ufind_msb using uadd_sat
v2: Fix inverted condition for applying the optimization. Noticed by
Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
831f9d3f61 nir/algebraic: Optimize some ifind_msb to ufind_msb
On Intel platforms, the uclz lowering if ufind_msb is either one
instruction better (Gfx7 and newer) or two instructions better (all
older platforms) than the ifind_msb implementations.

On platforms that use lower_find_msb_to_reverse, there should be no
difference.

All Haswell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19938662 -> 19938634 (<.01%)
instructions in affected programs: 850 -> 822 (-3.29%)
helped: 2 / HURT: 0

total cycles in shared programs: 858467067 -> 858465538 (<.01%)
cycles in affected programs: 10080 -> 8551 (-15.17%)
helped: 2 / HURT: 0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
db6d1edc1b nir: Restrict ufind_msb and ufind_msb_rev to 32- or 64-bit sources
4d802df3aa loosened the type restrictions
on these opcodes to enable support for 64-bit ballot operations.  In
doing so, it enabled 8-bit and 16-bit sizes as well.

It's impossible to get these sizes through GLSL or SPIR-V.  None of the
lowering in nir_opt_algebraic can handle non-32-bit sizes.  Almost no
drivers can handle non-32-bit sizes.

It doesn't seem possible to enforce anything other than "one bit size"
or "all bit sizes" in nir_opcodes.py.  The only way it seems possible to
enforce this is in nir_validate.  This is not ideal, but it be what it
be.

v2: Remove restriction on find_lsb. It is acutally possible to get this
via GLSL by doing findLSB() on a lowp value. findMSB declares its
parameter as highp, so that path is still impossible.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
2d6f48f6ef nir/algebraic: Do not generate 8- or 16-bit find_msb
The next commit will add validation to restrict this instruction (and
others) to only 32-bit or 64-bit sources.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
2119ab7319 nir/builder: Do not generate 8- or 16-bit find_msb
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
28311f9d02 nir: intel/compiler: Move ufind_msb lowering to NIR
Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Cycles in all programs: 9098346105 -> 9098333765 (-0.0%)
Cycles helped: 6

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
a4052e70ea nir/algebraic: Only lower ufind_msb with 32-bit sources
The 31-ufind_msb_rev(x) lowering only produces the correct result for
32-bit sources. ufind_msb_rev can also have 64-bit sources, and most
platforms are expected to lower this to 32-bit instructions with extra
logic operations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
0cc7bf63b7 nir: intel/compiler: Move ifind_msb lowering to NIR
Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so
no @32 annotation is required.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
66840b98e4 nir: ifind_msb_rev can only have int32 sources
Just like ifind_msb.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Eric Engestrom
f5d3d1e7ed meson: inline gtest_test_protocol now that it's always 'gtest'
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21485>
2023-03-10 07:20:29 +00:00
antonino
3a59b2a670 nir: handle output beeing written to deref in nir_lower_point_smooth
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21731>
2023-03-09 04:38:24 +00:00
Daniel Schürmann
3073810397 nir/gather_info: allow terminate() in non-PS
RADV will use terminate() to end ray-tracing shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736>
2023-03-08 16:59:41 +00:00
Rhys Perry
98cb7e0108 nir: add nir_lower_alu_width_test.fdot_order
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>
2023-03-08 14:38:26 +00:00
Rhys Perry
50f7e21481 nir: make fdph lowering match fdot
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>
2023-03-08 14:38:26 +00:00
Rhys Perry
3668da7c83 nir: use xyzw order for precise fdot
Fixes flickering grass in Immortals Fenyx Rising.

fossil-db (gfx1100):
Totals from 13969 (10.38% of 134574) affected shaders:
MaxWaves: 442794 -> 442878 (+0.02%)
Instrs: 4861105 -> 4901408 (+0.83%); split: -0.02%, +0.85%
CodeSize: 24316100 -> 24396272 (+0.33%); split: -0.03%, +0.35%
VGPRs: 446256 -> 445572 (-0.15%); split: -0.20%, +0.05%
Latency: 28122456 -> 28162233 (+0.14%); split: -0.10%, +0.24%
InvThroughput: 2899673 -> 2904323 (+0.16%); split: -0.07%, +0.23%
VClause: 119599 -> 119631 (+0.03%); split: -0.07%, +0.09%
SClause: 186636 -> 186265 (-0.20%); split: -0.23%, +0.03%
Copies: 301370 -> 300386 (-0.33%); split: -0.75%, +0.42%
Branches: 85066 -> 85047 (-0.02%); split: -0.02%, +0.00%
PreSGPRs: 436167 -> 436137 (-0.01%)
PreVGPRs: 329715 -> 329809 (+0.03%); split: -0.01%, +0.04%

fossil-db (gfx1100, RADV_DEBUG=invariantgeom):
Totals from 43116 (32.04% of 134574) affected shaders:
MaxWaves: 1332938 -> 1333012 (+0.01%); split: +0.01%, -0.00%
Instrs: 16424513 -> 16658021 (+1.42%); split: -0.06%, +1.48%
CodeSize: 81258868 -> 81827860 (+0.70%); split: -0.07%, +0.77%
VGPRs: 1720368 -> 1719648 (-0.04%); split: -0.19%, +0.15%
SpillSGPRs: 1670 -> 1600 (-4.19%); split: -5.27%, +1.08%
Latency: 82063766 -> 82425418 (+0.44%); split: -0.23%, +0.67%
InvThroughput: 9665803 -> 9727810 (+0.64%); split: -0.09%, +0.73%
VClause: 449662 -> 451099 (+0.32%); split: -0.32%, +0.64%
SClause: 498841 -> 498639 (-0.04%); split: -0.24%, +0.20%
Copies: 1001020 -> 1000770 (-0.02%); split: -1.20%, +1.17%
Branches: 237580 -> 239637 (+0.87%); split: -0.01%, +0.88%
PreSGPRs: 1198167 -> 1198024 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 1225202 -> 1225035 (-0.01%); split: -0.06%, +0.05%

fossil-db (navi10):
Totals from 13969 (10.38% of 134563) affected shaders:
MaxWaves: 474386 -> 474508 (+0.03%); split: +0.05%, -0.03%
Instrs: 3740895 -> 3771566 (+0.82%); split: -0.00%, +0.82%
CodeSize: 19426592 -> 19459916 (+0.17%); split: -0.00%, +0.18%
VGPRs: 389916 -> 389852 (-0.02%); split: -0.09%, +0.07%
Latency: 25452927 -> 25502482 (+0.19%); split: -0.14%, +0.34%
InvThroughput: 3880807 -> 3923144 (+1.09%); split: -0.07%, +1.16%
VClause: 66835 -> 66712 (-0.18%); split: -0.38%, +0.20%
SClause: 178805 -> 178802 (-0.00%); split: -0.01%, +0.01%
Copies: 167601 -> 167625 (+0.01%); split: -0.54%, +0.56%
Branches: 83788 -> 83784 (-0.00%)
PreSGPRs: 388229 -> 388216 (-0.00%)
PreVGPRs: 342984 -> 343062 (+0.02%); split: -0.01%, +0.03%

fossil-db (navi10, RADV_DEBUG=invariantgeom):
Totals from 43116 (32.04% of 134563) affected shaders:
MaxWaves: 1260184 -> 1256414 (-0.30%); split: +0.10%, -0.40%
Instrs: 12804951 -> 12983628 (+1.40%); split: -0.01%, +1.41%
CodeSize: 65813224 -> 66137852 (+0.49%); split: -0.03%, +0.52%
VGPRs: 1556396 -> 1561340 (+0.32%); split: -0.09%, +0.41%
SpillSGPRs: 1377 -> 1395 (+1.31%)
Latency: 76095867 -> 76355111 (+0.34%); split: -0.32%, +0.66%
InvThroughput: 13546863 -> 13788789 (+1.79%); split: -0.05%, +1.84%
VClause: 310910 -> 311283 (+0.12%); split: -0.63%, +0.75%
SClause: 474878 -> 474941 (+0.01%); split: -0.09%, +0.10%
Copies: 639367 -> 637610 (-0.27%); split: -1.03%, +0.76%
Branches: 240178 -> 240185 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1056594 -> 1056590 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 1247950 -> 1247798 (-0.01%); split: -0.05%, +0.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7920
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>
2023-03-08 14:38:26 +00:00
Marek Olšák
f7076d129d amd: add nir_intrinsic_xfb_counter_sub_amd and fix overflowed streamout offsets
Fixes: 5ec79f9899 - ac/nir/ngg: nogs support streamout

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21584>
2023-03-07 22:08:47 +00:00
Lionel Landwerlin
a278eeb719 nir: fix nir_ishl_imm
Both GLSL & SPIRV have undefined values for shift > bitsize. But SM5
says :

   "This instruction performs a component-wise shift of each 32-bit
    value in src0 left by an unsigned integer bit count provided by
    the LSB 5 bits (0-31 range) in src1, inserting 0."

Better to not hard code the wrong behavior in NIR.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e227bb9fd5 ("nir/builder: add ishl_imm helper")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21720>
2023-03-07 08:14:34 +00:00
Alyssa Rosenzweig
952bd63d6d nir/opt_barrier: Generalize to control barriers
For GLSL, we want to optimize code like

   memoryBarrierBuffer();
   controlBarrier();

into a single scoped_barrier intrinsic for the backend to consume. Now that
backends can get scoped_barriers everywhere, what's left is enabling backends to
combine these barriers together. We already have an Intel-specific pass for
combining memory barriers; it just needs a teensy bit of generalization to allow
combining all sorts of barriers together.

This avoids code quality regression on Asahi when switching to purely scoped
barriers. It's probably useful for other backends too.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21661>
2023-03-06 22:09:27 +00:00
Alyssa Rosenzweig
282aeb9b9c nir/lower_tex: Add lower_index_to_offset
Some backends can handle a constant texture index or a dynamic texture index but
not a constant texture index plus a dynamic texture offset. Add a nir_lower_tex
option to lower to one of these options.

v2: Use more straightforward code proposed by Faith.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21546>
2023-03-06 21:38:32 +00:00
Erik Faye-Lund
c305f97257 nir: add a print_internal debug-flag
It can sometimes be useful to also print the shaders that are marked as
internal, so let's add a flag that lets us do that.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21681>
2023-03-06 09:13:52 +00:00
Alyssa Rosenzweig
586da7b329 nir: Add nir_lower_helper_writes pass
This NIR pass lowers stores in fragment shaders to:

   if (!gl_HelperInvocaton) {
      store();
   }

This implements the API requirement that helper invocations do not have visible
side effects, and the lowering is required on any hardware that cannot directly
mask helper invocation's side effects. The pass was originally written for
Midgard (which has this issue) but is also needed for Asahi. Let's share the
code, and fix it while we're at it.

Changes from the Midgard pass:

1. Add an option to only lower atomics.

   AGX hardware can mask helper invocations for "plain" stores but not for
   atomics.  Accordingly, the AGX compiler wants this lowering for atomics but
   not store_global. By contrast, Midgard cannot mask any stores and needs the
   lowering for all store intrinsics. Add an option to the common pass to
   accommodate both cases.

   This is an optimization for AGX. It is not required for correctness, this
   lowering is always legal.

2. Fix dominance issues.

   It's invalid to have NIR like

      if ... {
         ssa_1 = ...
      }

      foo ssa_1

   Instead we need to rewrite as

      if ... {
         ssa_1 = ...
      } else {
         ssa_2 = undef
      }
      ssa_3 = phi ssa_1, ssa_2

      foo ssa_3

   By default, neither nir_validate nor the backends check this, so this doesn't
   currently fix a (known) real bug. But it's still invalid and fails validation
   with NIR_DEBUG=validate_ssa_dominance.

   Fix this in lower_helper_writes for intrinsics that return data (atomics).

3. Assert that the pass is run only for fragment shaders. This encourages
   backends to be judicious about which passes they call instead of just
   throwing everything in a giant lower everything spaghetti.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21413>
2023-03-04 13:31:05 -05:00
Rhys Perry
aa32dc704f nir/range_analysis: fix vectorized phis and intrinsics
Found by inspection.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21288>
2023-03-04 12:58:38 +00:00
Marek Olšák
b80bd58265 nir: skip nir_op_unpack_32_4x8 in nir_lower_alu_width
The pass can't handle it just like the other unpack opcodes and generates
invalid NIR.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>
2023-03-03 03:27:40 +00:00