Some hardware (AGX, Imagination, Arm) really want to know the interpolation
qualifiers when compiling the vertex shader. Even though we need to handle this
dynamic for separate shaders, we can improve performance by linking.
nir_opt_varyings already has all the information to do this, so just do so.
Note this has to be done in common code for Gallium, which links varyings within
the GLSL linker but then presents the linked programs as separate shader
objects. This models that nicely, allowing Gallium drivers to optimize without
weird sidebands.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
we'll want this to be able to link interpolation qualifiers in a simple way with
nir_opt_varyings. add the metadata for it and the FS gathering pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
the info is all messed up so we need to do this right after. merge this
code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
no other users now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
it's kind of silly now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
a bunch of drivers have versions of this, might as well make a common one.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
When I went to use opt_reassociate for tu, I was advised that you want to
do this loop to get the best results. If everyone needs it, let's make it
common code and explain what's going on.
In the process, also make it skip work appropriately when there's no
progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>
we already have a perfectly good spiller and SSA... use it when it helps. yes,
this costs a bit of CPU time, but it's guarded behind enough checks that the
average time should be fine.
this was prompted by a shadertoy where we were losing waves due to way too
many constants pooled at the start of a chunky shader.
in GL shader-db, only affected shaders are in blender:
instrs HURT: shaders/blender/1020.shader_test FS: 3125 -> 3178 (1.70%)
instrs HURT: shaders/blender/981.shader_test FS: 3125 -> 3178 (1.70%)
instrs HURT: shaders/blender/729.shader_test FS: 3086 -> 3154 (2.20%)
instrs HURT: shaders/blender/1023.shader_test FS: 3085 -> 3153 (2.20%)
instrs HURT: shaders/blender/424.shader_test FS: 3085 -> 3153 (2.20%)
threads helped: shaders/blender/1020.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/1023.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/424.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/729.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/981.shader_test FS: 576 -> 640 (11.11%)
in VK fossils, there's a lot more high pressure shaders that benefit:
Totals from 113 (0.21% of 54019) affected shaders:
MaxWaves: 64448 -> 73088 (+13.41%)
Instrs: 388529 -> 391646 (+0.80%); split: -0.00%, +0.80%
CodeSize: 2750064 -> 2769106 (+0.69%); split: -0.00%, +0.69%
ALU: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
FSCIB: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
GPRs: 21297 -> 19289 (-9.43%)
Preamble instrs: 75703 -> 75911 (+0.27%)
notable improvement in Far Cry 5.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
this is asking for trouble, since divergence analysis doesn't handle stuff we
lower quickly. this fixes geometry shaders blowing up since the cited commit,
but since I was the one who r-b'd that change, I don't have anyone to blame but
myself C:
Fixes: d61edf079b ("nir: add nir_move_only_convergent/divergent")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
When debugging a problem in a trace, CTS test,... that is caused by a
known compiler feature, the first step is usually to find which shader
causes the problem. This is often non-trivial as the amount of shaders
in a trace can be huge. This commit adds a debugging tool to help with
this.
The idea behind this tool is to assign every shader a deterministic
(pre-compilation) ID that can be used to order shaders. Once we have
this, we can use it to bisect which shader causes the problem. This
obviously only works if the problem can be traced back to a single
shader. In my experience, this is often the case.
This tool reuses the shader cache key as deterministic ID. It is
concatenated with the variant ID to distinguish the different variants
of a shader.
In practice, bisecting the shaders in a test run works like this:
- Gate the problematic compiler feature using ir3_shader_bisect_select;
E.g., if (ir3_shader_bisect_select(v)) IR3_PASS(...);
- Run test with IR3_SHADER_BISECT_DUMP_IDS_PATH=ids.txt
- Sort ids.txt
- Bisect the shader IDs using IR3_SHADER_BISECT_LO/IR3_SHADER_BISECT_HI.
- Dump the problematic shader using IR3_SHADER_BISECT_DISASM.
A Python script is provided to make all this easier:
- ir3_shader_bisect.py dump-ids -o ids.txt 'test args'
- ir3_shader_bisect.py bisect -i ids.txt 'test args'
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33602>
This MR removes most magic values of the affected code paths, and makes the code more readable. Parsing of the RSW words is now done by genxml.
v2:
- Renamed varying types
- Removed unnecessary whitespaces
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36401>
Anv removed support for sparse depth buffers, but some glcts tests try
to use them without first asking if we support them. We'll have to fix
this in the VK-GL-CTS codebase. In the meantime, keep Marge happy.
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>
Funcion anv_get_image_format_properties() can get called from two
different Vulkan entry points:
- anv_GetPhysicalDeviceImageFormatProperties2
- anv_GetPhysicalDeviceSparseImageFormatProperties2
While there is a sparse-named function aimed specifically at sparse
images, you can call vkGetPhysicalDeviceImageFormatProperties2
passing sparse flags in VkPhysicalDeviceImageFormatInfo2::flags. And
when that happens, we need to detect it and properly either return
VK_ERROR_FORMAT_NOT_SUPPORTED or properly set
props->imageFormatProperties->sampleCounts with a value that matches
the sparse usage.
This change affects our behavior in 3 types of cases: color MSAA
cases, depth/stencil MSAA cases and atomic_emulated cases. The
previous patches should have covered these cases, so everything should
be passing now.
v2: Rebase.
v3: Reword the commit message.
v4: Rebase and reword the commit message.
Testcase: dEQP-VK.api.info.sparse_image_format_properties2.2d.optimal.r16g16_unorm
Testcase: dEQP-VK.api.info.image_format_properties.2d.optimal.d16_unorm
Testcase: dEQP-VK.api.info.image_format_properties.2d.optimal.r64_uint
Reviewed-by: Iván Briano <ivan.briano@intel.com> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>
We set sparseImageInt64Atomics to false on these formats, so there's
no need for the software detiling. Thus, we can not set the flag,
which will make ISL pick Tile64 for these formats, and things will
work.
Thanks to Lionel for pointing the fix here.
Testcase: dEQP-VK.api.info.image_format_properties.*d.optimal.r64_*int
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>