a bunch of drivers have versions of this, might as well make a common one.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
When I went to use opt_reassociate for tu, I was advised that you want to
do this loop to get the best results. If everyone needs it, let's make it
common code and explain what's going on.
In the process, also make it skip work appropriately when there's no
progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>
we already have a perfectly good spiller and SSA... use it when it helps. yes,
this costs a bit of CPU time, but it's guarded behind enough checks that the
average time should be fine.
this was prompted by a shadertoy where we were losing waves due to way too
many constants pooled at the start of a chunky shader.
in GL shader-db, only affected shaders are in blender:
instrs HURT: shaders/blender/1020.shader_test FS: 3125 -> 3178 (1.70%)
instrs HURT: shaders/blender/981.shader_test FS: 3125 -> 3178 (1.70%)
instrs HURT: shaders/blender/729.shader_test FS: 3086 -> 3154 (2.20%)
instrs HURT: shaders/blender/1023.shader_test FS: 3085 -> 3153 (2.20%)
instrs HURT: shaders/blender/424.shader_test FS: 3085 -> 3153 (2.20%)
threads helped: shaders/blender/1020.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/1023.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/424.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/729.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/981.shader_test FS: 576 -> 640 (11.11%)
in VK fossils, there's a lot more high pressure shaders that benefit:
Totals from 113 (0.21% of 54019) affected shaders:
MaxWaves: 64448 -> 73088 (+13.41%)
Instrs: 388529 -> 391646 (+0.80%); split: -0.00%, +0.80%
CodeSize: 2750064 -> 2769106 (+0.69%); split: -0.00%, +0.69%
ALU: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
FSCIB: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
GPRs: 21297 -> 19289 (-9.43%)
Preamble instrs: 75703 -> 75911 (+0.27%)
notable improvement in Far Cry 5.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
this is asking for trouble, since divergence analysis doesn't handle stuff we
lower quickly. this fixes geometry shaders blowing up since the cited commit,
but since I was the one who r-b'd that change, I don't have anyone to blame but
myself C:
Fixes: d61edf079b ("nir: add nir_move_only_convergent/divergent")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
Firefox ESR requires Rust 1.82 since version 140. Thus, this update
is in line with our Rust update policy.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36526>
The requirements bump a few weeks ago forgot to update the docs.
Fixes: 1a698c75ae ("build: Rust: Bump minimum Meson and bindgen version")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36526>
For stable Rust, specifying the patch version already uniquely identifies a toolchain build. Specifying the date would only be required for nightly releases.
Reviewed-by: Eric Engestrom <eric@igalia.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36526>
When debugging a problem in a trace, CTS test,... that is caused by a
known compiler feature, the first step is usually to find which shader
causes the problem. This is often non-trivial as the amount of shaders
in a trace can be huge. This commit adds a debugging tool to help with
this.
The idea behind this tool is to assign every shader a deterministic
(pre-compilation) ID that can be used to order shaders. Once we have
this, we can use it to bisect which shader causes the problem. This
obviously only works if the problem can be traced back to a single
shader. In my experience, this is often the case.
This tool reuses the shader cache key as deterministic ID. It is
concatenated with the variant ID to distinguish the different variants
of a shader.
In practice, bisecting the shaders in a test run works like this:
- Gate the problematic compiler feature using ir3_shader_bisect_select;
E.g., if (ir3_shader_bisect_select(v)) IR3_PASS(...);
- Run test with IR3_SHADER_BISECT_DUMP_IDS_PATH=ids.txt
- Sort ids.txt
- Bisect the shader IDs using IR3_SHADER_BISECT_LO/IR3_SHADER_BISECT_HI.
- Dump the problematic shader using IR3_SHADER_BISECT_DISASM.
A Python script is provided to make all this easier:
- ir3_shader_bisect.py dump-ids -o ids.txt 'test args'
- ir3_shader_bisect.py bisect -i ids.txt 'test args'
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33602>
This MR removes most magic values of the affected code paths, and makes the code more readable. Parsing of the RSW words is now done by genxml.
v2:
- Renamed varying types
- Removed unnecessary whitespaces
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36401>
Anv removed support for sparse depth buffers, but some glcts tests try
to use them without first asking if we support them. We'll have to fix
this in the VK-GL-CTS codebase. In the meantime, keep Marge happy.
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>
Funcion anv_get_image_format_properties() can get called from two
different Vulkan entry points:
- anv_GetPhysicalDeviceImageFormatProperties2
- anv_GetPhysicalDeviceSparseImageFormatProperties2
While there is a sparse-named function aimed specifically at sparse
images, you can call vkGetPhysicalDeviceImageFormatProperties2
passing sparse flags in VkPhysicalDeviceImageFormatInfo2::flags. And
when that happens, we need to detect it and properly either return
VK_ERROR_FORMAT_NOT_SUPPORTED or properly set
props->imageFormatProperties->sampleCounts with a value that matches
the sparse usage.
This change affects our behavior in 3 types of cases: color MSAA
cases, depth/stencil MSAA cases and atomic_emulated cases. The
previous patches should have covered these cases, so everything should
be passing now.
v2: Rebase.
v3: Reword the commit message.
v4: Rebase and reword the commit message.
Testcase: dEQP-VK.api.info.sparse_image_format_properties2.2d.optimal.r16g16_unorm
Testcase: dEQP-VK.api.info.image_format_properties.2d.optimal.d16_unorm
Testcase: dEQP-VK.api.info.image_format_properties.2d.optimal.r64_uint
Reviewed-by: Iván Briano <ivan.briano@intel.com> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>
We set sparseImageInt64Atomics to false on these formats, so there's
no need for the software detiling. Thus, we can not set the flag,
which will make ISL pick Tile64 for these formats, and things will
work.
Thanks to Lionel for pointing the fix here.
Testcase: dEQP-VK.api.info.image_format_properties.*d.optimal.r64_*int
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>
We can't support multi-sampling with depth/stencil, only 1x and only
with 2D and sometimes 3D formats. Claim everything as not supported,
since games don't seem to be affected.
This will be noticeable once we fix
anv_GetPhysicalDeviceImageFormatProperties2() to stop (accidentally)
lying about what we support: without this patch we'll get failures.
It seems CTS expects that, if we do support the format, we have to
support it with multi-sampling as well.
Testcase: dEQP-VK.api.info.image_format_properties.2d.optimal.s8_uint (and 5 others)
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>
Prepare this function in a way where the caller is able to pass
multiple sample bits as the 'samples' argument, and add an output to
the function where we return the subset of 'samples' that is actually
valid, when it's valid.
For now none of the two callers is using the new argument, but this
will be changed in the next patch.
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35524>