I see no point, we allocate for every shader stage anyway. This is a bit
simpler.
I'm not a fan of the brw_compiler singleton at all but torching that is not on
today's agenda. Flattening it a little bit very much is.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>
We do not support VK_EXT_shader_object so far but vk_shader layer
depends on those values so we should fill them.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37452>
We now set fb_fetch_output and fb_fetch_output_coherent to be consistent
with nir_lower_io.
This has no impact in general unless some generic pass depends on those
infos.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37452>
We were not printing IO infos properly for those intrinsics.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37452>
The main challenge is handling tile status (TS) correctly. Full clears
simply mark tiles as "cleared" in TS metadata without touching pixels.
Scissored clears must first decompress existing TS tiles using the
current clear color, then apply the new color to the scissor region.
The implementation maintains the original surface clear color for TS
decompression operations while using the new color for actual clearing.
This prevents rendering artifacts when mixing BLT and 3D operations.
BLT engine operates directly with pixel positions and handles all TS
tile complexity automatically.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35956>
We should not count the last draw command stride padding in
the indirect buffer.
Fixes: 176740c26f4 ("mesa: implement mesh shader draw calls")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37392>
It does not look like our custom version has anything special to offer.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37421>
Advertise the extensions without VK_TIME_DOMAIN_DEVICE_KHR support is
not very useful.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37421>
Thanks to Nanley Chery for pointing out this possibility.
v2: Make it simpler (Nanley).
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37419>
When the auxiliary surface is handled by the hardware directly,
there's nothing to bind besides the main pixels, so we can allow
sparse without doing anything else. We can't do this in the exact same
way with DG2 (which has_flat_ccs) because it uses the
aux_state_tracking_buffer.
v2: Fix spelling (Nanley).
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37419>
We want to be a little more granular than just "aux surfaces are
completely incompatible with sparse!", so have each of
isl_surf_get_*_surf disable itself when sparse is used.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37419>
On r600 ternary operations can't use the fabs source modifier, so
converting "fadd(fabs(fmul(a, b), c)" to "ffma(fabs(a), fabs(b), c)"
adds one more instruction in the backend, hence avoid this.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37440>
Add late optimization to fuse f2i32 and fround_even operations into a
single f2i32_rtne instruction when the intermediate fround_even result
is only used once. This eliminates redundant rounding since f2i32_rtne
performs round-to-nearest-even conversion directly.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Tested-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37426>
Fixes this error during Shader.cpp build:
..\src\util/format/u_formats.h(33): fatal error C1083: Cannot open include file: 'util/format/u_format_gen.h': No such file or directory
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316>
Again, instrs don't get freed as we go, so the linear gc context saves us
5 pointers per instr.
Fossil replay time for deadspace3 on a debugoptimized build -4.85258% +/-
3.04009% (n=10)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316>
Since we don't free registers as we go, we can just allocate them in a
linear gc context that gets freed at ralloc destroy. Saves 5 pointers of
memory per register for the ralloc overhead.
Fossil replay time for deadspace3 on a debugoptimized build -4.30353% +/-
1.80078% (n=10).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316>
In some cases after other passes, `(a == 0.0 ? 0 : b)` can be
turned into `(a != 0.0 ? b : 0)`, so let's match those cases too.
Also matching `min(abs(a), abs(b)) == 0.0 ? 0.0 : a * b`.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31479>
This probably has the same issue as load_shared.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 04956d54ce ("aco: force uniform result for LDS load with uniform address if it can be non uniform")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37417>
The lack of ACCESS_COHERENT was probably fine, the worst it would do would
add a pointless iteration after loading an out-of-date value.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37417>
To lower 64bit sources we emit a COLLECT -> SPLIT pair to force
allocation into consecutive registers. When the sources for COLLECT
are outputs of the same SPLIT already, we can omit the COLLECT + SPLIT
pair.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37398>
No manual changes here, this is simply running
$ ninja -C build/ clang-format
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37222>
SMEM instructions mask off the low bits for the base and offset sources
both before and after they're added. However, NIR expects ACO to only
care about the alignment of the final address.
fossil-db (gfx1201):
Totals from 21 (0.03% of 79839) affected shaders:
Instrs: 229780 -> 229876 (+0.04%)
CodeSize: 1267724 -> 1268080 (+0.03%)
Latency: 2800924 -> 2800978 (+0.00%)
InvThroughput: 520250 -> 520256 (+0.00%)
Copies: 27878 -> 27876 (-0.01%); split: -0.01%, +0.00%
SALU: 29591 -> 29643 (+0.18%)
fossil-db (polaris10):
Totals from 3 (0.00% of 62201) affected shaders:
Latency: 2651 -> 2652 (+0.04%)
InvThroughput: 662 -> 663 (+0.15%)
PreSGPRs: 51 -> 54 (+5.88%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37301>
This seems to be equal assert with febe90e109 as we hit this when
launching Quake II RTX.
Fixes: 69b6b4cb28 ("anv: add shader instruction emission")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37429>
Implement retrieving color buffer info via gralloc0. This is a simpler
alternative to imapper4/5 which requires a dependency on libui that
would require a heavy effort to import headers and stub to be able to
build out of tree.
Since VNDK no longer releases headers since API Level 35 and they are
now only auto-generated, copy over the neccessary defines.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37185>
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(114): error C2220: the following warning is treated as an error
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(114): warning C5286: implicit conversion from enum type '<unnamed-enum-LVP_CMD_WRITE_BUFFER_CP>' to enum type 'vk_cmd_type'; use an explicit cast to silence this warning
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(114): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(152): warning C5286: implicit conversion from enum type '<unnamed-enum-LVP_CMD_WRITE_BUFFER_CP>' to enum type 'vk_cmd_type'; use an explicit cast to silence this warning
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(152): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(173): warning C5286: implicit conversion from enum type '<unnamed-enum-LVP_CMD_WRITE_BUFFER_CP>' to enum type 'vk_cmd_type'; use an explicit cast to silence this warning
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(173): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(204): warning C5286: implicit conversion from enum type '<unnamed-enum-LVP_CMD_WRITE_BUFFER_CP>' to enum type 'vk_cmd_type'; use an explicit cast to silence this warning
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(204): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(706): warning C5286: implicit conversion from enum type '<unnamed-enum-LVP_CMD_WRITE_BUFFER_CP>' to enum type 'vk_cmd_type'; use an explicit cast to silence this warning
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(706): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(722): warning C5286: implicit conversion from enum type '<unnamed-enum-LVP_CMD_WRITE_BUFFER_CP>' to enum type 'vk_cmd_type'; use an explicit cast to silence this warning
../src/gallium/frontends/lavapipe/lvp_acceleration_structure.c(722): note: to simplify migration, consider the temporary use of /Wv:18 flag with the version of the compiler with which you used to build without warnings
warnings are introduced with new cl compiler:
Microsoft (R) C/C++ Optimizing Compiler Version 19.44.35214 for x64
Copyright (C) Microsoft Corporation. All rights reserved.
usage: cl [ option... ] filename... [ /link linkoption... ]
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37041>
this reuses some of the machinery from regular gfx shaders, but there
are some key differences:
* separate program/GPL caching
* separate GPL vertex input (technically illegal because spec hasn't caught up)
* in descriptor layouts, task+mesh occupy vs+tcs space (and thus vs+tcs layouts add mesh stages)
* lots of 'is_mesh' checks sprinkled all over
otherwise much of this change is just enlarging arrays
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37427>