Reworks:
* Francisco: Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file")
* Jordan: Rebase on 952a523abb ("intel: switch over to unified
atomics")
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
These fields are overlapping with the ones set by brw_message_desc(),
so the latter should be used instead. This fixes corruption of the
LSC message descriptors when inconsistent values are specified through
both helpers, which can happen if the 'inst->mlen' field is modified
during optimization (e.g. by opt_split_sends()).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
We cannot rely on the immediate message descriptor having accurate
values for mlen and rlen at the IR level, since they are updated at
codegen time via 'inst->mlen' and 'inst->size_written', which could
end up with values inconsistent with the message descriptor if
e.g. the split sends optimization had an effect. Instead, define
helpers that do the computation without relying on the message
descriptor, and use the pre-existing
brw_message_desc_mlen()/brw_message_desc_rlen() helpers (fully
equivalent to the lsc helpers deleted here) during disassembly.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
Rework:
* Jordan: Drop FINISHME (s-b Caio)
* Jordan: Use reg_unit() in asserts rather than a ver check (s-b Caio)
* Ian: Make use of reg_unit() in round_components_to_whole_registers()
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
There's no real benefit to all this fixed size array nonsense. We're
heap allocating all of the nvk_descriptor_set structs anyway so having
an array-of-pointers walk instead of a linked list is no faster. Also,
this gets rid of another fixed-size array that can overflow.
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28482>
Hardly a copyrightable file, though it should inherit the MIT
license as the code originate from MIT licensed r300_winsys.h.
Fixes: 6e3fc2de2a ("r300g: Move bootstrap code to targets")
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28384>
Compaction only moves directly-indexed slots. This prevents unnecessary
num_slots > 1 from appearing in random slots.
Fixes: c66967b5cb - nir: add nir_opt_varyings, new pass optimizing and compacting varyings
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28431>
Backward inter-shader code motion turns ALU results into outputs,
which led to getting IO with unsupported bit sizes. This prevents
that.
There is a new NIR option flag that indicates 16-bit support.
Fixes: c66967b5cb - nir: add nir_opt_varyings, new pass optimizing and compacting varyings
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28431>
We no longer need to emit store_output intrinsics at the
end of the shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
There was 1 more bit left, may as well use it for something.
In the future, this may allow increasing the maximum number of
patches per workgroup.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
These parts are not used anymore, therefore we no longer need to
change the VS state when tessellation states change.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
Use tcs_offchip_layout instead of VS state to determine the
number of LS outputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
Now that neither RADV nor RadeonSI uses TCS epilogs, we don't
need to keep the code to compile them in ACO either.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
Always emit the tessellation factor writes in the main shader,
which is doable now that the necessary information is in the
tcs_offchip_layout SGPR.
This eliminates the need for TCS epilogs, so delete them
entirely from RadeonSI.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
Put the primitive mode and whether TES reads tess factors into
the tcs_offchip_layout SGPR, so they can be used by the main
shader instead of needing the epilog.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
The intention is to free up enough bits in tcs_offchip_layout so
that it can contain information for more dynamic states.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
We'll need to clean this up later, but for now it's better to
have it in common code than in RadeonSI.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
When tu_shader object is destroyed through vk_pipeline_cache, the relevant
destroy callback should relay to the general tu_shader_destroy function
that will also clean up owned resources.
During shader creation, the ir3_shader object should be destroyed once the
shader variants are retrieved. Since those variants are owned by tu_shader
they should be freed up in tu_shader_destroy.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: a03525d8db ("tu: Split program draw state into per-shader states")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27847>
With the exception of some variants for framebuffer fetch (to be addressed in a
follow up MR, this is big enough as it is) -- this switches us to a shader
precompile path for VS & FS. VS prologs let us implement vertex buffer fetch
with dynamic formats, FS prologs let us implement misc emulation like API sample
masking and cull distance, while FS epilogs handle blending and tilebuffer
stores. This should cut down shader recompile jank significantly in the GL
driver. It also prepares us with most of what we need for big ticket Vulkan
extensions like ESO, GPL, and EDS3.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
translate export/load_exported instructions into moves to/from the requested
GPRs at shader part boundaries, with coalescing in RA for perf.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
we don't want the zs_emit dance emitted, all we care about is making sure tag
writes are enabled as if we had a regular tib store
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>