fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 04:58:08 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	d0a3a11258	nir/lower_io: preserve all metadata when no progress Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13168>	2021-10-05 11:23:23 +00:00
Marcin Ślusarz	e26328582a	nir: preserve all metadata when nir_opt_vectorize doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Marcin Ślusarz	54df09c8d4	nir: preserve all metadata when nir_propagate_invariant doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Marcin Ślusarz	804c56f1a2	nir: preserve all metadata when nir_lower_int_to_float doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Marcin Ślusarz	87ecdd4eff	glsl: preserve all metadata when lower_buffer_interface_derefs doesn't make progress Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>	2021-10-05 10:02:54 +00:00
Jesse Natalie	82c69c9a9d	compiler/clc: Preserve OCL kernel arg type metadata on LLVM13 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13177>	2021-10-04 18:16:01 +00:00
Jesse Natalie	3a752256f5	compiler/clc: Null extensions should mean all supported, not all Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13177>	2021-10-04 18:16:01 +00:00
Lionel Landwerlin	445996379b	clc: let user specify the targetted SPIRV version This version is given to the LLVM-SPIRV translator. On the SPIRV-Tools side of things, we want to use the highest available version to be sure to be able to parse back what was generated. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13113>	2021-10-03 19:32:54 +00:00
Lionel Landwerlin	72fd81d0ac	clc: print warnings/errors on their own line Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13113>	2021-10-03 19:32:54 +00:00
Lionel Landwerlin	3c8c817ae7	clc: add allowed extension for compile parameter The LLVM-SPIRV translator can include a bunch of capabilities into the generated SPIRV which is not what you always want. That include internal Intel specific capabilities from the translator. v2: Rename options Fixup checks (Jesse) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13113>	2021-10-03 19:32:54 +00:00
Boris Brezillon	7cd402c9c8	nir/lower_blend: Shrink blended result if needed Make sure the new and old sources have the same number of components, otherwise the NIR validation pass complains. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	3e07b8d4f8	nir/lower_blend: Make sure we're not passed scaled formats SCALED formats are interpreted as floats, but not in the usual [0, 1] or [-1, 1] range, meaning that the blend lowering logic can't directly apply to those. Assert that the format being passed is not a scaled format. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	15b4cab4d5	nir/lower_blend: Don't lower RTs whose format is set to NONE The caller doesn't necessarily want to lower blend operations on all render targets since some of them might be supported natively (panvk will be in that case). Let's just skip RTs that have a format set to PIPE_FORMAT_NONE to allow that. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	637cd5ac00	nir/lower_blend: Pad src to a 4-component vector nir_ssa_for_src() is not supposed to pad the src vector if dst->num_components > src->num_components. Let's pad things explicitly with nir_pad_vector(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Boris Brezillon	641bed3103	nir: Make sure src->num_components < dst->num_components in nir_ssa_for_src() The NIR validation complains if the swizzle accesses a component that's not present in the source. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13060>	2021-09-30 16:54:42 +02:00
Vadym Shovkoplias	36c241be01	driconf, glsl: Add a vs_position_always_precise option This is basically the same workaround as in `9b577f2a88` (driconf, glsl: Add a vs_position_always_invariant option) commit but for tesselation evaluation shaders. Some applications do not mark outputs as precise in tesselation evaluation shaders which can lead to different results in case some optimizations were applied. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Fixes: `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13027>	2021-09-30 10:46:39 +00:00
Jason Ekstrand	d2264489ce	compiler/clc: grab opencl-c.h from the system path by default By default we use the header installed opencl-c.h header. But in the case Mesa is compiled for microsoft clon12 we keep the injected file. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9156>	2021-09-30 07:09:08 +00:00
Jason Ekstrand	8490766f53	compiler/clc: Clean ups Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9156>	2021-09-30 07:09:08 +00:00
Jason Ekstrand	1506ea2ecb	Move a bunch of the CLC stuff from src/microsoft to common code The D3D12-specific stuff isn't useful to have in common code but all the stuff to invoke clang really should be common. v2: Rebase (Lionel) v3: Define a new clc_libclc_new_dxil() entrypoint to create a clc context with DXIL nir_options (Jesse) v4: Fixup meson build (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9156>	2021-09-30 07:09:08 +00:00
Rob Clark	2b826582d8	isaspec: De-duplicate bitset encoding bitset encoding tends to have a lot of duplication, for ex. many instructions with the same encoding modulo the fixed pattern. Now that encode_bitset is split out into it's own template, so that we can capture the result, use a hash table to de-duplicate the bitset encoding into "snippet" functions so that bitset cases with identical encoding can re-use the same generated code. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13049>	2021-09-29 22:59:43 +00:00
Rob Clark	8726ed7221	isaspec: Split encode_bitset() into it's own template In the next patch, we are going to want to be able to capture the result of rendering the template as a py variable, which I don't think you can do otherwise with a <%def>. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13049>	2021-09-29 22:59:43 +00:00
Rob Clark	72db4e51f3	isaspec: Fix comment Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13049>	2021-09-29 22:59:43 +00:00
Rob Clark	cd3ee83ca5	isaspec: Remove unused leftovers These were never used, leftovers from an earlier iteration of isaspec which used an RPN based thing for expressions. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13049>	2021-09-29 22:59:43 +00:00
Lionel Landwerlin	52c0e6e5b3	spirv: switch Groups capability to non AMD specific field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13081>	2021-09-29 15:40:20 +00:00
Lionel Landwerlin	daa8a81d99	nir: fix opt_memcpy src/dst mixup Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `f6667cb0ce` ("nir: Add a memcpy optimization pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13079>	2021-09-28 16:36:08 +00:00
Lionel Landwerlin	f349c8ab4b	spirv: don't bother initializing variables to Undef If an OpVariable's initializer is undef, there is no need to initialize the variable. v2: Comment the code (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>	2021-09-27 19:04:42 +00:00
Lionel Landwerlin	a17d24d901	spirv: workaround LLVM-SPIRV Undef variable initializers The LLVM-SPIRV translator creates variables with initializers, but most of those are actually undef initializers. We can just skip composites that are entirely made of undefs, but for partially undefs, we will still zero initialize. v2: Rename wa_llvm_spirv_undef_initializer to wa_llvm_spirv_ignore_workgroup_initializer (Caio) Limit workaround to OpenCL (Caio) Make workaround clearer (Caio) v3: Only apply workaround on workgroup storage (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>	2021-09-27 19:04:42 +00:00
Lionel Landwerlin	a17928639d	spirv: avoid shadowing local variable v2: rename s/eval/elem_val/ (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>	2021-09-27 19:04:42 +00:00
Lionel Landwerlin	9d9e67d118	spirv: don't fail on CapabilitySubgroupDispatch if supported Since only Anv uses the value, I'm only enabling this on anv. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `518693c3ec` ("spirv: Handle the SubgroupSize execution mode") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13034>	2021-09-24 20:23:14 +00:00
Rhys Perry	e43007af56	nir/opt_if: add opt_if_rewrite_uniform_uses Turns: if (a == (b=readfirstlane(a))) use(a) into: if (a == (b=readfirstlane(a))) use(b) Improves divergence analysis and lets us scalarize use(a). Improves Cyberpunk 2077 performance. fossil-db (Sienna Cichlid, Cyberpunk 2077): Totals from 57 (10.56% of 540) affected shaders: VGPRs: 4904 -> 4040 (-17.62%) CodeSize: 624360 -> 626828 (+0.40%); split: -0.06%, +0.46% MaxWaves: 656 -> 824 (+25.61%) Instrs: 119770 -> 119447 (-0.27%); split: -0.49%, +0.22% Latency: 1950256 -> 1633110 (-16.26%); split: -16.26%, +0.00% InvThroughput: 364852 -> 292089 (-19.94%) VClause: 1512 -> 1008 (-33.33%) SClause: 2693 -> 3196 (+18.68%) Copies: 10050 -> 9955 (-0.95%); split: -3.34%, +2.40% Branches: 3476 -> 3547 (+2.04%) PreSGPRs: 4003 -> 5076 (+26.80%) PreVGPRs: 4709 -> 3810 (-19.09%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12472>	2021-09-24 18:41:18 +00:00
Rhys Perry	69f9a96af1	nir: add nir_src_components_read() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12472>	2021-09-24 18:41:18 +00:00
Caio Marcelo de Oliveira Filho	240e60ba76	nir/lower_io_to_vector: Allow Task/Mesh to load from outputs Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12951>	2021-09-24 14:35:15 +00:00
Caio Marcelo de Oliveira Filho	895cfca641	spirv: Identify non-temporal memory access Map it to the existing ACCESS_STREAM_CACHE_POLICY access mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12945>	2021-09-21 21:55:54 +00:00
Christian Gmeiner	ca0f892191	compiler/isaspec: add alignment support This helps to get a really nice and aligend disasm output. Just use :align=X to define where in the line the field should be printed. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>	2021-09-21 20:25:31 +00:00
Christian Gmeiner	eae96d0c4c	compiler/isaspec: keep track of written data Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>	2021-09-21 20:25:31 +00:00
Christian Gmeiner	f0104a6c72	compiler/isaspec: add print(..) helper To support field alignment we need to keep track of how much data we have printed to our out FILE. This is a prep commit. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>	2021-09-21 20:25:31 +00:00
Christian Gmeiner	bc2e806b0f	freedreno/isa: move isaspec to a new home This commit moves isaspec out of freedreno into a more generic new home - src/compiler/isaspec. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>	2021-09-21 20:25:31 +00:00
Christian Gmeiner	3d65cea6ee	util/bitset: s/BITSET_SET_RANGE/BITSET_SET_RANGE_INSIDE_WORD Prep work for the next commit. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>	2021-09-21 20:25:31 +00:00
Jason Ekstrand	518693c3ec	spirv: Handle the SubgroupSize execution mode Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12959>	2021-09-21 18:34:59 +00:00
Bas Nieuwenhuizen	0d8bd8518d	nir: Support ray launch size in divergence analysis. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	56b06c09b4	nir: Add AMD rt intrinsics. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen	b6be96a2bd	radv: Modify load_sbt_amd intrinsic to get the descriptor. That way we can get the address to the entry, which is needed for some nir builtins because extra data in the entry can be used as shader input. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Timur Kristóf	872d21820f	nir: Exclude non-generic patch variables from get_variable_io_mask. These are I/O variables which are not going to be removed anyway. However, get_variable_io_mask handles their location incorrectly. Found using the GCC undefined behavior sanitizer. Fixes the following error: runtime error: shift exponent 4294967258 is too large for 64-bit type 'long unsigned int' Closes: #5319 Fixes: `cf5f8f55c3` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>	2021-09-20 18:08:16 +00:00
Ian Romanick	d7ba52cce9	nir/edgeflags: Add a flag to indicate the edge flag input is needed Most modern hardware needs the edge flag added as a hidden vertex input and needs code added to the vertex shader to copy the input to an output. Intel hardware is a little different. Gfx4 and Gfx5 hardware works in the previously described mannter. Gfx6+ hardware needs the edge flag as a specific vertex shader input, and that input is magically processed by fixed-function hardware without need for extra shader code. This flag signals only that the vertex shader input is needed. It would be nice if we could decouple adding the vertex shader input from generating the copy-to-output code, but that has proven to be challenging. Not having that code causes other passes to want to eliminate that shader input. v2: Convert conditional to assertion. This pass is only called for vertex shaders. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>	2021-09-17 16:36:08 -07:00
Rhys Perry	a1af902531	nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants This allows for more MAD/FMA instructions to be created. fossil-db (Sienna Cichlid): Totals from 50134 (33.46% of 149839) affected shaders: VGPRs: 2436536 -> 2436000 (-0.02%); split: -0.05%, +0.03% SpillSGPRs: 13136 -> 13135 (-0.01%); split: -0.02%, +0.02% CodeSize: 206621424 -> 206278292 (-0.17%); split: -0.23%, +0.07% MaxWaves: 1116804 -> 1117448 (+0.06%); split: +0.07%, -0.01% Instrs: 38977460 -> 38862886 (-0.29%); split: -0.33%, +0.04% Latency: 832425389 -> 827432260 (-0.60%); split: -0.63%, +0.03% InvThroughput: 184193457 -> 183563350 (-0.34%); split: -0.37%, +0.03% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7458>	2021-09-17 17:28:26 +00:00
Jason Ekstrand	6c7d23e6ca	nir: Stop sweeping indirects They're no longer ralloc'd. Fixes: `879a569884` "nir: Switch from ralloc to malloc for NIR instructions." Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>	2021-09-16 11:28:36 +00:00
Jason Ekstrand	d1eae6f36b	nir: Properly clean up nir_src/dest indirects Now that they're no longer ralloc'd, we have to be much more careful about indirects. We have to make sure every time a source or destination is overwritten, its indirect (if any) is freed. We also have to choose a memory ownership convention for the rewrite functions. Assuming that they will be called with the source from some other instruction, we choose to always make a copy of the indirect (if any). It's the responsibility of the caller to ensure its copy of the indirect is freed. Unfortunately, all this extra logic is going to make nir_instr_rewrite/move_src/dest more expensive because they now have all the logic of nir_src/dest_copy instead of a simple struct assignment. Fortunately, the vast majority of rewrite calls are done by nir_ssa_def_rewrite_uses which is an SSA-only fast-path. Fixes: `879a569884` "nir: Switch from ralloc to malloc for NIR instructions." Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>	2021-09-16 11:28:36 +00:00
Emma Anholt	aed4c0b5a9	nir: Drop the unused instr arg for src/dest copy functions. Now that we don't use ralloc, we don't need this arg to get at the right ralloc ctx. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	879a569884	nir: Switch from ralloc to malloc for NIR instructions. By replacing the 48-byte ralloc header with our exec_node gc_node (16 bytes), runtime of shader-db on my system across this series drops -4.21738% +/- 1.47757% (n=5). Inspired by discussion on #5034. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00
Emma Anholt	feee5e6974	nir/tests: Fix transmuting an SSA dest to be non-SSA With the de-ralloc changes, having the register dest not have its .reg properly initialized caused crashes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>	2021-09-14 17:53:06 +00:00

1 2 3 4 5 ...

6449 commits