fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-20 17:30:24 +01:00

Author	SHA1	Message	Date
Erik Faye-Lund	f4bd2d35cb	draw: track vertices and vertex_ptr as byte-pointers Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>	2023-06-26 09:30:22 +00:00
Erik Faye-Lund	ed4bda8044	draw: use enum for primitive-type Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>	2023-06-26 09:30:22 +00:00
Erik Faye-Lund	1569507e26	draw: use uint32_t instead of uint In these cases we actually want uint32_t, because we're doing 32-bit things to them. The hwinfo-bit is only being used by i915, and should probably be moved to i915 instead. But it shoukd also be converted, so let's do that now. While we're at it, fixup the bit-setting as well. Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>	2023-06-26 09:30:22 +00:00
Erik Faye-Lund	57abc7d037	draw: use enum for tgsi-semantic Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>	2023-06-26 09:30:22 +00:00
Erik Faye-Lund	4844809edb	cso: use enum for render-conditions Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>	2023-06-26 09:30:22 +00:00
Samuel Pitoiset	82e2802b7d	radv/amdgpu: add a helper to get a new IB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:10 +00:00
Samuel Pitoiset	148f42be1d	radv/amdgpu: rename old_ib_buffers to ib_buffers No need to prefix with 'old' actually because this is just an array of IB buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:10 +00:00
Samuel Pitoiset	d74de65069	radv/amdgpu: use cs_finalize() when growing a CS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:10 +00:00
Samuel Pitoiset	437456b47c	radv/amdgpu: use the array of IB buffers for the chained IB path For executing IB on the compute queue (ie. IB2 isn't supported), we will need to break chaining, this is a first step towards this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:10 +00:00
Samuel Pitoiset	81e308df72	radv/amdgpu: do not set the IB size when ending a CS with RADV_DEBUG=noibs This was only necessary for preambles/postambles, let's clarify this by determining the IB info from the first IB in the array instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:10 +00:00
Samuel Pitoiset	df0c742543	radv/amdgpu: rework growing a CS with the chained IB path slightly This should allow us to use cs_finalize(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:09 +00:00
Samuel Pitoiset	c11a62a7b0	radv/amdgpu: use the correct IB size when growing a CS with RADV_DEBUG=noibs The current IB size is copied when radv_amdgpu_cs_add_old_ib_buffer() is called, which might not be the real IB size because we might still pad the CS with NOP packets after. Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>	2023-06-26 09:10:09 +00:00
Matt Coster	91143f45b8	pvr: Advance entry pointer in pvr_setup_vertex_buffers() Fixes: dEQP-VK.robustness.robustness1_vertex_access .out_of_bounds_stride_0 .out_of_bounds_stride_16_single_buffer .out_of_bounds_stride_30_middle_of_buffer .out_of_bounds_stride_8_middle_of_buffer_separate Signed-off-by: Matt Coster <matt.coster@imgtec.com> Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23834>	2023-06-26 08:40:13 +00:00
Corentin Noël	bc2828a436	compiler: Allow the explicit_stride of aoa types to be zero The explicit stride doesn't have to be defined to aoa and therefore can be zero in some cases, like in arrays of arrays of uniform blocks. Resolves crash with spec@arb_gl_spirv@execution@ubo@aoa-2.shader_test piglit test for virgl. Signed-off-by: Corentin Noël <corentin.noel@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Acked-by: Gert Wollny <gert.wollny@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23648>	2023-06-26 09:19:43 +02:00
Hyunjun Ko	9f4299d6b2	anv: fix to set predicted weight tables correctly. Fixes: `8d519eb5f` ("anv: add initial video decode support for h265") Closes: mesa/mesa#9214 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23790>	2023-06-26 15:08:05 +09:00
Hyunjun Ko	b8dc7675f2	intel/genxml: changes the type for predicted weight to unsigned. Turned out to be unsigned here after some experiments. Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23790>	2023-06-26 15:08:00 +09:00
Hyunjun Ko	e2f95ad296	vulkan/video: keep delta weight and offsets of predicted weight tables in h265 slice parsing Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23790>	2023-06-26 15:07:53 +09:00
Caio Oliveira	c421ecea56	vulkan: Update XML and headers to 1.3.255 Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23837>	2023-06-25 15:52:55 +00:00
Caio Oliveira	73af0475cb	vulkan: Add NV suffix to VK_NV_cooperative_matrix feature names In the new Vulkan Headers, VK_KHR_cooperative_matrix gets added and the feature names are the same. Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23837>	2023-06-25 15:52:55 +00:00
Karol Herbst	0759759658	rusticl/program: skip linking compiled binaries Applications can do their own caching, but are in any case required to properly "compiler" the binaries via clBuildProgram or clCompileProgram + clLinkPrograms. In any case, there is no point building something if we already have the result. Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23847>	2023-06-25 11:15:17 +02:00
Karol Herbst	18f1087a21	rusticl: bump bindgen requirement Apparently on some ARM systems any older bindgen version crashes. Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23840>	2023-06-24 15:37:18 +00:00
Yonggang Luo	5b29463746	nir: Add function nir_function_set_impl This function is added for create strong relationship between nir_function_impl and nir_function. So that nir_function->impl->function == nir_function is always true when (nir_function->impl != NULL && nir_function->impl != NIR_SERIALIZE_FUNC_HAS_IMPL) And indeed this invariant is already done in functions validate_function and validate_function_impl of nir_validate Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23820>	2023-06-24 14:48:47 +00:00
Yonggang Luo	9fa38cf142	vtn: Do not assign main_entry_point->impl twice Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23820>	2023-06-24 14:48:47 +00:00
Yonggang Luo	0d9f474381	draw: Update the comment and function name to match the type Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23845>	2023-06-24 20:52:56 +08:00
Yonggang Luo	e7f0dd2710	draw: Replace usage of ubyte/ushort/uint with uint8_t/uint16_t/uint32_t in draw_pt_vsplit.c This can not be done with tools, so do it manually Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23845>	2023-06-24 20:52:53 +08:00
Yonggang Luo	f35ebd221f	draw: Replace usage of boolean/TRUE/FALSE with bool/true/false in draw_pt_vsplit* These change can not be done with tools, so do it manually Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23845>	2023-06-24 20:52:49 +08:00
Karol Herbst	fbe9a7ca3e	rusticl/mesa: create proper build-id hash for the disk cache Without generating a proper timestamp for the disk cache, we pull old binaries out of the disk cache, potentially being buggy or simply outdated. Once meson 1.2 lands we can easily pull in LLVM functions. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>	2023-06-24 12:36:36 +00:00
Karol Herbst	29b932512a	rusticl/meson: extract common bindgen rust args Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>	2023-06-24 12:36:36 +00:00
Karol Herbst	c896373889	rusticl: generate bindings for build-id stuff Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>	2023-06-24 12:36:36 +00:00
Karol Herbst	d14af00432	rusticl: structurize and reorder mesa binding args It became quite a mess, I had enough 🙃 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>	2023-06-24 12:36:36 +00:00
Eric Engestrom	337908440e	v3dv: replace boolean and uint with bool and size_t There's no reason to use the gallium `p_compiler.h` types in vulkan code. Inspired by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23577, but using `size_t` for `ulist_data_size` because its two users are `blob_read_bytes()` and `memcpy()`, both of which expect a `size_t`. Signed-off-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23795>	2023-06-24 12:21:09 +00:00
Eric Engestrom	fa8a232691	docs/coding-style: add pre-commit hook fallback for clang-format Signed-off-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23722>	2023-06-24 12:04:15 +00:00
Eric Engestrom	270d898e75	docs/coding-style: add example emacs config for clang-format Signed-off-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23722>	2023-06-24 12:04:14 +00:00
Eric Engestrom	342196f7b0	docs/coding-style: add example vim config for clang-format Signed-off-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23722>	2023-06-24 12:04:14 +00:00
Pavel Ondračka	89873e5e5c	r300: properly count maximum used register index The problem is when we have DP2 or DP3 instruction that writes a w channel like here: DP3 temp[148].w, -temp[147].xyz_, temp[57].xyz_; will get pair-converted to src0.xyz = temp[147], src1.xyz = temp[57] DP3, -src0.xyz, src1.xyz DP3 temp[148].w, -src0._, src0._ where the alpha instruction is a basically just a replicate of the result from the RGB sub intruction. However the destination register index in the RBG slot is also 148. Now we pair-schedule and regalloc src0.xyz = temp[13], src1.xyz = temp[3] DP3, -src0.xyz, src1.xyz DP3 temp[3].w, -src0._, src0._ We properly regalloc the alpha channel, but we obviously skip the rgb, because the writemask is empty there. However when we emit the shader later, we actually check the number of used regs based on the maximum used register index and we don't consider the writemasks, so we would think we use 149 temps. AFAIK the shader would be still completelly OK. But we would think it hits the HW limits and used a dummy one instead. Fix this by checking for empty writemasks when marking the registers as used. GAINED: shaders/glmark/1-22.shader_test FS This is also needed to prevent another lost Trine shader from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089 Reviewed-by: Filip Gawin <filip.gawin@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23838>	2023-06-24 11:30:47 +00:00
Matt Turner	561cce32f1	anv: Only expose video decode bits with KHR_video_decode_queue This fixes dEQP-VK.api.info.format_properties.g8_b8r8_2plane_420_unorm in combination with the CTS fix from https://gerrit.khronos.org/c/vk-gl-cts/+/12191 Fixes: `9361481780` ("anv: add video format features for the one supported video output format") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8263 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23776>	2023-06-24 02:54:37 +00:00
Matt Turner	727335045d	anv: Pipe anv_physical_device to anv_get_image_format_features2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23776>	2023-06-24 02:54:37 +00:00
Karol Herbst	02aaf58908	nv50/ir/nir: set numBarriers if we emit an OP_BAR Even though the field is called `numBarriers` we set it to 1 just like we do with TGSI. It's unknown on what's the proper behavior here is. But without this set the GPU will complain to us loudly, so this silences at least that. Fixes: `a2d7a4f978` ("nv50/ir: convert to scoped_barrier") Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23749>	2023-06-24 02:12:14 +00:00
Karol Herbst	69c452781b	nvc0: fix printing shaders Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23749>	2023-06-24 02:12:14 +00:00
Karol Herbst	45d86b419b	rusticl/program: add debugging option to disable SPIR-V validation This is useful for running applications known to pass in invalid SPIR-V. Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23818>	2023-06-24 01:52:07 +00:00
Karol Herbst	2b2a513890	rusticl/program: add debugging for OpenCL C compilation Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23818>	2023-06-24 01:52:07 +00:00
Karol Herbst	2362fd502b	docs: document CLC_DEBUG Signed-off-by: Karol Herbst <git@karolherbst.de> Reviewed-by: Nora Allen <blackcatgames@protonmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23818>	2023-06-24 01:52:07 +00:00
Kenneth Graunke	1b3669a1ed	intel: Initialize FF_MODE2 on all Gfx12 platforms On Alchemist, the FF_MODE2 documentation says that we must set the FF_MODE2 timer values for GS and HS to 224. The hardware performance tuning guide also recommends setting the TDS timer to 4. On Tigerlake, i915 applies workarounds to set the GS timer to 224 (failing to do so can cause HS/DS unit hangs), and the TDS timer to 4 (for performance). It doesn't currently apply a HS timer there, and I'm not sure if it's strictly necessary, but given that Alchemist needed it, and the other two settings matched, let's assume that it ought to match as well. Unfortunately, there has been a bug in the i915 workarounds infrastructure for non-masked context registers where writing one field of the register zeroes out all the others. So, I believe the Tigerlake TDS timer value of 4 isn't being applied correctly there, though the register is also not readable on that platform which makes it hard to verify. So, this may also speed up tessellation. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9233 Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23839>	2023-06-24 01:20:36 +00:00
Francisco Jerez	427fee3507	intel/gfx12.5: Enable L3 partial write merging for compressible surfaces among other cases. This enables L3 partial write merging for a number of cases that seem to be getting accidentally disabled by the kernel, which was causing a serious performance bottleneck on DG2 and MTL platforms. The "Compressible Partial Write Merge Enable", "Coherent Partial Write Merge Enable" and "Cross-Tile Partial Write Merge Enable" bits in L3SQCREG5 were expected to be enabled by default (and confusingly, they even read off as enabled if you ran 'intel_reg read 0xb158' on an idle system), but they are getting clobbered during 3D context initialization by an i915 workaround. Enabling L3 partial write merging of compressible surfaces in particular seems to increase rendering fillrate by over 3x in some cases (e.g. the "VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0" fillrate-bound microbenchmarks). Significant improvements can also be reproduced in most real-world workloads we've tested so far, e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 -- Thanks a lot to Caleb Callaway for these figures. No regressions have been observed so far. Even though this patch might strike as surprisingly simple for such a large payoff, it's the result of Felix DeGrood and I trying to root-cause the rendering performance gap of DG2 on Linux vs Windows on and off during the last year, and some of the OA statistics captured by Felix early this month were greatly helpful for me to connect the last few dots, so Felix deserves a big chunk of the credit for this work. Cc: mesa-stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783>	2023-06-23 21:24:27 +00:00
David Heidelberg	d7ec6f1724	ci/fastboot: use gzipped Image to avoid compressing on the runner Faster download, one less step. Win-win. Signed-off-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23816>	2023-06-23 20:47:53 +00:00
Thong Thai	7d3c29dc60	frontends/va: fix some coverity scan reported issues Added some checks for NULL pointer dereferencing and loop bounds. v2: Use ARRAY_SIZE instead of magic numbers (@jenatali) Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23598>	2023-06-23 20:31:21 +00:00
Caio Oliveira	dc93f205c1	meson: Explicitly add "check : false" to a couple instances of run_command In both cases there's code right after the execution to check the result and give a proper message. This gets rid of meson warning ``` WARNING: You should add the boolean check kwarg to the run_command call. It currently defaults to false, but it will default to true in future releases of meson. See also: https://github.com/mesonbuild/meson/issues/9300 ``` Reviewed-by: Eric Engestrom <eric@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23821>	2023-06-23 18:57:31 +00:00
Rhys Perry	d3e5e04a75	amd/drm-shim: use fixed-width types Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9221 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23725>	2023-06-23 18:35:52 +00:00
Alyssa Rosenzweig	766535c867	agx: Implement vector live range splitting The SSA killer feature is that, under an "optimal" allocator, the number of registers used (register demand) is equal to the number of registers required (register pressure, the maximum number of variables simultaneously live at any point in the program). I put "optimal" in scare quotes, because we don't need to use the exact minimum number of registers as long as we don't sacrifice thread count or introduce spilling, and using a few extra registers when possible can help coalesce moves. Details-shmetails. The problem is that, prior to this commit, our register allocator was not well-behaved in certain circumstances, and would require an arbitrarily large number of registers. In particular, since different variables have different sizes and require contiguous allocation, in large programs the register file may become fragmented, causing the RA to use arbitrarily many registers despite having lots of registers free. The solution is vector live range splitting. First, we calculate the register pressure (the minimum number of registers that it is theoretically possible to allocate successfully), and round up to the maximum number of registers we will actually use (to give some wiggle room to coalesce moves). Then, we will treat this maximum as a bound, requiring that we don't use more registers than chosen. In the event that register file fragmentation prevents us from finding a contiguous sequence of registers to allocate a variable, rather than giving up or using registers we don't have, we shuffle the register file around (defragmenting it) to make room for the new variable. That lets us use a few moves to avoid sacrificing thread count or introducing spilling, which is usually a great choice. Android GLES3.1 shader-db results are as expected: some noise / small regressions for instruction count, but a bunch of shaders with improved thread count. The massive increase in register demand may seem weird, but this is the RA doing exactly what it's supposed to: using more registers if and only if they would not hurt thread count. Notice that no programs whatsoever are hurt for thread count, which is the salient part. total instructions in shared programs: 1781473 -> 1781574 (<.01%) instructions in affected programs: 276268 -> 276369 (0.04%) helped: 1074 HURT: 463 Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 12196640 -> 12201670 (0.04%) bytes in affected programs: 1987322 -> 1992352 (0.25%) helped: 1060 HURT: 513 Bytes are HURT. total halfregs in shared programs: 488755 -> 529651 (8.37%) halfregs in affected programs: 295651 -> 336547 (13.83%) helped: 358 HURT: 9737 Halfregs are HURT. total threads in shared programs: 18875008 -> 18885440 (0.06%) threads in affected programs: 64576 -> 75008 (16.15%) helped: 82 HURT: 0 Threads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00
Alyssa Rosenzweig	72e6b683f3	agx/lower_parallel_copy: Lower 64-bit copies To 32-bit. This way we don't get into bad situations where we need to eg swap unaligned 64-bit values or something funny like that. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00

1 2 3 4 5 ...

173300 commits