fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 00:38:06 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	778cb59086	anv: optimize STATE_BYTE_STRIDE emission Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30803>	2024-08-23 10:52:19 +00:00
Lionel Landwerlin	195c5b68ba	anv: don't miss workaround for indirect draws Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30803>	2024-08-23 10:52:19 +00:00
Lionel Landwerlin	f25b500af4	anv: move conditional render predicate after gfx_flush_state Following up on `f8c0a99d52` ("anv: emit conditional after gfx state flushing"), this should have been applied everywhere. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0147908a89` ("anv: predicate emission of STATE_BASE_ADDRESS") Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30803>	2024-08-23 10:52:19 +00:00
Tapani Pälli	5bf6602d23	anv: check if RT writes are happening for HasWriteableRT Fixes: `eebb6cd236` ("anv: stop using 3DSTATE_WM::ForceThreadDispatchEnable") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11749 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30785>	2024-08-23 06:28:00 +00:00
Lionel Landwerlin	a88898a28f	anv: optimize CLIP::MaximumVPIndex setting Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11746 Fixes: `982106e676` ("anv: only set 3DSTATE_CLIP::MaximumVPIndex once") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30762>	2024-08-23 05:45:03 +00:00
Kenneth Graunke	b97e10208c	intel/brw: Add a file parameter to idom_tree::dump() The other dump methods in this file also take a file parameter, defaulting to stderr. Dumping dot files to stdout is probably not what anybody really wanted. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30530>	2024-08-22 22:54:45 +00:00
Kenneth Graunke	bb4f05005e	intel/brw: Print blocks in brw_print_instructions_to_file() Useful when examining the control flow graph. For some reason, we printed this for the final assembly but not the IR. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30530>	2024-08-22 22:54:45 +00:00
Kenneth Graunke	2d73e42333	intel/brw: Fix OOB reads when printing instructions post-reg-alloc Post-register allocation, but before brw_fs_lower_vgrfs_to_fixed_grfs, we have registers with the VGRF file but they are actually fixed GRFs. brw_print_instructions_to_file() was seeing VGRFs and trying to access their size, but using bogus register numbers that could be out-of-bound. Detect when we're post-RA and avoid doing this. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30530>	2024-08-22 22:54:45 +00:00
Lionel Landwerlin	d9406658ed	brw: remove unused prog_data field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>	2024-08-22 19:44:40 +00:00
Lionel Landwerlin	3769b58272	anv: move lowering of descriptor intrinsics to apply_layout Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>	2024-08-22 19:44:40 +00:00
Lionel Landwerlin	45117c0ed5	anv: simplify loading driver internal constants Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>	2024-08-22 19:44:39 +00:00
Lionel Landwerlin	7a55a930f6	anv: reuse common pipeline state for compute push allocations Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>	2024-08-22 19:44:39 +00:00
Eric Engestrom	d7f7aede15	intel/ci: don't trigger anv-jsl-full & anv-tgl-full on GL changes These are pure VK-CTS jobs, they don't run any GL tests. It doesn't matter right now because these two jobs are disabled, but when they get re-enabled, we'll want this to have been fixed. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30677>	2024-08-22 16:24:24 +00:00
Daniel Stone	cc507536db	ci/intel: Move manual/nightly jobs to postmerge stage Create a new stage called intel-postmerge and move the full and manual jobs over there, to avoid entanglement with the pre-merge jobs. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30784>	2024-08-22 15:35:18 +00:00
Daniel Stone	f1aab081b5	ci: Create new 'performance' stage Move all jobs doing performance testing to a separate stage. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30784>	2024-08-22 15:35:18 +00:00
Kenneth Graunke	6a292c2699	intel: Fix bad align_offset on global_constant_uniform_block_intel We were specifying align_offset = 64 and align_mul = 64, which is invalid. nir_combined_align() asserts that align_offset < align_mul. Our intention here is to perform cacheline-aligned (64B-aligned) block loads, so we should set align_mul = 64 and can leave align_offset = 0. Fixes: `fbafa9cabd` ("intel/nir: remove load_global_const_block_intel intrinsic") Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30755>	2024-08-21 20:44:57 +00:00
Ian Romanick	c96ceb50d0	intel/brw/xe2: Allow int64 conversions As far as I can tell from looking at the Bspec, MOV between integers of all sizes appears to be supported. shader-db: total instructions in shared programs: 17480631 -> 17480535 (<.01%) instructions in affected programs: 26284 -> 26188 (-0.37%) helped: 21 / HURT: 13 total cycles in shared programs: 897601907 -> 897664293 (<.01%) cycles in affected programs: 10929664 -> 10992050 (0.57%) helped: 48 / HURT: 45 fossil-db: Totals: Instrs: 140686824 -> 140686155 (-0.00%); split: -0.00%, +0.00% Cycle count: 21525129188 -> 21524717729 (-0.00%); split: -0.01%, +0.00% Spill count: 70778 -> 70776 (-0.00%) Fill count: 139172 -> 139168 (-0.00%) Max live registers: 47513859 -> 47513795 (-0.00%) Totals from 612 (0.11% of 549272) affected shaders: Instrs: 964441 -> 963772 (-0.07%); split: -0.09%, +0.02% Cycle count: 1215564312 -> 1215152853 (-0.03%); split: -0.09%, +0.06% Spill count: 16172 -> 16170 (-0.01%) Fill count: 37962 -> 37958 (-0.01%) Max live registers: 70749 -> 70685 (-0.09%) Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30700>	2024-08-21 20:16:00 +00:00
Ian Romanick	09cf9fe8ab	anv: Larger memory pools for huge shaders At least one ray tracing shader in cp2077 is over 4MB on Xe2. There isn't a memory pool large enough for the allocation, so the driver crashes instead. This commit adds 8MB and 16MB pools. I intend this as a stop gap fix. I would prefer to figure out why this shader is so much larger than on previous platforms. The shader in question has 3824 spills and 8625 fills. That is not good. I suspect dealing with that will also solve the problem, but that will require a bit more time. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11739 Suggested-by: Lionel Landwerlin Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30751>	2024-08-21 19:45:17 +00:00
Ian Romanick	0921dfa044	anv: Protect against OOB access to anv_state_pool::buckets Suggested-by: Paulo Zanoni Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30751>	2024-08-21 19:45:17 +00:00
Rohan Garg	29a2e5358d	anv: enable KHR_shader_relaxed_extended_instruction Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30726>	2024-08-21 14:13:46 +00:00
Francisco Jerez	71ca8529c5	intel/brw/gfx12.5+: Fix IR of sub-dword atomic LSC operations. We were currently emitting logical atomic instructions with a packed destination region for sub-dword LSC atomics, along the lines of: > untyped_atomic_logical(32) dst<1>:HF, ... However, these instructions use an LSC data size D16U32, which means that the 16b data on the return payload is expanded to 32b by the LSC shared function, so we were lying to the compiler about the location of the individual channels on the return payload, its execution masking, etc. This is why the hacks that manually set the 'inst->size_written' of the instruction were required. In some cases this worked, but any non-trivial manipulation of the instruction destination by lowering or optimization passes could have led to corruption, as has been reproduced in deqp-vk during lower_simd_width() for shaders that use 16-bit atomics in SIMD32 dispatch mode. Note that LSC sub-dword reads aren't affected by this because they use raw UD destinations and specify the actual bit size of the operation datatype as the immediate SURFACE_LOGICAL_SRC_IMM_ARG, which doesn't work for atomic operations since that immediate specifies the atomic opcode. Instead, have the logical operation implement the behavior of 16-bit destinations correctly instead of silently replacing the 16-bit region with an inconsistent 32-bit region -- This is done by emitting the MOV instructions used to pack the data from the UD temporary into the packed destination from the lower_logical_sends() pass instead of from the NIR translation pass. Fixes: `43169dbbe5` ("intel/compiler: Support 16 bit float ops") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30683>	2024-08-21 02:33:12 +00:00
Nanley Chery	5e86087940	intel: Move depth clear value writes to drivers This improves drivers in the following ways: * iris_hiz_exec() and crocus_hiz_exec() gets rid of the narrowly-used update_clear_depth parameters. * iris avoids fast-clearing if the aux state is CLEAR. crocus avoids this too, but didn't actually need it in the first place. * iris updates the value once per fast_clear_depth() call instead of doing an update for each layer being cleared. * anv now updates the clear value when transitioning from an undefined layout instead of doing so on every fast-clear. This should be safer because we don't perform state cache invalidates when changing the clear value. So, existing surface states won't have any stale values. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30520>	2024-08-20 21:29:43 +00:00
Nanley Chery	d7b0d32c28	intel/blorp: Simplify depth clear value updates Use a single MI_STORE_DATA_IMM instead of five. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30520>	2024-08-20 21:29:43 +00:00
Nanley Chery	3294200098	intel: Add and use isl_get_sampler_clear_field_offset Add and use a function which documents the sampler's behavior around fast-clears on gfx11-12. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30520>	2024-08-20 21:29:43 +00:00
Nanley Chery	07e0834774	intel: Use a simpler workaround for HiZ WT fast-clears The new workaround tries to strike a balance between simplicity and functionality (for testing purposes). Instead of checking for the alignment of a specific LOD when fast-clearing, we take an all-or-nothing approach for LOD1+. I haven't found any app to clear LOD1+ except for a Dirt Rally trace some time ago. If I remember correctly, that trace clears all LODs, doesn't render to them, then clears again with a different color, incurring resolves. So, skipping LOD1+ fast clears will avoid those resolves. Other apps I tested include Synmark2, glmark2, GfxBench5, and the Vulkan games in internal our benchmarking tool. Now that we've added updated and simplified checks in the drivers themselves, we delete blorp_can_hiz_clear_depth. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30250>	2024-08-20 19:43:15 +00:00
Nanley Chery	a28bd0abdf	intel: Adjust partial depth fast clear checks None of our tracked games use partial depth clears, so only allow it in simple cases for testing purposes. This change also fixes an issue on gfx8, where we had been accidentally disabling full surface clears if the LOD was not 8x4 aligned. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30250>	2024-08-20 19:43:15 +00:00
Nanley Chery	dd384104b7	intel/blorp: Allow LOD0 fast-clears with HiZ WT I did some more debugging of this feature, but this time with a modified version of the piglit test, ./bin/depthstencil-render-miplevels. I modified the test to: * Control which LOD to stop populating/clearing * Print out the results of readpixels to stderr From there, I could see how different surface dimensions affected fast-clears. Depending on the surface dimensions, fast-clearing an LOD above the LOD0 could cause other LODs to be cleared and/or cause the targeted LOD to be only partially cleared (for example, when the LOD0 dimension is 66x66 and the test doesn't clear LOD3+). This never happens when fast-clearing LOD0 however. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5258 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30250>	2024-08-20 19:43:15 +00:00
Nanley Chery	6afdc9c5a6	intel: Enable more LOD0 HIZ+CCS fast clears For correct fast-clearing with HiZ+CCS, we require roughly 16x8 alignment of LODs. The next patch will cause drivers to ignore the alignment of LOD0, so align the qpitch to 8 to avoid breakage and so that fast clears will be enabled more often. Prevents failures with the piglit test case: ./bin/fbo-depth-array depth-clear -fbo in the next patch. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30250>	2024-08-20 19:43:15 +00:00
Kenneth Graunke	d22d6d814d	intel/brw: Fix Xe2+ SWSB encoding/decoding for DPAS instructions SBID SET can only be used on SEND, SENDC, or DPAS instructions. The existing code was handling SET for SEND/SENDC, but was using the wrong encoding for DPAS. Add a new case to handle that and make it clear that the existing code is only for SEND/SENDC. While here, rewrite the encoder to use 2-bit binary immediates shifted up into the mode [9:8] field, rather than pre-shifted hex values. This matches the documentation better and is a little easier to follow. On the decode side, we were incorrectly decoding MATH instructions. Because they're marked is_unordered, we were hitting the SEND/SENDC decoding, which is incorrect for MATH. Fixes 22 cooperative matrix tests on Lunar Lake. Huge thanks to Paulo Zanoni for bisecting failures to one of my commits, then analyzing shaders and experimenting to discover that the failure was really an unrelated bug, just being provoked by different choices of registers. His work narrowing the problem down made it much easier to discover and fix this bug. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30705>	2024-08-20 19:09:37 +00:00
Kenneth Graunke	89f9a6e10b	intel/brw: Pass opcode to brw_swsb_encode/decode We're going to need to handle encoding/decoding differently for DPAS vs. SEND/SENDC vs. other instructions. Pass the opcode so we can figure out the encodings for each type of instruction. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30705>	2024-08-20 19:09:37 +00:00
Rohan Garg	1f06e70bdc	anv: migrate indirect mesh draws to indirect draws on ARL+ Backport-to: 24.2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30690>	2024-08-20 09:41:51 +00:00
Rohan Garg	f69c74b6d5	anv: dispatch indirect draws with a count buffer through the XI hardware on ARL+ ARL+ can dispatch indirect draws through the hardware. Backport-to: 24.2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30690>	2024-08-20 09:41:51 +00:00
Rohan Garg	74cd70841d	anv: refactor indirect draw support into it's own function ARL+ supports some form of indirect draws, instead of trying to mash support for indirect draws across various generations, let's make things cleaner by factoring out XI support into it's own function. Backport-to: 24.2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30690>	2024-08-20 09:41:51 +00:00
Rohan Garg	c1af71c9c2	anv,iris: prefix the argument format with XI for a upcoming refactor Backport-to: 24.2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30690>	2024-08-20 09:41:51 +00:00
Rohan Garg	dc23db2a0d	anv: program a custom byte stride on Xe2 for indirect draws Xe2 allows us to program in a custom byte stride for indirect draws Backport-to: 24.2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30690>	2024-08-20 09:41:50 +00:00
Tapani Pälli	d4e8c8f874	anv: move setting 3DSTATE_CLIP::MaximumVPIndex from loop Loop iterates viewports but for MaximumVPIndex we only need viewport count and last stage that writes viewport. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30732>	2024-08-20 06:48:50 +00:00
Jianxun Zhang	8c623b6a7e	Revert "anv: Disable PAT-based compression on depth images (xe2)" This reverts commit `6073f091bb`. With the progress on Xe2 platforms, we are not seeing many issues caused by compression on depth buffers. Backport-to: 24.2 Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30653>	2024-08-19 17:50:10 -07:00
José Roberto de Souza	12656571fd	anv/gfx20: Enable depth buffer write through for multi sampled images BSpec: 56419 Backport-to: 24.2 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29615>	2024-08-19 20:04:36 +00:00
Nanley Chery	ebe3eabda6	anv: Add want_hiz_wt_for_image() Backport-to: 24.2 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29615>	2024-08-19 20:04:36 +00:00
José Roberto de Souza	2553878fba	intel/isl/gfx20: Alow hierarchial depth buffer write through for multi sampled surfaces BSpec: 56419 Backport-to: 24.2 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29615>	2024-08-19 20:04:36 +00:00
Lionel Landwerlin	e10cbb59a5	anv: add assert to detect problematic instruction merges We stick to a rule in the driver that each field is only set in a single place in the driver. Therefore when merging instructions, we should never have any bit set to 1 from both sides. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30684>	2024-08-19 11:02:44 +00:00
Lionel Landwerlin	982106e676	anv: only set 3DSTATE_CLIP::MaximumVPIndex once Currently we can end up merging 2 prepacked 3DSTATE_CLIP instructions where 2 different places in the driver fill the MaximumVPIndex. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `50f6903bd9` ("anv: add new low level emission & dirty state tracking") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30684>	2024-08-19 11:02:44 +00:00
Lionel Landwerlin	7c73346549	anv: remove unused macro Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30684>	2024-08-19 11:02:44 +00:00
Lionel Landwerlin	9eff285a46	anv: fix extended buffer flags usages Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `bcc0ec8e6c` ("anv: enable KHR_maintenance5") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30714>	2024-08-19 10:13:09 +00:00
Caio Oliveira	40f77b6936	intel/brw: Avoid modifying the shader in assign_curb_setup if not needed If there are no uniforms to push, don't emit the AND or invalidate the shader analysis. This affects only compute shaders. Not a significant impact since lots of shaders end up pushing uniforms. Fossil-db numbers (restricted to compute pipelines only) for DG2 ``` Totals: Instrs: 3071016 -> 3070894 (-0.00%) Cycle count: 8320268863 -> 8320264519 (-0.00%) Totals from 122 (2.70% of 4520) affected shaders: Instrs: 10675 -> 10553 (-1.14%) Cycle count: 2060003 -> 2055659 (-0.21%) ``` Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30631>	2024-08-17 16:25:01 -07:00
José Roberto de Souza	38c989ada2	anv: Nuke anv_utrace_submit::trace_bo There is no usage for this bo. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30676>	2024-08-16 19:38:19 +00:00
José Roberto de Souza	f7b386bd6d	anv: Use batch_bo_pool in utrace anv_async_submit_init() calls In pratical the only change here is that batch_bo_pool are captured to error dumps. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30676>	2024-08-16 19:38:19 +00:00
José Roberto de Souza	168e26fc04	anv: Add trivial_batch and query-pool to the error capture Those are batch buffers that are not allocated from batch_bo_pool, so they were left out of error capture without the capture-all parameter. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30676>	2024-08-16 19:38:18 +00:00
Sagar Ghuge	c4f2a8d984	intel/compiler: Fix indirect offset in GS input read for Xe2+ Make sure to take new GRF size into consideration and adjust the indirect offset according to new size so that when we do the indirect load with address register, we load right values. This helps pass the following tests: - dEQP-VK.binding_model.descriptor_buffer.mutable_descriptor.geom - dEQP-VK.ray_query.geometry_shader. Backport-to: 24.2 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30679>	2024-08-16 18:40:13 +00:00
Ian Romanick	c8038643b8	intel/brw: Make ifind_msb SSA friendly No shader-db changes on any Intel platform. v2: Use negate(tmp) instead of creating a new temporary. Suggested by Ken. fossil-db: Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 152535897 -> 152535883 (-0.00%); split: -0.00%, +0.00% Cycle count: 17112329592 -> 17112406110 (+0.00%); split: -0.06%, +0.06% Totals from 40 (0.01% of 633223) affected shaders: Instrs: 458813 -> 458799 (-0.00%); split: -0.01%, +0.00% Cycle count: 4358016282 -> 4358092800 (+0.00%); split: -0.23%, +0.24% Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 150560511 -> 150560465 (-0.00%); split: -0.00%, +0.00% Cycle count: 15484534441 -> 15482372893 (-0.01%); split: -0.12%, +0.11% Spill count: 59795 -> 59794 (-0.00%) Fill count: 103513 -> 103509 (-0.00%) Totals from 40 (0.01% of 632445) affected shaders: Instrs: 368877 -> 368831 (-0.01%); split: -0.01%, +0.00% Cycle count: 3918398264 -> 3916236716 (-0.06%); split: -0.49%, +0.43% Spill count: 16896 -> 16895 (-0.01%) Fill count: 27819 -> 27815 (-0.01%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30650>	2024-08-16 14:52:04 +00:00

1 2 3 4 5 ...

12582 commits