fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-30 13:58:15 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	9e750f00c3	intel/brw: Make opt_copy_propagation_defs clean up its own trash Copy propagation often eliminates all uses of an instruction. If we detect that we've done so, we can eliminate the instruction ourselves rather than leaving it hanging until the next DCE pass. This saves some CPU time as other passes don't see dead code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	2af84c2d49	intel/brw: Use the defs-based copy propagation along with the old one The new def-based pass works better in many cases, and should be less resource intensive. However, the limited visibility of the defs-based pass due to many values not being SSA yet makes it unable to fully replace the old pass. Try the new one, and if it can't make progress, then try the old one. That way, things will mostly be handled by the new pass, but everything that was being cleaned up still will be. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	580e1c592d	intel/brw: Introduce a new SSA-based copy propagation pass (Quite a few of the restrictions here are ported from the old pass.) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	9690bd369d	intel/brw: Delete old local common subexpression elimination pass We no longer use this older pass, so there's no need to keep it. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	8f09c58ddc	intel/brw: Switch to the new defs-based global CSE pass While the limited visibility due to partial SSA is a downside to the new pass, it has a huge number of advantages that make it worth switching over even now. It's much more efficient, can eliminate redundant memory loads across blocks, and doesn't generate loads of unnecessary copies that other passes have to clean up. This means we also eliminate the infighting between the old CSE, coalescing, and copy propagation passes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	234c45c929	intel/brw: Write a new global CSE pass that works on defs This has a number of advantages compared to the pass I wrote years ago: - It can easily perform either Global CSE or block-local CSE, without needing to roll any dataflow analysis, thanks to SSA def analysis. This global CSE is able to detect and coalesce memory loads across blocks. Although it may increase spilling a little, the reduction in memory loads seems to more than compensate. - Because SSA guarantees that values are never written more than once, the new CSE pass can directly reuse an existing value. The old pass emitted copies at the point where it discovered a value because it had no idea whether it'd be mutated later. This led it to generate a ton of trash for copy propagation to clean up later, and also a nasty fragility where CSE, register coalescing, and copy propagation could all fight one another by generating and cleaning up copies, leading to infinite optimization loops unless we were really careful. Generating less trash improves our CPU efficiency. - It uses hash tables like nir_instr_set and nir_opt_cse, instead of linearly walking lists and comparing each element. This is much more CPU efficient. - It doesn't use liveness analysis, which is one of the most expensive analysis passes that we have. Def analysis is cheaper. In addition to CSE'ing SSA values, we continue to handle flag writes, as this is a huge source of CSE'able values. These remain block local. However, we can simply track the last flag write, rather than creating entire sets of instruction entries like the old pass. Much simpler. The only real downside to this pass is that, because the backend is currently only partially SSA, it has limited visibility and isn't able to see all values. However, the results appear to be good enough that the new pass can effectively replace the old pass in almost all cases. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	2b30b3bbd4	intel/brw: Print defs in dump_instructions Like NIR, we print SSA defs as %1, %2, and so on. The number here is the VGRF number. VGRFs that don't correspond to a SSA def remain printed as vgrf1, vgrf2, and so on. This makes it much easier to see what values are SSA and which aren't. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Caio Oliveira	08da7edc0e	intel/brw: Track the number of uses of each def in def_analysis Even without a full use list, simply tracking the number of uses will let us tell "this is the only use of the def" or "we've just replaced all uses of a def". It's inexpensive to calculate and will be useful. (rebased by Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	0d144821f0	intel/brw: Add a new def analysis pass This introduces a new analysis pass that opportunistically looks for VGRFs which happen to satisfy the SSA definition properties. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	ad9e414aa9	intel/brw: Skip LOAD_PAYLOADs after every texture instruction if possible This avoids generating a bunch of trash we have to clean up later. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	84219892ad	intel/brw: Make gl_SubgroupInvocation lane index loading SSA Our code to initialize gl_SubgroupInvocation uses multiple instructions some of which are partial writes. This makes it difficult to analyze expressions involving gl_SubgroupInvocation, which appear very frequently in compute shaders. To make this easier, we add a new virtual opcode which initializes a full VGRF to the value of gl_SubgroupInvocation. (We also expand it to UD for SIMD8 so there are not partial write issues.) We then lower it to the original code later on in compilation, after we've done the bulk of our optimizations. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	344d4ee9f0	intel/brw: Make VEC() perform a single write to its destination. This gathers a number of sources into a contiguous vector register, typically using LOAD_PAYLOAD. However, it uses MOV for a single source. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Francisco Jerez	06e4e088a3	intel/brw/xe2+: Use active-thread-only barriers available since Xe2+. These allow avoiding dead-locks in non-compliant applications that execute barriers under non-uniform control flow. They're not expected to have any major disadvantage so let's enable them unconditionally. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:18 -07:00
Francisco Jerez	8e61d32db8	iris,anv/xe2+: Use pipelined variant of 3DSTATE_DRAWING_RECTANGLE. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:17 -07:00
Francisco Jerez	576c9e3af2	iris,anv/xe2+: Set tessellation redistribution regions per patch to recommended values. See also HSDES#14015504893 regarding the region-based tessellation redistribution feature which allows fine-tuning the number of regions per patch. This sets it to the recommended value, since region-based redistribution is enabled by default. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:17 -07:00
Francisco Jerez	2aa4652a68	iris,anv/xe2+: Enable the DX10/OGL border mode for YCrCb as per Wa_14014226147. Hardware defaults to DX9 YCrCb border color mode instead of the behavior expected for DX10/OGL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:17 -07:00
Alyssa Rosenzweig	15257b65c6	treewide: use nir_metadata_control_flow Via Coccinelle patch: @@ @@ -nir_metadata_block_index \| nir_metadata_dominance +nir_metadata_control_flow ...plus some manual fixups for call sites missed by coccinelle. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom] Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:14 -04:00
Daniel Schürmann	7af16e9f1e	nir/shader_info: remove uses_demote This flag is mostly redundant with uses_discard and was only introduced to implement demote with LLVM when it didn't have that intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	9b1a748b5e	nir: remove nir_intrinsic_discard The semantics of discard differ between GLSL and HLSL and their various implementations. Subsequently, numerous application bugs occurred and SPV_EXT_demote_to_helper_invocation was written in order to clarify the behavior. In NIR, we now have 3 different intrinsics for 2 things, and while demote and terminate have clear semantics, discard still doesn't and can mean either of the two. This patch entirely removes nir_intrinsic_discard and nir_intrinsic_discard_if and replaces all occurences either with nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the case that the NIR option 'discard_is_demote' is being set. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Faith Ekstrand	4a84725ebb	intel/blorp: Set nir_shader::options up-front before building Previously, we left it NULL until later in the compile. However, some builder helpers are starting to check the options and they blow up when options == NULL. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Daniel Schürmann	f3d8bd18dd	nir: introduce discard_is_demote compiler option This new option indicates that the driver emits the same code for nir_intrinsic_discard and nir_intrinsic_demote. Otherwise, it is assumed that discard is implemented as terminate. spirv_to_nir uses this option in order to directly emit nir_demote in case of OpKill. RADV GFX11: Totals from 3965 (4.99% of 79439) affected shaders: MaxWaves: 119418 -> 119424 (+0.01%); split: +0.03%, -0.03% Instrs: 1608753 -> 1620830 (+0.75%); split: -0.18%, +0.93% CodeSize: 8759152 -> 8785152 (+0.30%); split: -0.18%, +0.48% VGPRs: 152292 -> 149232 (-2.01%); split: -2.37%, +0.36% Latency: 9162314 -> 10033923 (+9.51%); split: -0.46%, +9.97% InvThroughput: 1491656 -> 1493408 (+0.12%); split: -0.10%, +0.22% VClause: 21424 -> 21452 (+0.13%); split: -0.31%, +0.44% SClause: 53598 -> 55871 (+4.24%); split: -2.15%, +6.39% Copies: 90553 -> 90462 (-0.10%); split: -2.91%, +2.81% Branches: 16283 -> 16311 (+0.17%) PreSGPRs: 113993 -> 113254 (-0.65%); split: -1.84%, +1.19% PreVGPRs: 110951 -> 108914 (-1.84%); split: -2.08%, +0.24% VALU: 963192 -> 963167 (-0.00%); split: -0.01%, +0.01% SALU: 87926 -> 90795 (+3.26%); split: -2.92%, +6.18% VMEM: 25937 -> 25936 (-0.00%) SMEM: 110012 -> 109799 (-0.19%); split: -0.20%, +0.01% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Jianxun Zhang	09277c7ea6	blorp: Fix offset when ambiguating MCS buffer (xe2) The MCS region to ambiguate needs to shift 4KB from its starting address. The first 4KB is reserved for hardware. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28919>	2024-06-15 14:57:59 +00:00
Jianxun Zhang	8aa0373a50	blorp: Scaledown rectangle of MSAA fast clear (xe2) The scaledown rectangle of MSAA fast clear on Xe2 is 8 times in X and 2 in Y dimension of previous platforms. Absorb refactoring change suggested by Nanley Chery <nanley.g.chery@intel.com> Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28919>	2024-06-15 14:57:59 +00:00
Jianxun Zhang	4b64b04963	isl: Add AUX MCS encoding into aux modes (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28919>	2024-06-15 14:57:59 +00:00
Jianxun Zhang	765fb3e158	isl: Add a heading 4KB to MCS surface (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28919>	2024-06-15 14:57:59 +00:00
Lionel Landwerlin	13dc2a28ce	intel/fs: fix lower_simd_width for MOV_INDIRECT MOV_INDIRECT picks one lane from the src[0] and moves it to all lanes in the destination. Even if we split the instruction, src[0] should remain identical. Noticed this while trying to use this instruction in SIMD32. All current use cases are limited to SIMD8 shaders (or SIMD16 on Xe2). Or maybe in SIMD32 but with a uniform src[0]. That's we think we've never seen the issue so far. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28036>	2024-06-14 22:21:26 +00:00
Lionel Landwerlin	86813c60a4	mi-builder: add read/write memory fencing support on Gfx20+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	3b88a77b45	genxml: add MI_MEM_FENCE for Gfx20 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	5b4278ccd8	anv: use new mi-builder write check API to avoid stalls Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	59f11ef774	anv: set query mi-builder mocs only once Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	4f50cc12b9	anv: use default mocs for memory bits only touched by CS Since we don't need to share that data with other fixed functions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	c343cfc8b1	anv: move more MI_SDI to mi_builder Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	d056f36fab	anv: use the new relocated write mi-builder api Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	3e4f6def87	anv: centralize mi_builder setup Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	243ced4eb2	mi-builder: add a write check parameter All the MI_SDI currently have forced write checks (meaning the command streamer will stall until completion) on Gfx12.0+. Now on Gfx12.0/12.5, the read commands have implicit waits on previous writes (BSpec ). So if we're only dealing with CS writes & reads, we don't need forced write checks. In the few cases where CS is writing data for other bits of HW, we need the forced write checks. This change adds an API that will let the driver decide when to enable forced write checks. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	a623760f82	mi-builder: add relocated register/memory writes When you want to write a value to a register or memory but you don't know just yet that value when you emit the command. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	775db77baf	mi-builder: add missing write completion check Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	8ecc2ff56d	mi-builder: make instruction pointer manipulation more obvious Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	634c7b097b	mi-builder: c++ warning fix Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	eef1a5b607	mi-builder: rename relocated api It wasn't clear what this was doing. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	49d2d25e24	anv: make device initialization more asynchronous With this change, the engine initialization batches are build and submitted at vkCreateDevice() but the function doesn't wait for them to complete. Instead we wait at vkDestroyDevice() or whenever another submission happens on the queue, we check whether the initialization batch has completed (without waiting) and free it if completed. Seems to be about 25% reduction time of vkCreateDevice() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>	2024-06-13 08:29:25 +00:00
Lionel Landwerlin	729c0b54b6	anv: use reserved array pool for legacy custom border colors The array pool does a single allocation and then splits it out. The downside is that the pool is not lockless, but for border colors it likely doesn't matter much as there is a max border colors for 4k. Seems to be a 30% time reduction for vkCreateDevice() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>	2024-06-13 08:29:25 +00:00
Lionel Landwerlin	7da5b1caef	anv: move trtt submissions over to the anv_async_submit We can remove a bunch of TRTT specific code from the backends as well as manual submission tracking. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>	2024-06-13 08:29:25 +00:00
Lionel Landwerlin	1adafbddbd	anv: rework utrace submission We want to make this more generic so that it can be reused for device initialization as well as TRTT submissions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>	2024-06-13 08:29:25 +00:00
Lionel Landwerlin	dd19e4240e	anv: reuse setup_execbuf_fence_params for utrace submissions Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>	2024-06-13 08:29:25 +00:00
Lionel Landwerlin	8c7e1052a3	anv: simplify TRTT initialization Drop usage of pthread mutex so initialization never fails. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>	2024-06-13 08:29:25 +00:00
Sviatoslav Peleshko	5ca51156e2	intel/elk: Actually retype integer sources of sampler message payload According to PRMs: "All parameters are of type IEEE_Float, except those in the The ld*, resinfo, and the offu, offv of the gather4_po[_c] instruction message types, which are of type signed integer." Currently, we load parameters with the correct types, but use them as send sources with the default float type, which may confuse passes downstream. Fix this by actually storing the retyped sources. Cc: mesa-stable Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29581>	2024-06-12 18:59:17 +00:00
Sviatoslav Peleshko	2358c997f3	intel/brw: Actually retype integer sources of sampler message payload According to PRMs: "All parameters are of type IEEE_Float, except those in the The ld*, resinfo, and the offu, offv of the gather4_po[_c] instruction message types, which are of type signed integer." Currently, we load parameters with the correct types, but use them as send sources with the default float type, which may confuse passes downstream. Fix this by actually storing the retyped sources. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11118 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29581>	2024-06-12 18:59:17 +00:00
Lionel Landwerlin	99f92dd6d3	anv: ensure completion of surface state copies before secondaries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29671>	2024-06-12 10:06:05 +00:00
Lionel Landwerlin	1851629407	anv: limit aux invalidations to primary command buffers This AUX-TT is only updated on the CPU since `ee6e2bc4a3` ("anv: Place images into the aux-map when safe to do so"). So the only really important invalidation that needs to happens is on the beginning of a primary command buffer. We are required to idle the pipes prior invalidation the AUX-TT. This might not be happening when the invalidation is put at the beginning of the secondary command buffers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29671>	2024-06-12 10:06:05 +00:00

... 60 61 62 63 64 ...

15202 commits