fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-09 23:30:13 +01:00

Author	SHA1	Message	Date
Kevin Chuang	7b526de18f	intel/compiler/rt: Calculate barycentrics on demand This commit moves the calculation of tri_bary out of brw_nir_rt_load_mem_hit_from_addr(), and only do the calculation on demand, since unorm_float_convert can be expensive. We do this for both Xe1/2 and Xe3+ for consistency. Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Sagar Ghuge	afc23dffa4	intel/compiler: Update MemHit data structure to 64-bit version Rework (Kevin): - Fix inst leaf ptr - Handle 24bit unorm barycentric coord Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Kevin Chuang	40fb95d51a	intel/compiler: Use 24bits for hit_kind on Xe3+ For Xe3+, the upper 8 bits of the second dword of a potential hit is used to store hitGroupIndex0, which is stuffed by the HW. This hitGroupIndex0 will later be used by the HW again to reconstruct the whole hitGroupIndex when driver issues a TRACE_RAY_COMMIT. We were corrupting this hitGroupIndex0 at the driver by setting the whole dword to hit_kind, which will cause the HW to read a wrong hitGroupIndex and therefore invoke a wrong closest hit shader. The behavior can be seen in dEQP-VK.ray_tracing_pipeline.pipeline_no_null_shaders_flag.gpu.boxes.\* and dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.\* This commit changes the driver to only use lower 24bits to store the hit_kind, and leave the upper 8bits as it. Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Sagar Ghuge	64fd66407b	intel/compiler: Pass around intel_device_info parameter in helper This will help us to handle code path separately for Xe3+ for updated 64bit memory data structure for RT. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Sagar Ghuge	6deb1950a4	anv: Update RT dispatch globals to use 64bit data structure Rework (Kevin) - Fix Hit/Miss/Resume shader group table value Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Sagar Ghuge	fcd5fe4a75	intel/genxml/xe3: Update 3STATE_BTD field Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Sushma Venkatesh Reddy	4084527876	intel/compiler: Always run opt_algebraic after descriptor_lowering Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This change ensures that `brw_opt_algebraic` is always executed after `brw_lower_send_descriptors` in `brw_opt.cpp`. By doing so, redundant logical operations are optimized, resulting in cleaner and more compact assembly output. fossil-db results on LNL: - Totals: - Instructions: 215857290 -> 215857028 (-0.00%) - Cycle count: 32008929636 -> 32008935384 (+0.00%); split: -0.00%, +0.00% - Max live registers: 66940643 -> 66940557 (-0.00%) - Affected shaders (104 out of 713963): - Instructions: 31090 -> 30828 (-0.84%) - Cycle count: 5955908 -> 5961656 (+0.10%); split: -0.16%, +0.26% - Max live registers: 10888 -> 10802 (-0.79%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34615>	2025-04-19 07:05:54 +00:00
Iván Briano	949d2e507d	anv: expose promoted KHR_depth_clamp_zero_one Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34614>	2025-04-18 21:31:37 +00:00
Rohan Garg	a5033c54e7	anv: use the common function for detecting a mesh shader stage Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34604>	2025-04-18 10:08:22 +00:00
Rohan Garg	9b477eea19	intel/compiler: use a immediate when doing the shift We can pass immediates to SHL and don't need to allocate a separate register here. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34604>	2025-04-18 10:08:22 +00:00
Konstantin Seurer	2dee1117b7	vulkan: Add a vk_device parameter to get_encode_key Useful for selecting different encoding options based on hardware generation. Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>	2025-04-17 20:20:40 +00:00
Caio Oliveira	fd0a7efb5a	spirv, nir: Delay calculation of shared_size when using explicit layout Move the calculation to nir_lower_vars_to_explicit_types(). This consolidates the check of shader_info::shared_memory_explicit_layout in a single place instead of in all drivers. This is motivated by SPV_KHR_untyped_pointers. Before that extension we had essentially two modes for shared memory variables - No layout decorations in the SPIR-V, and both internal layout and driver location was _given by the driver_. - Explicitly laid out, i.e. they are blocks, and decorated with Aliased. Because they all alias, we could assign them driver location directly to the start of the shared memory. With the untyped pointers extension, there's a third option, to be added by a later commit - Explicitly laid out, i.e. they are blocks, and NOT decorated with Aliased. Driver location is _given by the driver_. Blocks with and without Aliased can be mixed. The driver location of multiple blocks that don't alias depend on alignment that is driver-specific, which we can more easily do from the nir_lower_vars_to_explicit_types() that already has access to a function to obtain such value. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (hk) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (anv/hasvk) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv) Reviewed-by: Rob Clark <robdclark@gmail.com> (tu) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:17 +00:00
José Roberto de Souza	a96e280dfe	intel: Program XY_FAST_COLOR_BLT::Destination Mocs for gfx12 Copy engine is not used in gfx12 platforms on ANV but that is possible in Iris. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34560>	2025-04-17 18:11:44 +00:00
Rohan Garg	cbc1ec4f73	anv: re enable compression for CPS surfaces on platforms other than Xe I accidentally disabled compression on CPS surfaces marked as storage or color attachment for all platforms, when this should only be limited to Xe. Fixes: 80f9b6 ('anv: CPB surfaces that are used as color attachments or for stores cannot be compressed') Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34297>	2025-04-17 14:24:11 +00:00
Daniel Stone	8d08cde667	ci/piglit: Use structured tagging for Piglit Structured tagging (cf. mesa/mesa!33421) captures a checksum of the thing we think we're building, and verifies this through the chain. When we run container builds, we check that the tag we've captured in the CI variables matches the calculated checksum, to make sure the declared tags are consistent and we always have traceability. When we run tests, we check the tags again between what was declared in the CI variables and what we're actually running from the test container. This makes sure that we're always testing what we think we're testing. As a side advantage, the rule inheritance we need to make this work means that we can start doing more optional downloads via overlays, instead of pulling a whole container full of stuff we might not ever use. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34539>	2025-04-17 09:22:39 +00:00
Caio Oliveira	d5ad798140	spirv, radv, intel: Add NIR intrinsic for cmat conversion A cooperative matrix conversion operation was represented in NIR by the cmat_unary_op intrinsic with an nir_alu_op as extra parameter, that was already lowered to a specific conversion operation based on the matrix types. Instead of that, add a new intrinsic `cmat_convert` that is specific for that conversion. In addition to the src/dst matrix descriptions already available, also include the signedness information in the intrinsic (reuse nir_cmat_signed for that). This is needed because different Convert operations define different interpretations for integers, regardless their original type. In this patch, both radv and intel were changed to use the same logic that was previously used to pick the lowered ALU op. This change will help represent cmat conversions involving BFloat16, because it avoids having to create new NIR ALU ops for all the combinations involving BFloat16. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34511>	2025-04-16 23:13:36 +00:00
Valentine Burley	2f02fa5db4	intel/ci: Start using the new 6.14 kernel on JSL Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The new 6.14 kernel in gfx-ci linux is now stable on Jasper Lake, so drop the kernel override from the anv-jsl-vk job. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34553>	2025-04-16 22:13:19 +00:00
Ian Romanick	e783930b10	elk/algebraic: Don't optimize float SEL.CMOD to MOV Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Floating point SEL.CMOD may flush denorms to zero. We don't have enough information at this point in compilation to know whether or not it is safe to remove that. Integer SEL or SEL without a conditional modifier is just a fancy MOV. Those are always safe to eliminate. See also `3f782cdd25`. Fixes: `fab92fa1cb` ("i965/fs: Optimize SEL with the same sources into a MOV.") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Ian Romanick	f4ede9c10a	elk/algebraic: Clear condition modifier on optimized SEL instruction The condition modifier on SEL means something completely different than it means on MOV. On MOV it means to modify the flags based on the value written to the destination. On SEL it means to compare the sources using that mode and pick the result (i.e., as min() or max()) without modifying the flags. The resulting MOV should not have a condition modifier for the same reason it (already) doesn't have a predicate. This bug was found by inspection, so I added a unit test. Fixes: `fab92fa1cb` ("i965/fs: Optimize SEL with the same sources into a MOV.") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Ian Romanick	6a19d8915f	brw/algebraic: Don't optimize float SEL.CMOD to MOV Floating point SEL.CMOD may flush denorms to zero. We don't have enough information at this point in compilation to know whether or not it is safe to remove that. Integer SEL or SEL without a conditional modifier is just a fancy MOV. Those are always safe to eliminate. See also `3f782cdd25`. Fixes: `fab92fa1cb` ("i965/fs: Optimize SEL with the same sources into a MOV.") No shader-db changes on any Intel platform. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 209903490 -> 209903492 (+0.00%) Cycle count: 30546025224 -> 30546021980 (-0.00%); split: -0.00%, +0.00% Max live registers: 65516231 -> 65516235 (+0.00%) Totals from 2 (0.00% of 706657) affected shaders: Instrs: 3197 -> 3199 (+0.06%) Cycle count: 361650 -> 358406 (-0.90%); split: -10.05%, +9.15% Max live registers: 300 -> 304 (+1.33%) Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Ian Romanick	07dc1d4043	brw/algebraic: Clear condition modifier on optimized SEL instruction The condition modifier on SEL means something completely different than it means on MOV. On MOV it means to modify the flags based on the value written to the destination. On SEL it means to compare the sources using that mode and pick the result (i.e., as min() or max()) without modifying the flags. The resulting MOV should not have a condition modifier for the same reason it (already) doesn't have a predicate. This bug was found by inspection, so I added a unit test. No shader-db or shader-db changes on any Intel platform. Fixes: `fab92fa1cb` ("i965/fs: Optimize SEL with the same sources into a MOV.") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Caio Oliveira	fafdd24285	intel/executor: Update bfloat example Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Elaborate on the packed/unpack restrictions, use ADD(x, 0.0f) as a workaround for F->BF conversion. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>	2025-04-14 18:23:43 +00:00
Caio Oliveira	fbe5d559bd	brw: Update EU validation to allow packed BF mixed with packed F Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>	2025-04-14 18:23:43 +00:00
Caio Oliveira	d1dd088ede	brw: Allow DPAS with BF on Gfx125 MTL doesn't support, but both ACM and ARL-H do. Fixes: `e384ccde28` ("brw: Expand EU validation for DPAS") Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>	2025-04-14 18:23:43 +00:00
Caio Oliveira	050acb9def	intel: Disable has_bfloat16 for MTL Not supported. Some operations do work, but proper support was removed since it also doesn't support DPAS. Fixes: `9916cc1050` ("brw: Add BRW_TYPE_BF for bfloat16") Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>	2025-04-14 18:23:43 +00:00
Caio Oliveira	adfab666a4	intel: Add intel_device_info::has_systolic Gfx125+ has systolic, with exception for MTL and some ARL variants. Update code and tests to use it. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>	2025-04-14 18:23:43 +00:00
Konstantin Seurer	cb31b5a958	clc,libcl: Clean up CL includes This patch does a couple of things to make CL integration with drivers as seamless as possible: - We pull in opencl-c.h and opencl-c-base.h to stop relying on system headers. - Parts of libcl.h are moved to new headers that are incomplete CL-safe variants of libc headers. - A couple of util headers are changed to remove now unnecessary __OPENCL_VERSION__ guards and make more headers CL safe. - Drivers now include src/compiler/libcl and use headers like macros.h,u_math.h instead of libcl.h. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>	2025-04-11 21:27:37 +00:00
Kenneth Graunke	eb1ec9cf8e	brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs() This allows us to create temporary VGRFs that are larger than MAX_VGRF_SIZE(devinfo), which will be split eventually. They may not be split on the initial pass, because we may need LOAD_PAYLOAD lowering, copy propagation, and so on to occur first. So we allow registers to exceed that size initially. The "Register allocation relies on split_virtual_grfs()" assertion in brw_reg_allocate.cpp still asserts that all VGRFs which reach the register allocator have been properly split. One case where this is useful is for vectorizing convergent block loads. We create temporaries to splat the SIMD1 values out to SIMD(N), which can lead to some very large temporaries. However, copy propagation and so on ultimately eliminate these and they'll get split down to proper sizes or elided entirely in the end. (Note: both this and the prior commits from this merge request are needed to close the linked issue.) Cc: mesa-stable Reviewed-by: Matt Turner <mattst88@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12324 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>	2025-04-11 20:34:51 +00:00
Kenneth Graunke	a45583f078	brw: Use live->max_vgrf_size in pre-RA scheduling Post-RA scheduling doesn't use liveness analysis, so we continue using MAX_VGRF_SIZE(devinfo). But for pre-RA scheduling, we now use live->max_vgrf_size. This helps get us to a place where we can emit arbitrarily large VGRFs early on in compilation, but which will be split and cleaned up prior to register allocation. It may also allocate smaller arrays in practice since MAX_VGRF_SIZE(devinfo) assumes the worst case scenario for things we actually could need to allocate. Cc: mesa-stable Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>	2025-04-11 20:34:51 +00:00
Kenneth Graunke	4b27b5895c	brw: Use live->max_vgrf_size in register coalescing We already require liveness, so just use the actual maximum size we saw instead of a hardcoded pessimal size. Cc: mesa-stable Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>	2025-04-11 20:34:51 +00:00
Kenneth Graunke	ea468412f6	brw: Track the largest VGRF size in liveness analysis We're already looking at this data to calculate the per-component vars_from_vgrf[] and vgrf_from_vars[] mappings, so just record the largest VGRF size while we're here. This will allow passes to size arrays based on the actual size needed, rather than hardcoding some fixed size. In many cases, MAX_VGRF_SIZE(devinfo) is larger than necessary, because e.g. vec5 sparse sampling results aren't used. Not hardcoding this means we can also temporarily handle very large VGRFs which we know will be split eventually, without having to increase the maximum which is ultimately used for RA classes. Cc: mesa-stable Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>	2025-04-11 20:34:51 +00:00
José Roberto de Souza	68a617076d	intel/perf: Update intel_perf to match xe_drm.h There was a mismatch between drm-next version of xe_drm.h and the one in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142. So this does the necessary changes to build with current and new xe_drm.h Fixes: `2a828c35a1` ("intel/perf: add eu stall sampling support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34457>	2025-04-11 18:35:49 +00:00
Lionel Landwerlin	243c01c703	anv/iris: implement Wa_18040903259 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>	2025-04-11 13:54:35 +00:00
Lionel Landwerlin	d123aedfc7	anv: remove ALWAYS_INLINE from globally visible functions Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>	2025-04-11 13:54:35 +00:00
Lionel Landwerlin	bcaf08b47c	intel/dev: remove ADLN references Not used anymore, just use the existing ADL definitions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>	2025-04-11 13:54:35 +00:00
Lionel Landwerlin	938f79ed82	anv: update Wa_1607156449 to use WA infrastructure Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>	2025-04-11 13:54:35 +00:00
Valentine Burley	b49eaf0966	ci/lava: Consolidate piglit trace job definitions Clean up LAVA job definitions. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>	2025-04-11 07:05:07 +00:00
Valentine Burley	1aeedddbb6	ci/piglit: Drop redundant PIGLIT_PROFILES variable PIGLIT_PROFILES was only used with the piglit-runner.sh script, which no jobs were using anymore. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>	2025-04-11 07:05:06 +00:00
Valentine Burley	09f86df938	intel/ci: Convert iris-kbl-piglit to deqp-runner suite This was the last job using the piglit-runner.sh script. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>	2025-04-11 07:05:06 +00:00
Lionel Landwerlin	06ad9a25e5	brw: fix Wa_22013689345 emission Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details 2 problems : - not detecting null destination correctly - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering already happened Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>	2025-04-10 16:44:28 +00:00
Lionel Landwerlin	e321c438dc	anv: fix self dependency computation Some upcoming changes in the runtime will make it impossible to rely on the pipeline or runtime information to know whether a fragment shader has input attachments. Instead we gather that information at compile time and store it in our shader bind_map. At runtime we check whether the fragment shader has input attachments and whether those map to the runtime depth/stencil input attachments to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d2f7b6d5a7` ("anv: implement VK_KHR_dynamic_rendering_local_read") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Paulo Zanoni	fdbdfaed01	anv: add ANV_SYS_MEM_LIMIT for debugging system memory restrictions If you suspect a workload is failing because it needs more memory, you can set ANV_SYS_MEM_LIMIT=100 to give it all the memory available. This could make, for example, certain games start working (it really depends on how much RAM you have and how much the game wants). If you suspect a workload is too resource hungry, you can try to limit it with ANV_SYS_MEM_LIMIT=30 (or some other value) to see if it can deal with the more restricted environment and behave accordingly. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>	2025-04-09 22:48:18 +00:00
Paulo Zanoni	ec4b2ce664	anv: restore the old behavior of up to 75% of RAM for the system heap "We paid for sixteen gigs of RAM, so we gonna use the whole damn sixteen gigs of RAM!" - My Mom First, some history: The Anv 50%-or-75% rule was originally added in 2017 by `060a6434ec` ("anv: Advertise larger heap sizes"). When i915.ko started reporting memory sizes in its ioctls, it didn't impose any restrictions: 100% of SRAM was reported as available, so the restriction was in Mesa. When xe.ko was introduced, it only reported 50% of the SRAM as available through its ioctls, so commit `b571ae6e7a` ("intel: Make memory heaps consistent between KMDs") adapted the code to not take an extra 25% of the 50% that was already cut, and restricted i915.ko to 50% instead of the 50%-or-75%. In Kernel commit d2d5f6d57884 ("drm/xe: Increase the XE_PL_TT watermark"), xe.ko changed to reporting 100% of SRAM through its ioctls, so we adapted Mesa to do the right thing depending on which Kernel version was running. While this was all happening, we were discussing about which behavior was actually the best: restrict everything to 50% in order to avoid issues when many things are running in parallel, or keep the restriction only at 75% in order to allow high demanding workloads to make full use of the hardware. The way I see, if parallel applications are causing the system to run out of resources, the user always has the option to kill applications and use one thing at a time. On the other hand, if a single application needs more than 50% of the SRAM and we don't allow it in our heaps, the application will never work (unless, of course, the user patches Mesa). So in this commit we go back to allowing high-demanding applications to work by restoring the 50%-or-75% rule. This commit is especially useful in systems with integrated graphics, like LNL, where the option to upgrade RAM is not present. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>	2025-04-09 22:48:18 +00:00
Paulo Zanoni	02e896bc49	anv/xe: detect the newer xe.ko memory reporting model and act accordingly Kernel commit d2d5f6d57884 ("drm/xe: Increase the XE_PL_TT watermark") changed how xe.ko reportes memory: its ioctls now report 100% of the system RAM as available. Since our policy is to report 50% of the SRAM as available for the heaps, add some code to check the amount reported by xe.ko against the amount reported by the system, then act accordingly. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>	2025-04-09 22:48:18 +00:00
Paulo Zanoni	3db8931d4a	intel/i915: restrict the RAM size restrictions to Anv Before commit `b571ae6e7a` ("intel: Make memory heaps consistent between KMDs"), we had the following policy for reporting Sytem RAM memory sizes: - For OpenGL, we reported the total available RAM. - For Vulkan, we reported the total available RAM as: - 50% of the total RAM if the total RAM was <= 4GB, - 75% otherwise - In addition, the Memory Budget (for VK_EXT_memory_budget) is 90% of the "free" memory, which can be an extra 10% off of the 50% or 75%. When xe.ko was added, one key difference was noted: while i915.ko reported the "real" RAM memory sizes in its ioctls, xe.ko reported only 50% of the system RAM as available. Because of that (and other reasons, see this discussion on MR 28513), commit `b571ae6e7a` decided to unify the behavior by changing the Anv i915.ko rule to "always 50%" instead of "50% or 75%". This also changed the Iris rule to 50% instead of 100%. In my research, I couldn't find any reason why this restriction should also apply to Iris, so here we revert back to handling these size restrictions on Anv only. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>	2025-04-09 22:48:18 +00:00
Ian Romanick	cb69d019cf	brw/nir: Use offset() for all uses of offs in emit_pixel_interpolater_alu_at_offset This is necessary to appropriately uniformize the first component access of a convergent vector. Without this, this is produced: load_payload(16) %18:D, 0d, 0d NoMask group0 add(32) %21:F, %18+0.0:F, 0.5f add(32) %22:F, %18+2.0<0>:F, 0.5f This is the correct code: load_payload(16) %18:D, 0d, 0d NoMask group0 add(32) %21:F, %18+0.0<0>:F, 0.5f add(32) %22:F, %18+2.0<0>:F, 0.5f Without `38b58e286f`, the code generated was more incorrect, but happened to work for this test case: load_payload(16) %18:D, 0d, 0d NoMask group0 add(32) %21:F, %18+0.0<0>:F, 0.5f add(32) %22:F, %18+0.4<0>:F, 0.5f Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `38b58e286f` ("brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset") Closes: #12969 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34427>	2025-04-09 22:21:18 +00:00
Caleb Callaway	64b5ee3001	intel/tools: fix 32b build for EU stall tool Fixes: `610ad8d3` ("intel/tools: create intel_monitor for sampling eu stalls") Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34439>	2025-04-09 21:40:46 +00:00
Caio Oliveira	7457c4ecfd	brw: Make brw_range use half-open ranges Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>	2025-04-09 19:06:49 +00:00
Caio Oliveira	6509f8139d	brw: Use brw_range::last() to explicit get the last valid IP This is a preparation to change what is stored in brw_range::end. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>	2025-04-09 19:06:49 +00:00
Caio Oliveira	596bbb2c95	brw: Use brw_range to store Vars ranges Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>	2025-04-09 19:06:49 +00:00

1 2 3 4 5 ...

13900 commits