fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-10 10:20:20 +01:00

Author	SHA1	Message	Date
Eric Engestrom	c2dc60751b	nvk+zink/ci: add flakes seen in nightly pipeline Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29635>	2024-06-09 22:51:37 +00:00
Eric Engestrom	9f4c0d2a71	nvk+zink/ci: mark KHR-GL46.sparse_texture2_tests.SparseTexture2* as fixed Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29635>	2024-06-09 22:51:37 +00:00
Eric Engestrom	b92ce1b0d6	panfrost/ci: remove duplicate path Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29634>	2024-06-10 00:20:55 +02:00
Eric Engestrom	9c2e2b7a2e	turnip/ci: add a750 flakes seen in the latest nightly Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29633>	2024-06-09 21:04:18 +00:00
Eric Engestrom	04c939113f	turnip+zink/ci: mark a dEQP-GLES(2\|3).functional.rasterization.(fbo\|primitives).line_(strip_\|)wide as fixed Fixes: `07fa635f11` ("gallium/u_blitter: add option to override fragment shader for util_blitter_blit") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29633>	2024-06-09 21:04:18 +00:00
Eric Engestrom	95ca41bef9	radv/ci: drop duplicate navi31-aco flakes line Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29631>	2024-06-09 22:31:30 +02:00
Eric Engestrom	ef0f926aff	radv/ci: drop duplicate navi21-aco flakes line Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29631>	2024-06-09 22:31:23 +02:00
Eric Engestrom	f4f30ed826	radeonsi/ci: mark a bunch of tests as fixed on vangogh Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29631>	2024-06-09 22:30:36 +02:00
Pavel Ondračka	ae06e018fa	r300: fix RC_OMOD_DIV_2 modifier Backend only recognizes the MUL a*0.5 pattern if there is an immediate as one of the sources, however by then the source would we already coverted to RC_FILE_NONE with constant half swizzles. So teach peephole_omod to recognize this pattern as well. RV530: total instructions in shared programs: 128860 -> 128750 (-0.09%) instructions in affected programs: 11942 -> 11832 (-0.92%) helped: 106 HURT: 17 total presub in shared programs: 8739 -> 8736 (-0.03%) presub in affected programs: 32 -> 29 (-9.38%) helped: 3 HURT: 0 total omod in shared programs: 427 -> 1212 (183.84%) omod in affected programs: 38 -> 823 (2065.79%) helped: 0 HURT: 160 total temps in shared programs: 17544 -> 17554 (0.06%) temps in affected programs: 70 -> 80 (14.29%) helped: 0 HURT: 10 total lits in shared programs: 3153 -> 3159 (0.19%) lits in affected programs: 9 -> 15 (66.67%) helped: 0 HURT: 6 total cycles in shared programs: 191334 -> 191253 (-0.04%) cycles in affected programs: 21240 -> 21159 (-0.38%) helped: 101 HURT: 27 Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>	2024-06-09 09:07:32 +02:00
Pavel Ondračka	d94d2a05b2	r300: fix for ouput modifier and DDX/DDX Empirical testing shows that output modifiers are not working if the DDX/DDY is writing directly to output. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>	2024-06-09 09:07:32 +02:00
Pavel Ondračka	472c64c90e	r300: fix writemask rewrite when converting to omod Consider the following case: 0: MUL temp[1].y, input[0]._x__, input[1]._y__; 1: MOV temp[1].x, input[0].x___; 2: MOV temp[1].z, const[0].__x_; 3: MUL temp[2].xyz, const[1].xxx_, temp[1].yxz_; ... We correctly recognize that we can convert mul into omod for all three instructions, however the mul swizzle was not handled correctly: 0: MUL temp[2].y / 2, input[0]._x__, input[1]._y__; 1: MOV temp[2].x / 2, input[0].x___; 2: MOV temp[2].z / 2, const[0].__x_; ... Just create the conversion swizzle from the initial mul swizzle when rewriting the original instruction writemasks. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>	2024-06-09 09:07:31 +02:00
Pavel Ondračka	32cc2c2812	r300: fix cycles counting for KIL We add a cycles penalty when we see a begin tex and than subtract from it based on when first alu comes that needs the results. However if the only instruction in the TEX block is just KIL, we don't have to add any penalty as nothing waits for it. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>	2024-06-09 09:07:31 +02:00
Pavel Ondračka	fcc97bd6c3	r300/ci: fails list update shaders@glsl-bug-110796 skips since https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/921 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29626>	2024-06-09 08:04:59 +02:00
Georg Lehmann	05ca6e2478	amd/common: set COMPUTE_STATIC_THREAD_MGMT_SE2-3 correctly on gfx10-11 There is a hole between SE1 and SE2 occupied by COMPUTE_TMPRING_SIZE. Fixes: `3c8b48e310` ("ac,radeonsi: add a function to initialize compute preambles") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29622>	2024-06-08 19:18:53 +00:00
Karol Herbst	5d013da038	rusticl/memory: copies might overlap for host ptrs We can't really gurantee there is no overlap, because applications might pass in arbitrary host pointers. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29604>	2024-06-08 17:03:31 +00:00
Karol Herbst	e522c91d5c	rusticl/spirv: do not pass a NULL pointer to slice::from_raw_parts Fixes: `e8de580998` ("rusticl/kernel: basic implementation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29604>	2024-06-08 17:03:31 +00:00
Kenneth Graunke	3da444b79e	intel/brw: Refactor code to commute immediates into legal positions This will let us reuse this in a new pass shortly. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:19:12 -07:00
Kenneth Graunke	d45da713e7	intel/brw: Refactor try_constant_propagate() This will let us reuse the bulk of this code in a new copy propagation pass without replicating it. We retain a wrapper function for dealing with ACP entries, which the new pass won't have. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:19:10 -07:00
Kenneth Graunke	85aa6f80af	intel/brw: Drop BRW_OPCODE_IF from try_constant_propagate This was for Sandybridge's IF with embedded comparison, which only existed for a single generation of hardware. Since the compiler fork, we no longer support Sandybridge here. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:19:08 -07:00
Kenneth Graunke	7019bc4469	intel/brw: Drop compiler parameter from try_constant_propagate() This is unused. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:19:06 -07:00
Kenneth Graunke	43ab997951	intel/brw: Update instructions_match() to compare more fields We were missing the following "newer" fields: - ex_desc - predicate_trivial - sdepth - rcount - writes_accumulator - no_dd_clear - no_dd_check - check_tdr - send_is_volatile - send_ex_desc_scratch - send_ex_bso - last_rt - keep_payload_trailing_zeroes - has_packed_lod_ai_src We can actually just check ex_desc and the new "bits" union to handle most of them with fewer checks. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:19:03 -07:00
Kenneth Graunke	061da9f748	intel/brw: Make brw_reg::bits publicly accessible from fs_reg I want to be able to hash an fs_reg, including all the brw_reg fields. It's easiest to do this if I can use the "bits" union field that incorporates many of the other ones. We also move the using declaration for "nr" down because that field was moved to the second section a while back. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:19:01 -07:00
Kenneth Graunke	b4a595204b	intel/brw: Add a idom_tree::dominates(a, b) helper. Simpler to use than the existing methods. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:18:56 -07:00
Kenneth Graunke	e2d9ff8004	intel/brw: Handle scratch address swizzling of constants Pass in the nir_src and check if it's constant, handling it via CPU-side arithmetic instead of emitting instructions. While we can constant fold these via our optimization passes, we have to do opt_algebraic to fold the binary operation with constant sources into a MOV of an immediate, then opt_copy_propagation to put it in the next expression, and so on, until the entire expression is folded. This can take several iterations of the optimization loop, which is inefficient. For example, gfxbench5/aztec-ruins/normal/7 has load/store_scratch intrinsics with constant sources, and this patch removes a number of optimization passes according to INTEL_DEBUG=optimizer. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:18:54 -07:00
Kenneth Graunke	07745752d6	intel/brw: Skip fs_nir_setup_outputs for compute shaders There aren't any outputs, so there's no point to doing this work. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:18:54 -07:00
Kenneth Graunke	fa1564fb87	intel/brw: Recreate GS output registers after EmitVertex Geometry shaders write outputs multiple times, with EmitVertex() between them. The value of output variables becomes undefined after calling EmitVertex(), so we don't need to preserve those. This lets us recreate new registers after each EmitVertex(), assuming we aren't in control flow, allowing them to have separate live ranges. It also means that those registers are more likely to be written once, rather than having multiple writes, which can make optimization easier. This is pretty much a total hack, but it's helpful. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>	2024-06-08 02:18:51 -07:00
Eric Engestrom	cb30b266ca	ci/deqp: uprev gl & gles cts Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29602>	2024-06-08 08:19:47 +00:00
Eric Engestrom	c02329ded1	ci: set a common B2C_JOB_SUCCESS_REGEX with the message that's printed for all jobs Simpler code, and more reliable against serial corruption because that message is printed 4 times (vs only once for the other ones). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29608>	2024-06-08 07:16:27 +00:00
Marek Olšák	dc113c418d	ac/nir: import the dispatch logic for the universal compute clear/blit shader Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	6b15e45908	ac/nir: import the universal compute clear/blit shader Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	1becc6953c	ac/nir: import the MSAA resolving pixel shader from radeonsi It has a lot of options for efficiency. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	f96bbb64d6	radeonsi: add decision code to select when to use compute blit for performance Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	3424e16ece	radeonsi: add decision code to select when to use CB_RESOLVE for performance The answer is "almost never". Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	c5641387f3	radeonsi: add a new blit microbenchmark Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	0c545e2fca	radeonsi: add fail_if_slow parameter into si_msaa_resolve_blit_via_CB Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	77d81fb8b0	radeonsi: add a custom MSAA resolving pixel shader This is faster for 8 samples because it forms a VMEM clause, unlike the default shader. It also uses 16-bit types in the shader when possible and averages fewer components if the format has less than 4. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	21e90d9c6e	radeonsi: clear color buffers via compute for special tiling cases Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	2a0b9839ca	radeonsi: add use_aco into CS blit shader key it will be set in a future commit Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	fe7a4ed708	radeonsi: use shader_info::use_aco_amd to determine whether to use ACO It's set by si_nir_scan_shader, so we need to use it after that. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	c83225cd0a	radeonsi: print the compute shader blit key for AMD_DEBUG Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	d62ad0da5f	radeonsi: use MIMG A16 (16-bit image coordinates) in compute blits This reduces VGPR usage for MSAA blits and blitting multiple pixels per lane. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	d6c96024a8	radeonsi: extend NIR compute helpers to allow returning 16-bit results Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	5b3e1a0532	radeonsi: change the compute blit to clear/blit multiple pixels per lane The target is 8-16B per lane regardless of the format and number of samples. This is needed to fully utilize the memory bandwidth instead of only a small fraction of it. These are optimal numbers identified by benchmarking. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	d4c066abaf	radeonsi: adds flags parameter into si_compute_blit to replace fail_if_slow So that we can also specify sync flags. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	30af861bff	radeonsi: restructure (rewrite) the compute blit shader This merges the separate MSAA, downsampling, upsampling, and non-MSAA blocks. It's not meant to change behavior, but some change are necessary: - disallow 16 samples - loads only load the number of components that we need - optimizations barriers are placed optimally and include the sample index in the same vector as the coordinates, so that LLVM is forced to form VMEM clauses for loads and stores - the shader queries the descriptor for the dst image manually and passes it to the image store instead of the image variable (this is needed to get latency hiding for scalar loads in the presence of optimization barriers) This is a prerequisite for blitting multiple pixels per lane. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	d2ce5fc07a	radeonsi: split xy_clamp_to_edge to separate X and Y flags for the compute blit to generate less shader code if only one of the axes needs clamping. Use util_is_box_out_of_bounds instead of doing it manually. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	7ee936bf65	radeonsi: convert the compute blit shader hash table to u64 keys 32 bits is not enough anymore. We'll add more. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	40bcb588dd	radeonsi: remove the old si_compute_copy_image It's replaced by the compute blit. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:11 +00:00
Marek Olšák	b0c0cca3a7	radeonsi: switch the old compute image copy to the new one using the blit Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:10 +00:00
Marek Olšák	f3a59fe216	radeonsi: add a new version of si_compute_copy_image using the compute blit It's faster and handles more stuff. This is mostly the same code as the old version, but it calls si_compute_blit at the end. A later commit will remove the old version, so that there is no code duplication. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>	2024-06-08 05:48:10 +00:00

1 2 3 4 5 ...

190389 commits