fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 15:30:14 +01:00

Author	SHA1	Message	Date
Mike Blumenkrantz	453f49ce6d	lavapipe: move noop fs creation to device this avoids creating a separate noop fs for every pipeline Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21051>	2023-02-02 04:49:42 +00:00
Chia-I Wu	dc7f6c5324	freedreno: support UBWC scanout On sway+xwayland, both explicit and implicit modifiers are advertised. While dri3proto says nothing about it, zwp_linux_dmabuf_v1 says A compositor that sends valid modifiers and DRM_FORMAT_MOD_INVALID for a given format supports both explicit modifiers and implicit modifiers. "glmark2 -b build:model=bunny --fullscreen" goes from 468 to 598fps on a618 @ 2160x1440. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20892>	2023-02-02 04:33:25 +00:00
Chia-I Wu	1cf28bd049	freedreno: add has_implicit_modifier helper Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20892>	2023-02-02 04:33:25 +00:00
Timur Kristóf	1244506c15	nir/opt_algebraic: Add optimization for ieq/ine and right-shift. Fossil DB stats on GFX11: Totals from 1343 (1.00% of 134913) affected shaders: SpillSGPRs: 7145 -> 7137 (-0.11%) CodeSize: 20737744 -> 20739148 (+0.01%); split: -0.02%, +0.03% Instrs: 4010443 -> 4008449 (-0.05%); split: -0.05%, +0.00% Latency: 50021520 -> 50021105 (-0.00%); split: -0.00%, +0.00% InvThroughput: 6354371 -> 6354112 (-0.00%); split: -0.00%, +0.00% VClause: 63035 -> 63038 (+0.00%); split: -0.01%, +0.01% SClause: 121162 -> 121166 (+0.00%) Copies: 251354 -> 251058 (-0.12%); split: -0.18%, +0.06% PreSGPRs: 137283 -> 137299 (+0.01%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20936>	2023-02-02 03:08:19 +00:00
Kenneth Graunke	873dfb673b	anv: Perform load_constant address math in 32-bit rather than 64-bit We lower NIR's load_constant to load_global_constant, which uses A64 bindless messages. As such, we do the following math to produce the address for each load: base_lo@32 <- BRW_SHADER_RELOC_CONST_DATA_ADDR_LOW base_hi@32 <- BRW_SHADER_RELOC_CONST_DATA_ADDR_HIGH base@64 <- pack_64_2x32_split(base_lo, base_hi) addr@64 <- iadd(base@64, u2u64(offset@32)) On platforms that emulate 64-bit math, we have to emit additional code for the 64-bit iadd to handle the possibility of a carry happening and affecting the top bits. However, NIR constant data is always uploaded adjacent to the shader assembly, in the same buffer. These buffers are required to live in a 4GB region of memory starting at Instruction State Base Address. We always place the base address at a 4GB address. So the constant data always lives in a buffer entirely contained within a 4GB region, which means any offsets from the start of the buffer cannot possibly affect the high bits. So instead, we can simply do a 32-bit addition between the low bits of the base and the offset, then pack that with the unchanged high bits. On anv, INSTRUCTION_STATE_POOL_MIN_ADDRESS is 8GB, so the high bits are always 0x2. We don't even need to patch that portion of the address and can just use an immediate value. We do still need to pack, however. fossil-db on Icelake indicates the following for affected shaders: Instrs: 10830023 -> 10750080 (-0.74%) Cycles: 1048521282 -> 1046770379 (-0.17%); split: -0.33%, +0.16% Subgroup size: 103104 -> 103112 (+0.01%) Send messages: 570886 -> 570760 (-0.02%) Loop count: 14428 -> 14429 (+0.01%) Spill count: 14246 -> 14244 (-0.01%); split: -0.06%, +0.04% Fill count: 22802 -> 22794 (-0.04%); split: -0.04%, +0.01% Scratch Memory Size: 654336 -> 662528 (+1.25%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20999>	2023-02-02 02:45:04 +00:00
Kenneth Graunke	a0e7e7ff41	iris: Perform load_constant address math in 32-bit rather than 64-bit We lower NIR's load_constant to load_global_constant, which uses A64 bindless messages. As such, we do the following math to produce the address for each load: base_lo@32 <- BRW_SHADER_RELOC_CONST_DATA_ADDR_LOW base_hi@32 <- BRW_SHADER_RELOC_CONST_DATA_ADDR_HIGH base@64 <- pack_64_2x32_split(base_lo, base_hi) addr@64 <- iadd(base@64, u2u64(offset@32)) On platforms that emulate 64-bit math, we have to emit additional code for the 64-bit iadd to handle the possibility of a carry happening and affecting the top bits. However, NIR constant data is always uploaded adjacent to the shader assembly, in the same buffer. These buffers are required to live in a 4GB region of memory starting at Instruction State Base Address. We always place the base address at a 4GB address. So the constant data always lives in a buffer entirely contained within a 4GB region, which means any offsets from the start of the buffer cannot possibly affect the high bits. So instead, we can simply do a 32-bit addition between the low bits of the base and the offset, then pack that with the unchanged high bits. On iris, IRIS_MEMZONE_SHADER is at [0, 4GB) so the high bits are always zero. We don't even need to patch that portion of the address and can simply use u2u64 to promote the 32-bit add result to a 64-bit value where the top bits are 0. shader-db on Icelake indicates that this: - Helps instructions: -1.13% in 135 affected programs - Helps spills/fills: -4.08% / -4.18% in 4 affected programs - Gains us 1 SIMD16 compute shader instead of SIMD8 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20999>	2023-02-02 02:45:04 +00:00
Timur Kristóf	95d06343c6	radv: Don't place CS in VRAM when bandwidth is too low. People who use RADV on eGPU have reported poor performance by default. They also noted that the "nosam" option helps. This commit disables placing CS objects in VRAM when the bandwidth is below that of PCIe 3.0 x8. Note that eGPUs are typically PCIe 3.0 x4. Contributes-to: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7340 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20842>	2023-02-02 02:13:10 +00:00
Timur Kristóf	ef668f3714	ac/gpu_info: Add has_pcie_bandwidth_info. This is so that we can tell whether the current kernel has the PCIe bandwidth info available or not. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20842>	2023-02-02 02:13:10 +00:00
Jesse Natalie	d7730fcf22	vulkan/wsi/win32: Support tearing (immediate) and VSync (FIFO) present modes Reviewed-by: Giancarlo Devich <gdevich@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20945>	2023-02-02 01:30:28 +00:00
Jesse Natalie	747604b17c	vulkan/wsi: Add a wsi_device param to get_present_modes The Win32 WSI will want to query capabilities of the device to determine what's available. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20945>	2023-02-02 01:30:28 +00:00
Sagar Ghuge	0c083d29a5	intel/fs: Always stall between the fences on Gen11+ Be conservative in Gfx11+ and always stall in a fence. Since there are two different fences, and shader might want to synchronize between them. This change also brings back the original code block for the stall between the fence and comment from the commit `b390ff3517`. v2: (Caio) - Re-arrange code block. - Adjust comment. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6958 Fixes: `f7262462` ("intel/fs: Rework fence handling in brw_fs_nir.cpp") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Tested-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20996>	2023-02-02 00:21:21 +00:00
Emma Anholt	51ea81c0a1	ci: Fix perf job condition. We were supposed to be checking that the job had "performance" in the name, not that the user (which we already checked is marge) has "performance" in their name. Fixes: `f6c06ef2f6` ("ci: Add manual rules variations to disable irrelevant driver jobs.") Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21002>	2023-02-01 23:53:26 +00:00
Emma Anholt	5d1c693893	ci: Fix perf jobs blocking Marge pipelines. They got accidentally disabled entirely, so they didn't block merge, but once they re-enable then they'll block us again. The problem was that I moved allow_failure to a .performance-rules section, but we only ever inherit the rules from that location, not the rest of yml. This is basically a revert of `67547a04b6` ("ci: Move the performance jobs' allow_failure:true to the gl rules."), though I still keep the allow_failure in a more common location with comments, since perf jobs are a huge trap. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21002>	2023-02-01 23:53:26 +00:00
Samuel Pitoiset	aa68b98b87	radv: remove radv_pipeline_stage::spirv::sha1 This is no longer used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21048>	2023-02-01 23:25:52 +00:00
Samuel Pitoiset	853f8eb930	radv: remove redundant zero initialization of pipeline layout It's already zeroed in radv_pipeline_layout_init(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21048>	2023-02-01 23:25:52 +00:00
Samuel Pitoiset	1f67782eb2	radv: optimize radv_pipeline_layout_add_set() slightly That value is already computed when a descriptor set layout is created. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21048>	2023-02-01 23:25:52 +00:00
Yiwei Zhang	a73a5915fb	venus: log upon device creation Log the deviceName and driverInfo gated behind VN_DEBUG=log_ctx_info Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21030>	2023-02-01 22:04:41 +00:00
Pavel Ondračka	7e6acfd587	nir: mark progress when removing trailing unused load_const channels When the unused channels were at the end and so no reswizzling was needed, we wouldn't correctly mark the progress. Fixes: `3305c960` Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>	2023-02-01 20:33:31 +00:00
Pavel Ondračka	fe56dd9c42	nir: mark progress when removing trailing unused alu channels When the unused channels were at the end and so no reswizzling was needed, we wouldn't correctly mark the progress. Fixes: `cb7f2012` Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>	2023-02-01 20:33:31 +00:00
Pavel Ondračka	ef800da3f7	nir: nir opt_shrink_vectors whitespace fix Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>	2023-02-01 20:33:31 +00:00
Amber	ab4c2990ed	intel/compiler: use lower_image_samples_to_one Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Amber Amber <amber@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>	2023-02-01 19:52:49 +00:00
Amber	e8bfb71660	ir3: use lower_image_samples_to_one This is necessary to properly support ARB_shader_texture_image_samples fixes crash in KHR-GL45.shader_texture_image_samples_tests.image_functional_test Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Amber Amber <amber@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>	2023-02-01 19:52:49 +00:00
Amber	c384690ab7	nir: support lowering nir_intrinsic_image_samples to a constant load This can be used by multiple drivers that do not support ms images Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Amber Amber <amber@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>	2023-02-01 19:52:49 +00:00
Konstantin Seurer	a568a5492f	radv: Fix creating accel structs with unbound buffers If the buffer hasn't been bound to memory yet, we will dereference a NULL pointer in radv_CreateAccelerationStructureKHR. cc: mesa-stable Closes: #8199 Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21019>	2023-02-01 19:31:43 +00:00
Sil Vilerino	37652da616	d3d12: Honor suggested driver profile/level for H264/HEVC encode Fixes some H264 <-> HEVC transcode cases where the wrong level/profile was assigned to the output bitstream Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21043>	2023-02-01 19:17:21 +00:00
Rhys Perry	bfd4ac4581	aco: limit VALUPartialForwardingHazard search Complicated CFG and lots of SALU can cause this to take an extremely long time to finish. Fixes dEQP-VK.graphicsfuzz.cov-value-tracking-selection-dag-negation-clamp-loop and Monster Hunter Rise demo compile times. fossil-db (gfx1100): Totals from 57 (0.04% of 134574) affected shaders: Instrs: 170919 -> 171165 (+0.14%) CodeSize: 860144 -> 861128 (+0.11%) Latency: 961466 -> 961505 (+0.00%) InvThroughput: 127598 -> 127608 (+0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8153 Fixes: `5806f0246f` ("aco/gfx11: workaround VALUPartialForwardingHazard") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20941>	2023-02-01 18:52:40 +00:00
José Roberto de Souza	8092bc2158	intel/ds: Fix crash when allocating more intel_ds_queues than u_vector was initialized u_vector_add() don't keep the returned pointers valid. After the initial size allocated in u_vector_init() is reached it will allocate a bigger buffer and copy data from older buffer to the new one and free the old buffer, making all the previous pointers returned by u_vector_add() invalid and crashing the application when trying to access it. This is reproduced when running dEQP-VK.synchronization.signal_order.timeline_semaphore.* in DG2 SKUs that has 4 CCS engines, INTEL_COMPUTE_CLASS=1 is set and of course perfetto build is enabled. To fix this issue here I'm moving the storage/allocation of struct intel_ds_queue to struct anv_queue/iris_batch and using struct list_head to maintain a chain of intel_ds_queue of the intel_ds_device. This allows us to append or remove queues dynamically in future if necessary. Fixes: `e760c5b37b` ("anv: add perfetto source") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20977>	2023-02-01 18:31:29 +00:00
Faith Ekstrand	1b3c746eec	hasvk: Let spirv_to_nir() set UBO/SSBO base cast alignments Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21027>	2023-02-01 17:54:40 +00:00
Faith Ekstrand	85d44b0f97	anv: Let spirv_to_nir() set UBO/SSBO base cast alignments Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21027>	2023-02-01 17:54:40 +00:00
Faith Ekstrand	f78e4cec32	vtn: Set alignment on initial UBO/SSBO casts Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21027>	2023-02-01 17:54:40 +00:00
Rob Clark	e29001d0e7	freedreno/a6xx: Remove excess CS flushing Also requires fixing where we emit barriers, and flushing pending barriers at the end of the batch. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	9b22bdc956	freedreno/a6xx: Also FLUSH_CACHE on image barrier For the same reason we need to on an UPDATE_BUFFER barrier. Fixes KHR-GLES31.core.compute_shader.pipeline-post-fs once the hard-coded cache-flush is removed from launch_grid path. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	23e65c6084	freedreno/a6xx: Make shader state independent of grid info Eventually we want to move this into a state group, so we can pre-bake the cmdstream and re-emit it via CP_SET_DRAW_STATE when it is dirty. But in order to do that it needs to not depend on grid info. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	1faf7133d4	freedreno: Don't open-code setting dirty CS state There is actually no issue with setting FD_DIRTY_PROG, since all state is marked dirty when we switch from compute to 3d. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	5a37cd8569	freedreno/a6xx: Don't double-write SP_CS_OBJ_START Also SP_CS_INSTRLEN. This is already done in fd6_emit_shader(). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	a063caa46a	freedreno: Skip flush_resource with explicit sync Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	2503e22717	freedreno: nondraw-batch Allow multiple compute grids to be combined into a single non-draw batch. This will allow us to optimize state emit and remove excess flushing between compute jobs. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	0e3f2646dd	freedreno/a6xx: Add CS instrlen workaround Based on !19023. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	bfd7d9e22e	freedreno/a6xx: Add missing CS_BINDLESS mapping Fixes: `e51975142c` ("freedreno/a6xx: Add bindless state" Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	13fe9c3e63	freedreno/ir3: Scalarize load_ssbo The benefits of turning it into isam (which needs to be scalar as the SSBO is sampled as a single component R32 texture) outweigh the benefits of vectorizing. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	951d963565	freedreno/a6xx: LRZ for MSAA We don't need to fall off the LRZ path when we fall back to clearing depth with a u_blitter draw, since u_blitter uses zsa state to achieve the depth/stencil clear and this is entirely compabile with LRZ. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Rob Clark	5eb85ef756	freedreno/decode: Increase size of offsets table The offsets table stores offsets of a buffer (such as cmdstream) that we've already dumped. The suballoc pool results in more suballocated cmdstream allocated from a single backing buffer, meaning that we need to increase the size of this table. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20975>	2023-02-01 17:28:41 +00:00
Georg Lehmann	2b264455b5	aco: use s_pack_ll_b32_b16 for constant copies Totals from 2 (0.00% of 134913) affected shaders: CodeSize: 28636 -> 28628 (-0.03%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970>	2023-02-01 17:07:25 +00:00
Georg Lehmann	9ee9b0859b	aco: use s_bfm_64 for constant copies Foz-DB Navi21: Totals from 1025 (0.76% of 134913) affected shaders: CodeSize: 1436752 -> 1432412 (-0.30%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970>	2023-02-01 17:07:25 +00:00
Rhys Perry	bbc5247bf7	aco/spill: always end spill vgpr after control flow To fix a hypothetical issue: v0 = start_linear_vgpr if (...) { } else { use_linear_vgpr(v0) } v0 = phi We need a p_end_linear_vgpr to ensure that the phi does not use the same VGPR as the linear VGPR. This is also much simpler. fossil-db (gfx1100): Totals from 1195 (0.89% of 134574) affected shaders: Instrs: 4123856 -> 4123826 (-0.00%); split: -0.00%, +0.00% CodeSize: 21461256 -> 21461100 (-0.00%); split: -0.00%, +0.00% Latency: 62816001 -> 62812999 (-0.00%); split: -0.00%, +0.00% InvThroughput: 9339049 -> 9338564 (-0.01%); split: -0.01%, +0.00% Copies: 304028 -> 304005 (-0.01%); split: -0.02%, +0.01% PreVGPRs: 115761 -> 115762 (+0.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621>	2023-02-01 15:45:22 +00:00
Rhys Perry	850d945baf	aco/tests: add setup_reduce_temp.divergent_if_phi Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621>	2023-02-01 15:45:22 +00:00
Rhys Perry	44fdd2ebcb	aco: end reduce tmp after control flow, when used within control flow In the case of: v0 = start_linear_vgpr if (...) { } else { use_linear_vgpr(v0) } v0 = phi We need a p_end_linear_vgpr to ensure that the phi does not use the same VGPR as the linear VGPR. fossil-db (gfx1100): Totals from 3763 (2.80% of 134574) affected shaders: MaxWaves: 90296 -> 90164 (-0.15%) Instrs: 6857726 -> 6856608 (-0.02%); split: -0.03%, +0.01% CodeSize: 35382188 -> 35377688 (-0.01%); split: -0.02%, +0.01% VGPRs: 234864 -> 235692 (+0.35%); split: -0.01%, +0.36% Latency: 47471923 -> 47474965 (+0.01%); split: -0.03%, +0.04% InvThroughput: 5640320 -> 5639736 (-0.01%); split: -0.04%, +0.03% VClause: 93098 -> 93107 (+0.01%); split: -0.01%, +0.02% SClause: 214137 -> 214130 (-0.00%); split: -0.00%, +0.00% Copies: 369895 -> 369305 (-0.16%); split: -0.31%, +0.15% Branches: 164996 -> 164504 (-0.30%); split: -0.30%, +0.00% PreVGPRs: 210655 -> 211438 (+0.37%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621>	2023-02-01 15:45:22 +00:00
Marek Olšák	e2d63c9a62	ac/gpu_info: add PCIe info Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20790>	2023-02-01 14:58:57 +00:00
Marek Olšák	e267b86d80	amd: update amdgpu_drm.h Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20790>	2023-02-01 14:58:57 +00:00
Samuel Pitoiset	cd6712e3a8	radv: pass pCreateInfo to radv_graphics_pipeline_compile() This removes some duplicated code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20990>	2023-02-01 14:20:47 +00:00

... 33 34 35 36 37 ...

167720 commits