fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-01 05:58:05 +02:00

Author	SHA1	Message	Date
Dave Airlie	ccdee8aade	llvmpipe: convert texture barrier to a finish. Need to flush the rasterizer and wait for everything to finish, with new overlap flush isn't enough. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Dave Airlie	604ed15e56	lavapipe: handle non-timeline semaphores wait/signal. When llvmpipe is allowed execute fragment shaders overlapping with other stuff, we have to start using the pipe fences. With presentation, the acquire path needs to signal a semaphore that can be waited on by the user, so add support for passing signal/wait semaphores for non-timeline in, and just put a fence pointer in the semaphore for that case. This fixes rendering once we allow overlapping rasterization. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Dave Airlie	70dfa8b32f	lavapipe: don't flush on transfer operations. The pipeline barrier/wait event code should handle this. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Dave Airlie	20696aa170	lavapipe: execute a finish in pipeline barrier and event waiting. Refactor out the code for finishing a fence and used it in pipeline barrier and event waiting. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Dave Airlie	1f1e62c15d	lavapipe: handle endless fence timeout properly. If the users ask for an infinte timeout, just pass it through to gallium. When llvmpipe ends up allowing async fragment shaders, it's important to get this right for lots of CTS tests. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Dave Airlie	7a1426db66	lavapipe: fix pipeline statistic query results with availability. The availability is meant to be the last integer value written, but for pipeline stats this was being done wrong. calculate the availability position properly. With the old non-overlapping execution model queries never were unavailable. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Dave Airlie	aeed07f604	drisw: fence drawing to the swap/copy buffers. Currently neither llvmpipe or softpipe ever leave any drawing in the pipeline, but I'd like to change that for llvmpipe. This makes drisw block for completed rendering before sending data to the X server. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>	2022-02-21 03:52:02 +00:00
Ilia Mirkin	65c4b6a4c6	freedreno/ir3: document GETINFO's x/y results The zw were already known, but throw them in here too. I'm not extremely happy with the description of "y", feels like there's a simpler explanation there, but I couldn't find it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14672>	2022-02-21 02:09:19 +00:00
Qiang Yu	80974a5f1e	radeonsi: fix depth stencil multi sample texture blit This causes the flushed_depth_texture is allocated without multi sample. So the blit will cause VM fault. cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14990>	2022-02-21 01:47:23 +00:00
Dave Airlie	0f989a840e	crocus: fix leak on gen4/5 stencil fallback blit path. Noticed by Ilia. Fixes: `f3630548f1` ("crocus: initial gallium driver for Intel gfx 4-7") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15100>	2022-02-21 10:21:56 +10:00
Ilia Mirkin	357dae424f	freedreno/a4xx: make luminance formats renderable, add missing L8A8_SNORM If the luminance formats aren't renderable, they back out to R* formats, but those will end up with a 1 in alpha rather than 0 when textured. So instead make them explicitly renderable, which will cause the correct texture format swizzle to be applied. Fixes query-rgba-signed-components and probably others. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15097>	2022-02-20 16:58:03 +00:00
Ilia Mirkin	56b1bd086f	freedreno/a4xx: use correct macro for color Doesn't actually matter since all the colors are encoded the same. But for consistency... Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15097>	2022-02-20 16:58:03 +00:00
Danylo Piliaiev	a814a4f9db	turnip: Add a refcount mechanism to BOs Until now we have lived without a refcount mechanism in the driver because in Vulkan the user is responsible for handling the life span of memory allocations for all Vulkan objects, however, imported BOs are tricky because the kernel doesn't refcount so user-space needs to make sure that: 1. When importing a BO into the same device used to create it (self-importing) it does not double free the same BO. 2. Frees imported BOs that were not allocated through the same device. Our initial implementation always freed BOs when requested, so we handled 2) correctly but not 1) on drm and we would double-free self-imported BOs because kernel doesn't return a unique gem_handle on each import. Beside this the submit ioctl checks for duplicates in the BO list and returns an error if there is one. This fixes the problem for good by adding refcounts to BOs so that self-imported BOs have a refcnt > 1 and are only freed when all references are freed. KGSL on the other hand does not have the same problems, at least not with ION buffers which are used for exportable BOs on pre 5.10 android kernels. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5936 Fixes CTS tests: dEQP-VK.drm_format_modifiers.export_import.* Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15031>	2022-02-19 15:16:55 +00:00
Lionel Landwerlin	2763a8af5a	anv/genxml/intel/fs: fix binding shader record entry Bit is flipped compared to all the other packets. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `705395344d` ("intel/fs: Add support for compiling bindless shaders with resume shaders") Fixes: `c3ac9afca3` ("anv: Create and return ray-tracing pipeline SBT handles") Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15078>	2022-02-19 13:50:56 +00:00
Chia-I Wu	5f3e50b27c	venus: trace vn_ring_wait_space It is good to know that we run out of ring space and have to wait. This happens easily with fossilize-replay because encoding a vkCreateGraphicsPipeline takes microseconds while executing it can take milliseconds, >100ms sometimes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14966>	2022-02-19 03:57:30 +00:00
Chia-I Wu	7cb2e9a8f0	venus: cache VkFormatProperties This is for fossilize-replay which keeps querying for the same formats. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14966>	2022-02-19 03:57:30 +00:00
Alyssa Rosenzweig	e392dd8237	pan/bi: Promote MUX to CSEL in the scheduler Helps scheduling, and makes scheduling more predictable when deciding between MUX and CSEL. total tuples in shared programs: 1523328 -> 1516256 (-0.46%) tuples in affected programs: 509800 -> 502728 (-1.39%) helped: 1977 HURT: 181 helped stats (abs) min: 1.0 max: 48.0 x̄: 3.71 x̃: 2 helped stats (rel) min: 0.04% max: 14.29% x̄: 1.98% x̃: 1.28% HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.43 x̃: 1 HURT stats (rel) min: 0.14% max: 7.69% x̄: 1.40% x̃: 0.70% 95% mean confidence interval for tuples value: -3.47 -3.08 95% mean confidence interval for tuples %-change: -1.79% -1.60% Tuples are helped. total clauses in shared programs: 350552 -> 349906 (-0.18%) clauses in affected programs: 34839 -> 34193 (-1.85%) helped: 570 HURT: 49 helped stats (abs) min: 1.0 max: 16.0 x̄: 1.22 x̃: 1 helped stats (rel) min: 0.67% max: 20.00% x̄: 3.26% x̃: 2.22% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.92% max: 16.67% x̄: 4.31% x̃: 4.17% 95% mean confidence interval for clauses value: -1.13 -0.96 95% mean confidence interval for clauses %-change: -2.95% -2.38% Clauses are helped. total cycles in shared programs: 202589.37 -> 202512.25 (-0.04%) cycles in affected programs: 7644.46 -> 7567.33 (-1.01%) helped: 771 HURT: 147 helped stats (abs) min: 0.041665999999999315 max: 1.8333360000000027 x̄: 0.11 x̃: 0 helped stats (rel) min: 0.16% max: 14.29% x̄: 2.10% x̃: 1.35% HURT stats (abs) min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.07 x̃: 0 HURT stats (rel) min: 0.24% max: 7.41% x̄: 1.49% x̃: 1.11% 95% mean confidence interval for cycles value: -0.09 -0.07 95% mean confidence interval for cycles %-change: -1.69% -1.36% Cycles are helped. total arith in shared programs: 56755.96 -> 56585.50 (-0.30%) arith in affected programs: 18746.29 -> 18575.83 (-0.91%) helped: 1605 HURT: 352 helped stats (abs) min: 0.04166399999999726 max: 1.8333360000000027 x̄: 0.12 x̃: 0 helped stats (rel) min: 0.07% max: 20.00% x̄: 1.92% x̃: 1.12% HURT stats (abs) min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.06 x̃: 0 HURT stats (rel) min: 0.17% max: 33.33% x̄: 2.09% x̃: 1.08% 95% mean confidence interval for arith value: -0.09 -0.08 95% mean confidence interval for arith %-change: -1.34% -1.07% Arith are helped. total quadwords in shared programs: 1429737 -> 1424670 (-0.35%) quadwords in affected programs: 418175 -> 413108 (-1.21%) helped: 1682 HURT: 198 helped stats (abs) min: 1.0 max: 35.0 x̄: 3.17 x̃: 2 helped stats (rel) min: 0.04% max: 13.33% x̄: 1.72% x̃: 1.29% HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.38 x̃: 1 HURT stats (rel) min: 0.15% max: 7.41% x̄: 1.30% x̃: 0.92% 95% mean confidence interval for quadwords value: -2.86 -2.53 95% mean confidence interval for quadwords %-change: -1.48% -1.32% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	a8418abd74	pan/bi: Revert "Fix load_const of 1-bit booleans" This reverts commit `29d319c767`. Now that we use nir_lower_bool_to_bitsize, we don't see 1-bit booleans anymore, so the issue this fixed doesn't apply. Actually, that issue was (in part) why I started looking into boolean handling in the first place. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	21bdee7bcc	pan/bi: Switch to lower_bool_to_bitsize Instead of ingesting 1-bit booleans and trying to force everything to be 16-bit, except when it isn't, and creating a mess in the backend... just use the NIR pass designed to select bitsize for booleans. Yes, this means we need to handle more NIR instructions, but the handling is easier and the conversion is more obvious (except for some edge cases like 16-bit vectorized b32csel). This generates noticeably better code, and the generated code will be easier to optimize. total instructions in shared programs: 90257 -> 88941 (-1.46%) instructions in affected programs: 49145 -> 47829 (-2.68%) helped: 201 HURT: 2 helped stats (abs) min: 1.0 max: 40.0 x̄: 6.57 x̃: 3 helped stats (rel) min: 0.29% max: 13.89% x̄: 2.57% x̃: 1.90% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 2.15% max: 2.74% x̄: 2.45% x̃: 2.45% 95% mean confidence interval for instructions value: -7.71 -5.26 95% mean confidence interval for instructions %-change: -2.84% -2.20% Instructions are helped. total tuples in shared programs: 73740 -> 72922 (-1.11%) tuples in affected programs: 36564 -> 35746 (-2.24%) helped: 184 HURT: 7 helped stats (abs) min: 1.0 max: 74.0 x̄: 4.49 x̃: 2 helped stats (rel) min: 0.30% max: 16.67% x̄: 2.86% x̃: 1.89% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.29 x̃: 1 HURT stats (rel) min: 0.12% max: 12.50% x̄: 4.26% x̃: 3.33% 95% mean confidence interval for tuples value: -5.29 -3.28 95% mean confidence interval for tuples %-change: -3.06% -2.13% Tuples are helped. total clauses in shared programs: 15993 -> 15928 (-0.41%) clauses in affected programs: 2464 -> 2399 (-2.64%) helped: 35 HURT: 16 helped stats (abs) min: 1.0 max: 27.0 x̄: 2.31 x̃: 1 helped stats (rel) min: 0.49% max: 18.88% x̄: 7.63% x̃: 5.88% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.79% max: 6.25% x̄: 1.91% x̃: 1.01% 95% mean confidence interval for clauses value: -2.46 -0.09 95% mean confidence interval for clauses %-change: -6.38% -2.90% Clauses are helped. total cycles in shared programs: 7622.13 -> 7594.75 (-0.36%) cycles in affected programs: 1078.67 -> 1051.29 (-2.54%) helped: 103 HURT: 4 helped stats (abs) min: 0.041665999999999315 max: 3.0833319999999986 x̄: 0.27 x̃: 0 helped stats (rel) min: 0.32% max: 21.05% x̄: 3.62% x̃: 2.44% HURT stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.13% max: 7.14% x̄: 2.94% x̃: 2.25% 95% mean confidence interval for cycles value: -0.33 -0.19 95% mean confidence interval for cycles %-change: -4.14% -2.61% Cycles are helped. total arith in shared programs: 2762.46 -> 2728.08 (-1.24%) arith in affected programs: 1550.12 -> 1515.75 (-2.22%) helped: 197 HURT: 6 helped stats (abs) min: 0.041665999999999315 max: 3.0833319999999986 x̄: 0.18 x̃: 0 helped stats (rel) min: 0.32% max: 21.05% x̄: 2.93% x̃: 1.61% HURT stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0 HURT stats (rel) min: 0.13% max: 20.00% x̄: 5.78% x̃: 3.37% 95% mean confidence interval for arith value: -0.21 -0.13 95% mean confidence interval for arith %-change: -3.20% -2.15% Arith are helped. total quadwords in shared programs: 68155 -> 67555 (-0.88%) quadwords in affected programs: 27944 -> 27344 (-2.15%) helped: 151 HURT: 9 helped stats (abs) min: 1.0 max: 52.0 x̄: 4.09 x̃: 3 helped stats (rel) min: 0.23% max: 12.35% x̄: 2.87% x̃: 2.17% HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.89 x̃: 1 HURT stats (rel) min: 0.20% max: 6.76% x̄: 1.91% x̃: 1.13% 95% mean confidence interval for quadwords value: -4.67 -2.83 95% mean confidence interval for quadwords %-change: -2.99% -2.21% Quadwords are helped. total threads in shared programs: 2232 -> 2233 (0.04%) threads in affected programs: 1 -> 2 (100.00%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	a64534754d	pan/bi: Handle vectorized u2f16/i2f16 Will be useful when we enable int16, I guess... No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	6a05852f5b	pan/bi: Handle trivial i2i32 lower_bool_to_bitsize can generate i2i32 from a 32-bit source, which is trivial but needs to be handled explicitly to avoid going down the 8-bit conversion path. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	f7d44a46cd	pan/bi: Optimize replication Bifrost's 16-bit support comes in the form of vectorized instructions, so when we manipulate scalars, we usually replicate to both bottom and top halves of 32-bit registers. Add an analysis pass that detects replication. Then, use that replication pass to optimize out useless swizzle instructions (by changing them to plain moves, which can be copypropped). This optimization is a slight shader-db win on its own, and allows us to transition to lower_bool_to_bitsize without regressing shader-db. total instructions in shared programs: 90323 -> 90257 (-0.07%) instructions in affected programs: 2513 -> 2447 (-2.63%) helped: 20 HURT: 0 helped stats (abs) min: 1.0 max: 16.0 x̄: 3.30 x̃: 2 helped stats (rel) min: 1.25% max: 11.11% x̄: 4.80% x̃: 4.29% 95% mean confidence interval for instructions value: -5.05 -1.55 95% mean confidence interval for instructions %-change: -6.06% -3.54% Instructions are helped. total tuples in shared programs: 73769 -> 73740 (-0.04%) tuples in affected programs: 1611 -> 1582 (-1.80%) helped: 17 HURT: 0 helped stats (abs) min: 1.0 max: 9.0 x̄: 1.71 x̃: 1 helped stats (rel) min: 0.58% max: 16.67% x̄: 4.80% x̃: 3.33% 95% mean confidence interval for tuples value: -2.70 -0.71 95% mean confidence interval for tuples %-change: -7.06% -2.54% Tuples are helped. total clauses in shared programs: 15997 -> 15993 (-0.03%) clauses in affected programs: 27 -> 23 (-14.81%) helped: 4 HURT: 0 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 7.69% max: 25.00% x̄: 18.17% x̃: 20.00% 95% mean confidence interval for clauses value: -1.00 -1.00 95% mean confidence interval for clauses %-change: -29.91% -6.44% Clauses are helped. total cycles in shared programs: 7623.13 -> 7622.13 (-0.01%) cycles in affected programs: 64.83 -> 63.83 (-1.54%) helped: 13 HURT: 0 helped stats (abs) min: 0.0416660000000002 max: 0.375 x̄: 0.08 x̃: 0 helped stats (rel) min: 1.02% max: 5.56% x̄: 2.82% x̃: 2.50% 95% mean confidence interval for cycles value: -0.13 -0.02 95% mean confidence interval for cycles %-change: -3.79% -1.85% Cycles are helped. total arith in shared programs: 2763.75 -> 2762.46 (-0.05%) arith in affected programs: 67.17 -> 65.88 (-1.92%) helped: 18 HURT: 0 helped stats (abs) min: 0.0416660000000002 max: 0.375 x̄: 0.07 x̃: 0 helped stats (rel) min: 1.02% max: 22.22% x̄: 5.68% x̃: 3.16% 95% mean confidence interval for arith value: -0.11 -0.03 95% mean confidence interval for arith %-change: -8.56% -2.80% Arith are helped. total quadwords in shared programs: 68173 -> 68155 (-0.03%) quadwords in affected programs: 1258 -> 1240 (-1.43%) helped: 14 HURT: 0 helped stats (abs) min: 1.0 max: 3.0 x̄: 1.29 x̃: 1 helped stats (rel) min: 0.42% max: 8.70% x̄: 3.88% x̃: 3.67% 95% mean confidence interval for quadwords value: -1.64 -0.93 95% mean confidence interval for quadwords %-change: -5.27% -2.49% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	35ff537814	pan/bi: Constant fold swizzles on constants This lets us avoid generating SWZ instructions. Those instructions could be constant folded but that complicates the replication analysis introduced in the next commit. Almost no shader-db changes. quadwords HURT: shaders/glmark/1-22.shader_test MESA_SHADER_FRAGMENT: 718 -> 722 (0.56%) total quadwords in shared programs: 68169 -> 68173 (<.01%) quadwords in affected programs: 718 -> 722 (0.56%) helped: 0 HURT: 1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	62533a6e64	pan/bi: Lower swizzles on MUX.v2i16 We'll generate this in a moment. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig	8bd4976d98	pan/bi: Lower swizzles on CSEL.i32/MUX.i32 This is counter-intuitive, but required for correct operation when CSEL.i32 takes a 1-bit (stored 16-bit) boolean argument. The impedance mismatch ultimately is between CSEL.b32 (nir's bcsel, nonexistant in the hardware) and the lowering CSEL.i32. However, a similar problem exists even with MUX.i32 which lacks a good way of zero/sign-extending booleans. Cherry-picked from my Valhall branch though the issue also affects Bifrost. Fixes piglit shaders@glsl-vs-if-bool on Bifrost. Unfortunately, shader-db is quite unhappy :-( The proper fix is to use lower_bool_to_bitsize, but that can't be backported to mesa-stable. total instructions in shared programs: 157539 -> 158953 (0.90%) instructions in affected programs: 55621 -> 57035 (2.54%) helped: 2 HURT: 259 helped stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.11% max: 2.67% x̄: 2.39% x̃: 2.39% HURT stats (abs) min: 1.0 max: 40.0 x̄: 5.47 x̃: 2 HURT stats (rel) min: 0.36% max: 16.13% x̄: 2.55% x̃: 1.59% 95% mean confidence interval for instructions value: 4.44 6.40 95% mean confidence interval for instructions %-change: 2.21% 2.82% Instructions are HURT. total tuples in shared programs: 132322 -> 132907 (0.44%) tuples in affected programs: 31806 -> 32391 (1.84%) helped: 5 HURT: 152 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.39% max: 3.03% x̄: 1.70% x̃: 1.61% HURT stats (abs) min: 1.0 max: 42.0 x̄: 3.89 x̃: 2 HURT stats (rel) min: 0.29% max: 18.18% x̄: 2.50% x̃: 1.79% 95% mean confidence interval for tuples value: 2.88 4.58 95% mean confidence interval for tuples %-change: 1.87% 2.85% Tuples are HURT. total clauses in shared programs: 28672 -> 28698 (0.09%) clauses in affected programs: 869 -> 895 (2.99%) helped: 1 HURT: 24 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 5.88% max: 5.88% x̄: 5.88% x̃: 5.88% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.12 x̃: 1 HURT stats (rel) min: 0.49% max: 33.33% x̄: 8.46% x̃: 3.59% 95% mean confidence interval for clauses value: 0.82 1.26 95% mean confidence interval for clauses %-change: 3.84% 11.93% Clauses are HURT. total cycles in shared programs: 15119.04 -> 15137.88 (0.12%) cycles in affected programs: 922.87 -> 941.71 (2.04%) helped: 4 HURT: 79 helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0 helped stats (rel) min: 0.40% max: 3.17% x̄: 1.57% x̃: 1.35% HURT stats (abs) min: 0.041665999999999315 max: 1.75 x̄: 0.24 x̃: 0 HURT stats (rel) min: 0.30% max: 20.00% x̄: 2.83% x̃: 2.12% 95% mean confidence interval for cycles value: 0.17 0.29 95% mean confidence interval for cycles %-change: 1.86% 3.37% Cycles are HURT. total arith in shared programs: 4922.71 -> 4947.71 (0.51%) arith in affected programs: 1423.79 -> 1448.79 (1.76%) helped: 5 HURT: 177 helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0 helped stats (rel) min: 0.40% max: 3.17% x̄: 1.82% x̃: 1.67% HURT stats (abs) min: 0.041665999999999315 max: 1.75 x̄: 0.14 x̃: 0 HURT stats (rel) min: 0.30% max: 22.22% x̄: 2.50% x̃: 1.52% 95% mean confidence interval for arith value: 0.11 0.17 95% mean confidence interval for arith %-change: 1.86% 2.90% Arith are HURT. total quadwords in shared programs: 120605 -> 120956 (0.29%) quadwords in affected programs: 26535 -> 26886 (1.32%) helped: 6 HURT: 143 helped stats (abs) min: 1.0 max: 7.0 x̄: 2.83 x̃: 1 helped stats (rel) min: 0.93% max: 6.33% x̄: 2.29% x̃: 1.71% HURT stats (abs) min: 1.0 max: 21.0 x̄: 2.57 x̃: 2 HURT stats (rel) min: 0.34% max: 13.79% x̄: 2.02% x̃: 1.22% 95% mean confidence interval for quadwords value: 1.86 2.86 95% mean confidence interval for quadwords %-change: 1.45% 2.24% Quadwords are HURT. total threads in shared programs: 4670 -> 4669 (-0.02%) threads in affected programs: 2 -> 1 (-50.00%) helped: 0 HURT: 1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>	2022-02-19 03:02:10 +00:00
Emma Anholt	a2b7d9b9cd	ci/freedreno: Add a known spilling hangcheck flake. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085>	2022-02-19 02:37:13 +00:00
Emma Anholt	b39d5e9705	ci/freedreno: Cut down pre-merge a630 VK coverage. We've got lots of VK coverage on 618, so take some of the load off (but leave a little bit of testing just to make sure we don't totally break 630). This should help with our Marge times since we've added some other coverage to 630 that's started overloading the runners. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085>	2022-02-19 02:37:13 +00:00
Emma Anholt	04790ec8bb	ci/freedreno: Move a 60s timeout test to skips instead of flakes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085>	2022-02-19 02:37:13 +00:00
Connor Abbott	7e8d885919	spirv: Rewrite determinant calculation The old calculation for mat3 was clever, but it turns out that a straightforward application of subdeterminants similar to how mat4 is handled is more efficient: on a scalar architecture with some sort of combined multiply+add instruction with a negate modifier (both fairly common), the new determinant is 9 instructions vs. 15 for the old one, and without the multiply-add it's 14 instructions vs. 18 for the old one. When used as a routine for inverse() the savings are compounded, because we now use the same method as used to compute the adjucate matrix and so CSE can combine most of the calculations with the adjucate matrix ones. Once mat3 and mat4 use the same method for computing determinants, we can combine them into a single recursive function. I also pulled up the mat_subdet() function because it was doing basically what we need, so it's now shared between determinant and inverse. This shrinks the implementation significantly, as can be seen from the diffstat. The real reason I want to change this, though, is that it fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.inverse.compute.mat3 with turnip. Qualcomm uses round-to-zero for 16-bit frcp, which combined with some inaccuracy in the old method of calculating the determinant led us to fail. Qualcomm's driver uses something like the new method to calculate the determinant in the inverse. We could argue that Mesa's method should be allowed, because round-to-zero for floating-point division is within spec and there are no precision guarantees given for determinant() or inverse(). However we might as well use the more efficient method. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14652>	2022-02-19 02:03:25 +00:00
Connor Abbott	c21065c87a	util/blob: Clarify rules on blob::data Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028>	2022-02-19 01:25:46 +00:00
Connor Abbott	6761550357	nir/serialize: Don't access blob->data directly It won't work if the blob is fixed-size and we overrun the size, which will be the case with the Vulkan pipeline cache. This gets a bit tricky for the repeated-header optimization, because we can't read the header from the blob. Instead we have to store the header itself. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028>	2022-02-19 01:25:46 +00:00
Alyssa Rosenzweig	9168dcbbc1	pan/bi: Disambiguate IDVS variants in shader-db Label IDVS variants as being MESA_SHADER_{POSITION, VARYING} stages; reserve the MESA_SHADER_VERTEX label for non-IDVS shaders. This reduces confusion where a single shader compiles to two MESA_SHADER_VERTEX shaders with different stats. While we're at it, de-vendor the blend shader stage name; these stats are internal anyway. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15086>	2022-02-19 00:01:07 +00:00
Alyssa Rosenzweig	01d1bf6228	asahi: Wire in pure integer texture formats Passes dEQP-GLES3.functional.texture.format.sized.2d.r* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:33 +00:00
Alyssa Rosenzweig	fded99b1c5	asahi: Support LOD clamps Passes: dEQP-GLES3.functional.texture.mipmap.2d.min_lod.* dEQP-GLES3.functional.texture.mipmap.2d.max_lod.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:33 +00:00
Alyssa Rosenzweig	cc3e98e201	asahi: Identify minimum/maximum LOD fields Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:33 +00:00
Alyssa Rosenzweig	6554790dfb	asahi: Add LOD clamp packing unit tests With GTest. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	e3a5c1b478	asahi: Add LOD type Automatically packs and unpacks float <==> clamped 4:6 fixed point, used for min/max LOD fields on the Sampler descriptor. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	db93090ffc	asahi: Allow GenXML to be used in C++ C++ requires explicit casts from integers to enums. Fixes errors like the following when trying to use Asahi GenXML from a GTest unit test. src/asahi/lib/agx_pack.h:554:23: error: assigning to 'enum agx_channels' from incompatible type 'uint64_t' (aka 'unsigned long long') values->channels = __gen_unpack_uint(cl, 0, 6); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	055c5a59f8	agx: Round and clamp array indices Conforming with the GLSL spec. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_fixed_fragment (and probably others) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	a822b7b6cc	agx: Naturally align uniform pushes Required to pack correctly, e.g if we push a 16-bit value then a 64-bit value. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	0c2bbb470a	agx: Add agx_size_align_16 helper Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	9aeb5156bc	agx: Add typed move helper Useful for u2u16 in lowering code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	830d16e9f0	asahi: Add AGX_PUSH_ARRAY_SIZE_MINUS_1 Required to clamp array indices against the array sizes per the GLSL spec. Metal also does this, implying it's required by the hardware for correct operation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	7b4ea2fd38	asahi: Implement texturing with non-zero start level Unsure if this comes up anywhere. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	11072cfd21	asahi: Handle reloads of specific cube/mipfaces The texture descriptor we construct for reloading needs to respect the surface's texture/layer selection. Fix exactly the same bug as `b8c31ac06d` ("lima: fix glCopyTexSubImage2D"). Fixes: dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgba dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgb dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgba Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	062ca49ca7	asahi: Add agx_map_texture_{cpu,gpu} helpers Streamline access to particular layer/levels. These patterns show up across the driver and are easy to screw up, so add a helper. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	a8bf729f8a	asahi: Support 2D array and 3D textures As far as I can tell, these must be tiled. Other than that, the implementation is completely routine. Passes dEQP-GLES3.functional.texture.format.unsized.2d_array Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	204e2ffe1b	asahi: Track mipmap state explicitly Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	e714fae263	asahi: Pass correct tile shift to tiling routines Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig	5f10ffd6e2	asahi: Handle page alignment of miptrees Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>	2022-02-18 23:48:32 +00:00

1 2 3 4 5 ...

150339 commits