fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-11 18:00:36 +02:00

Author	SHA1	Message	Date
Axel Davy	90a7573a65	st/nine: Add RAM memory manager for textures On 32 bits, virtual memory is sometimes too short for apps. Textures can hold virtual memory 3 ways: 1) MANAGED textures have a RAM copy of any texture 2) SYSTEMMEM is used to have RAM copy of DEFAULT textures (to upload them for example) 3) Textures being mapped. Nine cannot do much for 3). It's up to driver to really unmap textures when possible on 32 bits to reduce virtual memory usage. It's not clear whether on Windows anything special is done for 1) and 2). However there is clear indication some efforts have been done on 3) to really unmap when it makes sense. My understanding is that other implementations reduce the usage of 1) by deleting the RAM copy once the texture is uploaded (Dxvk's behaviour is controlled by evictManagedOnUnlock). The obvious issue with that approach is whether the texture is read by the application after some time. In that case, we have to recreate the RAM backing from the GPU buffer. And apps DO that. Indeed I found that for example Mass Effect 2 with High Texture mods (one of the crash case fixed by this patch serie), When the character gets close to an object, a high res texture and replaces the low res one. The high res one simply has more levels, and the game seems to optimize reading the high res texture by retrieving the small-resolution levels from the original low res texture. In other words during gameplay, the game will randomly read MANAGED textures. This is expected to be fast as the data is supposed to be in RAM... Instead of taking that RAM copy eviction approach, this patchset proposes a different approach: storing in memfd and release the virtual memory until needed. Basically instead of using malloc(), we create a memfd file and map it. When the data doesn't seem to be accessed anymore, we can unmap the memfd file. If the data is needed, the memfd file is mapped again. This trick enables to allocate more than 4GB on 32 bits apps. The advantage of this approach over the RAM eviction one, is that the load is much faster and doesn't block the GPU. Of course we have problems if there's not enough memory to map the memfd file. But the problem is the same for the RAM eviction approach. Naturally on 64 bits, we do not use memfd. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Axel Davy	6087ff44ae	st/nine: Add new function to know if we are the worker This will be useful in a later patch Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Ilia Mirkin	fd017458bc	mesa: fix fbo attachment size check for RBs, make it trigger in ES2 Makes dEQP-GLES2.functional.fbo.completeness.size.distinct pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9441>	2021-03-06 20:29:41 +00:00
Ilia Mirkin	a8044e87e7	mesa: fix conditions for fp16 render format eligibility GLES3 adds all of these, but they're also available in GLES2 with an ext. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4400 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9441>	2021-03-06 20:29:41 +00:00
Karol Herbst	12f1e42ed3	tegra/context: unwrap indirect_draw_count as well Fixes: `22f6624ed3` "gallium: separate indirect stuff from pipe_draw_info - 80 -> 56 bytes" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9425>	2021-03-06 11:48:57 +00:00
Karol Herbst	a84c8ddb19	tegra/context: fix regression in tegra_draw_vbo We should only pass in a new indirect_info object if we actually set valid values in it. Fixes: `abe8ef862f` "gallium: make pipe_draw_indirect_info * a draw_vbo parameter" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9425>	2021-03-06 11:48:57 +00:00
Icecream95	efd7711e0e	st/mesa: Update constants on alpha test change if it's lowered nir_lower_alpha_test creates a uniform for the alpha reference value; this needs to be updated when changing alpha test state. Fixes: `b1c4c4c7f5` ("mesa/gallium: automatically lower alpha-testing") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4390 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9439>	2021-03-06 00:32:51 +00:00
Dave Airlie	24ce0862fe	zink/ci: update results after layer extensions enabled in lavapipe Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	d061e21b7e	lavapipe: enable EXT_shader_viewport_index_layer This is already implemented afaik Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	dad5d5099a	llvmpipe: add support for shader viewport layer This should already be implemented just never enabled the CAP Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	4cf898b988	draw/prim_assembler: write correct decomposed primitive lengths In order for shader viewport index to be calculated correctly, the cliptest code needs proper primitive lengths to work out the provoking vertex. I half fixed this before for GL4 but looks like I didn't make it all the way. This fixes: dEQP-VK.draw.shader_viewport* Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	52dc22055f	draw: fix uses viewport index for tess eval shader Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Kenneth Graunke	cdffa3e114	vbo: Fix vbo_sw_primitive_restart for start > 0 Commit `e99e7aa4` began passing start > 0 to indexed draw calls rather than keeping start at 0 and manually advancing ib->ptr. This should work fine, however, there have been instances of software fallbacks not handling things right. vbo_sw_primitive_restart had a bug where it was ignoring "start" and always calling find_sub_primitives with start = 0 and end = ib->count. This meant that when start > 0, it was analyzing the wrong part of the index buffer when finding subprimitives. In theory, each _mesa_prim can have a different "start" value. But the code only calls find_sub_primitives once, because it wants to map, analyze, and unmap the index buffer before calling ctx->Draw, as some drivers don't support drawing with the index buffer mapped. To handle this, we break vbo_sw_primitive_restart calls into sections where "start" matches across all the primitives, similar to how I handled the issue in tnl in commit `bd6120f562`. In the common case, start matches and we handle it in one pass anyway. Fixes Piglit's primitive-restart VBO_COMBINED_VERTEX_AND_INDEX test and KHR-GL33.pipeline_statistics_query_tests_ARB.functional_primitives_vertices_submitted_and_clipping_input_output_primitives on Intel Ivybridge and older (which don't do arbitrary cut indices). Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4052 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9417>	2021-03-05 21:16:32 +00:00
Adam Jackson	cf468b7ad8	zink: more and better debug printfs Use debug_printf more consistently, normalize formatting a bit, and trace a few more places you're likely to care about. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9436>	2021-03-05 15:03:09 -05:00
Gert Wollny	f3aa2f15c2	r600/sfn: eliminate loading unused component loads from shared memory LDS loads are quite expensive, so try to eliminate as many as possible Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9416>	2021-03-05 18:25:25 +00:00
Rhys Perry	9f8a0b797e	radv: cache pipeline statistics Applications rarely require them, but this improves fossil-db replay time. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Rhys Perry	7c7e8942f8	radv,aco: remove aco_compiler_statistics This removes a pointer from radv_shader_binary_legacy::data. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Lionel Landwerlin	8955d179d3	anv: fix MI_PREDICATE_RESULT write This register is only 32bits. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `1952fd8d2c` ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9428>	2021-03-05 16:19:20 +00:00
Alyssa Rosenzweig	718bfdb3da	pan/bi: Implement fsin/fcos Instead of lowering it in NIR, use the lookup tables as inputs to a second-order Taylor expansion. shader-db results aren't amazing but keep in mind this is without backend CSE yet. total instructions in shared programs: 115913 -> 115707 (-0.18%) instructions in affected programs: 3151 -> 2945 (-6.54%) helped: 12 HURT: 0 Instructions are helped. total nops in shared programs: 84045 -> 84041 (<.01%) nops in affected programs: 1571 -> 1567 (-0.25%) helped: 1 HURT: 7 Inconclusive result (value mean confidence interval includes 0). total clauses in shared programs: 20498 -> 20489 (-0.04%) clauses in affected programs: 188 -> 179 (-4.79%) helped: 6 HURT: 0 Clauses are helped. total quadwords in shared programs: 90395 -> 90291 (-0.12%) quadwords in affected programs: 2287 -> 2183 (-4.55%) helped: 12 HURT: 0 Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Alyssa Rosenzweig	253b795451	pan/bi: Allow negating constants Useful for representing -0 in transcendental sequences matching the blob. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Alyssa Rosenzweig	362756ad09	pan/bi: Use replace_index in more places Needed to respect abs/neg. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Pierre-Eric Pelloux-Prayer	c276bde34a	radeonsi/sqtt: export shader code to RGP With these changes the shader code is visible in RGP. Vk pipeline feature is emulated using si_update_shaders: when shaders are updated we compute a sha1 of their code and use it as a pipeline hash. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	729d3eb0e0	radeonsi/sqtt: don't always use WGP 0 Because it may be disabled. Instead use the cu mask to pick the first active WGP. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	47eafb3f51	radeonsi/sqtt: remove duplicate token V_008D18_REG_INCLUDE_CONTEXT was set twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	a27ea38d2a	radeonsi/sqtt: keep a copy of the uploaded shader code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	7f5a8db96d	ac/rgp: move radv/sqtt functions to ac pso_correlation and code_object_loader don't depend on drivers specific logic so move them to the shared code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	b2ef94943f	ac/rtld: make ac_rtld_upload returns the code size This will be useful to keep a copy of the uploaded code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	e5b1e645e7	ac/rgp: make the max gap between shader code a warning For radeonsi the shaders don't live in the same BOs, so they're unlikely to be less that 0x1000 bytes apart. So this commit bumps the threshold to 0x10000 and warns once when hitting it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	0e97d817f5	radeonsi: properly set SPI_SHADER_PGM_HI_ES When not using S_00B324_MEM_BASE the value isn't properly truncated. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Iago Toral Quiroga	6e6e71ddf9	broadcom/compiler: fix flags check for ldvary merge We were checking that the previous instruction doesn't write flags, but we also need to check it doesn't read them. Fixes: `1784dd22a3` ('broadcom/compiler: pipeline smooth ldvary sequences') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9431>	2021-03-05 12:55:47 +00:00
Iago Toral Quiroga	21c1853c55	broadcom/compiler: ldvary doesn't implicitly write to r3 since V3D 4.1 total instructions in shared programs: 13805979 -> 13786037 (-0.14%) instructions in affected programs: 2263244 -> 2243302 (-0.88%) helped: 10646 HURT: 1508 Instructions are helped. total threads in shared programs: 412220 -> 412242 (<.01%) threads in affected programs: 58 -> 80 (37.93%) helped: 17 HURT: 6 Threads are helped. total uniforms in shared programs: 3793200 -> 3790401 (-0.07%) uniforms in affected programs: 131281 -> 128482 (-2.13%) helped: 1547 HURT: 281 Uniforms are helped. total max-temps in shared programs: 2326309 -> 2324834 (-0.06%) max-temps in affected programs: 31836 -> 30361 (-4.63%) helped: 1139 HURT: 153 Max-temps are helped. total spills in shared programs: 5932 -> 5940 (0.13%) spills in affected programs: 80 -> 88 (10.00%) helped: 2 HURT: 3 total fills in shared programs: 13370 -> 13372 (0.01%) fills in affected programs: 480 -> 482 (0.42%) helped: 2 HURT: 3 total sfu-stalls in shared programs: 30829 -> 30685 (-0.47%) sfu-stalls in affected programs: 2190 -> 2046 (-6.58%) helped: 570 HURT: 533 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13836808 -> 13816722 (-0.15%) inst-and-stalls in affected programs: 2276152 -> 2256066 (-0.88%) helped: 10643 HURT: 1525 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9430>	2021-03-05 13:37:39 +01:00
Rhys Perry	524848707b	radv: don't set sx_blend_opt_epsilon for V_028C70_COLOR_10_11_11 Matches radeonsi and PAL. From PAL: // 1 is recommended, but doesn't provide sufficient precision Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4394 Fixes: `ed94638156` ("radv: Enable RB+ where possible.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9427>	2021-03-05 11:16:40 +00:00
Iago Toral Quiroga	839007e490	broadcom/compiler: always restart ldvary pipelining when scheduling ldvary When we were only able to pipeline smooth varyings, if we had to disable ldvary pipelining in the middle of a sequence it would stay disabled for the rest of the program, to prevent us from prioritizing scheduling of ldvary instructions that we would not be able to pipeline effectively. Now that we can pipeline all ldvary sequences we can change this. This change re-enables ldvary pipelining upon finding the next ldvary in the program in the hopes that we can continue pipelining succesfully. To do this, we track the number of ldvary instructions we emitted so far and compare that to the number of inputs in the fragment shader we are scheduling. This also allows us to simplify our ldvary tracking at nir to vir time, since that is all now handled in the QPU scheduler. total instructions in shared programs: 13817048 -> 13810783 (-0.05%) instructions in affected programs: 810114 -> 803849 (-0.77%) helped: 4843 HURT: 591 Instructions are helped. total max-temps in shared programs: 2326612 -> 2326300 (-0.01%) max-temps in affected programs: 4689 -> 4377 (-6.65%) helped: 285 HURT: 7 Max-temps are helped. total sfu-stalls in shared programs: 30942 -> 30865 (-0.25%) sfu-stalls in affected programs: 207 -> 130 (-37.20%) helped: 120 HURT: 42 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13847990 -> 13841648 (-0.05%) inst-and-stalls in affected programs: 825378 -> 819036 (-0.77%) helped: 4899 HURT: 590 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9404>	2021-03-05 10:32:19 +01:00
Samuel Pitoiset	2169c4f763	radv: re-enable TC-compat HTILE for MSAA D32S8 images on GFX9+ Should help MSAA games. Note that it's broken on GFX8 because the tiling doesn't match. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3868 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9284>	2021-03-05 08:44:40 +00:00
Xin He	97b196b921	virgl: use atomic operations when increase sub_ctx_id Use atomic operations to avoid competition. In addition, since sub_ctx_id 0 has been used by default, sub_ctx_id should start from 1. Signed-off-by: Xin He <hexin.op@bytedance.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9406>	2021-03-05 08:35:29 +00:00
Samuel Pitoiset	367a93830b	radv: skip useless FCE when fast-clearing MSAA images with DCC enabled The clear code is 0xCC which means CMASK isn't fast-cleared. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9392>	2021-03-05 08:11:28 +00:00
Samuel Pitoiset	6102507a74	radv: remove useless check about mips+layers for TC-compat HTILE images radv_use_htile_for_image() prevents it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:10:19 +01:00
Samuel Pitoiset	438f65fb1e	radv: cleanup enabling TC-compat HTILE for depth surfaces It makes more sense to try to enable TC-compat if the image has HTILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:09:42 +01:00
Mike Blumenkrantz	55b57db84d	zink: add vk/spirv caps/extension for shader LAYER variable this is required if gl_Layer is used outside of GEOMETRY stage Fixes: `c77df59c9e` ("zink: export PIPE_CAP_TGSI_VS_LAYER_VIEWPORT") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9410>	2021-03-05 03:45:51 +00:00
Dave Airlie	1186fbcdf1	lavapipe: fix dynamic viewport/scissor pipeline emission Just fixup the tests for when the pipeline vp/scissors are emitted. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9422>	2021-03-05 03:34:47 +00:00
Dave Airlie	6bcd304278	lavapipe: fix pipeline vp/scissor mixup. Not copying all the scissors caused dEQP-VK.pipeline.extended_dynamic_state.two_draws_dynamic.2_viewports to fail but thah test pointlessly relies on KHR_multiview (cts issue filed). Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9422>	2021-03-05 03:34:47 +00:00
Iván Briano	194e477615	anv: don't advertise mipmaps for linear 3D surfaces on BDW Prior to SKL, the mipmaps for 3D surfaces are laid out in a way that make it impossible to represent in the way that VkSubresourceLayout expects. Since we can't tell users how to make sense of them, don't report them as available. "Fixes" dEQP-VK.image.subresource_layout.3d.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9419>	2021-03-04 16:23:23 -08:00
Ian Romanick	2c4fd24c01	nir/algebraic: Apply addition property of equality to the other ordering too Inequality comparison operations are not commutative, so `foo < bar` and `bar < foo` both have to be explicitly listed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 20027051 -> 20026899 (<.01%) instructions in affected programs: 37181 -> 37029 (-0.41%) helped: 85 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.05% max: 6.78% x̄: 0.92% x̃: 0.68% 95% mean confidence interval for instructions value: -2.42 -1.15 95% mean confidence interval for instructions %-change: -1.23% -0.61% Instructions are helped. total cycles in shared programs: 979762793 -> 979753527 (<.01%) cycles in affected programs: 2653905 -> 2644639 (-0.35%) helped: 104 HURT: 50 helped stats (abs) min: 1 max: 1048 x̄: 119.99 x̃: 11 helped stats (rel) min: <.01% max: 9.88% x̄: 0.77% x̃: 0.20% HURT stats (abs) min: 1 max: 734 x̄: 64.26 x̃: 8 HURT stats (rel) min: <.01% max: 3.06% x̄: 0.36% x̃: 0.10% 95% mean confidence interval for cycles value: -98.65 -21.68 95% mean confidence interval for cycles %-change: -0.66% -0.15% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Ian Romanick	33031bdab6	nir/algebraic: Apply addition property of equality more conservatively This allows a lot more CSE. Depending on where the addition and the comparison are scheduled, it may also reduce register pressure by reducing the live range of the addends. Across all the platforms, the shaders affected for spills or fills were all fragment shaders from Dirt Rally. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 21043103 -> 21038804 (-0.02%) instructions in affected programs: 892878 -> 888579 (-0.48%) helped: 1549 HURT: 724 helped stats (abs) min: 1 max: 225 x̄: 4.14 x̃: 2 helped stats (rel) min: 0.05% max: 11.18% x̄: 1.04% x̃: 0.78% HURT stats (abs) min: 1 max: 71 x̄: 2.93 x̃: 1 HURT stats (rel) min: 0.07% max: 6.90% x̄: 0.80% x̃: 0.56% 95% mean confidence interval for instructions value: -2.33 -1.45 95% mean confidence interval for instructions %-change: -0.50% -0.40% Instructions are helped. total cycles in shared programs: 855054155 -> 855757566 (0.08%) cycles in affected programs: 58275918 -> 58979329 (1.21%) helped: 1213 HURT: 1680 helped stats (abs) min: 1 max: 107405 x̄: 1684.00 x̃: 10 helped stats (rel) min: <.01% max: 38.09% x̄: 1.51% x̃: 0.25% HURT stats (abs) min: 1 max: 126632 x̄: 1634.59 x̃: 12 HURT stats (rel) min: <.01% max: 85.91% x̄: 2.75% x̃: 0.49% 95% mean confidence interval for cycles value: -98.06 584.35 95% mean confidence interval for cycles %-change: 0.71% 1.22% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9843 -> 9771 (-0.73%) spills in affected programs: 72 -> 0 helped: 5 HURT: 0 total fills in shared programs: 9600 -> 9451 (-1.55%) fills in affected programs: 149 -> 0 helped: 5 HURT: 0 LOST: 14 GAINED: 9 Skylake total instructions in shared programs: 18185074 -> 18183866 (<.01%) instructions in affected programs: 575180 -> 573972 (-0.21%) helped: 1286 HURT: 468 helped stats (abs) min: 1 max: 15 x̄: 1.55 x̃: 1 helped stats (rel) min: 0.03% max: 4.08% x̄: 0.67% x̃: 0.65% HURT stats (abs) min: 1 max: 8 x̄: 1.69 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.87% x̃: 0.45% 95% mean confidence interval for instructions value: -0.77 -0.60 95% mean confidence interval for instructions %-change: -0.30% -0.22% Instructions are helped. total cycles in shared programs: 960518105 -> 960608234 (<.01%) cycles in affected programs: 42536073 -> 42626202 (0.21%) helped: 1210 HURT: 1714 helped stats (abs) min: 1 max: 7015 x̄: 123.41 x̃: 10 helped stats (rel) min: <.01% max: 33.76% x̄: 1.32% x̃: 0.26% HURT stats (abs) min: 1 max: 14474 x̄: 139.71 x̃: 14 HURT stats (rel) min: <.01% max: 58.94% x̄: 2.00% x̃: 0.44% 95% mean confidence interval for cycles value: 4.02 57.63 95% mean confidence interval for cycles %-change: 0.43% 0.82% Cycles are HURT. LOST: 16 GAINED: 42 Broadwell total instructions in shared programs: 17856880 -> 17852158 (-0.03%) instructions in affected programs: 564836 -> 560114 (-0.84%) helped: 1243 HURT: 418 helped stats (abs) min: 1 max: 115 x̄: 4.36 x̃: 1 helped stats (rel) min: 0.03% max: 9.67% x̄: 0.90% x̃: 0.67% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.14% max: 7.69% x̄: 0.89% x̃: 0.46% 95% mean confidence interval for instructions value: -3.45 -2.23 95% mean confidence interval for instructions %-change: -0.51% -0.38% Instructions are helped. total cycles in shared programs: 1031140321 -> 1029856892 (-0.12%) cycles in affected programs: 66986946 -> 65703517 (-1.92%) helped: 1084 HURT: 1653 helped stats (abs) min: 1 max: 415168 x̄: 1835.32 x̃: 10 helped stats (rel) min: <.01% max: 57.16% x̄: 1.19% x̃: 0.28% HURT stats (abs) min: 1 max: 43930 x̄: 427.14 x̃: 12 HURT stats (rel) min: <.01% max: 57.53% x̄: 1.32% x̃: 0.39% 95% mean confidence interval for cycles value: -915.76 -22.07 95% mean confidence interval for cycles %-change: 0.17% 0.47% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 20891 -> 20335 (-2.66%) spills in affected programs: 1567 -> 1011 (-35.48%) helped: 70 HURT: 0 total fills in shared programs: 27307 -> 25905 (-5.13%) fills in affected programs: 5381 -> 3979 (-26.05%) helped: 71 HURT: 0 LOST: 17 GAINED: 20 Haswell total instructions in shared programs: 16411850 -> 16409414 (-0.01%) instructions in affected programs: 602666 -> 600230 (-0.40%) helped: 1152 HURT: 781 helped stats (abs) min: 1 max: 103 x̄: 3.59 x̃: 1 helped stats (rel) min: 0.03% max: 8.61% x̄: 0.85% x̃: 0.65% HURT stats (abs) min: 1 max: 41 x̄: 2.18 x̃: 1 HURT stats (rel) min: 0.12% max: 7.69% x̄: 0.88% x̃: 0.69% 95% mean confidence interval for instructions value: -1.74 -0.78 95% mean confidence interval for instructions %-change: -0.21% -0.10% Instructions are helped. total cycles in shared programs: 1035338781 -> 1036977801 (0.16%) cycles in affected programs: 68961096 -> 70600116 (2.38%) helped: 1246 HURT: 2206 helped stats (abs) min: 1 max: 392022 x̄: 1040.28 x̃: 14 helped stats (rel) min: <.01% max: 56.44% x̄: 2.32% x̃: 0.38% HURT stats (abs) min: 1 max: 68630 x̄: 1330.56 x̃: 18 HURT stats (rel) min: <.01% max: 69.97% x̄: 3.31% x̃: 0.61% 95% mean confidence interval for cycles value: 90.43 859.17 95% mean confidence interval for cycles %-change: 1.02% 1.54% Cycles are HURT. total spills in shared programs: 17805 -> 17457 (-1.95%) spills in affected programs: 1202 -> 854 (-28.95%) helped: 34 HURT: 31 total fills in shared programs: 20939 -> 20387 (-2.64%) fills in affected programs: 2702 -> 2150 (-20.43%) helped: 34 HURT: 31 LOST: 24 GAINED: 45 Ivy Bridge and earlier Intel GPUs had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515912 -> 15516757 (<.01%) instructions in affected programs: 396569 -> 397414 (0.21%) helped: 578 HURT: 858 helped stats (abs) min: 1 max: 9 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.04% max: 3.70% x̄: 0.65% x̃: 0.65% HURT stats (abs) min: 1 max: 11 x̄: 1.87 x̃: 1 HURT stats (rel) min: 0.08% max: 12.90% x̄: 0.95% x̃: 0.53% 95% mean confidence interval for instructions value: 0.47 0.70 95% mean confidence interval for instructions %-change: 0.24% 0.37% Instructions are HURT. total cycles in shared programs: 584395455 -> 584466352 (0.01%) cycles in affected programs: 20346570 -> 20417467 (0.35%) helped: 1192 HURT: 1896 helped stats (abs) min: 1 max: 4108 x̄: 123.27 x̃: 14 helped stats (rel) min: <.01% max: 37.20% x̄: 2.27% x̃: 0.46% HURT stats (abs) min: 1 max: 3698 x̄: 114.89 x̃: 19 HURT stats (rel) min: <.01% max: 70.28% x̄: 3.02% x̃: 0.71% 95% mean confidence interval for cycles value: 10.75 35.16 95% mean confidence interval for cycles %-change: 0.73% 1.23% Cycles are HURT. LOST: 20 GAINED: 12 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Kenneth Graunke	206495cac4	iris: Enable u_threaded_context This implements most of the remaining u_threaded_context support. Most of the heavy lifting was done in the previous patches which fixed things up for the new thread safety requirements. Only a few things remain. u_threaded_context support can be disabled via an environment variable: GALLIUM_THREAD=0 On Felix's Tigerlake with the GPU at fixed frequency, enabling u_threaded_context improves performance of several games: - Civilization VI: +17% - Shadow of Mordor: +6% - Bioshock Infinite +6% - Xonotic: +6% Various microbenchmarks improve substantially as well: - GfxBench5 gl_driver2: +58% - SynMark2 OglBatch6: +54% - Piglit drawoverhead: +25% Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	c133d0930f	iris: Use thread safe slab allocators in transfer_map handling pipe->transfer_map can be called from u_threaded_context's thread rather than the driver thread. We need to use two different slab allocators, one for each thread. transfer_unmap, on the other hand, is only ever called from the driver thread. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	1b1c857248	iris: Make various classes inherit from u_threaded_context base classes u_threaded_context requires various objects to inherit from a new threaded_foo base class rather than directly from pipe_foo. This patch does most of the mechanical changes required for that. It also initializes the new threaded_resource fields. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	3358c7125a	iris: Use different shader uploaders for precompile vs. draw time When we enable u_threaded_context, the pipe->create_*_state hooks (precompile variants) are going to be called from one thread, while iris_update_compiled_shaders (on-the-fly variants) are going to be called from a driver thread. BLORP shaders also happen from clear, blit, and so on in the driver thread. u_upload_mgr isn't thread-safe, so use an uploader for each purpose. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	ec0d61c14c	iris: Support rebinding of stream output targets This enables us to replace the backing storage of resources that have been used as stream output targets, in case we're invalidating their entire contents. This can avoid stalls. We simply hadn't supported it because it was going to be tricky to re-emit 3DSTATE_SO_BUFFER without screwing up "reset offset to zero" vs. "keep appending". But that should be working fine with the previous patch's refactor. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	08e04ddd2c	iris: Rework zeroing of stream output buffer offsets The previous mechanism was a bit fragile. We stored the zero offset in the pre-baked packet, and used an flag to override 0xFFFFFFFF (append) offsets until our first emit - then prohibited anyone from trying to re-emit the packet by flagging IRIS_DIRTY_SO_BUFFERS, because that would re-emit the version with the zeroing of the offset. Now, we always store 0xFFFFFFFF in the pre-baked packet, and use a flag to override it to zero on the first emit. That way, we can re-emit that packet at any time, and it'll just keep appending. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00

1 2 3 4 5 ...

125501 commits