Commit graph

148656 commits

Author SHA1 Message Date
Alyssa Rosenzweig
b550b3c89c vc4: Use u_box_pixels_to_blocks helper
Eliminates a ETC1 special case. In fact this unit conversion applies to
all formats; the original code path works since ETC1 is the only format
with blocks bigger than 1x1 supported by vc4 (I assume).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
2022-01-10 23:16:56 +00:00
Alyssa Rosenzweig
6f07159a1d v3d: Use u_box_pixels_to_blocks helper
Rather than open-coding.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
2022-01-10 23:16:56 +00:00
Alyssa Rosenzweig
b920ace4bc lima,panfrost: Correct pixel vs block mismatches
Different parts of our codebase disagree on whether spatial
coordinates/dimensions are given in pixels or blocks, which differ by a
constant factor for block-compressed formats. This disagreement
manifests as incorrect results accessing block-compressed formats.

To resolve this, define the public tiling routines to take their
coordinates in pixels, and align the relevant code in Panfrost
accordingly.

Fixes rendering glitches in Factorio, as well as a pile of piglits on
Panfrost. It should also fix glTexSubImage() with ETC1 on Lima, but
there are no tests for this in dEQP/Piglit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> [dEQP/Lima]
Tested-by: Erico Nunes <nunes.erico@gmail.com> [Piglit/Lima]
Reported-by: Icecream95 <ixn@disroot.org>
Closes: #5560
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
2022-01-10 23:16:56 +00:00
Alyssa Rosenzweig
26c533f167 gallium/util: Add pixel->blocks box helper
There is a lot of unit confusion in Gallium due to pixels versus blocks
matching only with uncompressed textures. Add a helper to do a common
pixels->blocks unit conversion required in multiple drivers.

v2: Rename dst->blocks, src->pixels to avoid confusion about the units
to casual readers (Mike).

Note to mesa-stable maintainers: this is marked as Cc: mesa-stable so
the next patch (a set of bug fixes for Lima and Panfrost) can be
backported. It's not a bug fix in its own right, of course.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>
2022-01-10 23:16:56 +00:00
Thomas H.P. Andersen
7daba1fe65 replace 0 with NULL for NULL pointers
This updates many places where 0 is used as NULL pointer.

There are a few warnings left when I build the default
configuration but they either relate to code
outside of mesa or where "None" is used instead.

Found with static analysis (smatch)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12174>
2022-01-10 22:53:32 +00:00
Rhys Perry
60c711833f aco: remove pack_half_2x16(a, 0) optimization
This makes the compiler less predictable and should only have a very small
effect on performance.

fossil-db (Vega):
Totals from 2410 (1.79% of 134756) affected shaders:
CodeSize: 6911568 -> 6942840 (+0.45%)

Fixes Horizon Zero Dawn artifacts.

If a shader has:
   a = pack_half_2x16(a, 0) //rtne
   store(pack_half_2x16(0, b) | a) //rtne
   a = unpack_2x16(a).x
It will become:
   store(pack_half_2x16(a, b)) //rtz
   a = unpack_2x16(pack_half_2x16(a, 0)).x //rtne

So a later shader with "unpack_2x16(load()).x" will use "a" rounded to
zero, while the previous shader will use "a" rounded to the nearest even.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2f125908b3 ("radv,aco: lower_pack_half_2x16")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14475>
2022-01-10 22:19:29 +00:00
Christian Gmeiner
6e08d8fc3d ci: Uprev piglit to af1785f31
Brings in these changes:

af1785f31 occlusion_query_conform: skip GetQueryCounterBits test if needed
dad078717 occlusion_query_conform: convert to pilgit subtests
b52c1c761 glsl-1.30: test nested preprocessor concat
6c4da153b texture-storage: Fix subtest result handling of skips.
4343f19db fbo-integer: Remove the invalid DrawPixels test.
e3842f2fe arb_dsa: exclude stencil8 textures from test sets.
ce8649be7 spec/ext_external_objects: Fix build on Debian systems
4e553838f glsl: add basic tests for desktop GLSL invariant qualifier linking
7e61e5199 Tests for variable in and out of loop scope
f855ad1c8 fbo-mrt-alphatest: Only require GLSL 1.20
9be2fe999 glx: add glx-multi-display-single-pbuffer test
bfe290725 glx: add glx-swap-pbuffer test
efa64335e framework: Fix build on Windows when using waffle

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14468>
2022-01-10 21:52:42 +00:00
Jordan Justen
0fc93928f1 isl: Don't enable HDC:L1 caches on DG2
The MOCS entry used for this on Tigerlake doesn't exist on DG2.

Ref: aca31baafc ("isl: Enable Tigerlake HDC:L1 caches via MOCS in various cases.")
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14467>
2022-01-10 21:20:03 +00:00
Rhys Perry
67fc7a1763 nir/uniform_atomics: fix is_atomic_already_optimized without workgroups
dims_needed would have been zero, so this would always returned true for
non-compute stages.

Also fix this for variable workgroup sizes.

Improves Shadow of the Tomb Raider RX 6800 performance by 10.6%, 11.5% and
4.5% (day_of_dead, jungle and paititi scenes).

radv_perf before and after:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '62.913333333333334', 'min_fps': '62.81', 'max_fps': '62.98', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '64.02666666666666', 'min_fps': '63.93', 'max_fps': '64.11', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '74.81666666666666', 'min_fps': '74.72', 'max_fps': '74.88', 'interations': '3'}

{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '69.57', 'min_fps': '69.52', 'max_fps': '69.63', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '71.41000000000001', 'min_fps': '71.31', 'max_fps': '71.5', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '78.16666666666667', 'min_fps': '78.07', 'max_fps': '78.23', 'interations': '3'}

Performance now seems slightly better than AMDVLK 2021.Q4.3:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '68.02666666666666', 'min_fps': '67.95', 'max_fps': '68.16', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '70.24666666666667', 'min_fps': '69.83', 'max_fps': '70.51', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '77.19', 'min_fps': '77.18', 'max_fps': '77.2', 'interations': '3'}

fossil-db (Sienna Cichlid):
Totals from 40 (0.03% of 134621) affected shaders:
CodeSize: 62676 -> 65996 (+5.30%)
Instrs: 11372 -> 12111 (+6.50%)
Latency: 144122 -> 142848 (-0.88%); split: -1.09%, +0.21%
InvThroughput: 19686 -> 19847 (+0.82%); split: -0.06%, +0.87%
VClause: 304 -> 306 (+0.66%)
SClause: 603 -> 604 (+0.17%); split: -0.83%, +1.00%
Copies: 780 -> 858 (+10.00%)
Branches: 235 -> 329 (+40.00%)
PreSGPRs: 1072 -> 1083 (+1.03%); split: -0.37%, +1.40%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14407>
2022-01-10 19:57:38 +00:00
Konstantin Seurer
aaa90c37e0 panvk: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
2022-01-10 19:28:17 +00:00
Konstantin Seurer
651bec0971 turnip: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
2022-01-10 19:28:17 +00:00
Konstantin Seurer
e0d590cafb anv: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
2022-01-10 19:28:17 +00:00
Konstantin Seurer
2b5cf84efd lavapipe: Fixed maxFragmentCombinedOutputResources
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
2022-01-10 19:28:17 +00:00
Rhys Perry
0f5d90c2a7 ac/nir: fix store_buffer_amd write_masks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>
2022-01-10 19:01:04 +00:00
Rhys Perry
b00138090e nir/lower_shader_calls: fix store_scratch write_mask
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>
2022-01-10 19:01:04 +00:00
Lucas Stach
d799a4be27 etnaviv: drm: defer destruction of softpin BOs
When destroying a BO with a userspace managed address and thus freeing
the VMA space, we need to make sure that the BO isn't in use by any
active submit anymore, as the kernel will rightfully reject the next
submit that re-uses the still active VMA. Keep the BO alive as long
as it isn't fully idle to prevent the VMA being reused prematurely.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
2022-01-10 16:49:00 +00:00
Lucas Stach
98a2049c08 etnaviv: drm: rename _etna_bo_del
Rename it to a somwhat more descriptive name, which makes it easier
to distinguish between the etna_bo_del function in the public interface
and the internal function. Also remove the duplicated forward declaration
and move it to the common interal header.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
2022-01-10 16:49:00 +00:00
Lucas Stach
77ebbcbf9a etnaviv: drm: export BO idle check function
The ability to check if a BO is idle is not only useful in the
buffer cache, but also in other parts of the winsys and even the
pipe driver. Make this functionality available in the interface.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
2022-01-10 16:49:00 +00:00
Lucas Stach
1b1f8592c0 etnaviv: drm: properly handle reviving BOs via a lookup
If a BO is removed from a cache bucket list via a lookup, we must
handle it in the same way as if a allocation from the cache happened:
tell valgrind that the buffer is active again and take a reference
to the etna_device, which the BO had given up while being in the
cache.

Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
2022-01-10 16:49:00 +00:00
Lucas Stach
ccfd5054a4 etnaviv: drm: fix size limit in etna_cmd_stream_realloc
The intended limit for command stream size is 64KB, as this is what old
kernels can reliably do and what allows for maximum number of queued
streams on newer kernels. However, due to unit confusion with the size
member, which is in dwords, the submitted streams could grow up to
~128KB. Fix this by using the proper limit in dwords.

Flushing due to some limits being exceeded is not an issue, but is
expected with certain workloads, so lower the severity of the message
being emitted in this case to debug level.

Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14425>
2022-01-10 15:38:27 +00:00
Lucas Stach
22d796feb8 egl/wayland: break double/tripple buffering feedback loops
Currently we dispose any unneeded color buffers immediately if we detect that
there are more unlocked buffers than we need. This can lead to feedback loops
between the compositor and the application causing rapid toggling between
double and tripple buffering.

Scenario: 2 buffers already queued to the compositor, egl/wayland allocates a
new back buffer to avoid throttling, slowing down the frame. This allows the
compositor to catch up and unlock both buffers. EGL detects that there are
more buffers than currently needed, freeing the buffer, restarting the loop
shortly after.

To avoid wasting CPU time on rapidly freeing and reallocating color buffers
break those feedback loops by letting the unneeded buffers sit around for a
short while before disposing them.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14451>
2022-01-10 15:11:44 +00:00
Danylo Piliaiev
d77bfc117c tu,ir3: Implement VK_KHR_shader_integer_dot_product
- gen4 - has dp4acc and dp2acc, dp4acc is used to implement
  4x8 dot product.
- gen3 - has dp2acc, in OpenCL blob uses dp2acc for dot product
  on both get3 and gen4.
- gen2 - unknown, lower everything.
- gen1 - no dp2acc, lower everything. OpenCL blob doesn't advertise
  cl_qcom_dot_product8 but still generates code for it.
  The assembly is more verbose and uses yet to be documented
  mad32.u16 instruction.

Passes:
 dEQP-VK.spirv_assembly.instruction.compute.opsdotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opudotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsudotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsdotaccsatkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsudotaccsatkhr.*

Only packed 4x8 unsigned and mixed versions are accelerated.
However in theory we should be able to do better for signed version
than current NIR lowering.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:21:24 +02:00
Danylo Piliaiev
e1f89a1da2 ir3: Make nir compiler options a part of ir3_compiler
This would allow for sub-gens to have different options.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Danylo Piliaiev
b8d486f298 nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Danylo Piliaiev
c1d5c318bc ir3: New cat3 instructions
* shrm - (src2 >> src1) & src3
* shlm - (src2 << src1) & src3
* shrg - (src2 >> src1) | src3
* shlg - (src2 << src1) | src3
* andg - (src2 & src1) | src3
* dp2acc - dot product of two {i,u}8vec2 packed into
  SRC1 and SRC2, added to 32b SRC3
* dp4acc - dot product of two {i,u}8vec4 packed into
  SRC1 and SRC2, added to 32b SRC3
* wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is
  duplicated (1 << (SRC3 / 32)) times starting from DST register
* wmm.accu - same as wmm but result is added to DST registers, however
  the first reg in each vec4 result is overwritten instead of
  accumulating.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Connor Abbott
c45c6e36eb tu: Implement VK_EXT_subgroup_size_control
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott
1a1e25dcce tu, ir3: Support runtime gl_SubgroupSize in FS
We already supported it in the CS for computing the subgroup ID, but
soon we'll need it in the FS too. Vertex stages will always have it
lowered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott
e6e34883a9 ir3: Add wavesize control
This allows the wavesize to be controlled per-shader. This will be used
by VK_EXT_subgroup_size_control, and freedreno will also need it if
legacy ARB_shader_ballot is to be supported (since it forces a wavesize
of 64 or less).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott
30237b3d9c ir3: Pass shader to ir3_nir_post_finalize()
We'll need to add shader-specific lowering for gl_SubgroupSize.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Connor Abbott
9ebc48005c ir3, freedreno: Add options struct for ir3_shader_from_nir()
We'll expand this in a moment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
2022-01-10 10:58:28 +00:00
Danylo Piliaiev
fe9c9ec83f tu: fix workaround for depth bounds test without depth test
Fixes: bb4db22ff4

("turnip: apply workaround for depth bounds test without depth test")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14390>
2022-01-10 09:36:59 +00:00
Lionel Landwerlin
07bc6b7ed9 anv: limit compiler valid color outputs using NIR variables
This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc
test which asserts in the backend with :

   brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend || key->nr_color_regions == 1' failed.

This is because there is 2 color attachments provided by the
renderpass so we initially set nr_color_regions = 2. But once we've
parsed the shader, we can see it's only using one output (with dual
source color blending).

This change looks at the output variables to update the valid output
variables.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417>
2022-01-10 09:38:32 +02:00
Tapani Pälli
b8f0459d6f iris: unref syncobjs and free r/w dependencies array for slab entries
Fixes memory leak with dependencies array:

  ==5224== 104 (96 direct, 8 indirect) bytes in 3 blocks are definitely lost in loss record 1,954 of 2,035
  ==5224==    at 0x484178A: malloc (vg_replace_malloc.c:380)
  ==5224==    by 0x484670B: realloc (vg_replace_malloc.c:1437)
  ==5224==    by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819)
  ==5224==    by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898)
  ==5224==    by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031)
  ==5224==    by 0x14DB77D0: iris_transfer_map (iris_resource.c:2348)
  ==5224==    by 0x157786FD: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
  ==5224==    by 0x14C479E7: tc_buffer_map (u_threaded_context.c:2252)
  ==5224==    by 0x1434F3F8: pipe_buffer_map_range (u_inlines.h:393)
  ==5224==    by 0x1435094A: _mesa_bufferobj_map_range (bufferobj.c:491)
  ==5224==    by 0x143586D9: map_buffer_range (bufferobj.c:3737)
  ==5224==    by 0x14358DA3: _mesa_MapBuffer (bufferobj.c:3947)

  ==5224== 240 (192 direct, 48 indirect) bytes in 6 blocks are definitely lost in loss record 1,984 of 2,035
  ==5224==    at 0x484178A: malloc (vg_replace_malloc.c:380)
  ==5224==    by 0x484670B: realloc (vg_replace_malloc.c:1437)
  ==5224==    by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819)
  ==5224==    by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898)
  ==5224==    by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031)
  ==5224==    by 0x14FF72CC: iris_get_query_result (iris_query.c:631)
  ==5224==    by 0x14C4396A: tc_get_query_result (u_threaded_context.c:880)
  ==5224==    by 0x1458F4F7: get_query_result (st_cb_queryobj.c:273)
  ==5224==    by 0x1458F7EB: st_WaitQuery (st_cb_queryobj.c:352)
  ==5224==    by 0x144EFF66: get_query_object (queryobj.c:742)
  ==5224==    by 0x144F01AE: _mesa_GetQueryObjectuiv (queryobj.c:811)

And leak with syncobjs:

  ==13644== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1,846
  ==13644==    at 0x484186F: malloc (vg_replace_malloc.c:381)
  ==13644==    by 0x639789B: iris_create_syncobj (iris_fence.c:69)
  ==13644==    by 0x63B213A: iris_batch_reset (iris_batch.c:512)
  ==13644==    by 0x63B3637: _iris_batch_flush (iris_batch.c:1056)
  ==13644==    by 0x65EF2BC: iris_get_query_result (iris_query.c:631)
  ==13644==    by 0x623B970: tc_get_query_result (u_threaded_context.c:880)
  ==13644==    by 0x5B874F7: get_query_result (st_cb_queryobj.c:273)
  ==13644==    by 0x5B877EB: st_WaitQuery (st_cb_queryobj.c:352)
  ==13644==    by 0x5AE7F66: get_query_object (queryobj.c:742)
  ==13644==    by 0x5AE8150: _mesa_GetQueryObjectiv (queryobj.c:801)

Fixes: ce2e2296ab ("iris: Suballocate BO using the Gallium pb_slab mechanism")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14387>
2022-01-09 13:43:45 +00:00
Christian Gmeiner
9cb91010ab iris/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14442>
2022-01-07 23:12:37 +00:00
Christian Gmeiner
4d624f189e i915g/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14439>
2022-01-07 23:00:16 +00:00
Emma Anholt
a2dbdc645f ci: Shrink container/rootfs sizes.
Cutting the extra VK mustpass files is 315MB out of 1.5GB of the amd64
rootfs.  pip was 10MB.  The rustup toolchains were massive (over a GB
IIRC) on the x86 container images.

Hopefully helps with #5837

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14460>
2022-01-07 22:32:31 +00:00
Yiwei Zhang
48712b8cc5 venus: subtract appended header size in vn_CreatePipelineCache
Use header->header_size to offset cache data as well in case the header
struct extends on a newer driver but the cache data was appended with
an old header.

Fixes: 723f0bf74a ("venus: initial support for module and pipelines")

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14463>
2022-01-07 22:10:53 +00:00
Danylo Piliaiev
3792fbfcf6 ir3: Assert that we cannot have enough concurrent waves for CS with barrier
If we have a compute shader that has a big workgroup, a barrier, and
a branchstack which limits max_waves - this may result in a situation
when we cannot run concurrently all waves of the workgroup, which
would lead to a hang.

Blob just explodes in such case.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>
2022-01-07 18:40:15 +00:00
Danylo Piliaiev
9ed4d49c97 ir3: Be able to reduce register limit for RA when CS has barriers
If barriers are used, it must be possible for all waves in the workgroup
to execute concurrently. Thus we may have to reduce the registers limit.

Fixes a hang in "Digital Combat Simulator".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>
2022-01-07 18:40:15 +00:00
Hoe Hao Cheng
9323d2ea6d zink/codegen: remove bogus print statement
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>
2022-01-07 17:51:45 +00:00
Hoe Hao Cheng
37f01832eb zink/codegen: remove core_since in constructor
the variable is now automatically filled in according to registry values

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>
2022-01-07 17:51:45 +00:00
Hoe Hao Cheng
029e871239 zink/codegen: support platform tags
Some extensions are locked behind certain platforms, don't include them
if the extension is unsupported.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>
2022-01-07 17:51:45 +00:00
Lionel Landwerlin
1d40d53e03 anv: don't leave anv_batch fields undefined
Because the extend_cb vfunc is not initialized, there is a risk that
the emission code calls into a random pointer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14418>
2022-01-07 17:28:11 +00:00
Gert Wollny
8685a505e7 ntt: Set the output invariant flag according to the semantics
This is used by virglrenderer to create the correct shaders on the
host. Fixes:

dEQP-GLES31.functional.primitive_bounding_box.triangles.tessellation_set_per_primitive.vertex_tessellation_fragment.fbo

when using ntt with virgl.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
2022-01-07 16:35:43 +00:00
Gert Wollny
6f348d9c99 nir_lower_io: propagate the "invariant" flag to outputs
Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer
to correctly declare output variables.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
2022-01-07 16:35:43 +00:00
Gert Wollny
5bfe292708 util/primconvert: map only index buffer part that is needed
By putting vertex store and indices all in one buffer the larger part
of the shared buffer might actually only be vertex data we are not
interested in. Hence only map the part of the buffer that contains the
index data for the currently active draw command.

This helps drivers where a mapping operation is expensive, like e.g. virgl.

v2: - add comment about ranged buffer mapping (Pierre-Eric)
    - keep passing direct_draws[i].start to direct_draw_func, it looks
      like the "start" parameter is properly set in
      util_prim_restart_convert_to_direct

v3: Fix ws error (Mike)

Related: #5825

Fixes: f9d12bf50e
   vbo/dlist: use a single buffer object

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
2022-01-07 16:35:43 +00:00
Christian Gmeiner
86b19db459 etnaviv/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14441>
2022-01-07 16:24:12 +00:00
Rhys Perry
1756930a79 radv: increase maxTaskOutputCount to 65535
This is the minimum required by the spec.

Fixes dEQP-VK.api.info.vulkan1p2_limits_validation.nv_mesh_shader

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14446>
2022-01-07 15:04:11 +00:00
Connor Abbott
cb45120556 ir3: Use (ss) for instructions writing shared regs
The blob uses *both* nops and (ss). It turns out that in some rare cases
the hardware does take more than 6 cycles, at least for movmsk, but
adding nops is unnecessary. I believe the extra nops are only there due
to the immaturity of the blob's implementation of subgroup ops, so we
don't have to copy them - just handle shared reg producers the same as
SFU instructions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00
Connor Abbott
d45678cac4 ir3/postsched: Rename tex/sfu to sy/ss
Analogous to the previous commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
2022-01-07 14:26:08 +00:00