Commit graph

184774 commits

Author SHA1 Message Date
Marek Olšák
1afe6f3321 radeonsi: don't print the preamble state separately for GALLIUM_DDEBUG
because it's always printed as part of command buffers.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
9e76459616 radeonsi: execute streamout_begin after cache flushes
so that si_emit_streamout_begin can assume that cache flushes have
finished.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
2022854360 radeonsi/gfx11: skip si_set_streamout_enable because it has no effect
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
bf7debee82 radeonsi: in bind_{blend,rs}_state, only call 1 update function per if
Also don't use "key.ps.part.prolog.color_two_side" during updates
because it would depend on the order the update functions are called,
which is not a problem now, but it's a trap for the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
53aa36772a radeonsi: rewrite si_get_total_colormask as si_any_colorbuffer_written
The result is only used as bool.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
e2b817b948 radeonsi: rewrite how shader key bits dependent on current_rast_prim are updated
Don't set do_update_shaders every time current_rast_prim changes, which can
be EVERY DRAW. Instead, just update the shader key bits and set
do_update_shaders only if any bits are different.

When we bind a new rasterizer state, do the same.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
4ab5374ec3 radeonsi: clean up setting poly/line/stipple shader key bits
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
f9c4ac3477 radeonsi: update shaders for rasterizer state only if the shader key changed
Check if any key bit changed before setting do_update_shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
613ea16aab radeonsi: update shaders for blend state only if the shader key changed
Check if any key bit or state changed before setting do_update_shaders.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
c8411ddf17 radeonsi: change the low-priority compiler queue to normal priority
I'm guessing that low priority could cause us to get optimized shaders later
than we need.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
98e7a7123b radeonsi: don't set non-existent VGT_GS_MAX_PRIMS_PER_SUBGROUP on gfx10
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Marek Olšák
f51b960af1 radeonsi/gfx11: fix unaligned SET_CONTEXT_PAIRS_PACKED
It set an invalid register. Luckily it didn't cause any issues.

Fixes: 2ac6816b70 - radeonsi/gfx11: use SET_CONTEXT_REG_PAIRS_PACKED for other states

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26307>
2023-12-09 00:05:27 +00:00
Paulo Zanoni
af65af8267 intel/tools: fix compilation of intel_hang_viewer on 32 bits
Because gcc was complaining:

../../src/intel/tools/intel_hang_viewer.cpp: In function ‘void display_hang_stats()’:
../../src/intel/tools/intel_hang_viewer.cpp:365:31: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘std::vector<hang_bo>::size_type’ {aka ‘unsigned int’} [-Werror=format=]
  365 |    ImGui::Text("BOs:        %lu", context.bos.size());
      |                             ~~^   ~~~~~~~~~~~~~~~~~~
      |                               |                   |
      |                               long unsigned int   std::vector<hang_bo>::size_type {aka unsigned int}
      |                             %u
../../src/intel/tools/intel_hang_viewer.cpp:366:31: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘std::vector<hang_exec>::size_type’ {aka ‘unsigned int’} [-Werror=format=]
  366 |    ImGui::Text("Execs       %lu", context.execs.size());
      |                             ~~^   ~~~~~~~~~~~~~~~~~~~~
      |                               |                     |
      |                               long unsigned int     std::vector<hang_exec>::size_type {aka unsigned int}
      |                             %u
../../src/intel/tools/intel_hang_viewer.cpp:367:31: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘std::vector<hang_map>::size_type’ {aka ‘unsigned int’} [-Werror=format=]
  367 |    ImGui::Text("Maps:       %lu", context.maps.size());
      |                             ~~^   ~~~~~~~~~~~~~~~~~~~
      |                               |                    |
      |                               long unsigned int    std::vector<hang_map>::size_type {aka unsigned int}
      |                             %u
cc1plus: some warnings being treated as errors

I'm not sure if STL's size_type is defined by the spec to be anything
specific, but for the platforms we care about it seems to be size_t,
so change it to %z.

Fixes: 33fd93f3b1 ("intel/tools: hang viewer/editor")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26581>
2023-12-08 22:53:03 +00:00
Eric Engestrom
b0ad9995d6 v3dv/ci: only trigger on relevant changes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26597>
2023-12-08 22:19:50 +00:00
Matt Turner
6d2be84672 ci/lava: Add firmware-misc-nonfree on amd64
Hopefully this provides the GuC firmware files we need for testing on
Intel ADL+ boards.

Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9841
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25792>
2023-12-08 21:25:12 +00:00
Rob Clark
2132f95de0 freedreno/a6xx: Fix NV12+UBWC import
Treat R8_G8B8_420_UNORM and NV12 the same, because dri2 frontend doesn't
understand or care about the difference from the sampler PoV.

Fixes: 1e820ac128 ("freedreno: Rework supported-modifiers handling")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26601>
2023-12-08 21:06:05 +00:00
Sagar Ghuge
708d4f59f8 anv: Use RCS cmd buffer if blit src/dest has 3 components
The Blitter engine lacks support for 3 components color format so we can
just fallback to RCS companion command buffer for the blit operation.

Even though blitter supports 96-bit support it only supports linear
tiling. We can support other types of tiling by falling back to the RCS
companion command buffer.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26300>
2023-12-08 20:44:03 +00:00
Ian Romanick
87cdcbd7d7 intel/compiler: Verify that DO is alone in the block
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>
2023-12-08 20:21:28 +00:00
Ian Romanick
65237f8bbc intel/fs: Don't add MOV instructions to DO blocks in combine constants
There was a subtle bug related to CFG tracking. Namely, some branch
instructions may point *only* to the block after the DO instruction
for the loop. If the MOV instructions are in the DO block, the may not
have liveness properly tracked.

Like in !25132, having the MOV instructions in blocks that might
contain other instructions helps scheduling.

shader-db:

All Broadwell and newer Intel GPUs had similar results (Ice Lake shown)
total cycles in shared programs: 848577248 -> 848557268 (<.01%)
cycles in affected programs: 78256396 -> 78236416 (-0.03%)
helped: 361 / HURT: 18

fossil-db:

All Skylake and newer Intel GPUs had similar results (Ice Lake shown)
Totals:
Cycles: 15021501924 -> 15021372904 (-0.00%); split: -0.00%, +0.00%

Totals from 735 (0.11% of 656080) affected shaders:
Cycles: 676429502 -> 676300482 (-0.02%); split: -0.02%, +0.00%

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>
2023-12-08 20:21:28 +00:00
Sil Vilerino
23f07f4942 d3d12: Check video encode codec cap before checking encode profile/level cap
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26598>
2023-12-08 20:04:49 +00:00
Timur Kristóf
1c8c3e5a7a radv: Don't retile DCC on transfer queues.
Instead, the retile will be executed on another queue type
when the image is transitioned to another queue.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:17 +00:00
Timur Kristóf
5c30d462b9 radv: Disable HTILE on exclusive images with transfer queues when SDMA doesn't support it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:17 +00:00
Timur Kristóf
1764259ba8 radv: Disable DCC on exclusive images with transfer queue when SDMA doesn't support it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:17 +00:00
Timur Kristóf
89a6b08cba radv: disable HTILE/DCC for concurrent images with transfer queue if unsupported.
DCC and HTILE are only supported by SDMA on GFX10+ (unless disabled by a workaround).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:16 +00:00
Chia-I Wu
ad6b6673be radv: convert a check in radv_get_memory_fd to assert
VUID-VkBindImageMemoryInfo-memory-02628 and
VUID-VkBindImageMemoryInfo-memory-02629 make sure the memory offset is 0
for dedicated allocations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
8aa62ba240 radv: fix asserts for radv_init_metadata
radv_init_metadata hits several assert failures when the image is
multi-planar.  Make sure we use plane 0.

This change should make no difference in practice.  Also, this is done
only to follow radeonsi.  Since the opaque metadata is mainly for
validations and DCC, and we don't enable DCC for multi-planar images, we
probably don't need to call radv_query_opaque_metadata at all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
035cf7ab97 radv: fix a typo in radv_image_view_make_descriptor
Only GFX8 and before have legacy_surf_level.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
07f575a8a6 radv: fix VkSubresourceLayout2KHR for multi-planar formats with modifiers
Memory planes and format planes are equivalent for multi-planar formats
with modifiers.  Do not return the DCC info of plane 0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
8f60ccf969 radv: fix VkDrmFormatModifierProperties2EXT for multi-planar formats
Do not report DCC modifiers for multi-planar formats.  We don't support
DCC for them and drmFormatModifierPlaneCount had incorrect values.

Fix vkGetImageSubresourceLayout for multi-planar images with modifiers.
In that case, memory planes and format planes are equivalent.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Samuel Pitoiset
90dda31901 radv: simplify disabling MRT compaction for PS epilogs
If the fragment shader isn't compiled, the PS epilog key isn't used
at all with GPL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26563>
2023-12-08 13:52:40 +00:00
Samuel Pitoiset
0cf00390c5 ci: uprev vkd3d-proton to a0ccc383937903f4ca0997ce53e41ccce7f2f2ec
To cover DGC mesh shaders which are only tested as part of vkd3d-proton.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26590>
2023-12-08 11:14:22 +00:00
Yonggang Luo
5bf68ab701 osmesa: Make osmesa.h compatible with Windows SDK's GL.h
For glext.h and glcorearb.h, it's already use 'APIENTRY', so for the osmesa.h

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26561>
2023-12-08 09:55:54 +00:00
Dave Airlie
10db6948da nvk/nak: fix regression with shf changes on sm70
This commit nak: implement SHL and SHR on SM50 caused a regression on
KHR-GL45.gpu_shader_fp64.* using zink.

This fixes the regression, by setting the wrap fields.

Fixes: 00be041ffc ("nak: implement SHL and SHR on SM50")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26586>
2023-12-08 05:30:09 +00:00
Marek Olšák
64b769a102 glthread: add a string table of function names
for printing glthread batches

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
adfab9794e mesa: deduplicate glVertexPointer and glNormalPointer vs DSA error checking
Regular and direct state access functions did the same thing. The new
functions will be used later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
3a74cdcd91 glthread: pass struct marshal_cmd_DrawElementsUserBuf into Draw directly
Pass the whole structure directly instead of as separate parameters.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
98e42c6efb glapi: only allow deprecated="" on non-aliased functions
Merging deprecated="" of aliased and real functions isn't completely
predictable. The function (real or aliased) that's defined last overwrites
attributes of its alias defined before it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
61e19c53e7 glthread: don't do "if (COMPAT)" if the function is not in the GL core profile
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
a3992379cb glapi: only expose GL_EXT_direct_state_access functions to GL compatibility
The extension is only exposed in GL compatibility.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
666d53214a glthread: rework type reduction and reduce vertex stride params to 16 bits
- add get_marshal_type(), which reduces type sizes
- rework all places to use the result of get_marshal_type()

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
162c890614 glthread: use autogenerated marshal structures for custom functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
e9d08bb043 glapi: rename primcount -> instance_count in a few Draw functions
In order to match the marshal structures we already have in the tree.
The next commit will depend on this.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
a02ed8a95f glthread: add option to put autogenerated marshal structures in the header file
This is used when we want to be able to read the calls of autogenerated
functions, or when we want to use the default structure for our custom
marshal functions.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
bdb771b27c glthread: eliminate push/pop calls in PushMatrix+Draw/MultMatrixf+PopMatrix
Viewperf benefits. This implements glPushMatrix marshalling manually and
looks ahead in the unmarshal function what the following calls are.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
c3b95d1507 glthread: add a marker at the end of batches indicating the end
Unmarshal calls that "look ahead" in the batch will use it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
5af047d40a mesa: optimize setting the identity matrix
instead of memcpy from a static mutable place ("const" doesn't help
anything here), just set the values directly

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
5fb106c253 mesa: skip checking for identity matrix in glMultMatrixf with glthread
glMultMatrixf was doing it. glMatrixMultfEXT is the other user of
matrix_mult that needs to do it before we can skip it here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
d321b1500b mesa: optimize _mesa_matrix_is_identity
+5% performance in VP13/Sw/teslaTower_shaded

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Yiwei Zhang
d17ddcc847 venus: dispatch background shader tasks to secondary ring
Summary:
- Add a perf option to force primary ring submission
- Let device own secondary ring(s) for ad-hoc spawn
- For threads where swapchain and command pool are created, track with
  TLS to instruct ring dispatch.
- If the pipeline creation or cache retrieval happens on the background
  threads not on the hot paths, force synchronous and dispatch to the
  secondary ring after waiting for primary ring becoming current.
- If the pipeline creation or cache retrieval happens on the hot paths
  threads, dispatch to the primary ring to avoid being blocked by those
  tasks on the secondary ring.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
5b26bebcf4 venus: add vn_gettid helper
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00