Commit graph

147698 commits

Author SHA1 Message Date
Iago Toral Quiroga
a1e723cace broadcom/simulator: add a helper to get the amount of free heap memory
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18483>
2022-09-09 11:14:03 +00:00
Iago Toral Quiroga
861fff6339 v3dv: limit heap size to 4GB
GPU addresses are 32-bit, so we can't address more than 4GB.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18483>
2022-09-09 11:14:03 +00:00
Iago Toral Quiroga
b5b3a1634f v3dv: fix variable type
The heap size is a 64-bit value.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18483>
2022-09-09 11:14:03 +00:00
Iago Toral Quiroga
f27d3a08c6 v3dv: expose VK_EXT_attachment_feeback_loop_layout
We don't have any special requirements for this, so we can just expose
the extension.

The tests in CTS have an issue where they only check if a format is
supported for sampling but don't check if an image with that format
can be created for sampling. In our case, since we can't sample
1D depth/stencil images, this causes affected tests to crash in the
simulator (they pass on the device though). There is an issue with
a fix here:

https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/3923

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18489>
2022-09-09 12:31:02 +02:00
Iago Toral Quiroga
bcc37775f1 v3dv: implement VK_EXT_depth_clip_control
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18387>
2022-09-09 10:03:58 +00:00
Karmjit Mahil
c6a9897b76 pvr: Add depth_bias_array handling on dbenable.
On dbenable depth bias is enabled so we need to write the depth
bias data into the depth_bias_array (which gets uploaded to the
device) and also setup the depth bias index (used in the control
stream).

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18438>
2022-09-09 09:46:27 +00:00
Jason Ekstrand
441598eca6 radv: Switch to dynamic rendering only
Also, update list of expected failures.

dEQP-VK.image.sample_texture.*_bit_compressed_format_two_samplers_*
now reliably pass on Polaris10 (GFX8) and Pitcairn (GFX6).

Stoney has new failures but given there is already a lot of
depth/stencil resolve failures, we shouldn't worry about them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
2022-09-09 09:24:59 +00:00
Jason Ekstrand
08e1af52ee radv: Leave image layouts alone when doing HW MSAA resolves
If the current layout supports DCC, we initialize it.  There's no reason
why we can't leave it in that layout and need to stomp it to
COLOR_ATTACHMENT_OPTIMAL.  If the layout supports DCC, it's effectively
identical to COLOR_ATTACHMENT_OPTIMAL anyway.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
2022-09-09 09:24:59 +00:00
Jason Ekstrand
0461d59098 radv: Only copy the render area from VRS to HTILE
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
2022-09-09 09:24:59 +00:00
Jason Ekstrand
c7d0d328d5 radv: Set the window scissor to the render area, not framebuffer
With dynamic rendering, the concept of framebuffer dimensions goes away
so this won't make sense.  Even with render passes, the render area is
guaranteed to be inside the framebuffer so we may as well clip to the
potentially smaller render area.  This commit also moves window scissor
setup to CmdBeginRenderPass2() time.  This should be fine, even for meta
ops, as the only meta ops which happen inside a render pass need the
same render area as the render pass itself.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
2022-09-09 09:24:59 +00:00
Martin Roukala (né Peres)
bbb3749077 radv/ci: document an unstable test
The test seem to fail when run in conjunction with other tests. This
got revealed after I introduced parrallelism in the VKCTS execution on
VanGogh.

This was caught by pre-merge CI, but idiot me thought this was a
flake... and did not try re-running the job to verify...</BrownBag>

Reference: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7220
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18480>
2022-09-09 08:38:10 +00:00
Mark Collins
21aec585c1 tu: Retain allocated CSes in tu_autotune_on_submit
It was determined that a significant part of queue submission
overhead was from allocation/freeing of CSes constantly inside
`tu_autotune_on_submit`. This has been reduced by retaining
instances of `tu_submission_data` with their corresponding
CSes, this results in entirely eliminating that overhead as
resetting a CS is a very cheap operation compared to allocation
or even freeing it wholly.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18461>
2022-09-09 07:57:54 +00:00
Gert Wollny
762d377292 mesa/glsl: Add support for NV_shader_noperspective_interpolation
With EXT_gpu_shader4 the support is already in place, we just
have to allow it in glsl and expose the extension name.

v2: Check whether the extension is enabled in the shader (Adam Jackson)
v3: Don't check GLES version in lexer (mareko)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18460>
2022-09-09 07:22:20 +00:00
Samuel Pitoiset
c26e0e5070 radv: fix hw remapping of MRT holes with color attachments without export
If a color attachment is used in a render pass but not exported by the
FS, cb_shader_mask would be non-zero for this MRT. Though, to make sure
the hw remapping of SPI_SHADER_COL_FORMAT<->CB_SHADER_MASK works as
expected, we should also clear the unused color attachment in
CB_SHADER_MASK. Otherwise, the hw will remap to the wrong MRT.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7221
Fixes: 8fcb4aa0eb ("radv: compact MRTs to save PS export memory space")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18491>
2022-09-09 07:03:00 +00:00
Samuel Pitoiset
f8209ddc5b radv: import PS epilog from libraries if present
This enables using PS epilogs with GPL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
dcff89994c radv: add support for emitting and prefetching PS epilogs
Long jumps seem to be slow and prefetching might help.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
4ba84b4d64 radv: create a PS epilog from a library without the main FS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
3f5d31ae69 radv: keep track of the code size for VS prologs and PS epilogs
This will be used to prefetch PS epilogs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
8c4e33cdab radv: do not try to remove color exports for FS that need an epilog
The color format would be zero and all exports would be removed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
b5e25e3a30 radv: add radv_remove_color_exports() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
baf3924631 radv: do not lower color exports for FS that need an epilog
When building the main FS with GPL we don't know the color export
formats.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
2022-09-09 06:26:27 +00:00
Samuel Pitoiset
a14354cf21 radv: fix reporting RT shaders in RGP
RGP expects a compute bind point. This allows it to show ISA of RT
shaders and also enables instruction timing.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7213
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
2022-09-09 05:51:23 +00:00
Samuel Pitoiset
2e04aeb1e5 radv: capture RT pipelines from the SQTT layer
They were just not recorded.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
2022-09-09 05:51:23 +00:00
Samuel Pitoiset
8866e6582d radv: emit SQTT markers for RT related commands
This reports RT commands like vkCmdTraceRaysKHR and
vkCmdBuildAccelerationStructuresKHR in RGP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
2022-09-09 05:51:23 +00:00
Adam Jackson
057c58b39b glx: Use XSaveContext, delete glxhash.c
libX11 has a perfectly good XID-based hash table we can be using, let's.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18474>
2022-09-08 23:43:42 +00:00
Yiwei Zhang
63703de443 venus: force synchronous submission for external signal semaphore
This is to ensure semaphore export under globalFencing represents the
correct submission.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18475>
2022-09-08 21:46:40 +00:00
Yiwei Zhang
0a3647fb88 venus: clean up vn_QueueSubmit
and ensure all submissions are synchronous upon NO_ASYNC_QUEUE_SUBMIT

No other intended change in behavior.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18475>
2022-09-08 21:46:40 +00:00
Timur Kristóf
e58a5cca02 nir/gather_info: Clear cross-invocation output mask.
Similar to how other I/O info is cleared at the beginning
of gather_info we should also clear the cross-invocation
mesh shader output mask.

Fixes: 112a856813
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
2022-09-08 20:26:03 +00:00
Timur Kristóf
c80d811403 nir/lower_system_values: Add shortcut for 1D workgroups.
When the workgroup is 1 dimensional, simply use	a vec3
filled with zeroes and the local invocation index.
This is is better than lower_id_to_index + constant folding,
because this way we don't leave behind extra ALU instrs.

Note, this is relevant to mesh shaders on RDNA2 because
it enables us to better detect cross-invocation output
access.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
2022-09-08 20:26:03 +00:00
Yiwei Zhang
9fdfe43240 zink: implement fence_get_fd required by EGL android platform
fence_get_fd is required for any kind of surface flush or native fence
sync export on Android. The typical scenarios are:
- eglDupNativeFenceFDANDROID
- eglSwapBuffers*
- eglMakeCurrent
- glFlush/glFinish for front buffer rendering

This change updates zink_flush to handle PIPE_FLUSH_FENCE_FD via a
forced submit to signal an external sync_fd semaphore. fence_get_fd is
implemented to export the sync file from that semaphore.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
2022-09-08 19:30:38 +00:00
Yiwei Zhang
6d1e214238 zink: fix in-fence lifecycle
For in-fence handling, dri2 has this below sequence in a row:
1. create_fence_fd: import external fence fd
2. fence_server_sync: import the pipe fence into the driver ctx
3. fence_reference: deref the created pipe fence

Before this change, zink pushed the wrapped external semaphore to the
wait semaphores of the next batch but the followed fence_reference will
destroy the imported semaphore immediately. Instead of extending the
lifecycle of the pipe fence throughout the batch state, we can simply
transfer the semaphore ownership to the batch and destroy it upon batch
reset.

Fixes: 32597e116d ("zink: implement GL semaphores")

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
2022-09-08 19:30:38 +00:00
Yiwei Zhang
c1b827d6a2 zink: fix zink_create_fence_fd to properly import
This change fixes below:
1. Dup the fence fd, otherwise, since external semaphore import takes
   the ownership of the fd, non-Vulkan part touches the fd leading to
   undefined behavior. This can be hit on implementations that defer
   the processing of the passed fd.
2. Use VK_SEMAPHORE_IMPORT_TEMPORARY_BIT for importing since that's
   required for SYNC_FD handle type because of its copy transference.
   Meanwhile, doing temporary import for opaque fd is fine in this path.

Fixes: 32597e116d ("zink: implement GL semaphores")

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
2022-09-08 19:30:38 +00:00
Yiwei Zhang
76397d18ae zink: fix core support on Android
- use legacy EGL driver interface
- use Android Vulkan loader
- avoid unrelated kopper source files

v2: update false #elif to #else

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
2022-09-08 19:30:38 +00:00
Yiwei Zhang
8fe667afbb loader: use os_get_option for driver override
Android requires this to enable zink.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
2022-09-08 19:30:38 +00:00
Chad Versace
d0cb99e96a venus: Enable VK_EXT_pipeline_creation_feedback
Implement natively by always returning invalid feedback. This is a legal
(but useless) implementation according to the spec.

In the future, I want to return the real feedback values from the host,
but that requires changes to the venus protocol.  The protocol does not
know that the VkPipelineCreationFeedback structs in the
VkGraphicsPipelineCreateInfo pNext are output parameters. Before
VK_EXT_pipeline_creation_feedback, the pNext chain was input-only.

Tested with `dEQP-VK.pipeline.*.creation_feedback.*`.

The tests in vulkan-cts-1.3.3.0 are buggy. I submitted a fix to dEQP
upstream; see below.

Results with the bug:
    Passed:         0/30 ( 0.0%)
    Failed:        12/30 (40.0%)
    Not supported: 18/30 (60.0%)
    Warnings:       0/30 ( 0.0%)

Results with bugfix:
    Passed:        12/30 (40.0%)
    Failed:         0/30 ( 0.0%)
    Not supported: 18/30 (60.0%)
    Warnings:       0/30 ( 0.0%)

See: https://gerrit.khronos.org/c/vk-gl-cts/+/10086
See: https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/909
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Chad Versace <chadversary@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18035>
2022-09-08 19:13:51 +00:00
Emma Anholt
e1f032acc3 zink: Don't lower indirect derefs of temp arrays.
nir_to_spirv can handle it.  Cuts instructions in a turnip CS shader on
Aztec Ruins from 36k to 3k.

Part of #6115

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18374>
2022-09-08 16:44:28 +00:00
Emma Anholt
09f6acc4b7 zink: Don't upload shader immediate arrays through UBO 0.
Trust the host vulkan driver to load it through whatever way it finds to
be most efficient.  Saves upload on the frontend when other uniforms
change.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18374>
2022-09-08 16:44:28 +00:00
Mike Blumenkrantz
a0f6fecc6a zink: flag all assigned output slots as mapped
this ensures types which consume more than 1 slot are effectively tagged
so that the next stage inputs are also assigned properly

fixes:
spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-dvec3

cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18444>
2022-09-08 16:10:42 +00:00
James Park
c48c53c21f vulkan: Augment _WIN32 stub comparison
Make current check robust to incremental linking.

Compare JMP targets if the first byte is opcode 0xE9.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18400>
2022-09-08 15:37:11 +00:00
Kenneth Graunke
19fc870ac6 intel/compiler: Use subgroup invocation for ICP handle loads
When loading a TCS or GS input, we generate some code to read the URB
handle for a particular input control point (ICP handle), which often
involves indirect addressing due to a non-constant vertex.

For example:

   mov(8) vgrf148+0.0:UW, 76543210V
   shl(8) vgrf149:UD, vgrf148+0.0:UW, 2u
   shl(8) vgrf150:UD, vgrf145:UD, 5u
   add(8) vgrf151:UD, vgrf150:UD, vgrf149:UD
   mov_indirect(8) vgrf147:UD, g2:UD, vgrf151:UD, 96u

Unfortunately, the first load with 76543210V is considered a partial
write because the 8 channels of 16-bit UW data doesn't fill an entire
register, and we can't allocate VGRFs at sub-register granularity.

This causes none of the above math to be CSE'd, even though the first
two instructions are common to *all* input loads, and the rest may be
reused sometimes as well.

To work around this, we stop emitting 76543210V to a temporary, and
instead use nir_system_values[SYSTEM_VALUE_SUBGROUP_INVOCATION], which
already contains this value, and is unconditionally set up for us.
With all input loads using the same register for the sequence, our
CSE pass is able to eliminate the rest of the common math.

shader-db results on Tigerlake:

   total instructions in shared programs: 20748243 -> 20744844 (-0.02%)
   instructions in affected programs: 73410 -> 70011 (-4.63%)
   helped: 242 / HURT: 21
   helped stats (abs) min: 1 max: 37 x̄: 14.17 x̃: 15
   helped stats (rel) min: 0.17% max: 19.58% x̄: 6.13% x̃: 6.32%
   HURT stats (abs)   min: 1 max: 4 x̄: 1.38 x̃: 1
   HURT stats (rel)   min: 0.18% max: 1.31% x̄: 0.58% x̃: 0.58%
   95% mean confidence interval for instructions value: -13.73 -12.12
   95% mean confidence interval for instructions %-change: -6.00% -5.19%
   Instructions are helped.

   total cycles in shared programs: 785828951 -> 785788480 (<.01%)
   cycles in affected programs: 597593 -> 557122 (-6.77%)
   helped: 227 / HURT: 13
   helped stats (abs) min: 6 max: 624 x̄: 182.19 x̃: 185
   helped stats (rel) min: 0.24% max: 18.22% x̄: 7.85% x̃: 7.80%
   HURT stats (abs)   min: 2 max: 153 x̄: 68.08 x̃: 36
   HURT stats (rel)   min: 0.03% max: 7.79% x̄: 2.97% x̃: 1.25%
   95% mean confidence interval for cycles value: -182.55 -154.71
   95% mean confidence interval for cycles %-change: -7.84% -6.69%
   Cycles are helped.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18455>
2022-09-08 15:12:41 +00:00
Georg Lehmann
4d7fe94f3a nir/opt_algebraic: Optimize unpacking of upcasts to 64bit integers.
Foz-DB Navi21:
Totals from 7 (0.01% of 134913) affected shaders:
CodeSize: 213364 -> 213028 (-0.16%)
Instrs: 38347 -> 38319 (-0.07%)
Latency: 780148 -> 779776 (-0.05%)
InvThroughput: 520098 -> 519851 (-0.05%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18435>
2022-09-08 14:37:56 +00:00
Karmjit Mahil
b36e9b6187 pvr: Remove unimplemented push descriptor code.
Push descriptors are part of VK_KHR_push_descriptor.
Not supporting it for now.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18430>
2022-09-08 11:56:43 +00:00
Karol Herbst
54709efd5e nv50: properly flush the TSC cache on 3D
The change didn't make any sense. `s` will always be
`NV50_SHADER_STAGE_COMPUTE`, because it's used to loop over all shader
stages. And the TSC cache on the compute side is already flushed in
`nv50_compute_validate_samplers`.

Fixes spurious `CACHE_ERROR` dmesg messages.

Fixes: ba6ba8c990 ("nv50: adapt texture and constbuf paths for compute shaders")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18382>
2022-09-08 11:46:51 +00:00
Karol Herbst
b23b94fbc9 nv50/ir: fix OP_UNION resolving when used for vector values
When an OP_UNION def takes part in a vector source e.g. for a tex
instruction we failed to clean up the OP_UNION instruction as rep() points
towards the coalesced value instead.

This fixes a regression on nv50 moving to NIR, but also potentially issues
with nvc0.

The main reason this is common in nv50 is, that we lower OP_SLCT to a set,
predicated movs and a union.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6406
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7117
Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18377>
2022-09-08 11:35:35 +00:00
Frank Binns
cbc477d057 pvr: finish render job sample count setup
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18437>
2022-09-08 11:25:02 +00:00
Thomas H.P. Andersen
1980827aeb util: avoid deprecated builtin has_trivial_destructor
From clang 16 has_trivial_destructor is deprecated.
Use the replacement __is_trivially_destructible if it
is available.

Fixes new warnings with clang 16 like:

../src/compiler/glsl/list.h:58:4: warning: builtin __has_trivial_destructor is deprecated; use __is_trivially_destructible instead [-Wdeprecated-builtins]
../src/util/ralloc.h:551:4: note: expanded from macro 'DECLARE_RZALLOC_CXX_OPERATORS'
   DECLARE_ALLOC_CXX_OPERATORS_TEMPLATE(type, rzalloc_size)
   ^
../src/util/ralloc.h:542:12: note: expanded from macro 'DECLARE_ALLOC_CXX_OPERATORS_TEMPLATE'
      if (!HAS_TRIVIAL_DESTRUCTOR(TYPE))                                 \
           ^
../src/util/macros.h:233:44: note: expanded from macro 'HAS_TRIVIAL_DESTRUCTOR'

Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18423>
2022-09-08 10:53:32 +00:00
Karmjit Mahil
61d265cfce pvr: Finish setting up job resolve info.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18431>
2022-09-08 10:44:57 +00:00
Karmjit Mahil
c6113def84 pvr: Set descriptor dirty flag based on other flags.
Set the flag if the descriptor set is updated.
Set the fragment descriptor dirty flag if blend consts are updated.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18429>
2022-09-08 10:36:27 +00:00
Mark Collins
dd19da31f2 tu: Expose VK_EXT_tooling_info using common implementation
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18390>
2022-09-08 08:14:40 +00:00
Mark Collins
c82249aa68 tu: Clamp priority in DRM submitqueue creation
The kernel driver has a range of valid priority values that can
be supplied to it, submitting any priority value outside these
bounds will result in `-EINVAL`. To avoid this, the priority
value is now clamped to the range that the kernel supports.

Fixes: 0c6fbfca0c
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18389>
2022-09-08 08:04:10 +00:00