This patch exposes shader hashes (computes and draws) to Perfetto and
utrace. By including these hashes in traces, developers can correlate
compute and draw calls with their assoicated ASM dumps when analyzing
the traces.
To achieve this, intel_tracepoint.py has been reworked to preprocess
tracepoint arguments dynamically. Any argument containing "hash" in its
variable name is now forrmated as hexadecimal before being passed to the
tracepoint definition.
Signed-off-by: Michael <michael.cheng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32708>
Whenever we execute a fast-clear due to LOAD_OP_CLEAR, we decrease the
number of layers to clear by one. We then enter the slow clear function
and possibly exit without clearing if the layer count is zero.
Unfortunately, we've already compiled the shader for slow clears by the
time we exit. Skip the slow clear function if there are no layers to
clear.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>
We're going to be storing clear colors from the drivers rather than
BLORP. Add a function for this purpose.
For now, the first use replaces init_fast_clear_color(). One change in
behavior is that the clear color initialization is now done without
write-checking on gfx12. This actually matches what anv does to other
writes to the image's fast-clear tracking state. We can fix this later
if and when we address the larger issue.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30824>
The new workaround tries to strike a balance between simplicity and
functionality (for testing purposes). Instead of checking for the
alignment of a specific LOD when fast-clearing, we take an
all-or-nothing approach for LOD1+.
I haven't found any app to clear LOD1+ except for a Dirt Rally trace
some time ago. If I remember correctly, that trace clears all LODs,
doesn't render to them, then clears again with a different color,
incurring resolves. So, skipping LOD1+ fast clears will avoid those
resolves. Other apps I tested include Synmark2, glmark2, GfxBench5, and
the Vulkan games in internal our benchmarking tool.
Now that we've added updated and simplified checks in the drivers
themselves, we delete blorp_can_hiz_clear_depth.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30250>
None of our tracked games use partial depth clears, so only allow it in
simple cases for testing purposes. This change also fixes an issue on
gfx8, where we had been accidentally disabling full surface clears if
the LOD was not 8x4 aligned.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30250>
Add up to four reasons per flush to perfetto flushes. PC reasons
will help debuggers understand why flushes were required, and
perhaps provide hints as to how they can be avoided.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27400>
Wa_1409600907 was enabled for gen12+. It should not be applied for
platforms after gen12.0. Use generated helpers to ensure application
to all relevant platforms.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21743>
Since the dirty range started out as 0..0, you would have 0..VBend as the
new dirty range on the first draw, and if your VB was >32b then you'd
flush every time you used it. Instead, if there's no existing dirty range
then just set it to our new VB's range.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21370>
Field must be disabled if any render targets have integer format.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20671>
For INTEL_MEASURE, ensure all prior instructions completed before
timestamp taken. Continue to support no CS flush case for Perfetto.
CS stall was dropped from pipecontrol when adding u_trace support.
Fixes: cc5843a573 ("anv: implement u_trace support")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20502>
Delete anv_CmdDispatch, anv_CmdSetDeviceMask, and
anv_GetDeviceGroupPeerMemoryFeatures so that the vk_common_*
versions will be used instead. This will avoid repeated code.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20218>
On cmd_buffer_emit_scissor(), if VkViewport height or width are set to
a value lower than 1.0, y_max or x_max can be attributed negative values,
causing an overflow. That leads to ScissorRectangleYMax or
ScissorRectangleXMax to be set to values on an unsupported range.
Clamping x_max and y_max in the valid range solves the problem.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7471
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20200>
Oh, for the days of Broadwell and earlier where compression was called
fast-clear. That was a simpler time. The birds sang in the trees, the
oceans weren't brown from oil spills, and Intel surface compression was
actually comprehendable by humans. To help the reviewer, keep the
following in mind:
1. CCS_E is SKL+
2. Implicit CCS is TGL+
3. The AUX TT (AKA aux map) is TGL+
4. HIZ+CCS, stencil CCS, and CCS for storage images are all TGL+
4. CCS_D surfaces only ever get full resolves and MCS surfaces only
ever get partial resolves
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19852>
Intel has different Z interpolation float point rounding
than other mesa gpus
For example gl_Position.z = 0.0 will be interpolated to
gl_FragCoord.z = 0.5 for all gpus
gl_FragCoord = -0.00000001 will be interpolated to
gl_FragCoord.z = 0.4999999702 for Intel
and rounded to gl_FragCoord.z = 0.5 for other gpus
Games with LEQUAL depth func will fail depth test on Intel
and will pass it on other gpus in such case
This workaround lowers translated depth range
and several gl_FragCoord.z coords with extra small difference
will be translated to the same UINT16\UINT24\UINT32
value of an integer depth buffer
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7199
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18412>
This new driver is a copy of the current Anv code, it will only load
on gfx7/8 platforms though.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18208>