The WA is meant to be here to apply some state that is not propagated
properly inside the HW. But if you have a loop like :
for ( ... ) {
emit(3DPRIMITIVE, some param);
}
You're not really changing any state, just push more draws into the
pipeline.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f2645229c2 ("anv: implement Wa_14016118574")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21660>
We don't want the WA to kick-in if it's not point/line topology.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f2645229c2 ("anv: implement Wa_14016118574")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21660>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21390>
As mentioned in the previous patch, if intel-xe-kmd is disabled
it will fail to detected in run time but it will still compile all
Xe files.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21368>
With the addition of workarounds, the output from this tool is more
verbose than some users will want. Provide optional parameters for
enabling hwconfig and workaround details.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21639>
This probably doesn't affect Vulkan or GL because they can't have
anything bigger than a vec4 anyway unless it's a u64vec4 and those have
to be at least 8B aligned. This may affect CL apps if they use
__attribute__((packed)) on something with big vectors, depending on how
LLVM decides to translate that.
Fixes: f8aa83f0c8 ("intel/nir: Use nir_lower_mem_access_bit_sizes()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>
We need to have the scratch buffer added to the pipeline BO tracking
list, so it's added to the batch buffer and finally to the execbuffer
list. Otherwise we pagefault (or read the default scratch page on
i915).
Fixes
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_u16vec4
on CI (and probably other tests).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2028f1caa3 ("anv: emit 3DSTATE_HS in cmd_buffer_flush_gfx_state")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21653>
Using texelFetch to read samples from an 8xMSAA fast cleared image on
Haswell can read transparent black pixels around triangles from where
there should be none. This issue isn't present when using sample
shading, resolving the image using vkCmdResolveImage or in a copy the
image. The easiest way to fix this is by just disabling non-zero fast
clears for 8xMSAA images.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7587
Cc: mesa-stable
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21444>
Workarounds for defects in Intel silicon have been manually
implemented:
- consult defect database for the current platform
- add workaround code behind platform ifdef or devinfo->ver checks
Some bugs have occurred due to the manual process. Typical failure
modes:
- defect database is updated after a platform is enabled
- version checks are overly broad (eg gfx11+) for defects that were
fixed (eg in gfx12)
- version checks are too narrow for defects that were extended to
subsequent platforms.
- missed workarounds
This commit automates workaround handling:
- Internal automation queries the defect database to collate and
summarize defect documentation in json.
- mesa_defs.json describes all public defects and impacted platforms.
Defects which are extended to subsequent platforms are listed under
the original defect.
- gen_wa_helpers.py generates workaround helpers to be called
in place of version checks:
- NEEDS_WORKAROUND_{ID} provides a compile time check suitable for
use in genX routines.
- intel_device_info_needs_wa() provides a more precise runtime
check, differentiating platforms within a generation and
platform steppings.
Internal automation will generate new mesa_defs.json as needed.
Workarounds enabled with these helpers will apply correctly based on
updated information in Intel's defect database.
Reviewed-by: Dylan Baker <dylan@pnwbakers>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20825>
This ensures that users of libintel_dev.a won't be compiled until
include files are generated, and that they are recompiled when the
header changes.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20825>
The two points calling this macro pass dyn->rs.provoking_vertex to it,
which is why it works, but it's cleaner to use the parameter instead.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21613>
Coverity complains that we could end up rolling over on a 32bit
platform, which isn't really true because of the assertion, but there's
also no harm in ensuring that we have exactly the same behavior for both
32 bit and 64 bit platforms.
CID: 1515989
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21572>
I wonder if the docs are correct for Gfx11 because this is the
generation that gave us the Bindless Sampler Heap of 4Gb. So it would
make sense that the border colors can also be placed anywhere in that
4Gb heap.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21600>
It's only used by fragment shaders, so halving it matches the size
used in the most optimal primitive pipeline (VS + FS).
This change frees some URB space for mesh and task shaders and as
a result improves vk_meshlet_cadscene performance by up to 2%,
depending on the model.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21559>
The spec citation says it's just for when the RT write message BTI might
point to a different RT, and if we don't have any color attachments then
we won't have one of those at all.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21508>
Power-of-two SW stack sizes are prone to causing collisions in the
hashing function used by the L3 to map memory addresses to banks,
which can cause stack accesses from most DSSes to bottleneck on a
single L3 bank. Fix it by padding the SW stack stride by a single
cacheline if it was a power of two. This has been reported by Felix
DeGrood to improve Quake2 RTX performance by ~30% on DG2-512 in
combination with other RT patches Lionel Landwerlin has been working
on.
Many thanks to Felix DeGrood for doing much of the legwork and
providing several iterations of Q2RTX performance counter dumps which
eventually prompted me to consider the hash collision theory and
motivated this patch, and for providing additional performance counter
dumps confirming that there is no longer an appreciable imbalance in
traffic across L3 banks after this change.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21461>
This allows us to also inherit `-mfpmath=sse` added in previous commit.
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21371>
Now that all color blend bits are dynamic, emit_cb_state() is doing
almost nothing and half of that is wrong.
In the case that color write enable is dynamic, at the time the pipeline
state is emitted, it sees all the color attachments as having write
disabled and stores the WriteDisabled bit for each channel.
When all dynamic state is flushed, we have the right values already but
the values recorded into the command buffer get ORed with the ones
stored in the pipeline, and so WriteDisabled tag along when they
shouldn't.
Since all disabled color attachments are handled already when dynamic
state is flushed, there's no point in doing so at pipeline creation
time too. And since the only other thing done by emit_cb_state() is
writing three hardcoded values, they might as well be taken care of in
the same place as everything else.
Fixes CTS from the future:
dEQP-VK.pipeline.*.extended_dynamic_state.*.color_blend_equation_*dynamic*
dEQP-VK.pipeline.*.extended_dynamic_state.*.color_blend_all_*
Fixes: fc3fd7c69e (anv: dynamic color write mask)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21509>
Not necessary but, all things being equal, be consistent with Iris.
Now that intel_debug_write_identifiers() already add the padding,
there's no need to include extra "+ 8" to the offset.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21479>
We need to emit 3DSTATE_HS for each primitive with tessellation.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21308>