Given a situation like this :
- CB_A: begin, renderDepthA, end
- CB_B: begin, computeA, barrier (depth), computeB, end
The depth cache is not being flushed between renderDepthA & computeB
because :
- it's not flushed at the end of CB_A (it's not required)
- when CB_B starts, we're still on GFX pipeline mode but do not
flush render caches because pipeline mode is unknown
- when barrier is CB_B is executed, we're already in compute
pipeline mode and HW cannot flush depth.
The fix is to flush RT/depth cached when switching from unknown
pipeline mode any pipeline mode.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e6dae6ef5f ("vulkan: Optimize implicit end_subpass barrier")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14816
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: David Gow <david@davidgow.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39824>
This bit is set in mocs for other protected attachment types by
anv_image_fill_surface_state() however was ommited for depth/stencil
attachments here.
Without the protected bit set, it causes heavy black artifacting when
attaching a protected depth attachment image to a framebuffer.
Fixes: 794b0496e9 ("anv: enable protected memory")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39818>
The destination for CmdResolve can be a 3D image, and while some
restrictions on the base layer and count exist, the Z offset into which
the resolve will happen has no such restriction.
Fixes some new tests: dEQP-VK.pipeline.*.multisample.m10_resolve.resolve_cmd.*.full_3d.*
Fixes: 0e7761b35cd ("anv, hasvk: allow using a 3D image as a resolve target")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39793>
`operands_match` was modifying instruction source operands in-place
(through the `elk_fs_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.
Guard all `.f` accesses with `.file == IMM` checks.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>
`operands_match` was modifying instruction source operands in-place
(through the `brw_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.
Guard all `.f` accesses with `.file == IMM` checks.
Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39814>
The entire array is always initialized to zero and never modified.
Cuts the size of brw_wm_prog_data by 32%.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39791>
Instead of asserting, let's simply not enumerate any configuration if
cooperative matrix is disabled. This can happen for example when
neither systolic nor software lowering is being used.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39728>
This has not been problem before the compression hint given to kernel
but now that we set it we hit problems when allocating bo if modifier
does not support compression.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14625
Fixes: f91de58818 ("anv: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39710>
This is the shader key for the fragment shader. Nobody even knows
what the windowizer/masker unit is or does anymore. Even on Gen4-6,
"fs" is still clearer. This makes the codebase easier to read.
This is only about 15 years overdue.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
This is the program data for the fragment shader. Nobody even knows
what the windowizer/masker unit is or does anymore. Even on Gen4-6,
"fs" is still clearer. This makes the codebase easier to read.
This is only about 15 years overdue.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
This started out as dynamic configuration for MSAA related state, but
has since expanded to cover many dynamic fragment shader options.
We rename it to intel_fs_config, similar to intel_tess_config, to
better indicate its purpose.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
Assign a new QPitch when fast-clearing the unaligned top rows on a
redescribed surface. Fixes the following piglit test on gfx12.5:
$ test_folder=generated_tests/spec/EXT_shader_framebuffer_fetch/execution/gles3/
$ ./bin/shader_runner_gles3 $test_folder/single-slice-2darray.shader_test -auto -fbo
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 3e331e4fe9 ("intel/blorp: Optimize non-zero-layer fast-clears")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39722>
Add virtio-gpu native context support to ANV driver.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29870>
Add virtio-intel native DRM context base preparatory code. Virtio-intel
works by passing ioctl's from guest to host for execution, utilizing
available VirtIO-GPU infrastructure.
This patch adds initial experimental native context support using i915
KMD UAPI.
Compile Mesa with -Dintel-virtio-experimental=true to enable virtio-intel
native context support.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29870>
Check whether userptr UAPI presents and disable userptr features if not.
Kernel i915 driver has config option that disables userptr ioctl. The
ioctl also may not present in a case of virtio native context driver.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29870>
Fixes a race condition where a BVH will be dumped before its command buffer is
actually submitted if a different command buffer completes between the time the
BVH dump is recorded and the time the command buffer is actually submitted.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Fixes: 1b55f101 ("anv/bvh: Dump BVH synchronously upon command buffer completion")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39599>
Not enough tested on over Gen12 platforms.
Turns out to be not working on DG2, for example.
Cc: mesa-stable
Closes: #14449
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39676>
For VS/TES/GS, we lower all outputs to temporaries and emit copies at the
end of the shader (or for GS, at each EmitVertex() call) from those
temporaries back to real outputs. We use vec8 URB writes without
writemasking, since our output area's contents are undefined anyhow.
This is simpler than what TCS and Mesh do, which allow for output
variables to be read/written at a per-component level at any time,
with the output memory being used for cross-thread communication.
Rather than using the complicated TCS/Mesh handling and relying on
vectorization, we port the emit_urb_writes() approach to NIR. This
also takes care of emitting the VUE header with default values when
fields aren't explicitly written by the shader.
We also handle multiview in the process. It simplifies things, and
also drops another case of non-semantic IO in brw.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39666>
The TES workaround code is still going to be needed even after
we rework URB output handling for VS/TES/GS to use NIR intrinsics.
For VS, we know at least one URB write will have been emitted at
the end of the program, so we can just tag it.
GS already handles EOT via emit_gs_thread_end().
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39666>
This lets us look up things in varying_to_slot[] without having to
special case VIEWPORT, LAYER, and PRIMITIVE_SHADING_RATE. All of them
map to the same slot as PSIZ, slot 0, the VUE header.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39666>
We'll need the VUE map when we convert to using URB intrinsics.
Prepare for that by reordering VUE map setup before IO lowering.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39666>
For now this runs on anv and freedreno a618 -- other devices have manual
skips for it currently, or run under a compositor, or don't have a
connector with a mode that the tests are willing to use. Hopefully we can
extend coverage to other devices soon.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39568>
The implicit_unmap tests complete in ~18s each on my A740, so I think they
should be fine to remove from all devices' skips files -- the problem was
hitting swap in parallel.
This reshuffles some test groups, making new xfails show up. The changes
are particularly notable in virgl, where virglrenderer gets wedged at some
point, arbitrary sets of tests after that fail.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39568>
This brings what ANV reports closer to what Iris reports, and is mostly dropping
redundancies.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39633>