Commit graph

6156 commits

Author SHA1 Message Date
Hyunjun Ko
e510efed05 anv: support in-loop super resolution for AV1 decoding
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>
2025-01-10 21:45:04 +00:00
Hyunjun Ko
788263501d anv: calculate global parmeters correctly for AV1 decoding
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>
2025-01-10 21:45:04 +00:00
Dave Airlie
8432b8b282 anv: add initial support for AV1 decoding
Co-authored-by: Hyunjun Ko <zzoon@igalia.com>
- Allow intrabc
- Fix to manage refrenece frames using referenceNameSlotIndices
- Fix to set bitmask of motion field projection correctly
- Set destination buffer offset to the BSD_OBJECT
- Support 10-bit decoding.
- Fix small bugs.
- Change to C-style comment.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>
2025-01-10 21:45:04 +00:00
Hyunjun Ko
0fd0a51df6 anv/video: Fix to return supported video format correctly.
Since 8-bit decoding is not default, we need to check the flag too.

Fixes: a64ae20d0 ("anv: support HEVC 10-bit decoding" )

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>
2025-01-10 21:45:04 +00:00
Dave Airlie
6a28e7a6c7 anv: add default av1 tables from media-driver
Co-authored-by: Hyunjun Ko <zzoon@igalia.com>
- Change to C-style comment.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32775>
2025-01-10 21:45:04 +00:00
Michael Cheng
c3c05ffb5f intel : Expose Shader hashes for utrace and Perfetto
This patch exposes shader hashes (computes and draws) to Perfetto and
utrace. By including these hashes in traces, developers can correlate
compute and draw calls with their assoicated ASM dumps when analyzing
the traces.

To achieve this, intel_tracepoint.py has been reworked to preprocess
tracepoint arguments dynamically. Any argument containing "hash" in its
variable name is now forrmated as hexadecimal before being passed to the
tracepoint definition.

Signed-off-by: Michael <michael.cheng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32708>
2025-01-10 17:38:16 +00:00
Sagar Ghuge
710624fcc0 anv: Use 3DSTATE_URB_ALLOC_* instructions
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.

In case only one slice is available in the device, SliceN fields will be
ignored by HW.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
2025-01-09 21:26:40 +00:00
Lionel Landwerlin
08e82b28e8 anv: use the correct MOCS for depth destinations
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31778>
2025-01-09 17:47:27 +00:00
José Roberto de Souza
1d1d5653ac anv: Check VkResult main batch buffer before start companion batch buffer
It could run the companion batch buffer even if the main batch buffer
failed, that was possible to happen in i915 and Xe KMD.

In case the main context/queue is banned and companion is not it could
still return that submission was properly start what was not.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>
2025-01-09 13:47:28 +00:00
José Roberto de Souza
4c6194cae0 anv: Check VkResult of perf query batch buffer
On i915 it could be executing the main batch buffer in
i915_queue_exec_locked() even if the perf query batch buffer failed.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32850>
2025-01-09 13:47:28 +00:00
Sagar Ghuge
33d9a685a5 anv: Add pipelined coarse pixel state
3DSTATE_CPS_POINTERS is deprecated on PTL, so let's switch to
3DSTATE_COARSE_PIXEL to deliver CPS state as pipelined state.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32737>
2025-01-07 23:53:44 +00:00
Chia-I Wu
83dec767da anv: use common calibrated timestamp support partially
Use the common GetPhysicalDeviceCalibrateableTimeDomainsKHR.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32689>
2025-01-07 03:39:29 +00:00
José Roberto de Souza
7ac9ac0f93 anv: Allow larger SLM sizes for task and mesh shader
It was hard-coded to 64k but Xe2 platforms and newer supports
larger SLM sizes.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32874>
2025-01-06 18:31:20 +00:00
Tapani Pälli
72351afe24 anv: handle mesh in sbe_primitive_id_override
This prevents crashes seen in some upcoming cts tests.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32861>
2025-01-06 08:41:18 +00:00
Hyunjun Ko
5ecea6ec4a anv: handle negative value of slot index for h265 decoding.
Fixes: 8d519eb5 ("anv: add initial video decode support for h265")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32823>
2025-01-06 01:02:14 +00:00
Hyunjun Ko
168298b891 anv: Enable remapping picture ID
Fix to handle 16 refs.

v1. handle the case where a slot index is negative.
(Lionel Landwerlin <lionel.g.landwerlin@intel.com>)

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32823>
2025-01-06 01:02:14 +00:00
Hyunjun Ko
9221feaf79 anv: define ANV_VIDEO_H264_MAX_DPB_SLOTS
prep work for remapping slot ids for h264 decoding.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32823>
2025-01-06 01:02:13 +00:00
Lionel Landwerlin
98cdb9349a anv: ensure null-rt bit in compiler isn't used when there is ds attachment
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15987f49bb ("anv: avoid setting up a null RT unless needed")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12396
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32867>
2025-01-03 23:12:22 +00:00
Lionel Landwerlin
1448778385 anv: rework tbimr push constant workaround
We'll want to know about the empty push constant for device generated
commands. It's easier if the information is stored in
anv_pipeline_bind_map::push_ranges[].

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32828>
2025-01-03 11:48:42 +00:00
Lionel Landwerlin
6281b207db anv: add tracepoints timestamp mode for empty dispatches
When the runtime is going to potentially emit no dispatch, we need to
have a way to capture a timestamp. Add a new flag for this to tell
whether we don't have a HW instruction to capture the timestamp and
rely on MI_STORE_REGISTER_MEM instead.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: de00fe3f66 ("anv: add BVH building tracking through u_trace")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12382
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32835>
2025-01-03 10:36:49 +00:00
Lionel Landwerlin
6fb2d3b163 anv: limit the memcpy data for push constants
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32824>
2025-01-02 16:48:04 +00:00
Sagar Ghuge
76e85df2d2 anv: Switch to ANISOTROPIC_FAST filter mode
Same thing as ANISOTROPIC including all restrictions except HW is
allowed to take liberties with precision to speed things up, Currently
only has an affect on formats of type *_sRGB.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32738>
2024-12-31 21:49:41 +00:00
Lionel Landwerlin
5e4aeb3ad7 anv: fix index buffer size changes
With vkCmdBindIndexBuffer2KHR only the provided size can change which
currently fails to reprogram the index buffer properly.

Signed-off-by: Lionel Landwerlin <llandwerlin@gmail.com>
Fixes: 5c2aca456e ("anv: implement vkCmdBindIndexBuffer2KHR")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12376
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32785>
2024-12-27 13:20:49 +00:00
Rohan Garg
308c2b9828 anv: refactor choose_isl_tiling_flags to pass fewer arguments
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32771>
2024-12-23 19:33:36 +00:00
Erik Faye-Lund
e17abeca44 anv: use vk_descriptor_type_is_dynamic
No need to open-code this one now that we have a generic helper.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32627>
2024-12-19 15:12:58 +00:00
José Roberto de Souza
2bd3df75e5 anv: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS
According to HAS it is necessary to emit this instruction once per
context so MI_MEM_FENCE works properly.

Fixes: 86813c60a4 ("mi-builder: add read/write memory fencing support on Gfx20+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32680>
2024-12-18 17:16:05 +00:00
José Roberto de Souza
b8f93bfd38 anv: Always create anv_async_submit in init_copy_video_queue_state()
A next patch will emit more instructions in video and copy queues
for Gfx 200 and newer but the current code only creates anv_async_submit
if device has aux_map.
Instead we can always create anv_async_submit and only submit it to
hardware if any instruction was emited.

Fixes: 86813c60a4 ("mi-builder: add read/write memory fencing support on Gfx20+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32680>
2024-12-18 17:16:05 +00:00
Kevin Chuang
1b55f10105 anv/bvh: Dump BVH synchronously upon command buffer completion
Modified the BVH dumping mechanism to synchronously wait for the command
buffer to complete before saving BVH data to files. This approach is
more robust compared to the previous method of dumping during
acceleration strucutre destruction.

Note: if DEBUG_BVH_ANY is enabled but intel-rt is disabled, we will wait
for nothing.

Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32585>
2024-12-16 23:01:11 +00:00
Felix DeGrood
0f46c53b0c anv: Use vfg distribution mode = RR_STRICT for Xe2+
Performance tuning. Round Robin strict faster on Xe2 for some
workloads.

Speedup:
 - Borderlands3-dx11-trace: +4%
 - WolfensteinYoungblood-vk.g6: +1.5%
 - Cyberpunk2077-dx12vk-2160p-ultra: +0.5%

Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32566>
2024-12-13 19:15:48 +00:00
Sagar Ghuge
d3f9139e49 intel: Use Morton compute walk order
According to HSD 14016252163 if compute shader uses the sample
operation, morton walk order and set the thread group batch size to 4 is
expected to increase sampler cache hit rates by increasing sample
address locality within a subslice.

Rework:
 * Caio: "||" => "&&" for type checking in instr_uses_sampler()
 * Jordan: Use nir's foreach macros rather than
   nir_shader_lower_instructions()

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32430>
2024-12-12 19:56:47 -08:00
Lionel Landwerlin
2bb98a8f99 anv: document UBO descriptor range alignments
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32347>
2024-12-12 07:35:18 +00:00
Benjamin Lee
becb014d27 nir: treat per-view outputs as arrayed IO
This is needed for implementing multiview in panvk, where the address
calculation for multiview outputs is not well-represented by lowering to
nir_intrinsic_store_output with a single offset.

The case where a variable is both per-view and per-{vertex,primitive} is
now unsupported. This would come up with drivers implementing
NV_mesh_shader or using nir_lower_multiview on geometry, tessellation,
or mesh shaders. No drivers currently do either of these. There was some
code that attempted to handle the nested per-view case by unwrapping
per-view/arrayed types twice, but it's unclear to what extent this
actually worked.

ANV and Turnip both rely on per-view outputs being assigned a unique
driver location for each view, so I've added on option to configure that
behavior rather than removing it.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
2024-12-09 20:31:49 +00:00
Benjamin Lee
975c3ecd1e nir: handle arbitrary per-view outputs in nir_lower_multiview
This is needed for panvk, where multiview is "all or nothing". When
multiview is enabled, all outputs may be written with separate values
for each view.

The edge case mentioned in the previous `nir_can_lower_multiview` is now
handled because we now handle an arbitrary number of per-view output
vars instead of expecting to find exactly one.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
2024-12-09 20:31:49 +00:00
Mi, Yanfeng
06d3eb8e01 anv:increase instruction heap to 3Gb
Black Myth Wukong is generating more than 2Gb of shaders in
pre-compiling stage after VK_EXT_shader_image_atomic_int64 extension
enabled. Driver will crash in create shader stages due to dereference
null pointer of kernel map.

Signed-off-by: Mi, Yanfeng <yanfeng.mi@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32548>
2024-12-09 19:14:38 +00:00
Mi, Yanfeng
0a5a04f509 anv:Fix memory grow calculation overflow issue
when old buffer size is large than 2G, 32bit cannot hold
2 times buffer size (>4G).

Fixes: 8d813a90d6 ("anv: fail pool allocation when over the maximal size")

Signed-off-by: Mi, Yanfeng <yanfeng.mi@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32551>
2024-12-09 18:49:17 +00:00
Lionel Landwerlin
de00fe3f66 anv: add BVH building tracking through u_trace
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32483>
2024-12-09 14:45:00 +00:00
Alyssa Rosenzweig
972f8aa287 vulkan: rename depth bias graphics states
"constant" is a special keyword in OpenCL C, and we'd like to #define it
suitably in host C23 to facilitate compatiblity between host/device headers.
That means we can't have any identifiers named "global" or "constant".
Fortunately, this is the only 'constant' in any file I'm hitting.

To avoid the clash, don't abbreviate "constant factor", use "constant_factor"
instead. For consistency, "slope factor" then becomes "slope_factor".
The new names are longer but match the Vulkan API exactly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [Intel]
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> [NVK and panvk]
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [V3DV]
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> [IMG]
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32505>
2024-12-06 13:48:26 -05:00
Nanley Chery
483c40a21d anv: Allow compressed memtypes with default buffer types
Source 2 games segfault if certain buffers are not able to use the same
memory types as images. CS2 specifically expects this to be the case for
vertex and index buffers (VK_BUFFER_USAGE_2_INDEX_BUFFER_BIT,
VK_BUFFER_USAGE_2_VERTEX_BUFFER_BIT). I have not tested other Source 2
games to see how much the requirement differs for the usage (if at all).

Up until now, we've disabled CCS for the Source 2 engine with the
anv_disable_xe2_ccs driconf option. However, this option is not great
for performance. So, replace this with a new option to allow the same
memory types we use for images on buffers - anv_enable_buffer_comp.

Compression of buffers is generally not good for performance. I
collected the result of unconditionally enabling the feature in the
performance CI on BMG. I used the default configuration to average the
result of two runs of each trace.

The CI reports that 4 game traces would regress between 0.44-1.01% FPS
with buffer compression. However, the CI actually shows it to be
beneficial in three of our game traces:

* Cyberpunk-trace-dx12-1080p-high 106.51%
* Hitman3-trace-dx12-1080p-med    101.59%
* Blackops3-trace-dx11-1080p-high 100.44%

So, enable the option for the two games we already have driconf entries
for, Cyberpunk and Hitman3.

Of course, also enable the option for Source 2 games. Casey Bowman
reports that on BMG, some frame times drop from ~15ms to ~7ms in CS2.
This is in large part due to the removal of HiZ resolves, which is a
consequence of the game now using of HIZ_CCS_WT instead of plain HIZ.

Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11520
Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32519>
2024-12-06 17:21:06 +00:00
Lionel Landwerlin
371b7a9b0d anv: set pipeline flags correct for imported libs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3d49cdb71e ("anv: implement VK_EXT_graphics_pipeline_library")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32507>
2024-12-05 19:53:34 +00:00
Lionel Landwerlin
6e396b400a anv: fix missing bindings valid dynamic state change check
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9ddd296cd3 ("anv: implement VK_EXT_vertex_input_dynamic_state")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32507>
2024-12-05 19:53:34 +00:00
Lionel Landwerlin
80c0d2718c anv: report formats supported by the common bvh framework
Enables DXR 1.1 with vkd3d-proton

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32487>
2024-12-05 15:54:10 +00:00
Michael Cheng
ed620bcd41 anv : Add tracepoint for as_build
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Kevin Chuang
5098c0c5df anv: Add INTEL_DEBUG for bvh dump and visualization tools
This commit allows you to dump different regions of memory related to
bvh building. An additional script to decode the memory dump is also
added, and you're able to view the built bvh in 3D view in html. See the
included README.md for usage.

Rework:
- you can now view the actual child_coord in internalNode in html
- change exponent to be int8_t in the interpreter
- fix the actual coordinates using an updated formula
- now you can have 3D view of the bvh
- blockIncr could be 2 and vk_aabb should be first
- Now, if any bvh dump is enabled, we will zero out tlas, to prevent gpu
  hang caused by incorrect tlas traversal
- rootNodeOffset is back to the beginning
- Add INTEL_DEBUG=bvh_no_build.
- Fix type of dump_size
- add assertion for a 4B alignment
- when clearing out bvh, only clear out everything after
  (header+bvh_offset)
- TODO: instead of dumping on destory, track in the command buffer

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
5561db68c3 anv: Add helper to copy data from src to dest anv_address
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
41baeb3810 anv: Implement acceleration structure API
Rework: (Kevin)
- Properly setup bvh_layout
   Our bvh resides in contiguous memory and can be divided into two sections:
      1. anv_accel_struct_header, tightly followed by
      2. actual bvh, which starts with root node, followed by interleaving
         leaves or internal nodes.
- Update comments for some fields for BVH and nodes.
- Properly populate the UUIDs in serialization header
- separate header func into completely two paths based on compaction bit
- Encode rt_uuid at second VK_UUID_SIZE.
- Write query result at correct slot
- add assertion for a 4B alignment
- move bvh_layout to anv_bvh
- Use meson option to decide which files to compile
- The alignment of serialization size is not needed
- Change static_assert to STATIC_ASSERT and move them inside functions

Rework (Sagar)
- Use anv_cmd_buffer_update_buffer instead of MI to copy data

Rework (Lionel)
- Remove flush after builds, and add flush in copy before dispatch
- Handle the flushes in CmdWriteAccelerationStructuresPropertiesKHR properly

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
9002e52037 anv: Implement cmd_dispatch_unaligned callback
Rework: (Kevin)
- Calculate correct number of threads in GPGPU thread group based on
  SIMD size.
- Instead of round up, just use the simple division and let the
  remainder part handle groupCount < local_size_x.
- Drop indirect_unroll_off and fix the bug that we're not using is_unaligned_size_x

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
0cab02ca9b anv: Implement flush_buffer_write_cp callbck
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
b2cffdb1ed anv: Implement write_buffer_cp callback
Rework: (Kevin)
 - Fix pointer arithmatic calculation.
 - Add assertion for a 4B alignment

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
8817ff26fc anv: Move update buffer code in helper
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
0edf208ab9 anv: Implement cmd_fill_buffer_addr callback
Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00