mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 04:48:07 +02:00

History

Konstantin Seurer 077292f65b Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details radv/bvh: Use box16 nodes when bvh8 is not used Using box16 nodes trades bvh quality for memory bandwidth which seems to be roughly equal in performance. Stats assuming box16 nodes are as expensive as box32 nodes: Totals from 7668 (79.68% of 9624) affected BVHs: compacted_size: 951666944 -> 742347648 (-22.00%) max_depth: 57606 -> 57615 (+0.02%) sah: 129114796242 -> 129998517775 (+0.68%); split: -0.00%, +0.68% scene_sah: 188564162 -> 192063633 (+1.86%); split: -0.02%, +1.88% box16_node_count: 0 -> 3270600 (+inf%) box32_node_count: 3365707 -> 95100 (-97.17%) Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>		2026-01-10 11:36:28 +01:00
..
.clang-format	clang-format: Disable formatting by default	2023-08-13 16:48:49 +02:00
build_helpers.h	radv/bvh: Use box16 nodes when bvh8 is not used	2026-01-10 11:36:28 +01:00
build_interface.h	radv/bvh: Use box16 nodes when bvh8 is not used	2026-01-10 11:36:28 +01:00
bvh.h	radv/bvh: Add radv_aabb16 and use it for box16 nodes	2026-01-10 11:36:19 +01:00
copy.comp	vulkan/bvh: Enable glsl extensions in meson	2025-09-16 20:18:01 +00:00
copy_blas_addrs_gfx12.comp	vulkan/bvh: Enable glsl extensions in meson	2025-09-16 20:18:01 +00:00
encode.comp	radv/bvh: Use box16 nodes when bvh8 is not used	2026-01-10 11:36:28 +01:00
encode.h	radv: re-format using clang-format	2025-09-09 05:48:56 +00:00
encode_gfx12.comp	radv/bvh: Pair compress triangles in more cases	2025-10-21 19:32:55 +00:00
encode_triangles_gfx12.comp	radv/bvh: Avoid a slow case when compressing triangles	2025-12-11 16:26:01 +00:00
header.comp	vulkan/bvh: Enable glsl extensions in meson	2025-09-16 20:18:01 +00:00
invocation_cluster.h	radv/bvh: Add radv_first_active_invocation	2025-10-21 19:32:53 +00:00
leaf.comp	vulkan/bvh: Enable glsl extensions in meson	2025-09-16 20:18:01 +00:00
meson.build	radv/bvh: Use box16 nodes when bvh8 is not used	2026-01-10 11:36:28 +01:00
README.md	radv/bvh: Document GFX12 BVH encoding	2025-04-17 20:20:40 +00:00
update.comp	radv: Optimize BVH4 acceleration structure updates	2026-01-05 15:24:54 +00:00
update.h	radv/rt: Keep updated nodes always active	2025-11-25 15:25:21 +00:00
update_gfx12.comp	vulkan/bvh: Enable glsl extensions in meson	2025-09-16 20:18:01 +00:00

README.md

GFX12

GFX12 introduces a new BVH encoding for the image_bvh_dual_intersect_ray and image_bvh8_intersect_ray instructions.

BVH8 box node

bitsize/range	name	description
32	`internal_child_offset`	Offset of child BVH8 box nodes in units of 8 bytes.
32	`primitive_child_offset`	Offset of child primitive nodes in units of 8 bytes.
32	`unused`	Used by amdvlk for storing the parent node ID.
32	`origin_x`	x-offset applied to all child AABBs.
32	`origin_y`	y-offset applied to all child AABBs.
32	`origin_z`	z-offset applied to all child AABBs.
8	`exponent_x`
8	`exponent_y`
8	`exponent_z`
4	`unused`
4	`child_count_minus_one`
32	`obb_matrix_index`	Selects a matrix for transforming the ray before performing intersection tests. `0x7F` to disable OBB.
96x8	`children[8]`

children[8] element layout:

bitsize/range	name	description
12	`min_x`	Fixed point child AABB coordinate.
12	`min_y`
4	`cull_flags`
4	`unused`
12	`min_z`
12	`max_x`
8	`cull_mask`
12	`max_y`
4	`node_type`
4	`node_size`	Increment for the child offset in units of 128 bytes.

The coordinates of child AABBs are encoded as follows:

min: floor((x - origin_x) / extent)
max: ceil((x - origin_x) / extent) - 1

image_bvh8_intersect_ray will return the node IDs of the child nodes.

Primitive node

Highlevel layout:

bitsize/range	name	description
52	`header`	Misc information about this node.
	`vertex_prefixes[3]`
	`data`	Compressed vertex positions followed by primitive/geometry index data.
29x`triangle_pair_count`	`pair_desc[triangle_pair_count]`	Misc information about a triangle pair.

header layout:

bitsize/range	name	description
5	`x_vertex_bits_minus_one`
5	`y_vertex_bits_minus_one`
5	`z_vertex_bits_minus_one`
5	`trailing_zero_bits`
4	`geometry_index_base_bits_div_2`
4	`geometry_index_bits_div_2`
3	`triangle_pair_count_minus_one`
1	`vertex_type`
5	`primitive_index_base_bits`
5	`primitive_index_bits`
10	`indices_midpoint`	Bit offset where the geometry and primitive indices start (geometry indices in negative direction, primitive indices in positive direction)

The data field is split in three sections:

Vertex data, this is a list of floats which share the same prefix and the same number of trailing zero bits. The decompressed value (for example the x component of a vertex) is (prefix << 32 - prefix_bits_x) | read(x_vertex_bits) << trailing_zero_bits where prefix_bits_x is derived from x_vertex_bits and trailing_zero_bits (32 - x_vertex_bits - trailing_zero_bits).
Geometry indices.
Primitive indices.

Geometry indices are encoded the same way with the only difference being that geometry indices are read/written in negative direction starting from indices_midpoint. The indices section starts with a *_index_base_bits-bit value *_index_base which is the index of the first triangle. Subsequent triangles use indices calculated based on a *_index_bits-bit value:

*_index = read(*_index_bits) if *_index_bits >= *_index_base_bits
*_index = read(*_index_bits) | (*_index_base & ~BITFIELD_MASK(*_index_bits)) otherwise.

pair_desc(s) layout:

bitsize/range	name	description
1	`prim_range_stop`
1	`tri1_double_sided`
1	`tri1_opaque`
4	`tri1_v0_index`	Indices into `data`, `0xF` for procedural nodes.
4	`tri1_v1_index`	`0xF` for procedural nodes.
4	`tri1_v2_index`
`tri0` has identical fields:
1	`tri0_double_sided`
1	`tri0_opaque`
4	`tri0_v0_index`
4	`tri0_v1_index`
4	`tri0_v2_index`

image_bvh8_intersect_ray will return the following data for triangle nodes:

VGPR index	value
0	t0
1	`(procedural0 << 31) \| u0`
2	`(opaque0 << 31) \| v0`
3	`(primitive_index0 << 1) \| backface0`
4	t1
5	`(procedural1 << 31) \| u1`
6	`(opaque1 << 31) \| v1`
7	`(primitive_index1 << 1) \| backface1`
8	`(geometry_index0 << 2) \| navigation_bits`
9	`(geometry_index1 << 2) \| navigation_bits`

image_bvh8_intersect_ray will return the following data for procedural nodes:

VGPR index	value
3	`primitive_index0 << 1`
8	`(geometry_index0 << 2) \| navigation_bits`
9	`(geometry_index1 << 2) \| navigation_bits`

navigation_bits is 0 if there are more triangle pairs to process, 1 if this was the last triangle pair and 3 if prim_range_stop is set.

Instance node

bitsize/range	name	description
32x3x4	`world_to_object`
62	`bvh_addr`	Units of 4 bytes.
1	`aabbs`	Does the BLAS (only) contain AABBs? Used for pointer flag based culling.
1	`unused`
32	`unused`
24	`user_data`	Returned by the intersect instruction for instance nodes.
8	`cull_mask`
The instance node can have up to 4 quantized child nodes:
32	`origin_x`	x-offset applied to all child AABBs.
32	`origin_y`	y-offset applied to all child AABBs.
32	`origin_z`	z-offset applied to all child AABBs.
8	`exponent_x`
8	`exponent_y`
8	`exponent_z`
4	`unused`
4	`child_count_minus_one`
96x4	`children[4]`

image_bvh8_intersect_ray will return:

VGPR index	value
2	BLAS addr lo
3	BLAS addr hi
6	`user_data`
7	`(child_ids[0] & 0xFF) \| ((child_ids[1] & 0xFF) << 8) \| ((child_ids[2] & 0xFF) << 16) \| ((child_ids[3] & 0xFF) << 24)`

VGPR index	value
0	t0
1	`(procedural0 << 31) \| u0`
2	`(opaque0 << 31) \| v0`
3	`(primitive_index0 << 1) \| backface0`
4	t1
5	`(procedural1 << 31) \| u1`
6	`(opaque1 << 31) \| v1`
7	`(primitive_index1 << 1) \| backface1`
8	`(geometry_index0 << 2) \| navigation_bits`
9	`(geometry_index1 << 2) \| navigation_bits`