mesa/src
Marcin Ślusarz c1685f08dd intel/compiler,anv: put some vertex and primitive data in headers
Both per-primitive and per-vertex space is allocated in MUE in 8 dword
chunks and those 8-dword chunks (granularity of
3DSTATE_SBE_MESH.Per[Primitive|Vertex]URBEntryOutputReadLength)
are passed to fragment shaders as inputs (either non-interpolated
for per-primitive and flat vertex attributes or interpolated
for non-flat vertex attributes).

Some attributes have a special meaning and must be placed in separate
8/16-dword slot called Primitive Header or Vertex Header.

Primitive Header contains 4 such attributes (Cull Primitive,
ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword
slot) potentially unused.

Vertex Header is similar - it starts with 3 unused dwords, 1 dword for
Point Size (but if we declare that shader doesn't produce Point Size
then we can reuse it), followed by 4 dwords for Position and optionally
8 dwords for clip distances.

This means we have an interesting optimization problem - we can put
some user attributes into holes in Primitive and Vertex Headers, which
may lead to smaller MUE size and potentially more mesh threads running
in parallel, but we have to be careful to use those holes only when
we need it, otherwise we could force HW to pass too much data to
fragment shader.

Example 1:
Let's assume that Primitive Header is enabled and user defined
12 dwords of per-primitive attributes.

Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of
MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader.

With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to
fragment shader.

16/16 is better than 24/16, so packing makes sense.

Example 2:
Now let's assume that Primitive Header is enabled and user defined
16 dwords of per-primitive attributes.

Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of
MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader.

With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to
fragment shader.

24/24 is worse than 24/16, so packing doesn't make sense.

This change doesn't affect vk_meshlet_cadscene in default configuration,
but it speeds it up by up to 25% with "-extraattributes N", where
N is some small value divisible by 2 (by default N == 1) and we
are bound by URB size.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
..
amd radv: add radv_compile_cs() to compile a compute shader 2023-07-24 07:04:44 +00:00
android_stub util/log: improve logger_android 2023-02-22 17:55:40 +00:00
asahi asahi: Don't depend on glibc to decode 2023-07-22 12:42:58 -04:00
broadcom nir: Remove register arrays 2023-07-21 11:25:49 +00:00
c11 treewide: Replace the usage of TRUE/FALSE with true/false 2023-06-27 18:18:28 +08:00
compiler nir/opt_copy_prop_vars: drop reuse of dynamic arrays 2023-07-24 02:29:54 +00:00
drm-shim drm-shim: Avoid assertion fail if someone does close(-1). 2023-06-01 01:50:41 +00:00
egl egl/wayland: wait for compositor to release shm buffers 2023-07-19 15:11:46 +00:00
etnaviv ci/etnaviv: update ci expectations 2023-07-22 04:16:32 +00:00
freedreno ci/freedreno: cover all texture gather flakes 2023-07-24 01:37:01 +02:00
gallium radeonsi: enable aco compile for mono merged ES/GS 2023-07-24 01:49:21 +00:00
gbm gbm: drop unnecessary vulkan dependency 2023-02-23 18:31:22 +00:00
getopt
glx glx: Assign unique serial number to GLXBadFBConfig error 2023-07-15 03:27:17 +00:00
gtest gtest: Update to 1.13.0 2023-05-14 11:09:02 +00:00
imagination pvr: Fix writing query availability write out 2023-07-19 12:22:30 +00:00
imgui
intel intel/compiler,anv: put some vertex and primitive data in headers 2023-07-24 07:55:29 +00:00
loader dri3: only invalidate drawables on geometry change if geometry has changed 2023-06-15 12:22:24 +00:00
mapi mapi: Remove dead struct _glapi_function in glapi/glapi_getproc.c 2023-06-29 01:36:09 +00:00
mesa mesa: propagate shader source sha1 from gl_shader to nir_shader 2023-07-20 09:08:08 +00:00
microsoft ci: move microsoft files rules to src/microsoft/ci/gitlab-ci.yml 2023-07-18 23:07:52 +00:00
nouveau nouveau: Drop BuildUtil::Location 2023-07-21 02:40:36 +00:00
panfrost panfrost: Remove unused helpers 2023-07-21 11:25:48 +00:00
tool meson: remove needless c++17-overrides 2023-05-19 12:45:31 +00:00
util util/u_queue: always enable UTIL_QUEUE_INIT_SCALE_THREADS, remove the flag 2023-07-18 11:11:12 -04:00
virtio venus: use in_render_pass to skip present_src counting 2023-07-22 01:49:43 +00:00
vulkan vulkan: bump header register to 1.3.258 2023-07-21 16:36:26 +00:00
.clang-format clang-format: add wayland foreach macros 2023-07-07 23:00:06 +00:00
meson.build lavapipe: Include llvmpipe 2023-06-30 12:56:35 +00:00