glthread_attrib_binding has 2 fields and 4 bytes of padding, which is
arranged in array. This removes the padding by splitting the structure
into 2 arrays, one for each field.
This also fixes the pointer alignment.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
glthread (the python generator) needs to know the pointer size at compile
time to sort structure fields of calls for optimal structure packing based
on the CPU.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
glthread will compare the whole type string, so the string must not have
trailing spaces.
No functional change.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
Since changing 1 field to 8 bits and the removal of cmd_size, call sizes
have decreased, so we have 4 unused bytes in 2 DrawArrays structures
So far we use:
- DrawArrays
- DrawArraysInstancedBaseInstance
- DrawArraysInstancedBaseInstanceDrawID
Change them to these by either removing 4 more bytes or adding 4 bytes,
so that we don't waste space, which drops the number of used calls by 1:
- DrawArraysInstanced
- DrawArraysInstancedBaseInstanceDrawID
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
Since changing 2 fields to 8 bits and the removal of cmd_size, call sizes
have decreased by 4 bytes, so we have 4 unused bytes in most DrawElements
structures. So far we have used these calls for all DrawElements variants:
- DrawElementsBaseVertex
- DrawElementsInstanced
- DrawElementsInstancedBaseVertexBaseInstance
- DrawElementsInstancedBaseVertexBaseInstanceDrawID
Change them to these by either removing 4 more bytes or adding 4 bytes,
so that we don't waste space.
- DrawElements
- DrawElementsInstancedBaseVertex
- DrawElementsInstancedBaseInstance
- DrawElementsInstancedBaseVertexBaseInstanceDrawID
This decreases the size of 1 frame in glthread batches by 12%
in Viewperf2020/Catia1.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
The only legal values are {1, 2, 3, 4, GL_BGRA}.
We need GLpacked16i to be unsigned, not signed, because GL_BGRA is
greater than 0x8000.
This decreases the size of 1 frame by 10% in Viewperf2020/Catia1.
It decreases the size of many Pointer calls by 8 bytes.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
Only variable-sized calls keep cmd_size in their structures, and it's
renamed to num_slots because it's in units of 8-byte elements.
The motivation is to make room for reducing call sizes.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
The main motivation is that no_error allows us to drop count==0 draws
at the beginning of the marshal function, instead of forwarding them
to the frontend thread. Such draws are plentiful with Viewperf.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
The main motivation is that no_error allows us to drop count==0 draws
at the beginning of the marshal function, instead of forwarding them
to the frontend thread. Such draws are plentiful with Viewperf.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27350>
We have to write all the same regs blob is writing or we risk using
stale reg value written by blob.
I went through blob trace again and added all missing magic regs,
I hope for the last time.
This fixes screen corruption for Mobox users and in some cases
for different emulators users. The reg which caused the issue
is HLSQ_UNKNOWN_A9AC.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27721>
I think it's better to keep the other checks (check for dmabuf, uuid,
...) as we can use them to know the features required.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26071>
We have a situation where some drivers have all the required features,
but they are not working with gl_sharing, so we end up advertising it
wrongly. Add this cap to ensure this driver was tested to work with
cl_khr_gl_sharing.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26071>
Use the same intializing trick as in 27d5543: first we initialize our
properties struct to { false }, then we fill the fields in one by one.
C++ does not allow assigning to an array from an initializer list, so
the properties exposed as an array in the struct are initialized either
one by one, or assigned in a chain.
As the properties are initialized at init time, move tu_get_properties
and tu_get_physical_device_properties_* before tu_physical_device_init,
so get_properties() would be callable by it.
This lets us delegate the physical device property entrypoints to
common runtime code.
Tested with drm-shim, doing a diff on vulkaninfo output. Differing
fields were pipelineCacheUUID, driverInfo and driverUUID, i.e. the
actual properties do not differ.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27723>
Some D3D11 games rely on out-of-bounds indirect UBO loads to return
real values from underlying bound descriptor. This workaround would
prevent us from lowering indirectly accessed UBOs to consts.
Later DXVK would declare dynamically indexed uniforms with upper
size bound, to make the accesses spec compliant. But for now
we need our own workaround.
Known affected games:
- Dark Souls 3
- Sekiro: Shadows Die Twice
- Final Fantasy Type-0 HD
- Ultrakill
- Dishonored 2
DXVK discussions:
- https://github.com/doitsujin/dxvk/issues/405
- https://github.com/doitsujin/dxvk/issues/3861
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27727>
This was a copy+paste error, probably from vk_drm_syncobj.c. If we do
WAIT_AVAILABLE, it only waits for the dma_fence to exist, not for it to
signal. Instead, we want WAIT_FOR_SUBMIT. (Technically, that's not
necessary but it is typical for CPU waits to also wait for the time
point to materialize.)
Fixes: 2074e28a0d ("nvk: Add an upload queue")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27757>
Zink is the only way to use hw accelerated GL on
a7xx and the preferred way for hw supporting NVK.
Start building Zink by default everywhere that we
would build swrast by default, except for Mac +
Cygwin + Haiku.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27737>