On non-Windows OSes, align_realloc is the os_realloc_aligned() from
src/util/os_memory_aligned.h, which doesn't use realloc internally.
Instead, it uses os_malloc_aligned() and memcpy's over the old data,
which is why it needs an "old size" (unlike normal realloc).
In _mesa_reserve_parameter_storage, the call to align_realloc above
passes (oldValNum * sizeof(gl_constant_value)) as the old size, which
is all the actual data. The actual allocation size of the array may
be larger (in fact, we allocate 16 extra components), which is tracked
in SizeValues. After realloc, we memset to zero starting at the old
allocation size, to the new allocation size.
This would work if it were a real realloc. However, because we actually
malloc + memcpy and only copy the previous /data/, not the allocated
size, and then memset from the old /allocated size/, our new copy will
have the spaces between the old data and the old allocation size neither
copied nor memset, leaving them as uninitialized garbage memory.
These values then get written to the shader cache, meaning that if you
compile the same shader multiple times, you may get different shader
cache entries. This is bad for reproducible, deterministic compiles.
While at it, we also memset to zero in _mesa_add_parameter, as this
looks like another place where memset-to-zero is missing.
To reproduce this error, one can run shader-db:
$ MESA_SHADER_CACHE_DIR=a ./run -b shaders/godot3.4/49-28.shader_test
$ MESA_SHADER_CACHE_DIR=b ./run -b shaders/godot3.4/49-28.shader_test
and see an occasional difference in the end of the ParameterValues
array, where there's a padding gap between the last two elements that
was never zero-initialized.
Thanks to Mark Janes for discovering this and tracking it down together!
Cc: mesa-stable
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25316>
Now that we don't need the lock, we can return directly. Also, now that
we don't have the old UAPI, we can clean things up and make the whole
function make a bit more sense. Also, drop some pointless braces while
we're just moving code around.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25357>
These existed entirely to support shadow memory for VkImage cases where
we needed tiling. Now that we have VM_BIND, these are no longer used so
we can drop the wrappers and just implement VkAllocate/FreeMemory again.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25357>
This existed to let us lock the memory_objects list and for handling
BO-based vk_sync waits. We don't have either of these things anymore so
there's no need for a device-level lock. We already have fine-grained
locks around the data structures that need them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25357>
When creating a new node, we were clobbering the original size
requested, and use that as offset, so the node would always be full.
Fixes: 591db9a9a5 ("util: Remove per-buffer header in linear alloc for release mode")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25382>
Otherwise previous non-mesh draws and can leave things dirty.
Fixes crashes in:
dEQP-VK.mesh_shader.ext.query.all_queries.triangles.reset_before.copy.no_wait.indirect_draw.32bit.no_availability.no_blocks.task_mesh.inside_rp.single_view.with_secondary
after other shaders have run.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25385>
When a surface is to be created for a buffer that is to be persistently mapped,
set the UAV bind flag as well, otherwise, an extra copy of the surface will have
to be created when the buffer is later bound as shader storage buffer, and
it will then incur extra buffer copies to keep the shadow copies of the surface
in sync. But when a buffer is simultanously bound to uniform buffer and shader
storage buffer, having both srv and uav on the same resource can also be costly.
This patch also provides optimization for readonly shader buffer. In this case,
instead of creating uav for accessing a readonly shader buffer, srv
raw buffer is used instead.
Tested with yuzu tracefile included in VMware bug 3029385.
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25381>
Since constant buffers can be accessed as raw buffers, indices to
the constant buffers will have to be declared as immediates.
But it is a waste to define immediates for every possible indices
to the constant buffer, we will only include immediates that are used in
the shader. But since immediate block is declared in a very early stage
of a shader, this patch will append any new immediates to the immediate
list and reemit the immediate block if needed.
Fixes assertion running yuzu
Reviewed-by: Maaz Mombasawalam <mombasawalam@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25381>
Restrict use of rawbuf for constant buffer access to GL43 capable
device only.
Fixes glretrace regressions running with SW Renderer.
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25381>
Currently when a buffer is mapped with the persistent bit set and is
later bound as a constant buffer, the updated buffer content is not
properly updated to the constant buffer surface as the constant buffer
surface is different from the original buffer surface. Doing
a buffer copy to sync the content of the constant buffer will fix
the problem, but the buffer copy can be costly.
To properly fix the issue, instead of creating a secondary surface
for the constant buffer, the original buffer surface will be accessed as
a raw buffer.
This fixes the rendering issue running yuzu deko_basic.nro
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25381>
The new host version checks the support of these features better,
so report here accordingly. This fixes a number of texwrap piglit
tests on Intel.
v2: Stick to old test for PIPE_CAP_TEXTURE_MIRROR_CLAMP because
host has to be backward compatible.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25343>
They now match the order in vulkan_core.h which makes it easier to find
missing entries of which there was one. This also fixes a bug where we
were setting .bufferImageGranularity twice and we were overwriting the
correct value with an incorrect one.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25383>
For dynamic states that are precompiled from static state, just set the
corresponding dynamic draw state directly, and keep a record of which
ones are precompiled when we go to emit states at draw time so we don't
accidentally re-emit them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
These draw states are things that depend on pipeline-only state:
- The load state depends on knowing the pipeline layout, which we won't
know for a shader that's loaded from a binary. This is going away on
a7xx anyway, and we should be able to use the a7xx strategy of
prefetching the descriptors in the preamble on a6xx too.
- The prim order state depends on feedback loops and raster order
attachment access, which isn't supported at the moment.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
Now that the FS-specific parts are split out, the only remaining part is
the blend state part. Use the same state that we use for dynamic
blending for static blending, eliminating the last use of the pipeline
in the LRZ code. While we're at it fix a bug where dynamic blending
didn't always disable LRZ writes (even though it only mattered with a
non-conformant debug flag because we invalidated LRZ anyway).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
There are a couple pieces of state that we can't calculate until we know
all of the shaders:
- The actual variants to use (i.e. whether to use safe-const variants)
- Program config and VPC draw states
- Const layout, which depends on the variants
- Whether per-view viewports should be enabled
Now that these are all combined in tu_pipeline::program, move these into
a separate struct that can be referenced directly without a pipeline.
The next step is to refactor the code filling it out so that it can be
called at draw time when given just the shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
The way this works now is awkward to map to shader objects. We don't
have the pipeline layout when "linking" shaders at draw time, so we have
to piece it together from the shaders. Store the information we need in
the shaders and piece it together.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
Before this, we combined the modes after compiling the shaders when
constructing the pipeline. But that's a bit awkward with shader objects,
where there is no good place to put state derived from TCS and TES but
not the other stages. However, shader objects leaves us with an out:
when compiling separately, the modes must be on one of the shaders. So
instead we just copy the modes earlier, in the NIR shaders, and then get
them from the appropriate shader later. That way there is no extra
overhead when fast-linking, as there currently is, and we don't need to
create an awkward separate object just for this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
We will gradually transition over users of cmd->state.pipeline and
TU_CMD_DIRTY_PIPELINE to shaders and derived state from shaders. This
just puts in place the framework to start doing that.
When importing a library with all of the shader state, we now have to
import the shaders in addition to the program, so that they are
available when we bind the pipeline.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
The most important of these is the empty FS, which should cut down on
time spent in the compiler when we fast-link a pipeline where there is
only a library with the VS and no library with the empty FS. Source
engine is known to do this.
This is also necessary for shader objects where the "empty" shaders are
never created up-front. We will use these when a NULL shader is bound.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25276>
Not comprehensive, but those were the ones used to work on the
previous linear_alloc changes. Also having a test already
set up lower the barrier to add more tests for future in case of
bugs.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
There's only need to keep the offset and size of the latest buffer,
so rename linear_header into linear_ctx and change the code to
keep records there.
For debug mode we still keep a header, now called linear_node_canary,
to have a magic check. Since due to alignment we have a free space,
also keep the individual occupation of each node (offset), for
debugging.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
With linear_realloc() gone, there's no code that reads the size
in linear_size_chunk struct, so it can be removed. This removes
the 8-byte overhead per child allocation and simplifies the
allocation code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
Now that linear_realloc() is unused, remove it. It is not an actual
realloc, will always allocate new memory and copy data around -- and
had a big warning about it in the documentation.
In the couple of uses we had before, the client code knew the size,
so it could be changed to perform the allocation and the copy by
themselves. The client code keeping the size is the recommended
way here.
This will allow us remove linear_size_chunk later.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
Note that for linear allocator, the realloc will always
allocate new memory. In both cases that realloc was used,
the existing size was known, so we can just allocate
and do the copy ourselves.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
In the linear allocation only the parent (context) can be used
to allocate new children, so let's use an opaque type to identify
the linear context. This is similar to what's done in GC allocator.
Update the documentation and a couple of function names to
refer to linear context instead of linear parent.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
Linear allocator doesn't support calling custom destructors to
its child allocations nor freeing individual child allocations.
So the destructor callback and the delete operator don't apply
to objects using linear allocator.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>