There's already a single command queue for the screen, meaning that
all commands are being serialized implicitly into that queue. There's
no need to have separate fences for parallel contexts when those
fences would all share the same underlying timeline.
This adds an explicit lock to expand the scope of the implicit screen
command queue ordering to include fence signals.
Each context still gets its own submit sequence, which is used for 1
purpose right now: A uniqueness check in the state manager to see
if states are coming from separate command lists, to apply promotion
and decay logic.
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14959>
Turn the context state and shader key into a bitmask. When lowering
the depth invert into the shader, scan for writes to viewport index.
If found, move position to the end of the function (or current vertex)
and check if the current viewport needs the depth invert before applying.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14881>
If an indirect arg buffer is being produced by a compute shader, then when
we go to consume it as an SSBO in a compute transform pass, we need to insert
a UAV barrier to prevent the two dispatches from overlapping. For app dispatches,
this is the app's responsibility via explicit barrier APIs, and if they don't,
then they're allowed to overlap.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14787>
GL has separate no-attachment framebuffer size parameters, but
D3D uses the viewport. Clamp the viewport dimensions to lie within
GL's default framebuffer sizes.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14504>
Some more refactoring in d3d12_draw.cpp to re-use a bunch of state
and descriptor management, and some refactoring of the dirty states.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>
We don't key off of it to try to figure out if we need to produce
a new shader variant, so there's no need to set it when changing
properties that feed into variants. If we do have a new shader or
variant at draw time, we'll produce a new PSO without this.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>
This is handled in 2 ways:
* For casts in the same "class," we can just do the cast. There's also
limited support for 4x8 => 1x32 and 2x16 => 1x32.
* For casts that are just by size, use a lowering pass. This reads the
data in the appropriate UINT format (except R11G11B10), retrieves
the original bits, re-packs it into the dest, and returns it to the
app for loads. For stores, the process is done in reverse - bits
for the final format are computed and re-expanded to the format that
should be used for the store.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
This is a bit fragile. The algorithm is essentially:
- Let the driver track state for non-dual-bound resources.
- For resources that are dual-bound as SSBO/image and a second
bind point, assume they're being used as UAVs.
- When a MemoryBarrier is issued, dirty all destination bind points
so they re-assert their state, and if the destination is not UAV,
temporarily suppress UAV state re-assertion.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
This removes the bitfield packing/unpacking.
pipe_format entries are reordered to have vertex formats first because
vertex formats must be <= 255.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11370>
Some GL applications, including Blender, are producing rendering
artifacts due to missing resource state barriers.
The d3d12_context keeps track of all resources bound as shader resource
or constant buffers. If any of these resources are used for Copy,
Resolve, or Clear source/target, the context tracking must be updated
so the correct state can be restored before the next draw call.
This change is something of a big hammer. Essentially, if a resource
currently bound as an SRV or CBV gets used for a non-shader access, a
flag is set in the context that invalidates all bindings of the same
type on the same shader stage. Thus the next Draw execution refreshes
the shader views and state transitions state before invoking Draw on the
command list.
A more elegant (and complex) fix would limit the invalidation to
resource state only, rather than also forcing a recreation of resource
views. It is unclear right now whether it is worth the time to
implement a more elegant fix.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10871>
Surfaces can be shared across contexts, and can even outlive the
original context that created them, so move the pools to the screen.
Since they no longer belong to a single context, they need a lock now.
v2: Samplers moved back to the context
v3: Fix Linux build
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3812
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9349>
There are a few places (mainly u_threaded_context) that do:
set_vertex_buffers(...);
for (i = 0; i < count; i++)
pipe_resource_reference(&buffers[i].resource.buffer, NULL);
set_vertex_buffers increments the reference counts while the loop
decrements them.
This commit eliminates those reference count changes by adding a parameter
into set_vertex_buffers that tells the callee to accept all buffers
without incrementing the reference counts.
AMD Zen benefits from this because it has slow atomics if they come from
different CCXs.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
Instead of calling this functions again to unbind trailing slots,
extend it to do it when binding. This reduces CPU overhead.
A lot of drivers ignore "start" and always unbind all slots after "count".
Such drivers don't need any changes here.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
Instead of calling this functions again to unbind trailing slots,
extend it to do it as part of the call that sets vertex buffers.
This reduces CPU overhead. Only st/mesa benefits from this.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
We often do this:
pipe->set_constant_buffer(pipe, shader, slot, &cb);
pipe_resource_reference(&cb->buffer, NULL);
That results in atomic increment in set_constant_buffer followed by
atomic decrement after set_constant_buffer. This new interface
eliminates those atomics.
For the case above, this should be used instead:
pipe->set_constant_buffer(pipe, shader, slot, true, &cb);
cb->buffer = NULL; // if cb is not a local variable, else do nothing
AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
For non-CPU-accessible pipe resource types (DEFAULT/IMMUTABLE),
allocate non-CPU-accessible buffers directly from the cache_bufmgr.
Update the d3d12_bo creation to handle nonmappable buffers.
For CPU-write-only (DYNAMIC/STREAM), use the upload slab_bufmgr.
Update this slab manager to use CPU_WRITE | GPU_READ PB usage.
For CPU-read-write (STAGING), use the readback_slab_bufmgr.
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8095>
Missed these last time through, not sure how. I couldn't find a
reason for the nested loop in d3d12_enable_fake_so_buffers to go
backwards, which would require signed, so I switched it to forward.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8248>
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable ss going out of scope leaks the storage it points to.
Fixes: 2ea15cd661 ("d3d12: introduce d3d12 gallium driver")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8179>