DXIL doesn't have a "subgroup ID" or "num subgroups" construct,
so add lowering to construct them. Subgroup ID is done using
once-per-subgroup atomics on a workgroup-shared variable, and
then broadcasting that (using read_first_invocation) to the other
threads. Num subgroups is just a division with the workgroup size.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20777>
Some D3D12 drivers, like my PC's AMD driver, don't like using a
dynamic index to load from a constant buffer that's bound via
root constants. Instead, just go ahead and load the full set of
vertex data and just bcsel which one to use.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20778>
This is a weird way to do queries, but in multiview, each query
takes up N slots, where N is the number of views. D3D doesn't do
it that way, and only has one result, which fortunately is a valid
way to do Vulkan queries. We just need to take care to zero out
the other view results, and make sure they get "signaled" when
the cmdbuf is submitted.
Note that it is invalid in D3D to use ResolveQueryData on query
slots that have never actually been begun/ended, so we zero out
the data by copying zeroes into the buffer. This probably could
be optimized but oh well.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20650>
For draws, when we're emulating multiview, we need to loop them
and set up the right sysval. For clears, we always need to loop.
When not emulating, we also need to set up the right view instance
mask.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20650>
D3D's view instancing is an optional feature, and even when it's
supported, it only goes up to 4 views, where Vulkan requires a
minimum of 6 supported views. So, we need to have a path for handling
the cases where we can't use the native feature.
In this mode, pass the view ID as a runtime var. The caller is then
responsible for looping the draw calls and filling out the constant
buffer value correctly for each draw. When we get to the last pre-rast
stage, we'll additionally want to write out gl_Layer to select the
right RTV array slice. Lastly, for the fragment shader, if there's
any input attachments, those get loaded using the RTV slice instead
of the view ID. RTV slice input into the PS is done with a signature
entry (which must be output from the previous stage) rather than a
system value.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20650>
This adds support for D3D12-native view instancing to the compiler.
Essentially, it's just the ability to load SV_ViewID (dx.op.viewID),
set the right capability, and fill out some more PSV data. Note that
the PSV data is currently garbage. Ideally, we'd fill out a proper
input -> output and viewID -> output dependency table, but AFAIK
this is only used to enforce D3D API validation, and drivers ignore
it, so it's less critical to get it right.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20650>
In addition to the DLL names being different, we don't have to do the versioning work since we don't have to fuss with known bad versions (for example).
Co-authored-by: Ethan Lee <flibitijibibo@gmail.com>
Co-authored-by: David Jacewicz <david.jacewicz@protonmail.com>
Co-authored-by: tieuchanlong <tieuchanlong@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19022>
From the Vulkan spec, the WAIT flag on vkCmdCopyQueryPoolResults only
serves to increase the first synchronization scope to include query end
commands, but either way, the synchronization scope only includes
commands that occur earlier in submission order. In other words, we
don't need to enforce queue ordering, a pipeline barrier is all that's
needed.
Fixes deadlocks in the timestamp.misc_tests.two_cmd_buffers_primary test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20617>
D3D considers the rasterizer enabled if there's a pixel shader *or* if
depth is enabled, since you can do depth-only rendering. After parsing
shaders, if we find that there was supposed to be a pixel shader, but
we removed it because there was no output position, disable depth too.
Also, store this info in the cache, since we might not even load the
nir shaders if we'd seen this pipeline before.
Fixes dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20617>
When querying capabilities or creating views using a scoped aspect
mask, we want to return the format for the correct single-channel
format, but when actually creating the resource (aspect mask 0),
we want to use the typeless format, since the single-channel formats
don't report multisampling support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20614>
Only add SV_SampleIndex if there exists a sample-rate var that has either flat
interpolation or centroid (and therefore can't force sample rate implicitly),
unless there is also a sample-rate var that doesn't have those properties.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20614>
There's VK tests that have mismatching interpolation specifiers between FS
and the previous stage. For structs, that resulted in different types, which
breaks DXIL validation.
We could link the shaders and have that overwrite the interpolation field from
the previous shader, but we could also just not care and always use float.
I don't see any regressions from that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20614>