This doesn't seems to make a practical difference, but it seems better
to do it the same way as we do for outputs, as well as what DXC does.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12197>
Rather surprisingly, the value stored in the NumVectors field of the
DXIL PSV header isn't the number of vectors, but rather the *maximum*
vector used.
This makes a difference when we're not writing to the first element of
an array, where we would previously generate a validation error.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12197>
We say that they're for debug only but we don't really have a good
policy around when to set them and when not to. In particular,
nir_lower_system_values and nir_lower_vars_to_ssa which are the chief
producers of SSA values which might reasonably have a name do not bother
to set one. We have some names set from things like BLORP and RADV's
meta shaders but AFAICT, they're setting a name more because it's there
than because they actually care.
Also, most things other than nir_clone and nir_serialize don't bother to
try and preserve them. You can see in the diffstat of this commit
exactly what passes attempt to preserve names. Notably missing from the
list is opt_algebraic which is the single largest source of SSA def
churn and it happily throws names away.
These observations lead me to question whether or not names are actually
useful at all or if they're just taking up space (8B per instruction)
and wasting CPU cycles (to ralloc_strdup on the off chance we do have
one). I don't think I can think of a single time in recent history
where I've been debugging a shader issue and a SSA value name has been
there and been useful. If anything, the few times they are there, they
just throw me off because they mess up the indentation in nir_print.
iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5)
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>
If the barrier tries to apply to memory that we can't express, just
don't apply the memory portion of the barrier. Similarly, if it tries
to apply a global memory barrier at invocation level, upgrade it to
thread-group.
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11670>
Since DXC doesn't perform de-duplication for arbitrary semantic names,
and the DXIL validator checks against this behavior. We need to remove
the de-duplication.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10989>
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
This change moves the SRVs associated with read-only SSBOs to be emitted
before any other UAV. We do this because the validator expects resources
to be emitted in a specific order, as noted by `emit_module`.
Previously, we emitted SSBOs as SRVs (read-only) or UAVs (read-write)
after other UAVs.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10514>
For non 64bit devices the key stored in hash_table_u64 is wrapped in
hash_key_u64 structure, which is never free.
This commit fixes this issue by just removing the user-defined
`delete_function` parameter in hash_table_u64_{destroy,clear} (which
nobody is using) and using instead a delete function to free this
structure.
Fixes: 608257cf82 ("i965: Fix INTEL_DEBUG=bat")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10480>
Note that it's no longer sufficient to check for >=1 sampler/image
in a potential array, because unbounded arrays have 0 of them.
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10298>
For GL, we've previously mostly ignored the binding property for sampler variables
during the shader compilation step. For CL, our image bindings were always 0-based as well.
Now, for Vulkan, we are going to be getting explicit bindings and need to emit DXIL that
respects those bindings. Since Vulkan can also have both split and combined images and samplers,
we now need to be smarter about recognizing when NIR is trying to use a "sampler" as *both* an
image and sampler (in deref mode, the same variable will be deref'd as both image and sampler).
That "being smarter" bit comes next, but first, let's prep GL for building correct root
signatures and binding the resources correctly.
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10298>
For non-CL, intrinsic access isn't set, because the image type doesn't
have access qualifier. Instead, the access qualifier is set on the variable.
So, add a mode to this pass which can chase back to the variable in addition
to the intrinsic access. Also, update the variable type and the deref chain
types so everything is consistent, that the tex is accessing a sampler. Note
we can't do this for CL, because void-typed samplers don't exist.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>
Creating resource handles upfront works well while we have fixed-size resource
counts, but once we start talking about bindless, having arrays or even sets
of handles becomes prohibitive. It also precludes dynamic indexing for textures.
Instead, rely on the load_vulkan_descriptor instruction for UBO/SSBO, and undo
nir_lower_samplers so we continue to have deref chains for image/sampler accesses.
Then, emit handles at the end of a deref chain - the chain should only have
array offsets, so once we get to a type that's not an array anymore, we can
emit the handle.
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10288>
Otherwise we can end up in situations like having divide-by-zero. If the optimization is smart enough
that we end up with a *constant* divide-by-zero, then the DXIL validator will fail to sign, which
can trigger fatal errors with CLOn12.
We want to run an initial translation of all kernels during program build, but at that point we don't
know the local size to be able to specify it through kernel specialization data.
v2: Metadata output of 0 is used to indicate that the size wasn't explicitly specific. Copy the
size to the metadata before overriding it to (1,1,1). If conf was explicitly specified,
update the metadata again (though nobody should be paying attention to it).
Closes: https://github.com/microsoft/OpenCLOn12/issues/20
Closes: https://github.com/darktable-org/darktable/issues/8700
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10303>
Previously UBOs only supported static indices, and SSBOs only
supported dynamic indices. UBO support for descriptors was added
as an alternative to static indices, but the logic for detecting
descriptors to SSBOs couldn't just differentiate on constants vs not.
Add a helper which can differentiate cleanly across the board and
handle pre-created handles from descriptors, or static/dynamic raw
indices.
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10149>
Instead of doing all of the handle logic in the descriptor load, split
it so that the resource index is actually computed during resource_index
processing, and it's converted to a handle during the load_descriptor.
At the same time, add SSBO handling and dynamic indexing handling.
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10149>
The resources need to be emitted in a particular order, so CBVs
have to be emitted first and can't be emitted as we iterate through
instructions.
Reviewed-by: Enrico Galli <enrico.galli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10149>