This is a no-op from a codegen PoV since both SrcSwizzle::Xx and
SrcSwizzle::None will result in .high not being set. However, it allows
other parts of the compiler to more easily reason about the fact that it
only reads the bottom 16 bits.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39572>
Instead of depending on a global "high" bit that affects both source and
destination, this models f2f.32.16 as an F16v2 op which ignores one of
the two components. This makes encoding the op a tiny bit more complex
(though that's easy enough to shove in a helper) in exchange for letting
copy-prop propagate OpPrmt and swizzles into it.
Shader-db stats:
Totals:
CodeSize: 24304240 -> 24298928 (-0.02%)
Static cycle count: 274812403 -> 274809320 (-0.00%)
Totals from 39 (0.57% of 6891) affected shaders:
CodeSize: 266672 -> 261360 (-1.99%)
Static cycle count: 138321 -> 135238 (-2.23%)
PERCENTAGE DELTAS Shaders CodeSize Static cycle count
google-meet-clvk/BgBlur 49 -0.49% -0.44%
google-meet-clvk/Relight 81 -0.55% -0.18%
q2rtx/q2rtx-rt-pipeline 42 -0.31% -0.10%
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39572>
Previously, we were allowing up to 1024 entries to be accumulated and
pushed. Nouveau kernel side always report 510 entries but we are going
to increase this at some point.
This makes it so that we now dynamically allocate
nvkmd_nouveau_exec_ctx::req_push.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39239>
The offset for the dynamic buffers needs to be computed with the currently
bound pipeline layout. This change fixes incorrectly selecting the offset
for a dynamic buffer if a descriptor with a lower index than the currently
being bound contains a dynamic buffer but said descriptor hasn't being
bound yet. It also prevents the binding to override the dynamic buffers in
order to preserve the already bound dynamic descriptors.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39203>
Unifies nir per instruction float control.
In the future this can be split into contract/reassoc/transform
like SPIR-V.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (except SPIR-V)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
This is the one that has Vulkan semantics. We could probably make
nir_lower_io_lower_64bit_to_32_new work but it assumes the weird GL
semantics which don't map to what Vulkan does. Using it with a Vulkan
driver would require remapping all the attribute indices.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39135>
They're implemented as RG32 so it's fine to claim storage texel buffer
support.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39135>
This is a no-op right now because NIL never claims buffer support on
anything that can't support texturing. That will change in the next
commit.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39135>
We don't need one bit per bitsize per instruction if only one actually
matters in the end.
First step towards moving NIR in the direction of full float_controls2
only.
Also rename this from fp_fast_math, because that name implied that 0 is
the no fast math mode, while the opposite was the case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>
Currently the sm120 instruction latency code expects registers to be out
of SSA. This prerequisite is broken with the prepass scheduler.
This commit removes non-SSA-specific code.
Fixes: b55b8da012 ("nak: Add a prepass instruction scheduler")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39072>
This helper was introduced in b4bac84d3b ("nak: Add a Dst::file()
helper function") which missed updating the sm120 file.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39074>