- Implement load_interpolated_input and friends.
- Optimize load_barycentric_* cases that can be simplified.
- Initial support for non-standard sample locations.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37540>
Without this, somebody trying to map a buffer for write by the CPU would
fail. This is not common to do in hardware driver environments, but it
shouldn't be disallowed, and there's no downside to allowing it.
I did skip virgl, because that's one where I don't know for sure if there
wouldn't be a downside to allowing RDWR (there are other virt exports
where RDWR is gated on a mappable flag).
This is a follow-up to !37088 to keep copy and paste from introducing the
same bug anywhere else.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37339>
This is required for OpenCL but not Vulkan. This fixes a bunch of
OpenCL CTS fails using the SPIR-V back-end in LLVM as opposed to
SPIRV-LLVM-Translator.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37555>
on sm70+ where we don't have a native bfe instruction. This
implementation is fewer instructions than the nir lowering.
This also implements the bfe semantics that vkd3d-proton wants. d3d12
wants specific behavior for out-of-bounds ibfe like
bitfieldExtract(-1, 15, 20) while spirv considers this undefined
behavior. Tested with VKD3D_TEST_FILTER=test_shader_instructions
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13795
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37592>
In case of a unbound subchannel usage, the hardware will route it to the
GPFIFO class.
NVIDIA blobs use this and this was causing an assertion to be fired up
in nv_push_dump.
This adds support for that and also add mapping for cls_gpfifo in
nv_push_dump.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
From nvidia-open-kernel-modules as they missing on open-gpu-doc at the
moment.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
If a given generation hasn't removed any methods, then the default case
can defer to the previous generation without changing the output at all.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
Note that this changes the output so it lists the method as being in the
class where it most recently changed rather than always in the class of
the running gpu. Personally, I find this more useful than the old
behavior.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
A good chunk of it excluding AMPERE B (header have a typo on some struct
making it clash with AMPERE A def)
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
We now handle from BLACKWELL_B down to FERMI_A (FERMI_B is excluded as we are missing the tex header)
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
Instead of typing this in nv_push.c, we now generate it based on the
class headers that are generated.
This makes it that we never have human errors in any of the checks and
allow to just support parsing everything we can.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37475>
We currently only support fast clearing the first layer of an image.
Attachments use VkImageView which can specify a base-layer of the view
for an image attachment.
Fixes: 44351d67f8 ("anv: Change params of anv_can_fast_clear_color_view")
Ref: https://projects.blender.org/blender/blender/issues/141181
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37562>
src[1]/src0 is signed and Xe2+ SHR don't support operations over signed
data types so lets switch this over ASR that supports signed data
types.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37557>
Customers suggested that Xe KMD should change all possible interfaces
visible to users to canonical address, with that we need some changes
to keep the decode of devcoredump working.
A old version of the tool will not be able to decode secondary batch
buffers when parsing a new version of the file but the new version of
this tool will be able to parse both versions of devcoredump file.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37570>
Labels set via VK_EXT_debug_utils are in a separate track due to the
following part of the spec:
"An application may open a debug label region in one command buffer and
close it in another, or otherwise split debug label regions across
multiple command buffers or multiple queue submissions."
This means labels can start in one renderpass and end in another command
buffer, which breaks our assumption that stages can be modeled as a stack.
While applications aren't expected to use labels in such extreme ways,
even simpler cases can break our assumptions.
Having annotations in a separate track prevents the main track(s) from
entering an invalid state.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37028>
Using NaN doesn't usually allow any extra optimizations, and 0 is an inline
constant on AMD hw where this opcode is used with undef for fragment shader
exports.
Foz-DB GFX1201:
Totals from 889 (1.11% of 80287) affected shaders:
Instrs: 1676365 -> 1676348 (-0.00%)
CodeSize: 8827040 -> 8821760 (-0.06%)
Latency: 13346728 -> 13346699 (-0.00%)
InvThroughput: 1799283 -> 1799262 (-0.00%)
Copies: 108125 -> 108102 (-0.02%)
VALU: 974875 -> 974852 (-0.00%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37552>