Commit graph

198723 commits

Author SHA1 Message Date
Marek Olšák
a618a2aa8b nir/linking_helpers: don't promote interpolated varyings to flat
Even the most flexible interpolation that we have in NIR options
(nir_io_has_flexible_input_interpolation_except_flat) doesn't allow
mixing flat and non-flat in the same vec4. This (legacy) optimization
can't promote interpolated inputs to flat if it doesn't consider
the interpolation mode of the whole vec4 slot.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>
2024-12-04 13:40:41 +00:00
Marek Olšák
16f7d22394 util/bitset: add BITSET_GET_RANGE_INSIDE_WORD
to be used later

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>
2024-12-04 13:40:41 +00:00
Marek Olšák
da3f9e3626 util/bitset_test: test the return value of BITSET_TEST_RANGE_INSIDE_WORD better
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32424>
2024-12-04 13:40:41 +00:00
Konstantin Seurer
16f4b93cac lavapipe: Implement VK_KHR_compute_shader_derivatives
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31056>
2024-12-04 12:53:57 +00:00
Konstantin Seurer
eac613bc70 gallivm: Use an accurate log2 implementation for lodq
The fast implementation can be off by a lot for small values.

Fixes:

dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.compute.lod_op.query.linear.16_1_1.mip_0
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.compute.lod_op.query.linear.16_1_1.mip_1
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.compute.lod_op.query.linear.4_4_1.mip_0
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.compute.lod_op.query.linear.4_4_1.mip_1
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.mesh.lod_op.query.linear.16_1_1.mip_0
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.mesh.lod_op.query.linear.16_1_1.mip_1
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.mesh.lod_op.query.linear.4_4_1.mip_0
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.mesh.lod_op.query.linear.4_4_1.mip_1
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.task.lod_op.query.linear.16_1_1.mip_0
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.task.lod_op.query.linear.16_1_1.mip_1
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.task.lod_op.query.linear.4_4_1.mip_0
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.task.lod_op.query.linear.4_4_1.mip_1

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31056>
2024-12-04 12:53:57 +00:00
Timothy Arceri
fcebbfc399 glsl: drop unused array refcount code and tests
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32450>
2024-12-04 11:50:57 +00:00
Sagar Ghuge
2af9853432 intel: Use the common RT BVH framework
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Michael Cheng
ed620bcd41 anv : Add tracepoint for as_build
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Kevin Chuang
5098c0c5df anv: Add INTEL_DEBUG for bvh dump and visualization tools
This commit allows you to dump different regions of memory related to
bvh building. An additional script to decode the memory dump is also
added, and you're able to view the built bvh in 3D view in html. See the
included README.md for usage.

Rework:
- you can now view the actual child_coord in internalNode in html
- change exponent to be int8_t in the interpreter
- fix the actual coordinates using an updated formula
- now you can have 3D view of the bvh
- blockIncr could be 2 and vk_aabb should be first
- Now, if any bvh dump is enabled, we will zero out tlas, to prevent gpu
  hang caused by incorrect tlas traversal
- rootNodeOffset is back to the beginning
- Add INTEL_DEBUG=bvh_no_build.
- Fix type of dump_size
- add assertion for a 4B alignment
- when clearing out bvh, only clear out everything after
  (header+bvh_offset)
- TODO: instead of dumping on destory, track in the command buffer

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
5561db68c3 anv: Add helper to copy data from src to dest anv_address
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
41baeb3810 anv: Implement acceleration structure API
Rework: (Kevin)
- Properly setup bvh_layout
   Our bvh resides in contiguous memory and can be divided into two sections:
      1. anv_accel_struct_header, tightly followed by
      2. actual bvh, which starts with root node, followed by interleaving
         leaves or internal nodes.
- Update comments for some fields for BVH and nodes.
- Properly populate the UUIDs in serialization header
- separate header func into completely two paths based on compaction bit
- Encode rt_uuid at second VK_UUID_SIZE.
- Write query result at correct slot
- add assertion for a 4B alignment
- move bvh_layout to anv_bvh
- Use meson option to decide which files to compile
- The alignment of serialization size is not needed
- Change static_assert to STATIC_ASSERT and move them inside functions

Rework (Sagar)
- Use anv_cmd_buffer_update_buffer instead of MI to copy data

Rework (Lionel)
- Remove flush after builds, and add flush in copy before dispatch
- Handle the flushes in CmdWriteAccelerationStructuresPropertiesKHR properly

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
9002e52037 anv: Implement cmd_dispatch_unaligned callback
Rework: (Kevin)
- Calculate correct number of threads in GPGPU thread group based on
  SIMD size.
- Instead of round up, just use the simple division and let the
  remainder part handle groupCount < local_size_x.
- Drop indirect_unroll_off and fix the bug that we're not using is_unaligned_size_x

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
0cab02ca9b anv: Implement flush_buffer_write_cp callbck
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
b2cffdb1ed anv: Implement write_buffer_cp callback
Rework: (Kevin)
 - Fix pointer arithmatic calculation.
 - Add assertion for a 4B alignment

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
8817ff26fc anv: Move update buffer code in helper
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
0edf208ab9 anv: Implement cmd_fill_buffer_addr callback
Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Kevin Chuang
2fe57947e3 anv: Implement encode shader to fit in ANV BVH
This shader gets called and will construct ANV BVH from IR BVH. More
specifically, each invocation will take care of one internal node. The
internal nodes get processed starting from root node all the way to the
bottom leaves.

During processing, we keep track of the destination of
where the internal node should be encoded (tracked in
vk_ir_box.bvh_offset), and where its leaves should be encoded (tracked
in vk_ir_header.dst_node_offset).

The processed bvh is in contiguous memory, which starts with header,
followed by interleaving internal nodes and leaves. The nodes
information are also populated.

Rework: (Sagar)
- Return out of bounds threads early
- Mimic GRL internal node encoding
- Handle node mask
- Fix block_incr_and_start_prim
- Fix shader_index_and_geom_mask for instance node
- Fix instance flag
- Fix block_incr and instance_contribution_and_geom_flags initialized to be zero
- Fix lower_x and upper_x to be properly flipped for invalid child
- For invalid node, clear blockIncr and set startPrim to INVALID
- Calculated things upfront and assign, cutting down more than ~200
  instructions

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
692b5fa9f2 anv: Add shader to copy acceleration structures
Rework (Kevin)
- encode the address of anv_instance_leaf after header in order to
  handle serialization and deserialization part.
- draw serialized data layout and explanation

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
a6b1a1fce1 anv: Add shader to build BVH header
Rework: (Kevin)
- Calculate the compacted_size properly
- Update instance count and self pointer
- The alignment of serialization size is not needed

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
ef94b7097b anv: Add header to track BVH data structures
This commit adds build interface and helper header for ANV BVH.

Rework: (Kevin)
- Use block_size macro to represent bvh node/leaf size
- Rename BVH-related node/leaf size macros for clarity
- Updated comments for some fields for bvh and nodes.
- move bvh_layout to anv_bvh.h
- Draw anv_bvh layout
- rename child_offset to child_block_offset

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Sagar Ghuge
617b7602ea anv: Split GRL code path in separate file
Rework (Kevin)
- Remove genX_acceleration_structure.c from meson option to avoid
  linking error

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:44 +00:00
Sagar Ghuge
b002b2589c anv: Update include dir for anv_tests
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:44 +00:00
Samuel Pitoiset
5d072e0e73 radv: fix stencil only copies of depth/stencil images with SDMA
This was broken for two reasons:
- the number of bytes per element should be 1 (8-bit for stencil)
- the base offset should be adjusted for the stencil

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32459>
2024-12-04 09:30:36 +00:00
Lionel Landwerlin
69edf4144a brw: use transpose unspill messages when possible
This simplifies the unspill messages quite a bit.

A/B testing on DG2 :

BlackOps3 : +0.96%
TotalWarPharaoh: +0.31%

DG2 shader changes :

  Assassin's Creed Valhalla:
  Totals from 19 (0.89% of 2131) affected shaders:
  Instrs: 70542 -> 64369 (-8.75%)
  Cycle count: 18810945 -> 18560169 (-1.33%); split: -1.40%, +0.06%

  Black Ops 3:
  Totals from 55 (3.41% of 1612) affected shaders:
  Instrs: 389549 -> 350646 (-9.99%)
  Cycle count: 344168275 -> 340652311 (-1.02%); split: -1.17%, +0.15%

  Control:
  Totals from 1 (0.11% of 878) affected shaders:
  Instrs: 3409 -> 3212 (-5.78%)
  Cycle count: 255991 -> 250411 (-2.18%)

  Cyberpunk 2077:
  Totals from 1 (0.08% of 1264) affected shaders:
  Instrs: 2363 -> 2337 (-1.10%)
  Cycle count: 69283 -> 69186 (-0.14%)

  Fallout 4:
  Totals from 1 (0.06% of 1601) affected shaders:
  Instrs: 27946 -> 20056 (-28.23%)
  Cycle count: 2391398 -> 2153658 (-9.94%)

  Fortnite:
  Totals from 273 (3.65% of 7470) affected shaders:
  Instrs: 634377 -> 601519 (-5.18%)
  Cycle count: 31870433 -> 31624089 (-0.77%); split: -0.78%, +0.01%

  Hogwarts Legacy:
  Totals from 50 (3.02% of 1656) affected shaders:
  Instrs: 110455 -> 103339 (-6.44%)
  Cycle count: 6613728 -> 6530832 (-1.25%); split: -1.28%, +0.03%

  Metro Exodus:
  Totals from 70 (0.16% of 43076) affected shaders:
  Instrs: 253847 -> 245321 (-3.36%)
  Cycle count: 13269473 -> 13209131 (-0.45%)
  Spill count: 1111 -> 1108 (-0.27%)
  Fill count: 2868 -> 2865 (-0.10%)

  Red Dead Redemption 2:
  Totals from 139 (2.38% of 5847) affected shaders:
  Instrs: 496551 -> 450180 (-9.34%)
  Cycle count: 43233944 -> 40947386 (-5.29%); split: -5.33%, +0.04%
  Spill count: 6322 -> 6326 (+0.06%)
  Fill count: 15558 -> 15568 (+0.06%)

  Rise Of The Tomb Raider:
  Totals from 1 (0.56% of 178) affected shaders:
  Instrs: 1682 -> 1437 (-14.57%)
  Cycle count: 603670 -> 586766 (-2.80%)

  Spiderman Remastered:
  Totals from 820 (11.77% of 6965) affected shaders:
  Instrs: 4622877 -> 3984893 (-13.80%)
  Cycle count: 235094963186 -> 234483925430 (-0.26%); split: -0.42%, +0.16%
  Spill count: 73414 -> 73581 (+0.23%); split: -0.02%, +0.25%
  Fill count: 215090 -> 215627 (+0.25%); split: -0.02%, +0.27%
  Scratch Memory Size: 3520512 -> 3528704 (+0.23%); split: -0.12%, +0.35%

Some of stats show spilling changes which is telling of how our spill
code is not adequate. Some of the spilled values are probably being
respilled which shouldn't be the case.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32110>
2024-12-04 08:59:07 +00:00
Pavel Ondračka
dcfa8851bd ci: bring back some i915g testing
Only single g33 as part of r300 ci-tron-based farm.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32376>
2024-12-04 08:18:43 +00:00
Kenneth Graunke
2ade3ec2a9 brw: Allow SIMD32 math instructions on Xe2
There's no restriction here AFAICT - only when HF types are involved.

fossil-db results on Lunar Lake:

   Totals:
   Instrs: 143665291 -> 142654109 (-0.70%)
   Cycle count: 22516049016 -> 22514172014 (-0.01%); split: -0.02%, +0.01%
   Max live registers: 49038116 -> 49017687 (-0.04%); split: -0.04%, +0.00%

   Totals from 117623 (21.07% of 558370) affected shaders:
   Instrs: 25098642 -> 24087460 (-4.03%)
   Cycle count: 1038884570 -> 1037007568 (-0.18%); split: -0.48%, +0.29%
   Max live registers: 12423219 -> 12402790 (-0.16%); split: -0.16%, +0.00%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32471>
2024-12-04 02:42:34 +00:00
Kenneth Graunke
815236b417 brw: Fix register unit calculation in SIMD32 LOAD_PAYLOAD lowering
We were wanting to check if the destination region spanned multiple
registers.  But we were checking against REG_SIZE, when the register
size is actually REG_SIZE * reg_unit(devinfo) now.

This meant that SIMD32 LOAD_PAYLOAD was always getting SIMD-split
on Xe2 platforms, generating a lot of unnecessary mess for compute
shaders.

fossil-db results on Lunar Lake:

   Totals:
   Instrs: 146178614 -> 143291988 (-1.97%); split: -1.98%, +0.00%
   Subgroup size: 11089632 -> 11089376 (-0.00%); split: +0.00%, -0.00%
   Cycle count: 22528892444 -> 22507551650 (-0.09%); split: -0.12%, +0.03%
   Max live registers: 48834202 -> 48886685 (+0.11%); split: -0.09%, +0.20%

   Totals from 134306 (24.10% of 557327) affected shaders:
   Instrs: 28806335 -> 25919709 (-10.02%); split: -10.02%, +0.00%
   Subgroup size: 4297680 -> 4297424 (-0.01%); split: +0.00%, -0.01%
   Cycle count: 956867650 -> 935526856 (-2.23%); split: -2.84%, +0.61%
   Max live registers: 13085711 -> 13138194 (+0.40%); split: -0.33%, +0.73%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32471>
2024-12-04 02:42:34 +00:00
Caio Oliveira
dfa4c55a4f intel/brw: Add is_control_source for the new subgroup ops
Fixes: 019770f026 ("intel/brw: Add SHADER_OPCODE_VOTE_*")
Fixes: 9537b62759 ("intel/brw: Add SHADER_OPCODE_REDUCE")
Fixes: 0ba1159b0a ("intel/brw: Add SHADER_OPCODE_*_SCAN")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32411>
2024-12-04 01:19:37 +00:00
Nanley Chery
428a970511 anv: Only consider R32 image formats as supporting atomics
Only consider R32 image formats as supporting atomics because we only
expose VK_FORMAT_FEATURE_2_STORAGE_IMAGE_ATOMIC_BIT for those formats.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>
2024-12-03 22:54:35 +00:00
Nanley Chery
122c01a496 anv: Enable more storage compression on gfx12+
On gfx12.0, allow storage compression unless the image may be used with
atomics.

On gfx20, use the CCS_E aux-usage for storage compression. This causes
ISL to create surface states with more appropriate render compression
formats.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5657
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>
2024-12-03 22:54:35 +00:00
Nanley Chery
01c4ea771c anv: Enable storage accesses with modifiers on gfx12+
I tested this patch with an ACM card. It enables "Halo: The Master Chief
Collection" to use the clear color modifier instead falling back to the
uncompressed Tile4 modifier.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>
2024-12-03 22:54:35 +00:00
Nanley Chery
2dedd8dbb2 intel/isl: Fix DecompressInL3 assignment on gfx12.5
* In the ACM PRMs, the programming notes under
  RENDER_SURFACE_STATE::MemoryCompressionEnable state that the
  DecompressInL3 bit must be set for media compression.

* Unlike TGL, ACM seems to handle format reinterpretation just fine
  without using the bit.

Update the assignment accordingly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32192>
2024-12-03 22:54:34 +00:00
Autumn Ashton
7e9ea5c1b5 radv/video: Fix bitstreamStartOffset including dstBufferOffset
The bitstreamStartOffset from the VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR query in RADV is currently relative to the dstBuffer, and not dstBuffer + dstBufferOffset like the spec states.

To fix this, let's append the offset to the VA directly and not tell the encoder about the offset relative to the VA at all.

The Vulkan spec states:
"VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR specifies that queries managed by the pool will capture the byte offset of the bitstream data written by the video encode operation to the bitstream buffer specified in VkVideoEncodeInfoKHR::dstBuffer relative to the offset specified in VkVideoEncodeInfoKHR::dstBufferOffset."

The relevant part being that is is relative to dstBufferOffset and not the start of the VkBuffer.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32452>
2024-12-03 22:19:43 +00:00
Georg Lehmann
1a7ebfd2a8 radv: rework vk_property initialization
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32457>
2024-12-03 21:43:34 +00:00
Georg Lehmann
b961537a17 radv: fix reporting mesh/task/rt as supported dgc indirect stages
Fixes: 8300378bf3 ("radv: advertise VK_EXT_device_generated_commands on GFX8+")

Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32457>
2024-12-03 21:43:34 +00:00
Gurchetan Singh
03b527ea92 gfxstream: fix issues with VK1.4 build
Fixes build after VK1.4 update.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32446>
2024-12-03 20:35:44 +00:00
Gurchetan Singh
ade6a19f14 gfxstream: remove abort()
I have no idea why it just started complaining now about
this.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32446>
2024-12-03 20:35:44 +00:00
Erik Faye-Lund
7b8f5b0881 panvk: report minmax-support for sampled formats
We also need to report minmax as part of the format-features.

This fixes the following CTS tests for me:
- dEQP-VK.api.info.format_properties.r8_unorm
- dEQP-VK.api.info.format_properties.r8_snorm
- dEQP-VK.api.info.format_properties.r16_sfloat
- dEQP-VK.api.info.format_properties.r32_sfloat
- dEQP-VK.api.info.format_properties.d16_unorm

Fixes: 1fc454673a ("panvk: Implement VK_EXT_sampler_filter_minmax for v10")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32445>
2024-12-03 20:16:58 +00:00
Aleksi Sapon
0812a8bccc draw: front-face injection must check geometry shader primitive type
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32306>
2024-12-03 19:42:22 +00:00
Mary Guillemard
cdf822632a panvk: Add a nightly job for Mali-G52
We have quite a big fraction currently and it has been proven that we
are missing new failures right now.

This adds a new nightly job that run a full CTS on a single VIM3.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32421>
2024-12-03 19:11:30 +00:00
Mary Guillemard
913a7b26e1 panvk: Update Mali-G52 CI baseline
We seems to have new regressions that were introduced but never seen
because of the massive fraction used.

This adds the failures seen with a full run while trying to document
some.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32421>
2024-12-03 19:11:30 +00:00
Pavel Ondračka
61d890b6db r300/ci: update RV410 CI expectations
The test was almost passing before but we can't really always get the
required five decimal point tolerance with the R300/R400 hw.
nir_opt_algebraic improvements in 92797c6878
shuffled the ALUs a bit and we now do a bit better.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32458>
2024-12-03 18:48:32 +00:00
Konstantin Seurer
4ed867825a lavapipe: Implement VK_KHR_shader_float_controls2
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31273>
2024-12-03 16:15:25 +00:00
Konstantin Seurer
540e84bedb gallivm: Preserve -0 and nan
Some operations need additional or different code to preserve the sign
of 0 or nan.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31273>
2024-12-03 16:15:25 +00:00
Konstantin Seurer
f5db70cb24 gallivm: Add float operation behavior flags to lp_type
Used to emit additional code if -0 or nan needs to be preserved.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31273>
2024-12-03 16:15:25 +00:00
Samuel Pitoiset
9df3c9e4a1 ac/parse_ib: print VA for the SDMA CONSTANT_FILL/WRITE packets
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32456>
2024-12-03 15:29:40 +00:00
Samuel Pitoiset
31524d42a2 ac/parse_ib: fix parsing SDMA CONSTANT_FILL packet
This packet only has 5 DWORDS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32456>
2024-12-03 15:29:39 +00:00
Georg Lehmann
34a47e4b14 nir/opt_algebraic: mark a - ffract(a) as nan incorrect.
Inf + fract(Inf) -> Inf + NaN -> NaN
floor(Inf) -> Inf

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>
2024-12-03 14:42:18 +00:00
Georg Lehmann
2ee96cf514 nir/opt_algebraic: optimize d3d9 ceil
No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>
2024-12-03 14:42:18 +00:00
Georg Lehmann
34caed8adb nir/opt_algebraic: optimize d3d9 ftrunc
Foz-DB Navi21:
Totals from 85 (0.11% of 79395) affected shaders:
MaxWaves: 1972 -> 1968 (-0.20%)
Instrs: 48682 -> 47067 (-3.32%)
CodeSize: 255664 -> 247172 (-3.32%)
VGPRs: 3752 -> 3768 (+0.43%)
Latency: 154414 -> 150360 (-2.63%)
InvThroughput: 37186 -> 35081 (-5.66%)
VClause: 847 -> 865 (+2.13%); split: -0.24%, +2.36%
SClause: 768 -> 796 (+3.65%)
Copies: 2763 -> 2869 (+3.84%); split: -0.14%, +3.98%
VALU: 28133 -> 26781 (-4.81%)
SALU: 7182 -> 6939 (-3.38%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32393>
2024-12-03 14:42:18 +00:00