mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-16 03:18:06 +02:00
Fixes: 10f2c308c1 ("docs: add release notes for 26.1.0")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41406>
4810 lines
221 KiB
ReStructuredText
4810 lines
221 KiB
ReStructuredText
Mesa 26.1.0 Release Notes / 2026-05-06
|
||
======================================
|
||
|
||
Mesa 26.1.0 is a new development release. People who are concerned
|
||
with stability and reliability should stick with a previous release or
|
||
wait for Mesa 26.1.1.
|
||
|
||
Mesa 26.1.0 implements the OpenGL 4.6 API, but the version reported by
|
||
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
|
||
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
|
||
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
|
||
4.6 is **only** available if requested at context creation.
|
||
Compatibility contexts may report a lower version depending on each driver.
|
||
|
||
Mesa 26.1.0 implements the Vulkan 1.4 API, but the version reported by
|
||
the apiVersion property of the VkPhysicalDeviceProperties struct
|
||
depends on the particular driver being used.
|
||
|
||
SHA checksums
|
||
-------------
|
||
|
||
::
|
||
|
||
SHA256: a5095e6dc2986c78f0cef4c5555dc803e93b6bfe5670e991f9e8bd49395bae19 mesa-26.1.0.tar.xz
|
||
SHA512: 02972b1a2e6a2d10fa9c970ab579ddd6f0d3acfb782d82616b91cca7a28dd494b15fb14748fff333b1078898cf990a25666e9122d7fd41cb212a151caf933786 mesa-26.1.0.tar.xz
|
||
|
||
|
||
New features
|
||
------------
|
||
|
||
- GL_NV_timeline_semaphore on radeonsi
|
||
- VK_QCOM_image_processing on Turnip
|
||
- VK_EXT_present_timing on RADV, NVK, Turnip, ANV, Honeykrisp, panvk
|
||
- VK_KHR_sampler_ycbcr_conversion on pvr
|
||
- VK_EXT_image_drm_format_modifier on pvr
|
||
- VK_KHR_internally_synchronized_queues on RADV
|
||
- VK_EXT_blend_operation_advanced on lavapipe
|
||
- VK_KHR_get_surface_capabilities2 on panvk
|
||
- VK_KHR_get_display_properties2 on panvk
|
||
- VK_EXT_acquire_drm_display on panvk
|
||
- VK_KHR_present_id on panvk, v3dv
|
||
- VK_KHR_present_wait on panvk, v3dv
|
||
- VK_KHR_pipeline_executable_properties on pvr
|
||
- VK_EXT_zero_initialize_device_memory on panvk
|
||
- GL_EXT_shader_image_load_store on panfrost
|
||
- VK_KHR_swapchain_mutable_format on panvk
|
||
- VK_EXT_astc_decode_mode on panvk
|
||
- VK_KHR_copy_memory_indirect on nvk, RADV/GFX8+
|
||
- VK_EXT_color_write_enable on panvk
|
||
- VK_EXT_hdr_metadata on v3dv
|
||
- VK_EXT_image_view_min_lod on panvk
|
||
- VK_EXT_depth_clamp_control on panvk
|
||
- VK_VALVE_shader_mixed_float_dot_product on RADV (Vega20, Navi14, RDNA2+)
|
||
- VK_EXT_legacy_dithering on panvk
|
||
- GL_ARB_sample_shading on v3d
|
||
- VK_KHR_maintenance4 on pvr
|
||
- VK_ARM_scheduling_controls on panvk
|
||
- cl_khr_subgroup_ballot on asahi, iris, llvmpipe, radeonsi and zink
|
||
- cl_khr_subgroup_clustered_reduce on asahi, llvmpipe, radeonsi and zink
|
||
- cl_khr_subgroup_extended_types on asahi, iris, llvmpipe, radeonsi and zink
|
||
- cl_khr_subgroup_non_uniform_arithmetic on asahi, llvmpipe, radeonsi and zink
|
||
- cl_khr_subgroup_non_uniform_vote on asahi, iris, llvmpipe, radeonsi and zink
|
||
- cl_khr_subgroup_rotate on asahi, llvmpipe and zink
|
||
- VK_EXT_nested_command_buffer on panvk
|
||
- VK_VALVE_mutable_descriptor_type on panvk
|
||
- VK_EXT_shader_stencil_export on panvk
|
||
- VK_EXT_map_memory_placed on panvk
|
||
- VK_EXT_conditional_rendering on panvk
|
||
- VK_{KHR,EXT}_{surface,swapchain}_maintenance1 on panvk
|
||
- VK_EXT_shader_atomic_float on panvk
|
||
- VK_KHR_device_address_commands on RADV
|
||
- VK_EXT_non_seamless_cube_map on pvr
|
||
- fragmentStoresAndAtomics on panvk/v6-7
|
||
- VK_KHR_shader_untyped_pointers on panvk
|
||
- VK_EXT_primitive_restart_index on RADV
|
||
- VK_EXT_attachment_feedback_loop_layout on panvk
|
||
- VK_EXT_attachment_feedback_loop_dynamic_state on panvk
|
||
- VK_KHR_shader_integer_dot_product on pvr
|
||
- VK_EXT_descriptor_heap on RADV (with 'export RADV_EXPERIMENTAL=heap')
|
||
- fullDrawIndexUint32 on pvr
|
||
- multiDrawIndirect on pvr
|
||
- depthBiasClamp on pvr
|
||
- wideLines on pvr
|
||
- VK_EXT_rgba10x6_formats on panvk
|
||
- static C++ stdlib required on rusticl to workaround applications using their own C++ stdlib
|
||
|
||
|
||
Bug fixes
|
||
---------
|
||
|
||
- 26.0.1 fails to build: \`create_context.c: error: 'struct glx_screen' has no member named 'frontend_screen'`
|
||
- A770: Counter-Strike 2 visual glitches (regression)
|
||
- ACO: assertion in insert_exec_mask()
|
||
- ANV: SkQP regression in Android CTS
|
||
- Accumulation of black squares with OpenGL applications at high resolutions (hiz-related)
|
||
- After updating GStreamer, all videos in Showtime are green/purple
|
||
- Anv/Blorp: implement RT clear operations without changing the binding table
|
||
- Anv: implement transfer operations on buffers without render targets
|
||
- Bisected regression: Assertion texObj->pt == view->texture failed.
|
||
- Compile failure on gentoo since !40305
|
||
- Intel BDW regression due to load_push_data_intel intrinsic
|
||
- Is maxFragmentCombinedOutputResources=16 in Honeykrisp reflects an actual HW limit?
|
||
- KHR-GL46.geometry_shader.limits.max_output_components
|
||
- KHR_display: plane size limits
|
||
- Kodi regression with panthor >= 1.7 after updating to Linux 7.0-rc1
|
||
- MDK2 HD (opengl) has most elements rendered as black
|
||
- Mesa 25.3 amdgpu memory issue
|
||
- Mesa LLVMpipe Memory Leak
|
||
- Missing Haswell case after a097a3d214eda7fb7b9ff63176754b7260e09e03 leads to bogus assert in intel_perf_mdapi.c
|
||
- OpenGL 4.1 VRAM Memory Leak with setting uniform variables
|
||
- Panfrost Bifrost compiler assertion failure: wrong vectorization in bi_alu_src_index (Mesa 26.0.0)
|
||
- Portal hard locks the machine on rv350.
|
||
- Question: Does building Lavapipe on Windows require building "microsoft-experimental" as well?
|
||
- RADV: Invalid hitAttributeEXT value when using function-call RT pipelines
|
||
- RADV: RDNA4 visual corruption in DX11 (DXVK) – Mafia III character model glitches, AMDVLK renders correctly (9070XT)
|
||
- Segmentation fault in gm200_validate_sample_locations with Firefox on GTX 1070 Ti (nouveau)
|
||
- Sekiro: Shadows Die Twice lighting corruption on RX 7900 XT
|
||
- Shader inputs/outputs for vertex/pixel shaders that have the integer (int) type are broken on RDNA 3 and 4 graphics cards
|
||
- Support for timeline semaphores in radeonsi
|
||
- The End is Nigh (Wine): No lighting in The Hollows
|
||
- Transcoding mpeg2video with ffmpeg h264_vulkan on Intel cause Conversion failed!
|
||
- Turnip crash with lazy depth textures: GPUMEM_BIND_RANGES failed (Not a typewriter)
|
||
- Unroll loops before lowering indirects
|
||
- VK_KHR_display fails on PowerVR on Mesa master
|
||
- Vulkan CTS regression bisected to 5d2c17a5fdce ("vtn: skip make-available/visible for shared")
|
||
- [26.0.0~rc1] d3d12_screen.cpp:1165:(.text._ZL31d3d12_interop_query_device_infoP11pipe_screenjPv+0x4b): undefined reference to \`d3d12_video_encoder_get_last_slice_completion_fence(pipe_video_codec*, void*, pipe_fence_handle**)'
|
||
- [ANV]: Regression in dxvk Greedfall
|
||
- [ANV][A770] Deep Rock Galactic assert fails
|
||
- [ANV][BMG] Building Mesa with Clang causes Missing Skin Textures in UE games - Tekken 8
|
||
- [ANV][BMG] Dying Light The Beast graphics artifacts
|
||
- [ANV][BMG] Regression - Broken lighting and flickering in Kingdom Come: Deliverance II
|
||
- [ANV][DG2][Regression]: Flickering water "boxes" in Civilization VII
|
||
- [ANV][LNL] - Elden Ring (1245620) - Vertex explosions with ray tracing enabled
|
||
- [ANV][PTL] R.E.P.O. GPU Hang
|
||
- [ANV][Xe2+] Resource Barriers Invalid Signal Stage
|
||
- [NVL-S] deqp-vk failures within 2d_array and cubemap tests
|
||
- [RADV] Killer7 has a blue tint with RDNA3/4
|
||
- [anv] Intel ARC B390 | Horizon Forbidden West | DX12 | Flashing effects
|
||
- [anv][ptl] valheim gpu hang and visual corruption
|
||
- [bisected] Xe3 regression with piglit tess/barrier-patch.shader_test after cmod prop change
|
||
- [docs] add documentation on how to install debug symbols in various distros
|
||
- [hasvk] Regressions from enabling VK_KHR_maintenance6
|
||
- [radeonsi] Missing ground texture in Lethis Path of the Progress
|
||
- [radeonsi] Regression: GL_FEEDBACK returns 0.0 for X-coordinates (Legacy GL)
|
||
- [radv] Regression causes glitches in Strange Brigade (Vulkan renderer)
|
||
- [radv][bisected][regression] GhostwireTokyo RT gpu hangs with HPLOC commit
|
||
- [regression] Left 4 Dead 2 crashing when joining or starting survival with "Official Dedicated" servers
|
||
- amdgpu reset/crash when simulating stereo camera
|
||
- amdgpu_device_initialize: amdgpu_query_info(ACCEL_WORKING) failed (-13)
|
||
- anv, bisected: Genshin Impact wrong shadows, flickering grass
|
||
- anv: Implement BTI switching for fast-clears
|
||
- anv: Implement multi-layer fast-clear
|
||
- anv: Suballocate indirect clear colors
|
||
- anv: missing implementation of vkGetPhysicalDeviceCooperativeMatrixFlexibleDimensionsPropertiesNV
|
||
- anv: optimize descriptor buffer binding
|
||
- anv: ycbcr CTS tests asserts
|
||
- building mesa_clc on ubutu-26.04 with gcc-16 fails link
|
||
- ci: Wayland issues after 6641c891fdaa15923f0b61a5fef4b9d9ed91ac0e
|
||
- freedreno/avx2 build failure
|
||
- freedreno/decode: Usage based register summaries
|
||
- glcpp: incorrect macro expansion in token pasting
|
||
- glsl: spec\@glsl-es-1.00\@linker\@glsl-mismatched-uniform-precision-unused broken
|
||
- gnome-control-center hitting assert
|
||
- intel/isl: Support CCS on Ys-tiled images
|
||
- ir3: ir3_get_predicate() vs &ctx->build
|
||
- isl: Prefer Tile64 when it saves space
|
||
- lavapipe: crash in caselist
|
||
- mesa: deleting a buffer bound only to an index also undoes the associated general target binding
|
||
- nir/rusticl: optimize 64b sys vals
|
||
- nir: possible exactness bug in reassociate
|
||
- nvk: Enable ZCULL
|
||
- nvk: shader does unnecesary move to local
|
||
- panvk: VK_KHR_present_id, VK_KHR_present_wait not implemented
|
||
- r300 , regression , bisected : Glitches with Sauerbraten
|
||
- r300: HiZ related dEQP failures
|
||
- radv, regression : Crysis 2 Remastered raytracing blocky reflections
|
||
- radv: Port si_emit_guardband to RADV
|
||
- radv: enable rdna3 bfloat16 support
|
||
- regression, bisect: commit a8272bf0f1f9229d73252b03d0fb32d563396a9c breaks KWin through NVK & Zink
|
||
- static linking regression since !37495 - spirv-tools shared library required at runtime if exists at build time
|
||
- tu,ir3: Lowering IO before unrolling loops broke forced loop unrolling heuristic, breaking at least multiview
|
||
- tu: GPU faults during LRZ clears on unallocated transient attachments in gmem mode
|
||
- turnip: VK_EXT_device_memory_report unsupported
|
||
- turnip: inconsistent host allocator use for command buffers
|
||
- turnip: llama.cpp: Running test-backend-ops results in segmentation fault
|
||
- util/hash_table: regression: use after free on 32-bit platforms
|
||
- util: Build regression with MSYS2 MinGW-W64 x64 clang 21 on 26.0.0-rc3
|
||
- venus crashes in vn_CreateDevice() with latest mesa/main [bisected]
|
||
- virgl: Trace scripts timeout with Linux 6.17
|
||
- virtio/vulkan: vkcube fails to run when calling the libvulkan_virtio dynamic library
|
||
- wsi: \`assert(chain->dxgi);` may failed under venus for win32
|
||
- zink, turnip: compilation is failing when compiling zink and turnip with only kgsl support
|
||
- zink: mesh shaders broken
|
||
|
||
|
||
Changes
|
||
-------
|
||
|
||
Adam Jackson (2):
|
||
|
||
- zink: use VK_EXT_pci_bus_info for PCI address
|
||
- venus: advertise VK_KHR_shader_fma
|
||
|
||
Adam Simpkins (1):
|
||
|
||
- iris: fix a crash in disable_rb_aux_buffer
|
||
|
||
Aditya Swarup (11):
|
||
|
||
- anv: Add helper macros for address binding report extension
|
||
- anv: Report address binding events for memory buffers
|
||
- anv: Report address binding events for images
|
||
- anv: Add BO helper macros for binding report extension
|
||
- anv: Report address binding events for VkQueryPool
|
||
- anv: Report address binding events for VkDescriptorPool
|
||
- anv: Report address binding events for VkCommandPool
|
||
- anv: Report bind/unbind events for sparse VA range
|
||
- anv: Report addr bind events for opaque/non-opaque sparse allocations
|
||
- anv: Enable support for VK_EXT_device_address_binding_report
|
||
- anv: Report bind events for image private binding
|
||
|
||
Adrián Larumbe (2):
|
||
|
||
- pan/kmod: Fix minor version number check for USER_MMIO_OFFSET ioctl
|
||
- pan/kmod: fix double syncop count sum when populating vm_bind syncs
|
||
|
||
Agate, Jesse (3):
|
||
|
||
- amd/vpelib: Add RGB 601 Primaires to BG Color
|
||
- amd/vpelib: Predication fix
|
||
- amd/vpelib: Visual Confirm Fix
|
||
|
||
Ahmed Hesham (3):
|
||
|
||
- rusticl: return correct error from clCreateSubBuffer
|
||
- rusticl: fix flag validation when creating an image
|
||
- pan/bi: Restore b3210 as a valid swizzle
|
||
|
||
Aitor Camacho (45):
|
||
|
||
- nvk: Handle unbound sets that contain dynamic buffers
|
||
- hk: Handle unbound sets that contain dynamic buffers
|
||
- kk: Update kk_bind_descriptor_sets comment to reflect updated binding
|
||
- kk: Bundle nir_to_msl options into a struct for easier option addition
|
||
- kk: Force frag output component count to match render targets'
|
||
- kk: Use nir_opt_shrink_stores
|
||
- kk: Fill pipelineUUID
|
||
- kk: Fix shader uint32_t value serialization
|
||
- kk: Correctly release pipeline handles at shader destroy
|
||
- kk: Fix compute pipeline cache
|
||
- kk: Move gfx pipeline data to the info struct within kk_shader
|
||
- kk: Fix graphics pipeline serialization
|
||
- kk: Remove primitive type from pipeline and rely on dynamic one
|
||
- kk: Enable vertexPipelineStoresAndAtomics
|
||
- kk: Move nir_opt_shrink_stores after nir_opt_remove_phis for correct shrink
|
||
- kk: Fix disabling workaround 4
|
||
- kk: Expose pushDescriptor from 1.4
|
||
- kk: Expose VK_EXT_extended_dynamic_state2
|
||
- kk: Expose VK_EXT_texel_buffer_alignment
|
||
- kk: Expose depthBiasClamp
|
||
- kk: Expose largePoints
|
||
- kk: Expose VK_EXT_image_2d_view_of_3d
|
||
- kk: Expose sampleRateShading
|
||
- wsi/metal: Expose additional color spaces if instance extension enabled
|
||
- kk: Remove helper invocation flag in read system values
|
||
- kk: Fix crash in PositiveShaderImageAccess.UndefImage
|
||
- kk: Assign type to load_frag_coord
|
||
- kk: Fix push descriptor set layout when rebinding
|
||
- kk: Force fragment output matches render targets'
|
||
- kk: Fix image access issues
|
||
- kk: Default to max descriptor size if mutable list is empty
|
||
- kk: Increase push constant size to 256 from 128
|
||
- docs/kk: Update build instructions to add --prefer-static
|
||
- docs: Add KosmicKrisp to the list of layered drivers
|
||
- kk: Set command buffer state to 0 when reset
|
||
- kk: Add clc in a similar fashion to other drivers like HK
|
||
- kk: Rework draw recording for easier addition of stages like tessellation
|
||
- kk: Implement VK_KHR_draw_indirect_count as HK does
|
||
- kk: Rework command buffers' compute shader state tracking
|
||
- kk: Rework command buffers' graphics shader state tracking
|
||
- kk: Clean up gfx state flushing
|
||
- kk: Reset queries through compute dispatch instead of queue writes
|
||
- kk: Remove buffer arg from queue writes
|
||
- kk: Demote events, query availabilities and queue writes to 32 bits
|
||
- kk: Fix pre-gfx encoder dependency with gfx encoder
|
||
|
||
Alejandro Piñeiro (3):
|
||
|
||
- v3dv/meson: fix missing headers and duplicate entry
|
||
- v3dv: split v3dv_private.h into smaller headers
|
||
- broadcom/vulkan: remove v3dv_private.h
|
||
|
||
Aleksi Sapon (8):
|
||
|
||
- llvmpipe: pass explicit derivatives to sampling codegen
|
||
- llvmpipe: elliptical derivative transform for anisotropic filtering
|
||
- llvmpipe: implement per-fragment anisotropic rho
|
||
- llvmpipe: add GALLIVM_PERF=no_lod_ellipse
|
||
- llvmpipe: add stride argument to lp_build_swizzle_aos_n
|
||
- llvmpipe: update traces
|
||
- llvmpipe: fix incorrect image 64bit fetch return value type
|
||
- lavapipe: update fails
|
||
|
||
Alexander Koskovich (1):
|
||
|
||
- freedreno/common: add support for the Adreno 810
|
||
|
||
Ali, Nawwar (4):
|
||
|
||
- amd/vpelib: update 3dlut and shaper FL
|
||
- amd/vpelib: coding style rectify
|
||
- amd/vpelib: Fix crash during encoding test
|
||
- amd/vpelib: Move shaper and 3D LUT updates to vpe_color_update_movable_cm
|
||
|
||
Allen Ballway (1):
|
||
|
||
- vulkan: update ALLOWED_ANDROID_VERSION for api level 37
|
||
|
||
Alyssa Milburn (1):
|
||
|
||
- nv50,nvc0: Avoid uninitialized cbuf reads in blits
|
||
|
||
Alyssa Rosenzweig (92):
|
||
|
||
- brw: move nir_opt_memcpy OOTL
|
||
- brw: remove redundant nir_opt_combine_stores
|
||
- brw: hoist lower_pack OOTL
|
||
- brw: unloop post-mem vectorize opts
|
||
- brw: run opt_deref only once
|
||
- brw: only optimize ray queries once
|
||
- brw: only optimize ray queries if there are any
|
||
- brw: run nir_opt_idiv_const only once
|
||
- brw: optimize bfi only late
|
||
- brw: combine more peephole select
|
||
- brw: remove a redundant DCE
|
||
- util: add linear_memdup
|
||
- util: add BITSET_LINEAR_ZALLOC
|
||
- brw: use BITSET_LINEAR_ZALLOC
|
||
- brw: move fsign lower OOTL
|
||
- brw: hoist fsat lower OOTL
|
||
- pvr,pan,agx: drop cargo-culted nir_opt_loop calls
|
||
- util: allow string shader "statistics"
|
||
- util: hide hashes from GL shader stats
|
||
- intel: add scheduling mode statistic
|
||
- intel: simplify shader stats names
|
||
- intel: report code size in shader stats
|
||
- nir: disable fast-math for lowering conversions
|
||
- vtn: fix wait_group_events memory scope
|
||
- brw: drop buggy SLM optimization
|
||
- nir: add missing ssbo atomics to nir_get_io_index_src_number
|
||
- util/bitset: add an assert for big BITSET_EXTRACT
|
||
- nir/opt_constant_folding: optimize ballot(false)
|
||
- nir: add nir_get_io_data_src
|
||
- nir/lower_atomics: use data helper
|
||
- nir/opt_uniform_atomics: use data helper
|
||
- nir/opt_fragdepth: use data helper
|
||
- nir/opt_intrinsics: use data helpers
|
||
- brw: use data helper
|
||
- util,intel: move probably_float to common code
|
||
- agx: use util_is_probably_float
|
||
- mailmap: update my personal email
|
||
- panfrost: drop email from ancient copyright lines
|
||
- asahi: fix some copyright headers
|
||
- nir: optimize u2u32(unpack_32_2x16_split_*)
|
||
- nir/lower_subgroups: generalize vote lowering
|
||
- agx: use common code vote lowering
|
||
- nir/lower_io: remove incorrect Intel _block cases
|
||
- brw: move brw_nir_pack_vs_input to brw_nir.c
|
||
- brw: move brw_can_coherent_fb_fetch to a C header
|
||
- nir/lower_io: handle Intel URB intrinsics
|
||
- nir/lower_subgroups: fix boolean clustered reductions
|
||
- nir/opt_sink: sink Intel UBO loads
|
||
- nir/opt_sink: sink pack_64_2x32_split
|
||
- agx: drop NIR continue handling
|
||
- util/sparse_bitset: add u_sparse_bitset_clear_all
|
||
- brw: explicitly pad tgl_swsb
|
||
- brw/eu_emit: relax assertion to allow ARF NULL
|
||
- brw/nir_lower_fs_load_output: optimize pixel coord
|
||
- brw: wire up MACL
|
||
- brw: Move intel_nir_opt_peephole_imul32x16 later in compilation
|
||
- brw: scalarize even 64-bit scratch access
|
||
- brw: lower 16-bit mulh
|
||
- brw: lower mem access sizes even for UBOs
|
||
- brw: chop up unaligned access
|
||
- brw: round up block components
|
||
- nir: add frag_coord_w_rcp intrinsic
|
||
- nir: add Intel RT write intrinsic
|
||
- nir: add shuffle_intel
|
||
- nir: add pixel_coord_intel
|
||
- brw: subgroup lowering for jay
|
||
- brw: disable hw generate local ID for jay
|
||
- brw: disable nir_opt_uniform_atomics for Jay
|
||
- brw: add Jay-specific SIMD selection rule
|
||
- brw: lower ifind_msb for Jay
|
||
- intel: add Jay
|
||
- iris: wire up jay
|
||
- anv: wire up jay
|
||
- jay: fix W-entry calcs
|
||
- jay: rematerialize address regs
|
||
- jay: drop GRF reg stats
|
||
- jay: roundrobin RA
|
||
- jay: marginally improve send splitting heuristic
|
||
- jay: tweak roundrobin
|
||
- jay: generalize alignment heuristic
|
||
- jay: improve vector affinities
|
||
- jay: fix SEND scoreboarding
|
||
- jay: fix simd split swsb bugs
|
||
- jay: fix instr counts
|
||
- jay: move deswizzle hack outside of swsb
|
||
- jay: split up jay_from_nir.c
|
||
- jay: load_simd_width_intel
|
||
- jay: fix a bunch of opcode properties
|
||
- jay: fix bfn cmod
|
||
- jay: allow cmod on cvt
|
||
- intel: fuse off Jay in Mesa 26.1
|
||
- nir/opt_reassociate: fix exactness bug
|
||
|
||
Anders Roxell (7):
|
||
|
||
- teflon: Add support for symmetric per-channel quantization
|
||
- ethosu: Add support for per-channel quantization
|
||
- ethosu: Handle per-channel zero_points
|
||
- ethosu: fix RESIZE upscale mode
|
||
- ethosu: clean up ADD elementwise scaling
|
||
- teflon/tests: add micronet_large anomaly detection model
|
||
- ethosu: fix blockdep to check for data dependencies
|
||
|
||
Andy Nguyen (1):
|
||
|
||
- amd/addrlib: Add more GFX1013 GPUs
|
||
|
||
Anna Maniscalco (2):
|
||
|
||
- freedreno/common: set has_astc_hdr true for a7xx targets
|
||
- zink: don't care about generated gs output primitive
|
||
|
||
Ansari, Muhammad (2):
|
||
|
||
- amd/vpelib: Fix potential overflow calculation
|
||
- amd/vpelib: Adding new wrapper for register profiling
|
||
|
||
Arjob Mukherjee (4):
|
||
|
||
- doc: Added documentation for imagination tree
|
||
- pvr: Fixup for deqp-vk.api 2d.optimal.* conformance
|
||
- pvr: Fixup for deqp-vk.api 2d.optimal.* conformance
|
||
- pvr: increase value of maxPerStageDescriptorStorageBuffers
|
||
|
||
Arkady Shlykov (2):
|
||
|
||
- brw: Implement divergent atomics fusion optimization (single message approach)
|
||
- anv: Add control over divergent atomics fusion opt via driconf
|
||
|
||
Assadian, Navid (1):
|
||
|
||
- amd/vpelib: Reorder function pointers
|
||
|
||
Bas Nieuwenhuizen (1):
|
||
|
||
- ac/llvm: Fix build with LLVM 23.
|
||
|
||
Benjamin Cheng (16):
|
||
|
||
- radv/video: Use a more reliable way of computing tile sizes
|
||
- radv/video: Use ac_video_dec for decode
|
||
- radv/video: Split cdf buffer and encode ctx
|
||
- ac/parse_ib: Fix VCN address parsing
|
||
- radv/video: Disable qp map for h265 on vcn1
|
||
- radeonsi/vcn: Use full pitch for pre-encode input
|
||
- radv: Disable video features for some DRM modifiers
|
||
- frontends/va: Assert that slices come in order
|
||
- ac: Fix naming of hevc encode params IB
|
||
- radv/video_enc: Use variable slice mode when possible
|
||
- radeonsi/vcn: Reorder get_slice_ctrl_param
|
||
- ac: Update FW required for variable slice mode
|
||
- radv/video: Add low-latency flags
|
||
- ac/surface: Filter swizzle modes for VCN
|
||
- radv: Relax linear requirement to VCN1 and prior
|
||
- radv/wsi: Re-use transfer queue if it exists
|
||
|
||
Benjamin Otte (1):
|
||
|
||
- lavapipe: Fix features for nonsubsampled ycbcr formats
|
||
|
||
Bernd Kuhls (1):
|
||
|
||
- blake3: add blake3_neon.c only for little endian archs
|
||
|
||
Boris Brezillon (2):
|
||
|
||
- pan/kmod: Allow mmap() on foreign buffers
|
||
- pan/format: Advertise support for AFBC(16x16,sparse,split)
|
||
|
||
Boyuan Zhang (1):
|
||
|
||
- ac/vcn_dec: add addr_mode for VCN 5.0.1
|
||
|
||
Brian Paul (2):
|
||
|
||
- gallivm: fix undefined CALLOC_STRUCT build error
|
||
- util,loader: silence asprintf() unused result warnings
|
||
|
||
Caio Oliveira (49):
|
||
|
||
- brw/scoreboard: Use std::vector when applicable
|
||
- brw/scoreboard: Add tests showing implicit unordered dependencies in SWSB
|
||
- brw: Provide ~ and &= operators for tgl_sbid_mode
|
||
- brw/scoreboard: Support local implicit out-of-order dependencies
|
||
- brw: Create a struct to hold parser state
|
||
- brw: Move brw_last_inst macro to assembler
|
||
- brw: Move the brw_codegen inside brw_asm_parser
|
||
- brw: Remove global variables from brw_asm parser
|
||
- brw: Remove tabs from brw_cfg.cpp
|
||
- brw: Remove foreach_block_safe / reverse_safe
|
||
- brw: Remove block_list in favor of blocks array
|
||
- brw: Don't increment block loads addresses unless needed
|
||
- intel/compiler: Use SPDX annotations
|
||
- spirv: Check Capability for identifying SPV_NV_mesh_shader
|
||
- intel/mda: add difflog command
|
||
- brw: Include backend NIR passes in mda files
|
||
- brw: Use the "early break" loop macros when possible
|
||
- intel/mda: Change the matching logic
|
||
- intel/mda: Use -W for color words diff and -U for regular unified diff
|
||
- brw: Remove outdated comment about remove_dead_variables
|
||
- brw: Fix "GRF registers" stats output
|
||
- brw: Print "GRF registers" in INTEL_DEBUG=shaders output
|
||
- brw/print: Don't print extra space at the end
|
||
- anv: Don't enumerate cooperative matrix configurations if disabled
|
||
- anv: Simplify cooperative matrix feature advertising
|
||
- brw: Fix cooperative matrix constant sources other than src0
|
||
- brw: Make brw_builder::uniform() ignore previous group
|
||
- brw: Explicitly set group=0 in generator for SYNC used in workaround
|
||
- nir: Handle nir_instr_type_cmat_call in more places
|
||
- brw/scoreboard: Don't track dependencies for UNDEFs
|
||
- brw: Add lowering for nir_cmat_call_op_per_element_op
|
||
- anv: Enable cooperativeMatrixPerElementOperations
|
||
- anv: Set PIPELINE_SELECT systolic mode based on shader usage
|
||
- spirv: Refactor ALU opcode translation to take bit sizes
|
||
- spirv: Pull constant source fixup to the existing loop
|
||
- spirv: Fix spec constant to handle Select for non-native floats
|
||
- spirv: Remove conversions from vtn_nir_alu_op_for_spirv_opcode()
|
||
- nir: Fix constant folding for iadd_sat
|
||
- anv: Add vkGetPhysicalDeviceCooperativeMatrixFlexibleDimensionsPropertiesNV
|
||
- spirv: Use SPDX annotations
|
||
- spirv: Remove dead code in subgroup instruction handling
|
||
- nir: Add intrinsics for ShuffleUpINTEL and ShuffleDownINTEL
|
||
- intel/compiler: Handle shuffle_*_intel intrinsics in bit size lowering
|
||
- spirv: Lower ShuffleUpINTEL and ShuffleDownINTEL to intrinsics
|
||
- brw: Always use split send in generator
|
||
- anv: Fix assert in anv_nir_compute_push_layout
|
||
- brw: In "Clear Accumulator" workaround, never set predicate_inverse
|
||
- anv: Lower any remaining globals when cmat_calls are inlined
|
||
- brw: Fix max_dispatch_width collection for CS with variable size
|
||
|
||
Caius-Moldovan-img (3):
|
||
|
||
- pco: Move DITR and DITRP fencing from translation to legalization
|
||
- pco: Add pseudo instruction fencing for DITR and DITRP
|
||
- pco: Move part of legalization after register allocation
|
||
|
||
Calder Young (4):
|
||
|
||
- Revert "anv,brw: Allow multiple ray queries without spilling to a shadow stack"
|
||
- anv: Avoid dumping BVH before command buffer is submitted
|
||
- anv: Fix address bit masking for indirect SBTs
|
||
- anv: Fix support for indirect SBTs on Xe3+
|
||
|
||
Caleb Callaway (2):
|
||
|
||
- driconf: LTO disable
|
||
- anv/driconf: Disable shader LTO for MHW
|
||
|
||
Casey Bowman (3):
|
||
|
||
- anv: Fix shaders-lineno implementation for eu stall sampling
|
||
- intel/tools: Add xe3p format for intel_monitor
|
||
- intel/ds: Modify rejection threshold to scale with requested sample period
|
||
|
||
Caterina Shablia (20):
|
||
|
||
- panvk: fix sparse image non-opaque binds
|
||
- panvk: let the mod handler handle DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED
|
||
- panvk: merge vm_bind ops in some cases
|
||
- drm-uapi: update drm_fourcc.h
|
||
- pan/genxml: add interleaved 64k clump ordering and block format
|
||
- pan/lib: introduce standard_sparse_mapping_granularity
|
||
- panvk: implement sparse in terms of interleaved 64k
|
||
- pan/lib: remove deadcode
|
||
- panvk: implement sparseResidencyImage3D
|
||
- spirv: plumb spirv-dis --offsets
|
||
- pan/bi: remove trailing space
|
||
- panvk: propagate debug info through NIR when BIFROST_MESA_DEBUG=debuginfo is specified
|
||
- pan/bi: propagate debug info
|
||
- pan/bi: print shaders with debug info when BIFROST_MESA_DEBUG=shaders,debuginfo is specified
|
||
- panvk: never report identicalMemoryLayout
|
||
- pan/lib: use interleaved 64k in more cases
|
||
- panvk: leave a TODO for U-interleaved copies
|
||
- pan/lib: use tiled AFBC
|
||
- radv: skip retiling if transitioning away from ZERO_INITIALIZED
|
||
- radv: move all image checks into radv_image_need_retile
|
||
|
||
Chan, Roy (2):
|
||
|
||
- amd/vpelib: fix uninitialized variable
|
||
- amd/vpelib: add a new cap to differentiate scaler coeff handling
|
||
|
||
Chang, Tomson (1):
|
||
|
||
- amd/vpelib: avoid using reg_update for multi-thread
|
||
|
||
Christian Gmeiner (62):
|
||
|
||
- compiler/mesa/st: Move gl_advanced_blend_mode to util/blend.h
|
||
- util/blend: Add advanced blend modes
|
||
- util/blend: Add pipe_blend_overlap_mode enum
|
||
- vulkan/runtime: Add helper to convert VkBlendOp to pipe_advanced_blend_mode
|
||
- vulkan/runtime: Add helper to convert VkBlendOverlapEXT to pipe_blend_overlap_mode
|
||
- vulkan/runtime: Add VK_EXT_blend_operation_advanced state tracking
|
||
- nir: Extract blend equation helpers to shared header
|
||
- nir/lower_blend: Add advanced blending support
|
||
- asahi/hk: Implement VK_EXT_blend_operation_advanced
|
||
- gallium: Add pipe cap for masked clears and support stencil masking
|
||
- etnaviv: hwdb: Add BLT_64BPP_MASKED_CLEAR_FIX cap
|
||
- etnaviv: blt: Enable masked clear for color and stencil
|
||
- etnaviv: Emit alpha_to_coverage dither table only when needed
|
||
- lavapipe: Implement VK_EXT_blend_operation_advanced
|
||
- meson: Restore .clang-format for ninja clang-format target
|
||
- pan/format: Disable storage image support for compressed formats
|
||
- panvk: Support VK_EXT_zero_initialize_device_memory
|
||
- pan/compiler: Fix progress reporting in pan_nir_lower_store_component
|
||
- panvk: Support VK_EXT_astc_decode_mode
|
||
- etnaviv: Validate MSAA sample count for depth/stencil formats
|
||
- etnaviv: blt: Fix clear_bits overflow for 32-bit formats
|
||
- panvk: Advertise VK_EXT_color_write_enable
|
||
- panvk: implement VK_EXT_image_view_min_lod
|
||
- etnaviv: blt: Use img->swizzle for CONFIG SWIZ fields
|
||
- etnaviv: Add translate_pe_internal_format helper
|
||
- etnaviv: Use BGRA-internal texture format with BLT/RS R/B swizzle
|
||
- etnaviv: Compute blend color registers directly in etna_set_blend_color(..)
|
||
- panvk: Support VK_EXT_depth_clamp_control
|
||
- panvk: Support VK_EXT_legacy_dithering
|
||
- u_blitter: Add single-triangle draw mode for NEAREST blit consistency
|
||
- etnaviv: Enable single-triangle blitter mode
|
||
- etnaviv: hwdb: Add WIDELINE_TRIANGLE_EMU cap
|
||
- etnaviv: Limit max line width to 1.0 on GPUs needing wide line emulation
|
||
- etnaviv: Mark TS config dirty after BLT blit
|
||
- etnaviv: Implement stencil-only blit using util_blitter fallback
|
||
- etnaviv: Add S8_UINT texture format support for stencil texturing
|
||
- panvk: Use per-queue shader core count for CSF group creation
|
||
- panvk: Advertise VK_ARM_scheduling_controls on CSF
|
||
- isaspec: Use %g format for float display to ensure round-trip fidelity
|
||
- panvk: advertise VK_EXT_nested_command_buffer on v10+
|
||
- etnaviv: isa: Restrict COND field to conditional instructions
|
||
- etnaviv: isa: Split texkill into concrete bitset variants
|
||
- panvk: Advertise VK_VALVE_mutable_descriptor_type
|
||
- etnaviv: isa: Add unary texkill variant
|
||
- panvk: Advertise VK_EXT_shader_stencil_export
|
||
- docs/features: VK_VALVE_mutable_descriptor_type: Add missing version info
|
||
- panvk: Implement VK_EXT_memory_budget support
|
||
- pan/kmod: Simplify pan_kmod_bo_mmap() to always map the whole BO
|
||
- panvk: Implement VK_EXT_map_memory_placed
|
||
- panvk: Add VK_EXT_conditional_rendering state and commands
|
||
- panvk: Wrap draws and dispatches with conditional rendering
|
||
- panvk: Support inherited conditional rendering in secondaries
|
||
- panvk: Disable conditional rendering during meta operations
|
||
- panvk: Advertise VK_EXT_conditional_rendering
|
||
- panvk: Advertise VK_EXT_shader_atomic_float
|
||
- panvk: Lower memcpy derefs
|
||
- panvk: Advertise VK_KHR_shader_untyped_pointers on v9+
|
||
- panvk: Advertise VK_EXT_attachment_feedback_loop_layout
|
||
- panvk: Advertise VK_EXT_attachment_feedback_loop_dynamic_state
|
||
- radv: Don't advertise any features for R10X6G10X6B10X6A10X6_UNORM_4PACK16
|
||
- util/format, vulkan: Add PIPE_FORMAT_X6R10X6G10X6B10X6A10_UNORM
|
||
- panvk: Advertise VK_EXT_rgba10x6_formats
|
||
|
||
Christoph Pillmayer (20):
|
||
|
||
- pan: Add some missing ForEachMacros to .clang-format
|
||
- pan/bi: Fix spill limit value order
|
||
- pan/bi: Reindex SSA before NIR->BIR
|
||
- pan/bi: Copy agx_repair_ssa.c
|
||
- pan/bi: Copy reindex_ssa.c from agx
|
||
- pan/bi: Fixup bi_reindex_ssa.c for bi
|
||
- pan/bi: Fixup bi_repair_ssa.c for bi
|
||
- pan/bi: Repair SSA after spilling
|
||
- pan/bi: Teach bi_print_instr about memory vars
|
||
- pan/bi: Pull out size recording
|
||
- pan/bi: Abstract away spills/fills when spilling
|
||
- pan/bi: Don't allocate lut space for temps
|
||
- pan/bi: Account for MEMMOV in bi_record_sizes
|
||
- pan/bi: Make SSA spilling vector aware
|
||
- panvk: Don't create MS2SS views for internal views
|
||
- pan/bi: Fix coupling spill placement
|
||
- pan/bi: Move FAUs to memory for memory phis
|
||
- CODEOWNERS: Update panfrost
|
||
- pan/bi: Fix MEMMOV size calculation
|
||
- pan/bi: Fix source swizzle in bi_repair_ssa
|
||
|
||
Chuanyu Tseng (1):
|
||
|
||
- Revert "amd/vpelib: Adding new wrapper for register profiling"
|
||
|
||
Collabora's Gfx CI Team (13):
|
||
|
||
- Uprev ANGLE to 63d1dd7c2dfccf6acbd92af224b48aa6ada45f1c
|
||
- Uprev VVL to snapshot-2026wk06
|
||
- Uprev Piglit to 0d79fb4a59c7d213ff144afa4c73e3b32ebe6500
|
||
- Uprev VVL to snapshot-2026wk07
|
||
- Uprev Piglit to 8e2c8bc0018f42b107d470a2de1bf7f53e8fb012
|
||
- Uprev ANGLE to b90b9ee1a4f901e6ba9e649d8f6cf9098a944f50
|
||
- Uprev Vulkan Validation Layers
|
||
- Uprev Piglit to d0a16eee4f7b24abe7e3aab6ee77db8f82e5ad49
|
||
- Uprev ANGLE to 599125448d7ad53b2868a7b5d2e3e8d3bfbc1717
|
||
- Uprev VVL to adfdda5b665f59aae31acf5c12c73e64a8f89553
|
||
- Uprev VVL to cb2acdf7f49053406770ae73cbb315229a9131eb
|
||
- Uprev Piglit to 11ce9eb56edb00e6a7702d13168cc827ce5e0cbd
|
||
- Uprev ANGLE to 5e591d03650dd427001e355f4884b857cadab113
|
||
|
||
Connor Abbott (65):
|
||
|
||
- nir: Move is is_compact() out of unlower_io_to_vars
|
||
- nir: Fix recompute_io_bases with compact i/o arrays
|
||
- nir/recompute_io_bases: Fix handling of dual-source blending
|
||
- nir/recompute_io_bases: Fix num_inputs with dual-slot VS inputs
|
||
- nir: Allow lower_clip_fs with lowered IO
|
||
- nir/lower_clip: Correctly handle driver_location in VS lowering
|
||
- st/mesa: Call nir_recompute_io_bases after lowering FP variants
|
||
- nir, ir3: Make ir3 GS varyings use a proper index
|
||
- ir3, freedreno, tu: Move nir_lower_io out of post_finalize()
|
||
- ir3: Fix ir3_nir_lower_layer_id indentation
|
||
- ir3: Stop relying on variables in ir3_nir_lower_layer_id
|
||
- ir3: Stop using variables when translating from NIR
|
||
- ir3: Don't use variables for passthrough TCS
|
||
- ir3: Remove variables after i/o lowering
|
||
- freedreno, turnip, ir3: Always gather streamout info from NIR
|
||
- ir3: Stop asserting tess levels are scalar
|
||
- ir3: Don't support indirect inputs in FS
|
||
- ir3, freedreno, turnip: Lower io earlier
|
||
- freedreno, ir3: Fix branchstack register definitions on a5xx+
|
||
- ir3: Split out max_branchstack and branchstack_size
|
||
- ir3: Fix branchstack max_waves calculation
|
||
- tu: Fix FDM texel size calculation
|
||
- tu: Handle FDM-per-layer in CmdClearAttachments paths
|
||
- tu: Use a patchpoint for subpass clears with FDM
|
||
- tu: Implement VK_QCOM_multiview_per_view_viewports
|
||
- tu/autotune: Take render pass layers into account
|
||
- tu: Support VK_QCOM_multiview_per_view_render_areas
|
||
- tu: Remove fdm argument from tu6_emit_tile_select
|
||
- tu: Implement bin merging for views
|
||
- tu: Implement bin skipping for zero-density regions
|
||
- ir3: Fix barrier error case calculation
|
||
- tu: Fix condition for re-emitting FDM-related state
|
||
- tu: Use HW offset 0 in 3d loads/clears with FDM
|
||
- ir3: Fix constlen trimming when more than one stage is trimmed
|
||
- tu: Store tile the tile was merged with
|
||
- tu: Refactor FDM sampling and bin merging
|
||
- tu: Move FDM tile configuration to a new file
|
||
- tu: Always call tu_emit_renderpass_begin()
|
||
- tu: Pass through tile_config to FDM patchpoints
|
||
- tu: Move immutable sampler handling above descriptor size calc
|
||
- vulkan: Store a few more fields in vk_sampler
|
||
- tu: Set polygon mode when blitting
|
||
- tu: Fix setting will_be_resolved with MSRTSS
|
||
- tu: Track which views an attachment is used as a resolve attachment
|
||
- tu: Refactor immutable sampler handling with descriptor update templates
|
||
- tu: Multiply bin size by GMEM extent
|
||
- tu: Implement subsampled images
|
||
- freedreno/afuc: Update cread/cwrite syntax in README
|
||
- freedreno: Rename afuc to QRisc
|
||
- vtn: Fix vtn_mediump_downconvert_value() for transposed matrices
|
||
- vtn: Fix vtn_mediump_upconvert_value() with transposed matrices
|
||
- nir: Use better calculation for alpha-to-coverage mask
|
||
- tu, ir3, nir: Plumb through driver param for alpha-to-coverage
|
||
- tu: Enable alpha-to-coverage emulation
|
||
- freedreno: Name GS/DS ViewID register fields
|
||
- ir3: Implement ViewIndex for GS
|
||
- ir3: Support multiview in GS lowering
|
||
- tu: Adjust multiview lowering for GS
|
||
- tu: Fill GS/DS ViewID register fields
|
||
- tu: Lower maxMultiviewViewCount to 6
|
||
- tu: Enable multiviewGeometryShader
|
||
- ir3: Don't reset immediate count to 0 after lowering
|
||
- ir3: Use correct immediate size for constlen calculation
|
||
- tu: Fix LRZ+FDM offset+secondaries
|
||
- tu: Disable LRZ when resuming if the GPU doesn't support tracking
|
||
|
||
Daivik Bhatia (9):
|
||
|
||
- broadcom/compiler: Update comment clarifying OpTerminate implementation
|
||
- v3dv: use vk_graphics_pipeline_state for pipeline creation
|
||
- v3d/v3dv: drop manual log2_tile_width/height asserts. Move the log2_tile_width/height asserts to pack header functions.
|
||
- v3dv: parse V3DV_ENABLE_PIPELINE_CACHE with parse_debug_string
|
||
- v3d/v3dv: drop unused UIF XOR disable plumbing
|
||
- nir: Handle format swizzles for OOB image loads
|
||
- v3dv: Implement robust_image_access_2 flag
|
||
- broadcom/compiler: lower txf LOD for robustImageAccess2 on V3D 4.2
|
||
- v3dv: Enable VK_KHR_robustness2
|
||
|
||
Daniel Lang (3):
|
||
|
||
- etnaviv: hwdb: Import gc_feature_database from Amlogic
|
||
- etnaviv: hwdb: Import gc_feature_database from D-Robotics
|
||
- etanviv: hwdb: Import gc_feature_database from eYs3D
|
||
|
||
Daniel Schürmann (59):
|
||
|
||
- nir/lower_non_uniform_access: flag IF as always divergent taken
|
||
- panfrost/clc: call nir_opt_remove_phis after nir_opt_loop
|
||
- asahi/clc: call nir_opt_remove_phis after nir_opt_loop
|
||
- nir/opt_loop: Relax restrictions on opt_loop_peel_initial_break() for more loops
|
||
- nir/opt_load_store_vectorize: use linear allocator instead of ralloc
|
||
- nir/opt_load_store_vectorize: create add_entry_to_hash_table() helper
|
||
- nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks
|
||
- radv: vectorize UBO, SSBO and shared across blocks
|
||
- radeonsi: vectorize UBO, SSBO and shared across blocks
|
||
- nir/opt_load_store_vectorize: Vectorize speculatable instructions across blocks
|
||
- nir/opt_load_store_vectorize: don't use shared2 vectorization across blocks
|
||
- nir/opt_loop_unroll: Always unroll loops with a known trip-count of 0
|
||
- nir/loop_analyze: also set force_unroll if the array_size is larger than max_trip_count
|
||
- nir/clone: Fix cloning indirect call instructions
|
||
- aco/builder: Fix v_add_co_u32 carry-out to VCC if post_ra
|
||
- aco/isel: Do IF-simplification if that didn't happen during NIR optimizations
|
||
- aco: don't emit p_logical_start / p_logical_end after divergent branches
|
||
- aco/isel: Don't emit ELSE side of uniform branches which jump
|
||
- aco/isel: Don't emit ELSE side of divergent branches which jump
|
||
- aco/lower_branches: Consider branch target of nested conditional branches
|
||
- aco/print_asm: Sort block markers by block offset
|
||
- aco: introduce notion of block_kind_loop_latch
|
||
- aco/assembler: emit block_kind_loop_latch before the loop header
|
||
- aco/insert_delay_alu: handle loop latch block before loop body
|
||
- aco/lower_branches: Add try_rotate_latch_block() optimization
|
||
- glsl_to_nir: emit loop continue construct
|
||
- nir/divergence: rename divergent_loop_cf to divergent_cf
|
||
- nir/divergence: Fix nir_block::divergent in presence of divergent breaks
|
||
- nir/divergence: Ignore divergent_loop_{continue|break} for nir_block::divergent
|
||
- nir/opt_remove_phis: recursively check loop header phis for triviality
|
||
- nir/lower_continue_constructs: Simplify loops before lowering continue constructs
|
||
- nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements
|
||
- radv/rt: add and lower loop continue construct in traversal shaders
|
||
- radv/dgc: add and lower loop continue construct
|
||
- tgsi_to_nir: Add and lower loop continue constructs
|
||
- dxil/nir: Remove nir_jump_continue from lower_subgroup_scan()
|
||
- dozen: add and lower loop continue construct
|
||
- nir/lower_goto_ifs: Add and lower loop continue constructs
|
||
- ac: add and lower loop continue construct for streamout buffer info loop
|
||
- tu/rt: add and lower loop continue construct in traversal shaders
|
||
- lavapipe/rt: add and lower loop continue construct in traversal shaders
|
||
- aco/tests: add and lower loop continue constructs in all tests which use continues
|
||
- nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue
|
||
- nir: ensure that loop continue statements always link to continue constructs
|
||
- nir: validate that loop continue statements always link to continue constructs
|
||
- radv: call nir_opt_if late again
|
||
- radv: increase limit for peephole_select in radv_optimize_nir_algebraic_early()
|
||
- nir/opt_if: allow load_const instructions on ELSE-side for if-simplifaction
|
||
- nir/opt_if: allow undef instructions on ELSE side for if-simplification
|
||
- aco/isel: Remove loop_context* parameter from begin_loop() / end_loop() helper functions
|
||
- aco/isel: Remove if_context* parameter from begin_if() / end_if() helper functions
|
||
- aco/lower_branches: Don't remove branches which jump over loops
|
||
- aco/lower_branches: Fix try_rotate_latch_block()
|
||
- aco/isel: remove handling of nir_jump_continue
|
||
- aco/insert_exec_mask: remove handling of loop continues
|
||
- aco/lower_phis: remove handling of block_kind_continue
|
||
- aco/opt_value_numbering: remove handling of block_kind_continue
|
||
- aco/lower_branches: remove handling of block_kind_continue
|
||
- aco: remove remaining occurences of block_kind_continue
|
||
|
||
Daniel Stone (6):
|
||
|
||
- panvk: Support VK_KHR_get_surface_capabilities2
|
||
- panvk: Support VK_KHR_get_display_properties2
|
||
- panvk: Support VK_EXT_acquire_drm_display
|
||
- panvk: Support VK_KHR_present_id and present_wait
|
||
- ci/panvk: Skip DRM WSI tests for v10/ASan
|
||
- vulkan/wsi/wayland: Correctly map 24bpp format types
|
||
|
||
Danylo Piliaiev (23):
|
||
|
||
- tu: Fix typo in min bounds calculation of FDM scissors
|
||
- tu: Avoid disabling LRZ when possible for suspend/resume+depth-only draws
|
||
- util/u_trace: Fix usage of variable-sized strings in non-queued case
|
||
- docs/envvars: Document TU_DEBUG and IR3_SHADER_DEBUG
|
||
- docs: Add documentation on how to debug GPU crashes and misrenderings
|
||
- tu/a7xx: Fix D/S corruption when loading them via load_3d_blit
|
||
- freedreno/rddecompiler: Fix shader editing when REG_BUNCH is used
|
||
- ir3: Align TCS per-patch output to 64 bytes to prevent stale reads
|
||
- tu: Fix double emission of PC_DS_CNTL due to missing break
|
||
- tu: Add lrzWriteDisableReason to render_pass tracepoint
|
||
- tu: vk_dont_care_as_load should not affect internal DONT_CARE cases
|
||
- tu: Store gmem attachments after custom resolve in dyn RP
|
||
- tu: Fix stomping of D/S test for custom resolve with D/S
|
||
- tu: Don't read .patch_input_gmem of unused attachment
|
||
- tu/kgsl: Better detection of sparse support
|
||
- tu: Fix imported memory not being affected by DEVICE_ADDRESS_CAPTURE_REPLAY
|
||
- tu: u_trace usage fixes before u_trace refactoring
|
||
- tu/autotune: Improve RP hash
|
||
- tu: Support EXT_shader_image_atomic_int64
|
||
- tu: Support transfer commands for R64 formats
|
||
- tu: Add tracepoints for clear/copy/blit/lrz ops
|
||
- tu: Fix CP_CCHE_INVALIDATE not being applied at the right point
|
||
- tu/u_trace: Fix explicit toggle_name not being used
|
||
|
||
Dave Airlie (12):
|
||
|
||
- lavapipe: add NV_cooperative_matrix2 flexible dimensions support
|
||
- lavapipe: add NV_cooperative_matrix2 conversions support
|
||
- nir: add cmat call to propogate invariants
|
||
- lavapipe: add NV_cooperative_matrix2 reductions support
|
||
- lavapipe: add support for NV_cooperative_matrix2 per element operations
|
||
- gallivm: handle u16 correct on const loads.
|
||
- st/mesh: handle mesh shader point size
|
||
- nvk: don't set sector promotion on texture headers
|
||
- nouveau: drop sector promotion.
|
||
- gallivm: handle llvm 22 coroutine end change
|
||
- gallivm: handle llvm 22 scatter/gather intrinsic changes.
|
||
- lavapipe: treat NULL pColorAttachmentLocations as no handles
|
||
|
||
David Headrick (2):
|
||
|
||
- dozen: Add support for VK_EXT_memory_budget
|
||
- dozen: Resolve Seg Fault in dzn_physical_device_create
|
||
|
||
David Rosca (40):
|
||
|
||
- radeonsi/vcn: Clean up decode flags
|
||
- radeonsi/vcn: Add low latency decode debug option
|
||
- radv/video: Use coded size from session params instead of codedExtent
|
||
- radv/video: Fix maxActiveReferencePictures for H265 decode
|
||
- radv/video: Support AV1 encode frame size override
|
||
- ac: Add video decode interface
|
||
- ac: Add VCN ac_video_dec implementation
|
||
- ac: Add VCN JPEG ac_video_dec implementation
|
||
- ac: Add UVD ac_video_dec implementation
|
||
- radeonsi: Don't assert when using src texture in si_compute_clear_copy_buffer
|
||
- radeonsi/video: Add video decoder using ac_video_dec
|
||
- radeonsi/video: Remove old VCN and UVD decode implementation
|
||
- radeonsi: Rename si_uvd_create_decoder to si_video_codec_create
|
||
- radeonsi: Rename si_uvd.c to si_video.c
|
||
- radeonsi: Move functions from radeon_video.c to si_video.c
|
||
- radeonsi/video: Drop offsets parameter for si_vid_resize_buffer
|
||
- radv/video: Remove old VCN and UVD decode implementation
|
||
- radv: Drop videoarraypath debug option
|
||
- ac/vcn_dec: Make the helper functions static
|
||
- radv/video: Support UVD decode on hawaii and older
|
||
- ac/vcn_dec: Fix tier2 dpb array size
|
||
- vl: Also disable MPEG2 Main profile when mpeg12 decode is disabled
|
||
- ac: Add variable slice mode interface
|
||
- radeonsi/vcn: Use variable slice mode when possible
|
||
- radv/video: Fix AV1 encode min tile size
|
||
- radv/video: Fix coding pic_parameter_set_id in H264 slice header
|
||
- frontends/va: Fix leaking H264/5 PPS/SPS objects when decoder wasn't created
|
||
- frontends/va: Fix leaks when create_video_codec fails
|
||
- radv/video: Use quality level for encode preset instead of tuning mode
|
||
- radeonsi: Set multi plane format also for imported textures
|
||
- radeonsi/video: Fix setting decode surface format for single plane formats
|
||
- radv/video: Remove unused function radv_vcn_sq_start
|
||
- radeonsi/vcn: Remove encode op_preset overrides
|
||
- radeonsi/vcn: Don't force balance encode preset with sao on VCN5
|
||
- d3d12: Use HEVC RefPicSet order from frontend
|
||
- ac/parse_ib: Fix printing enc recon VAs on VCN5
|
||
- radv/video: Fix initializing rc structs with default rate control
|
||
- frontends/va: Fix finding LTRs from POCs in HEVC decode
|
||
- frontends/va: Fix setting output color properties from color standard
|
||
- frontends/va: Add missing NULL check for additional output surface
|
||
|
||
Derek Lesho (1):
|
||
|
||
- zink: Guard bo map/unmap on map_count.
|
||
|
||
Dhruv Mark Collins (24):
|
||
|
||
- freedreno,u_trace: Fix various UBSAN errors
|
||
- tu: Increase clang-format ColumnLimit to 120
|
||
- tu: Move tu_autotune_end_renderpass as late as possible
|
||
- tu: Rewrite autotune in C++
|
||
- util/rand_xor: Add extern C for C++ compatibility
|
||
- tu/autotune: Add "Profiled" algorithm
|
||
- util/math: Add ROUND_DOWN_TO_NPOT
|
||
- tu/autotune: Prefer SYSMEM when only SW binning is possible
|
||
- tu/autotune: Disable autotuning for small renderpasses by default
|
||
- tu/autotune: Add "Preempt Optimize" mode
|
||
- tu/autotune: Add prefer SYSMEM/GMEM mode
|
||
- tu+util: Allow setting autotune mode from driconf
|
||
- tu+util: Prefer SYSMEM for DXVK/VKD3D
|
||
- tu/autotune: Add render mode locking to PROFILED algorithm
|
||
- tu/autotune: Allow 99% max probability in profiled mode
|
||
- tu/autotune: Only lock RPs sustain certain mode for 30s
|
||
- freedreno/fdperf: Detect when counter values are invalid
|
||
- zink+turnip/ci: Add failures uncovered by new autotune
|
||
- tu: Disable features using performance counter for KGSL
|
||
- tu: Only emit preempt optimization ambles when active
|
||
- tu/autotune: Fail gracefully when CP counters are unavailable
|
||
- fd/pps: Allocate performance counters from high-to-low
|
||
- tu/autotune: Allocate performance counters from low-to-high
|
||
- tu/query_pool: Avoid CP counter conflict with autotune
|
||
|
||
Dmitry Baryshkov (1):
|
||
|
||
- freedreno/ci: update nightly expectations
|
||
|
||
Dmitry Osipenko (7):
|
||
|
||
- intel: Check for userptr UAPI presence
|
||
- intel: Add virtio-gpu native context
|
||
- iris: Open-code drm prime ioctls
|
||
- iris: Support virtio-gpu native context
|
||
- anv: Support virtio-gpu native context
|
||
- crocus: Use intel_ioctl() consistently
|
||
- crocus: Support virtio-gpu native context
|
||
|
||
Duncan Brawley (4):
|
||
|
||
- pvr: add basic support for shader statistics framework
|
||
- pvr: Add support for VK_KHR_pipeline_executable_properties
|
||
- pco: Use vertex input registers in register allocation
|
||
- pco: Fix pco_last_igrp returning the first element instead of the last
|
||
|
||
Dylan Baker (15):
|
||
|
||
- bin/pick: When the main widget is replaced, trigger a redraw
|
||
- docs: add release notes for 25.3.4
|
||
- docs: Add SHA sums for 25.3.4
|
||
- docs: update calendar for 25.3.4
|
||
- docs/release-calendar: Update calendar for 1 week bump
|
||
- docs: add release notes for 25.3.5
|
||
- docs: Add 25.3.5 SHA sums
|
||
- docs: update calendar for 25.3.5
|
||
- vulkan/runtime: Tie vulkan log printing to debug option rather than buildtype
|
||
- docs: add release notes for 25.3.6
|
||
- docs: Add SHA sums for 25.3.6
|
||
- docs: Fix unescaped \`*` in 25.3.6 release notes
|
||
- docs: update calendar for 25.3.6
|
||
- intel/tools: Don't allocate in noop_drm_shim until after error checking
|
||
- anv: assert we haven't gone over the maximum number of push_buffers
|
||
|
||
Ella Stanforth (13):
|
||
|
||
- vulkan: add plane aspect format helper
|
||
- vulkan/runtime: use nir_shader_tex_pass for ycbcr lowering
|
||
- pvr: fix transfer double stride
|
||
- pvr/csbgen: fix packing multiple addresses
|
||
- pvr: add multiplanar format support
|
||
- pvr: handle packing texstate for ycbcr images
|
||
- pvr: handle ycbcr swizzle
|
||
- pvr: handle plane addresses for ycbcr images.
|
||
- pvr: setup csc tables
|
||
- pvr: implement chroma swap
|
||
- pvr: workaround hardware clamping for YCBCR_IDENTITY conversion
|
||
- pvr: add ycbcr formats
|
||
- pvr: enable sampler ycbcr conversion
|
||
|
||
Emma Anholt (74):
|
||
|
||
- nir: Fix C UB in imad24_ir3 evaluation.
|
||
- nir/opt_algebraic_tests: Allow testing imad24_ir3.
|
||
- nir/opcodes: Define the mul/mad_relaxed opcodes to return poison for OOB.
|
||
- nir/opt_algebraic_tests: Allow testing mul/mad_relaxed opcodes.
|
||
- nir/opcodes: Define udiv_aligned_4 to return poison for not-aligned-4.
|
||
- nir/opt_algebraic_tests: Allow testing udiv_aligned_4.
|
||
- nir/opt_algebraic_tests: Allow testing of fdot*_replicated opcodes.
|
||
- nir/opt_algebraic_tests: Add support for expression swizzles.
|
||
- nir/opt_algebraic_tests: Remove unnecessary input_count.
|
||
- nir/opt_algebraic_tests: Move more of the base class code to be methods.
|
||
- nir/opt_algebraic_tests: Rename and use the enum result type more.
|
||
- nir/opt_algebraic_tests: Make sure we test the same inputs on BE as LE.
|
||
- nir/opt_algebraic_tests: Test !nir_fp_preserve_signed_zero behavior.
|
||
- nir/opt_algebraic_tests: Fix fuzzing levels for multi-component inputs.
|
||
- nir/opt_algebraic: Fix a bit of imad24_ir3's optimization.
|
||
- nir/opt_algebraic_tests: Fix leak of the variable conds ht.
|
||
- nir/opt_algebraic_tests: Fix annotating uint values.
|
||
- nir/opt_algebraic_tests: Initialize an obvious dummy value for all defs.
|
||
- ci: Skip dEQP-VK.wsi.direct_drm.
|
||
- vulkan/wsi/display: Rename XCB RandR functions to mention "randr"
|
||
- vulkan/wsi/display: Add some super useful debug messaging.
|
||
- wsi/display: Fix up the swapchain init error paths.
|
||
- vulkan/wsi/display: Avoid holding drm master for the device's fd.
|
||
- isaspec: Print a useful error for an assert I hit.
|
||
- isaspec: Improve debug info for extractor_fallback().
|
||
- isaspec: Print the bitset we're processing when missing a field.
|
||
- isaspec: Print the bit number when just a single bit is undefined.
|
||
- ir3/tests: Print a helpful bit number on re-assembly failures.
|
||
- nir,spirv: Add support for SPV_QCOM_image_processing.
|
||
- ir3: Refactor bindless tex src info collection.
|
||
- ir3: Add support for VK_QCOM_image_processing opcodes.
|
||
- tu: Implement VK_QCOM_image_processing.
|
||
- vulkan/wsi: Add some comments about how the vblank/flip sequencing happens.
|
||
- wsi/display: Delete dead vblank_handler path.
|
||
- vulkan/wsi: Delete ancient libdrm support for the page flip handler.
|
||
- nir: Bump test timeouts.
|
||
- tu: Add support for VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 color attachments
|
||
- lima/ci: Remove erroneous skips.
|
||
- ci/freedreno: Clean up some expectations for the nightlies.
|
||
- ci/intel: Clean up some expectations for the nightlies.
|
||
- ci/tu: Skip more subgroup.clustered vector tests pre-merge.
|
||
- ci/tu: Clear stale xfails from the nightlies.
|
||
- ci: Add some flakes that I tripped over when test groups got reshuffled.
|
||
- ci/deqp-runner: Drop silly catting of flakes/skips files together.
|
||
- ci/deqp-runner: Bump to 0.23.2 for single-threaded and vkd3d support.
|
||
- ci/deqp-runner: Enable a common single-threaded test list.
|
||
- ci/vulkan: Enable dEQP-VK.wsi.direct_drm testing.
|
||
- ci/rpi4: Move OOM-causing test skips to the single-thread list.
|
||
- ci/tu: Move vkd3d-proton testing from nightly to pre-merge.
|
||
- ci/zink: Skip ext-no-config-context for now, due to taking out the X server.
|
||
- freedreno/a5xx: Convert a bunch of LO/HI regs to 64-bit regs.
|
||
- docs/xlib: Drop docs about long-dead X features.
|
||
- docs/xlib: Stop calling the fake GLX xlib frontend the most mature of sw.
|
||
- docs/helpwanted: Rewrite this page.
|
||
- docs/debugging: Drop this page.
|
||
- docs/perf: Drop a bunch of old hints on getting faster GL1 swrast.
|
||
- docs/shading.rst: Drop a bunch of old docs about the GLSL compiler.
|
||
- docs: Rename the "shading" page to "GLSL" since it's just that.
|
||
- docs/index: Move the Xlib software GL driver with the other drivers.
|
||
- docs: Conformance is done through SFC now, not SPI.
|
||
- docs/systems: Minor touch-ups from reading the page.
|
||
- ci/vulkan: Single-thread non-drm VK_KHR_display testing as well.
|
||
- ci/radv: Add some flakes I hit while testing WSI.
|
||
- ci/radv: Enable WSI testing.
|
||
- ir3/ra: Clean up the ra_ctx_dump() output a bit.
|
||
- ir3/ra: Fix DOUBLE_ONLY limit pressure computation.
|
||
- ir3/ra: Assert that our calculated pressures aren't bigger than the reg file.
|
||
- freedreno/crashdec: Print an error instead of crashing on fopen() fail.
|
||
- nir: Do NIR_DEBUG=print under a lock.
|
||
- vulkan/wsi/display: Don't re-probe connectors in between hotplugs.
|
||
- vulkan/wsi/display: Check with an atomic commit if the swapchain fails.
|
||
- ir3: Fix shared IMAD24 lowering.
|
||
- tu: Add capture/replay for sparse buffers and descriptor buffer.
|
||
- screenshot-layer: Fix leftover VK queues in the map at DeviceDestroy.
|
||
|
||
Emre Cecanpunar (1):
|
||
|
||
- aco: drop optimizer peephole TODO comment
|
||
|
||
Eric Engestrom (124):
|
||
|
||
- VERSION: bump to 26.1
|
||
- docs: reset new_features.txt
|
||
- docs/releasing: s/pull request/merge request/
|
||
- docs/releasing: rephrase sentence about not letting the mr label script run
|
||
- docs/releasing: skip ci when creating the branchpoint
|
||
- docs: update calendar for 26.0.0-rc1
|
||
- pick-ui: update for python 3.14 support
|
||
- pvr/ci: document fixed tests
|
||
- pvr/ci: sort expectations
|
||
- pvr/ci: document last night's flakes
|
||
- docs/release-calendar: add 26.1 branchpoint and dates
|
||
- hk: enable VK_EXT_present_timing
|
||
- nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests
|
||
- docs: update calendar for 26.0.0-rc2
|
||
- Revert "meson: static link spirv-tools for darwin"
|
||
- docs: update calendar for 26.0.0-rc3
|
||
- pvr/drm-shim: avoid trying a random bvnc by default
|
||
- pick-ui: add \`Backport-to: \*\` as a synonym to \`Cc: mesa-stable\`
|
||
- pvr/ci: rename deqp suite to a less generic name
|
||
- pvr/ci: simplify the renderer string check
|
||
- ci: split get_job_seconds() computation out of get_current_minsec() formatting
|
||
- ci: sync time domains
|
||
- mr-label-maker: label wsi files that have a label
|
||
- Revert "add VK CTS validation report for a0 interpolation fix"
|
||
- bin/gen_release_notes: fix support for python 3.14
|
||
- docs: add release calendar for the 26.0.x cycle
|
||
- docs: add release notes for 26.0.0
|
||
- docs: add sha sum for 26.0.0
|
||
- ci/deqp-runner: also limit the number of test log and caselist files
|
||
- mr-label-maker: add label CI to bin/ci/*
|
||
- docs/precompiled: modernize distro documentation
|
||
- docs/precompiled: document debug packages
|
||
- ci: close individual build sections by default
|
||
- ci/vkd3d: drop duplicate submodule update
|
||
- ci/vkd3d: fetch only the desired commit
|
||
- ci/vkd3d: drop separate build directory
|
||
- ci/vkd3d: drop no-longer-needed file
|
||
- ci/vkd3d: ensure test-runner.sh has the right mode
|
||
- ci/vkd3d: keep tests/ directory structure
|
||
- ci/vkd3d: only build and install the files we actually want
|
||
- ci/vkd3d: update tags
|
||
- ci-tron: add job template for the x86_64 video test image
|
||
- radv/ci: add vulkan fluster job on navi48
|
||
- marge/pipeline_message: print job status for jobs still running or waiting for manual action
|
||
- marge/pipeline_message: consider any job that hasn't succeded as problematic
|
||
- marge/pipeline_message: print details for any pipeline that hasn't succeeded
|
||
- ci: drop redundant MESA_IMAGE
|
||
- nvk+zink/ci: update fails & flakes for nightly jobs
|
||
- ci/build: include rusticl in debian-build-x86_64
|
||
- nvk+zink/ci: add rusticl testing
|
||
- docs: update calendar for 26.0.1
|
||
- docs: add release notes for 26.0.1
|
||
- docs: add sha sum for 26.0.1
|
||
- docs/linkcheck: ignore a few more websites that don't allow linkcheck
|
||
- Revert "ci/gitlab_gql: disable schema fetch"
|
||
- etnaviv/ci: update expectations
|
||
- i915/ci: update expectations
|
||
- r300/ci: update expectations
|
||
- nvk/ci: update expectations
|
||
- freedreno/ci: update expectations
|
||
- anv/ci: update expectations
|
||
- venus/ci: update expectations
|
||
- radv/ci: document recent flakes
|
||
- radeonsi/ci: document recent flakes
|
||
- broadcom/ci: document recent flakes
|
||
- etnaviv/ci: document recent flakes
|
||
- freedreno/ci: document recent flakes
|
||
- anv/ci: document recent flakes
|
||
- nvk/ci: document recent flakes
|
||
- llvmpipe/ci: document recent flakes
|
||
- lavapipe/ci: document recent flakes
|
||
- docs/linkcheck: ignore one more website that doesn't allow linkcheck
|
||
- docs: update link to ubuntu's debug symbols documentation
|
||
- etnaviv/ci: fix expectation
|
||
- anv/ci: document more flakes
|
||
- etnaviv/ci: fix expectation
|
||
- freedreno: fix a few missed afuc -> qrisc renames
|
||
- docs: update calendar for 26.0.2
|
||
- docs: add release notes for 26.0.2
|
||
- docs: add sha sum for 26.0.2
|
||
- ci: let shader-db run on regular runners
|
||
- ci: yaml-toml-shell-test runs on generic runners, not hw farm runners
|
||
- ci: drop workaround for jobs not being created in fork pipelines
|
||
- ci: changing .gitlab-ci.yml itself also means the container jobs must exist
|
||
- docs: update calendar for 26.0.3
|
||
- docs: add release notes for 26.0.3
|
||
- docs: add sha sum for 26.0.3
|
||
- ci: fix rebase mistake
|
||
- ci: fix scheduled pipelines
|
||
- docs: fix release calendar
|
||
- meson: flip python check to avoid nesting conditions in the next commits
|
||
- meson: move mako lib check inside python check
|
||
- meson: move yaml lib check inside python check
|
||
- Revert "meson: Fix build break on f43, gentoo, etc"
|
||
- docs: fix various pep8 issues
|
||
- docs: replace html redirects with http redirects
|
||
- docs: delete now-unused html_redirects extension
|
||
- freedreno/ci: document regressions
|
||
- llvmpipe/ci: document regressions
|
||
- nvk/ci: give more time to nightly job nvk-ga106-vkcts-valve
|
||
- radv/ci: document recent flakes
|
||
- radeonsi/ci: document recent flakes
|
||
- vc4,v3d/ci: document recent flakes
|
||
- turnip/ci: document recent flakes
|
||
- nvk/ci: document recent flakes
|
||
- zink+radv/ci: document recent flakes
|
||
- zink+lvp/ci: document recent flakes
|
||
- ci: vmware farm is offline, stop using it
|
||
- ci: abort init-stage2.sh early if install dir is missing
|
||
- ci/init-stage2: symlink install dir between both CI_PROJECT_DIR paths
|
||
- ci: drop redundant existance check before \`rm -rf`
|
||
- ci: always make sure the results dir is created, not just when changing its path
|
||
- ci: only clean the artifacts folder if gitlab hasn't already done it
|
||
- ci-tron: ensure the test jobs start with a clean job folder
|
||
- docs: update calendar for 26.0.4
|
||
- docs: add release notes for 26.0.4
|
||
- docs: add sha sum for 26.0.4
|
||
- Revert "ci-tron: ensure the test jobs start with a clean job folder"
|
||
- VERSION: bump for 26.1.0-rc1
|
||
- .pick_status.json: Update to 806fcc6193e305c22366baa17ccf88c8e1da1bda
|
||
- VERSION: bump for 26.1.0-rc2
|
||
- .pick_status.json: Update to d4d7055aee547f452689f8165e21ca100869e6fe
|
||
- VERSION: bump for 26.1.0-rc3
|
||
- .pick_status.json: Update to 2b9e491b6789f60a7993cc9b74fe5ac7fa60c9c5
|
||
|
||
Eric Guo (2):
|
||
|
||
- panfrost: Fix NULL pointer dereference in panfrost_emit_images
|
||
- panfrost: disable round_to_nearest_even for NEAREST samplers
|
||
|
||
Eric R. Smith (15):
|
||
|
||
- mesa: do not unbind general point when different indexed points are deleted
|
||
- pan: add some missing formats to pan_nir_lower_framebuffer
|
||
- panfrost: optimize blending with DST_ALPHA when there is no alpha
|
||
- panfrost: remove I8_UNORM from the blendable format table
|
||
- panfrost: adjust format in blend shaders
|
||
- panfrost: blending fixes for Midgard
|
||
- pco: fix a typo in the check for optimization looping
|
||
- panfrost: fix texel buffer calculations
|
||
- panfrost: fix typos in architecture detection
|
||
- panfrost: add sysval for number of samples
|
||
- panvk: store number of samples in unused bits in the attribute descriptor
|
||
- pan: change image2DMSArray lowering to use Z instead of Y
|
||
- panvk: remove a redundant conditional
|
||
- panfrost: make sure INDEX_OFFSET is cleared
|
||
- panfrost: add helper function for checking for active queries
|
||
|
||
Erico Nunes (3):
|
||
|
||
- Revert "ci: lima farm maintenance"
|
||
- lima: add support for srgb framebuffers
|
||
- lima: add support for srgb textures
|
||
|
||
Erik Faye-Lund (56):
|
||
|
||
- pvr: drop needless include
|
||
- pvr: avoid needless dispatches in powervr winsys
|
||
- pvr/srv: query full pvr_device_info on winsys init
|
||
- pvr/srv: define per-arch winsys-ops
|
||
- pvr: prepare pvr_winsys_render_ctx_create_info for multi-arch
|
||
- pvr: prepare pvr_winsys_compute_ctx_create_info for multi-arch
|
||
- gallium/aux: do not hard-code linear interpolation
|
||
- gallium: make needless linear interpolation optional
|
||
- panfrost: expose the prefer_persp cap
|
||
- v3d: move a failure to a flake
|
||
- pan/ci: mark new xfails
|
||
- pan/ci: skip a few more slow tests
|
||
- pan/ci: clean up t720 expectations
|
||
- panfrost: group image load/store flags a bit
|
||
- panfrost: also check for PAN_BIND_STORAGE_IMAGE
|
||
- panfrost: expose EXT_shader_image_load_store
|
||
- pvr: add basic volcanic hw-definitions
|
||
- pan/ci: add missing xfails from nightly run
|
||
- pan/ci: update list of DRM-related skips
|
||
- pan/ci: add missing t720-flakes
|
||
- pan/ci: g720 and t720 isn't the same
|
||
- pan/ci: add some more flakes
|
||
- pan/ci: correct syntax for flakes
|
||
- pan/ci: update missed expectation
|
||
- pan/ci: add a slow test to the skip-list
|
||
- gallium/dri: set LIBVA_DRIVERS_PATH in devenv
|
||
- gallium/va: set up symlinks in build-dir
|
||
- panvk: fixup v7 check
|
||
- pan/lib: kill compiler-warning
|
||
- pan/lib: harmonize default-case handling
|
||
- pan/ci: update traces result
|
||
- util: add common ycbcr coefficient math code
|
||
- compiler/nir: use common ycbcr math
|
||
- vulkan: use common ycbcr code
|
||
- gallium/vl: rename scale/bias variables
|
||
- gallium/vl: do not adjust matrix twice
|
||
- gallium/vl: use common ycbcr helpers
|
||
- pan/genxml: remove non-existent YUV Enable for AFRC
|
||
- pan/lib: do not try to use stencil-aspect of color attachment
|
||
- pan/lib: set srgb-flag for afrc render-targets
|
||
- pan/lib: divide extent by tile-extend, not itself
|
||
- pan/lib: drop redundant assign
|
||
- panvk: fix incorrect sorting
|
||
- panvk: advertise wsi maintenance extensions
|
||
- pan/ci: move flake from fails to flakes file
|
||
- panvk: remove unused flag
|
||
- docs/panfrost: fix heading-levels
|
||
- pan/ci: update expectations
|
||
- panvk: drop out-of-date TODO
|
||
- pan/lib: fix up afbc and linear layout
|
||
- pan/lib: emit high bits of buffer-size
|
||
- nouveau: do not report unsupported feature
|
||
- radeonsi: remove old, unsupported cap
|
||
- panvk: do not enable extension without required feature
|
||
- panvk: do not enable extension without required feature
|
||
- dri: deprecate post-processing dri-confs
|
||
|
||
Erik Kurzinger (2):
|
||
|
||
- wsi/display: retrieve monitor name from EDID
|
||
- wsi/display: retrieve monitor size from EDID
|
||
|
||
Faith Ekstrand (195):
|
||
|
||
- nvk: Enable ZPASS_PIXEL_COUNT in draw_state_init()
|
||
- nak: Make OpF2F take a F16v2 source
|
||
- nak: Use .xx swizzles for f2f.32.16
|
||
- nir/lower_bool_to_bit_size: Use the correct num_components for conversions
|
||
- nir/lower_bool_to_bit_size: Make smarter canonicalization choices
|
||
- pan/bi: Run lower_alu_width after opt_algebraic_late
|
||
- pan/bi: Add support for unpack_32_4x8
|
||
- pan/bi: Add support for unpack_64_2x32
|
||
- pan/bi: Stop calling nir_lower_pack()
|
||
- pan/bi: Don't attempt to fuse AND(ICMP, ICMP) if the AND is swizzled
|
||
- pan/bi: Vectorize comparisons
|
||
- pan/bi: Set lower_vector_cmp
|
||
- pan/bi: Allow vector booleans
|
||
- panvk: Don't emit storage descriptors for compressed views
|
||
- pan/texture: ASTC is not allowed for storage
|
||
- pan/clear: Stop packing undefined bits in colors
|
||
- pan: Add a new framebuffer abstraction
|
||
- nir/print: Add panfrost blend intrinsics
|
||
- nir/gather_info: Add support for panfrost tile load/store intrinsics
|
||
- pan/fb: Add a common FB load shader builder
|
||
- pan/fb: Add a mapping to the old FB info
|
||
- panvk/csf: Use a panvk_rendering_state temp variable
|
||
- panvk: Add and use a new pan_ptr_offset() helper
|
||
- panvk: Return frame shader DCDs and modes from cmd_fb_preload()
|
||
- panvk: Stop doing the pre/post DCD offsetting in common code
|
||
- panvk: Memset fb state to zero
|
||
- panvk: Drop all the crc_valid stuff
|
||
- panvk/jm: Emit FRAGMENT_JOB ourselves
|
||
- util/format: Place PAN_FORMAT_FOO_START after the first format
|
||
- panvk: Refactor resolveMode handling
|
||
- pan/desc: Set clean_tile.write_zs for interleaved Z/S stencil clears
|
||
- panvk: Use thew new pan_fb_layout for setting up attachments
|
||
- panvk: Use the new FB code for tile size selection
|
||
- panvk: Use the new structs to re-populate fbinfo in force_fb_preload()
|
||
- panvk/csf: Use the new structs in prepare_incremental_rendering_fbinfos()
|
||
- panvk: Add a version of fb_preload which takes the new structs
|
||
- panvk/csf: Use the new pan_fb code for emitting FBDs
|
||
- panvk/jm: Re-generate FB info in cmd_preload_fb_after_batch_split()
|
||
- panvk/jm: Use the new pan_fb code for emitting fragment jobs
|
||
- panvk: Create both Z/S descriptors, even for separate Z/S
|
||
- panvk/preload: Stop assuming 32 registers
|
||
- panvk: Switch to the new preload shader framework
|
||
- panvk/preload: Populate our own texture descriptors
|
||
- panvk/trace: Trace using pan_fb_layout instead of info
|
||
- panvk: Use pan_fb_load to decide when we have a clear
|
||
- panvk: Drop pan_fb_info
|
||
- panvk/jm: Refactor BeginRendering()
|
||
- panvk/jm: Rework load/store/spill
|
||
- panvk: Insert a pipeline barrier if we have any FB loads
|
||
- panvk: Also load output attachments with LOAD_OP_NONE+STORE_OP_NONE
|
||
- panvk: Use partial FB preloads to deal with alignments
|
||
- pan/desc: Pass image views directly to attachment helpers
|
||
- pan/desc: Drop cbuf_offset from emit_*_color_attachment()
|
||
- pan/desc: Set Z/S MSAA averaging mode in common code
|
||
- pan/genxml: Make sections more typesafe
|
||
- pan/fb: Fill out our own descriptors
|
||
- pan/fb: Refactor load shader building
|
||
- pan/fb: Add support for more MSAA modes in shaders
|
||
- pan/fb: Separate MSAA ops into in_bounds and border
|
||
- pan/fb: Add new shader ops for copying from RTs
|
||
- pan/fb: Add an option to only write sample0 of a render target
|
||
- pan/fb: Add a concept of resolve ops and resolve shaders
|
||
- panvk/meta: Set color_attachment_count based on bound attachments
|
||
- panvk: Always set pan_fb_layout.rt_count to at least 1
|
||
- panvk: Move cmd_fb_preload to cmd_frame_shaders
|
||
- panvk/frame_shaders: Set no_shader_depth/stencil_read
|
||
- panvk/frame_shaders: Set modes in cmd_preload_*_attachments()
|
||
- panvk/frame_shaders: Only allow preload shaders to be killed
|
||
- panvk/frame_shaders: Add support for resolve shaders
|
||
- panvk: Respect storeOp for color attachments
|
||
- panvk: Use resolve shaders for color resolves
|
||
- panvk: Also use resolve shaders for Z/S
|
||
- pan: Add a pan_format_supports_hw_blend() helper
|
||
- panvk: Optimize resolves if possible
|
||
- pan/ci: Mark couple of WSI crashes as flakes
|
||
- util/format: Add a util_format_get_depth_bits() helper
|
||
- pan/fb: Improve depth format asserts
|
||
- pan/desc: Pass emit_*_attachments args through a struct
|
||
- pan/fb: Set reverse_issue_order when needed
|
||
- panvk: Avoid direct MSAA resolves to AFBC on v6 and earlier
|
||
- pan/fb: Figure out clean tile enables up-front
|
||
- pan/fb: Set clean_tile_write for mismatched superblock/tile sizes
|
||
- pan/fb: Force pre-frame shaders to ALWAYS for clean tiles
|
||
- panvk: Relax ms2ss afbc disablement
|
||
- etnaviv: Call lower_bool_to_int32 not to_bitsize
|
||
- nir/lower_bool_to_bitsize: Make all bN_csel sources match
|
||
- nir,panfrost: Move lower_bool_to_bitsize to panfrost
|
||
- pan/bi: Be more careful about bit sizes in b2f lowering
|
||
- pan/bi: Delete the b32csel special case and assert sizes match
|
||
- pan/compiler: Handle store_per_view_output in collect_varyings()
|
||
- pan/compiler: Add a pan_varying_layout struct
|
||
- pan/bi: Stop pretending to support f16 gl_Position
|
||
- nir: Allow 8-bit vertex output stores
|
||
- pan: Add a pass to resize I/O load/stores as needed by the varying layout
|
||
- pan/bi: Allow 8-bit varying direct stores
|
||
- pan/bi: Use the pan_varying_layout for Vallhall+ direct varing load/store
|
||
- pan/bi: Stop lowering point size to float16 early
|
||
- Revert "nir: Add a type parameter to nir_lower_point_size()"
|
||
- pan/nir: Improve collect_noperspective_varyings_fs()
|
||
- panvk: Scrape noperspective varyings out of the FS first
|
||
- panvk: Compile shaders in pipeline stage order
|
||
- panvk: Build the VS varying layout early
|
||
- panvk/csf: Emit varying descriptors based on the VS varying layout
|
||
- pan/bi: Handle varying layout mismatches in emit_load_vary()
|
||
- panvk: Pass the varying layout from the VS to the FS compile
|
||
- pan: Add a helper for generating more compact varying layouts
|
||
- panvk: Use a new, more compact varying layout
|
||
- vulkan/render_pass: Always use separate depth/stencil layouts
|
||
- nak: Report progress from nak_nir_rematerialize_load_const()
|
||
- pan/bi: Add new FS input load intrinsics
|
||
- pan/bi: Lower FS input loads in NIR
|
||
- pan,nir: Rework converted_mem_pan intrinsics
|
||
- pan/bi: Lower VS outputs in NIR
|
||
- pan/bi: Drop lower_sample_mask_writes
|
||
- pan/bi: Drop bifrost_nir_lower_blend_components()
|
||
- nir: Consider if uses in nir_def_all_uses_*
|
||
- nir: Support primitive_id in lower_sysvals_to_varyings
|
||
- treewide: Enable lowering of primitive ID in a bunch of Vulkan drivers
|
||
- nak: Move lowering of load_*_id to lower_vtg_io.c
|
||
- nak: Add support for load_primitive_id
|
||
- vtn: Use a system value for primitive ID in fragment shaders
|
||
- panvk: Drop lower_load_fs_input
|
||
- pan/bi/ra: Dump verbose debug logging to stderr
|
||
- pan/bi: v2x16 conversions don't replicate
|
||
- pan/buffer: Add the offset to the size for buffer textures
|
||
- pan/buffer: Drop pan_buffer_view::offset
|
||
- panvk: Reduce minTexelBufferOffsetAlignment
|
||
- panvk: Rework setting dyn_buf_offsets
|
||
- panvk: Track which dynamic buffers are SSBOs
|
||
- panvk: Increase robust buffer access alignments
|
||
- panvk: Set min_ubo/ssbo_alignment in spirv_options
|
||
- pan/bi: Always vectorize UBO access
|
||
- panvk: Replace robust2_modes with robust_modes
|
||
- pan/bi: Vectorize SSBOs when not robust
|
||
- pan/bi: Allow 64-bit vectors in bi_make_vec_to()
|
||
- pan/bi: Handle 64-bit sources in bi_alu_src_index()
|
||
- pan/bi: Properly handle large 8-bit vectors in bi_alu_src_index()
|
||
- pan/bi: Move nir_op_mov handling to the top
|
||
- pan/bi: Handle pack_*_split with vecN
|
||
- pan/bi: Unify handling of pack_*
|
||
- pan/bi: Unify handling of unpack_*
|
||
- pan/bi: Simplify unpack_64_2x32_split_*
|
||
- pan/bi: Rework mem_vectorize_cb
|
||
- nir: Add sampler and resource heap system values
|
||
- nir: Add intrinsics for descriptor heaps
|
||
- nir: Add tex sources for descriptor heaps
|
||
- spirv: Improve the error message for invalid SPIR-V sections
|
||
- spirv: Add new SPV_KHR_descriptor_heap Builtins
|
||
- spirv: Handle OpTypeBufferKHR
|
||
- spirv,vulkan: Implement OpConstantSizeOfKHR
|
||
- spirv: Handle ArrayStrideIdKHR and OffsetIdKHR decorations
|
||
- spirv: Handle OpBufferPointerKHR
|
||
- spirv: Mark DescriptorHeapKHR as implemented
|
||
- vulkan: Rename some VK_EXT_descriptor_buffer properties
|
||
- vulkan: Add a lowering pass for descriptor heap mappings
|
||
- vulkan: Support descriptor heaps in vk_nir_convert_ycbcr()
|
||
- vulkan/pipeline: Allow compiling compute/rt pipelines with a NULL layout
|
||
- vulkan/shader: Call vk_nir_lower_descriptor_heaps()
|
||
- vulkan: Add a vk_hash_descriptor_heap_mappings() helper
|
||
- vulkan/pipeline: Reorder vk_pipeline_precomp_shader_deserialize()
|
||
- vulkan/pipeline: Call vk_nir_lower_descriptor_heaps()
|
||
- vulkan: Add a common implementation of GetPhysicalDeviceDescriptorSizeKHR
|
||
- vulkan: Add a no-op implementation of [Un]RegisterCustomBorderColor()
|
||
- pan/bi: Support more swizzle aliases in the bifrost pack code
|
||
- pan/bi: Delete a few instruction encodings
|
||
- pan/bi/ra: Allow offsets on tied sources
|
||
- pan/bi: Add a bi_swizzle_from_half() helper
|
||
- pan/bi: Compose swizzles in bi_half() and bi_byte()
|
||
- pan/bi: Use bi_half() for texture MS indices
|
||
- pan/bi: Return void from bi_swizzle_to_byte_channels()
|
||
- pan/bi: Add a bi_swizzle_from_byte_channels() helper
|
||
- pan/bi: Add a bi_try_compose_swizzles() helper
|
||
- pan/bi: Add a bi_op_supports_swizzle() helper
|
||
- pan/bi: Add a lowering pass for MKVEC and SWZ
|
||
- pan/bi: Always use SWZ.v4i8 in bi_lower_swizzle()
|
||
- pan/bi: Stop lowering swizzles on mkvec and swz
|
||
- pan/bi: Emit MKVEC directly
|
||
- pan/bi: Add bytewise copy propagation
|
||
- pan/bi: Pack 8-bit vec2s
|
||
- pan/bi: Vectorize 8-bit ops up to v4i8
|
||
- pan/bi: Delete BI_SWIZZLE_1123
|
||
- pan/bi: Add BI_SWIZZLE_NONE
|
||
- pan/bi: Support all the swizzles in the packer
|
||
- nir: Add a couple is_zero() helpers
|
||
- pan/bi: Use nir_src_is_zero()
|
||
- pan/bi: Handle arbitrary size constants
|
||
- pan/nir: Stop doing manual optimization after resize_varying_io
|
||
- pan/nir: Stop being so conservative about phi scalarizing
|
||
- pan/nir: Use minimum-width constants instead of scalar
|
||
- pan/bi: Simplify extract_i8 handling
|
||
- nir: Add a nir_alu_src_comp_as_uint() helper
|
||
- pan/bi: Handle vector 16-bit extract_[ui]8
|
||
- pan/bi: Vectorize more conversions
|
||
- panvk/csf: Emit INDEX_BUFFER[_SIZE] even for non-indexed draws
|
||
- zink: Assert if we try to use a dedicated allocation with offset > 0
|
||
|
||
Felix DeGrood (4):
|
||
|
||
- intel/tools: intel_measure.py correctly parse cmdbuf-only data
|
||
- intel/tools: intel_measure.py avoid early exit on corrupted data
|
||
- anv: report correct format for depth/stencil blorps in utrace
|
||
- intel/decoder: update warning message when buildtype=release
|
||
|
||
Francisco Jerez (17):
|
||
|
||
- intel/isl: Define ISL_AUX_STATE_COMPRESSED_HIER_DEPTH aux state.
|
||
- iris/gfx12.5: Allocate indirect color state for depth surfaces.
|
||
- iris/gfx12.5+: Keep HIZ_CCS aux usage while sampling from resolved depth surfaces.
|
||
- anv/gfx12.5: Allocate indirect color state for depth surfaces.
|
||
- anv: Use actual layout in anv_fast_clear_depth_stencil() instead of ANV_IMAGE_LAYOUT_EXPLICIT_AUX.
|
||
- anv/gfx12.5: Can't fast clear multisampled Z/S with HIZ CCS WT aux usage.
|
||
- anv/gfx12.5: Resolve depth during layout transitions from ISL_AUX_STATE_COMPRESSED_HIER_DEPTH.
|
||
- anv/gfx12.5: Infer ISL_AUX_STATE_COMPRESSED_HIER_DEPTH from anv_layout_to_aux_state().
|
||
- anv/gfx12.5+: Keep HIZ_CCS aux usage while sampling from depth surfaces.
|
||
- intel/measure: Define snapshot type for HiZ partial resolves.
|
||
- intel/blorp: Add support for partial resolves of HiZ-CCS surfaces.
|
||
- intel/isl: Teach ISL about HIZ CCS partial resolves.
|
||
- anv/gfx12.5: Take advantage of partial resolves in depth layout transitions.
|
||
- anv/gfx12.5: Apply HIZ-CCS resolve TC flush on full resolves for all gfx12.5.
|
||
- iris/gfx12.5: Apply HIZ-CCS resolve DC flush after full resolves for all gfx12.5.
|
||
- intel/isl: Add unit tests for ISL_AUX_STATE_COMPRESSED_HIER_DEPTH.
|
||
- iris: Rework iris_sample_with_depth_aux() into helper that returns aux usage.
|
||
|
||
Francois Coulombe (1):
|
||
|
||
- vulkan/wsi/headless: add sRGB swapchain format support
|
||
|
||
Frank Binns (14):
|
||
|
||
- docs/pvr: fix some typos and wording
|
||
- docs/pvr: some minor improvements
|
||
- pvr/ci: document some recent flakes
|
||
- pvr: remove asserts in pvr_get_image_subresource_layout()
|
||
- pvr/ci: update fails to remove two tests that have started passing
|
||
- pvr/ci: move some timing out tests from fails to skips
|
||
- pvr: Fix alloc callbacks usage when freeing frame buffers
|
||
- zink: add renderonly scanouts handling
|
||
- zink: add a winsys library exposing renderonly screen creation
|
||
- kmsro: wire Zink up as a fallback
|
||
- pvr: re-enable fullDrawIndexUint32
|
||
- pvr: re-enable multiDrawIndirect
|
||
- pvr: re-enable depthBiasClamp
|
||
- pvr: re-enable wideLines
|
||
|
||
GKraats (1):
|
||
|
||
- crocus: Fix shader precompilation on Gen6 and higher
|
||
|
||
Georg Lehmann (297):
|
||
|
||
- radv/gfx11: add a RADV_PERFTEST flag to expose bfloat16 cmat
|
||
- aco/gfx12: use 64bit add/sub to swap sgprs
|
||
- nir/opt_algebraic: optimize f2f16_rtz(b2f(a))
|
||
- nir/opt_algebraic: optimize f2f16_rtz(min/max)
|
||
- nir/opt_algebraic: remove f2f16 roundtrip conversions
|
||
- nir/opt_algebraic: optimize f2f16_rtz of bcsel with constants
|
||
- nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo
|
||
- aco/isel: optimize pack_32_2x16_split(undef, const)
|
||
- aco/optimizer: fix parsing salu p_insert as shift
|
||
- aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx6/7
|
||
- aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx8+
|
||
- aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for salu
|
||
- ac/nir/lower_ps_late: CSE partial packed exports
|
||
- ac/nir/lower_ps_late: emit scalar f2f16_rtz for when one half of a packed export is undef
|
||
- aco: fix demote in header of single iteration loop
|
||
- aco: add a helper function for non supported DPP opcodes
|
||
- aco/optimizer: use opcode_supports_dpp
|
||
- aco: disable DPP for rev integer subs and shifts
|
||
- aco/validate: allow dpp with scalar src1 on gfx11.5+
|
||
- aco: undo operand swap if applying DPP fails
|
||
- aco: don't convert VOP3P to VOP3 when applying DPP
|
||
- aco/ra: don't move sgpr into v_fmac_f32_dpp src0
|
||
- aco: apply DPP with scalar src1 on gfx11.5+
|
||
- aco/optimizer: allow DPP with scalar src1 in alu_opt_info_is_valid
|
||
- aco/optimizer: rework how dpp is applied
|
||
- aco: only apply DPP with 3 or less uses
|
||
- aco: allow v_cmpx with DPP
|
||
- spirv: move NoContraction handling into vtn_handle_fp_fast_math
|
||
- spirv: handle fast_math for opencl opcodes
|
||
- spirv: use base type instead of bit size to determine fp_math_ctrl
|
||
- spirv: consider both source and dest type for fast math
|
||
- spirv: remove vtn_builder::exact
|
||
- spirv: assert fp_math_ctrl was reset after use
|
||
- nir/opt_algebraic: use correct syntax to create exact fsat
|
||
- nir/algebraic: terminate opcode regex
|
||
- nir/algebraic: remove manual opcode validation
|
||
- nir/opt_algebraic: use contract instead of inexact for more patterns
|
||
- nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern
|
||
- nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit
|
||
- nir/search: gather union of all fp_math_ctrl
|
||
- nir/search: preserve nan/inf/sz if any alu in a replaced expression did
|
||
- nir/opt_algebraic: rework ignore_exact to work like other internal conditions
|
||
- nir/algebraic: remove ability to create Value from Expression
|
||
- nir/algebraic: make subexpression inexact on creation
|
||
- nir/opt_algebraic: optimize unpack_32_2x16 of extract
|
||
- nir/lower_alu_width: emit f2f32 for unpack_half_2x16
|
||
- nir/lower_tex: use f2f32 instead of unpack_half
|
||
- nir/opt_16bit_tex_image: remove unpack_half support
|
||
- nir/format_convert: use f2f32 instead of unpack_half
|
||
- nir/opt_algebraic: remove unpack_half_2x16_split
|
||
- aco: remove unpack_half support
|
||
- ac/llvm: remove unpack_half support
|
||
- nak: remove unpack_half support
|
||
- kk/compiler: remove unpack_half support
|
||
- broadcom/compiler: use f2f32 when lowering image load
|
||
- broadcom/compiler: remove unpack_half support
|
||
- asahi/compiler: remove unpack_half support
|
||
- brw/lower_storage_image: use f2f32 instead of unpack_half
|
||
- brw: remove unpack_half support
|
||
- elk/lower_storage_image: use f2f32 instead of unpack_half
|
||
- elk: remove unpack_half support
|
||
- microsoft/compiler: switch to a backend specific unpack half opcode
|
||
- nouveau/codegen: remove support for unpack_half
|
||
- panfrost/compiler/bi: remove unpack_half support
|
||
- r600/sfn: implement minimal 16bit f2f32 support
|
||
- r600/sfn: lower unpack_half to f2f32
|
||
- r600/sfn: remove unpack_half support
|
||
- nir: remove split unpack_half opcodes
|
||
- aco: clean up emit_extract_vector a bit
|
||
- aco/optimizer: repeat vector of split opt
|
||
- aco/optimizer: don't remove label_extract for splits
|
||
- aco: improve emit_extract_vector for vector of vecs
|
||
- aco/isel: split vector into dwords/words first
|
||
- aco/isel: avoid extracts for continuous alu src components
|
||
- aco/optimizer: apply further extracts to v_cvt_f32_ubyte
|
||
- aco/optimizer: apply byte p_split_vector as extract
|
||
- aco/optimizer: add second copy prop for pseudo instructions
|
||
- aco/optimizer: only copy propagate p_split_vector if it can be eliminated
|
||
- aco/optimzer: apply extract with any uses
|
||
- aco/optimizer: use nan preserve flag to prevent incorrect med3
|
||
- spirv: use nan/inf preserve instead of exact for fp compare
|
||
- spirv: use nan/inf preserve for glsl.std.450 min/max instead of exact
|
||
- mesa/prog_to_nir: use nan/inf preserve instead of exact for kill's flt
|
||
- gallium/ttn: use nan/inf preserve instead of exact for kill's flt
|
||
- radv/nir/rt: preserve inf/nan for emulated RT intersect
|
||
- nir/format_convert: use nan/inf preserve flag for fmax instead of exact
|
||
- nir/lower_double_ops: don't create more exact ops than the input requires
|
||
- nir/lower_uniform_subgroup: use nan/inf preserve instead of exact for feq
|
||
- glsl: preserve inf/nan for precise/invariant
|
||
- glsl: make fp (not) equal always nan/inf preserving
|
||
- glsl: make fmin/fmax/fsat nan/inf preserving
|
||
- nir/search: add option to set nan/inf/sz preserve on replacement patterns
|
||
- nir/opt_algebraic: use nan/inf/sz preserve flags instead of exact for cmp/min/max replacement
|
||
- nir/opt_algebraic: mark newly created fmulz nan/inf preserving
|
||
- nir: remove special fp_math_ctrl rules
|
||
- nir/opt_algebraic: remove inexact a * 0.0 patterns
|
||
- nir/opt_algebraic: add a - a with nnan
|
||
- nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
|
||
- nir/opt_algebraic: remove inexact from floor->trunc pattern
|
||
- nir/opt_algebraic: make pattern pushing fmul into bcsel exact
|
||
- nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
|
||
- nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
|
||
- nir/opt_algebraic: use better float control for some fcmp patterns
|
||
- nir/opt_algebraic: optimize flt/fge(#c, fadd(a, #b))
|
||
- nir/opt_algebraic: update flt -> fneu patterns
|
||
- nir: make alu fp_math_ctrl helpers const
|
||
- nir/opt_algebraic: make bcsel(fcmp(b, a), b, a) -> fmin/fmax patterns exact
|
||
- nir/opt_algebraic: make some fmin/fmax/fsat patterns exact using nsz/nnan
|
||
- nir/opt_algebraic: make some more fcmp patterns exact using nnan
|
||
- nir/opt_algebraic: make ffract(is_integral) exact using nnan
|
||
- nir/opt_algebraic: make a < 0.0 ? -a : a exact using search helpers
|
||
- nir/opt_algebraic: make fcmp(a+b, 0.0) -> fcmp(a, -b) exact using ninf
|
||
- brw/nir_lower_fsign: try to fix NaN correctness
|
||
- nir/algebraic: allow inexact optimizations with sz/inf/nan preserve
|
||
- aco/optimizer: stop checking precise for med3
|
||
- ci: skip invalid float_control2 tests
|
||
- ci: update trace checksums
|
||
- anv/ci: add cross signed zero expected fails
|
||
- aco/lower_branches: consider jump target of conditional branches based on vcc
|
||
- aco: handle all SALU that modifies PC in needs_exec_mask
|
||
- aco/opt_postRA: don't optimize across calls
|
||
- aco: remove redundant can_use_DPP declaration
|
||
- nir/opt_algebraic: remove few uses of integer nir_analyze_range
|
||
- nir: remove non float nir_analyse_range support
|
||
- nir: rename nir_analyze_range because it's float only
|
||
- nir: let nir_analyze_fp_range take a nir_def
|
||
- nir/opt_algebraic: optimize 1.0 - b2f(a) to b2f(inot(a))
|
||
- ci: disable debian-ppc64el and debian-s390x
|
||
- zink: do not check type when emitting fp_fast_math_mode
|
||
- nir/serialize: omit serializing fp_math_ctrl if it has to be 0
|
||
- nir/opcodes: remove valid_fp_math_ctrl bits from some opcodes
|
||
- nir/opt_algebraic: preserve signed zero when creating new b2f
|
||
- nir/opt_algebraic: create more b2f if sign of zero doesn't matter
|
||
- vulkan,spirv: update headers
|
||
- nir: add mixed float dot opcodes
|
||
- spirv: implement SPV_VALVE_mixed_float_dot_product
|
||
- aco: mixed float dot product opcodes
|
||
- aco/ra: create v_dot2c_f32_f16
|
||
- aco: allow modifiers for fp16 dot
|
||
- aco: allow dpp for fp8/bf8 dot4
|
||
- ac/llvm: implement mixed float dot
|
||
- radv: expose VK_VALVE_shader_mixed_float_dot_product on supported hardware
|
||
- nir/lower_subgroups: lower shuffles and bitwise reduce to 32bit before scalarizing
|
||
- aco/insert_fp_mode: don't skip setting round for fract
|
||
- nir: print all fp_math_ctrl bits
|
||
- nir/opt_algebraic: optimize b2f(a) - 1.0 to -b2f(a)
|
||
- nir/opt_algebraic: optimize d3d9 iand(a, inot(b))
|
||
- nir/opt_algebraic: generalize late fcmp(fneg(a), const) patterns
|
||
- nir/opt_algebraic: remove fcmp+fneg patterns that are cleaned up earlier
|
||
- nir/opt_algebraic: optimize all comparisons of b2f/b2i with constants
|
||
- nir_opt_algebraic: remove more specific cmp+bcsel opts
|
||
- nir/opt_algebraic: remove loops for b2f/b2i equality handling
|
||
- nir_opt_algebraic: reassociate fadd into ffma where one factor is a constant
|
||
- nir_opt_algebraic: remove unneeded is_not_const
|
||
- nir/opt_algebraic: remove is_used_once on outer instruction
|
||
- ci: update expectations
|
||
- nir/opt_algebraic: optimize b2i(a) * b to bcsel
|
||
- nir/lower_bool_to_float: assert that vector comparisons were lowered
|
||
- nir: remove fall_equal/fany_nequal opcodes
|
||
- ntt: lower vector comparisons using nir_lower_alu_to_scalar
|
||
- i915,nv30,softpipe,svga,mesa/st: remove lower_vector_cmp for tgsi backends
|
||
- kk,llvmpipe,nouveau: remove lower_vector_cmp from scalar backends
|
||
- zink: use nir_lower_alu_to_scalar to lower vector compare
|
||
- bifrost: use nir_lower_alu_width to lower vector comparisons
|
||
- etnaviv: use nir_lower_alu_width to lower vector compare
|
||
- lima: use nir_lower_alu_width to lower vector compare
|
||
- freedreno/ir2: use nir_lower_alu_width to lower vector compare
|
||
- r300: use nir_lower_alu_width to lower vector compare
|
||
- nir: remove lower_vector_cmp
|
||
- nir/opt_algebraic: remove pattern that skips iabs with range analysis
|
||
- nir/opt_algebraic: fix frsq clamp pattern
|
||
- nir/opt_algebraic: add is_not_uint_zero for b2i16(uge) pattern
|
||
- nir: rewrite fp range analysis as a fp class analysis
|
||
- nir: add fp class analysis for sin/cos
|
||
- nir: add fp class analysis for flog2
|
||
- nir: add fp class analysis for some intrinsics
|
||
- nir: add fp class analysis for shadow compare
|
||
- nir: add fp class analysis for fsub
|
||
- nir: add fp class analysis for fround_even
|
||
- aco/isel: skip min/max for SALU fsat if possible
|
||
- nir/gather_tcs_info: use nir_analyze_fp_class directly
|
||
- nir/search_helpers: switch to fp class analysis
|
||
- nir: remove nir_analyze_fp_range
|
||
- nir/search_helpers: use fp class analysis more
|
||
- nir: remove more fsat using range analysis
|
||
- nir: add fp class analysis tests
|
||
- panvk/ci: document new crashes on bifrost
|
||
- nir: add a pass to optimize fp_math_ctrl
|
||
- radv: use nir_opt_fp_math_ctrl
|
||
- nir: create more fsat using range analysis
|
||
- nir/opt_algebraic: remove min(a, >= 1.0) before fsat
|
||
- nir/opt_algebraic: skip more redundant alignment iand
|
||
- nir: don't assume indicies are always 32bit when accessing them as raw data
|
||
- nir: support intrinsic indicies larger than 32 bits
|
||
- nir: merge xfb and xfb2 into one 64bit intrinsic index
|
||
- nir: add free bits in nir_io_semantics for future use
|
||
- aco/scheld_vopd: make VOPDInfo more flexible by adding a swizzle
|
||
- aco/sched_vopd: convert fma with inline constants to fmamk/fmaak
|
||
- aco/opt_postRA: remove try_convert_fma_to_vop2
|
||
- aco/sched_vopd: create dot2acc from VOP3P dot2
|
||
- aco/ra: try to allocate registers for dot2 to allow VOPD
|
||
- aco/ra: don't tie definition when the operand is in a preserved reg
|
||
- nir: fix nir_intrinsic_copy_const_indices for large indices
|
||
- nir: add no_signed_zero flag to io semantics
|
||
- nir/opt_fp_math_ctrl: handle input/output no_signed_zero flag
|
||
- radv: set no_signed_zero for FS store_output when format doesn't care
|
||
- nir/opt_load_store_vectorize: use nir_intrinsic_has_align_mul
|
||
- nir/gather_info: use nir_intrinsic_has_io_semantics
|
||
- nir/lower_frexp: preserve fp_math_ctrl
|
||
- ac/nir/cull: make fisfinite nan/inf preserving
|
||
- nir/opt_algebraic: lower ninf fisfinite correctly
|
||
- glsl: reset fp_math_ctrl when changing it per alu
|
||
- glsl: make flt/fge/fabs/fneg inf preserving
|
||
- nir/opt_algebraic: be more strict when optimizing fcmp(a + #b, #c)
|
||
- ntt: set precise correctly for comparisons/min/max
|
||
- nir/tests: test algebraic patterns with maximum fp_math_ctrl
|
||
- nir/search_helpers: assume float sources without preserve flag can't be inf/nan
|
||
- nir/opt_algebraic: take advantage of range helpers including nnan
|
||
- nir/opt_algebraic: turn fabs(a) into fneg(a) if a is not positive
|
||
- nir/opt_algebraic: remove manual pattern that removes fmax(..., 0.0)
|
||
- nir/opt_algebraic: remove manual patterns that optimizes flt([0.0, 1.0], 0.0)
|
||
- nir/opt_algebraic: move some fsat patterns next to the other fsat patterns
|
||
- nir/algebraic/tests: invert all excluded fp_math_ctrl flags
|
||
- ir3: set progress for nir_opt_large_constants
|
||
- nir/opt_large_constants: don't add constants implemented with ALU to the constant data
|
||
- nir/opt_large_constants: set fp_math_ctrl for bit exact results
|
||
- nir/opt_large_constants: enable small constant optimization for non trivial strides
|
||
- nir/opt_large_constants: optimize small vector constant arrays
|
||
- nir/opt_large_constants: support negative small constants
|
||
- nir/opt_large_constants: handle floating point power of two fractions
|
||
- nir/opt_large_constants: only use 16bit float alu when supported
|
||
- radv/ci: update restricted trace checksum
|
||
- nir: replace lower_ldexp with has_ldexp
|
||
- nir/opt_algebraic: create ldexp from exp2
|
||
- nir/opt_algebraic: optimize more near useless bcsel
|
||
- nir: rework nir_alu_src_is_trivial_ssa to take an alu src
|
||
- nir/search: never insert movs for alu uses
|
||
- nir/opt_algebraic: optimize more fmulz(1.0, a) remains
|
||
- nir/opt_algebraic: optimize b2f(a) * b
|
||
- broadcom/ci: skip rpi4 timeout
|
||
- aco: skip fract for sin/cos on gfx6-8 if the src is already in range
|
||
- gallivm: don't optimize fadd(a, 0.0) with signed zero preserve
|
||
- gallivm: remove dead code in lp_build_add
|
||
- aco/optimizer: apply dpp to v_dot before RA for gfx10.3
|
||
- nir/opt_fp_math_ctrl: ignore ffract input sign of zero
|
||
- nir: add fp_math_ctrl as intrinsic index
|
||
- nir: add fp_math_ctrl to ddx/ddy
|
||
- nir/opt_uniform_subgroup: use ddx/ddy fp_math_ctrl
|
||
- nir/opt_fp_math_ctrl: use ddx/ddy fp_math_ctrl
|
||
- nir: add fp_math_ctrl to cmat alu ops
|
||
- spirv: set fp_math_ctrl for cmat alu
|
||
- radv: preserve fp_math_ctrl when lowering cmat alu ops
|
||
- nak: preserve fp_math_ctrl when lowering cmat
|
||
- brw: preserve fp_math_ctrl when lowering cmat alu
|
||
- lavapipe: preserve fp_math_ctrl when lowering cmat alu
|
||
- nir: add fp_math_ctrl to convert_alu_types
|
||
- aco: spill VGPRs to LDS if it doesn't further limit occupancy
|
||
- aco: allow spilling to LDS in RT shaders without stack pointer
|
||
- nir/lower_non_uniform_access: fix fusing loops for same index but different array variable
|
||
- nir/lower_tex: fix lowering 16bit textureGatherOffsets
|
||
- nir,radv: lower shadow compare gather to 16bit
|
||
- aco/isel: use s_bitcmp1 for 1bit ubfe
|
||
- nir/opt_algebraic: remove a few non 1bit bool patterns
|
||
- nir/search: remove matching variable type
|
||
- nir/opt_large_constants: optimize constant arrays with just two different values
|
||
- nir/opt_load_skip_helpers: don't skip helpers for store_scratch data
|
||
- nir/opt_algebraic: update open coded flerp(..., b2f(c)) to bcsel patterns
|
||
- nir/opt_algebraic: move some lower_lerp patterns
|
||
- aco/isel: optimize 16/64bit non constant valu bit test
|
||
- nir/opt_algebraic: create more 64bit bit test
|
||
- aco/optimizer: do not try to create 3 byte constant operands
|
||
- aco/spill: fix mixed lds+scratch spill/reload
|
||
- nir: split exact bit into no_contract/reassoc/transform
|
||
- spirv: map float control2 to fine grained nir flags instead of exact
|
||
- nir/algebraic: actually seperate contract and inexact
|
||
- nir/opt_reassociate: use nir_fp_no_reassoc instead of exact
|
||
- intel/peephole_fma: use nir_fp_no_contract instead of exact
|
||
- zink/ntv: seperate float control2 exact bits
|
||
- aco/tests: fix med3 NaN tests
|
||
- aco: use no contract/reassoc instead of exact
|
||
- radv: clarify that copy prop is required for correctness after D16 opt
|
||
- radv: remove point size in lowered io
|
||
- radv: do not remove point size variable
|
||
- radv: immediately remove phis after loop unrolling
|
||
- radv: call radv_optimize_nir after lowering io
|
||
- radv: remove radv_remove_color_exports
|
||
- radv: lower lowered io to scalar
|
||
- radv: do not vectorize fs out variables
|
||
- radv: do not vectorize io variables
|
||
- radv: do not remove dead variables
|
||
- radv: remove some unneeded passes from radv_nir_lower_io_vars_to_scalar
|
||
- radv: do not shrink vectors when lowering IO vars to scalar
|
||
- radv: don't lower io vars to scalar
|
||
- radv: remove lower array vars to elem
|
||
- radv: remove radv_nir_lower_viewport_to_zero
|
||
- nir: disable fp class analysis for 64bit transcendentals
|
||
- intel/nir_opt_peephole_ffma: fix fp_math_ctlr for modifiers
|
||
|
||
Gurchetan Singh (12):
|
||
|
||
- gallium: fix sometimes-uninitialized warning
|
||
- gfxstream: fixes related to -Wmissing-prototypes
|
||
- gfxstream: fix build after vk.xml update
|
||
- lavapipe: fix uninitialized variable warning
|
||
- virtio/kumquat: add safety comments
|
||
- gfxstream: explicitly assign INVALID_DESCRIPTOR
|
||
- gfxstream: meson: remove duplicate includes
|
||
- gfxstream: add vulkan_gfxstream_structure_type.h to codegen output
|
||
- gfxstream: fixes to get Fuchsia headless to compile
|
||
- gfxstream: simple compile fix
|
||
- gfxstream: cereal: fix 'None' in gfxstream codegen
|
||
- gfxstream: additional Goldfish logic for Android builds
|
||
|
||
Haixiang Tang (1):
|
||
|
||
- zink/kopper: Allow surface creation for Pixmaps (non-window surfaces)
|
||
|
||
Hans-Kristian Arntzen (15):
|
||
|
||
- vulkan/wsi: Add common infrastructure for EXT_present_timing.
|
||
- vulkan/runtime: Expose PRESENT_STAGE_LOCAL as calibrateable domain.
|
||
- anv: Add PRESENT_STAGE_LOCAL_EXT path for calibration.
|
||
- vulkan/wsi: Add no-op present timing support to most backends.
|
||
- wsi/wayland: Implement EXT_present_timing on Wayland.
|
||
- radv: Enable EXT_present_timing.
|
||
- turnip: Enable EXT_present_timing.
|
||
- anv: Enable VK_EXT_present_timing.
|
||
- nvk: Enable EXT_present_timing.
|
||
- panvk: Enable EXT_present_timing.
|
||
- vulkan/wsi: Implement QUEUE_OPERATIONS_END present timing query.
|
||
- wsi/wayland: Fix some locking quirks around present ID update.
|
||
- wsi/display: Implement present timing on KHR_display.
|
||
- wsi/common: Allow timestampValidBits < 64 for present timing.
|
||
- docs: Add VK_EXT_present_timing to new features.
|
||
|
||
Hoe Hao Cheng (1):
|
||
|
||
- zink/codegen: do not enable extensions that are fully core-promoted
|
||
|
||
Hsieh, Mike (1):
|
||
|
||
- amd/vpelib: Move feature skip after buffer size return
|
||
|
||
Hyunjun Ko (9):
|
||
|
||
- anv/video: fix a typo in Vulkan AV1 decoding.
|
||
- anv/video: Compute AV1 tile positions internally
|
||
- anv: Add dummy workload for AV1 decode on affected platforms (Wa_1508208842)
|
||
- anv/video: disable encoder on untested platforms
|
||
- anv/video: set transform skip numbers according to qp
|
||
- anv/video: set Qp passed from apps for h265 encoder
|
||
- anv/video: Handle GPB(Generalized P and B frames) properly for H265 enc.
|
||
- anv/video: set Sad Qp Lambda values properly for H265 encoder.
|
||
- anv/video: remove unsupported feautres for encoders
|
||
|
||
Iago Toral Quiroga (5):
|
||
|
||
- broadcom/compiler: drop unnecessary MOV
|
||
- broadcom/compiler: don't always clear undefined bits from sub-32 integers
|
||
- broadcom/compiler: optimize alu(shr(x, 16).l) to alu(x.h)
|
||
- broadcom/compiler: inform NIR scheduler about 0 cost ALU instructions
|
||
- nir/opt_vectorize_load_store: allow sizes unaligned with high offset for loads
|
||
|
||
Ian Douglas Scott (1):
|
||
|
||
- wsi/wayland: Use \`wl_fixes` to destroy \`wl_registry`
|
||
|
||
Ian Forbes (7):
|
||
|
||
- svga: Implement GL_ARB_derivative_control
|
||
- svga: Increase max_combined_shader_output_resources and SSBO limit to 16
|
||
- svga: Implement GL_ARB_conditional_render_inverted
|
||
- svga: Always emit VGPU10_OPCODE_DCL_GLOBAL_FLAGS for VGPU10
|
||
- svga: Enable GL_ARB_vertex_type_10f_11f_11f_rev
|
||
- svga: Make svga_screen::hud members atomic
|
||
- svga: Implement GL_ARB_pipeline_statistics_query
|
||
|
||
Ian Romanick (29):
|
||
|
||
- spirv: Use STACK_ARRAY instead of NIR_VLA
|
||
- nir: Use STACK_ARRAY instead of NIR_VLA
|
||
- elk: Use F16TO32 for nir_op_f2f32 of float16 source
|
||
- brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline
|
||
- brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts
|
||
- elk: Call nir_opt_algebraic_late in elk_postprocess_nir
|
||
- brw/cmod: Don't propagate from CMP to ADD if there is a write between
|
||
- elk/cmod: Don't propagate from CMP to possible Inf + (-Inf)
|
||
- elk/cmod: Don't propagate from CMP to ADD if there is a write between
|
||
- brw: Don't mark_invalid in update_for_reads for non-VGRF destination
|
||
- brw: Use brw_reg_is_arf in update_for_reads
|
||
- brw: Also check for ADDRESS file in update_for_reads
|
||
- brw/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
|
||
- elk/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
|
||
- brw/validate: Eliminate duplicate integer multiply validation
|
||
- brw/validate: Implicit read of accumulator cannot also have explicit read
|
||
- brw/validate: Perform more 3-src validation in brw_validate instead of brw_eu_emit
|
||
- brw/emit: Src1 can be accumulator on Gfx12.5 and newer
|
||
- brw: Fix encoding of accumulator sources of 3-source instructions
|
||
- brw/asm: Don't drop accumulator number in the assembler
|
||
- anv: Use different logic to isolate lowest flag in anv_foreach_vk_stage
|
||
- anv: Use u_foreach_bit
|
||
- util: Use same method to clear bits in u_foreach_bit as util_bitcount
|
||
- brw/algebraic: Allow mixed types in saturate constant folding
|
||
- brw: Handle scalars and swizzles correctly in is_const_zero
|
||
- brw/lower_regioning: Allow integer conversions in SEL
|
||
- brw: Change the flags written by some CMP
|
||
- brw/const: Don't allow type changes when accumulators are involved
|
||
- brw: brw_reg::nr for an accumulator is not part of the offset
|
||
|
||
Icenowy Zheng (39):
|
||
|
||
- pvr: preliminary EXT_image_drm_format_modifier support
|
||
- util/cpu: add a number of RISC-V extensions
|
||
- util/cpu: support detecting RISC-V FD/C/V/Zb[abs] with riscv_hwprobe
|
||
- pvr: only specially handle gfx subcmd for BeginQuery
|
||
- pvr: suppress VkDescriptorSetLayoutBindingFlagsCreateInfo ignored warn
|
||
- mailmap: map all mailboxes for Icenowy Zheng
|
||
- gallium/frontends/dri: only reserve a few bind flags for MSAA cbufs
|
||
- glsl: support adding point size to io_lowered shaders
|
||
- pipe-loader: make get_driver_descriptor() return NULL for unknown driver
|
||
- pipe-loader: fallback to zink instead of kmsro for render nodes
|
||
- vulkan/wsi/headless: properly use CPU images for CPU devices
|
||
- zink: skip all post-process when importing and resource_create fails
|
||
- pvr: re-indent pseudocode for DDMADT behavior
|
||
- glsl: initialize PSIZ variable to NULL when adding pointsize
|
||
- pco: fix encoding of fred's s0abs bit
|
||
- pvr: drop master for the display FD if it's not needed
|
||
- pvr: Align width for PBE write when creating linear image
|
||
- pvr: fix "obb" typo in oob_buffer_size when building vertex pds data
|
||
- pvr: save vertex attribute size for DMA checking
|
||
- pvr: move PVR_BUFFER_MEMORY_PADDING_SIZE definition to pvr_buffer.h
|
||
- pvr: consider the size of DMA request when setting msize of DDMADT
|
||
- vulkan/wsi/headless: properly cleanup swapchain init failure
|
||
- vulkan/wsi/headless: implement wait_for_present for swapchain
|
||
- pvr: support VK_EXT_non_seamless_cube_map
|
||
- pvr: fix dirty tracking for stencil ops
|
||
- docs: add missing bits for pvr's VK_EXT_non_seamless_cube_map
|
||
- pvr: fix pvr_clear_vdm_state_get_size_in_dw() inverted feature condition
|
||
- pvr: set has_usc_alu_roundingmode_rne for all B-series Rogue cores
|
||
- pvr: finalize query_indices array after ending last sub_cmd
|
||
- pvr: fix the code copying query_indices to sub_query_indices
|
||
- pvr: propagate get_vis_results flag from secondary cmdbuf gfx jobs
|
||
- pvr: follow other drivers' practice for copying build ID
|
||
- pvr: skip emitting query program when copy result / reset with 0 queries
|
||
- pvr: wait for graphics jobs in CopyQueryPoolResults
|
||
- pvr: increase maxPerStageResources for new maxPerStageDescriptorStorageBuffers
|
||
- pvr: do not setup deferred RTA clear for active render targets
|
||
- pvr: properly handle deferred RTA clears for 2D array view of 3D image
|
||
- pvr: add deferred RTA clear command to list after checking it's not NULL
|
||
- pvr: record deferred RTA clears for secondary cmdbuf subcmds
|
||
|
||
Iván Briano (5):
|
||
|
||
- brw: fix local_invocation_index with quad derivaties on mesh/task shaders
|
||
- anv, hasvk: handle MSAA resolving to a 3D slice
|
||
- anv: don't try to fast clear D/S with multiview
|
||
- anv: fix anv_is_dual_src_blend_equation
|
||
- brw: do not omit RT writes if dual_src_blend is on
|
||
|
||
Jakob Sinclair (6):
|
||
|
||
- pan: improve debug printing of multiple registers
|
||
- pan: move discard/kill_ssa flag after index for debug prints
|
||
- pan: add sigil to SSA values for debug printing
|
||
- pan/compiler: Do not assume split 64-bit registers in va_mark_last
|
||
- pan/compiler: Fix style formatting in lower_split_src
|
||
- pan/compiler: Use SHADDX instruction for i64 add
|
||
|
||
Jan Alexander Steffens (heftig) (1):
|
||
|
||
- kk: Fix debug printf specifier
|
||
|
||
Janne Grunau (9):
|
||
|
||
- asahi: ci: Rename asahi-g13g Vulkan CTS suite to asahi-agx2
|
||
- asahi: Add OpenGL / EGL CTS CI expectations
|
||
- asahi: Use GPU for buffer copies in resource_copy_region()
|
||
- asahi: Implement clear_buffer using libagx_fill*
|
||
- hk: Use aligned vector fill in hk_CmdFillBuffer if possible
|
||
- hk: Increase maxFragmentCombinedOutputResources to HK_MAX_DESCRIPTORS
|
||
- hk: ci: Skip timing out wsi xlib tests
|
||
- hk: ci: Handle more spurious wsi CTS timeouts/fails
|
||
- nir/gather_info: clear interpolation qualifiers only in fragment stage
|
||
|
||
Jarred Davies (2):
|
||
|
||
- pvr: Fix allocating the required scratch buffer space for tile buffers
|
||
- pvr: Add missing support for tile buffers to SPM EOT programs
|
||
|
||
Jason Macnak (4):
|
||
|
||
- gfxstream: enable VK_EXT_primitives_generated_query
|
||
- gfxstream: Fix StagingInfo destruction ordering
|
||
- gfxstream: fix submit to not hold lock when calling into encoder
|
||
- gfxstream: fix goldfish guards on fence functions
|
||
|
||
Jesse Natalie (11):
|
||
|
||
- wgl: Delete stw_pfd_flag
|
||
- wgl: Support PFD_SWAP_COPY pixel formats
|
||
- wgl: Add a driconf option to force pixel formats with GDI support
|
||
- driconf: Add a driconf entry for 文香白板 (Wenxiang whiteboard)
|
||
- d3d12: Set packed_uniforms cap
|
||
- meson: Include DirectX-Headers dependency for all VK Windows builds
|
||
- d3d12: Fix importing external resources
|
||
- wgl: Flush and wait when unbinding a context that references a swapchain
|
||
- mesa/st: Pass the context to fence_finish as part of flush+wait
|
||
- d3d12: Don't allow CPU storage for huge buffers
|
||
- wgl: Use an hwnd xor hdc for framebuffers
|
||
|
||
Jianxun Zhang (2):
|
||
|
||
- anv: Limit modifier disabling workaround to specific GTK versions
|
||
- driconf: Refactor CCS modifier disabling entry
|
||
|
||
Job Noorman (43):
|
||
|
||
- ir3/isa: attach (sat) to dst
|
||
- ir3/isa: fix shift/reduce conflict for mova.r
|
||
- ir3/parser: make bison fail on warnings
|
||
- tu,ir3: lower multiview indirect stores to register indirects
|
||
- ir3: add block_can_be_predicated helper
|
||
- ir3: don't use predication for large blocks
|
||
- ir3: update context builder after ir3_get_predicate
|
||
- ir3: don't predicate vote_all/vote_any
|
||
- ir3/legalize: don't drop sync flags on removed predt/predf
|
||
- nir/opt_varyings_bulk: add data parameter to optimize callback
|
||
- nir/opt_varyings: fix alu def cloning
|
||
- nir/gather_info: gather per_view info
|
||
- nir/gather_info: clear interpolation qualifiers before gathering
|
||
- nir/recompute_io_bases: fix num_slots for per_view outputs
|
||
- ir3: fix handle_partial_const with vectorized src
|
||
- ir3: call nir_lower_io_vars_to_temporaries for GS outputs
|
||
- ir3: call nir_io_add_intrinsic_xfb_info after IO lowering
|
||
- tu: extract NIR lowering to a separate function
|
||
- tu: use nir_opt_varyings_bulk for linking
|
||
- nir/opt_uniform_subgroup: fix ballot_bit_count components
|
||
- nir/lower_atomics: add support for bindless_image_atomic
|
||
- ir3: allow imm src0 (IBO) on bindless atomics
|
||
- ir3: support isam with less than 4 components
|
||
- ir3: add support for r64u?i image loads/stores
|
||
- ir3: add support for 64-bit image atomics
|
||
- ir3/analyze_ubo_ranges: add const_align_vec4 helper
|
||
- ir3/analyze_ubo_ranges: don't over-align consts when loaded via preamble
|
||
- ir3: simplify constlen calculation
|
||
- ir3: remove unused ir3_context::has_relative_load_const_ir3
|
||
- ir3/collect_info: remove max_const calculation
|
||
- ir3/postsched: update legalize state for terminators
|
||
- ir3: set cat6.dst_offset for ldc
|
||
- ir3/legalize: track need_ss/sy_for_const per const reg
|
||
- ir3/parser: set constlen when adding const regs
|
||
- ir3/parser: add \@constlen header
|
||
- ir3: allow shared address src for ldg.k
|
||
- ir3: add support for the ldg.k a1.x addressing mode
|
||
- ir3/isa: fix load size encoding for ldg.k
|
||
- ir3: use ldg.k load size
|
||
- ir3/shared_ra: fix live-out reload after src reload
|
||
- ir3/cf: fix rewriting uses with different dst types
|
||
- ir3/shared_ra: use ir3_cursor instead of instr in reload helpers
|
||
- ir3/shared_ra: insert reloads before tied dst pcopies
|
||
|
||
Jon Turney (1):
|
||
|
||
- ddebug: Fix use of alloca() without #include "c99_alloca.h"
|
||
|
||
Jordan Justen (19):
|
||
|
||
- intel/decoder: Use array of filenames in get_embedded_xml_data_by_name()
|
||
- intel/genxml: Rename Xe2 genxml to xe2.xml and xe2_rt.xml
|
||
- intel/genxml: Rename Xe3 genxml to xe3.xml and xe3_rt.xml
|
||
- intel/genxml: Start Xe3P (GFX_VERx10 == 350) support (xe3p.xml, xe3p_rt.xml)
|
||
- intel/genxml: Update README notes on hardware version numbers
|
||
- intel/genxml: Fix Xe3P import filenames in intel_genxml.py
|
||
- intel/genxml: Add gen125_rt.xml to default_imports in intel_genxml.py
|
||
- intel/isl: Build for Xe3P (GFX_VERx10 == 350)
|
||
- intel/shaders: Build for Xe3P (GFX_VERx10 == 350)
|
||
- iris: Build for Xe3P (GFX_VERx10 == 350)
|
||
- intel/l3: Add Xe3P (GFX_VERx10==350)
|
||
- anv: Add Xe3P (GFX_VERx10==350)
|
||
- intel/dev: Add INTEL_PLATFORM_NVL_P platform enum
|
||
- intel/dev: Split out Xe3 threads and URBs macros
|
||
- intel/dev: Add XE3P devinfo macros
|
||
- intel/tools/intel_dev_info: Verify stage_names size in print_base_devinfo()
|
||
- intel/dev: Handle Xe3P in intel_device_info_init_common() (for build tests)
|
||
- intel/dev: Add NVL-P device info
|
||
- intel/dev: Add NVL-P PCI IDs (with FORCE_PROBE required)
|
||
|
||
Jose Maria Casanova Crespo (14):
|
||
|
||
- v3dv: disable blending when logicOpEnable is set
|
||
- v3d: flush write jobs before BO replacement in DISCARD_WHOLE path
|
||
- vc4: flush write jobs before BO replacement in DISCARD_WHOLE path
|
||
- v3d: reject fast TLB blit when RT formats don't match
|
||
- v3d: simplify fast TLB blit format check
|
||
- broadcom/ci: broaden glx-copy-sub-buffer flake entry on RPi5
|
||
- broadcom/common: fix V3D 7.1 TFU ICFG IFORMAT values
|
||
- broadcom/common: add tile alloc block size macros and sizing helper
|
||
- v3d: use shared v3d_tile_alloc_sizes() and 128B initial blocks
|
||
- v3dv: use shared v3d_tile_alloc_sizes() and 128B initial blocks
|
||
- v3dv: defer tile_alloc creation in meta TLB ops
|
||
- broadcom/compiler: really enable branch in delay slots validation
|
||
- broadcom/compiler: MULTOP in branch delay slots doesn't generate RTOP hazard
|
||
- broadcom/compiler: move nir_lower_undef_to_zero out of optimization loop
|
||
|
||
José Expósito (2):
|
||
|
||
- winsys/amdgpu: Fix userq job info log on PPC
|
||
- venus: Fix error log on PPC
|
||
|
||
José Roberto de Souza (22):
|
||
|
||
- intel/dev: Remove INTEL_DEVICE_INFO_MMAP_MODE_XD
|
||
- intel/dev: Remove INTEL_DEVICE_INFO_MMAP_MODE_UC
|
||
- intel/dev: Improve PAT entries comment
|
||
- anv: Move anv_bo_get_mmap_mode() to i915 backend
|
||
- intel/dev: Add INTEL_DEVICE_INFO_MMAP_MODE_INVALID
|
||
- intel/isl/gfx12.5: Alow hierarchial depth buffer write through for multi sampled surfaces
|
||
- intel/brw: Add BRW_DEPENDENCY_INSTRUCTIONS invalidation when instructions are added or removed in brw_opt_split_virtual_grfs()
|
||
- iris: Fix invalid reads when uploading blend state
|
||
- intel/brw: Use computed push constants size in brw_assign_urb_setup()
|
||
- intel/brw: Add and call brw_lsc_supports_base_offset() in places that checks for support of this feature
|
||
- intel/perf: Add HSW verx10 to intel_perf_query_result_write_mdapi()
|
||
- intel/dev: Add URB min/max entries for Mesh and Task
|
||
- intel/dev/xe3p: Add min URB entries for task and mesh shaders
|
||
- anv: Fix CmdResetEvent2() with RESOURCE_BARRIER::Wait stage == none
|
||
- anv: Remove asserts() added in resource_barrier_wait_stage()
|
||
- anv: Always have a valid Resource barrier::Wait stage set
|
||
- anv: Fix invalid resource barrier signal stage
|
||
- anv: Fix placed address mmap with slab bo
|
||
- anv: Rename and share get_scratch_surf() with other files
|
||
- anv: Make use of anv_shader_get_scratch_surf() in genX_cmd_compute.c
|
||
- anv: Use helper to get anv_address in emit_simple_shader_dispatch()
|
||
- intel/brw: Remove unsed functions to get data port message type
|
||
|
||
Juan A. Suarez Romero (34):
|
||
|
||
- broadcom/ci: remove asan failures from rpi3 and rpi4
|
||
- broadcom/ci: re-evaluate timeout tests
|
||
- broadcom/ci: re-adjust fractions
|
||
- broadcom/ci: rename rusticl job
|
||
- broadcom/ci: re-evaluate all the flakes
|
||
- broadcom/ci: remove duplicate entries
|
||
- broadcom/cle: bump up gen version for v3d
|
||
- broadcom/cle: ensure zlib inflate assign memory
|
||
- broadcom: don't hardcode pagesize
|
||
- broadcom/ci: update expected results
|
||
- v3dv: serialize all the tests causing OOM
|
||
- broadcom/ci: fetch custom packaged kernel in CI-Tron
|
||
- broadcom/ci: update available devices
|
||
- broadcom/ci: update expected results
|
||
- v3d: fix leak in blit fast
|
||
- v3d,v3dv: emit always set point size
|
||
- st/pbo_compute: remove unused variables
|
||
- broadcom/ci: update expected results
|
||
- vc4/ci: update expected results
|
||
- v3d: add support for GL_ARB_sample_shading
|
||
- broadcom/ci: update expected results
|
||
- broadcom/ci: update expected results
|
||
- broadcom/ci: update expected results
|
||
- v3dv: fix mutable resolve attachment format mismatch
|
||
- vc4: fix unwanted buffer release on uploader
|
||
- v3d/ci: add new OpenCL failure
|
||
- v3dv/ci: add link to failing CTS test
|
||
- vc4: add dot on static QPU unpack strings
|
||
- vc4: make some dump functions return strings instead of printf
|
||
- vc4: use Mesa logging functions
|
||
- broadcom/compiler: make some dump functions return strings instead of printf
|
||
- broadcom: use Mesa logging functions
|
||
- broadcom/cle: parse once the XML spec
|
||
- broadcom/ci: update expected results
|
||
|
||
Julia Zhang (2):
|
||
|
||
- vulkan: return pQueue with matching flags
|
||
- radv/amdgpu: handle DISCARDABLE flag in get_flags_from_fd
|
||
|
||
Juston Li (1):
|
||
|
||
- anv: set missing protected bit for protected depth/stencil surfaces
|
||
|
||
Karmjit Mahil (14):
|
||
|
||
- tu: Allocate cmd_buffer from its pool
|
||
- tu: Set tu_ignore_frag_depth_direction driconf for Creed
|
||
- zink: Fix incorrect assert checking for linear state format
|
||
- freedreno/registers: Add some missing include in fd6_hw.h
|
||
- freedreno/a6xx: Add missing include to fd6_pack.h
|
||
- freedreno: Add fd{2,3,4,5}_hw.h and fd_hw_common.h
|
||
- freedreno: Add check_xml_includes test
|
||
- freedreno: Add check_xml_includes to meson setup
|
||
- tu: Use "nir/" for the nir includes
|
||
- tu: Undef before redefining MESA_LOG_TAG
|
||
- tu: Update .clang-format include categories
|
||
- tu: Reorder includes
|
||
- tu: Cleanup some includes
|
||
- tu: Remove unecessary forward declaration
|
||
|
||
Karol Herbst (76):
|
||
|
||
- nvk: reorder exposed coop matrix types
|
||
- clc: reorder headers to fix compilation errors due to UNUSED
|
||
- clc: support some atomic and generic address space features
|
||
- clc: enable generic address space and seq_cst and device scope atomic features
|
||
- nir: fix nir_fixup_is_exported for LLVM-22
|
||
- clc: fix compile compatability with LLVM-22
|
||
- khronos-update: synchronize OpenCL header file list
|
||
- khronos-update: add Intel's OpenCL header
|
||
- include: synchronize OpenCL headers
|
||
- rusticl/platform: add rusticl_warn_once macro
|
||
- rusticl/program: accept and ignore Intel's 4G memory flags
|
||
- nir: add nvidia IO intrinsics
|
||
- nir: add BASE to nvidia memory intrinsics
|
||
- nak: convert memory load/stores to nv variants
|
||
- nir/opt_offsets: support negative offsets and 64 bit sources
|
||
- nir/opt_offsets: support nvidias intrinsics
|
||
- nak: replace get_io_addr_offset with nir_opt_offsets
|
||
- rusticl/mesa: only use resource_from_user_memory if the cap is advertised
|
||
- vtn/opencl: flush denorms for cbrt()
|
||
- vtn: set default fp_math_ctrl values for kernels
|
||
- nir: add nvidias shared memory non unform address shift
|
||
- nak: add LDS/STS/ATOM address shift encoding
|
||
- nak: Fold constant ishl into shared ld/st/atoms
|
||
- zink: handle drivers with multiple subgroup sizes correctly
|
||
- ac/llvm: handle int8 inside ac_build_optimization_barrier
|
||
- zink: implement subgroup rotate
|
||
- rusticl: support more subgroup extensions
|
||
- asahi: support subgroup_rotate
|
||
- nir: fix nir_alu_type_range_contains_type_range for fp16 to int
|
||
- nir: fix nir_round_int_to_float for fp16
|
||
- nouveau/drm-shim: implement get_zcull_info
|
||
- nvk: run nir_opt_large_constants before nir_lower_load_const_to_scalar
|
||
- nak: invalidate loop analysis with nak_nir_lower_load_store
|
||
- nak: replace legalize_ext_instr with explicit lowering
|
||
- nak: add input predicate to load_global_nv and OpLd
|
||
- nak: use ldg input predicate in nak_nir_lower_non_uniform_ldcx
|
||
- nak: support has_load_global_bounded on turing and newer
|
||
- nvk: skip lowering load_global_constant_bounded on turing inside lower_load_intrinsic
|
||
- nak: enable vectorize_vec2_16bit
|
||
- nak: allow vector sources for f2f16 conversions
|
||
- nak: vectorize f2f16
|
||
- nak: vectorize f2f16 even more
|
||
- nak: make nak_mem_vectorize_cb create only aligned and supported vectors
|
||
- nir: rename fsin_amd and fcos_amd to a more generic name
|
||
- nir: unvendor ac_nir_lower_sin_cos
|
||
- nak: run nir_normalize_sin_cos on Volta+
|
||
- ci: add api\@clgetmemobjectinfo to fails
|
||
- nak: rework swizzling on scalar FP16 ops
|
||
- nak: remove OpF2F::dst_high
|
||
- nak: support MUFU.F16
|
||
- nak: add hw_test for MUFU.F16
|
||
- nak: enable MUFU.F16 on Turing and newer
|
||
- nak: add algebraic patterns to improve MUFU.F16
|
||
- radeonsi: set valid_buffer_range for CL buffers
|
||
- docs: clarify the use of autonomously acting tooling
|
||
- docs: add AI disclosure requirements
|
||
- radeonsi: properly report unified memory on APUs
|
||
- rusticl/kernel: implement CL_KERNEL_GLOBAL_WORK_SIZE for custom devices
|
||
- rusticl/device: Fix reporting of global memory on mixed memory devices
|
||
- nak/copy_prop: allow modified F16v2 and F16 sources
|
||
- nak: properly copy prop neg/abs float sources for flushed values
|
||
- nak: add scalar tex encoding support
|
||
- nak/nvdisasm_tests: test .SCR flag in TEX, TLD and TLD4
|
||
- nak: scalarize tex, tld and tld4 on SM70+
|
||
- nak/nvdisasm_tests: fix offset stride for gens older than Turing
|
||
- nak: add ugpr latency classes for memory instructions
|
||
- nak: add is_gpr_reg and is_ugpr_reg helpers
|
||
- nak: uregs are 6 bits before Hopper, so enforce that
|
||
- nak: the MS location comes last in TLD, same spot as depth compare in TEX
|
||
- mesa/st: do not advertise CL subgroup features on the GL side
|
||
- ci: install libstdc++-static on fedora
|
||
- rusticl: link the C++ runtime statically
|
||
- softfloat: make sign bit an unsigned int
|
||
- nir: add fmul_rtz
|
||
- nak: handle nir_op_fmul_rtz
|
||
- nak: use fmul_rtz for NAK_INTERP_MODE_PERSPECTIVE
|
||
|
||
Kenneth Graunke (78):
|
||
|
||
- nir: Add memory modes to URB load intrinsics
|
||
- nir: Teach opt_load_store_vectorize how to handle Intel URB intrinsics
|
||
- nir: Add load/store vectorizer option for rounding up masked stores
|
||
- nir: Add a round_up_components callback to load/store vectorization
|
||
- brw: Assert that urb_vec4_intel stores only have 4/8 components
|
||
- brw: Skip vec8 store_urb_vec4_intel noop writemasks as well
|
||
- brw: Avoid using URB global offset with per-slot offsets on <= Icelake
|
||
- brw: Vectorize URB intrinsics using nir_opt_load_store_vectorize
|
||
- brw: Use nir_imul_imm in per-vertex/per-primitive offset calculation
|
||
- brw: Fix geometry shaders with non-constant vertex indices
|
||
- iris: Switch to SPDX headers
|
||
- brw: Drop urb_handle parameter from store_urb()
|
||
- brw: Implement load_urb_output_handle_intel for VS/GS stages
|
||
- brw: Move TES VUE map calculation before lowering outputs
|
||
- brw: Set a valid varying_to_slot for VUE header fields other than PSIZ
|
||
- brw: Add VUE header varyings to io_component()
|
||
- brw: Split EOT handling out of emit_urb_writes()
|
||
- brw: Convert VS/TES/GS outputs to URB intrinsics.
|
||
- intel: Rename intel_msaa_flags to intel_fs_config
|
||
- intel: Rename wm_prog_data to fs_prog_data
|
||
- intel: Rename wm_prog_key to fs_prog_key
|
||
- brw: Delete wm_prog_data::urb_setup_channel[]
|
||
- elk: Delete mesh shader remnants
|
||
- iris: Fix scratch shift after recent bindless changes
|
||
- intel/elk, hasvk: Drop indirect_ubos_use_sampler option and DP code
|
||
- brw: Make indirect_ubos_use_sampler a static inline bool taking devinfo
|
||
- brw: Make use_tcs_multi_patch a static inline taking devinfo
|
||
- iris: Move recompile debugging to work on iris program keys
|
||
- anv: Make a devinfo local in anv_shader_lower_nir
|
||
- anv: Pass devinfo to anv_shader_compute_fragment_rts, not compiler
|
||
- anv: Drop an outdated comment about indirect descriptors
|
||
- anv, brw: Consolidate ex_bso bits to a static devinfo inline
|
||
- brw: Delete use_bindless_sampler_offset flag
|
||
- brw: Pass devinfo to lower_bit_size, not compiler
|
||
- brw: Make a devinfo temporary in lower_mem_access_bitsizes
|
||
- brw: Drop brw_compiler option from brw_no_indirect_mask()
|
||
- iris: Drop SBE component overrides for layer/viewport varyings
|
||
- iris: Use the first FS input's value for all undefined FS inputs
|
||
- iris: Drop sprite coord checks from SBE_SWIZ setup
|
||
- iris: Drop use of BFC(n) when it exists but COL(n) is unwritten
|
||
- iris: Defeature native two-sided color support
|
||
- brw: Use NUM_TOTAL_VARYING_SLOTS instead of VARYING_SLOT_TESS_MAX
|
||
- brw: Drop BRW_VARYING_SLOT_PAD and brw_varying_slot enum
|
||
- brw: Drop VUE header values and position from wm_prog_data->inputs
|
||
- brw: Use memset for initializing varying/slot maps
|
||
- brw: Only lower system values for passthrough TCS
|
||
- brw: Drop extra validation from TCS passthrough creation
|
||
- iris: Move passthrough TCS generation out of brw and into iris
|
||
- iris: Create our own enums for system values
|
||
- iris: Move ALT mode handling from brw to iris
|
||
- nir: Fix divergence of Intel URB input/output handle intrinsics
|
||
- brw: Purge source_depth_to_render_target
|
||
- brw: Simplify GS load_invocation_id handling
|
||
- brw: Combine brw_assign_*_urb_setup() into one function
|
||
- brw: Fix single patch thread dispatch masks in NIR
|
||
- brw: Lower TCS single patch invocation ID calculations in NIR
|
||
- brw: Don't emit HALT_TARGET for VS/TCS/TES/GS
|
||
- brw: Simplify mark_last_urb_write_with_eot()
|
||
- nir: Add is_sparse flag to texture builders
|
||
- intel/nir: Use txf builder in intel_nir_lower_sparse
|
||
- intel/nir: Set new image intrinsic parameters via builder helpers
|
||
- intel/nir: Generalize lower_tex_compare to split_tex_residency
|
||
- intel/nir: Replace tg4 with txl/txb/tex when splitting texture residency
|
||
- nir: Add nir_texop_resinfo_intel
|
||
- brw: Use nir_texop_resinfo_intel for query_levels and txs
|
||
- nir: Increase tex opcode bits from 5 to 6 in nir_instr_set
|
||
- anv: Use nir_lower_memory_model
|
||
- intel/compiler: Use nir_static_workgroup_size helper
|
||
- brw: Support load_simd_width_intel for fragment shaders
|
||
- brw: Set nir->info.{min,max}_subgroup_size in brw_nir_apply_key
|
||
- brw: Have brw_nir_apply_key call brw_nir_lower_simd for all stages
|
||
- nir, brw: lower scratch in NIR
|
||
- nir: Add nir_texop_sparse_residency[_txf]_intel operations
|
||
- intel: add INTEL_JAY environment variable
|
||
- intel/nir: Make intel_nir_lower_sparse work for either brw or jay
|
||
- jay: Assert that source is not null in jay_copy_strided
|
||
- jay: Make lower_immediates bail if there are no sources
|
||
- jay: Clear default group for quad swizzles
|
||
|
||
Khem Raj (1):
|
||
|
||
- glx: fix const qualifier warnings found with C23 glibc support
|
||
|
||
Kitlith (2):
|
||
|
||
- panvk: Free drm device in can_present_on_device
|
||
- pvr: Free drm device in can_present_on_device
|
||
|
||
Konstantin Seurer (33):
|
||
|
||
- vulkan: Handle inactive primitives with LBVH builds
|
||
- vulkan: Avoid NAN in the IR BVH
|
||
- vulkan: Limit the number of LBVH invocations
|
||
- radv/rra: Fix nullptr dereference
|
||
- vulkan: Make sure no NaNs end up in the BVH
|
||
- radv/bvh: Make sure internal nodes are collapsed when possible
|
||
- radv: Use stderr for shader printf
|
||
- radv: Stop saving descriptors before acceleration structure OPs
|
||
- radv: Fix setting the viewport for depth stencil FS resolves
|
||
- util/ralloc: Allow creating a linear context without parent context
|
||
- vulkan/cmd_queue: Use a linear allocator
|
||
- lavapipe: Extend vk_cmd_queue_entry_base for internal commands
|
||
- vulkan: Remove vk_cmd_queue_entry::driver_data
|
||
- vulkan/cmd_queue: Remove get_array_member_copy
|
||
- vulkan/cmd_queue: Fixup stride for multi draws
|
||
- vulkan/cmd_queue: Do not zero initialize vk_cmd_queue_entry
|
||
- radv/meta: Add and use helpers for setting state
|
||
- radv/meta: Rework saving/restoring state
|
||
- vulkan/cmd_queue: Rework copy codegen
|
||
- vulkan/cmd_queue: Don't explicitly set struct members to NULL
|
||
- nir/tests: Test nir_opt_large_constants
|
||
- spirv,nir: Preserve more information about the descriptor type
|
||
- vulkan: Add helper for dispatching BVH build stages
|
||
- vulkan: Request less scratch space for lbvh
|
||
- vulkan: move internal_node_count to vk_acceleration_structure_build_state
|
||
- vulkan: Remove bvh_state
|
||
- vulkan: Init all update scratch at once
|
||
- radv/bvh: Prefer selecting quads as the first pair of a HW node
|
||
- radv: Add an option for dumping BVH stats
|
||
- radv: Add RT prolog information to hang reports
|
||
- radv: Refactor declaring shader args
|
||
- radv: Set debug info in radv_shader_create_uncached
|
||
- radv: Gather debug info about shader args
|
||
|
||
Kovac, Krunoslav (1):
|
||
|
||
- amd/vpelib: Apply external CSC
|
||
|
||
Krzysztof Sobiecki (3):
|
||
|
||
- gallium/dril: Don't use gbm if there is no gbm configured
|
||
- os: Don't use PATH_MAX as it's not portable.
|
||
- os: Add support for GNU/HURD compilation and use of dri swrast/llvmpipe.
|
||
|
||
Lakshman Chandu Kondreddy (1):
|
||
|
||
- freedreno/layout, tu: Fix UBWC block sizes for PIPE_FORMAT_R8_G8B8_420_UNORM
|
||
|
||
Lars-Ivar Hesselberg Simonsen (25):
|
||
|
||
- panfrost/bi: Fix unbound texel buffers
|
||
- pan/decode/jm: IDVS decode improvements
|
||
- panvk: Fix dcd_flags1 dirty bit
|
||
- pan/genxml/v13: Fix HSR Prepass typo
|
||
- pan/genxml/v13: Add HSR operation enums
|
||
- pan/compiler: Add pass to collect HSR info
|
||
- panvk/v13: Set HSR flags
|
||
- panvk/v13: Support HSR Prepass
|
||
- pan: Drop ASTC support for v5 texel buffers building
|
||
- pan: Move buffer functions to pan_buffer
|
||
- pan/va: Remove non-existent unused CLPERs
|
||
- pan/va: Clean up unused/removed instructions
|
||
- pan/va: Add opcode modifier to ISA.xml
|
||
- pan/va: XMLify opcode2
|
||
- pan/va/disasm: Move instr print to function
|
||
- pan/va: Generalize opcode/opcode2
|
||
- pan/va/disasm: Clean up hardcoded values
|
||
- pan/va/disasm: Move src discard marker behind reg
|
||
- pan: Centralize preload registers
|
||
- pan/model: Redo gpu_prod_id in the model
|
||
- pan: Add support for 64 bit gpu_id
|
||
- pan/va/isa: Src for X16_TO* takes lane, not swizzle
|
||
- pan/model: Expose prod_id and rev functions
|
||
- panfrost: Add support for 64 bit gpu_id
|
||
- panvk: Fix debug flag overlap
|
||
|
||
Leder, Brendan Steve (Brendan) (1):
|
||
|
||
- amd/vpelib: Add missing JFIF enum
|
||
|
||
Leon Perianu (6):
|
||
|
||
- pvr: fix logic for when to reset fill blit
|
||
- pvr: fix format table properties duplicate
|
||
- pvr: remove hardcoded buffer alignment and image alignment
|
||
- pvr: refactor image initialization with helper functions
|
||
- pvr: use align64 for large buffer memory requirements
|
||
- pvr: enable VK_KHR_maintenance4
|
||
|
||
Lin, Ricky (1):
|
||
|
||
- amd/vpelib: Augment swizzling modes
|
||
|
||
Link Mauve (1):
|
||
|
||
- docs/panfrost: fix outdated path to complete GPUs list
|
||
|
||
Lionel Landwerlin (111):
|
||
|
||
- iris: fix incorrect intrinsic usage on ELK
|
||
- anv/iris: add drirc to enable sampler state & compute surface state prefetch
|
||
- vulkan/wsi/direct: remove VkDisplay created from GetDrmDisplayEXT on ReleaseDisplayEXT
|
||
- vulkan/pipeline: don't consider capture-replay flag for shader hashing
|
||
- anv: fix shader heap replay addr
|
||
- anv/xe: move special WaitIdle optimization to submission path
|
||
- anv: implement VK_KHR_internally_synchronized_queues
|
||
- anv: flush render caches on first pipeline select
|
||
- anv: fix nested command buffer relocations
|
||
- anv: add missing constant cache invalidation for descriptor buffers
|
||
- isl: fix 32bit math with 4GB buffer size
|
||
- brw: make PULL_CONSTANT opcodes more like MEMORY opcodes
|
||
- brw: enable ex_bso for LSC_SS
|
||
- anv: rename/document a layout helper
|
||
- anv: rework descriptor set indexing in NIR
|
||
- anv: remove set index for descriptor buffers
|
||
- anv: add a couple of surfaces to read descriptors
|
||
- brw: handle non-GRF aligned pushed UBO masking
|
||
- anv: delay BRW prog_data filling
|
||
- anv: drop unused argument for compute_push_layout
|
||
- anv: use internal surface state on Gfx12.5+ to access descriptor buffers
|
||
- anv: remove unused arguments
|
||
- brw/iris: move ubo range analysis pass to iris
|
||
- intel/tools: print out GRF size in intel_dev_info
|
||
- anv: enable compute bti prefetch
|
||
- anv: apply the same ccs disabling for Xe3 than Xe2
|
||
- anv: disable ccs modifier reporting when ccs modifiers are disabled
|
||
- anv: move ALU registers used for mi commands
|
||
- anv: enable sharing binding table pool programming
|
||
- anv: predicate BTP emissions
|
||
- anv: add a drirc to control binding table block size
|
||
- anv: dirty descriptors after blorp operations
|
||
- anv: program HW to gather push constants at 3DSTATE_CONSTANT parsing time on Gfx9
|
||
- anv: specialize code for Wa_18019110168
|
||
- anv: remove snprintf for aux op transition
|
||
- anv: remove old comment related to pre softpin platforms
|
||
- anv: add a debug printout for dirty descriptors
|
||
- anv: make a helper for push constant allocation
|
||
- anv: optimize binding table flushing
|
||
- anv: track descriptor buffers used & promoted to push constants
|
||
- anv: avoid filling PC reason for timestamp u_trace captures
|
||
- anv: pack bind_map further
|
||
- anv: delay num-workgroups usage until push remapping
|
||
- anv: merge 2 push constants fields
|
||
- anv: add a shrinking push constant loading pass
|
||
- brw/nir: handle inline_data_intel more like push_data_intel
|
||
- anv: implement inline parameter promotion from push constants
|
||
- anv: fix dynamic buffes again
|
||
- anv: add missing handling for attachment locations in secondaries
|
||
- docs/anv: add some debug tips
|
||
- anv: dirty all push constant stages in simple shader
|
||
- anv: remove unused bind_map field
|
||
- anv: add an analysis pass to detect compute shaders clearing data
|
||
- anv: add drirc option to workaround missing application barriers on typed/untyped data
|
||
- blorp: add mda support
|
||
- brw: use scalar build for immediate offsets
|
||
- brw/nir: enable constant offsets for global_constant_uniform_block_intel
|
||
- brw/nir: add new intrinsics to load data from the indirect address
|
||
- blorp: switch to new load_indirect_address_intel intrinsic
|
||
- anv/brw: remove push constant load emulation from the backend compiler
|
||
- anv: fix dEQP-VK.memory.address_binding_report*
|
||
- anv: only go into buffer relocs after we've looked at all batches
|
||
- anv: fix pulling constant data in compute/mesh/task shaders
|
||
- brw/nir: improve shader_indirect_data_intel handling
|
||
- anv: fix internal compute shader constant data pull
|
||
- anv: use companion RCS for hiz ops on compute queue
|
||
- anv: reduce optimalBufferCopy(Offset|RowPitch)Alignment
|
||
- vulkan/runtime: add new helper for vertex strides
|
||
- vulkan/runtime: build (address|copy)_flags for vk_buffer
|
||
- vulkan/runtime: add implementation of older entrypoints using KHR_DAC
|
||
- anv: enable compression control on Android VP17
|
||
- vulkan/runtime: break view mask from renderpass information
|
||
- anv: don't queue pipe control reasons wihout a trace
|
||
- anv: limit aux disabling on concurrent images to pre-Xe2
|
||
- vulkan/runtime: fix missing copy image layout
|
||
- vulkan/runtime: fix incorrect entrypoint call for CmdCopyImageToBuffer2
|
||
- anv: deal with Wa 14024015672 on the blorp path
|
||
- anv: move depth/stencil BeginRendering handling prior to color
|
||
- anv: rename variables in CmdBeginRendering
|
||
- anv: batch rendering initialization commands
|
||
- anv: merge null surface state packing with previous attachments
|
||
- anv: document more stalling
|
||
- anv: rework color_aux operation tracking
|
||
- drm-uapi: Sync xe_drm.h
|
||
- intel/dev: add state cache perf fix support xe detection
|
||
- brw: fence SLM writes between workgroups
|
||
- nir: improve deref_instr_get_variable
|
||
- isl: speedup buffer fills by dropping swizzle programming
|
||
- nir/lower_image_atomics_to_global: add heap support
|
||
- nir/lower_non_uniform: add heap support
|
||
- nir/lower_robust_access: add heap/bindless support
|
||
- vulkan/runtime: convert descriptor heap pipeline flag to shader flag
|
||
- nir/divergence: handle resource_intel like other intrinsics
|
||
- nir: add heap variant of load_param_intel
|
||
- nir/lower_io: add index support for load_param_intel
|
||
- nir: divergence analysis support for image_heap_load_param_intel
|
||
- brw: make the program key available on pass_tracker
|
||
- anv: use arrays instead of vectors for descriptor set mapping
|
||
- anv: bump MAX_SETS to 32
|
||
- anv: don't relocate memory from blob
|
||
- brw: add support for < 32bit io values
|
||
- anv: enable storageInputOutput16
|
||
- brw: don't support frontfacing ternary optimization on != 32bit
|
||
- elk: don't support frontfacing ternary optimization on != 32bit
|
||
- anv: avoid C23
|
||
- anv: fix compute push constant allocations on pre Gfx12.5 platforms
|
||
- anv: fix invalid value for push block index
|
||
- anv: fix debug printfs on hang
|
||
- anv: fixup compute queue detection
|
||
- anv: fix null pointer access
|
||
- anv: fix arc artifacts on Farming simulator 2022
|
||
|
||
Liu, Mengyang (1):
|
||
|
||
- aco: fix broken VGPRs reservation for 64-bit attributes in VS prologs
|
||
|
||
Liviu Prodea (2):
|
||
|
||
- clc: Fix static link with clang>=22
|
||
- util: Fix use of undeclared identifier 'NULL' in src/util/os_misc.h when compiling with clang
|
||
|
||
Lorenzo Rossi (49):
|
||
|
||
- nvk,nak: Add nir_printf_fmt
|
||
- nir/opt_varyings: Skip code-motion for upconversions
|
||
- pan/compiler: Remove mediump from collect_varyings
|
||
- panvk: Constant fold location in panvk_lower_nir_io
|
||
- pan/compiler: Add formats to varyings info
|
||
- panvk/jm: Emit varying descriptors based on the VS varying layout
|
||
- panvk: Cleanup shaders linking
|
||
- panfrost: Build explicit varying layout
|
||
- pan/compiler: Dont build ABI automatically
|
||
- panfrost: Refine names in pan_cmdstream descriptor functions
|
||
- panfrost/bi: Emit varying descriptors based on the VS varying layout
|
||
- panfrost/val: Emit varying descriptors based on the VS varying layout
|
||
- pan/compiler: Remove collect_varyings
|
||
- pan/compiler: Remove unused descriptor info
|
||
- pan/compiler: Mostly remove auto32 varting store
|
||
- pan/compiler: Remove auto32 quirk
|
||
- panfrost/docs: Document varyings
|
||
- panfrost: Switch to compact varyings layout
|
||
- pan: Remove dead code for sso_abi builder and fixed_varyings
|
||
- people: Update my email
|
||
- mailmap: Update my email
|
||
- pan/bi: Add is_psiz_store flag in bi_instr
|
||
- pan/bi,nir: Divide memory_access from segments
|
||
- Revert "pan/bi: Model pos/vary segments in STORE instructions"
|
||
- panfrost: Lower indirect derefs before lower_io
|
||
- pan/bi: Resize varyings IO early
|
||
- pan/compiler: Remove dead ABI function
|
||
- panvk,panfrost: Always emit ld_var_buf when possible
|
||
- panfrost/docs: Fix v9+ varyings description
|
||
- pan/compiler: Remove unused hack in varyings stores
|
||
- panfrost/bi: Separate va_shader_output from bitmasks
|
||
- pan: Add PAN_MAX_MULTIVIEW_VIEW_COUNT
|
||
- pan/compiler: Refactor va_shader_output_from\_ in common code
|
||
- pan/compiler: Split lower_varyings_io into fs_inputs and vs_outputs
|
||
- pan/compiler: Group outputs in lower_vs_outputs
|
||
- pan/compiler: Make lower_vs_outputs write needs_extended_fifo
|
||
- pan/compiler: Add bound assert on emit_split_i32
|
||
- pan/compiler: Replace frag_coord_zw_pan with var_special_pan
|
||
- pan/compiler: Replace bi_lower_ldexp16 with algebraic pass
|
||
- pan/compiler: Split bi_debug.c from bifrost_compile.c
|
||
- pan/compiler: Split bifrost_nir.c from bifrost_compile.c
|
||
- pan/compiler: Don't crash nopersp if pos is undefined
|
||
- pan/compiler: Fix noperspective int varyings
|
||
- panfrost,panvk: Move postprocess near shader_compile
|
||
- panfrost: Move lower_res_indices before postproc
|
||
- panfrost,panvk: Move lower_texture_late inside postproc
|
||
- panfrost,panvk: Move lower_texture_early inside preproc
|
||
- pan/compiler: Document compilation pipeline expectations
|
||
- panvk/jm: Fix tls_size overwrite in indirect draws
|
||
|
||
Louis Montagne (1):
|
||
|
||
- zink: relax build-id length assertion for Mach-O
|
||
|
||
Loïc Molinari (17):
|
||
|
||
- util/perf: Replace tabs with spaces
|
||
- util/perf: Reorder ending CPU scope traces to match beginning order
|
||
- util/perf: Add support for conditional CPU scope traces
|
||
- pan/trace: Add wrappers for Mesa CPU scope traces
|
||
- panfrost: Port CPU scope traces to PAN_TRACE_*()
|
||
- panfrost: Add new CPU scope traces
|
||
- panfrost: Fix clean_pixel_write_enable forced check for AFBC
|
||
- pan/desc: Only set clean_pixel_write_enable on clear (v4)
|
||
- pan/desc: Emit common RGB render target config in pan_emit_rt()
|
||
- pan/desc: Force pan_merge() ending semicolon
|
||
- pan/desc: Move funcs closer to callers
|
||
- pan/desc: Cache clean tile state
|
||
- pan/desc: Issue TSIX-2033 only affects pre-frame shaders
|
||
- panfrost: Update clean_pixel_write_enable flag name for v6+
|
||
- panfrost: Fix -Wunused-variable warning on arch > 7
|
||
- panfrost: Fix -Wunused-variable warnings using ASSERTED
|
||
- panfrost: Fix -Wunused-but-set-variable warnings using ASSERTED
|
||
|
||
Lucas Fryzek (8):
|
||
|
||
- drisw: Properly mark shmid as -1 when alloc fails
|
||
- x11: Add helper util to check for xshm support
|
||
- egl/dri: Check that xshm can be attached
|
||
- glx: Check that xshm can be attached
|
||
- vulkan/wsi: Check that xshm can be attached
|
||
- lvp: Mark opaque FD and dmabuf as compatible is supported
|
||
- lvp: Export -1 as sync fd
|
||
- tu: fix reg size for a8xx_gen1
|
||
|
||
Lucas Stach (1):
|
||
|
||
- etnaviv: idle the pipe before flushing texture caches
|
||
|
||
Luigi Santivetti (14):
|
||
|
||
- pvr: fix logic for setting DSMERGE and PICKD
|
||
- pvr: fix src/dst image formats for DS resolve ops
|
||
- pvr: fix ds merge logic for blit image to image
|
||
- pvr: fixup for wrong conflict resolution in rebase
|
||
- pvr: allow pvr_get_copy_format to handle ycbcr formats
|
||
- pvr: drop redundant check on VK_FORMAT_X8_D24_UNORM_PACK32
|
||
- pvr: stop using samples to determine what src and dst formats
|
||
- Revert "pvr: Fixup for deqp-vk.api 2d.optimal.* conformance"
|
||
- pvr/ci: update bxs flakes to add one ycbcr test
|
||
- zink: fix format conversion logic for the alpha emulation case
|
||
- pco: fix Mesa-CI regression in pco texture packed formats
|
||
- pvr: expose partial usc mrt init routine
|
||
- pvr: keep compiler resources in sync with attachments
|
||
- pvr: add missing multi-arch support for pipeline exec and stats
|
||
|
||
Maaz Mombasawala (7):
|
||
|
||
- ci: Update vmware farm admins.
|
||
- svga: Update ci failure expectations.
|
||
- svga: Update CI expectations
|
||
- svga: Update CI expectations.
|
||
- Revert "ci: disable vmware farm"
|
||
- svga: Use gfx-ci kernel in CI
|
||
- Revert "ci: vmware farm is offline, stop using it"
|
||
|
||
Marc Alcala Prieto (1):
|
||
|
||
- pan/cs: Fix cs_run_fragment() calls with swapped arguments
|
||
|
||
Marek Olšák (122):
|
||
|
||
- ac/nir/meta: tune image clear & copy performance for gfx12
|
||
- ac/nir/meta: tune 12B clear buffer performance for gfx12
|
||
- ac,radeonsi: set optimal COMPUTE_DISPATCH_INTERLEAVE for buffer clears/copies
|
||
- radeonsi: don't use pipe_surface_size in clears
|
||
- radeonsi: add faster color clear for gfx12
|
||
- radeonsi: test bigger sizes for AMD_TEST=dmaperf
|
||
- radeonsi: disable 2D compute dispatch SE interleaving
|
||
- nir/print: fix a crash due to unhandled GLSL_SAMPLER_DIM_EXTERNAL
|
||
- nir: add ACCESS_SPARSE
|
||
- nir: add dest_type to load_buffer_amd
|
||
- nir/opt_16bit_tex_image: lower dst of load_buffer_amd
|
||
- radeonsi: unify tex descriptor loads
|
||
- ac,radeonsi: add AC_NIR_TEX_BACKEND_FLAG_IS_IMAGE
|
||
- aco: handle ACCESS_SPARSE and ACCESS_SKIP_HELPERS for load_buffer_amd
|
||
- ac: set missing dest_type for image_deref_load
|
||
- aco,ac/llvm: force IDXEN=1 for buffer format opcodes on GFX9
|
||
- ac/llvm: handle ACCESS_SPARSE in load_buffer_amd
|
||
- ac/nir: rename ac_nir_lower_tex -> ac_nir_lower_image_tex
|
||
- ac/nir: lower buffer txf to load_buffer_amd in NIR
|
||
- ac/nir: lower buffer image_load to load_buffer_amd in NIR
|
||
- ac: remove txf buffer code from ACO & LLVM
|
||
- ac: remove image_load buffer code from ACO & LLVM
|
||
- ac/llvm: fix buffer_load_format with TFE by replacing inline asm with LLVM code
|
||
- ac/llvm: remove scalarizing emit_intrin_1f_param_scalar
|
||
- ac/llvm: simplify emitting FP intrinsics
|
||
- ac/llvm: pass soffset to buffer_load/store_format
|
||
- radeonsi: move VB descriptor emission and upload into separate functions
|
||
- radeonsi: don't allocate a VB descriptor upload buffer if it's not needed
|
||
- nir: reassociate a $op (b ? #c : #d) for div, mod, rem
|
||
- ac: unify DCC clear code definitions
|
||
- radv: use DCC definitions more
|
||
- ac: unify and demystify CMASK clear codes
|
||
- ac: unify HTILE codes and encoding
|
||
- ac: add FMASK codes
|
||
- ac: lower load_workgroup_ids for ACO in NIR
|
||
- ac: lower load_subgroup_id for ACO in NIR
|
||
- ac/nir: add ac_nir_lower_intrinsics_to_args_options structure
|
||
- ac: lower load_num_workgroups in NIR
|
||
- ac/llvm: remove unused functions
|
||
- nir: handle get_ubo_size as a resource query in nir_shader_gather_info
|
||
- nir: add ACCESS to get_ubo_size
|
||
- nir: allow get_ssbo_size to return a 64-bit result
|
||
- nir/lower_non_uniform_access: add an option not to lower tex & image queries
|
||
- nir/opt_non_uniform_access: use new query flags
|
||
- radeonsi: remove CB_RESOLVE
|
||
- radeonsi: don't fail a CB_RESOLVE assertion on gfx11
|
||
- ac/nir/meta: don't scalarize sparse loads if the address is aligned to load size
|
||
- ac/nir/meta: use the clear/copy compute shader if CP DMA doesn't support sparse
|
||
- ac/nir/meta: properly align sparse buffer clears with 12-byte clear values
|
||
- radeonsi: remove the CP DMA workaround for sparse for GFX6-9
|
||
- radeonsi: replace null descriptors with memset
|
||
- st/mesa: optimize glCopyImageSubData for 3D and array textures
|
||
- amd: add meson variable idep_amd_generated_headers for all generated headers
|
||
- amd: add gfx11 and gfx12 CP packet definitions
|
||
- ac/gpu_info: handle more VRAM types
|
||
- ac/llvm: encode LLVM cache flags manually
|
||
- ac: tidy up ac_hw_cache_flags
|
||
- radeonsi: don't set any EXPCLEAR state on gfx12+
|
||
- ac: set the correct number of Z planes for ALLOW_EXPCLEAR
|
||
- ac: add ac_cu_info::has_fmask, adjust FMASK checks
|
||
- radv: make radv_postprocess_binary_config more correct and more readable
|
||
- radv,radeonsi: PA_SC_BINNER changes for gfx12
|
||
- radeonsi: rename si_shader_gs/vs -> si_shader_gs/vs_legacy
|
||
- radeonsi: don't fail si_compute_blit for compressed/subsampled formats properly
|
||
- radeonsi: add debug options forcing fast clear, gfx and compute blits
|
||
- radeonsi: remove AMD_TEST=blitperf
|
||
- amd/packets: remove non-existent CLEAR_STATE from gfx12 definitions
|
||
- amd: generate a packet parser/printer automatically from packet definitions
|
||
- ac: enable the new auto-generated CP packet parser
|
||
- ac: replace some packet field definitions in sid.h by generated ones
|
||
- meson.build: require python 3.10, try python3.12
|
||
- nir/inline_uniforms: rename num_offsets -> num_uniforms
|
||
- nir/inline_uniforms: rename new_num -> new_num_uniforms
|
||
- nir/inline_uniforms: update comments
|
||
- nir/inline_uniforms: track visited state per component
|
||
- nir: change export_amd intrinsics to use enabled_channels instead of write_mask
|
||
- nir: change export_amd intrinsics to use target instead of base
|
||
- Inline SHA1_DIGEST_LENGTH
|
||
- Inline SHA1_DIGEST_STRING_LENGTH
|
||
- Inline mesa_sha1, SHA1_CTX
|
||
- Inline SHA1* functions, remove sha1.h
|
||
- Inline _mesa_sha1_init/update/final functions
|
||
- Remove redundant BLAKE3_KEY_LEN32
|
||
- Inline _mesa_sha1_compute/format, remove the other unused ones
|
||
- Remove mesa-sha1.h
|
||
- util: rename the sha1 test to blake3 test
|
||
- Rename SHA1_* names to BLAKE3_*
|
||
- Rename sha1_* and sha_* names to blake3_*
|
||
- Rename sha words to blake3
|
||
- Rename more sha and sha1 names to blake3
|
||
- Rename SHA1 words to BLAKE3
|
||
- Rename \*_sha1 names to \*_blake3
|
||
- Final rename of sha1 names to blake3
|
||
- Change remaining SHA-1 occurences to BLAKE3
|
||
- driconf: unbreak profiles for "runner" by merging them and ignoring sha1s
|
||
- driconf: rename sha1 option to blake3
|
||
- radeonsi: recompute IO bases after optimizations
|
||
- radeonsi/meson: don't use llvm variables when LLVM is disabled
|
||
- ac/llvm: remove almost duplicated ac_build_varying_gather_values
|
||
- ac/llvm: inline ac_build_gather_values_extended
|
||
- radeonsi: remove unnecessary ac_to_integer in si_llvm_ps_build_end
|
||
- radeonsi: fix compiler selection for fixed-func TCS
|
||
- radeonsi: fix an assertion failure for sampler descriptor loads with LLVM
|
||
- ac/nir/meta_cs_blit: use uint16 for coordinates to fix 64K blits
|
||
- gallium/u_blitter: allow using the single triangle for scaled blits too
|
||
- radeonsi: fix blits via util_blitter_draw_rectangle
|
||
- radeonsi: disable streamout queries for u_blitter
|
||
- radeonsi: add 64K texture support to gfx blits
|
||
- radeonsi: remove always-set SI_SAVE_FRAGMENT_STATE
|
||
- radeonsi: sink si_get_pipe_constant_buffer in si_blitter_begin
|
||
- radeonsi: draw using a single triangle in u_blitter
|
||
- nir: return a failure value from nir_system_value_from_intrinsic
|
||
- nir: factor out nir_system_value_from_instr from nir_opt_varyings
|
||
- nir/opt_varyings: move expressions with view_index into preceding shaders
|
||
- nir/tests: test nir_opt_varyings with sysvals
|
||
- ac,radv: use AC_TRACKED_DB_PA_SC_VRS_OVERRIDE_CNTL for PA_SC_VRS_OVERRIDE_CNTL
|
||
- radv,radeonsi: don't set PA_SC_HIS_INFO
|
||
- ac,radv: remove AC_TRACKED_DB_VRS_OVERRIDE_CNTL as well
|
||
- amd/packets: fix the size of 1-bit bitfields
|
||
- amd/packets: remove the underscore between opcode number and word index, use %x
|
||
- amd/packets: add disable_wr_confirm alias to dis_wc
|
||
- amd: switch to new packet definitions for all packets
|
||
|
||
Mario Kleiner (3):
|
||
|
||
- v3dv: Enable VK_KHR_present_id and VK_KHR_present_wait
|
||
- v3dv: Enable VK_EXT_hdr_metadata.
|
||
- dri: Fix "cosmetic" undefined behaviour warning for RGB[A]16_UNORM formats.
|
||
|
||
Martin Roukala (né Peres) (3):
|
||
|
||
- ci: disable the valve-kws farm
|
||
- Revert "ci: disable the valve-kws farm"
|
||
- zink/ci: mark the unvanquished trace on vangogh as flake
|
||
|
||
Mary Guillemard (40):
|
||
|
||
- nvk/nvkmd: Do not limit exec_push count in nvkmd_nouveau_exec_ctx
|
||
- mr-label-maker: Mark CI files for NVK with the NVK label
|
||
- nvk: Reenable compression support with nouveau 1.4.2
|
||
- nvk: Report NIR shader in pipeline executable properties
|
||
- nvk: Reorder view_mask checks in nvk_mme_clear
|
||
- nvk: Rename DRAW_BEGIN scratch to DRAW_TOPOLOGY
|
||
- nvk: Use DRAW_CONTROL_A on Turing+
|
||
- nvk: Early return in draw commands when no draw will be performed
|
||
- hk: Fix crash in hk_handle_passthrough_gs
|
||
- vulkan: Do not override the shader_flags in case of no task shader
|
||
- nir: Add isbewr_nv intrinsic and extends isberd_nv
|
||
- nak: Legalize ISBERD
|
||
- nak: Implement ISBEWR and extend ISBERD implementation
|
||
- nak/nvdisasm_tests: Test ISBERD and ISBEWR
|
||
- nir, nvk, nak: Add base to isbewr_nv and isberd_nv
|
||
- docs/nvk: Fix link for subchannel switches
|
||
- nvk/mme: Add missing nullcheck in nvk_mme_test_state_state
|
||
- nvk: Put nvk_mme in the nouveau test suite
|
||
- nvk/mme: Enable testing for Kepler
|
||
- nvk: Validate push constant offset in nvk_root_descriptor_table
|
||
- nvk: Move viewport and scissor emit to their own function
|
||
- nvk: Broacast viewport0 and scissor0 in case of FSR on Turing
|
||
- nir/dead_cf: Add missing load_ssbo_ir3 handling
|
||
- nir/dead_cf: Add missing load_global_bounded handling
|
||
- nir/dead_cf: Add missing load_global_nv handling
|
||
- nak: Do not allow load_helper_invocation reordering
|
||
- agx: Fix alpha-to-coverage bit size
|
||
- nvk: Use SET_PRIMITIVE_TOPOLOGY instead of MME scratch
|
||
- nvk: Move shader size and offset calculations to nvk_shader_get_shader_size
|
||
- nvk: Wire up shader program prefetch method
|
||
- nvk: Ensure that shader I-cache prefetch is enabled on Ada+
|
||
- nvk: Do not fill cb0 at queue creation
|
||
- nvk: Do not use SET_L1_CONFIGURATION on 3D state init
|
||
- nvk: Set VAF eviction policy to nornmal
|
||
- nvk: adjust reduce color thresholds default values
|
||
- nvk: Remove old comments from draw state init
|
||
- bin: Add Tested-by in rb.py
|
||
- nvk: Adjust maxFragmentCombinedOutputResources to match max descriptors limit
|
||
- hk: Add HK_MAX_RTS to maxFragmentCombinedOutputResources
|
||
- nak: Allows predicate in legalize_ext_instr
|
||
|
||
Matt Arsenault (2):
|
||
|
||
- ac/llvm: Remove -promote-alloca workaround
|
||
- ac/llvm: Use new denormal_fpenv attribute for llvm >= 23
|
||
|
||
Matt Coster (1):
|
||
|
||
- ci,cirnm: Fix program name in usage example
|
||
|
||
Matt Turner (6):
|
||
|
||
- brw/cse: fix \`operands_match` corrupting non-IMM register data
|
||
- brw/cse: use copies in \`operands_match` instead of in-place modification
|
||
- elk/cse: fix \`operands_match` corrupting non-IMM register data
|
||
- elk/cse: use copies in \`operands_match` instead of in-place modification
|
||
- intel/elk: Remove dead TXL_LZ/TXF_LZ opcodes
|
||
- radv: fix UB in radv_format_pack_clear_color for snorm formats
|
||
|
||
Mauro Rossi (12):
|
||
|
||
- vulkan/runtime: Fix gnu-empty-initializer error in vk_pipeline.c
|
||
- lavapipe: Fix gnu-empty-initializer error in NV_cooperative_matrix2 conversions
|
||
- lavapipe: Fix gnu-empty-initializer error in NV_cooperative_matrix2 reductions
|
||
- vulkan/runtime: Fix gnu-empty-initializer error in vk_shader.c
|
||
- radv: Fix gnu-empty-initializer error in radv_pipeline_graphics.c
|
||
- radv: Fix gnu-empty-initializer error in radv_pipeline_rt.c
|
||
- radv: Fix gnu-empty-initializer error in radv_pipeline_compute.c
|
||
- radv: Fix gnu-empty-initializer error in radv_shader_object.c
|
||
- radv: Fix gnu-empty-initializer error in prolog_stage
|
||
- intel/jay: fix static_assert expression
|
||
- radv: Fix gnu-empty-initializer errors in 480a94fb
|
||
- radv: Fix gnu-empty-initializer errors in 8c10eab1
|
||
|
||
Maíra Canal (11):
|
||
|
||
- broadcom/ci: skip tests that causes GPU resets/hangs in RPi 3
|
||
- broadcom/compiler: Don't lower to LCSSA before calling nir_divergence_analysis()
|
||
- vc4: drop redundant shader->failed reassignment
|
||
- nir: add load_texture_scale intrinsic
|
||
- vc4: fail VS compilation on divergent loops
|
||
- broadcom/ci: don't skip dynamic loop tests in RPi 3
|
||
- v3d: increase BO allocation size when growing CLs
|
||
- v3d: use the state uploader for the image view texture shader state
|
||
- v3d: sub-allocate sampler view texture state from state uploader
|
||
- v3d: Rename cle_buffer_min_size to page_size
|
||
- v3d: use devinfo->page_size for state uploader default size
|
||
|
||
Mel Henning (66):
|
||
|
||
- nvk: Use layout->vk.dynamic_descriptor_count
|
||
- nvk: Use pipeline_layout.dynamic_descriptor_offset
|
||
- hk: Use layout->vk.dynamic_descriptor_count
|
||
- hk: Use pipeline_layout.dynamic_descriptor_offset
|
||
- kk: Use layout->vk.dynamic_descriptor_count
|
||
- kk: Use pipeline_layout.dynamic_descriptor_offset
|
||
- nvk: Ignore meta ops in occlusion queries
|
||
- nvk: Disable large pages for now
|
||
- nvk: Add a NVK_MME_VAL_MASK macro
|
||
- nvk: Use macros for nvk_mme_set_tess_params tests
|
||
- nvk: Add CCW, POINT_MODE flags for set_tess_params
|
||
- nvk: Compute tess prims in the MME macro
|
||
- nvk: Remove prims from tess state
|
||
- nvk: Move tess flags between other fields
|
||
- nvk: Use some additional drf macros
|
||
- nak: Split out TesselationCommonShaderInfo
|
||
- nak: Handle unspecified tess spacing
|
||
- nvk: Merge tese/tesc state in the MME
|
||
- vulkan/wsi: Call wl_display_roundtrip on our queue
|
||
- nvk: Initialize SET_ALPHA_TO_COVERAGE_OVERRIDE
|
||
- nvk: Report additional host_image_copy layouts
|
||
- zink: Emit float controls for preserve_denorms too
|
||
- zink: Generalize spirv_builder_emit_exec_mode_id3
|
||
- zink: Use float_controls2
|
||
- zink: Use NMin/NMax for fmin/fmax if nan_preserve
|
||
- nvk,nak: Store offsets in a const extern struct
|
||
- nak: Remove some unused fs_key parameters
|
||
- nvk: Don't include u_math.h in generated headers
|
||
- nouveau/headers: Don't use 128-bit comparisons
|
||
- nouveau/headers: Use UINT64_C in drf.h
|
||
- libcl_vk: Add VkCopyMemoryIndirectCommandKHR
|
||
- nouveau/headers: Add P_IMMD_WORD()
|
||
- nvk: VK_KHR_copy_memory_indirect
|
||
- drm-uapi: Sync nouveau_drm.h
|
||
- nouveau/winsys: Fetch zcull_info on device create
|
||
- nouveau/headers: Preserve _ before 0-9 in to_camel
|
||
- nil: Add zcull support
|
||
- nvk: Enable basic zcull support
|
||
- nvk: Enable zcull for VK_ATTACHMENT_LOAD_OP_LOAD
|
||
- nvk: Remove unused cmd.tls_space_needed
|
||
- driconf: force_vk_vendor on No Man's Sky + NVK
|
||
- nvk: Use SET_GLOBAL_RENDER_ENABLE
|
||
- nvk: Use the MME for cond rendering on Turing+
|
||
- nvk: Expose VK_KHR_depth_clamp_zero_one
|
||
- nvk: Disable descriptorBufferCaptureReplay for now
|
||
- nir/lower_io: Add global_bounded to io_offset_src
|
||
- nir/mem_access_bit_sizes: Handle global_bounded
|
||
- nak: Fix mufu's f16 bit on sm90+
|
||
- nvk/lower_descriptors: Move load_root_table up
|
||
- nvk/lower_descriptors: Use more load_root_table
|
||
- nvk/lower_descriptors: .base in load_root_table
|
||
- nvk/lower_descriptors: Add load_root_table_array()
|
||
- nvk/lower_descriptors: Change ROOT_DESC addr space
|
||
- nvk: Rename macro loop index from i to _index
|
||
- nvk: Swizzle root_table.dynamic_buffers[]
|
||
- nvk: Initialize NVC597_SET_ROOT_TABLE_VISIBILITY
|
||
- nvk: Reorder nvk_root_descriptor_table
|
||
- nvk: Factor out build_push_write_push_const
|
||
- nak: Turn nak_const_offsets into a function.
|
||
- nak: Add an is_graphics param to nak_const_offsets
|
||
- nak: Add printf_cb to nak_constant_offset_info
|
||
- nvk/cmd_indirect: Pass pdev into more functions
|
||
- nvk: Move mme_set_anti_alias_tests to a check func
|
||
- nvk: Wire up ROOT_TABLE
|
||
- nvk: SET_ROOT_TABLE_PREFETCH
|
||
- nvk: Disable zcull save/restore regions for now
|
||
|
||
Michael Cheng (15):
|
||
|
||
- vulkan: add vk_shader_ops::replay_at vfunc stub
|
||
- anv: Implement RT shader group handle capture/replay
|
||
- anv: Rename instruction_state_pool to shader_heap
|
||
- intel/blorp: add explicit clear op enums for stencil and linear paths
|
||
- intel/blorp: Remove unused blorp_gfx8_hiz_clear_attachments
|
||
- intel/blorp: use dedicated clear ops in clear paths
|
||
- vulkan/runtime: allow drivers to enable vk_log output in release builds
|
||
- anv: enable perf warning logging in release builds
|
||
- hasvk: enable perf warning logging in release builds
|
||
- intel/ds: report when OA metric access is blocked by kernel policy
|
||
- intel/ds: report when OA metrics are unavailable
|
||
- anv: log fast color clear fallback reasons in vkCmdClearAttachments
|
||
- anv: log fast depth clear fallback reasons in vkCmdClearAttachments
|
||
- anv: log aux disable and aux-skip reasons during image setup
|
||
- anv: log aux disable reasons in image init and DRM modifier selection
|
||
|
||
Michal Krol (3):
|
||
|
||
- gallium: add rasterization_stream to pipe_rasterizer_state
|
||
- draw: fix per-stream vertex buffer leak in non-LLVM path
|
||
- lavapipe: implement transformFeedbackRasterizationStreamSelect
|
||
|
||
Michel Dänzer (13):
|
||
|
||
- Pass the destination buffer size minus one to strncpy
|
||
- ci: Drop -Wno-error=vla-cxx-extension from debian-x86_64-msan job
|
||
- ci: Drop remaining -Wno-error stanzas from debian-x86_64-asan/ubsan jobs
|
||
- ci: Drop -Wno-error=stringop-overread from debian-release job
|
||
- ci: Drop some -Wno-error stanzas from the debian-android job
|
||
- ci: Drop -Wno-error stanzas from debian-no-libdrm job
|
||
- ci: Drop half of -Wno-error stanzas from fedora-release job
|
||
- ci: Drop most -Wno-error stanzas from debian-arm64 jobs
|
||
- ci: Drop a couple of -Wno-error stanzas from alpine-build-testing job
|
||
- vulkan/wsi/x11: Guard XCB_PRESENT_OPTION_SUBOPTIMAL by ignore_suboptimal
|
||
- vulkan/wsi/x11: Don't use modifiers when ignoring SUBOPTIMAL
|
||
- winsys/amdgpu: Prefer render node FD for ac_drm_device_initialize
|
||
- winsys/amdgpu: Use render node only as fallback
|
||
|
||
Mike Blumenkrantz (100):
|
||
|
||
- zink: re-allow transient images during blitting
|
||
- zink: break out ntv into separate meson dep
|
||
- ntv: emit extra decorations for matrix members of structs
|
||
- ntv: stop explicitly tracking variables for samplers/images
|
||
- ntv: handle a couple trivial builtin loads
|
||
- ntv: shore up shader_temp var handling
|
||
- ntv: add push const variable to ctx->vars hash table
|
||
- ntv: handle glsl texture types
|
||
- ntv: handle bare sampler arrays
|
||
- ntv: add basic vulkan support
|
||
- ntv: emit demote extension/capability when emitting demote
|
||
- ntv: add a simple pass to convert vulkan descriptor access to direct derefs
|
||
- ntv: stop tracking ubo variables
|
||
- ntv: avoid setting Block decoration repeatedly on bo struct types
|
||
- ntv: improve setting Aliased decoration on bo emits
|
||
- ntv: emit ViewIndex with flat for fragment stage
|
||
- ntv: handle nir_intrinsic_load_first_vertex as basevertex
|
||
- zink: fix broken compiler assert
|
||
- zink: only do pre-sync transfer barrier after a renderpass
|
||
- zink: only update the value of VkAttachmentFeedbackLoopInfoEXT, not the pNext
|
||
- zink: use maintenance10 info for DRLR optimization
|
||
- ci: add ASAN_OPTIONS=malloc_fill_byte=1 for asan jobs
|
||
- ntv: also use base glsl type for non-zink array derefs
|
||
- ntv: ignore stuff for get_ssbo_size() in vulkan mode
|
||
- zink: add TRANSFER_WRITE -> HOST_READ sync to end of batch
|
||
- st/bitmap: only release YUV samplerviews
|
||
- ntv: run nir_cleanup_functions() in ntv_shader_prepare()
|
||
- ntv: re-gather shader info after ntv_shader_prepare
|
||
- ntv: run nir_remove_dead_variables during ntv_shader_prepare()
|
||
- ntv: call nir_lower_variable_initializers() from ntv_shader_prepare
|
||
- radv: fix multiview fast clears
|
||
- ntv: do gl-style shared/task lowering for vulkan mode
|
||
- ntv: run opt_algebraic late for prep optimization pass
|
||
- vk/cmd_queue: use arrays to directly manage refcounting
|
||
- vk/cmd_queue: handle descriptor layout refcounting
|
||
- vk/cmd_queue: return cmd instead of error code
|
||
- vk/cmd_queue: pass command to struct copying methods
|
||
- vk/cmd_queue: generate CmdBindDescriptorSets
|
||
- vk/cmd_queue: generate CmdPushDescriptorSet
|
||
- vk/cmd_queue: move pipeline layout refs into builder
|
||
- vk/cmd_queue: generate the rest of the descriptor functions
|
||
- vk/cmd_queue: generate CmdPushConstants2
|
||
- egl/device: fix the fix for explicit sw rejection in non-sw EGL_PLATFORM=device
|
||
- zink: reapply zsbuf state after unordered blits
|
||
- zink: allow renderpass termination for clears with ZINK_DEBUG=rp and GENERAL layouts
|
||
- zink: run opt_combine_stores when optimizing
|
||
- aux/trace: handle set_sample_locations
|
||
- llvmpipe: enable GLSL 4.60
|
||
- nir: fix nir_is_io_compact for mesh shaders
|
||
- mesa/st: fix unlower_io_to_vars to work with mesh shaders
|
||
- lavapipe/llvmpipe: make mesh draw params consistent
|
||
- llvmpipe: support EXT_mesh_shader
|
||
- mesa/st: make st_texture_get_current_sampler_view static
|
||
- mesa/st/sampler_view: use a local variable for buffer sv format
|
||
- mesa/st/sampler_view: use a local variable for texture sv format
|
||
- mesa/st/sampler_view: eliminate st_sampler_view::srgb_skip_decode
|
||
- mesa/st/samplerview: explicitly block releasing in-use samplerviews
|
||
- zink: work around drivers with broken mesh shader properties
|
||
- nir/print: print per_vertex for variables
|
||
- llvmpipe: save mesh shader when calling u_blitter
|
||
- llvmpipe: fix mesh cap exports
|
||
- lavapipe: fix mesh property exports
|
||
- llvmpipe: set prefer_real_buffer_in_constbuf0 and delete user buffer path
|
||
- r300: import util_framebuffer_init
|
||
- gallium/util: kill off util_framebuffer_init
|
||
- util/cso: use the mesh_shader pipe cap for mesh support
|
||
- ntv: always emit const coord components for fbfetch loads
|
||
- mesa/renderbuffer: always add PIPE_BIND_SAMPLER_VIEW to rendering textures
|
||
- llvmpipe: fix color fbfetch
|
||
- softpipe: delete pipe_context::create_surface
|
||
- svga: delete pipe_context surface hooks
|
||
- llvmpipe: delete pipe_context surface hooks
|
||
- svga: simplify some surface management
|
||
- crocus: clean up surface management
|
||
- tc: delete unused surface ref code
|
||
- freedreno: clean up some surface management
|
||
- nouveau: delete unused surface hook
|
||
- tegra: delete pipe_context surface hooks
|
||
- freedreno: delete pipe_context surface hooks
|
||
- r300: clean up some surface management
|
||
- r300: delete pipe_context surface hooks
|
||
- gallium: add a destructor param to surface refcounting functions
|
||
- gallium: delete pipe_context surface hooks
|
||
- gallium: add a pipe_context param to pipe_surface_reference()
|
||
- svga: move surface context member onto internal surface type
|
||
- gallium: kill off pipe_surface::context
|
||
- zink: use EXT_primitive_restart_index
|
||
- lavapipe: update prim restart index on index buffer bind
|
||
- lavapipe: VK_EXT_primitive_restart_index
|
||
- radv: handle null pCounterBuffers with xfb binds
|
||
- vulkan/runtime: handle null pCounterBuffers with xfb binds
|
||
- llvmpipe: fix min_samples + A2C
|
||
- lavapipe: fix indirect memory copies
|
||
- lavapipe: fix pushconst data updating
|
||
- util/format: support 256-bit formats in util_format_get_tilesize()
|
||
- lavapipe: use the right type for DGC mesh draws
|
||
- lavapipe: rework immutable samplers
|
||
- lavapipe: allow fbfetch with shader objects
|
||
- vk/cmd_queue: always ceil() param lens
|
||
- llvmpipe: always set view_index for linear rasterizer
|
||
|
||
Mixie (6):
|
||
|
||
- xlib: clear currentDpy when releasing the current context
|
||
- xlib: use XMesaDestroyVisual when destroying display visuals
|
||
- xlib: use XMesaDestroyVisual instead of manual free
|
||
- xlib: fix skipping visuals in destroy_visuals_on_display
|
||
- xlib: remove vishandle from XMesaVisual and fix XVisualInfo leak
|
||
- xlib: clear currentDpy when switching current context
|
||
|
||
Mohamed Ahmed (7):
|
||
|
||
- nil/modifiers: Clarify drm_format_mods_for_format rejecting modifiers for unsupported color formats
|
||
- nvk: Calculate and stash the plane offset and alignment at create time
|
||
- nvk: Extend tiled_shadow to be multiplanar
|
||
- nvk: Defer tiled shadow plane memory allocation to draw time
|
||
- nvk: Enable multiplanar YCbCr linear modifiers
|
||
- nvk: Use the pre-calculated offsets for sparse binds
|
||
- nvk: Remove nvk_image_plane_size_align_B()
|
||
|
||
Máté Pinczel (1):
|
||
|
||
- nak: implement uror and urol using shf
|
||
|
||
Nanley Chery (63):
|
||
|
||
- intel/isl: Use 1x ACM Tile64 swizzle on Xe2
|
||
- intel/isl: Use 1x Ys/Yf swizzle for IMS layout
|
||
- intel/isl: Set TileAddressMappingMode for CMS/UMS
|
||
- intel/isl: Fix miptail selection for compressed textures
|
||
- iris: Disable some 8bpp fast-clears within miptail
|
||
- iris: Increase imported dmabuf alignment for 64K+ BOs
|
||
- iris: Use PIPE_BIND_SHADER_IMAGE more
|
||
- iris: Limit resolves for atomics to R32 formats
|
||
- iris: Allow Yf and Ys tilings more often
|
||
- intel/isl: Rework miptail restrictions with CCS
|
||
- intel/isl: Reduce scope of Yf-disabling workaround
|
||
- anv: Disable multisampled host transfer support
|
||
- anv: Ensure host-transfer tilings are supported by ISL
|
||
- blorp: Fix Tile64 clear redescription assertion
|
||
- intel: Add and use ISL_SURF_USAGE_PREFER_4K_ALIGNMENT
|
||
- intel/isl: Refactor tiling selection in isl_surf_init_s
|
||
- intel/isl: Prefer the smallest suggested tiling
|
||
- intel/isl: Drop HIZ/MCS checks in CCS support query
|
||
- intel/isl: Prefer suggested tilings which use CCS
|
||
- anv: Query the plane in anv_can_fast_clear_color()
|
||
- anv,iris: Don't fast-clear 3D + Ys on gfx12.0
|
||
- intel: Enable CCS support for Yf and Ys
|
||
- intel/isl: Fix QPitch of arrayed MCS
|
||
- iris: Set missing flags on clear color changes
|
||
- iris: Use the CLEAR state on Xe2+ for MCS
|
||
- anv: Update predicated resolve documentation
|
||
- anv: Fix the fast clear type for FCV writes
|
||
- anv: Reset fast-clear type in transition_color_buffer()
|
||
- anv: Support partial resolves on any level/layer
|
||
- anv: Set compressed bit separately from fast-clear type
|
||
- anv: Delete conversion of CCS_D partial resolve
|
||
- anv: Inline the CCS/MCS predicated resolve functions
|
||
- anv: Line wrap anv_CmdClearColorImage
|
||
- anv: Don't return the Xe2+ fast-clear type early
|
||
- anv: Use variable default value for some images using CLEAR
|
||
- anv: Support fast clears on more layers
|
||
- anv: Don't partial resolve LOD1+ for non-FCV CCS
|
||
- intel/blorp: Avoid unused surface redescription calc
|
||
- intel/blorp: Optimize non-zero-layer fast-clears
|
||
- intel/blorp: Bump pitch when clearing unaligned bottom rows
|
||
- anv: Fix clear state of WSI blit sources during presentation
|
||
- anv: Treat non-WSI PRESENT_SRC as TRANSFER_SRC
|
||
- anv: Don't set the display flag on WSI blit sources
|
||
- anv: Drop anv_image::wsi_blit_src
|
||
- intel: Disable CCS_E support for YCRCB on gfx12
|
||
- intel/isl: Add YCRCB CMF mappings for Xe2+
|
||
- anv: Enable YCRCB CMFs on Xe2+
|
||
- intel/blorp: Fix the redescribed fast-clear qpitch
|
||
- intel/isl: Replace mc_format with aux_format
|
||
- intel/isl: Increase 3D miptail workaround scope
|
||
- intel/isl: Generalize and move some Yf/Ys miptail limits
|
||
- intel/isl: Relax some alignments in get_image_surf()
|
||
- intel/isl: Use a fixed alignment for single slices
|
||
- intel/blorp: Lower bit-casting code in blorp_copy()
|
||
- intel/blorp: Add blorp_surf::has_replicated_pixel
|
||
- anv: Prepare for format width changes in blorp_copy()
|
||
- anv: Add WaSamplerCacheFlushBetweenRedescribedSurfaceReads
|
||
- intel/blorp: Make blorp_copy() format queries aux-dependent
|
||
- intel/blorp: Use stencil hardware less for CPB copies
|
||
- intel/blorp: Add blorp_surf_convert_to_single_level_tile()
|
||
- intel/blorp: Redescribe surfaces for copies
|
||
- isl: Apply VALIGN_8 fast-clear restriction on Xe3P+
|
||
- intel/blorp: Fix width scaling for YCBCR copies
|
||
|
||
Natalie Vock (26):
|
||
|
||
- aco: Fix parameter stack size calculation
|
||
- radv/rt: Refactor shader group stack size calculation to include traversal stack
|
||
- aco: Don't exclude discardable parameters from register preservation
|
||
- radv/rt: Fix some tail-call compatibility checks
|
||
- radv/rt: Fix discardable attributes on chit and traversal shaders
|
||
- meson: Identify LTO builds in the package version
|
||
- mesa: Prevent building with LTO
|
||
- radv/rt: Only use ds_bvh_stack_rtn if the stack base is possible to encode
|
||
- radv: Initialize nir_lower_io_to_scalar progress variable
|
||
- radv/nir: Correctly handle workgroup sizes not aligned to 32
|
||
- radv/rt: Bump ray query stack base limit for GFX12
|
||
- radv/rt: Fix shared ray query stack on top of application LDS
|
||
- vulkan: Rename {encode,update}_bind_pipeline to {encode,update}_prepare
|
||
- vulkan: Bump MAX_ENCODE_PASSES
|
||
- radv/rt: Fix cases in which the bound BVH build pipeline gets clobbered
|
||
- radv/rt: Remove RADV_OFFSET_UNUSED
|
||
- radv/rt: Don't enable midpoint sorting
|
||
- radv/rt: Don't combine config of unreachable shaders
|
||
- radv: Disable RADV_DEBUG=llvm in release builds
|
||
- aco/spill_preserved: Only compute preserved registers if in a callee
|
||
- aco/isel: Fix stack_ptr synthesis
|
||
- aco/lower_to_hw_instr: Run p_init_scratch if the program has a call
|
||
- radv: Rewrite the RT prolog in NIR
|
||
- aco: Nuke ACO-side prolog selection
|
||
- nir/deref: Elide loads/stores from deref cast of undef
|
||
- radv: Run nir_opt_deref after first optimization loop
|
||
|
||
Nataraj Deshpande (1):
|
||
|
||
- anv: Fix ASTC emulation sync in CopyImage and CopyBufferToImage
|
||
|
||
Nick Hamilton (12):
|
||
|
||
- pvr: Temporarily disable the buffer device address extension
|
||
- pco: Fix for atomic operations on an image buffer
|
||
- pvr: Fix the isp samples per tile calculation
|
||
- pco: Fix multiview sampling of subpass input attachments
|
||
- pvr: Fix incorrect subpass merging optimisation
|
||
- pvr: Rename pvr_render_input_attachment
|
||
- pvr: Add missing support for preserve attachments
|
||
- pvr: Update CI fails list after render pass fixes
|
||
- pvr: Add support for fragment pass through shader
|
||
- pvr: Fix for multiple attachments being assigned to the same tile buffer.
|
||
- pco: fix clamping the array index when shaderImageGatherExtended is enabled
|
||
- pvr: Revert don't csb emit multi-layer clear attachments without rta support
|
||
|
||
OPNA2608 (2):
|
||
|
||
- vc4: Fix printing of get_tiling.modifier
|
||
- rocket: Fix printing of rknpu_mem_create.dma_addr
|
||
|
||
Olivia Lee (13):
|
||
|
||
- Revert "panvk: advertise VK_EXT_primitives_generated_query on v10+"
|
||
- hk: fix hk_passthrough_gs_key size computation
|
||
- hk: fix passthrough GS key invalidation
|
||
- panvk/csf: use different resource registers for precomp vs user dispatch
|
||
- pan/va: weaken barrier requirements for allow_merging_workgroups
|
||
- pan/va: move allow_merging_workgroups decision to drivers
|
||
- pan/va: don't merge workgroups when subgroups are used
|
||
- panvk/csf: take merged workgroups into account for divergence
|
||
- panvk/csf: lower divergent values introduced by merged workgroups
|
||
- panvk/csf: enable allow_merging_workgroups when possible
|
||
- panfrost: don't try to emit varying shader stats on v12+
|
||
- panvk/csf: check printf buffer one last time when queue is lost
|
||
- pan/bi: fix memory access alignment
|
||
|
||
Olle Lögdahl (3):
|
||
|
||
- aco/isel: move if_context and loop_context to heap
|
||
- aco/isel: use iterative visitor during traversal
|
||
- aco/isel: added test-case for iterative cf visitor
|
||
|
||
Osama Abdelkader (1):
|
||
|
||
- vulkan/wsi: Fix realloc error handling in wsi_get_modifiers_for_format
|
||
|
||
Patrick Lerda (13):
|
||
|
||
- r600: fix cayman msaa shading behavior
|
||
- r600: disable l8_srgb on r700 and older gpus
|
||
- r600: fix rv770 dot4 operations
|
||
- r600: make vertex r10g10b10a2_sscaled conformant on palm and beyond
|
||
- r600: fix rv770 clamp to max_texel_buffer_elements
|
||
- r600: enable GL_EXT_shader_realtime_clock
|
||
- r600: update cubearray imagesize calculation
|
||
- r600: improve vs_as_ls switch reliability
|
||
- r600: fix cs atomic operations when the shader is called multiple times
|
||
- r600: fix alpha-to-coverage and alpha-to-one used together
|
||
- r600: fix atomic buffer offset
|
||
- r600: update vertex emit_varying_pos
|
||
- r600: fix atomic_counter_post_dec
|
||
|
||
Paulo Zanoni (13):
|
||
|
||
- anv: anv_get_image_format_features2() can be static
|
||
- anv: don't pass vk_format to anv_get_image_format_features2()
|
||
- anv: unify blit_cts_workaround handling
|
||
- anv: extract anv_color_format_supports_drm_modifier_tiling()
|
||
- anv: extract 2 subvariants of anv_get_image_format_features2()
|
||
- anv: extract anv_get_color_format_features()
|
||
- vtn_bindgen2: limit the nir_opt_peephole_select optimization
|
||
- elk: don't use instr->const_index[] directly
|
||
- anv: avoid VK_STRUCTURE_TYPE_BIND_MEMORY_STATUS warnings
|
||
- intel/blorp: remove always-true #if
|
||
- intel/genxml: move the GPGPU_DISPATCHDIM* registers to genxml
|
||
- intel/isl: fix assert when surf->size_B is > UINT_MAX
|
||
- intel/isl: warn about excessive num_elements only once
|
||
|
||
Pavel Ondračka (36):
|
||
|
||
- r300: split unaligned 3D texsubimage uploads by layer
|
||
- r300: align macro-tiled stride-addressed textures in X
|
||
- r300/ci: enable glx tests
|
||
- i915/ci: update expectation missed in piglit uprev
|
||
- mesa: implement FRAMEBUFFER_RENDERABLE internalformat query
|
||
- r300/ci: update expectations
|
||
- i915/ci: update expectations
|
||
- r300: handle polygon-mode points in point sprite path
|
||
- r300/ci: update expectations
|
||
- r300: Z16 polygon offset fixes
|
||
- r300: copy target when merging alpha output instruction
|
||
- r300: disable HiZ for PIPE_FUNC_ALWAYS
|
||
- r300/ci: enable HiZ in CI
|
||
- r300: make occlusion queries work without a bound depth buffer
|
||
- r300: pause and resume occlusion queries for blitter/meta paths
|
||
- r300: return zero for finished occlusion queries with no emitted results
|
||
- frontends/dri: fix NUM_PLANES for imported dma-buf images
|
||
- r300: disable clip-discard watermark for triangles
|
||
- r300: pad short vertex shaders to avoid R3xx hangs
|
||
- r300/ci: update expectations
|
||
- r300/ci: expectation update
|
||
- r300: fix bias presubtract algebraic transformation
|
||
- r300: lower Z16 polygon offset scale coefficient
|
||
- r300: don't apply odd macroblock rounding to 3D textures
|
||
- r300: disable zmask clears for large surfaces
|
||
- r300: add shared HyperZ pipe-count helper
|
||
- r300: split large HiZ clears into multiple packets
|
||
- st/bitmap: release the temporary bitmap sampler view
|
||
- r300: keep all vertex atributes 32bit on big endian
|
||
- r300: fix BE 32-bit CBZB clear values
|
||
- r300: fix BE CBZB clears for swapped 8888 formats
|
||
- gallium/u_blitter: remove unused CONST declaration when using IMM
|
||
- r300/ci: rv380 piglit
|
||
- r300/ci: update expectations
|
||
- r300: fix MSAA resolve COLORPITCH tiling after pipe_surface de-pointerization
|
||
- r300: dirty VS state when switching variants
|
||
|
||
Pierre-Eric Pelloux-Prayer (37):
|
||
|
||
- radeonsi/sqtt: retrieve sqtt data after the flush ended
|
||
- radeonsi/sqtt: use pipe_aligned_buffer_create to allocate bo
|
||
- radeonsi/sqtt: use pipe_buffer_map instead of ws->buffer_map
|
||
- radeonsi/sqtt: allocate BOs in VRAM
|
||
- radeonsi/sqtt: use radeon_add_to_buffer_list
|
||
- mesa/vbo: update NeedFlush before flushing
|
||
- dri: prevent read_sbc from going backward
|
||
- ac: keep a single instance of sid_table
|
||
- radeonsi: move mediump code to a separate compilation unit
|
||
- radeonsi: split shaders/draw code from si_debug to a new file
|
||
- radeonsi: move buffer high-level functions to si_buffer.c
|
||
- radeonsi: move si_ps_key_update_framebuffer to si_state.c
|
||
- ac: add u_stub.h helper
|
||
- meson: add with_gfx_compute property
|
||
- radeonsi: use with_gfx_compute to disable parts of the driver
|
||
- radeonsi: remove nir references when graphics is disabled
|
||
- frontends/va: fix undefined ref error
|
||
- mesa: don't wraparound st_context::work_counter
|
||
- radeonsi: move spi_shader_*_format to si_shader_variant_info
|
||
- radeonsi: account for outputs_written when updating spi_shader_col_format
|
||
- radeonsi/test: update failures
|
||
- gallium/u_blitter: add a new fs_color_clear variant
|
||
- drm-shim: fix shim on GLX
|
||
- winsys/amdgpu: remove assert
|
||
- ac: remove ac_null_device
|
||
- ac/info: add ac_fill_tiling_info
|
||
- ac/info: add ac_fill_memory_info
|
||
- ac/info: add ac_fill_hw_ip_info
|
||
- ac/info: add ac_identify_chip
|
||
- ac/info: move more memory properties to ac_fill_memory_info
|
||
- ac/info: remove has_bo_metadata
|
||
- ac/info: add ac_fill_bug_info
|
||
- ac/info: add ac_fill_feature_info
|
||
- ac/info: add ac_fill_hw_info
|
||
- ac/info: add ac_fill_tess_info
|
||
- ac/info: constify ac_fill_compiler_info
|
||
- ac/tests: use amdgpu shim devices
|
||
|
||
Pohsiang (John) Hsu (12):
|
||
|
||
- mediafoundation: refactor update picture desc
|
||
- mediafoundation: remove published codecapi
|
||
- mediafoundation: update version to 1.09
|
||
- mediafoundation: in slice generation mode, send METransformNeedInput once per frame.
|
||
- d3d12: add workaround for max subregion number reported in slice auto mode
|
||
- mediafoundation: add workaround for max subregion number reported in slice auto mode
|
||
- mediafoundation: fix hevc vui time_scale
|
||
- mediafoundation: set defualt unwrapped poc for h264 to true
|
||
- mediafoundation: set reasonable number of reference frames if the user didn't set CODECAPI_AVEncVideoMaxNumRefFrame
|
||
- d3d12: ifdef the surfaces member from d3d12_batch under HAVE_GALLIUM_D3D12_GRAPHICS
|
||
- mediafoundation: add support for GPU priority setting via IMFDXGIScheduler
|
||
- mediafoundation: remove published codecapi
|
||
|
||
Priya Hosur (1):
|
||
|
||
- ac/nir/ngg: re-enable use of known compile-time GS connectivity
|
||
|
||
Qiang Yu (11):
|
||
|
||
- radeonsi: move sqtt draw code to shared function with mesh pipeline
|
||
- radeonsi: mesh shader support sqtt
|
||
- radeonsi: be able to record sqtt for frame 0|1 and no swap
|
||
- radeonsi: not overlap ib print for multi context
|
||
- radeonsi: fix mesh shader outputs kill
|
||
- winsys/amdgpu: add timeline point support to fence lists
|
||
- winsys/amdgpu: use timeline syncobj chunks in kernelq submission
|
||
- radeonsi: add timeline semaphore support to fence operations
|
||
- radeonsi: advertise GL_NV_timeline_semaphore
|
||
- docs: add GL_NV_timeline_semaphore support for radeonsi
|
||
- ac,radeonsi,radv: fix print IB assertion fail for reserved fields
|
||
|
||
Radu Costas (7):
|
||
|
||
- pvr, pco: Commonize texture packing code
|
||
- pco: Add hwinfo check for features in sampler code
|
||
- pco: Commonize atomic sync operations
|
||
- pvr,ci: Update expected fails list with new tests
|
||
- pvr, ci: Update expected failures list
|
||
- pco: Amend errant nir_move_option
|
||
- pvr, ci: Remove tests from expected failure list
|
||
|
||
Raviraj Uppal (2):
|
||
|
||
- ac/nir: Fixed OpenGL CTS transform feedback overflow detection test case The ordered atomic commits the post-add offset to memory, but overflow was computed using the pre-add offset, causing partial overflows to be missed and counters to become corrupted.
|
||
- driconf: disable allow_rgb16_configs for SPECviewperf
|
||
|
||
Reilly Brogan (1):
|
||
|
||
- amd,compiler: fix const errors found with C23 glibc support
|
||
|
||
Renato Pereyra (5):
|
||
|
||
- pps: On data source register, report all counters as enabled by default
|
||
- pps: Remove timestamps from counter descriptions
|
||
- pps: Skip emitting repeated zero counter values
|
||
- intel: Add pid and tid to Vulkan QueueSubmit events
|
||
- intel: Include available counter descriptions in the perfetto counter spec
|
||
|
||
Rhys Perry (113):
|
||
|
||
- aco/insert_fp_mode: remove incorrect assertion
|
||
- radv: fix RADV_DEBUG=shaderstats with RT pipelines
|
||
- aco: add lv1/lv2 as alias for v1/v2.as_linear()
|
||
- aco: use lv1/lv2 instead of v1/v2.as_linear()
|
||
- aco: use lv1.resize() pattern
|
||
- radv: fix when incomplete rt pipeline libraries are loaded from cache
|
||
- radv: improve skipping of creation of NIR for cached rt pipeline libraries
|
||
- aco: use ABI::numClobbered() more
|
||
- aco: use Program::stack_ptr instead of Program::static_scratch_rsrc
|
||
- aco: add return address to call_clobbered_regs
|
||
- aco: move return address to a clobbered register
|
||
- aco/insert_waitcnt: improve s_setpc_b64/s_swappc_b64/end_with_regs a bit
|
||
- radv: include ahit/isec shaders in radv_get_shader_from_executable_index
|
||
- nir/search: remove creation of swizzle
|
||
- nir/search: use memcmp/memcpy/memset
|
||
- aco: consider 64-bit transcendental normal valu for s_delay_alu
|
||
- radv: add ngg_wave_id_en to radv_shader_info
|
||
- radv,aco/gfx11: preserve s2 when NGG_WAVE_ID_EN=1
|
||
- aco: only consider cost of memory loads at waitcnt
|
||
- aco/sched_ilp: improve scheduling with VMEM/DS->VALU WaW
|
||
- aco/ra: track cost of moving variables
|
||
- aco/ra: move variables from affinity register to avoid waitcnt
|
||
- aco/ra: prefer phi operands which don't create waitcnt
|
||
- aco/ra: create vectors for affinities of split definitions
|
||
- nir/opt_intrinsics: optimize inot(inverse_ballot(const))
|
||
- ac/nir,radv,radeonsi: flip branches to avoid waitcnts
|
||
- nir/load_store_vectorize: more carefully add entries from loop preheader
|
||
- nir/load_store_vectorize: don't update last_entry after a barrier
|
||
- nir: fix fmin_agx/fmax_agx constant folding
|
||
- nir: only set fp_math_ctrl if meaningful
|
||
- nir/algebraic: remove ignore_exact
|
||
- aco: fix gfx6-8 store_scratch() with function calls
|
||
- aco/ra: don't modify parallelcopies if get_reg_for_affinity fails
|
||
- aco: fix VALUReadSGPRHazard with s_call_b64/s_swappc_b64
|
||
- aco: reset all vgpr_used_by_vmem\_ in resolve_all_gfx11
|
||
- aco: resolve hazards before calls
|
||
- radv: disable fast math for frag_coord.z adjust
|
||
- radv: combine v_mov_dpp into fma in frag_coord.z adjust
|
||
- radv: fix size of reserved in radv_physical_device_cache_key
|
||
- radv: remove radv_physical_device::max_shared_size
|
||
- ac/nir: remove gfx_level parameter from ac_nir_lower_indirect_derefs
|
||
- ac/nir: remove ac_nir_lower_ps_late_options::family
|
||
- ac/gpu_info: fix outdated comment
|
||
- ac/gpu_info: remove padding from ac_cu_info
|
||
- amd: move various flags to ac_cu_info
|
||
- amd: add ac_cu_info::has_vrs_frag_pos_z_bug
|
||
- amd: add ac_cu_info::local_invocation_ids_packed
|
||
- ac/llvm: pass ac_cu_info to ac_llvm_context_init
|
||
- radv: don't cache esgs_ring_size/gsvs_ring_size
|
||
- aco: perform dce for blocks skipped for process_block()
|
||
- ac/gpu_info: move some NGG flags to ac_cu_info
|
||
- ac/nir: use ac_nir_lower_ngg_options for ac_nir_lower_ngg_mesh
|
||
- ac/nir: don't pass radeon_info to NGG lowering
|
||
- ac/nir: don't pass radeon_info to ac_nir_set_options
|
||
- ac/nir: pass ac_cu_info to ac_nir_compute_tess_wg_info
|
||
- ac/nir/ngg: add ac_cu_info shortcut
|
||
- amd: rename ac_cu_info to ac_compiler_info
|
||
- ac/gpu_info: print most of ac_compiler_info
|
||
- aco: fix PS epilog dual-source blending with only one color output
|
||
- ac/nir: fix when both dual source outputs are unwritten
|
||
- radeonsi: replace dual_src_blend_swizzle flag with dual_src_blend
|
||
- radeonsi: fix PS epilog dual-source blending with only one color output
|
||
- radeonsi: fix dual-source blending with only one output
|
||
- radv: don't mask PS epilog spi_shader_col_format with dual source blending
|
||
- nir/range_analysis: set deleted key
|
||
- nir: add nir_fp_analysis_state
|
||
- nir/range_analysis: use SSA index for hash table keys
|
||
- nir/range_analysis: use function pointers for lookup
|
||
- nir/range_analysis: use sparse array for float analysis
|
||
- zink: use hash_table_u64 instead of _mesa_hash_table_create_u32_keys
|
||
- amd/common/virtio: use hash_table_u64
|
||
- util: make UINT32_MAX a reserved key for _mesa_hash_table_create_u32_keys
|
||
- nir/range_analysis: use uint16_t for sparse array elements
|
||
- nir/range_analysis: use a dense array
|
||
- nir/range_analysis: cache results of non-alu fp class queries
|
||
- amd/drm-shim: enable conformant_trunc_coord for navi31
|
||
- radv: improve lower_array_layer_round_even condition
|
||
- aco/tests: fix assembler tests with LLVM 22
|
||
- aco/tests: fix assembler/isel tests with LLVM 23
|
||
- radv: don't copy radv_vertex_input_state in CmdSetVertexInputEXT
|
||
- radv: fix memory leak in radv_rt_nir_to_asm
|
||
- radv: add radv_shader_debug_info
|
||
- radv: simplify radv_shader_dump_debug_info
|
||
- radv: add radv_parse_binary_debug_info
|
||
- radv: add radv_shader_debug_info parameter to radv_shader_create
|
||
- radv: move radv_shader_create out of radv_compute_pipeline_compile
|
||
- radv: copy spirv in radv_graphics_shaders_nir_to_asm instead
|
||
- radv: move radv_shader_create out of radv_graphics_shaders_nir_to_asm
|
||
- radv: move radv_shader_create out of radv_graphics_shaders_compile
|
||
- radv: move radv_shader_create out of radv_rt_nir_to_asm
|
||
- radv: create radv_rt_spirv_to_nir
|
||
- util: allow any key for hash tables
|
||
- util: simplify hash_table_u64
|
||
- util: fix UBSan error with _mesa_bfloat16_bits_to_float
|
||
- nir/tests: fix NaN/inf checks in skip_test()
|
||
- nir/algebraic: optimize exact f2u32(fmul(unpack_norm))
|
||
- nir/propagate_invariant: include intrinsics
|
||
- nir/propagate_invariant: set fp_math_ctrl for intrinsics
|
||
- nir/propagate_invariant: include derefs
|
||
- nir/propagate_invariant: be more conservative with NULL variables
|
||
- nir/propagate_invariant: be more conservative with aliasing variables
|
||
- nir/propagate_invariant: handle images
|
||
- nir: add and use block predecessor helpers
|
||
- nir: add nir_loop_has_back_edge helper
|
||
- nir/cf: don't remove block predecessors while iterating
|
||
- nir: use a u_dynarray for block predecessors
|
||
- aco: ignore copykill+latekill operands in get_temp_reg_changes
|
||
- docs/aco: add live variable analysis documentation
|
||
- ir3/array_to_ssa: skip remove_trivial_phi for non-array phis
|
||
- ir3/array_to_ssa: initialize ir3_instruction::data
|
||
- ir3/ra: fix copy-paste error
|
||
- aco: support s_bitset
|
||
- aco/ra: create s_bitset
|
||
|
||
Rob Clark (115):
|
||
|
||
- ir3: Use fd_dev_info from ir3_compiler
|
||
- ir3: Handle dual-wave reconvergence
|
||
- freedreno/common: Fix gen8 EFU float control
|
||
- freedreno: Force single wavesize if double threadsize is unsupported
|
||
- tu: Drop HIC support for depth images
|
||
- freedreno/lrz: Correct lrz fc layout for gen8
|
||
- tu: Re-enable LRZ for gen8
|
||
- freedreno/a6xx: Better program state size calc
|
||
- freedreno/decode: Skip bindless dumps on pre-bindless hw
|
||
- freedreno/decode: Enable --bindless for cffdump tests
|
||
- freedreno/decode: Add multi-plane descriptor coverage
|
||
- freedreno/registers: Drop a6xx descriptor chip use
|
||
- freedreno/registers: Rename A6XX_TEX_MEMOBJ
|
||
- freedreno/decode: Fix gen8 descriptor address
|
||
- freedreno/decode: Extract out helper to set varset
|
||
- freedreno/decode: Decode all descriptor variants
|
||
- freedreno/registers: Descriptor variants
|
||
- freedreno/decode: Add script support for enum types
|
||
- freedreno/decode: Replace/remove __tonumber()
|
||
- freedreno/decode: Allow direct access to domain bitfield
|
||
- freedreno/decode: Add lua handler to filter descriptors
|
||
- freedreno/decode: Allow dom[1] to be NULL
|
||
- ir3: Rename cat6 UBO/UAV descriptor src
|
||
- ir3: Disasm shader descriptor stats
|
||
- freedreno/decode: Split out domain based decoding
|
||
- freedreno/decode: fix domain decode for "structs"
|
||
- freedreno/decode: add shader stats object
|
||
- freedreno/decode: pass more info to descriptor handler
|
||
- freedreno/decode: call show_descriptor() for UBO and SAMPLERs as well
|
||
- freedreno/decode: filter unused descriptors in lua
|
||
- freedreno/decode: Dump filtered bindless descriptors by default
|
||
- freedreno/decode: Fix query bin vals
|
||
- freedreno/decode: Expose gpu buffers to lua
|
||
- freedreno/decode: Handle strips
|
||
- freedreno/decode: Allow raw access to pm4 packets
|
||
- freedreno/decode: Emulate CP_MEM_WRITE
|
||
- freedreno/decode: Shorten query string
|
||
- freedreno/decode: Split out endswith() helper
|
||
- freedreno/decode: Filter redundent _HI regs
|
||
- freedreno/decode: Keep intereactive for query mode
|
||
- nir: Fill in missing conversion opts
|
||
- freedreno/decode: Fix endswith()
|
||
- freedreno/registers: Update CP_COND_WRITE
|
||
- freedreno/registers: Update GRAS_BIN_FOVEAT
|
||
- tu: Split out stomp_regs() helper
|
||
- tu: Mark TU_CMD_DIRTY_COMPUTE_DESC_SETS after stomping
|
||
- freedreno/registers: Move binning regs to "cmd"
|
||
- freedreno/registers: Rename some unknown A2D regs
|
||
- freedreno/registers: Split out "blit" usage
|
||
- freedreno/registers: Split out "resolve" usage
|
||
- freedreno/registers: Split out compute usage
|
||
- freedreno/registers: Move remaining rp_blit to draw
|
||
- freedreno/registers: Usage additions/corrections
|
||
- freedreno/rnn: Track reg usage
|
||
- freedreno/decode: Remove prefetch-test
|
||
- freedreno/decode: Use reg usage for reg summary
|
||
- freedreno: Move some draw regs into driver
|
||
- gallium: Add PIPE_QUERY_TIMESTAMP_RAW
|
||
- freedreno/a6xx: Implement PIPE_QUERY_TIMESTAMP_RAW
|
||
- nir: Fix validation error after nir_round_int_to_float()
|
||
- ir3: More COMPUTE vs KERNEL
|
||
- freedreno+ir3: Implement CL isam mode
|
||
- gallium: Switch TIMESTAMP_RAW back to callback
|
||
- gallium: Add warning about PIPE_QUERY_x's ABIness
|
||
- rusticl: Add CL specific bind flag
|
||
- freedreno/a6xx: Hide 10_10_10_2 for opencl
|
||
- ir3: Lower 8b usub_sat
|
||
- freedreno/fdl: Set layer_size in explicit_layout case
|
||
- freedreno: Add missing cl_gl_sharing cap
|
||
- freedreno: Fix stdout vs stderr logging
|
||
- freedreno/a6xx: Barrier debug
|
||
- freedreno: Flip logging to debug
|
||
- freedreno: Block rusticl on older gens
|
||
- freedreno/a6xx: Fix num_groups programming
|
||
- freedreno: Attach fence to last batch
|
||
- freedreno: Reuse last_fence when possible
|
||
- freedreno/a6xx: Don't emit epilogue per-tile
|
||
- freedreno/a6xx: Rework flushing events
|
||
- freedreno/decode: Add missing a6xx/a7xx reg decoding
|
||
- freedreno: Use linear for 1d/1d_array
|
||
- freedreno: Reduce advertised memory
|
||
- freedreno/drm: bo cache logging vs tsan
|
||
- freedreno: Initialize debug once
|
||
- ir3: Initialize debug once
|
||
- freedreno: Avoid shadow blits for compute contexts
|
||
- freedreno/a6xx: Avoid touching long lived stateobj refcnt
|
||
- rusticl: Let backend control convert_alu_types lowering
|
||
- ir3: Handle (some) convert_alu_types in backend
|
||
- freedreno: Rename a830
|
||
- freedreno: Split up freedreno_devices.py
|
||
- freedreno: Add --nvtop arg
|
||
- freedreno/fdl: Use 4k alignment for tiled
|
||
- ir3: Move shader upload under variants_lock
|
||
- freedreno/drm: Fix bo_flush race
|
||
- freedreno: Check for flushed batches
|
||
- freedreno: Update pscreen->num_contexts
|
||
- freedreno: Don't re-bind global buffers
|
||
- freedreno: Move pvtmem to screen
|
||
- freedreno/a6xx: Fix sharable cs races
|
||
- freedreno/drm: Shareable stateobjs
|
||
- ir3: Lower ffma
|
||
- ir3: Late lowering of fmul+fadd to ffma
|
||
- freedreno/ci: Update trace expectations
|
||
- ir3: Set max_workgroup options
|
||
- freedreno/registers: Add a couple missing bitfields
|
||
- freedreno/registers: Remove left-over comment
|
||
- meson: Fix build break on f43, gentoo, etc
|
||
- freedreno/a6xx: Move A2D reg write to ncrb
|
||
- freedreno/common: Fix upstream a830 chip_id
|
||
- freedreno/registers: Update gmu reg offsets
|
||
- freedreno/a6xx: Fix supported-blit fmt check
|
||
- freedreno/common: Drop gen8 0x78000 offset
|
||
- freedreno: Add a829
|
||
- freedreno/a6xx: Fix blit fmt check
|
||
- tu/kgsl: Add UBWC_5 and UBWC_6 support
|
||
|
||
Rob Herring (Arm) (17):
|
||
|
||
- ethosu: Fix padding calculation
|
||
- teflon/tests: Add 16-bit output support
|
||
- teflon: Add debug string for concatenation
|
||
- test_teflon: Fix crash with read-only buffers
|
||
- test_teflon: Fix missing UInt16/Int16 output size
|
||
- test_teflon: Add 32-bit integer output comparison
|
||
- teflon: Add support for setting the tensor type size
|
||
- ethosu: Add support for 16-bit tensors
|
||
- ethosu: Add scalar ADD support
|
||
- ethosu: Handle reversing IFM and IFM2 operands
|
||
- ethosu: Handle IFM2 H/W/D broadcast
|
||
- teflon: Support ReLU activation for ADD ops
|
||
- ethosu: Support ReLU activation for ADD ops
|
||
- ethosu: Fix buffer overrun in stridedslice
|
||
- ethosu: Fix U85 AvgPool for greater than 8x8 kernel sizes
|
||
- ethosu: Drop 2nd allocation of IFM and OFM
|
||
- ethosu: Move ethosu_allocate_feature_map() to ethosu_lower.c
|
||
|
||
Robert Mader (5):
|
||
|
||
- lavapipe: enable dmabuf import for planar drm formats
|
||
- lavapipe: Remove some dead code
|
||
- llvmpipe: Stop aligning height to raster block size for unbacked handles
|
||
- nir/lower_tex: Reinstate LSB to MSB shift
|
||
- llvmpipe: Implement manual context resets
|
||
|
||
Rohan Garg (2):
|
||
|
||
- anv: set a private binding when the image is not externally shared
|
||
- anv: refactor add_aux_state_tracking_buffer for conciseness
|
||
|
||
Rohit Athavale (1):
|
||
|
||
- mediafoundation: Test compile steps v/s step , and set build flag
|
||
|
||
Roland Scheidegger (3):
|
||
|
||
- llvmpipe: get rid of unused code in float to small float code
|
||
- llvmpipe: don't rely on cpu denorms for float to smallfloat conversion
|
||
- llvmpipe: disable denorms in compute shaders on x86/sse
|
||
|
||
Romaric Jodin (1):
|
||
|
||
- pan/bi: lower phis to scalar early
|
||
|
||
Rouf, Farhan (1):
|
||
|
||
- amd/vpelib: Embedded Buffer Size for 3DLUT FL
|
||
|
||
Rudi Heitbaum (1):
|
||
|
||
- mesa: retain const qualifier from pointer
|
||
|
||
Ruijing Dong (1):
|
||
|
||
- ac/vcn: correct a typo in av1 dec header
|
||
|
||
Ryan Houdek (1):
|
||
|
||
- freedreno/fdl: Fix compiling with GCC and AVX2
|
||
|
||
Ryan Mckeever (2):
|
||
|
||
- panvk: lower multisampled images before nir_lower_descriptors
|
||
- panvk: enable fragmentStoresAndAtomics for Bifrost
|
||
|
||
Ryan Zhang (5):
|
||
|
||
- panvk: guard against NULL pointers to avoid crash
|
||
- panvk/csf: use DEFERRED_FLUSH for fragment job cache flush
|
||
- panvk: trivial fix to remove repeated assignment
|
||
- panvk/csf: rework IR descriptor handling for tiler OOM
|
||
- panvk: add VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL to host copy layouts
|
||
|
||
Sagar Ghuge (14):
|
||
|
||
- anv: Mark RootNodeOffset at 256B always
|
||
- nir: Add nir_resource_intel_internal entry
|
||
- anv: Set max outstanding ray queries to 1024
|
||
- intel/blorp: drop unused BLORP_BATCH_COMPUTE_ENGINE flag
|
||
- anv: Improve bvh_no_build option
|
||
- anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921
|
||
- anv: Write IR header using shader instead of CS
|
||
- anv/rt: Drop header update using blorp code path
|
||
- intel/genxml: Add new State Cache Perf Fix Disabled field
|
||
- anv: enable BTP+BTI RCC keying for some workloads
|
||
- anv/bvh: Drop atomic on instance_count
|
||
- intel/compiler: Handle TerminateOnFirstHit in ray query execution
|
||
- intel/compiler: Remove unused brw_nir_memclear_global helper
|
||
- anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921
|
||
|
||
Samuel Pitoiset (358):
|
||
|
||
- vulkan: fix missing begin debug marker for HPLOC
|
||
- spirv: Update the JSON and headers
|
||
- vulkan: update spec to 1.4.340
|
||
- radv: move emitting SQTT markers closer to the draw/dispatch packets
|
||
- radv: use the SQTT enable bit for PKT3_DRAW_{INDEX}_INDIRECT_MULTI
|
||
- radv: use the SQTT enable bit for PKT3_DISPATCH_MESH_INDIRECT_MULTI
|
||
- radv: use the SQTT enable bit for PKT3_DISPATCH_TASKMESH_INDIRECT_MULTI_ACE
|
||
- radv: fix applying radv_ssbo_non_uniform=true for Crysis 2/3 remastered
|
||
- radv: add a workaround for a synchronization bug in Strange Brigade Vulkan
|
||
- radv/meta: fix 3D color resolves with compute when base slice isn't zero
|
||
- radv/meta: return the flush bits from radv_clear_hiz()
|
||
- radv: optimize barriers when clearing HiZ on GFX12
|
||
- radv/sqtt: rework acquiring GPU timestamps
|
||
- radv/sqtt: rework acquiring timed cmdbufs
|
||
- radv/sqtt: reduce the number of timed cmdbufs
|
||
- radv: rework app workarounds implemented using internal layers
|
||
- vulkan: add support for VK_KHR_internally_synchronized_queues
|
||
- radv: advertise VK_KHR_internally_synchronized_queues
|
||
- radv: zero-initialize image view objects
|
||
- radv: fix tracking of pipelines used in secondaries
|
||
- radv/meta: remove declared but unused radv_decompress_resolve_rendering_src()
|
||
- radv/amdgpu: remove radv_dummy_winsys_create()
|
||
- radv/meta: remove unused emit_depth_stencil_resolve()
|
||
- radv/amdgpu: bypass GL2 for command buffer BOs
|
||
- radv: disable unordered submits when SQTT queue events are enabled
|
||
- ac,radv,radeonsi: shorten some emit macro names
|
||
- radv: emit pending flushes after late decompressions with fbfetch
|
||
- radv: stop delaying decompression passes for feedback loops with DRLR
|
||
- radv: emit the VRS surface as part of the framebuffer state on GFX11+
|
||
- radv: track redundant PA_SC_VRS_OVERRIDE_CNTL register writes
|
||
- radv: remove occurrences of VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR
|
||
- radv/meta: decompress resolve src outside of depth/stencil resolves
|
||
- radv/meta: make radv_decompress_resolve_src() static
|
||
- radv/meta: stop saving/restoring rendering state for FS/HW resolves
|
||
- radv/meta: fix the key for DCC decompress on compute
|
||
- radv/meta: remove useless check in radv_CmdClearAttachments()
|
||
- radv/meta: remove dead code for VK_FORMAT_R4G4_UNORM_PACK8
|
||
- radv/meta: remove dead DCC clear code about E5B9B9R9_UFLOAT_PACK32
|
||
- radv: fix late decompressions for fbfetch with more corner cases
|
||
- radv: only pass custom sample locations when relevant
|
||
- radv: emit late decompressions for fbfetch slightly earlier
|
||
- radv/meta: stop saving/restoring rendering state for color/depth decompressions
|
||
- radv/meta: remove unused saving/restoring rendering state logic
|
||
- zink/ci: update checksum of one trace running on VANGOGH
|
||
- radv/meta: remove dead code in the gfx depth/stencil clear path
|
||
- radv: move color/depth-stencil init surface helpers to radv_image_view.c/h
|
||
- radv: remove declared but unused radv_get_dcc_max_uncompressed_block_size()
|
||
- radv: move {depth,stencil}_compress_disable to the image view extra info
|
||
- radv: add a new dirty bit for the GFX12 HiZ workaround
|
||
- radv: emit the framebuffer state when rendering begins
|
||
- radv/meta: fix CmdCopyBufferToImage2() on compute queue with compressed HTILE
|
||
- radv/meta: add a function to fixup HTILE metadata for copies on compute queue
|
||
- radv/meta: stop fixing up HTILE after a partial copy
|
||
- radv: set COMPRESSION_EN=1 for depth or stencil storage images when supported
|
||
- radv/meta: do not disable compression for depth/stencil expand on compute
|
||
- radv/meta: skip some HTILE operations when it's decompressed on image stores
|
||
- radv/meta: remove an useless barrier when fixing up HTILE for copies on compute
|
||
- radv/meta: stop using custom sample locations for color resolves
|
||
- radv: pass VkSampleLocationsInfoEXT for depth/stencil expand
|
||
- radv: clear rendering state before performing resolves
|
||
- radv: make sure rendering isn't already active in CmdBeginRendering()
|
||
- radv: do not resolve when rendering is suspended
|
||
- radv: do not set the resume rendering flag for custom resolves
|
||
- radv/meta: stop trying to reduce the number of format variants
|
||
- radv/meta: use R32G32 formats for R64 slow color clears
|
||
- radv: enable trimming FS color exports for internal shaders
|
||
- radv/meta: stop fixing up DCC after a partial resolve using compute
|
||
- radv/meta: remove an useless barrier when fixing up DCC for compute resolves
|
||
- radv/meta: add a function to fixup DCC metadata for compute resolves
|
||
- radv: rename radv_image_use_dcc_image_stores()
|
||
- radv/meta: fix partial depth/stencil resolves with compute
|
||
- radv: cleanup barriers after a depth/stencil expand
|
||
- radv/meta: stop fixing up HTILE after a partial resolve using compute
|
||
- radv/meta: add HTILE support to radv_fixup_resolve_dst_metadata()
|
||
- radv/meta: optimize a barrier with depth/stencil compute resolves
|
||
- radv/meta: move the barrier for depth/stencil compute resolves outside
|
||
- ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
|
||
- radv/meta: remove useless DCC decompressions for image<->buffer
|
||
- radv/ci: mark more WSI flakes for NAVI21
|
||
- radv/ci: mark more WSI tests as flakes on NAVI21
|
||
- radv: remove a redundant check in radv_image_is_renderable()
|
||
- radv/meta: rename some variables for btoi 96-bit shader
|
||
- radv/meta: rename r32g32b32 to 96bit
|
||
- radv/meta: rework get_image_stride_for_96bit() and make it non-static
|
||
- vulkan/runtime: add a separate function to build ETC2 decode core shader
|
||
- vulkan/runtime: add support for ETC2 emulation with copy_memory_indirect
|
||
- radv: simplify buffer-to-image and image-to-image operations for 96-bit formats
|
||
- radv: fix potential corruption after FMASK decompression on GFX6-8
|
||
- radv: skip some operations when the image is already zero-initialized
|
||
- radv/meta: fix depth/stencil resolves with different regions
|
||
- radv: reserve CS space for the HiZ WA on GFX12
|
||
- radv: skip some redundant operations when rendering is resumed
|
||
- radv: handle the cache flush workaround for mips before emitting the fb
|
||
- radv: suspend/resume dynamic rendering completely
|
||
- radv: fix independent sets with dynamic buffers and GPL
|
||
- ac/nir: fix writemask for dual source blending on GFX11+
|
||
- radv: fix potential GPU hangs with secondaries on transfer queue
|
||
- radv/nir: use radv_physical_cache_key::no_rt more
|
||
- radv/nir: use radv_physical_cache_key::emulate_rt more
|
||
- radv/nir: use radv_physical_cache_key::bvh8 more
|
||
- radv: use radv_physical_cache_key::disable_trunc_coord more
|
||
- radv: use radv_physical_cache_key::disable_aniso_single_level more
|
||
- radv: use radv_physical_cache_key::disable_shrink_image_store more
|
||
- radv: use radv_physical_cache_key::clear_lds more
|
||
- ac/nir: stop passing radeon_info for addr->coord helpers
|
||
- radv/meta: stop using pdev for shaders that use addr<->coord helpers
|
||
- radv/meta: stop using pdev for building the resolve meta shaders
|
||
- radv/meta: stop using pdev for some query resolve shaders
|
||
- radv: remove an useless check for VK_IMAGE_LAYOUT_PREINITIALIZED
|
||
- radv/meta: replace radv_meta_resolve_type by VkImageAspectFlags
|
||
- radv/meta: add depth/stencil support to the core resolve shader
|
||
- radv/meta: use the same shader for color/depth&stencil compute resolves
|
||
- radv/meta: inline one function in the compute resolve path
|
||
- radv/meta: pass a resolve mode for compute color resolves
|
||
- radv/meta: simplify creating pipelines for color/depth&stencil compute resolves
|
||
- radv/meta: add a single function for color/depth&stencil compute resolves
|
||
- radv/meta: fix the shader stage for push constants in the fragment resolve path
|
||
- radv/meta: move the barrier for color fragment resolves outside
|
||
- radv/meta: use the same shader for color/depth&stencil fragment resolves
|
||
- radv/meta: inline one function in the fragment resolve path
|
||
- radv/meta: pass a resolve mode for fragment color resolves
|
||
- radv/meta: simplify creating pipelines for color/depth&stencil fragment resolves
|
||
- radv/meta: add a single function for color/depth&stencil fragment resolves
|
||
- ac/cmdbuf: only set DCC_WRITE_COMPRESS for DCC on SDMA 5+
|
||
- radv: only enable DCC/HTILE if it's compressed with SDMA
|
||
- ac/sdma: fix pitch assertion for SDMA7
|
||
- radv: fix copying images with different swizzle modes on SDMA7
|
||
- radv: remove redundant radv_sdma_surf::micro_tile_mode
|
||
- radv: tidy up determining 3D alignment for SDMA
|
||
- ac/parse_ib: fix parsing some packets on SDMA7
|
||
- ac,radv,radeonsi: merge tiled/linear surfaces into one struct
|
||
- radv: mark linear images for SDMA as potentially compressed on GFX12
|
||
- ac/sdma: rework and fix metadata for SDMA7
|
||
- radv: fix computing pitch/slice_pitch for compressed block formats
|
||
- radv/meta: remove redundant barriers in vkCmdResolveImage2()
|
||
- radv/meta: remove an useless barrier after color resolves with graphics
|
||
- radv/meta: remove an useless barrier before color resolves with graphics
|
||
- radv/meta: optimize synchronization with compute resolves
|
||
- radv/meta: rework the barrier for depth/stencil resolves
|
||
- radv/meta: set the depth/stencil resolve region depth to 1
|
||
- radv: disable STORAGE for depth-only formats
|
||
- radv: remove radv_disable_depth_storage drirc
|
||
- radv: always fast-clear non-MSAA color image with comp-to-single on GFX10-10.3
|
||
- ac/surface: remove remaining occurrences of HiS on GFX12
|
||
- radv: add radv_image_has_hiz()
|
||
- radv: add radv_image_has_hiz_metadata()
|
||
- radv: remove unused radv_gfx12_get_hiz_clear_value()
|
||
- radv: initialize HiZ also for depth-only images
|
||
- radv: always enable DISABLE_CONSERVATIVE_ZPASS_COUNTS on GFX11
|
||
- radv: fix a GPU hang with PS epilogs and secondary command buffers
|
||
- radv: remove redundant radv_sdma_surf::is_3d
|
||
- radv: remove redundant radv_sdma_surf::is_linear
|
||
- radv: simplify 96-bit copies with SDMA
|
||
- radv: use vk_image_buffer_copy_layout() for SDMA buf layout
|
||
- radv: simplify computing offset/extent of SDMA surfaces
|
||
- radv: remove unnecessary radv_sdma_surf::{blk_w,blk_h}
|
||
- radv: simplify getting bpe for SDMA surfaces
|
||
- radv: tidy up radv_sdma_surf
|
||
- radv: replace radv_sdma_surf by ac_sdma_surf
|
||
- radv/meta: fix HTILE fixup after copying depth/stencil image copies
|
||
- radv: fix local invocation index for mesh/task and quad derivatives on GFX12
|
||
- radv: apply the 1D workgroup optimization for mesh/task shaders too
|
||
- radv: dump the PS epilog in the GPU hang report
|
||
- amd/drm-shim: add rembrandt
|
||
- amd/drm-shim: add phoenix
|
||
- amd/drm-shim: bump version_minor to 52
|
||
- ac,radeonsi: pre-compute some raster config in ac_gpu_info
|
||
- ac,radeonsi: move guardband computations to common code
|
||
- radv: use common guardband computations
|
||
- radv: optimize clipping performance with PA_SU_HARDWARE_SCREEN_OFFSET
|
||
- zink/ci: update the lists for CEZANNE and VANGOGH
|
||
- zink/ci: update traces expectations for VANGOGH/GFX1201
|
||
- radv: initialize HiZ for UNDEFINED transitions on transfer queue
|
||
- ac,radv,radeonsi: add has_db_force_stencil_valid_bug
|
||
- radv: set {color,ds}_samples for inherited rendering state
|
||
- radv: only emit FORCE_S_VALID(1) for MSAA depth/stencil images
|
||
- radv: fix missing L2 cache invalidation with streamout on GFX12
|
||
- radv: rewrite a comment explaining why PFP waits for ME with streamout
|
||
- ci: uprev vkd3d
|
||
- radv/meta: make some functions non-static
|
||
- radv: implement VK_KHR_copy_memory_indirect
|
||
- radv: advertise VK_KHR_copy_memory_indirect on GFX8+
|
||
- radv: handle FRAGMENT_SHADING_RATE_ATTACHMENT_READ properly on GFX10.3
|
||
- radv/meta: remove redundant cache flushes when copying VRS rates to HTILE
|
||
- radv: tidy up radv_postprocess_nir()
|
||
- radv: remove empty gather_shader_info_cs()
|
||
- radv: remove unused radv_device parameter to few functions
|
||
- radv: use radv_physical_device_cache_key::use_ngg_culling more
|
||
- radv: remove radv_nir_compiler_options::info
|
||
- radv: remove unnecessary radv_device parameter to few functions
|
||
- radv: remove radv_use_llvm_for_stage()
|
||
- ac/rtld: remove radeon_info
|
||
- radv/meta: use the fragment resolve path by default
|
||
- radv/meta: remove CB_RESOLVE
|
||
- vulkan: adjust MESA_VK_PIPELINE_RAY_TRACING_FLAGS with beta extensions disabled
|
||
- vulkan: update spec to 1.4.346
|
||
- vulkan: use vk_object_zalloc() for acceleration structs
|
||
- vulkan: add support for vkCreateAccelerationStructure2KHR()
|
||
- vulkan: add helpers for device address range
|
||
- vulkan: add vk_image_memory_copy_layout()
|
||
- vulkan: add helpers for depth/stencil only layouts
|
||
- radv: only set the relevant image views for custom depth/stencil resolves
|
||
- radv: stop checking whether HTILE is compressed with the UNDEFINED layout
|
||
- radv/meta: add separate ds layouts support to the HTILE expand pass
|
||
- radv/meta: stop setting the other depth/stencil attachments when unused
|
||
- radv: handle separate depth/stencil layouts correctly for fbfetch decompressions
|
||
- radv: handle separate depth/stencil layouts correctly for barriers
|
||
- radv: fix missing HTILE decompression with separate depth/stencil layouts
|
||
- radv: always use separate depth/stencil layouts for rendering
|
||
- radv: cleanup valid image layouts in radv_layout_is_htile_compressed()
|
||
- vulkan: do not pass vk_instance for debug report messages
|
||
- aco,radv,radeonsi: remove debug report support in ACO
|
||
- radv: stop passing radv_device for SPIR-V debug reports
|
||
- radv/meta: fix missing sync for compute resolves
|
||
- radv/amdgpu: free the VA range in case the BO allocation failed
|
||
- radv/amdgpu: remove dead code in radv_amdgpu_winsys_bo_create()
|
||
- vulkan: stop passing vk_device to vk_set_subgroup_size()
|
||
- radv: remove unnecessary radv_device parameter to few functions
|
||
- radv: move radv_printf_data to radv_debug_nir
|
||
- radv: move valid VA debug info to radv_valid_va data
|
||
- radv: stop associating NIR with device for debugging tools
|
||
- radv: move setting NIR options for meta shaders
|
||
- radv: stop passing radv_device for creating NIR meta shaders
|
||
- radv/meta: cleanup determining the resolve method
|
||
- radv/meta: decompress source resolve images slighly earlier
|
||
- radv: use nir_intrinsic_printf in radv_build_printf()
|
||
- radv: use nir_shader::uses_printf for lowering printf
|
||
- ac/gpu_info: remove a TODO about LOAD_CONTEXT_REG on GFX6-7
|
||
- ac/cmdbuf: add an assertion for COPY_DATA+PFP with registers
|
||
- radv: use LOAD_CONTEXT_REG_INDEX when supported for streamout
|
||
- radv: remove useless PFP_SYNC_ME when loading color/ds metadata on GFX6-7
|
||
- radv: update color/ds clear metadata in ME
|
||
- radv: emit PFP_SYNC_ME right after STRMOUT_BUFFER_UPDATE is emitted
|
||
- radv: stop allowing users to disable the global BO list
|
||
- radv: remove radv_device::use_global_bo_list
|
||
- radv: cleanup functions that writes descriptors
|
||
- radv: stop allocating an array of BO for descriptors
|
||
- radv/amdgpu: remove the virtual BOs tracking logic
|
||
- radv: remove adding a BO to the per-cmdbuf list when unnecessary
|
||
- radv: ignore the GFX12 HiZ WA for internal blits
|
||
- radv: only consider HiZ as valid after clears with the depth aspect
|
||
- radv: fix a perf issue when clearing depth/stencil images on GFX12
|
||
- radv: cleanup redundant radv_image_has_hiz_metadata() checks
|
||
- vulkan: fix memory leak in vkQueueBeginDebugUtilsLabelEXT()
|
||
- radv/ci: stop skipping ASTC tests with ANGLE+STONEY
|
||
- radv/ci: stop skipping some memory related tests on POLARIS10
|
||
- radv/ci: remove skipping mesh shader tests for NAVI31
|
||
- Revert "radv: remove adding a BO to the per-cmdbuf list when unnecessary"
|
||
- radv/ci: move slow tests to radv-slow-skips.txt
|
||
- radv/ci: add a new dEQP test suite for nightly jobs
|
||
- radv/ci: add new jobs that run full VKCTS on NAVI21/NAVI31/GFX1201
|
||
- radv/amdgpu: always return VK_ERROR_INVALID_EXTERNAL_HANDLE for host ptr imports
|
||
- radv/ci: fix radv-slow-skips.txt path
|
||
- radv/ci: fix a typo in radv-navi10-vkcts-full
|
||
- radv: replace radv_copy_flags by VkAddressCopyFlagsKHR
|
||
- radv: implement VK_KHR_device_address_commands
|
||
- radv: advertise VK_KHR_device_address_commands
|
||
- nir: make nir_variable::descriptor_set a 32-bit variable
|
||
- vulkan/runtime: handle custom border color index with samplers
|
||
- docs: add missing description of RADV_PERFTEST=rtcps
|
||
- radv: introduce RADV_EXPERIMENTAL envvar for experimental features
|
||
- radv: do not try to resize the SPM BO for per-submit captures
|
||
- radv: improve dumping RGP captures
|
||
- radv: add RADV_DEBUG=fullsync
|
||
- treewide: cleanup non-existent descriptor types from nir_intrinsic_desc_type()
|
||
- nir: introduce nir_descriptor_type for Vulkan like descriptors
|
||
- amd/drm-shim: bump version_minor to 54
|
||
- amd: bump required DRM version to 3.54 (Linux kernel 6.6+)
|
||
- nir,treewide: add nir_image_intrinsic_type
|
||
- vulkan: add DGC support with descriptor heap
|
||
- nir: add texture_heap_offset/sampler_heap_offset to nir_build_tex()
|
||
- nir/lower_mediump: add heap support
|
||
- nir/opt_shrink_vectors: add heap support
|
||
- nir/opt_sink: add heap support
|
||
- nir/opt_move_discards_to_top: add heap support
|
||
- nir/lower_image: add heap support
|
||
- nir/divergence_analysis: add missing nir_intrinsic_image_heap_texel_address
|
||
- nir/opt_intrinsics: add heap support
|
||
- nir/opt_uniform_atomics: add heap support
|
||
- nir/opt_access: add heap support
|
||
- nir/opt_group_loads: add heap support
|
||
- nir/opt_preamble: add heap support
|
||
- nir/validate: add heap support
|
||
- nir/gather_info: add heap support
|
||
- nir/lower_helper_writes: add heap support
|
||
- nir/opt_shrink_stores: add heap support
|
||
- radv: fix a typo when determining if a VS needs a prolog
|
||
- radv: emit BOP events after every draw to workaround a VRS bug on GFX12
|
||
- vulkan: rename VK_EXT_device_fault features
|
||
- vulkan,spirv: update headers
|
||
- spirv: fix OpUntypedVariableKHR with optional data type parameter
|
||
- spirv: handle untyped pointer storage class with descriptor heap
|
||
- vulkan: remove unused parameters in vk_build_descriptor_heap_address()
|
||
- vulkan: fix determining the heap ptr
|
||
- vulkan: update spec to 1.4.348
|
||
- ac/nir: adjust lowering of query size for descriptor heap
|
||
- ac/nir: add descriptor heap support to ac_nir_lower_image_tex()
|
||
- ac/nir: add descriptor heap support to opt_flip_if_for_mem_loads()
|
||
- nir: add new variable modes for the resource/sampler heap pointers
|
||
- spirv: change the resource/sampler builtins variable mode
|
||
- spirv: set the image format for image intrinsics
|
||
- spirv: emit nir_intrinsic_image_heap when resource/sampler ptrs are used
|
||
- spirv: implement SpvOpUntypedImageTexelPointerEXT
|
||
- vulkan: adjust lowering of descriptor heaps
|
||
- nir: remove resource/sampler heap ptrs sysvals
|
||
- spirv: mark all resources as non-uniform by default with descriptor heap
|
||
- vulkan: stop emitting global_addr_to_descriptor
|
||
- nir: remove nir_intrinsic_global_addr_to_descriptor
|
||
- radv/meta: fix computing extent for image->image with both compressed formats
|
||
- nir: allow heap image intrinsics in nir_rewrite_image_intrinsic()
|
||
- radv: pre-compute the primitive restart index
|
||
- radv: implement VK_EXT_primitive_restart_index
|
||
- radv: advertise VK_EXT_primitive_restart_index
|
||
- radv/meta: remove an outdated comment in vkCmdClearAttachments()
|
||
- radv: replace remaining occurrences of VK_ACCESS_xxx
|
||
- vulkan: mark RP attachments as invalid when no rendering create info
|
||
- nir: add new system values for descriptor heap RT traversal inputs
|
||
- radv: zero-allocate graphics shader stages
|
||
- radv: add a new helper to make a sampler descriptor
|
||
- radv: add support for custom border colors with descriptor heap
|
||
- radv: make radv_make_sampler_descriptor() non-static
|
||
- radv: use 32-bit memory types for descriptor heap buffers
|
||
- radv: add shader info about whether descriptor heap is used
|
||
- radv: declare shader arguments for resource/sampler heaps
|
||
- radv/rt: declare shader arguments for resource/sampler heaps
|
||
- radv: keep track of descriptor heap mapping in the shader layout
|
||
- radv: call vk_nir_lower_descriptor_heaps()
|
||
- radv/nir: adjust lowering of ycbcr tex instructions for descriptor heap
|
||
- radv/nir: adjust lowering of immediate samplers for descriptor heap
|
||
- radv/nir: rename radv_nir_apply_pipeline_layout
|
||
- radv/nir: lower descriptor heap in radv_nir_lower_descriptors
|
||
- radv: set descriptor heap sizes/alignments for VTN
|
||
- radv: allow to create pipelines with a NULL pipeline layout
|
||
- radv: copy mapping info for graphics pipeline libraries
|
||
- radv: implement vkWrite{Resource,Sampler}DescriptorsEXT()
|
||
- radv: implement vkCmdBind{Resource,Sampler}HeapEXT()
|
||
- radv: add support for emitting descriptor heaps
|
||
- radv: implement vkCmdPushDataEXT()
|
||
- radv: implement vkGetPhysicalDeviceDescriptorSizeEXT()
|
||
- radv: add support for capture&replay with descriptor heap
|
||
- radv: add support for inherited descriptor heap for secondaries
|
||
- radv: flush caches with descriptor heap access flags
|
||
- radv: add support for DGC with descriptor heap
|
||
- radv: advertise VK_EXT_descriptor_heap with RADV_EXPERIMENTAL=heap
|
||
- radv/ci: set RADV_EXPERIMENTAL=heap
|
||
- ci: uprev VKCTS main to 634a3fc62d82c34de68c3b1add25e6b7f5777524
|
||
- radv/ci: remove a hack for the number of deqp instances with RENOIR
|
||
- radv/ci: update flakes of VKCTS jobs
|
||
- radv/ci: fix setting RADV_EXPERIMENTAL=heap
|
||
- radv/ci: document a descriptor heap failure
|
||
- vulkan: add an option to lower SHADER_RECORD_INDEX to non-uniform
|
||
- radv: lower SHADER_RECORD_INDEX to non-uniform
|
||
- radv: fix GPU hangs with PS epilogs and secondaries properly
|
||
- radv: re-introduce DGC+multiview support and enable it for vkd3d-proton only
|
||
- vulkan: add missing VkMemoryRangeBarriersInfoKHR support
|
||
- radv: add missing VkMemoryRangeBarriersInfoKHR from DAC
|
||
- radv/meta: fix expanding HTILE on compute with multisampling
|
||
- radv: fix determining needed dynamic states when rasterization is disabled
|
||
|
||
Sergi Blanch Torne (7):
|
||
|
||
- ci: disable Collabora's farm due to maintenance
|
||
- Revert "ci: disable Collabora's farm due to maintenance"
|
||
- ci: disable Collabora's farm due to maintenance
|
||
- Revert "ci: disable Collabora's farm due to maintenance"
|
||
- ci: fix envvar default value
|
||
- ci: nightly run xfiles for gc2000 and a618 piglit jobs
|
||
- ci: nightly run xfiles for a618 angle job
|
||
|
||
Shih, Jude (1):
|
||
|
||
- amd/vpelib: Gate assertion on debug flag
|
||
|
||
Silvio Vilerino (20):
|
||
|
||
- d3d12: Add missing using Microsoft::WRL:ComPtr in d3d12_context_common
|
||
- d3d12: Add HAVE_GALLIUM_D3D12_VIDEO guards for d3d12_video_encoder_set_max_async_queue_depth/d3d12_video_encoder_get_last_slice_completion_fence
|
||
- pipe: Add PIPE_VIDEO_CAP_SLICE_STRUCTURE_AUTO for PIPE_VIDEO_SLICE_MODE_AUTO
|
||
- d3d12: Implement PIPE_VIDEO_CAP_SLICE_STRUCTURE_AUTO reporting
|
||
- mediafoundation: Query PIPE_VIDEO_CAP_SLICE_STRUCTURE_AUTO
|
||
- ci: Bump DirectX-Headers and Agility SDK dependencies to v1.619.1
|
||
- d3d12: Implement trim notification residency eviction
|
||
- d3d12: Truncate move_rects_support.bits.max_motion_hints 16 bit var to 65535, not 65536
|
||
- d3d12: d3d12_video_encode_support_caps was assigning a stack variable address to capEncoderSupportData in/out arg
|
||
- d3d12: Fix video fence leak and double assign
|
||
- d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization
|
||
- pipe: Add fence_get_win32_event since fence_get_fd return int type is smaller than HANDLE/void*
|
||
- d3d12: Implement pipe_screen::fence_get_win32_event
|
||
- mediafoundation: Use WaitForMultipleObjects for AUTO slices wait in sliced encode mode
|
||
- mediafoundation: Prefetch the slice fence handles before the waits
|
||
- mediafoundation: Pre-create all MFSamples to avoid per slice COM allocation in the hot loop
|
||
- mediafoundation: Remove unnecessary staging variable in ProcessSliceBitstreamZeroCopy
|
||
- d3d12: Check queues are registered before unregistering in unregister_work_queue
|
||
- mediafoundation: MFTRegisterWorkQueue/MFTUnregisterWorkQueue to validate null param instead of crash
|
||
- Revert "d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization"
|
||
|
||
Simon Perretta (18):
|
||
|
||
- pco: update formatless skip check
|
||
- pvr: remove drm device config table
|
||
- pvr: add initial yuv tex/smp state words
|
||
- pvr: rename PVR_HAS_ERN to PVR_HAS_ENHANCEMENT
|
||
- pvr: add support for drm-shim
|
||
- docs/pvr: add drm-shim documentation
|
||
- pvr: drop pvr_assert macro
|
||
- pvr: handle SRC_SMRG_D32S8_D32S8 in tq shader
|
||
- pvr: set transfer flags based on derived formats
|
||
- pvr: allow primary drm node to be optional
|
||
- pvr: report nir shader in pipeline executable properties
|
||
- pvr: downgrade vs out/fs in mismatch assertion to a warning
|
||
- pco: add encodings and mappings for smp integer and array flags
|
||
- pco: use vm/icm for tile buffer store coverage mask
|
||
- pco: add native u{add,sub}{carry,borrow,sat} ops
|
||
- pvr: advertise VK_KHR_shader_integer_dot_product
|
||
- pco: reserve additional outputs for trilinear sampled coeffs
|
||
- pco: amend tg4 lowering
|
||
|
||
Stéphane Cerveau (1):
|
||
|
||
- anv/ci: add vulkan fluster job on adl
|
||
|
||
Sushma Venkatesh Reddy (1):
|
||
|
||
- brw: Use lookup tables for Gfx12+ 3src type encoding/decoding
|
||
|
||
Tanner Van De Walle (1):
|
||
|
||
- d3d12: Fix d3d12_surface_destroy() to match pipe_surface_destroy_func()
|
||
|
||
Tapani Pälli (34):
|
||
|
||
- intel/dev: update mesa_defs.json from workaround database
|
||
- anv: add handling for Wa_14026600921
|
||
- intel/genxml: bring some missing fields to gen125.xml
|
||
- drirc/anv: implement steps to disable RHWO for Wa_14024015672
|
||
- iris: implement steps to disable RHWO for Wa_14024015672
|
||
- blorp: fix asserts hit with msaa blorp blits on xe3
|
||
- anv: route clear operations on compute to companion
|
||
- intel/dev: update mesa_defs.json from workaround database
|
||
- anv: use workaround framework for Wa_1508208842
|
||
- intel/genxml: add CHICKEN_RASTER_2 with required bit for Xe3
|
||
- anv: set DisableAnyMCTRresponsefix to zero on init
|
||
- iris: set DisableAnyMCTRresponsefix to zero on init
|
||
- anv: skip compressed flag for bo if not supported by modifier
|
||
- util: bring back fix to avoid strict aliasing bugs in xxhash
|
||
- intel/dev: implement urb handle limits for Wa_16025326720
|
||
- anv: update btp address after CmdExecuteCommands
|
||
- anv: dirty descriptors in set_dirty_for_bind_map if sha changed
|
||
- intel/dev: add NVL_U, NVL_P platforms to gen_wa_helpers.py
|
||
- intel/dev: update mesa_defs.json from workaround database
|
||
- intel/compiler: move validation assert after brw_shader_debug_log
|
||
- anv: remove barrier special handling for RT_BTI_CHANGE
|
||
- anv: add required barrier for Wa_14026570320
|
||
- genxml/mi: add additional bit to FF_MODE and autostrip helper
|
||
- anv: use mi_set_autostrip_state for autostrip control
|
||
- iris: use mi_set_autostrip_state for autostrip control
|
||
- intel/compiler: expose inferred_exec_pipe from scoreboarding
|
||
- intel/compiler: implement dummy mov for Wa_18035690555
|
||
- intel/dev: update mesa_defs.json from workaround database
|
||
- anv: fix Wa_14024015672 interaction in blorp
|
||
- intel/compiler: implement macl part of Wa_18035690555
|
||
- drirc/anv: add flag to disable VK_EXT_subgroup_size_control
|
||
- drirc: set anv_disable_subgroup_size_control for bg3
|
||
- drirc: use anv_disable_drm_ccs_modifiers for any GTK version
|
||
- anv: do not use resource barrier with split barriers
|
||
|
||
Thomas H.P. Andersen (2):
|
||
|
||
- nvk: prepare for driver internal layers
|
||
- nvk: add app workaround layer
|
||
|
||
Thong Thai (1):
|
||
|
||
- radeonsi: remove radeonsi prefix from si_pipe.h includes
|
||
|
||
Tim Van Patten (1):
|
||
|
||
- anv: Enable Vulkan 1.4 for SDK 37+
|
||
|
||
Timothy Arceri (18):
|
||
|
||
- nir: make nir_collect_src_uniforms() private
|
||
- nir: make nir_add_inlinable_uniforms() private
|
||
- nir: update asserts in inline uniforms
|
||
- nir: speedup nir_find_inlinable_uniforms()
|
||
- mesa: add _mesa_lookup_state_param_idx() helper
|
||
- st/glsl_to_nir: make sure the variant has the correct locations set
|
||
- mesa/st: use same path for setting state ref locations
|
||
- st/glsl_to_nir: update state var locations earlier
|
||
- glsl: relax precision matching on unused uniforms ES
|
||
- glsl: add workaround for MDK2 HD
|
||
- glx: guard glx_screen frontend_screen member
|
||
- mesa: add force_explicit_uniform_loc_zero workaround
|
||
- util/driconf: add workarounds for Lethis - Path Of Progress
|
||
- nir: remove is_only_uniform_src() restriction
|
||
- nir: test loop analyze sets exact trip flags correctly
|
||
- radeonsi: add Gun Godz workaround
|
||
- glcpp: fix paste within macro function expansion
|
||
- amd/radeonsi: dont clamp packed user varyings
|
||
|
||
Tomeu Vizoso (43):
|
||
|
||
- dril: don't build a rocket_dri.so
|
||
- teflon/tests: Fail tests with unsupported output types
|
||
- teflon/tests: Add EfficientDet model
|
||
- teflon/tests: Add InceptionV1 model
|
||
- teflon/tests: Add MobileNetV2 model
|
||
- teflon/tests: Add SSD MobileNetV2 model
|
||
- teflon/tests: Add MoveNet Lighning and Thunder models
|
||
- ethosu: Update tests baseline for new models
|
||
- ethosu: Update test expectations
|
||
- teflon: Fix leak of tensor structs
|
||
- ethosu: Add U85 fields, these are compatible with the U65
|
||
- ethosu: Let maxblockdeps be arch-specific
|
||
- ethosu: Make the UBlock sizes arch-specific
|
||
- ethosu: Compute is_partkernel during scheduling
|
||
- ethosu: Switch to the weight encoder from Regor
|
||
- ethosu: Invert lowering order of concatenation suboperations
|
||
- ethosu: Add debug option for forcing U85 generation
|
||
- ethosu: Don't emit redundant state changes
|
||
- ethosu: Add a separate scheduler for the U85
|
||
- ethosu: Improve parallelism by detecting overlaps for BLOCKDEP
|
||
- ethosu: Expand pooling to U85
|
||
- ethosu: Refactor ethosu_allocate_feature_map to return the new offset
|
||
- ethosu: Emission changes for U85
|
||
- ethosu: Implement simplified scaling for U85
|
||
- ethosu: Fix ublock selection for 8-bit depthwise/pooling on U85-256
|
||
- ethosu: map BOs at creation time and unmap at destruction
|
||
- ethosu: Fix scalar ADD on U85
|
||
- ethosu: Properly emit IFM_BROADCAST and IFM2_BROADCAST on U85
|
||
- ethosu: Set test baseline for the Corstone 1000 (U85)
|
||
- etnaviv/ml: Skip all synthetic tests as we now have several real models
|
||
- rocket: Skip all synthetic tests as we now have several real models
|
||
- gallium: pipe_tensor.resource → pipe_tensor.data
|
||
- gallium: replace padding_same with per-side padding
|
||
- gallium: add pipe_ml_device, pipe_screen::get_ml_device()
|
||
- ethosu: move hardware description from ethosu_screen to ethosu_ml_device
|
||
- ethosu: add U85-256 support to ethosu_ml_device_create()
|
||
- ethosu: parse optional SRAM size from device spec string
|
||
- ethosu: Specifying SRAM size in pipe_ml_device ID
|
||
- gallium: add pipe_context::ml_subgraph_deserialize()
|
||
- ethosu: implement ml_subgraph_deserialize()
|
||
- ethosu: handle NULL bias tensor in convolution
|
||
- gallium: add ml_device_destroy callback to pipe_ml_device
|
||
- ethosu: implement ml_device_destroy for standalone ML device
|
||
|
||
Trigger Huang (1):
|
||
|
||
- vulkan/queue: pass protected submit info to driver
|
||
|
||
Urja Rannikko (1):
|
||
|
||
- hash_table: fix use-after-free by reorganization of destruct callbacks
|
||
|
||
Utku Iseri (19):
|
||
|
||
- zink: add a variable splitter for component-xfb + unlowering
|
||
- zink: manually ignore centroid with sample shading
|
||
- zink: set mediump is 32 bits
|
||
- zink: set flat interpolation for integer FS inputs
|
||
- zink: switch to using unlower_io_to_vars
|
||
- zink: remove rework_io and revectorization
|
||
- zink: add an rpstores debug option
|
||
- zink: track which stages a sampler gets bound to
|
||
- zink: add arrayness I/O matching
|
||
- zink: emulate clip distance
|
||
- zink: add arm and panvk to invalid<->linear
|
||
- zink: ignore msrtss support on panvk
|
||
- pan/genxml: make pandecode comparisons return -1,1
|
||
- panvk: pass heap explicitly to as_alloc/free
|
||
- panvk: increase mappable VA range to 48 bits
|
||
- panvk: expose swapchain_mutable_format support
|
||
- panvk: use AUTO_VA with v9
|
||
- panvk: add max supported va to physical device
|
||
- panvk: BDA capture/replay support on v10+
|
||
|
||
Val Packett (1):
|
||
|
||
- util/rust: Add memory map access mode detection to descriptor API
|
||
|
||
Valentine Burley (84):
|
||
|
||
- tu: Assign stable unique_id to buffer objects
|
||
- tu: Implement VK_EXT_device_memory_report
|
||
- zink/ci: Fix a few job timeouts
|
||
- zink/ci: Re-enable optimal_keys for zink-tu-a618
|
||
- zink/ci: Enable optimal_keys for zink-tu-a750
|
||
- tu: Handle VkDrmFormatModifierPropertiesList2EXT
|
||
- lavapipe/ci: Update Android CTS expectations
|
||
- ci/deqp: Retry GitHub API lookup for main-branch check
|
||
- tu: Fix memory leak of patchpoints_ctx in dynamic rendering
|
||
- tu/ci: Document a618-vk-asan failure
|
||
- tu: Free cmd_buffer from its pool
|
||
- tu: Simplify cmd_buffer allocation
|
||
- docs: Update features.txt for Turnip
|
||
- tu: Enable VK_KHR_compute_shader_derivatives for a6xx
|
||
- tu: Advertise VK_EXT_shader_uniform_buffer_unsized_array
|
||
- tu: Advertise VK_KHR_depth_clamp_zero_one
|
||
- docs/features: Remove VK_KHR_android_surface
|
||
- venus/ci: Increase android-angle-venus-anv-cml-cts timeout
|
||
- venus/ci: Skip invalid SkQP test on ANV
|
||
- ci: Update kernel to pull in new build for freedreno
|
||
- turnip/ci: Move a618-vk job to new sc7180 runner
|
||
- turnip/ci: Promote a618-vk-asan to pre-merge
|
||
- turnip/ci: Skip more slow tests
|
||
- turnip/ci: Remove a618-vk-full job
|
||
- ci: Disable Collabora's farm due to network issues
|
||
- Revert "ci: Disable Collabora's farm due to network issues"
|
||
- zink/ci: Drop fixed VU from VVL filters
|
||
- zink: Enable optimal keys for GPL on Turnip
|
||
- radeonsi/ci: Skip subgroups.arithmetic tests on Mendocino
|
||
- ci/lava: Uprev lava-job-submitter
|
||
- ci: Update kernel to Linux 6.19.6
|
||
- freedreno/ci: Switch sm8650 to gfx-ci/linux kernel
|
||
- etnaviv/ci: Switch CI-tron to gfx-ci/linux kernel
|
||
- intel/ci: Document recent Intel flakes
|
||
- venus/ci: Remove hanging timeout override for ADL and TGL jobs
|
||
- ci: Strip qemu from rootfs
|
||
- ci/android: Disable wifi for Cuttlefish
|
||
- ci/android: Update Cuttlefish build
|
||
- ci/container: Generalize debian/x86_64_test-android container
|
||
- ci/container: Prepare test-android for multi-arch support
|
||
- ci: Add test-android container for arm64
|
||
- venus/ci: Add an Android Venus on Turnip job on a618
|
||
- ci: Update kernel to pick up new network adapter
|
||
- tu: Add support for VK_EXT_depth_clamp_control
|
||
- ci: Enable legacy-wayland=bind-wayland-display for debian-arm32 and debian-arm64
|
||
- zink/ci: Enable mesh shader tests on lavapipe
|
||
- zink/ci: Run zink-lavapipe on regular runner
|
||
- ci/android: Update Cuttlefish build
|
||
- ci: Uprev GL & GLES CTS
|
||
- ci: Capture weston logs
|
||
- tu/drm/virtio: Add missing lock to virtio_bo_init_dmabuf
|
||
- tu/drm/virtio: Move set_iova into success path of virtio_bo_init_dmabuf
|
||
- tu/drm/virtio: Avoid freeing zombified tu_sparse_vma
|
||
- tu/drm/virtio: Do not free iova from heap for lazy BOs
|
||
- tu/drm/virtio: Fix GEM handle leak in tu_bo_init error path
|
||
- tu/drm/virtio: Fix GEM handle leak on failed dmabuf res_id lookup
|
||
- ci: Avoid mixing libwayland versions in build and test containers
|
||
- ci: Drop legacy-wayland option for debian-arm32
|
||
- ci: Drop duplicate Intel shader-db run
|
||
- ci: Run Intel shader-db on Lunar Lake and Panther Lake
|
||
- ci: Enable ZSTD support for ZRAM in the kernel
|
||
- zink/ci: Move zink-tu-a618 to sc7180-trogdor-kingoftown
|
||
- ci/venus: Skip crashing Android CTS test on ANV
|
||
- util: Add more libdrm stubs
|
||
- egl: Use util/libdrm.h instead of xf86drm.h
|
||
- meson: Add support for buidling zink + Turnip/KGSL
|
||
- meson: Fix Turnip libdrm-linking check
|
||
- ci: Enable EGL and GLX in debian-no-libdrm
|
||
- meson: Update freedreno-kmds comment
|
||
- lavapipe/ci: Skip flaky Android CTS test
|
||
- venus/ci: Move android-angle-venus-tu-a618 to sc7180-trogdor-kingoftown
|
||
- ci/android: Add 5-minute timeout to Cuttlefish launch
|
||
- ci/android: Refactor replacing Vulkan drivers
|
||
- ci/android: Enable virtio freedreno KMD support
|
||
- ci/android: Update Cuttlefish build
|
||
- turnip/ci: Add Android job with ANGLE on a618
|
||
- pan/ci: Document recent flakes and timeouts
|
||
- ci/freedreno: Move remaining lazor a618 jobs, retire device type
|
||
- ci: Disable Collabora's farm due to network issues
|
||
- Revert "ci: Disable Collabora's farm due to network issues"
|
||
- anv/ci: Add full VKCTS pre-merge job on Raptor Lake
|
||
- zink/ci: Remove Cezanne job
|
||
- tu/drm/virtio: Fix tu_wait_fence timeout handling
|
||
- freedreno/drm/virtio: Fix wait_fence ret ordering
|
||
|
||
Vignesh Raman (1):
|
||
|
||
- ci/gitlab_gql: disable schema fetch
|
||
|
||
Vinson Lee (9):
|
||
|
||
- compiler/clc: Fix const correctness in libclc_add_generic_variants
|
||
- freedreno/decode: Fix const correctness in get_tex_count
|
||
- freedreno/decode: replace lua_pushunsigned with lua_pushinteger
|
||
- llvmpipe: fix build on macOS due to st_mtim
|
||
- nil: Fix Rust test link failure under Coverity due to missing -lm
|
||
- d3d12: Fix MinGW cross-build error in resource_state_if_promoted
|
||
- zink: remove unused variable in zink_instance.py
|
||
- st/mesa: fix implicit conversion warning in st_atom_framebuffer
|
||
- vulkan/screenshot-layer: initialize info to NULL
|
||
|
||
Wang Ruitang (1):
|
||
|
||
- amd/common/virtio: use device fd to init sync provider
|
||
|
||
Wei Hao (1):
|
||
|
||
- radeonsi: fix threaded shader compilation finishing after context is destroyed
|
||
|
||
Wei Zhao (1):
|
||
|
||
- vulkan/wsi/wayland: use roundtrip instead of flush on swapchain free
|
||
|
||
Wenfeng Gao (2):
|
||
|
||
- mediafoundation: Support externally provided motion hints
|
||
- mediafoundation: Fix the frame number validation logic for motion hint
|
||
|
||
Wujian Sun (1):
|
||
|
||
- mesa: Fix inconsistent multisampled CopyTexImage checks
|
||
|
||
Xianzhong Li (1):
|
||
|
||
- panfrost: Fix GEM handle refcount leak in panfrost_bo_import
|
||
|
||
Yiwei Zhang (77):
|
||
|
||
- venus: track prime blit dst buffer memory in the wsi image
|
||
- venus: track dedicated image during mem alloc
|
||
- venus: add vn_renderer_bo_export_sync_file helper
|
||
- venus: refactor vn_AcquireNextImage2KHR
|
||
- venus: properly handle wsi implicit in-fence
|
||
- venus: refactor Android ANB tracking to avoid confusions with WSI
|
||
- venus: remove obsolete asserts for ANB image creation
|
||
- ci/android: revive some previously skipped tests
|
||
- pan/kmod: drop pan_kmod_bo_check_import_flags validation
|
||
- pan/kmod: clean up unused flags from bo import paths
|
||
- venus: fix a prime blit assert...again
|
||
- venus: sync latest protocol for VK_EXT_descriptor_heap support
|
||
- venus: implement all descriptor heap commands
|
||
- venus: cmd inheritance info fix to consider descriptor heap
|
||
- venus: pipeline layout is now optional
|
||
- venus: skip image cache for VkOpaqueCaptureDataCreateInfoEXT
|
||
- venus: add vn_descriptor.h to be shared between different desc systems
|
||
- venus: rename format_update_mutex for general purpose
|
||
- venus: cache descriptor size query
|
||
- venus: ensure descriptor writes invariance
|
||
- venus: take care of combined image sampler descriptor for ycbcr
|
||
- venus: fill descriptor heap feats and props
|
||
- venus: expose VK_EXT_descriptor_heap behind a debug option
|
||
- venus: workaround a gcc-15 dead store elimination (DSE) bug
|
||
- venus: sync latest protocol for VK_KHR_shader_fma
|
||
- vulkan/wsi/drm: force prime buffer blit for WSI_DEBUG_BUFFER
|
||
- venus: sync protocol for strict aliasing compliance
|
||
- venus: the GCC DSE workaround is no longer needed
|
||
- venus: amend to mark descriptor size cache initialized
|
||
- venus: RegisterCustomBorderColorEXT can be async when index is requested
|
||
- venus: expose VK_EXT_descriptor_heap by default
|
||
- pan/fb: fix return type for mali_to_glsl_dim
|
||
- ci/venus: skip broken drm display tests
|
||
- util: convert tabs to spaces for ralloc.c
|
||
- pan: fix to not clear out of bitset range
|
||
- lvp: avoid advertising dmabuf support for kms_swrast
|
||
- lvp: hide import-only dmabuf support from zink
|
||
- ci/venus: update expectation based on nightly job runs
|
||
- virgl: set DRM_RDWR for exported dma-bufs (non-blob)
|
||
- venus: force prime blit on Nvidia GPU
|
||
- vulkan/android: add new helpers for aliased ANB support (spec v8+)
|
||
- lvp: add lvp_image_init helper
|
||
- lvp: support VK_ANDROID_native_buffer v8+
|
||
- llvmpipe: drop unused dt_format
|
||
- lvp: import_memory_fd returns a boolean
|
||
- lvp: properly initialize AHB image layout
|
||
- lvp: fix dedicated allocation requirements for AHB images
|
||
- ci/lvp: update android cts expectations
|
||
- lvp: drop redundant lvp_image::offset
|
||
- lvp: rename lvp_image_plane::plane_offset to offset
|
||
- lvp: fix multi-planar image memory binding with explicit layout
|
||
- llvmpipe: follow winsys handle attributes when imported with explicit layout
|
||
- lvp: raise LVP_MAX_PLANE_COUNT to 3 and update ci expectations
|
||
- lvp: follow winsys handle size when imported with explicit layout
|
||
- lvp: refactor image plane initialization
|
||
- venus: fix to relax the KHR_external_memory_fd requirement
|
||
- vulkan/anv: use vk_device_get_timestamp and drop vk_clock_gettime
|
||
- util/list: fix formatting
|
||
- panvk: hide swapchainMaintenance1 behind WSI guard
|
||
- ci/panvk: update expectations with new flakes
|
||
- docs/venus: update instructions around Intel pat issue
|
||
- docs/venus: adjust driver support list and drop obsolete descriptions
|
||
- docs/venus: add QEMU instructions
|
||
- docs/venus: add Android Cuttlefish instructions
|
||
- vulkan/wsi/win32: add wsi_win32_find_idle_image helper
|
||
- vulkan/wsi/win32: respect acquire timeout for sw wsi
|
||
- venus: add vn_get_query_pool_results for non-qfb
|
||
- venus: relocate vn_query_feedback_wait_ready into qfb query
|
||
- venus: add vn_relax_warn to check if at warn order
|
||
- venus: ensure qfb can catch device lost
|
||
- venus: add vn_get_semaphore_counter_value that takes vn_relax_state
|
||
- venus: ensure sfb can catch device lost
|
||
- venus: add vn_get_fence_status that takes vn_relax_state
|
||
- venus: ensure ffb can catch device lost
|
||
- ci/venus: update expectation for an expected fail
|
||
- docs/vulkan: fix the order of KHR and EXT extensions
|
||
- docs/vulkan: fix the order of platform and vendor extensions
|
||
|
||
Yogesh Mohan Marimuthu (2):
|
||
|
||
- winsys/amdgpu: pointers to be NULL if num 0 for kernel ioctl
|
||
- winsys/amdgpu: call userq wait ioctl only once
|
||
|
||
Yonggang Luo (2):
|
||
|
||
- vulkan/anv:Remove unused anv_clock_gettime
|
||
- pvr: Remove two unused function
|
||
|
||
You, Min-Hsuan (1):
|
||
|
||
- amd/vpelib: refactor minor change
|
||
|
||
Yuxuan Shui (4):
|
||
|
||
- vulkan/wsi/x11: Make sure error is returned if create_swapchain fails
|
||
- wsi/display: add connectors to connectors list during allocation
|
||
- wsi/display: initialize Xlib display connector property IDs in all cases
|
||
- wsi/display: move set atomic cap out of wsi_display_get_connector
|
||
|
||
Zan Dobersek (15):
|
||
|
||
- tu: handle DS_DEPTH_BOUNDS_TEST_BOUNDS state under TU_DYNAMIC_STATE_RB_DEPTH_CNTL
|
||
- tu: avoid incorrect pipeline draw state for disabled depth/stencil attachments
|
||
- tu: allocate transient attachments used for LRZ
|
||
- tu/kgsl: wait-only submit handling should not ignore sparse bind commands
|
||
- freedreno/common: make a8xx magic regs common between all such devices
|
||
- freedreno/common: set up a830 properties
|
||
- tu/a8xx: fix tu_desc_set_ubwc() to avoid unwanted bitfield override
|
||
- tu: use pkt_field macros in tu_desc_{get,set}_addr()
|
||
- fd: make RD dump output path configurable through FD_RD_DUMP_PATH
|
||
- tu/a8xx: add missing register state in tu_clear_sysmem_attachments()
|
||
- fd: support a8xx in rddecompiler
|
||
- fd/replay: kgsl context should use no-fault tolerance, report reset state
|
||
- tu/kgsl: bump msm_kgsl.h header
|
||
- tu: only support userspace-managed perfcounters on a7xx and earlier
|
||
- tu/a8xx: remove enforced TU_DEBUG_FLUSHALL
|
||
|
||
Zeyang Lyu (1):
|
||
|
||
- radv: Fix incorrect misaligned_mask_invalid for VK_EXT_vertex_input_dynamic_state
|
||
|
||
Zhao, Jiali (1):
|
||
|
||
- amd/vpelib: Re-enable new feature support check
|
||
|
||
aerith (1):
|
||
|
||
- zink: fix codegen for extensions with non-standard struct names
|
||
|
||
anonymix007 (1):
|
||
|
||
- vulkan/runtime: Implement VK_TIME_DOMAIN_QUERY_PERFORMANCE_COUNTER_KHR
|
||
|
||
emre (1):
|
||
|
||
- nvk: fix barrier cache invalidation
|
||
|
||
irql-notlessorequal (8):
|
||
|
||
- hasvk: Allow NULL index buffers
|
||
- hasvk: Remove no longer valid assert
|
||
- hasvk: Handle VkBindMemoryStatusKHR on buffer/image memory bind
|
||
- hasvk: Add support for Cmd*DescriptorSet*2KHR
|
||
- hasvk: Advertise VK_KHR_maintenance6
|
||
- docs/features: Mark VK_KHR_maintenance6 complete for hasvk
|
||
- Revert "hasvk: Remove no longer valid assert"
|
||
- hasvk: Stop advertising blockTexelViewCompatibleMultipleLayers
|
||
|
||
jaap aarts (1):
|
||
|
||
- radv/sqtt: Prevent concurrent submit when sqtt is enabled
|
||
|
||
jiajia Qian (1):
|
||
|
||
- nir/opt_phi_precision: Fix bit size mismatch when moving widening conversions
|
||
|
||
juntak0916 (1):
|
||
|
||
- nvk: fix BindImageMemory2 per-bind status result
|
||
|
||
kingstom.chen (1):
|
||
|
||
- radv/rt: only run move_rt_instructions() for CPS shaders
|
||
|
||
osy (2):
|
||
|
||
- vulkan: external sync for vk_sync_binary
|
||
- kk: enable VK_KHR_external_{fence,semaphore}_fd
|
||
|
||
rdh (1):
|
||
|
||
- mesa: allow MAX_TRANSFORM_FEEDBACK_BUFFERS in GL40+ contexts
|
||
|
||
scavenger (1):
|
||
|
||
- add VK CTS validation report for a0 interpolation fix
|
||
|
||
utzcoz (1):
|
||
|
||
- gfxstream: Fix vkSetDebugUtilsObjectNameEXT crash for unwrapped objects
|