2020-05-27 23:46:48 +02:00
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
< html lang = "en" >
< head >
< meta http-equiv = "content-type" content = "text/html; charset=utf-8" >
< title > Mesa Release Notes< / title >
< link rel = "stylesheet" type = "text/css" href = "../mesa.css" >
< / head >
< body >
< div class = "header" >
< h1 > The Mesa 3D Graphics Library< / h1 >
< / div >
< iframe src = "../contents.html" > < / iframe >
< div class = "content" >
< h1 > Mesa 20.1.0 Release Notes / 2020-05-27< / h1 >
< p >
Mesa 20.1.0 is a new development release. People who are concerned
with stability and reliability should stick with a previous release or
wait for Mesa 20.1.1.
< / p >
< p >
Mesa 20.1.0 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is < strong > only< / strong > available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
< / p >
< p >
Mesa 20.1.0 implements the Vulkan 1.2 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
< / p >
< h2 > SHA256 checksum< / h2 >
< pre >
2020-05-28 00:48:38 +02:00
2109055d7660514fc4c1bcd861bcba9db00c026119ae222720111732dba27c83 mesa-20.1.0.tar.xz
2020-05-27 23:46:48 +02:00
< / pre >
< h2 > New features< / h2 >
< ul >
< li > GL_ARB_compute_variable_group_size on i965.
< / li >
< li > GL_EXT_depth_bounds_test on Iris.
< / li >
< li > GL_EXT_texture_shadow_lod on radeonsi, nvc0.
< / li >
< li > GL_NV_alpha_to_coverage_dither_control on radeonsi
< / li >
< li > GL_NV_copy_image on all gallium drivers.
< / li >
< li > GL_NV_pixel_buffer_object on all gallium drivers, i915, i965, swrast.
< / li >
< li > GL_NV_viewport_array2 on nvc0 (GM200+).
< / li >
< li > GL_NV_viewport_swizzle on nvc0 (GM200+).
< / li >
< li > VK_AMD_memory_overallocation_behavior on RADV.
< / li >
< li > VK_KHR_shader_non_semantic_info on Intel, RADV.
< / li >
< li > GL_EXT_draw_instanced on gles2
< / li >
< li > VK_KHR_8bit_storage for ACO on GFX8+
< / li >
< li > VK_KHR_16bit_storage for ACO on GFX8+ (storageInputOutput16 is still unsupported)
< / li >
< li > shaderInt16 for ACO on GFX9+
< / li >
< li > VK_KHR_shader_float16_int8 for ACO on GFX8+ (shaderFloat16 is still unsupported)
< / li >
< li > VK_EXT_robustness2 on Intel, RADV.
< / li >
< li > Add Rocket Lake (RKL) support on anvil and iris.
< / li >
< / ul >
< h2 > Bug fixes< / h2 >
< ul >
< li > Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2)< / li >
< li > glsl: regression affecting shader compilation time< / li >
< li > freedreno: glamor issue with x11 desktops< / li >
< li > [gles3] supertuxkart: some textures are incorrect< / li >
< li > Double lock in fbobject.c< / li >
< li > [bisected] Steam crashes when newest Iris built with LTO< / li >
< li > i965/vec4: opt_cse_local cause the out of bound array access< / li >
< li > NIR: Regression on shader using 8/16-bit integers< / li >
< li > lp_bld_intr.c:70:16: error: use of undeclared identifier ' LLVMFixedVectorTypeKind' ; did you mean ' LLVMVectorTypeKind' ?< / li >
< li > Deadlock in anv_timelines_wait()< / li >
< li > post_version.py does not work with release candidates< / li >
< li > post_version.py does not work with release candidates< / li >
< li > radv regression on android< / li >
< li > src\util\meson.build:294:4: ERROR: Program or command ' winepath' not found or not executable< / li >
< li > debug builds are massively broken on Windows< / li >
< li > heavy glitches on amd ryzen 5 since version 20.x< / li >
< li > zink asserts with 32-bit boolean< / li >
< li > Dirt: Showdown bad performance and broken rendering with enabled advanced lightning< / li >
< li > gravit & Firefox WebGL broken since 3dc2ccc14c0e035368fea6ae3cce8c481f3c4ad2 " ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE" < / li >
< li > mesa 20.0.5 causing kitty to crash< / li >
< li > radeonsi: " Torchlight II" trace showing regression on mesa-20.0.6 [bisected]< / li >
< li > [RADV/LLVM/ACO/Regression] After mesa commit a3dc7fffbb7be0f1b2ac478b16d3acc5662dff66 all games stucks at start< / li >
< li > Android building error after commit 2ab45f41< / li >
< li > iris: Crash when trying to capture window in OBS Studio< / li >
< li > Properly annotate control flow convergence points< / li >
< li > intel/compiler: Register coalesce doesn' t move conditional modifiers< / li >
< li > [bisected] [iris] mpv under wayland: failed to import supplied dmabufs: Unsupported buffer format 808669784< / li >
< li > [Bisected][Iris] piglit.spec.!opengl 1_1.max-texture-size crashes on x32 platform< / li >
< li > anv : android deqp assert dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.image#export_import_bind_bind< / li >
< li > GL cts gtf30.GL3Tests.sgis_texture_lod.sgis_texture_lod_basic_getter failure< / li >
< li > freedreno/a6xx: texture cache vs realloc_bo()< / li >
< li > [Bisected] dEQP-VK.subgroups.ballot_mask.ext_shader_subgroup_ballot.* failures< / li >
< li > dEQP-VK.subgroups.size_control.compute.* crashes on HSW and TGL< / li >
< li > zink: framebuffer and pipeline caches accumulate due to zink_create_surface()< / li >
< li > FTBFS due to LLVM commit 2dea3f129878 (LLVMVectorTypeKind is gone)< / li >
< li > [r600/Turks] 20.0.2: modesetting/radeon driver SIGABRT at loading X (kernel 5.5.10, ppc64)< / li >
< li > piglit spec.!opengl 1.0.gl-1.0-fpexceptions crash on Iris< / li >
< li > ci: Update the Wine version< / li >
< li > SPIR-V: Failure in dEQP-VK.graphicsfuzz.control-flow-switch< / li >
< li > SPIR-V: OpConvertUToPtr from spec constant fails to compile< / li >
< li > ACO: Regression: Texture corruption< / li >
< li > radv: Reading ViewportIndex in fragment shader returns garbage< / li >
< li > piglit spec.arb_gpu_shader_fp64.execution.arb_gpu_shader_fp64-vs-non-uniform-control-flow-ssbo crash on Iris< / li >
< li > piglit spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-neg-abs.shader_test failure on IVB< / li >
< li > [ANV] gfxbench Aztec Ruins misrenders on gen11+< / li >
< li > glxinfo cmd crashed< / li >
< li > radeonsi: GL_LINES rendering is affected by GL_POINT_SPRITE< / li >
< li > nir: nir_lower_returns can' t handle nested loops< / li >
< li > Graphic artifacts with Mesa 20.0.4 on intel HD 510 GPU< / li >
< li > [Iris] [Bisected] Some KHR-GL46.arrays_of_arrays_gl. tests are failing< / li >
< li > Mesa 20 regression makes Lightsprint demos crash< / li >
< li > metro redux games crash upon loading certain levels on amdgpu< / li >
< li > dri_common.h:58:8: error: unknown type name ' __GLXDRIdrawable' < / li >
< li > Graphical glitches on Intel Graphics when Xorg started on Iris driver< / li >
< li > GL/GLES test crashes on G33/i915 platforms< / li >
< li > GL/GLES test crashes on G33/i915 platforms< / li >
< li > GL/GLES test crashes on G33/i915 platforms< / li >
< li > SIGSEGV src/compiler/glsl/ast_function.cpp:53< / li >
< li > manywin aborts with " i965: Failed to submit batchbuffer: Invalid argument" < / li >
< li > manywin aborts with " i965: Failed to submit batchbuffer: Invalid argument" < / li >
< li > manywin aborts with " i965: Failed to submit batchbuffer: Invalid argument" < / li >
< li > manywin aborts with " i965: Failed to submit batchbuffer: Invalid argument" < / li >
< li > v3d: transform feedback issue< / li >
< li > radv: Enable TC-compat HTILE in VK_IMAGE_LAYOUT_GENERAL.< / li >
< li > radv: dEQP-VK.binding_model.descriptorset_random.sets4.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.noia.0 segfault< / li >
< li > radv: RAVEN fails dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy< / li >
< li > buffer overflow in nouveau driver on mesa 20.0.2< / li >
< li > xmlconfig sha1 code has overflow and possible bug< / li >
< li > enable storageBuffer16BitAccess feature in radv for SI and CIK< / li >
< li > Build Fails with Clang Shared Library< / li >
< li > Thousands of 32 bit regressions in VulkanCTS and GL test suites due to handling of cross-invocation< / li >
< li > anv: isl assert when running dEQP-VK.geometry.layered.3d.*.readback< / li >
< li > Weston drm-backend.so seems to fail with Mesa master and LIBGL_ALWAYS_SOFTWARE=1< / li >
< li > freedreno/turnip: Don' t request pixlodenable when we don' t use it< / li >
< li > VulkanCTS uniform_buffer_block_geom spins forever< / li >
< li > freedreno: dEQP-GLES3.functional.fbo.msaa.4_samples.r16f flakiness in CI< / li >
< li > src\util\meson.build:291:4: ERROR: Program or command ' winepath' not found or not executable< / li >
< li > RADV: flickering textures in Q.U.B.E. 2 through Proton< / li >
< li > Missing ENDBR in entry_x86-64_tls.h, entry_x86_tls.h and entry_x86_tsd.h< / li >
< li > [regression][bisected] Android build test fails: marshal_generated.c' , missing and no known rule to make it< / li >
< li > Missing ENDBR in rtasm_x86sse.c< / li >
< li > src/intel/tools/aubinator_viewer.cpp:383:52: error: format ‘ %lx’ expects argument of type ‘ long unsigned int’ , but argument 5 has type ‘ uint64_t {aka long long unsigned int}’ [-Werror=format=]< / li >
< li > src/compiler/glsl/ast_to_hir.cpp:2134: ir_rvalue* ast_expression::do_hir(exec_list*, _mesa_glsl_parse_state*, bool): Assertion `result != NULL || !needs_rvalue' failed.< / li >
< li > process_test fails on macOS< / li >
< li > Vulkan Overlay is blinking< / li >
< li > Regression: 9d64ad2fe79 broke Rocket League< / li >
< li > GameMaker games (Memoranda and Undertale) + amdgpu — Segmentation fault on launch< / li >
< li > Civilization VI - Animated leader characters small black squares artifacts< / li >
< li > [ACO] Reliable crash with RPCS3 that is not present with LLVM< / li >
< li > [RADV] vkCmdBindTransformFeedbackBuffersEXT pSizes optional parameter not handled< / li >
< li > [RadeonSI] - Curse of the Dead Gods (1123770) - Lighting is not rendering correctly.< / li >
< li > soft-fp64: __fsat64 incorrectly returns NaN for a NaN input. It should return zero.< / li >
< li > Hang when using glWaitSync with multithreaded shared GL contexts< / li >
< li > RPCS3 / Persona 5 - Performance regression [RADV / Navi]< / li >
< li > [ANV] Rendering corruption in Shadow of the Tomb Raider< / li >
< li > src/compiler/glsl/glcpp/glcpp-parse.y:1297: _token_print: Assertion `!" Error: Don' t know how to print token." ' failed.< / li >
< li > [CTS] dEQP-VK.descriptor_indexing.* fails on RADV/LLVM< / li >
< li > Unigine Valley failure / assert< / li >
< li > [Gen9/icl] [Bisected] [Regression] dEQP-GLES3.functional.shaders.loops.short_circuit.do_while_fragment fail< / li >
< li > [RadeonSI][gfx10/navi] Kerbal Space Program crash: si_draw_vbo: Assertion `0' failed< / li >
< li > Budget Cuts hits VK_AMD_shader_fragment_mask assert< / li >
< li > Follow-up from " i965/blorp: Don' t resolve HiZ unless we' re reinterpreting" < / li >
< li > crash in vc4_write_uniforms with shaders involving YUV textures< / li >
< li > Corrupted output with vaapi 10 bit -> 8 bit transcoding on AMD RAVEN< / li >
< li > tessellator.cpp:78:7: error: ' fmin' is missing exception specification ' noexcept' < / li >
< li > Please add Raspberry Pi 4 to features.txt< / li >
< li > Build failure with bison 2.3.< / li >
< li > Mesa build fails on 32 bit architecture< / li >
< li > Mesa build fails on 32 bit architecture< / li >
< li > Incorrect rendering with vaapi + uyvy422< / li >
< li > V3D/Broadcom (Raspberry Pi 4) - GLES 3.1 - GL_EXT_texture_norm16 advertised, but not usable< / li >
< li > mesa-20.0.0/src/amd/compiler/aco_instruction_selection.cpp:7221:55: style: Same expression on both sides of ' & & < / li >
< li > i965 assertion failure in fallback_rgbx_to_rgba< / li >
< li > vaapi bob deinterlacer produces wrong output height on AMD< / li >
< li > Compute copies do not handle SUBSAMPLED formats< / li >
< li > Please document RADV_TEX_ANISO variable in envvars.html< / li >
< li > unexpected CI failure< / li >
< li > Multiple glapi_mapi_tmp.h< / li >
< li > drisw crashes on calling NULL putImage on EGL surfaceless platform (pbuffer EGLSurface)< / li >
< li > VRAM leak with vuilkan external memory + opengl memory objects< / li >
< li > [radeonsi][vaapi][bisected] invalid VASurfaceID when playing interlaced DVB stream in Kodi< / li >
< li > [RADV] GPU hangs while the cutscene plays in the game Assassin' s Creed Origins< / li >
< li > ACO: The Elder Scrolls Online crashes on startup (Navi)< / li >
< li > Broken rendering of glxgears on S/390 architecture (64bit, BigEndian)< / li >
< li > aco: sun flickering with Assassins Creeds Origins< / li >
< li > !1896 broke ext_image_dma_buf_import piglit tests with radeonsi< / li >
< li > aco: wrong geometry with Assassins Creed Origins on GFX6< / li >
< li > valgrind errors since commit a8ec4082a41< / li >
< li > src/broadcom/qpu/qpu_pack.c:962:25: error: implicit declaration of function ' ffs' is invalid in C99 [-Werror,-Wimplicit-function-declaration] mux_b = ffs(desc-> mux_b_mask) - 1;< / li >
< li > X fails to start with amdgpu and Mesa 20.1 on Fedora< / li >
< li > GPU hangs in Factorio on Radeon RX 5700 XT (MSI GAMING X)< / li >
< li > OSMesa osmesa_choose_format returns a format not supported by st_new_renderbuffer_fb< / li >
< li > Build error with VS on WIN< / li >
< li > Using EGL_KHR_surfaceless_context causes spurious " libEGL warning: FIXME: egl/x11 doesn' t support front buffer rendering." < / li >
< li > !3460 broke texsubimage test with piglit on zink+anv< / li >
< li > VERSION needs to be bumped for trunk master< / li >
< li > The screen is black when using ACO< / li >
< / ul >
< h2 > Changes< / h2 >
< ul >
< p > Abhishek Kumar (1):< / p >
< li > anv/android: fix assert in anv_import_ahw_memory< / li >
< p > < / p >
< p > Adam Jackson (1):< / p >
< li > gallium: enable EGL_EXT_image_dma_buf_import_modifiers unconditionally< / li >
< p > < / p >
< p > Albert Astals Cid (5):< / p >
< li > cube_face_coord: Use fabsf instead of fabs since we know it' s floats< / li >
< li > cube_face_index: Use fabsf instead of fabs since we know it' s floats< / li >
< li > aco: Minor optimization in spill_ctx constructor< / li >
< li > aco: pass vars by const & < / li >
< li > Fix promotion of floats to doubles< / li >
< p > < / p >
< p > Alejandro Piñeiro (7):< / p >
< li > docs/features: add v3d driver< / li >
< li > nir/linker: remove reference to just SPIR-V linking< / li >
< li > v3d/tex: don' t configure tmu config 1 if not needed< / li >
< li > v3d/tex: Configuration Parameter 1 can be only skipped if P2 can be skipped too< / li >
< li > v3d/packet: fixing TMU_Config_Parameter_2 definition< / li >
< li > nir: add nir_tex_instr_need_sampler helper< / li >
< li > v3d: support for textureQueryLOD< / li >
< p > < / p >
< p > Alexandros Frantzis (3):< / p >
< li > gitlab-ci: Automated testing with OpenGL traces< / li >
< li > gitlab-ci: Fix traces caching in tracie< / li >
< li > gitlab-ci: Check the Mesa version used for tracie tests< / li >
< p > < / p >
< p > Alyssa Rosenzweig (505):< / p >
< li > pan/midgard: Break out one-src read_components< / li >
< li > pan/midgard: Implement mixed-type constant packing< / li >
< li > panfrost: Avoid overlapping copy< / li >
< li > pan/midgard: Check for null consts< / li >
< li > pan/midgard: Remove unused variable< / li >
< li > panfrost: Use size0 when calculating the offset to a depth level< / li >
< li > pan/midgard: Fix scheduling issue with csel + render target reference< / li >
< li > panfrost: Simplify swizzle translation< / li >
< li > panfrost: Update comment about magic number relating to barriers< / li >
< li > panfrost: Ensure compute shader_meta is zeroed< / li >
< li > panfrost: Identify mali_shared_memory structure< / li >
< li > panfrost: Unify bifrost_scratchpad with mali_shared_memory< / li >
< li > panfrost: Rename bifrost_framebuffer-> mali_framebuffer< / li >
< li > panfrost: Rename unknown2_8 to padding< / li >
< li > panfrost: Allocate RAM backing of shared memory< / li >
< li > pan/midgard: Track pressure when scheduling ld/st< / li >
< li > pan/midgard: Fix missing prefixes< / li >
< li > pan/midgard: Fix swizzles harder< / li >
< li > pan/midgard: Implement barriers< / li >
< li > pan/midgard: Allow jumping out of a shader< / li >
< li > pan/midgard: Fix 32/64 mixed swizzle packing< / li >
< li > pan/midgard: Use dummy tag for empty shaders< / li >
< li > pan/midgard: Improve barrier disassembly< / li >
< li > pan/midgard: Overhaul tag handling< / li >
< li > pan/midgard: Imply next tags< / li >
< li > pan/midgard: Infer tags entirely< / li >
< li > pan/midgard: Set xyzx swizzle for load_compute_arg< / li >
< li > pan/midgard: Identify stack barrier flag< / li >
< li > pan/midgard: Don' t crash with constants on unknown ops< / li >
< li > pan/midgard: Use fprintf instead of printf for constants< / li >
< li > pan/decode: Remove extraneous newline< / li >
< li > pan/decode: Add `minimal` mode< / li >
< li > pan/decode: Cleanup pandecode_jc< / li >
< li > panfrost: Implement PAN_DBG_SYNC with pandecode/minimal< / li >
< li > panfrost: Print synced traces to stderr< / li >
< li > panfrost: Rewrite scoreboarding routines< / li >
< li > panfrost: Update scoreboarding notes< / li >
< li > panfrost: Cleanup transfer_map< / li >
< li > panfrost: Avoid reading GPU memory when packing vertices< / li >
< li > panfrost: Debitfieldize mali_uniform_buffer_meta< / li >
< li > panfrost: Remove enum panfrost_memory_layout< / li >
< li > panfrost: Remove dirty tracking< / li >
< li > panfrost: Remove old comment< / li >
< li > panfrost: Remove old hack< / li >
< li > panfrost: Remove flush_frontbuffer< / li >
< li > pan/midgard: Identify clamp(x, -1.0, 1.0) flag< / li >
< li > panfrost: Move checksum routines to root panfrost< / li >
< li > panfrost: Move pan_afbc.c to root< / li >
< li > panfrost: Move format translation to root< / li >
< li > panfrost: Rewrite texture descriptor creation logic< / li >
< li > nir: Add SSBO-> global lowering pass< / li >
< li > pan/midgard: Lower SSBOs in NIR< / li >
< li > pan/midgard: Implement nir_intrinsic_get_buffer_size< / li >
< li > pan/midgard: Implement load/store_shared< / li >
< li > panfrost: Combine get_index_buffer with bound computation< / li >
< li > panfrost: Implement index buffer cache< / li >
< li > pan/decode: Dump scratchpad size if present< / li >
< li > pan/midgard: Don' t spill near a branch< / li >
< li > panfrost: Fix gl_VertexID/InstanceID< / li >
< li > panfrost: Fix padded_vertex_count generation< / li >
< li > panfrost: Update spilling comment framebuffer-> shared< / li >
< li > panfrost: Don' t set shared-> unk0< / li >
< li > panfrost: Fix param getting< / li >
< li > panfrost: Default to 256 threads for TLS< / li >
< li > panfrost: Reserve an extra page for spilling< / li >
< li > panfrost: Simplify stack shift calculation< / li >
< li > panfrost: Expose PIPE_CAP_PRIMITIVE_RESTART< / li >
< li > panfrost: Add PAN_MESA_DEBUG=gles3 option< / li >
< li > panfrost: Increase SSBO/image limit from 4-> 8< / li >
< li > pan/midgard: Allow inverted inverted ops< / li >
< li > pan/midgard: Allow fusing inverted sources for inverted ops< / li >
< li > pan/midgard: Partially fix 64-bit swizzle alignment< / li >
< li > pan/midgard: Extract nir_ssa_index helper< / li >
< li > pan/midgard: Add LDST_ADDRESS property< / li >
< li > pan/midgard: Fix load/store argument sizing< / li >
< li > pan/midgard: Round up bytemasks when promoting uniforms< / li >
< li > pan/midgard: Force address alignment< / li >
< li > pan/midgard: Add address analysis framework< / li >
< li > pan/midgard: Use address analysis for globals, etc< / li >
< li > pan/decode: Calm an assert to a pandecode error< / li >
< li > pan/decode: Restore bifrost sample_locations< / li >
< li > pan/decode: Fix tiler weights printing< / li >
< li > pan/decode: Skip analysis for Bifrost tiler structures< / li >
< li > pan/bi: Add discard ops< / li >
< li > pan/bi: Add ICMP.GL.NEQ op< / li >
< li > pan/bi: Move notes on FMA opcodes from disassembler< / li >
< li > pan/bi: Introduce CSEL4 class< / li >
< li > pan/bi: Move notes on ADD ops to notes file< / li >
< li > pan/bi: Decode FMA_SHIFT properly< / li >
< li > pan/bi: Add v4i8 mode to FMA_SHIFT< / li >
< li > pan/bi: Identify extended FMA opcodes< / li >
< li > pan/bi: Decode ADD_SHIFT properly< / li >
< li > pan/bi: Combine LOAD_VARYING_ADDRESS instructions by type< / li >
< li > pan/bi: Squash LD_ATTR ops together< / li >
< li > pan/bi: Structify FMA_FADD< / li >
< li > pan/bi: Move some definitions from disasm to bifrost.h< / li >
< li > panfrost: Add note about preloaded varyings< / li >
< li > pan/bi: Gut old compiler< / li >
< li > pan/bi: Stub out new compiler< / li >
< li > pan/bi: Add the control flow graph< / li >
< li > pan/bi: Add src/dest fields to bifrost_instruction< / li >
< li > pan/bi: Add class properties< / li >
< li > pan/bi: Add modifiers to bi_instruction< / li >
< li > pan/bi: Add BI_GENERIC property< / li >
< li > pan/bi: Factor out enum bifrost_minmax_mode< / li >
< li > pan/bi: Add a bifrost_roundmode field< / li >
< li > pan/bi: Add bifrost_minmax_mode field< / li >
< li > pan/bi: Add bi_load structure< / li >
< li > pan/bi: Pull out bifrost_load_var< / li >
< li > pan/bi: Add bi_load_vary structure< / li >
< li > pan/bi: Add PAN_SCHED_* flags< / li >
< li > pan/bi: Add bi_clause, bi_bundle abstractions< / li >
< li > pan/bi: Add dest_type field to bifrost_instruction< / li >
< li > pan/bi: Add special indices< / li >
< li > pan/bi: Add constant field to bi_instruction< / li >
< li > pan/bi: Add class-specific ops< / li >
< li > pan/bi: Add clause header fields to bi_clause< / li >
< li > pan/bi: Clarify special op scheduling< / li >
< li > pan/bi: Add swizzles< / li >
< li > pan/bi: Add source type for conversions< / li >
< li > pan/bi: Add EXTRACT, MAKE_VEC synthetic ops< / li >
< li > pan/bi: Add constants to bi_clause< / li >
< li > pan/bi: Add pred/successors to build CFG< / li >
< li > pan/bi: Extract bifrost_branch structure< / li >
< li > pan/bi: Add bi_branch data< / li >
< li > pan/bi: Add CSEL condition< / li >
< li > pan/bi: Add high-latency property for classes< / li >
< li > pan/bi: Add quirks system< / li >
< li > pan/bi: Add IR iteration macros< / li >
< li > pan/bi: Move some print routines out of the disasm< / li >
< li > pan/bi: Add BIR manipulation routines to bir.c< / li >
< li > pan/bi: Move bi_interp_mode_name to bi_print< / li >
< li > pan/bi: Add bi_instruction printing< / li >
< li > pan/bi: Add bi_print_bundle for printing bi_bundle< / li >
< li > pan/bi: Add bi_print_clause< / li >
< li > pan/bi: Add bi_print_block< / li >
< li > pan/bi: Add bi_print_shader< / li >
< li > pan/bi: Lower and optimize NIR< / li >
< li > pan/bi: Walk through the NIR control flow graph< / li >
< li > pan/bi: Improve block printing< / li >
< li > pan/bi: Don' t print types for unconditional branches< / li >
< li > pan/bi: Print branch target< / li >
< li > pan/bi: Add instruction emit/remove helpers< / li >
< li > pan/bi: Call nir_lower_io_to_temporaries in cmdline< / li >
< li > pan/bi: Add support for if-else blocks< / li >
< li > pan/bi: Handle loops when ingesting CFG< / li >
< li > pan/bi: Handle jumps (breaks, continues)< / li >
< li > pan/bi: Fix destination printing< / li >
< li > pan/bi: Implement nir_intrsinic_load_interpolated_input< / li >
< li > pan/bi: Add blend_location to IR for BI_BLEND< / li >
< li > pan/bi: Add bi_schedule_barrier helper< / li >
< li > pan/bi: Implement store_output for fragment shaders< / li >
< li > pan/bi: Implement load_input for vertex shaders< / li >
< li > pan/bi: Add helpers for creating temporaries< / li >
< li > pan/bi: Implement store_vary for vertex shaders< / li >
< li > pan/bi: Add preliminary LOAD_UNIFORM implementation< / li >
< li > pan/bi: Implement load_const< / li >
< li > pan/bi: Add dummy scheduler< / li >
< li > pan/bi: Rename next-wait to simply ' wait' < / li >
< li > pan/bi: Fix Android.mk< / li >
< li > panfrost: Move mir_to_bytemask to common code< / li >
< li > pan/bi: Generalize swizzles to avoid extracts< / li >
< li > pan/bi: Introduce writemasks< / li >
< li > pan/bi: Remove bi_load< / li >
< li > pan/bi: Lower vec* to writemasks in NIR< / li >
< li > pan/bi: Add initial handling of ALU ops< / li >
< li > pan/bi: Allow inlining constants< / li >
< li > pan/bi: Implement fsat as mov.sat< / li >
< li > pan/bi: Add a bunch of ALU ops< / li >
< li > pan/bi: Add BI_SPECIAL_* enum< / li >
< li > pan/bi: Handle special ops in NIR-> BIR< / li >
< li > pan/bi: Implement fabs, fneg as fmov with mods< / li >
< li > pan/bi: Disable lower_sub< / li >
< li > pan/bi: Add isub op< / li >
< li > pan/bi: Import algebraic pass from midgard< / li >
< li > pan/bi: Implement nir_op_bcsel< / li >
< li > pan/bi: Lower b2f to bcsel< / li >
< li > pan/bi: Specify comparison op for BI_CMP< / li >
< li > pan/bi: Print source types unconditionally< / li >
< li > pan/bi: Implement comparison opcodes via BI_CMP< / li >
< li > panfrost: Promote midgard_program to panfrost/util< / li >
< li > pan/midgard: Remove unused iterators< / li >
< li > pan/midgard: Adjust sysval-related prototypes< / li >
< li > pan/midgard: Remove indexing dependency of sysvals< / li >
< li > pan/midgard: Decontextualize midgard_nir_assign_sysval_body< / li >
< li > pan/midgard: Remove dest_override sysval argument< / li >
< li > panfrost: Move Midgard sysval code to common Panfrost< / li >
< li > pan/bi: Switch to panfrost_program< / li >
< li > pan/bi: Implement sysvals< / li >
< li > pan/midgard: Localize `visited` tracking< / li >
< li > pan/midgard: Decontextualize liveness analysis core< / li >
< li > pan/midgard: Sync midgard_block field names with Bifrost< / li >
< li > pan/midgard: Subclass midgard_block from pan_block< / li >
< li > panfrost: Move liveness analysis to root panfrost/< / li >
< li > panfrost: Sync Midgard/Bifrost control flow< / li >
< li > pan/bi: Paste over bi_has_arg< / li >
< li > pan/bi: Add bi_bytemask_of_read_components helpers< / li >
< li > pan/bi: Add bi_next/prev_op helpers< / li >
< li > pan/bi: Add bi_max_temp helper< / li >
< li > pan/bi: Add liveness analysis pass< / li >
< li > pan/bi: Add dead code elimination pass< / li >
< li > pan/bi: Implement nir_op_ffma< / li >
< li > pan/bi: Fix swizzle for second argument to ST_VARY< / li >
< li > panfrost: Move lcra to panfrost/util< / li >
< li > pan/midgard: Remove incorrect comment in RA< / li >
< li > pan/bi: Minor fixes in iteration macros< / li >
< li > pan/bi: Fix vector handling of readmasks< / li >
< li > pan/bi: Fix missing src_types< / li >
< li > pan/bi: Add register allocator< / li >
< li > pan/bi: Interpret register allocation results< / li >
< li > pan/bi: Setup initial clause packing< / li >
< li > pan/bi: Sketch out instruction word packing< / li >
< li > pan/bi: Add packing for register control field< / li >
< li > pan/bi: Pack register fields< / li >
< li > pan/bi: Add missing __attribute__((packed))< / li >
< li > pan/bi: Assign registers to ports< / li >
< li > pan/bi: Route through first_instruction field< / li >
< li > pan/bi: Model 3-bit Bifrost srcs in IR< / li >
< li > pan/bi: Add struct bifrost_fma_fma< / li >
< li > pan/bi: Pack BI_FMA ops< / li >
< li > pan/bi: Pack fadd32< / li >
< li > pan/bi: List ADD classes in bi_pack_add< / li >
< li > pan/bi: Generalize bi_get_src a bit< / li >
< li > pan/bi: Pass second src for load_vary ops< / li >
< li > pan/bi: Emit load_vary ops< / li >
< li > pan/bi: Skip over data registers in port assignment< / li >
< li > pan/bi: Route through clause header< / li >
< li > pan/bi: Pretty-print clause types in disassembler< / li >
< li > pan/bi: Don' t hide SCHED_ADD inside HI_LATENCY< / li >
< li > pan/bi: Track clause types during scheduling< / li >
< li > pan/bi: Flesh out ATEST in IR< / li >
< li > pan/bi: Add ATEST packing< / li >
< li > pan/bi: Flesh out BI_BLEND< / li >
< li > pan/bi: Pack BI_BLEND< / li >
< li > pan/bi: Implement FMA/MOV without modifiers< / li >
< li > pan/bi: Add bi_emit_before helper< / li >
< li > pan/bi: Add move lowering pass< / li >
< li > pan/bi: Pack a constant quadword< / li >
< li > pan/bi: Document constant related errata(?)< / li >
< li > pan/bi: Index out constants in instructions< / li >
< li > pan/bi: Include UBO index for sysval reads< / li >
< li > pan/bi: Add bi_load32_components helper< / li >
< li > pan/bi: Pack ld_ubo ops< / li >
< li > pan/bi: Pack ld_var_addr< / li >
< li > pan/bi: Flesh out st_vary IR< / li >
< li > pan/bi: Generalize data register setting< / li >
< li > pan/bi: Add store_channels property< / li >
< li > pan/bi: Pack st_vary< / li >
< li > pan/bi: Pack LD_ATTR< / li >
< li > pan/bi: Lower bool to ints< / li >
< li > pan/bi: Remove hacks for 1-bit booleans in IR< / li >
< li > pan/bi: Add `soft` NIR-> BIR condition translation< / li >
< li > pan/bi: Implement csel fusing< / li >
< li > pan/bi: Respect shift when printing immediates< / li >
< li > pan/bi: Use bi_lookup_immediate when packing< / li >
< li > pan/bi: Default csel to " != 0" mode< / li >
< li > pan/bi: Pack csel4 opcodes< / li >
< li > pan/bi: Ingest vecN directly (again)< / li >
< li > pan/bi: Lower combines to rewrites for scalars< / li >
< li > pan/bi: Rewrite aligned vectors as well< / li >
< li > panfrost: Split panfrost_device from panfrost_screen< / li >
< li > panfrost: Isolate panfrost_bo_access_for_stage to pan_cmdstream.c< / li >
< li > panfrost: Inline reference counting routines< / li >
< li > panfrost: Move pan_bo to root panfrost< / li >
< li > pan/bit: Link standalone compiler with en/decoder< / li >
< li > panfrost: Move device open/close to root panfrost< / li >
< li > pan/bit: Open up the device< / li >
< li > panfrost: Stub out G31/G52 quirks< / li >
< li > pan/bit: Submit a WRITE_VALUE job as a sanity check< / li >
< li > pan/bit: Begin generating a vertex job< / li >
< li > pan/bi: Fix overzealous write barriers< / li >
< li > pan/bi: Fix off-by-one in scoreboarding packing< / li >
< li > pan/bi: Enable precision lowering in standalone compiler< / li >
< li > panfrost: Enable PIPE_SHADER_CAP_FP16 on Bifrost< / li >
< li > pan/bi: Handle f2f* opcodes< / li >
< li > pan/bi: Ignore swizzle in unwritten component< / li >
< li > pan/bi: Finish FMA structures< / li >
< li > pan/bi: Fix missing type for fmul< / li >
< li > pan/bi: Add FMA16 packing< / li >
< li > pan/bi: Pack outmod and roundmode with FMA< / li >
< li > pan/bi: Expand out FMA conversion opcodes< / li >
< li > pan/bi: Enumerate conversions< / li >
< li > pan/bi: Handle standard FMA conversions< / li >
< li > pan/bi: Add bifrost_fma_2src generic< / li >
< li > pan/bi: Add one-source f32-> f16 op< / li >
< li > pan/bi: Assert out i16 related converts for now< / li >
< li > pan/bi: Handle round opcodes in frontend< / li >
< li > pan/bi: Add v2f16 versions of rounding ops< / li >
< li > pan/bi: Structify fadd/min/max16< / li >
< li > pan/bi: Handle core faddminmax16 packing< / li >
< li > pan/bi: Handle abs packing for fp16/FMA add/min< / li >
< li > pan/bi: Handle fp16/abs scheduling restriction< / li >
< li > pan/bi: Fix handling of constants with COMBINE< / li >
< li > pan/bit: Add `run` mode to the cmdline< / li >
< li > pan/bit: Wire through I/O< / li >
< li > pan/bi: Fix writes_component for VECTOR< / li >
< li > pan/bi: Use STAGE srcs for scheduler nops< / li >
< li > pan/bi: Don' t set the back-to-back bit yet< / li >
< li > pan/bi: Add cmdline option for verbose disassembly< / li >
< li > pan/bi: Fix unused port swapping< / li >
< li > pan/bi: Handle fmov class ops< / li >
< li > pan/bi: Fix outmod/roundmode flip< / li >
< li > pan/bi: Export bi_class_name< / li >
< li > pan/bi: Fix duplicated source in ADD.v2f16< / li >
< li > pan/bi: Fix negation in ADD.v2f16< / li >
< li > pan/bi: Don' t gobble zero ports< / li >
< li > pan/bi: Allow BI_FMA to take mods< / li >
< li > pan/bi: Handle BIFROST_FIRST_WRITE_FMA_P2_READ_P3< / li >
< li > pan/bi: Add helper to debug port assignment< / li >
< li > pan/bi: Match CSEL argument order with hw< / li >
< li > pan/bit: Stub out BIR interpreter< / li >
< li > pan/bit: Handle read/write< / li >
< li > pan/bit: Add preliminary FMA/ADD/MOV implementations< / li >
< li > pan/bit: Implement outmods< / li >
< li > pan/bit: Implement floating source mods< / li >
< li > pan/bit: Add packing test framework< / li >
< li > pan/bit: Add helper for generating floating mod tests< / li >
< li > pan/bit: Add verbose printing for tests< / li >
< li > pan/bit: Add 16-bit fmod tests< / li >
< li > pan/bit: Add FMA tests< / li >
< li > pan/bit: Add CSEL to interpreter< / li >
< li > pan/bit: Add csel tests< / li >
< li > pan/bit: Make run more useful< / li >
< li > pan/bit: Add mode to run unit tests< / li >
< li > pan/bi: Remove nontrivial SPECIAL ops< / li >
< li > pan/bi: Add 32-bit _FAST packing< / li >
< li > pan/bi: Add fp16 support for frcp/frsq< / li >
< li > pan/bit: Add special op interpreting< / li >
< li > pan/bit: Add special unit test< / li >
< li > pan/bi: Implement min/max on FMA< / li >
< li > pan/bi: Structify ADD unit add/min/max< / li >
< li > pan/bi: Add ADD add/min/max fp32 packing< / li >
< li > pan/bi: Set BI_MODS for MINMAX< / li >
< li > pan/bi: Fix incorrect abs flip in fma/fadd16< / li >
< li > pan/bi: Force ADD scheduling for MINMAX< / li >
< li > pan/bit: Unify test frontends< / li >
< li > pan/bit: Add min/max support to interpreter< / li >
< li > pan/bit: Enable more debug for `run`< / li >
< li > pan/bit: Add fmin/max16 tests< / li >
< li > pan/bit: Wire up add/add op+test< / li >
< li > panfrost: Add IS_BIFROST quirk< / li >
< li > panfrost: Populate bifrost-specific structs within mali_shader_meta< / li >
< li > panfrost: Staticize a few cmdstream functions< / li >
< li > panfrost: Unify vertex/tiler structures< / li >
< li > panfrost: Set mfbd.msaa.sample_locations on Bifrost< / li >
< li > panfrost: Call the Bifrost compiler on bi devices< / li >
< li > pan/bi: Fix nondeterministic register packing< / li >
< li > pan/midgard: Remove unused max_varying variable< / li >
< li > panfrost: Move varying linking to cmdstream< / li >
< li > panfrost: Move uniform_count to pan_assemble< / li >
< li > panfrost: Pass compiler-appropriate options< / li >
< li > pan/bi: Fix backwards registers ports< / li >
< li > panfrost: Fix BI_BLEND packing< / li >
< li > pan/bi: Let !b2b imply branch_cond< / li >
< li > pan/decode: Print Bifrost blend descriptor< / li >
< li > panfrost: Drop dependency on nonexistant write_value< / li >
< li > pan/bi: Lower fsqrt< / li >
< li > pan/midgard: Fix f2u naming confusion< / li >
< li > pan/bi: Set BI_ROUNDMODE for BI_CONVERT< / li >
< li > pan/bi: Fix incorrect swizzle packing assert< / li >
< li > pan/bi: Rewrite conversion packing< / li >
< li > pan/bi: ADD packing for CONVERT< / li >
< li > pan/bit: Add BI_CONVERT interpretation< / li >
< li > pan/bit: Add BI_CONVERT tests< / li >
< li > pan/bi: Add disasm for ADD.i8< / li >
< li > pan/bi: Disable FMA scheduling for CONVERT< / li >
< li > pan/bi: Add BI_TABLE for fast table accesses< / li >
< li > pan/bi: Add special op for exp2< / li >
< li > pan/bi: Add op for ADD_FREXPM< / li >
< li > pan/bi: Add FLOG2_U op to disassembler< / li >
< li > pan/bi: Add log_frexpe op to IR< / li >
< li > pan/bi: Add frexp_log packing< / li >
< li > pan/bi: Add bi_pack_fma_2src helper< / li >
< li > pan/bi: Pack ADD_FREXPM< / li >
< li > pan/bi: Add log2_help packing< / li >
< li > pan/bi: Add _MSCALE flag for FMA/ADD< / li >
< li > pan/bi: Structify FMA_MSCALE< / li >
< li > pan/bi: Pack FMA_MSCALE< / li >
< li > pan/bi: Add fexp2_fast packing< / li >
< li > pan/bi: Split src/dest index printing< / li >
< li > pan/bi: Ensure CONSTANT srcs have types< / li >
< li > pan/bi: Fix bi_get_immediate with multiple imms< / li >
< li > pan/bi: Fix packing with multiple constants< / li >
< li > pan/bi: Fix packing with low-nibble-set on hi constant< / li >
< li > pan/bi: Fix lower_combine swizzle rewrite< / li >
< li > pan/bi: Add fexp2 implementation< / li >
< li > pan/bi: Implement flog2< / li >
< li > pan/bi: Fix vec2/3 handling< / li >
< li > pan/bi: Handle st_vary with < 4 components< / li >
< li > pan/bi: Try to reuse constants in ALU< / li >
< li > pan/bi: Workaround constant packing errata< / li >
< li > pan/bi: Structify add and min/max fp16 ADD< / li >
< li > pan/bi: Pack ADD.v2f16< / li >
< li > pan/bi: Pack MAX.v2f16< / li >
< li > pan/bi: Dump extra bits for disasm< / li >
< li > pan/bi: Round constants to 32-bit< / li >
< li > pan/bi: Lower special ops to 32-bit< / li >
< li > pan/bit: Add FREXP interp support< / li >
< li > pan/bit: Add frexp_log test< / li >
< li > pan/bit: Add BI_REDUCE_FMA interp< / li >
< li > pan/bit: Add FMA_REDUCE test< / li >
< li > pan/bit: Add log2 helper interp< / li >
< li > pan/bit: Add BI_TABLE test< / li >
< li > pan/bit: _MSCALE interp< / li >
< li > pan/bit: Add FMA_MSCALE test< / li >
< li > pan/bit: Add fexp2_fast interp< / li >
< li > pan/bit: Add fexp2_fast test< / li >
< li > pan/bit: Add constants test< / li >
< li > pan/bit: Add fp16 min/max tests< / li >
< li > pan/bi: Print tex_compact coordinates< / li >
< li > pan/bi: Document when dual-tex is triggered< / li >
< li > pan/bi: Disassemble f16 dual tex< / li >
< li > pan/bi: Structify TEX compact< / li >
< li > pan/bi: Include TEX_COMPACT f16 opcode< / li >
< li > pan/bi: Feed data register to BI_TEX< / li >
< li > pan/bi: Add normal/compact/dual switch to IR< / li >
< li > pan/bi: Stub out tex_compact logic< / li >
< li > pan/bi: Generate TEX_COMPACT instruction< / li >
< li > pan/bi: Pack TEX compact instructions< / li >
< li > pan/bi: Assert out multiple textures< / li >
< li > panfrost: Fix crashes with small BOs< / li >
< li > panfrost: Assert on unimplemented fragcoord etc< / li >
< li > panfrost: Set clear_color_[12] in the extra fb desc< / li >
< li > panfrost: Add tentative bifrost_texture_descriptor< / li >
< li > panfrost: decode textures and samplers on bifrost< / li >
< li > pan/decode: Remove is_zs weirdness< / li >
< li > panfrost: Identify texture layout field< / li >
< li > panfrost: The texture descriptor has a pointer to a trampoline< / li >
< li > pan/bi: Pack fp16 ATEST< / li >
< li > pan/bi: Passthrough type for ATEST< / li >
< li > pan/bi: Passthrough blend types< / li >
< li > pan/bi: Assign blend descriptor for BLEND op< / li >
< li > pan/bi: Add missing BI_VECTOR< / li >
< li > pan/bi: Fix ADD.v4i8 opcode< / li >
< li > pan/bi: Eliminate writemasks in the IR< / li >
< li > pan/bi: Rename BI_SWIZZLE to BI_SELECT< / li >
< li > pan/bi: Pack FMA SEL16< / li >
< li > pan/bi: Pack FMA SEL8< / li >
< li > pan/bi: Pack ADD SEL16< / li >
< li > pan/bi: Force BI_SELECT arguments scalar< / li >
< li > pan/bit: Interpret BI_SELECT< / li >
< li > pan/bit: Add SELECT tests< / li >
< li > pan/bi: Fix RA wrt 16-bit swizzles< / li >
< li > pan/bi: Implement 16-bit COMBINE lowering< / li >
< li > nir: Move nir_lower_mediump_outputs from ir3< / li >
< li > ir3: Use shared mediump output lowering< / li >
< li > pan/bi: Add bool-> float opcodes< / li >
< li > pan/bi: Add CSEL.64 opcode< / li >
< li > pan/bi: Add some 8-bit compares< / li >
< li > pan/bi: Add 64-bit int compares< / li >
< li > pan/bi: Add FCMP.GL.v2f16 on ADD opcode< / li >
< li > pan/bi: Add CSEL.8 opcode< / li >
< li > pan/bi(t): Fix SELECT tests< / li >
< li > pan/bi: Deduplicate csel/cmp cond< / li >
< li > pan/bi: Remove bi_round_op< / li >
< li > pan/bi: Structify FMA FCMP< / li >
< li > pan/bi Strucitfy ADD FCMP 32< / li >
< li > pan/bi: Structify FMA FCMP16< / li >
< li > pan/bi: Structify ADD FCMP16< / li >
< li > pan/bi: Structify FMA ICMP 32< / li >
< li > pan/bi: Structify FMA ICMP 16< / li >
< li > pan/bi: Structify ADD ICMP 32< / li >
< li > pan/bi: Fix source mod testing for CMP< / li >
< li > pan/bi: Pack FMA 32 FCMP< / li >
< li > pan/bi: Factor out fp16 abs logic< / li >
< li > pan/bi: Pack fma.fcmp16< / li >
< li > pan/bi: Relax double-abs condition< / li >
< li > pan/bit: Prepare condition evaluation for vectors< / li >
< li > pan/bit: Interpret CMP< / li >
< li > pan/bi: Add initial fcmp test< / li >
< li > pan/bi: Add bitwise modifiers< / li >
< li > pan/bi: Pack BI_BITWISE< / li >
< li > pan/bi: Handle iand/ior/ixor in NIR-> BIR< / li >
< li > pan/bit: Interpret BI_BITWISE< / li >
< li > pan/bit: Add BITWISE test< / li >
< li > panfrost: Fix BO reference counting< / li >
< li > panfrost: Move Bifrost IR indexing to common< / li >
< li > pan/bi: Use common IR indices< / li >
< li > pan/mdg: Remove nir_alu_src_index< / li >
< li > pan/mdg: Use PAN_IS_REG< / li >
< li > pan/mdg: SSA_FIXED_MINIMUM already covered by PAN_IS_REG< / li >
< li > pan/mdg: Don' t break SSA< / li >
< li > pan/mdg: Remove goofy 16-bit comment< / li >
< li > pan/mdg: Remove old hack< / li >
< li > pan/mdg: Set lower_flrp16< / li >
< li > pan/bi: Share ALU type printing< / li >
< li > pan/mdg: Add type fields to IR< / li >
< li > pan/mdg: Track ALU src types< / li >
< li > pan/mdg: Track ALU dest type< / li >
< li > pan/mdg: Another goofy comment gone< / li >
< li > pan/mdg: Track a primary type for I/O< / li >
< li > pan/mdg: Denoise prints< / li >
< li > pan/mdg: Track v_mov type (force uint32 for now?)< / li >
< li > pan/mdg: Track texture types< / li >
< li > pan/mdg: Set texture full fields at pack time< / li >
< li > pan/mdg: Move sampler_type emission to pack time< / li >
< li > pan/mdg: Lower specials to 32-bit< / li >
< li > pan/mdg: Specialize swizzle to type< / li >
< li > pan/mdg: Always print the mask< / li >
< li > pan/mdg: Make some branch targets more explicit< / li >
< li > pan/mdg: Don' t crash on unknown branch target< / li >
< li > pan/mdg: Pass through some types from scheduling< / li >
< li > pan/mdg: Move condense_writemask to disasm< / li >
< li > pan/mdg: Ensure fdot is scalar out in disasm< / li >
< li > pan/mdg: Replicate 16-bit swizzles< / li >
< p > < / p >
< p > Andreas Baierl (8):< / p >
< li > lima/parser: Fix RSW depth test parsing< / li >
< li > lima/parser: Extend AUX0 findings< / li >
< li > lima/parser: Change value name in RSW parser< / li >
< li > lima/parser: Extend rsw parsing showing strings instead of numbers< / li >
< li > gitlab-ci: lima: Add flaky tests to the skips list< / li >
< li > gitlab-ci: Enable the lima job again< / li >
< li > gitlab-ci: Add add a set of lima flakes< / li >
< li > lima: Add etc1 support< / li >
< p > < / p >
< p > Andres Gomez (27):< / p >
< li > tracie: correct typo< / li >
< li > gitlab-ci: add missing popd to the build-deqp-vk.sh script< / li >
< li > gitlab-ci: build gfxreconstruct into the Vulkan testing container< / li >
< li > gitlab-ci: build VulkanTools into the Vulkan testing container< / li >
< li > gitlab-ci: Change devices format to < api-vendor-deviceId> < / li >
< li > gitlab-ci: Add gfxreconstruct traces support< / li >
< li > gitlab-ci: Add jobs to be able to test Vulkan< / li >
< li > gitlab-ci: Fix indentation and dangerous " \" in the last multiline line< / li >
< li > gitlab-ci: Remove unneeded python3-pilkit dependency< / li >
< li > gitlab-ci: Sort packages to install alphabetically< / li >
< li > gitlab-ci: add python3-requests to the test-vk container< / li >
< li > gitlab-ci/traces: Add Vulkan sample entries for POLARIS10< / li >
< li > gitlab-ci: Don' t use buster-backports packages by default for x86_test-vk< / li >
< li > gitlab-ci: add Wine, win64' s apitrace and DXVK to the Vulkan testing container< / li >
< li > gitlab-ci: add apitrace' s DXGI traces support< / li >
< li > gitlab-ci: replay apitrace traces in headless mode< / li >
< li > gitlab-ci: add Wine and DXVK env variables to Vulkan' s tracie runner< / li >
< li > gitlab-ci/traces: Add D3D11 sample entry for POLARIS10< / li >
< li > gitlab-ci: Vulkan tracie runner to return last command exit code< / li >
< li > gitlab-ci: protect usage of shell variables with double quotes< / li >
< li > gitlab-ci: make explicit tracie is gitlab specific< / li >
< li > gitlab-ci: adapt query_traces_yaml to gitlab specific changes< / li >
< li > gitlab-ci: install winehq-stable to get 5.0 instead of 4.0< / li >
< li > Revert " meson,ci: Disable sparse_array tests on windows" < / li >
< li > gitlab-ci: update tracie README after changes in main script< / li >
< li > gitlab-ci: create always the " results" directory with tracie< / li >
< li > gitlab-ci: correct tracie behavior with replay errors< / li >
< p > < / p >
< p > Andrii Simiklit (2):< / p >
< li > Revert " glx: convert glx_config_create_list to one big calloc" < / li >
< li > i965/vec4: Ignore swizzle of VGRF for use by var_range_end()< / li >
< p > < / p >
< p > Anuj Phogat (2):< / p >
< li > intel/gen12+: Reserve 4KB of URB space per bank for Compute Engine< / li >
< li > intel/gen12+: Set way_size_per_bank to 4< / li >
< p > < / p >
< p > Arcady Goldmints-Orlov (7):< / p >
< li > compiler/nir: Add support for variable initialization from a pointer< / li >
< li > compiler/spirv: Add support for non-constant initializers< / li >
< li > Rename nir_lower_constant_initializers to nir_lower_variable_initalizers< / li >
< li > spirv: Remove outdated SPIR-V decoration warnings< / li >
< li > nir: Lower returns correctly inside nested loops< / li >
< li > anv: increase minUniformBufferOffsetAlignment to 64< / li >
< li > intel/compiler: fix alignment assert in nir_emit_intrinsic< / li >
< p > < / p >
< p > Axel Davy (1):< / p >
< li > gallium/util: Fix leak in the live shader cache< / li >
< p > < / p >
< p > Bas Nieuwenhuizen (29):< / p >
< li > radv: Allow non-dedicated linear images and buffer.< / li >
< li > radv: Do not set SX DISABLE bits for RB+ with unused surfaces.< / li >
< li > radv: Optimize emitting index buffer changes.< / li >
< li > radv: Do not redundantly set the RB+ regs on pipeline switch.< / li >
< li > radeonsi: Fix compute copies for subsampled formats.< / li >
< li > amd/llvm: Fix divergent descriptor indexing. (v3)< / li >
< li > amd/llvm: Fix divergent descriptor regressions with radeonsi.< / li >
< li > radv: Store 64-bit availability bools if requested.< / li >
< li > radv: Consider maximum sample distances for entire grid.< / li >
< li > radv: Whitespace fixup.< / li >
< li > radv: Use correct buffer count with variable descriptor set sizes.< / li >
< li > winsys/amdgpu: Retrieve WC flags from imported buffers.< / li >
< li > drm-uapi,radv,radeonsi: Add amdgpu_drm.h header.< / li >
< li > vulkan/wsi: Add callback to set ownership of buffer.< / li >
< li > radv: Add WSI buffers to BO list only if they can be used.< / li >
< li > st/dri: Set next in template instead of after creation. (v2)< / li >
< li > radeonsi: Count planes for imported textures.< / li >
< li > radv: Use actual memory type count for setting app-visible bitset.< / li >
< li > radv: Stop using memory type indices.< / li >
< li > radv/winsys: Add function to get domains/flags from fd.< / li >
< li > radv: Determine memory type for import based on fd.< / li >
< li > radv: Expose 4G element texel buffers.< / li >
< li > radv: Fix implicit sync with recent allocation changes.< / li >
< li > radv: Extend tiling flags to 64-bit.< / li >
< li > radv: Provide a better error for permission issues with priorities.< / li >
< li > radv/winsys: Remove extra sizeof multiply.< / li >
< li > radv: Handle failing to create .cache dir.< / li >
< li > radv: Do not close fd -1 when NULL-winsys creation fails.< / li >
< li > radv: Implement vkGetSwapchainGrallocUsage2ANDROID.< / li >
< p > < / p >
< p > Bernd Kuhls (1):< / p >
< li > util/os_socket: Include unistd.h to fix build error< / li >
< p > < / p >
< p > Blaž Tomažič (1):< / p >
< li > radeonsi: Fix omitted flush when moving suballocated texture< / li >
< p > < / p >
< p > Boris Brezillon (45):< / p >
< li > pan/midgard: Add an enum to describe the render targets< / li >
< li > pan/midgard: Make sure we pass the right RT id to emit_fragment_store()< / li >
< li > pan/midgard: Lower bitfield extract to shifts< / li >
< li > pan/midgard: Don' t check ' branch & & branch-> writeout' twice in mir_schedule_alu()< / li >
< li > pan/midgard: Stop leaking instruction objects in mir_schedule_alu()< / li >
< li > panfrost: Fix the damage box clamping logic< / li >
< li > pan/midgard: Turn Z/S stores into zs_output_pan intrinsics< / li >
< li > pan/midgard: Add nir_intrinsic_store_zs_output_pan support< / li >
< li > panfrost: Z24 variants should be sampled as R32UI< / li >
< li > panfrost: Add the MALI_WRITES_{Z,S} flags< / li >
< li > panfrost: Set the MALI_WRITES_{Z,S} flags when needed< / li >
< li > Revert " panfrost: Z24 variants should be sampled as R32UI" < / li >
< li > panfrost: Pass the sampler view format when creating a tex descriptor< / li >
< li > panfrost: Assign primitive_size.pointer only if writes_point_size() returns true< / li >
< li > panfrost: Add an helper to retrieve the currently active shader state< / li >
< li > panfrost: Move the batch stack size adjustment out of panfrost_queue_draw()< / li >
< li > panfrost: Move viewport desc emission out of panfrost_emit_for_draw()< / li >
< li > panfrost: Move the const buf emission logic out of panfrost_emit_for_draw()< / li >
< li > panfrost: Move shared mem desc emission out of panfrost_launch_grid()< / li >
< li > panfrost: Dissociate shader meta patching from the desc emission< / li >
< li > panfrost: Move panfrost_attach_vt_framebuffer() to pan_cmdstream.c< / li >
< li > panfrost: Stop using panfrost_emit_for_draw() for compute jobs< / li >
< li > panfrost: Simplify panfrost_emit_for_draw() and make it private< / li >
< li > panfrost: Add an helper to update the occclusion query part of a tiler job desc< / li >
< li > panfrost: Add an helper to update the rasterizer part of a tiler job desc< / li >
< li > panfrost: Prepare things to get rid of panfrost_shader_state.tripipe< / li >
< li > panfrost: Prepare shader_meta descriptors at emission time< / li >
< li > panfrost: Add a panfrost_sampler_desc_init() helper< / li >
< li > panfrost: Move sampler/tex descs emission helpers to pan_cmdstream.c< / li >
< li > panfrost: Add an helper to emit a pair of vertex/tiler jobs< / li >
< li > panfrost: Drop initial mali_attr_meta.src_offset assignment< / li >
< li > panfrost: Ignore BO start addr when adjusting src_offset< / li >
< li > panfrost: Prepare attribute for builtins at state creation time< / li >
< li > panfrost: Emit attribute descriptors after patching the templates< / li >
< li > panfrost: Move the mali_attr.src_offset adjustment to a sub-function< / li >
< li > panfrost: Rename panfrost_stage_attributes()< / li >
< li > panfrost: Move streamout offset update out of panfrost_draw_vbo()< / li >
< li > panfrost: Move vertex/tiler payload initialization out of panfrost_draw_vbo()< / li >
< li > panfrost: Inline panfrost_queue_draw() and panfrost_emit_for_draw()< / li >
< li > panfrost: Move panfrost_emit_vertex_data() to pan_cmdstream.c< / li >
< li > panfrost: Move panfrost_emit_varying_descriptor() to pan_cmdstream.c< / li >
< li > panfrost: Re-init the VT payloads at draw/launch_grid() time< / li >
< li > panfrost: Use ctx-> active_prim in panfrost_writes_point_size()< / li >
< li > panfrost: Get rid of ctx-> payloads[]< / li >
< li > vtn/opencl: add rint-support< / li >
< p > < / p >
< p > Brian Ho (17):< / p >
< li > turnip: Promote tu_cs_get_size/is_empty to header< / li >
< li > turnip: Execute main cs for secondary command buffers< / li >
< li > turnip: Advertise 8 bit subpixel precision< / li >
< li > ir3: Disable copy prop for immediate ldlw offsets< / li >
< li > turnip: Set has_gs in ir3_shader_key< / li >
< li > turnip: Emit geometry shader obj and related consts< / li >
< li > turnip: Configure VPC for geometry shaders< / li >
< li > turnip: Configure VFD_CONTROL with gsheader and primitiveid< / li >
< li > turnip: Set up REG_A6XX_SP_GS_CONFIG< / li >
< li > turnip: Selectively configure GRAS_LAYER_CNTL< / li >
< li > turnip: Update maxGeometryShaderInvocations to match blob< / li >
< li > turnip: Populate tu_pipeline.active_stages< / li >
< li > turnip: Enable geometry shaders for CP_DRAWs< / li >
< li > turnip: Enable geometryShader device feature< / li >
< li > turnip: Correctly set layer stride for 3D images< / li >
< li > turnip: Emit geometry shader descriptor consts< / li >
< li > freedreno/turnip: Update GRAS_LAYER_CNTL to GRAS_MAX_LAYER_INDEX< / li >
< p > < / p >
< p > Caio Marcelo de Oliveira Filho (46):< / p >
< li > anv: Advertise VK_KHR_shader_non_semantic_info< / li >
< li > radv: Advertise VK_KHR_shader_non_semantic_info< / li >
< li > intel/gen12: Take into account opcode when decoding SWSB< / li >
< li > spirv: Be consistent when checking for Shader/Kernel< / li >
< li > anv: Use intel_debug_flag_for_shader_stage()< / li >
< li > anv: Add pipe_state_for_stage() helper< / li >
< li > nir/builder: Add nir_scoped_memory_barrier()< / li >
< li > nir: Add the alias NIR_MEMORY_ACQ_REL< / li >
< li > nir/tests: Use nir_scoped_memory_barrier() helper< / li >
< li > nir, intel: Move use_scoped_memory_barrier to nir_options< / li >
< li > anv: Remove unused field xfb_used from anv_pipeline< / li >
< li > anv: Remove unused field `urb.total_size`< / li >
< li > nir: Don' t skip a bit in nir_memory_semantics< / li >
< li > nir: Reorder nir_scopes so wider scope has larger numeric value< / li >
< li > nir: Add pass to combine adjacent scoped memory barriers< / li >
< li > intel/fs: Combine adjacent memory barriers< / li >
< li > anv: Add a new enum to identify the pipeline type< / li >
< li > anv: Use pipeline type to decide whether or not lower multiview< / li >
< li > anv: Use a dynamic array for storing executables in pipeline< / li >
< li > anv: Keep the shader stage in anv_shader_bin< / li >
< li > anv: Pass the right pipe_state to flush_descriptor_sets()< / li >
< li > anv: Remove redundant check in flush_descriptor_sets() helpers< / li >
< li > anv: Decouple flush_descriptor_sets() helpers from pipeline struct< / li >
< li > anv: Decouple flush_descriptor_sets() from pipeline struct< / li >
< li > anv: Use a separate field in the pipeline for compute shader< / li >
< li > anv: Split graphics and compute bits from anv_pipeline< / li >
< li > anv: Reduce compute pipeline batch_data size< / li >
< li > anv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set< / li >
< li > intel/blorp: Plumb the stage through blorp upload_shader< / li >
< li > mesa/main: Fix overflow in validation of DispatchComputeGroupSizeARB< / li >
< li > nir: Add per_view attribute to nir_variable< / li >
< li > intel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION< / li >
< li > intel/fs: Allow multiple slots for position< / li >
< li > anv/gen12: Lower VK_KHR_multiview using Primitive Replication< / li >
< li > intel/compiler: Replace cs_prog_data-> push.total with a helper< / li >
< li > anv: Stop using cs_prog_data-> threads< / li >
< li > iris: Stop using cs_prog_data-> threads< / li >
< li > intel/compiler: Remove cs_prog_data-> threads< / li >
< li > intel/fs,vec4: Properly account SENDs in IVB memory fence< / li >
< li > spirv: Fix propagation of OpVariable access flags< / li >
< li > spirv: Handle instruction aliases in vtn_gather_types< / li >
< li > spirv: Update the headers from latest Khronos master< / li >
< li > intel/fs: Allow FS_OPCODE_SCHEDULING_FENCE stall on registers< / li >
< li > intel/fs,vec4: Pull stall logic for memory fences up into the IR< / li >
< li > intel/fs: Only stall after sending all memory fence messages< / li >
< li > i965: Use correct constant for max_variable_local_size< / li >
< p > < / p >
< p > Chad Versace (12):< / p >
< li > anv: Drop unused anv_image_get_surface_for_aspect_mask()< / li >
< li > anv: Rename param make_surface::dev to device< / li >
< li > anv: Delete anv_image::ccs_e_compatible< / li >
< li > anv: Clarify behavior of anv_image_aspect_to_plane()< / li >
< li > anv: Respect ISL_SURF_USAGE_DISABLE_AUX_BIT in make_surface()< / li >
< li > turnip: Add magic register values to tu_physical_device< / li >
< li > turnip: Add a618 support< / li >
< li > anv: Drop anv_image.c:get_surface()< / li >
< li > anv: Add anv_image_plane_needs_shadow_surface() (v2)< / li >
< li > anv: Refactor creation of aux surfaces (v2)< / li >
< li > anv: Flatten the logic add_aux_surface_if_supported (v3)< / li >
< li > anv: Use isl_drm_modifier_get_default_aux_state()< / li >
< p > < / p >
< p > Chia-I Wu (2):< / p >
< li > egl/android: require ANDROID_native_fence_sync for buffer age< / li >
< li > egl/android: enable/disable KHR_partial_update correctly< / li >
< p > < / p >
< p > Chris Lord (2):< / p >
< li > vc4: fix vc4_yuv_blit overwriting fragment constant buffer slot 0< / li >
< li > vc4: Fix query_dmabuf_modifiers mis-reporting external_only property< / li >
< p > < / p >
< p > Chris Wilson (1):< / p >
< li > iris: Fix import sync-file into syncobj< / li >
< p > < / p >
< p > Christian Gmeiner (44):< / p >
< li > etnaviv: enable texture upload memory throttling< / li >
< li > etnaviv: update headers from rnndb< / li >
< li > etnaviv: fix alpha test on GC3000< / li >
< li > etnaviv: add etna_constbuf_state object< / li >
< li > etnaviv: ask kernel for max number of supported varyings< / li >
< li > etnaviv: update headers from rnndb< / li >
< li > etnaviv: increase number of supported varyings to 16< / li >
< li > etnaviv: implement emit_string_marker< / li >
< li > etnaviv: get rid of etna_spec in etna_context< / li >
< li > etnaviv: enable shareable shaders< / li >
< li > freedreno: calculate modified bit mask only once< / li >
< li > freedreno: simplify fd_set_shader_buffers(..)< / li >
< li > freedreno: ssbo: keep track if a buffer gets written< / li >
< li > freedreno: ssbo: mark resource read or written depending on usage< / li >
< li > etnaviv: get rid of SE_CLIP_*< / li >
< li > etnaviv: rework clippling calculation to be a derived state< / li >
< li > etnaviv: do the left shift by 16 at emit time< / li >
< li > etnaviv: get rid of struct compiled_scissor_state< / li >
< li > etnaviv: s/scissor_s/scissor< / li >
< li > etnaviv: compiled_framebuffer_state: get rid of SE_SCISSOR_*< / li >
< li > etnaviv: rename hw queries to acc queries< / li >
< li > etnaviv: rework etna_acc_sample_provider< / li >
< li > etnaviv: explicitly call resource_written(..)< / li >
< li > etnaviv: reset no_wait_cnt after triggered flush< / li >
< li > etnaviv: rework wait/flush logic< / li >
< li > etnaviv: extend acc query provider with supports(..) function< / li >
< li > etnaviv: make use of a fixed size array to track of all acc query provider< / li >
< li > etnaviv: extend result(..) to return if data is ready< / li >
< li > etnaviv: extend acc sample provide with an allocate(..)< / li >
< li > etnaviv: move generic perfmon functionality into own file< / li >
< li > etnaviv: convert perfmon queries to acc queries< / li >
< li > etnaviv: drop redundant calls to etna_acc_query_suspend(..)< / li >
< li > etnaviv: change begin_query(..) to a void function< / li >
< li > etnaviv: remove the " active" member of queries< / li >
< li > etnaviv: anisotropic filtering is supported starting with HALTI0< / li >
< li > etnaviv: update headers from rnndb< / li >
< li > etnaviv: add anisotropic filter support< / li >
< li > docs/features: mark GL_ARB_texture_filter_anisotropic as done for etnaviv< / li >
< li > etnaviv: drop default state for FE_HALTI5_ID_CONFIG< / li >
< li > etnaviv: call util_blitter_save_fragment_constant_buffer_slot(..)< / li >
< li > etnaviv: support for using generic blit path< / li >
< li > ci: bare-metal: power down device after tests< / li >
< li > etnaviv: fix SAMP_ANISOTROPY register value< / li >
< li > etnaviv: do not use int filter when anisotropic filtering is used< / li >
< p > < / p >
< p > Christopher Egert (1):< / p >
< li > radv: use util_float_to_half_rtz< / li >
< p > < / p >
< p > Christopher James Halse Rogers (1):< / p >
< li > egl/wayland: Fix zwp_linux_dmabuf usage< / li >
< p > < / p >
< p > Connor Abbott (55):< / p >
< li > freedreno: Fix CP_COND_REG_EXEC bit positions< / li >
< li > freedreno: Add CP_REG_WRITE documentation< / li >
< li > freedreno: Fix CP_COND_EXEC< / li >
< li > tu: Move vsc_data and vsc_data2 allocation into the device< / li >
< li > tu: Don' t emit initial render target state in tile_load_ib< / li >
< li > tu: Properly set UBWC flags in RB_RENDER_CNTL< / li >
< li > tu/blit: Support blits in secondary cmdstreams< / li >
< li > tu: Support multisample image clears< / li >
< li > tu: Disable linear depth attachments< / li >
< li > tu: Sysmem rendering< / li >
< li > tu: Add helper for CP_COND_REG_EXEC< / li >
< li > tu: Handle vkCmdClearAttachments() with sysmem< / li >
< li > tu: Support resolve ops with sysmem rendering< / li >
< li > tu: Support input attachments with sysmem< / li >
< li > tu: Force sysmem with mipmapped non-aligned linear stores< / li >
< li > tu: Rewrite border color handling< / li >
< li > lima/gpir: Make lima_gpir_node_insert_child() useful< / li >
< li > lima/gpir: Optimize conditional break/continue< / li >
< li > lima/gpir: Optimize nots created from branch lowering< / li >
< li > tu: Fix border color with compute shaders< / li >
< li > freedreno/fdl: Add base_align< / li >
< li > tu: Return the correct alignment for images< / li >
< li > freedreno: Cleanup event names< / li >
< li > freedreno: Rename RB_DONE_TS< / li >
< li > tu: Dump out shader assembly when requested< / li >
< li > tu: ir3: Emit push constants directly< / li >
< li > freedreno/a6xx: Add UBO size field< / li >
< li > freedreno/a6xx: Add registers for the bindless model< / li >
< li > ir3: Add bindless instruction encoding< / li >
< li > ir3: Plumb through support for a1.x< / li >
< li > ir3: Also don' t propagate immediate offset with LDC< / li >
< li > ir3: LDC also has a destination< / li >
< li > ir3: Plumb through bindless support< / li >
< li > ir3: Rewrite UBO push analysis to support bindless< / li >
< li > tu: Switch to the bindless descriptor model< / li >
< li > tu: Emit CP_LOAD_STATE6 for descriptors< / li >
< li > tu: Add missing code for immutable samplers< / li >
< li > tu: Implement descriptor set update templates< / li >
< li > ir3: Fix txs with bindless< / li >
< li > ir3: Fix LDC offset units< / li >
< li > ir3: Handle load_ubo_ir3 when promoting to constants< / li >
< li > tu: Align GMEM resolve blit scissor< / li >
< li > tu: Use tu_cs_add_entries() with non-render-pass secondaries< / li >
< li > ir3/ra: Fix off-by-one issues with live-range extension< / li >
< li > freedreno/a6xx: Expand various varying-count bitfields< / li >
< li > tu: Fix the advertised maxFragmentInputComponents< / li >
< li > ir3: Don' t double-insert the first block< / li >
< li > ir3: Fix bug with shaders that only exit via discard< / li >
< li > freedreno/a6xx: Document PrimID passthrough registers< / li >
< li > ir3: Skip missing VS outputs in VS out map when linking< / li >
< li > tu: Implement PrimID passthrough< / li >
< li > freedreno/a6xx: Implement PrimID passthrough< / li >
< li > st/nir: Fix assigning PointCoord location with !PIPE_CAP_TEXCOORD< / li >
< li > ir3: Remove VARYING_SLOT_PNTC remapping hack< / li >
< li > tu: Don' t invert point coords< / li >
< p > < / p >
< p > D Scott Phillips (6):< / p >
< li > intel/tools/aubinator_error_decode: read HW Context before other batches< / li >
< li > intel/tools/aubinator_error_decode: Decode ring buffers from HEAD to TAIL< / li >
< li > util/sparse_array: don' t stomp head' s counter on pop operations< / li >
< li > intel/fs: Update location of Render Target Array Index for gen12< / li >
< li > anv,iris: Fix input vertex max for tcs on gen12< / li >
< li > anv/gen11+: Disable object level preemption< / li >
< p > < / p >
< p > Daniel Schürmann (73):< / p >
< li > aco: fix image_atomic_cmp_swap< / li >
< li > nir: gather info whether a shader uses demote_to_helper< / li >
< li > nir: add pass to lower discard() to demote()< / li >
< li > amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation< / li >
< li > radeonsi: lower discard to demote when FS_CORRECT_DERIVS_AFTER_KILL is enabled< / li >
< li > radv: use nir_lower_discard_to_demote to work around game bugs< / li >
< li > amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvm< / li >
< li > nir: fix unpack_64_4x16 in lower_alu_to_scalar()< / li >
< li > aco: add comparison operators for PhysReg< / li >
< li > aco: add sub-dword regclasses< / li >
< li > aco: refactor regClass setup for subdword VGPRs< / li >
< li > aco: validate p_create_vector with subdword elements properly< / li >
< li > aco: validate register alignment of subdword operands and definitions< / li >
< li > aco: validate uninitialized operands< / li >
< li > aco: validate RA of subdword assignments< / li >
< li > aco: print subdword registers< / li >
< li > aco: fix Temp and assignment of renamed operands during RA< / li >
< li > aco: remove unnecessary reg_file.fill() operation in get_reg_create_vector()< / li >
< li > aco: add notion of subdword registers to register allocator< / li >
< li > aco: create helper function to collect variables from register area< / li >
< li > aco: adapt register allocation for subdword registers< / li >
< li > aco: align subdword registers during RA when necessary< / li >
< li > aco: small refactoring of shuffle code lowering< / li >
< li > aco: add builder function for subdword copy()< / li >
< li > aco: lower subdword shuffles correctly.< / li >
< li > aco: don' t propagate SGPRs into subdword PSEUDO instructions< / li >
< li > aco: don' t assume split_vector(create_vector) has the same number of elements when optimizing< / li >
< li > aco: don' t vectorize 8/16bit load/store_ssbo< / li >
< li > aco: add missing conversion operations for small bitsizes< / li >
< li > aco: add byte_align_scalar() & trim_subdword_vector() helper functions< / li >
< li > aco: prepare helper functions for subdword handling< / li >
< li > aco: implement vec2/3/4 with subdword operands< / li >
< li > aco: implement storagePushConstant8 & storagePushConstant16< / li >
< li > aco: implement 8bit/16bit load_buffer< / li >
< li > aco: implement 8bit/16bit store_ssbo< / li >
< li > aco: use MUBUF to load subdword SSBO< / li >
< li > aco: guarantee that Temp fits in 4 bytes< / li >
< li > aco: add explicit padding for all Instruction sub-structs< / li >
< li > aco: improve hashing for value numbering< / li >
< li > aco: improve register assignment when live-range splits are necessary< / li >
< li > aco: replace assignment hashmap by std::vector in register allocation< / li >
< li > aco: during RA only insert into renames table if a variable got renamed< / li >
< li > aco: improve speed of live_var_analysis< / li >
< li > aco: refactor try_remove_trivial_phi() in RA< / li >
< li > aco: change some std::map to std::unordered_map in register_allocation< / li >
< li > aco: change live_out variables to std::unordered_set< / li >
< li > aco: move all needed helper containers to ra_ctx< / li >
< li > aco: RA - move all std::function objects into proper functions< / li >
< li > aco: setup subdword regclasses for ssa_undef & load_const< / li >
< li > aco: ensure correct bit representation of subdword constants< / li >
< li > aco: don' t constant-propagate into subdword PSEUDO instructions< / li >
< li > aco: lower subdword phis with SGPR operands< / li >
< li > aco: rename aco_lower_bool_phis() -> aco_lower_phis()< / li >
< li > aco: make some reg_file helpers private and fix their uses< / li >
< li > aco: fix p_extract_vector optimization in presence of unequally sized vector operands< / li >
< li > aco: use v_subrev_f32 for fsub with an sgpr operand in src1< / li >
< li > aco: fix 64bit fsub< / li >
< li > aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel< / li >
< li > aco: simplify operand handling in RA< / li >
< li > aco: refactor get_reg() to take Temp instead of RegClass< / li >
< li > aco: refactor get_reg() to also handle affinities< / li >
< li > aco: create pseudo dummy instruction in RA to be used for live-range splits< / li >
< li > aco: create and use DefInfo struct in RA< / li >
< li > aco: use DefInfo in more places to simplify RA< / li >
< li > aco: move attempt to find strided register into get_reg_simple()< / li >
< li > aco: allocate full register for subdword definitions if HW doesn' t support it< / li >
< li > aco: don' t create vector affinities for operands which are not killed or are duplicates< / li >
< li > aco: refactor get_reg_simple() to return early on exact matches< / li >
< li > aco: stop get_reg_simple after reaching max_used_gpr< / li >
< li > aco: try to always find a register with stride for even sizes< / li >
< li > aco: use upper part of gap in register file if it is beneficial for striding< / li >
< li > aco: coalesce v_mad' s accumulator with definition' s affinities< / li >
< li > aco: either copy-propagate or inline create_vector operands< / li >
< p > < / p >
< p > Daniel Stone (15):< / p >
< li > Revert " gitlab-ci: disable panfrost runners" < / li >
< li > egl/wayland: Don' t invalidate buffers on no-op resize< / li >
< li > util/test: Use MAX_PATH on Windows< / li >
< li > CI: Add native Windows VS2019 build< / li >
< li > CI: Windows: Fix Docker tag argument inversion< / li >
< li > CI: Disable Panfrost Mali-T820 jobs< / li >
< li > CI: Avoid htz4 runner for VS2019< / li >
< li > meson: Add VS 4624 warning exclusion to remove piles of LLVM warnings< / li >
< li > CI: Re-enable Windows VS2019 builds< / li >
< li > EGL: Add eglSetDamageRegionKHR to GLVND dispatch list< / li >
< li > meson: Make shared-llvm into a tri-state boolean< / li >
< li > CI: Disable Windows/VS2019 builds< / li >
< li > Revert " CI: Disable Windows/VS2019 builds" < / li >
< li > ci/windows: Make Chocolatey installs more reliable< / li >
< li > CI: Disable Lima jobs due to lab unhealthiness< / li >
< p > < / p >
< p > Danylo Piliaiev (29):< / p >
< li > i965: Do not set front_buffer_dirty if there is no front buffer< / li >
< li > st/mesa: Handle the rest renderbuffer formats from OSMesa< / li >
< li > osmesa/tests: Cover OSMESA_RGB GL_UNSIGNED_BYTE case< / li >
< li > st/nir: Unify inputs_read/outputs_written before serializing NIR< / li >
< li > brw_nir: Cast bitshift to unsigned< / li >
< li > brw_fs: Avoid zero size vla< / li >
< li > intel/compiler: Do not qsort zero sized array< / li >
< li > intel/bufmgr: Cast bitshift to unsigned< / li >
< li > glsl/blob: Do not call memcpy if there is nothing to copy< / li >
< li > iris: Do not dereference nullptr with pipe_reference< / li >
< li > i965: Do not generate D16 B5G6R5_UNORM configs on gen < 8< / li >
< li > intel/tools: Fix compilation with UBSan< / li >
< li > glsl: do not crash if string literal is used outside of #include/#line< / li >
< li > st/mesa: Fix signed integer overflow when using util_throttle_memory_usage< / li >
< li > intel/aub_viewer: Fix format specifier for uint64_t< / li >
< li > nir: Fix breakage of foreach_list_typed_safe assumptions in loop unrolling< / li >
< li > anv: Do not sample from 3d depth image with HiZ< / li >
< li > glsl/list: Fix undefined behaviour of foreach_* macros< / li >
< li > st/mesa: Update shader info of ffvp/ARB_vp after translation to NIR< / li >
< li > st/mesa: Re-assign vs in locations after updating nir info for ffvp/ARB_vp< / li >
< li > spirv: Expand workaround for OpControlBarrier on old GLSLang< / li >
< li > st/mesa: Treat vertex inputs absent in inputMapping as zero in mesa_to_tgsi< / li >
< li > iris/bufmgr: Check if iris_bo_gem_mmap failed< / li >
< li > i965: Fix out-of-bounds access to brw_stage_state::surf_offset< / li >
< li > anv: Translate relative timeout to absolute when calling anv_timelines_wait< / li >
< li > anv: Fix deadlock in anv_timelines_wait< / li >
< li > meson: Disable GCC' s dead store elimination for memory zeroing custom new< / li >
< li > mesa: Fix double-lock of Shared-> FrameBuffers and usage of wrong mutex< / li >
< li > intel/fs: Work around dual-source blending hangs in combination with SIMD16< / li >
< p > < / p >
< p > Dave Airlie (69):< / p >
< li > llvmpipe/query: add support for indexed queries< / li >
< li > gallivm/swr: add stream_id to geom epilogue emit< / li >
< li > gallivm/nir: add support for multiple vertex streams< / li >
< li > draw: change geom shader output to an array of outputs.< / li >
< li > draw/gs: track emitted prims + verts per stream.< / li >
< li > draw: emit multiple streams to streamout.< / li >
< li > draw: don' t emit vertex to streams with no outputs< / li >
< li > llvmpipe: advertise 4 vertex streams< / li >
< li > gallivm/s390: fix pass init order on s390 with llvm 8 (v2)< / li >
< li > ci: bump debian image and change llvm deps to 8< / li >
< li > dri: add another get shm variant.< / li >
< li > glx/drisw: add getImageShm2 path< / li >
< li > glx/drisw: return false if shmid == -1< / li >
< li > glx/drisw: fix shm put image fallback< / li >
< li > gallivm/tgsi: fix stream id regression< / li >
< li > gallivm/nir: fix integer divide SIGFPE< / li >
< li > gallivm/nir: handle mod 0 better.< / li >
< li > gallium/auxiliary: add the microsoft tessellator and a pipe wrapper.< / li >
< li > gallivm/nir: split out 64-bit splitting code< / li >
< li > gallivm/nir: add support for tess system values< / li >
< li > gallivm/nir: align store_var param order with load_var< / li >
< li > gallivm/tgsi/swr: add mask vec to the tcs store< / li >
< li > gallivm/nir: add tessellation i/o support.< / li >
< li > draw: add JIT context/functions for tess stages.< / li >
< li > draw: add main tessellation code< / li >
< li > draw: hook up final bits of tessellation< / li >
< li > gallium/nir/tgsi: only scan fragment shader inputs for usage_mask< / li >
< li > llvmpipe: add support for tessellation shaders< / li >
< li > gallivm/tessellator: use private functions for min/max to avoid namespace issues< / li >
< li > gallium: fix build with latest meson and gcc10< / li >
< li > gallivm/s3tc: split out dxt5 alpha code< / li >
< li > gallivm: add support for rgtc/latc fetches.< / li >
< li > gallium/llvmpipe: add an optimised 32-bit memset< / li >
< li > gallivm/rgtc: fix the truncation to 8-bit< / li >
< li > gallivm/rgtc: enable fast path for snorm types.< / li >
< li > Revert " gallivm: disable rgtc/latc SNORM accellerated fetches" < / li >
< li > llvmpipe: fixup context leaks.< / li >
< li > draw: collect tessellation invocations statistics< / li >
< li > llvmpipe: report tessellation shader statistics.< / li >
< li > llvmpipe/query: fix transform feedback overflow any queries.< / li >
< li > gallivm: fix left over shader vote debug< / li >
< li > gallivm/nir: lower implicit lod to tex.< / li >
< li > gallivm/draw: calloc prim id toavoid undef< / li >
< li > llvmpipe: fix no tokens detections.< / li >
< li > draw: fix tessellation stats query< / li >
< li > llvmpipe/setup: move line stats collection earlier.< / li >
< li > draw/cull: run pipeline for culled points.< / li >
< li > draw: fix user culling pipeline order. (v2)< / li >
< li > u_blitter: fix stencil blitting< / li >
< li > draw: free the NIR IR.< / li >
< li > draw/tess: free the NIR< / li >
< li > llvmpipe/nir: free the nir shader< / li >
< li > nir/linking: fix issue with two compact variables in a row. (v2)< / li >
< li > gallivm/nir: fix image store conversions< / li >
< li > gallivm/nir: add helper invocation support< / li >
< li > util/indirect: handle stride less than number of parameters.< / li >
< li > llvmpipe: bump max images to 16< / li >
< li > llvmpipe: fix ssbo alignment< / li >
< li > draw/tess: fix TES patch vertices in.< / li >
< li > llvmpipe: fix d32 unorm depth conversions.< / li >
< li > llvmpipe/setup: add point size clamping< / li >
< li > llvmpipe: enable stencil only formats. (v2)< / li >
< li > llvmpipe: clamp color storage for integer types.< / li >
< li > gallivm: fix stencil border< / li >
< li > vulkan: add initial device selection layer. (v6.1)< / li >
< li > ci: add llvmpipe paths to virgl rules< / li >
< li > draw/tess: free tessellation control shader i/o memory.< / li >
< li > llvmpipo/nir: free compute shader NIR< / li >
< li > llvmpipe: compute shaders work better with all the threads.< / li >
< p > < / p >
< p > David Stevens (1):< / p >
< li > egl/android: set window usage flags< / li >
< p > < / p >
< p > Denys (1):< / p >
< li > gitlab: add bug report template< / li >
< p > < / p >
< p > Dominik Behr (1):< / p >
< li > meson: fix debug build on Android< / li >
< p > < / p >
< p > Drew Davenport (1):< / p >
< li > radv: Filter extensions not whitelisted for Android< / li >
< p > < / p >
< p > Duncan Hopkins (2):< / p >
< li > zink. Added storage CISto descriptor pool. Added storage in descriptor pool for combined image samplers as well as uniform buffers. Stops some shaders from running through a pools storage faster than zinks internal tracking.< / li >
< li > zink: zero out zink_render_pass_state< / li >
< p > < / p >
< p > Dylan Baker (48):< / p >
< li > docs/release-calendar: 20.0.0-rc1 has been released< / li >
< li > docs: Mark 20.0-rc2 as done< / li >
< li > docs: Add release notes for 19.3.4< / li >
< li > docs: Add SHA256 sum for 19.3.4< / li >
< li > docs: Mark 19.3.4 as done< / li >
< li > docs: Mark 20.0.0-rc3 as done< / li >
< li > Docs: Add 20.0.0 release notes< / li >
< li > docs: Update index, relnotes, and release-calendar for 20.0< / li >
< li > docs: Update stable process around using fixes: and gitlab< / li >
< li > docs/submittingpatches: Fix confusing typo + missing pronoun< / li >
< li > docs: Update release notes with current process< / li >
< li > bin/post_version.py: Update the release calendar as well< / li >
< li > bin/post_version.py: Pretty print the html< / li >
< li > bin/post_version.py: Make the git commit as well.< / li >
< li > docs: update releasing to cover updated post_version.py< / li >
< li > docs: add relnotes for 20.0.1< / li >
< li > docs: Add sha256sums for 20.0.1< / li >
< li > docs: update news, calendar, and link release notes for 20.0.1< / li >
< li > Docs: Add release notes for 20.0.2< / li >
< li > docs/relnotes: Add sha256 sums for 20.0.2< / li >
< li > docs: update calendar, add news item, and link releases notes for 20.0.2< / li >
< li > docs/release-calendar: Add calendar for 20.1 Release candidates< / li >
< li > bin/gen_release_notes.py: Fix version detection for .0 release< / li >
< li > bin/pick-ui: Add a new maintainer script for picking patches< / li >
< li > replace _mesa_is_pow_two with util_is_power_of_two_*< / li >
< li > replace _mesa_next_pow_two_* with util_next_power_of_two_*< / li >
< li > replace _mesa_logbase2 with util_logbase2< / li >
< li > replace LOG2 with util_fast_log2< / li >
< li > u_math: add x86 optimized version of ifloor< / li >
< li > replace IFLOOR with util_ifloor< / li >
< li > Replace IROUND_POS with _mesa_roundevenf< / li >
< li > mesa/main: remove unused IROUNDD< / li >
< li > replace IROUND with util functions< / li >
< li > move windows strtok_r define to u_string< / li >
< li > Replace IS_INF_OR_NAN with util_is_inf_or_nan< / li >
< li > replace malloc macros in imports.h with u_memory.h versions< / li >
< li > util: Add an aligned realloc function< / li >
< li > replace imports memory functions with utils memory functions< / li >
< li > mesa|mapi: replace _mesa_[v]snprintf with [v]snprintf< / li >
< li > mesa: move ADD_POINTERS to macros.h< / li >
< li > dri/nouveau: replace assert with unreachable< / li >
< li > remove final imports.h and imports.c bits< / li >
< li > meson: update llvm dependency logic for meson 0.54.0< / li >
< li > docs: Add relnotes for 20.0.5< / li >
< li > docs: Add sha256 sums for 20.0.5< / li >
< li > docs: update calendar, add news item, and link releases notes for 20.0.5< / li >
< li > mesa: Follow OpenGL conversion rules for values that exceed storage size< / li >
< li > tests: Make tests aware of meson test wrapper< / li >
< p > < / p >
< p > Edmondo Tommasina (1):< / p >
< li > radv/sqtt: fix RADV_THREAD_TRACE_BUFFER_SIZE spelling< / li >
< p > < / p >
< p > Eduardo Lima Mitev (3):< / p >
< li > turnip/pipeline: Don' t assume tu_shader is a valid object< / li >
< li > turnip: Instance can be NULL resolving ' GetInstanceProcAddr' entry point< / li >
< li > anv/radv: Resolving ' GetInstanceProcAddr' should not require a valid instance< / li >
< p > < / p >
< p > Eli Schwartz (1):< / p >
< li > docs: fix typo in v20 release notes< / li >
< p > < / p >
< p > Elie Tournier (3):< / p >
< li > spirv2nir: print nir shader if translation succed< / li >
< li > spirv2nir: Add kernel spirv support< / li >
< li > docs/features: Update virgl OpenGL 4.5 features GL_ARB_clip_control and GL_KHR_robustness are now expose in the guest.< / li >
< p > < / p >
< p > Emil Velikov (11):< / p >
< li > meson: glx: drop with_glx == dri check< / li >
< li > glx: set the loader_logger early and for everyone< / li >
< li > egl/drm: reinstate (kms_)swrast support< / li >
< li > Revert " egl/dri2: Don' t dlclose() the driver on dri2_load_driver_common failure" < / li >
< li > loader: use a maximum of 64 drmDevices< / li >
< li > loader: simplify loader_get_user_preferred_fd()< / li >
< li > loader: simplify codeflow in drm_get_pci_id_for_fd< / li >
< li > loader: move " using driver..." message to loader_get_kernel_driver_name< / li >
< li > loader: fallback to kernel name, if PCI fails< / li >
< li > glx: omit loader_loader() for macOS< / li >
< li > egl: simplify client/platform extension handling< / li >
< p > < / p >
< p > Emmanuel Gil Peyrot (1):< / p >
< li > Expose EGL_KHR_platform_* when EXT is supported< / li >
< p > < / p >
< p > Eric Anholt (144):< / p >
< li > gallium/osmesa: Fix a typo in the unit test' s test names.< / li >
< li > gallium/osmesa: Fix MakeCurrent of non-8888 contexts.< / li >
< li > gallium/osmesa: Fill out other format tests.< / li >
< li > gallium/osmesa: Try to fix the test for big-endian.< / li >
< li > util: Make helper functions for pack/unpacking pixel rows.< / li >
< li > mesa/st: Use direct util_format_pack/unpack instead of u_tile.< / li >
< li > gallium/util: Remove pipe_get_tile_z/put_tile_z.< / li >
< li > softpipe: Drop the raw_to* part of the tile cache interface.< / li >
< li > softpipe: Refactor pipe_get/put_tile_rgba_* paths.< / li >
< li > gallium: Add and use a helper for packing uc from a color_union.< / li >
< li > gallium: Refactor some single-pixel util_format_read/writes.< / li >
< li > util: Drop unpacking from int signed to unsigned and vice versa.< / li >
< li > freedreno: Move the layout debug under FD_MESA_DEBUG=layout.< / li >
< li > freedreno: Include the layer size in layout debug.< / li >
< li > freedreno: Rename the UBWC layer size field and store it as bytes.< / li >
< li > freedreno/a6xx: Disable the core layer-size setup.< / li >
< li > freedreno: Swap the whole resource layout in shadowing.< / li >
< li > freedreno: Blit all array levels when uncompressing UBWC.< / li >
< li > freedreno: Disable UBWC on Z24S8 if not TEXTURE_2D.< / li >
< li > freedreno: Allow UBWC on textures with multiple mipmap levels.< / li >
< li > mesa: Clean up some endianness adapters for shader image formats.< / li >
< li > intel/isl: Move iris' s pipe-to-isl format function to isl.< / li >
< li > glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT.< / li >
< li > mesa/st: Move the SYSTEM_VALUE -> TGSI_SEMANTIC map to tgsi_from_mesa.< / li >
< li > nouveau: Reuse tgsi_get_sysval_semantic().< / li >
< li > nouveau: reuse tgsi_get_gl_frag_result_semantic().< / li >
< li > nouveau: Reuse tgsi_get_gl_varying_semantic().< / li >
< li > u_tile: Skip the packed temporary and just store tiles directly.< / li >
< li > ci: Disable a bunch of tests on freedreno a630.< / li >
< li > ci: Bump the GLES CTS version to 3.2.6.1.< / li >
< li > Revert " gallium: Fix big-endian addressing of non-bitmask array formats." < / li >
< li > ci: Extend the a630 flake list to reduce spurious failures.< / li >
< li > radv: Squelch possibly-undefined warning< / li >
< li > llvmpipe: Fix real uninitialized use of " atype" for SEMANTIC_FACE< / li >
< li > llvmpipe: Silence " possibly uninitialized value" warning for ssbo_limit.< / li >
< li > llvmpipe: Silence uninitialized variable warning about " chan" < / li >
< li > llvmpipe: Fix warning about uninitialized " op" in the NIR path.< / li >
< li > llvmpipe: Silence uninitialized variable warning about " vals" < / li >
< li > llvmpipe: Silence uninitialized variable warning about " scissor" < / li >
< li > llvmpipe: Fix another uninitialized value warning, on init_val.< / li >
< li > gallium: Only define PIPE_ALIGNSTACK on x86.< / li >
< li > ci: prepare-artifacts: Make the indent here match previously in the file< / li >
< li > ci: Make sure that we have a proper shell prompt for LAVA.< / li >
< li > ci: Make LAVA job fails emit the full list of unexpected test results.< / li >
< li > ci: Document how LAVA runners work.< / li >
< li > ci: Don' t bother generating deqp junit results since we don' t present it.< / li >
< li > ci: Remove a useless filtering of the lava logs.< / li >
< li > nir: Rename gl_nir_lower_bindless_images.c in preparation for extending it.< / li >
< li > nir: Make image lowering optionally handle the !bindless case as well.< / li >
< li > gallium: Add a cap for enabling lowering of image load/store intrinsics.< / li >
< li > v3d: Ask the state tracker to lower image accesses off of derefs.< / li >
< li > glsl: Factor out the sampler dim coordinate components switch statement.< / li >
< li > spirv_to_nir: Reuse glsl_sampler_dim_coordinate_components().< / li >
< li > freedreno/ir3: Reuse glsl_get_sampler_dim_coordinate_components() in tex_info.< / li >
< li > tgsi_to_nir: Reuse glsl_get_sampler_dim_coordinate_components().< / li >
< li > prog_to_nir: Reuse glsl_get_sampler_dim_coordinate_components().< / li >
< li > freedreno/ir3: Fix the arg to ir3_get_num_components_for_image_format()< / li >
< li > nir: Move intel' s intrinsic_image_coordinate_components() to core nir.< / li >
< li > freedreno: Switch to using lowered image intrinsics.< / li >
< li > ci: Blacklist another freedreno flaky test.< / li >
< li > meson: Disable bison' s -Wdeprecated since we still support old bison.< / li >
< li > turnip: Fix compiler warning about casting a nondispatchable handle.< / li >
< li > freedreno/computerator: Fix defined-but-not-used warnings from lex/yacc.< / li >
< li > ci: Remove LLVM from ARM test drivers.< / li >
< li > ci: Stop disabling ACPI in the LAVA arm64 kernel build.< / li >
< li > ci: Shrink the arm64 kernel build a bit.< / li >
< li > ci: Include db410c support in the ARM container.< / li >
< li > aco: Fix signed-vs-unsigned warning.< / li >
< li > ci: Enable -Werror on meson-vulkan and meson-testing.< / li >
< li > ci: Switch testing on db410c over to LAVA.< / li >
< li > ci: Add a disabled-by-default job for GLES3 testing on db410c.< / li >
< li > ci: Flip db410c back to docker mode.< / li >
< li > ci: Print the renderer/version that our dEQP invocation is using.< / li >
< li > ci: Fix installation of firmware for db410c' s nic.< / li >
< li > ci: Make a simple little bare-metal fastboot mode for db410c.< / li >
< li > glsl/tests: Catch mkdir errors to help explain when they happen.< / li >
< li > glsl/tests: Fix waiting for disk_cache_put() to finish.< / li >
< li > ci: Update the ci-templates commit.< / li >
< li > ci: Enable ccache in the container builds.< / li >
< li > ci: Enable ccaching of CMake builds as well.< / li >
< li > ci: Enable testing GLES2-3 on a530 (Dragonboard 820c).< / li >
< li > freedreno/a5xx: Fix min-vs-mag filtering decisions on non-mipmap tex.< / li >
< li > gallium/util: Switch util_float_to_half to _mesa_float_to_half()' s impl.< / li >
< li > ci: Ban the recent popular freedreno a630 flakes.< / li >
< li > ci: Disable tests that showed intermittent fails on a530 in day 1.< / li >
< li > ci: Only run the freedreno baremetal tests when freedreno/core changes.< / li >
< li > freedreno: Switch to exposing only half-integer pixel centers.< / li >
< li > ci: Move db820c and db410c' s gles3 tests to manual, like radv did.< / li >
< li > glsl: Restore the IsES flag on the shader when reading from cache.< / li >
< li > ci: Ban the recent popular freedreno a630 intermittent failure.< / li >
< li > freedreno: Remove always-true return from per-gen begin_query.< / li >
< li > freedreno: Remove the " active" member of queries.< / li >
< li > freedreno: Fix acc query handling in the presence of batch reordering.< / li >
< li > freedreno: Associate the acc query bo with the batch.< / li >
< li > freedreno: Count blits in GL_TIME_ELAPSED and perf counter queries.< / li >
< li > freedreno/a6xx: Fix timestamp queries.< / li >
< li > freedreno: Rename " is_blit" to " is_discard_blit" < / li >
< li > freedreno: Fix detection of being in a blit for acc queries.< / li >
< li > freedreno: Work around UBWC flakiness.< / li >
< li > freedreno: Drop an unnecessary include marked " this should go away" < / li >
< li > freedreno/turnip: Use the NIR info to decide if we need helper invocations.< / li >
< li > loader: Warn when we fail to open a device node due to permissions.< / li >
< li > ci: Consistently use -j4 across x86 build jobs and -j8 on ARM.< / li >
< li > freedreno/a6xx: Sink the per-level size temps inside the loop.< / li >
< li > freedreno/a6xx: Remove the " aligned_height" temporary.< / li >
< li > freedreno/a6xx: Drop the " alignment" layout temporary.< / li >
< li > freedreno: Add the outline of a test for a6xx texture layout.< / li >
< li > freedreno/a6xx: Set a level' s pitch based on minified level0 pitch, not width0.< / li >
< li > freedreno: Fix leak of binning shader variants.< / li >
< li > freedreno/ir3: Stop doing b2n on the SEL condition.< / li >
< li > freedreno/ir3: CSE the up/downconversion of SEL' s cond' s size.< / li >
< li > freedreno/a5xx+: Skip compiling the old gmem blit programs.< / li >
< li > freedreno/drm-shim: Add support for faking other adreno chips.< / li >
< li > freedreno/ir3: Drop handling FRAG_RESULT_DEPTH writing to .z< / li >
< li > freedreno: Introduce a " cpp_shift" value for cpp divs/muls.< / li >
< li > freedreno: Make the slice pitch be bytes, not pixels.< / li >
< li > drm-shim: Let the driver choose to overwrite the first render node.< / li >
< li > nir/lower_two_sided_color: Fix picking of new driver location.< / li >
< li > nir/lower_clip: Fix picking of unused driver locations.< / li >
< li > gallium: Fix setup of pstipple frag coord var.< / li >
< li > freedreno/ir3: Fix driver_location of the added vertex_flags varying.< / li >
< li > freedreno/ir3: Fix sizing of the inputs/outputs array.< / li >
< li > vc4: Use NIR shader' s num_outputs for generating our new output.< / li >
< li > ci: Drop redundant freedreno stage specification.< / li >
< li > ci: Enable GLES3 testing on db410c/db820c (freedreno a306 and a530).< / li >
< li > freedreno: Fix derivatives without texturing on a3xx-a5xx.< / li >
< li > ci: Enable GLES 3.1 testing on db820c (a530).< / li >
< li > freedreno/ir3: Fix the disasm of half-float STG dests.< / li >
< li > freedreno/ir3: Print a space after nop counts, like qcom' s disasm.< / li >
< li > freedreno/ir3: Add a unit test for our disassembler.< / li >
< li > freedreno/ir3: Convert remaining disasm src prints to reginfo.< / li >
< li > freedreno/ir3: Refactor out print_reg_src().< / li >
< li > freedreno/ir3: Add support for disasm of cat2 float32 immediates.< / li >
< li > ci: Enable --compact-display false on all dEQP runs.< / li >
< li > ci: Add sanity checking that dEQP gets the expected GL_RENDERER.< / li >
< li > freedreno: Fix calculation of the const buffer cmdstream size.< / li >
< li > ci: Allow namespacing of dEQP run results files.< / li >
< li > ci: Clean up some excessive use of pipes in dEQP results processing.< / li >
< li > ci/freedreno: Add a test run of a few driver options.< / li >
< li > util/ra: Sanity check that the driver selected a valid reg.< / li >
< li > util/ra: Sanity check that we' re adding a valid reg to a class.< / li >
< li > util/ra: Use util_dynarray for the adjacency list.< / li >
< li > util/ra: Use util_dynarray for handling the conflict lists.< / li >
< li > util/ra: Improve ra_set_finalize() performance.< / li >
< p > < / p >
< p > Eric Engestrom (58):< / p >
< li > VERSION: bump after 20.0 branch point< / li >
< li > egl: put full path to libEGL_mesa.so in GLVND json< / li >
< li > gitlab-ci: disable a630 tests as mesa-cheza is down< / li >
< li > util/os_socket: fix header unavailable on windows< / li >
< li > freedreno/perfcntrs: fix fd leak< / li >
< li > dri: delete gen-symbol-redefs.py< / li >
< li > util/disk_cache: check for write() failure in the zstd path< / li >
< li > meson: don' t bother trying `python2`< / li >
< li > Revert " egl: put full path to libEGL_mesa.so in GLVND json" < / li >
< li > egl: directly access static members instead of using _egl{Get,Set}ConfigKey()< / li >
< li > meson: explicitly disallow unsupported build directory layout< / li >
< li > docs: fix typos in the release docs< / li >
< li > bin/gen_release_notes.py: fix commit list command< / li >
< li > gen_release_notes: fix vulkan version reported< / li >
< li > docs/relnotes/19.3: fix vulkan version reported< / li >
< li > docs/relnotes/20.0: fix vulkan version reported< / li >
< li > Revert " docs/relnotes/19.3: fix vulkan version reported" < / li >
< li > docs: trivial fix for html structure< / li >
< li > docs/releasing: add missing < /li> tags< / li >
< li > docs: add release notes for 19.3.5< / li >
< li > docs: update calendar, add news item, and link releases notes for 19.3.5< / li >
< li > vulkan/wsi: fix cleanup when dup() fails< / li >
< li > gen_release_notes: fix version in " you should wait" message< / li >
< li > gen_release_notes: resolve ambiguity by renaming `version` to `previous_version` and `next_version` to `this_version`< / li >
< li > meson: use existing variables in inc_common< / li >
< li > meson: inline `inc_common`< / li >
< li > vulkan: drop unused include directories< / li >
< li > intel: drop unused include directories< / li >
< li > scons: prune unused Makefile.sources< / li >
< li > docs: add release notes for 20.0.3< / li >
< li > docs/relnotes: add sha256sum for 20.0.3< / li >
< li > docs: update calendar, add news item, and link releases notes for 20.0.3< / li >
< li > docs: add release notes for 20.0.4< / li >
< li > docs/relnotes: add sha256sum for 20.0.4< / li >
< li > docs: update calendar, add news item, and link releases notes for 20.0.4< / li >
< li > glx: fix 630 times -Wlto-type-mismatch when building with LTO enabled< / li >
< li > glx: use anonymous namespace to avoid -Wodr issues when building with LTO enabled< / li >
< li > pick-ui: auto-scroll the feedback window< / li >
< li > pick-ui: compute .pick_status.json path only once< / li >
< li > pick-ui: make .pick_status.json path relative to the git root instead of the script< / li >
< li > pick-ui: show commit sha in the pick list< / li >
< li > VERSION: bump to 20.1.0-rc1< / li >
< li > .pick_status.json: Update to af55bdd05d94eda59ee1c9331a50045000da5db5< / li >
< li > .pick_status.json: Update to 57796946985de60204189426ca8eb7bbfa97c396< / li >
< li > .pick_status.json: Mark 3fac55ce0d066d767d6c6c8308f79d0c3e566ec0 as denominated< / li >
< li > .pick_status.json: Update to 29da52128090a1ef8ef782188c0f67c7f5ec8d19< / li >
< li > VERSION: bump to 20.1.0-rc2< / li >
< li > .pick_status.json: Update to 772b15ad3227e08bb4e18932ac9ecf4c29271160< / li >
< li > .pick_status.json: Update to 56f955e4850035d915a2a87e2ebea7fa66ab5e19< / li >
< li > .pick_status.json: Update to c1c0cf7a66905e8d7ad506842a41b0ad0c5b10da< / li >
< li > VERSION: bump to 20.1.0-rc3< / li >
< li > .pick_status.json: Update to 5a6beb6a24aa084adfd6c57edd0a64f0a044611a< / li >
< li > post_version.py: fix branch name construction for release candidates< / li >
< li > post_version.py: invert `is_point` into `is_first_release` to make its purpose clearer< / li >
< li > post_version.py: stop adding release candidates to the index and relnotes< / li >
< li > VERSION: bump to 20.1.0-rc4< / li >
< li > .pick_status.json: Update to a91306677c613ba7511b764b3decc9db42b24de1< / li >
< li > tree-wide: fix deprecated GitLab URLs< / li >
< p > < / p >
< p > Erik Faye-Lund (154):< / p >
< li > zink: enable texture-buffer objects< / li >
< li > zink: implement load_instance_id< / li >
< li > zink: implement support for derivative-control< / li >
< li > zink: be more careful about the mask-check< / li >
< li > zink: disallow depth-stencil blits with format-change< / li >
< li > st/mesa: use uint-result for sampling stencil buffers< / li >
< li > zink: lower away fdph< / li >
< li > zink: fixup sampler-usage< / li >
< li > zink: replace unset buffer with a dummy-buffer< / li >
< li > zink: emit blend-target index< / li >
< li > zink: only inspect dual-src limit if feature enabled< / li >
< li > Revert " nir: Add a couple trivial abs optimizations" < / li >
< li > zink: do not use SpvDimRect< / li >
< li > zink: fix binding-usage< / li >
< li > zink: do not report texture-samplers for unsupported stages< / li >
< li > zink/spirv: do not reinvent store_dest< / li >
< li > zink/spirv: prefer store_dest over store_dest_uint< / li >
< li > zink/spirv: rename functions a bit< / li >
< li > zink/spirv: unit_value -> raw_value< / li >
< li > zink/spirv: uint -> raw< / li >
< li > zink: do not convert bools to/from uint< / li >
< li > util: promote u_debug_memory.c to src/util< / li >
< li > util: move debug_memory_{begin,end} to os_memory_debug.h< / li >
< li > gallium/util: do not use debug_print_format< / li >
< li > gallium/util: remove unused debug_print_foo helpers< / li >
< li > zink/spirv: do not use bitwise operations on booleans< / li >
< li > pipebuffer: clean up cast-warnings< / li >
< li > rbug: clean up cast-warnings< / li >
< li > rbug: do not return void-value< / li >
< li > vtn/opencl: fully enable OpenCLstd_Clz< / li >
< li > compiler/nir: move build_exp helper into builtin-builder< / li >
< li > compiler/nir: move build_log helper into builtin-builder< / li >
< li > vtn/opencl: add native exp/log-support< / li >
< li > vtn/opencl: add native exp10/log10-support< / li >
< li > vtn/opencl: add native exp2/log2-support< / li >
< li > nv50: remove unused variable< / li >
< li > meson: disable some more warnings on msvc< / li >
< li > mesa/main: correct extension-checks for GL_BLACKHOLE_RENDER_INTEL< / li >
< li > mesa/main: clean-up extension-checks for point-sprites< / li >
< li > mesa/main: clean up extension-check for GL_VERTEX_PROGRAM< / li >
< li > mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_TWO_SIDE< / li >
< li > mesa/main: clean up extension-check for GL_VERTEX_PROGRAM_POINT_SIZE< / li >
< li > mesa/main: clean up extension-check for GL_TEXTURE_RECTANGLE< / li >
< li > mesa/main: clean up extension-check for GL_STENCIL_TEST_TWO_SIDE< / li >
< li > mesa/main: clean up extension-check for GL_DEPTH_BOUNDS_TEST< / li >
< li > mesa/main: clean up extension-check for AMD_depth_clamp_separate< / li >
< li > mesa/main: clean up extension-check for GL_FRAGMENT_SHADER_ATI< / li >
< li > mesa/main: clean up extension-check for GL_TEXTURE_CUBE_MAP_SEAMLESS< / li >
< li > mesa/main: clean up extension-check for GL_RASTERIZER_DISCARD< / li >
< li > mesa/main: clean up extension-check for GL_TEXTURE_EXTERNAL< / li >
< li > mesa/main: remove unused macro< / li >
< li > wgl: drop pointless debug_printf< / li >
< li > wgl: drop unused member< / li >
< li > wgl: move screen-init to a helper< / li >
< li > wgl: do not create screen from DllMain< / li >
< li > st/dri: make sure software color-buffers are linear< / li >
< li > zink: be less picky about tiled resources< / li >
< li > .mailmap: add an alias for Alan Swanson< / li >
< li > .mailmap: add an alias for Alyssa Rosenzweig< / li >
< li > .mailmap: add an alias for Andrii Simiklit< / li >
< li > .mailmap: add an alias for Anuj Phogat< / li >
< li > .mailmap: add an alias for Axel Davy< / li >
< li > .mailmap: add an alias for Boris Brezillon< / li >
< li > .mailmap: add an alias for Bruce Cherniak< / li >
< li > .mailmap: update aliases for Carl-Philip Hänsch< / li >
< li > .mailmap: add an alias for Chad Versace< / li >
< li > .mailmap: add a couple of aliases for Chandu Babu Namburu< / li >
< li > .mailmap: add alias for Chenglei Ren< / li >
< li > .mailmap: add an alias for Christian Gmeiner< / li >
< li > .mailmap: add an alias for Christian Inci< / li >
< li > .mailmap: add a few aliases for Christoph Haag< / li >
< li > .mailmap: add an alias for Colin McDonald< / li >
< li > .mailmap: specify spelling for Constantine Kharlamov< / li >
< li > .mailmap: add an alias for Craig Stout< / li >
< li > .mailmap: add an alias for Daniel Schürmann< / li >
< li > .mailmap: add an alias for Danylo Piliaiev< / li >
< li > .mailmap: add an alias for Dave Airlie< / li >
< li > .mailmap: add an alias for Dylan Baker< / li >
< li > .mailmap: add a couple of aliases for Dylan Noblesmith< / li >
< li > .mailmap: add an alias for Emmanuel Gil Peyrot< / li >
< li > .mailmap: add an alias for Erik Faye-Lund< / li >
< li > .mailmap: specify spelling for Francesco Ansanelli< / li >
< li > .mailmap: specify spelling for Gurchetan Singh< / li >
< li > .mailmap: add an alias for Haihao Xiang< / li >
< li > .mailmap: add an alias for Harish Krupo< / li >
< li > .mailmap: specify spelling for Heinrich Fink< / li >
< li > .mailmap: specify spelling for Henri Verbeet< / li >
< li > .mailmap: add an alias for Igor Gnatenko< / li >
< li > .mailmap: add an alias for Illia Iorin< / li >
< li > .mailmap: specify spelling for James Zhu< / li >
< li > .mailmap: add an alias for Jan Beich< / li >
< li > .mailmap: clean up aliases for Jeremy Huddleston< / li >
< li > .mailmap: add an alias for Julien Isorce< / li >
< li > .mailmap: add a few aliases for Karol Herbst< / li >
< li > .mailmap: add a few aliases for Kevin Rogovin< / li >
< li > .mailmap: add a few aliases for Kristian Høgsberg< / li >
< li > .mailmap: add an alias for Lionel Landwerlin< / li >
< li > .mailmap: specify spelling for Liviu Prodea< / li >
< li > .mailmap: update aliases for Marc-André Lureau< / li >
< li > .mailmap: add alias for Matthias Groß< / li >
< li > .mailmap: add an alias for Neha Bhende< / li >
< li > .mailmap: add an alias for Neil Roberts< / li >
< li > .mailmap: specify spelling for Nian Wu< / li >
< li > .mailmap: add an alias for Nicholas Bishop< / li >
< li > .mailmap: update aliases for Nicolai Hähnle< / li >
< li > .mailmap: add an alias for Philipp Zabel< / li >
< li > .mailmap: update aliases for Pierre-Eric Pelloux-Prayer< / li >
< li > .mailmap: add an alias for Plamena Manolova< / li >
< li > .mailmap: add an alias for Qiang Yu< / li >
< li > .mailmap: specify spelling for Randy Xu< / li >
< li > .mailmap: add an alias for Renato Caldas< / li >
< li > .mailmap: add an alias for Rob Clark< / li >
< li > .mailmap: add an alias for Rodrigo Vivi< / li >
< li > .mailmap: add an alias for Samuel Li< / li >
< li > .mailmap: add an alias for Sergii Romantsov< / li >
< li > .mailmap: specify spelling for Sonny Jiang< / li >
< li > .mailmap: add a couple of aliases for Steinar H. Gunderson< / li >
< li > .mailmap: add a couple of aliases for Suresh Guttula< / li >
< li > .mailmap: add an alias for Thierry Reding< / li >
< li > .mailmap: add an alias for Timo Aaltonen< / li >
< li > .mailmap: add a couple of aliases for Timothy Arceri< / li >
< li > .mailmap: add an alias for Tim Wiederhake< / li >
< li > .mailmap: add an alias for Tom Stellard< / li >
< li > .mailmap: add an alias for Tomasz Figa< / li >
< li > .mailmap: add an alias for Topi Pohjolainen< / li >
< li > .mailmap: add an alias for Vadym Shovkoplias< / li >
< li > .mailmap: add an alias for Varad Gautam< / li >
< li > .mailmap: specify spelling for Vivek Kasireddy< / li >
< li > .mailmap: specify spelling for Wladimir J. van der Laan< / li >
< li > .mailmap: add an alias for Xavier Bouchoux< / li >
< li > .mailmap: add an alias for Yaakov Selkowitz< / li >
< li > .mailmap: add alias for Zhaowei Yuan< / li >
< li > .mailmap: add an alias for Zhongmin Wu< / li >
< li > meson: use override_options to change warning-level< / li >
< li > wgl: silence some cast-warnings< / li >
< li > util/tests: initialize variable< / li >
< li > mesa: fixup cast expression< / li >
< li > vbo: avoid including wingdi.h on win32< / li >
< li > meson: tell flex that we support c99< / li >
< li > gtest: Update to 1.10.0< / li >
< li > meson: do not disable incremental linking for debug-builds< / li >
< li > docs: remove outdated sentence< / li >
< li > mesa/gallium: do not use enum for bit-allocated member< / li >
< li > meson: correct windows-version define< / li >
< li > mesa/main: do not store unrecognized extensions in context< / li >
< li > mesa/main: do not pass context to one-time extension init< / li >
< li > mesa/main: do not init remap-table per api< / li >
< li > mesa/main: Do not pass context to one_time_init< / li >
< li > mesa/main: one_time_init() -> _mesa_initialize()< / li >
< li > mesa/st: call _mesa_initialize() early< / li >
< li > zink: lower b2b to b2i< / li >
< li > util/os_memory: never use os_memory_debug.h< / li >
< li > zink: implement i2b1< / li >
< li > zink: use general-layout when blitting to/from same resource< / li >
< p > < / p >
< p > Francisco Jerez (57):< / p >
< li > intel/fs/cse: Make HALT instruction act as CSE barrier.< / li >
< li > intel/fs/gen7: Fix fs_inst::flags_written() for SHADER_OPCODE_FIND_LIVE_CHANNEL.< / li >
< li > intel/fs: Add virtual instruction to load mask of live channels into flag register.< / li >
< li > intel/fs/gen12: Workaround unwanted SEND execution due to broken NoMask control flow.< / li >
< li > intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch writes.< / li >
< li > intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow.< / li >
< li > intel/fs: Set src0 alpha present bit in header when provided in message payload.< / li >
< li > intel/fs/gen11: Work around dual-source blending hangs in combination with SIMD32.< / li >
< li > intel/fs: Make sample_mask_reg() local to brw_fs.cpp and use it in more places.< / li >
< li > intel/fs: Use helper for discard sample mask flag subregister number.< / li >
< li > intel/fs/gen7+: Swap sample mask flag register and FIND_LIVE_CHANNEL temporary.< / li >
< li > intel/fs: Refactor predication on sample mask into helper function.< / li >
< li > intel/fs: Return consistent UW types from sample_mask_reg() in fragment shaders.< / li >
< li > intel/fs/gen7+: Implement discard/demote for SIMD32 programs.< / li >
< li > intel/compiler: Move base IR definitions into a separate header file< / li >
< li > intel/compiler: Reverse inclusion dependency between brw_cfg.h and brw_shader.h< / li >
< li > intel/compiler: Nest definition of live variables block_data structures< / li >
< li > intel/compiler: Reverse inclusion dependency between brw_fs_live_variables.h and brw_fs.h< / li >
< li > intel/compiler: Reverse inclusion dependency between brw_vec4_live_variables.h and brw_vec4.h< / li >
< li > intel/compiler: Introduce simple IR analysis pass framework< / li >
< li > intel/compiler: Introduce backend_shader method to propagate IR changes to analysis passes< / li >
< li > intel/compiler: Define more detailed analysis dependency classes< / li >
< li > intel/compiler: Pass detailed dependency classes to invalidate_analysis()< / li >
< li > intel/compiler: Mark virtual_grf_interferes and vars_interfere as const< / li >
< li > intel/compiler: Move all live interval analysis results into fs_live_variables< / li >
< li > intel/compiler: Move all live interval analysis results into vec4_live_variables< / li >
< li > intel/compiler: Restructure live intervals computation code< / li >
< li > intel/compiler: Pass single backend_shader argument to the fs_live_variables constructor< / li >
< li > intel/compiler: Pass single backend_shader argument to the vec4_live_variables constructor< / li >
< li > intel/compiler/fs: Add live interval validation pass< / li >
< li > intel/compiler/vec4: Add live interval validation pass< / li >
< li > intel/compiler/fs: Switch liveness analysis to IR analysis framework< / li >
< li > intel/compiler/vec4: Switch liveness analysis to IR analysis framework< / li >
< li > intel/compiler: Drop invalidate_live_intervals()< / li >
< li > intel/compiler: Move idom tree calculation and related logic into analysis object< / li >
< li > intel/compiler: Move dominance tree data structure into idom_tree object< / li >
< li > entel/compiler: Simplify new_idom reduction in dominance tree calculation< / li >
< li > intel/compiler: Move register pressure calculation into IR analysis object< / li >
< li > intel/compiler: Calculate num_instructions in O(1) during register pressure calculation< / li >
< li > intel/fs: Fix workaround for VxH indirect addressing bug under control flow.< / li >
< li > intel/fs/gen12: Fix interaction of SWSB dependency combination with EU fusion workaround.< / li >
< li > intel/fs/gen12: Fix hangs with per-sample SIMD32 fragment shader dispatch.< / li >
< li > intel/fs/gen12: Work around dual-source blending hangs in combination with SIMD32.< / li >
< li > intel/fs/gen12: Fix Render Target Read header setup for new thread payload layout.< / li >
< li > intel/ir: Add missing initialization of backend_reg::offset during construction.< / li >
< li > intel/fs: Rename half() helpers to quarter(), allow index up to 3.< / li >
< li > intel/fs: Fix constness of argument of fs_instruction_scheduler::is_compressed().< / li >
< li > intel/fs: Replace fs_visitor::bank_conflict_cycles() with stand-alone function.< / li >
< li > intel/vec4: Fix constness of vec4_instruction::reads_flag() and ::writes_flag().< / li >
< li > intel/ir: Import shader performance analysis pass.< / li >
< li > intel/fs: Heap-allocate fs_visitors in brw_compile_fs().< / li >
< li > intel/fs: Implement performance analysis-based SIMD32 heuristic for fragment shaders.< / li >
< li > intel/fs: Add INTEL_DEBUG=no32 debugging flag.< / li >
< li > intel/ir: Use brw::performance object instead of CFG cycle counts for codegen stats.< / li >
< li > intel/ir: Pass block cycle count information explicitly to disassembler.< / li >
< li > intel/ir: Remove scheduling-based cycle count estimates.< / li >
< li > intel/ir: Update performance analysis parameters for memory fence codegen changes.< / li >
< p > < / p >
< p > Fritz Koenig (3):< / p >
< li > Revert " gitlab-ci: disable a630 tests as mesa-cheza is down" < / li >
< li > Revert " gitlab-ci: disable a630 tests as mesa-cheza is down (again)" < / li >
< li > freedreno: allow FMT6_8_UNORM as a UBWC format< / li >
< p > < / p >
< p > Georg Lehmann (3):< / p >
< li > Correctly wait in the fragment stage until all semaphores are signaled< / li >
< li > Vulkan Overlay: Don' t try to change the image layout to present twice< / li >
< li > Vulkan overlay: use the corresponding image index for each swapchain< / li >
< p > < / p >
< p > Gert Wollny (63):< / p >
< li > r600: force new CF with TEX only if any texture value is written< / li >
< li > r600: Increase space for IO values to agree with PIPE_MAX_SHADER_IN/OUTPUTS< / li >
< li > r600: Add NIR compiler options< / li >
< li > r600: Update state code to accept NIR shaders< / li >
< li > r600/sfn: Add a basic nir shader backend< / li >
< li > r600: enable NIR backend DEBUG flag for supported architectures< / li >
< li > r600/sfn: Add the VS in and FS out vectorization< / li >
< li > r600/sfn: Add the WaitAck instruction< / li >
< li > r600/sfn: add live range evaluation for the GPR< / li >
< li > r600/sfn: add register remapping< / li >
< li > r600/sfn: Add lowering arrays to scratch and according instructions< / li >
< li > r600/sfn: Add a load GDS result instruction< / li >
< li > r600/sfn: Add MemRingOut instructions< / li >
< li > r600/sfn: add emitVertex instructions< / li >
< li > r600/sfn: Add support for geometry shader< / li >
< li > r600/sfn: Add VS for TCS shader skeleton< / li >
< li > r600/sfn: Add compute shader skeleton< / li >
< li > r600/sfn: Add GDS instructions< / li >
< li > r600/sfn: Add lowering UBO access to r600 specific codes< / li >
< li > r600: Make sure LLVM is not used for DRAW< / li >
< li > r600/sfn: Add support for atomic instructions< / li >
< li > r600/sfn: Add support for SSBO load and store< / li >
< li > r600/sfn: Add .editorconfig file< / li >
< li > r600/sfn: Add some documentation< / li >
< li > r600/sfn: Avoid using dynamic_cast to identify type< / li >
< li > r600/sfn: Use static_cast when type is already known< / li >
< li > r600/sfn: Don' t try to catch exceptions, the driver doesn' t throw any< / li >
< li > gallium/tgsi_to_nir: Set nir_intrinsic_align_mul to 16 and offset to 0< / li >
< li > r600: Dump a few more variables when requested< / li >
< li > r600/sfn: Reduce array limit for scratch usage< / li >
< li > r600/sfn: Fix setting alignments when lowering UBOs< / li >
< li > r600/sfn: Implementing instructions blocks< / li >
< li > r600/nir: Pin interpolation results to channel< / li >
< li > r600/sfn: Fix null pointer deref in live range evalation< / li >
< li > r600/sfn: Handle b2b1 like it was a mov< / li >
< li > r600/sfn: Fix handling of GS inputs< / li >
< li > r600/sfn: Fix using the result of a fetch instruction in next fetch< / li >
< li > r600/sfn: Count only literals that are not inline to split instruction groups< / li >
< li > r600/sfn: use new temp register allocation when loading single value temporaries< / li >
< li > nir: Add r600 specific intrinsics for tesselation shader IO< / li >
< li > nir: Add umad24 and umul24 opcodes< / li >
< li > r600: Handle texcoord semantics in LDS index evaluation< / li >
< li > r600/sfn: simplify UBO lowering pass< / li >
< li > r600/sfn: Don' t emit inline constants in the r600 IR< / li >
< li > r600/sfn: Add LDS IO instructions to r600 IR< / li >
< li > r600/sfn: Add LDS instruction to assembly conversion< / li >
< li > r600/sfn: Add TF write instruction< / li >
< li > r600/sfn: Add IR instruction to fetch the TESS parameters< / li >
< li > r600/sfn: Handle umul24 and umad24< / li >
< li > r600/sfn: Emit some LDS instructions< / li >
< li > r600/sfn: Move emission of barrier from compute shader to shader base< / li >
< li > r600/sfn: Add methods to valuepool to get a vector of values< / li >
< li > r600/sfn: Move some shader base methods to the public interface< / li >
< li > r600/sfn: extract class to handle the VS export to different stages< / li >
< li > r600/sfn: derive the GS from the vertex stage for a common interface< / li >
< li > r600/sfn: Handle LDS output in VS< / li >
< li > r600/sfn: Move removing of unused variables< / li >
< li > r600/sfn: Add lowering passes for Tesselation IO< / li >
< li > r600/sfn: Add tesselation shaders< / li >
< li > r600: Enable tesselation for NIR< / li >
< li > r600: Fix nir compiler options, i.e. don' t lower IO to temps for TESS< / li >
< li > r600/sfn: Fix printing vertex fetch instruction flags< / li >
< li > r600: Fix duplicated subexpression in r600_asm.c< / li >
< p > < / p >
< p > Greg V (3):< / p >
< li > amd/addrlib: fix build on non-x86 platforms< / li >
< li > r600: add missing < array> include< / li >
< li > svga: fix build on FreeBSD< / li >
< p > < / p >
< p > H.J. Lu (2):< / p >
< li > x86_init_func_common: Add ENDBR at function entry< / li >
< li > x86: Add ENDBR at function entries< / li >
< p > < / p >
< p > Hanno Böck (1):< / p >
< li > Properly check mmap return value< / li >
< p > < / p >
< p > Hyunjun Ko (27):< / p >
< li > freedreno/ir3: fix printing half constant registers.< / li >
< li > freedreno/ir3: Add cat4 mediump opcodes< / li >
< li > freedreno/ir3: put the conversion back for half const to the right place.< / li >
< li > freedreno/ir3: Fold const only when the type is float< / li >
< li > freedreno/ir3: Add new ir3 pass to fold out fp16 conversions< / li >
< li > nir: Add optimization for doing removing f16/f32 conversions< / li >
< li > freedreno/ir3: handle half registers for arrays during register allocation.< / li >
< li > turnip: support indirect draw< / li >
< li > glsl: Handle fp16 unary operations when lowering matrix operations< / li >
< li > glsl/lower_instructions: Handle fp16 for MOD_TO_FLOOR< / li >
< li > turnip: Gather information for transform feedback< / li >
< li > turnip: Define structs for transform feedback< / li >
< li > turnip: Setup stream-output when linking program< / li >
< li > turnip: Implement stream-out emit and vkApis for transform feedback< / li >
< li > turnip: Implement an empty function vkCmdDrawIndirectByteCountEXT< / li >
< li > turnip: Enable VK_EXT_transform_feedback< / li >
< li > turnip: Add tu6_control struct.< / li >
< li > turnip: Fix wrong assignment of xfb output' s offset.< / li >
< li > turnip: Do gathering xfb info after nir_remove_dead_variables< / li >
< li > freedreno: Enable mediump lowering< / li >
< li > freedreno/ir3: enable nir_opt_loop_unroll on a6xx< / li >
< li > nir: fix wrong assignment to buffer in xfb_varyings_info< / li >
< li > turnip: make the struct slot_value of queries get 2 values< / li >
< li > turnip: Implement and enable VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT< / li >
< li > turnip : Fix wrong offset calculation for xfb buffer.< / li >
< li > turnip: Skip unused regs when setting up streamout buffers< / li >
< li > turnip: Fix crashes when geometry shader constants aren' t used< / li >
< p > < / p >
< p > Iago Toral Quiroga (1):< / p >
< li > nir: add a bool bitsize lowering pass< / li >
< p > < / p >
< p > Ian Romanick (62):< / p >
< li > intel/fs: Don' t count integer instructions as being possibly coissue< / li >
< li > nir: Mark fmin and fmax as commutative and associative< / li >
< li > mesa/draw: Make sure all the unused fields are initialized to zero< / li >
< li > nir/search: Use larger type to hold linearized index< / li >
< li > intel/fs: Correctly handle multiply of fsign with a source modifier< / li >
< li > intel/fs: Do cmod prop again after scheduling< / li >
< li > intel/fs: Allow NOT instructions in conditional discard optimization< / li >
< li > intel/fs: Fix NULL destinations on 3-source instructions again after late DCE< / li >
< li > nir/algebraic: Simplify logic to detect sign of an integer< / li >
< li > nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)< / li >
< li > nir/algebraic: Generalize some and-of-shift-right patterns [v2]< / li >
< li > nir/algebraic: Constant reassociation for bitwise operations too< / li >
< li > nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan< / li >
< li > soft-fp64/b2f: Reimplement using bitwise logic ops< / li >
< li > soft-fp64: Don' t open-code umulExtended< / li >
< li > soft-fp64: Simplify __countLeadingZeros32 function< / li >
< li > soft-fp64: Pick a single idiom for treating sign value as a Boolean< / li >
< li > soft-fp64: Store sign value as 0 or 0x80000000< / li >
< li > soft-fp64/fneg: Don' t treat NaN specially< / li >
< li > soft-fp64/flt: Perform checks in a different order< / li >
< li > soft-fp64/fsat: Correctly handle NaN< / li >
< li > soft-fp64/fsat: Micro-optimize x < 0 test< / li >
< li > soft-fp64/fsat: Micro-optimize x > = 1 test< / li >
< li > soft-fp64: Relax the way NaN is propagated< / li >
< li > soft-fp64/ffloor: Simplify the > = 0 comparison< / li >
< li > soft-fp64: Optimize __fmin64 and __fmax64 by using different evaluation order [v2]< / li >
< li > soft-fp64/fadd: Instead of tracking " b < a" , track sign of the difference< / li >
< li > soft-fp64/fadd: Massively split the live range of zFrac0 and zFrac1< / li >
< li > soft-fp64/fadd: Pick zero or non-zero result based on subtraction result< / li >
< li > soft-fp64/fadd: Just let the subtraction happen when the result will be zero< / li >
< li > soft-fp64/fadd: Delete a redundant condition check< / li >
< li > soft-fp64/fadd: Reformat after previous commit< / li >
< li > soft-fp64/fadd: Combine an if-statement into the preceeding else-clause< / li >
< li > soft-fp64/fadd: Rename aFrac and bFrac variables< / li >
< li > soft-fp64/fadd: Use absolute value of expDiff< / li >
< li > soft-fp64/fadd: Move common code out of both branches of an if-statement< / li >
< li > soft-fp64/fadd: Common code optimization for differing sign case< / li >
< li > soft-fp64: Split a block that was missing a cast on a comparison< / li >
< li > intel/vec4: Allow late copy propagation on vec4< / li >
< li > nir/algebraic: Change the default cursor location when replacing a unary op< / li >
< li > nir/algebraic: Distribute source modifiers into instructions< / li >
< li > nir/algebraic: Use value range analysis to convert fmax to fsat< / li >
< li > nir/algebraic: Remove a redundant fabs pattern< / li >
< li > tnl: Don' t dereference NULL obj pointer in bind_indices< / li >
< li > tnl: Don' t dereference NULL obj pointer in replay_init< / li >
< li > tnl: Don' t dereference NULL obj pointer in t_rebase_prims< / li >
< li > tnl: Silence unused parameter ' attrib' warning in convert_half_to_float< / li >
< li > tnl: Silence unused parameter warnings in _tnl_draw_prims< / li >
< li > tnl: Silence unused parameter warnings in dump_draw_info< / li >
< li > tnl: Silence unused parameter warnings in _tnl_split_inplace< / li >
< li > tnl: Code formatting in t_draw.c< / li >
< li > tnl: Code formatting in t_rebase.c< / li >
< li > intel/compiler: Silence unused parameter warnings in vec4_tcs_visitor< / li >
< li > intel/compiler: Silence unused parameter warning in fs_live_variables::setup_one_read< / li >
< li > intel/compiler: Silence unused parameter warning in update_inst_scoreboard< / li >
< li > intel/compiler: Only GE and L modifiers are commutative for SEL< / li >
< li > intel/compiler: CSEL can do saturate< / li >
< li > intel/compiler: Fixup operands in fs_builder::emit() that takes array< / li >
< li > nir/algebraic: Detect some kinds of malformed variable names< / li >
< li > nir/algebraic: Require operands to iand be 32-bit< / li >
< li > nir/algebraic: Optimize ushr of pack_half, not ishr< / li >
< li > anv/tests: Don' t rely on assert or changing NDEBUG in tests< / li >
< p > < / p >
< p > Icecream95 (16):< / p >
< li > panfrost: Fix non-debug builds< / li >
< li > panfrost: Inline panfrost_get_default_swizzle< / li >
< li > panfrost: LogicOp support< / li >
< li > nir: Allow nir_format conversions to work on 32-bit values< / li >
< li > panfrost: LogicOp fixes and non 8-bit format support< / li >
< li > mesa/format_utils: Add a fast-path for RGBA to BGRA< / li >
< li > panfrost: Extend the tiled store fast-path to loads< / li >
< li > panfrost: Mark 64-bit formats as unsupported< / li >
< li > panfrost: Add support for B5G5R5X1< / li >
< li > st/mesa: Fall back on R3G3B2 for R3_G3_B2< / li >
< li > panfrost: Add support for R3G3B2< / li >
< li > panfrost: Correctly identify format 0x4c< / li >
< li > pan/midgard: Fix a divide by zero in emit_alu_bundle< / li >
< li > panfrost: Fix GL_EXT_vertex_array_bgra< / li >
< li > panfrost: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED< / li >
< li > panfrost: Fix background showing when using discard< / li >
< p > < / p >
< p > Icenowy Zheng (3):< / p >
< li > lima: remove its hash table entry when invalidating a resource< / li >
< li > lima: expose fragment shader derivatives capability< / li >
< li > lima: implement zsbuf reload< / li >
< p > < / p >
< p > Ilia Mirkin (24):< / p >
< li > nv50: report max lod bias of 15.0< / li >
< li > gitlab-ci: disable panfrost runners< / li >
< li > mesa: fix _mesa_draw_nonzero_divisor_bits to return nonzero divisors< / li >
< li > nv50,nvc0: add newly added PIPE_CAP' s to list< / li >
< li > st/mesa: allow TXB2/TXL2 to work with cube array shadow textures< / li >
< li > nvc0: enable EXT_texture_shadow_lod< / li >
< li > st/vdpau: avoid asserting on new VDP_YCBCR_* formats< / li >
< li > st/vdpau: make query test for 2D support< / li >
< li > nv50: don' t try to upload MSAA settings for BUFFER textures< / li >
< li > gallium: add viewport swizzling state and cap< / li >
< li > mesa: add GL_NV_viewport_swizzle support< / li >
< li > st/mesa: add NV_viewport_swizzle support< / li >
< li > nvc0: add NV_viewport_swizzle support for GM200+< / li >
< li > compiler: add VARYING_SLOT_VIEWPORT_MASK< / li >
< li > glsl: add NV_viewport_array2 support< / li >
< li > mesa: add NV_viewport_array2 enable, attach to glsl< / li >
< li > gallium: add TGSI_SEMANTIC_VIEWPORT_MASK< / li >
< li > gallium: add TGSI_PROPERTY_LAYER_VIEWPORT_RELATIVE< / li >
< li > gallium: add PIPE_CAP_VIEWPORT_MASK< / li >
< li > st/mesa: add support for GL_NV_viewport_array2< / li >
< li > nvc0: enable GL_NV_viewport_array2< / li >
< li > nv50,nvc0: update with latest caps< / li >
< li > docs: update for recently-added nvc0 features< / li >
< li > mesa: add interaction between compute derivatives and variable local sizes< / li >
< p > < / p >
< p > Indrajit Kumar Das (4):< / p >
< li > glapi/copyimage: Implement CopyImageSubDataNV< / li >
< li > gallium: prepare framework for supporting AlphaToCoverageDitherControlNV< / li >
< li > mesa: add support for AlphaToCoverageDitherControlNV< / li >
< li > radeonsi: enable support for AlphaToCoverageDitherControlNV< / li >
< p > < / p >
< p > Ivan Molodetskikh (1):< / p >
< li > egl: allow INVALID format for linux_dmabuf< / li >
< p > < / p >
< p > James Xiong (2):< / p >
< li > iris: handle the failure of converting unsupported yuv formats to isl< / li >
< li > gallium: let the pipe drivers decide the supported modifiers< / li >
< p > < / p >
< p > James Zhu (1):< / p >
< li > radeonsi: fix Segmentation fault during vaapi enc test< / li >
< p > < / p >
< p > Jan Palus (1):< / p >
< li > targets/opencl: fix build against LLVM> =10 with Polly support< / li >
< p > < / p >
< p > Jan Vesely (2):< / p >
< li > clover: Use explicit conversion from llvm::StringRef to std::string< / li >
< li > clover: Check if the detected clang libraries are usable< / li >
< p > < / p >
< p > Jan Zielinski (8):< / p >
< li > gallium/swr: Fix various asserts and security issues< / li >
< li > gallium/swr: fix corruptions in Unigine Heaven< / li >
< li > gallium/swr: use ElementCount type arguments for getSplat()< / li >
< li > gallium/gallivm: Remove workaround disabling AVX code for newer CPUs< / li >
< li > gallium/gallivm: fix compilation issues with llvm 11< / li >
< li > gallium/gallivm: remove unused header include for newer LLVM< / li >
< li > gallium/swr: Fix LLVM 11 compilation issues< / li >
< li > gallium/swr: Fix crashes and failures in vertex fetch< / li >
< p > < / p >
< p > Jason Ekstrand (202):< / p >
< li > genxml: Add a new 3DSTATE_SF field on gen12< / li >
< li > anv,iris: Set 3DSTATE_SF::DerefBlockSize to per-poly on Gen12+< / li >
< li > intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11< / li >
< li > iris: Set SLMEnable based on the L3$ config< / li >
< li > iris: Store the L3$ configs in the screen< / li >
< li > iris: Use the URB size from the L3$ config< / li >
< li > i965: Re-emit l3 state before BLORP executes< / li >
< li > intel: Take a gen_l3_config in gen_get_urb_config< / li >
< li > intel/blorp: Always emit URB config on Gen7+< / li >
< li > iris: Consolodate URB emit< / li >
< li > anv: Emit URB setup earlier< / li >
< li > intel/common: Return the block size from get_urb_config< / li >
< li > intel/blorp: Plumb deref block size through to 3DSTATE_SF< / li >
< li > anv: Plumb deref block size through to 3DSTATE_SF< / li >
< li > iris: Plumb deref block size through to 3DSTATE_SF< / li >
< li > anv: Always fill out the AUX table even if CCS is disabled< / li >
< li > intel/eu/validate: Don' t validate regions of sends< / li >
< li > intel/disasm: SEND has two sources on Gen12+< / li >
< li > intel/tools: Handle strides better when dumping buffers< / li >
< li > intel/fs: Write the address register with NoMask for MOV_INDIRECT< / li >
< li > anv/blorp: Use the correct size for vkCmdCopyBufferToImage< / li >
< li > anv: No-op submit and wait calls when no_hw is set< / li >
< li > anv: Reject modifiers on depth/stencil formats< / li >
< li > vulkan: Update the XML and headers to 1.2.133< / li >
< li > nir: Fix the nir_builder include path for nir_builtin_builder< / li >
< li > nir/builder: Return an integer from nir_get_texture_size< / li >
< li > intel/isl: Add isl_aux_info.c to Makefile.sources< / li >
< li > anv: Always enable the data cache< / li >
< li > nir: Drop nir_tex_instr::texture_array_size< / li >
< li > anv: Use the PIPE_CONTROL instead of bits for the CS stall W/A< / li >
< li > anv: Use a proper end-of-pipe sync instead of just CS stall< / li >
< li > anv: Do end-of-pipe sync around MCS/CCS ops instead of CS stall< / li >
< li > nir: Flush to zero with OOB low exponents in ldexp< / li >
< li > isl: Set 3DSTATE_DEPTH_BUFFER::Depth correctly for 3D surfaces< / li >
< li > iris: Allow HiZ on blit sources< / li >
< li > blorp: Write to depth/stencil images as depth/stencil when possible< / li >
< li > anv: Enable HiZ for VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL< / li >
< li > iris: Enable CCS for copies from HiZ+CCS depth buffers< / li >
< li > iris: Enable HiZ and stencil CCS for blorp blit destinations< / li >
< li > iris: Don' t skip fast depth clears if the color changed< / li >
< li > anv: Parse VkPhysicalDeviceFeatures2 in CreateDevice< / li >
< li > anv: Mark max_push_range UNUSED and simplify the code< / li >
< li > anv: Pass buffer addresses into emit_push_constant*< / li >
< li > anv: Delete some pointless break statements< / li >
< li > anv: Align UBO sizes to 32B< / li >
< li > anv: Add an align_down_u32 helper< / li >
< li > anv: Bounds-check pushed UBOs when robustBufferAccess = true< / li >
< li > vulkan/wsi: Don' t leak the FD when GetImageDrmFormatModifierProperties fails< / li >
< li > vulkan/wsi: Return an error if dup() fails< / li >
< li > intel/isl: Clean up some aux surface logic< / li >
< li > intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT< / li >
< li > intel/blorp: Allow HIZ_CCS_WT in copy sources< / li >
< li > iris: Use ISL_AUX_USAGE_HIZ_CCS_WT to indicate write-through HiZ< / li >
< li > intel/isl: Require ISL_AUX_USAGE_HIZ_CCS_WT for HZ+CCS WT mode< / li >
< li > intel/isl: Add a separate ISL_AUX_USAGE_STC_CCS< / li >
< li > intel/blorp: Allow STC_CCS in blit sources< / li >
< li > iris: Use ISL_AUX_USAGE_STC_CCS for stencil CCS< / li >
< li > intel: Require ISL_AUX_USAGE_STC_CCS for stencil CCS< / li >
< li > intel/isl: Set DepthStencilResource based on aux usage< / li >
< li > anv: Dump push ranges via VK_KHR_pipeline_executable_properties< / li >
< li > anv: Fix the comparison in an assert< / li >
< li > anv: Push UBO ranges relative to the start of the binding< / li >
< li > anv: Do an end-of-pipe sync before updating AUX table entries< / li >
< li > intel/isl: Don' t align linear images to 64K on Gen12+< / li >
< li > intel/blorp: Add support for swizzling fast-clear colors< / li >
< li > anv: Swizzle fast-clear values< / li >
< li > intel/iris: Always initialize CCS to 0< / li >
< li > anv: Only add END_OF_PIPE_SYNC if we actually have AUX_INVAL< / li >
< li > util/sparse_array: Finish the sparse_array in the tests< / li >
< li > util/sparse_array: Add a node_size_log2 temporary< / li >
< li > meson,ci: Disable sparse_array tests on windows< / li >
< li > util/sparse_array: Stash the node level in the node pointer< / li >
< li > anv: Stop fetching the timestamp frequency ourselves< / li >
< li > intel/dump_gpu: Add an ensure_device_info helper< / li >
< li > intel/dump_gpu: Handle a bunch of getparam in the no-HW case< / li >
< li > intel/nir: Run copy-prop and DCE after lower_bool_to_int32< / li >
< li > nir: Add b2b opcodes< / li >
< li > aco: Implement b2b32 and b2b1< / li >
< li > nir: Use b2b opcodes for shared and constant memory< / li >
< li > nir: Insert b2b1s around booleans in nir_lower_to< / li >
< li > anv: Set alignments on descriptor and constant loads< / li >
< li > nir: Validate that memory load/store ops work on whole bytes< / li >
< li > nir: Set UBO alignments in lower_uniforms_to_ubo< / li >
< li > nir/opt_loop_unroll: Fix has_nested_loop handling< / li >
< li > nir/lower_int64: Lower 8 and 16-bit downcasts with nir_lower_mov64< / li >
< li > nir/algebraic: Add downcast-of-pack opts< / li >
< li > nir: Add a nir_op_is_vec helper< / li >
< li > nir: Copy propagate through vec8s and vec16s< / li >
< li > nir: Handle vec8/16 in bool_to_bitsize< / li >
< li > nir: Handle vec8/16 in gather_ssa_types< / li >
< li > nir: Handle vec8/16 in lower_phis_to_scalar< / li >
< li > nir: Handle vec8/16 in lower_regs_to_ssa< / li >
< li > nir: Handle vec8/16 in opt_split_alu_of_phi< / li >
< li > nir: Treat vec8/16 as select in opt_peephole_select< / li >
< li > nir: Handle vec8/16 in opt_undef_vecN< / li >
< li > nir: Handle vec8/16 in nir_shrink_array_vars< / li >
< li > anv: Account for the header in anv_state_stream_alloc< / li >
< li > anv/allocator: Use util_dynarray for blocks in anv_state_stream< / li >
< li > spirv: Implement OpCopyObject and OpCopyLogical as blind copies< / li >
< li > Revert " spirv: Implement OpCopyObject and OpCopyLogical as blind copies" < / li >
< li > anv/image: Use align_u64 for image offsets< / li >
< li > nir/from_ssa: Only chain movs when a src is also a dest< / li >
< li > intel/fs: Choose memory message type based on bit size< / li >
< li > anv: Improve brw_nir_lower_mem_access_bit_sizes< / li >
< li > iris: Set alignments on cbuf0 and constant reads< / li >
< li > intel/nir: Lower memory access bit sizes later< / li >
< li > nir/load_store_vectorize: Fix shared atomic info< / li >
< li > nir/load_store_vectorize: Use nir_iadd_imm for offsets< / li >
< li > nir/load_store_vectorize: Add support for nir_var_mem_global< / li >
< li > intel/nir: Enable load/store vectorization< / li >
< li > spirv: Add a vtn_block() helper< / li >
< li > spirv: Add cast and loop helpers for vtn_cf_node< / li >
< li > spirv: Make vtn_case a vtn_cf_node< / li >
< li > spirv: Make vtn_function a vtn_cf_node< / li >
< li > spirv: Add a parent field to vtn_cf_node< / li >
< li > spirv: Rewrite CFG construction< / li >
< li > Revert " spirv: Rewrite CFG construction" < / li >
< li > nir: Assert memory loads are aligned< / li >
< li > anv: Advertise SEND count through VK_EXT_pipeline_executable_properties< / li >
< li > anv: Fix UBO range detection in anv_nir_compute_push_layout< / li >
< li > nir: Add an alignment to nir_intrinsic_load_constant< / li >
< li > nir: Add some sanity assertions in opt_large_constants< / li >
< li > intel: Add _const versions of prog_data cast helpers< / li >
< li > anv: Report correct SLM size< / li >
< li > intel/batch_decoder: Stop printing to stdout< / li >
< li > intel/cfg: Add first/last_block helpers< / li >
< li > anv: Emit pushed UBO bounds checking code in the back-end compiler< / li >
< li > intel/blorp: Delete an unused enum< / li >
< li > spirv: Handle OOB vector extract operations< / li >
< li > spirv,nir: Add a better vector_insert< / li >
< li > spirv: Error if OpCompositeInsert/Extract has OOB indices< / li >
< li > nir/builder: Handle any bit-size selector in nir_extract< / li >
< li > spirv: Call nir_builder directly for vector_extract< / li >
< li > spirv,nir: Move the SPIR-V vector insert code to NIR< / li >
< li > anv: Move vb_emit setup closer to where it' s used in flush_state< / li >
< li > anv: Apply any needed PIPE_CONTROLs before emitting state< / li >
< li > nir/dominance: Better handle unreachable blocks< / li >
< li > nir/gcm: Loop over blocks in pin_instructions< / li >
< li > nir/gcm: Use an array for storing the early block< / li >
< li > nir/gcm: Move block choosing into a helper function< / li >
< li > nir/gcm: Add a real concept of " progress" < / li >
< li > nir/gcm: Delete dead instructions< / li >
< li > nir/gcm: Prefer the instruction' s original block< / li >
< li > intel/fs: Rename block to scan_block in can_coalesce_vars< / li >
< li > intel/fs: Coalesce when the src live range is contained in the dst< / li >
< li > glsl: Hard-code noise to zero in builtin_functions.cpp< / li >
< li > nir: Delete the fnoise opcodes< / li >
< li > meta,i965: Rip GL_EXT_texture_multisample_blit_scaled support out of meta< / li >
< li > spirv: Allow constants and NULLs in SpvOpConvertUToPtr< / li >
< li > anv: Properly handle all sizes of specialization constants< / li >
< li > radv: Properly handle all sizes of specialization constants< / li >
< li > turnip: Properly handle all sizes of specialization constants< / li >
< li > spirv: Use nir_const_value for spec constants< / li >
< li > nir/opt_deref: Remove certain sampler type casts< / li >
< li > spirv: Fix passing combined image/samplers through function calls< / li >
< li > anv: Drop an assert< / li >
< li > nir/lower_subgroups: Mask off unused bits in ballot ops< / li >
< li > anv: Add a vk_image_layout_to_usage_flags helper< / li >
< li > anv: Move vk_image_layout_is_read_only higher< / li >
< li > anv: Be more conservative about image view usage< / li >
< li > anv: Rework anv_layout_to_aux_state< / li >
< li > anv/blorp: Do less hard-coding of aux usages< / li >
< li > anv: Generalize some aux usage checks< / li >
< li > intel/blorp: Allow more HiZ usages in hiz_clear_depth_stencil< / li >
< li > anv: Simplify a case in layout_to_aux_usage< / li >
< li > anv/cmd_buffer: Move anv_image_init_aux_tt higher< / li >
< li > intel/isl: Delete a misleading comment< / li >
< li > intel/isl: Refactor isl_surf_get_ccs_surf< / li >
< li > anv: Add support for HiZ+CCS< / li >
< li > spirv: Rewrite CFG construction< / li >
< li > intel/devinfo: Compute the correct L3$ size for Gen12< / li >
< li > anv: Expose CS workgroup sizes based on a maximum of 64 threads< / li >
< li > anv: Return an error if allocating attachment memory fails< / li >
< li > anv: Add TRANSFER_SRC to pass usage not subpass usage< / li >
< li > anv: Stop filling out the clear color in compute_aux_usage< / li >
< li > anv: Assert surface states are valid< / li >
< li > anv: Use ANV_FROM_HANDLE for pInheritanceInfo fields< / li >
< li > anv: Mark images written in end_subpass< / li >
< li > anv: Split command buffer attachment setup in three< / li >
< li > anv: Allocate surface states per-subpass< / li >
< li > intel: Move swizzle_color_value from blorp to ISL< / li >
< li > anv: Disallow fast-clears which require format-reinterpretation< / li >
< li > anv: Stop allowing non-zero clear colors in input attachments< / li >
< li > anv: Refactor cmd_buffer_setup_attachments< / li >
< li > anv: Rework depth_stencil_attachment_compute_aux_usage< / li >
< li > anv: Split color_attachment_compute_aux_usage in two< / li >
< li > anv: Use anv_layout_to_aux_usage for color during render passes< / li >
< li > anv: Allow all clear colors for texturing on Gen11+< / li >
< li > vulkan: Update Vulkan XML and headers to 1.2.139< / li >
< li > nir/copy_prop_vars: Handle volatile better< / li >
< li > nir/copy_prop_vars: Report progress when deleting self-copies< / li >
< li > nir/dead_write_vars: Handle volatile< / li >
< li > nir/combine_stores: Handle volatile< / li >
< li > anv: Handle NULL descriptors< / li >
< li > anv: Handle null vertex buffer bindings< / li >
< li > anv: Claim VK_EXT_robustness2 support< / li >
< li > intel/fs: Don' t delete coalesced MOVs if they have a cmod< / li >
< li > vulkan: Allow destroying NULL debug report callbacks< / li >
< li > anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+< / li >
< li > nir/lower_double_ops: Rework the if (progress) tree< / li >
< li > nir/opt_deref: Report progress if we remove a deref< / li >
< li > nir/copy_prop_vars: Record progress in more places< / li >
< p > < / p >
< p > Jesse Natalie (3):< / p >
< li > wgl: add official gldrv.h header-file< / li >
< li > wgl: use gldrv.h instead of stw_icd.h< / li >
< li > util/ralloc: fix ralloc alignment on Win64< / li >
< p > < / p >
< p > John Stultz (7):< / p >
< li > freedreno: Add ir3_cf.c and ir3_delay.c to Makefile.sources< / li >
< li > panfrost: Move pan_afbc.c file to the the right Makefile.source file< / li >
< li > gallium: hud_context: Fix scalar initializer warning.< / li >
< li > Android.mk: Tweak MESA_ENABLE_LLVM checks< / li >
< li > etnaviv: Avoid shift overflow< / li >
< li > vc4_bufmgr: Remove duplicative VC definition< / li >
< li > r600: Fix build error in sfn_nir_lower_fs_out_to_vector.cpp< / li >
< p > < / p >
< p > Jon Turney (1):< / p >
< li > Fix util/process test on Cygwin< / li >
< p > < / p >
< p > Jonathan Marek (79):< / p >
< li > freedreno/a6xx: use single format enum< / li >
< li > freedreno/a6xx: fix Z24_UNORM_S8_UINT_AS_R8G8B8A8< / li >
< li > freedreno: name sysmem color/depth flush events< / li >
< li > freedreno/a6xx: document some unknown bits< / li >
< li > turnip: add option to force use of hw binning< / li >
< li > turnip: fix COND_EXEC reserved size in tu_query< / li >
< li > turnip: add tu_device pointer to tu_cs< / li >
< li > turnip: automatically reserve cmdstream space in emit_pkt4/emit_pkt7< / li >
< li > turnip: remove marker seqno< / li >
< li > turnip: make cond_exec helper easier to use< / li >
< li > turnip: move tile_load_ib/sysmem_clear_ib into draw_cs< / li >
< li > hud: add GALLIUM_HUD_SCALE< / li >
< li > turnip: enable sampleRateShading feature< / li >
< li > turnip: enable fullDrawIndexUint32/independentBlend/dualSrcBlend/logicOp< / li >
< li > etnaviv: disable INT_FILTER for ASTC< / li >
< li > util/format: add missing BC4/BC5 vulkan formats< / li >
< li > turnip: rework format table to support r5g5b5a1_unorm/b5g5r5a1_unorm< / li >
< li > turnip: add r5g5b5a1_unorm/b5g5r5a1_unorm formats< / li >
< li > turnip: check the right alignment requirement on shader iova< / li >
< li > turnip: move some constant state to tu6_init_hw< / li >
< li > turnip: remove unecessary MRT_CONTROL fill< / li >
< li > turnip: minify image_view extent< / li >
< li > turnip: fix hw binning + render_area offset interaction< / li >
< li > turnip: fix srgb MRT< / li >
< li > turnip: don' t hardcode gmem base for input attachment< / li >
< li > turnip: remove unnecessary fb size check< / li >
< li > turnip: fall back to sysmem when attachments don' t fit into gmem< / li >
< li > turnip: increase array sizes in tu_descriptor_map< / li >
< li > turnip: improve binning pipe layout config< / li >
< li > turnip: fix tile-> slot calculation< / li >
< li > etnaviv: nir: add compile_check_limits< / li >
< li > freedreno/registers: more GRAS_CL_CNTL bits, Z_CLAMP< / li >
< li > turnip: fix znear clipping< / li >
< li > turnip: implement depth clamp< / li >
< li > turnip: implement timestamp query< / li >
< li > turnip: fix compute shaders crashing after geometry shader change< / li >
< li > turnip: improve vertex input handling< / li >
< li > turnip: use buffer size instead of bo size for VFD_FETCH_SIZE< / li >
< li > freedreno/registers: add RB_CCU_CNTL bitfields< / li >
< li > freedreno/a6xx: set bypass RB_CCU_CNTL value for blitter< / li >
< li > turnip: RB_CCU_CNTL fixes< / li >
< li > turnip: split up gmem/tile alignment< / li >
< li > turnip: fix nir validate failure from push constant lowering< / li >
< li > turnip: disable 8x msaa< / li >
< li > turnip: save attachment samples in renderpass state< / li >
< li > turnip: use dirty bits for dynamic viewport/scissor state< / li >
< li > turnip: rework format helpers< / li >
< li > turnip: add vk_format_is_snorm/is_float< / li >
< li > turnip: new clear/blit implementation with shader path fallback< / li >
< li > freedreno/computerator: support nop prefix< / li >
< li > freedreno/computerator: support bindless sampler instructions< / li >
< li > freedreno/ir3: fix emit_tex_info split_dest< / li >
< li > freedreno/ir3: don' t overwrite wrmask in ir3_SAM< / li >
< li > turnip: compute render_components/srgb_cntl at renderpass creation time< / li >
< li > turnip: don' t limit framebuffer size to image size< / li >
< li > turnip: image_view rework< / li >
< li > nir: add common convert_ycbcr for vulkan csc< / li >
< li > nir: convert_ycbcr: preserve alpha channel< / li >
< li > anv: use common nir_convert_ycbcr< / li >
< li > radv: use common nir_convert_ycbcr< / li >
< li > turnip: fix GMEM resolve in CmdNextSubpass< / li >
< li > turnip: disable depth test for S8_UINT attachment< / li >
< li > turnip: improve GMEM load/store logic< / li >
< li > turnip: enable VK_FORMAT_S8_UINT as stencil format< / li >
< li > turnip: set shader key msaa field< / li >
< li > turnip: implement VK_EXT_sample_locations< / li >
< li > turnip: implement VK_EXT_filter_cubic< / li >
< li > turnip: enable cube arrays< / li >
< li > turnip: implement VK_EXT_sampler_filter_minmax< / li >
< li > turnip: divide cube map depth by 6< / li >
< li > freedreno/ir3: fix 16-bit ssbo access< / li >
< li > freedreno/ir3: set even bit for f2f16_rtne< / li >
< li > freedreno/ir3: fix incorrect conversion folding< / li >
< li > turnip: remove unused RB_UNKNOWN_8E04_blit< / li >
< li > turnip: use RESOLVE_TS event< / li >
< li > turnip: add adreno 650< / li >
< li > nir: add pack_32_2x16_split/unpack_32_2x16_split lowering< / li >
< li > freedreno/ir3: run nir_lower_pack< / li >
< li > turnip: fix wrong substream size in parse_multisample_and_color_blend< / li >
< p > < / p >
< p > Jordan Justen (6):< / p >
< li > intel/compiler: Restrict cs_threads to 64< / li >
< li > intel: Update TGL PCI strings< / li >
< li > intel: Add TGL PCI ID< / li >
< li > intel/dev: Split .num_subslices out of GEN12_FEATURES macro< / li >
< li > intel/dev: Add device info for RKL< / li >
< li > docs/relnotes/new_features.txt: Add RKL to 20.1 release notes< / li >
< p > < / p >
< p > Jose Maria Casanova Crespo (5):< / p >
< li > broadcom: Fix implicit declaration of ffs for Android build< / li >
< li > v3d: Sync on last CS when non-compute stage uses resource written by CS< / li >
< li > v3d: Primitive Counts Feedback needs an extra 32-bit padding.< / li >
< li > v3d: Fix swizzle in DXT3 and DXT5 formats< / li >
< li > v3d: Include supported DXT formats to enable s3tc/dxt extensions< / li >
< p > < / p >
< p > Joshua Ashton (3):< / p >
< li > radv: Use TRUNC_COORD on samplers< / li >
< li > radv: Pass logical device to si_emit_graphics< / li >
< li > radeonsi: Use TRUNC_COORD on samplers< / li >
< p > < / p >
< p > José Fonseca (4):< / p >
< li > meson: Avoid duplicate symbols.< / li >
< li > scons: Prune out unnecessary targets.< / li >
< li > gitlab-ci: Prune all SCons jobs except scons-win64, and allows failures.< / li >
< li > appveyor: Remove Meson job.< / li >
< p > < / p >
< p > Juan A. Suarez Romero (6):< / p >
< li > nir/lower_double_ops: add note for lowering mod< / li >
< li > nir/lower_double_ops: relax lower mod()< / li >
< li > nir/algebraic: coalesce fmod lowering< / li >
< li > anv: use urb_setup_attribs in SBE< / li >
< li > intel/compiler: store the FS inputs in WM prog data< / li >
< li > anv/pipeline: allow more than 16 FS inputs< / li >
< p > < / p >
< p > Karol Herbst (18):< / p >
< li > clover: add trivial clCreateCommandQueueWithProperties implementation< / li >
< li > nir/lower_ssbo: handle atomics< / li >
< li > gallium: make handles of set_global_binding 64 bit< / li >
< li > Revert " gallium: make handles of set_global_binding 64 bit" < / li >
< li > nv50, nvc0: fix must_check warning of util_dynarray_resize_bytes< / li >
< li > clover: fix build with single library clang build< / li >
< li > gallium: add PIPE_CAP_SYSTEM_SVM< / li >
< li > clover: add stubs for SVM< / li >
< li > clover: implement CL_DEVICE_SVM_CAPABILITIES< / li >
< li > clover: implement clSetKernelArgSVMPointer< / li >
< li > clover: implement SVM functions for devices with fine grained system SVM support< / li >
< li > clover: implement cl_arm_shared_virtual_memory< / li >
< li > clover: expose cl_arm_shared_virtual_memory for devices with SVM support< / li >
< li > nvc0: enable ASTC and ETC on GM20B< / li >
< li > mesa: fix enum value of VIEWPORT_SWIZZLE_POSITIVE_W_NV< / li >
< li > gallium: initialize viewport swizzle in cso_set_viewport_dims< / li >
< li > Revert " nvc0: fix line width on GM20x+" < / li >
< li > st/mesa: properly guard fallback_copy_texsubimage aginst failed maps< / li >
< p > < / p >
< p > Kenneth Graunke (14):< / p >
< li > intel/genxml: Drop " reserved" enum< / li >
< li > isl: Fix the android build.< / li >
< li > iris: Dump frame markers with INTEL_DEBUG=submit< / li >
< li > iris: Trim " ../../src/gallium/drivers/iris/" out of debug dump filenames< / li >
< li > iris: Make mocs an inline helper in iris_resource.h< / li >
< li > iris: Fix BLORP vertex buffers to respect ISL MOCS settings< / li >
< li > iris: Set MOCS for constant packets on Gen12+< / li >
< li > intel/compiler: Drop nir_lower_to_source_mods() and related handling.< / li >
< li > intel/compiler: Put back saturate on [iu]add_sat opcodes< / li >
< li > intel/compiler: Don' t copy prop source mods into PICK_HIGH_32BIT< / li >
< li > intel/compiler: Delete abs/neg handling in fsign code< / li >
< li > intel/compiler: Don' t create 64-bit src1 immediates in opt_peephole_sel< / li >
< li > nir: Actually do load/store vectorization beyond vec2< / li >
< li > iris: Fix downcast of bound_vertex_buffers from uint64_t to int< / li >
< p > < / p >
< p > Konrad Dybcio (1):< / p >
< li > freedreno/a4xx: enable A405< / li >
< p > < / p >
< p > Kristian Høgsberg (39):< / p >
< li > nir: Delete unused is_var_constant() helper< / li >
< li > nir: Make unroll pragma work on clang< / li >
< li > freedreno/fdperf: Cast away some ignored return values< / li >
< li > spirv/opencl: Cast opcode up front to avoid warnings< / li >
< li > glsl: Use ' using' to be explicit about visitor overloads< / li >
< li > nir: Remove always-true assert< / li >
< li > turnip: Be explicit about converting vk compare func to a6xx< / li >
< li > freedreno/a6xx: Add fd6_resource_screen_init()< / li >
< li > freedreno: Set up supported modifiers in fd*_resource_screen_init()< / li >
< li > freedreno: Add layout_resource_for_modifier screen vfunc< / li >
< li > freedreno/a6xx: Implement layout for DRM_FORMAT_MOD_QCOM_COMPRESSED< / li >
< li > turnip: Drop explicit configure opt-in for turnip< / li >
< li > ci: Drop turnip opt-in option< / li >
< li > freedreno/ir3: Set IR3_REG_HALF flag on src as well in immediate MOV< / li >
< li > Mark a few static inline helpers with ASSERTED< / li >
< li > main/get: Converted type conversion macros to inline functions< / li >
< li > nir/types: Add glsl_float16_type() helper< / li >
< li > freedreno/ir3: Lower output precision< / li >
< li > Revert " glsl: Use a simpler formula for tanh" < / li >
< li > Revert " spirv: Use a simpler and more correct implementaiton of tanh()" < / li >
< li > freedreno/ir3: Don' t fold conversions into sign< / li >
< li > glsl: Add ir_constant constructor for fp16< / li >
< li > glsl: Add fp16 case for ir_triop_lrp optimization< / li >
< li > glsl: Implement constant propagation for fp16< / li >
< li > glsl: Expand fp16 to float before constant expression evaluation< / li >
< li > glsl: Add type queries for fp16+float and fp16+float+double< / li >
< li > glsl/lower_instructions: Handle fp16 for FDIV_TO_MUL_RCP< / li >
< li > radeonsi: Stop exposing PIPE_SHADER_CAP_FP16< / li >
< li > turnip: Add missing VKAPI_ATTR annotations< / li >
< li > turnip: Stub out VK_KHR_external_{fence,semaphore}_fd< / li >
< li > turnip: Make Android platform build< / li >
< li > turnip: Drop dep_llvm from dependencies< / li >
< li > freedreno/ir3: Fix sz vs class confusion< / li >
< li > freedreno/computerator: Decouple ir3 assembler< / li >
< li > freedreno/ir3: Move ir3 assembler to backend compiler< / li >
< li > freedreno/ir3: Parse, but ignore @in, @out and @tex headers< / li >
< li > freedreno/ir3: Reset lex line number when we start parsing< / li >
< li > freedreno/ir3: Print @tex write mask using 0x%x< / li >
< li > freedreno: Use the right amount of & ' s< / li >
< p > < / p >
< p > Krzysztof Raszkowski (10):< / p >
< li > gallium/swr: fix gcc warnings< / li >
< li > gallium/swr: Fix gcc 4.8.5 compile error< / li >
< li > gallium/swr: Fix llvm11 compilation issues< / li >
< li > gallium/swr: simplify environmental variabled expansion code< / li >
< li > gallium/swr: fix rdtsc debug statistics mechanism< / li >
< li > gallium/swr: Fix min/max range index draw< / li >
< li > Revert " gallium/swr: Fix min/max range index draw" < / li >
< li > gallium/swr: Fix vcvtph2ps llvm intrinsic compile error< / li >
< li > gallium/swr: Fix array stride problem.< / li >
< li > gallium/swr: Re-enable scratch space for client-memory buffers< / li >
< p > < / p >
< p > Leandro Ribeiro (1):< / p >
< li > i965: remove duplicated comment< / li >
< p > < / p >
< p > Leo Liu (1):< / p >
< li > radeon/jpeg: fix the jpeg dt_pitch with YUYV format< / li >
< p > < / p >
< p > Lepton Wu (1):< / p >
< li > virgl: Use ETC2 formats directly when possible.< / li >
< p > < / p >
< p > Lionel Landwerlin (49):< / p >
< li > iris: implement gen12 post sync pipe control workaround< / li >
< li > anv: implement gen9 post sync pipe control workaround< / li >
< li > anv: implement gen12 post sync pipe control workaround< / li >
< li > anv: set MOCS on push constants< / li >
< li > mesa: add INTEL_blackhole_render< / li >
< li > i965: enable INTEL_blackhole_render< / li >
< li > st: add support for INTEL_blackhole_render< / li >
< li > iris: add support INTEL_blackhole_render< / li >
< li > intel/tools/aub_dump: move aub file initialization to maybe_init()< / li >
< li > intel/tools/aub_dump: fix crash when using the default legacy context< / li >
< li > intel/aub_dump: stub the waits when overriding the device< / li >
< li > intel/tools/dump_gpu: fix getparam values< / li >
< li > anv: stop storing prog param data into shader blobs< / li >
< li > intel/decoder: don' t consider header fields past dword0< / li >
< li > isl: implement linear tiling row pitch requirement for display< / li >
< li > isl: properly filter supported display modifiers on Gen9+< / li >
< li > isl: only apply main surface ccs pitch constraint with CCS< / li >
< li > isl: drop min row pitch alignment when set by the driver< / li >
< li > intel: add new TGL pci ids< / li >
< li > i965/iris: fix crash when calling GetPerfQueryDataINTEL< / li >
< li > vulkan/overlay: Add a workaround semaphore for application presenting without one< / li >
< li > intel/perf: move register definition to special file< / li >
< li > intel/perf: break GL query stuff away< / li >
< li > intel/perf: move mdapi query definitions to their own file< / li >
< li > intel/perf: document meaning of query field< / li >
< li > intel/perf: store the probed i915-perf version< / li >
< li > isl: set bpb for Y8_UNORM< / li >
< li > isl: don' t warn in physical extent calculation for yuv formats< / li >
< li > intel/aub_viewer: fix access to freed memory< / li >
< li > drm-shim: return device platform as specified< / li >
< li > drm-shim: stub libdrm' s use of realpath()< / li >
< li > iris: properly free resources on BO allocation failure< / li >
< li > iris: share buffer managers accross screens< / li >
< li > iris: make resources take a ref on the screen object< / li >
< li > i965: store DRM fd on intel_screen< / li >
< li > i965: share buffer managers across screens< / li >
< li > iris: drop cache coherent cpu mapping for external BO< / li >
< li > intel/perf: Enable MDAPI queries for Gen12< / li >
< li > anv: skip writing perfcntr in results on Gen12+< / li >
< li > util/sparse_free_list: manipulate node pointers using atomic primitives< / li >
< li > iris: fail screen creation when kernel support is not there< / li >
< li > include/drm-uapi: bump headers< / li >
< li > intel/perf: store default sseu configuration< / li >
< li > intel/perf: specify sseu configuration when supported< / li >
< li > anv: force whole EU array to be powered for perf queries< / li >
< li > drm-shim: provide a valid fake syncobj handle at creation< / li >
< li > drm-shim: stub syncobj wait ioctl< / li >
< li > iris: don' t assert on unfinished aux import in copy paths< / li >
< li > anv: don' t expose VK_INTEL_performance_query without kernel support< / li >
< p > < / p >
< p > Liviu Prodea (2):< / p >
< li > scons/windows: Support build with LLVM 10.< / li >
< li > util: Make process_test path compatible with mingw native toolchains< / li >
< p > < / p >
< p > Louis-Francis Ratté-Boulianne (7):< / p >
< li > glsl/linker: add DisableTransformFeedbackPacking workaround< / li >
< li > glsl/linker: handle array/struct members for DisableXfbPacking< / li >
< li > glsl/linker: add xfb workaround for modified built-in variables< / li >
< li > gallium: add PIPE_CAP_PACKED_STREAM_OUTPUT< / li >
< li > gallium: add PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED< / li >
< li > gallium: add PIPE_CAP_PSIZ_CLAMPED< / li >
< li > panfrost: fix transform feedback< / li >
< p > < / p >
< p > Lucas Stach (1):< / p >
< li > etnaviv: retarget transfer to render resource when necessary< / li >
< p > < / p >
< p > Marek Olšák (254):< / p >
< li > vbo: move GLvertexformat initialization into a template header file for reuse< / li >
< li > vbo: use the template for noop GLvertexformat initialization< / li >
< li > vbo: use the template for save GLvertexformat initialization< / li >
< li > vbo: move reusable code from vbo_attrib_tmp.h into vbo_util.h< / li >
< li > mesa: implement missing display list functions while switching to the template< / li >
< li > radeonsi: don' t report that multi-plane formats are supported< / li >
< li > radeonsi: fix the DCC MSAA bug workaround< / li >
< li > radeonsi: don' t update states for the DCC MSAA bug on GFX6-7< / li >
< li > glx: print FPS with 2 decimal places< / li >
< li > mesa: fix incorrect uses of FLUSH_CURRENT< / li >
< li > mesa: remove FLUSH_CURRENT calls that have no effect< / li >
< li > mesa: import PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET handling< / li >
< li > vbo: create the immediate mode buffer only in vbo_exec_vtx_map< / li >
< li > vbo: skip FlushMappedBufferRange for glBegin/End by using a persistent mapping< / li >
< li > vbo: don' t unmap persistent buffer mappings for glBegin/End< / li >
< li > vbo: remove immediate mode code that doesn' t do anything and simplify stuff< / li >
< li > vbo: interleave attrsz, attrtype, and active_sz in memory< / li >
< li > vbo: remove a funky recursive call in glBegin< / li >
< li > vbo: don' t check ctx-> NewState twice in glBegin< / li >
< li > vbo: keep the immediate mode buffer always mapped for simplicity< / li >
< li > vbo: don' t set FLUSH_UPDATE_CURRENT for glVertex< / li >
< li > vbo: pass only either uint32_t or uint64_t into ATTR_UNION< / li >
< li > vbo: don' t store glVertex values temporarily into exec< / li >
< li > vbo: optimize resizing vertex attributes during immediate mode< / li >
< li > vbo: fix resizing 64-bit vertex attributes< / li >
< li > vbo: use FlushVertices flags properly and clear NeedFlush correctly< / li >
< li > vbo: increase the size of the immediate mode buffer to decrease draw count< / li >
< li > vbo: add/update unlikely statements in ATTR_UNION< / li >
< li > vbo: delay flagging FLUSH_STORED_VERTICES until glEnd< / li >
< li > vbo: also map the immediate mode buffer for read< / li >
< li > vbo: clean up resetting vertex attribs< / li >
< li > vbo: merge use_buffer_objects into vbo_CreateContext to skip the big malloc< / li >
< li > í965: don' t use _mesa_prim::is_indirect< / li >
< li > mesa: remove unused _mesa_prim::is_indirect< / li >
< li > mesa: don' t use bitfields in _mesa_prim< / li >
< li > st/mesa: optimize st_update_array with ALWAYSINLINE< / li >
< li > radeonsi: don' t wait for shader compilation to finish when destroying a context< / li >
< li > mesa: translate into gallium vertex formats in mesa/main< / li >
< li > mesa: remove unused _mesa_draw_indirect< / li >
< li > st/mesa: always inline the code setting non-64bit vertex elements< / li >
< li > st/mesa: simplify determination whether a draw has user vertex buffers< / li >
< li > st/mesa: simplify determination whether a draw needs min/max index< / li >
< li > st/mesa: change some loops from while to do..while in st_atom_array.c< / li >
< li > st/mesa: make st_setup_current static< / li >
< li > st/mesa: simplify releasing the current attrib buffer< / li >
< li > gallium/u_upload_mgr: reduce dereferences by adding buffer_size< / li >
< li > gallium/u_upload_mgr: don' t do align twice in the u_upload_alloc fast path< / li >
< li > gallium/u_vbuf: adjust the heuristic for unrolling indices< / li >
< li > gallium/cso_hash: inline a bunch of functions< / li >
< li > gallium/cso_hash: make cso_hash declared within structures instead of alloc' d< / li >
< li > gallium/cso_hash: remove always constant variable nodeSize< / li >
< li > gallium/cso_hash: cosmetic changes, no behavior changes< / li >
< li > gallium/cso_hash: remove another layer of pointer indirection< / li >
< li > st/mesa: try to fix MSVC build failure due to ALWAYS_INLINE< / li >
< li > vbo: remove dead code in vbo_can_merge_prims< / li >
< li > vbo: remove redundant code in vbo_exec_fixup_vertex< / li >
< li > mesa: document _mesa_prim::begin/end< / li >
< li > mesa: don' t use memset in glDrawArrays< / li >
< li > mesa: fix immediate mode with tessellation and varying patch vertices< / li >
< li > gallium/util: remove unused u_surfaces.c/h< / li >
< li > util: remove the dependency on kcmp.h< / li >
< li > nir: fix gl_nir_lower_images for bindless images< / li >
< li > tgsi_to_nir: set num_images and num_samplers with holes correctly< / li >
< li > gallium/hash_table: consolidate hash tables with pointer keys< / li >
< li > gallium/hash_table: consolidate hash tables with FD keys< / li >
< li > gallium/hash_table: use the same callback signatures as util/hash_table< / li >
< li > gallium/hash_table: turn it into a wrapper around util/hash_table< / li >
< li > gallium/hash_table: remove some function wrappers< / li >
< li > mesa: remove leftovers from ARB_shadow_ambient< / li >
< li > mesa: call FLUSH_VERTICES before updating CoordReplace< / li >
< li > i965: stop using " indirect" parameter from Driver.Draw (non-indirect)< / li >
< li > mesa: remove unused " indirect" parameter from Driver.Draw< / li >
< li > gallium/cso_hash: pack cso_node better< / li >
< li > gallium/cso_hash: inline struct cso_hash_data< / li >
< li > gallium: pass cso_velems_state into cso_context instead of pipe_vertex_element< / li >
< li > gallium/u_threaded: fix uploading user indices with start != 0< / li >
< li > gallium/u_threaded: convert dividing by index_size to a bit shift< / li >
< li > mesa/i965: remove _mesa_prim::indirect_offset< / li >
< li > mesa: remove redundant _mesa_prim::is_indexed< / li >
< li > mesa: move num_instances and base_instance out of _mesa_prim< / li >
< li > mesa: clean up glMultiDrawElements code, use alloca for small draw count (v2)< / li >
< li > mesa: don' t unroll glMultiDrawElements if one count is 0< / li >
< li > mesa: optimize glMultiDrawArrays, call Draw only once (v2)< / li >
< li > mesa: fix incorrect prim.begin/end for glMultiDrawElements< / li >
< li > nir: replace GCC unroll with an option that works on GCC < 8.0< / li >
< li > gallivm: fix 5 warnings< / li >
< li > nir: fix 5 warnings< / li >
< li > mesa: fix 11 warnings< / li >
< li > gallium/u_vbuf: silence a warning by using unreachable< / li >
< li > mesa: add index_size_shift = log2(index_size) into _mesa_index_buffer< / li >
< li > mesa: replace some index_size multiplications and divisions with shifts< / li >
< li > vbo: don' t look at the second draw' s count when merging 2 glBegin/End draws< / li >
< li > vbo: deduplicate copy_vertices functions< / li >
< li > vbo: clean up vbo_copy_vertices< / li >
< li > vbo: handle GS and tess primitive types when splitting Begin/End< / li >
< li > vbo: clean up conditional blocks in ATTR_UNION< / li >
< li > vbo: fold code from vbo_exec_fixup_vertex to vbo_exec_wrap_upgrade_vertex< / li >
< li > Revert " mesa: check for z=0 in _mesa_Vertex3dv()" < / li >
< li > mesa: remove _mesa_index_buffer::index_size in favor of index_size_shift< / li >
< li > mesa: optimize get_index_size< / li >
< li > mesa: deduplicate draw indirect functions< / li >
< li > vbo: merge more primitive types for glBegin/End (v2)< / li >
< li > vbo: merge draws even when begin==0 or end==0< / li >
< li > glthread: don' t generate the sync fallback if the call size is not variable< / li >
< li > glthread: don' t prefix variable_data with const< / li >
< li > glthread: inline _mesa_unmarshal_dispatch_cmd and convert the switch to a table< / li >
< li > glthread: reduce pointer dereferences in glthread_unmarshal_batch< / li >
< li > glthread: use int instead of size_t where it' s OK< / li >
< li > glthread: simplify repeated function sequences in marshal_generated.c< / li >
< li > glthread: don' t insert _mesa_post_marshal_hook into every function< / li >
< li > glthread: don' t increment variable_data if it' s the last variable-size param< / li >
< li > glthread: add GL_DRAW_INDIRECT_BUFFER tracking and generator support< / li >
< li > glthread: add/update count and marshal fields for many GL functions< / li >
< li > glthread: handle complex pointer parameters and support GL functions with strings< / li >
< li > glthread: check the size of all variable params and clean up the code< / li >
< li > glthread: replace custom ClearBuffer marshalling with generated one< / li >
< li > glthread: add support for TexParameteri and SamplerParameteri functions< / li >
< li > glthread: add support for glFog, glLight, glLightModel, glTexEnv, glTexGen< / li >
< li > glthread: add support for glClearNamedFramebuffer, glMaterial, glPointParameter< / li >
< li > glthread: add support for glCallLists, glPatchParameterfv< / li >
< li > glthread: add support for glMemoryObjectParameteriv, glSemaphoreParameterui64v< / li >
< li > glthread: don' t insert an empty line after (void) cmd;< / li >
< li > glthread: add marshal_call_after and remove custom glFlush and glEnable code< / li >
< li > glthread: track for each VAO whether the user has set a user pointer< / li >
< li > glthread: sync instead of disabling glthread for non-VBO pointers< / li >
< li > glthread: replace custom glBindBuffer marshalling with generated one< / li >
< li > glthread: merge glBufferData and glNamedBufferData into 1 set of functions< / li >
< li > glthread: merge glBufferSubData and glNamedBufferSubData into 1 set of functions< / li >
< li > glthread: add custom marshalling for glNamedBuffer(Sub)DataEXT< / li >
< li > glthread: fix a crash with incorrect glShaderSource parameters< / li >
< li > glthread: fall back if a param size is non-zero and a pointer param is NULL< / li >
< li > radeonsi: add a bug workaround for NGG - LATE_ALLOC_GS< / li >
< li > ac: add a bug workaround for the 100% NGG culling case< / li >
< li > radeonsi: determine uses_bindless_samplers correctly< / li >
< li > st/mesa: flush the bitmap cache before st/dri and vbo flushes< / li >
< li > st/mesa: fix a possible crash with selection and feedback modes< / li >
< li > gallium/cso_context: remove cso_delete_xxx_shader helpers to fix the live cache< / li >
< li > st/mesa: keep serialized NIR instead of nir_shader in st_program< / li >
< li > vbo: use vbo_exec_wrap_upgrade_vertex for glVertex in ATTR_UNION< / li >
< li > vbo: fix transitions from glVertexN to glVertexM where M < N< / li >
< li > vbo: fix vbo_copy_vertices for GL_PATCHES and adjacency primitive types< / li >
< li > gallium: add PIPE_CAP_DRAW_INFO_START_WITH_USER_INDICES< / li >
< li > mesa: don' t unroll glMultiDrawElements with user indices for gallium< / li >
< li > radeonsi/gfx10: cache metadata in L2 on small chips< / li >
< li > radeonsi: set better tessellation tunables on gfx9 and gfx10< / li >
< li > radeonsi: tune primitive binning for small chips< / li >
< li > ac: add radeon_info::use_late_alloc to control LATE_ALLOC globally< / li >
< li > ac: disable late alloc on small gfx10 chips< / li >
< li > gallium/u_threaded: don' t sync the thread for all unsychronized mappings< / li >
< li > gallium/u_vbuf: simplify the first if statement in u_vbuf_upload_buffers< / li >
< li > ac: unify denorm setting enforcement< / li >
< li > ac: set new LLVM denormal flags< / li >
< li > ac: don' t set old denormals flags with LLVM > = 11< / li >
< li > nir: fix clip/cull_distance_array_size in nir_lower_clip_cull_distance_arrays< / li >
< li > mesa: use vbo_attrib_tmp.h to generate display list vertex attrib functions< / li >
< li > mesa: remove redundant api_loopback functions< / li >
< li > glthread: align the batch buffer to 8 bytes for pointers and doubles again< / li >
< li > glthread: enable display lists< / li >
< li > glthread: track VAOs created by CreateVertexArrays< / li >
< li > glthread: don' t execute any custom VAO and BindBuffer code in the Core profile< / li >
< li > glthread: remove debug_print_marshal function< / li >
< li > glthread: clean up debug_print_sync code< / li >
< li > glthread: don' t declare unmarshal functions as inline< / li >
< li > winsys/radeon: change to 3-space indentation< / li >
< li > driconf: enable glthread for " From The Depths" < / li >
< li > glthread: remove _mesa_post_marshal_hook, because it' s not very useful< / li >
< li > glthread: simplify printing safe_mul in gl_marshal.py< / li >
< li > glthread: autogenerate prototypes for custom-marshalled functions< / li >
< li > glthread: move buffer functions into glthread_bufferobj.c< / li >
< li > glthread: rename marshal.h/c to glthread_marshal.h and glthread_shaderobj.c< / li >
< li > mesa: put gl_thread_state inside gl_context to remove pointer indirection< / li >
< li > glthread: handle buffer unbinding via glDeleteBuffers< / li >
< li > glthread: rename non_vbo helper functions< / li >
< li > glthread: track which vertex array attribs are enabled< / li >
< li > glthread: ignore vertex arrays with user pointers if they' re disabled< / li >
< li > glthread: remove the marshal_fail XML attribute< / li >
< li > vbo,gallium: make glBegin/End buffer size configurable by drivers< / li >
< li > ac: fix fast division< / li >
< li > st/mesa: fix use of uninitialized memory due to st_nir_lower_builtin< / li >
< li > glthread: inline SET_func and add -O1 to build _mesa_create_marshal_table faster< / li >
< li > glthread: declare marshal and unmarshal functions as non-static< / li >
< li > glthread: compile marshal_generated.c faster by breaking it up into 8 files< / li >
< li > nir: add and gather shader_info::writes_memory< / li >
< li > glsl_to_tgsi: set shader_info::writes_memory< / li >
< li > mesa: allow out-of-order drawing to optimize immediate mode if it' s safe< / li >
< li > radeonsi: enable full out-of-order drawing when allow_draw_out_of_order is set< / li >
< li > mesa: try to fix the android build< / li >
< li > Move compiler.h and imports.h/c from src/mesa/main into src/util< / li >
< li > mesa: don' t use < > for including internal headers< / li >
< li > util: stop including files from mesa/main< / li >
< li > radv: stop including files from mesa/main< / li >
< li > util: don' t include p_defines.h and u_pointer.h from gallium< / li >
< li > util: remove duplicated MALLOC_STRUCT and CALLOC_STRUCT< / li >
< li > radeonsi: remove obsolete TODO comment related to compute-based culling< / li >
< li > radeonsi: fix incorrect ordered_wave_id initilization for compute-based culling< / li >
< li > radeonsi: set amdgpu-gds-size for mode == 2 of compute-based culling< / li >
< li > radeonsi: always create wait_mem_scratch for compute-based culling< / li >
< li > radeonsi: add num_vbos_in_user_sgprs into the shader cache key< / li >
< li > radeonsi/gfx10: don' t use NGG culling if compute-based culling is used< / li >
< li > radeonsi/gfx10: fix ds.ordered.add intrinsic for compute-based culling< / li >
< li > radeonsi/gfx10: user correct ACQUIRE_MEM packet for compute-based culling< / li >
< li > radeonsi/gfx10: fix the wave size for compute-based culling< / li >
< li > radeonsi/gfx10: fix descriptors and compute registers for compute-based culling< / li >
< li > gallium/u_threaded: call the driver to pin threads to L3 immediately< / li >
< li > st/mesa: add environment variable pin_app_thread for faster glthread on AMD Zen< / li >
< li > driconf: whilelist more games for glthread< / li >
< li > mesa: optimize initialization of new VAOs< / li >
< li > mesa: don' t ever set NullBufferObj in gl_vertex_array_binding< / li >
< li > mesa: don' t ever bind NullBufferObj for glBindBuffer targets< / li >
< li > mesa: don' t ever bind NullBufferObj to glBindBuffer(Base,Range) slots< / li >
< li > mesa: remove NullBufferObj< / li >
< li > mesa: remove no longer needed _mesa_is_bufferobj function< / li >
< li > mesa: precompute _mesa_primitive_restart_index during state changes< / li >
< li > mesa: split _mesa_primitive_restart_index into a function without gl_context< / li >
< li > vbo: expose helper function vbo_get_minmax_index_mapped for glthread< / li >
< li > util: move and adjust the vertex upload heuristic equation from u_vbuf< / li >
< li > st/mesa: fix a crash due to passing a draw vertex shader into the driver< / li >
< li > ac: out-of-order rasterization is not supported on gfx10< / li >
< li > ac,radeonsi: simplify checking for Navi1x chips< / li >
< li > radeonsi: use pipe_blend_state::max_rt to update fewer blend registers< / li >
< li > ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11< / li >
< li > ac: update and document fast math flags used by radeonsi< / li >
< li > ac: generate FMA for inexact instructions for radeonsi< / li >
< li > ac: reassociate FP expressions for inexact instructions for radeonsi< / li >
< li > mesa: replace _NEW_EVAL with vbo_exec_update_eval_maps< / li >
< li > mesa: reset primitive restart state in glClientAttribDefaultEXT< / li >
< li > mesa: remove exec=" dynamic" from Draw functions that are not really dynamic< / li >
< li > glthread: use 32-bit align instead of 64-bit ALIGN< / li >
< li > glthread: reduce dereferences of the next batch< / li >
< li > glthread: use GLenum16 in batch buffers to save space< / li >
< li > glthread: sort variables in marshal structures to pack them optimally< / li >
< li > gallium: add PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE for glthread< / li >
< li > mesa: add Const.BufferCreateMapUnsynchronizedThreadSafe & MESA_MAP_THREAD_SAFE< / li >
< li > mesa: add offset_is_int32 param into _mesa_bind_vertex_buffer for glthread< / li >
< li > mesa: extend _mesa_bind_vertex_buffer to take ownership of the buffer reference< / li >
< li > mesa: replace GLenum target with gl_shader_stage in NewProgram< / li >
< li > ac/surface: rename micro tile mode enums like gfx10 uses them< / li >
< li > ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it' s always set< / li >
< li > ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE< / li >
< li > ac/surface: match get_display_flag() with expectations for is_displayable< / li >
< li > ac/surface: don' t compute DCC if it' s unsupported by DCN on gfx9+< / li >
< li > ac/surface: move non-displayable DCC to the end of the buffer< / li >
< li > ac/surface: add code for gfx10 displayable DCC< / li >
< li > ac/surface: validate that DCC is enabled correctly on gfx9+< / li >
< li > ac: enable displayable DCC on Navi12 & Navi14< / li >
< li > mesa: report GL_INVALID_OPERATION for invalid glTextureBuffer target< / li >
< li > st/mesa: expose more SPIR-V capabilities< / li >
< li > radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size< / li >
< li > radeonsi: revert an accidental change in si_clear_buffer< / li >
< li > Revert " ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it' s always set" < / li >
< li > Revert " ac: reassociate FP expressions for inexact instructions for radeonsi" < / li >
< li > ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9< / li >
< li > radeonsi: fix compilation of monolithic PS< / li >
< li > radeonsi: don' t expose 16xAA on chips with 1 RB due to an occlusion query issue< / li >
< p > < / p >
< p > Marek Vasut (4):< / p >
< li > etnaviv: Destroy rsc-> pending_ctx set in etna_resource_destroy()< / li >
< li > etnaviv: Emit PE.ALPHA_COLOR_EXT* on GPUs with half-float support< / li >
< li > etnaviv: Fix depth stencil ops on GC880/GC2000< / li >
< li > etnaviv: Disable seamless cube map on GC880< / li >
< p > < / p >
< p > Mark Janes (2):< / p >
< li > nir: check shader type before writing to shaderinfo.tess union< / li >
< li > nir: place aligned members after bitfields in shader_info.tess< / li >
< p > < / p >
< p > Mark Menzynski (2):< / p >
< li > util/blob: Add overwrite function for uint8< / li >
< li > tgsi/util: Change boolean for bool< / li >
< p > < / p >
< p > Martin Fuzzey (3):< / p >
< li > freedreno: android: fix build failure on android due to python version< / li >
< li > freedreno: android: add a6xx-pack.xml.h generation to android build< / li >
< li > freedreno: android: fix build of perfcounters.< / li >
< p > < / p >
< p > Mathias Fröhlich (19):< / p >
< li > egl: Implement getImage/putImage on pbuffer swrast.< / li >
< li > mesa: Fix FLUSH_VERTICES in SubpixelPrecisionBiasNV.< / li >
< li > egl: Fix A2RGB10 platform_{device,surfaceless} PBuffer configs.< / li >
< li > egl: Factor out dri2_add_pbuffer_configs_for_visuals {device,surfaceless}.< / li >
< li > mesa: Check for OpenGL state change before flushing vertices.< / li >
< li > mesa: Flush vertices before changing the OpenGL state.< / li >
< li > i965: Move down genX_upload_sbe in profiles.< / li >
< li > iris: Move down iris_emit_sbe_swiz in profiles.< / li >
< li > i965: Use 32 bit u_bit_scan for vertex attribute setup.< / li >
< li > i965: Use the VAOs binding information in array setup.< / li >
< li > i965: Test original vertex array pointer to skip array upload.< / li >
< li > i965: Split merge_inputs and clear_buffers.< / li >
< li > i965: Reorder workaround flags computation.< / li >
< li > i965: Remove glbinding from brw_vertex_element.< / li >
< li > mesa: Remove now unused _mesa_draw_attrib_and_binding.< / li >
< li > mesa: Remove now unused _mesa_draw_attrib.< / li >
< li > mesa: Provide gl_vertex_format accessors.< / li >
< li > i965: Make use of the vertex format functions in i965.< / li >
< li > i965: Use gl_vertex_format in brw_vertex_element.< / li >
< p > < / p >
< p > Matt Turner (11):< / p >
< li > intel/tools: Do not print type/qualifiers/name for c_literal< / li >
< li > intel/vec4: Make implied_mrf_writes() a vec4_instruction method< / li >
< li > intel/compiler: Remove unnecessary local variables< / li >
< li > intel/compiler: Make instructions_to_schedule a local variable< / li >
< li > intel/compiler: Mark some methods and parameters const< / li >
< li > intel/compiler: Mark visitor parameters to scheduler const< / li >
< li > intel/compiler: Pass backend_shader * to cfg_t()< / li >
< li > intel/compiler: Pass shader_stats for each SIMD mode< / li >
< li > intel/compiler: Discount NOPs from instruction counts< / li >
< li > isl: Avoid EXPECT_DEATH in unit tests< / li >
< li > meson: Specify the maximum required libdrm in dri.pc< / li >
< p > < / p >
< p > Mauro Rossi (5):< / p >
< li > android: gallium/auxiliary: fix " Unused source files" in tesselator< / li >
< li > android: aco: fix PIPE_FORMAT related building errors< / li >
< li > android: r600/sfn: fix includes and libmesa_nir dependency< / li >
< li > android: r600/sfn: Add GDS instructions< / li >
< li > android: aco: add various compiler statistics< / li >
< p > < / p >
< p > Michel Dänzer (33):< / p >
< li > gitlab-ci: Update to latest ci-templates HEAD< / li >
< li > gitlab-ci: Pass -j4 to make< / li >
< li > gitlab-ci: Merge ccache and libxml2-utils into main apt-get install< / li >
< li > gitlab-ci: Add ppc64el and s390x cross-build jobs< / li >
< li > gitlab-ci: Build radeonsi & RADV in the ppc64el job< / li >
< li > llvmpipe: Bump test timeout to 180 seconds< / li >
< li > gitlab-ci: Only use gstreamer runners for the s390x job for now< / li >
< li > gitlab-ci: Sort random failure softpipe skips< / li >
< li > gitlab-ci: Add three more dEQP-GLES31 tests to softpipe skips< / li >
< li > st/vdpau: Only call is_video_format_supported hook if needed< / li >
< li > winsys/amdgpu: Make local variable r signed< / li >
< li > util: Change os_same_file_description return type from bool to int< / li >
< li > gitlab-ci: Drop " test-" prefix from llvmpipe/softpipe job names< / li >
< li > gitlab-ci: Distribute jobs across more stages< / li >
< li > gitlab-ci: Always name artifacts archive after the job producing it< / li >
< li > gitlab-ci: Don' t restrict ppc64el/s390x build jobs to gstreamer runners< / li >
< li > gitlab-ci: Don' t use buster-backports packages by default for x86_build< / li >
< li > gitlab-ci: Fold scons-swr job into scons job< / li >
< li > gitlab-ci: Move classic driver testing to a new meson-classic job< / li >
< li > llvmpipe: Use uintptr_t for pointer values< / li >
< li > gitlab-ci: Enable more Gallium drivers in meson-i386 job< / li >
< li > gitlab-ci: Restrict s390x/ppc64el jobs to packet runners< / li >
< li > gitlab-ci: Update to current templates< / li >
< li > gitlab-ci: Rename " paths" YAML anchor to " all_paths" < / li >
< li > gitlab-ci/lava: Add needs: for container image to test jobs (again)< / li >
< li > gitlab-ci: Don' t require triggering build/test jobs manually< / li >
< li > gitlab-ci: Run merge request pipelines automatically only for Marge Bot< / li >
< li > gitlab-ci: Use all_paths in .test-manual rules< / li >
< li > gbm/dri: Propagate queryDmaBufModifiers return value< / li >
< li > amd/addrlib: Use enum instead of sparse chars to identify dimensions< / li >
< li > mesa: Skip 3-byte array formats in _mesa_array_format_flip_channels< / li >
< li > Revert " ac,radeonsi: fix compilations issues with LLVM 11" < / li >
< li > Revert " gallium/gallivm: fix compilation issues with llvm 11" < / li >
< p > < / p >
< p > Mike Blumenkrantz (6):< / p >
< li > zink: set UBO alignments in nir_intrinsic_load_uniform lowering< / li >
< li > zink: remove framebuffer cache< / li >
< li > zink: explicitly unref old fb object when setting new one< / li >
< li > iris: move iris_vtable to iris_screen< / li >
< li > gallium: add pipe cap for scissored clears and pass scissor state to clear() hook< / li >
< li > iris: handle PIPE_CAP_CLEAR_SCISSORED< / li >
< p > < / p >
< p > Nanley Chery (6):< / p >
< li > isl: Add a module which manages aux resolves< / li >
< li > iris: Use isl_aux_usage_has_fast_clear()< / li >
< li > iris: Use ISL' s access preparation functions< / li >
< li > iris: Use isl_aux_state_transition_write()< / li >
< li > i965: Use ISL' s access preparation functions< / li >
< li > i965: Use isl_aux_state_transition_write()< / li >
< p > < / p >
< p > Nataraj Deshpande (1):< / p >
< li > dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_R8G8B8X8_UNORM< / li >
< p > < / p >
< p > Neha Bhende (2):< / p >
< li > svga: fix size of format_conversion_table[]< / li >
< li > svga: Use pipe_shader_state_from_tgsi to set shader state< / li >
< p > < / p >
< p > Neil Armstrong (4):< / p >
< li > gitlab-ci/lava: fix handling of lava tags< / li >
< li > Revert " ci: Remove T820 from CI temporarily" < / li >
< li > gitlab-ci: add FILES_HOST_URL and move FILES_HOST_NAME into jobs< / li >
< li > gitlab-ci: re-enable mali400/450 and t820 jobs< / li >
< p > < / p >
< p > Neil Roberts (17):< / p >
< li > nir/opcodes: Add nir_op_f2fmp< / li >
< li > glsl: Add support for float16 types in the IR tree< / li >
< li > glsl: Add IR conversion ops for 16-bit float types< / li >
< li > glsl: Add b2f16 and f162b conversion operations< / li >
< li > glsl: Add ir_unop_f2fmp< / li >
< li > glsl/validate: Allow float16 in the expression tree< / li >
< li > glsl/lower_instructions: Use float16 constants when appropriate< / li >
< li > glsl/opt_minmax: Add support for float16< / li >
< li > glsl: Add a method to get precision from a deref instruction< / li >
< li > glsl/hierarchical_visitor: Call leave_callback on leaf nodes< / li >
< li > glsl: Add an IR lowering pass to convert mediump operations to 16-bit< / li >
< li > glsl/standalone: Add an option to lower the precision< / li >
< li > glsl: Add unit tests for the lower_precision pass< / li >
< li > freedreno/ir3: Lower bools to bitsize< / li >
< li > glsl: Inline builtins in a separate pass< / li >
< li > glsl/lower_precision: Lower builtins depending on arguments< / li >
< li > glsl/lower_precision: Use vector.back() instead of vector.end()[-1]< / li >
< p > < / p >
< p > Paulo Zanoni (8):< / p >
< li > intel: fix the gen 11 compute shader scratch IDs< / li >
< li > intel: fix the gen 12 compute shader scratch IDs< / li >
< li > intel/device: bdw_gt1 actually has 6 eus per subslice< / li >
< li > anv: multiply the scratch space by 4 on gen9-10 like iris and i965< / li >
< li > iris: remove hole from struct iris_bo< / li >
< li > iris: remove unnecessary forward declaration< / li >
< li > iris: remove useless bo-> gtt_offset assignment< / li >
< li > iris: make BATCH_SZ smaller by BATCH_RESERVED bytes< / li >
< p > < / p >
< p > Peng Huang (1):< / p >
< li > radeonsi: make si_fence_server_signal flush pipe without work< / li >
< p > < / p >
< p > Pierre Moreau (1):< / p >
< li > clover/nir: Check the result of spirv_to_nir< / li >
< p > < / p >
< p > Pierre-Eric Pelloux-Prayer (44):< / p >
< li > radeonsi/ngg: add VGT_FLUSH when enabling fast launch< / li >
< li > radeonsi: test subsampled format in testdma< / li >
< li > format: add format_to_chroma_format< / li >
< li > gallium/video: remove pipe_video_buffer.chroma_format< / li >
< li > gallium/vl: add 4:2:2 support< / li >
< li > radeonsi: fix surf_pitch for subsampled surface< / li >
< li > st/va: enable 4:2:2 chroma format< / li >
< li > st/va: add support YUY2< / li >
< li > radeonsi: remove AMD_DEBUG=sisched option< / li >
< li > omx: fix build with gcc 10< / li >
< li > meson: enable -fno-common by default< / li >
< li > gitlab-ci: rules:changes to test on tested drivers changes< / li >
< li > vdpau: remove bogus assert< / li >
< li > st/mesa: disallow deferred flush if there are multiple contexts< / li >
< li > radeonsi: enable glsl_zero_init for Curse of the Dead Gods< / li >
< li > radeonsi: clarify the conditions when FLUSH_AND_INV_DB is needed< / li >
< li > util/os_file: extend os_read_file to return the file size< / li >
< li > util/u_process: add util_get_process_exec_path< / li >
< li > util/xmlconfig: add new sha1 application attribute< / li >
< li > radeonsi: enable workarounds for YoYo engine based games< / li >
< li > util/u_process: fix Windows build< / li >
< li > nir: update uses_demote flag in discard_to_demote pass< / li >
< li > ac: fix ac_build_is_helper_invocation when postponed_kill is null< / li >
< li > util: fix process_test path< / li >
< li > ddebug: add missing forward declaration< / li >
< li > radeon: fix includes< / li >
< li > radeonsi: switch to 3-spaces style< / li >
< li > radeon: switch to 3-spaces style< / li >
< li > gallium/util: let shader live cache users know if a hit occured< / li >
< li > radeonsi: dump shader stats when hitting the live cache< / li >
< li > util/xmlconfig: fix sha1 comparison code< / li >
< li > mesa: update pipeline when re-linking a program in use< / li >
< li > gallium/u_threaded: flush batch when hitting mapping limit< / li >
< li > radeonsi: use thread_context::bytes_mapped_limit< / li >
< li > radeonsi: don' t assume ctx is always a threaded_context< / li >
< li > radeonsi: skip vs output optimizations for some outputs< / li >
< li > mesa: fix crash in find_value< / li >
< li > gallium/utils: silence strncpy warning< / li >
< li > st/omx: fix gcc warnings< / li >
< li > radeonsi: fix export count< / li >
< li > mesa: add gl_coontext::ForceIntegerTexNearest< / li >
< li > driconf: add force_integer_tex_nearest option< / li >
< li > radeonsi: don' t print gs_copy_shader stats for shaderdb< / li >
< li > amd/addrlib: fix forgotten char -> enum conversions< / li >
< p > < / p >
< p > Plamena Manolova (2):< / p >
< li > intel/compiler: Add support for variable workgroup size< / li >
< li > i965: Implement ARB_compute_variable_group_size< / li >
< p > < / p >
< p > Qiang Yu (35):< / p >
< li > lima: remove definition of lima_is_scanout< / li >
< li > lima: use util_copy_framebuffer_state< / li >
< li > lima: always add texture bo to submit< / li >
< li > lima: remove lima_ctx_buff_va submit flags (v2)< / li >
< li > lima: pass array as parameter to PLBU and VS command macros< / li >
< li > lima: delay add plb buffer to submit when flush< / li >
< li > lima: delay plbu head command generation to flush stage (v2)< / li >
< li > lima: add render target to submit by dirty buffer flags< / li >
< li > lima: add missing resolve check for damage and reload< / li >
< li > lima: move syncobj from lima_submit to lima_context< / li >
< li > lima: merge gp/pp submit< / li >
< li > lima: put hardware related info to lima_gpu.h< / li >
< li > lima: move flush code to lima_submit.c< / li >
< li > lima: pass submit parameter for functions in lima_submic.c (v2)< / li >
< li > lima: add lima_submit_create_stream_bo< / li >
< li > lima: adjust pp_stream to use lima_submit_create_stream_bo< / li >
< li > lima: use lima_submit_create_stream_bo for plbu/vs_cmd and pp_stack< / li >
< li > lima: add lima_submit_get< / li >
< li > lima: make lima_submit one time use drop data (v3)< / li >
< li > lima: track write submits of context (v3)< / li >
< li > lima: move plbu/vs_cmd_array into lima_submit< / li >
< li > lima: move resolve into lima_submit< / li >
< li > lima: move pp_max_stack_size to lima_submit< / li >
< li > lima: move damage_rect into lima_submit< / li >
< li > lima: move clear into submit (v2)< / li >
< li > lima: move framebuffer info to lima_submit< / li >
< li > lima: use per submit dump file< / li >
< li > lima: optinal flush submit in lima_clear< / li >
< li > lima: enable multi submit optimization< / li >
< li > lima: move dump check to macro for lima_dump_command_stream_print< / li >
< li > lima: rename lima_submit to lima_job< / li >
< li > lima: fix buffer import with offset< / li >
< li > lima: also check tiled and depth case when import< / li >
< li > lima: set offset when export resource< / li >
< li > panfrost: don' t always build bifrost_compiler< / li >
< p > < / p >
< p > Quentin Glidic (1):< / p >
< li > meson: Use dependency.partial_dependency()< / li >
< p > < / p >
< p > Rafael Antognolli (18):< / p >
< li > intel: Load the driver even if I915_PARAM_REVISION is not found.< / li >
< li > intel/tools: Update aubinator_error_decode.< / li >
< li > intel/blorp: Implement GEN:BUG:1605967699.< / li >
< li > iris: Apply the flushes when switching pipelines.< / li >
< li > anv: Wait for the GPU to be idle before invalidating the aux table.< / li >
< li > iris: Split aux map initialization from invalidation.< / li >
< li > iris: Wait for the GPU to be idle before invalidating the aux table.< / li >
< li > intel/isl: Implement D16_UNORM workarounds.< / li >
< li > intel/gen12+: Disable mid thread preemption.< / li >
< li > iris: Enable EXT_depth_bounds_test extension.< / li >
< li > drm-uapi: Update headers from Linux 5.7-rc1.< / li >
< li > i965/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.< / li >
< li > iris/bufmgr: Factor out GEM_MMAP ioctl from mmap_cpu and mmap_wc.< / li >
< li > i965/bufmgr: Add support for MMAP_OFFSET ioctl.< / li >
< li > iris/bufmgr: Add support for MMAP_OFFSET ioctl.< / li >
< li > anv: Add anv_device parameter to anv_gem_munmap.< / li >
< li > anv: Add support for new MMAP_OFFSET ioctl.< / li >
< li > anv: Enable HiZ on multi-layer depth buffers.< / li >
< p > < / p >
< p > Rhys Perry (118):< / p >
< li > aco: fix gfx10_wave64_bpermute< / li >
< li > aco: gfx10_wave64_bpermute reduce op to print_ir< / li >
< li > aco: disable some instruction combining if it could change an exec operand< / li >
< li > aco: improve SCC handling in some SALU combines< / li >
< li > nir: fix nir_const_value_as_uint bit size in load/store vectorizer tests< / li >
< li > gitlab-ci: remove load_store_vectorizer from expected s390x test failures< / li >
< li > aco: add RegisterFile< / li >
< li > aco: add some helpers for filling/testing register ranges< / li >
< li > aco: improve GFX9 1D ddx/ddy assertion< / li >
< li > spirv: improve creation of memory_barrier< / li >
< li > spirv: fix memory_barrier_tcs_patch emission< / li >
< li > aco: keep track of which events are used in a barrier< / li >
< li > aco: fix carry-out size for wave32 v_add_co_u32_e64< / li >
< li > aco: handle v_add_co_u32_e64 in parse_base_offset()< / li >
< li > aco: add new NOP insertion pass for GFX6-9< / li >
< li > aco: improve get_wait_states()< / li >
< li > aco: consider non-hazard writes in handle_raw_hazard_internal< / li >
< li > aco: improve control flow handling in GFX6-9 NOP pass< / li >
< li > aco: only reserve sgprs for vcc if it' s used< / li >
< li > aco: fix uninitialized data error in waitcnt pass< / li >
< li > glsl/list: use uintptr_t for exec_node_data()' s subtraction< / li >
< li > aco: add helpers for moving instructions for scheduling< / li >
< li > aco: add helpers for ensuring correct ordering while scheduling< / li >
< li > aco: allow barriers to be skipped during scheduling< / li >
< li > aco: don' t stop scheduling at exports< / li >
< li > aco: move some register demand helpers into aco_live_var_analysis.cpp< / li >
< li > aco: add a late kill flag< / li >
< li > aco: set late kill for v_interp_p1_f32 for some APUs< / li >
< li > aco: fix instruction encoding for LS VGPR init bug workaround< / li >
< li > aco: fix operand order for LS VGPR init bug workaround< / li >
< li > nir/gather_info: handle emit_vertex_with_counter< / li >
< li > radv: call nir_shader_gather_info again< / li >
< li > radv/winsys: set has_syncobj_wait_for_submit in the null winsys< / li >
< li > aco: set has_divergent_branch for discards in loops< / li >
< li > aco: handle missing second predecessors at merge block phis< / li >
< li > aco: handle when ACO adds new continue edges< / li >
< li > aco: skip NIR in unreachable merge blocks< / li >
< li > aco: improve check for unreachable loop continue blocks< / li >
< li > aco: emit IR in IF' s merge block instead if the other side ends in a jump< / li >
< li > aco: fix boolean undef regclass< / li >
< li > nir/gather_info: fix per-vertex handling in try_mask_partial_io< / li >
< li > aco: remove dead code in handle_operands()< / li >
< li > aco: implement 64-bit VGPR constant copies in handle_operands()< / li >
< li > aco: look at p_{extract,split}_vector' s definitions in pred_by_exec_mask()< / li >
< li > glsl: fix race in instance getters< / li >
< li > util/u_queue: fix race in total_jobs_size access< / li >
< li > radv: add code for exposing compiler statistics< / li >
< li > aco: add various compiler statistics< / li >
< li > aco: add vmem/smem score statistic< / li >
< li > radv, aco: collect statistics if requested but executables are not< / li >
< li > radv: fix null winsys gpu_info array< / li >
< li > aco: make PhysReg in units of bytes< / li >
< li > aco: add SDWA_instruction< / li >
< li > aco: print and validate opsel< / li >
< li > aco: add emission support for register-allocated sdwa sels< / li >
< li > aco: remove divergence check in sanitize_if()< / li >
< li > aco: zero-initialize Temp< / li >
< li > aco: improve vector optimization with sub-dword vectors< / li >
< li > aco: fix p_extract_vector validation< / li >
< li > aco: improve p_create_vector RA for sub-dword operands< / li >
< li > aco: clear moved operands in get_reg_create_vector()< / li >
< li > aco: fix 1D textureGrad() on GFX9< / li >
< li > aco: implement various 8/16-bit conversions< / li >
< li > aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y< / li >
< li > aco: fix copy statistic for 64-bit vgpr constant copy< / li >
< li > aco: add VOP3P_instruction< / li >
< li > aco: implement sub-dword swaps< / li >
< li > aco: implement 64-bit sgpr swaps< / li >
< li > nir/lower_bit_size: fix lowering of shifts< / li >
< li > nir/lower_bit_size: fix lowering of {imul,umul}_high< / li >
< li > nir/algebraic: don' t undo lowering of 8/16-bit comparisons to 32-bit< / li >
< li > aco: decrease the uses of other copy operations after splitting/removing< / li >
< li > aco: copy-propagate p_create_vector copies of vectors< / li >
< li > aco: remove copy in load_input_from_temps()< / li >
< li > aco: move call to store_output_to_temps in store_ls_or_es_output earlier< / li >
< li > aco: combine VALU and SALU into various VOP3 instructions< / li >
< li > aco: improve code for 32-bit isign< / li >
< li > aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations< / li >
< li > aco: fix outdated label_vec from p_create_vector labelling< / li >
< li > radv: align buffer descriptor sizes to dword< / li >
< li > radv: allocate larger shader memory slabs if needed< / li >
< li > aco: be more careful about using SMEM for load_global< / li >
< li > aco: add and use RegClass::get() helper< / li >
< li > aco: add emit_load helper< / li >
< li > aco: refactor load_lds to use new helpers< / li >
< li > aco: use emit_load helper for VMEM/SMEM loads< / li >
< li > aco: add helpers for splitting stores< / li >
< li > aco: refactor store_lds() to use new helpers< / li >
< li > aco: refactor store_vmem_mubuf() to use new helpers< / li >
< li > aco: refactor visit_store_ssbo() to use new helpers< / li >
< li > aco: refactor visit_store_global() to use new helpers< / li >
< li > aco: refactor visit_store_scratch() to use new helpers< / li >
< li > aco: add and use get_buffer_store_op() helper< / li >
< li > aco: allow 8/16-bit shared loads< / li >
< li > aco: vectorize global loads/stores< / li >
< li > aco: handle undef p_create_vector operands in the optimizer< / li >
< li > aco: clobber scc in s_bfe_u32 in get_alu_src()< / li >
< li > aco: improve sub-dword emit_split_vector() with sgprs< / li >
< li > aco: lower 8/16-bit integer arithmetic< / li >
< li > radv/aco: enable 8/16-bit storage and int8/int16 on GFX8+< / li >
< li > aco: make RegisterFile::block() take a regclass< / li >
< li > aco: check alignment of non-subdword registers in get_reg_specified()< / li >
< li > aco: fix neighboring register check in get_reg_simple()< / li >
< li > aco: split self-intersecting copies instead of swapping< / li >
< li > aco: don' t recurse in sub-dword get_reg_simple()< / li >
< li > aco: improve RA for uneven p_split_vector< / li >
< li > aco: add missing adjust_max_used_regs()< / li >
< li > aco: fix sub-dword out-of-bounds check in RA validator< / li >
< li > aco: fix sub-dword overwrite check in RA validator< / li >
< li > aco: add various GFX10 int16 opcodes< / li >
< li > aco: improve clamped integer addition disassembly workaround< / li >
< li > aco: fix vgpr nir_op_vecn with sgpr operands< / li >
< li > aco: consider blocks unreachable if they are in the logical cfg< / li >
< li > aco: remove use of f-strings< / li >
< li > aco: add message to static_assert< / li >
< li > nir: add missing group_memory_barrier handling< / li >
< li > nir/opt_if: run opt_peel_loop_initial_if after all other optimizations< / li >
< li > nir: fix lowering to scratch with boolean access< / li >
< p > < / p >
< p > Rob Clark (147):< / p >
< li > freedreno/drm: readonly cmdstream< / li >
< li > freedreno/ir3: shuffle a few ir3_register fields< / li >
< li > freedreno/ir3: cleanup after lower_locals_to_regs< / li >
< li > freedreno/ir3: fix crash when no non-input instructions< / li >
< li > freedreno/ir3: split out delay helpers< / li >
< li > freedreno/ir3: move nop padding to legalize< / li >
< li > freedreno/ir3: move block-scheduling into legalize< / li >
< li > freedreno/ir3: move atomic fixup after RA< / li >
< li > freedreno/ir3: a bit more optmsgs debug< / li >
< li > freedreno/ir3/ra: make use()/def() functions instead of macros< / li >
< li > freedreno/ir3: fix kill scheduling< / li >
< li > freedreno/ir3: post-RA sched pass< / li >
< li > freedreno/ir3: number instructions from one< / li >
< li > freedreno/ir3: add is_tex_or_prefetch()< / li >
< li > freedreno/ir3: don' t precolor unused inputs< / li >
< li > freedreno/ir3: two pass register allocation< / li >
< li > freedreno/a6xx: fix lrz overflow< / li >
< li > freedreno/ir3: add RA sanity check< / li >
< li > freedreno/ir3: remove unused tex arg harder< / li >
< li > freedreno/ir3: create fragcoord instructions in input block< / li >
< li > freedreno/ir3: simplify split from collect< / li >
< li > freedreno/ir3: fix a dirty lie< / li >
< li > freedreno: allow ctx-> batch to be NULL< / li >
< li > freedreno/ir3: fold const conversion into consumer< / li >
< li > freedreno: allow INVALID modifier< / li >
< li > freedreno/registers: teach gen_header.py about a3xx_regid< / li >
< li > freedreno/a6xx: few register updates< / li >
< li > freedreno: quiet INFO_MSG< / li >
< li > freedreno/registers: cleanup CP_SET_MARKER< / li >
< li > freedreno/computerator: import parser/lexer from fdre-a3xx< / li >
< li > freedreno/computerator: polish out some of the rust< / li >
< li > freedreno/computerator: rename prefix asm-> ir3< / li >
< li > freedreno/ir3: allow block-> predecessors to be null< / li >
< li > freedreno/computerator: add computerator< / li >
< li > freedreno/computerator: fix build dependency< / li >
< li > freedreno/ir3: remove from_tgsi< / li >
< li > freedreno/a6xx: remove unused param< / li >
< li > freedreno/a6xx: emit LRZ clear in sysmem too< / li >
< li > freedreno/a6xx: whitespace fix< / li >
< li > freedreno/a6xx: don' t emit YIELD packet< / li >
< li > freedreno/a6xx: enable SKIP_IB2_ENABLE properly< / li >
< li > freedreno: honor FD_MESA_DEBUG=nogrow< / li >
< li > freedreno/ir3: remove regmask_set_if_not()< / li >
< li > freedreno/ir3: rewrite regmask to better support a6xx+< / li >
< li > freedreno/ir3: don' t hide latency when there is none to hide< / li >
< li > freedreno/ir3: track half-precision live values< / li >
< li > freedreno/ir3: update SFU delay< / li >
< li > freedreno/ir3: fix crash with samgq workaround< / li >
< li > freedreno/ir3: don' t precolor unassigned inputs< / li >
< li > freedreno/ir3: fix assert with getinfo< / li >
< li > freedreno/ir3: add assert< / li >
< li > nir/print: show variable precision< / li >
< li > freedreno/ir3: also lower lowp frag outputs< / li >
< li > freedreno/computerator: add hrsq/hlog2/hexp2< / li >
< li > freedreno/ir3: remove extra nops inserted in scheduler< / li >
< li > freedreno/ir3: add simplified stall estimation< / li >
< li > freedreno: fix FD_MESA_DEBUG=inorder< / li >
< li > util/ra: spiff out select_reg_callback< / li >
< li > util/ra: move NO_REG to header< / li >
< li > freedreno/ir3: split out has_latency_to_hide()< / li >
< li > freedreno/ir3: fix has_latency_to_hide< / li >
< li > freedreno/ir3: track register usage in first RA pass< / li >
< li > freedreno/ir3: round-robin RA< / li >
< li > freedreno/ir3: try to avoid syncs< / li >
< li > freedreno/computerator: add performance counter support< / li >
< li > freedreno/fdperf: set locale< / li >
< li > freedreno/a6xx: register update< / li >
< li > freedreno/ir3: small cleanup and comments< / li >
< li > freedreno/ir3: add bary_ij as src for meta:tex_prefetch< / li >
< li > freedreno/ir3: remove unused helper< / li >
< li > freedreno/ir3: fix bogus register footprint with tess/gs< / li >
< li > freedreno/ir3: reformat disasm output< / li >
< li > freedreno/ir3: convert debug bitfield to BITFIELD_BIT()< / li >
< li > freedreno/ir3/ra: add debug option for RA debug msgs< / li >
< li > freedreno/ir3/ra: split-up< / li >
< li > freedreno/ir3/ra: add helper to map name to instruction< / li >
< li > freedreno/ir3/ra: fix target register calculation< / li >
< li > freedreno/ir3/ra: add helper to map name to array< / li >
< li > freedreno/ir3/ra: drop extending output live-ranges< / li >
< li > freedreno/ir3/ra: add def/use iterators< / li >
< li > freedreno/ir3/ra: fix array liveranges< / li >
< li > freedreno/ir3/ra: compute register target from liveranges< / li >
< li > freedreno/ir3/ra: pick higher numbered scalars in first pass< / li >
< li > freedreno/ir3/ra: split building regs/classes and conflicts< / li >
< li > freedreno/ir3/ra: re-work a6xx merged register file conflicts< / li >
< li > gitlab-ci: disable vs2019 build< / li >
< li > freedreno: remove some obsolete debug options< / li >
< li > util: fix u_fifo_pop()< / li >
< li > freedreno: add logging infrastructure< / li >
< li > freedreno/a6xx: timestamp logging support< / li >
< li > freedreno: add some initial fd_log tracepoints< / li >
< li > freedreno/a6xx: add some more tracepoints< / li >
< li > freedreno/log: avoid duplicate ts' s< / li >
< li > util: move ALIGN/ROUND_DOWN_TO to u_math.h< / li >
< li > freedreno/ir3: fix android build< / li >
< li > freedreno/log: fix build error< / li >
< li > nir: fix definition of imadsh_mix16 for vectors< / li >
< li > freedreno/ir3/cf: handle widening too< / li >
< li > freedreno/ir3: fixup cat3 32b vs 16b< / li >
< li > freedreno/ir3/cf: skip array load/store< / li >
< li > freedreno/ir3: add a pass to collect SSA uses< / li >
< li > freedreno/ir3/cf: use ssa-uses< / li >
< li > freedreno/a6xx: add some compute logging< / li >
< li > freedreno: fix missing locking< / li >
< li > freedreno/ir3: also precompile compute shaders for shaderdb< / li >
< li > freedreno: limit fp16 to frag and compute< / li >
< li > glsl: don' t limit fp16 lowering to frag< / li >
< li > nir: add some swizzle helpers< / li >
< li > nir/lower_amul: fix slot calculation< / li >
< li > freedreno/log: android support< / li >
< li > freedreno/log: spiff out parser some more< / li >
< li > freedreno/log: better decoding for multiple chunks per batch< / li >
< li > freedreno/ir3: spiff out disasm a bit< / li >
< li > freedreno/ir3: make falsedep use' s optional< / li >
< li > freedreno/ir3: simplify grouping pass< / li >
< li > freedreno/ir3: fix location of inserted mov' s< / li >
< li > freedreno/ir3: new pre-RA scheduler< / li >
< li > freedreno/ir3/sched: awareness of partial liveness< / li >
< li > freedreno/ir3/postsched: remove some leftovers< / li >
< li > freedreno/ir3/postsched: avoid moving tex ahead of kill< / li >
< li > freedreno/ir3: add mov/cov stats< / li >
< li > freedreno/ir3/ra: handle array case for SFU select_reg opt< / li >
< li > freedreno/ir3: better cleanup when removing unused instructions< / li >
< li > freedreno/ir3: rename depth-> dce< / li >
< li > freedreno/ir3/ra: cleanup some leftovers< / li >
< li > mesa: avoid redundant VBO updates< / li >
< li > mesa/st: avoid u_vbuf for GLES< / li >
< li > gallium: add # of MRT to blend state< / li >
< li > freedreno/computer: add script to test widening/narrowing< / li >
< li > freedreno/ir3/ra: remove unused variable< / li >
< li > freedreno/ir3/ra: use ir3_debug_print helper< / li >
< li > freedreno/ir3/ra: split out helper for array assignment< / li >
< li > freedreno/ir3/ra: only assign array base in first pass< / li >
< li > freedreno/a6xx+tu: rename VSC_DATA/VSC_DATA2< / li >
< li > freedreno: add helper to estimate # of bins per pipe< / li >
< li > freedreno/a6xx: pre-calculate expected vsc stream sizes< / li >
< li > freedreno/log-parser: support to read gzip' d logs< / li >
< li > freedreno: small whitespace fix< / li >
< li > freedreno: don' t realloc idle bo' s< / li >
< li > freedreno: mark more state dirty when rebinding resources< / li >
< li > freedreno: optimize rebind_resource()< / li >
< li > freedreno: rebind resource in all contexts< / li >
< li > freedreno: rebind_resource() *before* bo changes< / li >
< li > freedreno/a6xx: invalidate tex state cache entries on rebind< / li >
< li > freedreno: fix buffer import< / li >
< li > freedreno/ir3: fix indirect cb0 load_ubo lowering< / li >
< li > freedreno: clear last_fence after resource tracking< / li >
< p > < / p >
< p > Rohan Garg (5):< / p >
< li > ci: Split out radv build-testing on arm64< / li >
< li > ci: Drop the git dependency in tracie< / li >
< li > tracie: Switch to using shutil.move for cross filesystem moves< / li >
< li > tracie: Print results in a machine readable format< / li >
< li > tracie: Reformat code to fix indentation< / li >
< p > < / p >
< p > Roland Scheidegger (7):< / p >
< li > gallivm: fix crash with bptc border color sampling< / li >
< li > gallivm: fix crash in emit_get_buffer_size< / li >
< li > gallivm: disable rgtc/latc SNORM accellerated fetches< / li >
< li > gallium/util: Add back (and rename) util_float_to_half implementation< / li >
< li > gallivm: fix rgtc2 format< / li >
< li > gallivm: switch the mask6/mask7 cases for signed rgtc formats< / li >
< li > gallivm: fix stream id fetch< / li >
< p > < / p >
< p > Roman Stratiienko (3):< / p >
< li > panfrost: Align Android makefiles with recent changes< / li >
< li > lima: Add missing source file to Android.mk< / li >
< li > panfrost: Align Android makefiles with recent changes< / li >
< p > < / p >
< p > Sagar Ghuge (13):< / p >
< li > intel/isl: Move get_format_encoding function to isl< / li >
< li > intel/isl: Switch to R8_UNORM format for compatiblity< / li >
< li > intel/tools: Handle illegal instruction< / li >
< li > intel/tools: Handle STATE_REG in typed source operand< / li >
< li > intel/tools: Set correct address register file and number in i965_asm< / li >
< li > intel/tools: Add test for address register as source< / li >
< li > intel/tools: Add test for state register as source< / li >
< li > intel/tools: Print c_literals 4 byte wide< / li >
< li > intel/tools: Allow i965_disasm to disassemble c_literal input type< / li >
< li > intel/genxml: Add patch count threshold field on gen12< / li >
< li > intel/compiler: Track patch count threshold< / li >
< li > anv: Set patch count threshold in 3DSTATE_HS< / li >
< li > iris: Set patch count threshold in 3DSTATE_HS< / li >
< p > < / p >
< p > Samuel Iglesias Gonsálvez (2):< / p >
< li > radv: check buffer size in vkCreateBuffer()< / li >
< li > radv: set sparseAddressSpaceSize to RADV_MAX_MEMORY_ALLOCATION_SIZE< / li >
< p > < / p >
< p > Samuel Pitoiset (197):< / p >
< li > aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6< / li >
< li > aco: do not use ds_{read,write}2 on GFX6< / li >
< li > gitlab-ci: disable a630 tests as mesa-cheza is down (again)< / li >
< li > aco: fix waiting for scalar stores before " writing back" data on GFX8-GFX9< / li >
< li > radv: make sure to not submit any IBs when RADV_FORCE_FAMILY is set< / li >
< li > radv: set the chip name to GCN-NOOP when RADV_FORCE_FAMILY is set< / li >
< li > aco: fix creating v_madak if v_mad_f32 has two sgpr literals< / li >
< li > nir: do not use De Morgan' s Law rules for flt and fge< / li >
< li > radv: fix line width range and granularity< / li >
< li > radv: implement VK_EXT_line_rasterization< / li >
< li > radv: remove LLVM sicheduler enable for The Talos Principle< / li >
< li > radv: remove RADV_DEBUG=nosisched and RADV_PERFTEST=sisched< / li >
< li > radv: remove unused RADV_HASH_SHADER_IS_GEOM_COPY_SHADER< / li >
< li > radv: remove unnecessary RADV_DEBUG=nobatchchain option< / li >
< li > docs/new_features: empty the feature list for the 20.1 cycle< / li >
< li > radv: enable shaderStorageImageMultisample on GFX6-GFX7< / li >
< li > radv: enable VK_EXT_sampler_filter_minmax on GFX6< / li >
< li > radv: enable VK_NV_compute_shader_derivatives on GFX6-GFX7< / li >
< li > radv: add a comment about VK_AMD_mixed_attachment_samples on GFX6-GFX7< / li >
< li > docs/envvars: document RADV_TEX_ANISO< / li >
< li > radv/winsys: add a new flag that requests zerovram allocations< / li >
< li > radv: use RADEON_FLAG_ZERO_VRAM when creating the trace BO< / li >
< li > radv: add the trace BO to the BO list at submit time< / li >
< li > radv: implement a dummy winsys for creating devices without AMDGPU< / li >
< li > ac,radeonsi: add ac_gpu_info::lds_size_per_cu< / li >
< li > ac: add more ac_gpu_info related shader fields< / li >
< li > radv/gfx10: adjust the number of simd per compute unit< / li >
< li > radv/gfx10: adjust SGPRs/VGPRs related info< / li >
< li > radv/gfx10: adjust the LDS size used to compute waves< / li >
< li > radv/gfx10: adjust the number of VGPRs used to compute waves< / li >
< li > radv: make use of ac_gpu_info::max_wave64_per_simd< / li >
< li > radv: fix creating null devices if KHR_display is enabled< / li >
< li > ac/llvm: fix 64-bit fmed3< / li >
< li > ac/llvm: fix 16-bit fmed3 on GFX8 and older gens< / li >
< li > ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens< / li >
< li > ac: add more fields to ac_gpu_info< / li >
< li > ac/registers: add definitions for thread trace< / li >
< li > radv: add a small helper that allows to submit internal CS< / li >
< li > radv: add initial SQ Thread Trace support for GFX9< / li >
< li > radv: emit thread trace markers after every draw/dispatch call< / li >
< li > radv: add initial SQTT files generation support< / li >
< li > radv: allow to capture SQTT traces with RADV_THREAD_TRACE=< start_frame> < / li >
< li > radv: fix 32-bit build failure in radv_queue_internal_submit()< / li >
< li > radv: fix size of sqtt_file_chunk_asic_info on 32-bit system< / li >
< li > radv/rgp: adjust trace memory/shader clocks to fix frame duration< / li >
< li > radv/sqtt: do not assume that the number of shader engines is 4< / li >
< li > radv/sqtt: update SPI_CONFIG_CNTL.EXP_PRIORITY_ORDER value< / li >
< li > ac/registers: add definitions for thread trace on GFX10< / li >
< li > radv/sqtt: add support for GFX10< / li >
< li > radv: update entrypoints generation from ANV< / li >
< li > ac: rename lds_size_per_cu to lds_size_per_workgroup< / li >
< li > ac: rename vgpr_alloc_granularity to wave64_vgpr_alloc_granularity< / li >
< li > ac: rename min_vgpr_alloc to min_wave64_vgpr_alloc< / li >
< li > aco: fix image load/store with lod and 1D images< / li >
< li > gitlab-ci: build Fossilize in the test image for VK< / li >
< li > gitlab-ci: add Fossilize support to detect compiler regressions< / li >
< li > gitlab-ci: enable building the test image for VK unconditionally< / li >
< li > gitlab-ci: add a job that runs Fossilize on RADV/Polaris10< / li >
< li > radv/winsys: fix missing initializations of shader info in the null device< / li >
< li > radv/sqtt: fix wrong check in radv_is_thread_trace_complete()< / li >
< li > radv/sqtt: tidy up radv_emit_thread_trace_{start,stop}< / li >
< li > radv/sqtt: add radv_copy_thread_trace_info_regs() helper< / li >
< li > ac/registers: adjust some definitions for thread trace on GFX8< / li >
< li > radv/sqtt: add support for GFX8< / li >
< li > radv/sqtt: abort if SQTT is used on GFX6-GFX7< / li >
< li > ac: add ac_gpu_info::cu_mask to store bitmask of compute units< / li >
< li > radv/rgp: report correct cu_mask info< / li >
< li > radv/rgp: report correct system ram size< / li >
< li > nir/lower_input_attachments: remove bogus assert in try_lower_input_texop()< / li >
< li > radv/entrypoints: declare a driver internal layer for SQTT< / li >
< li > radv: use device entrypoints from the SQTT layer if enabled< / li >
< li > radv/sqtt: add a helper that emits thread trace userdata markers< / li >
< li > radv: initial implementation of the driver internal layer SQTT< / li >
< li > radv/sqtt: describe begin/end command buffers with user markers< / li >
< li > radv/sqtt: describe draw/dispatch and emit event markers< / li >
< li > radv/sqtt: describe render pass color/depthstencil clears< / li >
< li > radv/rgp: bump the instrumentation spec version to 1< / li >
< li > radv/sqtt: describe pipeline and wait events barriers< / li >
< li > gitlab-ci: add rules:changes for RADV< / li >
< li > radv: do not recursively begin/end render pass for meta operations< / li >
< li > radv: fix 32-bits build (again)< / li >
< li > gitlab-ci: build RADV in meson-i386 to avoid 32-bit build failures< / li >
< li > ac/llvm: add missing optimization barrier for 64-bit readlanes< / li >
< li > radv/sqtt: describe begin/end subpass barriers with user markers< / li >
< li > radv/sqtt: describe layout transitions with user markers< / li >
< li > radv/gfx10: cache metadata in L2 on small chips< / li >
< li > radv: use better tessellation tunables on GFX9+< / li >
< li > radv: tune primitive binning for small chips< / li >
< li > radv: rewrite late alloc computation< / li >
< li > radv: use ac_gpu_info::use_late_alloc< / li >
< li > radv: cleanup occurences of use_aco everywhere< / li >
< li > radv: remove radv_shader_variant::aco_used< / li >
< li > radv: remove unnecessary LLVM includes< / li >
< li > radv: add llvm_compiler_shader() helper< / li >
< li > gitlab-ci: remove useless ' patch' package in the VK test image< / li >
< li > gitlab-ci: allow deqp-runner to use the maximum number of jobs< / li >
< li > gitlab-ci: do not set the number of deqp-parallel jobs for RADV CTS< / li >
< li > gitlab-ci: bump Vulkan CTS to 1.2.1.0< / li >
< li > radv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()< / li >
< li > radv: only inject implicit subpass dependencies if necessary< / li >
< li > radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control< / li >
< li > radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control< / li >
< li > radv: fix random depth range unrestricted failures due to a cache issue< / li >
< li > radv: remove wrong assert that checks compute subgroup size< / li >
< li > radv: fix optional pSizes parameter when binding streamout buffers< / li >
< li > radv/winsys: fix wrong PCI ID for Vega10 in the null winsys< / li >
< li > radv/winsys: spoof some values for num_render_backends in the null winsys< / li >
< li > gitlab-ci: compile fossils with both RADV compiler backends (LLVM/ACO)< / li >
< li > gitlab-ci: compile fossils with more ASICs< / li >
< li > gitlab-ci: add a new stage for RADV CI< / li >
< li > gitlab-ci: add a bunch of new fossils from the Sascha Vulkan demos< / li >
< li > radv/llvm: fix subgroup shuffle for chips without bpermute< / li >
< li > radv: enable VK_KHR_8bit_storage on GFX6-GFX7< / li >
< li > ac/nir: use llvm.amdgcn.rcp for nir_op_frcp< / li >
< li > ac/nir: use llvm.amdgcn.rsq for nir_op_frsq< / li >
< li > ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()< / li >
< li > nir/algebraic: add fexp2(fmul(flog2(a), 0.5) -> fsqrt(a) optimization< / li >
< li > aco: only break SMEM clauses if XNACK is enabled (mostly APUs)< / li >
< li > aco: always optimize v_mad to v_madak in presence of literals< / li >
< li > ac/nir: split 8-bit load/store to global memory on GFX6< / li >
< li > ac/nir: split 8-bit SSBO stores on GFX6< / li >
< li > radv/llvm: enable 8-bit storage features on GFX6-GFX7< / li >
< li > ac/nir: split 16-bit load/store to global memory on GFX6< / li >
< li > ac/nir: split 16-bit SSBO stores on GFX6< / li >
< li > radv/llvm: enable 16-bit storage features on GFX6-GFX7< / li >
< li > radv: rename decompress/resummarize depth/stencil functions< / li >
< li > radv: rename extra graphics pipeline decompress/resummarize fields< / li >
< li > radv: cleanup creating the decompress/resummarize pipelines< / li >
< li > radv: remove radv_layout_has_htile() helper< / li >
< li > radv: enable lowering of GS intrinsics for the LLVM backend< / li >
< li > ac,radv: add ac_gpu_info::has_double_rate_fp16< / li >
< li > radv: only expose shaderFloat16 for chips with double rate fp16< / li >
< li > radv: only expose storageInputOutput16 for chips with double rate fp16< / li >
< li > radv: only expose fp16 control features for chips with double rate fp16< / li >
< li > radv: only enable TC-compat HTILE for images readable by a shader< / li >
< li > radv: allow TC-compat HTILE with GENERAL outside of render loops< / li >
< li > aco: implement 16-bit nir_op_frexp_sig/nir_op_frexp_exp< / li >
< li > aco: implement 16-bit nir_op_ffract< / li >
< li > aco: implement 16-bit nir_op_fexp2/nir_op_flog2< / li >
< li > aco: implement 16-bit nir_op_ftrunc/nir_op_fround_even< / li >
< li > aco: implement 16-bit nir_op_fsqrt/nir_op_frcp/nir_op_frsq< / li >
< li > aco: implement 16-bit nir_op_ffloor/nir_op_fceil< / li >
< li > aco: implement 16-bit nir_op_fmax/nir_op_fmin< / li >
< li > aco: implement 16-bit nir_op_fabs/nir_op_fneg< / li >
< li > aco: implement 16-bit nir_op_fsub/nir_op_fadd< / li >
< li > aco: implement 16-bit nir_op_fcos/nir_op_fsin< / li >
< li > aco: implement 16-bit nir_op_fmul< / li >
< li > aco: implement 16-bit nir_op_fsat< / li >
< li > aco: implement 16-bit nir_op_fsign< / li >
< li > aco: implement 16-bit nir_op_bcsel< / li >
< li > aco: implement 16-bit nir_op_f2i32/nir_op_f2u32< / li >
< li > aco: implement 16-bit nir_op_ldexp< / li >
< li > aco: implement 16-bit nir_op_fmax3/nir_op_fmin3/nir_op_fmed3< / li >
< li > aco: implement 16-bit comparisons< / li >
< li > aco: implement nir_op_b2f16/nir_op_i2f16/nir_op_u2f16< / li >
< li > aco: fix f2i64/f2u64 with sgprs if the exponent computation overflow< / li >
< li > aco: implement 16-bit nir_op_f2i64/nir_op_f2u64< / li >
< li > aco: fix nir_op_pack_32_2x16_split if one operand is a constant< / li >
< li > radv: add radeon_set_context_reg_rmw() helper< / li >
< li > radv: use RMW packets for updating the maximum sample distance< / li >
< li > aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents< / li >
< li > radv/aco: do not advertise VK_KHR_shader_subgroup_extended_types< / li >
< li > aco: implement nir_op_f2i8/nir_op_f2u8< / li >
< li > aco: fix emitting stream output with tess eval shaders< / li >
< li > radv: do not abort with unknown/unimplemented descriptor types< / li >
< li > radv: fix geometry shader primitives query with ACO on GFX10< / li >
< li > radv: set missing SHARED_VGPR_CNT for NGG VS and ACO< / li >
< li > radv/llvm: fix exporting the viewport index if the fragment shader needs it< / li >
< li > aco: fix exporting the viewport index if the fragment shader needs it< / li >
< li > nir/lower_int64: lower imin3/imax3/umin3/umax3/imed3/umed3< / li >
< li > nir/opt_algebraic: lower 64-bit fmin3/fmax3/fmed3< / li >
< li > gitlab-ci: add a list of excluded tests for RADV< / li >
< li > radv: make sure to export the viewport index if FS needs it< / li >
< li > radv: simplify checking for Navi1x chips< / li >
< li > radv: adjust the supported subgroup stages< / li >
< li > radv: fix robust_buffer_access if enabled via VkPhysicalDeviceFeatures2< / li >
< li > gitlab-ci: add lists of expected failures for RADV CI< / li >
< li > ac,radeonsi: fix compilations issues with LLVM 11< / li >
< li > radv: do not expose GTT as device local memory mostly for APUs< / li >
< li > radv: enable FMASK for color attachments only< / li >
< li > radv: remove unused radv_device_memory::map_size field< / li >
< li > radv: track memory heaps usage if overallocation is explicitly disallowed< / li >
< li > radv: advertise VK_AMD_memory_overallocation_behavior< / li >
< li > ac/llvm: fix nir_texop_texture_samples with NULL descriptors< / li >
< li > aco: fix nir_texop_texture_samples with NULL descriptors< / li >
< li > aco: fix adjusting the sample index with FMASK if value is negative< / li >
< li > radv: handle NULL descriptors< / li >
< li > radv: handle NULL vertex bindings< / li >
< li > radv: advertise VK_EXT_robustness2< / li >
< li > gitlab-ci: add a list of expected failures for FIJI with ACO< / li >
< li > ci: fix reporting the number of unexpected/flakes< / li >
< li > radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed< / li >
< li > radv: don' t report error with other vendor DRM devices< / li >
< li > aco: fix 64-bit trunc with negative exponents on GFX6< / li >
< li > radv: limit the Vulkan version to 1.1 for Android< / li >
< li > radv: handle different Vulkan API versions correctly< / li >
< li > radv: update the list of allowed Android extensions< / li >
< p > < / p >
< p > Satyajit Sahu (1):< / p >
< li > st/va: GetConfigAttributes: check profile and entrypoint combination< / li >
< p > < / p >
< p > Simon Ser (1):< / p >
< li > mesa: add support for NV_pixel_buffer_object< / li >
< p > < / p >
< p > Simon Zeni (1):< / p >
< li > mesa: enable GL_EXT_draw_instanced for gles2< / li >
< p > < / p >
< p > Sonny Jiang (1):< / p >
< li > radeonsi: enable EXT_texture_shadow_lod< / li >
< p > < / p >
< p > Szymon Andrzejuk (1):< / p >
< li > virgl: Use align_free for align_malloc allocated buffer< / li >
< p > < / p >
< p > Tapani Pälli (27):< / p >
< li > intel/vec4: fix valgrind errors with vf_values array< / li >
< li > glsl: fix a memory leak with resource_set< / li >
< li > iris: fix aux buf map failure in 32bits app on Android< / li >
< li > mesa: introduce boolean toggle for EXT_texture_norm16< / li >
< li > i965: toggle on EXT_texture_norm16< / li >
< li > mesa/st: toggle EXT_texture_norm16 based on format support< / li >
< li > mesa/st: fix formats required for EXT_texture_norm16< / li >
< li > nir: fix compilation warning on glsl_get_internal_ifc_packing< / li >
< li > iris: toggle on PIPE_CAP_MIXED_COLOR_DEPTH_BITS< / li >
< li > nir/glsl: gather bitmask of images used by program< / li >
< li > iris: use the images_used mask in resolve pass< / li >
< li > intel/compiler: detect if atomic load store operations are used< / li >
< li > iris: provide dummy iris_image_view_aux_usage< / li >
< li > iris: move existing image format fallback as a helper function< / li >
< li > iris: determine aux usage during predraw and state setup< / li >
< li > isl: allow compression for storage images on gen12+< / li >
< li > iris: allow compression conditionally for images on gen12< / li >
< li > glsl: set error_emitted true if type not ok for assignment< / li >
< li > mesa/st: unbind shader state before deleting it< / li >
< li > mesa/st: release variants for active programs before unref< / li >
< li > mesa: remove redudant check< / li >
< li > mesa: remove redudant assignment< / li >
< li > glsl: remove redudant assignment< / li >
< li > glsl: stop processing function parameters if error happened< / li >
< li > mesa/st: initialize all winsys_handle fields for memory objects< / li >
< li > anv: remove assert from GetImageMemoryRequirements[2]< / li >
< li > st/mesa: destroy only own program variants when program is released< / li >
< p > < / p >
< p > Thomas Hellstrom (5):< / p >
< li > svga: Fix banded DMA upload< / li >
< li > svga, winsys/svga: Fix persistent memory discard maps< / li >
< li > svga: Treat forced coherent maps as maps of persistent memory< / li >
< li > gallium/pipebuffer: Use persistent maps for slabs< / li >
< li > winsys/svga: Optionally avoid caching buffer maps< / li >
< p > < / p >
< p > Thong Thai (7):< / p >
< li > Revert " st/va: Convert interlaced NV12 to progressive" < / li >
< li > gallium/auxiliary/vl: fix bob compute shaders for deint yuv< / li >
< li > st/va: remove unneeded code< / li >
< li > st/va/postproc: reallocate interlaced destination buffer< / li >
< li > radeonsi: add 10-bit HEVC encode support for VCN2.0 devices< / li >
< li > radeon: add support for 10-bit HEVC encoding to VCN 2.0< / li >
< li > st/va: add check for P010 and P016 encode/decode support< / li >
< p > < / p >
< p > Timothy Arceri (51):< / p >
< li > glsl: fix gl_nir_set_uniform_initializers() for image arrays< / li >
< li > glsl: fix possible memory leak in nir uniform linker< / li >
< li > glsl: set the correct number of samplers in a shader< / li >
< li > glsl: set the correct number of images in a shader< / li >
< li > glsl: fix resizing of the uniform remap table< / li >
< li > glsl: reset next_image_index count for each shader stage< / li >
< li > glsl: fix sampler index calculation in nir linker< / li >
< li > glsl: add some error checks to the nir uniform linker< / li >
< li > glsl: move nir link uniforms struct defs earlier< / li >
< li > glsl: move add_parameter() earlier in nir link uniforms< / li >
< li > glsl: move get_next_index() earlier in nir link uniforms< / li >
< li > glsl: add name support to nir uniform linker< / li >
< li > glsl: correctly find block index when linking glsl with nir linker< / li >
< li > nir: add glsl_get_internal_ifc_packing() helper< / li >
< li > nir: add glsl_get_std140_base_alignment() helper< / li >
< li > nir: add glsl_get_std140_size() helper< / li >
< li > nir: add glsl_get_std430_base_alignment() helper< / li >
< li > nir: add glsl_get_std430_size() helper< / li >
< li > glsl: add std140 and std430 layouts to nir uniform linker< / li >
< li > glsl: correctly set explicit offsets for struct members< / li >
< li > glsl: find the base offset for block members from unnamed blocks< / li >
< li > glsl: nir linker fix setting of ssbo top level array< / li >
< li > glsl: set ShaderStorageBlocksWriteAccess in the nir linker< / li >
< li > glsl: add support for builtins to the nir uniform linker< / li >
< li > glsl: dont try to assign uniform storage for uniform blocks< / li >
< li > glsl: add subroutine support to nir linker< / li >
< li > glsl: fix varying packing for 64bit integers< / li >
< li > nir: fix packing of TCS varyings not read by the TES< / li >
< li > nir: fix crash in varying packing on interface mismatch< / li >
< li > glsl_to_nir: remove dead code< / li >
< li > radeonsi: don' t lower constant arrays to uniforms in GLSL IR< / li >
< li > nir: make opt_if_loop_terminator() less strict< / li >
< li > nir: add matrix_layout to nir_variable data< / li >
< li > glsl: fix struct offsets in the nir uniform linker< / li >
< li > glsl: tidy up uniform storage value count code in NIR linker< / li >
< li > Revert " glsl: fix resizing of the uniform remap table" < / li >
< li > glsl: fix explicit locations for the glsl linker< / li >
< li > glsl: error check max user assignable uniform locations< / li >
< li > glsl: fix block index in NIR uniform linker< / li >
< li > glsl: pull mark_array_elements_referenced() out into common helper< / li >
< li > glsl: only set stage ref when uniforms referenced in stage< / li >
< li > nir/gcm: allow derivative dependent intrinisics to be moved earlier< / li >
< li > nir/gcm: be more conservative about moving instructions from loops< / li >
< li > nir/gcm: dont move movs unless we can replace them later with their src< / li >
< li > glsl: add bindless support to nir uniform linker< / li >
< li > glsl: fix gl_nir_set_uniform_initializers() for bindless textures< / li >
< li > st/glsl_to_nir: make use of nir linker for linking uniforms< / li >
< li > glsl: some nir uniform linker fixes< / li >
< li > glsl: remove some duplicate code from the nir uniform linker< / li >
< li > glsl: stop cascading errors if process_parameters() fails< / li >
< li > glsl: fix slow linking of uniforms in the nir linker< / li >
< p > < / p >
< p > Timur Kristóf (90):< / p >
< li > aco/optimizer: Don' t combine uniform bool s_and to s_andn2.< / li >
< li > radv: Move some helper functions to the radv_shader.h header file.< / li >
< li > aco: Extract setup_gs_variables into a separate function.< / li >
< li > aco: Setup tessellation control shader variables.< / li >
< li > aco: Implement load_tess_coord.< / li >
< li > aco: Implement load_primitive_id for tessellation shaders.< / li >
< li > aco: Implement load_patch_vertices_in.< / li >
< li > aco: Implement load_invocation_id for tessellation control shaders.< / li >
< li > aco: Implement control_barrier for tessellation control shaders.< / li >
< li > aco: Implement memory_barrier_tcs_patch.< / li >
< li > aco: Implement load_view_index for TCS and TES.< / li >
< li > aco: Setup correct HW stages when tessellation is used.< / li >
< li > aco: Use mesa shader stage when loading inputs.< / li >
< li > aco: Remove vertex_geometry_gs assertion from merged shaders.< / li >
< li > aco: Extract LDS alignment calculation to a separate function.< / li >
< li > aco: Remove esgs_itemsize from LDS alignment calculation.< / li >
< li > aco: Introduce new VMEM load/store helpers.< / li >
< li > aco: Introduce new helpers for calculating address offsets.< / li >
< li > aco: Refactor load_per_vertex_input in preparation for tessellation.< / li >
< li > aco: Refactor VS output stores in preparation for tessellation.< / li >
< li > aco: Slight fix to lds_store and lds_load.< / li >
< li > aco: Fix combining DS additions in the optimizer.< / li >
< li > aco: Implement tessellation control shader input/output.< / li >
< li > aco: Store VS outputs correctly when tessellation is used.< / li >
< li > aco: Fix LS VGPR init bug on affected hardware.< / li >
< li > radv: Enable ACO for tessellation control shaders.< / li >
< li > aco: Setup tessellation evaluation shader variables.< / li >
< li > aco: Use TES output info when TES runs on the VS stage.< / li >
< li > aco: Store TES outputs when TES runs on the HW VS stage.< / li >
< li > aco: Enable streamout when TES runs on the HW VS stage.< / li >
< li > aco: Implement loading TES inputs.< / li >
< li > radv: Enable ACO for TES when there is no GS.< / li >
< li > aco: Enable running TES as ES, including merged TES+GS.< / li >
< li > radv: Enable ACO on all stages.< / li >
< li > aco: Don' t generate an if when the first part of a merged HS or GS is empty.< / li >
< li > aco: Store tess factors in VMEM only at the end of the shader.< / li >
< li > aco: Only write TCS outputs to LDS when they are read by the TCS.< / li >
< li > aco: Don' t store TCS outputs to LDS when we' re sure that none are read.< / li >
< li > nir: Add ability to lower non-const quad broadcasts to const ones.< / li >
< li > radv: Enable lowering dynamic quad broadcasts.< / li >
< li > radv: Enable subgroup shuffle on GFX10 when ACO is used.< / li >
< li > aco: Create null exports in instruction selection instead of assembler.< / li >
< li > aco: Extract tcs_driver_location_matches_api_mask to separate function.< / li >
< li > aco: Fix handling of tess factors.< / li >
< li > aco: Allow combining TCS output VMEM stores.< / li >
< li > aco: Allow combining LDS loads when loading tess factors.< / li >
< li > aco: Skip 2nd read of merged wave info when TCS in/out vertices are equal.< / li >
< li > aco: Use more optimal sequence at the beginning of merged shaders.< / li >
< li > nir: Collect if shader uses cross-invocation or indirect I/O.< / li >
< li > aco: Treat outputs of the previous stage as inputs of the next stage.< / li >
< li > aco: Change isel inputs/outputs to a flat array.< / li >
< li > aco: Zero-fill undefined elements in create_vec_from_array.< / li >
< li > aco: Extract setup_tcs_info to a separate function.< / li >
< li > aco: Fix workgroup size calculation.< / li >
< li > aco: Extract store_output_to_temps into a separate function.< / li >
< li > aco: When LS and HS invocations are the same, pass LS outputs in temps.< / li >
< li > aco: Don' t store LS VS outputs to LDS when TCS doesn' t need them.< / li >
< li > aco: Fix crash in insert_wait_states.< / li >
< li > aco: Extract uniform if handling to separate functions.< / li >
< li > aco: Print block_kind_export_end.< / li >
< li > aco: Extract merged_wave_info_to_mask to its own function.< / li >
< li > aco: Treat s_setprio as a scheduling barrier.< / li >
< li > aco/ngg: Add new stage for hw_ngg_gs.< / li >
< li > aco/ngg: Initialize exec mask for NGG VS and TES.< / li >
< li > aco/ngg: Fix exports for NGG VS and TES.< / li >
< li > aco/ngg: Setup NGG VS and TES stages.< / li >
< li > aco/ngg: Implement NGG VS and TES.< / li >
< li > aco/ngg: Schedule position exports of NGG VS/TES.< / li >
< li > aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.< / li >
< li > radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.< / li >
< li > aco: Print shader stage in aco_print_program.< / li >
< li > radv: Print shader stage before disassembly.< / li >
< li > radv: Add inputs read by TES to radv_shader_info.< / li >
< li > aco: Only store TCS outputs to VMEM when they are read by TES.< / li >
< li > aco: Increase barrier_count to 7 to include barrier_barrier.< / li >
< li > aco: Abort when RA can' t find a register.< / li >
< li > aco: Const correctness for get_barrier_interaction.< / li >
< li > aco: Const correctness for aco_print_ir.< / li >
< li > aco: Use 24-bit multiplication in TCS I/O< / li >
< li > aco: Use 24-bit multiplication for NGG wave id and thread id.< / li >
< li > aco: Move s_setprio to correct place after the gs_alloc_req.< / li >
< li > radv: Refactor calculate_tess_lds_size and get_tcs_num_patches.< / li >
< li > aco: Use context variables instead of calculating TCS inputs/outputs.< / li >
< li > aco: Remember VS/TCS output driver locations.< / li >
< li > aco: Calculate workgroup size of legacy GS.< / li >
< li > aco: Set config-> lds_size when TES or VS is running on HW ESGS.< / li >
< li > nir: Add new linking helper to set linked driver locations.< / li >
< li > radv: Use new linking helper to set default driver locations.< / li >
< li > aco: Use new default driver locations.< / li >
< li > radv: Use smaller esgs_itemsize for ACO.< / li >
< p > < / p >
< p > Tobias Jakobi (1):< / p >
< li > meson: Link Gallium Nine with ld_args_build_id< / li >
< p > < / p >
< p > Tomasz Pyra (1):< / p >
< li > gallium/swr: spin-lock performance improvement< / li >
< p > < / p >
< p > Tomeu Vizoso (34):< / p >
< li > panfrost: Print intended field when decoding< / li >
< li > panfrost: Add more info to some assertions< / li >
< li > pan/midgard: Handle nir_intrinsic_load_barycentric_centroid< / li >
< li > panfrost: Use DBG macro to avoid noise in the console< / li >
< li > panfrost: Fix decoding of tiled 3D textures< / li >
< li > panfrost: Only clamp the LOD to disable mipmapping when needed< / li >
< li > gitlab-ci: Switch kernel for LAVA jobs to 5.5< / li >
< li > gitlab-ci: Disable the lima job for now< / li >
< li > gitlab-ci: Run GLES3 tests in dEQP on Panfrost< / li >
< li > panfrost: Remove some more prints to stdout< / li >
< li > gitlab-ci: Move to 5.5 kernel plus fixes for Panfrost< / li >
< li > gitlab-ci: Use PAN_MESA_DEBUG=gles3 for Panfrost< / li >
< li > gitlab-ci: Remove GLES3 test from Panfrost fails list< / li >
< li > gitlab-ci: Skip dEQP-GLES3.functional.shaders.derivate.*< / li >
< li > gallium: Add forgotten docs for new CAPs related to transform feedback< / li >
< li > gitlab-ci: Update renderdoc< / li >
< li > gitlab-ci: Use surfaceless platform also for apitrace< / li >
< li > gitlab-ci: Place files from the Mesa repo into the build tarball< / li >
< li > gitlab-ci: Serve files for LAVA via separate service< / li >
< li > gitlab-ci: Disable jobs for Collabora' s LAVA lab< / li >
< li > Revert " gitlab-ci: Disable jobs for Collabora' s LAVA lab" < / li >
< li > panfrost: Remove most usage of midgard_payload_vertex_tiler< / li >
< li > panfrost: Pass IS_BIFROST to pandecode_jc< / li >
< li > panfrost: Don' t emit write_value jobs on Bifrost< / li >
< li > panfrost: On Bifrost, set the right tiler descriptor< / li >
< li > gitlab-ci: Test virgl driver< / li >
< li > panfrost: Clean up a bit the tiler structs for Bifrost< / li >
< li > panfrost: Emit sampler descriptor on bifrost< / li >
< li > panfrost: Emit texture descriptor on bifrost< / li >
< li > gitlab-ci: Update virglrenderer in the x86_test-gl image< / li >
< li > gitlab-ci: Allow test jobs to add options to the dEQP invocation< / li >
< li > gitlab-ci: Test OpenGL ES 3.1 on virgl< / li >
< li > gitlab-ci: Test Virgl with traces< / li >
< li > panfrost: Add Bifrost texture trampoline BO to batch< / li >
< p > < / p >
< p > Uros Bizjak (1):< / p >
< li > doc: Update features.txt for r600 with misc supported features< / li >
< p > < / p >
< p > Vasily Khoruzhick (19):< / p >
< li > lima: handle early-z and pixel kill better< / li >
< li > lima: implement PLB PP stream cache< / li >
< li > lima: add RGBA5551 and RGBA4444 formats< / li >
< li > lima: don' t disable tiling if there' s linear modifier in list< / li >
< li > lima: gpir: enforce instruction limit earlier< / li >
< li > panfrost: split index cache into shared part< / li >
< li > lima: enable minmax cache for index buffers< / li >
< li > lima: print gp uniforms if gp debug is enabled< / li >
< li > lima/gpir: improve disassembler output< / li >
< li > lima/gpir: print acc ops even if we have only one source< / li >
< li > lima/gpir: kill dead writes to regs in DCE< / li >
< li > lima/gpir: add better lowering for ftrunc< / li >
< li > lima/gpir: fix crash in schedule_insert_ready_list()< / li >
< li > lima: disable Z16 format< / li >
< li > lima: decode depth/stencil write bits in RSW< / li >
< li > lima: split pixel and texel format tables< / li >
< li > lima: add support for R and RG formats< / li >
< li > lima: Implement lima_texture_subdata< / li >
< li > lima: avoid situations when scissor minx > maxx or miny > maxy< / li >
< p > < / p >
< p > Veerabadhran (1):< / p >
< li > radeon/vce: Move global function pointer si_get_pic_param to local encoder structure Multi gpu use case broken when the function was global< / li >
< p > < / p >
< p > Vilya Harvey (1):< / p >
< li > zink. Don' t set incorrect sType in VkImportMemoryFdInfoKHR struct< / li >
< p > < / p >
< p > Vinson Lee (16):< / p >
< li > swr: Fix build with GCC 10.< / li >
< li > lima: Fix build with GCC 10.< / li >
< li > swr: Fix GCC 4.9 checks.< / li >
< li > panfrost: Remove unused anonymous enum variables.< / li >
< li > meson: Enable -Wno-deprecated only for bison > 2.3.< / li >
< li > swr: Fix non-pod-varargs error.< / li >
< li > st/nine: Fix incompatible-pointer-types-discards-qualifiers errors.< / li >
< li > panfrost: Fix gnu-empty-initializer error.< / li >
< li > util/u_process: Add util_get_process_exec_path for macOS.< / li >
< li > mesa: Change _mesa_exec_malloc argument type.< / li >
< li > gallivm: Add missing header for powf.< / li >
< li > swr/rasterizer: Use private functions for min/max to avoid namespace issues.< / li >
< li > swr: Remove Byte Order Mark.< / li >
< li > r600/sfn: Initialize VertexStageExportForGS m_num_clip_dist member variable.< / li >
< li > r600/sfn: Use correct setter method.< / li >
< li > freedreno: Add missing va_end.< / li >
< p > < / p >
< p > Yevhenii Kolesnikov (1):< / p >
< li > intel/compiler: fix cmod propagation optimisations< / li >
< p > < / p >
< p > Zhang, Boyuan (1):< / p >
< li > radeonsi: Add support for midstream bitrate change in encoder< / li >
< p > < / p >
< p > luc (1):< / p >
< li > zink: confused compilation macro usage for zink in target helpers.< / li >
< p > < / p >
< p > < / p >
< / ul >
< / div >
< / body >
< / html >