No shader-db or fossil-db changes on any Intel platform. In this MR, the
only benefit of these changes is to convert some "-a > 0" CSEL
comparisons to "a < 0" for improved readability.
v2: Add integer CSEL support
v3: Use fs_inst::resize_sources and brw_type_is_sint. Both suggested by
Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
No shader-db or fossil-db changes on any Intel platform. This ends up
begin helpful in "intel/brw: Use range analysis to optimize fsign."
v2: Add integer CSEL support
v3: Massive simplification (-20 lines!) of constant propagation
logic. Suggested by Ken. Add missing CSEL case in supports_src_as_imm.
Noticed by Ken.
v4: While MAD can mix F and HF sources on some platforms, CSEL
cannot. Found by skqp on TGL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
Gfx9 can only have F, but newer GPUs can have F, HF, *D, or *W. The
source and destination types must still match in size.
v2: Simplify the float vs integer logic. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
This bit from the comment should have been a big red flag:
There are currently zero instances of fsign(double(x))*IMM in
shader-db or any test suite, so it is hard to care at this time.
The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.
Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.
Ironically, the vec4 implementation is correct. The odds of encountering
an application that is performace limited by dsign performance in vertex
processing stages on Ivy Bridge or Haswell is infinitesimal.
No shader-db changes on any Intel platform.
v2: Delete 's' in emit_fsign as it is now unused.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
This bit from the comment should have been a big red flag:
There are currently zero instances of fsign(double(x))*IMM in
shader-db or any test suite, so it is hard to care at this time.
The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.
Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
This change is built on top of work originally done by Benjamin Lee.
Implement conservative rasterization on GPUs that support it. This is done
through a MME method on pre-Volta, and through SET_CONSERVATIVE_RASTER* (newly
published) on more recent GPUs.
primitiveUnderestimation and fullyCoveredFragmentShaderInputVariable will be
supported later as they require SPH and compiler work.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9627
Signed-off-by: Arthur Huillet <ahuillet@nvidia.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28937>
A couple pieces were missed when this was originally added in
b172fd62f5. Without this, NVK doesn't
pick up the value of extraPrimitiveOverestimationSize in 'dyn->rs'.
Signed-off-by: Benjamin Lee <benjamin@computer.surgery>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28937>
VK_EXT_legacy_vertex_attributes was enabled for both Anv and Turnip
but this never fully made its way to the docs.
Fixes: 8c1cc405d3 ("anv: VK_EXT_legacy_vertex_attributes")
Fixes: 660a47ecbf ("tu: support VK_EXT_legacy_vertex_attributes")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29176>
For dma-buf, if the import and finish occur back-2-back for the same
dma-buf, zombie vma cleanup will unexpectedly close the re-imported
dma-buf gem handle. This change fixes it by trying to resurrect from
zombie vmas on the dma-buf import path.
Fixes: 63904240f2 ("tu: Re-enable bufferDeviceAddressCaptureReplay")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29173>
I tested that v_swap_b16 can be encoded as VOP3, because the ISA doc doesn't list
it as a possible VOP3 opcode. VOP3 is nessecary to access v128+.
Foz-DB Navi31:
Totals from 32 (0.04% of 79395) affected shaders:
Instrs: 201865 -> 195168 (-3.32%)
CodeSize: 1082220 -> 1031228 (-4.71%); split: -4.71%, +0.00%
Latency: 2258198 -> 2238586 (-0.87%)
InvThroughput: 796731 -> 788934 (-0.98%)
Copies: 34514 -> 29220 (-15.34%)
VALU: 122457 -> 117163 (-4.32%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29143>
Indeed, the object returned by LLVMCreatePassBuilderOptions()
was not freed.
For instance, this issue is triggered with "piglit/bin/cl-api-build-program":
Direct leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x7f6b15abdf57 in operator new(unsigned long) (/usr/lib64/libasan.so.6+0xb2f57)
#1 0x7f6afff6529e in LLVMCreatePassBuilderOptions llvm-18.1.5/lib/Passes/PassBuilderBindings.cpp:83
#2 0x7f6b1186ee41 in optimize ../src/gallium/frontends/clover/llvm/invocation.cpp:521
#3 0x7f6b1186ee41 in clover::llvm::link_program(std::vector<clover::binary, std::allocator<clover::binary> > const&, clover::device const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ../src/gallium/frontends/clover/llvm/invocation.cpp:554
#4 0x7f6b1150ce67 in link_program ../src/gallium/frontends/clover/core/compiler.hpp:78
#5 0x7f6b1150ce67 in clover::program::link(clover::ref_vector<clover::device> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, clover::ref_vector<clover::program> const&) ../src/gallium/frontends/clover/core/program.cpp:78
#6 0x7f6b11401a2b in clBuildProgram ../src/gallium/frontends/clover/api/program.cpp:283
Fixes: 2d4fe5f229 ("clover/llvm: move to modern pass manager.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29164>
Use IO semantics to find the locations instead of more complex ways.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29133>