This commit used to be "brw/emit: Allow scalar sources to 64-bit
3-source instructions". These instructions were fixed up in
brw_eu_emit. There seems to be some conflict with the <0,1,0> stride an
post-RA scheduling. The only difference between the passing code
generated by this commit and the failing code generated by the older
commit is some post-RA scheduling.
v2: Change the stride of a MAD even if the instruction isn't
lowered. MAD instructions that are already SIMD8 have to follow the same
rules. 🤦
v3: Pull the lowering out to its own pass. Update the comment in
brw_fs_validate. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884>
Sources that are scalars (almost all source) and convergent generally
want <0,1,0> source stride. Sources that are vectors (e.g., texture
coordinates, SSBO write data, etc.) and convergent want no extra strides
applied. In nearly all cases LOAD_PAYLOAD lowering will do the right
thing.
v2: Use VEC in emit_pixel_interpolater_send. Suggested by Ken.
v3: With the elimination of offset_to_component(), offset() may not
convert an is_scalar source to have a zero stride. Explicitly do this
in get_nir_src and prepare_alu_destination_and_sources.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884>
In SIMD16 and SIMD32, storing convergent values in full 16- or
32-channel registers is wasteful. It wastes register space, and in most
cases on SIMD32, it wastes instructions. Our register allocator is not
clever enough to handle scalar allocations. It's fundamental unit of
allocation is SIMD8. Start treating convergent values as SIMD8.
Add a tracking bit in brw_reg to specify that a register represents a
convergent, scalar value. This has two implications:
1. All channels of the SIMD8 register must contain the same value. In
general, this means that writes to the register must be
force_writemask_all and exec_size = 8;
2. Reads of this register can (and should) use <0,1,0> stride. SIMD8
instructions that have restrictions on source stride can us <8,8,1>.
Values that are vectors (e.g., results of load_uniform or texture
operations) will be stored as multiple SIMD8 hardware registers.
v2: brw_fs_opt_copy_propagation_defs fix from Ken. Fix for Xe2.
v3: Eliminte offset_to_scalar(). Remove mention of vec4 backend in
brw_reg.h. Both suggested by Caio. The offset_to_scalar() change
necessitates some trickery in the fs_builder offset() function, but I
think this is an improvement overall. There is also some rework in
find_value_for_offset to account for the possibility that is_scalar
sources in LOAD_PAYLOAD might be <8;8,1> or <0;1,0>.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884>
This isn't used now, but future commits will add uses. Doing this as a
separate commit removes a lot of "just typing" churn from commits that
have real changes to review.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884>
Farm has started experiencing intermittent dhcp/pxe issues with DUTs.
Disable the farm to investigate.
Signed-off-by: Martin Krastev <martin.krastev@broadcom.com>
Reviewed-by: David Heidelberg <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32776>
We only set buf_filled_size for the first target, but draws from non-zero
streams use buf_filled_size from other targets, so share the same
buf_filled_size buffer among all streamout targets because it contains
all 4 offsets.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
This compiles an optimized shader variant for NGG streamout where the output
primitive is known at compile time. This allows putting stores for all
vertices into the same VMEM clause.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
This uses si_shader_info to store whether gl_FragDepth can be removed,
and it uses the kill_z epilog flag to do the removal without recompilation.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
It works exactly like gfx11 except that COVERAGE_TO_MASK_ENABLE must be 1
to indicate that alpha for alpha-to-coverage should be read from mrtz.a.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
alpha-to-coverage must be applied before alpha-to-one. The only way to do
that is to export alpha for alpha-to-coverage via mrtz, and export 1 via
mrt0.a.
ACO and monolithic shader support is already in place thanks to RADV,
so we only need to change the LLVM PS epilog and the shader key.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
This adds kill_z and kill_stencil flags to the shader PS epilog key, which
removes those outputs if depth or stencil are disabled.
It must be implemented in:
* ACO PS epilog
* LLVM PS epilog
* ac_nir_lower_ps for monolithic shaders
Some of the samplemask code wasn't completely correct, but probably harmless.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
This will take effect after nir_opt_varyings is enabled by another MR, and
will fix existing shader compiler crashes thanks to better optimizations.
For example, one GLSL program that failed to compile and had 226 VS
instructions and 356 FS instructions in NIR will be reduced to 2 or 3
instructions per shader.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942>
Regression on test:
dEQP-GLES31.functional.geometry_shading.basic.output_256
voffset is missing if buffer store base >=4096, we need to
re-calculate offen after resolve_excess_vmem_const_offset().
Fixes: cdaf269924 ("aco: inline store_vmem_mubuf/emit_single_mubuf_store")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32767>
CI tests are carried out in debian/x86_64_pyutils container which is using
python version 3.11 so use this version also for local testing.
This makes local testing more accurate. For example repeated double
quotes in f-formatted strings will raise an error in python 3.11 but not
in python 3.12.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32602>
Since run-pytest.sh uses the debian/x86_64_pyutils container, it's not
necessary to add an additional layer of isolation by creating a virtual
environment for run-pytest.sh.
So stop creating a venv when run-pytest is run in a container, but keep
the option of using a venv to run-pytest.sh locally.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32602>
Currently the update_traces_checksum script prints a label verification
request with a line that is 167 characters long.
Split the long line to make it more readable. Update the flake8
configuration to enforce a maximum line length of 159 characters to ensure
consistency.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32602>
Currently the ci scripts and tests don't have any linting checks. Add
.flake8 linting to start adding some consistency to the scripts. Ignore
most of the existing errors until they can be addressed on an individual
basis.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32602>
Currently, run-pytest.sh won't run locally outside of a pipeline
because it can't find the `setup-test-env.sh` which provides necessary
functions.
Add a default value for the SCRIPTS_DIR so that run-pytest can find and
run the setup-test-env.sh and be run locally outside of a pipeline.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32602>
Without this the game crashes during the loading screen.
The game uses vkUpdateDescriptorSetWithTemplate and, in certain cases,
passes VkDescriptorBufferInfo structures where the offset + range
exceeds the size of the buffer. This triggers an assertion when
vk_buffer_range() is called, causing the game to crash.
When the nvidia vendor id is used the range is consistently set to 65536.
Without it the range varies and is much smaller - never exceeding 1000.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12349
Cc: stable
Reviewed-by: Faith Ekstrand <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32764>
There is no requirement on LLVM or SPIR-V tools since the introduction
of mesa-clc.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32719>
As Asahi, Intel and soon Panfrost requires an offline compiler for their
respective internal shaders, this commit adds generic new options to
workaround meson current limitations around cross-compillation.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32719>