CSE tries to collapse equal instructions, and collapsing two phis with incompatible dests is illegal.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes: 6bdce55c ("nir: Add a basic CSE pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19960>
(cherry picked from commit a54c2c8289)
SCC is 1-bit, and we can't copy a 32-bit value into it.
Fixes dEQP-VK.spirv_assembly.type.scalar.i32.iequal_tesse with
ACO_DEBUG=noopt.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 9476986e6f ("aco/ra: special-case get_reg_for_create_vector_copy()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20240>
(cherry picked from commit a05dd58309)
problem:
when doing "gst-launch-1.0 -v videotestsrc num-buffer=10 !
vaapih264enc ! fakeink"
The command will fail due to gst will fetch the first
available supported format in the list, it becomes P010_LE
due to the commit in
[0b02db3007]
frontends/va: fixed av1 decoding 10bit ffmpeg output YUV issue
fix:
move the P010_LE code block to the end of the function, the sequence
of the supported formats restored to its original.
cc: mesa-stable
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20242>
(cherry picked from commit a73e86e0a5)
The first instruction of any kernel should have non-zero emask. This
restriction needs to be obeyed to avoid GPU hangs.
Patch adds a function to insert dummy mov as first instruction
to make sure this requirement is fulfilled.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20194>
(cherry picked from commit bc4b7de0d0)
This is trying to clear a bit in the control register. However, it's
executing with whatever channel mask happens to be active. Typically
this is the one at the start of the program, so at least some channels
will be active. Typically the first channel will be active due to
packed dispatch, but that's not always guaranteed. Without NoMask,
the float controls writes may randomly not happen.
Recent GPUs also seem to have a hang issue when the first instruction in
the shader doesn't have any active channels. Having an instruction with
NoMask at the start of the program works around the issue. See HSD bug
14017989577. In our case, the float controls preamble was breaking that
restriction every time, causing us to run into this problem frequently.
Thanks to Tapani Pälli for finding this hang issue, and Francisco
Jerez and Lionel Landwerlin for helping pinpoint this issue during
review of a workaround patch in !20194.
Fixes GPU hangs in Elder Scrolls Online, Witcher 3, and likely more.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7639
Fixes: 9da56ffc52 ("i965/fs: add emit_shader_float_controls_execution_mode() and aux functions")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20214>
(cherry picked from commit bafbe7c23a)
On cmd_buffer_emit_scissor(), if VkViewport height or width are set to
a value lower than 1.0, y_max or x_max can be attributed negative values,
causing an overflow. That leads to ScissorRectangleYMax or
ScissorRectangleXMax to be set to values on an unsupported range.
Clamping x_max and y_max in the valid range solves the problem.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7471
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20200>
(cherry picked from commit 2e775b8bdb)
Android may use either DRM or some downstream solution, KGSL is a
downstream kernel driver for Adreno. Don't enable DRM when we want
Turnip to use KGSL instead of DRM.
Fixes: 09ac29cca9
("meson: Enable system_has_kms_drm for android")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20168>
(cherry picked from commit 1cfc413c9a)
If points or lines are drawn using the polygon mode, the guardband
should be adjusted for large points/lines.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20185>
(cherry picked from commit 12f26b5e6d)
We were not taking into account that when all invocations within workgroup
are active, we'll copy more data than needed, corrupting task payload
of other workgroups.
Fixes: 8aff8d3dd4 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20080>
(cherry picked from commit ffefa386fd)
Death stranding does scratch_arr[80-idx]. This doesn't seem to work if we
try to combine the subtraction into the access.
fossil-db (navi21):
Totals from 52 (0.04% of 135636) affected shaders:
Instrs: 78560 -> 79036 (+0.61%)
CodeSize: 427940 -> 431188 (+0.76%)
Latency: 1313809 -> 1318142 (+0.33%)
InvThroughput: 292833 -> 293842 (+0.34%)
VClause: 2361 -> 2555 (+8.22%); split: -0.51%, +8.73%
Copies: 8767 -> 8746 (-0.24%); split: -0.35%, +0.11%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 0e783d687a ("aco: use scratch_* for scratch load/store on GFX9+")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7735
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20117>
(cherry picked from commit 381de3c809)
add the logic for calculating film grain related
coefficients for VCN to generate film grain output.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19660>
(cherry picked from commit b283567456)
To make sure shader invocations read the correct values.
Fixes dEQP-VK.rasterization.rasterization_order_attachment_access.*.samples_*.multi_draw_barriers
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19728>
(cherry picked from commit a42f8d49c3)
link_varyings ignores precisions and can assign the same location to
variables with different precisions. nir_link_varying_precision should
check location_frac as well.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20113>
(cherry picked from commit 7244d88516)
`dzn_instance_create` will call `dzn_instance_destroy` when the d3d12
library fails to load. Just like the issue in `d3d12_screen`, this will
lead to a crash because `d3d12_mod` is NULL.
To fix this, only close the library after if it was actually opened.
Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20145>
(cherry picked from commit b513389400)
`d3d12_destroy_screen` is called by `d3d12_create_dxcore_screen` after
`d3d12_init_screen_base` fails and attempts to call `util_dl_close` on
a NULL pointer, leading to an abort.
To fix this, only close the library after if it was actually opened.
Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20145>
(cherry picked from commit b3d1ae19f2)
This change should avoid any accidental rounding issues because of
border colors getting stored in a float in dxil_wrap_sampler_state. It
also switches us to using the correct helpers for nir_const_value so we
can avoid any weird uninitialized data failures that can be caused by
filling out the fields in the struct directly.
Fixes: b9c61379ab ("microsoft/compiler: translate nir to dxil")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19689>
(cherry picked from commit cd5c66e165)
There were a couple of issues here:
1. We should be using nir_const_value_for_uint instead of setting the
union fields directly to ensure the rest of the union is zeroed.
2. It was always filling out the first two components of val even if
the incoming constant had 2 64-bit components.
Fixes: 165fb5117b ("r600/sfn: add lowering passes to get 64 bit ops lowered to 32 bit vec2")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19689>
(cherry picked from commit f3f1c28f8e)
In try_fold_load_store when trying to extract const addition from
non-const offset source, we should take into account that there is
already a constant base offset, which should count towards the limit.
The issue was found in "Monster Hunter: World" running on Turnip.
Fixes: cac6f633b2
("nir/opt_offsets: Use nir_ssa_scalar to chase offset additions.")
Well, the issue was present before this commit but it made a lot
of changes in surrounding code.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20099>
(cherry picked from commit 5d025f4003)
STL/LDL have 13 bits to store imm offset. However the most significant
bit in the offset is a sign bit, so the positive offset is limited by
12 bits.
nir_opt_offsets only has the upper limit and doesn't deal with
negative offsets, so shared_max should be changed to `(1 << 12) - 1`.
The issue was found in "Monster Hunter: World".
Fixes: 0b2da9d795
("ir3: Limit the maximum imm offset in nir_opt_offset for shared vars")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20100>
(cherry picked from commit 8f0177b334)
We've recently rebalanced our lab devices to get a fewer number of
grunts. Switch to scheduling only on the newer shinier ones, running
fewer tests. We'll evaluate the runtime, and if they're quick enough
then we can increase the amount of testing we do.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20081>
(cherry picked from commit 921cfcf4c4)
The idea in the single sync path is that we serialize any job that
needs to wait, however, our ANY queue syncobj only tracks the last job
submitted to any hardware queue, so in practice when we wait on this
we are only serializing against the queue to which we have submitted
the last job, which is not correct.
Fix that by accumulating the last job sync into the ANY queue synbcobj
to ensure that waiting on this syncobj effectively waits on all
hardware queues.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20078>
(cherry picked from commit 4276ec9f2a)
Instead of having functions that return early in multi-sync mode
let's only call them when we are in single-sync mode. I think this
makes the code more explicit.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20078>
(cherry picked from commit 95b9293eeb)
All data written by the user are offset by TUE header size.
Without this patch we copy the correct amount of user data, but both
"from" and "to" offsets are wrong.
Fixes: 37e78803d7 ("intel/compiler: use nir_lower_task_shader pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19409>
(cherry picked from commit db0e6f9a07)
We need this, because on Intel task payload starts with private header,
followed by user-accessible data.
Fixes: 37e78803d7 ("intel/compiler: use nir_lower_task_shader pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19409>
(cherry picked from commit f6adfd6278)
This probably wasn't noticed earlier because tests using sRGB storage
images didn't exist, and we didn't know whether this works, but this
fixes dEQP-VK.image.store.without_format.2d.*_srgb which also proves
that the bit works.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20060>
(cherry picked from commit ccef6d1f5f)
We don't care to support i8vec16, we just need a bit of 8-bit support to
implement format packing/unpacking in blend shaders. We're already doing
this by using the 16-bit pipe, we just need to commit to it all the way
-- reporting the correct sizes in max_bitsize_for_alu so the mask
packing logic works as intended -- and dropping the imov-specific hack
that was introduced to workaround a similar class of bugs.
With the previous patch, fixes:
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1
Fixes: 39e4b7279d ("pan/midg: Fix swizzling on 8-bit sources")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>
(cherry picked from commit 976405907e)
We're about to make scratch surface states part of the surface state
heap. Because those are required to be in the low 26bits parts surface
state heap (we're limited in bits handed in the CFE_STATE, 3DSTATE_VS,
etc... instructions), this change splits the 32bit surface state heap
as follow:
- 8Mb of surface states for scratch
- 1Gb - 8Mb of binding tables
- 3Gb of surface states
That way all of the surfaces are located within a 4Gb region visible
from STATE_BASE_ADDRESS::SurfaceStateBaseAddress
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
(cherry picked from commit daab161535)