spirv_to_nir sometimes wraps derefs in vec2 or mov instructions as part of
its texture handling. These get in the way of
nir_rematerialize_derefs_in_use_blocks_impl. Running copy propagation
should get rid of the extra move instructions and get us back to intact
deref chains for everything except variable pointer use-cases.
fossil-db (Sienna Cichlid):
Totals from 6 (0.00% of 134572) affected shaders:
CodeSize: 92656 -> 93088 (+0.47%)
Instrs: 17060 -> 17138 (+0.46%)
Latency: 224408 -> 227539 (+1.40%)
InvThroughput: 37402 -> 37924 (+1.40%)
VClause: 408 -> 402 (-1.47%)
Copies: 1065 -> 1107 (+3.94%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5668
Fixes: 14a12b771d ("spirv: Rework our handling of images and samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13924>
(cherry picked from commit b425100781)
The function need_temp_reg_initialization looks suspecious.
It will only ever return true if we get past this if:
if (!(emit->info.indirect_files && (1u << TGSI_FILE_TEMPORARY)) ...
Using the logical && means the intended initialization done
based on the result of this check is not performed.
This code was both introduced and altered in MR 5317.
ccb4ea5a introduces the function.
ba37d408 is a collection of performance improvements and misc
fixes. This altered the if from using bitwise to logical and.
This commit changes it back to bitwise.
Spotted from a compile warning.
Fixes: ba37d408da ("svga: Performance fixes")
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12157>
(cherry picked from commit 64292c0f05)
Android build system may use different internal variables to specify
cflags/cppflags.
Small change in product confguration may force Android to use diffrent
set of variables, therefore we should keep all of them attached to the
make rule's target.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5549
Fixes: 8621bd8d5e ("android: Add scripts to build using meson")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13914>
(cherry picked from commit 32ec0fffa6)
Now that we removed the intel intrinsic and just use the generic one,
we can skip it in the intel call lowering pass and just deal with it
in the intel rt intrinsic lowering.
v2: rewrite with nir_shader_instructions_pass() (Jason)
v3: handle everything in switch (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 423c47de99 ("nir: drop the btd_resume_intel intrinsic")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12113>
(cherry picked from commit c5a42e4010)
If the source and destination were within the same full register, like
hr90.x and hr90.y (which both map to r45.x), then we'd perform the
swap/copy with the wrong register. This broke
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.35 once BDA is enabled.
Fixes: 0ffcb19b9d ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
(cherry picked from commit c98adc56f4)
VK CTS just added some new tests to write to a compressed image
from a compute shader, which was overrunning memory.
The image width/height need to be sized according to the block
sizes to avoid overwriting memory.
dEQP-VK.image.sample_texture.*bit_compressed*
Cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13618>
(cherry picked from commit 27903abbb6)
p_is_helper doesn't have any operands, so ACO's value numbering and/or
the pre-RA optimizer could incorrectly recognize two such instructions
as the same.
This patch adds exec as an operand to p_is_helper in order to achieve
correct behavior.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5570
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13577>
(cherry picked from commit d80c7f3406)
Future LLVM header leads to __declspec(__restrict), which is invalid.
Just undefine the restrict macro to keep __declspec(restrict).
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit e0de7aa4d7)
# Conflicts:
# src/amd/compiler/aco_print_asm.cpp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13813>
When the client calls vkMapMemory(), we have to align the requested
offset down to the nearest page or else the map will fail. On platforms
where we have DRM_IOCTL_I915_GEM_MMAP_OFFSET, we always map the whole
buffer. In either case, the original map may start before the requested
offset and we need to take that into account when we clflush.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
(cherry picked from commit 90ac06e502)
This is already done on lines 475-480, resulting in them appearing twice
in the summary.
Fixes: 47946855f1 ("meson: allow egl_native_platform to be specified")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13278>
(cherry picked from commit 9ad375bdcd)
A bit difficult to find what commit introduced the issue because of
all the renaming, but it was my bug :)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
(cherry picked from commit 349bfb7275)
10c75ae4 moved handling of this state to the functions that
depend on ctx->_ImageTransferState.
So we can't depend on _NEW_PIXEL being set to call this function,
since it'll be always clear earlier by _mesa_update_state_locked.
Example sequence that would trigger the issue:
glPixelTransferi(...)
glClear(...)
glTexSubImage2D(...) <-- won't use the new value set by
glPixelTransferi because glClear caused
_NEW_PIXEL to be cleared.
_NEW_PIXEL itself is kept because st_update_pixel_transfer depends
on it.
Fixes: 10c75ae4 ("mesa: move _mesa_update_pixel out of _mesa_update_state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5273
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13596>
(cherry picked from commit 1ee3fbd703)
- Unset depth_cleared_level_mask for non-clear blits. Set the flag after
the clear, so that we don't have to check blitter_running.
- Set depth_cleared_level_mask only when we set depth_clear_value.
Fixes: ff8a930cf7 - radeonsi: add _once suffix to depth_cleared_level_mask
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13603>
(cherry picked from commit 5d3aea49b8)
vertex shader stages that can produce xfb must have
their input size clamped to the compiler define MAX_VARYING
to successfully be able to export an xfb output for each input
fixes KHR-GL46.geometry_shader.limits.max_input_components
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13632>
(cherry picked from commit 5d1b81d8ac)
if no restart indices are found, this draw must be discarded to avoid
crashing later on
Fixes: 583070748c ("util/primconvert: handle rewriting of prim-restart draws with unsupported primtype")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13630>
(cherry picked from commit bc345281ab)
Fixes dEQP-VK.reconvergence.*nesting* tests.
There are cases when cmod is set to an instruction that cannot have
conditional modifier. E.g. following:
find_live_channel(32) vgrf166:UD, NoMask
cmp.z.f0.0(32) null:D, vgrf166+0.0<0>:D, 0d
is optimized to:
find_live_channel.z.f0.0(32) vgrf166:UD, NoMask
v2:
- Add unit test to check cmod is not set to 'find_live_channel' (Matt Turner)
- Update flag_subreg when conditonal_mod is updated (Ian Romanick)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5431
Fixes: 32b7ba66b0 ("intel/compiler: fix cmod propagation optimisations")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13268>
(cherry picked from commit 2dbb66997e)
there's other variants of implicit lod sampling, and none of them are valid
outside fragment stage
Fixes: 3ad06b6949 ("zink: always use explicit lod for texture() when legal in non-fragment stages")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13585>
(cherry picked from commit 87fbb0eab0)
sysctl is the correct way of getting the current executable's path.
procfs is not mounted by default.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1598>
(cherry picked from commit 98dbd01a96)