This reverts commit 939ddf3f67.
Intel has a separate pass for fusing FFMAs selectively. We split
these flags in commit 1b72c31e1f and
the reasoning still stands. The patch being reverted was just a
cleanup, so there should be no issue with reverting it.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>
To avoid having a separate "wsi_fence" path in the driver, make it so wsi
fences can signal a syncobj.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6707>
This makes it explicit that this intrinsic is only for SSBOs. For the
v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be
able to distinguish between the two.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812>
It's not really unordered in the sense that it can still stall on
ordered things and we don't need a SYNC_NOP for that because it is a
SYNC_NOP. However, it also doesn't count when computing instruction
distances.
Fixes: 18e72ee210 "intel/fs: Add FS_OPCODE_SCHEDULING_FENCE"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6781>
this catches some undefined behavior like e.g., using a stale descriptorset
that references deleted bos, which I would absolutely never do
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6747>
This has been shown to help performance on TGL and DG1. This could be
applied to gen9+, but we still need to show if it helps with those
platforms.
Rework:
* Make change in src/intel/vulkan/genX_cmd_buffer.c too. (Ken)
* Keep mask as 3 for gen < 12
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6684>
Anv doesn't do multi-layer fast-clear tracking, but TGL may add
fast-clears to multiple layers. Disable CCS_E for image arrays on TGL+
until anv gets more clear color tracking abilities.
With this change, anv+TGL now passes:
* dEQP-VK.multiview.readback_implicit_clear.15_15_15_15
* dEQP-VK.multiview.readback_implicit_clear.8_1_1_8
* dEQP-VK.multiview.readback_implicit_clear.1_2_4_8_16_32
* dEQP-VK.multiview.renderpass2.readback_implicit_clear.15_15_15_15
* dEQP-VK.multiview.renderpass2.readback_implicit_clear.8_1_1_8
* dEQP-VK.multiview.renderpass2.readback_implicit_clear.1_2_4_8_16_32
v2. Mention HSD 14010672564. (Sagar)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6528>
Coverity complains about possible out-of-bounds write & read, because
it thinks that "loc + i" can be bigger than sizes of the 2 used arrays.
It's not obvious from the code it cannot happen, so add asserts here.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
Found by Coverity, as "argument cannot be negative", referring to
fread's 2nd argument.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
These variables seem to be initialized before being used, so this
patch is not fixing any bug, but leaving them unitialized may become
a bug after some refactoring.
These classes were affected: fs_reg_alloc, fs_visitor, fs_generator,
instruction_scheduler.
Found by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
Add a function suitable for planar YUV surfaces. For these surfaces,
drivers remap each plane to an RGB-formatted surface. Enable drivers to
pass the plane index and the original YUV format to get the right
aux-map format bits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6486>
* Define ISL equivalents for the P010, P012, and P016 formats.
* Add aux-map encodings for the YUV formats iris will soon support.
v2. Replace &&'s with ||'s in isl_format_is_planar() (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6486>
For UBO accesses to be the same performance as classic GL default uniform
block uniforms, we need to be able to push them through the same path. On
freedreno, we haven't been uploading UBOs as push constants when they're
used for indirect array access, because we don't know what range of the
UBO is needed for an access.
I believe we won't be able to calculate the range in general in spirv
given casts that can happen, so we define a [0, ~0] range to be "We don't
know anything". We use that at the moment for all UBO loads except for
nir_lower_uniforms_to_ubo, where we now avoid losing the range information
that default uniform block loads come with.
In a departure from other NIR intrinsics with a "base", I didn't make the
base an be something you have to add to the src[1] offset. This keeps us
from needing to modify all drivers (particularly since the base+offset
thing can mean needing to do addition in the backend), makes backend
tracking of ranges easy, and makes the range calculations in
load_store_vectorizer reasonable. However, this could definitely cause
some confusion for people used to the normal NIR base.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
brw_compile_gs and brw_compile_tcs are extern C functions, but are
defined inside of brw namespace, which somehow works but confuses
Eclipse CDT's code analysis.
Move these functions out of brw namespace and fix references to
objects from brw namespace.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602>
This also fixes the inverted last parameter of nir_lower_flrp in most drivers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
This doesn't really do anything for us today. One day, I suppose we
could use it to do something with wide loads with non-uniform offsets.
The big reason to do this is to get better testing to make sure that NIR
doesn't blow up on the deref paths.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Instead, we do a limited indirect deref lowering and then use
nir_lower_vars_to_explicit_types and nir_lower_explicit_io to lower it
as if it were SSBO or global memory access. Among other things, this
should enable pointer arithmetic on local variables. Fun!
The only shader-db change from this change on ICL was a few tiny cycle
count changes in 7 Aztec Ruins compute shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
Since everything flows through NIR and we're doing all of our indirect
deref lowering there now, there's no reason to keep making those
decisions in brw_compiler and stuffing them in the GLSL compiler
structs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
There's no good reason to split this into three. Sure, CS indirects are
only guaranteed by the spec to be DWORD aligned, but that's all untyped
surface reads require anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>
This can come up if, for instance, the shader does a derivative of a
uniform or flat input. Ideally, NIR would use divergence analysis to
get rid of the derivative in this case but it doesn't right now. This
fixes a crash in F1 2017.
Cc: mesa-stable@lists.freedesktop.org
Reported-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6564>
When we have softpin, we know the address of the shader constant data at
shader upload time because it's sitting at the end of the shader. This
commit changes ANV to use patch constants to embed the address in the
shader patch the right address in at upload time. This allows us to
avoid having to set up a UBO binding on-the-fly for shader constants.
This commit uses an A64 message but it's quite possible that we could
also use an A32 message and make the dataport do the 64-bit add for us.
However, load_global is what we have right now so it was easier to just
use that.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>