Instead of checking for MESA_SHADER_COMPUTE (and KERNEL). Where
appropriate, also use gl_shader_stage_is_compute().
This allows most of the workgroup-related lowering to be applied to
Task and Mesh shaders. These will be added later and "inherit" from
cs_prog_data structure.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13629>
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member error is not initialized
in this constructor nor in any functions that it calls.
Fixes: 7558340ebb ("intel/compiler: Add helpers to select SIMD for compute shaders")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13608>
brw_simd_select return type is int.
Fix defect reported by Coverity Scan.
Unsigned compared against 0 (NO_EFFECT)
unsigned_compare: This less-than-zero comparison of an unsigned value is never true. selected_simd < 0U.
Fixes: 7dda0cf2b8 ("intel/compiler: Use SIMD selection helpers for CS")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13606>
Fixes dEQP-VK.reconvergence.*nesting* tests.
There are cases when cmod is set to an instruction that cannot have
conditional modifier. E.g. following:
find_live_channel(32) vgrf166:UD, NoMask
cmp.z.f0.0(32) null:D, vgrf166+0.0<0>:D, 0d
is optimized to:
find_live_channel.z.f0.0(32) vgrf166:UD, NoMask
v2:
- Add unit test to check cmod is not set to 'find_live_channel' (Matt Turner)
- Update flag_subreg when conditonal_mod is updated (Ian Romanick)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5431
Fixes: 32b7ba66b0 ("intel/compiler: fix cmod propagation optimisations")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13268>
Unless we are combining multiple workgroups in the same HW thread,
there's no advantage of using SIMD16 when SIMD8 already fits the
entire workgroup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
Variable workgroup size works by compiling as much SIMD variants as
possible and then selecting the right one during dispatch (when the
actual workgroup size is passed to us).
Instead of replicating the logic in a separate function, reuse the
same logic for regular SIMD selection. And move function for that
together with the remaining simd selection functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
Clean up the logic and move it to functions that work with prog_data
attributes to select the right SIMD. This shouldn't change any
behavior compared to the original.
Having it extracted will allow reuse by Task/Mesh and make it easier
to write tests.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
This doesn't impact any performance since the previous typo value
matches the current cache control value.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13458>
This is separate from images and samplers. It's a texture (not a
storage image) without a sampler. We also add C-visible helpers to
convert between sampler and image types.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>
With the `gtest` protocol meson will add some extra arguments to the
test to generate better junit results, which may be useful. This
protocol is only available in meson 0.55.0+, so keep using the default
`exitcode` protocol for meson older than that.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8484>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
With gtest is possible to filter execution and run only a specific
test suite or individual test, so there's no particular reason here to
generate multiple binaries for the tests of a single module.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13303>
Include vec4 in their names to avoid same names as the fs
counterparts. This will allow compiling all the tests together in the
future.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13303>
These got moved into common code a good while ago.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13328>
We're about to allow unknown format for specific formats in Anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>
Changes:
- nir_metadata_preserve(..., nir_metadata_block_index | nir_metadata_dominance)
is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
make progress
- pass returns true ONLY when it makes progress ("progress" was initialized incorrectly)
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13189>
In the upcoming intel_clc tool, we're allowing to print these messages
out and some of them just don't look right.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13165>
Make INTEL_DEBUG=blorp dump the blorp compute shaders instead using
the general INTEL_DEBUG=cs which is now reserved for actual compute
programs.
Ref: 05933fb0f7 ("intel/compiler: Use INTEL_DEBUG=blorp to dump blorp shaders")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
When they re-arranged all the dataport stuff and added the LSC, doing
URB fencing through the dataport no longer makes sense. Instead, there
is now a fence message on the URB shared function.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13092>
Found this nugget while looking at the ACO driver. It seems sensible to
avoid SLM fences if there is no SLM. This also makes the check depend
on SLM usage rather than just shader stage which will be useful if we
ever implement task/mesh because task shaders also have SLM.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13092>
Start off making everything look like LSC where we have three types of
fences: TGM, UGM, and SLM. Then, emit the actual code in a generation-
aware way. There are three HW generation cases we care about:
XeHP+ (LSC), ICL-TGL, and IVB-SKL. Even though it looks like there's a
lot to deduplicate, it only increases the number of ubld.emit() calls
from 5 to 7 and entirely gets rid of the SFID juggling and other
weirdness we've introduced along the way to make those cases "general".
While we're here, also clean up the code for stalling after fences and
clearly document every case where we insert a stall.
There are only three known functional changes from this commit:
1. We now avoid the render cache fence on IVB if we don't need image
barriers.
2. On ICL+, we no longer unconditionally stall on barriers. We still
stall if we have more than one to help tie them together but
independent barriers are independent. Barrier instructions will
still operate in write-commit mode and still be scheduling barriers
but won't necessarily stall.
3. We now assert-fail for URB fences on LSC platforms. We'll be adding
in the new URB fence message for those platforms in a follow-on
commit.
It is a big enough refactor, however, that other minor changes may be
present.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13092>
The lowering passes will soon be moved to another function, so there
won't be any choice.
As a side benefit, this allows eliminating the uses_atomic_load_store
**pointer** parameter from brw_nir_lower_storage_image. For some reason
crocus was passing false instead of NULL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
Previously, we misunderstood how conditional modifiers and saturate
interacted. We thought the condition was evaulated before the saturate
was applied. For the floating point cases, we went to some heroics to
modify the condition to maintain the same results. For integer cases,
it was not clear that this could even work. We had no use-cases and no
tests-cases, so we just disallowed everything.
Now we understand that the condition is evaluated after the saturate.
Earlier commits in this series removed the various floating point
heroics. It is easier to just delete the code that prevents some cases
that just work.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Originally this was part of "intel/fs: Remove condition-based
restriction for cmod propagation to saturated operations". With some
additional changes to that commit, it caused a lot of extra churn in the
unit tests. I felt that made it harder to see the actual changes in the
unit tests, so I split it out.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>