In order to implement the ballot intrinsic, we do a MOV from flag
register to some GRF. If that GRF is used in a SEL, cmod propagation
helpfully changes it into a MOV from the flag register with a cmod.
This is perfectly valid but when lower_simd_width comes along, it simply
splits into two instructions which both have conditional modifiers.
This is a problem since we're reading the flag register. This commit
makes us check whether or not flags_written() overlaps with the flag
values that we are reading via the instruction source and, if we have
any interference, will force us to emit a copy of the source.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
v2: use a more generic compat function
v3: rename and formatting cleanup
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
CC: <mesa-stable@lists.freedesktop.org>
The beginning of the end for the shader keys. Not entirely sure
what I'm going to replace them with for the compiler though, so this
is the first step.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This helps valgrind when encode_type_to_blob is used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Not sure if this is the best place to put it, but we're going to need
this for NIR too.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
These assertions were revisited a couple of times in the past, and they
still weren't quite right.
The problem I was seeing (with some other state tracker) was a copy between
two 512x512 s3tc textures, but from mip level 0 to mip level 8. Therefore,
the destination has only size 2x2 (not a full block), so the box width/height
was only 2, causing the assertion to trigger for src alignment.
As far as I can tell, such a copy is completely legal, and because a correct
assertion would get ridiculously complicated just get rid of it for good.
Reviewed-by: Brian Paul <brianp@vmware.com>
This way, we know what we're allowed to use (no nested include lists
for instance) and users get immediate feedback when trying to use
unsupported versions, rather than a cryptic crash or things being
silently not built correctly.
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Libunwind has some issues on some platforms, so let's allow people
who have issues to opt-out. This is similar to what we do in automake,
and the implementation is modelled after our opt-out for valgrind.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Following test checking entrypoints passes:
dEQP-EGL.functional.get_proc_address.extension.gl_ext_occlusion_query_boolean
Piglit test 'ext_occlusion_query_boolean-any-samples' passes with these changes.
No changes/regression observed in WebGL occlusion tests or Intel CI.
v2: add es2="2.0" for glapi entrypoints, clean up xml
dispatch_sanity changes (fix 'make check')
Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Some of the checks are valid for generic ES 3.2 as well.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
It's still printed after linking, but it makes more sense to
have SPIRV->NIR->LLVM IR->ASM.
Fixes: f0a2bbd1a4 (radv: move nir print after linking is done)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is needed for RADV to support explicit component packing.
This is also required to use the new NIR component splitting /
packing passes.
V2:
- add commponent packing support for interpolate_at* intrinsics
- improve store packing support when not all varyings are scalar
as spotted by Bas the store source was incorrectly offset.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Not sure how useful this is, but it makes it more consistent.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
For certain buffer meta ops we can use the CP or a compute shader,
we should use a define to rather than hardcoding 4096, allows
for easier testing and more consistency.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Drivers have supported KHR_no_error for a while. We'd been leaving it
marked as "in progress" because there's a zillion places that could get
slightly more optimized. But, Timothy and Samuel have already done
piles of work, and I think we have a solid implementation at this point.
Let's check it off the list.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This properly sets stage_state->push_constant_dirty = true, so that we
emit 3DSTATE_CONSTANT_XS to disable the constant buffer for the shader
stage. It also sets stage_state->push_const_size = 0.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We have a gl_program and we want a gl_program. There's no point in
converting to brw_program and back again. This probably made more
sense in the old days before Tim dropped a layer of subclassing.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Mostly copy/pasta from Dylan Baker's conversion of nouveau and i965.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Also needed in freedreno/ir3.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Similar to 848da66222, pass an arg to
ir3_nir_trig.py to add to python path, rather than using $PYTHONPATH,
to prep for meson build support.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Compute shaders don't have access to the framebuffer, so there's no
point in worrying whether a texture is bound as a render target.
This saves a bunch of resolves in GFXBench4 Manhattan 3.1, but doesn't
seem to impact performance at all, at least on Apollolake.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
gcc is throwing this warning in my meson build:
../src/intel/compiler/brw_eu_validate.c:50:11: warning
argument 1 null where non-null expected [-Wnonnull]
return memmem(haystack.str, haystack.len,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
needle.str, needle.len) != NULL;
~~~~~~~~~~~~~~~~~~~~~~~
The first check for CONTAINS has a NULL error_msg.str and 0 len. The
glibc implementation will exit without looking at any haystack bytes if
haystack.len < needle.len, so this was safe, but silence the warning
anyway by guarding against implementation variablility.
Fixes: 122ef3799d ("i965: Only insert error message if not already present")
Reviewed-by: Matt Turner <mattst88@gmail.com>
To enable per-context priorities, we need to have per-context pipe's.
Unfortunately we still need to keep the global screen pipe, mostly just
for screen->get_timestamp().
Signed-off-by: Rob Clark <robdclark@gmail.com>
To add context priority support we need to have an fd_pipe per context,
rather than per-screen. Which conflicts with existing ctx->pipe (which
is actually a visibility stream pipe (hw resource). So just rename it.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Instead of plain snprintf(). To fix the MSVC build.
snprintf() is used in various places in Mesa/gallium, but apparently,
not in code built with MSVC.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
I'm working on radeonsi support in the Chrome OS Android container
(ARC++). Mesa in ARC++ uses autotools instead of Android.mk, but all
the necessary EGL bits are there, so the existing check is too strict.
Signed-off-by: Benjamin Gordon <bmgordon@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
We now have linking optimisations so we want to delay dumping the
nir until after these are complete.
Fixes: 06f05040eb (radv: Link shaders)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes a regression I introduced refactoring this code,
I managed to invert range twice, I moved the inversion into
the common code, but forgot to stop doing it in the callee.
Fixes: GL45-CTS.multi_bind.dispatch_bind_buffers_base
Fixes: 35ac13ed3 (mesa/bufferobj: consolidate some codepaths between ubo/ssbo/atomics.)
Reported-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>