v2: Do the synchronization in the correct place. Noticed by Curro.
Fixes: b5fa43952a ("intel/fs: Better handle constant sources of FS_OPCODE_PACK_HALF_2x16_SPLIT")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Felix DeGrood <felix.j.degrood@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17037>
This was an optimization done a while ago that doesn't seem to be having
much of an impact anymore, and on the other hand, causes all sorts of
breakage with queries, as many of our HW counters don't get incremented
when rasterization is disabled.
This fixes a bunch of issues Zink has with ANV, but more importantly, it
fixes upcoming CTS tests:
dEQP-VK.transform_feedback.primitives_generated_query.*.empty_frag.*
dEQP-VK.transform_feedback.primitives_generated_query.*.no_attachment.*
dEQP-VK.transform_feedback.primitives_generated_query.*.color_write_disable_*
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17038>
The caller may have passed ownership of intel_measure_batch structures
to intel_measure until they are ready to be gathered. The caller
needs a notification when rendering is complete and snapshots have
been processed, so it can free the resources that measure the batch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16571>
Re-allocating the buffer object for snapshots carries a heavy penalty
at run-time. When resetting a command buffer, the buffer object that
is allocated for snapshots may be re-used directly on subsequent
renders.
Stale snapshot data will persist in the buffer object. To verify that
rendering is complete, zero the final timestamp value and check that
it has been written before gathering data.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16571>
It is possible that a secondary command buffer was submitted with no
renders in it. For that case, no timestamp will be collected. Only
verify that timestamps if the index is nonzero.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16571>
If this environment variable is set, then a detected compute engine
will be used as described in docs/envvars.rst.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14395>
This is now needed following Ken's 8831cb38aa.
Ref: 8831cb38aa ("anv: Stop updating STATE_BASE_ADDRESS on XeHP")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14395>
I don't know when this was added but it's really neat and we should use
it instead of NIR_PASS_V since NIR_DEBUG=print and a few validation
things will work better.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17014>
I don't know when this was added but it's really neat and we should use
it instead of NIR_PASS_V since NIR_DEBUG=print and a few validation
things will work better.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17014>
This fixes many tests in following groups on DG2:
dEQP-VK.memory_model.*
dEQP-VK.fragment_shader_interlock.*
v2: use memory scope and setup descriptor also
for barriers without defined scope (Curro),
use local scope and flush type none with
NIR_SCOPE_NONE scope, cleanups (Lionel)
v3: use LSC_FENCE_THREADGROUP for NIR_SCOPE_WORKGROUP,
remove default case (Curro), use eviction if scope
was not defined, use LSC_FENCE_GPU scope for vertex
stage
v4: use LSC_FENCE_TILE independent of stage for device
scope (Curro)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15743>
fs_visitor::assign_curb_setup() maps UNIFORM registers to HW regs,
and contains the following assert:
assert(inst->src[i].stride == 0);
emit_a64_oword_block_header's striding tricks run afoul of this
restriction, by producing stride 1 values on a 64-bit UNIFORM source.
Work around this by copying the UNIFORM value to a VGRF first.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16938>
This will happen automatically when they're waited on by the dummy
submit in wsi_common_queue_present().
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4037>
With the old value, anv didn't think that the hardware supported 48-bit
addresses, and hit this assert:
assert(device->supports_48bit_addresses == !device->use_relocations);
The new value of 1ull << 48 is the one reported on my Icelake machine.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16933>
If the executables are still hanging out,
anv_GetPipelineExecutableStatisticsKHR will try to dereference NULL
pointers in pipeline->shaders[MESA_SHADER_FRAGMENT].
At least in terms of fossil-db output, this matches the behavior from
before 73b3efcd59.
Fixes: 73b3efcd59 ("anv: Handle the null FS optimization after compiling shaders")
Closes: #6590
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16898>
Previously `install-intel-gpu-tests` controlled this, but now
`install-intel-gpu-tests` will only be used to decide if it should be
installed.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16902>
We need this so C++ will understand "restrict" which is used in the
genxml output.
Fixes: 9f717b5f23 ("util: remove needless c99_compat.h includes")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16899>
NIR has fdiv, and all the NIR backends have to have lower_fdiv set
appropriately already since various passes (format conversions,
tgsi_to_nir, nir_fast_normalize(), etc.) might generate one.
This causes softpipe and llvmpipe to now do actual divides, since
lower_fdiv is not set there. Note that llvmpipe's rcp implementation is a
divide of 1.0 by x, so now we're going to be just doing div(x, y) instead
of mul(x, div(1.0, y)).
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16823>
We were using both uint32_t and anv_cmd_dirty_mask_t, this is
a cleanup making type usage consistent. Commit also changes type of
the mask to be enum anv_cmd_dirty_bits.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16849>
This will allow the use of static_assert here instead of our
compiler-specific implementation.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16670>
Call it once instead of calling the very same function for each source
and destination. This should make those ternary operators a little
easier to read, IMHO.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15835>
In opt_algebraic(), handle TYPE_DF in a different check than TYPE_Q. We have a
separate flag for each type, use separate checks so platforms where one is true
and the other is not can work properly.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15835>
Don't compute it based on devinfo->has_64bit_float. Othwerwise we may
end up emitting 64bit-int (Q) instructions on platforms with 64bit
floats but not 64bit integers.
Right now, the only platforms where has_64bit_int is different from
has_64bit_float are the platforms that use GFX7_FEATURES.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15835>
This expression accidentally performs a 32-bit sign-extension when
processing the second half of the expression (the low 16 bits).
Consider -7W, which is represented as 0xfff9fff9 in our encoding (the
16-bit word is replicated to both halves of the 32-bit dword).
Tigerlake's compaction stores the low 11-bits of an immediate as-is,
and replicates the 12th bit. So here, compacted_imm will be 0xff9.
( (int)(0xff9 << 20) >> 4) |
((short)(0xff9 << 4) >> 4))
0xfff90000 | (0xff90 >> 4)
0xfff90000 | 0xfffffff9 ...oops...
0xfffffff9
By casting the second line of the expression to unsigned short, we
prevent the sign-extension when it combines both parts, so we get:
0xfff90000 | 0x0000fff9
0xfff9fff9
Fixes: 12d3b11908 ("intel/compiler: Add instruction compaction support on Gen12")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16833>
When this landed, the Autotools build system was already removed. Why
was this file added in the first place? Probably a rebase-mistake...
Fixes: 134e750e16 ("i965: extract performance query metrics")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16790>
Fixes tests matching:
dEQP-VK.pipeline.extended_dynamic_state.cmd_buffer_start.*unused_ms
These tests bind mesh pipeline, immediately after that bind non-mesh
pipeline and expect that binding mesh pipeline was a no-op.
v2: do it in one place & add comment (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16811>
It looks like atomics are slow on compressed surfaces so when enabling
compression for storage images that can be possibly used for atomic
operation hinders performance. Lets just disable compression in this
scenario.
v2: Reword comment (Ken)
Allow mutable with 16/32/64 bits (Ken)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14712>
v2: Add a fields in isl_format with per gen support (Lionel)
v3: Fixup R32_FLOAT from 80 to 90
Fixup R32_[SU]INT from 80 to 70 (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14712>
Also document additional piglit failures and passes.
Multiple changes, mostly notable:
- few new tests
- fixed test for upcoming mesa MR
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16785>
Now that the resulting xfb_info is stashed on the shader, we can put
this with all the other NIR stuff and only fetch it out at the last
minute when we upload the kernel.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16750>
This isn't really necessary because the API doesn't allow MSAA and
mipmapping at the same time but people forget that pretty often so it's
good to have it as documentation if nothing else.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14129>