Given an array of gen_insts representing a structured program,
fill in the missing JIPs and UIPs to follow that structure.
The input array must provide JIPs for the WHILE instructions (the
"back-edges", since there's no DO in Gfx9+). It optionally can
provide other JIPs or UIPs, their values will be used instead of
the calculated one.
The input JIPs and UIPs are absolute index values in the array,
and after finish they will be converted into relative byte offsets,
which is what the hardware will use.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Port the validation rules from brw_eu_validate.cpp. This also
ports the tests of the validation, so we can check whether the
rules actually flag the cases.
Also include some new validation cases derived from asserts in
brw_eu encoding logic.
Assisted-by: Pi coding agent (gpt-5.5, opus-4.6)
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Add a new module that can produce the binary encoded representation of
the instructions. Some key differences from existing encoding logic
in brw:
- Use a struct to represent the instructions before final encoding.
This is similar to the struct we already use in validation. This
allows generator/validation code to ignore details of instruction
formatting and just "set src0 to something".
- Split the encoding logic between Pre-Xe (Gfx9 and Gfx11) and Xe (from
Gfx12 and up). They are documented differently, so splitting makes
both sides easier to deal with.
- Try to follow the bit range numbers as they are documented in the
spec, programatically shifting them when needed. This means numbers
in code match PRMs / BSpec.
Later patches will add compaction and make use of the module in various
parts of the code.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
We're required to support this extension for Android VP17.
We've tried supporting it through the use of
CMF_DISABLE_WRITE_COMPRESSION but some regressions are measures
(-0.5~-1.0%).
We're not aware using CMF_DISABLE_WRITE_COMPRESSION would prevent any
application bug so it doesn't feel useful to implement.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41187>
This implements the extension on the Graphics and Compute queues using
Blorp OpenCL compute shaders. Support for the Transfer queue will come
in a later patch. We also don't support 24/48/96 bpp formats yet.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
In the next patch we will use mi_builder.h from blorp code, so this
commit prepares the terrain for that by adding the necessary
definitions that the header requires.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
We're going to use this for indirect copies, as we need to iterate
through the indirect buffer checking the copy sizes, then pick the
maximum copy size in order to launch the indirect compute shader.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
Just like mi_ior(), but for xor. We're going to use it in one of the
next commits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
Directories are named using the process name and PID to avoid overwriting dumps from
subsequent runs of the same application.
v2 (Caio): Use util_get_process_name(). Change to be default behavior.
Old behavior still accessible via MDA_OUTPUT_DIR="." env var.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39125>
This reverts commit 97391328a3.
This broke devenv because DRIRC_CONFIGDIR doesn't point the folder that
contains everything anymore.
DRIRC_CONFIGDIR will be modified to take the standard `:`-separated list
of paths, but until then, revert this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41890>
The `libanv_drirc` custom target passes `00-mesa-defaults.conf` to the
generator script via the `--validate` flag, but it was hardcoded as a
path rather than being declared as an input dependency.
When translating the build to a hermetic build system, this missing
dependency causes the build to fail because the undeclared file is not
provisioned in the sandboxed build environment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41886>
genX(batch_emit_vertex_input) reserves 3DSTATE_VERTEX_ELEMENTS and then
writes into that reserved memory. Any later anv_batch_emit() may
allocate a new batch and finalize the previous one, running the valgrind
defined-memory check over it.
Fill the draw-parameter and dynamic VERTEX_ELEMENT_STATE entries before
emitting 3DSTATE_VF_INSTANCING.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41790>
Coverity notices that there is an error case where
`nir_get_io_data_src_number` could return `-1`, and that is then used to
index into an array. Given that that is an exceptional case, we can just
assert here.
CID: 1681480
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40146>
On Xe2 and Xe3, the flushing is necessary due to aliasing of TGM data
in L1 memory (HSD 14020414266). On newer platforms, it is necessary
for proper post-format data conversion handling (HSD 22020984324).
See the Instruction_Fence page (63969) for documentation on the fact
that the threadgroup scope ignores flushes.
Thanks to Francisco Jerez and Kenneth Graunke on their help for this
patch.
v2: restrict the flushing to TGM (Lionel).
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40732>
Some values were wrong, so here adding the whole table with all fixed values.
Just to make easier to read and compare I have added all shader stages to
XEHP_URB_MIN_MAX_ENTRIES.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41789>
Right now this value is not use but it will in the next patch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41789>
These formats are not supported natively on gfx20+. However, with a
driconf option enabled, we do create surfaces with these formats and use
them for transfer and decompression operations. Provide a CMF for these
formats to avoid hitting the unreachable in
isl_get_render_compression_format().
Fixes: 27d515772e ("intel/isl: Replace mc_format with aux_format")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15547
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41830>
As long as we round up the /alignments/ in RA, and pad to power-of-two when
calculating partitions (trivially true now, this informs future work though),
this is fine.
SIMD16:
Totals from 1001 (37.82% of 2647) affected shaders:
Instrs: 1897734 -> 1896157 (-0.08%); split: -0.25%, +0.16%
CodeSize: 28330256 -> 28315472 (-0.05%); split: -0.30%, +0.25%
Number of spill instructions: 1003 -> 999 (-0.40%)
Number of fill instructions: 990 -> 986 (-0.40%)
SIMD32:
Totals from 1230 (46.47% of 2647) affected shaders:
Instrs: 3284649 -> 3277437 (-0.22%); split: -1.18%, +0.96%
CodeSize: 48977696 -> 48907376 (-0.14%); split: -1.10%, +0.96%
Number of spill instructions: 41004 -> 40582 (-1.03%); split: -1.05%, +0.02%
Number of fill instructions: 39298 -> 38572 (-1.85%); split: -1.91%, +0.06%
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41808>
Jay's novel SSA-based register allocator relies on a fixed partition of Intel
GRFs mapping to logical GPRs.
Previously, Jay used a simple partitioning scheme, which was good enough for
simple compute and fragment shaders, but has both limitations preventing new
feature bring-up and performance issues.
Here we rewrite the Jay partitioning code at the heart of the Jay RA in
order to lift these restrictions and allow fully flexible partitions. This
should be easier to reason about, fix a bunch of issues around simd32 payloads,
enable better performance, etc.
The # of stride 16 GRFs reserved is halved in simd32 mode here to match how
multisampling stuff works, which explains the large simd32-only instruction
count reduction.
While churning all this code, I took the opportunity to break off
jay_partition.c... I think that is better organized and the diff was garbage
otherwise.
SIMD16:
Totals from 2189 (82.70% of 2647) affected shaders:
Instrs: 2702159 -> 2670951 (-1.15%); split: -1.41%, +0.26%
CodeSize: 40296128 -> 39850304 (-1.11%); split: -1.40%, +0.30%
SIMD32:
Totals from 2373 (89.65% of 2647) affected shaders:
Instrs: 4559418 -> 4072897 (-10.67%); split: -10.77%, +0.10%
CodeSize: 68185488 -> 60635616 (-11.07%); split: -11.17%, +0.09%
Number of spill instructions: 44069 -> 44055 (-0.03%)
Number of fill instructions: 43292 -> 43278 (-0.03%)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41808>