This is what LowerBoundedU32Array is doing internally and it's unlikely
faster to have all these extra cases. This also gets rid of a bunch of
transmute(), which is always nice. It does mean a small copy in the
case of large SSARefs but those should be uncommon.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41462>
This is a generalization of NAK's SSAValue and SSAValueArray structs.
But instead of depending on NAK's bespoke invariants, this depends on
something far simpler: A lower bound on the u32. As long as you can
guarantee that the maximum array length is strictly less than the
minimum U32 value, we can pull the same trick as NAK and generalize it
into a LowerBoundedU32Array type.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41462>
Now that every caller goes through brw_to_binary(), brw_generator has
a single user (brw_to_binary.cpp itself). Move the class definition
into that .cpp inside an anonymous namespace and delete the header,
so it can no longer leak into other translation units.
No functional change.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
This port tries to keep the same structure and still uses a lot of
brw_reg, converting types at the last moment. The idea is to make
easiert to verify the change. A later patch will convert to use
gen_operand and other types earlier.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
It is convenient for us to have the new code in a different file (for
various validation tasks). So first make a copy of the current generator
to a new file that later will be updated to use the gen module.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Remove the compatibility handling from gen and update the expected
results for the relevant parser tests.
This was the last remaining quirk in the "compact" mode, so also remove the
INTEL_BRW_ASM_COMPAT env var plumbing.
Assisted-by: Pi coding agent (opus-4.7)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Remove the compatibility handling from gen and update the expected
results for the relevant parser tests.
Assisted-by: Pi coding agent (opus-4.7)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Remove the compatibility handling from gen and update the expected
results for the relevant parser tests.
Assisted-by: Pi coding agent (opus-4.7)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Convert test inputs to the new syntax maintaining the previous expected
files untouched.
This uses the previously added compatibility mode when running those
tests. A later patch will remove those quirks and adjust the expected
files accordingly.
Assisted-by: Pi coding agent (opus-4.7)
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
To convert tests to the new syntax maintaining the previous expected
files untouched, There are a few quirks with old parser that we are
adding temporarily to the gen module (behind an env variable):
The quirks:
- Small integer immediates encoding their value also in the upper half
of the 32-bit immediate field.
- Pre-Xe default `null` source region to <0;1,0>.
- Pre-Xe SEND normalizing destination stride from 0 to 1.
None of those quirks should be needed, at least for the parser. A later
patch will remove those quirks and adjust the expected files accordingly.
This separation was done to make verification easier in the main patch
that ports the tests to a new syntax.
Assisted-by: Pi coding agent (opus-4.7)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Support for assembly and disassembly with the new syntax,
helper to show register regions in ASCII "diagramas", and
cheat sheet for the new syntax.
Assisted-by: Pi coding agent (gpt-5.5, opus-4.6)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
This reverts commit 0cc89ca03a.
This allows to use PCI IDs without having to set INTEL_FORCE_PROBE environment
variable. Useful for tests and tools, I'm assuming the original user of it is gone
or out-of-tree.
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Given an array of gen_insts representing a structured program,
fill in the missing JIPs and UIPs to follow that structure.
The input array must provide JIPs for the WHILE instructions (the
"back-edges", since there's no DO in Gfx9+). It optionally can
provide other JIPs or UIPs, their values will be used instead of
the calculated one.
The input JIPs and UIPs are absolute index values in the array,
and after finish they will be converted into relative byte offsets,
which is what the hardware will use.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Port the validation rules from brw_eu_validate.cpp. This also
ports the tests of the validation, so we can check whether the
rules actually flag the cases.
Also include some new validation cases derived from asserts in
brw_eu encoding logic.
Assisted-by: Pi coding agent (gpt-5.5, opus-4.6)
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
Add a new module that can produce the binary encoded representation of
the instructions. Some key differences from existing encoding logic
in brw:
- Use a struct to represent the instructions before final encoding.
This is similar to the struct we already use in validation. This
allows generator/validation code to ignore details of instruction
formatting and just "set src0 to something".
- Split the encoding logic between Pre-Xe (Gfx9 and Gfx11) and Xe (from
Gfx12 and up). They are documented differently, so splitting makes
both sides easier to deal with.
- Try to follow the bit range numbers as they are documented in the
spec, programatically shifting them when needed. This means numbers
in code match PRMs / BSpec.
Later patches will add compaction and make use of the module in various
parts of the code.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41413>
ends_program calls into nir_cf_node_get_function repeadtly to fetch the
same function and to check whether we are inside an entry point or not.
But we already got the information higher up the chain so use that
instead.
nir_cf_node_get_function is quite expensive, because it follows pointers
through the tree.
Speeds up compilation of more complex shaders by quite a bit. I am seeing
a 66% cut of compilation time spent in e.g. llama-bench.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41891>
We're required to support this extension for Android VP17.
We've tried supporting it through the use of
CMF_DISABLE_WRITE_COMPRESSION but some regressions are measures
(-0.5~-1.0%).
We're not aware using CMF_DISABLE_WRITE_COMPRESSION would prevent any
application bug so it doesn't feel useful to implement.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41187>