This prevents assertion failures in brw_eu_emit in a later commit in
this MR. Even though they have not been previously observed, these
assertion failures could happen even without that commit.
No shader-db or fossil-db changes on any Intel platform.
Fixes: 04e1783278 ("brw: Call brw_fs_opt_algebraic less often")
v2: Add SHUFFLE. Suggested by Ken. Fixed indentation.
v3: Update BROADCAST exec_size after rebasing on "brw/build: Use SIMD8
temporaries in emit_uniformize".
v4: Explain why munging the exec_size is correct.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
The changes to try_copy_propagate will be removed later in the series.
v2: Fix up some comments to note that offset != 0 is allowed only when
stride == 0. Apply same offset=0 restriction in try_copy_propagate_def
too. Allow copy propagation if the source is either a def or
UNIFORM. Don't copy prop a load_reg through a non-def value.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
load_reg is something like load_payload except it has a single
source. It copies the entire source to the destination. Its purpose is
to convert a non-SSA VGRF into an SSA value. This copy is marked as
volatile so that it will act as a scheduling barrier.
v2: Fix some typos in the commit message. Eliminate the
brw_builder::LOAD_REG overload that returns a brw_inst*. This is
unlikely to ever be used. Add some checks to brw_validate. All
suggested by Caio.
v3: Force the source and destination types of the LOAD_REG to by
integer. This will (eventually) simplify the creating of unit tests for
the pass that adds LOAD_REG instructions.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
v2: Move to immediately before the main optimization loop. Most
importantly, this is after the first call to DCE.
fossil-db:
All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Non SSA regs after NIR: 237045283 -> 100183460 (-57.74%); split: -58.12%, +0.39%
Totals from 701423 (99.26% of 706657) affected shaders:
Non SSA regs after NIR: 236868848 -> 100007025 (-57.78%); split: -58.17%, +0.39%
Suggested-by: Ken
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
This was handled in other instances in a previous patch, but this
instance remains, as the zlib decompression routine is slightly
different.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34118>
There are three copies of this function, all of them have the same
memory leak in them. Instead of fixing them one by one, just use a
common implementation for all three, since they already all have a
shared helper lib.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34118>
Before:
30453 ./build/src/intel/genxml/gen125_pack.h
After:
17026 ./build/src/intel/genxml/gen125_pack.h
21589 ./build/src/intel/genxml/gen125_video_pack.h
The idea is to have fewer line to parse in each genX_*.c file.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34276>
Add a command line option `-d DEVICE` to select
a device either by index or by substring of the
name.
Use `-d list` to print available devices.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34267>
That way we can deal with upcoming non constant values for
VK_KHR_maintenance8.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>
The problem this fixes is currently hidden because of the order in
which we run nir_lower_tex & intel_nir_lower_texture. The issue is
that nir_lower_tex removes the LOD source in some cases and the second
run of nir_lower_tex can add it back.
This is also only needed on Gfx12.5+ if the LOD is present.
Finally move all of the texture lowering to the postprocess phase. No
need to run this multiple times.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>
Now that the brw_ip_ranges analysis is being used, there's no
need to track start_ip/end_ips in the blocks as they are mutate. And
also no need to call adjust_block_ips at the end of some passes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
Calculate the IP ranges of the shader as an analysis pass. This will
later replace the existing tracking of start_ip/end_ip as the blocks are
changed (and the defer/adjust scheme to avoid too much churn when that
happen).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
Stop using the start_ip / end_ip, these are not really important for
those tests. What the test care was the number of instructions in the
block to check for changes and ensure we can peek at them by index.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
It was used by the pass tests to verify output with TEST_DEBUG=1,
replace it with brw_print_instructions().
The output is slightly different (not printing IP, not reordering the
blocks), we can add those features as we need, but given the usage was
already very reduced, don't bother with that until need arises.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
This should probably be done different, params should probably be considered immutable,
and this should be moved into the command buffer, also this gets set on decode paths as well
which might not make sense.
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34204>