There was a subtle bug related to CFG tracking. Namely, some branch
instructions may point *only* to the block after the DO instruction
for the loop. If the MOV instructions are in the DO block, the may not
have liveness properly tracked.
Like in !25132, having the MOV instructions in blocks that might
contain other instructions helps scheduling.
shader-db:
All Broadwell and newer Intel GPUs had similar results (Ice Lake shown)
total cycles in shared programs: 848577248 -> 848557268 (<.01%)
cycles in affected programs: 78256396 -> 78236416 (-0.03%)
helped: 361 / HURT: 18
fossil-db:
All Skylake and newer Intel GPUs had similar results (Ice Lake shown)
Totals:
Cycles: 15021501924 -> 15021372904 (-0.00%); split: -0.00%, +0.00%
Totals from 735 (0.11% of 656080) affected shaders:
Cycles: 676429502 -> 676300482 (-0.02%); split: -0.02%, +0.00%
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>
nir_emit_global_atomic should utilize lsc_op_num_data_values to
infer the number of operands for global atomic ops, following the same
pattern as nir_emit_surface_atomic
Fixes: 90a2137 ('intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs')
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26432>
align is a function and when we want use it, the align variable will shadow it
So replace it with other names
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25997>
A previous patch already removed individual compilation of the inputs,
by simply concatenating the files. This patch removes the linking of
the remaining single object that's compiled.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26458>
A previous change concatenated multiple SPIR-V inputs to be
compiled together, so we have a single clc_binary to work on.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26458>
v2 (idr): Fix expectations BottomBreakWithContinue. opt_predicated_break
will remove the IF and make the CONTINUE predicated.
v3 (idr): Temporarily disable the one test that fails.
v4 (idr): Free strings allocated by open_memstream. Fixes gitlab CI
failures in debian-testing-asan.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
Add optional argument for both cfg and block dump() function to pass
a FILE*. Default behavior remains dumping to stderr.
v2 (idr): Don't add the new test framework in this commit.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
Imagine 3 blocks A, B, and C. A has a physical link to B, and B has a
logical link to C. Previous to this commit, if B were removed, A would
get a logical link to C. This is not correct.
This was specifically observed to occur when block A was a DO block and
B was the WHILE block. The DO block would have two logical successors,
and that is completely invalid.
v2: Assert that the links from A-to-B and B-back-to-A are the same
kind. Suggested by Caio.
v3: Assume the successor and predecessor lists are well formed. Use this
to simplify the logic. Suggested by Caio. Add checks to cfg_t::validate
to ensure the lists are well formed.
v4: Remove (now unused) bblock_link_invalid. Suggested by Curro.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
The previous is_successor_of and is_predecessor_of checks prevented
creating a physical link when a logical link already existed. However, a
logical link could be added when a physical link already existed. This
change causes an existing physical link to be "promoted" to a logical
link.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
Previously when earlier_block->children.make_empty() was called, the
child blocks would still have links back to earlier_block.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
v2: Use _mesa_shader_stage_to_abbrev(stage) instead of
stage_abbrev. Noticed by Caio and GCC. That's what I get for not
recompiling after rebasing. Wrap cfg_t::validate in NDEBUG
magic. Suggested by Caio.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
This matches the behavior of fs_visitor::nir_emit_if.
This is not technically wrong, but the cfg_t generates some invalid
parent / child links in this case.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
Per Google Test FAQ recommendation, prefer consutrctors and destructors
unless there's a need to use SetUp/TearDown.
We will take advantage of this later to initialize an fs_builder.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
Once NIR code is lowered and a few optimization passes have run, there
might be flag register interactions between instructions quite far
away from one another.
In the following case :
f0 = and r0, r1
...
fs_interpolate r2, r3
...
if f0
...
endif
If we lower fs_inteporlate while using the f0 register, we completely
garble the value meant for the if block.
To fix this, emit the predication for fs_interpolate in brw_fs_nir.cpp
when doing the NIR translation to the backend IR. This will guarantee
that the flag register interactions are visible to the optimization
passes, avoiding the problem above.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 68027bd38e ("intel/fs: implement dynamic interpolation mode for dynamic persample shaders")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9757
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>
Those are not reused, so this will be the first and only allocation, so
no need to use the "realloc" variants.
For the fs_reg arrays, there's currently no particular reason to keep
them uninitialized, so zero-initialize them too -- not ideal but better
than random values.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26302>
We're currently updating the hitT value in the traversal result with
the hitT value from reportIntersection(), but this is not correct.
First the hitT value of reportIntersection() should update the
gl_RayTmaxEXT value (maps to brw_nir_rt_mem_ray_defs::t_far).
Second the hitT determined by traversal should only be updated if the
reportIntersection() hitT value has updated the gl_RayTmaxEXT and that
the new gl_RayTmaxEXT is smaller than the determined hitT value from
traversal.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Fixes: 303378e1dd ("intel/rt: Add lowering for combined intersection/any-hit shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25146>
Avoids fixing up list pointers that we don't care about anymore -- since
all the instructions will be re-added in a different order anyway.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
Some fields in schedule_node will need to be reset each time they are
used. The `cand_generation` needs to be back to zero, and both
`unblocked_time` and `parent_count` need to be back to their initial
values, which were pre-calculated.
Rename the initial data fields and add new ones for the temporary data.
Note the helper function is `per node` to allow it "tag along" with an
existing loops.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
This will be useful later on when we reuse the same scheduler for
multiple modes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
- Just assert in functions we expect it to exist
- Predicate usage with `!post_reg_alloc` to avoid suggest there are more
combinations.
- Reuse an existing loop to call the count function.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
Values are used together, saves one pointer in schedule_node,
reduces amount of reallocations when children count grows.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
Pull run() and schedule_instructions() for fs, and pull a very
simplified version of those into a run() for vec4. Because of the
previous patches the duplication is small.
Since we are touching these, change run() implementations to use the
cfg from the existing reference to the visitor/shader instead of taking
one as argument.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
The list was used for iterating through all instructions and then
later also to track the available ones. Now that the array iteration
is used, change how we fill it and rename it to reflect its only job.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
For all the preparation data collection before the scheduling
actually happens, it is possible to walk the schedule nodes
in order by iterating on the range of the array dedicated to
a given block.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>