`rules:changes:compare_to` resolved firstly pushed branch pipelines,
which always evaluated `rules:changes` as true which breaks the workflow
Since we now explicitely say, that we compare against `main` repository,
GitLab can evaluate against real changes.
Fixes: 79f7882fc6 ("ci: add quirk for GitLab assuming changes is always true for scheduled runs")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24305>
A few titles show max live register reductions, but nothing
significant in instruction count or other stats.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>
This follows the same convention as shader object where the last stage
would have nextStage to 0. This will allow more refactoring.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24313>
Add helper functions vk_memory_to_image_copy_layout() and
vk_image_to_memory_copy_layout(), which will be useful in
VK_EXT_host_image_copy implementations.
vk_memory_to_image_copy_layout() is similar to
vk_image_buffer_copy_layout(), except the second parameter is
VkMemoryToImageCopyEXT instead of VkBufferImageCopy2.
vk_image_to_memory_copy_layout() is similar to
vk_image_buffer_copy_layout(), except the second parameter is
VkImageToMemoryCopyEXT instead of VkBufferImageCopy2.
Signed-off-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24290>
Calling anything after nir_trivialize_registers() risks undoing some of
its work.
In this case, brw_nir_adjust_payload() will do a constant folding pass
if any payload adjusting happened, and that can turn a bunch of
@store_regs into basically noops.
Fixes dEQP-VK.subgroups.*task
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24325>
This is a modified version of a commit originally in !7698. This version
add the changes to brw_fs_copy_propagation. If the address passed to
fs_visitor::swizzle_nir_scratch_addr is a constant, that function will
generate SHL with two constant sources.
DG2 uses a different path to generate those addresses, so the constant
folding can't occur there yet. That will be addressed in the next
commit.
What follows is the commit change history from that older MR.
v2: Previously this commit was after `intel/fs: Combine constants for
integer instructions too`. However, this commit can create invalid
instructions that are only cleaned up by `intel/fs: Combine constants
for integer instructions too`. That would potentially affect the
shader-db results of each commit, but I did not collect new data for
the reordering.
v3: Fix masking for W/UW and for Q/UQ types. Add an assertion for
!saturate. Both suggested by Ken. Also add an assertion that B/UB types
don't matically come back.
v4: Fix sources count. See also ed3c2f73db ("intel/fs: fixup sources
number from opt_algebraic").
v5: Fix typo in comment added in v3. Noticed by Marcin. Fix a typo in a
comment added when pulling this commit out of !7698. Noticed by Ken.
shader-db results:
DG2
No changes.
Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
total instructions in shared programs: 20655696 -> 20651648 (-0.02%)
instructions in affected programs: 23125 -> 19077 (-17.50%)
helped: 7 / HURT: 0
total cycles in shared programs: 858436639 -> 858407749 (<.01%)
cycles in affected programs: 8990532 -> 8961642 (-0.32%)
helped: 7 / HURT: 0
Broadwell and Haswell had similar results. (Broadwell shown)
total instructions in shared programs: 18500780 -> 18496630 (-0.02%)
instructions in affected programs: 24715 -> 20565 (-16.79%)
helped: 7 / HURT: 0
total cycles in shared programs: 946100660 -> 946087688 (<.01%)
cycles in affected programs: 5838252 -> 5825280 (-0.22%)
helped: 7 / HURT: 0
total spills in shared programs: 17588 -> 17572 (-0.09%)
spills in affected programs: 1206 -> 1190 (-1.33%)
helped: 2 / HURT: 0
total fills in shared programs: 25192 -> 25156 (-0.14%)
fills in affected programs: 156 -> 120 (-23.08%)
helped: 2 / HURT: 0
No shader-db changes on any older Intel platforms.
fossil-db results:
DG2
Totals:
Instrs: 197780415 -> 197780372 (-0.00%); split: -0.00%, +0.00%
Cycles: 14066412266 -> 14066410782 (-0.00%); split: -0.00%, +0.00%
Totals from 16 (0.00% of 668055) affected shaders:
Instrs: 16420 -> 16377 (-0.26%); split: -0.43%, +0.17%
Cycles: 220133 -> 218649 (-0.67%); split: -0.69%, +0.01%
Tiger Lake, Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 153425977 -> 153423678 (-0.00%)
Cycles: 14747928947 -> 14747929547 (+0.00%); split: -0.00%, +0.00%
Subgroup size: 8535968 -> 8535976 (+0.00%)
Send messages: 7697606 -> 7697607 (+0.00%)
Scratch Memory Size: 4380672 -> 4381696 (+0.02%)
Totals from 6 (0.00% of 662749) affected shaders:
Instrs: 13893 -> 11594 (-16.55%)
Cycles: 5386074 -> 5386674 (+0.01%); split: -0.42%, +0.43%
Subgroup size: 80 -> 88 (+10.00%)
Send messages: 675 -> 676 (+0.15%)
Scratch Memory Size: 91136 -> 92160 (+1.12%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>
opt_copy_propagation can create invalid instructions like
shl(8) vgrf96:UD, 2d, 8u
These instructions will be cleaned up by opt_algebraic. The irony is
opt_algebraic converts these to simple mov instructions that
opt_copy_propagation should clean up. I don't think we want a loop like
do {
progress = false;
if (OPT(opt_copy_propagation)) {
OPT(opt_algebraic);
OPT(dead_code_eliminate);
}
} while (progress);
But maybe we do?
Maybe this would be sufficient:
while (OPT(opt_copy_propagation))
OPT(opt_algebraic);
OPT(dead_code_eliminate);
No shader-db or fossil-db changes (yet) on any Intel platform. This is
expected.
v2: Do opt_algebraic immediately after every call to
opt_copy_propagation instead of being clever. Suggested by Lionel.
Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>
The sequence was two threads A and B on a shared VkDevice:
A: move a BO to zombie VMA list
A: drop the BO VMA lock
B: prepare to allocate a BO
B: Lock BO VMA lock
B: call tu_free_zombie_vma_locked()
B: close the gem handle from the VMA list
B: Drop BO VMA lock
B: allocate a BO, getting the recently-closed handle back.
B: initialize the BO struct for the new handle.
A: memset the BO struct to 0.
Multithreading in C is the worst.
Closes: #9049, #9247
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24324>
This is required to support discrete GPUs placed in systems with large
PCI bar or resizeble PCI bar not available or disabled.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>
This is required to support discrete GPUs placed in systems with large
PCI bar or resizeble PCI bar not available or disabled.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>
This adds support for discrete GPUs placed in systems with large PCI
bar or resizeble PCI bar not available or disabled.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>
Just drop the store. Written while debugging
dEQP-VK.pipeline.monolithic.logic_op.r8_uint.no_op.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24252>
nir_const_value_for_int asserts signed bounds on the input, but we pass in an
unsigned value that would be out-of-bounds for 32-bit channels, causing the
assert to fail for 32-bit channel formats.
Fixes dEQP-VK.pipeline.monolithic.logic_op.r32_uint.* on AGXV (and probably
PanVK).
Fixes: dbd0615e7a ("nir/lower_blend: Avoid useless iand with logic ops")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24252>
Prevents regression from the series, since we don't support empty blend
shaders. This could be fixed more generically but I'm not inclined to compile
more blend shaders than needed so shrug.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24252>
Now that we're using load/store_reg intrinsics, the previous checks for
registers aren't what we want. Instead, we need to be looking for a mov
or vec where both the destination and a source are load/store_reg with
matching decl_reg.
Fixes: b8209d69ff ("intel/fs: Add support for new-style registers")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>
These are tracked the same way as register reads and writes, allowing
them to be re-arranged as long as they respect dependencies within the
same reg.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
This makes it so we never find a reg_decl in between a reg_store and the def
for its value, which helps avid inserting copy movs.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
Consider the snippet of NIR:
div 32 %447 = @load_reg (%442) (base=0, legacy_fabs=0, legacy_fneg=0)
div 32 %463 = @load_reg (%442) (base=0, legacy_fabs=0, legacy_fneg=0)
con 32 %409 = iadd %17 (0x3), %447
@store_output (%182 (0x601), %463) (base=0, wrmask=x, component=0, src_type=invalid...
@store_reg (%409, %442) (base=0, wrmask=x, legacy_fsat=0)
The load_reg's are trivial, so the %442 read will get folded into store_output.
But under the old definition, the store_reg is also trivial so it gets folded
into the iadd... causing a read-after-write hazard and invalid code generation.
The fix is to amend our definition of store_reg triviality to account for loads
getting folded in. It's not good enough that there's no intervening load_reg,
there can also be no intervening source that gets chased to a load_reg. Handle
that case as well.
Identified in dEQP-VK.geometry.input.basic_primitive.triangles_adjacency on
V3DV.
Fixes: d313eba94e ("nir: Add pass for trivializing register access")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reported-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
In order for a register load to be trivial, it cannot be used in any
block other than the one in which it is loaded. We're not currently
explicitly doing anything to ensure this invariant holds. It may be
that it holds regardless but I couldn't find any documented reason why
it should so let's explicitly handle that case. Worst case, the newly
added code does nothing.
Fixes: d313eba94e ("nir: Add pass for trivializing register access")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>