This consolidates several duplicated pieces of code into devinfo.
max_scratch_ids is an array that provides the max number of threads
for the rendering and compute stages.
This fixes some exceptions missed by crocus for scratch ids on haswell
and cherryview.
It also fills out devinfo->max_scratch_ids properly for stages VS
through CS on Gfx12.5. But, functionally this should not make a
difference as Gfx12.5 already uses COMPUTE for all stages.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12799>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
With LSC support, we can do 64-bit atomics with A32/64 messages.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12566>
For now, only mark the 4x8BitPacked variants as accelerated.
Applications are unlikely to use the "add with saturate" opcodes from
VK_INTEL_shader_integer_functions2, so, technically, all of the
AccumulatingSaturating variants "[provide] a performance advantage over
user-provided code composed from elementary instructions..." on all
Intel platforms. If we encounter an application that cares, we can do
things differently then. Ditto for the non-packed 8Bit, 4-element
vector variants.
v2: Don't memset props as this also zeros sType and pNext. Noticed by
Georg Lehmann in !12617.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>
Previously, we misunderstood how conditional modifiers and saturate
interacted. We thought the condition was evaulated before the saturate
was applied. For the floating point cases, we went to some heroics to
modify the condition to maintain the same results. For integer cases,
it was not clear that this could even work. We had no use-cases and no
tests-cases, so we just disallowed everything.
Now we understand that the condition is evaluated after the saturate.
Earlier commits in this series removed the various floating point
heroics. It is easier to just delete the code that prevents some cases
that just work.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Originally this was part of "intel/fs: Remove condition-based
restriction for cmod propagation to saturated operations". With some
additional changes to that commit, it caused a lot of extra churn in the
unit tests. I felt that made it harder to see the actual changes in the
unit tests, so I split it out.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
I don't know why the float_saturate_l_mov test was #if'ed out, but it
passes... so this commit enables it.
No shader-db or fossil-db changes.
In a previous iteration of this MR, this commit helped ~200 shaders in
shader-db. Now all of those same shaders are helped by "intel/fs: cmod
propagate from MOV with any condition". All of these shaders come from
Mad Max. After initial translation from NIR to assembly, these shader
contain patterns like:
mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.sat(8) g90<1>F g90<8,8,1>F
...
cmp.nz.f0(8) null<1>F g90<8,8,1>F 0 /* 0F */
An initial pass of cmod propagation converts this to
mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.sat.XX.f0(8) g90<1>F g90<8,8,1>F
Without this commit, XX is G. With this commit, XX is NZ. Saturate
propagation moves the saturate:
mul.sat(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.XX.f0(8) g90<1>F g90<8,8,1>F
Without this commit (but with "intel/fs: cmod propagate from MOV with
any condition"), the G gets propagated:
mul.sat.g.f0(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
With this commit (with or without "intel/fs: cmod propagate from MOV
with any condition"), the NZ gets propagated:
mul.sat.nz.f0(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
There were tests related to propagating conditional modifiers from a MOV
to an instruction with a .SAT modifier for a very long time, but they
were #if'ed out.
There are restrictions later in the function that limit the kinds of MOV
instructions that can propagate. This avoids the dangers of
type-converting MOVs that may generate flags in different ways.
v2: Update the added comment to look more like the existing comment.
That makes the small differences between the two cases more obvious.
Noticed by Marcin.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19827127 -> 19826924 (<.01%)
instructions in affected programs: 62024 -> 61821 (-0.33%)
helped: 201
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1
helped stats (rel) min: 0.13% max: 0.60% x̄: 0.35% x̃: 0.36%
95% mean confidence interval for instructions value: -1.02 -1.00
95% mean confidence interval for instructions %-change: -0.36% -0.34%
Instructions are helped.
total cycles in shared programs: 954655879 -> 954655356 (<.01%)
cycles in affected programs: 1212877 -> 1212354 (-0.04%)
helped: 155
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 3.65 x̃: 4
helped stats (rel) min: <.01% max: 0.17% x̄: 0.07% x̃: 0.07%
HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 8
HURT stats (rel) min: 0.04% max: 0.23% x̄: 0.14% x̃: 0.15%
95% mean confidence interval for cycles value: -3.60 -2.90
95% mean confidence interval for cycles %-change: -0.07% -0.05%
Cycles are helped.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
...and rename it to brw_reg_type_is_unsigned_integer. It is now next to
brw_reg_type_is_floating_point and brw_reg_type_is_integer.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
We were previously operating with the mindset "a MOV is just a compare
with zero." As a result, we were trying to share as much code between
the MOV path and the CMP path as possible. However, MOV instructions
can perform type conversions that affect the result of the comparison.
There was some code added to better handle this for cases like
and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD
mov.nz.f0(16) null<1>F g31<8,8,1>D
The flaw in these changed special cases is that it allowed things like
or(8) dest:D src0:D src1:D
mov.nz(8) null:D dest:F
Because both destinations were integer types, the propagation was
allowed. The source type of the MOV and the destination type of the OR
do not match, so type conversion rules have to be accounted for.
My solution was to just split the MOV and non-MOV paths with completely
separate checks. The "else" path in this commit is basically the old
code with the BRW_OPCODE_MOV special case removed.
The new MOV code further splits into "destination of scan_inst is float"
and "destination of scan_inst is integer" paths. For each case I
enumerate the rules that I belive apply. For the integer path, only the
"Z or NZ" rules are listed as only NZ is currently allowed (hence the
conditional_mod assertion in that path). A later commit relaxes this
and adds the rule.
The new rules slightly relax one of the previous rules. Previously the
sizes of the MOV destination and the MOV source had to be the same. In
some cases now the sizes can be different by the following conditions:
- Floating point to integer conversion are not allowed.
- If the conversion is integer to floating point, the size of the
floating point value does not matter as it will not affect the
comparison result.
- If the conversion is float to float, the size of the destination
must be greater than or equal to the size of the source.
- If the conversion is integer to integer, the size of the destination
must be greater than or equal to the size of the source.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Of particular interest are the tests where the MOV performs a type
conversion. If the restriction on conditional modifier for a MOV is
ever relaxed, some of these cases must still be disallowed.
v2: s/NZ/Z/ in one of the comments. Notice by Marcin.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
This foreach_inst_in_block_reverse_starting_from loop only applies
CMP, MOV, and AND. AND instructions break out of the loop before this
point.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
This will simplify some later changes to these tests.
v2: Combine test_positive_saturate_prop and test_negative_saturate_prop
into a single function. Suggested by Marcin.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Avoid using a sparse and relatively large array for HALIGN encoding.
Additionally, this provides validation of the input alignment values.
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
We're going to start asserting for valid halign/valign values during
surface state emission. Pre-SKL, isl_surf_get_uncompressed_surf creates
uncompressed surfaces with invalid halign/valign values - 1x1. Fix this
by replacing the call to isl_surf_get_image_surf with isl_surf_init,
passing in the uncompressed format up-front.
As we're no longer using isl_surf_get_image_surf, we also need to get
the x and y offset of the image ourselves. Instead of getting a
sample-based offset, then converting to elements later on, we use
isl_surf_get_image_offset_B_tile_el to get the offset in elements
up-front.
With the above two changes, the generic code after the else-block is no
longer needed for the single-layer-view code path. We move it and
specialize it to the if-block (which is executed SKL+ and handles
multi-layer views).
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
Stencil on Gfx7 has a vertical alignment element of 8, but the largest
its surface state can express is 4. Apply the Gfx6 solution of changing
the alignment in blorp_surf_retile_w_to_y.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
On XeHP, NPOT and POT formatted surfaces will use different image
alignment units when emitting surface states. When BLORP fakes an RGB
image as RED, update the image alignment to prevent assert failures when
emitting surface states.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
Try to avoid having to update isl_gfx6_filter_tiling when new tilings
are added for new platforms. Note that the allow-list uses
ISL_TILING_ANY_Y_MASK and thus assumes that no new Y-tilings will be
added in the future.
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
In order to size Tile64 surfaces correctly, make sure that the total
physical extent is arrayed. The code should handle 3D surfaces as well,
but is untested for now.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
Implement the new XeHP alignment rules for surface layout.
RENDER_SURFACE_STATE objects still need updating, but that's left for a
separate commit.
Rework:
* Nanley: Include Sagar's VALIGN fix for D16_UNORM.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
XeHP defines new tiling formats, Tile4 and Tile64. They are needed in
order to support depth/stencil surfaces and multisampling. Create new
ISL enums and define some initial tiling information in order to enable
them later on.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
They are not used yet but the layout of Yf and Ys tiles are dependent on
these parameters. While we're here, better document the function.
Rework:
* Nanley: Update crocus.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
Program the producers/consumer fields for TCS Barrier messages.
Producer and consumer fields are set to number of TCS threads.
Ref: Bspec 54006 for Barrier Data Payload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
Rearrange if/else fragments to unify case for Gen11 or later
platforms. This will help the code look cleaner for adding
unified barrier support to TCS.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
The prior method of checking the result of getenv() for NULL would cause
the feature to be enabled for INTEL_NO_HW=0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
Implement the workarounds in anv and iris instead.
Before this commit, ISL unconditionally modified workaround registers
while filling out depth stencil state. To account for this, drivers
unconditionally stalled prior to emitting depth stencil packets. This
hurt performance.
By having the drivers perform the workarounds, they can choose when to
modify the relevant registers. The drivers now avoid emitting the
workaround for NULL depth buffers. This reduces stalls and leads to
better performance.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (the ISL/Anv bits)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (the Iris bits)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>