Commit graph

11864 commits

Author SHA1 Message Date
Kenneth Graunke
84139470a5 intel/brw: Use VEC for emit_unzip()
Helps make SIMD-split code more SSA-friendly.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:54 -07:00
Kenneth Graunke
1b54b4fad5 intel/brw: Use VEC for NIR vec*() sources
This writes the whole destination register in a single builder call.
Eventually, VEC will write the whole destination register in one go,
allowing better visibility into how it is defined.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:50 -07:00
Kenneth Graunke
d4563747d9 intel/brw: Use VEC for output stores
This writes the whole destination register in a single builder call.
Eventually, VEC will write the whole destination register in one go,
allowing better visibility into how it is defined.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:49 -07:00
Kenneth Graunke
f0c29c9b71 intel/brw: Use VEC for FS outputs
This writes the whole destination register in a single builder call.
Eventually, VEC will write the whole destination register in one go,
allowing better visibility into how it is defined.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:49 -07:00
Kenneth Graunke
cbe7a13f2b intel/brw: Use VEC for TCS/TES/GS input/output loads
This writes the whole destination register in a single builder call.
Eventually, VEC will write the whole destination register in one go,
allowing better visibility into how it is defined.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:48 -07:00
Kenneth Graunke
a94e1bd0ac intel/brw: Use VEC for gl_FragCoord
This writes the whole destination register in a single builder call.
Eventually, VEC will write the whole destination register in one go,
allowing better visibility into how it is defined.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:47 -07:00
Kenneth Graunke
d0a24496fd intel/brw: Use VEC for load_const
This writes the whole destination register in a single builder call.
Eventually, VEC will write the whole destination register in one go,
allowing better visibility into how it is defined.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:45 -07:00
Kenneth Graunke
3c867bf2c7 intel/brw: Add a new VEC() helper.
This gathers a number of sources into a contiguous vector register.
Eventually, the plan is that it will use a MOV for a single source,
or LOAD_PAYLOAD for multiple sources.  For now, it emits a series of
MOVs to allow us to rewrite a bunch of existing code to use the new
helper, then change them all over at once later.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:42 -07:00
Kenneth Graunke
c194df565a intel/brw: Don't include unnecessary undefined values in texture results
When emitting a sampler message, we allocate a temporary destination
large enough to hold 4 values (or 5 for sparse).  This is the maximum
size needed to hold any result.  However, we shrink the size written by
the sampler message to skip writing any trailing components that NIR
tells us are never read.  So we may not write the entire temporary.

The NIR texture instruction has a destination VGRF which is sized
assuming that all components are present.  We issue a LOAD_PAYLOAD
instruction to copy our sampler result temporary to the NIR destination.
When we reduce the response length of the sampler messages, then some of
these temporary components have undefined values.  The correct way to
indicate that is by using a BAD_FILE source.  Unfortunately, we were
naively reading offsets of the temporary that were never written, but
are still part of a larger VGRF.  This complicates things.

For example, sampling and only using RGB (not RGBA) was producing this:

   txl_logical(8) (written: 3) vgrf3+0.0:F, ...
   undef(8) (written: 4) vgrf4:UD
   load_payload(8) (written: 4) vgrf4:F, vgrf3+0.0:F, vgrf3+1.0:F, vgrf3+2.0:F, vgrf3+3.0:F

The last source, vgrf3+3.0:F, is undefined, and should be BAD_FILE.
Doing so allows VGRF splitting and other optimizations to work better.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:41 -07:00
Kenneth Graunke
e42914529a intel/brw: Support CSE on more ops
This has no changes in shader-db or fossil-db, surprisingly, but at
least CSEL will be useful shortly.  Presumably the others may matter
somewhere.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:40 -07:00
Kenneth Graunke
ed3e4c16dc intel/brw: Do not create empty basic blocks when removing instructions
If there's only a single instruction in a basic block, then removing it
would create an empty block.  We seem to have trouble representing those
as there are no instructions with an IP inside the block; several places
mess up connections.  While most blocks end in control flow instructions
(which are rarely eliminated), ones preceding a DO instruction may end
in an ordinary instruction.  This makes such blocks tricky to merge with
adjacent blocks - they may be between loops.  Any optimization pass may
may find such an instruction and want to eliminate it, and most of them
are unprepared to perform such CFG link surgery.  Nor do we want to make
every pass aware of this issue.

To work around this, we simply replace an instruction with a NOP when
removing it from a block containing only that instruction, leaving the
block in place.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:39 -07:00
Kenneth Graunke
391da3610c intel/brw: Print W/UW immediates correctly
We were printing 24w as 0x180018d which not only scarily shows the
wrong type, but also the replicated format of the word.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:33 -07:00
Kenneth Graunke
674e89953f intel/brw: Use new builder helpers that allocate a VGRF destination
With the previous commit, we now have new builder helpers that will
allocate a temporary destination for us.  So we can eliminate a lot
of the temporary naming and declarations, and build up expressions.

In a number of cases here, the code was confusingly mixing D-type
addresses with UD-immediates, or expecting a UD destination.  But the
underlying values should always be positive anyway.  To accomodate the
type inference restriction that the base types much match, we switch
these over to be purely UD calculations.  It's cleaner to do so anyway.

Compared to the old code, this may in some cases allocate additional
temporary registers for subexpressions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
4c2c49f7bc intel/brw: Add builder helpers that allocate temporary destinations
In many cases, we calculate an expression by generating a series of
instructions.  We'd either overwrite the same register repeatedly,
or call vgrf(BRW_TYPE_X) repeatedly to allocate temporaries for each
intermediate step.  In many cases, we overwrote the same register simply
because allocating and naming temporaries for each step was annoying.

This commit adds new builder helpers that will allocate a temporary
destination for you, using simple type interference: unary operations
use the source type, and binary operations require a matching base type
and return the largest of the two types.

The helpers return the destination register, allowing us to write in an
expression-tree style, chaining together builder operations to produce
whole values.  Sort of like nir_builder.  We still optionally will write
out the fs_inst pointer in case the caller wants to do things like set
predicates or saturation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
319ba85e10 intel/brw: Add builder helpers for math functions
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
cf8ed9925f intel/brw: Make a helper for finding the largest of two types
Some instructions can operate on mixed types.  Typically this is
something like a binary operation with UD and UW sources resulting
in a UD destination.  In order to make it easier to find the result
type of such operations, let's make a type helper that returns the
larger of the two types (but requires the base type to match).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
f5473e6edd intel/brw: Don't use inst return value when it isn't needed
We just want to emit an instruction, but we don't need to do anything
further with it, so we don't need to store the resulting inst pointer
anywhere.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Matt Turner
2a417e3fc1 intel: Build float64 shader only for Vulkan
It's only used by anv and it requires glslang, which isn't otherwise
required for building iris.

Fixes: b52e25d3a8 ("anv: rewrite internal shaders using OpenCL")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28943>
2024-04-26 14:08:32 +00:00
Lionel Landwerlin
9926aedc96 anv: enable EDS3 AlphaToCoverageEnable & RasterizationSamples
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10647
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
ada806baa3 anv: remove fs_msaa_flags from the graphics pipeline
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
ddf31d2f40 anv: move 3DSTATE_MULTISAMPLE to partial emission
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
815d2e3e8b anv: move 3DSTATE_PS to partial packing
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
3a336a98e9 anv: move more PS_EXTRA programming to runtime
All the stuff related to fs_msaa_flags.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
355549e7b0 anv: move 3DSTATE_WM::BarycentricInterpolationMode programming to runtime
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
11b348a1c5 anv: add dirty tracking of fs_msaa_flags in runtime
At the moment this is useless as the pipeline already holds the same
value. But in the next changes we'll stop building this value on the
pipeline to allow for more dynamic states.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
25b57a6a75 anv: track sample shading enable & min sample shading
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
b80dd22d57 intel/brw: add min_sample_shading value in wm_prog_data
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
bdfa25dc77 intel/fs: decouple alphaToCoverage from per sample dispatch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:03 +00:00
Lionel Landwerlin
1bbe2d9833 intel/brw: fixup wm_prog_data_barycentric_modes()
Always select sample barycentric when persample dispatch is unknown at
compile time and let the payload adjustments feed the expected value
based on dispatch.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:02 +00:00
Lionel Landwerlin
48bf95ba96 anv: factor out wm_prog_data get in runtime flush
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:02 +00:00
Lionel Landwerlin
e302825fef anv: fixup indentation
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:02 +00:00
Lionel Landwerlin
2f0c2d2ed7 anv: simplify multisampling check
We've already checked that ms != NULL in the if condition.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27803>
2024-04-26 05:13:02 +00:00
Iván Briano
8ebf07eccd anv: check requirements for VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE
Somehow I missed this one in 164c0951a0

If the format the image is being created with doesn't have the FSR
format feature, report it as unsupported.

Also fixes future CTS tests: dEQP-VK.api.info.unsupported_image_usage.*

Cc: mesa-stable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28913>
2024-04-26 03:01:07 +00:00
Rohan Garg
b406759479 anv: formatting fix when printing pipe controls
Fixes: abc4111 ('anv: pass steam output as argument for anv_dump_pipe_bits')
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28931>
2024-04-25 21:38:30 +00:00
Lionel Landwerlin
4b0362637b anv: reuse embedded samplers across shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10804
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28865>
2024-04-25 17:53:31 +00:00
Iván Briano
0fbaf8703a anv: enable VK_KHR_shader_float_controls2
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27281>
2024-04-25 12:13:41 +00:00
Kenneth Graunke
df6cfb4dd0 intel/brw: Rename brw_reg_type_to_hw_type to brw_type_encode
And similarly brw_hw_type_to_reg_type to brw_type_decode.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
9205f6ff51 intel/brw: Combine a1/a16 3src type decoding functions
Align16 is only used on Gfx9, while Align1 is used on Gfx11+.  We can
decode both kinds of encodings in the same function with a simple
devinfo check.  One snag is that the align16 encodings didn't have a
separate exec_type field, but we can just pass 0.

This lets us have a single function named brw_type_decode_for_3src,
which is much less of a mouthful.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
28034aac34 intel/brw: Combine a1/a16 3src type encoding functions
Align16 is only used on Gfx9, while Align1 is used on Gfx11+.  We can
handle both encodings in the same function with a simple devinfo check,
and give that function a simple name like brw_type_encode_for_3src.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
545bb8fb6f intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_*
Both of these helpers do the same thing.  We now have brw_type_size_bits
and brw_type_size_bytes and can use whichever makes sense in that place.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
c22f44ff07 intel/brw: Replace brw_reg_type_from_bit_size by brw_type_with_size
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
007d891239 intel/brw: Use newer brw_type_is_* shorter names
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
f523bfcf90 intel/brw: Reindent after shortening BRW_REGISTER_TYPE_* to BRW_TYPE_*
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
873fcdff38 intel/brw: Stop using long BRW_REGISTER_TYPE enum names
s/BRW_REGISTER_TYPE/BRW_TYPE/g

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
9d8f2c4421 intel/brw: Rework BRW_REGISTER_TYPE's representation semantics
In ancient days, we directly used the hardware register type encodings
throughout the compiler.  As more GPU generations came out, encodings
shifted, and we moved to an abstract enum that we could encode/decode
to a particular GPU's hardware encoding.  But there was no particular
meaning behind any particular value.

One downside to this approach is that we end up with switch statements
galore.  Want to know a type's size?  Switch.  Convert a unsigned type
to a signed one?  Switch.  Get a type with the same base type, but
different bit size?  Switch.  This is both inefficient and inconvenient.

In contrast, nir_alu_type takes a nicer approach - the type encoding has
certain bits representing the base type, and others encoding the size of
the type.  Switching base types or sizes is a simple matter of masking
out the relevant field and substituting a different one.

Tigerlake's encoding adopts a similar approach: two bits represent the
size as a 2-bit unsigned number n, where the bit size is (8 * 2^n).
Two more bits represent the base type.  Past encodings were a bit ad hoc
as new data types were added over time, but Gfx12 is organized (mostly).

This patch converts our brw_reg_type enum over to a new system that's
patterned after the Tigerlake style (for easy conversion) while
deviating in a few ways that make our vector immediate type size
handling simpler.  Should we add additional base types, we're likely
to continue deviating.  Still, converting is much simpler.

Type size calculations (which are performed all the time) are now a
simple mask and shift, instead of a switch.

We also adopt the name BRW_TYPE_* instead of BRW_REGISTER_TYPE_* because
it's much shorter and easier to type.  Similarly, we create new helper
functions named brw_type_* for working with these types, with a cleaner
naming convention.  Legacy names still exist but will we dropped over
the next few patches as pieces get cleaned up.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
c45e235df5 intel/brw: Drop NF type support
Icelake removed the PLN instruction for interpolating fragment shader
inputs, instead adding a special "Native Float" (NF) data type which
was a 66-bit floating point data type that could only be used with the
accumulator.  On Tigerlake, they dropped NF support in favor of just
doing the interpolation with MAD instructions.

We stopped using NF years ago (commit 9ea90aae1e),
instead just using the fs_visitor::lower_linterp() pass to emit MADs.

Since this existed only for a short time, and had very limited utility,
we drop it from the compiler.  One downside is that we can no longer
disassemble Icelake shaders containing NF types properly, but I doubt
anyone really minds.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
1c6f863fc7 intel/brw: Delete gfx10 table for align1 3src type encoding
align1 three-source instructions do not exist on gfx9, and this
compiler does not support gfx10.  So the oldest case is gfx11.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Lionel Landwerlin
68dfe17abc anv: disable dual source blending state if not used in shader
Fixing some simulation issues on Gfx9/11 with zink on anv running dual
source blending piglit tests like :

   ./bin/arb_blend_func_extended-dual-src-blending-discard-without-src1 -auto -fbo

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28901>
2024-04-25 09:03:30 +00:00
Kenneth Graunke
e6fb3ba037 isl: Set MOCS to uncached for Gfx12.0 blitter sources/destinations
We were accidentally leaving XY_BLOCK_COPY_BLT's Source and Destination
MOCS fields set to 0 (Error: Reserved for Non-Use) on Gfx12.0 systems.
This was causing assert fails in debug builds, since we try to ensure
that we don't do that.  In theory, MOCS 0 is supposed to be equivalent
to MOCS 2 (all the caching), but...we probably ought to use MOCS 3
(uncached).  Every Gfx12.5+ platform requires it, so although there
isn't a note about Gfx12.0 needing that, it's possible that it does.
We're currently only using the blitter for DRI PRIME blits on Gfx12.0,
anyway, and I think we're flushing all the caches regardless.

This bug was somewhat obscure to hit:
- You need a hybrid graphics system with Gfx12.0 and some other GPU
- You have to be using "reverse PRIME", i.e. rendering on the integrated
  GPU and displaying on the discrete one.  This is not the common case.
- You have to be using a debug build.

No observable performance delta in GfxBench5 Car Chase (an arbitrary
program) when rendering on Alderlake GT1 and displaying on an Arc A770.

Fixes: 194afe8416 ("anv/iris/blorp: use the right MOCS values for each engine")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28894>
2024-04-25 08:05:48 +00:00
Karol Herbst
d22f936019 nir: remove workgroup_id_zero_base
This removes the need for drivers to handle both versions. The base will
get added once in nir_lower_system_values when converting from deref to
intrinsic and will be replaced by a zero for users not supporting it.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800>
2024-04-24 20:18:49 +00:00