Commit graph

230 commits

Author SHA1 Message Date
Jordan Justen
6495fe3d37 intel/xe2+: Implement brw_wm_state_simd_width_for_ksp() on Xe2+.
The mechanism for selecting dispatch modes has changed from previous
platforms, add a new implementation brw_wm_state_simd_width_for_ksp()
using the new kernel dispatch controls.

[ Francisco Jerez: Split from a larger patch, handle multipolygon
                   dispatch, add additional comments. ]

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 14:12:59 -08:00
Francisco Jerez
5e0760a993 intel/fs/gfx12: Don't consider multipolygon PS to have packed dispatch.
This fixes a number of regressions and hangs in multipolygon fragment
shaders that have FIND_LIVE_CHANNEL sequences which would otherwise
lead to access of a dead channel.  Note that the failures don't seem
to be reproducible in simulation.

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
1eff2fcb62 intel/compiler: Add polygon count statistic to brw_compile_stats.
And use it in ANV in order to return a "SIMDNxM" name from
vkGetPipelineExecutablePropertiesKHR.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:30 +00:00
Francisco Jerez
ccf9174655 intel/compiler: Add multipolygon dispatch fields to brw_wm_prog_data.
Add fields that track the number of polygons processed per PS SIMD
thread (note that this might be lower than the value that was
specified to the compiler via brw_compile_fs_params if compilation at
the desired polygon count wasn't possible), and the dispatch width of
the multi-polygon PS kernel.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:30 +00:00
Francisco Jerez
e7b1993376 intel/compiler: Add max_polygons FS compilation parameter.
Add a brw_compile_fs_params parameter that specifies to the compiler
the maximum number of polygons that may be processed in parallel per
PS SIMD thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:30 +00:00
Jordan Justen
f170995e66 anv, blorp, iris: Update 3DSTATE_PS programming for xe2
Rework:
 * Jordan: Move code into intel_update_ps_state()

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26600>
2023-12-18 15:41:30 +00:00
Iván Briano
d36da7c5f8 anv: track what kind of pipeline a fragment shader may be used with
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>
2023-09-12 02:51:31 +00:00
Iván Briano
b200e5765c anv: use a simpler MUE layout for fast linked libraries
The compaction introduced in a252123363 ("intel/compiler/mesh: compactify MUE layout")
is not suitable for the case where graphics pipeline libraries are fast
linked, as the fragment shader won't receive the mue_map to know where
to locate its inputs.
For that case, keep doing what we did before and lay things down in the
order varyings are defined, which is also how it works for the non-mesh
case.

Fixes dEQP-VK.fragment_shading_rate.*fast_linked_library*.ms

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>
2023-09-12 02:51:31 +00:00
Lionel Landwerlin
d74c301026 intel/compiler: disable per-sample interpolation modes with non-per-sample dispatch
Fixes hangs in dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5644011f06 ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24716>
2023-08-29 23:19:12 +00:00
Lionel Landwerlin
9934613c74 anv/hasvk: track robustness per pipeline stage
And split them into UBO and SSBO

v2 (Lionel):
 - Get rid of robustness fields in anv_shader_bin
v3 (Lionel):
 - Do not pass unused parameters around

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>
2023-08-09 09:00:12 +03:00
Jason Ekstrand
739e21fa9a intel/fs: Add a parameter to speed up register spilling
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>
2023-07-28 14:51:42 +00:00
Faith Ekstrand
079e8a9674 anv,hasvk,iris: sampler_prog_key::swizzles is only used on crocus
The field is no longer consumed by brw_complie_* and is instead handled
directly by the crocus driver.  Therefore, it's safe to leave it zero
and not even bother setting it.  This removes our reliance on the
SWIZZLE_* macros in prog_instructions.h.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24288>
2023-07-24 15:40:40 +00:00
Marcin Ślusarz
48885c7fe3 intel/compiler: load debug mesh compaction options once
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
Marcin Ślusarz
c1685f08dd intel/compiler,anv: put some vertex and primitive data in headers
Both per-primitive and per-vertex space is allocated in MUE in 8 dword
chunks and those 8-dword chunks (granularity of
3DSTATE_SBE_MESH.Per[Primitive|Vertex]URBEntryOutputReadLength)
are passed to fragment shaders as inputs (either non-interpolated
for per-primitive and flat vertex attributes or interpolated
for non-flat vertex attributes).

Some attributes have a special meaning and must be placed in separate
8/16-dword slot called Primitive Header or Vertex Header.

Primitive Header contains 4 such attributes (Cull Primitive,
ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword
slot) potentially unused.

Vertex Header is similar - it starts with 3 unused dwords, 1 dword for
Point Size (but if we declare that shader doesn't produce Point Size
then we can reuse it), followed by 4 dwords for Position and optionally
8 dwords for clip distances.

This means we have an interesting optimization problem - we can put
some user attributes into holes in Primitive and Vertex Headers, which
may lead to smaller MUE size and potentially more mesh threads running
in parallel, but we have to be careful to use those holes only when
we need it, otherwise we could force HW to pass too much data to
fragment shader.

Example 1:
Let's assume that Primitive Header is enabled and user defined
12 dwords of per-primitive attributes.

Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of
MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader.

With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to
fragment shader.

16/16 is better than 24/16, so packing makes sense.

Example 2:
Now let's assume that Primitive Header is enabled and user defined
16 dwords of per-primitive attributes.

Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of
MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader.

With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to
fragment shader.

24/24 is worse than 24/16, so packing doesn't make sense.

This change doesn't affect vk_meshlet_cadscene in default configuration,
but it speeds it up by up to 25% with "-extraattributes N", where
N is some small value divisible by 2 (by default N == 1) and we
are bound by URB size.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
Marcin Ślusarz
a252123363 intel/compiler/mesh: compactify MUE layout
Instead of using 4 dwords for each output slot, use only the amount
of memory actually needed by each variable.

There are some complications from this "obvious" idea:
- flat and non-flat variables can't be merged into the same vec4 slot,
  because flat inputs mask has vec4 stride
- multi-slot variables can have different layout:
   float[N] requires N 1-dword slots, but
   i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot
- some output variables occur both in single-channel/component split
  and combined variants
- crossing vec4 boundary requires generating more writes, so avoiding them
  if possible is beneficial

This patch fixes some issues with arrays in per-vertex and per-primitive data
(func.mesh.ext.outputs.*.indirect_array.q0 in crucible)
and by reduction in single MUE size it allows spawning more threads at
the same time.

Note: this patch doesn't improve vk_meshlet_cadscene performance because
default layout is already optimal enough.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
Felix DeGrood
d04be9770b intel/compiler: use shader source hash in shader dump code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
2023-07-20 09:08:08 +00:00
Lionel Landwerlin
3384f029be intel/compiler: rework input parameters
Use a struct for various common parameters rather than per stage
structure or arguments to stage specific entrypoints.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
2023-07-20 09:08:08 +00:00
Caio Oliveira
26f6ea5c30 intel/compiler: Remove unused functions and declarations
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23539>
2023-06-09 20:09:51 +00:00
Lionel Landwerlin
3f1ff326e0 anv: reduce push constant size for descriptor sets
Now that descriptor sets are located a in a 1Gb area, we can avoid
storing the whole address to the descriptor and add the base address
of the area to a 32bit offset.

Replay a bunch of fossils with this and changes not really significant
one way or another :

Totals:
Instrs: 9278246 -> 9277148 (-0.01%); split: -0.01%, +0.00%
Cycles: 3547598421 -> 3547579435 (-0.00%); split: -0.00%, +0.00%

Totals from 353 (1.14% of 31021) affected shaders:
Instrs: 581546 -> 580448 (-0.19%); split: -0.23%, +0.04%
Cycles: 25885422 -> 25866436 (-0.07%); split: -0.31%, +0.24%

No difference on send messages or spills/fills.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:38 +00:00
Lionel Landwerlin
05089f305f intel/fs: enable bindless sampler state offsets
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:37 +00:00
Lionel Landwerlin
6d6877bf99 intel/fs: enable extended bindless surface offset
Gives use 4Gb of bindless surface state on Gfx12.5+ instead of 64Mb.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:37 +00:00
Lionel Landwerlin
429ef02f83 intel/fs: make tcs input_vertices dynamic
We need to do 3 things to accomplish this :

   1. make all the register access consider the maximal case when
      unknown at compile time

   2. move the clamping of load_per_vertex_input prior to lowering
      nir_intrinsic_load_patch_vertices_in (in the dynamic cases, the
      clamping will use the nir_intrinsic_load_patch_vertices_in to
      clamp), meaning clamping using derefs rather than lowered
      nir_intrinsic_load_per_vertex_input

   3. in the known cases, lower nir_intrinsic_load_patch_vertices_in
      in NIR (so that the clamped elements still be vectorized to the
      smallest number of URB read messages)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>
2023-05-24 18:32:07 +00:00
Lionel Landwerlin
b4b17f8aaa Revert "intel/compiler: make uses_pos_offset a tri-state"
This reverts commit 5489033fa8.

The problem I was trying to address is that we were programming the
3DSTATE_PS::PositionXYOffsetSelect bit differently with GPL (CENTROID)
than without (NONE).

I failed to understand that this bit also impacts the thread payload
layout. GPL fragment shaders don't know ahead of time if pos_offset is
going to be used. It'll be choosen at runtime base on push constant
bits. So we need to program this bit different just to have a payload
matching the compiled shader code.

This fixes the freedoom replay with GPL FS shader in SIMD32.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22938>
2023-05-11 08:01:46 +00:00
Lionel Landwerlin
5489033fa8 intel/compiler: make uses_pos_offset a tri-state
This value depends on the per-sample value which can be unknown at
compile time with graphics pipeline libraries. So we need to have this
dynamic has well and pick the right value when generating the
3DSTATE_PS/3DSTATE_WM packet.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22728>
2023-05-03 10:03:57 +00:00
Lionel Landwerlin
2acc2f18ea intel/compiler: report max dispatch width statistic
Most tools looking at shader stats assume that there is only a single
resulting binary shader out of a single input. On Intel HW this is not
always the case. So having a statistic on each variant that reports
the maximum dispatch width helps showing improvement on a single
shader in terms of how large we manage to compile it.

For shaders that can be compiled in multiple SIMD width (like fragment
shaders), this will report the maximum dispatch width in the
statistics of each variants.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22014>
2023-03-21 11:53:04 +00:00
Lionel Landwerlin
09cdb77a92 intel/fs: report max register pressure in shader stats
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21756>
2023-03-08 13:37:07 +00:00
Marcin Ślusarz
465c241266 intel/compiler/mesh: use U888X packed index format
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20910>
2023-02-10 21:03:33 +00:00
Lionel Landwerlin
fd7debc8bb intel/fs: make alpha_to_coverage a tristate
That way in some cases we can do this dynamically.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
f3969e2413 intel/fs: Rework dynamic coarse handling
Use 2 flags for PI & RT messages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
949b42c4dc intel/compiler: Convert wm_prog_key::multisample_fbo to a tri-state
This allows us to communicate to the back-end that we don't actually
know if the framebuffer is multisampled or not.  No drivers set anything
but ALWAYS/NEVER and we still have a few ALWAYS/NEVER assumptions but
those should be asserted.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
5644011f06 intel/compiler: Convert wm_prog_key::persample_interp to a tri-state
This allows for the possibility that we may not know at compile time if
sample shading is enabled through the API.  While we're here, also
document exactly what this bit means so we don't confuse ourselves.

v2: Fixup coarse pixel values (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
d8dfd153c5 intel/fs: Make per-sample and coarse dispatch tri-state
Whenever one of them is BRW_SOMETIMES, we depend on dynamic flag pushed
in as a push constant.  In this case, we have to often have to do the
calculation both ways and SEL the result.  It's a bit more code but
decouples MSAA from the shader key.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
43ca7f4178 intel/compiler: Convert brw_wm_aa_enable to brw_sometimes
There are other cases where we want a tri-state logic like this.  May as
well have one enum for all the cases.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:17 +00:00
Jason Ekstrand
991d546102 intel/compiler: Document wm_prog_key::persample_interp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:17 +00:00
Jordan Justen
78a75e0d25 intel/common/intel_genX_state.h: Add intel_set_ps_dispatch_state()
This replaces brw_fs_get_dispatch_enables(), which was added in
b9403b1c47 ("intel: factor out dispatch PS enabling logic"), but this
function will not work well for future changes to 3DSTATE_PS.

So, instead, this moves the related code into a "genX" file which can
directly update 3DSTATE_PS for the given platform.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20329>
2022-12-15 00:54:59 -08:00
Kenneth Graunke
8c2448d4e6 intel/compiler: Delete sampler key handling for planar format stuff
i965 used these, but Gallium drivers do this lowering via a separate
nir_lower_tex call from st/mesa.  Vulkan drivers don't use these at all.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
2022-12-09 10:18:25 +00:00
Kenneth Graunke
88918baf5c intel/compiler: Delete key->msaa_16
None of the drivers have used this since we dropped i965, and BLORP
no longer uses it as of the previous commit.  We can also drop the
former compressed_multisample_tex_mask (now padding) field so that
things remain 64-bit aligned.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
2022-12-09 10:18:25 +00:00
Kenneth Graunke
584e18863e intel: Drop compressed_multisample_layout_mask from the compiler keys
The compiler looks at this key field to determine whether to perform
an MCS fetch for a txf_ms or samples_identical texture message, if a
nir_tex_src_ms_mcs_intel source wasn't provided.  If it isn't set,
it instead uses constant 0 (nothing is compressed).

All of the drivers (iris, crocus, anv, hasvk) unconditionally set this
to ~0 because we don't want to pay for costly shader recompiles (which
can cause nasty stuttering).  Most textures are compressed anyway, and
the hardware ignores the l2dms MCS parameter if MCS is disabled.

The only user was BLORP, which sets the key field based on whether the
texture's aux usage has MCS.  But if it has MCS, it also does the MCS
fetch itself and supplies it directly.  Otherwise, it relies on the
compiler to fill in the 0 value.  But it could easily just provide the
0 value itself in that case and not rely on the compiler at all.

With that fixed, we can just drop the key fields entirely.  We leave
them as padding for now to avoid repacking structures; we won't need
to after the next commits anyway.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
2022-12-09 10:18:25 +00:00
Lionel Landwerlin
d4cd33630a intel: add missing restriction on fragment simd dispatch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7755
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20169>
2022-12-06 00:37:50 +02:00
Lionel Landwerlin
b9403b1c47 intel: factor out dispatch PS enabling logic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20169>
2022-12-06 00:37:47 +02:00
Iván Briano
d9747169b6 anv: support VK_PIPELINE_CREATE_RAY_TRACING_SKIP_*
VK_PIPELINE_CREATE_RAY_TRACING_SKIP_AABBS_BIT_KHR and
VK_PIPELINE_CREATE_RAY_TRACING_SKIP_TRIANGLES_BIT_KHR, when specified,
make TraceRay behave as if the corresponding shader flags were set, but
without affecting the value of IncomingRayFlags in shaders.

v2 (Lionel): Improve comments

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19152>
2022-10-20 00:03:55 +00:00
Jason Ekstrand
f1768f5640 intel/compiler: Store the number of position slots in the VUE map
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17602>
2022-08-31 02:00:18 +00:00
Caio Oliveira
a1b1fdf70d intel/compiler: Rename 8_PATCH to MULTI_PATCH
Make it clearer we are dealing with multiple patches,
works better in constrast with SINGLE_PATCH.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
2022-08-24 00:39:57 +00:00
Kenneth Graunke
2cea0d6ef6 intel/compiler: Drop variable group size lowering
This backend lowering code has been dead since the removal of i965 -
nothing in the current source tree ever sets the flag.

This is handled by iris_setup_uniforms() and crocus_setup_uniforms().
Variable group size does not appear to be a feature in anv.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18055>
2022-08-18 16:17:03 +00:00
Lionel Landwerlin
03ab1d6aaa intel/compiler: document units of brw_ubo_range fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17908>
2022-08-05 11:51:31 +00:00
Lionel Landwerlin
9cb9390962 intel/fs: store num of resume shaders in prog_data
That way we can look at the SBT entries for debug purposes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17908>
2022-08-05 11:51:31 +00:00
Mark Janes
dc8df485e9 intel/compiler: reorder shader cache keys to minimize padding
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
2022-07-29 20:45:25 +00:00
Mark Janes
a4a4aefa03 intel/compiler: pad all data structures used by shader cache keys
When the compiler pads a data structure, the padded bytes will not be
initialized.  Shader keys are compared with memcmp and unitialized
bytes within the structure breaks this mechanism.

Explicitly pad the structures with members, so the compiler is forced
to initialize them.  Add a warning to indicate if a change to
alignment in any of the data structures requires additional padding.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
2022-07-29 20:45:25 +00:00
Lionel Landwerlin
2d1f021e16 intel/fs: Set NonPerspectiveBarycentricEnable when the interpolator needs it.
[anholt: changed to make all drivers do the right thing by moving the
payload barycentric check into the compiler]

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17381>
2022-07-19 01:25:47 +00:00
Jason Ekstrand
530de844ef intel,anv,iris,crocus: Drop subgroup size from the shader key
Use nir->info.subgroup_size instead.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17337>
2022-07-08 22:47:22 +00:00