Commit graph

130 commits

Author SHA1 Message Date
Francisco Jerez
6dae56cc57 intel/fs/xe2+: Fix for new layout of X/Y pixel coordinates in PS payload.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 11:07:03 -08:00
Francisco Jerez
3f92dde55e intel/fs/xe2+: Stop building SIMD8 shaders for geometry stages (VS/TCS/TES/GS).
They are no longer suppored by the fixed-function hardware.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>
2023-12-22 10:37:00 -08:00
Francisco Jerez
4672fcbc76 intel/fs: Fix PS thread payload setup for depth_w_coef_reg.
It's not replicated per SIMD16 half of a SIMD32 thread on the PS
payload.  Make fs_visitor::payload::depth_w_coef_reg a scalar rather
than an array.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
b62ad4e028 intel/fs: Rework layout of FS vertex setup data in ATTR file to support multi-polygon dispatch.
The updated layout includes one copy of each plane parameter per
channel of the SIMD thread, in order to allow channels to process
different polygons.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
83a0252e8d intel/fs: Pass builder to per_primitive_reg().
Matches prototype of interp_reg(), will be useful in a subsequent commit.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:30 +00:00
Francisco Jerez
8e9f09dbe5 intel/fs: Provide component index explicitly to interp_reg().
Main motivation is that for multipolygon PS shaders the i-th plane
parameter for the j-th input attribute will no longer necessarily be a
scalar, since different channels may be processing different polygons
with different input plane parameters, so simply taking a component()
of the result of interp_reg() will no longer work.  Instead of
duplicating the multipolygon handling logic in every caller of
interp_reg(), fold the component() call into interp_reg() so we can
replace it with multipolygon-correct code more easily.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:30 +00:00
Francisco Jerez
e4aca2ebaa intel/fs: Add separate constructor of fs_visitor for fragment shaders.
To allow specifying the number of polygons that will be processed per
SIMD thread.

Rework:
 * Jordan: Add needs_register_pressure following
   09cdb77a92 ("intel/fs: report max register pressure in shader stats")

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:30 +00:00
Caio Oliveira
bfc953add7 intel/compiler: Use C helpers to access builtin types
Remove usage of C++ static members as they are going to be removed.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26658>
2023-12-15 03:09:19 +00:00
Caio Oliveira
a8b2426419 intel/compiler: Use reference instead of pointer for fs_visitor
Per Ian suggestion.  Also clear up a few unnecessary casts around the code and
use `s` for fs_visitor ("shader").  Note to include a reference in ntf we need
to set it during initialization, so create an explicit mem_ctx for it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
38a42e5aa1 intel/compiler: Add ctor to fs_builder that just takes the shader
Uses the dispatch_width from the shader (fs_visitor).  This was not
possible before because the dispatch_width was not part of
backend_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
cf730adc58 intel/compiler: Make fs_builder include fs_visitor and not the other way
This will allow fs_builder have a reference to an fs_visitor (a
"fs_shader" really), instead of a reference to a backend_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
4f991dec00 intel/compiler: Remove fs_visitor::bld
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
5b8ec015f2 intel/compiler: Don't use fs_visitor::bld in remaining places
The remaining users can simply create a new builder at_end() if needed.
In many places a new builder object is already being constructed, so
just give more specific instructions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
79735fa783 intel/compiler: Move remaining NIR conversion fields to nir_to_brw_state
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:13 +00:00
Caio Oliveira
5cb189636d intel/compiler: Move nir_ssa_value into a local structure
Create a nir_to_brw_state struct that is valid only during the
NIR to backend translation and use it for nir_ssa_values array.

This removes some NIR specific handling out of the fs_visitor -- nowadays
effectively an fs_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:13 +00:00
Caio Oliveira
2385d6087a intel/compiler: Make setup functions of NIR emission static
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:13 +00:00
Caio Oliveira
c12460b01e intel/compiler: Move NIR emission code to brw_fs_nir.cpp
This is a preparation to reorganize NIR emission code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:13 +00:00
Lionel Landwerlin
e22e88f8ce intel/fs: reuse set_predicate()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>
2023-11-28 13:40:07 +00:00
Kenneth Graunke
b35f1fc910 intel/compiler: Delete unused emit_dummy_fs()
This code is compiled out, but has been left in place in case we wanted
to use it for debugging something.  In the olden days, we'd use it for
platform enabling.  I can't think of the last time we did that, though.

I also used to use it for debugging.  If something was misrendering, I'd
iterate through shaders 0..N, replacing them with "draw hot pink" until
whatever shader was drawing the bad stuff was brightly illuminated.
Once it was identified, I'd start investigating that shader.

These days, we have frameretrace and renderdoc which are like, actual
tools that let you highlight draws and replace shaders.  So we don't
need to resort iterative driver hacks anymore.  Again, I can't think of
the last time I actually did that.

So, this code is basically just dead.  And it's using legacy MRF paths,
which we could update...or we could just delete it.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20172>
2023-10-30 23:03:23 +00:00
Francisco Jerez
53d1d793cb intel/fs: Delete manual 'inst->mlen' calculations from all uses of logical URB writes.
Rework:
 * Marcin: update emit_urb_indirect_vec4_write

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195>
2023-09-27 23:57:25 +00:00
Francisco Jerez
34a2c9ce35 intel/fs: Specify number of data components of logical URB writes via control immediate.
This is what most logical SEND messages do when they take a variable
number of components.  'inst->mlen' is expected to be zero for logical
SEND opcodes, which are expected to behave like plain arithmetic
operations, so certain automated transformations (like SIMD lowering)
can manipulate them without opcode-specific special-casing.

Guessing the number of components from 'inst->mlen' has other
disadvantages, because it requires duplicating the logic that infers
the message payload size in every use of the instruction -- Instead we
can just do the computation once during logical send lowering.  In
addition on LNL platform this causes the 'inst->mlen' field of URB
writes to have units inconsistent with every other SEND instruction,
which is likely to lead to confusion and bugs down the road.

Rework:
 * Marcin: update emit_urb_indirect_vec4_write

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195>
2023-09-27 23:57:25 +00:00
Ian Romanick
623465e26d intel/compiler/xe2: Update fs_visitor::emit_urb_writes to not assume SIMD8
v2: Account for 512b physical registers which causes the URB handle to be in FIXED_GFR 2 instead of 1.

XXX - Use fs_builder::vgrf() instead of open-coded dispatch_width calculations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195>
2023-09-27 23:57:25 +00:00
Lionel Landwerlin
d33aff783d intel/fs: add support for sparse accesses
Purely from the backend point of view it's just an additional
parameter to sampler messages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>
2023-07-27 02:02:30 +03:00
Marcin Ślusarz
a252123363 intel/compiler/mesh: compactify MUE layout
Instead of using 4 dwords for each output slot, use only the amount
of memory actually needed by each variable.

There are some complications from this "obvious" idea:
- flat and non-flat variables can't be merged into the same vec4 slot,
  because flat inputs mask has vec4 stride
- multi-slot variables can have different layout:
   float[N] requires N 1-dword slots, but
   i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot
- some output variables occur both in single-channel/component split
  and combined variants
- crossing vec4 boundary requires generating more writes, so avoiding them
  if possible is beneficial

This patch fixes some issues with arrays in per-vertex and per-primitive data
(func.mesh.ext.outputs.*.indirect_array.q0 in crucible)
and by reduction in single MUE size it allows spawning more threads at
the same time.

Note: this patch doesn't improve vk_meshlet_cadscene performance because
default layout is already optimal enough.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
2023-07-24 07:55:29 +00:00
Lionel Landwerlin
3384f029be intel/compiler: rework input parameters
Use a struct for various common parameters rather than per stage
structure or arguments to stage specific entrypoints.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
2023-07-20 09:08:08 +00:00
Lionel Landwerlin
46958bcb74 intel/fs: fix missing predicate on SEL instruction
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d8dfd153c5 ("intel/fs: Make per-sample and coarse dispatch tri-state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9381
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24236>
2023-07-19 21:57:25 +00:00
Faith Ekstrand
39b5bb0809 intel/fs: Drop support for nir_register
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>
2023-07-19 02:11:57 +00:00
Rohan Garg
ef2b763d9c anv: fix incorrect asserts when combining CPS and per sample interpolation
CPS is dynamically turned off when per sample interpolation is active.
Update the asserts to reflect this.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5644011f06 ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23103>
2023-05-31 19:26:59 +00:00
Lionel Landwerlin
04777171e0 intel/fs: try to rematerialize surface computation code
This helps a lot with accessing surface handles in control flow. Our
resource_intel intrinsic has a non_uniform flag, in which case we
cannot apply this optimization. But in uniform cases, this is just a
massive win. We drop all kind of pipeline stalls due to
find_live_channel. We also reduce register pressure by doing the
surface handle computation in a single GRF (instead of 2 or 4).

There are some regressions in max dispatch width but those I think are
only on SIMD32 and due to the current heuristic disabling it after
throughput comparison with SIMD16. We know this heuristic is not
perfect, it should probably be updated in another change.

Here are some stats (all titles seem to have similar gains) :

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width
 red_dead_redemption2 5860     -36.80%    -5.67%      +0.77%        +0.06%      -81.26%     -79.16%        -70.62%             -8.63%             -6.93%
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------
 All affected         4716     -37.29%    -5.67%      +0.95%        +0.07%      -81.26%     -79.16%        -70.62%             -9.15%             -8.47%
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Total                5860     -36.80%    -5.67%      +0.77%        +0.06%      -81.26%     -79.16%        -70.62%             -8.63%             -6.93%

 PERCENTAGE DELTAS          Shaders   Instrs    Cycles  Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width
 rise_of_the_tomb_raider_g2 12010    -37.19%   -22.12%      +0.01%        +0.00%      -99.01%     -99.14%        -98.65%             -7.62%             -4.96%
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 All affected               11732    -37.27%   -22.14%      +0.01%        +0.00%      -99.01%     -99.14%        -98.65%             -7.67%             -5.11%
 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Total                      12010    -37.19%   -22.12%      +0.01%        +0.00%      -99.01%     -99.14%        -98.65%             -7.62%             -4.96%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles  Spill count Fill count Scratch Memory Size Max live registers Max dispatch width
 total_war_warhammer2 462      -27.45%   -12.42%    -82.35%     -88.46%        -66.67%             -5.52%             -5.62%
 -----------------------------------------------------------------------------------------------------------------------------------
 All affected         335      -28.31%   -12.77%    -82.35%     -88.46%        -66.67%             -6.25%             -7.24%
 -----------------------------------------------------------------------------------------------------------------------------------
 Total                462      -27.45%   -12.42%    -82.35%     -88.46%        -66.67%             -5.52%             -5.62%

 PERCENTAGE DELTAS Shaders   Instrs    Cycles  Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width
 witcher_3_dxvk_g2 1049     -36.94%   -57.82%      +0.06%        +0.01%      -98.52%     -97.29%        -98.10%             -7.81%             -1.00%
 ------------------------------------------------------------------------------------------------------------------------------------------------------------
 All affected      693      -41.93%   -58.45%      +0.09%        +0.01%      -98.52%     -97.29%        -98.10%             -10.25%            -1.33%
 ------------------------------------------------------------------------------------------------------------------------------------------------------------
 Total             1049     -36.94%   -57.82%      +0.06%        +0.01%      -98.52%     -97.29%        -98.10%             -7.81%             -1.00%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:37 +00:00
Lionel Landwerlin
3d0cc3f63b intel/fs: keep track of new resource_intel information
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:37 +00:00
Lionel Landwerlin
56474fae93 intel/fs: fix subgroup invocation read bounds checking
nir->info.subgroup_size can be set to an enum :
  SUBGROUP_SIZE_VARYING = 0
  SUBGROUP_SIZE_UNIFORM = 1
  SUBGROUP_SIZE_API_CONSTANT = 2
  SUBGROUP_SIZE_FULL_SUBGROUPS = 3

So compute the API subgroup size value and compare it to the dispatch
size to determine whether we need some bound checking.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9ac192d79d ("intel/fs: bound subgroup invocation read to dispatch size")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>
2023-03-14 12:15:48 +00:00
Lionel Landwerlin
09cdb77a92 intel/fs: report max register pressure in shader stats
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21756>
2023-03-08 13:37:07 +00:00
Tapani Pälli
207eb94445 intel/compiler: add comment about workaround on simd width
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21619>
2023-03-02 14:06:36 +00:00
Jason Ekstrand
f3969e2413 intel/fs: Rework dynamic coarse handling
Use 2 flags for PI & RT messages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
964b878986 intel/fs: Break out yet another FB write helper
This new helper, do_emit_fb_writes() does the actual walk over all the
render targets to emit each of the different FB writes.  We want this in
a helper because we're about to go a bit crazy with coarse.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
5644011f06 intel/compiler: Convert wm_prog_key::persample_interp to a tri-state
This allows for the possibility that we may not know at compile time if
sample shading is enabled through the API.  While we're here, also
document exactly what this bit means so we don't confuse ourselves.

v2: Fixup coarse pixel values (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Jason Ekstrand
d8dfd153c5 intel/fs: Make per-sample and coarse dispatch tri-state
Whenever one of them is BRW_SOMETIMES, we depend on dynamic flag pushed
in as a push constant.  In this case, we have to often have to do the
calculation both ways and SEL the result.  It's a bit more code but
decouples MSAA from the shader key.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>
2023-02-06 09:12:18 +00:00
Kenneth Graunke
bafbe7c23a intel/compiler: Set NoMask on cr0 access for float controls mode
This is trying to clear a bit in the control register.  However, it's
executing with whatever channel mask happens to be active.  Typically
this is the one at the start of the program, so at least some channels
will be active.  Typically the first channel will be active due to
packed dispatch, but that's not always guaranteed.  Without NoMask,
the float controls writes may randomly not happen.

Recent GPUs also seem to have a hang issue when the first instruction in
the shader doesn't have any active channels.  Having an instruction with
NoMask at the start of the program works around the issue.  See HSD bug
14017989577.  In our case, the float controls preamble was breaking that
restriction every time, causing us to run into this problem frequently.

Thanks to Tapani Pälli for finding this hang issue, and Francisco
Jerez and Lionel Landwerlin for helping pinpoint this issue during
review of a workaround patch in !20194.

Fixes GPU hangs in Elder Scrolls Online, Witcher 3, and likely more.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7639
Fixes: 9da56ffc52 ("i965/fs: add emit_shader_float_controls_execution_mode() and aux functions")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20214>
2022-12-08 09:54:09 +00:00
Caio Oliveira
3272868218 intel/compiler: Make thread_payload struct abstract
Each shader stage has its own struct and will instantiate it, so the
base class doesn't need to be instantiated anymore.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>
2022-09-13 01:44:24 +00:00
Caio Oliveira
5b6987daee intel/compiler: Create and use struct for GS thread payload
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>
2022-09-13 01:44:24 +00:00
Caio Oliveira
0ca65b3c4c intel/compiler: Create and use struct for VS thread payload
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>
2022-09-13 01:44:24 +00:00
Caio Oliveira
19c6e1b447 intel/compiler: Create and use struct for TES thread payload
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>
2022-09-13 01:44:24 +00:00
Caio Oliveira
73920b7e2f intel/compiler: Use FS thread payload only for FS
Move the setup into the FS thread payload constructor.  Consolidate
payload setup for that in brw_fs_thread_payload.cpp file.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>
2022-09-13 01:44:24 +00:00
Tapani Pälli
40c2e0a317 intel/compiler: fix assert from ver to verx10
Fixes: 027b8b4249 ("intel/compiler: Add helper for barrier message payload setup for gfx >= 125")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18546>
2022-09-12 19:03:17 +00:00
Caio Oliveira
027b8b4249 intel/compiler: Add helper for barrier message payload setup for gfx >= 125
CS-like and TCS control barriers converged in gfx >= 125, so use a
common helper for the message payload setup.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18362>
2022-09-09 09:35:08 -07:00
Caio Oliveira
55db3aaa3a intel/compiler: Create fs_visitor::emit_tcs_barrier()
Allow us to implement this in brw_fs_visitor.cpp, which then will
let us deduplicate code between the CS-like barrier and the TCS
barrier in a later patch.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18362>
2022-09-09 09:35:08 -07:00
Marcin Ślusarz
7ebae85955 intel/compiler: insert URB fence before task/mesh termination
Bspec 53421 says:
"A URB fence memory is typically performed prior the thread
exit message, so that the next thread dispatch that reads
that URB memory will see it."

Cc: 22.1 <mesa-stable>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>
2022-08-02 09:31:24 +00:00
Ian Romanick
f7f232385f intel/fs: Use canonical form for "work around" tags
Trivial.  Also clean up some weird whitespace.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>
2022-07-26 17:25:19 +00:00
Ian Romanick
377246318a intel/fs: Eliminate "masked" and "per slot offset" URB messages
All of this information can be inferred from the sources.

v2: Fix "error: unused variable 'opcode'" detected by marge-bot.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>
2022-07-26 17:25:19 +00:00
Ian Romanick
349a040f68 intel/fs: Make logical URB write instructions more like other logical instructions
The changes to fs_visitor::validate() helped track down a place where I
initially forgot to convert a message to the new sources layout.  This
had caused a different validation failure in
dEQP-GLES31.functional.tessellation.tesscoord.triangles_equal_spacing,
but this were not detected until after SENDs were lowered.

Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19951145 -> 19951133 (<.01%)
instructions in affected programs: 2429 -> 2417 (-0.49%)
helped: 8 / HURT: 0

total cycles in shared programs: 858904152 -> 858862331 (<.01%)
cycles in affected programs: 5702652 -> 5660831 (-0.73%)
helped: 2138 / HURT: 1255

Broadwell
total cycles in shared programs: 904869459 -> 904835501 (<.01%)
cycles in affected programs: 7686744 -> 7652786 (-0.44%)
helped: 2861 / HURT: 2050

Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 141442369 -> 141442032 (-0.0%)
Instructions helped: 337

Cycles in all programs: 9099270231 -> 9099036492 (-0.0%)
Cycles helped: 40661
Cycles hurt: 28606

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>
2022-07-26 17:25:18 +00:00