Commit graph

475 commits

Author SHA1 Message Date
Mark Janes
85e314db5d Revert "intel/fs: handle interpolation modes for at_sample and at_offset too"
This reverts commit 5afbb0e730.

Closes: #6198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15534>
2022-03-23 11:03:47 -07:00
Iván Briano
5afbb0e730 intel/fs: handle interpolation modes for at_sample and at_offset too
Cc: mesa-stable

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15424>
2022-03-22 19:05:05 +00:00
Lionel Landwerlin
ec6e247a40 intel/fs: handle inline data on OpenCL style kernels
This is for Gfx12.5 with the COMPUTE_WALKER::Inline Data payload. We
do this in a similar way to the compute kernels.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Lionel Landwerlin
4ec5da7270 intel/nir/fs: replace COMPUTE || KERNEL by gl_shader_stage_is_compute()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>
2022-03-21 11:26:44 +00:00
Sagar Ghuge
6031ad4bf6 intel/fs: Add Wa_22013689345
v2: Use a simpler framework (Lionel)

v3: Rebase, add task/mesh (Lionel)

v4: Fixup fence exec size (SIMDX -> SIMD1)

v5: Fix invalidate_analysis, add finishme comment (Curro)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: 22.0 <mesa-stable>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>
2022-03-17 14:18:02 +00:00
Lionel Landwerlin
96c8880900 intel/fs: fix total_scratch computation
We only have a single prog_data::total_scratch for all shader variants
(SIMD 8, 16, 32). Therefore we should always max the total_scratch on
top of existing variant.

We probably haven't run into that issue before because we compile by
increasing SIMD size and higher SIMD size is more likely to spill. But
for bindless shaders with return shaders, if the last return part
doesn't spill, we completely ignore the previous parts' scratch
computation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>
2022-03-02 13:13:03 +00:00
Marcin Ślusarz
e5c39bc427 intel/compiler: optimize flat inputs mask calculation
Don't bother looking at urb if variable is not flat.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15169>
2022-02-25 22:34:22 +00:00
Marcin Ślusarz
e2cb562dd1 intel/compiler: ignore per-primitive attrs when calculating flat input mask
If we say that per-primitive attributes are flat (which is communicated by
3DSTATE_SBE.ConstantInterpolationEnable), GPU freaks out and applies it
to other (non-flat) attributes.

Fixes: be89ea3231 ("intel/compiler: Handle per-primitive inputs in FS")

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15169>
2022-02-25 22:34:22 +00:00
Marcin Ślusarz
f91bfc80ba intel/compiler: remove redundant code from fs_visitor::run_*
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15079>
2022-02-22 09:09:05 +00:00
Lionel Landwerlin
2763a8af5a anv/genxml/intel/fs: fix binding shader record entry
Bit is flipped compared to all the other packets.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 705395344d ("intel/fs: Add support for compiling bindless shaders with resume shaders")
Fixes: c3ac9afca3 ("anv: Create and return ray-tracing pipeline SBT handles")
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15078>
2022-02-19 13:50:56 +00:00
Iván Briano
db48dcb4f3 intel/compiler: remove what looks like a bad rebase
This bit in the compiler looks like it was added by accident on one of
the latest versions of the original commit, but it clearly doesn't
belong there.

Fixes: 03e1e19246 ("anv: Refactor descriptor copy")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15016>
2022-02-15 01:04:47 +00:00
Ian Romanick
38a94c82e6 intel/fs: Don't optimize out 1.0*x and -1.0*x
This (sort of) matches the behavior of nir_opt_algebraic.  This ensures
that subnormal values are properly flushed to zero.

With the aid of "nir/search: Float sources of texture instructions are
float users" and "nir/search: Transitively apply is_only_used_as_float",
there would have been no shader-db regressions on Intel platforms.
However, those caused a significant increase in compile time.  Since the
instruction regressions were so small, I just dropped those commits
rather than improve them.

All Haswell and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20125042 -> 20125094 (<.01%)
instructions in affected programs: 7184 -> 7236 (0.72%)
helped: 0
HURT: 32
HURT stats (abs)   min: 1 max: 4 x̄: 1.62 x̃: 2
HURT stats (rel)   min: 0.11% max: 1.49% x̄: 0.85% x̃: 0.78%
95% mean confidence interval for instructions value: 1.39 1.86
95% mean confidence interval for instructions %-change: 0.74% 0.96%
Instructions are HURT.

total cycles in shared programs: 862745586 -> 862746551 (<.01%)
cycles in affected programs: 109872 -> 110837 (0.88%)
helped: 12
HURT: 23
helped stats (abs) min: 2 max: 774 x̄: 90.83 x̃: 19
helped stats (rel) min: 0.07% max: 25.23% x̄: 3.06% x̃: 0.40%
HURT stats (abs)   min: 2 max: 1106 x̄: 89.35 x̃: 12
HURT stats (rel)   min: 0.08% max: 45.40% x̄: 3.01% x̃: 0.47%
95% mean confidence interval for cycles value: -60.09 115.23
95% mean confidence interval for cycles %-change: -2.21% 4.07%
Inconclusive result (value mean confidence interval includes 0).

All of the shaders hurt are in either UE4 shooter-game or shooter_demo.

Tiger Lake
Instructions in all programs: 159893213 -> 159893290 (+0.0%)
SENDs in all programs: 6936431 -> 6936431 (+0.0%)
Loops in all programs: 38385 -> 38385 (+0.0%)
Cycles in all programs: 7019259514 -> 7019260087 (+0.0%)
Spills in all programs: 101389 -> 101389 (+0.0%)
Fills in all programs: 131532 -> 131532 (+0.0%)

Ice Lake
Instructions in all programs: 143624164 -> 143624235 (+0.0%)
SENDs in all programs: 6980289 -> 6980289 (+0.0%)
Loops in all programs: 38383 -> 38383 (+0.0%)
Cycles in all programs: 8440082767 -> 8440083238 (+0.0%)
Spills in all programs: 102246 -> 102246 (+0.0%)
Fills in all programs: 131908 -> 131908 (+0.0%)

Skylake
Instructions in all programs: 134185424 -> 134185495 (+0.0%)
SENDs in all programs: 6938790 -> 6938790 (+0.0%)
Loops in all programs: 38356 -> 38356 (+0.0%)
Cycles in all programs: 8222366529 -> 8222366923 (+0.0%)
Spills in all programs: 98821 -> 98821 (+0.0%)
Fills in all programs: 125218 -> 125218 (+0.0%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: f5dd6dfe01 ("anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
2022-02-10 18:15:39 +00:00
Rohan Garg
03e1e19246 anv: Refactor descriptor copy
Refactor descriptor copies to use the existing helper functions instead
of rolling our own. In order to facilitate this, we need to store the
appropriate buffer views for the relevant descriptors internally and
reuse them in the helpers.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14909>
2022-02-09 09:24:37 +00:00
Lionel Landwerlin
bb40e999d1 intel/nir: use a single intel intrinsic to deal with ray traversal
In the future we'll want to reuse this intrinsic to deal with ray
queries. Ray queries will use a different global pointer and
programmatically change the control/level arguments of the trace send
instruction.

v2: Comment on barrier after sync trace instruction (Caio)
    Generalize lsc helper (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
2022-02-08 12:55:25 +00:00
Lionel Landwerlin
2665595244 intel/fs: limit FS dispatch to SIMD16 when using ray queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
2022-02-08 12:55:25 +00:00
Lionel Landwerlin
57eed6698b intel/compiler: tracker number of ray queries in prog_data
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
2022-02-08 12:55:25 +00:00
Lionel Landwerlin
9d22f8ed23 intel/fs: add support for ACCESS_ENABLE_HELPER
v2: Factor out fragment shader masking on send messages (Caio)
    Update comments (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
2022-02-08 12:55:24 +00:00
Lionel Landwerlin
c199f44d17 intel/fs: name sources for A64 opcodes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
2022-02-08 12:55:24 +00:00
Marcin Ślusarz
1d9f47325b intel/compiler: handle gl_[Clip|Cull]Distance from mesh in fragment shaders
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>
2022-01-29 06:32:19 +00:00
Caio Oliveira
856a0cacb1 intel/compiler: Merge Per-Primitive attribute handling in Mesh case
Just a refactor, no behavior change.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>
2022-01-29 06:32:19 +00:00
Caio Oliveira
2b8b884bcd intel/compiler: Have specific mesh handling in calculate_urb_setup()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>
2022-01-29 06:32:19 +00:00
Kenneth Graunke
d475e839da intel/fs: Reuse the same FS input slot for VUE header fields.
VARYING_SLOT_{VIEWPORT,LAYER,PSIZ} all live in the same VUE header slot,
and the FS is already set up to read the x/y/z/w component of that vec4.

However, we were setting up the SBE to pass each of those items as a
separate FS input, so hypothetically if a shader read all three, we
would burn 3 FS inputs with redundant data.  Not only was this passing
extra data to the FS, but it would count as extra input slots for the
"Do we have 16 or fewer attributes?" check for using SBE swizzling to
rearrange them in a convenient manner.

Now we make them share a single FS attribute and only count them once.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14210>
2022-01-19 01:31:47 +00:00
Lionel Landwerlin
30a8b8d2df intel/fs: disable VRS when omask is written
As indicated by
VkPhysicalDeviceFragmentShadingRatePropertiesKHR::fragmentShadingRateWithShaderSampleMask
our implementation will clamp to 1x1 when reading samplemask or
writing to samplemask.

This fixes vkd3d-proton tests test_sample_mask_dxbc & test_sample_mask_dxil

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b6332fc4a8 ("intel/compiler: handle coarse pixel in render target writes descriptors")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14553>
2022-01-14 19:14:06 +00:00
Jason Ekstrand
a1de102479 intel/fs: Use compare_func for wm_prog_key::alpha_test_func
Because 0 is no longer a recognizable value (it's NEVER, which isn't a
good default), we add an emit_alpha_test bool to tell the back-end when
to bother alpha testing.  This lets us only touch crocus with the
change.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14157>
2022-01-14 15:08:09 +00:00
Jordan Justen
d57b10ab98 intel/compiler: Adjust TCS instance-id for dg2+
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14385>
2022-01-05 16:13:28 -08:00
Marcin Ślusarz
a48f1d51e2 intel/compiler: disable workaround not applicable to gfx >= 11
There's nothing in bspec that would suggest this is still needed.
It only affected gfx 9 and 10.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14280>
2021-12-22 10:13:25 +00:00
Jason Ekstrand
eebb2dedb2 intel/fs: Add a NONE scheduling mode
While our LIFO scheduling mode attempts to optimize for register
pressure, it's often hard for a scheduling algorithm to do better than
the instruction order provided by the shader author.  Shader authors
often do perfectly reasonable things like using texture results
immediately after fetching them or constructing texture coordinates
immediately before the texture op.  When we throw all the instruction
ordering information away, we loose any help the author may have given
us.  By attempting NONE before we fall back to the worst case LIFO mode.

And, yes, I tried this with NONE both before and after LIFO and doing
NONE before LIFO is substantially better, according to shader-db.

    total instructions in shared programs: 19673152 -> 19665202 (-0.04%)
    instructions in affected programs: 33669 -> 25719 (-23.61%)
    helped: 20
    HURT: 0
    helped stats (abs) min: 15 max: 4609 x̄: 397.50 x̃: 107
    helped stats (rel) min: 2.33% max: 67.50% x̄: 14.60% x̃: 9.12%
    95% mean confidence interval for instructions value: -867.61 72.61
    95% mean confidence interval for instructions %-change: -21.74% -7.46%
    Inconclusive result (value mean confidence interval includes 0).

    total cycles in shared programs: 935562500 -> 935020920 (-0.06%)
    cycles in affected programs: 18620349 -> 18078769 (-2.91%)
    helped: 104
    HURT: 48
    helped stats (abs) min: 88 max: 60986 x̄: 8031.48 x̃: 3680
    helped stats (rel) min: 0.61% max: 51.44% x̄: 14.95% x̃: 8.87%
    HURT stats (abs)   min: 10 max: 54724 x̄: 6118.62 x̃: 1530
    HURT stats (rel)   min: 0.13% max: 46.45% x̄: 10.28% x̃: 6.46%
    95% mean confidence interval for cycles value: -5724.34 -1401.71
    95% mean confidence interval for cycles %-change: -9.86% -4.10%
    Cycles are helped.

    total spills in shared programs: 12158 -> 10327 (-15.06%)
    spills in affected programs: 1831 -> 0
    helped: 20
    HURT: 0

    total fills in shared programs: 14749 -> 12635 (-14.33%)
    fills in affected programs: 2114 -> 0
    helped: 20
    HURT: 0

    LOST:   8
    GAINED: 649

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
e6ddee764e intel/fs: Reset instruction order before re-scheduling
The way the current scheduler loop is implemented, each scheduling pass
starts with what the previous pass had.  This means that, if PRE screwed
everything up majorly, PRE_NON_LIFO would have to try to fix it.  It
also meant that tiny changes to one pass would affect every later pass.
Instead, reset the order of the instructions before each scheduling
pass.  This makes the passes entirely independent of each other.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19670486 -> 19670648 (<.01%)
    instructions in affected programs: 25317 -> 25479 (0.64%)
    helped: 2
    HURT: 7
    helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
    helped stats (rel) min: 0.07% max: 0.07% x̄: 0.07% x̃: 0.07%
    HURT stats (abs)   min: 8 max: 70 x̄: 24.29 x̃: 12
    HURT stats (rel)   min: 0.41% max: 4.95% x̄: 1.47% x̃: 0.87%
    95% mean confidence interval for instructions value: -1.28 37.28
    95% mean confidence interval for instructions %-change: -0.04% 2.30%
    Inconclusive result (value mean confidence interval includes 0).

    total cycles in shared programs: 935535948 -> 935490243 (<.01%)
    cycles in affected programs: 421994824 -> 421949119 (-0.01%)
    helped: 1269
    HURT: 879
    helped stats (abs) min: 1 max: 12008 x̄: 259.38 x̃: 52
    helped stats (rel) min: <.01% max: 28.02% x̄: 1.12% x̃: 0.14%
    HURT stats (abs)   min: 1 max: 29931 x̄: 322.46 x̃: 20
    HURT stats (rel)   min: <.01% max: 32.17% x̄: 1.74% x̃: 0.22%
    95% mean confidence interval for cycles value: -71.37 28.81
    95% mean confidence interval for cycles %-change: -0.11% 0.21%
    Inconclusive result (value mean confidence interval includes 0).

    total spills in shared programs: 12403 -> 12430 (0.22%)
    spills in affected programs: 1355 -> 1382 (1.99%)
    helped: 2
    HURT: 7

    total fills in shared programs: 15128 -> 15182 (0.36%)
    fills in affected programs: 3294 -> 3348 (1.64%)
    helped: 2
    HURT: 7

    LOST:   21
    GAINED: 28

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
d49d092259 Revert "intel/fs: Do cmod prop again after scheduling"
This reverts commit ba2fa1ceaf.  Doing
optimizations after scheduling but before RA means doing them in the
middle of the scheduling loop which introduces additional dependencies
between one scheduling iteration and the next.  That won't work if we
want to make the scheduling modes independent, at least not unless we
have some way of fully cloning the IR.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
cf98a3cc19 intel/fs: Use OPT() for split_virtual_grfs
Now that we're being conservative in the pass, it's easy to tell when it
makes progress and we can put it in the OPT() macro.  This way, we get
nice INTEL_DEBUG=optimizer dumps for it.  While we're here, fix the
header comment which is massively out-of-date.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
38fa18a7a3 intel/fs: Be more conservative in split_virtual_grfs
Instead of modifying every single instruction, keep track of which VGRFs
are actually split in a bit-set, and only modify the instructions that
actually touch split regs.

This cuts the runtime of the following Vulkan CTS on my SKL box by 45%
from 3:21 to 1:51:  dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13734>
2021-12-18 01:46:19 +00:00
Jason Ekstrand
3c89dbdbfe intel/fs: Implement the sample_pos_or_center system value
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
a580fd55e1 intel/fs: Rework emit_samplepos_setup()
This rolls compute_sample_position into emit_samplepos_setup, its only
caller, by using a loop instead of calling it twice.  We also
early-return for the !persample_dispatch case instead of doing it as
part of the sample calculation.  This means that we don't call
fetch_payload_reg() to get sample_pos_reg unless we're actually going to
use it so the function is safe to call even if we haven't set up
sample_pos_reg.  This will be important for the next commit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Jason Ekstrand
ac7255ed1e intel/fs: Return fs_reg directly from builtin setup helpers
There's no good reason why we're allocating them on the heap and
returning a pointer.  Return the fs_reg directly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>
2021-12-17 16:02:16 +00:00
Caio Oliveira
2ad11b39bd intel/compiler: Use a struct for brw_compile_bs parameters
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14139>
2021-12-13 01:08:16 +00:00
Jason Ekstrand
278d12f991 intel/fs,vec4: Drop prog_data binding tables
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>
2021-12-10 21:20:47 +00:00
Jason Ekstrand
4fa58d27a5 intel/fs,vec4: Drop support for shader time
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>
2021-12-10 21:20:47 +00:00
Jason Ekstrand
8f3c100d61 intel/fs,vec4: Drop uniform compaction and pull constant support
The only driver using these was i965 and it's gone now.  This is all
dead code.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>
2021-12-10 21:20:47 +00:00
Caio Oliveira
1f438eb033 intel/compiler: Implement Mesh Output
Use the same URB access helpers that were added for Task Output.  The
Arrayed I/O (per-primitive and per-vertex) is handled by applying the
pitch from the MUE layout into the NIR intrinsics and including the
non-arrayed offset on top of it.  After that, the index src can be
used directly for lowering.

Because we keep around the non-arrayed offset AND the pitch is
aligned, we can identify cases where the access is indirect but
guaranteed to be aligned, and dispatch a single message.  Added a TODO
to explore that later.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Caio Oliveira
db23c41537 intel/compiler: Add backend compiler basics for Task/Mesh
Task/Mesh stages are CS-like stages, and include many
builtins (e.g. workgroup ID/index) and intrinsics (e.g. workgroup
memory primitives) originally present only in CS.

This commit add two new stages (task and mesh) that 'inherit' from CS
by embedding a brw_cs_prog_data in their own prog_data structure, so
that CS functionality can be easily reused.  They also currently use
the same helpers to select the SIMD variant to use -- that was
recently added for CS.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Caio Oliveira
827cf65a26 intel/compiler: Export brw_nir_lower_simd
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Caio Oliveira
09dd05a219 intel/compiler: Make MUE available when setting up FS URB access
Allows to assert its existence for per-primitive variables and will
later be useful to implement the "more than 16 attributes" case for
Mesh.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Caio Oliveira
be89ea3231 intel/compiler: Handle per-primitive inputs in FS
In Fragment Shader, regular inputs are laid out in the thread payload
in a one dword per each half-GRF, that gives room for having the two
delta dwords needed for interpolation.

Per-primitive inputs are laid out before the regular inputs, and since
there's no need to have delta information, they are packed.  So
half-GRF will be fully filled with 4 dwords of input.

When num_per_primitive_inputs is zero (the default case), behavior
should be the same as before.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Caio Oliveira
7938c38778 intel/compiler: Properly lower WorkgroupId for Task/Mesh
Task/Mesh currently only support a single dimension (both in NV API
and HW), so make Y and Z be zero.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Topi Pohjolainen
261dd6c8f8 intel/compiler: Add new variant for TXF_CMS_W
This allows, for example, fs_inst::components_read() without passing
devinfo as extra argument.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00
Topi Pohjolainen
24831bbd40 intel/compiler: Prepare ld2dms_w for 4 mcs components
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00
Topi Pohjolainen
dfe0ba9080 intel/compiler: Demote sampler params to 16-bit for CMS/UMS/MCS
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00
Topi Pohjolainen
0374b56faa intel/compiler/fs: Add support for 16-bit sampler msg payload
For SIMD8 half float payload, each component takes a full register, so
we can use existing LOAD_PAYLOAD infrastruture for required padding by
alternating plain 8-wide half float vector and null vector.

Also this patch removes an unwanted assertion from
opt_copy_propagation_local for LOAD_PAYLOAD.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00
Sagar Ghuge
936412af27 intel/compiler: Add helper to support half float payload with padding
To support SIMD8 half float payloads, each component takes one full
32bit wide register in both SIMD8H and SIMD16H mode. So we can make use
of existing LOAD_PAYLOAD infrastructure alternating a half float vector
and a null vector, in order to handle required padding.

v2: (Francisco)
- Skip header sources
- Fix comparision units
- Don't allocate VGRF for padded source

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00
Sagar Ghuge
be2bfe5fe8 intel/compiler: Don't hardcode padding source type to 32bit
We can use LOAD_PAYLOAD infrastructure in order to handle 16bit float
payload. Let's rely on source type for padding sources, if not set
previously then default one would be 32-bit.

This patch will be used later in the series to handle 16-bit float
payloads.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00