Commit graph

5373 commits

Author SHA1 Message Date
Jason Ekstrand
4c8b100388 anv: Improve brw_nir_lower_mem_access_bit_sizes
This commit makes us take both bit size and alignment into account so
that we can properly handle cases such as when we have a 32-bit store
to an 8-bit-aligned address.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Jason Ekstrand
c643979228 intel/fs: Choose memory message type based on bit size
Thanks to the NIR vectorizing pass, we're about to see alignments that
are higher than the bit size.  Previously, we could use either and we
just happened to choose alignment (probably the wrong choice) so it's
harmless to switch to detecting based on bit size.  This commit changes
things to take both into account which is more accurate to what the
messages we're using do.  We also beef up the asserts and make them more
consistent, more accurate, and more complete.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Lionel Landwerlin
b38c32a573 intel/aub_viewer: fix access to freed memory
Windows closed while we're displaying them might lead to invalid
memory accessed, so use the safe iterators on the list of windows.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4430>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4430>
2020-04-03 15:46:24 +03:00
Jason Ekstrand
5cc27d59a1 anv/image: Use align_u64 for image offsets
The ALIGN functions in util/u_math.h work on uintptr_t whose size
changes depending on your platform.  Use ones which take an explicit
64-bit type instead to avoid 32-bit platform issues.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
2020-04-02 15:08:42 +00:00
Juan A. Suarez Romero
191ced539a anv/pipeline: allow more than 16 FS inputs
A fragment shader can have more than 16 inputs, so SBE emission should
deal with all of them.

This fixes dEQP-VK.pipeline.max_varyings.*

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
2020-04-01 23:36:28 +00:00
Juan A. Suarez Romero
460de2159e intel/compiler: store the FS inputs in WM prog data
Store the fragment shader inputs in the program data so we can use them
later when required without needing the NIR shader.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
2020-04-01 23:36:28 +00:00
Juan A. Suarez Romero
67c7cabd7f anv: use urb_setup_attribs in SBE
Avoid looping over all VARYING_SLOT_MAX urb_setup arrray entries.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
2020-04-01 23:36:28 +00:00
Danylo Piliaiev
e47bf7dadf anv: Do not sample from 3d depth image with HiZ
For Gen8-11, there are some restrictions around sampling from HiZ.
The Skylake PRM docs for RENDER_SURFACE_STATE::AuxiliarySurfaceMode
say:

    "If this field is set to AUX_HIZ, Number of Multisamples must
    be MULTISAMPLECOUNT_1, and Surface Type cannot be SURFTYPE_3D."

Fixes: dEQP-VK.geometry.layered.3d.*.readback

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2720
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4409>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4409>
2020-04-01 20:12:29 +00:00
Ian Romanick
62795475e8 nir/algebraic: Distribute source modifiers into instructions
There are three main classes of cases that are helped by this change:

1. When the negation is applied to a value being type converted (e.g.,
   float(-x)).  This could possibly also be handled with more clever
   code generation.

2. When the negation is applied to a phi node source (e.g., x = -(...);
   at the end of a basic block).  This was the original case that caught
   my attention while looking at shader-db dumps.

3. When the negation is applied to the source of an instruction that
   cannot have source modifiers.  This includes texture instructions and
   math box instructions on pre-Gen7 platforms (see more details below).

In many these cases the negation can be propagated into the instructions
that generate the value (e.g., -(a*b) = (-a)*b).

In addition to the operations implemtned in this patch, I also tried:

 - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on
   pre-Gen7.  On Gen6 and earlier, frcp is a math box instruction, and
   math box instructions cannot have source modifiers.

   I suspect this is why so many more shaders are helped on Gen6 than on
   Gen5 or Gen7.  Gen6 supports OpenGL 3.3, so a lot more shaders
   compile on it.  A lot of these shaders may have things like cos(-x)
   or rcp(-x) that could result in an explicit negation instruction.

 - bcsel - Hurt a few shaders with none helped.  bcsel operates on
   integer sources, so the fabs or fneg cannot be a source modifier in
   the bcsel itself.

 - Integer instructions - No changes on any Intel platform.

Some notes about the shader-db results below.

 - On Tiger Lake, a single Deus Ex fragment shader is hurt for both
   spills and fills.

 - On Haswell, a different Deus Ex fragment shader is hurt for both
   spills and fills.

 - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2
   (very high graphics settings, lol) fragment shader that upgrades
   from SIMD8 to SIMD16.

v2: Add support for fsign.  Add some patterns that remove redundant
negations and redundant absolute value rather than trying to push them
down the tree.

Tiger Lake
total instructions in shared programs: 17611333 -> 17586465 (-0.14%)
instructions in affected programs: 3033734 -> 3008866 (-0.82%)
helped: 10310
HURT: 632
helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1
helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01%
HURT stats (abs)   min: 1 max: 47 x̄: 3.21 x̃: 2
HURT stats (rel)   min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.32% -1.27%
Instructions are helped.

total cycles in shared programs: 338365223 -> 338262252 (-0.03%)
cycles in affected programs: 125291811 -> 125188840 (-0.08%)
helped: 5224
HURT: 2031
helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12
helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97%
HURT stats (abs)   min: 1 max: 2882 x̄: 69.50 x̃: 14
HURT stats (rel)   min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74%
95% mean confidence interval for cycles value: -18.71 -9.68
95% mean confidence interval for cycles %-change: -0.80% -0.63%
Cycles are helped.

total spills in shared programs: 8942 -> 8946 (0.04%)
spills in affected programs: 8 -> 12 (50.00%)
helped: 0
HURT: 1

total fills in shared programs: 9399 -> 9401 (0.02%)
fills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 16124348 -> 16102258 (-0.14%)
instructions in affected programs: 2830928 -> 2808838 (-0.78%)
helped: 11294
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72%
95% mean confidence interval for instructions value: -1.99 -1.93
95% mean confidence interval for instructions %-change: -1.34% -1.29%
Instructions are helped.

total cycles in shared programs: 335393932 -> 335325794 (-0.02%)
cycles in affected programs: 123834609 -> 123766471 (-0.06%)
helped: 5034
HURT: 2128
helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11
helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2634 x̄: 70.63 x̃: 16
HURT stats (rel)   min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62%
95% mean confidence interval for cycles value: -13.66 -5.37
95% mean confidence interval for cycles %-change: -0.69% -0.48%
Cycles are helped.

LOST:   0
GAINED: 2

Skylake
total instructions in shared programs: 14949240 -> 14927930 (-0.14%)
instructions in affected programs: 2594756 -> 2573446 (-0.82%)
helped: 11000
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.91
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 324829346 -> 324821596 (<.01%)
cycles in affected programs: 121566087 -> 121558337 (<.01%)
helped: 4611
HURT: 2147
helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10
helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2551 x̄: 67.88 x̃: 16
HURT stats (rel)   min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89%
95% mean confidence interval for cycles value: -4.25 1.96
95% mean confidence interval for cycles %-change: -0.28% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14971203 -> 14949957 (-0.14%)
instructions in affected programs: 2635699 -> 2614453 (-0.81%)
helped: 10982
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.90
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 336215033 -> 336086458 (-0.04%)
cycles in affected programs: 127383198 -> 127254623 (-0.10%)
helped: 4884
HURT: 1963
helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12
helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3401 x̄: 63.33 x̃: 16
HURT stats (rel)   min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70%
95% mean confidence interval for cycles value: -29.99 -7.57
95% mean confidence interval for cycles %-change: -0.89% -0.71%
Cycles are helped.

total fills in shared programs: 24905 -> 24901 (-0.02%)
fills in affected programs: 117 -> 113 (-3.42%)
helped: 4
HURT: 0

LOST:   0
GAINED: 16

Haswell
total instructions in shared programs: 13148927 -> 13131528 (-0.13%)
instructions in affected programs: 2220941 -> 2203542 (-0.78%)
helped: 8017
HURT: 4
helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93%
HURT stats (abs)   min: 1 max: 7 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91%
95% mean confidence interval for instructions value: -2.21 -2.13
95% mean confidence interval for instructions %-change: -1.43% -1.37%
Instructions are helped.

total cycles in shared programs: 321221791 -> 321079870 (-0.04%)
cycles in affected programs: 126886055 -> 126744134 (-0.11%)
helped: 4674
HURT: 1729
helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16
helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3694 x̄: 70.58 x̃: 18
HURT stats (rel)   min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90%
95% mean confidence interval for cycles value: -33.31 -11.02
95% mean confidence interval for cycles %-change: -0.99% -0.78%
Cycles are helped.

total spills in shared programs: 19872 -> 19874 (0.01%)
spills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

total fills in shared programs: 20941 -> 20941 (0.00%)
fills in affected programs: 62 -> 62 (0.00%)
helped: 1
HURT: 1

LOST:   0
GAINED: 8

Ivy Bridge
total instructions in shared programs: 11875553 -> 11853839 (-0.18%)
instructions in affected programs: 1553112 -> 1531398 (-1.40%)
helped: 7304
HURT: 3
helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94%
95% mean confidence interval for instructions value: -3.04 -2.90
95% mean confidence interval for instructions %-change: -1.65% -1.59%
Instructions are helped.

total cycles in shared programs: 178246425 -> 178184484 (-0.03%)
cycles in affected programs: 13702146 -> 13640205 (-0.45%)
helped: 4409
HURT: 1566
helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13
helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02%
HURT stats (abs)   min: 1 max: 356 x̄: 29.48 x̃: 10
HURT stats (rel)   min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70%
95% mean confidence interval for cycles value: -11.60 -9.14
95% mean confidence interval for cycles %-change: -1.19% -0.99%
Cycles are helped.

LOST:   0
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10695740 -> 10667483 (-0.26%)
instructions in affected programs: 2337607 -> 2309350 (-1.21%)
helped: 10720
HURT: 1
helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04%
95% mean confidence interval for instructions value: -2.69 -2.58
95% mean confidence interval for instructions %-change: -1.57% -1.51%
Instructions are helped.

total cycles in shared programs: 153478839 -> 153416223 (-0.04%)
cycles in affected programs: 22050900 -> 21988284 (-0.28%)
helped: 5342
HURT: 2200
helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16
helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86%
HURT stats (abs)   min: 1 max: 335 x̄: 20.93 x̃: 6
HURT stats (rel)   min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30%
95% mean confidence interval for cycles value: -9.18 -7.42
95% mean confidence interval for cycles %-change: -0.82% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 8114882 -> 8105574 (-0.11%)
instructions in affected programs: 1232504 -> 1223196 (-0.76%)
helped: 4109
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.31 -2.21
95% mean confidence interval for instructions %-change: -1.01% -0.96%
Instructions are helped.

total cycles in shared programs: 188504036 -> 188466296 (-0.02%)
cycles in affected programs: 31203798 -> 31166058 (-0.12%)
helped: 3447
HURT: 36
helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13%
HURT stats (abs)   min: 2 max: 30 x̄: 7.33 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10%
95% mean confidence interval for cycles value: -11.16 -10.51
95% mean confidence interval for cycles %-change: -0.22% -0.20%
Cycles are helped.

LOST:   0
GAINED: 1

GM45
total instructions in shared programs: 4989697 -> 4984531 (-0.10%)
instructions in affected programs: 703952 -> 698786 (-0.73%)
helped: 2493
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.13 -2.01
95% mean confidence interval for instructions %-change: -1.07% -0.99%
Instructions are helped.

total cycles in shared programs: 128929136 -> 128903886 (-0.02%)
cycles in affected programs: 21583096 -> 21557846 (-0.12%)
helped: 2214
HURT: 17
helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13%
HURT stats (abs)   min: 2 max: 8 x̄: 4.24 x̃: 4
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09%
95% mean confidence interval for cycles value: -11.75 -10.88
95% mean confidence interval for cycles %-change: -0.25% -0.22%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick
d2b4f3f137 intel/vec4: Allow late copy propagation on vec4
This change incurs a small amount of hurt now, but it enables a lot of
benefit on vec4 shaders on the next commit.  nir_opt_algebraic_late
converts dph, dot3, etc. to dhp_replicated, dot_replicated3, etc.  In
the process, it introduces extra moves.  If the original NIR contained

        vec1 32 ssa_45 = fdot4 ssa_51, ssa_44
        vec1 32 ssa_46 = fneg ssa_45

nir_opt_algebraic_late will produce

        vec4 32 ssa_18 = fdot_replicated4 ssa_1, ssa_15
        vec1 32 ssa_19 = mov ssa_18.x
        vec1 32 ssa_17 = fneg ssa_19

The algebraic pass added in the next commit can't see through the move
to know that the fneg applies to a fdot_replicated4.

Haswell, Ivy Bridge, and Sandybridge had similar results. (Haswell shown)
total cycles in shared programs: 187077604 -> 187079858 (<.01%)
cycles in affected programs: 350132 -> 352386 (0.64%)
helped: 174
HURT: 194
helped stats (abs) min: 2 max: 124 x̄: 23.60 x̃: 16
helped stats (rel) min: 0.12% max: 15.88% x̄: 4.98% x̃: 3.86%
HURT stats (abs)   min: 2 max: 164 x̄: 32.78 x̃: 16
HURT stats (rel)   min: 0.17% max: 22.82% x̄: 6.46% x̃: 0.86%
95% mean confidence interval for cycles value: 2.04 10.21
95% mean confidence interval for cycles %-change: 0.17% 1.93%
Cycles are HURT.

No shader-db changes on any other Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Lionel Landwerlin
88c046a6d3 isl: don't warn in physical extent calculation for yuv formats
Those format have correct descriptions already with the exception of
the planar format. In that case we introduce an assert.

This fine because we don't use the planar format in any of our
drivers. There are restrictions on how the addresses of the 2 planes
are relative to one another which make this annoying. The sampler is
also more limited than what we can do with a shader snippet.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
2020-03-31 15:59:21 +00:00
Lionel Landwerlin
015f08dd43 isl: set bpb for Y8_UNORM
This isn't a format we use in any of the drivers but for consistency
just give it a correct bpb.

We also set the luminance in the G channel. We can't actually use this
format with the 3D sampler (only media).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
2020-03-31 15:59:21 +00:00
Jason Ekstrand
896a7c28eb anv/allocator: Use util_dynarray for blocks in anv_state_stream
When we originally wrote a bunch of the allocation data structures, we
re-used the GPU memory for CPU-side data structures.  It's a bit more
memory efficient and usually ok.  However, this has a couple of
problems:

 1. It makes it MUCH more likely that the GPU will accidentlly stomp
    CPU-side data structures and cause nearly impossible to debug
    crashes.

 2. With discrete GPUs, the memory will be mapped somehow and that map
    may be across the BAR so it could have horribly slow CPU access.
    This is bad for our CPU-side data structures.

In the case of anv_state_stream, it also made the data structure
massively more complex than it needed to be.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
2020-03-31 08:12:07 +00:00
Jason Ekstrand
63bec07e14 anv: Account for the header in anv_state_stream_alloc
If we have an allocation that's exactly the block size, we end up
computing a new block size to allocate that's exactly the block size,
add in the header, and then assert fail.  When computing the block size,
we need to account for the header.

Fixes: 955127db93 "anv/allocator: Add support for large stream..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
2020-03-31 08:12:07 +00:00
Jason Ekstrand
4e80151c5d anv: Set alignments on descriptor and constant loads
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
2020-03-30 15:46:19 +00:00
Jason Ekstrand
2cb9cc56d5 intel/nir: Run copy-prop and DCE after lower_bool_to_int32
No shader-db impact on ICL with iris.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
2020-03-30 15:46:19 +00:00
Eric Engestrom
8970b7839a intel: drop unused include directories
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00
Eric Engestrom
79af30768d meson: inline inc_common
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00
Marek Olšák
e5339fe4a4 Move compiler.h and imports.h/c from src/mesa/main into src/util
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4324>
2020-03-27 21:00:09 +00:00
Lionel Landwerlin
aad0e6f810 intel/perf: store the probed i915-perf version
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Lionel Landwerlin
8e7202d45f intel/perf: document meaning of query field
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Lionel Landwerlin
dde96d31b7 intel/perf: move mdapi query definitions to their own file
Where they belong.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Lionel Landwerlin
33b9c7a7f6 intel/perf: break GL query stuff away
This stuff is somewhat specific to the GL extension & drivers. On
Vulkan we won't use this, it also made a rather large file.

v2: Fix Android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Lionel Landwerlin
f5c5574f42 intel/perf: move register definition to special file
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Francisco Jerez
36c155a017 intel/fs/gen12: Fix interaction of SWSB dependency combination with EU fusion workaround.
This has been reported to fix a hang in Shadow of Mordor on Gen12.
One of its compute shaders seems to cause an in-order exec_all
dependency to be merged into an out-of-order SET dependency slot,
which would prevent us from baking the SET dependency into the parent
instruction, leading to an assert failure in emit_inst_dependencies()
(Thanks to Rafael for noticing that).  Prevent that by avoiding
combination of in-order dependencies whenever that would cause a SET
dependency to be demoted to a SYNC.NOP instruction.

Fixes: e14529ff32 "intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow."
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-03-26 19:09:42 +00:00
Danylo Piliaiev
2d0599b1b4 intel/aub_viewer: Fix format specifier for uint64_t
Use PRIx64 instead of lx for uint64_t

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2692
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4331>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4331>
2020-03-26 18:00:15 +00:00
Pierre-Eric Pelloux-Prayer
2cb965e5b6 util/os_file: extend os_read_file to return the file size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4181>
2020-03-24 08:30:34 +01:00
Jason Ekstrand
67a10ea215 intel/dump_gpu: Handle a bunch of getparam in the no-HW case
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4250>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4250>
2020-03-24 06:28:29 +00:00
Jason Ekstrand
7fd4184378 intel/dump_gpu: Add an ensure_device_info helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4250>
2020-03-24 06:28:29 +00:00
Jason Ekstrand
be451f71ab anv: Stop fetching the timestamp frequency ourselves
gen_get_device_info_from_fd fetches the timestamp frequency from the
kernel.  ANV also carrying code for it is redundant.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4250>
2020-03-24 06:28:28 +00:00
D Scott Phillips
1182a3934a intel/tools/aubinator_error_decode: Decode ring buffers from HEAD to TAIL
Capture the HEAD and TAIL register values from the dump and
properly index the ring buffer using those. Previously we would
decode the ring buffer from the beginning, printing out whatever
happened to be there.

Also, properly pass the `from_ring` parameter to gen_print_batch()
so that decoding doesn't stop once MI_BATCH_BUFFER_END is
encoutered.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4261>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4261>
2020-03-23 19:10:50 +00:00
D Scott Phillips
49f9a0bb57 intel/tools/aubinator_error_decode: read HW Context before other batches
The hardware context buffer has state that was set before the
batch started. By decoding it first, references to things like
Dynamic State Base Address are decodable in the command batches.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4246>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4246>
2020-03-23 18:13:37 +00:00
Sagar Ghuge
60c789543e anv: Set patch count threshold in 3DSTATE_HS
Lets specifiy maximum number of patches that will be accumulated before
a thread is dispatched.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3563>
2020-03-23 17:57:57 +00:00
Sagar Ghuge
1a5ac646ce intel/compiler: Track patch count threshold
Return the number of patches to accumulate before an 8_PATCH mode thread
is launched.

v2: (Kenneth Graunke)
- Track patch count threshold instead of input control points.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3563>
2020-03-23 17:57:57 +00:00
Sagar Ghuge
b3dd54fe13 intel/genxml: Add patch count threshold field on gen12
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3563>
2020-03-23 17:57:57 +00:00
Jason Ekstrand
3252041a78 anv: Only add END_OF_PIPE_SYNC if we actually have AUX_INVAL
Fixes: 43dc842cb9 "anv: Wait for the GPU to be idle before..."
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4234>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4234>
2020-03-19 21:58:49 +00:00
Jason Ekstrand
9dbff6f6ce intel/iris: Always initialize CCS to 0
Previously, we were initializing the CCS to 0xFF for MCS+CCS due to a
misunderstanding of the following lines in the bspec:

    The following are the general SW requirements for MCS buffer clear
    functionality:
        ...
         - If Software wants to enable Color Compression without Fast
           clear, Software needs to initialize MCS with zeros.
         - Lossless compression and CCS initialized to all F (using HW
           Fast Clear or SW direct Clear) on the same surface is not
           supported.

The first line does not refer to the CCS as the comment author supposed
but refers to the MCS as the comment says.  It means that if you want to
use MCS compression without a fast-clear, you should initialize the MCS
to 0x00.  This is because the value 0x00 in the MCS means "all data is
in plane 0" which is a perfectly valid non-fast-clear initialization.
It's also the value the MCS should be in if you do a RECTLIST slow-clear
where the primitive fully covers each pixel such that the same value is
written to all samples.

The second line in the above quote seems to imply that CCS fast-clear is
incompatible with MCS fast-clear.  In particular, MCS+CCS fast-clear
uses a 0xff value in the MCS (like on Gen7-11) and leaves the CCS in
either the compressed or the pass-through state.  Therefore, we should
initialize the CCS to 0x00 even for MCS+CCS surfaces.

Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4074>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4074>
2020-03-19 20:54:19 +00:00
Lionel Landwerlin
507abc3959 isl: drop min row pitch alignment when set by the driver
When the caller of the isl_surf_init() specifies a row pitch, do not
consider the minimum CCS requirement if it's incompatible with the
caller's value.

isl_surf_get_ccs_surf() will check that the main surface alignment
matches CCS expectations.

v2: Simplify checks (Nanley)

v3: Add Comment about isl_surf_get_ccs_surf() (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: a3f6db2c4e ("isl: drop CCS row pitch requirement for linear surfaces")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
2020-03-19 19:17:10 +00:00
Lionel Landwerlin
def3470e9b isl: only apply main surface ccs pitch constraint with CCS
We could be creating a Y-tiled surface that isn't going to use CCS
(this could be the case when clearly indicated through modifiers).
Don't apply the main surface pitch alignment constraint in that case.

v2: Use logical NOT (Sagar)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a3f6db2c4e ("isl: drop CCS row pitch requirement for linear surfaces")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
2020-03-19 19:17:10 +00:00
Lionel Landwerlin
dab0aadea9 isl: properly filter supported display modifiers on Gen9+
Y tiling is supported for display on Gen9+ so don't filter it from the
possible flags.

v2: Drop Yf from display supported tilings on Gen12+ (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
2020-03-19 19:17:10 +00:00
Lionel Landwerlin
157a3cf3ec isl: implement linear tiling row pitch requirement for display
We're missing a requirement for alignment of row pitch for the display
HW. In linear tiling, the row pitch must be a 64bytes aligned.

v2: Use correct formula to align to 64bytes (Chad)

v3: Matching {} (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>
2020-03-19 19:17:10 +00:00
Jason Ekstrand
46187bb54f anv: Swizzle fast-clear values
Starting with Gen12, we can fast-clear a lot more surface formats and we
are suddenly in the position of having to fast-clear surfaces with
formats with an implicit swizzle such as VK_FORMAT_R4G4B4A4_UNORM_PACK16
which is represented as ISL_FORMAT_A4B4G4R4 with a BGRA swizzle.  In
order for blorp to do the fast-clear color conversion for us, it needs
a properly swizzled color.

This fixes the following Vulkan CTS groups on TGL:

 - dEQP-VK.pipeline.blend.format.b4g4r4a4_unorm_pack16.*
 - dEQP-VK.api.image_clearing.core.clear_color_image.*.b4g4r4a4*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
2020-03-18 21:05:07 +00:00
Jason Ekstrand
3fb8f19481 intel/blorp: Add support for swizzling fast-clear colors
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>
2020-03-18 21:05:07 +00:00
Chad Versace
6ee971c882 anv: Use isl_drm_modifier_get_default_aux_state()
Use it in anv_layout_to_aux_state().

Refactor only. No change in behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
2020-03-18 11:39:33 -07:00
Jason Ekstrand
0905d5a14a intel/isl: Don't align linear images to 64K on Gen12+
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
2020-03-18 17:33:28 +00:00
Lionel Landwerlin
25a54554b3 intel/decoder: don't consider header fields past dword0
v2: use ULL

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
2020-03-18 09:19:53 +00:00
Jason Ekstrand
d60375cbc2 anv: Do an end-of-pipe sync before updating AUX table entries
We've found in GL that an actual end-of-pipe sync is required before
invalidating the aux tables and that a simple CS stall is insufficient.
If we're about to modify the actual AUX table entries from the GPU, we
should definitely make sure it's stopped dead before we do so.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
2020-03-17 16:38:50 +00:00
Caio Marcelo de Oliveira Filho
3dd0d12aa5 intel/blorp: Plumb the stage through blorp upload_shader
Vulkan uses that for its own upload function -- even though for BLORP
it doesn't really currently care.  Neither Iris and i965 makes use of
it at the moment.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>
2020-03-17 08:24:46 -07:00
Jason Ekstrand
4061ac859d anv: Push UBO ranges relative to the start of the binding
There was a disconnect between anv_nir_compute_push_layout and the code
which sets up the push_ubo_sizes array.  The NIR code we emit checks
relative to the start of the bound UBO range so that, if we end up with
a vector which straddles the start of the push range, we can perform the
bounds check without risking overflow issues.  The code which sets up
the push_ubo_sizes, on the other hand, assumed it was relative to the
start of the push range.  Somehow, this didn't get get caught by any of
the available tests.

Fixes: e03f965280 "anv: Bounds-check pushed UBOs when ..."
Closes: #2623
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>
2020-03-16 15:14:14 +00:00
Jason Ekstrand
ae15b4fd73 anv: Fix the comparison in an assert
Fixes: e03f965280 "anv: Bounds-check pushed UBOs when ..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>
2020-03-16 15:14:14 +00:00