Commit graph

4346 commits

Author SHA1 Message Date
Ian Romanick
47c2aa5b48 intel/vec4: Reswizzle VF immediates too
Previously, an instruction like

mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F]

would get rewritten as

mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F]

The latter does not produce the correct result.  The VF immediate in the
second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F].  This
commit produces the former.

Fixes: 1ee1d8ab46 ("i965/vec4: Reswizzle sources when necessary.")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-08 11:30:10 -07:00
Dongwon Kim
f734e2a042 anv: disable repacking for compression for applicable gen
set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register
if the gen attribute, 'disable_ccs_repack' is set.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-07-08 10:54:38 -07:00
Dongwon Kim
eb6d067e68 intel: add disable_ccs_repack to gen_device_info
add a new attribute, 'disable_ccs_repack' to gen_device info, which
indicates whether repacking of components in certain pixel formats
before compression needs to be disabled to keep the compatibility
with decompression capability of display controller (gen11+)

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-07-08 10:54:38 -07:00
Dongwon Kim
e6ac6d3224 intel/genxml: correct bit fields in CACHE_MODE_0 reg for gen11
correct bit fields information of CACHE_MODE_0 reg in current gen11.xml

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-07-08 10:54:37 -07:00
Caio Marcelo de Oliveira Filho
9c7adaeb5f anv: Advertise VK_EXT_shader_demote_to_helper_invocation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-08 08:57:25 -07:00
Caio Marcelo de Oliveira Filho
45f5db5a84 intel/fs: Implement "demote to helper invocation"
The "demote" intrinsic works like "discard" but don't change the
control flow, allowing derivative operations to work.  This is the
semantics of D3D discard.

The "is_helper_invocation" intrinsic will return true for helper
invocations -- both the ones that started as helpers and the ones that
where demoted.  This is needed to avoid changing the behavior of
gl_HelperInvocation which is an input (so not expected to change
during shader execution).

v2: Emit the discard jump and comment why it is safe.  (Jason)
    Rework the is_helper_invocation() that was stomping f0.1.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-08 08:57:25 -07:00
Connor Abbott
6b28808b22 intel/nir: Extract add_const_offset_to_base
Pretty much every driver using nir_lower_io_to_temporaries followed by
nir_lower_io is going to want this. In particular, radv and radeonsi in
the next commits.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:14:53 +02:00
Connor Abbott
27f0c3c15e radv: Make FragCoord a sysval
load_fragcoord is already handled in common code for radeonsi, so we
don't need to do anything to handle it. However, there were some passes
creating NIR with the varying, so we switch them over to the sysval. In
the case of nir_lower_input_attachments which is used by both radv and
anv, we add handling for both until intel switches to using a sysval.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:14:53 +02:00
Daniel Schürmann
c31f470066 anv,nir: Move lower_input_attachments pass from ANV to NIR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:02:50 +02:00
Chia-I Wu
5824130389 anv: fix VkExternalBufferProperties for host allocation
It was reported as unsupported previously.  It should be importable
and is compatible with itself.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: 69cc6272fb ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-07 13:31:58 -07:00
Chia-I Wu
f3c7a02a62 anv: fix VkExternalBufferProperties for unsupported handles
compatibleHandleTypes must include the queried handle type.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-07 13:31:58 -07:00
Lionel Landwerlin
5493ec3c19 anv: manually add KHR_display to the list of platforms
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 38305e6c94 ("anv: replace hard-coded platform list with vk.xml parse")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111078
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-07-07 15:34:09 +03:00
Juan A. Suarez Romero
e06bc0b166 intel: fix wrong format usage
Do not use the view format when filling the surface state.

Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.*

Fixes: fb1350c76f ("intel: Add and use helpers for level0 extent")

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-07-03 10:14:54 +02:00
Jason Ekstrand
e708261cb7 anv: Advertise a more accurate minTexelBufferOffsetAlignment
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-02 22:28:44 +00:00
Jason Ekstrand
0bc657f2db anv: Implement VK_EXT_texel_buffer_alignment
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-02 22:28:44 +00:00
Jason Ekstrand
fa869f45c8 intel/fs: Use nir_lower_interpolation on gen11+
On gen11, the removed the PLN instruction so we have to emit a pile of
MAD to emulate it.  We may as well do that in NIR so we can optimize and
later schedule it.

Shader-db results on Ice Lake:

    total instructions in shared programs: 17145644 -> 16556440 (-3.44%)
    instructions in affected programs: 11507454 -> 10918250 (-5.12%)
    helped: 35763
    HURT: 42085
    helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18
    helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49%
    HURT stats (abs)   min: 1 max: 248 x̄: 2.22 x̃: 2
    HURT stats (rel)   min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47%
    95% mean confidence interval for instructions value: -7.67 -7.47
    95% mean confidence interval for instructions %-change: -4.46% -4.29%
    Instructions are helped.

    total loops in shared programs: 4370 -> 4370 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total cycles in shared programs: 360624645 -> 368220857 (2.11%)
    cycles in affected programs: 269631244 -> 277227456 (2.82%)
    helped: 15583
    HURT: 65874
    helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32
    helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44%
    HURT stats (abs)   min: 1 max: 238638 x̄: 133.87 x̃: 20
    HURT stats (rel)   min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97%
    95% mean confidence interval for cycles value: 67.42 119.09
    95% mean confidence interval for cycles %-change: 3.61% 3.73%
    Cycles are HURT.

    total spills in shared programs: 8943 -> 8981 (0.42%)
    spills in affected programs: 1925 -> 1963 (1.97%)
    helped: 44
    HURT: 14

    total fills in shared programs: 21815 -> 21925 (0.50%)
    fills in affected programs: 3511 -> 3621 (3.13%)
    helped: 41
    HURT: 18

    LOST:   70
    GAINED: 14

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-02 16:15:25 +00:00
Jason Ekstrand
2b79a9e5a5 intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-02 16:15:25 +00:00
Jason Ekstrand
8e7d066682 intel/fs: Actually implement the load_barycentric intrinsics
If they never get used, dead code should clean them up.  Also, we rework
the at_offset and at_sample intrinsics so they return a proper vec2
instead of returning things in PLN layout.  Fortunately, copy-prop is
pretty good at cleaning this up and it doesn't result in any actual
extra MOVs.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-02 16:15:25 +00:00
Sagar Ghuge
d5f63990b4 intel/tools: Add assembler unit tests for ROL/ROR instructions
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Sagar Ghuge
e9c35dd7cc intel/tools: Add ROL/ROR support in assembler
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Sagar Ghuge
1e92e83856 intel/compiler: Emit ROR and ROL instruction
v2: Reorder patch (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Sagar Ghuge
83fdec0f0d intel/compiler: Enable the emission of ROR/ROL instructions
v2: 1) Drop changes for vec4 backend as on Gen11+ we don't support
       align16 mode (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Eric Engestrom
5f9764bc0b anv: fix indentation
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-29 22:41:06 +01:00
Eric Engestrom
42eb85a9d8 anv: fix typo
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-29 22:41:06 +01:00
Eric Engestrom
38305e6c94 anv: replace hard-coded platform list with vk.xml parse
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-29 22:38:54 +01:00
Lionel Landwerlin
5847de6e9a intel/compiler: don't use byte operands for src1 on ICL
The simulator complains about using byte operands, we also have
documentation telling us.

Note that add operations on bytes seems to work fine on HW (like ADD).
Using dwords operands with CMP & SEL fixes the following tests :

   dEQP-VK.spirv_assembly.type.vec*.i8.*

v2: Drop the GLK changes (Matt)
    Add validator tests (Matt)

v3: Drop GLK ref (Matt)
    Don't mix float/integer in MAD (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
BSpec: 3017
Cc: <mesa-stable@lists.freedesktop.org>
2019-06-29 12:56:09 +00:00
Ian Romanick
b04beaf41d intel/vec4: Try both sources as candidates for being immediates
For some reason, when I first wrote try_immediate_source, I thought the
sources had already been ordered so that the immediate value was the
second source.  That's rubbish.  The generator assumes *neither* source
is immediate, and it relies on later copy/constant propagation passes to
do the reordering.

For this reason, the changes to try_immediate_source have to go to some
efforts to reorder the operands and tell the caller when it reordered
them.  The generator for comparison instructions uses this to determine
when the comparison needs to change (e.g., from GT to LT).

No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.

Haswell
total instructions in shared programs: 13484431 -> 13480500 (-0.03%)
instructions in affected programs: 441138 -> 437207 (-0.89%)
helped: 1883
HURT: 0
helped stats (abs) min: 1 max: 49 x̄: 2.09 x̃: 1
helped stats (rel) min: 0.07% max: 8.91% x̄: 1.10% x̃: 0.90%
95% mean confidence interval for instructions value: -2.19 -1.98
95% mean confidence interval for instructions %-change: -1.14% -1.06%
Instructions are helped.

total cycles in shared programs: 376420286 -> 376406400 (<.01%)
cycles in affected programs: 15995668 -> 15981782 (-0.09%)
helped: 1692
HURT: 219
helped stats (abs) min: 2 max: 764 x̄: 13.78 x̃: 4
helped stats (rel) min: <.01% max: 9.69% x̄: 0.69% x̃: 0.35%
HURT stats (abs)   min: 2 max: 516 x̄: 43.09 x̃: 22
HURT stats (rel)   min: 0.02% max: 12.09% x̄: 2.30% x̃: 1.13%
95% mean confidence interval for cycles value: -9.70 -4.83
95% mean confidence interval for cycles %-change: -0.42% -0.28%
Cycles are helped.

total spills in shared programs: 23166 -> 23158 (-0.03%)
spills in affected programs: 66 -> 58 (-12.12%)
helped: 2
HURT: 0

total fills in shared programs: 34592 -> 34580 (-0.03%)
fills in affected programs: 75 -> 63 (-16.00%)
helped: 2
HURT: 0

Ivy Bridge
total instructions in shared programs: 12051590 -> 12048513 (-0.03%)
instructions in affected programs: 355911 -> 352834 (-0.86%)
helped: 1481
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 2.08 x̃: 1
helped stats (rel) min: 0.07% max: 4.92% x̄: 1.08% x̃: 0.90%
95% mean confidence interval for instructions value: -2.17 -1.98
95% mean confidence interval for instructions %-change: -1.12% -1.04%
Instructions are helped.

total cycles in shared programs: 180319624 -> 180307642 (<.01%)
cycles in affected programs: 15591028 -> 15579046 (-0.08%)
helped: 1340
HURT: 174
helped stats (abs) min: 2 max: 764 x̄: 14.19 x̃: 2
helped stats (rel) min: <.01% max: 8.68% x̄: 0.64% x̃: 0.32%
HURT stats (abs)   min: 2 max: 518 x̄: 40.41 x̃: 14
HURT stats (rel)   min: 0.02% max: 8.37% x̄: 1.59% x̃: 0.67%
95% mean confidence interval for cycles value: -10.85 -4.97
95% mean confidence interval for cycles %-change: -0.45% -0.31%
Cycles are helped.

All Gen6 and earlier platforms had simlar results. (Sandy Bridge shown)
total instructions in shared programs: 10863159 -> 10861462 (-0.02%)
instructions in affected programs: 157839 -> 156142 (-1.08%)
helped: 715
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 2.37 x̃: 2
helped stats (rel) min: 0.23% max: 4.33% x̄: 1.07% x̃: 0.85%
95% mean confidence interval for instructions value: -2.53 -2.21
95% mean confidence interval for instructions %-change: -1.13% -1.02%
Instructions are helped.

total cycles in shared programs: 153957782 -> 153948778 (<.01%)
cycles in affected programs: 3171648 -> 3162644 (-0.28%)
helped: 696
HURT: 62
helped stats (abs) min: 2 max: 390 x̄: 15.72 x̃: 4
helped stats (rel) min: 0.02% max: 10.57% x̄: 0.57% x̃: 0.12%
HURT stats (abs)   min: 2 max: 300 x̄: 31.29 x̃: 2
HURT stats (rel)   min: 0.11% max: 7.23% x̄: 0.83% x̃: 0.34%
95% mean confidence interval for cycles value: -15.65 -8.11
95% mean confidence interval for cycles %-change: -0.56% -0.36%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-28 18:13:18 -07:00
Ian Romanick
379cf3bb87 intel/vec4: Try immediate sources for dot products too
No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.

All Haswell and earlier platforms has similar results. (Haswell shown)
total instructions in shared programs: 13484467 -> 13484431 (<.01%)
instructions in affected programs: 8540 -> 8504 (-0.42%)
helped: 33
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.09 x̃: 1
helped stats (rel) min: 0.31% max: 1.53% x̄: 0.49% x̃: 0.35%
95% mean confidence interval for instructions value: -1.19 -0.99
95% mean confidence interval for instructions %-change: -0.60% -0.38%
Instructions are helped.

total cycles in shared programs: 376420572 -> 376420286 (<.01%)
cycles in affected programs: 56260 -> 55974 (-0.51%)
helped: 26
HURT: 5
helped stats (abs) min: 2 max: 204 x̄: 11.85 x̃: 2
helped stats (rel) min: 0.11% max: 3.08% x̄: 0.39% x̃: 0.13%
HURT stats (abs)   min: 2 max: 6 x̄: 4.40 x̃: 6
HURT stats (rel)   min: 0.03% max: 0.35% x̄: 0.24% x̃: 0.35%
95% mean confidence interval for cycles value: -22.91 4.45
95% mean confidence interval for cycles %-change: -0.56% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-28 17:16:16 -07:00
Ian Romanick
eeebeb211f intel/vec4: Try emitting non-scalar immediates
Sometimes an instruction has a vector as a source, but all of the
components have the same value.  For example,

    vec3 32 ssa_16 = load_const (1.0, 1.0, 1.0)
    ...
    vec3 32 ssa_82 = fadd ssa_16, -ssa_81.xyz

No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.

Haswell
total instructions in shared programs: 13487811 -> 13484467 (-0.02%)
instructions in affected programs: 421981 -> 418637 (-0.79%)
helped: 1859
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.80 x̃: 1
helped stats (rel) min: 0.04% max: 9.80% x̄: 1.04% x̃: 0.84%
95% mean confidence interval for instructions value: -1.85 -1.74
95% mean confidence interval for instructions %-change: -1.07% -1.00%
Instructions are helped.

total cycles in shared programs: 376423252 -> 376420572 (<.01%)
cycles in affected programs: 14800970 -> 14798290 (-0.02%)
helped: 1519
HURT: 329
helped stats (abs) min: 2 max: 462 x̄: 10.59 x̃: 4
helped stats (rel) min: 0.03% max: 16.73% x̄: 0.79% x̃: 0.36%
HURT stats (abs)   min: 2 max: 598 x̄: 40.74 x̃: 16
HURT stats (rel)   min: <.01% max: 10.32% x̄: 2.56% x̃: 0.98%
95% mean confidence interval for cycles value: -3.53 0.63
95% mean confidence interval for cycles %-change: -0.30% -0.09%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 34601 -> 34592 (-0.03%)
fills in affected programs: 91 -> 82 (-9.89%)
helped: 9
HURT: 0

Ivy Bridge
total instructions in shared programs: 12053565 -> 12051626 (-0.02%)
instructions in affected programs: 298103 -> 296164 (-0.65%)
helped: 1228
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 1.58 x̃: 1
helped stats (rel) min: 0.04% max: 3.57% x̄: 0.91% x̃: 0.81%
95% mean confidence interval for instructions value: -1.63 -1.53
95% mean confidence interval for instructions %-change: -0.95% -0.88%
Instructions are helped.

total cycles in shared programs: 180322270 -> 180319922 (<.01%)
cycles in affected programs: 14123840 -> 14121492 (-0.02%)
helped: 1036
HURT: 195
helped stats (abs) min: 2 max: 462 x̄: 11.93 x̃: 2
helped stats (rel) min: 0.03% max: 14.05% x̄: 0.82% x̃: 0.35%
HURT stats (abs)   min: 2 max: 598 x̄: 51.33 x̃: 16
HURT stats (rel)   min: <.01% max: 9.68% x̄: 3.02% x̃: 0.72%
95% mean confidence interval for cycles value: -4.92 1.10
95% mean confidence interval for cycles %-change: -0.35% -0.07%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10864286 -> 10863189 (-0.01%)
instructions in affected programs: 159722 -> 158625 (-0.69%)
helped: 724
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.52 x̃: 1
helped stats (rel) min: 0.10% max: 2.91% x̄: 0.79% x̃: 0.62%
95% mean confidence interval for instructions value: -1.58 -1.46
95% mean confidence interval for instructions %-change: -0.82% -0.75%
Instructions are helped.

total cycles in shared programs: 153967938 -> 153957926 (<.01%)
cycles in affected programs: 1923186 -> 1913174 (-0.52%)
helped: 654
HURT: 56
helped stats (abs) min: 2 max: 170 x̄: 20.00 x̃: 4
helped stats (rel) min: 0.03% max: 11.82% x̄: 0.89% x̃: 0.18%
HURT stats (abs)   min: 2 max: 390 x̄: 54.75 x̃: 32
HURT stats (rel)   min: 0.05% max: 6.92% x̄: 3.09% x̃: 2.92%
95% mean confidence interval for cycles value: -17.42 -10.78
95% mean confidence interval for cycles %-change: -0.76% -0.40%
Cycles are helped.

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8142677 -> 8141721 (-0.01%)
instructions in affected programs: 139511 -> 138555 (-0.69%)
helped: 588
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 1.63 x̃: 1
helped stats (rel) min: 0.21% max: 4.39% x̄: 0.84% x̃: 0.46%
95% mean confidence interval for instructions value: -1.70 -1.55
95% mean confidence interval for instructions %-change: -0.89% -0.78%
Instructions are helped.

total cycles in shared programs: 188549394 -> 188547676 (<.01%)
cycles in affected programs: 3171960 -> 3170242 (-0.05%)
helped: 527
HURT: 0
helped stats (abs) min: 2 max: 18 x̄: 3.26 x̃: 2
helped stats (rel) min: <.01% max: 0.80% x̄: 0.08% x̃: 0.06%
95% mean confidence interval for cycles value: -3.49 -3.03
95% mean confidence interval for cycles %-change: -0.09% -0.07%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-06-28 17:16:06 -07:00
Anuj Phogat
387e43b52f Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.

We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.

This reverts commit 2be60e0c73.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-28 14:02:13 -07:00
Nanley Chery
02f6995d76 isl: Don't align phys_level0_sa by block dimension
Aligning phys_level0_sa by the compression block dimension prior to
mipmap layout causes the layout of compressed surfaces to differ from
the sampler's expectations in certain cases. The hardware docs agree:

From the BDW PRM, Vol. 5, Compressed Mipmap Layout,

   The compressed mipmaps are stored in a similar fashion to
   uncompressed mipmaps [...]

   The following exceptions apply to the layout of compressed (vs.
   uncompressed) mipmaps:
      * [...]
      * The dimensions of the mip maps are first determined by applying
	the sizing algorithm presented in Non-Power-of-Two Mipmaps
	above. Then, if necessary, they are padded out to compression
	block boundaries.

The last bullet indicates that alignment should not be done for
calculating a miplevel's dimensions, but rather for determining miplevel
placement/padding. Comply with this text by removing the extra
alignment.

Fixes some fbo-generatemipmap-formats piglit failures on all tested
platforms (SNB-KBL).

v2:
- Note fixed platforms.
- Update some consumers via a helper function.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-27 23:38:38 +00:00
Nanley Chery
fb1350c76f intel: Add and use helpers for level0 extent
Prepare for a bug fix by adding and using helpers which convert
isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of
surface elements.

v2:
- Update iris (Ken).
- Update anv.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-27 23:38:37 +00:00
Lionel Landwerlin
836225840c intel/compiler: fix derivative on y axis implementation
This rewrites the ddy in EXECUTE_4 mode with a loop to make it more
obvious what is going on and also sets the group each of the 4 threads
in the groups are supposed to execute.

Fixes the following CTS tests :

   dEQP-VK.glsl.derivate.dfdyfine.dynamic_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-Authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 2134ea3800 ("intel/compiler/fs: Implement ddy without using align16 for Gen11+")
2019-06-27 18:14:58 +00:00
Kenneth Graunke
748e5dac72 intel/blorp: Disable sampler state prefetching on Gen11
Sampler state prefetching is broken on Gen11, and WA_160668216 says
to disable it.  Apparently sampler state prefetching also has basically
zero impact on performance, so we don't need to worry there.

i965, anv, and iris already handle this correctly, but we missed BLORP.
Ideally the kernel should globally disable this by writing SARCHKMD, at
which point we wouldn't have to worry about it.  But let's be defensive
and handle it ourselves too.

v2: separate out from BTP workaround in case we change that eventually

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> [v1]
2019-06-25 13:29:31 -07:00
Jason Ekstrand
0a364a4a74 anv/descriptor_set: Only write texture swizzles if we have an image view
When immutable samplers are set we call write_image_view with a NULL
image view.  This causes issues on IVB where we have to fake texture
swizzling.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999
Fixes: d2aa65eb18 "anv: Emulate texture swizzle in the shader when..."
2019-06-25 19:43:25 +00:00
Tapani Pälli
7a6e5a4bc3 intel/compiler: silence a warning of using different enum type
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-25 10:09:22 +03:00
Nataraj Deshpande
d94fca5420 anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format
When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform
gralloc module will select a format based on the usage flags provided by
the camera device and the other endpoint of the stream.

The patch fixes crash in vulkan when the test is run with camera stream
set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED.

Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering
on chromebook with camera HAL3.

v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take
    AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan)

Fixes: f1654fa7e3 "anv/android: support creating images from external format"
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-24 08:28:18 +03:00
Jason Ekstrand
1a9e5b9094 anv: Implement "pop-free" clipping
This is the preferred clipping mode since it doesn't mean your points
disappear the moment part of the point crosses over the edge of the
viewport and that lines have weird endpoints at viewport edges.  We've
just never bothered to hook it up until now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Jason Ekstrand
4a757d6c31 anv: Enable the guardband clip test
In workloads where there is a lot of geometry drawn that crosses over
the edge of the viewport, this should substantially improve clipper
performance.  Not really sure why it's taken 3 years to turn it on but
we never got around to it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Jason Ekstrand
13f0c278c5 i965,iris: Move guardband calculations to a common location
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Kenneth Graunke
d4a4384b31 iris: Implement INTEL_DEBUG=pc for pipe control logging.
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush.  That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on.  It should make it easier to determine whether we're doing
too many flushes and why.
2019-06-20 13:32:15 -05:00
Lionel Landwerlin
4a61be24fe anv: only resort to sync fds internally with no syncobj support
We can rely on only one kind of synchronization object (drm-syncobj)
when it is available. This reduces the number of file descriptors we
use in our implementation.

This will be required later for timeline semaphores implementation, at
this point we won't ever want to use anything else but syncobjs.

v2: Only use has_syncobj for semaphores (Jason)

v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-20 14:59:51 +00:00
Eric Engestrom
a9e09d56a9 isl: tag unreachable path as such
GCC should be able to figure out that all the possible enum values are
exhausted in the switch() and all the branches return from the function,
but apparently it doesn't, so let's tell the compiler explicitly.

This gets rid of the following warnings in GCC 9:

    [1/24] Compiling C object 'src/intel/isl/60d23f8@@isl@sta/isl.c.o'.
    ../src/intel/isl/isl.c: In function ‘isl_surf_init_s’:
    ../src/intel/isl/isl.c:1569:10: warning: ‘array_pitch_el_rows’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     1569 |    *surf = (struct isl_surf) {
          |    ~~~~~~^~~~~~~~~~~~~~~~~~~~~
     1570 |       .dim = info->dim,
          |       ~~~~~~~~~~~~~~~~~
     1571 |       .dim_layout = dim_layout,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~
     1572 |       .msaa_layout = msaa_layout,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1573 |       .tiling = tiling,
          |       ~~~~~~~~~~~~~~~~~
     1574 |       .format = info->format,
          |       ~~~~~~~~~~~~~~~~~~~~~~~
     1575 |
          |
     1576 |       .levels = info->levels,
          |       ~~~~~~~~~~~~~~~~~~~~~~~
     1577 |       .samples = info->samples,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~
     1578 |
          |
     1579 |       .image_alignment_el = image_align_el,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1580 |       .logical_level0_px = logical_level0_px,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1581 |       .phys_level0_sa = phys_level0_sa,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1582 |
          |
     1583 |       .size_B = size_B,
          |       ~~~~~~~~~~~~~~~~~
     1584 |       .alignment_B = base_alignment_B,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1585 |       .row_pitch_B = row_pitch_B,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1586 |       .array_pitch_el_rows = array_pitch_el_rows,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1587 |       .array_pitch_span = array_pitch_span,
          |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     1588 |
          |
     1589 |       .usage = info->usage,
          |       ~~~~~~~~~~~~~~~~~~~~~
     1590 |    };
          |    ~
    ../src/intel/isl/isl.c:1488:24: warning: ‘*((void *)&phys_total_el+4)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     1488 |    struct isl_extent2d phys_total_el;
          |                        ^~~~~~~~~~~~~
    ../src/intel/isl/isl.c:1335:38: warning: ‘phys_total_el’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     1335 |       isl_align_div(phys_total_el->w * tile_el_scale,
          |                     ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
    ../src/intel/isl/isl.c:1488:24: note: ‘phys_total_el’ was declared here
     1488 |    struct isl_extent2d phys_total_el;
          |                        ^~~~~~~~~~~~~

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-20 12:05:14 +00:00
Bas Nieuwenhuizen
755c633b8d anv: Fix vulkan build in meson.
Apparently the android part was never ported to meson.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-19 23:27:46 +00:00
Jason Ekstrand
ef323d02d8 anv/image: Set different usage flags for shadow surfaces
For the block BLOCK_TEXEL_VIEW_COMPATIBLE case, this didn't matter
because the flags were already more-or-less what we wanted.  However,
for gen7 stencil shadow images, it still had ISL_SURF_USAGE_STENCIL_BIT
so we were getting W-tiled which isn't what we want for the shadow.  By
passing just ISL_SURF_USAGE_TEXTURE_BIT (and CUBE if we care), we now
get something that's actually texturable.

Fixes: f3ea0cf828 "anv: Add stencil texturing support for gen7"
2019-06-19 22:21:46 +00:00
Jason Ekstrand
215f9f83f5 anv: Flush caches in anv_image_copy_to_shadow
Copies to a shadow image happen during a VkCmdPipelineBarrier or at
subpass transitions.  We could potentially be a bit more conservative
but these transitions shouldn't happen often and it's better to have our
bases covered.

Fixes: f3ea0cf828 "anv: Add stencil texturing support for gen7"
2019-06-19 22:21:46 +00:00
Kenneth Graunke
9c19d07b1c anv: Fix wrong printf formatter
%lu is for unsigned long, %zu is for size_t.  Just cast the data.
2019-06-19 11:57:01 -05:00
Lionel Landwerlin
bc62673dce anv: write spirv-nir logs back to the application
Using the existing VK_EXT_debug_report extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-19 15:45:52 +03:00
Jason Ekstrand
58cb865313 anv: Make border colors the right size and alignment on HSW
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-18 16:07:08 +00:00
Jason Ekstrand
9672b7044c anv: Set STATE_BASE_ADDRESS upper bounds on gen7
This should fix floating-point border color on all gen7 HW.  Integer is
still thoroughly busted on gen7 because it doesn't exist on IVB and it's
crazy on HSW.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-17 18:53:07 -05:00