fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 13:58:05 +02:00

Author	SHA1	Message	Date
Ian Romanick	47c2aa5b48	intel/vec4: Reswizzle VF immediates too Previously, an instruction like mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F] would get rewritten as mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F] The latter does not produce the correct result. The VF immediate in the second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F]. This commit produces the former. Fixes: `1ee1d8ab46` ("i965/vec4: Reswizzle sources when necessary.") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:10 -07:00
Dongwon Kim	f734e2a042	anv: disable repacking for compression for applicable gen set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:38 -07:00
Dongwon Kim	eb6d067e68	intel: add disable_ccs_repack to gen_device_info add a new attribute, 'disable_ccs_repack' to gen_device info, which indicates whether repacking of components in certain pixel formats before compression needs to be disabled to keep the compatibility with decompression capability of display controller (gen11+) Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:38 -07:00
Dongwon Kim	e6ac6d3224	intel/genxml: correct bit fields in CACHE_MODE_0 reg for gen11 correct bit fields information of CACHE_MODE_0 reg in current gen11.xml Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:37 -07:00
Caio Marcelo de Oliveira Filho	9c7adaeb5f	anv: Advertise VK_EXT_shader_demote_to_helper_invocation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Caio Marcelo de Oliveira Filho	45f5db5a84	intel/fs: Implement "demote to helper invocation" The "demote" intrinsic works like "discard" but don't change the control flow, allowing derivative operations to work. This is the semantics of D3D discard. The "is_helper_invocation" intrinsic will return true for helper invocations -- both the ones that started as helpers and the ones that where demoted. This is needed to avoid changing the behavior of gl_HelperInvocation which is an input (so not expected to change during shader execution). v2: Emit the discard jump and comment why it is safe. (Jason) Rework the is_helper_invocation() that was stomping f0.1. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Connor Abbott	6b28808b22	intel/nir: Extract add_const_offset_to_base Pretty much every driver using nir_lower_io_to_temporaries followed by nir_lower_io is going to want this. In particular, radv and radeonsi in the next commits. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	27f0c3c15e	radv: Make FragCoord a sysval load_fragcoord is already handled in common code for radeonsi, so we don't need to do anything to handle it. However, there were some passes creating NIR with the varying, so we switch them over to the sysval. In the case of nir_lower_input_attachments which is used by both radv and anv, we add handling for both until intel switches to using a sysval. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Daniel Schürmann	c31f470066	anv,nir: Move lower_input_attachments pass from ANV to NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:02:50 +02:00
Chia-I Wu	5824130389	anv: fix VkExternalBufferProperties for host allocation It was reported as unsupported previously. It should be importable and is compatible with itself. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Fixes: `69cc6272fb` ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-07 13:31:58 -07:00
Chia-I Wu	f3c7a02a62	anv: fix VkExternalBufferProperties for unsupported handles compatibleHandleTypes must include the queried handle type. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-07 13:31:58 -07:00
Lionel Landwerlin	5493ec3c19	anv: manually add KHR_display to the list of platforms Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `38305e6c94` ("anv: replace hard-coded platform list with vk.xml parse") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111078 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-07 15:34:09 +03:00
Juan A. Suarez Romero	e06bc0b166	intel: fix wrong format usage Do not use the view format when filling the surface state. Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.* Fixes: `fb1350c76f` ("intel: Add and use helpers for level0 extent") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-03 10:14:54 +02:00
Jason Ekstrand	e708261cb7	anv: Advertise a more accurate minTexelBufferOffsetAlignment Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-02 22:28:44 +00:00
Jason Ekstrand	0bc657f2db	anv: Implement VK_EXT_texel_buffer_alignment Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-02 22:28:44 +00:00
Jason Ekstrand	fa869f45c8	intel/fs: Use nir_lower_interpolation on gen11+ On gen11, the removed the PLN instruction so we have to emit a pile of MAD to emulate it. We may as well do that in NIR so we can optimize and later schedule it. Shader-db results on Ice Lake: total instructions in shared programs: 17145644 -> 16556440 (-3.44%) instructions in affected programs: 11507454 -> 10918250 (-5.12%) helped: 35763 HURT: 42085 helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18 helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49% HURT stats (abs) min: 1 max: 248 x̄: 2.22 x̃: 2 HURT stats (rel) min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47% 95% mean confidence interval for instructions value: -7.67 -7.47 95% mean confidence interval for instructions %-change: -4.46% -4.29% Instructions are helped. total loops in shared programs: 4370 -> 4370 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 360624645 -> 368220857 (2.11%) cycles in affected programs: 269631244 -> 277227456 (2.82%) helped: 15583 HURT: 65874 helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32 helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44% HURT stats (abs) min: 1 max: 238638 x̄: 133.87 x̃: 20 HURT stats (rel) min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97% 95% mean confidence interval for cycles value: 67.42 119.09 95% mean confidence interval for cycles %-change: 3.61% 3.73% Cycles are HURT. total spills in shared programs: 8943 -> 8981 (0.42%) spills in affected programs: 1925 -> 1963 (1.97%) helped: 44 HURT: 14 total fills in shared programs: 21815 -> 21925 (0.50%) fills in affected programs: 3511 -> 3621 (3.13%) helped: 41 HURT: 18 LOST: 70 GAINED: 14 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Jason Ekstrand	2b79a9e5a5	intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Jason Ekstrand	8e7d066682	intel/fs: Actually implement the load_barycentric intrinsics If they never get used, dead code should clean them up. Also, we rework the at_offset and at_sample intrinsics so they return a proper vec2 instead of returning things in PLN layout. Fortunately, copy-prop is pretty good at cleaning this up and it doesn't result in any actual extra MOVs. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Sagar Ghuge	d5f63990b4	intel/tools: Add assembler unit tests for ROL/ROR instructions Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	e9c35dd7cc	intel/tools: Add ROL/ROR support in assembler Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	1e92e83856	intel/compiler: Emit ROR and ROL instruction v2: Reorder patch (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	83fdec0f0d	intel/compiler: Enable the emission of ROR/ROL instructions v2: 1) Drop changes for vec4 backend as on Gen11+ we don't support align16 mode (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Eric Engestrom	5f9764bc0b	anv: fix indentation Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-29 22:41:06 +01:00
Eric Engestrom	42eb85a9d8	anv: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-29 22:41:06 +01:00
Eric Engestrom	38305e6c94	anv: replace hard-coded platform list with vk.xml parse Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-29 22:38:54 +01:00
Lionel Landwerlin	5847de6e9a	intel/compiler: don't use byte operands for src1 on ICL The simulator complains about using byte operands, we also have documentation telling us. Note that add operations on bytes seems to work fine on HW (like ADD). Using dwords operands with CMP & SEL fixes the following tests : dEQP-VK.spirv_assembly.type.vec.i8. v2: Drop the GLK changes (Matt) Add validator tests (Matt) v3: Drop GLK ref (Matt) Don't mix float/integer in MAD (Matt) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com> BSpec: 3017 Cc: <mesa-stable@lists.freedesktop.org>	2019-06-29 12:56:09 +00:00
Ian Romanick	b04beaf41d	intel/vec4: Try both sources as candidates for being immediates For some reason, when I first wrote try_immediate_source, I thought the sources had already been ordered so that the immediate value was the second source. That's rubbish. The generator assumes neither source is immediate, and it relies on later copy/constant propagation passes to do the reordering. For this reason, the changes to try_immediate_source have to go to some efforts to reorder the operands and tell the caller when it reordered them. The generator for comparison instructions uses this to determine when the comparison needs to change (e.g., from GT to LT). No changes on any Gen8 or later platform because those platforms do not use the vec4 backend. Haswell total instructions in shared programs: 13484431 -> 13480500 (-0.03%) instructions in affected programs: 441138 -> 437207 (-0.89%) helped: 1883 HURT: 0 helped stats (abs) min: 1 max: 49 x̄: 2.09 x̃: 1 helped stats (rel) min: 0.07% max: 8.91% x̄: 1.10% x̃: 0.90% 95% mean confidence interval for instructions value: -2.19 -1.98 95% mean confidence interval for instructions %-change: -1.14% -1.06% Instructions are helped. total cycles in shared programs: 376420286 -> 376406400 (<.01%) cycles in affected programs: 15995668 -> 15981782 (-0.09%) helped: 1692 HURT: 219 helped stats (abs) min: 2 max: 764 x̄: 13.78 x̃: 4 helped stats (rel) min: <.01% max: 9.69% x̄: 0.69% x̃: 0.35% HURT stats (abs) min: 2 max: 516 x̄: 43.09 x̃: 22 HURT stats (rel) min: 0.02% max: 12.09% x̄: 2.30% x̃: 1.13% 95% mean confidence interval for cycles value: -9.70 -4.83 95% mean confidence interval for cycles %-change: -0.42% -0.28% Cycles are helped. total spills in shared programs: 23166 -> 23158 (-0.03%) spills in affected programs: 66 -> 58 (-12.12%) helped: 2 HURT: 0 total fills in shared programs: 34592 -> 34580 (-0.03%) fills in affected programs: 75 -> 63 (-16.00%) helped: 2 HURT: 0 Ivy Bridge total instructions in shared programs: 12051590 -> 12048513 (-0.03%) instructions in affected programs: 355911 -> 352834 (-0.86%) helped: 1481 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 2.08 x̃: 1 helped stats (rel) min: 0.07% max: 4.92% x̄: 1.08% x̃: 0.90% 95% mean confidence interval for instructions value: -2.17 -1.98 95% mean confidence interval for instructions %-change: -1.12% -1.04% Instructions are helped. total cycles in shared programs: 180319624 -> 180307642 (<.01%) cycles in affected programs: 15591028 -> 15579046 (-0.08%) helped: 1340 HURT: 174 helped stats (abs) min: 2 max: 764 x̄: 14.19 x̃: 2 helped stats (rel) min: <.01% max: 8.68% x̄: 0.64% x̃: 0.32% HURT stats (abs) min: 2 max: 518 x̄: 40.41 x̃: 14 HURT stats (rel) min: 0.02% max: 8.37% x̄: 1.59% x̃: 0.67% 95% mean confidence interval for cycles value: -10.85 -4.97 95% mean confidence interval for cycles %-change: -0.45% -0.31% Cycles are helped. All Gen6 and earlier platforms had simlar results. (Sandy Bridge shown) total instructions in shared programs: 10863159 -> 10861462 (-0.02%) instructions in affected programs: 157839 -> 156142 (-1.08%) helped: 715 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 2.37 x̃: 2 helped stats (rel) min: 0.23% max: 4.33% x̄: 1.07% x̃: 0.85% 95% mean confidence interval for instructions value: -2.53 -2.21 95% mean confidence interval for instructions %-change: -1.13% -1.02% Instructions are helped. total cycles in shared programs: 153957782 -> 153948778 (<.01%) cycles in affected programs: 3171648 -> 3162644 (-0.28%) helped: 696 HURT: 62 helped stats (abs) min: 2 max: 390 x̄: 15.72 x̃: 4 helped stats (rel) min: 0.02% max: 10.57% x̄: 0.57% x̃: 0.12% HURT stats (abs) min: 2 max: 300 x̄: 31.29 x̃: 2 HURT stats (rel) min: 0.11% max: 7.23% x̄: 0.83% x̃: 0.34% 95% mean confidence interval for cycles value: -15.65 -8.11 95% mean confidence interval for cycles %-change: -0.56% -0.36% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-28 18:13:18 -07:00
Ian Romanick	379cf3bb87	intel/vec4: Try immediate sources for dot products too No changes on any Gen8 or later platform because those platforms do not use the vec4 backend. All Haswell and earlier platforms has similar results. (Haswell shown) total instructions in shared programs: 13484467 -> 13484431 (<.01%) instructions in affected programs: 8540 -> 8504 (-0.42%) helped: 33 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.09 x̃: 1 helped stats (rel) min: 0.31% max: 1.53% x̄: 0.49% x̃: 0.35% 95% mean confidence interval for instructions value: -1.19 -0.99 95% mean confidence interval for instructions %-change: -0.60% -0.38% Instructions are helped. total cycles in shared programs: 376420572 -> 376420286 (<.01%) cycles in affected programs: 56260 -> 55974 (-0.51%) helped: 26 HURT: 5 helped stats (abs) min: 2 max: 204 x̄: 11.85 x̃: 2 helped stats (rel) min: 0.11% max: 3.08% x̄: 0.39% x̃: 0.13% HURT stats (abs) min: 2 max: 6 x̄: 4.40 x̃: 6 HURT stats (rel) min: 0.03% max: 0.35% x̄: 0.24% x̃: 0.35% 95% mean confidence interval for cycles value: -22.91 4.45 95% mean confidence interval for cycles %-change: -0.56% -0.02% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-28 17:16:16 -07:00
Ian Romanick	eeebeb211f	intel/vec4: Try emitting non-scalar immediates Sometimes an instruction has a vector as a source, but all of the components have the same value. For example, vec3 32 ssa_16 = load_const (1.0, 1.0, 1.0) ... vec3 32 ssa_82 = fadd ssa_16, -ssa_81.xyz No changes on any Gen8 or later platform because those platforms do not use the vec4 backend. Haswell total instructions in shared programs: 13487811 -> 13484467 (-0.02%) instructions in affected programs: 421981 -> 418637 (-0.79%) helped: 1859 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.04% max: 9.80% x̄: 1.04% x̃: 0.84% 95% mean confidence interval for instructions value: -1.85 -1.74 95% mean confidence interval for instructions %-change: -1.07% -1.00% Instructions are helped. total cycles in shared programs: 376423252 -> 376420572 (<.01%) cycles in affected programs: 14800970 -> 14798290 (-0.02%) helped: 1519 HURT: 329 helped stats (abs) min: 2 max: 462 x̄: 10.59 x̃: 4 helped stats (rel) min: 0.03% max: 16.73% x̄: 0.79% x̃: 0.36% HURT stats (abs) min: 2 max: 598 x̄: 40.74 x̃: 16 HURT stats (rel) min: <.01% max: 10.32% x̄: 2.56% x̃: 0.98% 95% mean confidence interval for cycles value: -3.53 0.63 95% mean confidence interval for cycles %-change: -0.30% -0.09% Inconclusive result (value mean confidence interval includes 0). total fills in shared programs: 34601 -> 34592 (-0.03%) fills in affected programs: 91 -> 82 (-9.89%) helped: 9 HURT: 0 Ivy Bridge total instructions in shared programs: 12053565 -> 12051626 (-0.02%) instructions in affected programs: 298103 -> 296164 (-0.65%) helped: 1228 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 1.58 x̃: 1 helped stats (rel) min: 0.04% max: 3.57% x̄: 0.91% x̃: 0.81% 95% mean confidence interval for instructions value: -1.63 -1.53 95% mean confidence interval for instructions %-change: -0.95% -0.88% Instructions are helped. total cycles in shared programs: 180322270 -> 180319922 (<.01%) cycles in affected programs: 14123840 -> 14121492 (-0.02%) helped: 1036 HURT: 195 helped stats (abs) min: 2 max: 462 x̄: 11.93 x̃: 2 helped stats (rel) min: 0.03% max: 14.05% x̄: 0.82% x̃: 0.35% HURT stats (abs) min: 2 max: 598 x̄: 51.33 x̃: 16 HURT stats (rel) min: <.01% max: 9.68% x̄: 3.02% x̃: 0.72% 95% mean confidence interval for cycles value: -4.92 1.10 95% mean confidence interval for cycles %-change: -0.35% -0.07% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10864286 -> 10863189 (-0.01%) instructions in affected programs: 159722 -> 158625 (-0.69%) helped: 724 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.52 x̃: 1 helped stats (rel) min: 0.10% max: 2.91% x̄: 0.79% x̃: 0.62% 95% mean confidence interval for instructions value: -1.58 -1.46 95% mean confidence interval for instructions %-change: -0.82% -0.75% Instructions are helped. total cycles in shared programs: 153967938 -> 153957926 (<.01%) cycles in affected programs: 1923186 -> 1913174 (-0.52%) helped: 654 HURT: 56 helped stats (abs) min: 2 max: 170 x̄: 20.00 x̃: 4 helped stats (rel) min: 0.03% max: 11.82% x̄: 0.89% x̃: 0.18% HURT stats (abs) min: 2 max: 390 x̄: 54.75 x̃: 32 HURT stats (rel) min: 0.05% max: 6.92% x̄: 3.09% x̃: 2.92% 95% mean confidence interval for cycles value: -17.42 -10.78 95% mean confidence interval for cycles %-change: -0.76% -0.40% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8142677 -> 8141721 (-0.01%) instructions in affected programs: 139511 -> 138555 (-0.69%) helped: 588 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 1.63 x̃: 1 helped stats (rel) min: 0.21% max: 4.39% x̄: 0.84% x̃: 0.46% 95% mean confidence interval for instructions value: -1.70 -1.55 95% mean confidence interval for instructions %-change: -0.89% -0.78% Instructions are helped. total cycles in shared programs: 188549394 -> 188547676 (<.01%) cycles in affected programs: 3171960 -> 3170242 (-0.05%) helped: 527 HURT: 0 helped stats (abs) min: 2 max: 18 x̄: 3.26 x̃: 2 helped stats (rel) min: <.01% max: 0.80% x̄: 0.08% x̃: 0.06% 95% mean confidence interval for cycles value: -3.49 -3.03 95% mean confidence interval for cycles %-change: -0.09% -0.07% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-28 17:16:06 -07:00
Anuj Phogat	387e43b52f	Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `2be60e0c73`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-28 14:02:13 -07:00
Nanley Chery	02f6995d76	isl: Don't align phys_level0_sa by block dimension Aligning phys_level0_sa by the compression block dimension prior to mipmap layout causes the layout of compressed surfaces to differ from the sampler's expectations in certain cases. The hardware docs agree: From the BDW PRM, Vol. 5, Compressed Mipmap Layout, The compressed mipmaps are stored in a similar fashion to uncompressed mipmaps [...] The following exceptions apply to the layout of compressed (vs. uncompressed) mipmaps: * [...] * The dimensions of the mip maps are first determined by applying the sizing algorithm presented in Non-Power-of-Two Mipmaps above. Then, if necessary, they are padded out to compression block boundaries. The last bullet indicates that alignment should not be done for calculating a miplevel's dimensions, but rather for determining miplevel placement/padding. Comply with this text by removing the extra alignment. Fixes some fbo-generatemipmap-formats piglit failures on all tested platforms (SNB-KBL). v2: - Note fixed platforms. - Update some consumers via a helper function. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-27 23:38:38 +00:00
Nanley Chery	fb1350c76f	intel: Add and use helpers for level0 extent Prepare for a bug fix by adding and using helpers which convert isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of surface elements. v2: - Update iris (Ken). - Update anv. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-27 23:38:37 +00:00
Lionel Landwerlin	836225840c	intel/compiler: fix derivative on y axis implementation This rewrites the ddy in EXECUTE_4 mode with a loop to make it more obvious what is going on and also sets the group each of the 4 threads in the groups are supposed to execute. Fixes the following CTS tests : dEQP-VK.glsl.derivate.dfdyfine.dynamic_* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Co-Authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `2134ea3800` ("intel/compiler/fs: Implement ddy without using align16 for Gen11+")	2019-06-27 18:14:58 +00:00
Kenneth Graunke	748e5dac72	intel/blorp: Disable sampler state prefetching on Gen11 Sampler state prefetching is broken on Gen11, and WA_160668216 says to disable it. Apparently sampler state prefetching also has basically zero impact on performance, so we don't need to worry there. i965, anv, and iris already handle this correctly, but we missed BLORP. Ideally the kernel should globally disable this by writing SARCHKMD, at which point we wouldn't have to worry about it. But let's be defensive and handle it ourselves too. v2: separate out from BTP workaround in case we change that eventually Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> [v1]	2019-06-25 13:29:31 -07:00
Jason Ekstrand	0a364a4a74	anv/descriptor_set: Only write texture swizzles if we have an image view When immutable samplers are set we call write_image_view with a NULL image view. This causes issues on IVB where we have to fake texture swizzling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999 Fixes: `d2aa65eb18` "anv: Emulate texture swizzle in the shader when..."	2019-06-25 19:43:25 +00:00
Tapani Pälli	7a6e5a4bc3	intel/compiler: silence a warning of using different enum type Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-25 10:09:22 +03:00
Nataraj Deshpande	d94fca5420	anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform gralloc module will select a format based on the usage flags provided by the camera device and the other endpoint of the stream. The patch fixes crash in vulkan when the test is run with camera stream set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED. Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering on chromebook with camera HAL3. v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan) Fixes: `f1654fa7e3` "anv/android: support creating images from external format" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-24 08:28:18 +03:00
Jason Ekstrand	1a9e5b9094	anv: Implement "pop-free" clipping This is the preferred clipping mode since it doesn't mean your points disappear the moment part of the point crosses over the edge of the viewport and that lines have weird endpoints at viewport edges. We've just never bothered to hook it up until now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 14:18:59 +00:00
Jason Ekstrand	4a757d6c31	anv: Enable the guardband clip test In workloads where there is a lot of geometry drawn that crosses over the edge of the viewport, this should substantially improve clipper performance. Not really sure why it's taken 3 years to turn it on but we never got around to it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 14:18:59 +00:00
Jason Ekstrand	13f0c278c5	i965,iris: Move guardband calculations to a common location Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 14:18:59 +00:00
Kenneth Graunke	d4a4384b31	iris: Implement INTEL_DEBUG=pc for pipe control logging. This prints a log of every PIPE_CONTROL flush we emit, noting which bits were set, and also the reason for the flush. That way we can see which are caused by hardware workarounds, render-to-texture, buffer updates, and so on. It should make it easier to determine whether we're doing too many flushes and why.	2019-06-20 13:32:15 -05:00
Lionel Landwerlin	4a61be24fe	anv: only resort to sync fds internally with no syncobj support We can rely on only one kind of synchronization object (drm-syncobj) when it is available. This reduces the number of file descriptors we use in our implementation. This will be required later for timeline semaphores implementation, at this point we won't ever want to use anything else but syncobjs. v2: Only use has_syncobj for semaphores (Jason) v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-20 14:59:51 +00:00
Eric Engestrom	a9e09d56a9	isl: tag unreachable path as such GCC should be able to figure out that all the possible enum values are exhausted in the switch() and all the branches return from the function, but apparently it doesn't, so let's tell the compiler explicitly. This gets rid of the following warnings in GCC 9: [1/24] Compiling C object 'src/intel/isl/60d23f8@@isl@sta/isl.c.o'. ../src/intel/isl/isl.c: In function ‘isl_surf_init_s’: ../src/intel/isl/isl.c:1569:10: warning: ‘array_pitch_el_rows’ may be used uninitialized in this function [-Wmaybe-uninitialized] 1569 \| surf = (struct isl_surf) { \| ~~~~~~^~~~~~~~~~~~~~~~~~~~~ 1570 \| .dim = info->dim, \| ~~~~~~~~~~~~~~~~~ 1571 \| .dim_layout = dim_layout, \| ~~~~~~~~~~~~~~~~~~~~~~~~~ 1572 \| .msaa_layout = msaa_layout, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1573 \| .tiling = tiling, \| ~~~~~~~~~~~~~~~~~ 1574 \| .format = info->format, \| ~~~~~~~~~~~~~~~~~~~~~~~ 1575 \| \| 1576 \| .levels = info->levels, \| ~~~~~~~~~~~~~~~~~~~~~~~ 1577 \| .samples = info->samples, \| ~~~~~~~~~~~~~~~~~~~~~~~~~ 1578 \| \| 1579 \| .image_alignment_el = image_align_el, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1580 \| .logical_level0_px = logical_level0_px, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1581 \| .phys_level0_sa = phys_level0_sa, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1582 \| \| 1583 \| .size_B = size_B, \| ~~~~~~~~~~~~~~~~~ 1584 \| .alignment_B = base_alignment_B, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1585 \| .row_pitch_B = row_pitch_B, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1586 \| .array_pitch_el_rows = array_pitch_el_rows, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1587 \| .array_pitch_span = array_pitch_span, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1588 \| \| 1589 \| .usage = info->usage, \| ~~~~~~~~~~~~~~~~~~~~~ 1590 \| }; \| ~ ../src/intel/isl/isl.c:1488:24: warning: ‘((void )&phys_total_el+4)’ may be used uninitialized in this function [-Wmaybe-uninitialized] 1488 \| struct isl_extent2d phys_total_el; \| ^~~~~~~~~~~~~ ../src/intel/isl/isl.c:1335:38: warning: ‘phys_total_el’ may be used uninitialized in this function [-Wmaybe-uninitialized] 1335 \| isl_align_div(phys_total_el->w tile_el_scale, \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ ../src/intel/isl/isl.c:1488:24: note: ‘phys_total_el’ was declared here 1488 \| struct isl_extent2d phys_total_el; \| ^~~~~~~~~~~~~ Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-20 12:05:14 +00:00
Bas Nieuwenhuizen	755c633b8d	anv: Fix vulkan build in meson. Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-19 23:27:46 +00:00
Jason Ekstrand	ef323d02d8	anv/image: Set different usage flags for shadow surfaces For the block BLOCK_TEXEL_VIEW_COMPATIBLE case, this didn't matter because the flags were already more-or-less what we wanted. However, for gen7 stencil shadow images, it still had ISL_SURF_USAGE_STENCIL_BIT so we were getting W-tiled which isn't what we want for the shadow. By passing just ISL_SURF_USAGE_TEXTURE_BIT (and CUBE if we care), we now get something that's actually texturable. Fixes: `f3ea0cf828` "anv: Add stencil texturing support for gen7"	2019-06-19 22:21:46 +00:00
Jason Ekstrand	215f9f83f5	anv: Flush caches in anv_image_copy_to_shadow Copies to a shadow image happen during a VkCmdPipelineBarrier or at subpass transitions. We could potentially be a bit more conservative but these transitions shouldn't happen often and it's better to have our bases covered. Fixes: `f3ea0cf828` "anv: Add stencil texturing support for gen7"	2019-06-19 22:21:46 +00:00
Kenneth Graunke	9c19d07b1c	anv: Fix wrong printf formatter %lu is for unsigned long, %zu is for size_t. Just cast the data.	2019-06-19 11:57:01 -05:00
Lionel Landwerlin	bc62673dce	anv: write spirv-nir logs back to the application Using the existing VK_EXT_debug_report extension. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-19 15:45:52 +03:00
Jason Ekstrand	58cb865313	anv: Make border colors the right size and alignment on HSW Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-18 16:07:08 +00:00
Jason Ekstrand	9672b7044c	anv: Set STATE_BASE_ADDRESS upper bounds on gen7 This should fix floating-point border color on all gen7 HW. Integer is still thoroughly busted on gen7 because it doesn't exist on IVB and it's crazy on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-17 18:53:07 -05:00

1 2 3 4 5 ...

4346 commits