Commit graph

5027 commits

Author SHA1 Message Date
Timur Kristóf
25775d346c aco: Only store TCS outputs to VMEM when they are read by TES.
Totals from affected shaders (GFX10):
Code Size: 10832 -> 10736 (-0.89 %) bytes

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
2020-04-24 17:58:57 +00:00
Timur Kristóf
b779d05d71 radv: Add inputs read by TES to radv_shader_info.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
2020-04-24 17:58:57 +00:00
Rhys Perry
ede1c171c5 aco: fix outdated label_vec from p_create_vector labelling
Fixes random dEQP-VK.transform_feedback.fuzz.* crashes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2dc550202e
    ('aco: copy-propagate p_create_vector copies of vectors')

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4730>
2020-04-24 12:21:15 +00:00
Marek Olšák
b4fd8f1919 ac,radeonsi: simplify checking for Navi1x chips
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698>
2020-04-24 10:38:54 +00:00
Marek Olšák
d8443b211e ac: out-of-order rasterization is not supported on gfx10
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698>
2020-04-24 10:38:54 +00:00
Jason Ekstrand
f4addfdde3 spirv: Use nir_const_value for spec constants
When we originally wrote spirv_to_nir we didn't have a good scalar value
union to handily use so we rolled our own thing for spec constants.  Now
that we have nir_const_value, we can use that and simplify a bunch of
the spec constant logic.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
2020-04-24 09:23:59 +00:00
Jason Ekstrand
a4885df9f8 radv: Properly handle all sizes of specialization constants
cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
2020-04-24 09:23:59 +00:00
Eduardo Lima Mitev
4dc7b76276 anv/radv: Resolving 'GetInstanceProcAddr' should not require a valid instance
Since vk_icdGetInstanceProcAddr() is wired through
vkGetInstanceProcAddr() in both drivers, we lost the ability for
'GetInstanceProcAddr' to resolve itself prior to having a valid
instance.

An upcoming spec change will fix that and allow
vkGetInstanceProcAddr() to resolve itself passing NULL as
instance. See https://gitlab.khronos.org/vulkan/vulkan/issues/2057
for details.

This patch implements the change in both radv and anvil.

CTS changes have already landed:
https://gitlab.khronos.org/Tracker/vk-gl-cts/issues/2278

vulkan-loader changes have also landed:
https://gitlab.khronos.org/Tracker/vk-gl-cts/issues/2278

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4273>
2020-04-24 09:09:14 +00:00
Rhys Perry
665250e830 aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: d1621834f3
    ('aco: combine VALU and SALU into various VOP3 instructions')

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2822
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4717>
2020-04-24 08:55:19 +00:00
Samuel Pitoiset
6a6e71524d radv: adjust the supported subgroup stages
VK_SHADER_STAGE_ALL now includes all ray-tracing related stages.
Noticed while comparing vulkaninfo with some other drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4679>
2020-04-23 16:16:09 +00:00
Samuel Pitoiset
ff3f775476 radv: simplify checking for Navi1x chips
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4702>
2020-04-23 15:54:32 +02:00
Rhys Perry
0d9fe0405f aco: improve code for 32-bit isign
No shader-db changes on Navi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
d1621834f3 aco: combine VALU and SALU into various VOP3 instructions
shader-db (Navi):
Totals from 2916 (2.28% of 127638) affected shaders:
SGPRs: 184427 -> 184283 (-0.08%); split: -0.10%, +0.02%
VGPRs: 143520 -> 143640 (+0.08%); split: -0.00%, +0.09%
CodeSize: 14913548 -> 14913288 (-0.00%); split: -0.00%, +0.00%
MaxWaves: 26034 -> 26012 (-0.08%)
Instrs: 2935435 -> 2930960 (-0.15%); split: -0.15%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
607fb4153d aco: move call to store_output_to_temps in store_ls_or_es_output earlier
Skips get_intrinsic_io_basic_offset()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
b497b774a5 aco: remove copy in load_input_from_temps()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
2dc550202e aco: copy-propagate p_create_vector copies of vectors
Instead of copying the operands of the other p_create_vector and labelling
the definition with label_vec, copy the operands and label it with
label_temp so that it can be copy-propagated.

This was found while removing a redundant copy in load_input_from_temps()
which removed duplicate p_create_vector instructions.

shader-db (Navi):
Totals from 139 (0.11% of 127638) affected shaders:
VGPRs: 8472 -> 7948 (-6.19%)
CodeSize: 514592 -> 512368 (-0.43%)
MaxWaves: 1089 -> 1195 (+9.73%)
Instrs: 100214 -> 99658 (-0.55%)
Cycles: 400856 -> 398632 (-0.55%)
VMEM: 15545 -> 15338 (-1.33%)
Copies: 5140 -> 4584 (-10.82%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
e4383b5c7f aco: decrease the uses of other copy operations after splitting/removing
For copies like v[7:8] = v[8:9], what currently happens is:
- do_copy() will skip the second dword
- the uses of the second dword will be reduced to 0
- the copy operation will be removed from the map
and v8 will never be set to v9.

So just decrease the uses of other operations after splitting or removing
the current operation, so: "v8 = v9" will be split off, it's uses reduced
and then the new copy will be done in the next iteration.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4686>
2020-04-23 11:39:23 +00:00
Joshua Ashton
58f25098a0 radv: Use TRUNC_COORD on samplers
The default behaviour (0) is: "round-nearest-even to n.6 and drop fraction when point sampling" whereas the Vulkan spec simply wants us to floor it (1) "truncate when point sampling".

See 15.6.1 in the Vulkan spec.
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#textures-normalized-operations

The Direct3D spec also mandates this (https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.7%20Point%20Sample%20Addressing)

This fixes some point-sampling texture precision issues in some Direct3D 9 titles such as Guild Wars 2 and htoL#NiQ: The Firefly Diary that are not present on other vendors.

Fixes dEQP-VK.pipeline.sampler.exact_sampling.*

https://github.com/Joshua-Ashton/d9vk/issues/450
https://github.com/doitsujin/dxvk/issues/1433

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3951>
2020-04-23 09:57:08 +00:00
Samuel Pitoiset
7086b38c81 radv: make sure to export the viewport index if FS needs it
If FS reads gl_ViewportIndex but VS doesn't export it, it should
be zero to avoid reading garbage.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2818
Fixes: b424d49ac0 ("radv/llvm: fix exporting the viewport index if the fragment shader needs it")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4687>
2020-04-23 08:10:25 +00:00
Daniel Schürmann
36e0d2f39b aco: coalesce v_mad's accumulator with definition's affinities
Totals from affected shaders:
Code Size: 8922676 -> 8915192 (-0.08 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d000d76f13 aco: use upper part of gap in register file if it is beneficial for striding
Totals from affected shaders:
SGPRS: 1717288 -> 1716984 (-0.02 %)
VGPRS: 1305924 -> 1304904 (-0.08 %)
Code Size: 138508892 -> 138420144 (-0.06 %) bytes
Max Waves: 115726 -> 115735 (0.01 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d666d83be2 aco: try to always find a register with stride for even sizes
Totals from affected shaders:
SGPRS: 1162400 -> 1162400 (0.00 %)
VGPRS: 947364 -> 946960 (-0.04 %)
Code Size: 98399300 -> 98399004 (-0.00 %) bytes
Max Waves: 74665 -> 74682 (0.02 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5a3c1f4f0b aco: stop get_reg_simple after reaching max_used_gpr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
2796cb4c24 aco: refactor get_reg_simple() to return early on exact matches
in the best fit algorithm

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
6792e134f3 aco: don't create vector affinities for operands which are not killed or are duplicates
Totals from affected shaders:
SGPRS: 825184 -> 825184 (0.00 %)
VGPRS: 697640 -> 697240 (-0.06 %)
Code Size: 79244104 -> 79201072 (-0.05 %) bytes
Max Waves: 42388 -> 42386 (-0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
edc2b57ac1 aco: allocate full register for subdword definitions if HW doesn't support it
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
97a870cf88 aco: move attempt to find strided register into get_reg_simple()
This simplifies code and helps some shaders

Totals from affected shaders:
Code Size: 51227172 -> 51202216 (-0.05 %) bytes
Max Waves: 19955 -> 19948 (-0.04 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
c7f97f110c aco: use DefInfo in more places to simplify RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
734f86db6b aco: create and use DefInfo struct in RA
for maintaining all information necessary to find a register.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5b2f628da3 aco: create pseudo dummy instruction in RA to be used for live-range splits
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d9f7d1d5cb aco: refactor get_reg() to also handle affinities
This simplifies definition handling and
helps a few shaders

Totals from affected shaders:
Code Size: 659540 -> 659376 (-0.02 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
7c8f4ebca9 aco: refactor get_reg() to take Temp instead of RegClass
This patch also moves get_reg_specified() and
get_reg_vec() before get_reg() to make use of it later.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Daniel Schürmann
0a9ed98178 aco: simplify operand handling in RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Rhys Perry
f13049f48a aco: implement 64-bit sgpr swaps
In our pipeline-db, helps almost exclusively Detroit: Become Human.

Totals from 6726 (5.36% of 125503) affected shaders:
CodeSize: 74680952 -> 74102228 (-0.77%)
Instrs: 14551507 -> 14406001 (-1.00%)
Cycles: 1748272436 -> 1690173104 (-3.32%)
VMEM: 964671 -> 964058 (-0.06%)
Copies: 1993312 -> 1847806 (-7.30%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
2ab45f41e0 aco: implement sub-dword swaps
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
83fdb1ed3d aco: add VOP3P_instruction
The optimizer isn't yet updated to handle this, since lower_to_hw_instr
will be the only user for now.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
8fc24f9a45 aco: fix copy statistic for 64-bit vgpr constant copy
The statistic is in units of instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Jonathan Marek
064d39e620 radv: use common nir_convert_ycbcr
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Daniel Schürmann
c3c1f4d6bc aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel
Is simpler and helps a couple of shaders.
Totals from affected shaders: (Vega)
Code Size: 16341296 -> 16335460 (-0.04 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Daniel Schürmann
be0bb7e101 aco: fix 64bit fsub
Fixes: 425558bfd5 ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Pierre-Eric Pelloux-Prayer
17acff01a0 radeonsi: skip vs output optimizations for some outputs
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555 ("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>
2020-04-20 08:45:16 +02:00
Daniel Schürmann
425558bfd5 aco: use v_subrev_f32 for fsub with an sgpr operand in src1
This fixes an accidentally introduced regression.

Fixes: 9be4be515f ('aco: implement 16-bit nir_op_fsub/nir_op_fadd')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633>
2020-04-19 16:16:27 +00:00
Albert Astals Cid
06c5875fd6 Fix promotion of floats to doubles
Use the f variants of the math functions if the input parameter is a
float, saves converting from float to double and running the double
variant of the math function for gaining no precision at all

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3969>
2020-04-18 19:55:45 +00:00
Samuel Pitoiset
c4ca9e66dd aco: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
2020-04-17 16:23:24 +00:00
Samuel Pitoiset
b424d49ac0 radv/llvm: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
2020-04-17 16:23:24 +00:00
Samuel Pitoiset
19aa68ae31 radv: set missing SHARED_VGPR_CNT for NGG VS and ACO
shuffle is implemented with shared VGPRs with ACO and Wave64.

Fixes dEQP-VK.subgroups.shuffle.framebuffer.subgroupshuffle*_vertex
with Wave64.

Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4595>
2020-04-17 16:11:17 +00:00
Samuel Pitoiset
fd6e44236c radv: fix geometry shader primitives query with ACO on GFX10
Fixes
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.

Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4593>
2020-04-17 17:39:16 +02:00
Rhys Perry
839c886b34 aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y
The ISA doc is inconsistent whether this instruction writes SCC. It does.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
2020-04-16 17:04:53 +01:00
Rhys Perry
ac74367bef aco: implement various 8/16-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
2020-04-16 17:04:45 +01:00
Michel Dänzer
e3e704c7e7 amd/addrlib: Use enum instead of sparse chars to identify dimensions
The enum values can be used directly as indices into arrays, simplifying
the code.

This significantly cuts down the number of CPU cycles spent inside

* Addr::V2::Gfx9Lib::HwlComputeDccAddrFromCoord:

+------------------------------------------------------------------------+
|+         +++    +                                                x x xx|
|    |_____AM____|                                                 |_A__||
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5         14.89         15.44         15.14        15.156    0.24704251
+   5          8.26          9.96          9.37         9.282     0.6262747
Difference at 95.0% confidence
	-5.874 +/- 0.694294
	-38.7569% +/- 4.58098%
	(Student's t, pooled s = 0.476051)

* Addr::V2::CoordEq::solve:

+------------------------------------------------------------------------+
| +                                                                x     |
| + +   +   +                                       x           x  x    x|
||__MA____|                                              |______A__M____||
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5          8.11          9.59          9.21          9.02    0.55605755
+   5          4.28          5.05          4.48         4.564    0.32867917
Difference at 95.0% confidence
	-4.456 +/- 0.666135
	-49.4013% +/- 7.38509%
	(Student's t, pooled s = 0.456744)

(The measured numbers are the percentages of samples inside the
respective function and its calles for
`perf record --call-graph=fp kitty -e false`, measured on a Lenovo
Thinkpad E595 (Picasso))

v2:
* Add missed 'coords[dim] |= bit << ord;' (Pierre-Eric Pelloux-Prayer)
* Put 'ADDR_ASSERT(dim < DIM_S);' where the code previous had
  'ADDR_ASSERT_ALWAYS()' for the s/m dimensions.
* Use 1u for BitsValid (since it's 32-bit unsigned values).
* Use parens in 'BitsValid[dim] & (1u << ord)' for clarity.

Acked-by: Marek Olšák <marek.olsak@amd.com> # v1
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4523>
2020-04-16 11:10:52 +00:00