Commit graph

5018 commits

Author SHA1 Message Date
Samuel Pitoiset
6a6e71524d radv: adjust the supported subgroup stages
VK_SHADER_STAGE_ALL now includes all ray-tracing related stages.
Noticed while comparing vulkaninfo with some other drivers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4679>
2020-04-23 16:16:09 +00:00
Samuel Pitoiset
ff3f775476 radv: simplify checking for Navi1x chips
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4702>
2020-04-23 15:54:32 +02:00
Rhys Perry
0d9fe0405f aco: improve code for 32-bit isign
No shader-db changes on Navi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
d1621834f3 aco: combine VALU and SALU into various VOP3 instructions
shader-db (Navi):
Totals from 2916 (2.28% of 127638) affected shaders:
SGPRs: 184427 -> 184283 (-0.08%); split: -0.10%, +0.02%
VGPRs: 143520 -> 143640 (+0.08%); split: -0.00%, +0.09%
CodeSize: 14913548 -> 14913288 (-0.00%); split: -0.00%, +0.00%
MaxWaves: 26034 -> 26012 (-0.08%)
Instrs: 2935435 -> 2930960 (-0.15%); split: -0.15%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
607fb4153d aco: move call to store_output_to_temps in store_ls_or_es_output earlier
Skips get_intrinsic_io_basic_offset()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
b497b774a5 aco: remove copy in load_input_from_temps()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
2dc550202e aco: copy-propagate p_create_vector copies of vectors
Instead of copying the operands of the other p_create_vector and labelling
the definition with label_vec, copy the operands and label it with
label_temp so that it can be copy-propagated.

This was found while removing a redundant copy in load_input_from_temps()
which removed duplicate p_create_vector instructions.

shader-db (Navi):
Totals from 139 (0.11% of 127638) affected shaders:
VGPRs: 8472 -> 7948 (-6.19%)
CodeSize: 514592 -> 512368 (-0.43%)
MaxWaves: 1089 -> 1195 (+9.73%)
Instrs: 100214 -> 99658 (-0.55%)
Cycles: 400856 -> 398632 (-0.55%)
VMEM: 15545 -> 15338 (-1.33%)
Copies: 5140 -> 4584 (-10.82%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
e4383b5c7f aco: decrease the uses of other copy operations after splitting/removing
For copies like v[7:8] = v[8:9], what currently happens is:
- do_copy() will skip the second dword
- the uses of the second dword will be reduced to 0
- the copy operation will be removed from the map
and v8 will never be set to v9.

So just decrease the uses of other operations after splitting or removing
the current operation, so: "v8 = v9" will be split off, it's uses reduced
and then the new copy will be done in the next iteration.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4686>
2020-04-23 11:39:23 +00:00
Joshua Ashton
58f25098a0 radv: Use TRUNC_COORD on samplers
The default behaviour (0) is: "round-nearest-even to n.6 and drop fraction when point sampling" whereas the Vulkan spec simply wants us to floor it (1) "truncate when point sampling".

See 15.6.1 in the Vulkan spec.
https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#textures-normalized-operations

The Direct3D spec also mandates this (https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.7%20Point%20Sample%20Addressing)

This fixes some point-sampling texture precision issues in some Direct3D 9 titles such as Guild Wars 2 and htoL#NiQ: The Firefly Diary that are not present on other vendors.

Fixes dEQP-VK.pipeline.sampler.exact_sampling.*

https://github.com/Joshua-Ashton/d9vk/issues/450
https://github.com/doitsujin/dxvk/issues/1433

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3951>
2020-04-23 09:57:08 +00:00
Samuel Pitoiset
7086b38c81 radv: make sure to export the viewport index if FS needs it
If FS reads gl_ViewportIndex but VS doesn't export it, it should
be zero to avoid reading garbage.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2818
Fixes: b424d49ac0 ("radv/llvm: fix exporting the viewport index if the fragment shader needs it")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4687>
2020-04-23 08:10:25 +00:00
Daniel Schürmann
36e0d2f39b aco: coalesce v_mad's accumulator with definition's affinities
Totals from affected shaders:
Code Size: 8922676 -> 8915192 (-0.08 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d000d76f13 aco: use upper part of gap in register file if it is beneficial for striding
Totals from affected shaders:
SGPRS: 1717288 -> 1716984 (-0.02 %)
VGPRS: 1305924 -> 1304904 (-0.08 %)
Code Size: 138508892 -> 138420144 (-0.06 %) bytes
Max Waves: 115726 -> 115735 (0.01 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d666d83be2 aco: try to always find a register with stride for even sizes
Totals from affected shaders:
SGPRS: 1162400 -> 1162400 (0.00 %)
VGPRS: 947364 -> 946960 (-0.04 %)
Code Size: 98399300 -> 98399004 (-0.00 %) bytes
Max Waves: 74665 -> 74682 (0.02 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5a3c1f4f0b aco: stop get_reg_simple after reaching max_used_gpr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
2796cb4c24 aco: refactor get_reg_simple() to return early on exact matches
in the best fit algorithm

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
6792e134f3 aco: don't create vector affinities for operands which are not killed or are duplicates
Totals from affected shaders:
SGPRS: 825184 -> 825184 (0.00 %)
VGPRS: 697640 -> 697240 (-0.06 %)
Code Size: 79244104 -> 79201072 (-0.05 %) bytes
Max Waves: 42388 -> 42386 (-0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
edc2b57ac1 aco: allocate full register for subdword definitions if HW doesn't support it
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
97a870cf88 aco: move attempt to find strided register into get_reg_simple()
This simplifies code and helps some shaders

Totals from affected shaders:
Code Size: 51227172 -> 51202216 (-0.05 %) bytes
Max Waves: 19955 -> 19948 (-0.04 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
c7f97f110c aco: use DefInfo in more places to simplify RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
734f86db6b aco: create and use DefInfo struct in RA
for maintaining all information necessary to find a register.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5b2f628da3 aco: create pseudo dummy instruction in RA to be used for live-range splits
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d9f7d1d5cb aco: refactor get_reg() to also handle affinities
This simplifies definition handling and
helps a few shaders

Totals from affected shaders:
Code Size: 659540 -> 659376 (-0.02 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
7c8f4ebca9 aco: refactor get_reg() to take Temp instead of RegClass
This patch also moves get_reg_specified() and
get_reg_vec() before get_reg() to make use of it later.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Daniel Schürmann
0a9ed98178 aco: simplify operand handling in RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Rhys Perry
f13049f48a aco: implement 64-bit sgpr swaps
In our pipeline-db, helps almost exclusively Detroit: Become Human.

Totals from 6726 (5.36% of 125503) affected shaders:
CodeSize: 74680952 -> 74102228 (-0.77%)
Instrs: 14551507 -> 14406001 (-1.00%)
Cycles: 1748272436 -> 1690173104 (-3.32%)
VMEM: 964671 -> 964058 (-0.06%)
Copies: 1993312 -> 1847806 (-7.30%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
2ab45f41e0 aco: implement sub-dword swaps
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
83fdb1ed3d aco: add VOP3P_instruction
The optimizer isn't yet updated to handle this, since lower_to_hw_instr
will be the only user for now.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
8fc24f9a45 aco: fix copy statistic for 64-bit vgpr constant copy
The statistic is in units of instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Jonathan Marek
064d39e620 radv: use common nir_convert_ycbcr
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Daniel Schürmann
c3c1f4d6bc aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel
Is simpler and helps a couple of shaders.
Totals from affected shaders: (Vega)
Code Size: 16341296 -> 16335460 (-0.04 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Daniel Schürmann
be0bb7e101 aco: fix 64bit fsub
Fixes: 425558bfd5 ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Pierre-Eric Pelloux-Prayer
17acff01a0 radeonsi: skip vs output optimizations for some outputs
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555 ("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>
2020-04-20 08:45:16 +02:00
Daniel Schürmann
425558bfd5 aco: use v_subrev_f32 for fsub with an sgpr operand in src1
This fixes an accidentally introduced regression.

Fixes: 9be4be515f ('aco: implement 16-bit nir_op_fsub/nir_op_fadd')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633>
2020-04-19 16:16:27 +00:00
Albert Astals Cid
06c5875fd6 Fix promotion of floats to doubles
Use the f variants of the math functions if the input parameter is a
float, saves converting from float to double and running the double
variant of the math function for gaining no precision at all

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3969>
2020-04-18 19:55:45 +00:00
Samuel Pitoiset
c4ca9e66dd aco: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
2020-04-17 16:23:24 +00:00
Samuel Pitoiset
b424d49ac0 radv/llvm: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
2020-04-17 16:23:24 +00:00
Samuel Pitoiset
19aa68ae31 radv: set missing SHARED_VGPR_CNT for NGG VS and ACO
shuffle is implemented with shared VGPRs with ACO and Wave64.

Fixes dEQP-VK.subgroups.shuffle.framebuffer.subgroupshuffle*_vertex
with Wave64.

Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4595>
2020-04-17 16:11:17 +00:00
Samuel Pitoiset
fd6e44236c radv: fix geometry shader primitives query with ACO on GFX10
Fixes
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.

Fixes: c24d9522da ("radv: Enable ACO for NGG VS/TES, but disable NGG for ACO GS.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4593>
2020-04-17 17:39:16 +02:00
Rhys Perry
839c886b34 aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y
The ISA doc is inconsistent whether this instruction writes SCC. It does.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
2020-04-16 17:04:53 +01:00
Rhys Perry
ac74367bef aco: implement various 8/16-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
2020-04-16 17:04:45 +01:00
Michel Dänzer
e3e704c7e7 amd/addrlib: Use enum instead of sparse chars to identify dimensions
The enum values can be used directly as indices into arrays, simplifying
the code.

This significantly cuts down the number of CPU cycles spent inside

* Addr::V2::Gfx9Lib::HwlComputeDccAddrFromCoord:

+------------------------------------------------------------------------+
|+         +++    +                                                x x xx|
|    |_____AM____|                                                 |_A__||
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5         14.89         15.44         15.14        15.156    0.24704251
+   5          8.26          9.96          9.37         9.282     0.6262747
Difference at 95.0% confidence
	-5.874 +/- 0.694294
	-38.7569% +/- 4.58098%
	(Student's t, pooled s = 0.476051)

* Addr::V2::CoordEq::solve:

+------------------------------------------------------------------------+
| +                                                                x     |
| + +   +   +                                       x           x  x    x|
||__MA____|                                              |______A__M____||
+------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5          8.11          9.59          9.21          9.02    0.55605755
+   5          4.28          5.05          4.48         4.564    0.32867917
Difference at 95.0% confidence
	-4.456 +/- 0.666135
	-49.4013% +/- 7.38509%
	(Student's t, pooled s = 0.456744)

(The measured numbers are the percentages of samples inside the
respective function and its calles for
`perf record --call-graph=fp kitty -e false`, measured on a Lenovo
Thinkpad E595 (Picasso))

v2:
* Add missed 'coords[dim] |= bit << ord;' (Pierre-Eric Pelloux-Prayer)
* Put 'ADDR_ASSERT(dim < DIM_S);' where the code previous had
  'ADDR_ASSERT_ALWAYS()' for the s/m dimensions.
* Use 1u for BitsValid (since it's 32-bit unsigned values).
* Use parens in 'BitsValid[dim] & (1u << ord)' for clarity.

Acked-by: Marek Olšák <marek.olsak@amd.com> # v1
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4523>
2020-04-16 11:10:52 +00:00
Samuel Pitoiset
35b3963928 radv: do not abort with unknown/unimplemented descriptor types
To workaround a crash with Wolfeinstein Younglood because the
games creates one descriptor with
VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV...

I reported the problem to Machine Games, but still no answer, so
let's remove the unreachable calls (which are technically not
unreachable for buggy apps) to help gamers.

Note that AMDVLK and AMDGPU-PRO don't crash because they ignore
unsupported descriptor types.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4571>
2020-04-16 08:14:56 +00:00
Samuel Pitoiset
11faaf646d aco: fix emitting stream output with tess eval shaders
Fixes dEQP-VK.transform_feedback.simple.winding_patch_list_12.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4553>
2020-04-16 07:57:39 +00:00
Samuel Pitoiset
91aa596ca7 aco: implement nir_op_f2i8/nir_op_f2u8
I think we should really refactor the conversions path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4551>
2020-04-16 08:47:49 +02:00
Samuel Pitoiset
e2650db952 radv/aco: do not advertise VK_KHR_shader_subgroup_extended_types
It's unsupported because small bitsizes are still not completely
supported. It should have been disabled by default with ACO.

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4549>
2020-04-15 14:45:43 +02:00
Rhys Perry
c818b5c089 aco: fix 1D textureGrad() on GFX9
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f718edced ('aco: simplify gathering of MIMG address components')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550>
2020-04-15 10:45:07 +00:00
Samuel Pitoiset
08a396033b aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).

RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.

Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546>
2020-04-15 10:12:44 +02:00
Rhys Perry
fbd2be3f5d aco: clear moved operands in get_reg_create_vector()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Rhys Perry
52cc1f8237 aco: improve p_create_vector RA for sub-dword operands
These's still improvements needed for sub-dword definitions, but that's
not as simple.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Rhys Perry
e18711cda3 aco: fix p_extract_vector validation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00