Commit graph

4814 commits

Author SHA1 Message Date
Samuel Pitoiset
68abc07317 aco: fix emitting SMEM instructions with no operands on GFX6-GFX7
Like s_memtime.

Fixes dEQP-VK.glsl.shader_clock.* on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3407>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3407>
2020-01-16 08:18:18 +01:00
Marek Olšák
eeb4a11c11 ac/cull: don't read Position.Z if it's not needed for culling
It could be NULL.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:20 -05:00
Samuel Pitoiset
7f5462e349 radv: enable Vulkan 1.2
This bumps the Vulkan version to 1.2.128.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
68d6bead78 radv: implement Vulkan 1.2 features and properties
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b3033198a8 radv: implement Vulkan 1.1 features and properties
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
a09ab76828 radv: update VK_KHR_timeline_semaphore for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
fab0aa9182 radv: update VK_KHR_uniform_buffer_standard_layout for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
3ff8d12458 radv: update VK_KHR_shader_subgroup_extended_types for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
af25c8d57b radv: update VK_KHR_shader_float_controls for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
5335bb6c39 radv: update VK_KHR_shader_float16_int8 for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
a73d01b1db radv: update VK_KHR_shader_atomic_int64 for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
83d1773a57 radv: update VK_KHR_imageless_framebuffer for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b3bdb4e6ff radv: update VK_KHR_image_format_list for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
a80229941f radv: update VK_KHR_driver_properties for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
af883bf3dc radv: update VK_KHR_draw_indirect_count for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b537be4368 radv: update VK_KHR_depth_stencil_resolve for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
5993f13b27 radv: update VK_KHR_create_renderpass2 for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b2be00fbc1 radv: update VK_KHR_buffer_device_address for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
0eb26aae1c radv: update VK_KHR_8bit_storage for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
b4eed4e548 radv: update VK_EXT_scalar_block_layout for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
efdf9d8969 radv: update VK_EXT_sampler_filter_minmax for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
65e215e6f3 radv: update VK_EXT_host_query_reset for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
95ec0c050b radv: update VK_EXT_descriptor_indexing for Vulkan 1.2
Promoted to Vulkan 1.2 with the EXT suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset
fce28a7341 radv/gfx10: simplify some duplicated NGG GS code
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
2020-01-15 07:45:29 +00:00
Samuel Pitoiset
53b50be35c radv/gfx10: enable all CUs if NGG is never used
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
2020-01-15 07:45:29 +00:00
Samuel Pitoiset
5ff12322c9 radv: only use VkSamplerCreateInfo::compareOp if enabled
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3392>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3392>
2020-01-15 08:16:15 +01:00
Bas Nieuwenhuizen
4e3c81517b radv: Disable VK_EXT_sample_locations on GFX10.
Workaround for https://gitlab.freedesktop.org/mesa/mesa/issues/2163

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3236>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3236>
2020-01-15 01:54:27 +00:00
Timur Kristóf
dfaa3c0af6 aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.
When possible, get rid of an s_not when all it does is invert the SCC,
and its successor s_cbranch / s_cselect can be inverted instead.

Also modify some parts of instruction_selection to take advantage of
this feature.

Example:
s2: %3900,  s1: %3899:scc = s_andn2_b64 %0:exec, %406
s2: %3902 = s_cselect_b64 -1, 0, %3900:scc
s2: %407,  s1: %3903:scc = s_not_b64 %3902
s2: %3906,  s1: %3905:scc = s_and_b64 %407, %0:exec
p_cbranch_z %3905:scc
Can now be optimized to:
s2: %3900,  s1: %3899:scc = s_andn2_b64 %0:exec, %406
p_cbranch_nz %3900:scc

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
c0f82165a7 aco: Optimize out s_and with exec, when used on uniform bitwise values.
Previously all booleans needed an s_and with exec when they were turned
into a scalar condition. However, this is not needed for uniform booleans.

v2 by Daniel Schürmann:
- Make the code more readable
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
1c44129db3 aco: Don't skip combine_instruction when definitions[1] is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
338d03090f aco: Allow optimizing vote_all and nir_op_iand.
By adding an extra instruction, we can replace the operands of
the s_cselect_b64, which allows it to get picked up by the
optimizer when it looks for uniform booleans.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
d962bbd895 aco: Implement 64-bit constant propagation.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Jason Ekstrand
7c16a1ae4e vulkan/wsi: Add a driconf option to force WSI to advertise BGRA8_UNORM first
The Aztec Ruins benchmark just grabs the first format in the list and
SRGB causes it to render washed out.  With this workaround, it renders
the same as OpenGL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
2020-01-14 19:27:13 +00:00
Rhys Perry
f978e0e516 aco: add integer min/max to can_swap_operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f92a89a979 aco: improve readfirstlane after uniform LDS loads
Totals from affected shaders:
SGPRS: 976 -> 968 (-0.82 %)
VGPRS: 580 -> 584 (0.69 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 106032 -> 103076 (-2.79 %) bytes
Max Waves: 237 -> 237 (0.00 %)
Instructions: 19452 -> 18740 (-3.66 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
92ace0bb31 aco: replace extract_vector with copies
Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done

Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
20d869079d aco: allow input modifiers on v_cndmask_b32
Totals from affected shaders:
SGPRS: 594099 -> 594019 (-0.01 %)
VGPRS: 441016 -> 441124 (0.02 %)
Spilled SGPRs: 101 -> 101 (0.00 %)
Spilled VGPRs: 18 -> 18 (0.00 %)
Code Size: 30266652 -> 30125256 (-0.47 %) bytes
Max Waves: 67044 -> 67057 (0.02 %)
Instructions: 5753097 -> 5726607 (-0.46 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f9405ceb8a aco: don't move literal to reg when making an instruction VOP3 on GFX10
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 163398 -> 163398 (0.00 %)
VGPRS: 143820 -> 143820 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13065744 -> 13044308 (-0.16 %) bytes
Max Waves: 18921 -> 18921 (0.00 %)
Instructions: 2514644 -> 2509285 (-0.21 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
e686e4765e aco: add min(-max(), ) and max(-min(), ) optimization
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
fa8357eb70 aco: improve clamp optimization
Not sure why it checked the use count, it doesn't apply the constants.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 22908584 -> 22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions: 4275312 -> 4263869 (-0.27 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 14650824 -> 14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions: 2762802 -> 2755517 (-0.26 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
edc888ccb1 aco: fix clamp optimization
We can't do the optimization if there are neg/abs in-between.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f664cb01ec aco: improve creation of v_madmk_f32/v_madak_f32
Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 54743740 -> 54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions: 10182030 -> 10182030 (0.00 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 49173564 -> 49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions: 9315135 -> 9293330 (-0.23 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
15e25da3e5 aco: take advantage of GFX10's constant bus limit and VOP3 literals
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 2397159 -> 2392494 (-0.19 %)
VGPRS: 1756036 -> 1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 110287304 -> 109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions: 21019327 -> 20533618 (-2.31 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
9c2d37308f aco: allow an extra SGPR with multiple uses to be applied to VOP3
This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f4c2c90e1a aco: allow applying two sgprs to an instruction
We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.

This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
7da07ca3e4 aco: follow through temporary when merging tests into constant comparisons
This can happen with v_mov_b32(s_mov_b32(literal))

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
dc6c35e1c3 aco: be more careful with literals in combine_salu_{n2,lshl_add}
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
fcf52eb42d aco: add check_vop3_operands()
This will be useful when taking advantage of GFX10 features.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
cef7879719 aco: rewrite apply_sgprs()
This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
0be7409069 aco: rewrite literal combining
Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00