Commit graph

570 commits

Author SHA1 Message Date
Samuel Pitoiset
0c4a30eb51 radv: do not add extra SGPR when push constants are not used
This is not because the vertex stage needs some push constants
that other stages need them too. This should reduce the number
of loaded SGPRs in some situations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:18 +01:00
Samuel Pitoiset
39097282f7 radv: change the needs_push_constants logic
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:16 +01:00
Samuel Pitoiset
1cecaa9174 radv: remove one useless check in ac_nir_shader_info_pass()
pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:12 +01:00
Dave Airlie
dd517ad96d ac/nir: fix lds store for patch outputs.
This wasn't calculating the correct value, this along with
a nir patch fixes a regression in:
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 043d14db30 (ac/nir: don't write tcs outputs to LDS that aren't read back.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-19 06:44:24 +10:00
Samuel Pitoiset
79b34d0832 amd/common: add ac_vgt_gs_mode() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:50 +01:00
Samuel Pitoiset
55f8431c76 amd/common: add ac_get_cb_shader_mask() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:48 +01:00
Bas Nieuwenhuizen
b308bb8773 amd/common: Add detection of the syncobj wait/signal/reset ioctls.
First amdgpu bump after inclusion was 20 (which was done for local BOs).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:06 +01:00
Samuel Pitoiset
225b198802 amd/common: add ac_build_waitcnt()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:44 +01:00
Samuel Pitoiset
24601810e9 amd/common: more use of i32_1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:42 +01:00
Samuel Pitoiset
ec4e566560 amd/common: more use of i32_0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:41 +01:00
Samuel Pitoiset
d43e72fd8c radeonsi: make use of ac_build_fdiv()
And move the comment to amd/common.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:24:38 +01:00
Samuel Pitoiset
88522e2bcd radv: export SampleMask from pixel shaders at full rate
Use 16_ABGR instead of 32_ABGR if Z isn't written.

Ported from RadeonSI.

No CTS regressions on Polaris.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:23:28 +01:00
Samuel Pitoiset
91f4d746e4 amd/common: add ac_get_spi_shader_z_format()
ac_shader_util.c will contain shader helpers for RadeonSI
and RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:23:23 +01:00
Samuel Pitoiset
90c3bf0789 radv: do not load the local invocation index when it's unused
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:26 +01:00
Samuel Pitoiset
e001944410 amd/common: scan which components of gl_LocalInvocationID are used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:04 +01:00
Samuel Pitoiset
42285ed8c3 amd/common: scan which components of gl_WorkGroupID are used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:02 +01:00
Samuel Pitoiset
2e58ef46a8 radv: replace grid_components_used by uses_grid_size
Use a boolean instead because the number of needed SGPRs
is always 3.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:19:42 +01:00
Samuel Pitoiset
97e57740d8 radv: always emit all compute block components
The number of grid components is always 3 when gl_NumWorkGroups
is declared, because it relies on the number of components of
nir_instrinsic_load_num_work_groups.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:19:39 +01:00
Timothy Arceri
a5f9ac2928 ac: fix nir_op_f2f64
Without this we get the error "FPExt only operates on FP" when
converting the following:

   vec1 32 ssa_5 = b2f ssa_4
   vec1 64 ssa_6 = f2f64 ssa_5

Which results in:

   %44 = and i32 %43, 1065353216
   %45 = fpext i32 %44 to double

With this patch we now get:

   %44 = and i32 %43, 1065353216
   %45 = bitcast i32 %44 to float
   %46 = fpext float %45 to double

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-13 13:20:28 +11:00
Bas Nieuwenhuizen
3342a432fa ac/nir: Support vulkan_resource_reindex.
Fixes: 93b4cb61eb "spirv: Allow OpPtrAccessChain for block indices"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-12 00:16:18 +01:00
Bas Nieuwenhuizen
368f49b284 ac/nir: Don't load the descriptor in vulkan_resource_index.
To support the reindex intrinsic, we need the result to be
something on which we can adjust the index/address.

Since it is all within a basic block, the compiler should be
able to merge any extra loads.

v2: Change visit_get_buffer_size too.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-12 00:16:18 +01:00
Samuel Pitoiset
5f81a43535 radv: use a faster version for nir_op_pack_half_2x16
This patch is ported from RadeonSI and it has two effects.

It fixes a rendering issue which affects F1 2017 and Dawn
of War 3 (Vega only) because LLVM was ending up by generating
the new v_mad_mix_{hi,lo} instructions which appear to be
buggy in some way. Not sure if Mesa is generating something
wrong or if the issue is in LLVM only. Anyway, that explains why
the DOW3 issue can't be reproduced with GL on Vega.

It also improves performance because v_cvt_pkrtz_f16 is faster,
and because I guess the rounding mode behaviour is similar between
GL and VK, we can use it. About performance, it improves Talos
by +3/4% but I don't see any other impacts.

No CTS regressions on Polaris.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-07 17:21:50 +01:00
Timothy Arceri
ccd1810bba ac: add si_nir_load_input_gs() to the abi
V2: make use of driver_location and don't expose NIR to the ABI.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:19 +11:00
Timothy Arceri
caf15ce670 ac: move build_varying_gather_values() to ac_llvm_build.h and expose
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:19 +11:00
Timothy Arceri
6fd6cb6616 ac: add basic nir -> llvm type helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Marek Olšák
186adc514b ac/surface: always compute DCC info when DCC is possible on GFX9
The same code for VI doesn't check for scanout either.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-30 18:46:11 +01:00
Marek Olšák
e4cce7dbba radeonsi: dismantle si_common_screen_init/destroy
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
757ea3e613 radeonsi: move/remove ac_shader_binary helpers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Marek Olšák
e3c0a5b6e8 ac/surface: enable DCC computation for MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Dylan Baker
5060c51b6f meson: build r600 driver
v4: - Ensure inc_amd_common defined when radeonsi is disabled (needed by
      r600)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-28 14:06:33 -08:00
Nicolai Hähnle
377a062321 ac/surface: fix indentation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Nicolai Hähnle
97f42d11df amd/common: sid.h cleanups
Fix a bunch of labels indicating when registers were added/removed
and normalize the SI-class GRBM_GFX_INDEX.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Marek Olšák
6b8909f2d1 ac: pack legacy_surf_level better
r600_texture: 1488 -> 1248 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:46:16 +01:00
Marek Olšák
ec15ff78c3 ac: change legacy_surf_level::slice_size to dword units
The next commit will reduce the size even more.

v2: typecast to uint64_t manually
v3: add more typecasts, add asserts

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:44:04 +01:00
Marek Olšák
474b4a9191 ac: pack ac_surface better
r600_texture: 1736 -> 1488 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:12:38 +01:00
Dave Airlie
043d14db30 ac/nir: don't write tcs outputs to LDS that aren't read back.
If the TCS doesn't read back the outputs, no need to store them
to LDS in the first place. (except for tess factors).

This seems to give about 50fps (3290->3330) with tessellation demo.

I haven't tested if it impacts DoW3 at all.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-27 13:50:24 +10:00
Boyuan Zhang
436a3f8d6d radeon/common: add vcn enc ip info query
New ip info query is needed for vcn encode

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-11-17 12:25:47 -05:00
Timothy Arceri
b73ce64fb8 ac: add gs_{prim,invocation}_id to the abi
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 10:54:03 +11:00
Dylan Baker
46a7fdd7ca meson: Remove build_by_default from amd code
This is the same logic as the previous two patches.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-13 13:43:20 -08:00
Timothy Arceri
8fe6abd964 ac: add emit_vertex to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Dave Airlie
6bec8bcd79 ac/nir: add support for all intrinsics. (v2)
This is derived from tgsi/radeonsi code from the GLSL intrinsics.

This should pre-fix radv for the upcoming spirv patches.

v2: actually use wait_cnt, sleep deprived dad time! (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-09 01:25:59 +00:00
Marek Olšák
7f33e94e43 amd/addrlib: update to latest version
This uses C++11 initializer lists.

I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.

The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 00:55:13 +01:00
Marek Olšák
cde664ab81 radeonsi: use ac_create_target_machine
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:38 +01:00
Marek Olšák
81f81fdb54 radeonsi: use ac_get_llvm_processor_name
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:36 +01:00
Marek Olšák
24e9004708 radeonsi: remove unused field in the PCI ID table
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-11-07 17:26:36 +01:00
Dave Airlie
0084f4a422 ac/nir: for ubo load use correct num_components
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-07 14:54:19 +10:00
Timothy Arceri
6e2eb96b64 ac: remove the remaining duplicate llvm types
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
e73a467005 ac: remove usused v4f32
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
7f4966731f ac: add v2f32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
cd6cfd1095 ac: use the ac f16 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00