Commit graph

371 commits

Author SHA1 Message Date
Nicolai Hähnle
28634ff7d3 ac/nir: pass ac_nir_context to emit_ddxy
Allocating the ddxy_lds is considered to be part of the API shader
translation and not part of the ABI.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:41 +02:00
Nicolai Hähnle
c5f3912e13 ac/nir: pass ac_nir_context to SSBO intrinsic handlers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:41 +02:00
Nicolai Hähnle
b78eae6f2a ac/nir: load buffer descriptors via ac_shader_abi::load_ssbo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
aa66fec47e ac/nir: pass ac_nir_context to emit_discard_if
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
4ba201ee36 ac/nir: extract shader_info->fs.can_discard from NIR shader info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
9061dca872 ac/nir: handle old-style shadow tex instructions correctly
The first element is only extracted for new-style shadow tex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
07597632a5 ac/nir: whitespace fixes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
ba06e8bbe8 ac/nir: use shader_info pass to determine whether instance_id is used
This improves the separation of ABI and NIR translation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
be0488a173 ac/nir: move setting shader_info->fs.writes_memory to radv-specific code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
f37f9aed84 ac/nir: add image and write parameter to ac_shader_abi::load_sampler_desc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:38 +02:00
Nicolai Hähnle
b36b6f76fa ac/nir: add support for arrays-of-arrays to get_sampler_desc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:38 +02:00
Nicolai Hähnle
35b7b3a80f ac/nir: pass ac_nir_context to tex_fetch_ptrs and related functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
6ff5317589 ac/nir: add and use ac_shader_abi::load_sampler_desc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
57fbf3f9eb ac/nir: pass ac_nir_context to visit_tex and various related functions
Get most of the churn out of the way before actually loading samplers
via the ABI.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
7763c7b2ba ac/nir,radeonsi: add ac_shader_abi::chip_class
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
d007919d99 ac/nir,radeonsi: add ac_shader_abi::load_ubo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:36 +02:00
Nicolai Hähnle
220ed150bc ac/nir: pass ac_nir_context to visit_load_ubo_buffer
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
df62e5eed0 ac/nir: pass ac_nir_context to visit_{load,store}_var and get_deref_offset helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
e139705c98 ac/nir: pass ac_llvm_context to some helper functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
cb96a36b04 ac/nir: pass ac_nir_context to visit_intrinsic
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
48737e1890 ac/nir: add ac_nir_context::main_function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
2be774b196 ac/nir: split scanning outputs from setting up output allocas
The scanning phase sets the driver_location, because it is part of the
ABI: radeonsi does the assignment differently.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
1a508cf8d6 ac/nir: pass ac_llvm_context to *build_alloca* helpers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
b99a169869 ac/nir: use ac_shader_abi::emit_outputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
0c3b6a4bd9 ac,radeonsi: add ac_shader_abi::emit_outputs for hardware VS shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
9df23db13d radeonsi: translate NIR to LLVM
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:33 +02:00
Nicolai Hähnle
73c7e92d3a ac/nir: add ac_shader_abi::inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:32 +02:00
Nicolai Hähnle
b2367cfcc7 ac/nir: begin splitting off ac_nir_context
The eventual goal is to hide all radv-specific details behind
ac_nir_context::abi, so that the NIR->LLVM code can be re-used by
radeonsi.

During development, we live with a partial split, where some of the
NIR->LLVM code still relies on linking back to the nir_to_llvm_context
(which should ultimately be renamed to reflect that it's radv-specific).
The idea is to get rid of these backlinks over time.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:32 +02:00
Nicolai Hähnle
fa5ae8db2e ac/nir: start using ac_shader_abi
v2: update for LLVMValueRefs in ac_shader_abi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Nicolai Hähnle
61ad2f13c3 ac,radeonsi: move some VS input descriptions to ac_shader_abi
v2: use LLVM values instead of function parameter indices

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Dave Airlie
e77ff11ffe radv/ac: port SI TC L1 write corruption fix.
This ports 72e46c988 to radv.
    radeonsi: apply a TC L1 write corruption workaround for SI

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-26 23:39:24 +01:00
Dave Airlie
a81e99f50a radv/ac: realign SI workaround with radeonsi.
This ports: da7453666a
radeonsi: don't apply the Z export bug workaround to Hainan
to radv.

Just noticed in passing.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-26 23:38:17 +01:00
Marek Olšák
5e81df0f10 ac/surface: fix hybrid graphics where APU=GFX9, dGPU=older
v2: don't do it for compressed textures (bpp = 0)

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-07-26 19:53:26 +02:00
Dave Airlie
80562f2b77 ac/gpu: add code to detect if kernel supports sync objects.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-21 21:31:54 +01:00
Connor Abbott
91dd2ca99f ac/nir: rewrite shared variable handling (v2)
Translate the NIR variables directly to LLVM instead of lowering to a
TGSI-style giant array of vec4's and then back to a variable. This
should fix indirect dereferences, make shared variables more tightly
packed, and make LLVM's alias analysis more precise. This should fix an
upcoming Feral title, which has a compute shader that was failing to
compile because the extra padding made us run out of LDS space.

v2: Combine the previous two patches into one, only use this for shared
variables for now until LLVM becomes smarter.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
2017-07-17 14:16:03 -07:00
Marek Olšák
3d1a576fa6 ac/gpu_info: if clock crystal frequency is 0, print an error and set 1
During bring-up, this is often 0. Prevent automatic disablement of
ARB_timer_query and demotion of the OpenGL version to 3.2 by setting
a non-zero frequency. Print an error message instead.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:59 -04:00
Marek Olšák
ddbd2f4c54 ac/surface/gfx9: flags.texture currently refers to TC-compatible HTILE
This should lead to better MSAA performance on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:46 -04:00
Marek Olšák
4560f2b90a radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Dave Airlie
acf1e132af amd/addrlib: fix typo in api name.
This fixes the misspelling of ALIGNMENTS in addrlib.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:14 +01:00
Dave Airlie
f8d5b377c8 radv: set cb base tile swizzles for MRT speedups (v4)
This patch uses addrlib to workout the tile swizzles according
to the surface index. It seems to produce the same values as
amdgpu-pro for the deferred test.

v2: don't apply swizzle to CMASK. the eg docs don't mention
it, and we clearly don't align cmask for that.
v3: disable surf index for dedicated images, as these will
most likely be shared, and I don't think the metadata has
space for this info in it yet.
v4: update for shareable images, rename combined_swizzle
to tile_swizzle

This gets the deferred demo from 730->950fps on my rx480.
(dcc cmask elim predication patches get it further)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:43:41 +01:00
Dave Airlie
7b5f2e0070 radv/ac: drop setting xnack
Since radv uses compute rings and we can't know when we are setting
up the shaders what ring they are to be used on, we should just use
the default xnack setting. This may be suboptimal in some places,
but if we hit a problem, we likely should try and address this
between llvm and mesa.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-09 22:21:43 +01:00
Dave Airlie
edf2acbeb1 radv: add support for using addrlib max alignment.
Rather than using 64k, use what addrlib returns as the base
alignment for vulkan allocations.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-09 22:17:59 +01:00
Alex Smith
c2a5cb6427 ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics
The NIR parameters are ordered "compare, data", matching GLSL, but both
the image and buffer LLVM intrinsics take them the other way around.
This is already handled correctly for SSBO atomics.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-07-07 00:57:25 +02:00
Dave Airlie
09d7c7be4f radv: enable sisched toggle in perftest flags.
RADV_PERFTEST=sisched

to enable it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:07:49 +01:00
Dave Airlie
d97275e42c ac/llvm: set xnack like radeonsi does.
Use family, but only set xnack+ for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:07:45 +01:00
Dave Airlie
01e958d631 ac/llvm: create features list using snprintf.
Just more moving code around before adding things to it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:06:04 +01:00
Dave Airlie
9d9f051390 ac/radv: change api to create target machine
This just modifies the API to make it easier to add other flags
to target machine creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:05:59 +01:00
Bas Nieuwenhuizen
860a8e6b99 ac/nir: Move VS position exports before param exports.
According to Nicolai the SX can already start work when all
the position exports are done, so do those first.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-05 20:23:00 +02:00
Connor Abbott
2ec77f7a3c ac/nir: fix 64-bit shifts
NIR always makes the shift amount 32 bits, but LLVM asserts if the two
sources aren't the same type. Zero-extend the shift amount to make LLVM
happy.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-03 11:58:59 -07:00
Connor Abbott
7168425dd7 ac/nir: implement 64-bit packing and unpacking
We implement the split opcodes, and tell NIR to lower the original ones.
The lowering to LLVM is a little more complicated, but NIR can optimize
the split ones a little better, and some NIR lowering passes that we
might want to use (particularly for doubles) emit the split ones.

This should fix pack/unpackDouble2x32, which seems like a bug since when
we enabled the Float64 capability. It will also fix pack/unpackInt2x32
when we enable the Int64 capability.

Fixes: 798ae37c ("radv: Enable Float64 support.")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-03 11:58:58 -07:00