NIR always makes the shift amount 32 bits, but LLVM asserts if the two
sources aren't the same type. Zero-extend the shift amount to make LLVM
happy.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We implement the split opcodes, and tell NIR to lower the original ones.
The lowering to LLVM is a little more complicated, but NIR can optimize
the split ones a little better, and some NIR lowering passes that we
might want to use (particularly for doubles) emit the split ones.
This should fix pack/unpackDouble2x32, which seems like a bug since when
we enabled the Float64 capability. It will also fix pack/unpackInt2x32
when we enable the Int64 capability.
Fixes: 798ae37c ("radv: Enable Float64 support.")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We apparently still used v16i8 ....
As radeonsi doesn't use it with LLVM version checks I don't think
we need them either.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The buffer intrinsics should be used instead of the image ones.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
When using cmpswap on an image, it was being trunctated to
lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely.
v2: Add stable CC
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Booleans in NIR are ~0 for true, b2i returns 0/1.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
LLVM has required an i1 here for a long time. llvm.ctlz.* was fixed in
commit edd23e0606 ("ac/llvm: fix various findMSB bugs").
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Noticed in passing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Just noticed in passing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The parses skips the line if it contains parentheses.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The very last entry in the sid_strings_offsets table ended up missing,
leading to out-of-bounds reads and potential crashes.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The commit did not add the relevant includes - in particular
stdint.h and stdbool.h for the respective standard types.
At the same time, the amdgpu_device_handle typedef redeclaration was
off.
Fixes: 81945ded0d ("ac: remove amdgpu.h dependency")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101471
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Gregor Münch <gr.muench@gmail.com>
Reported-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reported-by: Gregor Münch <gr.muench@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Add a couple of forward declarations and drop the amdgpu.h requirement.
With this we can build the r300 and r600 drivers without the need for
amdgpu.
v2:
- Add amdgpu.h include in the C file (Marek)
- Add a comment about pre C11 typedef redeclaration warning (Eric)
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101189
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
has_hw_decode is assigned twice.
Pointed out by coverity.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Also solve "outinfo may be used uninitialized" warning by putting in an
unreachable().
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Most functions are only inspecting nir, so nir related arguments can be
marked const. Some more can be done if/when some nir changes are
accepted.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This just moves this code in here to it's cleaner.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of having the fragile code to do a second pass, just
give the pointers you want params in to the initial code,
then call a later pass to assign them.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Just pass a pointer and increment inside the function,
makes the code less error prone.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Simple search for a backslash followed by two newlines.
If one of the newlines were to be removed, this would cause issues, so
let's just remove these trailing backslashes.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We always compute HTILE size using addrlib, even when not TC compatible.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlied <airlied@redhat.com>
This just adds the strings and includes the gfx9 register defs
in some files that we need them in.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is mostly mechanical changes of renaming types and introducing
"legacy" everywhere.
It doesn't use the ac_surface computation functions yet.
Reviewed-by: Dave Airlie <airlied@redhat.com>
This ports: 55445ff189 from radeonsi
radeonsi: tell LLVM not to remove s_barrier instructions
LLVM 5.0 removes s_barrier instructions if the max-work-group-size
attribute is not set. What a surprise.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is needed to add the max workgroup size attribute.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
BOs larger than the minimum fragment size should have their VA
alignet to at least the fragment size for optimal performance.
v2: drop unused leftover from initial implementation
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>