Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).
Roland Scheidegger's commit e827d91756
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000. This patch fixes
three of the four regressions observed by Ray:
- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2
One regression remains:
- draw-vertices-2101010
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.
The code fails to account for endianess when reading the packed
values.
This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
When the kernel supports it set the local flag and
stop adding those BOs to the BO list.
Can probably be optimized much more.
v2: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.
v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Only useful when that debug option is enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This fixes a rendering issue with Hitman when bindless textures
are enabled.
Fixes: 2263610827 ("radeonsi: flush DB caches only when transitioning from DB to texturing")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
If llvmpipe_set_scissor_states() is never called, we still need to be sure
that derived scissor/clip state is updated. As of commit 743ad599a9
that function might not be called.
Fixes regressed Piglit gl-1.0-scissor-offscreen -fbo -auto test.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101709
Fixes: 743ad599a9 ("st/mesa: don't set 16 scissors and 16 viewports
if they're unused")
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
The discard range codepath takes precedence, so if we get both
unsynchronized and discard_range, choose unsynchronized.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
A previous expression presents same as TGSI_SEMANTIC_SUBGROUP_GT_MASK.
It fixes a direction of an inequality for TGSI_SEMANTIC_SUBGROUP_LT_MASK.
before:
bit index > TGSI_SEMANTIC_SUBGROUP_INVOCATION
after:
bit index < TGSI_SEMANTIC_SUBGROUP_INVOCATION
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This reverts commit fc99cb3c9e.
"The performance went down from 64.7 to 51.4 fps in Valley and from 30.8 to
25.1 fps in Heaven on Radeon HD 7970. Other games seem to have also a 10-25%
performance decrease."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102429
It looks like we can't use the raster config values from the kernel.
Found by code inspection.
Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
In two places we called pipe_resource_reference() to remove a reference
to a vertex buffer resource. But we neglected to check if the buffer was
a user buffer and not a pipe_resource. This caused us to pass an invalid
pipe_resource pointer to pipe_resource_reference().
Instead of calling pipe_resource_reference(&vbuf->resource, NULL), use
pipe_vertex_buffer_unreference(&vbuf) which checks the is_user_buffer
field and does the right thing.
Also, explicity set the is_user_buffer field to false after setting the
vbuf->resource pointer to out_buffer.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102377
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
If we're rendering to a format without alpha, convert DST_ALPHA blend to
a ONE so that factors are properly computed. This same workaround is
done on a3xx+ as well.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
All of the coordinates and LOD args are integers for TXF. This mostly
doesn't matter, except for converting into a levelZero=true operation by
removing an explicit zero LOD. For the comparison against zero to work
properly, the sType of the instruction has to be set correctly.
Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch
Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Since encoder only support de-interlaced buffers.
v2: move to parameter call to tell dec/enc
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Only copy this value when in restart drawing mode.
Eliminates valgrind errors when running trivial programs.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
They are only used for debug info.
Together with making tgsi_opcode_info::opcode a bitfield, this reduces
the size of tgsi_opcode_info on 64-bit systems from 24 bytes to 4 bytes,
and makes the whole data structure a bit more linker friendly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
So we can easily re-arrange members of tgsi_opcode_info, and readers of
the code don't have to guess what all the 0s mean.
Mostly done with regex search&replace.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
It's not clear why they were ever 2 bits to begin with. Perhaps
the original intent was to use signed values, but that doesn't
seem to have ever been the case in master.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Various index-related fields are only initialized when required, so
they should only be dumped in those cases.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
When assertions were disabled, the compiler removed
the call to util_idalloc_alloc() and the first allocated
bindless slot was 0 which is invalid per the spec.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Pass the dri.sym version script to the linker. This ensures only
explicitly exported symbols are exported and shrinks the library by up
to 60KB.
HAVE_DLADDR also needs to be set so that __driDriverExtensions is defined.
We need to pass "--undefined-version" because the Android build system
sets --no-undefined-version by default and we get an error on
driver specific symbols if those drivers are disabled without the option.
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>