This still treats everything as RGBA8888 for the most part, same as
before. This is a prerequisite for handling other texture formats, since
only RGBA8888 has a raster-layout mode.
It meant that LUMALPHA was being marked as *many* miplevels, and
unsurprisingly wouldn't validate. On the other hand, some miplevel counts
wouldn't get the small mips validated at all.
This change will double cache size for branches which have a lower
LP_MAX_SHADER_VARIANTS limit (it will not do anything on master).
The reason is that nowadays shaders tend to be quite a bit larger than they
were (they were big when llvmpipe didn't have a fs loop, got much smaller with
that loop, and since then have gradually increased quite a bit though still
smaller than without the fs loop for various reasons - among them being d3d10
compliance, usage of 8-wide vectors, non-swizzled blend code). Thus effectively
less shaders would be cached (unless they were very small and the variant limit
was hit first). Also, since we're getting rid of the IR nowadays, the cached
shaders shouldn't need all that much memory actually.
If only the flat/smooth shade state changed between
two render calls the prior code would miss updating the
hardware state.
Also add check for sprite coord, potentially same type
of issue otherwise for it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
cs_vertex_buffer_state.enabled_mask and
cs_vertex_buffer_state.dirty_mask are both updated when
r600_set_constant_buffer() is called, so we don't need to manually
update these values.
This fixes a crash with OpenCL programs that have a kernel with no
arguments.
https://bugs.freedesktop.org/show_bug.cgi?id=82671
CC: "10.2" <mesa-stable@lists.freedesktop.org>
SB needs a bit of special handling to handle
instructions without obvious side effects, to
avoid it deleting them.
Fixes failing non-const ARB_gpu_shader5
textureOffsets piglits with sb enabled.
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Based on the old code, the new layout code describes the layout with the new,
well-documented, ilo_layout. It also gains new features such as MCS support
and extended ARYSPC_LOD0 that i965 comes up with (see
6345a94a9b).
Instead create a staging texture with pipe_buffer_create and
PIPE_USAGE_STAGING.
u_upload_mgr sets the usage of its staging buffer to PIPE_USAGE_STREAM.
But since 150ac07b85 CPU -> GPU streaming buffers
are created in VRAM. Therefore the staging texture (in VRAM) does not offer any
performance improvements for buffer downloads.
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
... and store the value in intel_winsys_info/ilo_dev_info.
Suggested-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
olv: check for errors and report raw values
The SWIZZLE_1 of the winsys destination was dereffing off the end of the
array, which surprisingly often worked out (since nobody reads the
rendered value anyway, so whatever junk was referenced in the QIR didn't
matter), but shader dumping would sometimes segfault.
We weren't accounting for the level 0 offset in the texture setup (so it
only worked if it happened to be a single-level texture), and doing so
required that we get the level 0 offset page aligned so that the offset
bits don't get interpreted as the texture format and such.
I had the right viewports in vc4_emit.c, but grabbed the wrong values in
the uniform setup, so primitives would claim to be in the wrong parts of
the screen. (The vc4_emit.c state looks like it just decides how big the
clipping guardband is).
This gets fbo-viewport closer to working (which still has the problem that
the HW is always guard-band clipping), and fixes inverted FBO rendering in
general.
Passes blendminmax and blendsquare. glean's more serious blendFunc fails
in simulation due to binner memory overflow (I really need to work around
that), and fbo-blending-formats fails due to Mesa refusing one of the
getter requests, even before it could fail due to the driver not actually
supporting different formats yet.
The hw_mask is the set of primitives you actually support, so this attempt
to provide the set of formats that's unsupported was wrong in two ways (it
was intended to be '~' not '!'). However, we only call this code when
prim isn't one of the actually supported hw_mask bits, so missing out on
the memcpy didn't matter anyway.