Tell LLVM the exact alignment we can guarantee, based on the fs block
dimensions, pixel format, and the alignment of the resource base pointer
and stride.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The fs shader now depends on the color buffer formats. The shader key was
extended to accommodate this, but llvmpipe_update_derived needs to be
updated to check the framebuffer dirty flag.
This fixes bug 57674.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
I fixed the only known bugs on r500 with 0222b2bd41.
Now there are no piglit regressions with Hyper-Z and all apps I tested seem
to work.
To summarize how it works:
- Only one process can use it at a time. This is a hardware limitation.
- The first process to clear a zbuffer gets the exclusive access to use
Hyper-Z.
- Compositors don't use any zbuffer, so they won't steal it, but some web
browsers do, so make sure there's no web browser running if you want your
game to use Hyper-Z.
- There's no need to restart an app which couldn't get the access to Hyper-Z.
Just quit the app which took it, the driver can turn it on for the other app
in the middle of rendering.
- If an app gets the access to Hyper-Z, it prints "radeon: Acquired Hyper-Z"
to stdout.
r300-r400:
Hyper-Z will be enabled by default on r300-r400 once sufficient testing is
done with piglit and Lightsmark at least.
Be sure to set the env var RADEON_HYPERZ and run piglit with parameters: -c 0
This fixes wrong rendering in Lightsmark and
the piglit/depthstencil-render-miplevels.
I think I fixed Hyper-Z. So far every app seems to work like a charm.
The border clamping code is unnecessary, since we don't care if a wrapped
coord value is -1 or <-1 (same for length vs. >length), in either case the
border handling code will mask out the offset and replace the texel value with
the border color.
Note that technically this is not entirely correct. Omitting clamping on the
float coords means that flt->int conversion may result in undefined values for
values of very large magnitude.
However there's no reason we should honor this here since:
a) we don't care for that for ordinary wrap modes in the aos code when
converting coords and the problem is worse there (as we've got only
effectively 24 instead of 32bits)
b) at least in some cases the clamping was done already in int space hence
doing nothing to fix that problem.
c) with sse2 flt->int conversion with such values results in 0x80000000 which
is just perfect (for clamp to border - not so much for the ordinary clamp to
edge).
Reviewed-by: Brian Paul <brianp@vmware.com>
all unsigneds are >= 0 :-)
There may be an argument for leaving this in, in case someone
changes min_lod to an integer, so feel free to apply or drop.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I haven't confirmed this is doing the correct thing, but at
least this might make someone review it!
Reported by internal RH coverity scan.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
To fix a pipe_context::surface_destroy() use-after-free problem.
We previously added pipe_sampler_view_release() for similar reasons.
Note: this is a candidate for the stable branches.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Completely forgot about updating Makefile when removing it. Stephane
already fixed the make build, but there were a few mentions of
lp_tile_soa left in the tree.
This should help avoid confusion now that we're using the gl_api enum
to distinguishing between core and compatibility API's. The
corresponding enum value for core API's is API_OPENGL_CORE.
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
The current implementation was close by not fully correct: several
operations that should be done in floating point were being done in
integer.
Fixes piglit fbo-clear-formats GL_ARB_texture_float
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
untested (couldn't get the piglit test to run even with version overrides)
but seemed blatantly wrong.
In any case it would only affect an error case which when it would happen
probably all hope is lost anyway.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
This adds array (1d,2d) texture support to llvmpipe.
Though probably should do something about 1d array textures requiring gobs
of memory (this issue is not strictly limited to arrays but it is probably
worse there).
Initial code by Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Support 1d and 2d array textures (including shadow samplers),
and (as a side effect mostly) also shadow cube samplers.
Seems to pass the relevant piglit tests both for sampling and rendering
to (though some require version overrides).
Since we don't support render target indices rendering to array textures
is still restricted to a single layer at a time.
Also, the min/max layer in the sampler view (which is unnecessary for GL)
is ignored (always use all layers).
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Now dead code.
Also had to remove the show_tiles/show_subtiles because now the color
buffers are always stored in their native format, so there is no longer
an easy way to paint the tile sizes.
Depth-stencil buffers are still swizzled.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Update llvmpipe_is_format_supported and llvmpipe_is_format_unswizzled
so that only the formats that we can render without swizzling are
advertised.
We can still render all D3D10 required formats except
PIPE_FORMAT_R11G11B10_FLOAT, which needs to be implemented in a future
opportunity.
Removal of rendertarget swizzling will be done in a subsequent change.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
It is buggy (it was giving wrong results for some of the formats with
padding), and util_format_description::is_array already does precisely
what's intended.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This is what we want in practice.
The only change is in PIPE_FORMAT_R8SG8SB8UX8U_NORM, which no longer is
considered an array format.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This patch fixes various format manipulation for big-endian
architectures.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch fixes various format manipulation for big-endian
architectures.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch adds two more functions in type conversions header:
* lp_build_bswap: construct a call to llvm.bswap intrinsic for an
element
* lp_build_bswap_vec: byte swap every element in a vector base on the
input and output types.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch fixes the vector constant generation used for vector shuffle
for big-endian machines.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch enforces the clear of NJ bit in VSCR Altivec register so
denormal numbers are handles as expected by IEEE standards.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch adds Altivec intrinsics for float vector types. It changes
the SSE specific definitions to a platform neutral and adds the calls
to Altivec intrinsic builder.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch add correct vector addition and substraction intrisics when
using Altivec with PPC. Current code uses default path and LLVM backend
ends up issuing carry-out arithmetic instruction while it is expected
saturated ones.
It also includes a fix for PowerPC where char are unsigned by default,
resulting in bogus values for vector shifting.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch adds PPC Altivec support for pack/unpack operations using Altivec
supported vector type (8xi8, 16xi16, 4xi32, 4xf32).
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Fixes 7 piglit tests, and prevents many more from crashing.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-and-Tested-by: Christian König <christian.koenig@amd.com>
The rest of the plumbing was in place already.
I have tested this by turning on all GL 3.1 features.
The drivers not supporting GL 3.1 will fail to create a core profile
as they should.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors.
Add debug_memory_check() function which can be periodically called to
check that all known blocks are good.
Reviewed-by: José Fonseca <jfonseca@vmware.com>