Commit graph

17182 commits

Author SHA1 Message Date
José Fonseca
6a2f2300a8 llvmpipe: Refactor convert_to/from_blend_type to convert in place.
This fixes the "Source and destination overlap in memcpy" valgrind
warnings.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-03 14:02:43 +00:00
José Fonseca
03aa3fd54b llvmpipe: Improve color buffer loads/stores alignment.
Tell LLVM the exact alignment we can guarantee, based on the fs block
dimensions, pixel format, and the alignment of the resource base pointer
and stride.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-03 14:02:43 +00:00
José Fonseca
0bc6ec238b llvmpipe: Recompute the fs shader key when framebuffer varies.
The fs shader now depends on the color buffer formats. The shader key was
extended to accommodate this, but llvmpipe_update_derived needs to be
updated to check the framebuffer dirty flag.

This fixes bug 57674.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-03 14:02:43 +00:00
Marek Olšák
54ff536823 r300g: increment num_z_clears only if we have Hyper-Z 2012-12-02 22:22:39 +01:00
Marek Olšák
838b19609f r300g: add blacklist for apps that shouldn't steal hyperz access 2012-12-02 22:18:11 +01:00
Marek Olšák
12dcbd5954 r300g: enable Hyper-Z by default on r500
I fixed the only known bugs on r500 with 0222b2bd41.
Now there are no piglit regressions with Hyper-Z and all apps I tested seem
to work.

To summarize how it works:
- Only one process can use it at a time. This is a hardware limitation.
- The first process to clear a zbuffer gets the exclusive access to use
  Hyper-Z.
- Compositors don't use any zbuffer, so they won't steal it, but some web
  browsers do, so make sure there's no web browser running if you want your
  game to use Hyper-Z.
- There's no need to restart an app which couldn't get the access to Hyper-Z.
  Just quit the app which took it, the driver can turn it on for the other app
  in the middle of rendering.
- If an app gets the access to Hyper-Z, it prints "radeon: Acquired Hyper-Z"
  to stdout.

r300-r400:
  Hyper-Z will be enabled by default on r300-r400 once sufficient testing is
  done with piglit and Lightsmark at least.
  Be sure to set the env var RADEON_HYPERZ and run piglit with parameters: -c 0
2012-12-02 18:07:26 +01:00
Marek Olšák
0222b2bd41 r300g: clear the ZB cache before clearing ZMASK or HIZ
This fixes wrong rendering in Lightsmark and
the piglit/depthstencil-render-miplevels.

I think I fixed Hyper-Z. So far every app seems to work like a charm.
2012-12-02 07:07:33 +01:00
Marek Olšák
62cba629c0 Revert "r300g: fix occlusion queries when depth test is disabled or zbuffer is missing"
It broke Hyper-Z terribly.
2012-12-02 07:07:33 +01:00
Marek Olšák
3039addf93 st/dri: implement new driver hook flush_with_flags
v2: added documentation for dri_flush as per Brian's request
2012-12-02 00:19:02 +01:00
Marek Olšák
8ad9d42b33 r300g: refuse to create too large textures 2012-12-01 22:41:39 +01:00
Marek Olšák
e694ea09f5 r300g: fix memory leaks in texture_create error paths 2012-12-01 22:38:36 +01:00
Marek Olšák
3e3a586236 r300g: fix revoking hyperz access
The bug was uncovered by 67c8e96f5a.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57763
2012-12-01 21:43:17 +01:00
Roland Scheidegger
587bd16d0d gallivm: drop border wrap clamping code
The border clamping code is unnecessary, since we don't care if a wrapped
coord value is -1 or <-1 (same for length vs. >length), in either case the
border handling code will mask out the offset and replace the texel value with
the border color.
Note that technically this is not entirely correct. Omitting clamping on the
float coords means that flt->int conversion may result in undefined values for
values of very large magnitude.
However there's no reason we should honor this here since:
a) we don't care for that for ordinary wrap modes in the aos code when
   converting coords and the problem is worse there (as we've got only
   effectively 24 instead of 32bits)
b) at least in some cases the clamping was done already in int space hence
   doing nothing to fix that problem.
c) with sse2 flt->int conversion with such values results in 0x80000000 which
   is just perfect (for clamp to border - not so much for the ordinary clamp to
   edge).

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-01 17:05:48 +01:00
Marek Olšák
224d0e4a3f r300g: handle map flag DISCARD_WHOLE_RESOURCE
This should improve performance in apps which trigger this codepath.
(e.g. Wine does)
2012-12-01 14:33:11 +01:00
Dave Airlie
d128ae347a svga: remove pointless assert on unsigned >= 0
all unsigneds are >= 0 :-)

There may be an argument for leaving this in, in case someone
changes min_lod to an integer, so feel free to apply or drop.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:25:15 +10:00
Dave Airlie
67c8e96f5a r300g: fix comparison of hyperz flush time.
I haven't confirmed this is doing the correct thing, but at
least this might make someone review it!

Reported by internal RH coverity scan.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-01 11:23:48 +10:00
Brian Paul
51223784d6 util: added pipe_surface_release() function
To fix a pipe_context::surface_destroy() use-after-free problem.
We previously added pipe_sampler_view_release() for similar reasons.

Note: this is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-30 12:08:07 -07:00
José Fonseca
e7177e362e llvmpipe: Remove remnants of lp_tile_soa from Makefile.
Completely forgot about updating Makefile when removing it. Stephane
already fixed the make build, but there were a few mentions of
lp_tile_soa left in the tree.
2012-11-30 07:07:38 +00:00
Vinson Lee
f126f34c1d llvmpipe: Fix incorrect sizeof.
Fixes sizeof not portable defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-29 21:08:48 -08:00
Stéphane Marchesin
4430d44eac llvmpipe: Fix build break from 75da95c50
The Makefile looks for a file which is gone (lp_tile_soa.c)

http://bugs.freedesktop.org/show_bug.cgi?id=57713
2012-11-29 19:54:34 -08:00
Vincent Lejeune
3fcb3fbf22 r600g: mirror simplification of if/break opcodes
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-29 22:15:18 +01:00
Vincent Lejeune
5fda2990aa r600g: separate resource_id and sampler_id tex info in tgsi-to-llvm
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-29 22:15:18 +01:00
Paul Berry
dbd6135bc1 mesa: Rename API_OPENGL to API_OPENGL_COMPAT.
This should help avoid confusion now that we're using the gl_api enum
to distinguishing between core and compatibility API's.  The
corresponding enum value for core API's is API_OPENGL_CORE.

Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-29 11:33:15 -08:00
Marek Olšák
3e163a137b gallium/postprocess: share pipe_context and cso_context with the state tracker
Using one context instead of two is more efficient and
we can skip another context flush.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-29 20:31:41 +01:00
José Fonseca
9c9c18a395 gallivm: Fix lp_build_float_to_half.
The current implementation was close by not fully correct: several
operations that should be done in floating point were being done in
integer.

Fixes piglit fbo-clear-formats GL_ARB_texture_float

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 16:52:42 +00:00
Roland Scheidegger
b5918d8f1d gallivm: fix a trivial txq issue for 2d shadow and cube shadow samplers
untested (couldn't get the piglit test to run even with version overrides)
but seemed blatantly wrong.
In any case it would only affect an error case which when it would happen
probably all hope is lost anyway.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:31:46 +01:00
Roland Scheidegger
6d50148742 llvmpipe: support array textures
This adds array (1d,2d) texture support to llvmpipe.
Though probably should do something about 1d array textures requiring gobs
of memory (this issue is not strictly limited to arrays but it is probably
worse there).
Initial code by Jakob Bornecrantz <jakob@vmware.com>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:30:19 +01:00
Roland Scheidegger
95e03914d8 gallivm: support array textures
Support 1d and 2d array textures (including shadow samplers),
and (as a side effect mostly) also shadow cube samplers.
Seems to pass the relevant piglit tests both for sampling and rendering
to (though some require version overrides).
Since we don't support render target indices rendering to array textures
is still restricted to a single layer at a time.
Also, the min/max layer in the sampler view (which is unnecessary for GL)
is ignored (always use all layers).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:28:25 +01:00
José Fonseca
88e92f5bcd llvmpipe: Remove lp_build_blend_soa()
No longer used/necessary, as we always blend in AoS now.

Trivial.
2012-11-29 14:08:43 +00:00
José Fonseca
75da95c50a llvmpipe: Eliminate color buffer swizzling.
Now dead code.

Also had to remove the show_tiles/show_subtiles because now the color
buffers are always stored in their native format, so there is no longer
an easy way to paint the tile sizes.

Depth-stencil buffers are still swizzled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:43 +00:00
José Fonseca
6916387e53 llvmpipe: Only advertise unswizzled formats.
Update llvmpipe_is_format_supported and llvmpipe_is_format_unswizzled
so that only the formats that we can render without swizzling are
advertised.

We can still render all D3D10 required formats except
PIPE_FORMAT_R11G11B10_FLOAT, which needs to be implemented in a future
opportunity.

Removal of rendertarget swizzling will be done in a subsequent change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
José Fonseca
9f06061d50 util/u_format: Kill util_format_is_array().
It is buggy (it was giving wrong results for some of the formats with
padding), and util_format_description::is_array already does precisely
what's intended.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
José Fonseca
a47674ee89 util/u_format: Tighten the meaning of is_array bit to exclude mixed type formats.
This is what we want in practice.

The only change is in PIPE_FORMAT_R8SG8SB8UX8U_NORM, which no longer is
considered an array format.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
Adhemerval Zanella
64e9ec634b util/u_format: Fix format manipulation for big-endian
This patch fixes various format manipulation for big-endian
architectures.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:23 +00:00
Adhemerval Zanella
e25abacc18 gallivm: Fix format manipulation for big-endian
This patch fixes various format manipulation for big-endian
architectures.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:18 +00:00
Adhemerval Zanella
b772d784b2 gallivm: Add byte-swap construct calls
This patch adds two more functions in type conversions header:
* lp_build_bswap: construct a call to llvm.bswap intrinsic for an
  element
* lp_build_bswap_vec: byte swap every element in a vector base on the
  input and output types.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:14 +00:00
Adhemerval Zanella
86902b5134 gallivm: Fix vector constant for shuffle
This patch fixes the vector constant generation used for vector shuffle
for big-endian machines.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:54:10 +00:00
Adhemerval Zanella
29ba79b2c9 gallivm: clear Altivec NJ bit
This patch enforces the clear of NJ bit in VSCR Altivec register so
denormal numbers are handles as expected by IEEE standards.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:52:05 +00:00
Adhemerval Zanella
43ce9efdbf gallivm: Altivec floating-point rounding
This patch adds Altivec intrinsics for float vector types. It changes
the SSE specific definitions to a platform neutral and adds the calls
to Altivec intrinsic builder.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:52:00 +00:00
Adhemerval Zanella
dd5c580816 gallivm: Altivec vector add/sub intrisics
This patch add correct vector addition and substraction intrisics when
using Altivec with PPC. Current code uses default path and LLVM backend
ends up issuing carry-out arithmetic instruction while it is expected
saturated ones.

It also includes a fix for PowerPC where char are unsigned by default,
resulting in bogus values for vector shifting.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:51:53 +00:00
Adhemerval Zanella
2ea7d3dabd gallivm: Altivec vector max/min intrisics
This patch adds the PPC Altivec instrics max/min instruction for
supported Altivec vector types (16xi8, 8xi16, 4xi32, 4xf32).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:51:46 +00:00
Adhemerval Zanella
31c63b058e gallivm: Altivec pack/unpack intrisics
This patch adds PPC Altivec support for pack/unpack operations using Altivec
supported vector type (8xi8, 16xi16, 4xi32, 4xf32).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-29 11:51:41 +00:00
Michel Dänzer
8b6aec6533 radeonsi: Bitcast result of packf16 intrinsic to float for export intrinsic.
Fixes 7 piglit tests, and prevents many more from crashing.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-and-Tested-by: Christian König <christian.koenig@amd.com>
2012-11-29 10:08:53 +01:00
Marek Olšák
aa46cc2879 st/mesa: allow forward-compatible contexts and set Const.ContextFlags
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-29 01:07:26 +01:00
Marek Olšák
249f86e3f8 st/mesa: add support for GL core profiles
The rest of the plumbing was in place already.

I have tested this by turning on all GL 3.1 features.
The drivers not supporting GL 3.1 will fail to create a core profile
as they should.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-29 01:07:26 +01:00
Brian Paul
0904973e39 util: add more memory debugging features
Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors.
Add debug_memory_check() function which can be periodically called to
check that all known blocks are good.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-28 15:03:29 -07:00
José Fonseca
1cead8845b llvmpipe: Implement logic ops for the AoS path.
It was forgotten in the previous patch series, but it is trivial to
implement, based on the SoA path.

This fixes glean logicOp failures.
2012-11-28 20:45:18 +00:00
José Fonseca
547efc76df llvmpipe: Don't use dynamically sized arrays.
Unfortunately for MSVC arrays with a constant variable size are still
considered dynamically sized.
2012-11-28 19:58:47 +00:00
James Benton
960ab06da0 llvmpipe: Update llvmpipe_is_format_unswizzled to reflect latest changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton
66fdf626bb llvmpipe: Enable vertex color clamping.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00