untested (couldn't get the piglit test to run even with version overrides)
but seemed blatantly wrong.
In any case it would only affect an error case which when it would happen
probably all hope is lost anyway.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
This adds array (1d,2d) texture support to llvmpipe.
Though probably should do something about 1d array textures requiring gobs
of memory (this issue is not strictly limited to arrays but it is probably
worse there).
Initial code by Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Support 1d and 2d array textures (including shadow samplers),
and (as a side effect mostly) also shadow cube samplers.
Seems to pass the relevant piglit tests both for sampling and rendering
to (though some require version overrides).
Since we don't support render target indices rendering to array textures
is still restricted to a single layer at a time.
Also, the min/max layer in the sampler view (which is unnecessary for GL)
is ignored (always use all layers).
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Now dead code.
Also had to remove the show_tiles/show_subtiles because now the color
buffers are always stored in their native format, so there is no longer
an easy way to paint the tile sizes.
Depth-stencil buffers are still swizzled.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Update llvmpipe_is_format_supported and llvmpipe_is_format_unswizzled
so that only the formats that we can render without swizzling are
advertised.
We can still render all D3D10 required formats except
PIPE_FORMAT_R11G11B10_FLOAT, which needs to be implemented in a future
opportunity.
Removal of rendertarget swizzling will be done in a subsequent change.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
It is buggy (it was giving wrong results for some of the formats with
padding), and util_format_description::is_array already does precisely
what's intended.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This is what we want in practice.
The only change is in PIPE_FORMAT_R8SG8SB8UX8U_NORM, which no longer is
considered an array format.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This patch fixes various format manipulation for big-endian
architectures.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch fixes various format manipulation for big-endian
architectures.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch adds two more functions in type conversions header:
* lp_build_bswap: construct a call to llvm.bswap intrinsic for an
element
* lp_build_bswap_vec: byte swap every element in a vector base on the
input and output types.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch fixes the vector constant generation used for vector shuffle
for big-endian machines.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch enforces the clear of NJ bit in VSCR Altivec register so
denormal numbers are handles as expected by IEEE standards.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch adds Altivec intrinsics for float vector types. It changes
the SSE specific definitions to a platform neutral and adds the calls
to Altivec intrinsic builder.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch add correct vector addition and substraction intrisics when
using Altivec with PPC. Current code uses default path and LLVM backend
ends up issuing carry-out arithmetic instruction while it is expected
saturated ones.
It also includes a fix for PowerPC where char are unsigned by default,
resulting in bogus values for vector shifting.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch adds PPC Altivec support for pack/unpack operations using Altivec
supported vector type (8xi8, 16xi16, 4xi32, 4xf32).
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Fixes 7 piglit tests, and prevents many more from crashing.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-and-Tested-by: Christian König <christian.koenig@amd.com>
The brw_compile structure contains the brw_instruction store and the
brw_eu_emit.c state tracking fields. These are only useful for the
final assembly generation pass; the earlier compilation stages doesn't
need them.
This also means that the code generator for future hardware won't have
access to the brw_compile structure, which is extremely desirable
because it prevents accidental generation of Gen4-7 code.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Compiling shaders requires several main steps:
1. Generating VS IR from either GLSL IR or Mesa IR
2. Optimizing the IR
3. Register allocation
4. Generating assembly code
This patch splits out step 4 into a separate class named "vec4_generator."
There are several reasons for doing so:
1. Future hardware has a different instruction encoding. Splitting
this out will allow us to replace vec4_generator (which relies
heavily on the brw_eu_emit.c code and struct brw_instruction) with
a new code generator that writes the new format.
2. It reduces the size of the vec4_visitor monolith. (Arguably, a lot
more should be split out, but that's left for "future work.")
3. Separate namespaces allow us to make helper functions for
generating instructions in both classes: ADD() can exist in
vec4_visitor and create IR, while ADD() in vec4_generator() can
create brw_instructions. (Patches for this upcoming.)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Final code generation should never fail. This is a bug, and there
should be no user-triggerable cases where this could occur.
Also, we're not going to have a fail() method after the split.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The brw_compile structure is closely tied to the Gen4-7 hardware
encoding. However, do_vs_prog is very generic: it just calls out to
get a compiled program and then uploads it.
This isn't ultimately where we want it, but it's a step in the right
direction: it's now closer to the code generator.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
During compilation, we allocate a bunch of things: the IR needs to last
at least until code generation...and then the program store needs to
last until after we upload the program.
For simplicity's sake, just keep it all around until we upload the
program. After that, it can all be freed.
This will also save a lot of headaches during the upcoming refactoring.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
We used to steal it out of the brw_compile struct, but that won't be
initialized in time soon (and is eventually going away).
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
We used to steal it out of the brw_compile struct...but vec4_visitor
isn't going to have one of those in the future.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This leaves only the final code generation stage in brw_vec4_emit.cpp,
moving the payload setup, run(), and brw_vs_emit functions to brw_vec4.cpp.
The fragment shader backend puts these functions in brw_fs.cpp, so this
patch also helps with consistency.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
I ran across this while running a glGenerateMipmap() test.
_meta_GenerateMipmap sets MESA_META_TRANSFORM, which causes
_mesa_meta_begin to try and set a default orthographic projection.
Unfortunately, if the drawbuffer isn't set up, ctx->DrawBuffer->Width
and Height are 0, which just causes an GL_INVALID_VALUE error.
Fixes oglconform's fbo/mipmap.automatic, mipmap.manual, and
mipmap.manualIterateTexTargets.
Reviewed-by: Brian Paul <brianp@vmware.com>
The rest of the plumbing was in place already.
I have tested this by turning on all GL 3.1 features.
The drivers not supporting GL 3.1 will fail to create a core profile
as they should.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors.
Add debug_memory_check() function which can be periodically called to
check that all known blocks are good.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Fixes a ton of piglit regressions since the depthstencil fixes for gen6+.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965,
and the new piglit test glsl-fs-clamp-5.
We were trying to emit a saturating move into a uniform, which the code
generator appropriately choked on. This was broken in the change in
32ae8d3b32.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Will allow formats with padding, e.g. RGBX.
Will now allow swizzled formats as long as the alpha is channel 3.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
And add test cases to ensure that this works
- 110 verifies that glcpp rejects #elif<digits> which glcpp
previously accepted.
- 111 verifies that glcpp accepts #if followed immediately by
(, +, -, !, or ~.
- 112 does the same as 111 but for #elif.
See 17f9beb6 for #if change.
Reviewed-by: Carl Worth <cworth@cworth.org>