Commit graph

58232 commits

Author SHA1 Message Date
Neil Roberts
a018a3f3f5 mesa/meta: Support decompressing floating-point formats
Previously the Meta implementation of glGetTexImage would fall back to
_mesa_get_teximage if the texturing is not using an unsigned normalised
format. However in order to support the half-float formats of BPTC textures we
can make it render to a floating-point renderbuffer instead. This patch makes
decompression_state have two FBOs, one for the GL_RGBA format and one for
GL_RGBA32F. If a floating-point texture is encountered it will try setting up
a floating-point FBO. It will now also check the status of the FBO and fall
back to _mesa_get_teximage if the FBO is not complete.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
817051ab5b swrast: Enable GL_ARB_texture_compression_bptc
Enables BPTC texture compression on the software rasterizer.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
9782b8a80c i965: Enable the GL_ARB_texture_compression_bptc extension
Enables the BPTC extension on Gen>=7 and adds the necessary format mappings to
get the right surface type value.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
88a8830390 mesa/main: Modify generate_mipmap_compressed to cope with float textures
Once we add BPTC texture support we will need to generate mipmaps for
compressed floating point textures too. Most of the code seems to already be
there but it just needs a few extra lines to get it to use GL_FLOAT instead of
GL_UNSIGNED_BYTE as the type for the temporary buffers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
17cde55c53 mesa: Add texstore functions for BPTC-compressed textures
This adds compressors for all four of the BPTC compressed-texture formats. The
compressor is written from scratch and takes a very simple approach. It always
uses a single mode of the BPTC format (4 for unorm and 3 for half-floats) and
picks the two endpoints by dividing the texels into those which have more or
less than the average luminance of the block and then calculating an average
color of the texels within each division.

It's probably not really sensible to try to use BPTC compression at runtime
because for example with the Nvidia offline compression tool it can take in
the order of an hour to compress a full-screen image. With that in mind I
don't think it's worth having a proper compressor in Mesa and this approach
gives reasonable results for a usage that is basically a corner case.

v2: Always use the custom compressor, even for the unorm formats. Fix the
    quantization step for the half-float format compressor. Fixed a typo which
    was breaking the right-hand edge of half-float textures with a width that
    isn't a multiple of four.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
442bcd7fd3 mesa: Add texel fetch functions for BPTC-compressed textures
Adds functions to fetch from any of the four BPTC-compressed formats.

v2: Set the alpha component to 1.0 when fetching from the half-float formats
    instead of leaving it uninitialised. Don't linearize the alpha component
    when fetching from sRGB.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
7e78033c11 mesa: Add the format enums for BPTC-compressed images
This adds the following four Mesa image format enums which correspond to the
four BPTC compressed texture formats:

 MESA_FORMAT_BPTC_RGBA_UNORM
 MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM
 MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT
 MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT

It also updates the format information functions to handle these and the
corresponding GL enums.

v2: Also modify _mesa_get_format_color_encoding, _mesa_get_srgb_format_linear
    and _mesa_get_uncompressed_format

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:23:50 +01:00
Neil Roberts
cc9c30b8a7 mesa/format_info: Add support for the BPTC layout
Adds the ‘bptc’ layout to get_channel_bits. The channel bits for BPTC depend
on the mode but as it only has to be an approximation this sets it to 8 for
the two UNORM formats and 16 for the two half-float formats. These represent
the minimum number of bits of variation that can be generated by the
interpolation of the two formats.

This doesn't quite match what we do for S3TC which only returns 4 even though
it can similarly generate 8 bits from the interpolation. However it does match
what we return for ETC2. For reference, NVidia seems to return 8 bits for the
UNORM formats and 32 bits for the half-float formats.

v2: Change the number of bits to 8/8/8/8 for the UNORM formats and 16/16/16
    for the half-float formats.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-08-12 18:23:38 +01:00
Neil Roberts
84218b598f mesa/format_info: Add support for compressed floating-point formats
If the name of a compressed texture format has ‘FLOAT’ in it it will now set
the data type of the format to GL_FLOAT. This will be needed for the BPTC
half-float formats.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:00:26 +01:00
Neil Roberts
0c6e230eb1 mesa: Fix the base format for GL_COMPRESSED_RGB_BPTC_*_FLOAT_ARB
The signed and unsigned half-float BPTC-compressed formats were being reported
as having a base format of GL_RGBA but they don't store an alpha channel so it
should be GL_RGB.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:00:26 +01:00
Neil Roberts
5ceb4bff33 mesa: Add the GL_ARB_texture_compression_bptc extension
This adds a boolean in the gl_extensions struct for
GL_ARB_texture_compression_bptc as well as an entry in extension_table.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-12 18:00:26 +01:00
Andreas Boll
36771dc60f winsys/radeon: fix nop packet padding for hawaii
The initial firmware for hawaii does not support type3 nop packet.
Detect the new hawaii firmware with query RADEON_INFO_ACCEL_WORKING2.
If the returned value is 3, then the new firmware is used.

This patch uses type2 for the old firmware and type3 for the new firmware.

It fixes the cases when the old firmware is used and the user wants to
manually enable acceleration.
The two possible scenarios are:
 - the kernel has no support for the new firmware.
 - the kernel has support for the new firmware but only the old firmware
   is available.

Additionaly this patch disables GPU acceleration on hawaii if the kernel
returns a value < 2. In this case the kernel hasn't the required fixes
for proper acceleration.

v2:
 - Fix indentation
 - Use private struct radeon_drm_winsys instead of public struct radeon_info
 - Rename r600_accel_working2 to accel_working2

v3:
 - Use type2 nop packet for returned value < 3

v4:
 - Fail to initialize winsys for returned value < 2

Cc: mesa-stable@lists.freedesktop.org
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-12 12:16:06 -04:00
Charmaine Lee
0c065270c0 svga: Add a limit to the maximum surface size
This patch adds a limit to the maximum surface size which is
based on the maximum size of a single mob. If this value is not
available, the maximum surface size is by default set to 128 MB.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-12 08:03:24 -06:00
José Fonseca
d839be24b3 mesa/st: Move declaration to top of block.
To fix MSVC build failure.

Trivial.
2014-08-12 14:25:37 +01:00
Ilia Mirkin
6174f49170 mesa/st: add support for dynamic sampler offsets
Replace the plain sampler index with a register reference to a sampler.
We also need to keep track of the sampler array size when there is a
relative reference so that we can mark the whole array used.

To facilitate implementation, we add a separate ADDR register that
exclusively handles the sampler relative address. Other approaches would
be more invasive.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-08-12 08:52:14 -04:00
Christian König
83012b5085 radeon/uvd: fix gpu_address for video surfaces
We need to get the new gpu_address as well when
reallocating the cs buffer.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=82428

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2014-08-12 11:53:52 +02:00
Chris Forbes
3b48f6a4c0 mesa: Add a new function for getting the nonconst sampler array index
If the array index is not a constant expression, the existing support
will assume a zero offset (giving us the sampler index of the base of
the array).

For dynamically uniform indexing of sampler arrays, we need both that
and the indexing expression.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-12 19:18:55 +12:00
Chris Forbes
1b4761bc27 glsl: Allow dynamically uniform sampler array indexing with 4.0/gs5
V2: Expand comment to explain what dynamically uniform expressions are
about.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-12 19:17:56 +12:00
Ilia Mirkin
f525bd01d1 nvc0/ir: describe the tex arguments for fermi/kepler
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-11 19:07:34 -04:00
Ilia Mirkin
b3cbd86224 nvc0/ir: add kepler+ support for indirect texture references
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-11 19:07:34 -04:00
Ilia Mirkin
af3619e880 nvc0/ir: add base tex offset for fermi indirect tex case
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-11 19:07:34 -04:00
Kenneth Graunke
f73594778b i965: Revert part of f5cc3fdcf1.
Fixes non-termination in various Piglit tests.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2014-08-11 15:07:17 -07:00
Eric Anholt
602a3f92d4 vc4: Flip which primitives are considered front-facing.
This mostly fixes glxgears rendering.
2014-08-11 14:47:54 -07:00
Eric Anholt
f097516505 vc4: Don't forget to set the depth clear value in the packet.
This gets glxgears partially rendering again.
2014-08-11 14:47:54 -07:00
Eric Anholt
e63598aecb vc4: Add support for gl_FragCoord.
This isn't passing all tests (glsl-fs-fragcoord-zw-ortho, for example),
but it does get a bunch more tests passing.

v2: Rebase on helpers change.
2014-08-11 14:47:54 -07:00
Eric Anholt
d34fbdda12 vc4: Refactor shader input setup again.
This makes some space for handling special inputs like fragcoords.
2014-08-11 14:47:54 -07:00
Eric Anholt
a7faca5d27 vc4: Clean up the tile alloc buffer size.
This prevents some simulator assertion failures, but it does mean (since
I've dropped the "* 16" padding) that on real hardware you need a kernel
that does overflow memory management (currently, "drm/vc4: Add support for
binner overflow memory allocation." in my kernel tree).
2014-08-11 14:47:51 -07:00
Eric Anholt
7050ab510d vc4: Clarify some values implicitly chosen for binning config.
These #defines are 0, but it should help make math above make more sense.
2014-08-11 14:45:32 -07:00
Eric Anholt
ed5cb5d7d5 vc4: Improve simulator memory allocation.
This should reduce a bunch of spurious failures in sim.
2014-08-11 14:45:32 -07:00
Eric Anholt
f5f8dd29c3 vc4: Handle stride==0 in VBO validation 2014-08-11 14:45:32 -07:00
Eric Anholt
0f034055f9 vc4: Stash some debug code for looking at what BOs are at what hindex.
When you're debugging validation, it's nice to know what the BOs are for.
2014-08-11 14:45:32 -07:00
Eric Anholt
8ebfa8fdb2 vc4: Use GEM under simulation even for non-winsys BOs.
In addition to reducing sim-specific code, it also avoids our local handle
allocation conflicting with the host GEM's handle numbering, which was
causing vc4_gem_hindex() to not distinguish between winsys BOs and the
same-numbered non-winsys bo.
2014-08-11 14:45:32 -07:00
Eric Anholt
cdc208bdaf vc4: Don't forget to unmap the GEM BO when freeing.
Otherwise it'll stick around forever.
2014-08-11 14:45:32 -07:00
Eric Anholt
d2cc7f97df vc4: Add validation of raster-format textures.
... and reject everything else, for now.

v2: Rebase on v2 of the rendering config validation change.
2014-08-11 14:45:32 -07:00
Eric Anholt
b384d16733 vc4: Drop VC4_PACKET_PRIMITIVE_LIST_FORMAT.
It's not relevant to our command streams any more.

v2: Fix indentation and a typo in the comment.
2014-08-11 14:45:32 -07:00
Eric Anholt
3aba1b124f vc4: Add validation that vertex indices don't overflow VBO bounds. 2014-08-11 14:45:32 -07:00
Eric Anholt
5692122147 vc4: Fix the shader record size for extended strides.
It turns out they aren't packed when attributes are missing, according to
both docs and simulation.
2014-08-11 14:45:32 -07:00
Eric Anholt
aaff32ded0 vc4: Fix the shader record size for extended strides.
It turns out they aren't packed when attributes are missing, according to
both docs and simulation.

v2: Drop unused variable.
2014-08-11 14:45:31 -07:00
Eric Anholt
9f24e4e6ed vc4: Add a bunch of validation of render mode configuration.
v2: Fix a build break after some previous rebase.
2014-08-11 14:45:31 -07:00
Eric Anholt
ff4748491b vc4: Store the (currently always linear) tiling format in the resource. 2014-08-11 14:45:31 -07:00
Eric Anholt
0bc2aed90f vc4: Add a bunch of validation of the binning mode config. 2014-08-11 14:45:31 -07:00
Eric Anholt
b6caa9556c vc4: Validate that the same BO doesn't get reused for different purposes.
We don't care if things like vertex data get smashed by render target
data, but we do need to make sure that shader code doesn't get rendered
to.

v2: Fix overflowing read of gl_relocs[] that incorrect flagged of some
    VBOs as shader code.
2014-08-11 14:45:31 -07:00
Eric Anholt
fa26d334cb vc4: Use the packet #defines in the kernel validation code. 2014-08-11 14:45:31 -07:00
Eric Anholt
5969f9b79c vc4: Rename GEM_HANDLES to be in a namespace.
It's not a real VC4 hardware packet, but I've put in a comment to explain
it.
2014-08-11 14:45:31 -07:00
Eric Anholt
27b8a0a025 vc4: Clean up TMU write validation.
The comment conflicted with the support in the code, so I moved the TMU
write validation to where the comment was, and dropped some dead arguments
from the functions while changing their signatures.
2014-08-11 14:45:31 -07:00
Eric Anholt
7969a15325 vc4: Update a comment about shader validation 2014-08-11 14:45:31 -07:00
Eric Anholt
99070c6daa vc4: Add proper translation from Zc to Zs for vertex output.
This fixes the remaining failure in depthfunc.
2014-08-11 14:45:31 -07:00
Eric Anholt
4160ac5ee4 vc4: Add support for depth clears and tests within a tile.
This doesn't load/store the Z contents across submits yet.  It also
disables early Z, since it's going to require tracking of Z functions
across multiple state updates to track the early Z direction and whether
it can be used.

v2: Move the key setup to before the search for the key.
2014-08-11 14:45:31 -07:00
Eric Anholt
2259cc5aeb vc4: Avoid flushing when mapping buffers that aren't in the batch.
This should prevent a bunch of unnecessary flushes for things like
updating immediate vertex data.
2014-08-11 14:45:31 -07:00
Eric Anholt
6b2583412f vc4: Drop the flush at the end of the draw
Now we actally get multiple draw calls per submit.
2014-08-11 14:45:31 -07:00