fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-19 05:50:36 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	db6ffa29c8	i965: Retype atomics to UD in Gen8 code generation. Kind of a moot point since we're deleting Gen8 code generation, but this at least helps make it match the Gen4-7 code. It's probably more reasonable than using float. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	04f5b2f4e4	i965/vp: Use the sampler for pull constant loads on Gen7/7.5. This improves performance in Trine 2 at 1280x720 (windowed) on "Very High" settings by 30% (in the interactive menu) to 45% (in the forest by the giant frog) on Haswell GT3e. It also now generates the same assembly on Gen7 as it does on Gen8, which always used the sampler for both types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	f7e9756201	i965/vec4: Drop gen <= 7 assertion in pull constant load handling. I don't see any reason for this to exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	ce90fd9676	i965/eu: Set src0 file to IMM on Gen8+ flow control instructions. According to the documentation, we need to set the source 0 register type to IMM for flow control instructinos that have both JIP and UIP. Out of paranoia, just make all flow control instructions use IMM; there's no benefit to using ARF anyway, and it could trouble that's difficult to diagnose. See commit `9584959123`, which did the analogous change in the gen8_generator code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	d8ef0eab5a	i965/eu: Refactor brw_WHILE to share a bit more code on Gen6+. We're going to add a Gen8+ case shortly, which would need to duplicate this code again. Instead, share it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	aafdf9eef4	i965/eu: Emulate F32TO16 and F16TO32 on Broadwell. When we combine the Gen4-7 and Gen8+ generators, we'll need to handle half float packing/unpacking functions somehow. The Gen8+ generator code today just emulates the behavior of the Gen7 F32TO16/F16TO32 instructions, including the align16 mode bugs. Rather than messing with fs_generator/vec4_generator, I decided to just emulate the instructions at the brw_eu_emit.c layer. v2: Change gen >= 7 asserts to gen == 7 (suggested by Chris Forbes). Fix regressions on Haswell in VS tests due to type assertions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	849046b842	i965/vec4: Port Gen8 SET_VERTEX_COUNT handling to vec4_generator. Broadwell requires the number of vertices written by the geometry shader to be specified in a separate register, as part of the terminating message's payload. This also means GS_OPCODE_THREAD_END needs to increment mlen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	17c17b87f9	i965/vec4: Switch to MOV, not OR, for GS_OPCODE_THREAD_END on Gen8. Either should work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	af13cf609f	i965/vec4: Use MOV, not OR, to set URB write channel mask bits. g0.5 has nothing of value to contribute to m0.5. In both the VS and GS payload, g0.5 contains the scratch space pointer - which is definitely not of any use. The GS payload also contains FFTID, but the URB write message header doesn't want FFTID. The only reason I used OR was because Eric originally requested it. On Broadwell, I used MOV, and that's worked out fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	efc818e3a4	i965/fs: Don't set flag_subreg_nr = 1 on predicated FB write setup. On Haswell, we implement "discard" via predicated SEND messages, using f0.1 instead of f0.0. To accomplish this, we set inst->flag_subreg to 1 on the FS_OPCODE_FB_WRITE. Most instructions using fs_inst::flag_subreg expand to a single assembly instruction. However, FS_OPCODE_FB_WRITE can generate several MOVs for setting up header information. We don't want to set flag_subreg on those, so override the default state back to 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	2e180e4c09	i965/vec4: Respect ir->force_writemask_all in Gen8 code generation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	7b6b61ba83	i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-12 13:39:24 -07:00
Jason Ekstrand	97d57f1142	gallium/r300: Fix a link error in the tests The link error occurs because the static libraries are linked in the wrong order. This fixes it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82483 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-08-12 11:35:07 -07:00
Matt Turner	e005c1148d	i965: Return NONE from brw_swap_cmod on unknown input. Comparing ~0u with a packed enum (i.e., 1 byte) always evaluates to false. Shouldn't gcc warn about this? Reported-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-12 11:09:45 -07:00
Neil Roberts	ab66b19669	docs: Update release notes and GL3.txt for GL_ARB_texture_compression_bptc Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	a018a3f3f5	mesa/meta: Support decompressing floating-point formats Previously the Meta implementation of glGetTexImage would fall back to _mesa_get_teximage if the texturing is not using an unsigned normalised format. However in order to support the half-float formats of BPTC textures we can make it render to a floating-point renderbuffer instead. This patch makes decompression_state have two FBOs, one for the GL_RGBA format and one for GL_RGBA32F. If a floating-point texture is encountered it will try setting up a floating-point FBO. It will now also check the status of the FBO and fall back to _mesa_get_teximage if the FBO is not complete. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	817051ab5b	swrast: Enable GL_ARB_texture_compression_bptc Enables BPTC texture compression on the software rasterizer. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	9782b8a80c	i965: Enable the GL_ARB_texture_compression_bptc extension Enables the BPTC extension on Gen>=7 and adds the necessary format mappings to get the right surface type value. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	88a8830390	mesa/main: Modify generate_mipmap_compressed to cope with float textures Once we add BPTC texture support we will need to generate mipmaps for compressed floating point textures too. Most of the code seems to already be there but it just needs a few extra lines to get it to use GL_FLOAT instead of GL_UNSIGNED_BYTE as the type for the temporary buffers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	17cde55c53	mesa: Add texstore functions for BPTC-compressed textures This adds compressors for all four of the BPTC compressed-texture formats. The compressor is written from scratch and takes a very simple approach. It always uses a single mode of the BPTC format (4 for unorm and 3 for half-floats) and picks the two endpoints by dividing the texels into those which have more or less than the average luminance of the block and then calculating an average color of the texels within each division. It's probably not really sensible to try to use BPTC compression at runtime because for example with the Nvidia offline compression tool it can take in the order of an hour to compress a full-screen image. With that in mind I don't think it's worth having a proper compressor in Mesa and this approach gives reasonable results for a usage that is basically a corner case. v2: Always use the custom compressor, even for the unorm formats. Fix the quantization step for the half-float format compressor. Fixed a typo which was breaking the right-hand edge of half-float textures with a width that isn't a multiple of four. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	442bcd7fd3	mesa: Add texel fetch functions for BPTC-compressed textures Adds functions to fetch from any of the four BPTC-compressed formats. v2: Set the alpha component to 1.0 when fetching from the half-float formats instead of leaving it uninitialised. Don't linearize the alpha component when fetching from sRGB. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	7e78033c11	mesa: Add the format enums for BPTC-compressed images This adds the following four Mesa image format enums which correspond to the four BPTC compressed texture formats: MESA_FORMAT_BPTC_RGBA_UNORM MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT It also updates the format information functions to handle these and the corresponding GL enums. v2: Also modify _mesa_get_format_color_encoding, _mesa_get_srgb_format_linear and _mesa_get_uncompressed_format Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	cc9c30b8a7	mesa/format_info: Add support for the BPTC layout Adds the ‘bptc’ layout to get_channel_bits. The channel bits for BPTC depend on the mode but as it only has to be an approximation this sets it to 8 for the two UNORM formats and 16 for the two half-float formats. These represent the minimum number of bits of variation that can be generated by the interpolation of the two formats. This doesn't quite match what we do for S3TC which only returns 4 even though it can similarly generate 8 bits from the interpolation. However it does match what we return for ETC2. For reference, NVidia seems to return 8 bits for the UNORM formats and 32 bits for the half-float formats. v2: Change the number of bits to 8/8/8/8 for the UNORM formats and 16/16/16 for the half-float formats. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-12 18:23:38 +01:00
Neil Roberts	84218b598f	mesa/format_info: Add support for compressed floating-point formats If the name of a compressed texture format has ‘FLOAT’ in it it will now set the data type of the format to GL_FLOAT. This will be needed for the BPTC half-float formats. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:00:26 +01:00
Neil Roberts	0c6e230eb1	mesa: Fix the base format for GL_COMPRESSED_RGB_BPTC_*_FLOAT_ARB The signed and unsigned half-float BPTC-compressed formats were being reported as having a base format of GL_RGBA but they don't store an alpha channel so it should be GL_RGB. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:00:26 +01:00
Neil Roberts	5ceb4bff33	mesa: Add the GL_ARB_texture_compression_bptc extension This adds a boolean in the gl_extensions struct for GL_ARB_texture_compression_bptc as well as an entry in extension_table. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:00:26 +01:00
Andreas Boll	36771dc60f	winsys/radeon: fix nop packet padding for hawaii The initial firmware for hawaii does not support type3 nop packet. Detect the new hawaii firmware with query RADEON_INFO_ACCEL_WORKING2. If the returned value is 3, then the new firmware is used. This patch uses type2 for the old firmware and type3 for the new firmware. It fixes the cases when the old firmware is used and the user wants to manually enable acceleration. The two possible scenarios are: - the kernel has no support for the new firmware. - the kernel has support for the new firmware but only the old firmware is available. Additionaly this patch disables GPU acceleration on hawaii if the kernel returns a value < 2. In this case the kernel hasn't the required fixes for proper acceleration. v2: - Fix indentation - Use private struct radeon_drm_winsys instead of public struct radeon_info - Rename r600_accel_working2 to accel_working2 v3: - Use type2 nop packet for returned value < 3 v4: - Fail to initialize winsys for returned value < 2 Cc: mesa-stable@lists.freedesktop.org Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-12 12:16:06 -04:00
Brian Paul	fa5b76e3a2	mesa: regenerate gl_mangle.h Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:45 -06:00
Brian Paul	0a96e7adaa	mesa: update wglext.h to version 20140810 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:45 -06:00
Brian Paul	eeb7fc8b7d	mesa: update glxext.h to version 20140810 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:45 -06:00
Brian Paul	b7d36efe93	mesa: update glext.h to version 20140810 This brings in the new OpenGL 4.5 features. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:44 -06:00
Charmaine Lee	0c065270c0	svga: Add a limit to the maximum surface size This patch adds a limit to the maximum surface size which is based on the maximum size of a single mob. If this value is not available, the maximum surface size is by default set to 128 MB. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-12 08:03:24 -06:00
José Fonseca	d839be24b3	mesa/st: Move declaration to top of block. To fix MSVC build failure. Trivial.	2014-08-12 14:25:37 +01:00
Ilia Mirkin	6174f49170	mesa/st: add support for dynamic sampler offsets Replace the plain sampler index with a register reference to a sampler. We also need to keep track of the sampler array size when there is a relative reference so that we can mark the whole array used. To facilitate implementation, we add a separate ADDR register that exclusively handles the sampler relative address. Other approaches would be more invasive. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-12 08:52:14 -04:00
Christian König	83012b5085	radeon/uvd: fix gpu_address for video surfaces We need to get the new gpu_address as well when reallocating the cs buffer. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=82428 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2014-08-12 11:53:52 +02:00
Chris Forbes	3b48f6a4c0	mesa: Add a new function for getting the nonconst sampler array index If the array index is not a constant expression, the existing support will assume a zero offset (giving us the sampler index of the base of the array). For dynamically uniform indexing of sampler arrays, we need both that and the indexing expression. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 19:18:55 +12:00
Chris Forbes	1b4761bc27	glsl: Allow dynamically uniform sampler array indexing with 4.0/gs5 V2: Expand comment to explain what dynamically uniform expressions are about. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 19:17:56 +12:00
Ilia Mirkin	f525bd01d1	nvc0/ir: describe the tex arguments for fermi/kepler Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 19:07:34 -04:00
Ilia Mirkin	b3cbd86224	nvc0/ir: add kepler+ support for indirect texture references Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 19:07:34 -04:00
Ilia Mirkin	af3619e880	nvc0/ir: add base tex offset for fermi indirect tex case Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 19:07:34 -04:00
Kenneth Graunke	f73594778b	i965: Revert part of `f5cc3fdcf1`. Fixes non-termination in various Piglit tests. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-11 15:07:17 -07:00
Eric Anholt	602a3f92d4	vc4: Flip which primitives are considered front-facing. This mostly fixes glxgears rendering.	2014-08-11 14:47:54 -07:00
Eric Anholt	f097516505	vc4: Don't forget to set the depth clear value in the packet. This gets glxgears partially rendering again.	2014-08-11 14:47:54 -07:00
Eric Anholt	e63598aecb	vc4: Add support for gl_FragCoord. This isn't passing all tests (glsl-fs-fragcoord-zw-ortho, for example), but it does get a bunch more tests passing. v2: Rebase on helpers change.	2014-08-11 14:47:54 -07:00
Eric Anholt	d34fbdda12	vc4: Refactor shader input setup again. This makes some space for handling special inputs like fragcoords.	2014-08-11 14:47:54 -07:00
Eric Anholt	a7faca5d27	vc4: Clean up the tile alloc buffer size. This prevents some simulator assertion failures, but it does mean (since I've dropped the "* 16" padding) that on real hardware you need a kernel that does overflow memory management (currently, "drm/vc4: Add support for binner overflow memory allocation." in my kernel tree).	2014-08-11 14:47:51 -07:00
Eric Anholt	7050ab510d	vc4: Clarify some values implicitly chosen for binning config. These #defines are 0, but it should help make math above make more sense.	2014-08-11 14:45:32 -07:00
Eric Anholt	ed5cb5d7d5	vc4: Improve simulator memory allocation. This should reduce a bunch of spurious failures in sim.	2014-08-11 14:45:32 -07:00
Eric Anholt	f5f8dd29c3	vc4: Handle stride==0 in VBO validation	2014-08-11 14:45:32 -07:00
Eric Anholt	0f034055f9	vc4: Stash some debug code for looking at what BOs are at what hindex. When you're debugging validation, it's nice to know what the BOs are for.	2014-08-11 14:45:32 -07:00

1 2 3 4 5 ...

64559 commits