fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 07:08:04 +02:00

Author	SHA1	Message	Date
Marek Olšák	f2328ffdc8	tgsi: add tgsi_get_processor_type helper from radeon Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-04 14:34:13 +01:00
Kenneth Graunke	ccbe15f332	i965/fs: Fix saturate on MAD and LRP with the NIR backend. Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-04 00:34:57 -08:00
Iago Toral Quiroga	1b029f8a4a	mesa: Fix _mesa_format_convert fallback path when src is not an array format When a rebase swizzle is provided and we call _mesa_swizzle_and_convert after unpacking the source format we were always passing normalized=false. We should pass true or false depending on the formats involved in the conversion for the byte and float paths (the integer path cannot ever be normalized). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-02-04 08:08:34 +01:00
Park, Jeongmin	6fd4a61ad6	st/osmesa: Fix osbuffer->textures indexing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930 Cc: 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-03 15:46:56 -07:00
Connor Abbott	ab24e12706	i965/nir: use redundant phi optimization Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 16:00:13 -05:00
Connor Abbott	a135f34080	nir: add an optimization to remove useless phi nodes This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 16:00:13 -05:00
Jason Ekstrand	572d1f6e41	nir/validate: Ensure that phi sources are SSA-only Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:52:42 -08:00
Jason Ekstrand	5420774510	nir/validate: Validate that only float ALU outputs are saturated Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:46:55 -08:00
Jason Ekstrand	c0df85cca4	nir/lower_source_mods: Don't lower saturate for non-float outputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:46:38 -08:00
Jason Ekstrand	8776b1b14b	i965/fs_nir: Get rid of get_alu_src Originally, get_alu_src was supposed to handle resolving swizzles and things like that. However, now that basically every instruction we have only takes scalar sources, we don't really need it anymore. The only case where it's still marginally useful is for the mov and vecN operations that are left over from SSA form. We can handle those cases as a special case easily enough. As a side-effect, we don't need the vec_to_movs pass anymore. v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we detect if we need an extra copy for swizzling. The old code involved a pile of confusing switch fall-throughs; we now use a loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Jason Ekstrand	112d738b91	i965/fs: Use NIR's scalarizing abilities and stop handling vectors Now that we can scalarize with NIR, there's no need for all this code anymore. Let's get rid of it and just do scalar operations. v2: run copy prop before lowering phi nodes v3: Get rid of the "emit(...)->saturate = foo" pattern v4: Run alu_to_scalar as an optimization pass total instructions in shared programs: 5998321 -> 5974070 (-0.40%) instructions in affected programs: 732075 -> 707824 (-3.31%) helped: 3137 HURT: 191 GAINED: 18 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Jason Ekstrand	f2adcd36cb	nir: Add a pass to lower vector phi nodes to scalar phi nodes v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Add better comments - Use nir_ssa_dest_init and nir_src_for_ssa more places - Fix some void * casts v3 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we determine whether or not to sccalarize a phi node to make the recursion non-bogus - Treat load_const instructions as scalarizable v4 Jason Ekstrand <jason.ekstrand@intel.com>: - Allow uniform and input loads to be scalarizable v5 Jason Ekstrand <jason.ekstrand@intel.com>: - Also consider loads of inputs (varying, uniform, or ubo) to be scalarizable. We were already doing this for load_var on uniforms and inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Matt Turner	e87928a494	i965/fs: Add support for constant propagating into sources with modifiers. All but 16 of the programs helped were ARB fp programs. total instructions in shared programs: 5949286 -> 5945470 (-0.06%) instructions in affected programs: 275162 -> 271346 (-1.39%) helped: 1197 GAINED: 1 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	cfa2165642	i965/vec4: Use abs/negate functions in const propagation. No changes in shader-db. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	dbd4c22a37	i965: Add function to take the abs of immediates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	638beee24a	i965: Add function to negate immediates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	1f4bdad316	i965: Mark UB/B immediates as unreachable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	32e98e8ef0	gallium/util: Don't use __builtin_clrsb in util_last_bit(). Unclear circumstances lead to undefined symbols on x86. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-03 12:25:14 -08:00
Matt Turner	d8be1b9aba	glsl/list: Note that exec_lists may not be realloc'd. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:25:14 -08:00
Nils Wallménius	cfb5b1c59e	st/mesa: mark constant array of swizzles as static const This saves about 0.5k in the text section for a gallium driver on amd64. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-02-04 09:07:13 +13:00
Eduardo Lima Mitev	0ed3bffc08	mesa: Returns a GL_INVALID_VALUE error on several APIs when buffer size is negative Section 2.3.1 (Errors) of the OpenGL 4.5 spec says: "If a negative number is provided where an argument of type sizei or sizeiptr is specified, an INVALID_VALUE error is generated. This patch adds checks for negative buffer size values passed to different APIs. It also moves up the check on other APIs that already had it, making it the first error check performed in the function, for consistency. While there may be other APIs throughtout the code lacking this check (or at least not at the beginning of the function), this patch focuses on the cases that break the dEQP tests listed below. It could be a good excersize for the future to check all other cases, and improve consistency in the order of the checks throughout the whole Mesa code base. This fixes 5 dEQP test: * dEQP-GLES3.functional.negative_api.state.get_attached_shaders * dEQP-GLES3.functional.negative_api.state.get_shader_source * dEQP-GLES3.functional.negative_api.state.get_active_uniform * dEQP-GLES3.functional.negative_api.state.get_active_attrib * dEQP-GLES3.functional.negative_api.shader.program_binary Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Samuel Iglesias Gonsalvez	284bd1ecdf	mesa: fix error value in GetFramebufferAttachmentParameteriv for OpenGL ES 3.0 Section 6.1.13 "Framebuffer Object Queries" of OpenGL ES 3.0 spec: "If the default framebuffer is bound to target, then attachment must be BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or STENCIL, identifying the stencil buffer." OpenGL ES 3.0, section 2.5 (GL Errors): "If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated." Then change the returned error to INVALID_ENUM. Fixes: dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	5dfb085ff3	glsl: Improve precision of mod(x,y) Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Eduardo Lima Mitev	c27d23f0c8	mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3 GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX (2.8.1 Transferring Array Elements, page 26) which is not currently possible to query using glGet() funcs. Fixes 4 dEQP tests: dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64 * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	ec7dcaf578	glsl: can't have 'const' qualifier used with struct or interface block members Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	5d655a43e6	glsl: interface blocks must be declared at global scope Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	6dd346c232	i965: Fix negate with unsigned integers For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8) g5<1>.xF -g9<4,4,1>.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 24 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uint_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec2_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec3_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec4_ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-03 13:19:36 +01:00
Jose Fonseca	5b941ce857	scons: Fix Windows builds with LLVM 3.5. LLVMBitReader dependency was introduced, as pointed out by Rob Conde.	2015-02-03 10:18:51 +00:00
Ilia Mirkin	bc321db75b	st/mesa: add EXT_polygon_offset_clamp support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-02 20:44:22 -05:00
Ilia Mirkin	7c211a12aa	gallium: add a cap to determine whether the driver supports offset_clamp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-02 20:44:02 -05:00
Ilia Mirkin	2ce29ce5af	i965/gen6+: enable EXT_polygon_offset_clamp Replace the hard-coded 0's with the context clamp value. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-02 20:35:36 -05:00
Ilia Mirkin	81998dda63	mesa: add support for GL_EXT_polygon_offset_clamp Nothing enables the extension yet, but the values are now available. The spec calls for it to only be exposed for GL 3.3+, which is core-only in mesa. Instead we allow any driver to enable it, including in a compat context for any GL version. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-02 20:35:36 -05:00
Ilia Mirkin	83321009de	glapi: add GL_EXT_polygon_offset_clamp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-02 20:35:36 -05:00
Kenneth Graunke	0f06f12c11	glsl: Pick ast_conditional branch regardless of op1/2 being constant. If the ?: operator's condition is a constant value, and both branches were pure expressions, we can just make the resulting value one or the other. Previously, we only did this if op[1] and op[2] were also constant values - but there's no actual reason for that restriction. No changes in shader-db, probably because we usually optimize this later anyway. But it does make us generate less stupid code up front. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-02 17:14:55 -08:00
Kenneth Graunke	534f07ee85	i965: Add a better PRM citation for the IMS dimension mangling. Paul originally had to reverse engineer these formulas based on the description about how the sampler works. The description here is not the easiest to follow - especially given that it's from the Sandybridge era, when the hardware only did 4x multisampling. Jordan and I recently found another part of the documentation where they simply state that IMS dimensions must be adjusted by a set of formulas. Quoting this section provides an easy to follow explanation for the code, including 2x/4x/8x/16x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-02-02 17:14:38 -08:00
Laura Ekstrand	e9b86cb5d6	swrast: Whitespace fixes. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-02 13:22:26 -08:00
Laura Ekstrand	e187c2f543	DD: Refactor BlitFramebuffer. In preparation for glBlitNamedFramebuffer, the DD table function BlitFramebuffer needs to accept two arbitrary framebuffer objects rather than assuming ctx->ReadBuffer and ctx->DrawBuffer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-02 13:21:20 -08:00
Laura Ekstrand	ad2c64abbd	GL: Update glext.h to Khronos Revision 29537. Khronos Revision 29537 fixes ARB_direct_state_access function prototypes that had GLsizei where they should have had GLsizeiptr. The mainly affects functions related to buffer objects. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-02 10:39:55 -08:00
Jason Ekstrand	2cebaac479	i965: Don't use tiled_memcpy to download from RGBX or BGRX surfaces Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-02 10:18:42 -08:00
Neil Roberts	af8fd694d4	dir-locals.el: Don't set variables for non-programming modes This limits the style changes to modes inherited from prog-mode. The main reason to do this is to avoid setting fill-column for people using Emacs to edit commit messages because 78 characters is too many to make it wrap properly in git log. Note that makefile-mode also inherits from prog-mode so the fill column should continue to apply there. v2: Apply to all the .dir-locals.el files, not just the one in the root directory. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-02 12:02:55 +00:00
Iago Toral Quiroga	68155e5a36	i965: Fix intel_miptree_copy_teximage for GL_TEXTURE_1D_ARRAY For GL_TEXTURE_1D_ARRAY targets we store the depth of the array in the Height field and leave Depth=1 in the underlying texture object. When we call intel_miptree_copy_teximage in the process of re-creating a miptree (possibily because the number of miplevels has changed) we didn't account for this, so we where only copying texture images for the first slice. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-02 09:29:18 +01:00
Eric Anholt	753c327151	vc4: Kill a bunch of color write calculation when colormask is all off. I could have done this in the bit that generates the ANDs and ORs, but it's probably generally useful. Sadly, I still need this even if I move to NIR, because I can't yet express my read of the destination color in NIR, which I would need to move my blend/logicop/colormask handling into NIR. total uniforms in shared programs: 13497 -> 13455 (-0.31%) uniforms in affected programs: 101 -> 59 (-41.58%) total instructions in shared programs: 40797 -> 40296 (-1.23%) instructions in affected programs: 1639 -> 1138 (-30.57%)	2015-02-01 16:07:24 -08:00
Fredrik Höglund	0508032413	docs: Update ARB_direct_state_access Mark vertex array objects as started.	2015-02-01 23:00:42 +01:00
Martin Peres	9272022353	doc: break down ARB_direct_state_access in GL3.txt A student was wondering what was going on + I started working on it too. CC: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Fredrik Höglund <fredrik@kde.org>	2015-02-01 22:50:35 +01:00
Eric Anholt	12ebd7e20e	vc4: Dump the VPM read index in QIR disasm. Since the VPM reads have to be in order, it's useful to see their indices in the dump.	2015-02-01 12:53:08 -08:00
Jason Ekstrand	6094619c02	i965/pixel_read: Don't try to do a tiled_memcpy from a multisampled buffer The GL spec guarantees that glGetTexImage will never get a multisampled texture, but this is not true for glReadPixels. If we get a multisampled buffer, we have to do a multisample resolve on it before we can pull the data down for the user. Since this isn't practical to handle in tiled_memcpy, we just fall back to the other paths that can handle this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-31 08:54:32 -08:00
Francisco Jerez	11f5d8a5d4	i965: Enable L3 caching of buffer surfaces. And remove the mocs argument of the emit_buffer_surface_state vtbl hook. Its semantics vary greatly from one generation to another, so it kind of encourages the caller to pass 0 which is the only valid setting across generations. After this commit the hardware-specific code decides what the best cacheability settings are for buffer surfaces, just like we do for textures. This together with some additional changes coming is expected to improve performance of pull constants, buffer textures, atomic counters and image objects on Gen7 and up. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-31 17:01:49 +02:00
José Fonseca	11a955aef4	egl: Pass the correct X visual depth to xcb_put_image(). The dri2_x11_add_configs_for_visuals() function happily matches a 32 bits EGLconfig with a 24 bits X visual. However it was passing 32bits depth to xcb_put_image(), making X server unhappy: https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911 Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-31 09:14:36 +00:00
Jason Ekstrand	5c31184cf5	intel/pixel_read: Properly flip the results for window system buffers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-30 18:56:56 -08:00
Jason Ekstrand	837a4c42a6	i965/tiled_memcpy: Support a signed linear pitch Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-30 18:56:56 -08:00

1 2 3 4 5 ...

67854 commits