fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 16:58:10 +02:00

Author	SHA1	Message	Date
Dave Airlie	753ba6b999	glsl/ir: Add cloning support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	57c6c3d3bd	glsl/ir: Add printing support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	5a69bdb599	glsl/ir: Add builtin function support for doubles v2: add d2b, more ir_constant stuff (Ilia) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Ilia Mirkin	53bf7c8fd2	glsl: fix uniform linking logic in the presence of structs Add a enter/leave record callback so that the offset may be aligned to the proper value. Otherwise only leaf fields are called, and the first field needs to be aligned to the outer struct's base alignment while the last field needs to be aligned to the inner struct's base alignment. This removes most usage of the last field/record type values passed into visit_field. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 00:28:34 -05:00
Ilia Mirkin	1ec715ce8b	glsl: teach std140_base_alignment about samplers These functions are about to be used more aggressively for determining uniform layout. Samplers may be inside of structs, and it's easier to reuse the existing base alignment logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 00:28:34 -05:00
Dave Airlie	fe23bb85ba	glsl: Uniform linking support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	3af8db94cd	glsl: Add double builtin type generation Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	277f4d75a7	glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2) v2: add define bit (Tapani Pälli) Patch makes following Piglit tests pass: arb_gpu_shader_fp64/preprocessor/define.vert arb_gpu_shader_fp64/preprocessor/define.frag Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	bf257d2c90	glsl: Add double builtin type This causes a lot of warnings about unchecked type in switch statements - fix them later. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Eric Anholt	6eadde51bb	nir: Recognize and reduce duplicated fsats. No effect on vc4 shader-db. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	1907a3a7ee	nir: Add a flag for lowering fsat. vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44356 -> 44354 (-0.00%) instructions in affected programs: 55 -> 53 (-3.64%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	e5ecf8e427	nir: Add a flag for lowering ffma. vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13966 -> 13791 (-1.25%) uniforms in affected programs: 435 -> 260 (-40.23%) total instructions in shared programs: 44732 -> 44356 (-0.84%) instructions in affected programs: 9599 -> 9223 (-3.92%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	42a8ace66e	nir: Add a flag for lowering fneg/ineg. vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44911 -> 44732 (-0.40%) instructions in affected programs: 11371 -> 11192 (-1.57%) v2: Fix broken iabs(isub(0, a)) transformation. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	cb95a228e8	nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)). vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13972 -> 13966 (-0.04%) uniforms in affected programs: 408 -> 402 (-1.47%) total instructions in shared programs: 44973 -> 44911 (-0.14%) instructions in affected programs: 1551 -> 1489 (-4.00%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:50 -08:00
Eric Anholt	ccf14bca4b	nir: Add lowering of POW instructions if the lower flag is set. This could be done in a separate pass like we do in GLSL IR, but it seems to me like having the definitions of the transformations in the two directions next to each other makes a lot of sense. v2: Reorder the comment about the transformation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-18 14:47:50 -08:00
Eric Anholt	8e9dbfff17	nir: Conditionalize the POW reconstruction on shader compiler options. Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic and other passes should try and generate certain types of opcodes or patterns. Extend that to NIR by defining our own struct, which is automatically generated from the Mesa struct in glsl_to_nir and provided directly by the driver in TGSI-to-NIR. v2: Split out the previous two prep patches. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2015-02-18 14:47:50 -08:00
Eric Anholt	955a6bb57d	nir: Add an optional expression controlling nir_algebraic xforms. This will be used so that we can customize the transforms for the target GPU, so we don't un-lower expressions that had already been lowered (or introduce new lowering transformations that not all GPUs want) v2: Drop the complication of having the condition->index dictionary, since we don't actually expect there to be many different conditions (change by Kenneth). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-18 14:47:50 -08:00
Eric Anholt	f90bb54734	nir: Add a nir_shader_compiler_options struct pointed to by the shaders. This will be used to give the optimization passes a chance to customize behavior for the particular target device. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:50 -08:00
Alan Coopersmith	d602fbd861	Avoid fighting with Solaris headers over isnormal() When compiling in C99 or C++11 modes, Solaris defines isnormal() as a macro via <math.h>, which causes the function definition to become too mangled to compile. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 18:16:33 -08:00
Alan Coopersmith	815b3bd096	Remove extraneous ; after DECL_TYPE usage The macro is defined to provide a trailing ; so this caused the expansion to end in ";;" which made the Solaris Studio compilers issue warnings for every line of: "builtin_type_macros.h", line 113: Warning: extra ";" ignored. for every file that included the header, filling build logs with thousands of useless warnings. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 18:16:33 -08:00
Kenneth Graunke	76960a55e6	glsl: Reduce memory consumption of copy propagation passes. opt_copy_propagation and opt_copy_propagation_elements create new ACP and Kill sets each time they enter a new control flow block. For if blocks, they also copy the entire existing ACP set contents into the new set. When we exit the control flow block, we discard the new sets. However, we weren't freeing them - so they lived on until the pass finished. This can waste a lot of memory (57MB on one pessimal shader). This patch makes the pass allocate ACP entries using this->acp as the memory context, and Kill entries out of this->kill. It also steals kill entries when moving them from the inner kill list to the parent. It then frees the lists, including their contents. v2: Move ralloc_free(this->acp) just before this->acp = orig_acp (suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>	2015-02-17 17:33:27 -08:00
Ian Romanick	147afac80c	glcpp: Silence GCC warning glcpp/glcpp.c:124:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] const static struct option ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 12:29:58 -08:00
Ilia Mirkin	b53fbec01d	glsl/tests: add IMAGE type. This fixes a warning when running make check. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-17 11:26:06 +10:00
Jason Ekstrand	dd110cdfd8	nir: Make gl_FrontFacing a system_value GLSL IR labels gl_FrontFacing as an input variable and not a system value. This commit makes NIR silently translate gl_FrontFacing to a system value so that it properly gets translated into a load_system_value intrinsic. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-14 13:47:16 -08:00
Jason Ekstrand	929f43851e	nir/lower_phis_to_scalar: Fix some logic in is_phi_scalarizable Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-14 13:46:59 -08:00
Emil Velikov	72e602905d	nir: add missing header to the sources list Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-12 13:19:13 +00:00
Emil Velikov	556fc4b84d	nir: resolve nir.h dependency list (fix make distcheck) Use nir/nir_opcodes.h as is (w/o the absolute path), as it is the target name used to generate the actual file. Otherwise the target is missing, the file won't get generated and the build will fail. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-12 13:18:52 +00:00
Matt Turner	69ad5fd4ce	glsl: Optimize (f2i(trunc x)) into (f2i x). total instructions in shared programs: 5950326 -> 5949286 (-0.02%) instructions in affected programs: 88264 -> 87224 (-1.18%) helped: 692	2015-02-11 13:50:19 -08:00
Matt Turner	c262b2b582	glsl: Optimize round-half-up pattern. Hurts some Psychonauts shaders, but after the next patch (which this enables) they're fewer instructions than before this patch.	2015-02-11 13:50:19 -08:00
Matt Turner	a5455ab1ca	glsl: Add trunc() to ir_builder.	2015-02-11 13:50:19 -08:00
Matt Turner	4c42e1116b	nir: Recognize open-coded fmin/fmax. And unfortunately other shaders do the same thing but with >=/<= which we can't apply this optimization to because of NaNs. instructions in affected programs: 23309 -> 22938 (-1.59%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-11 13:50:19 -08:00
Eric Anholt	56e21647e2	nir: Add algebraic opt for int comparisons with identical operands. No change on shader-db on i965. v2: Reword the comment due to feedback from Erik Faye-Lund Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)	2015-02-11 11:52:38 -08:00
Eric Anholt	2919bdf466	nir: Fix load_const comparisons for CSE. We want the size of a float per component, not the size of a whole vec4. NIR instructions on i965: total instructions in shared programs: 1261937 -> 1261929 (-0.00%) instructions in affected programs: 114 -> 106 (-7.02%) Looking at one of these examples (tesseract), it's from vec4 load_consts for a MRT solid fill, which do get CSEed now that we don't memcmp off the end of the const value and into the SSA def. For the 1-component loads that are common in i965, we were only memcmping off into the rest of the usually zero-filled const_value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-11 11:52:38 -08:00
Matt Turner	ea0f0eb6c0	glsl: Optimize 1/exp(x) into exp(-x). Lots of shaders divide by exp2(...) which we turn into a multiplication by the reciprocal. We can avoid the reciprocal by simply negating exp2's argument. total instructions in shared programs: 5947154 -> 5946695 (-0.01%) instructions in affected programs: 118661 -> 118202 (-0.39%) helped: 380 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-10 17:48:44 -08:00
Matt Turner	a9065cef48	nir: Remove casts from void*. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-10 17:48:42 -08:00
Matt Turner	bb1e007157	nir: Replace assert(0) with unreachable(). Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-10 17:48:31 -08:00
Matt Turner	942b56ad05	nir: Remove unused has_indirect variable. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-10 17:48:16 -08:00
Francisco Jerez	e6146e6f14	glsl: Forbid calling the constructor of any opaque type. The spec doesn't define any opaque type constructors. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-10 15:49:43 +02:00
Francisco Jerez	c4111dfa0a	glsl: Return correct number of coordinate components for cubemap array images. Cubemap array images are unlike cubemap array samplers in that they don't need an additional coordinate to index individual cubemaps in the array, instead they behave like a 2D array of 6n layers, with n the number of cubemaps in the array. Take this exception into account. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-10 15:49:43 +02:00
Kenneth Graunke	480ee1f0b4	nir: Mark nir_print_instr's instr pointer as const. Printing instructions doesn't modify them, so we can mark the parameter const. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-10 03:37:55 -08:00
Eric Anholt	bff4cbdafa	nir: Fix broken fsat recognizer. We've probably never seen this ridiculous pattern in the wild, so it didn't matter. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-06 15:57:55 -08:00
Eric Anholt	6706537dd4	nir: Slightly simplify algebraic code generation by reusing a struct. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-06 15:57:55 -08:00
Iago Toral Quiroga	71a36e0a2c	glsl: GLSL ES identifiers cannot exceed 1024 characters v2 (Ian Romanick) - Move the check to the lexer before rallocing a copy of the large string. Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-06 12:21:42 +01:00
Connor Abbott	a135f34080	nir: add an optimization to remove useless phi nodes This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 16:00:13 -05:00
Jason Ekstrand	572d1f6e41	nir/validate: Ensure that phi sources are SSA-only Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:52:42 -08:00
Jason Ekstrand	5420774510	nir/validate: Validate that only float ALU outputs are saturated Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:46:55 -08:00
Jason Ekstrand	c0df85cca4	nir/lower_source_mods: Don't lower saturate for non-float outputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:46:38 -08:00
Jason Ekstrand	f2adcd36cb	nir: Add a pass to lower vector phi nodes to scalar phi nodes v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Add better comments - Use nir_ssa_dest_init and nir_src_for_ssa more places - Fix some void * casts v3 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we determine whether or not to sccalarize a phi node to make the recursion non-bogus - Treat load_const instructions as scalarizable v4 Jason Ekstrand <jason.ekstrand@intel.com>: - Allow uniform and input loads to be scalarizable v5 Jason Ekstrand <jason.ekstrand@intel.com>: - Also consider loads of inputs (varying, uniform, or ubo) to be scalarizable. We were already doing this for load_var on uniforms and inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Matt Turner	d8be1b9aba	glsl/list: Note that exec_lists may not be realloc'd. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:25:14 -08:00
Iago Toral Quiroga	5dfb085ff3	glsl: Improve precision of mod(x,y) Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00

1 2 3 4 5 ...

3281 commits