fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-03-17 12:30:33 +01:00

Author	SHA1	Message	Date
Carl Worth	28510d69ff	i965: Split out brw_<stage>_populate_key into their own functions This commit splits portions of the existing brw_upload_vs_prog and brw_upload_gs_prog function into new brw_vs_populate_key and brw_gs_populate_key functions. This follows the same style as is already present for all other stages, (see brw_wm_populate_key, etc.). This commit is intended to have no functional change. It exists in preparation for some upcoming code movement in preparation for the shader cache. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-02 22:15:45 -07:00
Ilia Mirkin	01d3b750b3	nv50/ir: avoid folding immediates into imad operations Commit `09ee907266` added logic to fold immediates into mad operations, but the emission code is only there for fmad. Only allow it on float types. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 18:42:31 -04:00
Ilia Mirkin	603d28f32c	nv50/ir: fix imad emission when dst == src2 Commit `fb63df2215` added 4-byte mad support, but only supported emission for floats. Disable it for ints for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 18:35:59 -04:00
Kenneth Graunke	da5ec2ac0b	nir: Allocate nir_tex_instr::sources out of the instruction itself. The lifetime of the sources array needs to be match the nir_tex_instr itself. So, allocate it using the instruction itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:03 -07:00
Kenneth Graunke	7380c641b1	nir: Allocate predecessor and dominance frontier sets from block itself. These sets are part of the block, and their lifetime needs to match the block itself. So, allocate them using the block itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:02 -07:00
Kenneth Graunke	131444e1c5	nir: Allocate register fields out of the register itself. The lifetime of each register's use/def/if_use sets needs to match the register itself. So, allocate them using the register itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:01 -07:00
Kenneth Graunke	587b3a20a1	nir: Make nir_create_function() strdup the function name. glsl_to_nir passes in the ir_function's name field; we were copying the pointer, but not duplicating the memory. We want to be able to free the linked GLSL IR program after translating to NIR, so we'll need to create a copy of the function name that the NIR shader actually owns. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:00 -07:00
Kenneth Graunke	f61b6c3e48	nir: Free dead variables when removing them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:19:58 -07:00
Kenneth Graunke	f4e4491080	nir: Combine remove_dead_local_vars() and remove_dead_global_vars(). We can just pass a pointer to the list of variables, and reuse the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:19:56 -07:00
Kenneth Graunke	33f0f68d59	ralloc: Implement a new ralloc_adopt() API. ralloc_adopt() reparents all children from one context to another. Conceptually, ralloc_adopt(new_ctx, old_ctx) behaves like this pseudocode: foreach child of old_ctx: ralloc_steal(new_ctx, child) However, ralloc provides no way to iterate over a memory context's children, and ralloc_adopt does this task more efficiently anyway. One potential use of this is to implement a memory-sweeper pass: first, steal all of a context's memory to a temporary context. Then, walk over anything that should be kept, and ralloc_steal it back to the original context. Finally, free the temporary context. This works when the context is something that can't be freed (i.e. an important structure). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:19:41 -07:00
Jason Ekstrand	ca3b4d6d17	nir/opt_peephole_ffma: Fix a couple typos in a comment Acked-by: Matt Turner <mattst88@gmail.com>	2015-04-02 11:09:37 -07:00
Ilia Mirkin	4609ba6ea3	mesa: add ARB_depth_buffer_float to ES3.0 required extension list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-02 13:35:18 -04:00
Eric Anholt	a9152376b4	vc4: Add support for nir_iabs. Tested using the GLSL 1.30 tests for integer abs(). Not currently used, but it was one of the new opcodes used by robclark's idiv lowering.	2015-04-02 10:32:35 -07:00
Jason Ekstrand	e50cf5faa5	i965/generator: Get rid of the ! in the unreachable statement Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-02 10:21:18 -07:00
Jason Ekstrand	0573d0e484	nir/print: Correctly print swizzles for explicitly sized alu sources Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-02 10:21:18 -07:00
Ilia Mirkin	4a3c0e9950	freedreno/a3xx: add MRT support The hardware only supports 4 MRTs. It should be possible to emulate support for 8, but doesn't seem worth the trouble. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	6f4c1976f4	freedreno: convert blit program to array for each number of rts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	d9992ab35a	freedreno: add support for laying out MRTs in gmem Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	602bc6c88d	freedreno: add core infrastructure support for MRTs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	d13803c76f	freedreno/ir3: add support for FS_COLOR0_WRITES_ALL_CBUFS property This will enable the driver to tell which regids to link up to which MRT outputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	f27ec59084	freedreno/a3xx: add independent blend function support This is needed for MRT support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	8efa3e340d	freedreno: remove alpha key from ir3_shader This complication is unnecessary and makes MRTs more complicated and likely to generate tons of variants. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Stéphane Marchesin	70eed78cac	i915g: Implement EGL_EXT_image_dma_buf_import This adds all the plumbing to get EGL_EXT_image_dma_buf_import in i915g. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2015-04-01 20:13:37 -07:00
Matt Turner	a03d0ba78f	i965/fs: Relax type check in cmod propagation. The thing we want to avoid is int/float comparisons, but int/unsigned comparisons with 0 are equivalent. total instructions in shared programs: 6194829 -> 6193996 (-0.01%) instructions in affected programs: 117192 -> 116359 (-0.71%) helped: 471 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-01 13:43:57 -07:00
Matt Turner	781badee7a	nir: Remove useless ftrunc inside f2i/f2u. No shader-db changes, probably because they're all removed by the GLSL compiler optimization added in commit `69ad5fd4`. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	97e6c1b957	nir: Recognize (a < b \|\| a < c) as a < max(b, c). Doesn't work for analogous && cases, because of NaNs. total instructions in shared programs: 6195712 -> 6194829 (-0.01%) instructions in affected programs: 42000 -> 41117 (-2.10%) helped: 403 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	a2b6e908cf	nir: Add addition/multiplication identities of exp/log. instructions in affected programs: 2858 -> 2808 (-1.75%) helped: 12 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	099c729b4c	nir: Add identities for the log function. The rcp(log(x)) pattern affects instruction counts. instructions in affected programs: 144 -> 138 (-4.17%) helped: 6 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	8a6ae384b2	nir: Add identities for the exponential function. No changes in shader-db. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	e26783d445	nir: Recognize another open coded lrp. total instructions in shared programs: 6195924 -> 6195768 (-0.00%) instructions in affected programs: 4876 -> 4720 (-3.20%) helped: 58 HURT: 10 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	e82437e141	nir: Recognize open coded lrp. total instructions in shared programs: 6197614 -> 6195924 (-0.03%) instructions in affected programs: 34773 -> 33083 (-4.86%) helped: 147 HURT: 6 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Kenneth Graunke	25e214db00	nir: Use _mesa_flsll(InputsRead) in prog->nir. InputsRead is a 64-bit bitfield. Using _mesa_fls would silently truncate off the high bits, claiming inputs 32..56 (VARYING_SLOT_MAX) were never read. Using <= here was a hack I threw in at the last minute to fix programs which happened to use input slot 32. Switch back to using < now that the underlying problem is fixed. Fixes crashes in "Euro Truck Simulator 2" when using prog->nir, which uses input slot 33. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 13:30:13 -07:00
Kenneth Graunke	3d166b313d	mesa: Implement _mesa_flsll(). This is _mesa_fls() for 64-bit values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 13:30:13 -07:00
Kenneth Graunke	4b38c5c783	nir: In prog->nir, don't wrap dot products with ptn_channel(..., X). ptn_move_dest and nir_fadd already take care of replicating the last channel out, so we can just use a scalar and skip splatting it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:30:13 -07:00
Jason Ekstrand	218e45e2f7	i965: Use the same nir options for all gens If we tell NIR to split ffma's, then we don't need seperate options anymore. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	b9d7454571	i965/nir: Run DCE again before going out of SSA We run lowering and optimization passes that might leave garbage lying around. This keeps the FS cse from having to clean it up. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	37703040a1	i965/nir: Run the ffma peephole after the rest of the optimizations The idea here is that fusing multiply-add combinations too early can reduce our ability to perform CSE and value-numbering. Instead, we split ffma opcodes up-front, hope CSE cleans up, and then fuse after-the-fact. Unless an algebraic pass does something silly where it inserts something between the multiply and the add, splitting and re-fusing should never cause a problem. We run the late algebraic optimizations after this so that things like compare-with-zero don't hurt our ability to fuse things. shader-db results for fragment shaders on Haswell: total instructions in shared programs: 4390538 -> 4379236 (-0.26%) instructions in affected programs: 989359 -> 978057 (-1.14%) helped: 5308 HURT: 97 GAINED: 78 LOST: 5 This does, unfortunately, cause some substantial hurt to a shader in Kerbal Space Program. However, the damage is caused by changing a single instruction from a ffma to an add. This, in turn, decreases register pressure in one part of the program causing it to fail to register allocate and spill. Given the overwhelmingly positive results in other shaders and the fact that the NIR for the Kerbal shaders is actually better, this should be considered a positive. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	7f344721b1	nir/peephole_ffma: Be less agressive about fusing multiply-adds shader-db results for fragment shaders on Haswell: total instructions in shared programs: 4395688 -> 4389623 (-0.14%) instructions in affected programs: 355876 -> 349811 (-1.70%) helped: 1455 HURT: 14 GAINED: 5 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	a8c8b3b872	nir: Add a dedicated ffma peephole optimization i965/nir: Use the dedicated ffma peephole total instructions in shared programs: 4418748 -> 4394618 (-0.55%) instructions in affected programs: 1292790 -> 1268660 (-1.87%) helped: 5999 HURT: 457 GAINED: 4 LOST: 9 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	e06a3d0282	nir: Move the compare-with-zero optimizations to the late section total instructions in shared programs: 4422307 -> 4422363 (0.00%) instructions in affected programs: 4230 -> 4286 (1.32%) helped: 0 HURT: 12 While this does hurt some things, the losses are minor and it prevents the compare-with-zero optimization from fighting with ffma which is much more important. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	da294f9b2f	nir/algebraic: Add a seperate section for "late" optimizations i965/nir: Use the late optimizations Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	1779dc060f	nir/algebraic: Remove a duplicate optimization This optimization is repeated verbatim above Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	22ee7eeb4e	nir/algebraic: #define around structure definitions Previously, we couldn't generate two algebraic passes in the same file because of multiple structure definitions. To solve this, we play the age-old header file trick and just #define around it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	793a94d6b5	nir/print: Don't print extra swizzzle components Previously, NIR would just print 4 swizzle components if the swizzle was anything other than foo.xyzw. This creates lots of noise if, for example, you have a one-component element with a swizzle of foo.xxxx. Reviewed-by: Kenneth Grunke <kenneth@whitecape.org>	2015-04-01 12:49:49 -07:00
Emil Velikov	d99135b2e9	configure: nuke --with-max-{width,height} Unused as of commit 630ab0d27ba(mesa: remove last of MAX_WIDTH, MAX_HEIGHT). Update all the remaining references to the defines. v2: Use the correct variable name in the comments Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-04-01 19:43:34 +00:00
Emil Velikov	bd4925c6ac	gallium: ship tgsi_to_nir.h in the tarball Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-04-01 19:33:37 +00:00
Matt Turner	3384179faa	glsl: Make sure not to dereference NULL. Found by Coverity.	2015-04-01 12:25:29 -07:00
Laura Ekstrand	142909f19d	main: create_buffers unlocks mutex when throwing OUT_OF_MEMORY. Ilia Mirkin found that I had forgotten to free the mutex in the error case. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-01 12:07:28 -07:00
Jose Fonseca	3321724c10	automake,scons: Put NIR source files in a separate var to fix SCons build. SCons does not build NIR yet. Trivial.	2015-04-01 19:49:09 +01:00
Jose Fonseca	7f0682cebf	automake: Fix out-of-source builds. Add include path for generated nir_opcodes.h. Trivial.	2015-04-01 19:48:09 +01:00

1 2 3 4 5 ...

62706 commits