fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 11:48:06 +02:00

Author	SHA1	Message	Date
Emil Velikov	d2811c29da	docs: add news item and link release notes for mesa 10.4.3 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-01-24 13:18:10 +00:00
Emil Velikov	48818a0fc7	docs: Add sha256 sums for the 10.4.3 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `49a5bce780`)	2015-01-24 13:14:56 +00:00
Emil Velikov	9f35423270	Add release notes for the 10.4.3 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `e92bfa3f95`)	2015-01-24 13:14:54 +00:00
Matt Turner	94e7b59a75	i965: Convert CMP.GE -(abs)reg 0 -> CMP.Z reg 0. total instructions in shared programs: 5952059 -> 5951603 (-0.01%) instructions in affected programs: 138812 -> 138356 (-0.33%) GAINED: 1 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	40ae302a3c	i965/fs: Add support for removing MOV.NZ instructions. For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g25<1>F -g6<0,1,0>F g15<8,8,1>F cmp.l.f0(8) g26<1>D g25<8,8,1>F 0F mov.nz.f0(8) null g26<8,8,1>D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null -g6<0,1,0>F g15<8,8,1>F total instructions in shared programs: 5955701 -> 5951657 (-0.07%) instructions in affected programs: 302910 -> 298866 (-1.34%) GAINED: 1 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	9a3a294224	i965/fs: Allow flipping cond mod for negated arguments. This allows us to apply the optimization in cases where the CMP's argument is negated, by flipping the conditional mod. For example, it allows us to optimize this: add(8) temp a b cmp.l.f0(8) null -temp 0.0 into add.g.f0(8) temp a b total instructions in shared programs: 5958360 -> 5955701 (-0.04%) instructions in affected programs: 466880 -> 464221 (-0.57%) GAINED: 0 LOST: 1 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	d6317beb46	i965/fs: Propagate cmod across flag read if it contains the same value. total instructions in shared programs: 5959463 -> 5958900 (-0.01%) instructions in affected programs: 70031 -> 69468 (-0.80%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	3fb5b2bc47	i965/fs: Add unit tests for cmod propagation pass. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	19f9cb72c8	i965/fs: Add pass to propagate conditional modifiers. total instructions in shared programs: 5974160 -> 5959463 (-0.25%) instructions in affected programs: `1743737` -> 1729040 (-0.84%) GAINED: 0 LOST: 12 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	3759a89ad3	i965/fs: Eliminate null-dst instructions without side-effects. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	7452f18b22	i965/fs: Apply conditional mod specially to split MAD/LRP. Otherwise we'll apply the conditional mod to only one of SIMD8 instructions and trigger an assertion. NoDDClr/NoDDChk have the same problem but we never apply those to these instructions, so I'm leaving them for a later time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	eed7223243	i965/fs: Add a pass to fixup 3-src instructions that have a null dest. 3-src instructions can only have GRF/MRF destinations. It's really difficult to deal with that restriction in dead code elimination (that wants to give instructions null destinations to show that their result isn't used) while allowing 3-src instructions to have conditional mod, so don't, and just give then a destination before register allocation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	215b081c2a	i965: Add is_3src() to backend_instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	0654ca7d7e	i965: Add backend_instruction::can_do_cmod(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	71486e9f2d	i965/cfg: Add a foreach_block_reverse macro. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:57:39 -08:00
Matt Turner	65dd4a255a	i965/cfg: Add a foreach_inst_in_block_reverse_safe macro. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:57:39 -08:00
Matt Turner	579157e6c1	glsl: Add a foreach_in_list_reverse_safe macro. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	c638ea3d19	i965: Don't make instructions with a null dest a barrier to scheduling. Now that we properly track accumulator dependencies, the scheduler is able to schedule instructions between the mach and mov in the common the integer multiplication pattern: mul acc0, x, y mach null, x, y mov dest, acc0 Since a null destination implies no dependency on the destination, we can also safely schedule instructions (that don't write the accumulator) between the mul and mach. GAINED: 103 LOST: 43 Causes one program to spill (643 -> 1076 instructions). I committed this patch last year (commit `42a26cb5`) but reverted it (commit `0d3f83f4`) after inexplicable artifacts in Kerbal Space Program (bug 78648). Tapani reapplied this patch and could not reproduce the bug with current Mesa. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:57:39 -08:00
Ian Romanick	f02f1af9f7	i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful If try_replace_with_sel is able to replace the flow control with a SEL instruction, then there is no flow control... failing SIMD16 because of nonexistent flow control is wrong. No piglit regressions on any i965 platform in Jenkins. total instructions in shared programs: 4382707 -> 4382707 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 GAINED: 2089 LOST: 0 No other platforms affected in shader-db. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:34:47 -08:00
Eric Anholt	0680d170d1	nir: Expose nir_print_instr() for debug prints It's nice to have this present in your default cases so you can see what instruction is triggering an abort. v2: Just pass a NULL state, now that it won't crash when you do. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:30:11 -08:00
Eric Anholt	6445a40520	nir: When asked to print with a NULL state, just use bare variable names. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:30:01 -08:00
Eric Anholt	447ddfc137	nir: Add nir_lower_alu_to_scalar. This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted for vc4. v2: Use the nir_src_for_ssa() helper, and another instance of nir_alu_src_copy(). v3: Drop the non-SSA support. All intended callers will have SSA-only ALU ops. v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused unsupported() function, drop lower_context struct. v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert about weird input_sizes[]. Reviewed-by: Jason Ekstrand <jason.ekstrand@iastate.edu>	2015-01-23 16:37:23 -08:00
Eric Anholt	b200127816	nir: Make some helpers for copying ALU src/dests. There aren't many users yet, but I wanted to do this from my scalarizing pass. v2: Constify the src arguments. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 16:37:16 -08:00
Kenneth Graunke	15063d2ad0	nir: Add algebraic optimizations for division and reciprocal. These also exist in opt_algebraic.cpp. total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%) NIR instructions in affected programs: 42221 -> 42002 (-0.52%) helped: 198 total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%) i965 instructions in affected programs: 84322 -> 83885 (-0.52%) helped: 394 HURT: 1 (by 1 instruction) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	bbd60f6d79	nir: Add algebraic optimizations for exponential/logarithmic functions. Most of these exist in the GLSL IR algebraic pass already. However, SSA allows us to find more instances of the patterns. total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%) NIR instructions in affected programs: 124189 -> 120026 (-3.35%) helped: 604 total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%) i965 instructions in affected programs: 261295 -> 254507 (-2.60%) helped: 1295 HURT: 3 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	391fb32bbe	nir: Add algebraic optimizations for simplifying comparisons. The first batch removes bonus fnot/inot operations, possibly allowing other optimizations to better recognize patterns. The next batch replaces a fadd and constant 0.0 with an fneg - negation is usually free on GPUs, while addition is not. total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%) NIR instructions in affected programs: 411143 -> 405922 (-1.27%) helped: 2233 HURT: 214 A few shaders are hurt by a few instructions due to moving neg such that it has a constant operand, which is then folded, resulting in two distinct load_consts for x and -x. We can always clean that up later. total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%) i965 instructions in affected programs: 784980 -> 775093 (-1.26%) helped: 4508 HURT: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	551a752a59	nir: Add algebraic optimizations for pointless shifts. The GLSL IR optimization pass contained these; we may as well include them too. v2: Fix a >> 0 and a << 0 optimizations (caught by Matt). No change in the number of NIR instructions on a shader-db run. total i965 instructions in shared programs: 6035397 -> 6035392 (-0.00%) i965 instructions in affected programs: 542 -> 537 (-0.92%) helped: 2 (in glamor) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	3e56572c49	nir: Add a bunch of algebraic optimizations on logic/bit operations. Matt and I noticed a bunch of "val <- ior a a" operations in a shader, so we decided to add an algebraic optimization for that. While there, I decided to add a bunch more of them. v2: Delete bogus fand/for optimizations (caught by Jason). total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%) NIR instructions in affected programs: 149634 -> 146937 (-1.80%) helped: 1032 total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%) i965 instructions in affected programs: 537 -> 542 (0.93%) HURT: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	978b0a9cda	nir: Implement CSE on intrinsics that can be eliminated and reordered. Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had load_input and load_uniform intrinsics repeated several times, with the same parameters, but each one generating a distinct SSA value. This made ALU operations on those values appear distinct as well. Generating distinct SSA values is silly - these are read only variables. CSE'ing them makes everything use a single SSA value, which then allows other operations to be CSE'd away as well. Generalizing a bit, it seems like we should be able to safely CSE any intrinsics that can be eliminated and reordered. I didn't implement support for variables for the time being. v2: Assert that info->num_variables == 0 (requested by Jason). total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%) NIR instructions in affected programs: 2413496 -> 2001071 (-17.09%) helped: 16872 total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%) i965 instructions in affected programs: 640654 -> 620094 (-3.21%) helped: 2071 HURT: 585 GAINED: 14 LOST: 25 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	cbdd623f13	nir: Pull nir_instr_can_cse()'s SSA checks out of the switch. This should not be a change in behavior, as all current cases that potentially answer "yes" require SSA. The next patch will introduce another case that requires SSA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	d7743bb1c2	i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug. This allows us to count NIR instructions via shader-db. Use "run" as normal. The results file will contain both NIR and assembly. Then, to generate a NIR report: ./report.py <(grep NIR results/foo) <(grep NIR results/bar) Or, to generate an i965 report: ./report.py <(grep -v NIR results/foo) <(grep -v NIR results/bar) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	f3e06fcc6a	i965/nir: Print NIR on INTEL_DEBUG=fs. This is useful for debugging and looking for optimization opportunities. It will need to be expanded when we add support for other scalar stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	faa38e16aa	i965/nir: Do optimizations again just before lowering source mods. We want to run CSE and algebraic optimizations again after lowering IO. Some of the passes in the optimization loop don't handle saturates and other modifiers, so run it before lowering to source modifiers. total instructions in shared programs: 6046190 -> 6045768 (-0.01%) instructions in affected programs: 22406 -> 21984 (-1.88%) helped: 47 HURT: 0 GAINED: 0 LOST: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 14:53:25 -08:00
Matt Turner	9b5efac461	loader: Remove NEED_OPENGL_COMMON check. HAVE_DRICOMMON is sufficient since OpenGL must be enabled for DRI.	2015-01-23 14:28:44 -08:00
Matt Turner	2e7b62cbb9	gitignore: Ignore .tar.xz files.	2015-01-23 14:28:44 -08:00
Matt Turner	dd6f641303	mesa: Build with subdir-objects.	2015-01-23 14:28:44 -08:00
Matt Turner	145919b2ab	glsl: Build a libglsl_util library. Rather than sourcing files with ../dir/file.c which leads to distclean wiping out ../dir's .deps directory.	2015-01-23 14:28:44 -08:00
Matt Turner	a37ae2ab92	mapi: Build with subdir-objects.	2015-01-23 14:28:44 -08:00
Matt Turner	961def1074	mapi: Remove vgapi from SUBDIRS. OpenVG is disabled with via autotools.	2015-01-23 14:28:44 -08:00
Matt Turner	ce98519266	mesa: Drop inclusion of glapi_gen.mk. Some glapi headers used to be generated from this Makefile.am, but no longer.	2015-01-23 14:28:43 -08:00
Matt Turner	618c3b35f1	glsl: Build with subdir-objects. Apparently $(top_srcdir) is not expanded in a source list when using subdir-objects, so remove that. It's not clear to me why we were going to such lengths to prefix each source file anyway.	2015-01-23 14:28:42 -08:00
Matt Turner	a8b880bd63	nir: Add headers to distribution.	2015-01-23 14:27:39 -08:00
Matt Turner	ae494281a4	nir: Add nir_{opt_,}algebraic.py to distribution.	2015-01-23 14:26:53 -08:00
Matt Turner	4db329ddff	mesa: Add format_{un,}pack.py to distribution.	2015-01-23 14:26:53 -08:00
Matt Turner	195488e945	mesa: Remove pack_tmp.h from sources. Missed in commit `3a4de321`.	2015-01-23 13:35:25 -08:00
Connor Abbott	68a9d0b36f	nir: add generated file to .gitignore Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 10:20:46 -08:00
Ville Syrjälä	f4b31d29d7	i965: Fix min_vs_entries for CHV According to BSpec the correct number for min_vs_entries is 34 for CHV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-01-23 12:09:41 +02:00
Ville Syrjälä	99754446ab	i965: Fix max_wm_threads for CHV Change max_wm_threads to match the spec on CHV. The max number of threads in 3DSTATE_PS is always programmed to 64 and the hardware internally scales that depending on the GT SKU. So this doesn't change the max number of threads actually used, but it does affect the scratch space calculation. On CHV the old value was too small, so the amount of scratch space allocated wasn't sufficient to satisfy the actual max number of threads used. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-01-23 12:09:35 +02:00
Connor Abbott	c8761c8559	glsl: fix stale comment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-23 00:23:51 -05:00
Jason Ekstrand	6be2434031	i965/emit: Assert that src1 is not an MRF after doing the MRF->GRF conversion When emitting texturing from indirect texture units, we need to be able to scratch around in the header message. Since we only do this for >= HSW, this is ok since there are no MRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj phogat <anuj.phogat@gmail.com>	2015-01-22 16:00:34 -08:00

1 2 3 4 5 ...

67731 commits