fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 05:18:08 +02:00

Author	SHA1	Message	Date
Brian Paul	971122a9c0	st/mesa: minor simplification of some state atom assignments	2014-07-09 06:43:25 -06:00
Brian Paul	301ffe7b26	st/mesa: minor fix-up in st_GetSamplePosition() If the driver doesn't implement get_sample_position(), let's return some non-garbage values.	2014-07-09 06:43:25 -06:00
Brian Paul	91affc8b32	mesa: use float to silence MSVC warning in _mesa_GetMultisamplefv()	2014-07-09 06:43:25 -06:00
Samuel Pitoiset	50bbe49c33	nvc0: allocate more space before a counter is configured On nvc0, a counter can have up to 6 sources instead of only one for nve4+. This fixes a crash when a counter uses more than one source. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-08 19:41:00 -04:00
Tobias Klausmann	a9b21015f5	nv50/ir: use unordered_set instead of list to keep track of var uses The set of variable uses does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~22s from ~4h Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-08 19:41:00 -04:00
Kenneth Graunke	503391b46f	i965/disasm: Fix disassembly of the any16h/all16h predicates. BRW_PREDICATE_ALIGN1_ANY16H was incorrectly being disassembled as "all16h", and ALL16H would probably print as "(null)". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-08 12:31:01 -07:00
Kenneth Graunke	e13a6406c3	glsl: Fix the foreach_in_list_reverse macro. We clearly don't want to start at the head and walk backwards; we want to start at the last real element before the tail sentinel. If the list is empty, tail_pred will be the head sentinel, and we'll stop. Nothing uses this function, so I guess nobody noticed it was broken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-08 12:31:01 -07:00
Marek Olšák	be536efe20	radeonsi: mark MSAA config state as dirty at the beginning of CS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81020 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-08 20:46:23 +02:00
Marek Olšák	fe6be9926f	gallium: fix u_default_transfer_inline_write for textures This doesn't fix any known issue. In fact, radeon drivers ignore all the discard flags for textures and implicitly do "discard range" for any write transfer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-08 20:46:23 +02:00
Matt Turner	cf430408c4	i965: Remove artificial dependency between math instructions. ... on Gen6+. I'm not actually sure which class Gen6 fits into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-08 11:12:02 -07:00
Matt Turner	099cbc1477	i965/fs: Track dependencies in instruction scheduling per reg offset. Previously instruction scheduling tracked dependencies on a per-register basis. This meant that there was an artificial dependency between interpolation instructions writing into the same virtual register. Instruction scheduling would insert a number of instructions between the two instructions in this example, when they are actually independent. linterp vgrf8+0.0:F, hw_reg2:F, hw_reg3:F, hw_reg6:F linterp vgrf8+1.0:F, hw_reg2:F, hw_reg3:F, hw_reg6+16:F This lead to cases where the first texture coordinate is interpolated at the beginning of the shader, but the second is done immediately before the texture operation that uses it as a source. After this change, the artificial dependency is removed and the interpolation instructions are scheduled together. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-08 11:12:02 -07:00
Jon TURNEY	7a641dd58d	configure: Don't special case Cygwin to use gnu99, define _XOPEN_SOURCE instead Revert "build: Build on Cygwin with gnu99 instead of c99." and define _XOPEN_SOURCE appropriately. This reverts commit `53e36d333c`. Since Cygwin 1.7.18 (April 2013), it's headers correctly prototype strtoll() when using -std=c99, and correctly prototype strdup() when _XOPEN_SOURCE is defined appropriately, so this workaround is no longer needed. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Cc: Vinson Lee <vlee@freedesktop.org>	2014-07-08 14:25:21 +01:00
Chia-I Wu	8ff16111ee	ilo: fix fence reference counting The old code was complicated, and was wrong when *ptr is NULL.	2014-07-08 15:00:36 +08:00
Kristian Høgsberg	bbefb15e01	i965: Extend compute-to-mrf pass to understand blocks of MOVs The current compute-to-mrf pass doesn't handle blocks of MOVs. Shaders that end with a texture fetch follwed by an fb write are left like this: 0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000010: send(8) g2<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q }; 0x00000020: mov(8) g113<1>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000028: mov(8) g114<1>F g3<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000030: mov(8) g115<1>F g4<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000038: mov(8) g116<1>F g5<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000040: sendc(8) null g113<8,8,1>F render ( RT write, 0, 4, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT }; This patch lets compute-to-mrf recognize blocks of MOVs and match them to instructions (typically SEND) that writes multiple registers. With this, the above shader becomes: 0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000010: send(8) g113<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q }; 0x00000020: sendc(8) null g113<8,8,1>F render ( RT write, 0, 20, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT }; which is the bulk of the shader db results: total instructions in shared programs: 987040 -> 986720 (-0.03%) instructions in affected programs: 844 -> 524 (-37.91%) GAINED: 0 LOST: 0 The optimization also applies to MRT shaders that write the same color value to multiple RTs, in which case we can eliminate four MOVs in a similar fashion. See fbo-drawbuffers2-blend in piglit for an example. No measurable performance impact. No piglit regressions. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-07 23:39:40 -07:00
Ilia Mirkin	8aa34dc9cb	nvc0/ir: fill offset in properly for TXD Apparently TXD wants its offset differently than TEX, accepting it in the upper bits of the layer index. Unclear what happens when this is combined with indirect sampler indexing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	114d46829d	nvc0/ir: use manual TXD when offsets are involved Something about how we're implementing offsets for TXD is wrong, just flip to the generic quadop-based implementation in that case. This is the minimal fix appropriate for backporting. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	afea9bae67	nvc0/ir: do quadops on the right texture coordinates for TXD handleTEX moves the layer as the first argument. This makes sure that the quadops deal with the texture coordinates. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	1065aa92f4	nv50/ir: ignore bias for samplerCubeShadow on nv50 Unfortunately there's no good way to do this on the nv50 shader isa. Dropping the bias seems preferable to doing the compare post-filtering. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	30d91e0eec	nv50/ir: retrieve shadow compare from first arg This can only happen with texture(samplerCubeShadow, bias), where the compare will be in the first argument. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Carl Worth	9007c4f9f4	docs: Import 10.2.3 release notes And add a news item.	2014-07-07 16:28:37 -07:00
Matt Turner	f6db414f3c	i965/fs: Disable unlit_centroid_workaround on Haswell. Although the HSW PRM shows it, the BSpec lists this workaround as being for Ivybridge only. total instructions in shared programs: 1994951 -> 1993675 (-0.06%) instructions in affected programs: 27325 -> 26049 (-4.67%)	2014-07-06 18:19:17 -07:00
Matt Turner	6f7c4a8d05	i965/vec4: Perform CSE on CMP(N) instructions. Port of commit `b16b3c87` to the vec4 code. No shader-db improvements, but might as well. The fs backend saw an improvement because it's scalar and multiple identical CMP instructions were generated by the SEL peepholes.	2014-07-06 18:19:15 -07:00
Matt Turner	7921bf0062	i965/vec4: Don't emit null MOVs in CSE. Port of commit `219b43c6` to the vec4 code.	2014-07-06 18:18:52 -07:00
Matt Turner	949991cc99	i965/vec4: Improve CSE performance by expiring some available expressions. Port of commit `5daf867f` to the vec4 code.	2014-07-06 18:18:52 -07:00
Kenneth Graunke	3c8dc48ad1	i965/vec4: Add basic common subexpression elimination. [mattst88]: Modified to perform CSE on instructions with the same writemask. Offered no improvement before. total instructions in shared programs: 1995633 -> 1995185 (-0.02%) instructions in affected programs: 14410 -> 13962 (-3.11%) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-06 18:18:51 -07:00
Matt Turner	848fc7f710	i965: Fix warnings introduced in commit `e24ef5ab`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-06 18:15:36 -07:00
Christian König	042b061fef	gallium/radeon: use PRIX64 instead of PRIu64 We want hex values here, not decimals. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-06 13:28:04 +02:00
Matt Turner	1580865a8c	i965: Move assembly annotation functions to intel_asm_annotation.c. It's C. Compile it as such. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	423932791d	i965: Rename intel_asm_printer -> intel_asm_annotation. The #ifndef include guards already said the right thing :) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	6d3e24a5c2	i965: Make backend_instruction usable from C. With a hack to place an exec_node in the struct in C to be at the same location as the inherited exec_node in C++. Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	0db30fcf89	i965/cfg: Make cfg_t usable from C. Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	857c06236c	i965: Repack backend_instruction struct. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	ce706b4a9b	i965: Make a brw_predicate enum. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	46e5b2a497	i965: Make a brw_conditional_mod enum. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	ab74a42eef	i965: Move common fields into backend_instruction. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	3de11cacf0	i965: Use enum brw_reg_type for register types. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	34ef6a7651	i965: Move is_zero/one/null/accumulator into backend_reg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	c019105f37	i965: Make a common backend_reg class. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	9377b189f7	i965: Drop imm union from visitor register classes. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:29 -07:00
Matt Turner	53992a102f	i965: Use immediate storage in brw_reg for visitor regs. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:29 -07:00
Andreas Boll	45446efc30	docs: add news item for mesa-demos 8.2.0 release	2014-07-05 11:32:54 +02:00
Chris Forbes	4087d9ec0b	glsl: Fix merging of layout(invocations) with other qualifiers If another layout qualifier appeared to the left of `invocations` in the GS input layout declaration, the invocation count would be dropped on the floor. Fixes the piglit tests: spec/ARB_transform_feedback3/arb_transform_feedback3-ext_interleaved_two_bufs_gs_max spec/ARB_gpu_shader5/arb_gpu_shader5-invocation-id spec/ARB_gpu_shader5/compiler/correct-multiple-layout-qualifier-invocations.geom spec/ARB_gpu_shader5/execution/invocations-conflicting Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-05 09:42:17 +12:00
Ilia Mirkin	9a37eb8adb	nvc0: add a memory barrier when there are persistent UBOs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-03 20:08:41 -04:00
Ilia Mirkin	5d4f5218bb	nv50: do an explicit flush on draw when there are persistent buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-03 20:01:07 -04:00
Ilia Mirkin	b2b7c65122	nv50: disable dedicated ubo upload method The hardware allows multiple simultaneous renders with the same memory-backed constbufs but with each invocation having different values. However in order for that to work, the data has to be streamed in via the right constbuf slot. We weren't doing that for UBOs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>	2014-07-03 20:01:06 -04:00
Ilia Mirkin	32b71246e7	gallium: rename PIPE_CAP_TGSI_VS_LAYER to also have _VIEWPORT Now that this cap is used to determine the availability of both, adjust its name to reflect the new reality. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-03 19:39:25 -04:00
Ilia Mirkin	0fb6f1bf1d	mesa/st: enable AMD_vertex_shader_viewport_index The assumption is that any driver capable of emitting layer from the vertex shader and supporting viewports should be able to also handle emitting viewport index from the vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tobias Droste <tdroste@gmx.de>	2014-07-03 19:39:25 -04:00
Ilia Mirkin	313acb3ffa	r600g: allow vs to write to gl_ViewportIndex Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tobias Droste <tdroste@gmx.de>	2014-07-03 19:39:25 -04:00
Thomas Hellstrom	556a415033	svga: Don't unnecessarily reemit BindGBShader commands v2 The Linux winsys can no longer relocate shader code, so avoid reemitting BindGBShader commands. They are costly. v2: Correctly handle errors from SVGA3D_BindGBShader() Reported-by: Michael Banack <banackm@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-07-03 22:26:00 +02:00
Aaron Watry	824197efd5	radeon/llvm: Allocate space for kernel metadata operands Previously, we were assuming that kernel metadata nodes only had 1 operand. Kernels which have attributes can have more than 1, e.g.: !0 = metadata !{void (i32 addrspace(1)) @testKernel, metadata !1} !1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1} Attempting to get the kernel without the correct number of attributes led to memory corruption and luxrays crashing out. Fixes the cl/program/execute/attributes.cl piglit test. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223 CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-03 15:18:03 -05:00

1 2 3 4 5 ...

63861 commits