fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 05:18:08 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	4a1c8a3037	i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	b022150d70	i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	53a9b6223f	i965: Move 3-src subnr swizzle handling into the vec4 backend. While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Eric Anholt	64253fdb2e	vc4: Fix build from upload changes.	2016-01-02 17:33:19 -08:00
Nicolai Hähnle	8f384d07a8	gallium/radeon: send LLVM diagnostics as debug messages Diagnostics sent during code generation and the every error message reported by LLVMTargetMachineEmitToMemoryBuffer are disjoint reporting mechanisms. We take care of both and also send an explicit message indicating failure at the end, so that log parsers can more easily tell the boundary between shader compiles. Removed an fprintf that could never be triggered. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	255ccd1e99	gallium/radeon: pass pipe_debug_callback into radeon_llvm_compile (v2) This will allow us to send shader debug info via the context's debug callback. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	f8cd11403a	radeonsi: send shader info as debug messages in addition to stderr output The output via stderr is very helpful for ad-hoc debugging tasks, so that remains unchanged, but having the information available via debug messages as well will allow the use of parallel shader-db runs. Shader stats are always provided (if the context is a debug context, that is), but you still have to enable the appropriate R600_DEBUG flags to get disassembly (since it is rather spammy and is only generated by LLVM when we explicitly ask for it). Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	4bb1c8dfec	radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2) This will allow us to send shader debug info. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Nicolai Hähnle	b6847062dd	gallium/radeon: implement set_debug_callback Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Marek Olšák	ecb2da1559	u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	37d0aea772	u_upload_mgr: remove alignment parameter from u_upload_create Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	1bb79c3a7b	u_upload_mgr: pass alignment to u_upload_buffer manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	e0f932846c	u_upload_mgr: pass alignment to u_upload_data manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	020009f7cc	u_upload_mgr: pass alignment to u_upload_alloc manually The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	ffc4716e97	u_upload_mgr: rework the application of alignment The function only aligned the size, but not the offset. The offset was aligned only when the previous suballocation was aligned. That yielded the correct offset alignment if the alignment was constant for all suballocations. Instead, directly align the offset, but allow an unaligned size. There is no change in behavior, because the alignment is constant at the moment. This a prerequisite for allowing a variable alignment for suballocations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	36c93a6fae	st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels (v2) Spotted by luck. The GLSL uniform storage is only associated once in LinkShader and can't be reallocated afterwards, because that would break the association. v2: don't remove st_upload_constants calls, clarify why they're needed Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Marek Olšák	294ed5cd13	program: add _mesa_reserve_parameter_storage The next commit will use this. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Jordan Justen	a2942d8f26	mesa: Fix warning with MESA_VERBOSE=api for BindBufferRange Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-01 17:27:14 -08:00
Ilia Mirkin	c1d14c6817	nv50,nvc0: make sure there's pushbuf space and that we ref the bo early First off, we can't flush in the middle of a command. Secondly requesting the extra push space might cause a flush to happen. If that flush happens, we'd have to do the PUSH_REFN again. So instead do PUSH_REFN after the push space request. This helps avoid rare crashes with supertuxkart in libdrm due to assertion failures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-01 19:52:41 -05:00
Ilia Mirkin	33a415310b	st/mesa: sort extensions enablement array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-01 19:50:02 -05:00
Rob Clark	816ddee6b8	nir/lower_clip: add missing writemask on store Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-01 15:32:46 -05:00
Jordan Justen	3dce7bf268	mesa: Add MESA_VERBOSE=api for GL_ARB_program_interface_query v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-01 12:00:51 -08:00
Jordan Justen	36db91c4c4	mesa: Add MESA_VERBOSE=api for several indexed BindBuffer variants v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-01 12:00:51 -08:00
Dave Airlie	b835255992	st/glsl_to_tgsi: fix block movs for doubles While playing with fp64, I disable varying packing to debug something else, and noticed we never emitted half the output movs for double matrix arrays. We should be moving the left index two slots for dual source doubles, and the right index two slots for non-vs input doubles. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	d214ce86cf	st/glsl_to_tgsi: handle different attrib size vertex inputs are counted differently in some cases, with vertex inputs we need to make sure we don't double count them. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	dc7b33c1f3	st/glsl_to_tgsi: readd the double_reg2 for input index mapping Otherwise we end up emitting the wrong index for the second double. This fixes dmat-vs-gs-tcs-tes.shader_test and dvec3-vs-gs-tcs-tes.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	84dbf3c4ff	st/glsl_to_tgsi: when doing reladdr get vec4 of correct type This fixes fp64 relative addressing, in the upcoming dmat-vs-gs-tcs-tes.shader_test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	d87894b98f	st/glsl_to_tgsi: handle double immediates in matrices properly. This handles matrix initialisation properly. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	7351c7684f	st/glsl_to_tgsi: setup writemask for double arrays and matricies. It's important for the double instruction emission code that the writemasks are correct going in for double so it know which channels to replicate. This fixes it for the array and matrix cases. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	14506dcae2	st/glsl_to_tgsi: handle doubles in array shrinking code. This code takes into account double inputs in the array shrinking code. This fixes some issues with doubles and geom/tess inputs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	aab0c6c9c4	st/glsl_to_tgsi: handle doubles outputs in arrays. This handles the case where a double output is stored in an array, and tracks it for use in the double instruction emit code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	fc890d703e	st/glsl_to_tgsi: store if dst is double in array This is just a precursor patch to a fix for doubles with tessellation that I've written. We need to descend into output arrays in that case and mark dst's as double. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Kenneth Graunke	65d3f85eb3	nvc0: Set winding order regardless of domain. Quads need to respect winding order, too - not just triangles. Fixes rendering in GFXBench 4.0's tessellation benchmark. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-30 16:04:12 -08:00
Kenneth Graunke	7cdc2b9ca0	glsl: Fix varying struct locations when varying packing is disabled. varying_matches::record tries to compute the number of components in each varying, which varying_matches::assign_locations uses to assign locations. With varying packing, it uses glsl_type::component_slots() to come up with a reasonable value. Without varying packing, it fell back to an open-coded computation that didn't bother to handle structs at all. I believe we can simply use 4 * glsl_type::count_attribute_slots(false), which already handles these cases correctly. Partially fixes rendering in GFXBench 4.0's tessellation benchmark. (NVE0 is almost right after this, but i965 is still mostly garbage.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-30 16:04:12 -08:00
Kenneth Graunke	4acf71c89b	drirc: Disable ARB_blend_func_extended for Heaven 4.0/Valley 1.0. Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't specify which fragment shader output is which, so there's at best a 50/50 chance of us guessing it correctly. This is invalid. Unigine fixed this in 4.1 and 1.1 versions over a year and a half ago, but hasn't actually released them for whatever reason. So, add the workaround back so that it works for most people. Fixes Heaven 4.0/Valley 1.0 rendering on Ivybridge. For whatever reason, Broadwell worked. 4.1 and 1.1 have always worked. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-12-30 16:04:12 -08:00
Ilia Mirkin	5ac15f788b	glsl: add GL_ARB_shader_draw_parameters define Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-30 18:59:18 -05:00
Ilia Mirkin	517a93b346	nvc0: add ARB_shader_draw_parameters support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-30 16:55:57 -05:00
Ilia Mirkin	89bda9772d	st/mesa: add GL_ARB_shader_draw_parameters support Hooks up the new system values, passes the drawid in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	daaf0bdf46	gallium: add a drawid to pipe_draw_info This will allow the state tracker to inform the driver where in a broken-up multidraw we currently are. This can then be passed into the vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	87b4e4e29f	gallium: add PIPE_CAP_DRAW_PARAMETERS This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	bb52ea45cc	gallium: add baseinstance/drawid semantics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	d50e6128b8	nv50/ir: attempt to do more constant folding on mad -> add conversion The add might actually have a 0 as an argument, which would convert it into a mov. Make sure to detect that. Also avoid the hack of putting the immediate directly into the instruction, instead use a mov to put it into place and let the later LoadPropagation pass place it if possible. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-30 12:29:07 -05:00
Marta Lofstedt	97685ff10e	i965/gen8: Always use BRW_REGISTER_TYPE_UW for MUL on GEN8+ The imulExtended tests of the shader bitfield tests of the OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W is used for SHADER_OPECODE_MULH. Also, remove unused helper function: static inline bool type_is_signed(unsigned type) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-30 09:29:14 +01:00
Timothy Arceri	0d4cd045c8	glsl: tidy up struct with a single member There used to be more members but they now share other fields in order to keep memory use low. Also making the naming more generic will allow us to reuse the field for explicit byte offsets within blocks for ARB_enhanced_layouts. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:52:05 +11:00
Emil Velikov	2c1a215409	glsl/linker: annotate static functions as such Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:51:58 +11:00
Emil Velikov	c704b89fe4	glsl: annotate ast_process_struct_or_iface_block_members() as static Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:51:51 +11:00
Jason Ekstrand	0119773ffc	nir/builder: Add an init function that creates a simple shader for you A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-29 13:44:05 -08:00
Kristian Høgsberg Kristensen	55ca5b0e74	mesa/st: Pad out _mesa_sysval_to_semantic for new SYSTEM_VALUE_* enums GL_ARB_shader_draw_parameters added two new system values. This gets us back to mapping mesa system values to the right TGSI semantics. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-29 12:15:01 -08:00
Ilia Mirkin	724134f683	nv50/ir: float(s32 & 0xff) = float(u8), not s8 Make sure to make conversion unsigned when we're ANDing the high bits away. Fixes corruption in dolphin. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-29 15:08:20 -05:00
Kristian Høgsberg Kristensen	581f81860e	i965: Reemit vertex state between indirect multi draws If we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. We try to avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change, which then breaks down for indirect draws. Thus, if a program uses base vertex or base instance, and the draw call is indirect, always flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00

1 2 3 4 5 ...

75410 commits