fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-27 07:40:31 +01:00

Author	SHA1	Message	Date
Oded Gabbay	529aa8249a	llvmpipe: fix arguments order given to vec_andc This patch fixes a classic "confuse the enemy" bug. _mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the arguments are opposite. _mm_andnot_si128 performs "r = (~a) & b" while vec_andc performs "r = a & (~b)" To make sure this error won't return in another place, I added a wrapper function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-17 21:07:27 +02:00
Rob Clark	02ac91d717	freedreno/ir3: fix mad 3rd src delay calc In `fad158a0` ("freedreno/ir3: array rework") the src # (n) shifted by one, but missed updating delay-slot calc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-17 12:21:45 -05:00
Rob Clark	2a6ec1e061	freedreno/ir3: better array register allocation Detect arrays which don't conflict with each other and allow overlapping register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:23:52 -05:00
Rob Clark	6a33c5c0df	freedreno/ir3: array offset can be negative It at least happens with some piglit tests, like $piglit/bin/vp-address-01 VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..7] DCL ADDR[0] 0: ARL ADDR[0].x, IN[1].xxxx 1: MOV_SAT OUT[1], CONST[ADDR[0].x-1] 2: DP4 OUT[0].x, CONST[4], IN[0] 3: DP4 OUT[0].y, CONST[5], IN[0] 4: DP4 OUT[0].z, CONST[6], IN[0] 5: DP4 OUT[0].w, CONST[7], IN[0] 6: END Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:23:20 -05:00
Rob Clark	ddede497b8	freedreno/ir3: workaround bug/feature Seems like in certain cases, we cannot use c<a0.x+0> as the third src to cat3 instructions. This may be slightly conservative, we may only have this restriction when the first src is also const. This fixes, for example, +24/-0 of the variable-indexing piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:22:43 -05:00
Rob Clark	ebd3a1fc17	ttn: use writemask for store_var Only user is freedreno, and after array-rework it can cope. Avoids generating loads for a store. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:21:52 -05:00
Rob Clark	fad158a0e0	freedreno/ir3: array rework Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:21:08 -05:00
Rob Clark	cc7ed34df9	freedreno/ir3: refactor/simplify cp If we handle separately the special case of eliminating output mov (which includes keeps and various other cases where we don't have a consuming instruction's src register to collapse things into), we can simplify the logic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:20:46 -05:00
Rob Clark	680664dff9	freedreno/ir3: fix incorrect decoding of mov instructions Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:20:37 -05:00
Rob Clark	2809c87f90	freedreno/ir3: remove unused tgsi tokens ptr Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:59 -05:00
Rob Clark	fc0d2f7e02	freedreno/ir3: bit of ra refactor Shuffle things slightly, passing instr-data to ra_name() to reduce the number of places where we need to add support for array names. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:47 -05:00
Rob Clark	d430f443de	freedreno/ir3: cosmetic de-indent Collapse two nested if's into one to reduce indent level. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:33 -05:00
Rob Clark	6f0377d651	ttn: add missing writemask on store_output Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-16 13:35:44 -05:00
Rob Clark	683794fd60	nir/print: const_index is signed Noticed this with $piglit/bin/vp-address-01 Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-16 13:35:44 -05:00
Rob Clark	211b0644e6	nir: few missing struct names nir.h is a bit inconsistent about 'typedef struct {} nir_foo' vs 'typedef struct nir_foo {} nir_foo'. But missing struct name tags is inconvenient when you need a fwd declaration without pulling in all of nir. So add missing struct name tag for nir_variable, and a couple other spots where it would likely be useful. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-16 13:35:43 -05:00
Ilia Mirkin	32a9fe013b	nv50/ir: add saturate support on ex2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-16 00:10:56 -05:00
Jeff Muizelaar	e5fefe49f2	gallivm: avoid crashing in mod by 0 with llvmpipe This adds code that is basically the same as the code in umod, udiv and idiv. However, unlike idiv we return -1. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-16 03:36:29 +01:00
Kenneth Graunke	d54a70aa18	glsl: Allow implicit int -> uint conversions for bitwise operators (&, ^, \|). The ARB has decided that implicit conversions should be performed for bitwise operators in future language revisions. Implementations of current language revisions may or may not perform them. This patch makes Mesa apply implicti conversions even on current language versions. Applications appear to expect this behavior, and there's really no downside to doing so. Fixes shader compilation in Shadow of Mordor. Bugzilla: https://www.khronos.org/bugzilla/show_bug.cgi?id=1405 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-15 17:53:44 -08:00
Jason Ekstrand	61b0cfd84e	i965/fs: Always set channel 2 of texture headers in some stages In the vertex and fragment stages, the hardware is nice to us and leaves g0.2 zerod out for us so we can use it for headers. However, in compute, geometry, and tessellation stages, the hardware is not so nice. In particular, for compute shaders on BDW, the hardware places some debug bits in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha channel mask. This means that if you use a texturing instruction with a header in a compute shader, you may randomly get the alpha channel disabled. Since channel masks affect the return length of the sampler message, this can lead the GPU to expect a different mlen to the one you specified in the shader and this, in turn, hangs your GPU. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	9870f798be	i965/fs/generator: Take an actual shader stage rather than a string Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	0a6811207f	i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	f509a89082	nir/lower_system_values: Lower vertexID to id+base if needed	2016-01-15 16:15:50 -08:00
Jason Ekstrand	6b64dddd71	anv/batch_chain: Remove padding from the BO before emitting BUFFER_END	2016-01-15 15:59:58 -08:00
Jason Ekstrand	67bf74f020	anv/batch_chain: Don't call current_batch_bo() again We call it once at the top of the function and then hold on to the pointer. It shouldn't have changed, so there's no reason to query for it again.	2016-01-15 15:49:32 -08:00
Jason Ekstrand	117cac75d0	nir/spirv: Stop trusting the SPIR-V for the number of texture coordinates	2016-01-15 11:13:51 -08:00
Roland Scheidegger	03f66dfb4b	llvmpipe: ditch additional ref counting for vertex/geometry sampler views The cleaning up was quite a performance hog (making pipe_resource_reference the number two in profilers on the vertex path, and 3rd overall, with its cousin pipe_reference_described not far behind) if there were lots of tiny draw calls (ipers). Now the reason was really that it was blindly calling this for all potential shader views (so 32 each for vs and gs) even though the app never touched a single one which could have been fixed, however I can't come up with a good reason why we refcount these. We've got references, of course, in the sampler views, which should be quite sufficient as we do all vertex and geometry shader execution fully synchronous. (Calling prepare_shader_sampling for all draw calls even if there were no changes looks quite suboptimal too, but generally we don't really expect vs/gs shader sampling to be used much with llvmpipe, and there's even an early exit if there aren't any views to avoid the "null loop" albeit it's now no longer always trying to loop through all 32 slots. Maybe improve another time...). Of course, if we manage to make vertex loads run asynchronously some day, we need references again, but adding that back would be the least of the problems... Also only set LP_NEW_SAMPLER_VIEW for fragment sampler views. Nothing on the vertex side depends on it (I suppose we'd really wanted a separate flag in any case). (Good for a 3% improvement or so in ipers under the right conditions.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-15 20:13:45 +01:00
Roland Scheidegger	2f9a325b6a	llvmpipe: fix "leaking" textures This was not really a leak per se, but we were referencing the textures for longer than intended. If textures were set via llvmpipe_set_sampler_views() (for fs) and then picked up by lp_setup_set_fragment_sampler_views(), they were referenced in the setup state. However, the only way to unreference them was by replacing them with another texture, and not when the texture slot was replaced with a NULL sampler view. (They were then further also referenced by the scene too which might have additional minor side effects as we limit the memory size which is allowed to be referenced by a scene in a rather crude way.) Only setup destruction (at context destruction time) then finally would get rid of the references. Fix this by noting the number of textures the last time, and unreference things if the new view is NULL (avoiding having to unreference things always up to PIPE_MAX_SHADER_SAMPLER_VIEWS which would also have worked). Found by code inspection, no test... v2: rename var Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-15 20:13:45 +01:00
Chad Versace	0e420cb67f	anv: Populate SURFACE_STATE more safely genX_image_view_init allocates up to 3 separate SURFACE_STATE structures, and populates each from a single template. Stop mutating the template between each final SURFACE_STATE.	2016-01-15 11:00:22 -08:00
Chad Versace	eab6212efd	anv/meta: Stop leaking renderpass and framebuffer	2016-01-15 10:14:07 -08:00
Chad Versace	482a1f5eab	anv/meta: Reuse code for vkCmdClear{Color,DepthStencil}Image The two function bodies were very similar. Move common code to anv_cmd_clear_image(). Fixes all 'dEQP-VK.renderpass.formats.*' on Skylake.	2016-01-15 07:46:10 -08:00
Chad Versace	1afe33f8b3	anv/gen8: Fix SF_CLIP_VIEWPORT's Z elements SF_CLIP_VIEWPORT does not clamp Z values. It only scales and shifts them. Clamping to VkViewport::minDepth,maxDepth is instead handled by CC_VIEWPORT. Fixes dEQP-VK.renderpass.simple.depth on Broadwell.	2016-01-14 22:53:05 -08:00
Chad Versace	842b424d3b	anv/meta: Implement vkCmdClearDepthStencilImage	2016-01-14 22:53:05 -08:00
Chad Versace	e4b17a2e1a	anv/meta: Implement vkCmdClearAttachments	2016-01-14 22:53:05 -08:00
Chad Versace	0038ae2e4a	anv/meta: Add VkClearRect param to emit_clear() Prepares for vkCmdClearAttachments.	2016-01-14 22:53:05 -08:00
Chad Versace	11f5433715	anv: Distinguish between subpass setup and subpass start vkCmdBeginRenderPass, vkCmdNextSubpass, and vkBeginCommandBuffer with VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, all setup the command buffer for recording commands for some subpass. But only the first two, vkCmdBeginRenderPass and vkCmdNextSubpass, can start a subpass. Therefore, calling anv_cmd_buffer_begin_subpass() inside vkCmdBeginCommandBuffer is misleading. Clarify its purpose by renaming it to anv_cmd_buffer_set_subpass() and adding comments.	2016-01-14 22:53:05 -08:00
Chad Versace	deb8dd89b5	anv: Emit load clears at start of each subpass This should improve cache residency for render targets. Pre-patch, vkCmdBeginRenderPass emitted all the meta clears for VK_ATTACHMENT_LOAD_OP_CLEAR before any subpass began. Post-patch, vCmdBeginRenderPass and vkCmdNextSubpass emit only the clears needed for that current subpass.	2016-01-14 22:53:05 -08:00
Chad Versace	0679bef49f	anv/meta: Create 8 pipelines for color clears This prepares for moving the clear ops from the start of the render pass into each subpass. Pipeline N will be used to clear color attachment N of the current subpass. Currently meta color clears still create a throwaway subpass with exactly one attachment, so currently only pipeline 0 is used. This is an ugly hack to workaround the compiler's current inability to dynamically set the render target index in the render target write message.	2016-01-14 22:53:05 -08:00
Chad Versace	2997b0da4a	anv: Allow override of pipeline color attachment count Add anv_graphics_pipeline_create_info::color_attachment_count. If non-negative, then it overrides the color attachment count in the pipeline's subpass. Useful for meta. (All the hacks for meta!)	2016-01-14 22:53:05 -08:00
Chad Versace	13610c03a7	anv/meta: Name the nir shaders The names appear in debug output.	2016-01-14 22:53:05 -08:00
Chad Versace	6a1a760e3c	anv: Move MAX_* defs to top of anv_private.h Because I need to use MAX_RTS in struct anv_meta_state.	2016-01-14 22:53:05 -08:00
Chad Versace	4c2bafb9bf	anv: Define zero() macro zero(x) memsets x to zero. Eliminates bugs due to errors in memset's size param.	2016-01-14 22:53:05 -08:00
Chad Versace	f2700d665c	anv/meta: Rename emit_load_*_clear funcs The functions will soon handle clears unrelated to VK_ATTACHMENT_LOAD_OP_CLEAR, namely vkCmdClearAttachments. So remove "load" from their name: emit_load_color_clear -> emit_color_clear emit_load_depthstencil_clear -> emit_depthstencil_clear	2016-01-14 22:53:05 -08:00
Chad Versace	356f952f87	anv/meta: Use anv_cmd_state::attachments for clears Rewrite anv_cmd_buffer_clear_attachments, which emits the top-of-pass clears, to use the data provided in anv_cmd_state::attachments. This prepares for deferring each attachment clear to the first subpass that uses the attachment.	2016-01-14 22:53:05 -08:00
Chad Versace	a4b045ca44	anv: Add anv_cmd_state::attachments This array contains attachment state when recording a renderpass instance. It's populated on each call to anv_cmd_buffer_set_pass. The data is currently set but unused. We'll use it later to defer each attachment clear to the subpass that first uses the attachment.	2016-01-14 22:53:05 -08:00
Samuel Iglesias Gonsálvez	781d2787bc	glsl: restrict consumer stage condition to modify interpolation type Only modify interpolation type for integer-based varyings or when the consumer is known and different than fragment shader. If we are linking separate shader programs and the consumer is unknown, the consumer could be added later and be a fragment shader. If we modify the interpolation type in this case, we could read wrong values in the fragment shader inputs, as shown in bug 93320. Fixes the following CTS test: ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate Fixes the following dEQP tests: dEQP-GLES31.functional.separate_shader.random.102 dEQP-GLES31.functional.separate_shader.random.111 dEQP-GLES31.functional.separate_shader.random.115 dEQP-GLES31.functional.separate_shader.random.17 dEQP-GLES31.functional.separate_shader.random.22 dEQP-GLES31.functional.separate_shader.random.23 dEQP-GLES31.functional.separate_shader.random.3 dEQP-GLES31.functional.separate_shader.random.32 dEQP-GLES31.functional.separate_shader.random.39 dEQP-GLES31.functional.separate_shader.random.64 dEQP-GLES31.functional.separate_shader.random.73 dEQP-GLES31.functional.separate_shader.random.91 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93320 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-15 07:06:41 +01:00
Jason Ekstrand	5d1c2736b6	i965/fs/generator: Change a comment as per jordan's suggestion	2016-01-14 22:03:15 -08:00
Kenneth Graunke	3657cbf24f	i965: Apply add_const_offset_to_base for vec4 VS inputs too. This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	a3500f943e	i965: Make add_const_offset_to_base() work at the shader level. This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	824d82025d	i965: Make an is_scalar boolean in brw_compile_vs(). Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	bb6612f06b	nir/builder: Add a nir_build_ivec4() convenience helper. nir_build_ivec4 is more readable and succinct than using nir_build_imm directly, even if you have C99. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00

... 98 99 100 101 102 ...

82384 commits