fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 06:48:09 +02:00

Author	SHA1	Message	Date
Timothy Arceri	96527c3cf2	glsl: copy explicit offset to uniform storage Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:44 +11:00
Timothy Arceri	e12a49ac12	glsl: update comment on offset field The old comment was for the location not the offset, we now use the field for block members so mention that also. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:39 +11:00
Timothy Arceri	9f24f42c49	glsl: add offset to glsl interface type In this patch we also copy the offset value from the ast and implement offset linking rules by adding it to the record_compare() function. From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the GLSL 4.50 spec: "Two blocks linked together in the same program with the same block name must have the exact same set of members qualified with offset and their integral-constant-expression values must be the same, or a link-time error results." Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:34 +11:00
Timothy Arceri	8abed7f185	glsl: apply compile-time rules for the offset layout qualifier This implements the rules for the offset qualifier on block members. From Section 4.4.5 (Uniform and Shader Storage Block Layout Qualifiers) of the GLSL 4.50 spec: "The offset qualifier can only be used on block members of blocks declared with std140 or std430 layouts." ... "It is a compile-time error to specify an offset that is smaller than the offset of the previous member in the block or that lies within the previous member of the block." ... "The specified offset must be a multiple of the base alignment of the type of the block member it qualifies, or a compile-time error results." Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:30 +11:00
Timothy Arceri	6f45484ac7	glsl: enable offset layout qualifier for ARB_enhanced_layouts Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-03-05 19:38:26 +11:00
Timothy Arceri	1824ff1c2a	glsl: reject invalid input layout qualifiers Global in validation is already handled, this will do the validation for variables, blocks and block members. This fixes some CTS tests for the new enhanced layouts transform feedback qualifiers. V2: add some more valid input flags Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:09 +11:00
Timothy Arceri	bd53cc7b45	glsl: only apply default stream to output blocks This is needed to allow invalid qualifier checks on inputs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:04 +11:00
Timothy Arceri	78d3098c05	glsl: rework parsing of blocks Previously interface blocks were giving the global default flags of uniform blocks. This meant we could not check for invalid qualifiers on interface blocks because they always contained invalid flags. This changes parsing so that interface blocks now get an empty set of layouts. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:07:00 +11:00
Timothy Arceri	d244986bf2	glsl: don't apply uniform/buffer layouts to interface blocks If the following patch we will stop setting these layouts by default on interface blocks, so we need to do this to avoid hitting the assert. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-05 19:06:56 +11:00
Nanley Chery	4e75f9b219	anv: Implement VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS} v2: Subtract the baseMipLevel and baseArrayLayer (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 21:25:23 -08:00
Kenneth Graunke	4ba7ad6cc1	i965: Only magnify depth for 3D textures, not array textures. When BaseLevel > 0, we magnify the dimensions to fill out the size of miplevels [0..BaseLevel). In particular, this was magnifying depth, thinking that the depth doubles at each level. This is perfectly reasonable for 3D textures, but dead wrong for array textures. Changing the depth != 1 condition to a target == GL_TEXTURE_3D check should make this only happen in the appropriate cases. Fixes about 32 dEQP tests: - dEQP-GLES31.functional.texture.gather.*.level_{1,2} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-03-04 21:25:08 -08:00
Jason Ekstrand	c1436e80ef	anv/meta_clear: Set the right number of dynamic states	2016-03-04 19:18:20 -08:00
Juan A. Suarez Romero	2f76a9924e	i965/vec4: add opportunistic behaviour to opt_vector_float() opt_vector_float() transforms several scalar MOV operations to a single vectorial MOV. This is done when those MOV covers all the components of the destination register. So something like: mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf3.0.z:D, 0D is transformed in: mov vgrf3.0:F, [0F, 0F, 0F, 1F] But there are cases where not all the components are written. For example, in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xy:D, 0D mov vgrf3.0.w:D, 1065353216D mov vgrf4.0.xy:D, 1065353216D mov vgrf4.0.w:D, 0D mov vgrf6.0:UD, u4.xyzw:UD Nor vgrf3 nor vgrf4 .z components are written, so the optimization is not applied. But it could be applied anyway with the components covered, using a writemask to select the ones written. So we could transform it in: mov vgrf2.0.x:D, 1073741824D mov vgrf3.0.xyw:F, [0F, 0F, 0F, 1F] mov vgrf4.0.xyw:F, [1F, 1F, 0F, 0F] mov vgrf6.0:UD, u4.xyzw:UD This commit does precisely that: opportunistically apply opt_vector_float() when possible. total instructions in shared programs: 7124660 -> 7114784 (-0.14%) instructions in affected programs: 443078 -> 433202 (-2.23%) helped: 4998 HURT: 0 total cycles in shared programs: 64757760 -> 64728016 (-0.05%) cycles in affected programs: 1401686 -> 1371942 (-2.12%) helped: 3243 HURT: 38 v2: change vectorize_mov() signature (Matt). v3: take in account predicates (Juan). v4 [mattst88]: Update shader-db numbers. Fix some whitespace issues. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2016-03-04 19:16:52 -08:00
Jason Ekstrand	cc57efc67a	anv/pipeline: Fix depthBiasEnable on gen7 The first time I tried to fix this, I set the wrong fields.	2016-03-04 17:56:12 -08:00
Jason Ekstrand	653261285e	anv/cmd_buffer: Reset the state streams when resetting the command buffer	2016-03-04 17:54:29 -08:00
Jason Ekstrand	f700d16a89	anv/cmd_buffer: Include Haswell in set_subpass	2016-03-04 17:54:29 -08:00
George Kyriazis	feb71117ae	st/xlib: Don't destroy screen on XCloseDisplay() screen may still be used by other resources that are not yet freed. To correctly fix this there will be a need to account for resources differently, but this quick fix is not any worse than the original code that leaked screens anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 18:14:46 -07:00
Nanley Chery	a6fb62a864	isl: Fix RenderTargetViewExtent for mipmapped 3D surfaces Match the comment stated above the assignment. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 13:20:44 -08:00
Nanley Chery	b80c8ebc45	isl: Get rid of isl_surf_fill_state_info::level0_extent_px This field is no longer needed. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-04 13:20:03 -08:00
Jason Ekstrand	d154a5ebd6	anv/cmd_buffer: Let the pipeline set StencilBufferWriteEnable on gen9	2016-03-04 12:23:01 -08:00
Jason Ekstrand	f374765ce6	anv/cmd_buffer: Mask stencil reference values	2016-03-04 12:22:32 -08:00
Jason Ekstrand	d61dcec64d	anv/clear: Pull the stencil write mask from the pipeline The stencil write mask wasn't getting set at all so we were using whatever write mask happend to be left over by the application.	2016-03-04 12:03:00 -08:00
Jason Ekstrand	ec18fef88d	anv/pipeline: Set StencilBufferWriteEnable from the pipeline The hardware docs say that StencilBufferWriteEnable should only be set if StencilTestEnable is set. It seems reasonable to set them together.	2016-03-04 12:03:00 -08:00
Jason Ekstrand	fcd8e57185	anv/pipeline: More competent gen8 clipping	2016-03-04 12:03:00 -08:00
Jason Ekstrand	a8afd29653	anv/pipeline: Use the right provoking vertex for triangle fans	2016-03-04 12:03:00 -08:00
Jason Ekstrand	fa8539dd6b	anv/pipeline: Respect pRasterizationState->depthBiasEnable	2016-03-04 12:03:00 -08:00
Matt Turner	1f862e923c	i965/fs: Optimize float conversions of byte/word extract. instructions in affected programs: 31535 -> 29966 (-4.98%) helped: 23 cycles in affected programs: 272648 -> 266022 (-2.43%) helped: 14 HURT: 1 The patch decreases the number of instructions in the two Unigine programs by: #1721: 4374 -> 4155 instructions (-5.01%) #1706: 3582 -> 3363 instructions (-6.11%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	905ff86198	nir: Recognize open-coded extract_u16. No shader-db changes, but does recognize some extract_u16 which enables the next patch to optimize some code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Matt Turner	76289fbfa8	nir: Recognize open-coded extract_u8. Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack three bytes from an integer and convert each into a float: float((val >> 16u) & 0xffu) float((val >> 8u) & 0xffu) float((val >> 0u) & 0xffu) Instead of shifting, masking, and type converting like this: shr(8) g15<1>UD g25<8,8,1>UD 0x00000010UD and(8) g16<1>UD g15<8,8,1>UD 0x000000ffUD mov(8) g17<1>F g16<8,8,1>UD shr(8) g18<1>UD g25<8,8,1>UD 0x00000008UD and(8) g19<1>UD g18<8,8,1>UD 0x000000ffUD mov(8) g20<1>F g19<8,8,1>UD and(8) g21<1>UD g25<8,8,1>UD 0x000000ffUD mov(8) g22<1>F g21<8,8,1>UD i965 can simply extract a byte and convert to float in a single instruction: mov(8) g17<1>F g25.2<32,8,4>UB mov(8) g20<1>F g25.1<32,8,4>UB mov(8) g22<1>F g25.0<32,8,4>UB This patch implements the first step: recognizing byte extraction. A later patch will optimize out the conversion to float. instructions in affected programs: 28568 -> 27450 (-3.91%) helped: 7 cycles in affected programs: 210076 -> 203144 (-3.30%) helped: 7 This patch decreases the number of instructions in the two Unigine programs by: #1721: 4520 -> 4374 instructions (-3.23%) #1706: 3752 -> 3582 instructions (-4.53%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-03-04 11:52:34 -08:00
Kenneth Graunke	9d7faadd8a	anv: Fix backwards shadow comparisons sample_c is backwards from what GL and Vulkan expect. See intel_state.c in i965. v2: Drop unused vk_to_gen_compare_op. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-03-04 11:35:46 -08:00
George Kyriazis	01e92e7010	st/xlib: Hang off screen destructor off main XCloseDisplay() callback. This resolves some order dependencies between the already existing callback the newly created one. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 10:57:24 -07:00
George Kyriazis	51e562c3ea	st/xlib: Support unlimited number of display connections There is a limit of 10 display connections, which was a problem for apps/tests that were continuously opening/closing display connections. This fix uses XAddExtension() and XESetCloseDisplay() to keep track of the status of the display connections from the X server, freeing mesa-related data as X displays get destroyed by the X server. Poster child is the VTK "TimingTests" Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 10:57:09 -07:00
Brian Paul	192ee9adb1	svga: add new command-buffer-size HUD query To plot a graph of the command buffer size. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Brian Paul	1258f907f4	svga: add new svga_winsys_context::get_command_buffer_size() To ask how large the current command buffer is. Will be used for a new GALLIUM_HUD graph. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Brian Paul	6fc8d90fa9	svga: reorder SVGA_QUERY_ switch cases to match declaration order Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-03-04 07:57:41 -07:00
Sinclair Yeh	f1410c5b91	svga: Force an RGBA view creation for an RGBA resource glXCreatePixmap() may specify a GLX_TEXTURE_FORMAT_RGB_EXT format for an RGBA resource, causing us to create an RGBX view for an RGBA resource, a combination vgpu10 does not support. When this is detected, change the request to create an RGBA view instead. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 07:57:41 -07:00
Charmaine Lee	8366701f4c	svga: fix an error in svga_texture_generate_mipmap With this patch, make sure the shader resource view is properly created before referencing it in the generate mipmap command. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-03-04 07:57:41 -07:00
Thomas Hellstrom	395c7b8fa1	winsys/svga: Increase the fence timeout If running with a software renderer backend, the timeout may be insufficient, and we don't want to release busy buffers too early. In practice, SVGA gpu lockups are extremely rare. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-03-04 13:55:23 +01:00
Thomas Hellstrom	24ad7e16cd	winsys/svga: Fix an uninitialized return value Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviwed-by: Brian Paul <brianp@vmware.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-03-04 13:54:38 +01:00
Kenneth Graunke	9ec246796f	i965: Set MaxFramebufferWidth/Height to 16384, not viewport. dEQP-GLES31.functional.fbo.no_attachments.maximums.{all,height,size,width} started hitting assertion failures when emitting SURFACE_STATE, after commit `e8fd60e789` where Samuel increased the maximum viewport size to 32768, from 16384. MaxFramebufferWidth/Height were being set to the maximum viewport size, but are actually limited by the SURFACE_STATE width/height field range, which is 16384 on Gen7+ (where ARB_framebuffer_no_attachments is exposed). So, reduce these to 16384 explicitly. Fixes assert fails in the above mentioned dEQP tests. (Those tests still fail, however.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-03-03 21:31:22 -08:00
Francisco Jerez	a6046d217d	glsl: Improve the accuracy of the acos() approximation. The adjusted polynomial coefficients come from the numerical minimization of the L2 norm of the relative error. The old coefficients would give a maximum relative error of about 15000 ULP in the neighborhood around acos(x) = 0, the new ones give a relative error bounded by less than 2000 ULP in the same neighborhood. Fixes four dEQP subtests: dEQP-GLES31.functional.shaders.builtin_functions.precision.acos. highp_compute.{scalar,vec2,vec3,vec4} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	2795fbcae3	glsl: Parameterize asin_expr() on the fit coefficients. This will allow us to share the implementation while using different polynomials for asin() and acos(). Francisco Jerez did this in the SPIR-V front-end; I'm merely porting his idea to the GLSL world. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	aa37cbdff7	mesa: Allow Get() of several forgotten IsEnabled() pnames. From section 6.2 ("State Tables") of the GL 2.1 specification (the text also appears in the GL 3.0 and ES 3.1 specifications): "However, state variables for which IsEnabled is listed as the query command can also be obtained using GetBooleanv, GetIntegerv, GetFloatv, and GetDoublev." GL_DEBUG_OUTPUT, GL_DEBUG_OUTPUT_SYNCHRONOUS, and GL_FRAGMENT_SHADER_ATI were missing from the glGet() functions. All other IsEnabled() pnames look to be present, as far as I can tell. Fixes 8 dEQP-GLES31.functional.debug.state_query subtests: debug_output[_synchronous]_get{boolean,float,integer,integer64}. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	b4b50b074b	mesa: Make glGet queries initialize ctx->Debug when necessary. dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_* tries to call glGet on GL_DEBUG_GROUP_STACK_DEPTH right away, before doing any other debug setup. This should return 1. However, because ctx->Debug wasn't allocated, we bailed and returned 0. This patch removes the open-coded locking and switches the two glGet functions to use _mesa_lock_debug_state(), which takes care of allocating and initializing that state on the first time. It also conveniently takes care of unlocking on failure for us, so we don't need to handle that in every caller. Fixes dEQP-GLES31.functional.debug.state_query.debug_group_stack_depth_ {getboolean,getfloat,getinteger,getinteger64}. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-03-03 21:31:22 -08:00
Kenneth Graunke	3ed260f54c	hack to make dota 2 menus work	2016-03-03 16:21:09 -08:00
Jason Ekstrand	56ba13c994	isl/surface_state: Set L2 bypass disable for certain BC* formats	2016-03-03 16:16:57 -08:00
Eduardo Lima Mitev	47392011c0	Update docs to advertise new support for ARB_internalformat_query2 Support in Mesa main and i965 has just been added. v2: Include note in 'New Features' of docs/relnotes/11.3.0.html. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-03 22:19:35 +01:00
Kenneth Graunke	623ce595a9	anv: Compile shader stages in pipeline order. Instead of the arbitrary order modules might be specified in. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:36:19 -08:00
Nanley Chery	8dddc3fb1e	anv/meta: Delete unused functions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:44 -08:00
Nanley Chery	d20f6abc85	anv/meta: Use blitter API for state-handling in Buffer Update/Copy Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-03-03 11:26:42 -08:00

... 67 68 69 70 71 ...

82384 commits