fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 06:38:09 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	507626304c	glsl/nir: Call nir_lower_constant_initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	c5d664f9dc	anv/pipeline: Call nir_lower_constant_initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	f5232db9e5	nir: Add a pass for lowering away constant initializers Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-05 15:40:09 -08:00
Jason Ekstrand	0291bf4db2	Revert "i965: use nir_lower_indirect_derefs() for GLSL" This reverts commit `9404439a75`. I didn't intend to push it and it breaks clip and cull distance.	2016-12-05 15:21:20 -08:00
Jason Ekstrand	5f0e4c7c79	i965: Delete the meta-base CopyImageSubData implementation When I originally implemented the ARB_copy_image extension, the fast-path was written in meta using texture views. This path only worked if both images were uncompressed color images. All of the other cases fell back to the blitter or, in the worst case, mapping and memcpy on the CPU. Now that we have the blorp path, it handles all copies ever and the old meta, blitter, and CPU paths are only used on gen5 and below. The primary reason why we needed the meta path (apart from having a slow blitter on later hardware) was to handle multisampling which gen5 and earlier don't support anyway. Since the blitter is reasonably fast on gen5, we can just delete the meta path and get rid of all that terrible code. If we decide that we're ok with just disabling ARB_copy_image on gen5 and earlier (I personally am), then we could get rid of another 300 lines or so of semi-hairy code. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-12-05 14:00:35 -08:00
Jason Ekstrand	06d864921e	i965/copy_image: Re-implement the blitter path with emit_miptree_blit By using emit_miptree_blit which does chunking, this fixes the blitter path for the case where the image is too tall to blit normally. We also pull it into intel_blit as intel_miptree_copy. This matches the naming of the blorp blit and copy functions brw_blorp_blit and brw_blorp_copy. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2016-12-05 14:00:35 -08:00
Jason Ekstrand	6c74e7f492	i965/blit: Break the guts of intel_miptree_blit into a helper Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "13.0" <mesa-dev@lists.freedesktop.org>	2016-12-05 14:00:35 -08:00
Timothy Arceri	9404439a75	i965: use nir_lower_indirect_derefs() for GLSL This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0	2016-12-05 14:00:35 -08:00
Tim Rowley	0c70b26a2d	swr: mark PIPE_CAP_NATIVE_FENCE_FD unsupported Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-05 13:42:39 -06:00
Tim Rowley	efc3ca64ba	swr: include llvm version and vector width in renderer string Uses llvmpipe's string formating. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-05 13:42:39 -06:00
Tim Rowley	b035d9cab5	gallivm: use getHostCPUFeatures on x86/llvm-4.0+. Use llvm provided API based on cpuid rather than our own manually mantained list of mattr enabling/disabling. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-12-05 13:42:39 -06:00
Juan A. Suarez Romero	48416b6f4d	st/va: declare vlVaBuffer before vlVaContext And declare coded_buf in vlVaContext as "vlVaBuffer " instead of "struct vlVaBuffer ". This fixes several warnings later about assignment from incompatible pointer type. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 17:03:57 +00:00
Juan A. Suarez Romero	5a585d019e	st/va: remove unused variable pbuff Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2016-12-05 17:03:56 +00:00
Emil Velikov	510722d146	st/va: automake: cleanup C{PP,}FLAGS Remove some transitional left overs from the gallium pipe-loader rework and kill off unneeded AM_CPPFLAGS. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-12-05 17:03:56 +00:00
Tobias Droste	9d14a25bee	configure.ac: Move llvm_set_environment_variables higher. This moves the function to get the LLVM environment variables higher in the file. It still needs to be below the "--enable-opencl" because it uses $enable_opencl. It can be called without condition now as it only throws errors if openCL is enabled. v5: HAVE_MESA_LLVM is only used for gallium. Rename it to HAVE_GALLIUM_LLVM. In order to only link LLVM when it is needed, HAVE_GALLIUM_LLVM is only set if "$enable-gallium-llvm" is yes. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-12-05 14:43:46 +00:00
Boyuan Zhang	3949d7c6ea	st/va: fix gop size for rate control The gop_size in rate control is the budget window for internal rate control calculation, and shouldn't always equal to idr period. Define a coefficient to let budget window contains a number of idr period for proper rate control calculation. Adjust the number of i/p frame remaining accordingly. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-12-05 09:23:38 -05:00
Boyuan Zhang	8206882392	st/va: force to submit two consecutive single jobs The gop_size in rate control is the budget window for internal rate control calculation, and shouldn't always equal to idr period. Define a coefficient to let budget window contains a number of idr period for proper rate control calculation. Adjust the number of i/p frame remaining accordingly. v2: fixed regression issues introduced by previous version Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2016-12-05 09:23:38 -05:00
Nayan Deshmukh	7b811c362a	st/vdpau: fix compiler warning in vlVdpVideoMixerRender Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-12-05 11:20:55 +01:00
Topi Pohjolainen	5b27405eff	i965: Release aux buffer when disabling ccs Otherwise subsequent render cycles keep on using compression and/or fast clear. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-12-05 09:20:05 +02:00
Bas Nieuwenhuizen	92d7563fba	ac/nir: Only use the first component for SSBO atomics. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-12-05 01:40:54 +01:00
Dave Airlie	8033f78f94	radv: fix another regression since shadow fixes. This fixes: dEQP-VK.glsl.texture_gather.basic.2d.depth32f.* Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-12-05 10:14:37 +10:00
Iago Toral Quiroga	66e7effc85	spirv: Builtin Layer is an input for fragment shaders This change makes it so we emit a load_input intrinsic when Layer is read in a fragment shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-12-03 20:50:57 +01:00
Bruce Cherniak	a7b510f656	swr: Fix active_queries count The active_query count was incorrect for query types that don't require a begin_query. Removed the unnecessary assert. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-02 14:36:28 -06:00
George Kyriazis	2085088033	swr: Fix type to match parameters of std::max() Include propagation of comparisons further down. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-12-02 14:36:28 -06:00
Tim Rowley	f1ca377ab1	swr: [rasterizer jitter] include cstdarg in builder_misc.cpp Fixes build problem with llvm-svn. v2: use cstdarg instead of stdarg.h Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-12-02 14:36:28 -06:00
Jason Ekstrand	19a541f496	nir: Get rid of nir_constant_data This has bothered me for about as long as NIR has been around. Why do we have two different unions for constants? No good reason other than one of them is a direct port from GLSL IR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-12-02 10:53:32 -08:00
Timothy Arceri	c45d84ad83	Revert "st/mesa: get Version from gl_program rather than gl_shader_program" This reverts commit `6bf63b0119`. A patch that adds a reference to gl_shader_program_data to gl_program needs to land befor this one.	2016-12-02 16:44:44 +11:00
Timothy Arceri	6bf63b0119	st/mesa: get Version from gl_program rather than gl_shader_program Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-02 13:54:54 +11:00
Timothy Arceri	ab8c01386a	st/mesa/glsl: move Version to gl_shader_program_data This is mostly just used during linking however the st uses it when updating textures. In order to store gl_program in the CurrentProgram array rather than gl_shader_program we need to move this field to the shared gl_shader_program_data struct. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-02 13:54:47 +11:00
Rob Clark	534917495d	freedreno: no-op render when we need a fence If app tries to create a fence but there is no rendering to submit, we need a dummy/no-op submit. Use a string-marker for the purpose.. mostly since it avoids needing to realize that the packet format changes in later gen's (so one less place to fixup for a5xx). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-01 20:24:59 -05:00
Rob Clark	0b98e84e9b	freedreno: native fence fd support Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-01 20:24:46 -05:00
Rob Clark	16f6ceaca9	freedreno: some fence cleanup Prep-work for next patch, mostly move to tracking last_fence as a pipe_fence_handle (created now only in fd_gmem_render_tiles()), and a bit of superficial renaming. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-01 20:16:31 -05:00
Rob Clark	026a7223a6	gallium: support for native fence fd's This enables gallium support for EGL_ANDROID_native_fence_sync, for drivers which support PIPE_CAP_NATIVE_FENCE_FD. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-12-01 20:16:31 -05:00
Rob Clark	72cc1ca58d	gallium: wire up server_wait_sync This will be needed for explicit synchronization with devices outside the gpu, ie. EGL_ANDROID_native_fence_sync. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-01 20:16:31 -05:00
Rob Clark	0201f01dc4	egl: add EGL_ANDROID_native_fence_sync With fixes from Chad squashed in, plus fixes for issues that Rafael found while writing piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:57:35 -08:00
Rob Clark	2ba4c7e154	egl: un-fallthrough sync attr parsing Doesn't work so well when you start having more than one possible attrib. Prep-work for next patch. Signed-off-by: Rob Clark <robdclark@gmail.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:57:24 -08:00
Rob Clark	cce04a4630	egl: initialize SyncCondition after attr parsing Reduce the noise in the next patch. For EGL_SYNC_NATIVE_FENCE_ANDROID the sync condition is conditional on EGL_SYNC_NATIVE_FENCE_FD_ANDROID attribute. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2016-12-01 10:52:55 -08:00
Tim Rowley	05f35a868c	tgsi: store writes_primid when scanning tgsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-01 11:33:01 -06:00
Ilia Mirkin	7c16552f8d	mesa: only verify that enabled arrays have backing buffers We were previously also verifying that no backing buffers were available when an array wasn't enabled. This is has no basis in the spec, and it causes GLupeN64 to fail as a result. Fixes: `c2e146f487` ("mesa: error out in indirect draw when vertex bindings mismatch") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-12-01 06:35:13 -05:00
Eric Anholt	51244859e3	vc4: Avoid false scheduling dependencies for LOAD_IMMs. Noticed in shaders with branching, where we ended up scheduling delay slots near the start of a block for the uniforms reset setup. total instructions in shared programs: 93970 -> 93951 (-0.02%) instructions in affected programs: 3117 -> 3098 (-0.61%) 3DMMES performance +0.423087% +/- 0.133521% (n=9,10)	2016-11-30 19:58:09 -08:00
Eric Anholt	6c34084d8e	vc4: Try to schedule QIR instructions between writing to and reading math. This helps us get the delay slots between SFU writes and reads filled. total instructions in shared programs: 94494 -> 93970 (-0.55%) instructions in affected programs: 59206 -> 58682 (-0.89%) 3DMMES performance +1.89967% +/- 0.157611% (n=10,9)	2016-11-30 19:58:09 -08:00
Eric Anholt	d182740ac8	vc4: Improve interleaving of texture coordinates vs results. The latency_between was trying to handle the delay between the coordinate write ("before") and the corresponding sample read ("after"), but we were handing in the two instructions swapped. This meant that we tried to fit things between a tex_s and its preceding tex_result. This made us only interleave normal texture coordinates by accident, and pessimized UBO reads by pushing the tex_result collection earlier until there was nothing but it (and then its preceding coordinate setup) left. In addition to latency reduction, things end up packing better (probably due to reduced live ranges of the texture results): total instructions in shared programs: 98121 -> 94775 (-3.41%) instructions in affected programs: 91196 -> 87850 (-3.67%) 3DMMES performance +1.15569% +/- 0.124714% (n=8,10)	2016-11-30 19:58:09 -08:00
Eric Anholt	1f9daf7cd1	vc4: Fix stray "." on no-op MUL packs. This happened when the PM bit was set for R4 unpacks, where the MUL pack was NOP.	2016-11-30 19:58:09 -08:00
Eric Anholt	98d7e87488	vc4: Allow merging instructions with SF set where the other writes NOP. I'm not sure how I managed to write the SF merge code (`7d8b79f398`) without allowing merges with NOPs. Everything we try to merge with will have a NOP on one or the other side of the instruction, and that's why that commit showed no benefit. total instructions in shared programs: 99347 -> 95128 (-4.25%) instructions in affected programs: 91906 -> 87687 (-4.59%) 3DMMES performance +2.57105% +/- 0.135276% (n=6,8)	2016-11-30 19:58:09 -08:00
Eric Anholt	8e5ec33f11	vc4: In a loop break/continue, jump if everyone has taken the path. This should be a win for most loops, which tend to have uniform control flow. More importantly, it exposes important information to live variables: that the break/continue here means that our jump target may have access to values that were live on our input. Previously, we were just setting the exec mask and letting control flow fall through, so an intervening def between the break and the end of the loop would appear to live variables as if it screened off the variable, when it didn't actually. Fixes a regression in glsl-vs-loop-redundant-condition.shader_test when a perturbing of register allocation caused a live variable to get stomped. Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-11-30 19:58:09 -08:00
Ilia Mirkin	fda1d0187d	anv: expose support for VK_KHR_sampler_mirror_clamp_to_edge This is already supported in genX_state.c, expose the extension string. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-11-30 20:49:04 -05:00
Jason Ekstrand	27433b26b1	anv/cmd_buffer: Actually use the stencil dimension In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I accidentally kept setting the SurfaceType to 2D in the stencil-only case thanks to a copy+paste error. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-30 17:42:42 -08:00
Ilia Mirkin	ef59cb0820	swr: add streamout buffer offset into pBuffer pointer The buffer_size does not take the offset into account. Just add the offset into the pointer which lines up the structures much better. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:36:03 -05:00
Ilia Mirkin	3d837a8871	swr: fix assertion for max number of so targets The number has to be less than or equal to the max, not just less than. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:36:00 -05:00
Ilia Mirkin	02b2efa5eb	swr: properly report max number of SO components The components count the number of individual values, not the number of slots. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-30 20:35:56 -05:00

1 2 3 4 5 ...

79886 commits