fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 08:28:16 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	4455bfa9a0	nir/algebraic: Add lowering for ldexp The algorithm used is different from both the naive suggestion from the GLSL spec and the one used in GLSL IR today. Unfortunately, the GLSL IR implementation that we have today doesn't handle denormals (for those that care) or the case where the float source is +-inf. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:19 -07:00
Jason Ekstrand	765dd65349	i965: Implement the new imod and irem opcodes Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:08 -07:00
Jason Ekstrand	745b3d295e	nir: Add more modulus opcodes These are all needed for SPIR-V Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-13 15:44:00 -07:00
Jason Ekstrand	d880c6f9f5	i965/vec4: Inline get_pull_constant_offset It's not really doing enough anymore to justify a helper function. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reveiewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-13 15:39:20 -07:00
Jason Ekstrand	dd616cab01	nir/lower_io: Allow for a full bitmask of modes Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:10 -07:00
Jason Ekstrand	2caaf0ac5e	nir/lower_indirect: nir_variable_mode is now a bitfield Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:44:07 -07:00
Jason Ekstrand	ffa0e12e15	nir: Convert nir_variable_mode to a bitfield There are several passes where we need to specify some set of variable modes that the pass needs top operate on. This lets us easily do that. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 12:40:12 -07:00
George Kyriazis	f69a61b1aa	gallium/swr: Make flat shading tris work. - Incorporate flatshade flag into the shader generation - Use provoking vertex (vc) in shader when flat shading. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-04-13 13:46:37 -05:00
Rob Clark	c53a12fedc	Revert "freedreno/a4xx: better occlusion/sample counting" This reverts commit `62fa868728`. dEQP-GLES3.functional.occlusion_query.* was unhappy about that change. Still not really sure what the other slots in the sample results buffer are. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:40 -04:00
Rob Clark	46e9bbc918	freedreno/a4xx: rasterizer_discard support This one is slightly annoying, since trying to write RBRC from draw would clobber values set in the tiling/gmem code. We could do command- stream patching for RBRC, as is done on a3xx. Although since it seems to be a rarely used feature, it is easier just to do RMW to set/clear the bit. Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles and related tests. a3xx still needs the same feature, although there it probably makes more sense to take advantage of the existing cmdstream patching which is required for RBRC for other reasons. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:21 -04:00
Rob Clark	216225ce57	freedreno/ir3: fix array textures on a4xx Seems like a4xx needs offset added to array index for all arrays, whereas a3xx only for cubemap arrays. Fixes a whole swath of dEQP fails (roughly sampler2darray). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:14 -04:00
Rob Clark	7e93b26b5d	freedreno: fix stream-out offset handling for lines/tris We need to increment offset by # of vertices, not by # of prims. Fixes a bunch of dEQP fails involving prims other than points. For example, dEQP-GLES3.functional.transform_feedback.position.lines_separate Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:16:02 -04:00
Rob Clark	6ca6e80f61	freedreno: fix handling for stream-out offsets If changed && append, we shouldn't be resetting the internal offset back to zero. This fixes issues w/ sequences like: glBeginTransformFeedback() glDraw() glPauseTransformFeedback() glDraw() glResumeTransformFeedback() glDraw() glEndTransformFeedback() Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3 and related tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:54 -04:00
Rob Clark	0a4b0fc315	freedreno: fix prims-emitted query This should only count when TF is not paused. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:47 -04:00
Rob Clark	a7eb12d089	freedreno: fix max-line-width dEQP noticed that we were advertising completely bogus values. The actual maximum is 127.0f. But we have to use an artifically low maximum to work around a bug in the dEQP test, which gets confused when the max line width is too large and lines start going off-screen. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:31 -04:00
Rob Clark	6bf462a1ab	freedreno: add flag to enable dEQP hacks Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:24 -04:00
Rob Clark	f68f6c0246	freedreno/ir3: hack to avoid getting stuck in a loop There are still some edge cases which result in a neighbor-loop. Which needs to be fixed, but this hack at least makes deqp tests finish. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:13 -04:00
Rob Clark	dd70945e09	freedreno/ir3: use (ss) instead of (sy) for ldlv Fixes a bunch of flat-varying fail on a4xx (where we need to use ldlv to read the un-interpolated varying). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:15:05 -04:00
Rob Clark	b35ad6e701	freedreno/ir3: cleanup double cmps.s from frontend Since we cannot mov into a predicate register, the frontend uses a 'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x. It does this since it has no way to know that the source cond instruction (ie. for a kill, br, etc) will only be used to write the predicate reg. Detect this, and re-write the instruction writing p0.x to skip the original cmps.[sfu]. (It is done like this, rather than re-writing the dest of the first cmps.[sfu] in case the first cmps.[sfu] actually has other users.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-13 14:14:41 -04:00
Matt Turner	9bac27dbf9	glsl: Rename "vertex_input_slots" -> "is_vertex_input" vertex_input_slots would be an appropriate name for an integer, but not a bool. Also remove a cond ? true : false from a count_attribute_slots() call site, noticed during the rename. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-04-13 11:00:21 -07:00
Jose Fonseca	9586468c03	gallivm: Workaround LLVM PR 27332. The credit for finding and isolating this bug goes to Vinson and Roland. The buggy LLVM versions were found by doing opt -instcombine llvm-pr27332.ll > /dev/null where llvm-pr27332.ll is the IR from https://llvm.org/bugs/show_bug.cgi?id=27332#c3 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-13 16:42:55 +01:00
Marek Olšák	dd0a296895	gallium/radeon: move a comment to the correct place trivial	2016-04-13 17:31:03 +02:00
Nicolai Hähnle	9e9a2bb44a	radeonsi: gate PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT by LLVM version Otherwise we incorrectly claim ARB_ssbo support even with older LLVM versions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94917 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-13 10:06:22 -05:00
Marek Olšák	04f15e491f	gallium/radeon: add an env variable to force a level of aniso filtering Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-04-13 12:42:28 +02:00
Jose Fonseca	cc5d8b678e	llvmpipe: Test rounding of x.5. Leverage nearbyintif function, which should be available on all C99 implementations. Trivial.	2016-04-13 11:13:05 +01:00
Roland Scheidegger	cb438d8b3e	gallivm: use llvm.nearbyint instead of llvm.round. We used to use sse roundps intrinsic directly, but switched to use the llvm intrinsics for rounding with `e4f01da15d`. However, llvm semantics follows standard math lib round function which is specced to do roundNearestAwayFromZero but we really want roundNearestEven (moreoever, using round generates atrocious code since the cpu can't do it directly and it results in scalar calls to libm __roundf). So, use llvm.nearbyint instead, which does exactly the right thing, and even has the advantage of being available with llvm 3.3 too. (I've verified it actually generates a roundps instruction with llvm 3.3.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-13 11:13:03 +01:00
Pierre Moreau	f525db6358	nv50/ra: `isinf()` is in namespace `std` since C++11. This fixes a compile error while building Nouveau with C++11 enabled (and glibc >= 2.23). This happens if SWR is enabled, as it forces C++11. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Jose Fonseca <jfonseca@vmware.com> https://bugs.freedesktop.org/show_bug.cgi?id=94907	2016-04-13 07:41:13 +01:00
Jose Fonseca	fa46848e51	scons: Allow building with Address Sanitizer. libasan is never linked to shared objects (which doesn't go well with -z,defs). It must either be linked to the main executable, or (more practically for OpenGL drivers) be pre-loaded via LD_PRELOAD. Otherwise works. I didn't find anything with llvmpipe. I suspect the fact that the JIT compiled code isn't instrumented means there are lots of errors it can't catch. But for non-JIT drivers, the Address/Leak Sanitizers seem like a faster alternative to Valgrind. Usage (Ubuntu 15.10): scons asan=1 libgl-xlib export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/gallium/targets/libgl-xlib LD_PRELOAD=libasan.so.2 any-opengl-application Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-04-13 06:54:32 +01:00
Kenneth Graunke	d1c89f6005	mesa: Change an error code in glSamplerParameterI[iu]v(). This is supposed to be INVALID_OPERATION in ES. We already did this for the fv/iv variants, but not Iiv/Iuv, which are new in ES 3.2 (or extensions). Fixes: ES31-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-12 20:30:32 -07:00
Jose Fonseca	46bfcd61f5	softpipe: Free tgsi.image elements on context destruction. Courtesy of address sanitizer. [airlied: free buffers as well] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 13:21:37 +10:00
Edward O'Callaghan	5a3d928e2c	softpipe: Enable ARB_framebuffer_no_attachments Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 13:21:37 +10:00
Eric Anholt	3b63301d9f	vc4: Work around hardware limits on the number of verts in a single draw. Fixes rendering failures in glmark2's refract and bump:render-mode=high-poly demos, and partially in its terrain demo.	2016-04-12 19:10:51 -07:00
Thomas Hindoe Paaboel Andersen	6d6525a377	softpipe: avoid buffer overflow Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 11:51:35 +10:00
Thomas Hindoe Paaboel Andersen	b89708f95f	tgsi: fix buffer overflow Increase r to four channels as rgba is written to it Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-04-13 11:51:34 +10:00
Tim Rowley	b9294bc345	swr: handle pci cap requests Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-04-12 20:18:00 -05:00
Tim Rowley	b19d214b23	swr: support samplers in vertex shaders Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-04-12 20:18:00 -05:00
Nicolai Hähnle	10cfd7a604	radeonsi: enable GLSL 4.20 and therefore OpenGL 4.2 This is the last necessary bit for OpenGL 4.2 support. All driver-specific functionality has already been implemented as part of extensions. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-04-12 20:13:49 -05:00
Iurie Salomov	047e3264f6	va: check null context in vlVaDestroyContext Signed-off-by: Iurie Salomov <iurcic@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2016-04-13 00:52:53 +01:00
Jason Ekstrand	8f3b516f2e	nir/clone: Copy bit size when cloning registers Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-04-12 16:41:58 -07:00
Marek Olšák	8e70a58af3	radeonsi: fix a critical SI hang since PIPELINESTAT_START/STOP was added For some reason unknown to me, SI hangs if the event is written after CONTEXT_CONTROL.	2016-04-13 01:05:15 +02:00
Kenneth Graunke	95d622e16d	glsl: Don't copy propagate or tree graft precise values. This is kind of a hack. We currently track precise requirements by decorating ir_variables. Propagating or grafting the RHS of an assignment to a precise value into some other expression tree can lose those decorations. In the long run, it might be better to replace these ir_variable decorations with an "exact" decoration on ir_expression nodes, similar to what NIR does. In the short run, this is probably good enough. It preserves enough information for glsl_to_nir to generate "exact" decorations, and NIR will then handle optimizing these expressions reasonably. Fixes ES31-CTS.gpu_shader5.precise_qualifier. v2: Drop invariant handling, as it shouldn't be necessary (caught by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-12 15:57:48 -07:00
Kristian Høgsberg Kristensen	8ec971a997	i965/tiled_memcpy: Fix rgba8_copy_16_aligned_dst() typo Copy and paste error in commit `eafeb8db66`: i965/tiled_memcpy: Unroll bytes==64 case. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-12 15:32:43 -07:00
Kristian Høgsberg Kristensen	1af0f0151c	glsl/linker: Recurse on struct fields when adding shader variables ARB_program_interface_query requires that we add struct fields recursively down to basic types. Fixes 52 struct test cases in dEQP-GLES31.functional.program_interface_query.* Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	778fd46aa4	glsl/linker: Pass name and type through to create_shader_variable() No functional change here, but this now lets us recurse throught structs in add_shader_variable(). Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	09f0121593	glsl/linker: Pass absolute location to add_shader_variable() This lets us pass in the absolution location of a variable instead of computing it in add_shader_variable() based on variable location and bias. This is in preparation for recursing into struct variables. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Kristian Høgsberg Kristensen	8ab6aae4dc	glsl/linker: Add add_shader_variable() helper This consolidates the combination of create_shader_variable() and add_program_resource() into a new helper function. No functional difference, but we'll expand add_shader_variable() in the next few commits. Signed-off-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-12 14:38:26 -07:00
Matt Turner	eafeb8db66	i965/tiled_memcpy: Unroll bytes==64 case. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:37:05 -07:00
Roland Scheidegger	0e605d9b3a	i965/tiled_memcpy: Provide SSE2 for RGBA8 <-> BGRA8 swizzle. The existing code uses SSSE3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas SSE2 is always present with 64-bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments v4: [mattst88] Rebase Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-12 14:37:01 -07:00
Matt Turner	fc88b4babf	i965/tiled_memcpy: Move SSSE3 code back into inline functions. This will make adding SSE2 code a lot cleaner. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:36:59 -07:00
Matt Turner	0a5d8d9af4	i965/tiled_memcpy: Optimize RGBA -> BGRA swizzle. Replaces four byte loads and four byte stores with a load, bswap, rotate, store; or a movbe, rotate, store. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-12 14:36:56 -07:00

1 2 3 4 5 ...

70939 commits