fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-08 08:08:25 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	237f2f2d8b	nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com>	2015-12-28 09:59:53 -08:00
Ilia Mirkin	109c348284	nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion Also release the scratch allocation if any. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-27 21:33:36 -05:00
Ilia Mirkin	28e07fdd4a	nv50,nvc0: add a note when converting vertex elements using CPU Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-27 19:49:44 -05:00
Connor Abbott	41c7912d04	gallium/auxiliary: don't build NIR sources with MSVC2008 flags NIR has never been built with MSVC2008, so we shouldn't add MSVC2008_COMPAT_CFLAGS to anything that uses it. This allows us to get rid of the pragma in tgsi_to_nir.c. Build tested with freedreno. v2: Use MSVC2013_COMPAT_CLFAGS instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-12-23 20:46:48 -05:00
Rob Clark	843cec6d3a	freedreno/ir3: spelling.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-23 00:28:24 -05:00
Kenneth Graunke	7d539080c1	nir: Add a writemask to store intrinsics. Tessellation control shaders need to be careful when writing outputs. Because multiple threads can concurrently write the same output variables, we need to only write the exact components we were told. Traditionally, for sub-vector writes, we've read the whole vector, updated the temporary, and written the whole vector back. This breaks down with concurrent access. This patch prepares the way for a solution by adding a writemask field to store_var intrinsics, as well as the other store intrinsics. It then updates all produces to emit a writemask of "all channels enabled". It updates nir_lower_io to copy the writemask to output store intrinsics. Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks by doing a read-modify-write cycle (which is safe, because local variables are specific to a single thread). This should have no functional change, since no one actually emits partial writemasks yet. v2: Make nir_validate momentarily assert that writemasks cover the complete value - we shouldn't have partial writemasks yet (requested by Jason Ekstrand). v3: Fix accidental SSBO change that arose from merge conflicts. v4: Don't try to handle writemasks in ir3_compiler_nir - my code for indirects was likely wrong, and TTN doesn't generate partial writemasks today anyway. Change them to asserts as requested by Rob Clark. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v3]	2015-12-22 15:57:59 -08:00
Ben Skeggs	a8c4747602	nouveau: enable use of new kernel interfaces Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:17 +10:00
Ben Skeggs	5b614b141a	nvc0: remove use of deprecated sw class identifier Also emits a method to properly bind the class to a subchannel, which was missing previously. The kernel currently doesn't care, but this will break if it ever decides to (ie. to support multiple sw classes). Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:13 +10:00
Ben Skeggs	33a3ba8c59	nv50: fix g98+ vdec class allocation The kernel previously exposed incorrect classes for some of the chipsets that this code supports. It no longer does, but the older object ioctls have compatibility to avoid breaking userspace. This needs to be fixed before switching over to the newer interfaces. Rather than hardcoding chipset->class like the rest of the driver does, this makes use of (new) sclass queries to determine what's available. v2. - update to use symbolic class identifier from <nvif/class.h> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:10 +10:00
Ben Skeggs	791a3e1850	nouveau: remove use of deprecated nouveau_device_wrap() Switching to the newer libdrm entry-points tells libdrm that it's OK to make use of newer kernel interfaces. We want to be able to isolate any bugs to either the interfaces changes, or the use of NVIF itself. As such, this commit has a slight hack which forces libdrm to continue using the older kernel interfaces. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:08 +10:00
Ben Skeggs	323d4da372	nouveau: fix screen creation failure paths The winsys layer would attempt to cleanup the nouveau_device if screen init failed, however, in most paths the pipe driver would have already destroyed it, resulting in accesses to freed memory etc. This commit fixes the problem by allowing the winsys to detect whether the pipe driver's destroy function needs to be called or not. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:05 +10:00
Ben Skeggs	6c1bfff66c	nouveau: return nouveau_screen from hw-specific creation functions Kills off a void cast. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:03 +10:00
Ben Skeggs	1a9ec8e062	nouveau: remove use of deprecated nouveau_device::drm_version v2. update for libdrm nouveau_drm::lib_version removal Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:01 +10:00
Ben Skeggs	a458ffacba	nouveau: remove use of deprecated nouveau_device::fd Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:23:59 +10:00
Dave Airlie	d19106649f	r600: fix viewport clipping handling (v2) If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (we don't have enough info to program VPORT_PROVOKE). Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop vport provoke write, drop initial state writing this on evergreen, only program it on evergreen. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-22 09:09:56 +10:00
Dave Airlie	73e7c5fd7f	radeonsi: fix viewport clipping handling. (v2) If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (We don't know if oViewport is constant so we skip this.) Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop writing to provoke disable, drop write in initial state. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-22 09:09:52 +10:00
Dave Airlie	847f91f4e5	r600: drop VTX_CNT_EN write from initial state we always program this in shader stages atom now. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-22 09:09:48 +10:00
Nicolai Hähnle	ea8c0b16ec	gallium/radeon: fix regression in a number of driver queries This rather silly mistake was introduced by commit `01910676`. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-21 15:47:10 -05:00
Eric Anholt	f1fb85e544	vc4: Do instruction scheduling on the QIR to hide texture fetch latency. This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR. Texture fetch can probably take as much as the rest of the cycles of the program, so it's important to hide our other cycles during it (which is hard to do after register allocation). Also, we can queue up multiple texture requests before collecting the resulting samples, so that we keep the texture unit busy more of the time. High-settings openarena performance +2.35849% +/- 0.221154% (n=7). Also about 2-3% on the multiarb demo. 8 piglit tests (ext_framebuffer_multisample accuracy depthstencil) go from failing in rendering to failing in register allocation, but hopefully I can fix that up with some better register pressure handling here. total instructions in shared programs: 87723 -> 88448 (0.83%) instructions in affected programs: 78411 -> 79136 (0.92%) total estimated cycles in shared programs: 276583 -> 246306 (-10.95%) estimated cycles in affected programs: 265691 -> 235414 (-11.40%)	2015-12-18 17:12:10 -08:00
Eric Anholt	5278c64de5	vc4: Fix latency handling for QPU texture scheduling. There's only high latency between a complete texture fetch setup and collecting its result, not between each step of setting up the texture fetch request.	2015-12-18 17:09:03 -08:00
Eric Anholt	960f48809f	vc4: Keep sample mask writes from being reordered after TLB writes Fixes a regression I noticed after introducing scheduling on the QIR. Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-18 17:09:03 -08:00
Rob Herring	b201a6ed9f	freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabled Android builds with -Werror=pointer-to-int-cast causing an error on 32-bit builds. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-18 14:01:07 -05:00
Matt Turner	c8a74e3a4e	nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:13 -05:00
Nicolai Hähnle	0a6a17b9d7	gallium/radeon: only dispose locally created target machine in radeon_llvm_compile Unify the cleanup paths of the function rather than duplicating code. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-18 12:17:40 -05:00
Roland Scheidegger	61e5f8d073	gallium/util: (trivial) include p_shader_tokens.h in u_simple_shaders.h as it uses definition from it (enum tgsi_return_type).	2015-12-18 01:02:16 +01:00
Roland Scheidegger	6743c68a11	draw: fix clip test with NaNs NaNs mean it should be clipped, otherwise the NaNs might get passed to the next stages (if clipping didn't happen for another reason already), which might cause all kind of problems. The llvm path got this right already (possibly by luck), but this isn't used when there's a gs active. Found by code inspection, verified with some hacked piglit test and some more hacked debug output. (Note the clipper can still itself incorrectly generate NaN and INF position values in its output prims (at least after w divide / viewport transform) even if the inputs weren't NaNs, if the position data of the vertices is "sufficiently bad".) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-18 00:57:07 +01:00
Roland Scheidegger	44e87b7b7b	draw: fix pstipple and aaline stages wrt sampler_views/samplers Those stages only really work for OGL-style texturing (so number of samplers and views mostly the same, certainly for the max values). These get often set up all at once, thus there might be max number of both even if all of them are just NULL. We must not set the max number of samplers and views to the same value since that will lead to terrible things if a driver supports more views than samplers (and the state tracker set up all the views). (This will not make these stages magically work if a shader uses dx10-style texturing, they might still replace an actually used sview in that case.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-18 00:55:35 +01:00
Jonathan Gray	7f585a6a98	configure.ac: use pkg-config for libelf Use PKG_CHECK_MODULES to get the flags to link libelf v2: keep AC_CHECK_LIB as a fallback for elfutils provided libelf that doesn't install a pkg-config file. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Samuel Pitoiset	695ae816da	nv50: free memory allocated by the prog which reads MP perf counters This fixes a memory leak introduced in `6a9c151` ("nv50: add compute-related MP perf counters on G84+") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 21:52:43 -05:00
Brian Paul	f992d02ba2	st/osmesa: add OSMesaCreateContextAttribs() function As with the previous commit, except for gallium. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:39:05 -07:00
Brian Paul	c2c0983215	svga: don't use debug code in update_state() in release builds Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:38:15 -07:00
Samuel Pitoiset	aeee7f2a4d	nv50,nvc0: free memory allocated by performance metrics The destroy_query() helper was actually never called. This fixes a memory leak while monitoring performance metrics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 23:03:08 +01:00
Samuel Pitoiset	9aca60bfb0	nvc0: free memory allocated by the prog which reads MP perf counters This fixes a long time ago memory leak (even before all my query related changes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 22:00:57 +01:00
Samuel Pitoiset	8022c7480e	nvc0: fix metric-achieved_occupancy calculation on Kepler The maximum number of resident warps per multiprocessor is 64 on Kepler instead of 48 on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-16 22:00:57 +01:00
Christian König	a87a1420d6	st/va: remove fence handling v3 It's nonsense to drain the pipeline like this. v2: keep the drain for DMA-buf exports. v3: flush before the export and after compositing and add TODO comment. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-16 21:13:42 +01:00
Julien Isorce	89eb342def	st/va: retrieve size from the temporary img variable "image" is not ready yet since it will be set at the end of the function by: image = img; Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>	2015-12-16 14:12:31 +00:00
Roland Scheidegger	8e195a6251	draw: handle edge flags in llvm path We just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). v2: comment fixes, and make the use of the edgeflag in clipmask consistent with when it's actually there (should be impossible to hit a case where the difference would actually matter but still...) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-16 03:55:25 +01:00
Roland Scheidegger	13c0b1c780	draw: don't set start_instance and instance id for pt emit This just adds confusion, these parameters are used when fetching vertices by translate, but certainly not when emitting hw vertices for drivers, they make no sense there (setting them has no consequences otherwise since there won't be any elements with instance_divisor set). So just set them to 0 (the draw_pipe_vbuf code for emitting vertices when the draw pipeline is run already does exactly that). Also while here do some whitespace cleanup. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-16 03:55:14 +01:00
Samuel Pitoiset	276837cbe4	nvc0: remove old comment related to metric calculations I forgot to remove it when I refactored all performance metrics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-15 22:49:37 +01:00
Eric Anholt	3858722740	vc4: Add support for dumping executed commands to a file. The VC4_DEBUG=cl,qpu is nice and all, but I want to be able to get more detailed dumps, and to replay the same exact commands in simulation. For that I need a dump with all of the VBOs, shaders, shader recs, etc. This dump can be parsed by vc4-gpu-tools. For now this is only doable from simulator mode, because otherwise we don't have access to the RCL contents generated by the kernel.	2015-12-15 12:05:48 -08:00
Eric Anholt	07570edb98	vc4: Import updated vc4_drm.h with hang state.	2015-12-15 12:02:54 -08:00
Eric Anholt	c5b886b028	vc4: Only update vc4->msaa when the framebuffer changes. Any update here should have been the same as in vc4_set_framebuffer_state(), except for the point where vc4_blit.c temporarily sets different state for its different buffers.	2015-12-15 12:02:53 -08:00
Eric Anholt	f2cf2a63f1	vc4: Don't consider nr_samples==1 surfaces to be MSAA. This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.	2015-12-15 12:02:53 -08:00
Eric Anholt	da92f16c50	vc4: Fix min() wrapper definition for the simulator's kernel code.	2015-12-15 12:02:53 -08:00
Eric Anholt	02bcb443ee	vc4: Warn instead of abort()ing on exec ioctl failures. It's really harsh to abort() the X Server because of a momentary failure (particularly -ENOMEM). I don't see a way to pass an -ENOMEM up the stack from here, but we can at least log to stderr before proceeding on. Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-15 12:02:44 -08:00
Nicolai Hähnle	c8d9d289ff	radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layouts The incorrectly computed register count caused lockups. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 11:23:40 -05:00
Nicolai Hähnle	149d049676	gallium/radeon: remove unnecessary test in r600_pc_query_add_result This test is a left-over of the initial development. It is unneeded and misleading, so let's get rid of it. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 11:23:40 -05:00
Rob Clark	e677b3047b	freedreno/a4xx: fix fragcoord.z + fragdepth It seems like disabling earlyz on a4xx also, by defaults, disables fragcoord.z to the FS. For frag shaders that both read fragcoord(.z) and write fragdepth, we need to set some extra bits to prevent a lockup. This lets us get rid of the hack of disabling fragcoord.z (which prevented 0ad from lockups, but resulted in rendering corruption). Also fixes fbo-depth-sample-compare. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:40:54 -05:00
Rob Clark	cad0920d11	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:39:10 -05:00
Rob Clark	249b2be3bc	freedreno/ir3/cmdline: don't dump nir by default By default we only want the disasm dumped, which we get anyways. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:39:10 -05:00

... 39 40 41 42 43 ...

27608 commits