fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 02:48:06 +02:00

Author	SHA1	Message	Date
Marek Olšák	f2cdb68c8b	radeonsi: use LRP from gallivm Totals: SGPRS: 344552 -> 344368 (-0.05 %) VGPRS: 197132 -> 197552 (0.21 %) Code Size: 7375376 -> 7366304 (-0.12 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1679360 -> 1615872 (-3.78 %) bytes per wave Totals from affected shaders: SGPRS: 47736 -> 47552 (-0.39 %) VGPRS: 27952 -> 28372 (1.50 %) Code Size: 1392724 -> 1383652 (-0.65 %) bytes LDS: 39 -> 39 (0.00 %) blocks Scratch: 513024 -> 449536 (-12.38 %) bytes per wave Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	eb11efc989	radeonsi: don't emit AMDGPU intrinsics for integer abs, min, max No difference according to shader-db. (with the new S_ABS_I32 pattern) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	d72a26ec5d	radeonsi: don't emit AMDGPU intrinsics for EX2, ROUND, TRUNC No difference according to shader-db. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	6660ca7121	radeonsi: initialize output, temp, and address registers to "undef" This removes "v_mov v0, 0" which typically occurs before exports. Totals: SGPRS: 345216 -> 344552 (-0.19 %) VGPRS: 197684 -> 197132 (-0.28 %) Code Size: 7390408 -> 7375376 (-0.20 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1842176 -> 1679360 (-8.84 %) bytes per wave Totals from affected shaders: SGPRS: 101336 -> 100672 (-0.66 %) VGPRS: 53920 -> 53368 (-1.02 %) Code Size: 2170176 -> 2155144 (-0.69 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 1015808 -> 852992 (-16.03 %) bytes per wave Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	529c5e7740	gallivm: implement the correct version of LRP The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment	2015-10-17 21:40:03 +02:00
Marek Olšák	a2197cac7f	gallivm: set correct opcode info from unary/binary/ternary emits and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)	2015-10-17 21:40:03 +02:00
Marek Olšák	5bc871a4ca	radeonsi: implement vertex color clamping This is only supported in the compatibility profile (without GS and tess). Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	208d1ed38d	radeonsi: implement fragment color clamping using the shader key for now. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	acc6a07874	radeonsi: clean up other scratch buffer functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	9098d7e9bd	radeonsi: clean up copy-pasted scratch buffer updates Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	938a1bee34	radeonsi: unify shader create functions The shader specifies the processor type, so use that instead. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	b0167809f1	radeonsi: unify shader delete functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	aa060e276c	radeonsi: fix a GS copy shader leak Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	c4f086f399	radeonsi: remove an unused ctx parameter in si_shader_destroy Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	4f4f477d6d	radeonsi: print export_prim_id from the shader key Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	b11edf8872	radeonsi: disable NaNs for LS and HS They're disabled for all other shaders except compute, but I forgot to do this for tess stages. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	73e3fba335	radeonsi: clean up si_llvm_init_export_args Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	82335978bb	tgsi: move pipe_shader_from_tgsi_processor function to util Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Brian Paul	8c5647db5e	mesa: remove FLUSH_VERTICES() in _mesa_MatrixMode() Changing the matrix mode alone has no effect on rendering and does not need to trigger a flush or state validation. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-17 19:36:46 +02:00
Marek Olšák	3c6156a4a7	st/mesa: fix clip state dependencies This allows removing FLUSH_VERTICES in MatrixMode. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-17 19:36:44 +02:00
Marek Olšák	006fcc0da6	gallium/hud: fix possible NULL pointer dereference Trivial.	2015-10-17 19:06:27 +02:00
Brian Paul	3272f632ee	scons: fix MSVC, MinGW build Duplicate the glsl_types_hack.cpp work-around from the libgl-xlib target.	2015-10-17 10:06:49 -06:00
Rob Clark	7e6aafd6ab	build: fix make-check after `a6a6a71` commit `a6a6a71092` Author: Rob Clark <robclark@freedesktop.org> AuthorDate: Sat Oct 10 14:13:50 2015 -0400 glsl: (mostly) remove libglsl_util Was a bit too ambitious on removal of libglsl_util.. it is still needed by some of the tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-17 09:51:29 -04:00
Rob Clark	b7963b6926	build: fix out-of-tree build after `b9b40ef` commit `b9b40ef9b7` Author: Rob Clark <robclark@freedesktop.org> AuthorDate: Sat Oct 10 13:55:07 2015 -0400 nir: remove dependency on glsl broke things for i965 out of tree build. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-17 09:51:29 -04:00
Samuel Pitoiset	c188235d1b	nvc0: add support for performance monitoring metrics on Fermi As explained in the CUDA toolkit documentation, "a metric is a characteristic of an application that is calculated from one or more event values." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-17 10:50:00 +02:00
Rob Clark	a6a6a71092	glsl: (mostly) remove libglsl_util Now that NIR does not depend on glsl, we can (mostly[]) get rid of the libglsl_util hack. [] glsl_compiler is the one remaining user of libglsl_util Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:38 -04:00
Rob Clark	b9b40ef9b7	nir: remove dependency on glsl Move glsl_types into NIR, now that the dependency on glsl_symbol_table has been split out. Possibly makes sense to rename things at this point, but if we do that I'd like to keep it split out into a separate patch to make git history easier to follow (IMHO). v2: fix android build v3: I f***ing hate scons.. but at least it builds Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:38 -04:00
Rob Clark	183db3a645	glsl: move half<->float convertion to util Needed in NIR too, so move out of mesa/main/imports.c Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Rob Clark	60690cb3b3	glsl: move builtin vector types to glsl_types.cpp First step at untangling NIR's dependency on glsl_types without bringing in the dependency on glsl_symbol_table. The builtin types are now in glsl_types (which will end up in NIR), but adding them to the symbol- table stays in builtin_types.cpp (which will not be part of NIR). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Rob Clark	33de998230	glsl: couple shader_enums cleanups Add missing enum to gl_system_value_name() and move VARYING_SLOT_MAX / FRAG_RESULT_MAX / etc into shader_enums.h as suggested by Emil. v2: add STATIC_ASSERT()'s Reported-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Timothy Arceri	698cdbf492	glsl: initialise record array count to 1 This was only being done in one of the two process methods. Fixes an issue with samplers using the array size of a previous record. Tested-by: Marek Olšák <marek.olsak@amd.com> Cc: Jason Ekstrand <jason@jlekstrand.net>	2015-10-17 08:50:40 +11:00
Timothy Arceri	3c87377d0b	nir: add atomic lowering support for AoA Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-17 08:43:21 +11:00
Timothy Arceri	2e1798f183	nir: wrapper for glsl_type arrays_of_arrays_size() Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-17 08:43:15 +11:00
Ilia Mirkin	fd5e0581dd	configure: show which gallium drivers/sts are built Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-16 17:18:43 -04:00
Brian Paul	2023906667	tgsi: initialize ctx.file in tgsi_dump_instruction() Fixes segfault because of uninitialized file pointer. Trivial.	2015-10-16 14:32:09 -06:00
Samuel Pitoiset	a3b1757551	nvc0: add a note about MP counters on GF100/GF110 MP counters on GF100/GF110 (compute capability 2.0) are buggy because there is a context-switch problem that we need to fix. Results might be wrong sometimes, be careful! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	0461260d77	nvc0: add MP counters variants for GF100/GF110 GF100 and GF110 chipsets are compute capability 2.0, while the other Fermi chipsets are compute capability 2.1. That's why, some MP counters are different between these chipsets and we need to handle variants. Signed-off-by: Samuel Pitoiet <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	ec5001d25b	nvc0: move SW/HW queries info to their respective files This will help for handling HW SM queries variants on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	00d61869a5	nvc0: enable compute support by default on Fermi Compute support was not enabled by default because weird effects on 3D state happened, but I can't reproduce them anymore. This also enables MP performance counters by default on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	8cd4b8478a	nvc0: allow only one active query for the MP counters group Because we can't expose the number of hardware counters needed for each different query, we don't want to allow more than one active query simultaneously to avoid failure when the maximum number of counters is reached. Note that these groups of GPU counters are currently only used by AMD_performance_monitor. Like for Kepler, this limits the maximum number of active queries to 1 on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	cef22f3490	nvc0: read MP counters of all GPCs on Fermi When a card has more than one GPC, the grid used by the compute kernel which reads MP performance counters seems to be too small. The consequence is that the kernel is not launched on all TPCs. Increasing the grid size using the number of GPCs now launches enough blocks and we can read MP performance counters of all TPCs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	1825898e04	nvc0: store the number of GPCs to nvc0_screen NOUVEAU_GETPARAM_GRAPH_UNITS param returns the number of GPCs, the total number of TPCs and the number of ROP units. Note that when the DRM version is too old the default number of GPCs is fixed to 4. This will be used to launch the compute kernel which is used to read MP performance counters over all GPCs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	c4896c99cb	nvc0: fix unaligned mem access when reading MP counters on Fermi Memory access have to be aligned to 128-bits. Note that this doesn't happen when the card only has TPC. This patch fixes the following dmesg fail: gr: GPC0/TPC1/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 000f [UNALIGNED_MEM_ACCESS] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	7abd707251	nvc0: fix monitoring multiple MP counters queries on Fermi For strange reasons, the signal id depends on the slot selected on Fermi but not on Kepler. Fortunately, the signal ids are just offseted by the slot id! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	4fcb661711	nvc0: fix queries which use multiple MP counters on Fermi Queries which use more than one MP counters was misconfigured and computing the final result was also wrong because sources need to be configured on different hardware counters instead. According to the blob, computing the result is now as follows: FOR i..n val += ctr[i] * pow(2, i) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	6353f620cd	nvc0: allow to use 8 MP counters on Fermi On Fermi, we have one domain of 8 MP counters while we have two domains of 4 MP counters on Kepler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	cac897197b	nvc0: fix sequence field init for MP counters on Fermi Sequence fields are located at MP[i] + 0x20 in the buffer object. This is used to check if result is available for MP[i]. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	409658c367	nvc0: correctly enable the MP counters' multiplexer on Fermi Writing 0x408000 to 0x419e00 (like on Kepler) has no effect on Fermi because we only have one domain of 8 counters. Instead, we have to write 0x80000000. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	c3570c3fb9	nvc0: rip off the kepler MP-enabling logic from the Fermi codepath Writing 0x1fcb to 0x419eac is definitely not related to MP counters and has no effect on Fermi (although this enables MP counters on Kepler). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	dab7e0ed09	nvc0: split out begin_query() hook used by MP counters The way we configure MP performance counters is going to pretty different between Fermi and Kepler. Having two separate functions is much better. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00

1 2 3 4 5 ...

73617 commits