fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 08:28:16 +02:00

Author	SHA1	Message	Date
Marek Olšák	4ea0febcb0	radeonsi: move POSITION and FACE fragment shader inputs to system values And FACE becomes integer instead of float. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-13 12:27:28 +01:00
Marek Olšák	caf3c2abea	radeonsi: simplify gl_FragCoord behavior It will become a system value, not an input. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-13 12:27:28 +01:00
Roland Scheidegger	38cdcb000d	llvmpipe: (trivial) use cast wrapper for __m128d to __m128 casts some compiler was unhappy.	2016-01-13 04:48:41 +01:00
Roland Scheidegger	49ec647c3b	llvmpipe: avoid most 64 bit math in rasterization The trick here is to recognize that in the c + n * dcdx calculations, not only can the lower FIXED_ORDER bits not change (as the dcdx values have those all zero) but that this means the sign bit of the calculations cannot be different as well, that is sign(c + ndcdx) == sign((c >> FIXED_ORDER) + n(dcdx >> FIXED_ORDER)). That shaves off more than enough bits to never require 64bit masks. A shifted plane c value could still easily exceed 32 bits, however since we throw out planes which are trivial accept even before binning (and similarly don't even get to see tris for which there was a trivial reject plane)) this is never a problem. The idea isnt't all that revolutionary, in fact something similar was tried ages ago (`9773722c2b`) back when the values were only 32 bit anyway. I believe now it didn't quite work then because the adjustment needed for testing trivial reject / partial masks wasn't handled correctly. This still keeps the separate 32/64 bit paths for now, as the 32 bit one still looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled from setup which would be a good reason to ditch the 32 bit path, we'd need to change the special-purpose rasterization functions for small tris). This passes piglit triangle-rasterization (-fbo -auto -max_size -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks to make it work correctly with large sizes) easily (full piglit as well of course, but most tests wouldn't use triangles large enough to be affected, that is tris with a bounding box over 128x128). The profiler says indeed time spent in rast_tri functions is reduced substantially, BUT of course only if the tris are large. I measured a 3% improvement in mesa gloss demo when supersized to twice the screen size... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:50:57 +01:00
Roland Scheidegger	16530fdc82	llvmpipe: scale up bounding box planes to subpixel precision Otherwise some planes we get in rasterization have subpixel precision, others not. Doesn't matter so far, but will soon. (OpenGL actually supports viewports with subpixel accuracy, so could even do bounding box calcs with that). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:34:59 +01:00
Roland Scheidegger	0298f5aca7	llvmpipe: add sse code for fixed position calculation This is quite a few less instructions, albeit still do the 2 64bit muls with scalar c code (they'd need way more shuffles, plus fixup for the signed mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed scalar muls natively just fine after all (even on 32bit). (This still doesn't have a very measurable performance impact in reality, although profiler seems to say time spent in setup indeed has gone down by 10% or so overall. Maybe good for a 3% or so improvement in openarena.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:34:09 +01:00
Roland Scheidegger	9422999e40	draw: fix key comparison with uninitialized value Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.	2016-01-13 02:43:04 +01:00
Tom St Denis	56fc2986d5	st/omx: Avoid segfault in deconstructor if constructor fails If the constructor fails before the LIST_INIT calls the pointers will be null and the deconstructor will segfault. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-01-12 19:13:19 +01:00
Christian König	6f898f740c	vl: use preferred format for deinterlacing Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:42 +01:00
Christian König	5fdd4a5aef	vl: improve motion adaptive deinterlacer Handle other formats than YV12 as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:39 +01:00
Christian König	e945235aed	st/va: add BOB deinterlacing v2 Tested with MPV. v2: correctly handle compositor deinterlacing as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:35 +01:00
Christian König	3949cf0e02	st/va: add NV12 -> NV12 post processing v2 Usefull for mpv and GStreamer. v2: use common functionality for size adjustment. Signed-off-by: Indrajit-kumar Das <Indrajit-kumar.Das@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:28 +01:00
Christian König	9f644295dc	st/va: use vl_video_buffer_adjust_size Use the new helper function instead of open coding it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:24 +01:00
Christian König	da39637764	st/vdpau: use vl_video_buffer_adjust_size Use the new helper function instead of open coding it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:21 +01:00
Christian König	52ca9a9b8b	vl/buffers: extract vl_video_buffer_adjust_size helper Useful for the state trackers as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:16 +01:00
Christian König	8479782361	st/va: make the implementation thread safe v2 Otherwise we might crash with MPV. v2: minor cleanups suggested on the list. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-01-12 13:26:24 +01:00
Erik Faye-Lund	2a15dc0dd5	gallium/util: removed unused header-file This hasn't been in use since `c476305` ("gallium/util: pregenerate half float tables"), where the last bit of run-time init using this was killed. So let's just get rid of the pointless header. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 11:02:08 +11:00
Samuel Pitoiset	e67f5cac79	nvc0: do not force re-binding of compute constbufs on Fermi Re-binding compute constant buffers after launching a grid have no effects because they are not currently validated and because dirty_cp is not updated accordingly. This might also prevent weird future behaviours when UBOs will be bound for compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-12 00:47:20 +01:00
Samuel Pitoiset	3029d60de7	nvc0: remove useless goto in nvc0_launch_grid() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-12 00:19:34 +01:00
Ilia Mirkin	f21df5c513	nv50/ir: the whole point of data array is to hand out regular registers Fixes: `0d3051f75a` (nv50/ir: Fix scratch allocation size and file) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-11 13:01:11 -05:00
Pierre Moreau	0d3051f75a	nv50/ir: Fix scratch allocation size and file Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-09 12:58:21 -05:00
Ilia Mirkin	e3706a7118	nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-08 17:40:52 -05:00
Ilia Mirkin	dff1caccac	freedreno: add ir3_compiler to gitignore Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-08 15:16:37 -05:00
Ilia Mirkin	90ba06618e	gallium: add a RESQ opcode to query info about a resource Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	ebfb5446c7	gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	266d001261	gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	8cb493acc7	tgsi: update atomic op docs Specify that the operation only applies to the x component, not per-component as previously specified. This is unnecessary for GL and creates additional complications for images which need to support these operations as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	bdef02ff26	tgsi: add a is_store property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	50b8488926	tgsi: provide a way to encode memory qualifiers for SSBO Each load/store on most hardware can specify what caching to do. Since SSBO allows individual variables to also have separate caching modes, allow loads/stores to have the qualifiers instead of attempting to encode them in declarations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Ilia Mirkin	888ddd632d	ureg: add buffer support to ureg Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Ilia Mirkin	8cc9a8aa2a	tgsi: add ureg support for image decls Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Marek Olšák	1e463d20ba	nine: allow fragment shader POSITION and FACE to be system values Reported-by: Axel Davy <axel.davy@ens.fr>	2016-01-08 20:07:16 +01:00
Marek Olšák	d0cf66d835	vl: allow fragment shader POSITION to be a system value Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:16 +01:00
Marek Olšák	69f43c2cc9	util/pstipple: allow fragment shader POSITION to be a system value Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:16 +01:00
Marek Olšák	c00e534283	tgsi/scan: update for POSITION and FACE sytem values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	34738a92de	gallium: add caps for POSITION and FACE system values v2: document the integer behavior Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	c07cf5f5a9	tgsi/ureg: handle redundant declarations in ureg_DECL_system_value Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Marek Olšák	c886422656	tgsi/ureg: remove index parameter from ureg_DECL_system_value It can be trivially derived from the number of already declared system values. This allows ureg users not to worry about which index to choose. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Edward O'Callaghan	cb513485a0	radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-08 12:18:36 -05:00
Edward O'Callaghan	b42254eff3	gallium/aux: Use TGSI chan name defines inplace of literals Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-08 12:18:24 -05:00
Ilia Mirkin	67b31b3c59	nvc0: add ARB_indirect_parameters support I chose to make separate macros for this due to the additional complexity and extra scratch usage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	7ca67c752b	nvc0: add support for real ARB_multi_draw_indirect The draw groups are now split up into groups of 32 if there's a non-packed stride, or in groups of 400-500 if the draw data is packed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d3e43baffe	nvc0: adjust indirect draw macros to handle multiple draws at once These are still invoked one at a time, but the underlying macro can handle multiple draws. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d67b9ba9a1	gallium: add caps to expose support for multi indirect draws Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	3e11656694	gallium: add sufficient draw interface to allow new indirect features This makes it possible to support indirect multidraws as well as having the number of such draws to come from a separate GPU resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Roland Scheidegger	2923c7a0ed	llvmpipe: do 64bit plane calculations in the sse path The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	fad283ba9e	llvmpipe: don't store eo as 64bit int eo, just like dcdx and dcdy, cannot overflow 32bit. Store it as unsigned though just in case (it cannot be negative, but in theory twice as big as dcdx or dcdy so this gives it one more bit). This doesn't really change anything, albeit it might help minimally on 32bit archs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	b61b9a377e	llvmpipe: use aligned data for the assembly program in setup Back in the day (before `24678700ed`) the values were not actually in a struct but even then I can't see why we didn't simply align the values. Especially since it's trivial to do so. (Not that it actually matters since the code is pretty much unused for now.) Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	9db7309595	draw: initialize prim header flags when clipping lines Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	64da11f052	draw: fix line stippling with unfilled prims The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:13 +01:00

1 2 3 4 5 ...

25769 commits