fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-02 00:28:20 +02:00

Author	SHA1	Message	Date
Eric Anholt	24c5ab7bbb	vc4: Drop dependency on r3 for color packing. We can avoid it by carefully ordering the packing. This is important as a step in giving r3 to the register allocator. total instructions in shared programs: 56087 -> 55957 (-0.23%) instructions in affected programs: 18368 -> 18238 (-0.71%)	2014-12-08 16:08:13 -08:00
Eric Anholt	dfbf58c439	vc4: Add support for GL 1.0 logic ops.	2014-12-08 16:08:13 -08:00
Eric Anholt	5045d8ca42	vc4: Add support for TGSI_OPCODE_UCMP. This is being emitted now from st_glsl_to_tgsi.cpp.	2014-12-08 16:08:13 -08:00
Tom Stellard	c16436149c	radeonsi/compute: Clamp COMPUTE_TMPRING_SIZE.WAVES to: num_cu * 32 This is the maximum value allowed for this field.	2014-12-08 17:20:50 -05:00
Tom Stellard	0e1c085f17	winsys/radeon: Always report at least 1 compute unit All uses of this require that the value be at least one, so it's easier to report at least one than having to wrap all uses in MAX2(max_compute_units, 1). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-12-08 17:20:50 -05:00
Tom Stellard	67dcbcd92c	radeonsi: Program RASTER_CONFIG for harvested GPUs v5 Harvested GPUs have some of their render backends disabled, so in order to prevent the hardware from trying to render things with these disabled backends we need to correctly program the PA_SC_RASTER_CONFIG register. v2: - Write RASTER_CONFIG for all SEs. v3: - Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit. - Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting PA_SC_RASTER_CONFIG. - Get num_se and num_sh_per_se from kernel. v4: - Get correct value for num_se - Remove loop for setting PA_SC_RASTER_CONFIG - Only compute raster config when a backend has been disabled. v5: Michel Dänzer - Fix computation for chips with multiple SEs https://bugs.freedesktop.org/show_bug.cgi?id=60879 CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-12-08 17:20:50 -05:00
Ilia Mirkin	043b79461f	freedreno/a2xx: silence warning about missing DEPTH32X Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:53 -05:00
Ilia Mirkin	c416f49ebe	freedreno/a3xx: handle index_bias (i.e. base_vertex) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:50 -05:00
Ilia Mirkin	b38b40d7bb	freedreno/a3xx: add bgr565 texturing and rendering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:47 -05:00
Ilia Mirkin	e02ed16cb5	freedreno/a3xx: add support for SRGB render targets Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:43 -05:00
Ilia Mirkin	39a7c049d3	freedreno/a3xx: output RGBA16_FLOAT from fs for certain outputs Fixes R11G11B10F rendering, and is required for SRGB format support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:40 -05:00
Ilia Mirkin	3674c76edf	freedreno/a3xx: re-enable rgb10_a2 render targets There were previously regressions regarding border colors, which the updated swizzle logic resolves. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:37 -05:00
Ilia Mirkin	fc94b2c2a0	freedreno/a3xx: fix border color swizzle to match texture format desc This is a hack since it uses the texture information together with the sampler, but I don't see a better way to do it. In OpenGL, there is a 1:1 correspondence. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:33 -05:00
Ilia Mirkin	97fef2db5c	freedreno/a3xx: fix alpha-blending on RGBX formats Expert debugging assistance provided by Chris Forbes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:20 -05:00
Roland Scheidegger	6f2cf5f3d0	llvmpipe: decrease MAX_SCENES from 2 to 1 Multiple scenes per context are meant to be used so a new scene can be built while another one is processed in rasterization. However, quite surprisingly, this does not actually work (and according to git log, possibly never did, though maybe it did at some point further back (5 years+) but was buggy) because we always wait immediately on the rasterizer to finish the scene when contexts (and hence setup/scene) is flushed. This means when we try to get an empty scene later, any old one is already empty again. Thus using multiple scenes is just a waste of memory (not too bad, since the additional scenes are guaranteed to be empty, which means their size ought to be one data block (64kB) plus the size of some structs), without actually really doing anything. (There is also quite some code for the whole concept of multiple scenes which doesn't really do much in practice, but keep it hoping the wait-on-scene-flush can be fixed some day.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-06 18:03:18 +01:00
Eric Anholt	befdff8142	vc4: Try swapping the regfile A to B to pair instructions. total instructions in shared programs: 56995 -> 56087 (-1.59%) instructions in affected programs: 40503 -> 39595 (-2.24%)	2014-12-05 16:27:58 -08:00
Eric Anholt	7d8b79f398	vc4: Allow pairing of some instructions that disagree about the WS bit. No difference on shader-db because we tend to have a lot of other conflicts going on as well (like RADDR_A disagreements)	2014-12-05 16:27:06 -08:00
Eric Anholt	6f32deb538	vc4: Add separate write-after-read dependency tracking for pairing. If an operation is the last one to read a register, the instruction containing it can also include the op that has the next write to that register. total instructions in shared programs: 57486 -> 56995 (-0.85%) instructions in affected programs: 43004 -> 42513 (-1.14%)	2014-12-05 10:53:53 -08:00
Eric Anholt	042962df2d	vc4: Fix inverted priority of instructions for QPU scheduling. We were scheduling TLB operations as early as possible, and texture setup as late as possible. When I introduced prioritization, I visually inspected that an independent operation got moved above texture results collection, which tricked me into thinking it was working (but it was just because texture setup was being pushed late). total instructions in shared programs: 57651 -> 57486 (-0.29%) instructions in affected programs: 18532 -> 18367 (-0.89%)	2014-12-05 10:43:14 -08:00
Eric Anholt	bd4057a5d7	vc4: Refuse to merge two ops that both access shared functions. Avoids assertion failures in vc4_qpu_validate.c if we happen to find the right set of operations available.	2014-12-05 10:43:14 -08:00
Eric Anholt	dadc32ac80	vc4: Allow dead code elimination of color reads. This might happen if the blending functions are set up to not actually use the destination color/alpha, for example.	2014-12-05 10:43:14 -08:00
Eric Anholt	34cf86bdc4	vc4: Add a debug flag for waiting for sync on submit. This is nice when you're tracking down which command list is hanging the GPU.	2014-12-05 10:43:14 -08:00
Rob Clark	4265148ac6	freedreno/a4xx: unify vertex/texture formats into a single table Similar to the scheme that Ilia put in place for a3xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-04 16:01:37 -05:00
Rob Clark	e9589a8fcf	freedreno/a4xx: fd4_util -> fd4_format Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-04 16:01:37 -05:00
Rob Clark	8bf69a29bb	freedreno: update generated headers / a4xx fmt rename Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-04 16:01:37 -05:00
Rob Clark	c74f2db0a5	freedreno/a4xx: frag-depth fixes Also seems to fix kill/discard. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 16:38:26 -05:00
Ilia Mirkin	79f9a106b9	freedreno/a3xx: implement anisotropic filtering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-12-03 09:23:46 -05:00
Rob Clark	b491d1ca6e	freedreno/a4xx: rect textures Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 09:22:05 -05:00
Rob Clark	fbba633f2f	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 09:22:05 -05:00
Rob Clark	4cfe905a9b	freedreno: fix signed vs unsigned lols Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 09:22:05 -05:00
Jan Vesely	02cc9e9f9e	r600, llvm: Don't leak global symbol offsets Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-12-02 22:32:05 -05:00
Jan Vesely	ca0616f17e	r600, llvm: Fix mem leak Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-12-02 11:30:13 -05:00
Eric Anholt	29c7cf2b2b	vc4: Pair up QPU instructions when scheduling. We've got two mostly-independent operations in each QPU instruction, so try to pack two operations together. This is fairly naive (doesn't track read and write separately in instructions, doesn't convert ADD-based MOVs into MUL-based movs, doesn't reorder across uniform loads), but does show a decent improvement on shader-db-2. total instructions in shared programs: 59583 -> 57651 (-3.24%) instructions in affected programs: 47361 -> 45429 (-4.08%)	2014-12-01 22:29:42 -08:00
Dave Airlie	7b0067d23a	r600g/sb: fix issues cause by GLSL switching to loops for switch Since `73dd50acf6` glsl: implement switch flow control using a loop The SB backend was falling over in an assert or crashing. Tracked this down to the loops having no repeats, but requiring a working break, initial code just called the loop handler for all non-if statements, but this caused a regression in tests/shaders/dead-code-break-interaction.shader_test. So I had to add further code to detect if all the departure nodes are empty and avoid generating an empty loop for that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86089 Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-12-02 13:57:27 +10:00
Rob Clark	036f434ac2	freedreno/a4xx: alpha blend fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 20:31:23 -05:00
Rob Clark	a7d91c33c2	freedreno/a4xx: fix DRAW initiator encoding of index size Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 20:31:23 -05:00
Rob Clark	81194ac767	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 20:31:23 -05:00
Brian Paul	f54162857c	svga: fix comment typo	2014-12-01 16:30:12 -07:00
Eric Anholt	3fe4d8e1e3	vc4: Introduce scheduling of QPU instructions. This doesn't reschedule much currently, just tries to fit things into the regfile A/B write-versus-read slots (the cause of the improvements in shader-db), and hide texture fetch latency by scheduling setup early and results collection late (haven't performance tested it). This infrastructure will be important for doing instruction pairing, though. shader-db2 results: total instructions in shared programs: 61874 -> 59583 (-3.70%) instructions in affected programs: 50677 -> 48386 (-4.52%)	2014-12-01 11:00:23 -08:00
Eric Anholt	6958c404ca	vc4: Drop the explicit scoreboard wait. This is actually implicitly handled by the TLB operations.	2014-12-01 11:00:23 -08:00
Eric Anholt	334036fb64	vc4: Also deal with VPM reads at thread end. Prevents a regression with QPU scheduling, which happens to put the no-op reads for unused VPM contents end up at the end of the program.	2014-12-01 11:00:23 -08:00
Eric Anholt	a7b1a93137	vc4: Fix assertion about SFU versus texturing. We're supposed to be checking that nothing else writes r4, which is done by the TMU result collection signal, not the coordinate setup. Avoids a regression when QPU instruction scheduling is introduced.	2014-12-01 11:00:23 -08:00
Eric Anholt	2d5784c825	vc4: Add another check for invalid TLB scoreboard handling. This was caught by an assertion in the simulator.	2014-12-01 11:00:23 -08:00
Rob Clark	bb19f2c3c4	freedreno/a4xx: invalidate cache when vbo's change Otherwise vertex shader can see stale cache data. This in particular happens when the same vbo is updated and reused. Not sure yet if vbo's at differing addresses but bound to same vertex buffer slot could have issues, but seems safest to flush whenever new vertex buffers are bound. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 12:02:25 -05:00
Ilia Mirkin	4907c31385	freedreno/a3xx: add missing integer formats and enable rendering The mesa state tracker doesn't fall back on similar integer formats, so they must all be provided. Remove the restriction against integer color rendering. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	82104c19f3	freedreno/a3xx: enable sampling from integer textures We need to produce a u32 destination type on integer sampling instructions, so keep that in a shader key set based on the currently-bound textures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	8e336ef55b	freedreno: allow each generation to hook into sampler view setting Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	618ff11457	freedreno/a3xx: don't use half precision shaders for int/float32 Integer outputs end up getting mangled due to cov.f32f16, and float32 loses precision. Use full precision shaders in both of those cases. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	f866446e8c	freedreno/a3xx: disable blending for integer formats Also add support for the BLENDABLE bind flag, similarly predicated on non-int formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	8e147e9ec8	freedreno/a3xx: remove blend clamp enables from gmem/clears Just pass the data through unmolested. This probably has no effect since blending isn't actually enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:00:41 -05:00

1 2 3 4 5 ...

13003 commits