fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 14:58:10 +02:00

Author	SHA1	Message	Date
Brian Paul	46205ab8cc	tgsi: rename the TGSI fragment kill opcodes TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	f501baabdb	tgsi: fix-up KILP comments KILP is really unconditional fragment kill. We've had KIL and KILP transposed forever. I'll fix that next. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	e7c3898725	tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector To align with the docs and the state tracker. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	f3fad24b62	tgsi: use X component of the second operand in exec_scalar_binary() The code happened to work in the past since the (scalar) src args effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so whether you grab the X or Y component doesn't really matter. Just fixing the code to make it look right. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	9fc532a263	os: add os_get_process_name() function v2: explicitly test for BSD/APPLE, #warning for unexpected environments.	2013-07-12 08:32:50 -06:00
Brian Paul	919236f3a2	softpipe: silence some MSVC warnings	2013-07-12 08:19:52 -06:00
Brian Paul	76666b9394	hud: silence some MSVC warnings	2013-07-12 08:19:52 -06:00
Brian Paul	d7a852b3a1	util: add casts to silence MSVC warnings in u_blit.c	2013-07-12 08:19:51 -06:00
Brian Paul	c45d8f2e98	tgsi: s/unsigned/int/ to silence MSVC warning	2013-07-12 08:19:50 -06:00
Christian König	1681bd7f2b	radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2 UVD 2.x doesn't support hardware decoding of MPEG2, just use shader based decoding for those chipsets. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450 v2: fix interlacing as well Signed-off-by: Christian König <christian.koenig@amd.com>	2013-07-12 10:52:27 +02:00
Christoph Bumiller	9974593dfb	r600g: x/y coordinates must be divided by block dim in dma blit Note: this is a candidate for the 9.1 branch. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-11 19:11:44 -04:00
Chih-Wei Huang	1d9271a95c	r600g/sb: Fix Android build v2 Add the sb CXX files to the Android Makefile and also stop using some c++11 features. v2 (Vadim Girlin): use &bc[0] instead of bc.begin()	2013-07-12 01:11:04 +04:00
Vadim Girlin	758ac6f918	r600g/sb: improve math optimizations v2 This patch adds support for some math optimizations that are generally considered unsafe, that's why they are currently disabled for compute shaders. GL requirements are less strict, so they are enabled for for GL shaders by default. In case of any issues with applications that rely on higher precision than guaranteed by GL, 'sbsafemath' option in R600_DEBUG allows to disable them. v2 - always set proper src vector size for transformed instructions - check for clamp modifier in the expr_handler::fold_assoc Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-11 23:01:01 +04:00
Jonathan Gray	c451619dde	st/xvmc/tests: avoid non portable error.h functions Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-07-11 09:52:00 +02:00
Chia-I Wu	ad244884fc	winsys/intel: build with VISIBILITY_CFLAGS There is no public symbol in this winsys.	2013-07-11 09:03:59 +08:00
Chia-I Wu	79bc245c01	ilo: reduce PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS to 12 So that there are at most (2^22 * 6) texels, lower than the 2^26 limit.	2013-07-11 08:03:27 +08:00
Chia-I Wu	29af29b8dc	ilo: correctly initialize undefined registers in fs Initialize all 4 channels of undefined registers (that is, TEMPs that are used before being assigned) in FS.	2013-07-11 07:01:51 +08:00
Michel Dänzer	a06ee5a09e	radeonsi: Handle TGSI_OPCODE_DDX/Y using local memory 16 more little piglits. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-10 18:40:32 +02:00
Michel Dänzer	a6b83c0f23	radeonsi: Handle TGSI_OPCODE_TXD One more little piglit. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-10 12:16:38 +02:00
José Fonseca	b042aae70d	util/u_math: Use xmmintrin.h whenever possible. It seems __builtin_ia32_ldmxcsr is only available on gcc and only when -msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but these too are only available with gcc when -msse/-msse3 are set. scons build always sets -msse on x86 builds, but autotools doesn't seem to. We could try to get this working on gcc x86 without -msse by emitting assembly, but I believe that in this day and age we really should be building Mesa with -msse and -msse2.	2013-07-10 07:56:17 +01:00
Chia-I Wu	045bf0db52	ilo: honor surface padding requirements The PRM specifies several padding requirements that we failed to honor.	2013-07-10 12:40:22 +08:00
Zack Rusin	63386b2f66	util: treat denorm'ed floats like zero The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-09 23:30:55 -04:00
Marek Olšák	1faa375573	r600g: improve the mechanism for recognizing an empty CS Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	287b2fa115	r600g: explicitly flush caches for streamout-based buffer copying & clearing It's done automatically for vertex buffers, but not for constant buffers, textures, and colorbuffers. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	7948ed1250	r600g: only flush the caches that need to be flushed during CP DMA operations This should increase performance if constant uploads are done with the CP DMA, because only the cache that needs to be flushed is flushed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	1b40398d02	r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags also flushing any cache in evergreen_emit_cs_shader seems to be superfluous (we don't flush caches when changing the other shaders either) Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Alex Deucher	098316211c	r600g: adjust flush flags (v3) 1. flush SH with read caches 2. add flag for DB flushes 3. add flag for CB flushes v2: flush all CBs, remove redundant emit_state variable. v3: Marek: also set the new flags in r600_context_flush, the CP dma functions, and texture_barrier, and rename them Signed-off-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	862f69fbe1	r600g: don't call buffer_wait in buffer_mmap_sync_with_rings The winsys should do this, because it measures how much time we spend in buffer_map doing synchronization, which can be viewed with the gallium HUD. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	94d294137e	r600g: don't read back the MSAA depth buffer if the read flag is not set Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	141b892620	r600g: don't flush the context in texture_transfer_map the winsys does this automatically Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	ae87aae0c4	r600g: fix texture offset computation for mapped MSAA depth buffers It was wrong, because the offset shouldn't be applied to MSAA depth buffers. This small cleanup should prevent such issues in the future. This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n". Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	a3263cca59	r600g: fix color resolve for RGBX8 and RGBX16 integer formats Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	b1a061b81e	r600g: enable fast MSAA color clear for array/3D/cube textures Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	87669c3654	r600g: implement fast MSAA color clear for integer textures this also fixes the fast clear with multiple colorbuffers and each having a different format Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Christian König	085c695488	r600/uvd: fix check for UVD 2.x Signed-off-by: Christian König <christian.koenig@amd.com>	2013-07-08 19:51:20 +02:00
Roland Scheidegger	9ef49cfd84	gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch The logic for choosing number of lods was bogus. (The code should ultimately handle the case of only one lod even with multiple quads but currently can't.)	2013-07-05 18:07:51 +02:00
José Fonseca	45f174ce40	gallivm: Remove bogus assert. It is perfectly valid for the swizzle to be bigger than 2. For example the texel offsets could be SAMPLE ..., IMM[0].zzz What is not correct is for chan_index to be bigger than 2. Trivial.	2013-07-05 14:35:54 +01:00
Ben Skeggs	c29c6b2b2e	nvc0: enable very initial support for nvf0 (GK110) Shaders need a lot of work still. Basic stuff generally works, so this is basically just fine for gnome-shell, OA etc at this point. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-07-05 14:15:04 +10:00
Roland Scheidegger	4dbca8672b	gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources The assertion was always broken but the code unused until enabling the per-element lod code. Fixes piglit texelFetch vs isampler1D and similar tests (only run with GL 3.0 version override).	2013-07-05 01:19:23 +02:00
Roland Scheidegger	f3bbf65929	gallivm: do per-pixel lod calculations for explicit lod d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-04 19:42:04 +02:00
Zack Rusin	bbd1e60198	draw: fix overflows in the indexed rendering paths The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-03 09:06:30 -04:00
Zack Rusin	09820902d7	draw/llvm: index overflows if it's greater than elt max The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-03 09:06:24 -04:00
Matthew McClure	012ba47076	postprocess: move second temporary assertion into isolated configuration With this patch we will only assert that the second temporary is allocated, when there are more than two active filters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423 Signed-off-by: Brian Paul <brianp@vmware.com>	2013-07-03 09:19:04 -06:00
Ilia Mirkin	4bc8e3c3e4	targets/xvmc-nouveau: add in missing nv30 lib Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so that it may be dlopen'd. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-07-03 09:02:40 +02:00
Marek Olšák	30c3e8718d	mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
José Fonseca	84f367e69a	gallivm: Simplify intrinsic name construction. Just noticed this could be slightly shortened when fixing MSVC build. Trivial.	2013-07-02 13:12:31 +01:00
José Fonseca	4c859901ce	gallivm: Fix MSVC build.	2013-07-02 06:41:32 +01:00
José Fonseca	e621ec816d	gallivm: Fix indirect immediate registers. If reg->Register.Indirect is true then the immediate is not truly a constant LLVM expression. There is no performance regression in using LLVMBuildBitCast, as it will fallback to LLVMConstBitCast internally when the argument is a constant. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-02 06:30:06 +01:00
Zack Rusin	70bc43acdb	gallium/tests: fix the translate test	2013-06-28 09:43:17 -04:00
Zack Rusin	1c2e5c223d	draw/translate: fix instancing We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 05:21:20 -04:00

1 2 3 4 5 ...

18818 commits