fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 08:18:11 +02:00

Author	SHA1	Message	Date
Zack Rusin	63386b2f66	util: treat denorm'ed floats like zero The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-09 23:30:55 -04:00
Marek Olšák	1faa375573	r600g: improve the mechanism for recognizing an empty CS Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	287b2fa115	r600g: explicitly flush caches for streamout-based buffer copying & clearing It's done automatically for vertex buffers, but not for constant buffers, textures, and colorbuffers. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	7948ed1250	r600g: only flush the caches that need to be flushed during CP DMA operations This should increase performance if constant uploads are done with the CP DMA, because only the cache that needs to be flushed is flushed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	1b40398d02	r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags also flushing any cache in evergreen_emit_cs_shader seems to be superfluous (we don't flush caches when changing the other shaders either) Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Alex Deucher	098316211c	r600g: adjust flush flags (v3) 1. flush SH with read caches 2. add flag for DB flushes 3. add flag for CB flushes v2: flush all CBs, remove redundant emit_state variable. v3: Marek: also set the new flags in r600_context_flush, the CP dma functions, and texture_barrier, and rename them Signed-off-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	862f69fbe1	r600g: don't call buffer_wait in buffer_mmap_sync_with_rings The winsys should do this, because it measures how much time we spend in buffer_map doing synchronization, which can be viewed with the gallium HUD. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	94d294137e	r600g: don't read back the MSAA depth buffer if the read flag is not set Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	141b892620	r600g: don't flush the context in texture_transfer_map the winsys does this automatically Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	ae87aae0c4	r600g: fix texture offset computation for mapped MSAA depth buffers It was wrong, because the offset shouldn't be applied to MSAA depth buffers. This small cleanup should prevent such issues in the future. This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n". Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	a3263cca59	r600g: fix color resolve for RGBX8 and RGBX16 integer formats Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	b1a061b81e	r600g: enable fast MSAA color clear for array/3D/cube textures Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	87669c3654	r600g: implement fast MSAA color clear for integer textures this also fixes the fast clear with multiple colorbuffers and each having a different format Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Christian König	085c695488	r600/uvd: fix check for UVD 2.x Signed-off-by: Christian König <christian.koenig@amd.com>	2013-07-08 19:51:20 +02:00
Roland Scheidegger	9ef49cfd84	gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch The logic for choosing number of lods was bogus. (The code should ultimately handle the case of only one lod even with multiple quads but currently can't.)	2013-07-05 18:07:51 +02:00
José Fonseca	45f174ce40	gallivm: Remove bogus assert. It is perfectly valid for the swizzle to be bigger than 2. For example the texel offsets could be SAMPLE ..., IMM[0].zzz What is not correct is for chan_index to be bigger than 2. Trivial.	2013-07-05 14:35:54 +01:00
Ben Skeggs	c29c6b2b2e	nvc0: enable very initial support for nvf0 (GK110) Shaders need a lot of work still. Basic stuff generally works, so this is basically just fine for gnome-shell, OA etc at this point. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-07-05 14:15:04 +10:00
Roland Scheidegger	4dbca8672b	gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources The assertion was always broken but the code unused until enabling the per-element lod code. Fixes piglit texelFetch vs isampler1D and similar tests (only run with GL 3.0 version override).	2013-07-05 01:19:23 +02:00
Roland Scheidegger	f3bbf65929	gallivm: do per-pixel lod calculations for explicit lod d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-04 19:42:04 +02:00
Zack Rusin	bbd1e60198	draw: fix overflows in the indexed rendering paths The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-03 09:06:30 -04:00
Zack Rusin	09820902d7	draw/llvm: index overflows if it's greater than elt max The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-03 09:06:24 -04:00
Matthew McClure	012ba47076	postprocess: move second temporary assertion into isolated configuration With this patch we will only assert that the second temporary is allocated, when there are more than two active filters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423 Signed-off-by: Brian Paul <brianp@vmware.com>	2013-07-03 09:19:04 -06:00
Ilia Mirkin	4bc8e3c3e4	targets/xvmc-nouveau: add in missing nv30 lib Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so that it may be dlopen'd. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-07-03 09:02:40 +02:00
Marek Olšák	30c3e8718d	mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
José Fonseca	84f367e69a	gallivm: Simplify intrinsic name construction. Just noticed this could be slightly shortened when fixing MSVC build. Trivial.	2013-07-02 13:12:31 +01:00
José Fonseca	4c859901ce	gallivm: Fix MSVC build.	2013-07-02 06:41:32 +01:00
José Fonseca	e621ec816d	gallivm: Fix indirect immediate registers. If reg->Register.Indirect is true then the immediate is not truly a constant LLVM expression. There is no performance regression in using LLVMBuildBitCast, as it will fallback to LLVMConstBitCast internally when the argument is a constant. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-02 06:30:06 +01:00
Zack Rusin	70bc43acdb	gallium/tests: fix the translate test	2013-06-28 09:43:17 -04:00
Zack Rusin	1c2e5c223d	draw/translate: fix instancing We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 05:21:20 -04:00
Zack Rusin	df4ab7974a	draw: fix incorrect clipper invocation statistics clipper invocations are computed earlier (of course before the emittion) so this code was adding bogus numbers to already computed clipper invocations. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:29 -04:00
Zack Rusin	34546d61c1	draw/gallivm: export overflow arithmetic to its own file We'll be reusing this code so lets put it in a common file and use it in the draw module. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:24 -04:00
Zack Rusin	88de009cc1	draw: check for integer overflows in instance computation Integers could easily overflow is the starting instance was large enough. Instead of letting bogus counts through set the instance to max if it overflown and let our regular buffer overflow computation handle it. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:20 -04:00
Zack Rusin	2f13f28120	draw: check for an integer overflow when computing stride Our buffer overflow arithmetic was susceptible to integer overflows which was the buffer overflow logic to break. Lets use the llvm overflow intrinsics to check for integer overflows while computing the stride/needed buffer size. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:16 -04:00
Zack Rusin	e742f7788e	draw: account for elem size when computing overflow We weren't taking into account the size of element that is to be fetched, which meant that it was possible to overflow the buffer reads if the stride was very close to the end of the buffer, e.g. stride = 3, buffer size = 4, and the element to be read = 4. This should be properly detected as an overflow. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:12 -04:00
José Fonseca	acc6a141b8	tools/trace: Return dummy fence object to silence warnings.	2013-07-01 12:06:58 +01:00
José Fonseca	0fd71ac9eb	tools/trace: Don't crash if a trace has no timing information.	2013-07-01 12:05:57 +01:00
Maarten Lankhorst	bf95ca7de0	nvc0: allow frame dropping in h264 The only reason the checks existed were paranoia, when I first wrote the code I wasn't sure it was correct. Now that I am, the asserts triggered when XBMC was dropping frames, so remove it. NOTE: This is a candidate for the 9.1 branch.	2013-07-01 08:47:49 +02:00
Tom Stellard	24fa43675f	r300g/compiler: Prevent regalloc from swizzling texture operands v2 https://bugs.freedesktop.org/show_bug.cgi?id=63520 NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-30 21:38:57 -07:00
Tom Stellard	e2c3640540	r300g/compiler/tests: Add an assembly parser The assembly parser can be used to load r300 assembly dumps and run them through any of the r300 compiler passes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-30 21:38:57 -07:00
Tom Stellard	ab40d8d56f	r300g: Fix make check Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-30 21:24:55 -07:00
Grigori Goronzy	30004b20c2	r600g: implement fast color clears for MSAA on evergreen+ Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all bound colorbuffers fulfill certain conditions: a CMASK is required, we have to be able to create a clear color value for the format and the texture mustn't contain multiple images. Technically, it should be possible to support array textures and cubemaps if all images are attached to the framebuffer, but this does not appear to be common. v2: fix fast clear check v3: Marek: - disable fast clear with 128-bit formats, which are unsupported - set tex->dirty_level_mask in r600_clear, so that the driver knows the resource must be decompressed/expanded - return early from r600_clear if there's nothing else to do Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-07-01 03:02:43 +02:00
Marek Olšák	b1693194ee	r600g/compute: disable unused colorbuffer slots Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-01 03:02:43 +02:00
Marek Olšák	f83e220d36	st/mesa: handle SNORM formats in generic CopyPixels path v2: check desc->is_mixed in util_format_is_snorm	2013-06-30 22:14:37 +02:00
Roland Scheidegger	7d430bfab9	llvmpipe: fix timer query if there's no bins `b04a295a4a` removed seemingly unnecessary code in get_query. Turns out this code could in fact be reached - while timestamps are always binned, if there are no bins (which happens if fb size is 0) then the rasterization query code filling this in is still never executed. So fix this up by filling in some timestamp, but do it at EndQuery time not GetQuery time which should be more appropriate. Makes piglit arb_timer_query-timestamp-get happy again. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-29 16:58:02 +02:00
Tom Stellard	5a925cc550	clover: Don't segfault when compiling a program with no kernel	2013-06-28 15:19:06 -07:00
Alex Deucher	d669992e35	radeonsi: disable 2D tiling on CIK for now Causes GPU hangs. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:17:10 -04:00
Alex Deucher	1357624abc	radeonsi: add llvm processor names for CIK Requires updated llvm. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:17:00 -04:00
Alex Deucher	234d81e6b2	radeonsi: emit PA_SC_RASTER_CONFIG[_1] on cik Use the golden values for each asic. Todo: update Kabini and Kaveri. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:53 -04:00
Alex Deucher	9d8ad222c6	radeonsi: PA_CL_ENHANCE is privileged on CIK Needs to be and is set by the kernel. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:46 -04:00
Alex Deucher	72c10be3a7	radeonsi: update surface sync packet emit for CIK Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:35 -04:00

1 2 3 4 5 ...

18797 commits