fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-29 16:18:20 +02:00

Author	SHA1	Message	Date
Alex Deucher	00f554abba	radeonsi: enable optimal raster config setting for fiji (v2) Requires proper kernel tiling configuration so check the tiling config registers. v2: send the right version of the patch Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-16 10:09:47 -05:00
Alex Deucher	5b37d8b50c	radeonsi: use proper GRBM_GFX_INDEX offset for CI+ The offset is different on CI and newer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-16 10:09:34 -05:00
Emil Velikov	1780a562bc	nv50: add missing header into the sources list Otherwise it won't end up in the tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-16 10:49:14 +00:00
Ilia Mirkin	ff17b3ccf4	nv50,nvc0: disable render condition around clear_* functions Only the regular "clear" call is supposed to respect the render condition. The rest should ignore it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 20:15:22 -05:00
Samuel Pitoiset	848fa3101d	nv50: add support for performance metrics on G84+ Currently only one metric is exposed but more will be added later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:46 +01:00
Samuel Pitoiset	6a9c151dbb	nv50: add compute-related MP perf counters on G84+ These compute-related MP performance counters have been reverse engineered using CUPTI which is part of NVIDIA CUDA. As for nvc0, we use a compute kernel to read out those performance counters, and the command stream to configure them. Note that Tesla only exposes 4 MP performance counters, while Fermi has 8. Only G84+ is supported because G80 is an old and weird card. Tested on G84, G96, G200, MCP79 and GT218 with glxgears, glxspheres64, xonotic-glx, heaven and valley. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:42 +01:00
Samuel Pitoiset	ff72440b40	nv50: implement a basic compute support This adds the ability to launch simple compute kernels like the one I will use to read out MP performance counters in the upcoming patch. This compute support is based on the work of Francisco Jerez (aka curro) that he did as part of his EVoC project in 2011/2012 to get OpenCL working on Tesla. His original work can be found here: https://github.com/curro/mesa/commits/nv50-compute I did some improvements on the original code, like fixing using both 3D and COMPUTE simultaneously, improving global buffers binding, and making the code closer to what nvc0 already does. This compute support has been tested by Pierre Moreau and myself with some compute kernels. This is a step towards OpenCL. Speaking about this, it seems like compute programs overlap fragment programs when they are used both. To fix this, we need to re-validate fragment programs when binding compute programs and vice versa. Note that, textures, samplers and surfaces still need to be implemented. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:15 +01:00
Samuel Pitoiset	7167a058ba	nv50: free interpolation parameters in nv50_program_destroy() As for nvc0, we need to free memory allocated by interpolation parameters. This fixes a memory leak spotted by valgrind. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:16:12 +01:00
Samuel Pitoiset	69271bba06	nvc0: reduce the number of GPR used when reading MP perf counters No need to allocate more GPR than used in the compute kernel which reads MP performance counters on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-14 17:38:57 +01:00
Ilia Mirkin	f94e1d9738	nouveau: don't expose HEVC decoding support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-14 10:32:10 -05:00
Marek Olšák	3694d58e6c	radeonsi: remove dead code after ES-GS linkage change Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	d79a3449a7	radeonsi: link ES-GS just like LS-HS This reduces the shader key for ES. Use a fixed attrib location based on (semantic name, index). The ESGS item size is determined by the physical index of the highest ES output, so it's almost always larger than before, but I think that shouldn't matter as long as the ESGS ring buffer is large enough. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	b1c5f3faa9	radeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga I discovered that increasing the ESGS ring size fixes GS hangs on Tonga, so let's do it properly. There is now a separate init_config_gs_rings state that is not immutable, because GS rings are resized when needed. This also saves some memory. Most apps won't need more than 1MB per ring per shader engine. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	2f5d911ba2	radeonsi: rename si_update_gs_rings Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	4acd856088	radeonsi: calculate ESGS_RING_ITEMSIZE in create_shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	a0cf589961	radeonsi: move maximum gs stream calculation into create_shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	3ab0c49f04	radeonsi: clean up small duplication in si_shader_gs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	eb0d3e8a90	gallium/radeon: shorten render_cond variable names and ..._cond -> ..._invert Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	70c40cc989	gallium/radeon: remove predicate_drawing flag Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	12596cfd4c	gallium/radeon: atomize render condition (SET_PREDICATION) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	3521907622	gallium/radeon: simplify restoring render condition after flush Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	600e212d87	gallium/radeon: don't use PREDICATION_OP_CLEAR Not setting the predication bit is sufficient. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	6eff5415e4	gallium/radeon: simplify disabling render condition for u_blitter just disable it by not setting the predication bit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	8dd1ee6ff3	r600g: don't set predication on non-draw packets This has no effect. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	6cc8f6c6a7	gallium/radeon: inline the r600_rings structure Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	3d963abc81	radeonsi: prevent recursion in si_context_gfx_flush The recursion can only occur if you modify need_cs_space to always flush. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	8569f9a87e	gallium/radeon: remove the IB flushing flag Not needed anymore. A similar flag will be introduced in the next commit, which will be private in radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	81d412e02c	gallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space need_cs_space isn't invoked so often and is called before all commands too. This is a lot cleaner. The code in radeon_add_to_buffer_list always seemed dodgy to me. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	c6012a6650	radeonsi: rename cache flushing flags once more KCACHE, TC L1 and TC L2 are renamed to: - SMEM L1 - VMEM L1 - GLOBAL L2 You can easily tell what they are used for now. Shaders must deal with coherency issues between both L1s manually, e.g. by setting GLC=1 or by using s_dcache_*. BOTH_ICACHE_KCACHE was an unused definition. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	10130ccd8c	radeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well I missed this in commit `c3e527f93d` radeonsi: only enable write confirmation on the last CP DMA packet Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	40912dd91e	radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney otherwise the SX or CB blocks can go bananas Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-13 19:54:41 +01:00
Marek Olšák	f7757100f2	radeonsi: add glClearBufferSubData acceleration 8-bit and 16-bit clears which are not aligned to dwords are done in software. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	19773f9805	radeonsi: add SI_SAVE_FRAGMENT_STATE blitter flag Buffer clears via transform feedback won't set this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	19a9c1ecc7	gallium/u_blitter: add support for multi-dword clear values in clear_buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	e15c5c7a06	radeonsi: fix a future crash in emit_cb_target_mask This can't crash currently, but it would crash if clear_buffer from u_blitter were used with a clean context. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	65d0c558d5	radeonsi: fix unaligned clear_buffer fallback This is unreachable currently, but it will be used by unaligned 8-bit and 16-bit fills. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:40 +01:00
Marek Olšák	7f1e34e6c8	r600g: fix clear_buffer fallback with offset != 0 Discovered by luck. This code path hasn't been exercised since transform feedback was implemented. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:40 +01:00
Marek Olšák	01526136ba	gallium/radeon: fix PIPE_QUERY_GPU_FINISHED Broken by the addition of r600_multi_fence in `3b37155a68` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-13 19:54:40 +01:00
Ilia Mirkin	39f51ec96f	nvc0/ir: add support for TGSI_SEMANTIC_HELPER_INVOCATION Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-12 17:58:42 -05:00
Ilia Mirkin	e3d9dbe304	gallium: add support for gl_HelperInvocation semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-11-12 17:58:23 -05:00
Brian Paul	3e74038280	st/wgl: add a comment about recursive locking in stw_make_current() Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	f45b644e11	st/wgl: add a lock assertion in stw_framebuffer_from_hwnd_locked() Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
José Fonseca	a1c9feafd5	st/wgl: add some mutex checking code This would have caught the locking bug that was fixed in the earlier "st/wgl: fix locking issue in stw_st_framebuffer_present_locked()" patch. v2: minor coding style changes by Brian. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	166769fe4b	st/wgl: rename stw_framebuffer_release() to stw_framebuffer_unlock() To match the new stw_framebuffer_lock() function. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	dabc423ed0	st/wgl: reimplement stw_framebuffer::mutex with CRITICAL_SECTION v2: update comments on the stw_framebuffer::mutex field regarding locking order. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	f71508ae79	st/wgl: include u_debug.h To get declaration for debug_printf() directly instead of getting it indirectly through os_thread.h Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	fce68832c5	st/wgl: reimplement stw_device::fb_mutex with CRITICAL_SECTION Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	fa30de7643	st/wgl: re-implement stw_device::ctx_mutex with CRITICAL_SECTION This is Windows-only code so we can use the native Win32 functions for critical sections. This will also allow us to (cleanly) add some mutex check/debug code in subsequent patches. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:24 -07:00
Brian Paul	a02385cd69	gallium/hud: add cpu graph support for Windows We support "cpu" but not "cpu#" because there's no good way of querying per-cpu usage. Also, the cpu usage is for the process, not the whole system. Original code cobbled together by Brian and then fixed/polished by Jose. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-11-12 09:11:15 -07:00
Ilia Mirkin	c4182bb9b0	nv50,nvc0: add ARB_clear_texture support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-11 19:20:41 -05:00

1 2 3 4 5 ...

25092 commits