fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 10:08:13 +02:00

Author	SHA1	Message	Date
Eric Anholt	743dcdd936	vc4: Allow VBOs to be mapped during execution. There's no reason we can't -- the mappings we expose are basically equivalent to persistent/coherent, already. Improves mesa-demos drawoverhead (no state change) performance by 5.21362% +/- 1.25078% (n=11).	2017-06-20 09:05:44 -07:00
Brian Paul	d8148ed10a	gallium/vbuf: avoid segfault when we get invalid glDrawRangeElements() A common user error is to call glDrawRangeElements() with the 'end' argument being one too large. If we use the vbuf module to translate some vertex attributes this error can cause us to read past the end of the mapped hardware buffer, resulting in a crash. This patch adjusts the vertex count to avoid that issue. Typically, the vertex_count gets decremented by one. This fixes crashes with the Unigine Tropics and Sanctuary demos with older VMware hardware versions. The issue isn't hit with VGPU10 because we don't hit this fallback. No piglit changes. CC: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-20 08:03:18 -06:00
Brian Paul	2a9d8a45a6	gallium/vbuf: add some const qualifiers Helps understandability a bit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-20 08:03:12 -06:00
Brian Paul	ed83e73c4e	translate: whitespace fixes in translate_generic.c	2017-06-20 07:56:34 -06:00
Brian Paul	ceb9ca7fa5	softpipe: remove unused softpipe_context::line_stipple_counter Trivial.	2017-06-20 07:56:34 -06:00
Samuel Pitoiset	ea2492b62f	radeonsi: set correct usage flag according to image access type Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-20 13:01:18 +02:00
Marek Olšák	58af1f6bb0	winsys/amdgpu: fix a deadlock when waiting for submission_in_progress First this happens: 1) amdgpu_cs_flush (lock bo_fence_lock) -> amdgpu_add_fence_dependency -> os_wait_until_zero (wait for submission_in_progress) - WAITING 2) amdgpu_bo_create -> pb_cache_reclaim_buffer (lock pb_cache::mutex) -> pb_cache_is_buffer_compat -> amdgpu_bo_wait (lock bo_fence_lock) - WAITING So both bo_fence_lock and pb_cache::mutex are held. amdgpu_bo_create can't continue. amdgpu_cs_flush is waiting for the CS ioctl to finish the job, but the CS ioctl is trying to release a buffer: 3) amdgpu_cs_submit_ib (CS thread - job entrypoint) -> amdgpu_cs_context_cleanup -> pb_reference -> pb_destroy -> amdgpu_bo_destroy_or_cache -> pb_cache_add_buffer (lock pb_cache::mutex) - DEADLOCK The simple solution is not to wait for submission_in_progress, which we need in order to create the list of dependencies for the CS ioctl. Instead of building the list of dependencies as a direct input to the CS ioctl, build the list of dependencies as a list of fences, and make the final list of dependencies in the CS thread itself. Therefore, amdgpu_cs_flush doesn't have to wait and can continue. Then, amdgpu_bo_create can continue and return. And then amdgpu_cs_submit_ib can continue. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101294 Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-20 12:53:46 +02:00
Samuel Pitoiset	afeaa2e98a	radeonsi: update all resident texture descriptors when needed To avoid useless DCC fetches when DCC is disabled, descriptors have to be updated in order to reflect this change. This is quite similar to how we update descriptors of bound textures. As a side effect, this should also prevent VM faults when bindless textures are invalidated, because the VA in the descriptor has to be updated accordingly as well. I don't see any performance improvements with DOW3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-20 10:14:55 +02:00
Samuel Pitoiset	f00e80e3f7	radeonsi: keep track of the sampler state for texture handles Needed for updating all resident texture descriptors when dirty_tex_counter changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-20 10:14:52 +02:00
Marek Olšák	3fc99f1299	radeonsi: fix dumping shader descriptors into ddebug logs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:16:20 +02:00
Marek Olšák	f9dc29a9a5	radeonsi: add a workaround for inexact SNORM8 blitting again GFX9 is affected. We only have tests for GL_x_SNORM where x is R8, RG8, RGB8, and RGBA8. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Marek Olšák	0f827b51c0	radeonsi/gfx9: fix TC-compatible stencil compression Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Marek Olšák	8a264dd829	radeonsi/gfx9: fix TXF_LZ with 1D textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Marek Olšák	353b60cab5	radeonsi/gfx9: disable sparse buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-19 20:15:36 +02:00
Nicolai Hähnle	25e5534734	gallium/radeon/gfx9: fix PBO texture uploads to compressed textures st/mesa creates a surface that reinterprets the compressed blocks as RGBA16UI or RGBA32UI. We have to adjust width0 & height0 accordingly to avoid out-of-bounds memory accesses by CB. Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-19 12:05:15 +02:00
Nicolai Hähnle	4d5bb1b987	r600: fix off-by-one in egd_tables.py Port of the corresponding fix in sid_tables.py. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-19 12:05:12 +02:00
Samuel Pitoiset	6ff6863c32	radeonsi: reduce overhead for resident textures which need color decompression This is done by introducing a separate list. si_decompress_textures() is now 5x faster. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-18 14:10:38 +02:00
Samuel Pitoiset	06ed251c32	radeonsi: reduce overhead for resident textures which need depth decompression This is done by introducing a separate list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-18 14:10:36 +02:00
Samuel Pitoiset	705a6a560e	radeonsi: use util_dynarray_foreach for bindless resources Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-18 14:10:34 +02:00
Samuel Pitoiset	8d9e76ce1f	gallium/radeon: add a new HUD query for the number of resident handles Useful for debugging performance issues when ARB_bindless_texture is enabled. This query doesn't make a distinction between texture and image handles. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-18 14:08:08 +02:00
Emil Velikov	68aa39d5c2	r600: include libelf headers only as needed Headers are required only when building with OpenCL. As we're building w/o it libelf may be missing, hence we'll error out as below: src/gallium/drivers/r600/evergreen_compute.c:27:10: fatal error: 'gelf.h' file not found ^ 1 error generated. Fixes: `d96a210842` ("r600g,compute: provide local copy of functions from ac_binary.c") Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-06-17 16:57:18 +01:00
Emil Velikov	1f958c1337	radeonsi: include ac_binary.h for struct ac_shader_binary The header embeds the struct so it needs the header inclusion instead of the dummy forward declaration. Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: Marek Olšák <marek.olsak@amd.com> Cc: Tom Stellard <tstellar@redhat.com> Fixes: `32206c5e56` ("radeonsi: Add radeon_shader_binary member to struct si_shader") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-17 11:38:02 +01:00
Emil Velikov	7e1c42cf89	r600, radeon: move radeon_shader_binary_{init,clean} back to radeon Those are used by r600 and radeonsi, so moving them within the former was a bad idea. Fixes: `d96a210842` ("r600g,compute: provide local copy of functions from ac_binary.c") Cc: Jan Vesely <jan.vesely@rutgers.edu> Cc: Aaron Watry <awatry@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-17 11:37:58 +01:00
Brian Paul	e3f5b8ac16	svga: add new num-failed-allocations HUD query This counter is incremented if we fail to allocate memory for vertex/index/const buffers, textures, etc. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2017-06-16 17:04:08 -06:00
Brian Paul	b27281c110	gallium/hud: support GALLIUM_HUD_DUMP_DIR feature on Windows Use a dummy implementation of the access() function. Use \ path separator. Add a few comments. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2017-06-16 17:04:02 -06:00
Brian Paul	d6cb912d65	svga: add a few minor comments Trivial.	2017-06-16 17:03:01 -06:00
Tim Rowley	a6237e4b7f	swr/rast: Fix read-back of viewport array index Binner/clipper read viewport array index from the vertex header as needed. Move viewport state to BACKEND_STATE. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	9b448da60f	swr/rast: Refactor includes to limit simdintrin.h usage Reduces the files rebuilt after modifying simdintrin.h from 84 to 64. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	08a466aec0	swr/rast: Fix read-back of render target array index The last FE stage can emit render target array index. Currently we only check to see if GS is emitting it. Moved the state to BACKEND_STATE and plumbed the driver to set it. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	17cdd1e796	swr/rast: Adjust cast for gcc warning Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	bea00a7b6e	swr/rast: Don't transition hottile resolved->dirty during store tiles Fixes crash when dumping render targets and RT surface has been deleted. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	5c08bfbd17	swr/rast: gen_llvm_types.py support for SIMD256/SIMD512 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	21baadfe58	swr/rast: Properly size GS stage scratch space Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	3695c8ec1e	swr/rast: Fix early z / query interaction For certain cases, we perform early z for optimization. The GL_SAMPLES_PASSED query was providing erroneous results because we were counting the number of samples passed before the fragment shader, which did not work if the fragment shader contained a discard. Account properly for discard and early z, by anding the zpass mask with the post fragment shader active mask, after the fragment shader. Fixes the following piglit tests: - occlusion-query-discard - occlusion_query_meta_fragments Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	b7eb86c617	swr/rast: Share vertex memory between VS input/output Removes large simdvertex stack allocation. Vertex shader must ensure reads happen before writes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	7f3be3f0b8	swr/rast: Add support for dynamic vertex size for VS output Add support for dynamic vertex size for the vertex shader output. Add new state in SWR_FRONTEND_STATE to specify the size. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	8e5d11cd7b	swr/rast: SIMD16 FE - improve calcDeterminantIntVertical Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	01eca81cd4	swr/rast: Add support to PA for variable sized vertices Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	b10cdb217a	swr/rast: Rework attribute layout Move fixed attributes to the top and pack single component SGVs. WIP to support dynamically allocated vertex size. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	36ac8ba511	swr/rast: Remove explicit primitive id slot in the vertex layout - Remove any special casing in the PS stage when primitive ID is input. Treat as a normal attribute that must be set up properly in the FE linkage. - Remove primitive id from the PS_CONTEXT and TRI_FLAGS Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	8716e0d8b4	swr/rast: Fix invalid 16-bit format traits for A1R5G5B5 Correctly handle formats of <= 16 bits where the component bits don't add up to the pixel size. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Tim Rowley	a25093de71	swr/rast: Implement JIT shader caching to disk Disabled by default; currently doesn't cache shaders (fs,gs,vs). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-06-16 16:20:16 -05:00
Brian Paul	1c33dc77f7	gallium/docs: improve docs for SAMPLE_POS, SAMPLE_INFO, TXQS, MSAA semantics For the SAMPLE_POS and SAMPLE_INFO opcodes, clarify resource vs. render target queries, range of postion values, swizzling, etc. We basically follow the DX10.1 conventions. For the TXQS opcode and TGSI_SEMANTIC_SAMPLEID, clarify return value and type. For the TGSI_SEMANTIC_SAMPLEPOS system value, clarify the range of positions returned. v2: use 'undef' for unused vector components. Use (0.5, 0.5, undef, undef) for sample pos when MSAA not applicable. v3: Add note that OPCODE_SAMPLE_INFO, OPCODE_SAMPLE_POS are not used yet and the information is subject to change. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-06-16 14:07:31 -06:00
Brian Paul	005c978c5a	svga: add some missing SVGA_STATS_* enum values, prefix strings To fix the build when VMX86_STATS is defined. Also, some minor whitespace changes to match upstream code. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-06-16 14:06:53 -06:00
Bruce Cherniak	80b587ba27	swr: Don't crash when encountering a VBO with stride = 0. The swr driver uses vertex_buffer->stride to determine the number of elements in a VBO. A recent change to the state-tracker made it possible for VBO's with stride=0. This resulted in a divide by zero crash in the driver. The solution is to use the pre-calculated vertex element stream_pitch in this case. This patch fixes the crash in a number of piglit and VTK tests introduced by `17f776c27b`. There are several VTK tests that still crash and need proper handling of vertex_buffer_index. This will come in a follow-on patch. v2: Correctly update all parameters for VBO constants (stride = 0). Also fixes the remaining crashes/regressions that v1 did not address, without touching vertex_buffer_index. Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-06-16 13:45:24 -05:00
Christian Gmeiner	82db591155	etnaviv: add rs-operations sw query It could be useful to get the number of emited resolve operations when doing driver optimizations. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2017-06-16 15:28:12 +02:00
Lucas Stach	5065549e2a	etnaviv: advertise correct max LOD bias The maximum LOD bias supported is the same as the max texture level supported. Fixes piglit: ext_texture_lod_bias Fixes: `c9e8b49b` ("etnaviv: gallium driver for Vivante GPUs") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <dev@lynxeye.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-06-16 15:26:23 +02:00
Lucas Stach	8644b59b5d	etnaviv: mask correct channel for RB swapped rendertargets Now that we support RB swapped targets by using a shader variant, we must derive the color mask from both the blend state and the bound framebuffer. Fixes piglit: fbo-colormask-formats Fixes: `7f62ffb68a` ("etnaviv: add support for rb swap") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <dev@lynxeye.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-06-16 15:26:23 +02:00
Lucas Stach	d6aa2ba2b2	etnaviv: replace translate_clear_color with util_pack_color This replaces the open coded etnaviv version of the color pack with the common util_pack_color. Fixes piglits: arb_color_buffer_float-clear fcc-front-buffer-distraction fbo-clearmipmap Fixes: `c9e8b49b` ("etnaviv: gallium driver for Vivante GPUs") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <dev@lynxeye.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-06-16 15:26:23 +02:00
Lucas Stach	6633880e7e	etnaviv: remove bogus assert etna_resource_copy_region handles resources with multiple samples by falling back to the software path. There is no need to kill the application there. Fixes: `c9e8b49b` ("etnaviv: gallium driver for Vivante GPUs") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <dev@lynxeye.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-06-16 15:26:23 +02:00

... 4 5 6 7 8 ...

31705 commits