fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-04 19:10:35 +02:00

Author	SHA1	Message	Date
Nicolai Hähnle	e0af3bed2c	amd/common: round cube array slice in ac_prepare_cube_coords The NIR-to-LLVM pass already does this; now the same fix covers radeonsi as well. Fixes various tests of dEQP-GLES31.functional.texture.filtering.cube_array.combinations.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-18 11:25:18 +02:00
Nicolai Hähnle	6fb0c1013b	radeonsi: workaround for gather4 on integer cube maps This is the same workaround that radv already applied in commit `3ece76f03d` ("radv/ac: gather4 cube workaround integer"). Fixes dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i/ui.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-18 11:25:17 +02:00
Nicolai Hähnle	b7b4a14db5	st/glsl_to_tgsi: fix theoretical memory leak It can't really happen since we don't use subroutines. CID: 1417491 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2017-09-18 11:25:17 +02:00
Iago Toral Quiroga	3d9cb39fd0	i965: emit BRW_NEW_AUX_STATE on aux state changes Fixes a regression introduced with `b96313c0e1`, which removed BRW_NEW_BLORP for a bunch of SURFACE_STATE setup code, including render targets, on the basis that blorp invalidates binding tables but not surface states, however, at least on Broadwell, this caused a regression in a CTS test, which Ken and Jason tracked down to the fact that we are not uploading new render target surface states after allocating new CCS_D surfaces for fast clears (which allocation is deferred until an actual clear occurs). The reason this only fails in BDW is that on SKL+ we use CCS_E which is allocated up front so it exists in the initial surface state, the problem can be reproduced in these platforms too if we use INTEL_DEBUG=norcb to force the CCS_D path. This patch, together with the ones preceding it, fixes the regression by ensuring that we track and flag as dirty all aux state changes. Credit goes to Jason and Ken for figuring out the reason for the regression. Fixes: KHR-GL45.transform_feedback.draw_xfb_test Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-09-18 10:47:51 +02:00
Iago Toral Quiroga	9a8bf42308	i965: emit BRW_NEW_AUX_STATE when we change the fast clear value v2: rename intel_miptree_set_clear_value to intel_miptree_set_clear_color (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-09-18 10:47:51 +02:00
Iago Toral Quiroga	ca65b9e62d	i965: emit BRW_NEW_AUX_STATE if we drop the aux surface Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-09-18 10:47:51 +02:00
Iago Toral Quiroga	5b27816b22	i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE We want to use this flag to signal changes to the aux surfaces, so let's not make it about fast clearing only. Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-09-18 10:47:51 +02:00
Gert Wollny	e688a9ef6a	gbm: Add gbm_device_get_format_modifier_plane_count to test Adding gbm_device_get_format_modifier_plane_count made the test gbm-symbols-check fail, this patch adds the according function name to the test. Fixes: `8824141b8d` (gbm: Add a gbm_device_get_format_modifier_plane_count function) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-09-17 12:53:46 +03:00
Bas Nieuwenhuizen	969537d935	radv: Add support for more DCC compression with VK_KHR_image_format_list. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-16 11:55:56 +02:00
Bas Nieuwenhuizen	d398db2acb	radv: Add code to check if two formats can share DCC metadata. Ported from radeonsi. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-16 11:55:42 +02:00
Kenneth Graunke	4f8d1af0f6	i965: Add an INTEL_DEBUG=reemit option. Jason and I use this for debugging all the time. Recompiling the driver to enable it is kind of annoying. It's a great thing to try along with always_flush_batch=true and always_flush_cache=true to detect a class of problems - namely, atoms listening to an insufficient set of dirty bits. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-09-15 21:51:45 -07:00
Jan Vesely	3115687f9b	clover: Fix build after LLVM r313390 v2: pass llvm context reference instead of a pointer Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-09-15 21:39:54 -04:00
Bas Nieuwenhuizen	5ef3c2bcef	radv: Don't redundantly emit pipelines after secondary cmd buffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-15 23:12:25 +02:00
Bas Nieuwenhuizen	979978ee06	radv: Check for GFX9 for 1D arrays in image_size intrinsic. Only on GFX9 we implement them as 2D images. This fixes: dEQP-VK.image.image_size.1d_array.readonly_12x34 dEQP-VK.image.image_size.1d_array.readonly_1x1 dEQP-VK.image.image_size.1d_array.readonly_32x32 dEQP-VK.image.image_size.1d_array.readonly_7x1 dEQP-VK.image.image_size.1d_array.readonly_writeonly_12x34 dEQP-VK.image.image_size.1d_array.readonly_writeonly_1x1 dEQP-VK.image.image_size.1d_array.readonly_writeonly_32x32 dEQP-VK.image.image_size.1d_array.readonly_writeonly_7x1 dEQP-VK.image.image_size.1d_array.writeonly_12x34 dEQP-VK.image.image_size.1d_array.writeonly_1x1 dEQP-VK.image.image_size.1d_array.writeonly_32x32 dEQP-VK.image.image_size.1d_array.writeonly_7x1 Fixes: `1bcb953e16` "radv: handle GFX9 1D textures" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-15 22:06:56 +02:00
Eric Engestrom	915dc6db45	i965: drop unused variables Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-09-15 12:09:13 -07:00
Jason Ekstrand	7bd5931cc1	i965/tex: Unify the TexImage and TexSubImage code It's nearly the same so there's no good reason why it can't be in a common function. The one difference is that _mesa_store_teximage calls AllocTextureImageBuffer for us, while _mesa_store_texsubimage doesn't, but we don't need that anyway - intelTexImage already does it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-09-15 10:59:05 -07:00
Jason Ekstrand	bb811fa828	i965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpy It is set to false in both callers. It isn't needed for glTexImage because intelTexImage calls AllocTextureImageBuffer before calling texsubimage_tiled_memcpy. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-09-15 10:59:04 -07:00
Jason Ekstrand	6314dd13f7	i965/tex: Make a couple of helpers static Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-09-15 10:59:03 -07:00
Jason Ekstrand	82b3ca1981	i965: Move TexSubImage functions to intel_tex_image.c These two paths are basically the same. There's no good reason to have them in different files. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-09-15 10:58:58 -07:00
Jason Ekstrand	a43d379000	i965/blorp: Set r8stencil_needs_update when writing stencil This fixes a crash on Haswell when we try to upload a stencil texture with blorp. It would also be a problem if someone tried to texture from stencil after glBlitFramebuffers. Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-09-15 10:58:55 -07:00
Matt Turner	1bbe180873	util/u_atomic: Add implementation of __sync_val_compare_and_swap_8 Needed for 32-bit PowerPC. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Fixes: `a6a38a038b` ("util/u_atomic: provide 64bit atomics where they're missing") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-09-15 09:37:30 -07:00
Matt Turner	d075a4089e	util: Link libmesautil into u_atomic_test Platforms without particular atomic operations require the implementations in u_atomic.c Cc: "17.2" <mesa-stable@lists.freedesktop.org> Fixes: `a6a38a038b` ("util/u_atomic: provide 64bit atomics where they're missing") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-09-15 09:37:30 -07:00
Lionel Landwerlin	5ff06ddf3b	vulkan: update headers & registry to VK 1.0.61 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-09-15 08:56:40 -07:00
Gert Wollny	c75d781610	mesa/st/tests: Correct build flags and force -std=c++11 Include src/gallium/Automake.inc, correct the build flags accordingly. Force -std=c++11 (extensively used by the test) as otherwise it gets defined only when building against llvm >= 3.9. Fixes: `7be6d8fe12` ("mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2017-09-15 13:56:28 +01:00
Emil Velikov	3c5fb7346f	automake: include radv_shader.h in the sources list Otherwise it will be missing from the tarball, leadin to build failure. Fixes: `d4d777317b` ("radv: move shaders related code to radv_shader.c") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-09-15 13:56:27 +01:00
Gurkirpal Singh	6a8aa11c20	st/omx_bellagio: Rename state tracker and option Changes --enable-omx option to --enable-omx-bellagio Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com> Reviewed-and-Tested-by: Julien Isorce <julien.iso...@gmail.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-09-15 14:28:36 +02:00
Tapani Pälli	acbfcb7105	i965: fix build warning on clang fixes following warning: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') cast is needed to avoid this change turning in to another warning: warning: format specifies type 'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned long') Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-09-15 12:39:33 +03:00
Samuel Pitoiset	8e8c7c6703	radv: fix a potential crash if attachments allocation failed Also, it's useless to set the error code twice. Though, we should probably skip the next commands when the command buffer is considered invalid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-15 09:16:38 +02:00
Samuel Pitoiset	a0495d4bb3	radv: dump the device name into the hang report Similar to RadeonSI renderer string. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-15 09:16:35 +02:00
Samuel Pitoiset	176c2ad10c	radv: add get_chip_name() callback Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-15 09:16:34 +02:00
Dave Airlie	1b163238f5	r600: add .gitignore for egd_tables.h	2017-09-15 13:55:01 +10:00
Timothy Arceri	a70a401f52	radeonsi: enable STD430 packing of UBOs by default Before this change we were defaulting to STD140 which is slightly less efficient at packing arrays. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:55 +10:00
Timothy Arceri	fac9f2c4b0	st/mesa: set UseSTD430AsDefaultPacking const based on CAP Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:55 +10:00
Timothy Arceri	c96e45ebf0	gallium: introduce PIPE_CAP_LOAD_CONSTBUF Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:55 +10:00
Timothy Arceri	b4401cc104	radeonsi: make use of LOAD for UBOs v2: always set can_speculate and allow_smem to true Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:55 +10:00
Timothy Arceri	51cf16319d	mesa/st: add LOAD support for UBOs This will allow us to use STD430 packing by default if the driver supports it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:55 +10:00
Timothy Arceri	ee0fbc8b71	mesa/st: create add_buffer_to_load_and_stores() helper Will be used to add LOAD support to UBOs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:54 +10:00
Timothy Arceri	6fa60b5e40	gallium: add CONSTBUF type to tgsi_file_type This will be use to distinguish between load types when using the TGSI_OPCODE_LOAD opcode. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-15 11:42:54 +10:00
Dave Airlie	b6f6ead198	virgl: drop const dimensions on first block. The virgl protocol version of tgsi doesn't handle this yet, transform it back to the old ways. Thanks to Nicolai Hähnle <nicolai.haehnle@amd.com> for also writing nearly the same patch. Fixes: `41e342d5` tgsi/ureg: always emit constants (and their decls) as 2D Tested-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-15 10:33:14 +10:00
Dave Airlie	a7a7bf21bd	st/glsl->tgsi: fix u64 to bool comparisons. Otherwise we end up using a 32-bit comparison which didn't end well. Timothy caught this while playing around with some opt passes. Fixes: `278580729a` (st/glsl_to_tgsi: add support for 64-bit integers) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-15 09:49:50 +10:00
Kenneth Graunke	62f2670cba	i965: Print size of validation and relocation lists in INTEL_DEBUG=flush It's nice to have this information. While we're at it, tweak the formatting to try and vertically align numbers in the common case. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	7c5988e615	i965: Disentangle batch and state buffer flushing. We now flush the batch when either the batchbuffer or statebuffer reaches the original intended batch size, instead of when the sum of the two reaches a certain size (which makes no sense now that they're separate buffers). With this change, we also need to update our "are we near the end?" estimate to require separate batch and state buffer space. I obtained these estimates by looking at the size of draw calls in the Unreal 4 Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true). This will significantly impact the size of our batches. I've adjusted both down to try and be roughly similar to what we had been doing. On various benchmarks, a 20kB batch and 16kB statebuffer seemed to about right, but we may need to adjust this further. I tried a 16kB batch, but that regressed Synmark OglMultithread performance by a fair bit. 32kB for both would have significantly increased our batch sizes. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	2c46a67b41	i965: Delete BATCH_RESERVED handling. Now that we can grow the batchbuffer if we absolutely need the extra space, we don't need to reserve space for the final do-or-die ending commands. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	9034d157c0	i965: Make BLORP properly avoid batch wrapping. We need to set brw->no_batch_wrap to actually avoid flushing in the middle of our BLORP operation, and instead grow the batchbuffer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	2dfc119f22	i965: Grow the batch/state buffers if we need space and can't flush. Previously, we would just assert fail and die in this case. The only safeguard is the "estimated max prim size" checks when starting a draw (or compute dispatch or BLORP operation)...which are woefully broken. Growing is fairly straightforward: 1. Allocate a new larger BO. 2. memcpy the existing contents over to the new buffer 3. Set the new BO to the same GTT offset as the old BO. When emitting relocations, we write the presumed GTT offset of the target BO. If we changed it, we'd have to update all the existing values (by walking the relocation list and looking at offsets), which is more expensive. With the old BO freed, ideally the kernel could simply place the new BO at that offset anyway. 4. Update the validation list to contain the new BO. 5. Update the relocation list to have the GEM handle for the new BO (which we can skip if using I915_EXEC_HANDLE_LUT). v2: Update to handle malloc'd shadow buffers. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	78c404f106	i965: Use a separate state buffer, but avoid changing flushing behavior. Previously, we emitted GPU commands and indirect state into the same buffer, using a stack/heap like system where we filled in commands from the start of the buffer, and state from the end of the buffer. We then flushed before the two met in the middle. Meeting in the middle is fatal, so you have to be certain that you reserve the correct amount of space before emitting commands or state for a draw. Currently, we will assert !no_batch_wrap and die if the estimate is ever too small. This has been mercifully obscure, but has happened on a number of occasions, and could in theory happen to any application that issues a large draw at just the wrong time. Estimating the amount of batch space required is painful - it's hard to get right, and getting it right involves a lot of code that would burn CPU time, and also be painful to maintain. Rolling back to a saved state and retrying is also painful - failing to save/restore all the required state will break things, and redoing state emission burns a lot of CPU. memcpy'ing to a new batch and continuing is painful, because commands we issue for a draw depend on earlier commands as well (such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state). The best plan is to never run out of space, which is totally doable but pretty wasteful - a pessimal draw requires a huge amount of space, and rarely occurs. Instead, we'd like to grow the batch buffer if we need more space and can't safely flush. We can't grow with a meet in the middle approach - we'd have to move the state to the end, which would mean updating every offset from dynamic state base address. Using separate batch and state buffers, where both fill starting at the beginning, makes it easy to grow either as needed. This patch separates the two concepts. We create a separate state buffer, with a second relocation list, and use that for brw_state_batch. However, this patch tries to retain the original flushing behavior - it adds the amount of batch and state space together, as if they were still co-existing in a single buffer. The hope is to flush at the same time as before. This is necessary to avoid provoking bugs caused by broken batch wrap handling (which we'll fix shortly). It also avoids suddenly increasing the size of the batch (due to state not taking up space), which could have a significant performance impact. We'll tune it later. v2: - Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught by Chris). Unfortunately, we lose the ability to capture state data on older kernels. - Continue to support the malloc'd shadow buffers. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	0bf3fa4c53	i965: Pass screen to intel_batchbuffer_reset(). This will let us access screen->kernel_features in the next patch. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	2e68c4e454	i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer. We'll need to read from both buffers when decoding state. This also drops the "failed to map" fallback - it's completely useless on LLC systems where we write directly to the mapped BO. It's not that useful on non-LLC systems either. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	e723255901	i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc. brw_batch_reloc emits a relocation from the batchbuffer to elsewhere. brw_state_reloc emits a relocation from the statebuffer to elsewhere. For now, they do the same thing, but when we actually split the two buffers, we'll change brw_state_reloc to use the state buffer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-09-14 16:17:36 -07:00
Kenneth Graunke	1674a0bcbc	i965: Refactor relocs into a brw_reloc_list structure. I'm planning on splitting batch and state into separate buffers, at which point we'll need two relocation lists. In preparation for that, this patch refactors the relocation stuff into a structure we can replicate...which looks a lot like anv_reloc_list. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-09-14 16:17:36 -07:00

1 2 3 4 5 ...

88154 commits