fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 00:58:05 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	9168e7a65d	pan/midgard: Improve barrier disassembly Just move some state from unknowns to actual keywords. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:47 -05:00
Alyssa Rosenzweig	d208212f80	pan/midgard: Use dummy tag for empty shaders Fixes INSTR_INVALID_ENC in dEQP-GLES31.functional.compute.basic.empty Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:47 -05:00
Alyssa Rosenzweig	b2cab6b6db	pan/midgard: Fix 32/64 mixed swizzle packing Occurs in SSBO address computation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:47 -05:00
Alyssa Rosenzweig	a55a2e02a5	pan/midgard: Allow jumping out of a shader This comes up as a `return;` instruction in a compute shader. We need to use the special tag 1 to signify "break". Fixes numerous INSTR_INVALID_ENC faults in dEQP-GLES31.functional.compute.basic.* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:47 -05:00
Alyssa Rosenzweig	3f59098d1a	pan/midgard: Implement barriers Barriers execute on the texture pipeline on Midgard, so let's tentatively handle barrier() as conservatively as possible (forcing memory barriers of both buffers and shared memory). Implementation isn't quite there yet -- it doesn't look at interactions of adjacent barriers like it's supposed to -- but the core is there. Fixes dEQP-GLES31.functional.compute.basic.ssbo_local_barrier_single_invocation Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:47 -05:00
Alyssa Rosenzweig	4f0b928921	pan/midgard: Fix swizzles harder Just for disassembly for now~ Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:47 -05:00
Alyssa Rosenzweig	fbe1fd3de0	pan/midgard: Fix missing prefixes I was wondering where those were going... :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `c1952779d6` ("pan/decode: Dump to a file") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	521406a069	pan/midgard: Track pressure when scheduling ld/st Fixes RA failure in dEQP-GLES31.functional.shaders.builtin_functions.common.modf.* (which uses multiple indirect SSBO writes) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	9603126b74	panfrost: Allocate RAM backing of shared memory Unlike other GPUs, Mali does not have dedicated shared memory for compute workloads. Instead, we allocate shared memory (backed to RAM), and the general memory access functions have modes to access shared memory (essentially, think of these modes as adding this allocates base + workgroupid * stride in harder). So let's allocate enough memory based on the shared_size parameter and supply it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	50138abb5a	panfrost: Rename unknown2_8 to padding It's zero everywhere. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	6d9ee3e65a	panfrost: Rename bifrost_framebuffer->mali_framebuffer (And bifrost_fb_extra to mali_framebuffer_extra, bifrost_render_target to mali_render_target) These structures are the norm on midgard t760+, drop the bifrost names, it's silly... unrelated to the rest of the series but while I'm messing with pandecode and cleaning up bifrost abstractions, might as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	6dc105555b	panfrost: Unify bifrost_scratchpad with mali_shared_memory It looks like these are the same structure, so this allows us to reuse mali_shared_memory across architectures, and dispels with the Bifrost-specific mystery of the scratchpads... nothing so mysterious after all, just stack. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	254f40fd53	panfrost: Identify mali_shared_memory structure This small structure is used to configure shared memory and stack for compute shaders, and is also present at the beginning of framebuffer descriptors. Let's factor it out. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	418ca5dc1a	panfrost: Ensure compute shader_meta is zeroed In theory the hardware doesn't care but it'll make for easier traces. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Alyssa Rosenzweig	058faf5a4b	panfrost: Update comment about magic number relating to barriers It's a complicated story. But from what I can tell, in GL compute without barriers, the blob is able to redistribute the workgroups in various ways (that are not yet understood), whereas with barriers it cannot redistribute anything, which accounts for erratic workgroup packing without barriers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>	2020-02-16 09:16:46 -05:00
Dave Airlie	8f5a252d35	ci: bump debian image and change llvm deps to 8 v3: remove version in a few places (Michel) Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3805> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3805>	2020-02-15 04:15:00 +00:00
Dave Airlie	e7375e1795	gallivm/s390: fix pass init order on s390 with llvm 8 (v2) llvm 8 has some missing pass dependencies, fix the s390 case as well. v2: add ARM also (Michel) Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3805>	2020-02-15 04:15:00 +00:00
Kenneth Graunke	a603822b2f	iris: Trim "../../src/gallium/drivers/iris/" out of debug dump filenames Easier to read. v2: Also trim "/iris/" (Jordan Justen) Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3830> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3830>	2020-02-15 00:55:55 +00:00
Kenneth Graunke	96f247d1b3	iris: Dump frame markers with INTEL_DEBUG=submit Now you can see which batches go with which frames. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3830>	2020-02-15 00:55:55 +00:00
Marek Olšák	e395ce03e9	gallium/cso_hash: remove another layer of pointer indirection Convert this: struct cso_hash { union { struct cso_hash_data d; struct cso_node e; } data; }; to this: struct cso_hash { struct cso_hash_data data; struct cso_node *end; }; 1) data is not a pointer anymore. 2) "end" points to "data" and acts as the end of the linked list. 3) This code is still crazy. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:28 -05:00
Marek Olšák	e0bb7b87e2	gallium/cso_hash: cosmetic changes, no behavior changes Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:28 -05:00
Marek Olšák	789ed29d59	gallium/cso_hash: remove always constant variable nodeSize Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:28 -05:00
Marek Olšák	a8bbf10540	gallium/cso_hash: make cso_hash declared within structures instead of alloc'd This removes one level of indirection. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:28 -05:00
Marek Olšák	f8594a06e4	gallium/cso_hash: inline a bunch of functions I'm probably not getting anything out of this, but it's harmless. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	cf86f522b2	gallium/u_vbuf: adjust the heuristic for unrolling indices This improves performance in the first subtest of Viewperf11/Catia by 10%. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	55d8baa285	gallium/u_upload_mgr: don't do align twice in the u_upload_alloc fast path Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	19c18d532e	gallium/u_upload_mgr: reduce dereferences by adding buffer_size Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	909a2d0ed3	st/mesa: simplify releasing the current attrib buffer Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	6954efce23	st/mesa: make st_setup_current static Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	e3617fd00b	st/mesa: change some loops from while to do..while in st_atom_array.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	fd6636ebc0	st/mesa: simplify determination whether a draw needs min/max index Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	1d93372802	st/mesa: simplify determination whether a draw has user vertex buffers Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	61e4c582e0	st/mesa: always inline the code setting non-64bit vertex elements Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	3c98dccd40	mesa: remove unused _mesa_draw_indirect All drivers that expose ARB_draw_indirect also set the driver callback. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Marek Olšák	e6448f993b	mesa: translate into gallium vertex formats in mesa/main Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3829>	2020-02-14 18:16:27 -05:00
Francisco Jerez	8d3b86e34a	intel/fs/gen7+: Implement discard/demote for SIMD32 programs. At this point this simply involves fixing the initialization of the sample mask flag register to take the right dispatch mask from the thread payload, and fixing sample_mask_reg() to return f1.1 for the second half of a SIMD32 thread. This improves Manhattan 3.1 performance by 2.4%±0.31% (N>40) on my ICL with SIMD32 enabled relative to falling back to SIMD16 for the shaders that use discard. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:49 -08:00
Francisco Jerez	04c7d3d4b1	intel/fs: Return consistent UW types from sample_mask_reg() in fragment shaders. In SIMD32 programs that don't use discard, the upper 16 bits of the UD result of sample_mask_reg() don't contain the sample mask of the upper 16 channels as one would expect. Stop pretending we are returning a valid 32-bit mask. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:49 -08:00
Francisco Jerez	1c6853a9be	intel/fs: Refactor predication on sample mask into helper function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	a792e11f5c	intel/fs/gen7+: Swap sample mask flag register and FIND_LIVE_CHANNEL temporary. FIND_LIVE_CHANNEL was using f1.0-f1.1 as temporary flag register on Gen7, instead use f0.0-f0.1. In order to avoid collision with the discard sample mask, move the latter to f1.0-f1.1. This makes room for keeping track of the sample mask of the second half of SIMD32 programs that use discard. Note that some MOVs of the sample mask into f1.0 become redundant now in lower_surface_logical_send() and lower_a64_logical_send(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>x	2020-02-14 14:31:48 -08:00
Francisco Jerez	083fd96a97	intel/fs: Use helper for discard sample mask flag subregister number. Use it instead of hard-coding f0.1 for the sample mask of programs that use discard. This will make the task easier when we replace f0.1 with another flag register location in order to support discard with SIMD32 shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	a6bc11a789	intel/fs: Make sample_mask_reg() local to brw_fs.cpp and use it in more places. It's only really useful there. This will avoid confusion with another helper with a similar purpose I'm about to introduce that will be useful in multiple files from the FS back-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	b84fa0b31e	intel/fs/gen11: Work around dual-source blending hangs in combination with SIMD32. The SIMD8 dual-source blending framebuffer write messages seem to have trouble releasing the pixel scoreboard dependency in SIMD32 dispatch mode, which leads to hangs. I have a better workaround for this which doesn't involve disabling SIMD32 when dual-source blending is enabled, but I'm still investigating some issues with it. Limit the dispatch width to SIMD16 in such cases for the moment in order to make the CI happy on ICL with SIMD32 fragment shaders enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	57dee58c82	intel/fs: Set src0 alpha present bit in header when provided in message payload. Currently the "Source0 Alpha Present to RenderTarget" bit of the RT write message header is derived from brw_wm_prog_data::replicate_alpha. However the src0_alpha payload is provided anytime it's specified to the logical message. This could theoretically lead to an inconsistency if somebody provided a src0_alpha value while brw_wm_prog_data::replicate_alpha was false, as I'm planning to do in a future commit in order to implement a hardware workaround. Instead calculate the header bit based on whether a src0_alpha value was provided to the logical message, which guarantees the same behavior on pre-ICL and ICL+ (the latter used an extended descriptor bit for this which didn't suffer from the same issue). Remove the brw_wm_prog_data::replicate_alpha flag. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	e14529ff32	intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow. Together with the fixup_nomask_control_flow() pass introduced in a previous patch, this implements a less invasive alternative to the workaround documented in the hardware spec for GEN:BUG:1407528679, which doesn't involve disabling structured control flow. Under some conditions Gen12 hardware can end up executing a BB with all channels disabled, which will lead to the execution of any NoMask instructions in it, even though any execution-masked instructions will be correctly shot down. This could break assumptions of the SWSB pass if the data computed by a NoMask instruction is synchronized against by using an SWSB annotation baked into a regular execution-masked instruction, since the first (NoMask) instruction may be executed redundantly by the hardware, even though the second will correctly be shot down, potentially leading to a RaW or WaW hazard if a third instruction subsequently accesses the destination register of the first instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	4e4e8d793f	intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch writes. Found by inspection. Existing code was trying to avoid assuming that an SBID had been assigned to the virtual instruction, but synchronizing the header setup with respect to the previous SIMD16 SEND by using SYNC.ALLRD doesn't really seem possible unless the SEND instruction had been assigned an SBID. Assert-fail instead if no SBID has been allocated. Fixes: `15e3a0d9d2` "intel/eu/gen12: Set SWSB annotations in hand-crafted assembly." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	a8ac0bd759	intel/fs/gen12: Workaround unwanted SEND execution due to broken NoMask control flow. This is a less invasive alternative to the workaround documented in the hardware spec for GEN:BUG:1407528679, which doesn't involve disabling structured control flow (it's unlikely that switching to GOTO/JOIN would have actually fixed the problem anyway). Under some conditions Gen12 hardware can end up executing a BB with all channels disabled, which will lead to the execution of any NoMask instructions in it, even though any execution-masked instructions will be correctly shot down. This may break assumptions of some NoMask SEND messages whose descriptor depends on data generated by live invocations of the shader. This avoids the problem by predicating certain instructions on an ANY horizontal predicate that makes sure that their execution is omitted when all channels of the program are disabled. The shader-db impact of this patch seems to be minimal: total instructions in shared programs: 17169833 -> 17169913 (0.00%) instructions in affected programs: 30663 -> 30743 (0.26%) helped: 0 HURT: 42 total cycles in shared programs: 336966176 -> 336968568 (0.00%) cycles in affected programs: 2367290 -> 2369682 (0.10%) helped: 0 HURT: 13 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	008f95a043	intel/fs: Add virtual instruction to load mask of live channels into flag register. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	b8b509fb92	intel/fs/gen7: Fix fs_inst::flags_written() for SHADER_OPCODE_FIND_LIVE_CHANNEL. We need to pass a width of 32 since the opcode bashes the whole f1.0 register on IVB. This is unlikely to have caused problems since f1.0 is largely unused currently. That's likely to change soon though, even on platforms other than Gen7. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	c9e33e5cbf	intel/fs/cse: Make HALT instruction act as CSE barrier. Found by inspection. This seems particularly likely to cause problems with instructions dependent on the current execution mask like SHADER_OPCODE_FIND_LIVE_CHANNEL or the FS_OPCODE_LOAD_LIVE_CHANNELS instruction I'm about to introduce, but one could imagine it leading to data corruption if CSE ever managed to combine two instructions before and after the FS_OPCODE_PLACEHOLDER_HALT, since the one before may not be executed for some channels. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Andreas Baierl	fe1b0b7c50	lima/parser: Extend rsw parsing showing strings instead of numbers Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3807> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3807>	2020-02-14 21:48:25 +00:00

1 2 3 4 5 ...

120208 commits