fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 20:28:04 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	703ce617ca	aco: restrict scheduling depending on max_waves Previously, we allowed all shaders to reduce the number of max_waves to as low as 5. Restricting this on shaders with low register demand, increases the total number of waves while the VMEM def-use distances hardly change. This patch also changes the max number of move operations per MEM instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Jason Ekstrand	beca63c6c0	anv: Avoid emitting UBO surface states that won't be used This shaves around 4-5% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Jason Ekstrand	24c0545b2d	intel/vec4: Set brw_stage_prog_data::has_ubo_pull In `0e4a75f917`, Ken added a flag brw_stage_prog_data which indicates whether any UBO pulls ever occur. Unfortunately, he neglected to set the bit in the vec4 back-end. This was fine at the time because the optimization was intended for iris which does not support gen7 and using the vec4 back-end on Gen8+ requires an environment variable. We want to use this in Vulkan which does support Gen7 so we want the information from the vec4 back-end as well as scalar. Fixes: `0e4a75f917` "intel/compiler: Record whether any pull constant..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Samuel Pitoiset	5a9d777f5a	radv: fix perftest options RADV_PERFTEST=outooforder has been removed a while ago. This fixes dumping the options into hang reports. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:30 +01:00
Samuel Pitoiset	c895e08281	radv: move nomemorycache debug option at the right palce Fixes: `6571000071` ("radv: add debug option to turn off in memory cache") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:28 +01:00
Samuel Pitoiset	d4e0bef1bb	radv: fix dumping SPIR-V into hang reports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 13:02:08 +00:00
Tapani Pälli	4f8c86e6a5	mesa: enable ARB_gpu_shader_int64 in compat profile Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:37:27 +02:00
Tapani Pälli	2d8b8d3bd1	mesa: add [Program]Uniform*64ARB display list support This is required for int64 to be enabled in compat profile. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:37:27 +02:00
Bas Nieuwenhuizen	396195e8f1	radv: Enable VK_KHR_timeline_semaphore. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	4aa75bb3bd	radv: Add wait-before-submit support for timelines. This is actually a non-threaded implementation. I'd summarize this as event-based submission. When submit happens we walk a tree of submissions that depend on the syncobj signal operations to be submitted and if those submission we no other dependencies we start to execute them immediately. Or, well I still use a list to avoid issues with long chains and the stacksize when using recursion. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	88d41367b8	radv: Add timelines with a VK_KHR_timeline_semaphore impl. This does not fully do wait-before-submit, to be done in a follow up patch. For kernels without support for timeline syncobjs, this adds an implementation of non-shareable timelines using legacy syncobjs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2117c53b72	radv: Add temporary datastructure for submissions. So we can defer them. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	c3eae659e7	radv: Split semaphore into two parts as enum+union. This is in preparation to adding more types. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	84d9551b23	radv: Always enable syncobj when supported for all fences/semaphores. This simplifies code for timeline semaphores by needing to support less configurations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	45f4a639a8	radv: Improve fence signalling in QueueSubmit. Only signalling it once. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	a9c8424e08	radv: Do sparse binding in queue submission. So we have one place to do queue things if we end up deferring submissions. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	915e9178fa	radv: Split out commandbuffer submission. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	43ba44357c	radv: Clean up unused variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2e3a635ee6	radv: Add an early exit in the secure compile if we already have the cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen	d78809632f	radv: Compute hashes in secure process for secure compilation. To prevent poisoning arbitrary cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:37:41 +01:00
Erik Faye-Lund	4c4ac2d4d5	zink: drop nop descriptor-updates If there's nothing to be done, let's actually do nothing. Seems like a good idea. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-10-30 10:29:23 +00:00
Erik Faye-Lund	b222f28357	zink: use bitfield for dirty flagging Bitfields are a bit more ideomatic than explicit flags, and harder to get wrong. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-10-30 10:29:23 +00:00
Erik Faye-Lund	6d30abb4f1	zink: use dynamic state for line-width This will lead to fewer pipelines in the cache, which is assumed to become our most unavoidable performance bottle-neck down the line. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-10-30 10:29:23 +00:00
Duncan Hopkins	d2bb63c8d4	zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-10-30 09:09:49 +00:00
Michel Dänzer	aaf1b09270	gitlab-ci: Disable meson-windows job for the time being It needs a CI runner carrying the mesa-windows tag, but there's none available currently.	2019-10-30 09:38:20 +01:00
Timothy Arceri	cf25664686	radv: make use of radv_sc_read() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	28fff3efbc	radv: add radv_sc_read() helper This is a function with timeout support for reading from the pipe between processes used for secure compile. Initially we hardcode the timeout to 5 seconds. We can adjust the timeout limit in future if needed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	23a6827e4d	radv: allow select() calls in secure compile This will be used in the following patch to support timeouts for reading the pipe between processes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Lepton Wu	1abf05764b	mapi: Improve the x86 tsd stubs performance. This skips touching %ebx most times and it shows that glGetString performance increased from 114M/s to 120M/s on my desktop. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 20:50:05 -07:00
Lepton Wu	41407d5e9f	mapi: Inline call x86_current_tls. This saves one return and a simple benchmark which calls glGetString repeatedly on my desktop shows it improves calls per second from 123M to 141M. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1997 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:18:06 -07:00
Lepton Wu	b2b8639d8e	mapi: Clean up entry_patch_public for x86 tls Remove hard coded 16 and use entry_generate_or_patch to patch public stubs. The generated code actually is sightly tighter than before since the "nop" instructions before the final "jmp" get removed. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:18:06 -07:00
Lepton Wu	1fb75bee90	mapi: split entry_generate_or_patch for x86 tls The code works exactly the same with before. Just split this function out so we can reuse it. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:18:06 -07:00
Jonathan Gray	45206d7673	mapi: Adapted libglvnd x86 tsd changes The x86 assembly language stub in src/mapi/entry_x86_tsd.h does not generate PIC (position-independent code). This causes text relocations which bring troubles on recent versions of FreeBSD, OpenBSD, Android. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108541 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:13:14 -07:00
Caio Marcelo de Oliveira Filho	9c3c206e71	spirv: Don't fail if multiple ordering semantics bits are set Vulkan requires that only one bit for the ordering is set, but old versions of GLSLang just set all the bits. This was fixed as part of `c51287d744` but we can still find older versions (or shaders compiled with it) around. So instead of failing, emit a warning and fallback to the effective result of any combination of multiple bits: AcquireRelease. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2018 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 14:53:46 -07:00
Sagar Ghuge	f0db4c5204	intel/isl: Allow stencil buffer to support compression on Gen12+ v2: (Nanley Chery) - Fix commit title - Fix comment Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	b22b349443	iris: Resolve stencil resource prior to copy or used by CPU v2: Decide aux usage in get_copy_region_aux_settings (Nanley Chery) v3: Use isl_surf_usage_is_stencil function (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	5d331251cf	iris: Prepare resources before stencil blit operation We have to resolve destination surfaces if we are bliting to and from the same surface. v2: Revert unrelated change (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	4e0ed40ed7	iris: Prepare depth resource if clear_depth enable Avoid preparing depth resource, if we did fast depth clear before. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	81de49a9f2	iris: Prepare stencil resource before clear depth stencil Let aux surface state tracker track the stencil buffer's aux state while clearing depth stencil buffer. v2: Fix condition check (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	b8223991b5	iris: Resolve stencil buffer lossless compression with WM_HZ_OP packet Even though stencil buffer compression looks like regular lossless color compression w/o fast clear support, we have to resolve stencil buffer with WM_HZ_OP packet. v2: Check if resource is stencil with helper function (Nanley Chery) v3: Remove unnecessary included file (Nanley Chery) v4: (Nanley Chery) - Avoid stencil buffer aux state transition by improving condition check Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	87c57b8dae	intel/blorp: Set stencil resolve enable bit When set, the stencil buffer is filled with the true stencil values and we have to disable stencil buffer clear enable bit. v2: 1) Refactor code little bit (Nanley Chery) 2) Fix assertion (Nanley Chery) v3: 1) Remove unncessary assignment (Nanley Chery) 2) Fix GEN_GEN check (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	c401186762	intel: Track stencil aux usage on Gen12+ Enable stencil compression enable and control surface enable bit if stencil buffer lossless compression is enabled. v2: Remove unnecessary GEN_GEN check (Nanley Chery) v3: (Nanley Chery) - Change commit subject tag from intel/isl to intel - Keep assignment order correct Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	53d472df24	intel/blorp: Add helper function for stencil buffer resolve On Gen12+, Stencil buffer's lossless compression should be resolved with WM_HZ_OP packet. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	ce208be2d8	intel/blorp: Assign correct view while clearing depth stencil We never saw any failures regarding this typo but it's good to assign correct stencil view while constructing blorp_params. Fixes: `0cabf93b80` "intel/blorp: Add an entrypoint for clearing depth and stencil" Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	4287e0a4e4	genxml/gen12: Add Stencil Buffer Resolve Enable bit Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	0a2a9a4a5b	iris: Allocate main and aux surfaces together On Gen12, the CCS buffer address doesn't have to be referenced in state packets. In the case of a stencil buffer with CCS, the kernel won't know the location of the CCS unless an extra call is made to pin its address. To avoid this extra call, make the CCS part of the main surface. v2. Update comment above bo_size. (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	ff5bc81b51	iris: Determine aux offsets within configure_aux If a resource has a modifier, the main and aux surfaces will share a BO. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	f0ed86c6c6	iris: Bail resource creation upon aux creation error The functions used during aux buffer configuration and creation only return false for exceptional errors. Don't proceed with surface creation in those cases. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	8b62e3d978	iris: Drop iris_resource::aux::extra_aux::bo The primary and secondary aux buffers are always allocated in the same BO. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Duncan Hopkins	bb8e6994cc	zink: pass line width from rast_state to gfx_pipeline_state. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-10-29 20:38:26 +00:00

1 2 3 4 5 ...

117102 commits