fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-01 03:48:06 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	d1c4e64a69	intel/compiler: Add a flag to avoid compacting push constants In vec4, we can just not run the pass. In fs, things are a bit more deeply intertwined. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	aecde23519	anv: Pre-compute push ranges for graphics pipelines It turns off that emitting push constants is one of the hottest paths in the driver and ANY work we do there costs us. By pre-computing things a bit ahead of time, we shave 5% off the runtime of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	4b392ced2d	anv: Stop bounds-checking pushed UBOs The bounds checking is actually less safe than just pushing the data. If the bounds checking actually ever kicks in and it's not on the last UBO push range, then the shrinking will cause all subsequent ranges to be pushed to the wrong place in the GRF. One of the behaviors we definitely don't want is for OOB UBO access to result in completely unrelated UBOs returning garbage values. It's safer to just push the UBOs as-requested. If we're really concerned about robustness, we can emit shader code to do bounds checking which should be stupid cheap (a CMP followed by SEL). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	ebad00d9e7	anv: Delete dead shader constant pushing code As of `2d78e55a8c`, nir_intrinsic_load_constant with a constant offset is constant-folded so we should never end up with any that trigger brw_nir_analyze_ubo_ranges. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	0709c0f6b4	anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout This lets us stop tracking the pipeline layout. It also means less indirection on a very hot path. As an extra bonus, we can make some of our data structures smaller. No measurable CPU overhead improvement. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	fa120cb31c	anv: Input attachments are always single-plane Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	0a02f2a278	genxml: Mark everything in genX_pack.h always_inline Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	abfd4651ed	anv/pipeline: Assume layout != NULL In the early days of the driver we allowed layout to be VK_NULL_HANDLE and used that for some internal pipelines when we wanted to be lazy. Vulkan doesn't actually allow NULL layouts, however, so there's no reason to have this check. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Italo Nicola	59623f211b	intel/compiler: remove old comment This comment was correct some time ago, but since commit `d3c10ad427`, it isn't true anymore. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-18 10:20:34 -08:00
Alyssa Rosenzweig	3663340049	pan/midgard: Use shader stage in mir_op_computes_derivative A 'normal' texture op may be emitted in a vertex shader on T720 but it still doesn't take any derivatives. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-18 08:48:54 -05:00
Danylo Piliaiev	6f17fe0606	i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting 3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in SuperTuxCart and Tropico 6 which was seen only on Haswell. The reason for this is unknown and fix was found empirically. The closest mention in PRM is that it should improve performance. From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS): "When the BLEND_STATE pointer changes but not the CC_STATE pointer, driver needs to force a CC_STATE pointer change to improve blend performance in pixel backend." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1834 Fixes: `eca4a654` ("i965: Disable dual source blending when shader doesn't support it on gen8+") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-18 11:00:23 +02:00
Samuel Pitoiset	1ebd9459e7	radv: implement VK_AMD_device_coherent_memory This extension adds the device coherent and device uncached memory types. It's known to be slower than non-device coherent memory but it might be useful for debugging. This is only exposed for chips that support L2 uncached. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-18 08:20:19 +00:00
Samuel Pitoiset	2af7511ed2	ac: add radeon_info::has_l2_uncached For chips that have uncached device memory (ie. MTYPE_UC). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-18 08:20:19 +00:00
Pierre-Eric Pelloux-Prayer	3c9ea6bdfd	radeonsi: enable mesa_glthread for GfxBench It improves offscreen tests performance. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-18 09:16:18 +01:00
Alyssa Rosenzweig	bc9a7d0699	pan/midgard: Represent ld/st offset unpacked This simplifies manipulation of the offsets dramatically, fixing some UBO access related bugs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 22:19:31 -05:00
Alyssa Rosenzweig	1798f6bfc3	pan/midgard: Fix masks/alignment for 64-bit loads These need to be handled with special care. Oh, Midgard, you're extra special. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 22:19:31 -05:00
Alyssa Rosenzweig	34a860b9e3	pan/midgard: Expose more typesize helpers Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 21:30:14 -05:00
Alyssa Rosenzweig	2236904f72	pan/midgard: Implement non-aligned UBOs The field is more fine-grained than we had assumed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 21:18:45 -05:00
Christian Gmeiner	ee3ad0fad2	etnaviv: rs: upsampling is not supported This change makes it possible to support different downsample cases like 4 -> 2 or 4 -> 1. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-11-17 18:42:31 +00:00
Jonathan Marek	75e58d1fae	freedreno/registers: fix a6xx_2d_blit_cntl ROTATE A change from `b7093882` got overwritten by `610c8c93` Fixes: `610c8c93` ("freedreno/registers: Update with GS, HS and DS registers") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-17 17:40:53 +00:00
Jonathan Marek	0f5743429c	freedreno/ir3: disable texture prefetch for 1d array textures Prefetch only supports the basic 2D texture case, checking is_array is needed because 1d array textures pass the coord num_components==2 test. Fixes: `2a0d45ae` ("freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-17 17:01:18 +00:00
Andreas Baierl	ef9635d0bc	lima: Parse VS and PLBU command stream while making a dump This makes the streams more readable and comparable with the blob's parser as it parses the VS and PLBU stream and shows the currently known values. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-11-17 05:39:17 +00:00
Andreas Baierl	c76eb7ea84	lima: Beautify stream dumps Change the dump, that the output looks more like the output of mali-syscall-tracker [1]. This is a preparation for a more detailed stream analysis. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> [1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker	2019-11-17 05:39:17 +00:00
Aaron Watry	3b3494174d	clover/llvm: fix build after llvm 10 commit 1dfede3122ee CodeGenFileType moved from ::llvm::TargetMachine in llvm/Target/TargetMachine.h to ::llvm:: in llvm/Support/CodeGen.h Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-11-15 22:54:31 -06:00
Mauro Rossi	09ab297e9f	android: util/format: fix include path list To avoid following building error: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_util_intermediates/format/u_format_table.c:30:10: fatal error: 'u_format.h' file not found ^~~~~~~~~~~~ 1 error generated. Fixes: `882ca6d` ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-11-16 00:06:31 +01:00
Mauro Rossi	3cd522c70a	android: radeonsi: fix build error due to wrong u_format.csv file path GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path in order to avoid following build error: ninja: error: 'external/mesa/util/format/u_format.csv', needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h', missing and no known rule to make it Fixes: `882ca6d` ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-11-15 23:20:03 +01:00
Eric Anholt	b30589cbd3	mesa/st: Reuse st_choose_matching_format from st_choose_format(). We had this ad-hoc exact size matching for unsized internalformats, but st_choose_matching_format() can do exactly what we want. This means, that, for example, we'll now prefer the matching ordering for 565/565_REV if the driver supports both orders. We also pass Unpack.SwapBytes through from ChooseTextureFormat so that we can hit the memcpy path for 8888 formats when that flag is set. Some interesting format choice changes from this (on softpipe): intf/form/type before after ---------------------------------------------------- RGBA/RGBA/USHORT: R8G8B8A8_UNORM -> RGBA_UNORM16 RGB/RGBA/8888: X8B8G8R8_UNORM -> R8G8B8X8_UNORM RGB/ABGR/8888_REV: X8B8G8R8_UNORM -> R8G8B8X8_UNORM RGBA/RGBA/5551: B5G5R5A1_UNORM -> A1B5G5R5_UNORM RGBA/RGBA/4444: R8G8B8A8_UNORM -> A4B4G4R4_UNORM RGBA/GL_RGBA/1010102: R8G8B8A8_UNORM -> A2B10G10R10_UNORM DEPTH/DEPTH/UINT: Z24X8 -> Z_UNORM32 DEPTH/DEPTH/USHORT: Z24X8 -> Z_UNORM16 v2: Make sure that the baseformat still matches. v1 would pick MESA_FORMAT_L16_UNORM for RED/LUMINANCE/SHORT, when we clearly want a red format. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-15 20:32:17 +00:00
Eric Anholt	bc2b14a4a3	mesa: Don't put sRGB formats in the array format table. sRGB vs unorm was the only conflict case being guarded against in this function. Before the PIPE_FORMAT conversion, we always listed the unorm before the sRGB in the enums, but PIPE_FORMAT_A8B8G8R8_SRGB happens to be before _UNORM. We always want the unorm result here. Fixes: `807a800d8c` ("mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-15 20:32:17 +00:00
Eric Anholt	e5b06008f1	mesa/st: Simplify st_choose_matching_format(). We now have a nice helper function for finding those memcpy formats, without needing to go through each entry of the mesa format table to see if it happens to match. While looking at sysprof of a softpipe GLES2 CTS run, we were spending ~8% of the CPU on ChooseTextureFormat. With this, roughly the same region of the testsuite was .4%. v2: Add Ken's fix for canonicalizing array formats. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-15 20:32:17 +00:00
Kenneth Graunke	69f109cc37	mesa: Handle GL_COLOR_INDEX in _mesa_format_from_format_and_type(). Just return MESA_FORMAT_NONE to avoid triggering unreachable; there's really no sensible thing to return for this case anyway. This prevents regressions in the next commit, which makes st/mesa start using this function to find a reasonable format from GL format and type enums. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-15 20:32:17 +00:00
Alyssa Rosenzweig	ea232c7cfd	pan/midgard: Use generic constant packing for 8/64-bit Eventually, we will want to combine constants across types, but for now let's not break the world. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	4c182a6d11	pan/midgard: Pack 64-bit swizzles 64-bit ops have their own funky swizzles. Let's pack them, both for native 64-bit sources as well as extended 32-bit sources. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	ba2fb98d36	pan/midgard: Fix mir_round_bytemask_down for !32b Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	2655a300a3	pan/midgard: Implement i2i64 and u2u64 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	855eec93b1	pan/midgard: Expand 64-bit writemasks Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Marek Olšák	bda3ec5d55	radeonsi/nir: don't lower fma, instead, fuse fma We want fma. This decreases compile times by 4% for Borderlands 2. 48505 shaders in 30515 tests Totals: SGPRS: 2206584 -> 2204784 (-0.08 %) VGPRS: 1647892 -> 1648964 (0.07 %) Spilled SGPRs: 6256 -> 6078 (-2.85 %) Spilled VGPRs: 72 -> 72 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2240 -> 2240 (0.00 %) dwords per thread Code Size: 49680804 -> 49837988 (0.32 %) bytes LDS: 74 -> 74 (0.00 %) blocks Max Waves: 371387 -> 371352 (-0.01 %) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Marek Olšák	dec34e880d	radeonsi/nir: call nir_lower_flrp only once per shader Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Marek Olšák	0714b3d57e	radeonsi/nir: remove dead function temps glxgears has dead temps after lowering color inputs to load intrinsics. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Marek Olšák	bc5097a7d9	gallium/noop: call finalize_nir For measuring st/mesa compile time. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Tomeu Vizoso	27801b90fa	panfrost: Make sure the shader descriptor is in sync with the GL state State was leaking from previous frames as we weren't updating the descriptor in all cases. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig	095654e3c2	pan/midgard: Prioritize texture registers On newer GPUs, this is a no-op. On older GPUs, this prevents needless spilling since texture registers are shared with a subset of work registers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig	339401b53c	pan/midgard: Disassemble with old pipeline always on T720 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig	8344d7425b	pan/midgard: Use texture, not textureLod, on early Midgard We have to disable the fixup. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig	29f5b00e6e	pan/midgard: Fix vertex texturing on early Midgard We use a different set of texture registers, probably to save hardware. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig	3866d0776f	pan/midgard: Generalize texture registers across GPUs Early Midgard uses a different set of texture registers; let's not hardcode. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Rhys Perry	df645fa369	aco: implement VK_KHR_shader_float_controls This actually supports more of the extension than the LLVM backend but we can't enable it because ACO doesn't work with all stages yet. With more of it enabled, some CTS tests fail because our 64-bit sqrt is very imprecise. I can't find any precision requirements for it anywhere, so I'm thinking it might be a CTS issue. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-15 17:36:21 +00:00
Rhys Perry	be1d11249b	aco: fix 64-bit fsign with 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-15 17:36:21 +00:00
Rhys Perry	b062b92ab1	aco: don't combine literals into v_cndmask_b32/v_subb/v_addc No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-15 17:36:21 +00:00
Rhys Perry	d7b0d9a8d8	radv: enable FP16/FP64 denormals earlier and only for LLVM ACO sets this itself and will have to set it differently in the future to support shaderDenormFlushToZeroFloat64. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-15 17:36:21 +00:00
Michel Dänzer	c6c7652753	gitlab-ci: Organize images using new REPO_SUFFIX templates feature Two benefits: Most docker image related environment variables can now be defined in the jobs where they're used instead of globally. The DEBIAN_TAG values are propagated to other jobs via YAML anchors. Images on https://gitlab.freedesktop.org/mesa/mesa/container_registry are now organized in separate repositories with a suffix matching the name of the job which makes sure the image is there. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-15 16:23:22 +01:00

1 2 3 4 5 ...

117674 commits