fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-09 02:28:10 +02:00

Author	SHA1	Message	Date
Alexandros Frantzis	e0ffcdf16a	virgl: Don't try to use cached resources for legacy fences Resources for fences should not be from the cache, since we are basing the fence status on the resource creation busy status. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:16 -07:00
Alexandros Frantzis	8089d3658a	virgl: More info about chosen alignment value Add more info about why the value of VIRGL_MAP_BUFFER_ALIGNMENT. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:44:53 -07:00
Chia-I Wu	371743157e	virgl: store all info about atomic buffers We will need the full info. This also speeds up virgl_attach_res_atomic_buffers and fixes resource leaks when the context is destroyed. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	98fd742d7e	virgl: add shader images to virgl_shader_binding_state It replaces virgl_context::images. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	f965efb3c8	virgl: add SSBOs to virgl_shader_binding_state It replaces virgl_context::ssbos. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	920c4143f0	virgl: add UBOs to virgl_shader_binding_state It replaces virgl_context::ubos. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	2e21d66d7a	virgl: add virgl_shader_binding_state virgl_shader_binding_state will be used to manage all per-stage shader bindings. For now, it manages only sampler views. This replaces virgl_textures_info and fixes some issues - start_slot is now honored - views outside of [start_slot, slart_slot+count) are unmodified - views are released when the context is destroyed Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Kenneth Graunke	30314270d4	iris: Zero shs->cbuf0 when binding a passthrough TCS Fixes valgrind errors when running two CTS tests back to back: - KHR-GL45.shader_image_load_store.basic-allTargets-loadStoreT* (The first test has an actual TCS, the second uses passthrough.)	2019-06-07 15:13:42 -07:00
Jason Ekstrand	1e6b32d08c	intel/blorp: Only double the fast-clear rect alignment on HSW This restriction was accidentally added to the BSpec/PRM as an unrestricted restriction starting with the HSW docs and it was never removed. However, it only ever applied to HSW and actually potentially causes problems on BDW and above where we have mipmapped fast-clears. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-06-07 22:00:55 +00:00
Rob Clark	3c456cf583	freedreno/a6xx: re-arrange program stageobj/group Split out a separate program config state group to run early before the other groups. This seems to help w/ intermittent "missed tiles" (although I had assumed that was a mem2gmem issue), or at least I can't reproduce that issue with this patch, but can without. It has the benefit of HLSQ_VS_CNTL.CONSTLEN matching for VS and BS. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	958f6ffb60	freedreno/a6xx: fix hangs with newer sqe fw With the newer (v1.76) fw, we were getting hangs (compared to older v1.66 fw). Re-work the GMEM code to structure things a bit closer to the blob. This moves some PKT7 packets from IB2 to IB1, which I think is what was confusing SQE and causing it to get stuck in an infinite loop. But in general structuring things at least closer to the same way blob does makes it easier to compare cmdstream. Note: this is a bit on the large side for what I'd normally consider for stable.. but right now it is looking like it is the newer fw that is headed for linux-firmware. This should defn have some soak time on master, but probably a good idea for this patch to end up in distro mesa builds by the time a630_sqe.fw hits linux-firmware. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	1d002cfade	freedreno/a6xx: WFI before RB_CCU_CNTL writes This seems to be in a block of non buffered/context regs. Blob always WFIs before write, so probably a good idea. Annoyingly, compared to ealier gens, it is a bit harder to tell from the register offset whether it is a buffered reg, it isn't as simple as everything below 0x2000, it seems. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	8a02ca807d	freedreno/a6xx: don't pre-dispatch texture fetch on accident Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	b820c09fa8	freedreno/a6xx: fix issues with gallium HUD In some cases the draw for the text wasn't working. This seems to be fixed by resyncing some of the "golded registers" from blob (initial values were based on somewhat older blob version). Perhaps good to have a bit of soak time on master, but would be good to eventually land in 19.x stable branches. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Nanley Chery	b4198e792c	anv/cmd_buffer: Initalize the clear color struct for CNL+ On CNL+, the clear color struct is composed of RGBA channel values and fields which are either reserved by the HW or used to control fast-clears. Currently anv initializes the channel values to zero and allows the other fields to be undefined. Satisfy the MBZ field requirements by removing an optimization that doesn't hold true for CNL+ and pulling in the number of dwords to initialize from ISL. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-07 18:43:06 +00:00
Jon Turney	87173ded6e	glx/windows: Fix compilation with -Werror-format Fix compilation where the DWORD type is used with a format, after -Werror-format added by `c9c1e261`. Some Win32 API types are different fundamental types in the 32-bit and 64-bit versions. This problem is then further compounded by the fact that whilst both 32-bit Cygwin and 32-bit MinGW use the ILP32 data model, 64-bit MinGW uses the LLP64 data model, but 64-bit Cygwin uses the LP64 data model. This makes it near impossible to write printf format specifiers which are correct for all those targets. In the Win32 API, DWORD is an unsigned, 32-bit type. So, it is defined in terms of an unsigned long, except in the LP64 data model used by 64-bit Cygwin, where it is an unsigned int. It should always be safe to cast it to unsigned int and use %u or %x. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 11:28:48 -07:00
Kenneth Graunke	cd796120c9	iris: Rename bind_state to bind_shader_state. bind_state is possibly the worst name ever. For create, we used create_shader_state, which is more descriptive. Put shader in the name.	2019-06-07 11:26:20 -07:00
Kenneth Graunke	d5d2fb5c4c	isl: Mark enum isl_channel_select packed so it becomes 1 byte. I recently discovered that the following code lead to valgrind errors: struct isl_swizzle swizzle = ISL_SWIZZLE_IDENTITY; VALGRIND_CHECK_MEM_IS_DEFINED(&swizzle, sizeof(swizzle)); which is surprising, because struct isl_swizzle is simply: struct isl_swizzle { enum isl_channel_select r:4; enum isl_channel_select g:4; enum isl_channel_select b:4; enum isl_channel_select a:4; }; and the above code initializes all of them with a C99 initializer. Iván Briano reminded me that C99 initializers don't necessarily zero padding. A quick inspection revealed that sizeof(struct isl_swizzle) was 4 (rather than the expected 2). Ian Romanick suggested changing it to uint16_t, since this is essentially dicing up an unsigned, and that worked. This patch marks enum isl_channel_select packed, changing its size from 4 bytes to 1 byte. This then makes struct isl_swizzle 2 bytes, with no bogus padding fields. This eliminates valgrind undefined memory warnings. These isl_swizzle values become part of our BLORP blit program keys, which are then hashed. This undefined padding was being included in the hashing, possibly leading to issues. I originally saw this error when running KHR-GL45.texture_size_promotion.functional in iris under valgrind. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-07 11:09:44 -07:00
Alyssa Rosenzweig	e1c14b2820	panfrost/ci: Texture wrap tests are legitimately fixed These depended on the wallpaper reload. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	8442dde169	panfrost/midgard: Lower inot to inor with 0 We were previously lowering to inand, but the second arg was not duplicated so inot would always return ~0. Oops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	d415748955	panfrost/midgard: Cleanup tag fetch in disassembler Trivial. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	d3ad8d6b48	panfrost/midgard: Use fancy iterator Trivial cleanup. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	ae20bee75e	panfrost/midgard: Cull dead branches This fixes bugs with complex control flow. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	c62f2ff852	panfrost/midgard: Add mir_print_bundle helper This helps with debugging scheduling/emission. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	fd6d6c1b15	panfrost/midgard/disasm: Pretty-print branch tags Just makes it a little more obvious what's going on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	2ebf22c399	panfrost/ci: Note some since-fixed tests Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	de8d49acdc	panfrost/midgard: Vectorize I/O This uses the new mesa/st functionality for NIR I/O vectorization, which eliminates a number of corner cases (resulting in assorted dEQP failures and regressions) and should improve performance substantial due to lessened pressure on the load/store pipe. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	4aced18031	panfrost/midgard: Remove varyings delay pass This pass interfered with the more delicate path required for non-vectorized I/O. It's also ugly and duplicating the job of an actual honest-to-goodness scheduler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	43568f2675	panfrost/midgard: Apply component to load_input Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Eric Engestrom	440fe0eb43	nir: fix s/&&/\|\|/ typo Fixes: `cd73b6174b` "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-07 16:06:25 +01:00
Kristian H. Kristensen	b9bbac6234	freedreno/a6xx: Drop struct stage array This now boils down to just picking between binning or vertex shader and dummy_fs or real fs, which we can do in a couple of lines of code instead. The constlen logic isn't doing what it thinks it's doing, both constlens at this point MAX2(s[VS].constlen, align(state->bs->constlen, 4)); are binning shader constlens. We'll have to revisit the constlen logic, but this commit doesn't change how it works. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:33:12 -07:00
Kristian H. Kristensen	9382a3c11d	freedreno/a6xx: Drop support for SS6_DIRECT shader upload a6xx only supports indirect shaders. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:33:10 -07:00
Kristian H. Kristensen	0ef00ceb2e	freedreno/a6xx: Share shader_t_to_opcode We have a similar function in fd6_program.c. Move to fd6_emit.h and share. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:33:03 -07:00
Kristian H. Kristensen	4552162e2d	freedreno/a6xx: Consolidate more of dword 0 building in fd6_draw_vbo There's already a bit of duplicated logic here and tessellation will add more. Build up dword 0 in fd6_draw_vbo() and drop the a4xx in the process. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:32:59 -07:00
Kristian H. Kristensen	cae6b4d741	freedreno: Move fd4_size2indextype() helper to freedreno_util.h In preparation for refactoring fd6_draw.c a bit. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:32:34 -07:00
Samuel Pitoiset	0905189a25	radv: enable VK_EXT_sample_locations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:17 +02:00
Samuel Pitoiset	05f5fa661f	radv: enable HTILE for images that might need variable sample locations This is now supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:14 +02:00
Samuel Pitoiset	e7677a697b	radv: handle sample locations during automatic layout transitions From the Vulkan spec 1.1.109: "Some implementations may need to evaluate depth image values while performing image layout transitions. To accommodate this, instances of the VkSampleLocationsInfoEXT structure can be specified for each situation where an explicit or automatic layout transition has to take place. [...] and VkRenderPassSampleLocationsBeginInfoEXT can be chained from VkRenderPassBeginInfo to provide sample locations for layout transitions performed implicitly by a render pass instance." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:11 +02:00
Samuel Pitoiset	d0d41e58c3	radv: determine the first subpass id for every attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:08 +02:00
Samuel Pitoiset	f58e9f6d69	radv: handle sample locations during explicit depth/stencil transitions From the Vulkan spec 1.1.109, "Some implementations may need to evaluate depth image values while performing image layout transitions. To accommodate this, instances of the VkSampleLocationsInfoEXT structure can be specified for each situation where an explicit or automatic layout transition has to take place. VkSampleLocationsInfoEXT can be chained from VkImageMemoryBarrier structures to provide sample locations for layout transitions performed by vkCmdWaitEvents and vkCmdPipelineBarrier calls." This handles explicit depth/stencil layout transitions performed with CmdWaitEvents() or CmdPipelineBarrier(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:01 +02:00
Samuel Pitoiset	a20925f2a9	radv: allow the depth decompress pass to emit dynamic sample locations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:00 +02:00
Samuel Pitoiset	2dd8dfd913	radv: allow to set dynamic sample locations to the depth decompress pass If VK_EXT_sample_locations is used, the driver might need to emit the sample locations specified during layout transitions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:10:55 +02:00
Samuel Pitoiset	d78990c174	radv: allow to save/restore sample locations during meta operations This will be used for the depth decompress pass that might need to emit variable sample locations during layout transitions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:10:50 +02:00
Kenneth Graunke	22025595f3	iris: Sweep the NIR in iris_create_uncompiled_shader(). We run a ton of backend specific passes here (mostly brw_preprocess_nir) and ought to sweep up any unused memory at this point, since we're going to hang on to this NIR for as long as the linked program lives.	2019-06-07 01:29:38 -07:00
Eduardo Lima Mitev	c02ffd2700	ir3: Use the new NIR lowering pass for integer multiplication Shader-db stats courtesy of Eric Anholt: total instructions in shared programs: 6480215 -> 6475457 (-0.07%) instructions in affected programs: 662105 -> 657347 (-0.72%) helped: 1209 HURT: 13 total constlen in shared programs: 1432704 -> 1427769 (-0.34%) constlen in affected programs: 100063 -> 95128 (-4.93%) helped: 512 HURT: 0 total max_sun in shared programs: 875561 -> 873387 (-0.25%) max_sun in affected programs: 46179 -> 44005 (-4.71%) helped: 1087 HURT: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	340277ad71	ir3/nir: Add new NIR AlgebraicPass for lowering imul Currently, ir3 backend compiler is lowering integer multiplication from: dst = a * b to: dst = (al * bl) + (ah * bl << 16) + (al * bh << 16) by emitting this code: mull.u tmp0, a, b ; mul low, i.e. al * bl madsh.m16 tmp1, a, b, tmp0 ; mul-add shift high mix, i.e. ah * bl << 16 madsh.m16 dst, b, a, tmp1 ; i.e. al * bh << 16 which at that point has very low chances of being optimized. This patch adds a new nir_algebraic.AlgebraicPass to performs this lowering during NIR algebraic optimization passes, giving it a better chance for optimizing the resulting code. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	3addd7c8d9	nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16 For umul_low (al * bl), zero is returned if the low 16-bits word of either source is zero. for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl' is zero. A couple of nir_search_helpers are added: is_upper_half_zero() returns true if the highest word of all components of an integer NIR alu src are zero. is_lower_half_zero() returns true if the lowest word of all components of an integer nir alu src are zero. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	e45de3a6c3	ir3/compiler: Handle new alu opcodes 'umul_low' and 'imadsh_mix16' They directly emit ir3_MULL_U and ir3_MADSH_M16 respectively. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	c27b3758fa	nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes 'umul_low' is the low 32-bits of unsigned integer multiply. It maps directly to ir3's MULL_U. 'imadsh_mix16' is multiply add with shift and mix, an ir3 specific instruction that maps directly to ir3's IMADSH_M16. Both are necessary for the lowering of integer multiplication on Freedreno, which will be introduced later in this series. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Iago Toral Quiroga	9b96ae69bc	v3d: don't emit point coordinates varyings if the FS doesn't read them We still need to emit them in V3D 3.x since there there is no mechanism to disable them. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:29:42 +02:00

1 2 3 4 5 ...

111612 commits