fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 21:48:09 +02:00

Author	SHA1	Message	Date
Eric Anholt	112c65825f	freedreno/a6xx: Use LDC for UBO loads. It saves addressing math, but may cause multiple loads to be done and bcseled due to NIR not giving us good address alignment information currently. I don't have any workloads I know of using non-const-uploaded UBOs, so I don't have perf numbers for it This makes us match the GLES blob's behavior, and turnip (other than being bindful). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	ab93a631b4	freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf. With the upcoming LDC usage in the GL driver, we don't want to be uploading descriptors for every UBO when they aren't actually in use. Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling elsewhere right now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	d5176c453e	freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges. I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges would move things back in a broken way. We can just run this pass later and drop the _ir3 path. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	5387c27140	freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift. Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3() after the non-offset src's definition, leading to nir_validate() failures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	e0a4d1c4e5	freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa). Just copy the src through. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	6670475a44	freedreno/a6xx: Fix UBWC mipmapping height alignment. After fixing the power of two sizing, pitches worked, but 1-pixel high and unaligned height miplevels were off. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	81f21ff4ef	freedreno/a6xx: Fix UBWC mipmap sizing. The HW requires a log2 width/height of the level 0 meta_* size in the descriptors, making it pretty clear that UBWC mipmapping is all power-of-two sized. Fixes a bunch of failures in the upcoming unit UBWC layout unit tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	b5db2a2574	freedreno/a6xx: Fix UBWC blockheight for RG8. Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of R16I/UI's 6x8. The other blockw/h I verified other than cpp=1 (R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	9da4ce9953	freedreno: Pull the tile_alignment lookup for a layout to a helper. The r8g8 case UBWC alignment will be changing in the next commit, so fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	dc7ccdb3f5	freedreno/a6xx: Add a testcase for UBWC buffer sharing. These offsets are hand-computed referencing msm_media_info.h, and match our driver's current behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	e32783c644	freedreno/a6xx: Improve layout testcase logging for UBWC fails. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Eric Anholt	2e4ddb6353	freedreno/a4xx+: Increase max texture size to 16384. Noticed when poking around with texture layouts and found that my big texture layout from the blob buffer overflowed. Values come from http://vulkan.gpuinfo.org for Adreno 418, 512, 630. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>	2020-05-13 19:18:16 +00:00
Connor Abbott	b408734e5e	tu: Implement fallback linear staging blit for CopyImage Also, rewrite the format decision code so that we correctly decide when the linear fallback is needed, even if UBWC is disabled. As part of that, I also moved around some of the code to handle compressed formats to make sure that copying compressed formats with a linear staging blit works (this is now possible since we started allowing tiled compressed textures). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>	2020-05-13 13:39:04 +00:00
Connor Abbott	40e842c009	tu: Add noubwc debug flag to disable UBWC Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>	2020-05-13 13:39:04 +00:00
Connor Abbott	ed79f805fa	tu: Add a "scratch bo" allocation mechanism This is simpler than a full-blown memory reuse mechanism, but is good enough to make sure that repeatedly doing a copy that requires the linear staging buffer workaround won't use excessive memory or be slowed down due to repeated allocations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>	2020-05-13 13:39:04 +00:00
Samuel Pitoiset	91c757b796	turnip: use the common code for generating extensions and dispatch tables Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>	2020-05-13 08:45:29 +02:00
Rob Clark	d6706fdc46	freedreno/ir3/sched: try to avoid syncs Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d95a6e3a0c	freedreno/ir3/sched: avoid scheduling outputs If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	488cf208d5	freedreno/ir3/postsched: try to avoid (sy) syncs Similar to avoidance of `(ss)` syncs, it turns out to be helpful to avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex, (sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in gfxbench gl_fill2. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	25f4fb346e	freedreno/ir3/postsched: reset sfu_delay on sync Once we schedule an instruction that will require an `(ss)` sync flag, there is no need to delay any further instructions that consume an SFU result (until the next SFU instruction is scheduled). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	f351e1d137	freedreno/ir3: limit # of tex prefetch by shader size It seems for short frag shaders, too much prefetch can be detrimental. I think what we really want to do is decide after pre-RA sched, when we also know about nop's and what the actual ir3 instruction count is. But that will require re-working how prefetch lowering works. For now this is a super crude heuristic to attempt to approximate a good solution. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d69f6fd852	freedreno/ir3: fix indirect cb0 load_ubo lowering We can no longer assume that `state->ranges[0]` is block 0. It often is, but when we encounter a "real" ubo that we lower to `load_uniform` before a block 0 `load_ubo`, it could end up another entry in the table. Resulting in the second pass after gathering ubo ranges, not finding a valid range. Which results in a `load_ubo` for a thing that is not actually a ubo making it's way into ir3 frontend. Resulting in grabbing what we think is a ubo address out of some unrelated const register, and trying to dereference that. Which as you can imagine, fails in amusing ways. Fixes: `fc850080ee` ("ir3: Rewrite UBO push analysis to support bindless") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>	2020-05-12 23:51:46 +00:00
Rob Clark	c4dc877cb5	freedreno/ir3: don't allow negative const_offset Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>	2020-05-12 23:51:46 +00:00
Brian Ho	a43e974064	turnip: Execute ir3_nir_lower_gs pass again This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>	2020-05-12 13:42:55 -07:00
Jonathan Marek	d76e722ed6	turnip: enable tiling for compressed formats Now that layout code supports this, we can enable it. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>	2020-05-12 17:25:38 +00:00
Jonathan Marek	f543d87f23	turnip: update "fetchsize" value to match fdl6_layout changes It seems this is actually a "minimum pitch" value. For example TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels. This fixes breakage with compressed formats. For example this test: dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3 Fixes: `a34b3fa198` ("freedreno/fdl: Align after dividing by block size") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>	2020-05-12 17:25:38 +00:00
Eric Anholt	f789c5975c	freedreno: Fix non-constbuf-upload UBO block indices and count. The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers to the HW for UBO block 1-N, so let's just fix up the shader state. Fixes an off by one in const state layout setup, and some really dodgy register addressing trying to deal with dynamic UBO indices when the UBO pointers happen to be at the start of the constbuf. There's no fixes tag, though this fixes a bug from September, because it would require the num_ubos fix in nir_lower_uniforms_to_ubo. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>	2020-05-12 17:01:55 +00:00
Eric Anholt	51d7a71bd4	freedreno: Replace OUT_RELOCW with OUT_RELOC. Final cleanup commit now that they're the same. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	064f395a89	freedreno: Tell the kernel that all BOs are for writing. Using non-write flags is pretty dubious -- it means the kernel tracking an array of read-only consumers of the BO and having exclusive consumers wait on each reader's fence. It allows multiple readers through dma-bufs to do work in parallel, but at the cost of kernel CPU time and memory management of the shared array. Other drivers have dropped this distinction since dma-buf sharing is usually producer-consumer, not producer-two-consumers, and the userspace and kernel space tracking is expensive. For us, this lets us drop the flags passed in for relocs and tracked in the ringbuffer reloc lists. The end result of the flags reduction work is drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	b2c23b1e48	freedreno: Mark all ringbuffer BOs as to be dumped on crash. We can avoid passing these flags around in the DRM backends by just marking ring BOs up front. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	554b959df0	freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Eric Anholt	9d8d936dfc	freedreno: Start moving relocs flags into the BOs. It's silly to have all the reloc emitters passing around FD_RELOC_READ when you have to have it set on all relocs (that don't include WRITE, which implies read) for the kernel to actually track the fences on the BO. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Mauro Rossi	a92a483ff7	freedreno: android: add adreno-pm4-pack.xml.h generation to android build Fixes the following building errors: In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:40: external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blend.c:36: external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_const.c:26: external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `ee293160` "freedreno/a6xx: add OUT_PKT()" Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4973>	2020-05-09 16:19:14 +00:00
Mauro Rossi	5dc3b22dd0	freedreno/drm: android: add libfreedreno_registers static dependency The dependency is required to get the necessary generated headers Fixes the following building error: In file included from external/mesa/src/freedreno/drm/msm_bo.c:27: In file included from external/mesa/src/freedreno/drm/msm_priv.h:30: In file included from external/mesa/src/freedreno/drm/freedreno_priv.h:51: external/mesa/src/freedreno/drm/freedreno_ringbuffer.h:35:10: fatal error: 'adreno_common.xml.h' file not found #include "adreno_common.xml.h" ^~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `6c688ae8` ("freedreno: Deduplicate ringbuffer macros with computerator/fdperf") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4973>	2020-05-09 16:19:14 +00:00
Eric Anholt	c9e8df61dc	freedreno: Initialize the bo's iova at creation time. Avoids repeated conditionals at reloc time checking if we need to go ask the kernel. No statistically significant difference on the drawoverhead case I'm looking at (n=300). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:39 -07:00
Eric Anholt	b3c4e6a597	freedreno: Rename append_bo() in case it doesn't get inlined. In a debugoptimized build, it wasn't inlined and so I wasn't noticing where a bunch of CPU usage was going in the DRM functions. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:39 -07:00
Eric Anholt	e1c74f3fac	freedreno: Clean up tests around ORing in the reloc flags. gcc was surprisingly not seeing through this to just do an AND and an OR. Improves drawoverhead's few uniforms / 1 change throughput 1.64141% +/- 0.188152% (n=60). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:39 -07:00
Eric Anholt	6c688ae81f	freedreno: Deduplicate ringbuffer macros with computerator/fdperf They're sugar around freedreno_ringbuffer.h, so put them there and reuse them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>	2020-05-08 12:35:38 -07:00
Hyunjun Ko	094c7646a3	freedreno,tu: Don't request fragcoord components not being read. v1. Replace the existed bool type with new bitfield and edit register files to take a mask instead of duplicating codes to do masking. v2. Use fragcoord_compmask != 0 instead of fragcoord_compmask > 0 since it represents a bitfield. Tested with dEQP-VK.glsl.builtin_var.simple.fragcoord_xyz/w dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz/w Closes: #2680 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4723>	2020-05-08 17:45:03 +00:00
Connor Abbott	6d513eb0db	tu: Support pipelines without a fragment shader Apparently this is allowed, and the CTS started doing this more often recently which resulted in frequent hangs running the entire CTS. I copied the code to create an empty FS from radv. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4928>	2020-05-07 16:05:53 +00:00
Kristian H. Kristensen	a34b3fa198	freedreno/fdl: Align after dividing by block size For compressed formats, we need to align the number of blocks, not the logical number of pixels in the texture. Only compressed formats have block width/height > 1, so we can just unconditionally multiply the alignment by the block width/height. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4868>	2020-05-06 17:11:34 -07:00
Eric Anholt	9a6bbf4c80	freedreno/ir3: Disable sin/cos range reduction for mediump. robclark noted that the blob wasn't doing range reduction in the mediump case, and I confirmed it on dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.mediump_float_fragment vs dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.highp_float_fragment. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4893>	2020-05-05 17:23:34 +00:00
Joshua Ashton	785803a2e5	turnip: Remove RANGE_SIZE usage These were removed from the latest Vulkan headers https://github.com/KhronosGroup/Vulkan-Docs/issues/1230 Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4878>	2020-05-05 00:28:00 +00:00
Eric Anholt	5c81f51c3c	freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx. These come from the disasm tests, and fix our disasm of blob's uniform/nonuniform cat6 operands. We also now include human-readable names for all the modes we know about (though bindless gets distinguished by its .baseN, like Connor's original disasm). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:15:50 -07:00
Eric Anholt	97b21110b8	freedreno/ir3: Sync some new changes from envytools. With this I also brought in a few new control flow instruction disasm tests that I'd made back when I wrote the disasm test, but which were too far from correct to include until now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:14:46 -07:00
Eric Anholt	1e5b0c92c5	freedreno/ir3: Add some more tests of cat6 disasm. I put these together from traces I had while trying to do LDC for GL. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:14:46 -07:00
Eric Anholt	29f58cfbd0	freedreno/ir3: Set up outputs for multi-slot varyings. Necessary to avoid compiler assertion failures in: dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	88dcfaf0ee	freedreno/ir3: Stop initializing regid of so->outputs during setup. It's unused and overwritten by ir3_compile_shader_nir(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	8c1c218909	freedreno/ir3: Improve shader key normalization. We can remove a bunch of conditional code at key comparison time by computing a bitmask of used key bits at ir3_shader creation time. This also gives us a nice place to put additional key simplification to reduce how many variants we create (like skipping rastflat if we don't read colors in the FS, or skipping vclamp_color if we don't write colors). It does mean walking the whole key to AND it, but the key is just 28 bytes so far so that seems pretty fine. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	6f1e3235f2	freedreno: Emit debug messages when doing draw-time recompiles of shaders. Right now that's "always" unless you have shaderdb set. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00

1 2 3 4 5 ...

1099 commits