fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 02:28:07 +02:00

Author	SHA1	Message	Date
Eric Anholt	90269ba353	broadcom/vc5: Use THRSW to enable multi-threaded shaders. This is a major performance boost on all of V3D, but is required on V3D 4.x where shaders are always either 2- or 4-threaded.	2018-01-12 21:55:30 -08:00
Eric Anholt	86a12b4d5a	broadcom/vc5: Properly schedule the thread-end THRSW. This fills in the delay slots of thread end as much as we can (other than being cautious about potential TLBZ writes). In the process, I moved the thread end THRSW instruction creation to the scheduler. Once we start emitting THRSWs in the shader, we need to schedule the thread-end one differently from other THRSWs, so having it in there makes that easy.	2018-01-12 21:55:23 -08:00
Eric Anholt	a075bb6726	broadcom/vc5: Implement GFXH-1684 workaround. Apparently the VPM writes need to be flushed out before we end the shader.	2018-01-12 21:55:15 -08:00
Eric Anholt	edbd817c30	broadcom/vc5: Use a physical-reg-only register class for LDVPM. This is needed for LDVPM on V3D 4.x, but will also be needed for keeping values out of the accumulators across THRSW.	2018-01-12 21:54:42 -08:00
Eric Anholt	22a02f3e34	broadcom/vc5: Use the new LDVPM/STVPM opcodes on V3D 4.1. Now, instead of a magic write register for VPM stores we have an instruction to do them (which means no packing of other ALU ops into it), with the ability to reorder the VPM stores due to the offset being baked into the instruction. VPM loads also gain the ability to be reordered by packing the row into the A argument. They also no longer write to the r3 accumulator, and instead must be stored to a physical register.	2018-01-12 21:54:33 -08:00
Eric Anholt	dfee62eed3	broadcom/vc5: Add support for V3Dv4 signal bits. The WRTMUC replaces the implicit uniform loads in the first two texture instructions. LDVPM disappears in favor of an ALU op. LDVARY, LDTMU, LDTLB, and LDUNIF*RF now write to arbitrary registers, which required passing the devinfo through to a few more functions.	2018-01-12 21:53:45 -08:00
Dylan Baker	2083a14179	meson: Use dependencies for nir This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Eric Anholt	e60e3a56a2	broadcom/vc5: Fix discard_if during control flow. I want to do the SETMSF.IFA to discard only if execute == 0 and cond, so our dest of the PUSHZ needs to be nonzero if execute or !cond are nonzero. Fixes dEQP-GLES3.functional.shaders.discard.dynamic_loop_dynamic.	2018-01-03 14:31:36 -08:00
Eric Anholt	635131a238	broadcom/vc5: Don't emit component 3/4 F16 TLB writes for float/vec2. Fixes a simulator assertion failure on dEQP-GLES3.functional.fragment_out.array.fixed.r8_highp_float.	2018-01-03 14:31:28 -08:00
Eric Anholt	8e5a0ed953	broadcom/vc5: Emit flat shade flags for varying components > 24. This means that with no flatshading we'll emit the single-byte ZERO_ALL_FLAT_SHADE_FLAGS, and otherwise emit a set of FLAT_SHADE_FLAGS to get all the bits we need set. There's a _SET enum in the packet we could use to possibly set entire ranges of the bitfield without using another packet, but this at least fixes the conformance failure.	2018-01-03 14:25:23 -08:00
Eric Anholt	2056e4a777	broadcom/vc5: Emit proper flatshading code for glShadeModel(GL_FLAT). In updating the simulator, behavior changed slightly so that our old code wasn't getting glxgears's flatshading interpolated right. Emit flat shading code just like we would for a normal flat-shaded varying, by passing a flag in the shader key for glShadeModel(GL_FLAT) state and customizing the color inputs based on that.	2018-01-03 14:25:23 -08:00
Eric Anholt	ba965084b6	broadcom/vc5: Move texture return channel setup into the compiler. The compiler decides how many LDTMUs we're going to emit, and that must match the P1 flags. This brings the return channel counting to a single place (so all that's passed into the compiler is "how many return channels you may request from this texture's format), and was a necessary step for shadow samplers once we stop using OVRTMUOUT=0.	2018-01-03 14:25:23 -08:00
Eric Anholt	1171f1749d	broadcom/vc5: Enable NIR txd lowering on all txd instructions. Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for the base -texgrad/texgradcube ones which fail on what appear to be precision problems. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	52f024b052	broadcom/vc5: Fix shader input/outputs for gallium's new NIR linking.	2017-12-14 14:36:17 -08:00
Eric Anholt	6a78416dab	broadcom/vc5: Fix BASE_LEVEL handling with txl. The HW doesn't add the base level anywhere (the min/max lod clamping is what does base level), so we need to add it manually in this case. Fixes piglit tex-miplevel-selection *Lod 2D.	2017-11-22 10:56:31 -08:00
Eric Anholt	87391e23cf	broadcom/vc5: Ensure that there is always a TLB write. This should fix some GPU hangs in our (currently always single-threaded) fragment shaders, and definitely fixes assertion failures in simulation.	2017-11-17 16:09:55 -08:00
Andreas Boll	4f29ed38f3	broadcom/vc5: Remove unused v3d_compiler.c Unused since original import of VC5. Fixes: `ade416d023` ("broadcom: Add VC5 NIR compiler.") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 18:30:47 +00:00
Eric Anholt	50906e4583	broadcom/vc5: Do 16-bit unpacking of integer texture returns properly. We were doing f16 unpacks, which trashed "1" values. Fixes many piglit texwrap GL_EXT_texture_integer cases.	2017-11-07 12:58:03 -08:00
Eric Anholt	dfff9ce45e	broadcom/vc5: Fix scheduling for a non-SFU R4 write after a dead R4 write. The v3d_qpu_writes_r*() were only checking for fixed-function accumulator writes, not normal ALU writes to those regs. Fixes fs-discard-exit-2 on simulation (but not HW).	2017-11-07 12:57:49 -08:00
Eric Anholt	4d2619a6b3	broadcom/vc5: Stop lowering negates to subs. In the case of fneg(0.0), we were getting back 0.0 instead of -0.0. We were also needing an immediate 0 value for ineg, when there's an opcode to do the job properly. Fixes fs-floatBitsToInt-neg.shader_test.	2017-10-30 13:31:28 -07:00
Eric Anholt	e717e3e7cd	broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture. The HW has no native sampler support for multisample textures, but since we only need to support txf_ms and the layout is UIF, we just need to scale up the texcoords and then add in the sample. This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating MSAA textures as textures, rather than basically texbos like VC4 had to.	2017-10-30 13:31:27 -07:00
Eric Anholt	125f2a751e	broadcom/vc5: Lower unpack_*_4x8 to normal math. We only have 2x16 unpacking in our ALUs. To enable this, we also need lower_fdiv for its new instructions, which had been handled at a higher level previously.	2017-10-30 13:31:16 -07:00
Eric Anholt	eecdbaa985	broadcom/vc5: Add PIPE_TEX_WRAP_CLAMP support for linear-filtered textures. I already had the texture's wrapping set up to use different behavior for nearest or linear, so we just needed to saturate the coordinates in linear mode to get the "proper" blend between the edge and border values.	2017-10-30 13:31:16 -07:00
Eric Anholt	48615d1ead	meson: Fix vc5 deps on the XML-generated headers. I typoed and was depending on v3d_xml.h (the gzipped xml)_, not on the v3d_packet_v33_pack.h that the compiler and QPU packing actually use.	2017-10-20 17:16:00 -07:00
Eric Anholt	07bfdb478b	broadcom/vc5: Propagate vc4 aliasing fix to vc5. See `e5fea0d621`	2017-10-20 17:09:47 -07:00
Eric Anholt	9b5fa214f4	broadcom/vc5: Use SETMSF to handle discards. A bit of spec text suggested that (like vc4) condition codes should be used for discards, and the simulator was fine with it, but the 7268 disagrees and you have to use SETMSF instead or the color comes through. Fixes glsl-fs-discard-01 and many of the interpolation-with-clipping tests.	2017-10-20 15:59:41 -07:00
Eric Anholt	a48a38937c	broadcom/vc5: Set the snorm/unorm packing functions to be lowered. We don't have native instructions for them, so set up the lowering. Once we support the bfi instructions that get generated, they should start actually working.	2017-10-20 15:59:41 -07:00
Jason Ekstrand	59fb59ad54	nir: Get rid of nir_shader::stage It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-20 12:49:17 -07:00
Eric Anholt	4f3e380fa0	meson: Add support for the vc5 driver. v2: Default vc5 to off, since it requires the simulator currently. Add missing dep on the XML generation from libbroadcom_vc5. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)	2017-10-17 13:41:59 -07:00
Eric Anholt	20b91cd568	broadcom/vc5: Don't pair VPMSETUP with other peripheral access. The specs don't say you can't, but pairing it with an SFU write on the 7268 breaks all our simple shader tests using gl_MVP * gl_Vertex.	2017-10-12 10:41:09 -07:00
Eric Anholt	4b7de2a360	broadcom/vc5: Add support for f32 render targets. The TLB write code is getting ugly and needs a refactoring (that will hopefully handle TLBU uniform coalescing as well).	2017-10-10 11:42:06 -07:00
Eric Anholt	dc25a83a7a	broadcom/vc5: Start hooking up multiple render targets support. We now emit as many TLB color writes as there are color buffers.	2017-10-10 11:42:05 -07:00
Eric Anholt	361c5f28bd	broadcom/vc5: Fix handling of interp qualifiers on builtin color inputs. The interpolation qualifier, if specified, is supposed to take precedence over glShadeModel().	2017-10-10 11:42:05 -07:00
Eric Anholt	732a3a72cb	broadcom/compiler: Set up passthrough Z when doing FS discards. In order to keep early-Z from writing early in a discard shader, you need to set the "modifies Z" bit in the shader state (which the new prog_data.discards will indicate). Then, in the shader we do a TLB write to make Z passthrough happen (the QPU result is ignored, so we use a NULL source).	2017-10-10 11:42:05 -07:00
Eric Anholt	4c4fbab345	broadcom/compiler: Don't forget the discard state on TLB Z writes. We don't want to write Z for discarded fragments.	2017-10-10 11:42:05 -07:00
Eric Anholt	84939552d0	broadcom/compiler: Use defines instead of magic values in TLB write setup.	2017-10-10 11:42:05 -07:00
Eric Anholt	ade416d023	broadcom: Add VC5 NIR compiler. This is a pretty straightforward fork of VC4's NIR compiler to VC5. The condition codes, registers, and I/O have all changed, making the backend hard to share, though their heritage is still recognizable. v2: Move to src/broadcom/compiler to match intel's layout, rename more "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts with vc4, use new v3d_debug header, add compiler init/free functions, do texture swizzling in NIR to allow optimization.	2017-10-10 11:42:04 -07:00

37 commits