fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 20:08:06 +02:00

Author	SHA1	Message	Date
Eric Anholt	940501a446	v3d: Fix copy-propagation of input unpacks. I had a single function for "does this do float input unpacking" with two major flaws: It was missing the most common thing to try to copy propagate a f32 input nunpack to (the VFPACK to an FP16 render target) along with several other ALU ops, and also would try to propagate an f32 unpack into a VFMUL which only does f16 unpacks. instructions in affected programs: 659232 -> 655895 (-0.51%) uniforms in affected programs: 132613 -> 135336 (2.05%) and a couple of programs increase their thread counts. The uniforms hit appears to be a pattern in generated code of doing (-a >= a) comparisons, which when a is abs(b) can result in the abs instruction being copy propagated once but not fully DCEed.	2019-02-05 15:46:04 -08:00
Eric Anholt	e5c6938590	v3d: Fix input packing of .l for rounding/fdx/fdy. Avoids a regression in dEQP-GLES3.functional.shaders.derivate.fwidth.texture.* once we start copy-propagating more input packs.	2019-02-05 15:45:23 -08:00
Eric Anholt	1a4170952d	v3d: Fix pack/unpack of VFPACK operand unpacks. We want to be able to copy propagate our texture unpacks into the vfpack.	2019-02-05 15:45:23 -08:00
Eric Anholt	5e9ee6e841	v3d: Fold comparisons for IF conditions into the flags for the IF. total instructions in shared programs: 6193810 -> 6192844 (-0.02%) instructions in affected programs: 800373 -> 799407 (-0.12%)	2019-01-02 14:12:29 -08:00
Eric Anholt	ebde5afb93	v3d: Move "does this instruction have flags" from sched to generic helpers. I wanted to reuse it for DCE of flags updates.	2018-12-30 08:03:51 -08:00
Eric Anholt	248a7fb392	v3d: Do uniform pretty-printing in the QPU dump. If you're trying to trace what's going on in a QPU dump, this will definitely help you find your way.	2018-12-14 17:48:01 -08:00
Eric Anholt	ff80e58b38	v3d: Add missing flagging of SYNCB as a TSY op. Fixes: `f2e41daac5` ("broadcom/vc5: Update QPU instruction pack/unpack for v4.2.")	2018-12-14 17:48:01 -08:00
Dylan Baker	a999798daa	meson: Add tests to suites Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-20 09:09:22 -08:00
Eric Anholt	3471ce9985	v3d: Add support for the TMUWT instruction. This instruction is used to ensure that TMU stores have been processed before moving on. In particular, you need any TMU ops to be done by the time the shader ends.	2018-07-31 16:05:04 -07:00
Eric Anholt	27f1bfe471	vc4: Fix meson build when enabled without v3d. Reported-by: Rob Clark <robdclark@gmail.com> Fixes: `e92959c4e0` ("v3d: Pass the whole clif_dump structure to v3d_print_group().")	2018-07-29 19:13:29 -07:00
Eric Anholt	e7ae900341	v3d: Switch to using the new SFU instructions on V3D 4.x. These instructions let us write directly to the phys regfile, instead of just R4. That lets us avoid moving out of R4 to avoid conflicting with other SFU results, and to avoid conflicting with thread switches. There is still an extra instruction of latency, which is not represented in the scheduler at the moment. If you use the result before it's ready, the QPU will just stall, unlike the magic R4 mode where you'd read the previous value. That means that the following shader-db results aren't quite representative (since we now cause some stalls instead of emitting nops), but they're impressive enough that I'm happy with the change. total instructions in shared programs: 95669 -> 91275 (-4.59%) instructions in affected programs: 82590 -> 78196 (-5.32%)	2018-07-23 10:21:43 -07:00
Eric Anholt	58c1d3860f	v3d: Add QPU pack/unpack for the new SFU instructions. These instructions allow writing the result to any register, instead of a special writeback to r4.	2018-07-23 10:21:43 -07:00
Eric Anholt	cdfa99657d	v3d: Fix the name of the "flpop" operation. Noticed while trying to sort a new op into the appropriate place to match the documentation.	2018-07-23 10:21:43 -07:00
Eric Anholt	91e24e5718	v3d: Print the instruction we're testing in the QPU disasm/pack round-trip. If we fail initial disassembly, it's good to know what instruction it was that failed.	2018-07-23 10:21:42 -07:00
Eric Anholt	c3a504f470	broadcom/vc5: Add a QPU helper for instructions using the TLB. This will be used for detecting last thread segment in register spilling.	2018-03-19 16:42:59 -07:00
Eric Anholt	09c4dd1971	broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm(). These helpers will be used in register spilling to determine where to add a last thrsw if needed, and might help refactor QPU scheduling.	2018-03-19 16:42:59 -07:00
Eric Anholt	407f21ef1b	broadcom/vc5: The ldvpm signal also a case of using the VPM. The QPU scheduling code calling this function already separately checked this signal.	2018-03-19 16:42:59 -07:00
Eric Anholt	4760040c09	broadcom/vc5: Extract v3d_qpu_writes_tmu() helper. This will be reused in register spilling.	2018-03-19 16:42:59 -07:00
Eric Anholt	f2e41daac5	broadcom/vc5: Update QPU instruction pack/unpack for v4.2. After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because it's used for other barriers in compute.	2018-01-27 19:03:55 +11:00
Eric Anholt	028f6b327c	broadcom/vc5: Add the new TMU write addresses for V3D 4.x (and r5rep). The V3D 3.x series of TMU writes with meaning depending on the texture type is replaced with writes to specific registers for each texture argument semantic.	2018-01-12 21:56:48 -08:00
Eric Anholt	f50d39ab49	broadcom/vc5: Add a test for .ifb in ADD ops. I had a .ifb being decoded weird in sampid, so this is to check that .ifb is fine.	2018-01-12 21:54:57 -08:00
Eric Anholt	267f13dbee	broadcom/vc5: Add the new tesselation opcodes in V3D 4.1.	2018-01-12 21:54:50 -08:00
Eric Anholt	22a02f3e34	broadcom/vc5: Use the new LDVPM/STVPM opcodes on V3D 4.1. Now, instead of a magic write register for VPM stores we have an instruction to do them (which means no packing of other ALU ops into it), with the ability to reorder the VPM stores due to the offset being baked into the instruction. VPM loads also gain the ability to be reordered by packing the row into the A argument. They also no longer write to the r3 accumulator, and instead must be stored to a physical register.	2018-01-12 21:54:33 -08:00
Eric Anholt	55f8a01aca	broadcom/vc5: Drop dead VC5_QPU_* defines from qpu_instr.c. I had all the packing code in this file at one point, but these defines now live in qpu_pack.c.	2018-01-12 21:54:27 -08:00
Eric Anholt	2bd378647b	broadcom/vc5: Add support for QPU pack/unpack/disasm of small immediates.	2018-01-12 21:54:18 -08:00
Eric Anholt	c81cc767e4	broadcom/vc5: Drop signal bit #defines. Signals are more complicated than that, and tables ended up being better.	2018-01-12 21:53:53 -08:00
Eric Anholt	dfee62eed3	broadcom/vc5: Add support for V3Dv4 signal bits. The WRTMUC replaces the implicit uniform loads in the first two texture instructions. LDVPM disappears in favor of an ALU op. LDVARY, LDTMU, LDTLB, and LDUNIF*RF now write to arbitrary registers, which required passing the devinfo through to a few more functions.	2018-01-12 21:53:45 -08:00
Eric Anholt	81ec2ba229	broadcom/vc5: Fix pack/unpack of vfmul input unpack flags.	2018-01-12 21:53:38 -08:00
Dylan Baker	4ccb981673	meson: Use consistent style for tests Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Eric Anholt	49e2586bfc	broadcom/vc5: Fix a typo in memcmp for sig unpack checking. This shockingly ended up working out, because only the first byte of sig is used and (sizeof(sig) != 0) == 1. Fixes a compiler warning. Link: https://bugs.freedesktop.org/show_bug.cgi?id=104183	2017-12-14 14:36:24 -08:00
Eric Anholt	dfff9ce45e	broadcom/vc5: Fix scheduling for a non-SFU R4 write after a dead R4 write. The v3d_qpu_writes_r*() were only checking for fixed-function accumulator writes, not normal ALU writes to those regs. Fixes fs-discard-exit-2 on simulation (but not HW).	2017-11-07 12:57:49 -08:00
Eric Anholt	48615d1ead	meson: Fix vc5 deps on the XML-generated headers. I typoed and was depending on v3d_xml.h (the gzipped xml)_, not on the v3d_packet_v33_pack.h that the compiler and QPU packing actually use.	2017-10-20 17:16:00 -07:00
Eric Anholt	4f3e380fa0	meson: Add support for the vc5 driver. v2: Default vc5 to off, since it requires the simulator currently. Add missing dep on the XML generation from libbroadcom_vc5. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)	2017-10-17 13:41:59 -07:00
Eric Anholt	05c7d9715b	broadcom: Add V3D 3.3 QPU instruction pack, unpack, and disasm. Unlike VC4, I've defined an unpacked instruction format with pack/unpack functions to convert to 64-bit encoded instructions. This will let us incrementally put together our instructions and validate them in a more natural way than the QPU_GET_FIELD/QPU_SET_FIELD used to. The pack/unpack unfortuantely are written by hand. While I could define genxml for parts of it, there are many special cases (like operand order of commutative binops choosing which binop is being performed!) and it probably wouldn't come out much cleaner. The disasm unit test ensures that we have the same assembly format as Broadcom's internal tools, other than whitespace changes. v2: Fix automake variable redefinition complaints, add test to .gitignore	2017-10-10 11:42:04 -07:00

34 commits