fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 13:18:09 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	348ac0f4a2	asahi: Make agx_varyings a union More accurate. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:58:51 -04:00
Alyssa Rosenzweig	721c4f2186	asahi: Remove "padding" field Trivial. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:58:48 -04:00
Alyssa Rosenzweig	06cb242a54	asahi: Identify more shader-related fields The big discovery is the "number of uniform registers" field. I learned about this one accidentally when my preamble shaders weren't working right, because we had inadvertently hardcoded "at most 32 registers" :-) In the course of identifying that field, I found that the pipeline address is used as a tagged pointer, with some unknown field in the bottom bits and alignment demanded. The XML is updated to account for this. I later found that there's also a "number of general purpose registers used by the preamble shader" field. I missed this one first, because the encoding is slightly different from the usual "number of general purpose registers in the main shader" field. The specification is slightly coarser. I don't know why the hardware needs that information anyway -- occupancy of the preamble shader should be irrelevant -- but it's not a big deal. Finally I found that the "more than 4 textures?" bit is... not that. I do not yet know what it is, but it is... not that. These all use the new groups() modifier for GenXML Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:58:37 -04:00
Alyssa Rosenzweig	24bfa5af88	asahi: Identify "Uniform high" USC word The start field in the Uniform USC word is only 8-bits, whereas 9-bits are required to address the entire uniform register file. This other word gets used for the high half, with start indexed from u128l in the natural way. Apparently spending the evening stuffing too many uniforms into Metal is paying off. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:54:07 -04:00
Alyssa Rosenzweig	0e1f9ca9f6	asahi: Route shader-db stats to debug callback This way multithreading works correctly in shader-db including CPU time account. Code from v3d via panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:54:07 -04:00
Alyssa Rosenzweig	e126338394	asahi: Precompile for shader-db This gets shader-db's runner working, in conjunction with a shader-db ./run modified to set ASAHI_MESA_DEBUG=precompile. This flag triggers precompiles of all shaders witha default key so we can exercise the compiler. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:54:07 -04:00
Alyssa Rosenzweig	46ae8e659d	asahi: Remove AGX_FAKE_DEVICE environment variable The proper way to fake a device on Linux will be drm-shim. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:54:07 -04:00
Alyssa Rosenzweig	13e90bebe1	agx: Remove command line compiler It has not been used in quite some time but adds maintainence burden. Its function is replaced by drm-shim in conjunction with shader-db's ./run, which goes through the actual driver. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>	2022-10-22 14:54:07 -04:00
Alyssa Rosenzweig	bb6c43027e	agx: Reserve live-in regs at the start of block ...Rather than reserving the union of the registers live-out of the predecessors. This avoids reserving registers that are killed along a control flow edge (where the predecessor has another successor that does use the register). glmark2 subset of shaderdb: total instructions in shared programs: 6442 -> 6440 (-0.03%) instructions in affected programs: 42 -> 40 (-4.76%) helped: 1 HURT: 0 total bytes in shared programs: 42186 -> 42174 (-0.03%) bytes in affected programs: 270 -> 258 (-4.44%) helped: 1 HURT: 0 total halfregs in shared programs: 1769 -> 1757 (-0.68%) halfregs in affected programs: 75 -> 63 (-16.00%) helped: 3 HURT: 0 helped stats (abs) min: 4.0 max: 4.0 x̄: 4.00 x̃: 4 helped stats (rel) min: 16.00% max: 16.00% x̄: 16.00% x̃: 16.00% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	de6e11b848	agx: Pass in max regs as a paramter to RA This will allow us to restrict max regs later. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	68f89d4cc5	agx: Introduce ra_ctx data structure We have more parameters to pass, this will get unwieldly otherwise. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	bcb2cf9688	agx: Write to r0l with a "nesting" instruction This avoids modeling the r0l register explicitly in the IR, which would complicate RA for little benefit at this stage. Do the simplest thing that could possibly work in SSA. glmark2 subset. total instructions in shared programs: 6442 -> 6442 (0.00%) instructions in affected programs: 701 -> 701 (0.00%) helped: 4 HURT: 5 helped stats (abs) min: 1.0 max: 3.0 x̄: 2.00 x̃: 2 helped stats (rel) min: 1.46% max: 7.69% x̄: 4.03% x̃: 3.48% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.60 x̃: 1 HURT stats (rel) min: 0.81% max: 7.41% x̄: 2.67% x̃: 1.14% 95% mean confidence interval for instructions value: -1.58 1.58 95% mean confidence interval for instructions %-change: -3.70% 3.08% Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 42196 -> 42186 (-0.02%) bytes in affected programs: 7768 -> 7758 (-0.13%) helped: 8 HURT: 5 helped stats (abs) min: 2.0 max: 18.0 x̄: 7.25 x̃: 4 helped stats (rel) min: 0.13% max: 7.26% x̄: 2.02% x̃: 0.97% HURT stats (abs) min: 6.0 max: 18.0 x̄: 9.60 x̃: 6 HURT stats (rel) min: 0.82% max: 6.32% x̄: 2.37% x̃: 1.02% 95% mean confidence interval for bytes value: -7.02 5.48 95% mean confidence interval for bytes %-change: -2.30% 1.63% Inconclusive result (value mean confidence interval includes 0). total halfregs in shared programs: 1926 -> 1769 (-8.15%) halfregs in affected programs: 1395 -> 1238 (-11.25%) helped: 71 HURT: 0 helped stats (abs) min: 1.0 max: 10.0 x̄: 2.21 x̃: 2 helped stats (rel) min: 1.92% max: 52.63% x̄: 15.33% x̃: 11.76% 95% mean confidence interval for halfregs value: -2.69 -1.73 95% mean confidence interval for halfregs %-change: -17.98% -12.68% Halfregs are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	c9a96d4615	agx: Preload vertex/instance ID only at start This means we don't reserve the registers, which improves RA considerably. Using a special preload psuedo-op instead of a regular move allows us to constrain semantics and gaurantee coalescing. shader-db on glmark2 subset: total instructions in shared programs: 6448 -> 6442 (-0.09%) instructions in affected programs: 230 -> 224 (-2.61%) helped: 4 HURT: 0 total bytes in shared programs: 42232 -> 42196 (-0.09%) bytes in affected programs: 1530 -> 1494 (-2.35%) helped: 4 HURT: 0 total halfregs in shared programs: 2291 -> 1926 (-15.93%) halfregs in affected programs: 2185 -> 1820 (-16.70%) helped: 75 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	f665229d77	agx: Print agx_dim appropriately Easier to read, and gets us closer to proper disasm in Mesa. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	6c95572ef0	agx: Print instructions as "dest = src" This makes the dataflow easier to read, especially with splits and collects (which take variable numbers of sources/destinations). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	72a1e1f33f	agx: Emit trap at pack-time, not during isel This makes the shaderdb stats make more sense. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	1dcaade3e2	agx: Rename "combine" to "collect" For consistency with ir3 and bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	82e8e709cb	agx: Dynamically size split instruction This is more flexible. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	7c9fba34bc	agx: Switch to dynamic allocation of srcs/dests So we can handle parallel copies later. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	544c60a132	agx: Improve printing of immediate sources For floats, decode the float. Regardless, the size speciifer is redundant. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	c2bc8c1384	agx: Don't prefix pseudo-ops It's not really buying us anything and it clutters the IR. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	40f0ac2082	agx: Emit smaller combines for nir_op_vec2/3 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	6a183a9ffd	agx: Add iterators for phi/non-phi instructions We know that phi nodes are always at the start (this is asserted in agx_validate and a fundamental invariant of SSA form). That means we can cheaply iterate all n phi nodes forward (or n non-phi nodes backwards) in O(n) time. We already open code this idiom in a few places, use common iterators instead so we don't need to justify in random places. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>	2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig	6689d67603	asahi: Remove no-direct-packing It's weird. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:52 -04:00
Alyssa Rosenzweig	ea58edaafb	asahi: Use a header more like Intel's GenXML We're trying to converge on a common schema. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:52 -04:00
Alyssa Rosenzweig	ab2d5deec2	asahi,panfrost: Remove exact attribute Not used, although in the future it might be... Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:52 -04:00
Alyssa Rosenzweig	a64e38b0aa	panfrost,asahi: Remove unused function Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig	0f24c8ef5f	panfrost,asahi: Remove unused prepare macro Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig	0302519f1c	asahi/genxml: Defeature uint/float Unused, relic from panfrost and not in upstream genxml. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig	8eefda4ea9	asahi: Eliminate "Pixel Format" type from GenXML This is leaky and hurts compatibility with upstream GenXML. Just use the actual hardware fields. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>	2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig	deb3810f1e	agx: Remove load_kernel_input path Unused and now won't be used. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18658>	2022-10-05 16:09:21 +00:00
Alyssa Rosenzweig	c17fcbaa2f	agx: Account for mask when writing registers To use fewer registers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	5cd2371318	agx: Pass mask into ld/st_tile instructions Properly handle render target formats with <4 components. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	640fd089a2	agx: Ensure that the optimizer sees legitimate SSA Expecting it to keep around unused definitions around is wishful. Add an "anchoring" unit_test instruction to consume the results so they don't have to be precoloured registers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	52467c2d1e	agx: Test fsat+f2f16 together Something I hit when mucking with this pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	3e86522cf2	agx: Validate immediates In particular the new sizing rules. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	14f2be1f33	agx: Use 16-bit immediates This is slightly more accurate in the IR, and means we instruction select the current 16-bit size floating point instructions when all non-immediate operands are 16-bit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	e302e5d527	agx: Emit fewer combines for intrinsics A bunch of the emitted combines were unnecessary, or unnecessarily large. Fix the accounting now that combines are variable size. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	e887a11b06	agx: Fix bfi_mask packing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>	2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig	a1faab0b90	agx: Convert and clamp array indices in NIR ..Rather than at backend IR translation time. This is considerably simpler because we can use the txs lowering instead of special casing array sizes. Unfortunately it generates worse code, but that gap should close once nir_opt_preamble is wired in. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18652>	2022-09-19 16:14:24 +00:00
Alyssa Rosenzweig	bcd75a13e0	asahi: Identify shared memory layouts Somehow maps to the tile size. Not sure about the details yet. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	b8b3c9fa2a	asahi: Identify pixel stride Number of bytes in a pixel in the tilebuffer, does not depend on the tile size. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	933a9e350e	asahi: Overhaul USC control packing Break up the monolithic SET_SHADER_EXTENDED packet into the separate underlying commands (some only 2-byte sized and aligned), and add a builder for USC control streams like we did for PPP updates to make that change manageable. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	35d5558fa5	asahi/genxml: Overflow up to words when packing So we can pack things that aren't 4-byte sized. Note this doesn't help with alignment. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	22d3756207	asahi: Consolidate magic numbers for USC controls Aka "pipeline" states. It's another command/control stream. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	09cc736c42	asahi: Identify shared memory fields For compute kernels, this encodes how much workgroup-local memory is used ("shared memory" or "threadgroup memory" or "local memory"). This memory is partitioned by the hardware. For fragment shaders, this... encodes exactly the same thing. There is no traditional tilebuffer in AGX, instead local memory is interpreted as an imageblock, where each workgroup is a tile. This is a nifty design. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	2fbe1ae09c	asahi: Identify spill buffer histogram Histogram of sizes of the spill buffer, with logarithmic bucket sizes (relative to the amount spilled from the perspective of a single thread). Pretty funny. Also mark a few unknowns that are nonzero when spilling is used. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig	adfd213241	asahi: Decode IOGPU compute header Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig	a9c26df462	asahi: Identify IOGPU compute header Much simpler than the graphics one. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig	58d138334d	asahi: Shuffle IOGPU structs We need the header to be common between gfx and compute, but everything else seems to be different. Shuffle so we can decode compute without any terrible hacks. I don't know the exact layout and don't care: the layout of the fields here is all software defined in macOS, even though the values are defined by hardware (or firmware in a few cases). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>	2022-09-18 10:34:25 -04:00

1 2 3 4 5 ...

451 commits