fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 09:10:11 +01:00

Author	SHA1	Message	Date
Caio Oliveira	d00329e821	intel/brw: Replace some fs_reg constructors with functions Create three helper functions for ATTR, UNIFORM and VGRF creation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>	2024-07-03 02:53:18 +00:00
Kenneth Graunke	1e69ec3b8d	intel/brw: Add a lower_csel pass and allow building it for all types We can do CSEL on F, HF, W, and D on Gfx11+. Gfx9 can only do F. We can lower unsupported types to CMP+CSEL, allowing us to use CSEL in the IR and not worry about the limitations. Rework: (Sagar) - Update validation pass for CSEL Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29316>	2024-07-01 19:06:31 +00:00
Ian Romanick	77ef241577	intel/brw/xe2+: Scale size_written by reg_unit for DPAS Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Kenneth Graunke	5cb15a6c67	intel/brw: Make bld.ADD(x, 0) emit no instructions and return x directly There are a lot of places where we add 0 to an offset. Avoiding generating this can save us algebraic + copy_propagation later. Cuts compile time in Borderlands 3 by -0.590631% +/- 0.170108% (n=25). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>	2024-06-24 19:12:21 -07:00
Kenneth Graunke	068865ce81	intel/brw: Make an alu2 builder helper Instead of replicating the whole thing in macros, just make an alu2() function and use that in the wrappers. It ought to get inlined anyway. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>	2024-06-24 19:12:19 -07:00
Kenneth Graunke	344d4ee9f0	intel/brw: Make VEC() perform a single write to its destination. This gathers a number of sources into a contiguous vector register, typically using LOAD_PAYLOAD. However, it uses MOV for a single source. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	f04bb49465	intel/brw: Delete SAD2 and SADA2 opcodes These were removed with Icelake. While they technically still exist on Skylake, which this compiler supports, we have never used these opcodes in the 14 years we could have done so. So just scrap them. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29665>	2024-06-10 16:47:50 -07:00
Francisco Jerez	6261f4d361	intel/brw/xe2+: Fix 64-bit subgroup scan intrinsics not to rely on SEL instructions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>	2024-05-15 17:16:51 +00:00
Sagar Ghuge	e32828f5fc	intel/compiler: Fix destination type for CMP/CMPN For CMP/CMPN, use src0 type if destination is null otherwise get the src0 type register with destination register size. This fixes dEQP-VK.glsl.builtin_var.frontfacing.* tests cases on Xe2+. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28679>	2024-05-06 21:46:18 +00:00
Kenneth Graunke	3c867bf2c7	intel/brw: Add a new VEC() helper. This gathers a number of sources into a contiguous vector register. Eventually, the plan is that it will use a MOV for a single source, or LOAD_PAYLOAD for multiple sources. For now, it emits a series of MOVs to allow us to rewrite a bunch of existing code to use the new helper, then change them all over at once later. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>	2024-04-30 17:16:42 -07:00
Kenneth Graunke	674e89953f	intel/brw: Use new builder helpers that allocate a VGRF destination With the previous commit, we now have new builder helpers that will allocate a temporary destination for us. So we can eliminate a lot of the temporary naming and declarations, and build up expressions. In a number of cases here, the code was confusingly mixing D-type addresses with UD-immediates, or expecting a UD destination. But the underlying values should always be positive anyway. To accomodate the type inference restriction that the base types much match, we switch these over to be purely UD calculations. It's cleaner to do so anyway. Compared to the old code, this may in some cases allocate additional temporary registers for subexpressions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>	2024-04-29 07:51:45 +00:00
Kenneth Graunke	4c2c49f7bc	intel/brw: Add builder helpers that allocate temporary destinations In many cases, we calculate an expression by generating a series of instructions. We'd either overwrite the same register repeatedly, or call vgrf(BRW_TYPE_X) repeatedly to allocate temporaries for each intermediate step. In many cases, we overwrote the same register simply because allocating and naming temporaries for each step was annoying. This commit adds new builder helpers that will allocate a temporary destination for you, using simple type interference: unary operations use the source type, and binary operations require a matching base type and return the largest of the two types. The helpers return the destination register, allowing us to write in an expression-tree style, chaining together builder operations to produce whole values. Sort of like nir_builder. We still optionally will write out the fs_inst pointer in case the caller wants to do things like set predicates or saturation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>	2024-04-29 07:51:45 +00:00
Kenneth Graunke	319ba85e10	intel/brw: Add builder helpers for math functions Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>	2024-04-29 07:51:45 +00:00
Kenneth Graunke	545bb8fb6f	intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_* Both of these helpers do the same thing. We now have brw_type_size_bits and brw_type_size_bytes and can use whichever makes sense in that place. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	c22f44ff07	intel/brw: Replace brw_reg_type_from_bit_size by brw_type_with_size Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	f523bfcf90	intel/brw: Reindent after shortening BRW_REGISTER_TYPE_* to BRW_TYPE_* Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	873fcdff38	intel/brw: Stop using long BRW_REGISTER_TYPE enum names s/BRW_REGISTER_TYPE/BRW_TYPE/g Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	e637c63239	intel/brw: Make an fs_builder::SYNC helper We always want a null destination, so this saves some typing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Ian Romanick	6d85f7129a	intel/brw/xe2+: DPAS must be SIMD16 now Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>	2024-03-29 21:12:32 +00:00
Ian Romanick	671745b616	intel/fs: Don't allow 0 stride on MOV destination Outside SIMD1 instructions, a destination stride of zero doesn't make any sense. When such strides exist, they would be fixed by the FS generator. Currently the only place that intentionally generates such a stride is setup_barrier_message_payload_gfx125, and this commit changes that. The existence of a zero stride that won't really be a zero stride causes a variety of problems with other optimization passes. Those passes don't know that 0 actually means 1, and they make incorrect assumptions about sizes written, etc. The assertion helped catch many bugs in some other work in progress that tries to store convergent values in SIMD8 registers regardless of the dispatch width. That code would accidentally generate destination strides of zero. v2: Check stride differently depending on register file. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28256>	2024-03-19 18:17:59 +00:00
Caio Oliveira	97759ef139	intel/brw: Remove typedefs from fs_builder Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27866>	2024-02-29 21:14:13 -08:00
Caio Oliveira	865ef36609	intel/brw: Remove brw_shader.h Find a better home for its existing content. Some functions are now just static functions at the usage sites. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27861>	2024-02-29 19:28:06 +00:00
Caio Oliveira	5c93a0e125	intel/brw: Remove Gfx8- remaining opcodes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:39 +00:00
Caio Oliveira	b6098676fa	intel/brw: Remove Gfx8- code from builder Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Caio Oliveira	071e9f49f1	intel/brw: Remove F16TO32 and F32TO16 opcodes These are done with MOVs and appropriate types in Gfx9+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Ian Romanick	e666872c75	intel/compiler: Initial bits for DPAS instruction v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix overlapping register allocation (via has_source_and_destination_hazard). Fix incorrect destination register file encoding. v3: Prevent lower_regioning from trying to "fix" DPAS sources. v4: Add instruction latency information for scheduling and perf estimates. v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update the comment in fs_inst::has_source_and_destination_hazard. Suggested by Caio. v6: Add some comments near the src2 calculation in fs_inst::size_read. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:24:16 -08:00
Caio Oliveira	38a42e5aa1	intel/compiler: Add ctor to fs_builder that just takes the shader Uses the dispatch_width from the shader (fs_visitor). This was not possible before because the dispatch_width was not part of backend_shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	cf730adc58	intel/compiler: Make fs_builder include fs_visitor and not the other way This will allow fs_builder have a reference to an fs_visitor (a "fs_shader" really), instead of a reference to a backend_shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	f5032c4d52	intel/compiler: Make fs_visitor not depend on fs_builder At this point this is more a header dependency due to inline functions, so shuffle them around. The end goal is to allow fs_builder have a reference to a fs_visitor (really a fs_shader). Note the header is still included, a later patch will move the includes to the call-sites. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	21cf9323f0	intel/compiler: Add a few more helpers to fs_builder Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>	2023-11-30 20:58:05 +00:00
Francisco Jerez	150b3e87c8	intel/fs/xe2+: Round up fs_builder::vgrf() size calculation to HW register unit. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Caio Oliveira	26f6ea5c30	intel/compiler: Remove unused functions and declarations Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23539>	2023-06-09 20:09:51 +00:00
Lionel Landwerlin	3d0cc3f63b	intel/fs: keep track of new resource_intel information Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Kenneth Graunke	e7ea2aa46c	intel/fs: Make bld.F16TO32 actually emit F16TO32 not F32TO16 Ahem, "add builder helpers that work on Gfx7"...now might actually work. Too much copy and paste... Fixes: `966995d911` ("intel/fs: Add builder helpers for F32TO16/F16TO32 that work on Gfx7.x") Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>	2023-03-17 09:01:18 +00:00
Kenneth Graunke	966995d911	intel/fs: Add builder helpers for F32TO16/F16TO32 that work on Gfx7.x These take care of emitting the F32TO16/F16TO32 instructions on Gfx7.x but otherwise just emit a type converting MOV on Gfx8+. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Lionel Landwerlin	e5dfff0946	intel/fs: reduce liveness of variables in lowering passes When lowering a single instruction with a destination VGRF to 2 or more, the VGRF is now considered partially written by each generated instruction and that increases its liveness especially in loops. Thus potentially increasing the number of spills/fills due to register allocation. Putting an UNDEF instruction in front of the lowered instructions allows the IR to limit the liveness of the VGRF, reducing register pressure. This has a pretty dramatic effect on spills/fills for RT shaders. Here the stats on Q2RTX shaders on DG2 (wipping out any spills/fills due to register allocation) : Instructions in all programs: 26150 -> 24955 (-4.6%) SENDs in all programs: 1148 -> 1148 (+0.0%) Loops in all programs: 4 -> 4 (+0.0%) Cycles in all programs: 392179 -> 332787 (-15.1%) Spills in all programs: 132 -> 116 (-12.1%) Fills in all programs: 262 -> 154 (-41.2%) Shader-db results on TGL : total instructions in shared programs: 21158140 -> 21158377 (<.01%) instructions in affected programs: 76629 -> 76866 (0.31%) helped: 18 HURT: 20 helped stats (abs) min: 1 max: 60 x̄: 18.89 x̃: 12 helped stats (rel) min: 0.21% max: 3.61% x̄: 1.02% x̃: 0.77% HURT stats (abs) min: 1 max: 79 x̄: 28.85 x̃: 18 HURT stats (rel) min: 0.04% max: 2.81% x̄: 1.13% x̃: 0.79% 95% mean confidence interval for instructions value: -4.82 17.30 95% mean confidence interval for instructions %-change: -0.34% 0.57% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 5753 -> 5753 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 798856834 -> 798870688 (<.01%) cycles in affected programs: 6208395 -> 6222249 (0.22%) helped: 22 HURT: 17 helped stats (abs) min: 2 max: 8794 x̄: 1438.18 x̃: 782 helped stats (rel) min: 0.05% max: 2.28% x̄: 0.63% x̃: 0.44% HURT stats (abs) min: 2 max: 19178 x̄: 2676.12 x̃: 1358 HURT stats (rel) min: 0.04% max: 23.49% x̄: 2.25% x̃: 0.71% 95% mean confidence interval for cycles value: -952.19 1662.65 95% mean confidence interval for cycles %-change: -0.64% 1.90% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 4078 -> 4066 (-0.29%) spills in affected programs: 40 -> 28 (-30.00%) helped: 2 HURT: 0 total fills in shared programs: 2856 -> 2832 (-0.84%) fills in affected programs: 127 -> 103 (-18.90%) helped: 2 HURT: 0 total sends in shared programs: 998554 -> 998554 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Total CPU time (seconds): 2346.06 -> 2304.80 (-1.76%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>	2022-10-27 21:05:00 +00:00
Lionel Landwerlin	14b99df7d9	intel/fs: require UNDEFs register offsets to be aligned to REG_SIZE Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>	2022-10-27 21:05:00 +00:00
Sagar Ghuge	75c73fcdc4	intel/compiler: Fix instruction size written calculation We are always aligning to REG_SIZE but when we have payload sources less than REG_SIZE, size written is miscalculated. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>	2021-11-22 21:27:30 -08:00
Ian Romanick	0f809dbf40	intel/compiler: Basic support for DP4A instruction v2: Very significant rebase on changes to previous commits. Specifically, brw_fs_nir.cpp changes were pretty much rewritten from scratch after changing the NIR opcode names and types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Sagar Ghuge	705285b9f4	intel/compiler: Add support for ternary add instruction on XeHP v2: - Re-arragne opcode in correct order (Matt Turner) - Move ADD3 case closer to LRP (Jason) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Anuj Phogat	1d296484b4	intel: Rename Genx keyword to Gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/Gen$[[:digit:]]\+$/Gfx\1/g" Exclude changes in src/intel/perf/oa-.xml: find src/intel/perf -type f $ -name ".xml" $ \| xargs sed -ie "s/Gfx/Gen/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	b75f095bc7	intel: Rename genx keyword to gfxx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen$[[:digit:]]\+$/gfx\1/g" Exclude pack.h and xml changes in this patch: grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+_pack\.h$/gen\1/g" grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+\.xml$/gen\1/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Ian Romanick	684ec33c79	intel/compiler: Make the CMPN builder work like the CMP builder Since the CMPN builder was never used, there was no reason to make its interface usable. :) Fixes: `2f2c00c727` ("i965: Lower min/max after optimization on Gen4/5.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>	2021-02-17 19:52:24 +00:00
Jason Ekstrand	8c2543d037	intel/fs: Implement umin/umax shuffle Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>	2021-01-22 18:38:38 +00:00
Jason Ekstrand	a6500236e3	intel/fs: Refactor our shuffle emit code This adds an emit_scan_step helper which gives us a place to do something a bit more interesting than emitting a single op. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>	2021-01-22 18:38:38 +00:00
Jason Ekstrand	68092df8d8	intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+ Intel hardware supports 8-bit arithmetic but it's tricky and annoying: - Byte operations don't actually execute with a byte type. The execution type for byte operations is actually word. (I don't know if this has implications for the HW implementation. Probably?) - Destinations are required to be strided out to at least the execution type size. This means that B-type operations always have a stride of at least 2. This means wreaks havoc on the back-end in multiple ways. - Thanks to the strided destination, we don't actually save register space by storing things in bytes. We could, in theory, interleave two byte values into a single 2B-strided register but that's both a pain for RA and would lead to piles of false dependencies pre-Gen12 and on Gen12+, we'd need some significant improvements to the SWSB pass. - Also thanks to the strided destination, all byte writes are treated as partial writes by the back-end and we don't know how to copy-prop them. - On Gen11, they added a new hardware restriction that byte types aren't allowed in the 2nd and 3rd sources of instructions. This means that we have to emit B->W conversions all over to resolve things. If we emit said conversions in NIR, instead, there's a chance NIR can get rid of some of them for us. We can get rid of a lot of this pain by just asking NIR to get rid of 8-bit arithmetic for us. It may lead to a few more conversions in some cases but having back-end copy-prop actually work is probably a bigger bonus. There is still a bit we have to handle in the back-end. In particular, basic MOVs and conversions because 8-bit load/store ops still require 8-bit types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>	2020-11-09 18:58:51 +00:00
Francisco Jerez	6310a05f68	intel/fs: Rename half() helpers to quarter(), allow index up to 3. Makes more sense considering SIMD32. Relaxing the assertion in brw_ir_fs.h will be required in order to avoid assertion failures on SNB with SIMD32 fragment shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-04-28 23:00:29 -07:00
Ian Romanick	f7d620f47d	intel/compiler: Fixup operands in fs_builder::emit() that takes array The versions that take a specific number of operands will do various fixups depending on the platform and the opcode. However, the version that takes an array of sources did not. This makes all version operate similarly. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>	2020-04-17 08:21:47 -07:00

1 2

69 commits