Commit graph

70 commits

Author SHA1 Message Date
Caio Oliveira
3670c24740 intel/brw: Replace uses of fs_reg with brw_reg
And remove the fs_reg alias.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>
2024-07-03 02:53:19 +00:00
Caio Oliveira
d00329e821 intel/brw: Replace some fs_reg constructors with functions
Create three helper functions for ATTR, UNIFORM and VGRF creation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>
2024-07-03 02:53:18 +00:00
Kenneth Graunke
1e69ec3b8d intel/brw: Add a lower_csel pass and allow building it for all types
We can do CSEL on F, HF, *W, and *D on Gfx11+.  Gfx9 can only do F.

We can lower unsupported types to CMP+CSEL, allowing us to use CSEL
in the IR and not worry about the limitations.

Rework: (Sagar)
- Update validation pass for CSEL

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29316>
2024-07-01 19:06:31 +00:00
Ian Romanick
77ef241577 intel/brw/xe2+: Scale size_written by reg_unit for DPAS
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>
2024-06-25 14:17:47 -07:00
Kenneth Graunke
5cb15a6c67 intel/brw: Make bld.ADD(x, 0) emit no instructions and return x directly
There are a lot of places where we add 0 to an offset.  Avoiding
generating this can save us algebraic + copy_propagation later.

Cuts compile time in Borderlands 3 by -0.590631% +/- 0.170108% (n=25).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>
2024-06-24 19:12:21 -07:00
Kenneth Graunke
068865ce81 intel/brw: Make an alu2 builder helper
Instead of replicating the whole thing in macros, just make an alu2()
function and use that in the wrappers.  It ought to get inlined anyway.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>
2024-06-24 19:12:19 -07:00
Kenneth Graunke
344d4ee9f0 intel/brw: Make VEC() perform a single write to its destination.
This gathers a number of sources into a contiguous vector register,
typically using LOAD_PAYLOAD.  However, it uses MOV for a single source.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>
2024-06-18 09:02:25 +00:00
Kenneth Graunke
f04bb49465 intel/brw: Delete SAD2 and SADA2 opcodes
These were removed with Icelake.  While they technically still exist on
Skylake, which this compiler supports, we have never used these opcodes
in the 14 years we could have done so.  So just scrap them.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29665>
2024-06-10 16:47:50 -07:00
Francisco Jerez
6261f4d361 intel/brw/xe2+: Fix 64-bit subgroup scan intrinsics not to rely on SEL instructions.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>
2024-05-15 17:16:51 +00:00
Sagar Ghuge
e32828f5fc intel/compiler: Fix destination type for CMP/CMPN
For CMP/CMPN, use src0 type if destination is null otherwise get the
src0 type register with destination register size.

This fixes dEQP-VK.glsl.builtin_var.frontfacing.* tests cases on Xe2+.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28679>
2024-05-06 21:46:18 +00:00
Kenneth Graunke
3c867bf2c7 intel/brw: Add a new VEC() helper.
This gathers a number of sources into a contiguous vector register.
Eventually, the plan is that it will use a MOV for a single source,
or LOAD_PAYLOAD for multiple sources.  For now, it emits a series of
MOVs to allow us to rewrite a bunch of existing code to use the new
helper, then change them all over at once later.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971>
2024-04-30 17:16:42 -07:00
Kenneth Graunke
674e89953f intel/brw: Use new builder helpers that allocate a VGRF destination
With the previous commit, we now have new builder helpers that will
allocate a temporary destination for us.  So we can eliminate a lot
of the temporary naming and declarations, and build up expressions.

In a number of cases here, the code was confusingly mixing D-type
addresses with UD-immediates, or expecting a UD destination.  But the
underlying values should always be positive anyway.  To accomodate the
type inference restriction that the base types much match, we switch
these over to be purely UD calculations.  It's cleaner to do so anyway.

Compared to the old code, this may in some cases allocate additional
temporary registers for subexpressions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
4c2c49f7bc intel/brw: Add builder helpers that allocate temporary destinations
In many cases, we calculate an expression by generating a series of
instructions.  We'd either overwrite the same register repeatedly,
or call vgrf(BRW_TYPE_X) repeatedly to allocate temporaries for each
intermediate step.  In many cases, we overwrote the same register simply
because allocating and naming temporaries for each step was annoying.

This commit adds new builder helpers that will allocate a temporary
destination for you, using simple type interference: unary operations
use the source type, and binary operations require a matching base type
and return the largest of the two types.

The helpers return the destination register, allowing us to write in an
expression-tree style, chaining together builder operations to produce
whole values.  Sort of like nir_builder.  We still optionally will write
out the fs_inst pointer in case the caller wants to do things like set
predicates or saturation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
319ba85e10 intel/brw: Add builder helpers for math functions
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28957>
2024-04-29 07:51:45 +00:00
Kenneth Graunke
545bb8fb6f intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_*
Both of these helpers do the same thing.  We now have brw_type_size_bits
and brw_type_size_bytes and can use whichever makes sense in that place.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
c22f44ff07 intel/brw: Replace brw_reg_type_from_bit_size by brw_type_with_size
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
f523bfcf90 intel/brw: Reindent after shortening BRW_REGISTER_TYPE_* to BRW_TYPE_*
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
873fcdff38 intel/brw: Stop using long BRW_REGISTER_TYPE enum names
s/BRW_REGISTER_TYPE/BRW_TYPE/g

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
e637c63239 intel/brw: Make an fs_builder::SYNC helper
We always want a null destination, so this saves some typing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>
2024-04-16 02:14:49 +00:00
Ian Romanick
6d85f7129a intel/brw/xe2+: DPAS must be SIMD16 now
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29 21:12:32 +00:00
Ian Romanick
671745b616 intel/fs: Don't allow 0 stride on MOV destination
Outside SIMD1 instructions, a destination stride of zero doesn't make
any sense. When such strides exist, they would be fixed by the FS
generator. Currently the only place that intentionally generates such a
stride is setup_barrier_message_payload_gfx125, and this commit changes
that.

The existence of a zero stride that won't really be a zero stride causes
a variety of problems with other optimization passes. Those passes don't
know that 0 actually means 1, and they make incorrect assumptions about
sizes written, etc.

The assertion helped catch many bugs in some other work in progress that
tries to store convergent values in SIMD8 registers regardless of the
dispatch width. That code would accidentally generate destination
strides of zero.

v2: Check stride differently depending on register file. Suggested by
Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28256>
2024-03-19 18:17:59 +00:00
Caio Oliveira
97759ef139 intel/brw: Remove typedefs from fs_builder
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27866>
2024-02-29 21:14:13 -08:00
Caio Oliveira
865ef36609 intel/brw: Remove brw_shader.h
Find a better home for its existing content.  Some functions are
now just static functions at the usage sites.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27861>
2024-02-29 19:28:06 +00:00
Caio Oliveira
5c93a0e125 intel/brw: Remove Gfx8- remaining opcodes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:39 +00:00
Caio Oliveira
b6098676fa intel/brw: Remove Gfx8- code from builder
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:38 +00:00
Caio Oliveira
071e9f49f1 intel/brw: Remove F16TO32 and F32TO16 opcodes
These are done with MOVs and appropriate types in Gfx9+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:38 +00:00
Ian Romanick
e666872c75 intel/compiler: Initial bits for DPAS instruction
v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix
overlapping register allocation (via has_source_and_destination_hazard). Fix
incorrect destination register file encoding.

v3: Prevent lower_regioning from trying to "fix" DPAS sources.

v4: Add instruction latency information for scheduling and perf
estimates.

v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update
the comment in fs_inst::has_source_and_destination_hazard. Suggested
by Caio.

v6: Add some comments near the src2 calculation in
fs_inst::size_read. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:24:16 -08:00
Caio Oliveira
38a42e5aa1 intel/compiler: Add ctor to fs_builder that just takes the shader
Uses the dispatch_width from the shader (fs_visitor).  This was not
possible before because the dispatch_width was not part of
backend_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
cf730adc58 intel/compiler: Make fs_builder include fs_visitor and not the other way
This will allow fs_builder have a reference to an fs_visitor (a
"fs_shader" really), instead of a reference to a backend_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
f5032c4d52 intel/compiler: Make fs_visitor not depend on fs_builder
At this point this is more a header dependency due to inline functions,
so shuffle them around.  The end goal is to allow fs_builder have a
reference to a fs_visitor (really a fs_shader).

Note the header is still included, a later patch will move the includes
to the call-sites.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>
2023-12-12 19:36:14 +00:00
Caio Oliveira
21cf9323f0 intel/compiler: Add a few more helpers to fs_builder
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Francisco Jerez
150b3e87c8 intel/fs/xe2+: Round up fs_builder::vgrf() size calculation to HW register unit.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>
2023-09-20 17:19:36 -07:00
Caio Oliveira
26f6ea5c30 intel/compiler: Remove unused functions and declarations
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23539>
2023-06-09 20:09:51 +00:00
Lionel Landwerlin
3d0cc3f63b intel/fs: keep track of new resource_intel information
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
2023-05-30 06:36:37 +00:00
Kenneth Graunke
e7ea2aa46c intel/fs: Make bld.F16TO32 actually emit F16TO32 not F32TO16
Ahem, "add builder helpers that work on Gfx7"...now might actually work.
Too much copy and paste...

Fixes: 966995d911 ("intel/fs: Add builder helpers for F32TO16/F16TO32 that work on Gfx7.x")
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>
2023-03-17 09:01:18 +00:00
Kenneth Graunke
966995d911 intel/fs: Add builder helpers for F32TO16/F16TO32 that work on Gfx7.x
These take care of emitting the F32TO16/F16TO32 instructions on Gfx7.x
but otherwise just emit a type converting MOV on Gfx8+.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>
2023-03-09 23:26:17 +00:00
Lionel Landwerlin
e5dfff0946 intel/fs: reduce liveness of variables in lowering passes
When lowering a single instruction with a destination VGRF to 2 or
more, the VGRF is now considered partially written by each generated
instruction and that increases its liveness especially in loops. Thus
potentially increasing the number of spills/fills due to register
allocation.

Putting an UNDEF instruction in front of the lowered instructions
allows the IR to limit the liveness of the VGRF, reducing register
pressure.

This has a pretty dramatic effect on spills/fills for RT shaders. Here
the stats on Q2RTX shaders on DG2 (wipping out any spills/fills due to
register allocation) :

Instructions in all programs: 26150 -> 24955 (-4.6%)
SENDs in all programs: 1148 -> 1148 (+0.0%)
Loops in all programs: 4 -> 4 (+0.0%)
Cycles in all programs: 392179 -> 332787 (-15.1%)
Spills in all programs: 132 -> 116 (-12.1%)
Fills in all programs: 262 -> 154 (-41.2%)

Shader-db results on TGL :

total instructions in shared programs: 21158140 -> 21158377 (<.01%)
instructions in affected programs: 76629 -> 76866 (0.31%)
helped: 18
HURT: 20
helped stats (abs) min: 1 max: 60 x̄: 18.89 x̃: 12
helped stats (rel) min: 0.21% max: 3.61% x̄: 1.02% x̃: 0.77%
HURT stats (abs)   min: 1 max: 79 x̄: 28.85 x̃: 18
HURT stats (rel)   min: 0.04% max: 2.81% x̄: 1.13% x̃: 0.79%
95% mean confidence interval for instructions value: -4.82 17.30
95% mean confidence interval for instructions %-change: -0.34% 0.57%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 5753 -> 5753 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 798856834 -> 798870688 (<.01%)
cycles in affected programs: 6208395 -> 6222249 (0.22%)
helped: 22
HURT: 17
helped stats (abs) min: 2 max: 8794 x̄: 1438.18 x̃: 782
helped stats (rel) min: 0.05% max: 2.28% x̄: 0.63% x̃: 0.44%
HURT stats (abs)   min: 2 max: 19178 x̄: 2676.12 x̃: 1358
HURT stats (rel)   min: 0.04% max: 23.49% x̄: 2.25% x̃: 0.71%
95% mean confidence interval for cycles value: -952.19 1662.65
95% mean confidence interval for cycles %-change: -0.64% 1.90%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4078 -> 4066 (-0.29%)
spills in affected programs: 40 -> 28 (-30.00%)
helped: 2
HURT: 0

total fills in shared programs: 2856 -> 2832 (-0.84%)
fills in affected programs: 127 -> 103 (-18.90%)
helped: 2
HURT: 0

total sends in shared programs: 998554 -> 998554 (0.00%)
sends in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   0
GAINED: 0

Total CPU time (seconds): 2346.06 -> 2304.80 (-1.76%)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>
2022-10-27 21:05:00 +00:00
Lionel Landwerlin
14b99df7d9 intel/fs: require UNDEFs register offsets to be aligned to REG_SIZE
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18657>
2022-10-27 21:05:00 +00:00
Sagar Ghuge
75c73fcdc4 intel/compiler: Fix instruction size written calculation
We are always aligning to REG_SIZE but when we have payload sources less
than REG_SIZE, size written is miscalculated.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
2021-11-22 21:27:30 -08:00
Ian Romanick
0f809dbf40 intel/compiler: Basic support for DP4A instruction
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Sagar Ghuge
705285b9f4 intel/compiler: Add support for ternary add instruction on XeHP
v2:
- Re-arragne opcode in correct order (Matt Turner)
- Move ADD3 case closer to LRP (Jason)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
2021-07-16 15:59:56 +00:00
Michel Dänzer
2928c21eb7 Convert most remaining free-form fall-through comments to FALLTHROUGH
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Anuj Phogat
1d296484b4 intel: Rename Genx keyword to Gfxx
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/Gen\([[:digit:]]\+\)/Gfx\1/g"

Exclude changes in src/intel/perf/oa-*.xml:
find src/intel/perf -type f \( -name "*.xml" \) | xargs sed -ie "s/Gfx/Gen/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat
b75f095bc7 intel: Rename genx keyword to gfxx in source files
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/gen\([[:digit:]]\+\)/gfx\1/g"

Exclude pack.h and xml changes in this patch:
grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+_pack\.h\)/gen\1/g"
grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+\.xml\)/gen\1/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat
abe9a71a09 intel: Rename gen field in gen_device_info struct to ver
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Ian Romanick
684ec33c79 intel/compiler: Make the CMPN builder work like the CMP builder
Since the CMPN builder was never used, there was no reason to make its
interface usable. :)

Fixes: 2f2c00c727 ("i965: Lower min/max after optimization on Gen4/5.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>
2021-02-17 19:52:24 +00:00
Jason Ekstrand
8c2543d037 intel/fs: Implement umin/umax shuffle
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
a6500236e3 intel/fs: Refactor our shuffle emit code
This adds an emit_scan_step helper which gives us a place to do
something a bit more interesting than emitting a single op.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:38 +00:00
Jason Ekstrand
68092df8d8 intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:

  - Byte operations don't actually execute with a byte type.  The
    execution type for byte operations is actually word.  (I don't know
    if this has implications for the HW implementation.  Probably?)

  - Destinations are required to be strided out to at least the
    execution type size.  This means that B-type operations always have
    a stride of at least 2.  This means wreaks havoc on the back-end in
    multiple ways.

  - Thanks to the strided destination, we don't actually save register
    space by storing things in bytes.  We could, in theory, interleave
    two byte values into a single 2B-strided register but that's both a
    pain for RA and would lead to piles of false dependencies pre-Gen12
    and on Gen12+, we'd need some significant improvements to the SWSB
    pass.

  - Also thanks to the strided destination, all byte writes are treated
    as partial writes by the back-end and we don't know how to copy-prop
    them.

  - On Gen11, they added a new hardware restriction that byte types
    aren't allowed in the 2nd and 3rd sources of instructions.  This
    means that we have to emit B->W conversions all over to resolve
    things.  If we emit said conversions in NIR, instead, there's a
    chance NIR can get rid of some of them for us.

We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us.  It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus.  There is still a bit we have to handle in the back-end.  In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Francisco Jerez
6310a05f68 intel/fs: Rename half() helpers to quarter(), allow index up to 3.
Makes more sense considering SIMD32.  Relaxing the assertion in
brw_ir_fs.h will be required in order to avoid assertion failures on
SNB with SIMD32 fragment shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-04-28 23:00:29 -07:00