Commit graph

43 commits

Author SHA1 Message Date
Emma Anholt
7c57061b77 nouveau: Enable frexp lowering in the backend.
This would be desired for NVK using this backend, but also for getting
lowering out of the GLSL frontend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>
2023-04-06 02:32:01 +00:00
Emma Anholt
3a336a8ffd nouveau: Add missing nir_opt_algebraic_late.
This was needed for nir_lower_frexp, but it's a win anyway.  shader-db
results:

total gpr in shared programs: 1143621 -> 1143502 (-0.01%)
gpr in affected programs: 33918 -> 33799 (-0.35%)

total instructions in shared programs: 7829415 -> 7820124 (-0.12%)
instructions in affected programs: 1204967 -> 1195676 (-0.77%)

total bytes in shared programs: 71802760 -> 71717352 (-0.12%)
bytes in affected programs: 11031888 -> 10946480 (-0.77%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>
2023-04-06 02:32:01 +00:00
Michel Dänzer
ff73392774 nouveau: Make getSize return unsigned int
This matches the type of the underlying size member, and is consistent
with other getSize methods.

Avoids compiler warning with LTO enabled:

In member function '__ct ',
    inlined from 'convertToSSA' at ../src/nouveau/codegen/nv50_ir_ssa.cpp:401:26,
    inlined from 'convertToSSA' at ../src/nouveau/codegen/nv50_ir_ssa.cpp:310:28,
    inlined from 'nv50_ir_generate_code' at ../src/nouveau/codegen/nv50_ir.cpp:1331:22:
../src/nouveau/codegen/nv50_ir_ssa.cpp:407:48: error: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
  407 |    stack = new Stack[func->allLValues.getSize()];
      |                                                ^
/usr/include/c++/12/new: In function 'nv50_ir_generate_code':
/usr/include/c++/12/new:128:26: note: in a call to allocation function 'operator new []' declared here
  128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
      |                          ^

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21781>
2023-03-17 16:08:33 +00:00
Daniel Schürmann
2bb369dd8d nir: add assertions that loops don't have a Continue Construct
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.

Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
2023-02-21 10:41:11 +00:00
Ian Romanick
ea413e826b nir: Eliminate nir_op_f2b
Builds on the work of !15121.  This gets to delete even more code
because many drivers shared a lot of code for i2b and f2b.

No shader-db or fossil-db changes on any Intel platform.

v2: Rebase on 1a35acd8d9.

v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin.

v4: Another rebase. Remove f2b stuff from Midgard.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
2023-02-03 22:39:57 +00:00
Ian Romanick
eb76cee9f8 nir: Eliminate nir_op_i2b
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b.  Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.

At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.

The failing test on d3d12 is a pre-existing bug that is triggered by
this change.  I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.

v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.

v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.

v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress.  The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).

v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.

v6: Just eliminate the i2b instruction.

v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷

No shader-db changes on any Intel platform.

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2

Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2

The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.

Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
2022-12-14 06:23:21 +00:00
Ben Skeggs
3a94b3b2a7 gv100/ir: noop OP_BAR for now
Let's get stuff rolling and deal with figuring this out later.

Acked-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17633>
2022-11-09 21:21:22 +00:00
Yusuf Khan
2c5b1d0e3b nv50/ir: Support fmulz and ffmaz
Signed-off-by: Yusuf Khan <yusisamerican@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19333>
2022-11-08 21:10:08 +00:00
Yusuf Khan
47251d2852 nv50/ir: add prefer_nir flag for getting compiler options
So that we dont expose certain options for nir_to_tgsi

Signed-off-by: Yusuf Khan <yusiamerican@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19333>
2022-11-08 21:10:08 +00:00
Joan Bruguera
6014a642ae nv50/ir/nir: ignore sampler for TXF/TXQ ops.
Recently, a regression was reported where videos in Firefox had shifted/
glitched colors on certain Kepler hardware. This was bisected to
bf02bffe15, however, the issue already
existed but didn't hit users until TGSI was switched to NIR as default.

The issue was traced to a YUV-to-RGB fragment shader used by Firefox,
which uses three samplers for the Y/U/V components. The Y component was
handled correctly, but the U/V components were bogus, causing the issue.

After analysis, it appears the TXF/TXQ ops. should only handle the texture
(r) but not the sampler (s), see 63b850403c
and 346ce0b988.
Similarly, handleTXQ/handleTXF on nv50_ir_from_tgsi always sets s=0.
Only Kepler was affected because other hardware ignores s at codegen.

Always set s=0 on NIR for TXF/TXQ, to keep TGSI behavior and fix the
regression.

Thanks: Karol Herbst and M Henning for help diagnosing the issue.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7416
Cc: mesa-stable
Suggested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Signed-off-by: Joan Bruguera <joanbrugueram@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19453>
2022-11-02 12:29:34 +00:00
Jason Ekstrand
d0c9ab529e nouveau/codegen: Support bindless texture queries
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19431>
2022-11-01 15:29:24 +00:00
Alyssa Rosenzweig
941c37c085 nir/lower_idiv: Remove imprecise_32bit_lowering
NIR has two implementations of lower_idiv, keyed on the
imprecise_32bit_lowering flag. This flag is misleading: the results when
setting this flag "imprecise", they're completely wrong for some values.
If a backend has a native implementation of umul_high, the correct path
isn't that much more expensive. If it doesn't, it's substantially slower
for highp integer divison... but in practice, non-constant highp integer
division is pretty rare.

After a painful migration of the tree, this code path has no more users.
Remove it so nobody else gets the bright idea of using it again.

Closes: #6555
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19303>
2022-10-27 19:37:14 +00:00
Yusuf Khan
d9a257b339 nv50/ir: nir_op_b2i8 and nir_op_b2i16
Signed-off-by: Yusuf Khan <yusisamerican@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19256>
2022-10-27 02:16:24 +00:00
Thomas Debesse
6d5921c623 nv50: call nir_lower_flrp
Fix #7432: unknown nir_op flrp assertion

This copy-pastes src/gallium/drivers/radeonsi/si_shader_nir.c

The lower_flrp16 value differs given chipset >= NVISA_GV100_CHIPSET.

Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19003>
2022-10-10 17:22:49 +00:00
Adam Jackson
14c6f716b4 nouveau: const cleanup
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18586>
2022-09-16 14:23:47 +00:00
Emma Anholt
873365caee nouveau: Fix compiler warnings about silly address checks in ir_print.
in/out/sv are arrays, so &array[i] is a non-null pointer.  Presumably
numSysVals/Inputs/Outputs are only incremented when there's data in the
arrays, anyway.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18536>
2022-09-13 18:39:07 +00:00
Danilo Krummrich
b97590371a nv50/ir: handle U8/U16 integers converting to U64
We can't directly convert from unsigned integers smaller than 64 bit to
unsigned 64 bit integers. Hence, converting from 32 bit to 64 bit is
handled by just merging with 0. To support U8/U16 integers handle them
just the same way.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:30 +02:00
Danilo Krummrich
caba679e56 nv50/ir: handle S8/S16 integers converting to S64
We can't convert directly from signed integers smaller 64 bit to signed
64 bit integers. For 32 bit integers this is handled with SHR and MERGE.
In order to also support 8/16 bit singed integers convert them to 32 bit
first.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:27 +02:00
Danilo Krummrich
2aaa315eee nv50/ir: split and cvt 64bit integers for {i,u}2{i,u}{8,16}
We can't convert from a 64 bit integer to any integer smaller than
64 bit directly, hence split the value first and then cvt / mov to the
target type.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:24 +02:00
Danilo Krummrich
8ccba4ea5c nv50/ir: add intermediate conversion for f2{i,u}{8,16}
Directly converting from a float to an 8 bit integer and from a 64 bit
float to an integer smaller than 32 bit is not supported, therefore add
an intermediate conversion to an 32 bit integer.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:22 +02:00
Danilo Krummrich
6a9825bc1b nv50/ir/nir: always round towards zero for f2i/f2u
Conversions to integers must be rounded towards zero, hence, actually
do this for all integers including 8/16 bit sources.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:18 +02:00
Danilo Krummrich
109d56f612 nv50/ir/nir: convert 8/16 bit src to 32 bit for {i,u}2f64
Converting signed and unsigned integers from 8/16 bit sources to a 64 bit
floating point destination (i2f64 / u2f64) isn't possible, hence convert
the source to 32 bit first.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:16 +02:00
Danilo Krummrich
78fc5e3773 nv50/ir: add isUnsignedIntType() and isIntType() helpers
Add helper functions to check whether a DataType is an unsigned integer
type and whether a DataType is either an unsigned or signed integer
type.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:13 +02:00
Danilo Krummrich
ec60dcd870 nv50/ir/nir: avoid 8/16 bit dest regs for OP_MOV
Instructions like

  mov u16 %r78s 0x00ff (0)

are dropped, since they're not supported by the HW, hence avoid using
8/16 bit destination registers for OP_MOV and use the full width of the
register instead.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:10 +02:00
Danilo Krummrich
6e2fda15f1 nv50/ir/nir: convert to 32 bit for all OP_SET opcodes
The 'set' instruction does distinguish between signed and unsigned, but
always treats values as 32 bit. For singed values < 0 with a bit width
smaller than 32 bit this falsely results in treating it as a positive
value.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:32:05 +02:00
Danilo Krummrich
cd53bcd325 nv50/ir/nir: add conversion ops for bit width < 32
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18109>
2022-09-09 17:31:57 +02:00
Karol Herbst
b23b94fbc9 nv50/ir: fix OP_UNION resolving when used for vector values
When an OP_UNION def takes part in a vector source e.g. for a tex
instruction we failed to clean up the OP_UNION instruction as rep() points
towards the coalesced value instead.

This fixes a regression on nv50 moving to NIR, but also potentially issues
with nvc0.

The main reason this is common in nv50 is, that we lower OP_SLCT to a set,
predicated movs and a union.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6406
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7117
Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18377>
2022-09-08 11:35:35 +00:00
M Henning
f90f04d501 nv/nir: Set ssbo CacheMode from intrinsic access
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18354>
2022-09-04 20:32:30 +00:00
Pierre Moreau
16b07b342d nv50/nir: A group barrier is CTA-level not global-level
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
2022-08-23 18:29:44 +00:00
Pierre Moreau
9236af8b6c nv50/ir: Avoid generating splits of splits
Among others, it would result in the spill offsets being wrong due to
being relative to the parent split and not absolute.

For example when computing a 64-bit multiply on Tesla (which only
supports 16-bit mul in hardware), the sources will first be split into
32-bit values and then a second time down to 16-bit ones. Looking at the
first source, the spill offsets ended being computed as follows:

    { .hihi = +2, .hilo = +0, .lohi = +2, .lolo = +0 }

instead of the expected

    { .hihi = +6, .hilo = +4, .lohi = +2, .lolo = +0 }

This is resolved with this patch.

Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
2022-08-23 18:29:44 +00:00
Pierre Moreau
b327f46e45 nv50/ra: Fix the offset computation for compounds
compMask is expressed in terms of colours, not bytes, where on Tesla we
have 1 colour per 16-bit (whereas it is 1 per 32-bit for later
architectures). By multiplying by units we will get back to a result in
bytes.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
2022-08-23 18:29:44 +00:00
Pierre Moreau
4d892829f3 nv50/peephole: Disallow combining sub 4-byte ld/st for now
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
2022-08-23 18:29:44 +00:00
Pierre Moreau
81828284b2 nv50/ir: Handle non-32-bit values when cst folding SPLIT
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
2022-08-23 18:29:44 +00:00
Emma Anholt
f6c5b1d6c6 nir: Split usub_sat lowering flag from uadd_sat.
Intel vec4 would like to do uadd_sat, but use lowering for usub_sat.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17637>
2022-07-22 17:54:28 +00:00
M Henning
4ee6345d2e nouveau: Drop C++03 compat code
Mesa as a whole requires C++14 nowadays, so this isn't needed any more.

Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17307>
2022-07-05 13:23:12 +00:00
Connor Abbott
c86563c29e nv50/ir/ra: Fix copying compound for moves
In order to reduce moves when coalescing multiple registers into a
larger register, RA will try to coalesce MERGE instructions with their
definitions. For example, for something like this in GLSL:

uint a = ...;
uint b = ...;
uint64 x = packUint2x32(a, b);

The compiler will try to coalesce x with a and b, in the same way as
something like:

uint a = ...;
uint b = ...;
...
uint x = phi(a, b);

with the crucial difference that the definitions of a and b only clobber
part of the register, instead of the whole thing. This information is
carried through the compound flag and compMask bitmask. If compound is
set, then the value has been coalesced in such a way that not all the
defs clobber the entire register. The compMask bitmask describes which
subregister each def clobbers, although it does it in a slightly
convoluted way. It's an invariant that once compound is set on one def,
it must be set for all the defs in a given coalesced value.

In more detail, the constraints pass will first create extra moves:

uint a = ...;
uint b = ...;
uint a' = a;
uint b' = b;
uint64 x = packUint2x32(a', b');

and then RA will merge values involved in MERGE/SPLIT instructions,
merging x with a' and b' and making the combined value compound -- this
is relatively simple, and will always succeed since we just created a'
and b', so they never interfere with x, and x has no other definitions,
since we haven't started coalescing moves yet. Basically, we just replaced
the MERGE instruction with an equivalent sequence of partial writes to the
destination. The tricky part comes when we try to merge a' with a
and b' with b. We need to transfer the compound information from a' to a
and b' to b, which copyCompound() does, but we also need to transfer it
to any defs coalesced with a and b, which the code failed to do. Similarly,
if x is the argument to a phi instruction, then when we try to merge it
with other arguments to the same phi by coalescing moves, we'd have
problems guaranteeing that all the other merged defs stay up-to-date.

One tricky part of fixing this is that in order to properly propagate
the information from a' to a, we need to do it before the defs for a and
a' are merged in coalesceValues(), since we need to know which defs are
merged with a but not a' -- after coalesceValues() returns, all the defs
have been combined, so we don't know which is which. I took the approach
of calling copyCompound() inside coalesceValues(), instead of
afterwards.

v2: (mhenning) This now loops over mergedDefs in copyCompound, to update
    it for changes made in bcf6a9ec

Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17115>
2022-06-30 11:35:56 +00:00
Emma Anholt
1e2e52eff7 nouveau/nir: Implement mul_zero_wins behavior for use_legacy_math_rules.
This is the same flag TGSI sets for LEGACY_MATH_RULES.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Tested-by: Mobin Aydinfar <mobin@mobintestserver.ir>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16176>
2022-06-10 03:26:33 +00:00
Emma Anholt
76b203eb39 gallium: Rename MUL_ZERO_WINS to LEGACY_MATH_RULES.
This is a clearer name for what it does than MUL_ZERO_WINS, and matches up
to the new name in shader_info.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16176>
2022-06-10 03:26:32 +00:00
Timothy Arceri
bc0f8455e5 nouveau/nvc0: disable GLSL IR loop unrolling
NIR loop unrolling is already enabled so just let it do its job.

Shader-db results (nv120):

total gpr in shared programs: 893490 -> 893898 (0.05%)
gpr in affected programs: 15338 -> 15746 (2.66%)
total instructions in shared programs: 6243205 -> 6237068 (-0.10%)
instructions in affected programs: 71160 -> 65023 (-8.62%)
total bytes in shared programs: 66729616 -> 66664760 (-0.10%)
bytes in affected programs: 759328 -> 694472 (-8.54%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
2022-06-04 16:11:49 +00:00
Timothy Arceri
e5181c2e23 nouveau/nv50: disable GLSL IR loop unrolling
NIR loop unrolling is already enabled so just let it do its job.

Shader-db results (nv92):

total gpr in shared programs: 734638 -> 735037 (0.05%)
gpr in affected programs: 11058 -> 11457 (3.61%)
total instructions in shared programs: 6073415 -> 6073398 (<.01%)
instructions in affected programs: 10079 -> 10062 (-0.17%)
total bytes in shared programs: 41837432 -> 41838872 (<.01%)
bytes in affected programs: 252504 -> 253944 (0.57%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
2022-06-04 16:11:49 +00:00
Dave Airlie
e90fe826a2 nouveau/codegen: drop gallium headers from the interface.
I know pipe defines are still used internally, but I'd want
better testing, before starting to remove that.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Yusuf Khan<yusisamerican@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16763>
2022-06-03 03:57:18 +00:00
Dave Airlie
ee93b32fdd nouveau/codegen: drop all ubytes from codegen.
There wasn't that many, so get rid of them in favour of real types.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Yusuf Khan<yusisamerican@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16763>
2022-06-03 03:57:18 +00:00
Dave Airlie
1f754b7aae nouveau: move codegen to a common higher level directory.
This allows it to be built independently of the gallium driver.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Yusuf Khan<yusisamerican@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16763>
2022-06-03 03:57:18 +00:00