Testing negative iterations count makes no sense, and can cause issues
when the unsigned type is used.
Testing 0 iterations is already covered with
will_break_on_first_iteration, so it can be skipped too.
Fixes: 6772a17a ("nir: Add a loop analysis pass")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9913
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26173>
@decl_reg intrinsics must be in the first block so it's convenient to be
able to create an insertion point after all @decl_regs when the first
block needs to be split.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26737>
When we moved the bulk of glsl_type to C, these globals were
kept to avoid changes to compiler/glsl code in the MR. Now that
landed, change the code to use the actual bultins directly.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26658>
Those are passed as an optional argument and are declared as a list of
(type, name) tuples.
At the moment this can only be used for conditions.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26214>
Once lowered low enough, it's not always possible to tell what strings
are used. So include them all when linking another shader.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26505>
Once decl_reg is handled, src[0].ssa->divergent will be properly set, so
load_reg and load_reg_indirect do not need special treatment.
shader-db can run to completion on HSW, IVB, and SNB now. No other
testing was done.
v2: Refactor nir_intrinsic_load_reg and nir_intrinsic_load_reg_indirect
handling. Suggested by Daniel Schürmann.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 4fd257d20f ("nir: Properly handle divergence for load_reg")
Fixes: 6dbb5f1e07 ("intel/fs: rerun divergence analysis prior to convert_from_ssa")
Closes: #10233
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26436>
In the wise words of Mike Blumenkrantz, "I hate gl_PointSize and so can you".
The mesa/st lowering won't mesh well with vertex shader epilogues, and it falls
over in various circumstances. I am too tired to go against the grain, so let's
just pretend to be a normal gallium driver and trust in the rasterizer CSO,
lowering point size internally. This properly handles transform feedback without
any hacks, both GL and GLES behaviours, etc.
Fixes:
KHR-GL31.transform_feedback.capture_vertex_separate_test
gl-2.0-large-point-fs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26614>
in prep for tessellation (which will share the IA lowering), and for multidraw
indirect (which greatly complicates IA lowering with geom/tess).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26614>
This is for fixes the following error:
FAILED: src/vulkan/runtime/vk_synchronization_helpers.c src/vulkan/runtime/vk_synchronization_helpers.h
"C:\CI-Tools\msys64\mingw64\bin/python3.EXE" "../../src/vulkan/util/vk_synchronization_helpers_gen.py" "--xml" "../../src/vulkan/registry/vk.xml" "--out-c" "src/vulkan/runtime/vk_synchronization_helpers.c" "--beta" "false"
Traceback (most recent call last):
File "C:/work/xemu/mesa/src/vulkan/util/vk_synchronization_helpers_gen.py", line 213, in main
f.write(TEMPLATE_C.render(**environment))
UnicodeEncodeError: 'gbk' codec can't encode character '\xa9' in position 15: illegal multibyte sequence
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26515>
Since nir_opt_algebraic runs on its own results, if the driver doesn't
have [su]dot_4x8_[ui]add then the [su]dot_4x8_[ui]add lowering rules
will kick in and lower that to what we had originally.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26533>
Replacement in try_eval_const_alu() doesn't work because the replacements
are always scalar. The callers also always give a scalar dest.
This is encountered when compiling a Redout shader under ASan.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: bc170e895f ("nir/loop_analyze: Use try_eval_const_alu and induction variable basis info")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26225>
Save a comparison, and move out the comparison to be more backend friendly.
Saves 2 instrs on AGX (as the remaining comparison now fuses with bcsel).
Results on AGX, all affected shaders in asphalt9.
total instructions in shared programs: 1813003 -> 1812611 (-0.02%)
instructions in affected programs: 119646 -> 119254 (-0.33%)
helped: 333
HURT: 0
Instructions are helped.
total bytes in shared programs: 11870344 -> 11867208 (-0.03%)
bytes in affected programs: 820888 -> 817752 (-0.38%)
helped: 333
HURT: 0
Bytes are helped.
and on Mali-G57:
total instructions in shared programs: 2677538 -> 2677205 (-0.01%)
instructions in affected programs: 206923 -> 206590 (-0.16%)
helped: 333
HURT: 0
Instructions are helped.
total cvt in shared programs: 14667.50 -> 14662.30 (-0.04%)
cvt in affected programs: 1953.64 -> 1948.44 (-0.27%)
helped: 333
HURT: 0
Cvt are helped.
total quadwords in shared programs: 1450664 -> 1450544 (<.01%)
quadwords in affected programs: 5064 -> 4944 (-2.37%)
helped: 15
HURT: 0
Quadwords are helped.
total threads in shared programs: 53282 -> 53309 (0.05%)
threads in affected programs: 27 -> 54 (100.00%)
helped: 27
HURT: 0
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26489>
The NIR intrinsics now take and return a barrier whenever one is
modified instead of modifying in-place. In NAK, we give the internal
instructions the same treatment and convert everything to use barrier
SSA values and RegRefs. In nak_from_nir, we move all barriers to/from
GPRs. We'll clean up the massive pile of OpBMov later.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26463>
asahi will pass in 16bits, works fine if we convert before clamping. note we
don't try to be clever and make a smaller immediate because it would require
extra logic for negatives to make sure we don't have garbage in upper bits
(nir_validate checks that). do the simple, obviously correct thing.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26440>
If the shader passed to nir_lower_vars_to_scratch contains some unused
derefs to a variable that will be lowered, validation will fail because
the variable is not part of the shader after the pass.
cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26271>
This flag indicates the requirement of helper invocations
in fragment shaders, independent from any present instructions.
This fixes the lowering of OpGroupNonUniformQuad* instructions.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26026>
This is based off the original GLSL IR pass but it is much much
simpler as it doesn't need to do all of the hackery required in
GLSL IR to achieve the lowering.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25860>
Since v71, broadcom hw include specific packing/conversion
instructions, so this commit adds opcodes to be able to make use of
them, specially for image stores:
* pack_2x16_to_unorm_2x8 (on backend vftounorm8/vftosnorm8):
2x16-bit floating point to 2x8-bit unorm/snorm
* f2unorm_16/f2snorm_16 (on backend ftounorm16/ftosnorm16):
floating point to 16-bit unorm/snorm
* pack_2x16_to_unorm_2x10/pack_2x16_to_unorm_10_2 (on backend
vftounorm10lo/vftounorm10hi): used to convert a floating point to
a r10g10b10a2 unorm
* pack_32_to_r11g11b10 (on backend v11fpack): packs 2 2x16 FP into
R11G11B10.
* pack_uint_32_to_r10g10b10a2 (on backend v10pack): pack 2 2x16
integer into R10G10B10A2
* pack_4x16_to_4x8 (on backend v8pack): packs 2 2x16 bit integer
into 4x8 bits.
* pack_2x32_to_2x16 (on backend vpack): 2x32 bit to 2x16 integer
pack
For the latter, it can be easly confused with the existing
pack_32_2x16_split. But note that this one receives two 16bit integer,
and packs them on a 32bit integer. But broadcom opcode takes two 32bit
integer, takes the lower halfword, and packs them as 2x16 on a 32bit
integer.
Interestingly broadcom also defines a similar one that packs the
higher halfword. Not used yet.
Note that at this point we use agnostic names, even if we add a _v3d
suffix as they are only available for broadcom, in order to follow
current NIR conventions.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25726>