These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.
It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.
Otherwise, the only thing that matters is is whether a is NaN.
Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
We already have patterns to move the negation to the constant.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
The only special case here is d == -0.0.
Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
This was marked inexact because of me in !21475, but I don't see why now,
even after checking all the special values.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
Outside of isnan/isinf this shouldn't be needed, but at this point
they were already lowered.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
Not sure if this is actually required, but it matches previous behavior in NIR,
and some piglit tests expect this.
Notably GL-CTS does not need this, so maybe piglit is just broken.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
If querying based on a regexp pattern, both FOO_BASE and the synthisized
FOO_BASE_HI could match the regex. But we don't really want to see both
seperately, we just want the combined 64b value.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
The output in query mode can take many columns. Tighten it up a bit,
going from:
12: DI_PT_TRILIST(0,0-1381,1381):98304:RM6_BIN_VISIBILITY:CP_SET_THREAD_BR: 00000040!+ RB_BIN_FOVEAT: { BINSCALEEN }
to
12: TRILIST(0,0-1381,1381):98304:BIN_VISIBILITY:BR: 00000040!+ RB_BIN_FOVEAT: { BINSCALEEN }
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
The kernel capabilty has the `FPFastMathMode` decoration, but not the
`FPFastMathDefault` execution mode, so a SPIR-V module not using
`SPV_KHR_float_controls2` has no way of setting any defaults.
Fixes: 9da2d21804 ("vtn: implement default fp_math_ctrl without using execution mode")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39790>
blend_entries is already a uint32_t so its lenght should only be rt_dwords.
This fix invalid reads in the line below where the memcpy lenght is the
sizeof(blend_entries).
Valgrind warning:
==176211== Invalid read of size 2
==176211== at 0x485A7CF: memmove (vg_replace_strmem.c:1415)
==176211== by 0x810035C: iris_upload_dirty_render_state (iris_state.c:7643)
==176211== by 0x8141B5B: iris_upload_render_state (iris_state.c:9055)
==176211== by 0x8D19941: iris_simple_draw_vbo (iris_draw.c:195)
==176211== by 0x8D1A034: iris_draw_vbo (iris_draw.c:346)
==176211== by 0x7290116: tc_call_draw_single (u_threaded_context.c:3826)
==176211== by 0x7290116: batch_execute (u_threaded_context_calls.h:11)
==176211== by 0x7290116: tc_batch_execute (u_threaded_context.c:5344)
==176211== by 0x7281583: _tc_sync (u_threaded_context.c:744)
==176211== by 0x72868A2: tc_texture_map (u_threaded_context.c:2847)
==176211== by 0x6975C23: pipe_texture_map_3d (u_inlines.h:680)
==176211== by 0x697701C: st_ReadPixels (st_cb_readpixels.c:537)
==176211== by 0x68E97C4: read_pixels (readpix.c:1216)
==176211== by 0x68E97C4: _mesa_ReadnPixelsARB (readpix.c:1233)
==176211== by 0x68E9889: _mesa_ReadPixels (readpix.c:1248)
==176211== Address 0x1a792240 is 0 bytes after a block of size 144 alloc'd
==176211== at 0x484C7A8: malloc (vg_replace_malloc.c:446)
==176211== by 0x80A6829: iris_create_blend_state (iris_state.c:1815)
==176211== by 0x7282E2F: tc_create_blend_state (u_threaded_context.c:1408)
==176211== by 0x71F4FB0: cso_set_blend (cso_context.c:541)
==176211== by 0x695CE93: st_update_blend (st_atom_blend.c:351)
==176211== by 0x64C6C0B: st_validate_state (st_util.h:129)
==176211== by 0x64C6D91: st_prepare_draw (st_draw.c:88)
==176211== by 0x67F1D16: _mesa_draw_arrays (draw.c:1176)
==176211== by 0x67F2795: _mesa_DrawArrays (draw.c:1386)
==176211== by 0x4973339: stub_glDrawArrays (piglit-dispatch-gen.c:12483)
==176211== by 0x49ECF1C: piglit_draw_rect_from_arrays (piglit-util-gl.c:746)
==176211== by 0x49ED3A6: piglit_draw_rect_custom (piglit-util-gl.c:868)
==176211==
==176211== Invalid read of size 2
==176211== at 0x485A7C0: memmove (vg_replace_strmem.c:1415)
==176211== by 0x810035C: iris_upload_dirty_render_state (iris_state.c:7643)
==176211== by 0x8141B5B: iris_upload_render_state (iris_state.c:9055)
==176211== by 0x8D19941: iris_simple_draw_vbo (iris_draw.c:195)
==176211== by 0x8D1A034: iris_draw_vbo (iris_draw.c:346)
==176211== by 0x7290116: tc_call_draw_single (u_threaded_context.c:3826)
==176211== by 0x7290116: batch_execute (u_threaded_context_calls.h:11)
==176211== by 0x7290116: tc_batch_execute (u_threaded_context.c:5344)
==176211== by 0x7281583: _tc_sync (u_threaded_context.c:744)
==176211== by 0x72868A2: tc_texture_map (u_threaded_context.c:2847)
==176211== by 0x6975C23: pipe_texture_map_3d (u_inlines.h:680)
==176211== by 0x697701C: st_ReadPixels (st_cb_readpixels.c:537)
==176211== by 0x68E97C4: read_pixels (readpix.c:1216)
==176211== by 0x68E97C4: _mesa_ReadnPixelsARB (readpix.c:1233)
==176211== by 0x68E9889: _mesa_ReadPixels (readpix.c:1248)
==176211== Address 0x1a792242 is 2 bytes after a block of size 144 alloc'd
==176211== at 0x484C7A8: malloc (vg_replace_malloc.c:446)
==176211== by 0x80A6829: iris_create_blend_state (iris_state.c:1815)
==176211== by 0x7282E2F: tc_create_blend_state (u_threaded_context.c:1408)
==176211== by 0x71F4FB0: cso_set_blend (cso_context.c:541)
==176211== by 0x695CE93: st_update_blend (st_atom_blend.c:351)
==176211== by 0x64C6C0B: st_validate_state (st_util.h:129)
==176211== by 0x64C6D91: st_prepare_draw (st_draw.c:88)
==176211== by 0x67F1D16: _mesa_draw_arrays (draw.c:1176)
==176211== by 0x67F2795: _mesa_DrawArrays (draw.c:1386)
==176211== by 0x4973339: stub_glDrawArrays (piglit-dispatch-gen.c:12483)
==176211== by 0x49ECF1C: piglit_draw_rect_from_arrays (piglit-util-gl.c:746)
==176211== by 0x49ED3A6: piglit_draw_rect_custom (piglit-util-gl.c:868)
==176211==
Cc: stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39661>
This is still going to be limited by the actual range supported
by panthor. Some internal allocations have to be within the
lower 32-bit AS, so this creates a private heap and puts all
priv bo and gpu queue allocations in that.
If panthor doesn't let us use >32bit addresses, we go back
to the old behavior of having a single heap.
The lowest 32MB range is still reserved for device usage.
Also adds a debug option to limit it back to 32 bits for debugging
any potential future issues.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39383>
libclc doesn't so we have to. fixes math_brutefore cbrt on Iris.
Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39794>
When a buffer is deleted, we have to remove it from all binding points.
We were re-using the code for BindBufferRange for this; however, this
caused the general binding point to be unbound (bound to NULL)
unconditionally, even if a different buffer is bound there. Fix this by
inlining the various bind calls into the delete buffers code.
cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14755
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39659>
For sufficiently big images, tiled AFBC offers perf advantages
over linear AFBC. Keep using linear AFBC for images that are thin
and fall through to U-interleaved for even thinner images. Note
that indeed, interleaved 64k will be skipped in this case as it
won't meet the minimum size criteria set out by interleaved 64k's
test_props.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39737>
On GFX10+, the hardware can write decompressed DWORDS to HTILE when
COMPRESSION_EN=1, which means some HTILE decompression/initialization
operations can be avoided because it automatically mark the tiles that
are touched as uncompressed.
Though according to PAL, there are issues with that on GFX10-10.3, so
it's only enabled on GFX11-11.5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656>