These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.
It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.
Otherwise, the only thing that matters is is whether a is NaN.
Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
We already have patterns to move the negation to the constant.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
The only special case here is d == -0.0.
Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
This was marked inexact because of me in !21475, but I don't see why now,
even after checking all the special values.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
Outside of isnan/isinf this shouldn't be needed, but at this point
they were already lowered.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
Not sure if this is actually required, but it matches previous behavior in NIR,
and some piglit tests expect this.
Notably GL-CTS does not need this, so maybe piglit is just broken.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
This reverts commit 4eeda739c4.
Looks like @ajax has been playing with claude "ai" and accidentally
committed and pushed things to main.
1. That doesn't belong here.
2. We shouldn't bypass MR review unless in emergency situations.
This seems like a good time to remind people to not have a push-able
upstream remote, to avoid an accidental `git push` or something else
doing that.
One way to do this is to change the push url like this, assuming your
upstream remote is called `origin` (default if you didn't pick
something else):
git remote set-url --push origin invalid-url
`invalid-url` will fail if you try to push to it, catching your mistakes :)
Spotted-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39810>
If querying based on a regexp pattern, both FOO_BASE and the synthisized
FOO_BASE_HI could match the regex. But we don't really want to see both
seperately, we just want the combined 64b value.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
The output in query mode can take many columns. Tighten it up a bit,
going from:
12: DI_PT_TRILIST(0,0-1381,1381):98304:RM6_BIN_VISIBILITY:CP_SET_THREAD_BR: 00000040!+ RB_BIN_FOVEAT: { BINSCALEEN }
to
12: TRILIST(0,0-1381,1381):98304:BIN_VISIBILITY:BR: 00000040!+ RB_BIN_FOVEAT: { BINSCALEEN }
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39788>
The kernel capabilty has the `FPFastMathMode` decoration, but not the
`FPFastMathDefault` execution mode, so a SPIR-V module not using
`SPV_KHR_float_controls2` has no way of setting any defaults.
Fixes: 9da2d21804 ("vtn: implement default fp_math_ctrl without using execution mode")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39790>
blend_entries is already a uint32_t so its lenght should only be rt_dwords.
This fix invalid reads in the line below where the memcpy lenght is the
sizeof(blend_entries).
Valgrind warning:
==176211== Invalid read of size 2
==176211== at 0x485A7CF: memmove (vg_replace_strmem.c:1415)
==176211== by 0x810035C: iris_upload_dirty_render_state (iris_state.c:7643)
==176211== by 0x8141B5B: iris_upload_render_state (iris_state.c:9055)
==176211== by 0x8D19941: iris_simple_draw_vbo (iris_draw.c:195)
==176211== by 0x8D1A034: iris_draw_vbo (iris_draw.c:346)
==176211== by 0x7290116: tc_call_draw_single (u_threaded_context.c:3826)
==176211== by 0x7290116: batch_execute (u_threaded_context_calls.h:11)
==176211== by 0x7290116: tc_batch_execute (u_threaded_context.c:5344)
==176211== by 0x7281583: _tc_sync (u_threaded_context.c:744)
==176211== by 0x72868A2: tc_texture_map (u_threaded_context.c:2847)
==176211== by 0x6975C23: pipe_texture_map_3d (u_inlines.h:680)
==176211== by 0x697701C: st_ReadPixels (st_cb_readpixels.c:537)
==176211== by 0x68E97C4: read_pixels (readpix.c:1216)
==176211== by 0x68E97C4: _mesa_ReadnPixelsARB (readpix.c:1233)
==176211== by 0x68E9889: _mesa_ReadPixels (readpix.c:1248)
==176211== Address 0x1a792240 is 0 bytes after a block of size 144 alloc'd
==176211== at 0x484C7A8: malloc (vg_replace_malloc.c:446)
==176211== by 0x80A6829: iris_create_blend_state (iris_state.c:1815)
==176211== by 0x7282E2F: tc_create_blend_state (u_threaded_context.c:1408)
==176211== by 0x71F4FB0: cso_set_blend (cso_context.c:541)
==176211== by 0x695CE93: st_update_blend (st_atom_blend.c:351)
==176211== by 0x64C6C0B: st_validate_state (st_util.h:129)
==176211== by 0x64C6D91: st_prepare_draw (st_draw.c:88)
==176211== by 0x67F1D16: _mesa_draw_arrays (draw.c:1176)
==176211== by 0x67F2795: _mesa_DrawArrays (draw.c:1386)
==176211== by 0x4973339: stub_glDrawArrays (piglit-dispatch-gen.c:12483)
==176211== by 0x49ECF1C: piglit_draw_rect_from_arrays (piglit-util-gl.c:746)
==176211== by 0x49ED3A6: piglit_draw_rect_custom (piglit-util-gl.c:868)
==176211==
==176211== Invalid read of size 2
==176211== at 0x485A7C0: memmove (vg_replace_strmem.c:1415)
==176211== by 0x810035C: iris_upload_dirty_render_state (iris_state.c:7643)
==176211== by 0x8141B5B: iris_upload_render_state (iris_state.c:9055)
==176211== by 0x8D19941: iris_simple_draw_vbo (iris_draw.c:195)
==176211== by 0x8D1A034: iris_draw_vbo (iris_draw.c:346)
==176211== by 0x7290116: tc_call_draw_single (u_threaded_context.c:3826)
==176211== by 0x7290116: batch_execute (u_threaded_context_calls.h:11)
==176211== by 0x7290116: tc_batch_execute (u_threaded_context.c:5344)
==176211== by 0x7281583: _tc_sync (u_threaded_context.c:744)
==176211== by 0x72868A2: tc_texture_map (u_threaded_context.c:2847)
==176211== by 0x6975C23: pipe_texture_map_3d (u_inlines.h:680)
==176211== by 0x697701C: st_ReadPixels (st_cb_readpixels.c:537)
==176211== by 0x68E97C4: read_pixels (readpix.c:1216)
==176211== by 0x68E97C4: _mesa_ReadnPixelsARB (readpix.c:1233)
==176211== by 0x68E9889: _mesa_ReadPixels (readpix.c:1248)
==176211== Address 0x1a792242 is 2 bytes after a block of size 144 alloc'd
==176211== at 0x484C7A8: malloc (vg_replace_malloc.c:446)
==176211== by 0x80A6829: iris_create_blend_state (iris_state.c:1815)
==176211== by 0x7282E2F: tc_create_blend_state (u_threaded_context.c:1408)
==176211== by 0x71F4FB0: cso_set_blend (cso_context.c:541)
==176211== by 0x695CE93: st_update_blend (st_atom_blend.c:351)
==176211== by 0x64C6C0B: st_validate_state (st_util.h:129)
==176211== by 0x64C6D91: st_prepare_draw (st_draw.c:88)
==176211== by 0x67F1D16: _mesa_draw_arrays (draw.c:1176)
==176211== by 0x67F2795: _mesa_DrawArrays (draw.c:1386)
==176211== by 0x4973339: stub_glDrawArrays (piglit-dispatch-gen.c:12483)
==176211== by 0x49ECF1C: piglit_draw_rect_from_arrays (piglit-util-gl.c:746)
==176211== by 0x49ED3A6: piglit_draw_rect_custom (piglit-util-gl.c:868)
==176211==
Cc: stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39661>
This is still going to be limited by the actual range supported
by panthor. Some internal allocations have to be within the
lower 32-bit AS, so this creates a private heap and puts all
priv bo and gpu queue allocations in that.
If panthor doesn't let us use >32bit addresses, we go back
to the old behavior of having a single heap.
The lowest 32MB range is still reserved for device usage.
Also adds a debug option to limit it back to 32 bits for debugging
any potential future issues.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39383>
libclc doesn't so we have to. fixes math_brutefore cbrt on Iris.
Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39794>
When a buffer is deleted, we have to remove it from all binding points.
We were re-using the code for BindBufferRange for this; however, this
caused the general binding point to be unbound (bound to NULL)
unconditionally, even if a different buffer is bound there. Fix this by
inlining the various bind calls into the delete buffers code.
cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14755
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39659>
For sufficiently big images, tiled AFBC offers perf advantages
over linear AFBC. Keep using linear AFBC for images that are thin
and fall through to U-interleaved for even thinner images. Note
that indeed, interleaved 64k will be skipped in this case as it
won't meet the minimum size criteria set out by interleaved 64k's
test_props.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39737>