We are writing 32-bit register value to buffer and were reading back
64-bit value back into two register. We don't need to read the second
register in this case.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29389>
They are still the same, but we don't rely on the BRW compiler
specific symbols. STATIC_ASSERT catches at compile time if they
change independently. At some point we might revisit the need
for them to match.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27646>
On newer platforms (Arrowlake and above) we can issue a
EXECUTE_INDIRECT_DRAW that allows us to:
* Skip issuing mi load/store instructions for indirect parameters
* Skip doing the indirect draw unroll on the CPU side when the
appropriate stride is passed
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26178>
This is a prepare step to remove depends on p_defines.h in src/util/*
This is done by:
replace pipe_prim_type with mesa_prim
replace shader_prim with mesa_prim
replace PIPE_PRIM_MAX with MESA_PRIM_COUNT
replace SHADER_PRIM_ with MESA_PRIM_
replace PIPE_PRIM_ with MESA_PRIM_
This patch only replace code only
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23369>
We can have multiple pipe_screen but only one iris_bufmgr per device.
So better to store intel_device_info into the shared iris_bufmgr and
save some memory.
Also in future patches iris_bufmgr will make more use of
intel_device_info.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19650>
On Tigerlake and later, we enable compression for image views. However,
we never actually added any code to update the aux state, which meant
that if it ever changed, things would break, badly.
We managed to avoid catastrophic effects in most cases because of
two other issues which papered over the problem: if compression wasn't
already enabled for an image, we'd leave it disabled. And, we avoided
writing via the CPU to buffers with auxiliary. So in most cases, CCS
remained disabled, or got enabled (say by glTexImage()) then stayed on
permanently. There were still issues, but they managed to remain more
hidden than one would expect given the severity of the bug.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>
Eventually the resolve code started making everything take ice instead
of batch, and at some point this ceased to be used. It's always render.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>
Make it clearer we are dealing with multiple patches,
works better in constrast with SINGLE_PATCH.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
On Gfx11+, we're going to stop changing Surface State Base Address
and instead start changing the Binding Table Pool address instead.
So, rename a few things to track the last binder address, which is
what we're actually changing, regardless of how we program it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
include it explicitly in the correct places
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14104>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
Using recommended values based on performance studies across a range
of workloads.
Rework:
* Always enable geometry distribution
* Set ListCutIndexEnable if primitive restart is enabled
* Set distribution mode based on TEEnable
v2:
- Flag missing IRIS_DIRTY_VFG bit (Ken)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
This adds buffer-local barriers so any required synchronization
commands are emitted before a buffer object is used as source for
indirect draw parameters. An unconditional PIPE_CONTROL meant to
flush the contents of the draw count buffer can now be removed, since
it's redundant with the more accurate buffer-local barrier introduced
here, which should avoid flushing in cases where the buffer wasn't
written by any incoherent cache since the last flush.
(Rebased by Kenneth Graunke.)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12691>
We would like draw-only display lists to have immutable draw info and
this is the only GL non-draw state in pipe_draw_info (not counting
view_mask).
It also allows removing some code from draw_vbo for tessellation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
This is the virtual memory address of the buffer object. Calling it the
BO's address is a lot more obvious than calling it an offset in one of
the now many graphics translation tables.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12206>
iteration needs to be added to the offset now
Fixes: dae3113c3d ("gallium: split drawid out of pipe_draw_info and as a separate draw_vbo param")
Tested-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10555>
the only case in which this is nonzero is if a multidraw gets split by the frontend,
i.e., mesa core, and in all other cases it can be ignored. the value can also be ignored
for all indirect draws, though it seems many (most?) gallium drivers are not aware of this
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10166>
this moves index_bias into the multidraw struct, enabling draws where the value
changes to be merged; the draw_info struct member is renamed and moved to the end
of the struct for tc use
u_vbuf still has some checks to split draws if index_bias changes, maybe
this can be removed at some point?
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10166>
The previous mechanism was a bit fragile. We stored the zero offset
in the pre-baked packet, and used an flag to override 0xFFFFFFFF
(append) offsets until our first emit - then prohibited anyone from
trying to re-emit the packet by flagging IRIS_DIRTY_SO_BUFFERS,
because that would re-emit the version with the zeroing of the offset.
Now, we always store 0xFFFFFFFF in the pre-baked packet, and use a
flag to override it to zero on the first emit. That way, we can
re-emit that packet at any time, and it'll just keep appending.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
The `restart_index` field can be uninitialized if `primitive_restart`
is false so we have to track `restart_index` changes
only if `primitive_restart` is true
Here is a valgrind warning:
Conditional jump or move depends on uninitialised value(s)
==52021== at 0x6D44968: iris_update_draw_info (iris_draw.c:102)
==52021== by 0x6D450B5: iris_draw_vbo (iris_draw.c:273)
==52021== by 0x642FD8E: cso_multi_draw (cso_context.c:1708)
==52021== by 0x5C434D3: st_draw_gallium (st_draw.c:271)
==52021== by 0x5DF5F1B: _mesa_draw_arrays (draw.c:554)
==52021== by 0x5DF68F7: _mesa_DrawArrays (draw.c:768)
==52021== by 0x49011F2: stub_glDrawArrays (piglit-dispatch-gen.c:12181)
==52021== by 0x11C611: piglit_display (shader_runner.c:4549)
==52021== by 0x4994D83: process_next_event (piglit_x11_framework.c:137)
==52021== by 0x4994E47: enter_event_loop (piglit_x11_framework.c:153)
==52021== by 0x49939A4: run_test (piglit_winsys_framework.c:88)
==52021== by 0x49821A9: piglit_gl_test_run (piglit-framework-gl.c:229)
v2: - don't propagate trash to state->cut_index
(Kenneth Graunke <kenneth@whitecape.org>)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8409>
It is currently a bitset on top of a uint64_t but there are already
more than 64 values. Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>
To remove PIPE_CAP checking in the common code.
It's better if drivers lower multi draws even if the hardware doesn't
support it beause the multi draw loop can be moved deeper into the driver
to remove more overhead.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>
Essentially rename multi_draw to draw_vbo and remove start and count
from pipe_draw_info.
This is only an interface change. It doesn't add multi draw support
anywhere.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
This removes 8 bytes from pipe_draw_info (think u_threaded_context)
and a lot of info->indirect pointer indirections.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
This removes some overhead from tc_draw_vbo and increases the maximum number
of draws per batch from 153 to 192 in u_threaded_context.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
On Gen12+, we can enable additional caches in certain usage situations.
This routes that decision making to a central place in ISL, based on
surface usage flags, and updates both drivers to use it. (i965 doesn't
need to change because it doesn't support Gen12.)
We continue handling the "external" decision via an anv_mocs() wrapper
for now, since we store that flag in anv_bo, which isl doesn't know
about. (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
actually sure that would be cleaner.)
This patch should not have any functional nor performance effects, as
we continue selecting the exact same MOCS values for now.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
It's included in declaration of INTEL_DEBUG.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>
Whenever iris_predraw_resolve_inputs() ends up doing a flush or
invalidate, we really want it to be on the same batch which is going
to consume the result. Any resolves should still be performed from
the render batch thanks to the previous patch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
We're nearly out of dirty bits, and some patches pending review on
GitLab no longer apply due to that. Make room for them by splitting
off shader stage-specific bits into a separate stage_dirty mask.
An alternative would be to split compute-related bits into a separate
mask, but that would prevent the '<< stage' indexing done in various
parts of the driver from working.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>
Now that it uses ISL rather than genxml code, there's no need for it to
live as a vtable function inside the state module. We can just make it
a static inline helper in iris_resource.h so it's available throughout
the codebase.
Fixes: a4da6008b6 ("iris: Use mocs from isl_dev.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3720>
This improves a couple of things:
1. We now only update anything if the shader actually cares.
Previously, is_indexed_draw was causing us to flag dirty vertex
buffers, elements, and SGVs every time the shader switched between
indexed and non-indexed draws. This is a very common situation,
but we only need that information if the shader uses gl_BaseVertex.
We were also flagging things when switching between indirect/direct
draws as well, and now we only bother if it matters.
2. We upload new draw parameters only when necessary.
When we detect that the draw parameters have changed, we upload a
new copy, and use that. Previously we were uploading it every time
the vertex buffers were dirty (for possibly unrelated reasons) and
the shader needed that info. Tying these together also makes the
code a bit easier to follow.
In Civilization VI's benchmark, this code was flagging dirty state
many times per frame (49 average, 16 median, 614 maximum). Now it
occurs exactly once for the entire run.
If we can entirely push uniform data, we don't need a SURFACE_STATE
descriptor for pulling data. Since constant uploads are a very common
operation, and being able to push all data is also very common, we would
like to avoid the overhead in this case.
This patch defers uploading new descriptors. Instead of handling that
at iris_set_constant_buffer, we do it at iris_update_compiled_shaders,
where we can see the currently bound shader variants. If any need pull
descriptors, and descriptors are missing, we update them and flag that
the binding table also needs to be refreshed.
Improves performance in GFXBench5 gl_driver2 on an i7-6770HQ by
31.9774% +/- 1.12947% (n=15).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>