Reworks:
* Jordan: Check for cfg == NULL rather than is_dg1
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4956>
For texturing and draw calls, HW expects the clear color to be in two
different color spaces after sRGB fast-clears - sRGB in the former and
linear in the latter. Up until now, iris has stored the clear color in
the sRGB color space. Limit the allowable clear colors for sRGB
fast-clears to 0/1 so that both color space requirements are satisfied.
Makes iris pass the sRGB -> sRGB subtest of the fcc-write-after-clear
piglit test on gen9+.
v2:
* Drop iris_context::blend_enables. (Ken)
* Drop some more resolve-related blend-state-tracking code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>
Remove the CCS_D fallback logic so that iris doesn't attempt to use a
non-existent surface state for some renders. Also, add an assertion to
catch the issue.
The fallback in iris_resource_render_aux_usage can lead to this problem
because it doesn't account for the fact that surface states created from
resources with the Y_TILED_CCS modifier may only have CCS_E or NONE as
aux usages (due to iris_resource_create_with_modifiers).
Without this change, the next commit would have triggered the fallback
and regressed the following tests on gen9:
* dEQP-EGL.functional.wide_color.window_888_colorspace_srgb
* dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
* dEQP-EGL.functional.wide_color.pbuffer_888_colorspace_srgb
* dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>
Even though the clear color BO is bound as a read-only buffer, report
the same caching domain as the main BO in use_surface() (typically
IRIS_DOMAIN_RENDER_WRITE) in order to avoid ping-ponging back and
forth between IRIS_DOMAIN_RENDER_WRITE and IRIS_DOMAIN_OTHER_READ,
which leads to increased stall-at-pixel-scoreboard synchronization
between draw calls.
Fixes a 5%-10% FPS regression in some benchmarks spotted on ICL.
Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: eb5d1c2722 "iris: Annotate all BO uses with domain and sequence number information."
Closes: #3097
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5411>
This introduces a batch synchronization boundary at every PIPE_CONTROL
command, and updates the cache coherency status tracked during batch
construction according to the specified control bits.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
This is the main performance trade-off of this cache tracking
mechanism: In order for the seqno vector of buffer objects to be
accurate, they need to be marked as used again every time the batch is
split into a new synchronization section if they remain bound to the
pipeline. This can be achieved easily by re-using
iris_restore_render_saved_bos() and iris_restore_compute_saved_bos(),
which currently serve a similar purpose across batch buffer
boundaries.
The impact on Piglit drawoverhead results seems to be within a
standard deviation of the current results.
XXX - It might be possible to completely remove the current
iris_batch::contains_draw flag at a small additional performance
cost.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
Probably the most annoying patch to review from the whole series --
Mark every buffer object use as accessed through some caching domain
with the sequence number of the current synchronization section of the
batch. The additional argument of iris_use_pinned_bo() makes sure I'd
have gotten a compile error if I had missed any buffer added to the
batch validation list.
There are only a few exceptions where a buffer is left untracked while
adding it to the validation list, justified below:
- Batch buffers: These are strictly read-only for the moment.
- BLORP buffer objects: Their seqnos are bumped manually at the end
of iris_blorp_exec() instead, in order to avoid plumbing domain
information through BLORP address combining.
- Scratch buffers: The contents of these are strictly thread-local.
- Shader images and SSBOs: Accesses of these buffers are explicitly
synchronized at the API level.
v2: Opt out of tracking more aggressively (Ken): In addition to the
above, surface states, binding tables, instructions and most
dynamic states are now left untracked, which means a *lot* more BO
uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely
carefully, since the cache tracker won't be able to provide any
coherency guarantees for them.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
This delimits all batch operations which access memory between
iris_batch_sync_region_start() and iris_batch_sync_region_end() calls.
This makes sure that any buffer objects accessed within the region are
considered in use through the same caching domain until the end of the
region.
Adding any buffer to the batch validation list outside of a sync
region will lead to an assertion failure in a future commit, unless
the caller explicitly opted out of the cache tracking mechanism.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
We're nearly out of dirty bits, and some patches pending review on
GitLab no longer apply due to that. Make room for them by splitting
off shader stage-specific bits into a separate stage_dirty mask.
An alternative would be to split compute-related bits into a separate
mask, but that would prevent the '<< stage' indexing done in various
parts of the driver from working.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>
This makes iris_batch_prepare_noop() return a boolean instead of
passing through the relevant set of dirty flags. It will make it
easier to change the representation of dirty flags.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>
There's an update for INTERFACE_DESCRIPTOR_DATA at dispatch, so we can
just move the KSP assignment there. This flexibility will later allow
variable group size to pick the right SIMD variant.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5142>
Add new IRIS_DIRTY_STENCIL_REF dirty flag which would help us to trigger
separate 3DSTATE_WM_DEPTH_STENCIL packet using modify disable fields.
Instead of merging two packets into one in order to build
3DSTATE_WM_DEPTH_STENCIL state, set_stencil_ref can use
IRIS_DIRTY_STENCIL_REF bit and bind_zsa_state can use
IRIS_DIRTY_WN_DEPTH_STENCIL, both could cause packet to happen with
available information using modify disable bits which allow us to
construct packet by ignoring set of fields.
v2: (Kenneth Graunke)
- Fix condition ordering.
- Club GEN cases.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3688>
gen12 does away with the single patch dispatch mode for tcs, and
increases some limits so that 8_patch mode can always work. Make the
necessary changes so we don't try to fall back to single patch mode.
Fixes KHR-GL46.tessellation_shader.single.max_patch_vertices and others
Fixes: 44754279ac ("intel/fs/gen12: Use TCS 8_PATCH mode.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4843>
This is the wrong data type, the original field - and the values we're
adding in - are both 64-bit unsigned. Keep the original data type.
Thanks to Dave Airlie for finding this while reading the code.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4802>
This is a preparation for dropping this field since this value is
expected to be calculated by the drivers now for variable group size
case. And also the field would get in the way of brw_compile_cs
producing multiple SIMD variants (like FS).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
The push.total field had three values but only one was directly
used (size). Replace it with a helper function that explicitly takes
the cs_prog_data and the number of threads -- and use that in the
drivers.
This is a preparation for ARB_compute_variable_group_size where the
number of threads (hence the total size for push constants) is not
defined at compile time (not cs_prog_data->threads).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
Patch changes surface state setup to alloc/fill states for all possible
usages for image resource on gen12. Also predraw and binding table
population is changed to determine correct aux usage with the new
iris_image_view_aux_usage.
v2: alloc always all states independent on current image
aux state on gen >= 12 , code cleanups (Nanley)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>
Patch adds a helper function for determining image format and changes
iris_set_shader_images to use it.
v2: pass iris_context instead of pipe one, rename function,
code cleanup (Nanley)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>
Harvest the information gathered in the previous patch
inside of iris.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
GEN:BUG:14010455700 (lineage 1808121037):
"To avoid sporadic corruptions “Set 0x7010[9] when Depth Buffer
Surface Format is D16_UNORM , surface type is not NULL & 1X_MSAA"
Required for fixing ttps://gitlab.freedesktop.org/mesa/mesa/issues/2501.
GEN:BUG:1806527549:
"Set HIZ_CHICKEN (7018h) bit 13 = 1 when depth buffer is D16_UNORM."
This one could fix a GPU hang in some workloads.
v2: Implement WA in isl and add another similar WA (Jason).
v3: Add flushes before changing chicken registers (Jason)
v4: Depth flush and stall + end of pipe sync when changing registers
(Jason).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3801>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3801>
Even though the workaround description says:
"all the listed commands are non-pipelined and hence flush caused due
to pipeline mode change must not cause performance issues..."
My understanding is that we still need to have the flushes. Also, the
flushes are required not only to stall the pipeline, but also to clear
caches, so I don't think they can simply be discarded.
Additionally, while doing some testing that increased the number of
surface STATE_BASE_ADDRESS emitted, I got a lot more GPU hangs. Adding
these flushes fixes those hangs.
Fixes: b8fbb39a (iris: Implement Gen12 workaround for non pipelined
state)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3908>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3908>
Now that it uses ISL rather than genxml code, there's no need for it to
live as a vtable function inside the state module. We can just make it
a static inline helper in iris_resource.h so it's available throughout
the codebase.
Fixes: a4da6008b6 ("iris: Use mocs from isl_dev.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3720>
Like Skylake, Gen12 requires a workaround for PIPE_CONTROLs using a
post-sync operation.
v2: Restrict to A0
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
We only calculate them based on device info and never change them so
this seems like a reasonable place to put them. We could also put them
in the context, but that's not accessible from iris_init_*_context.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
SML is no longer in the L3$ on Gen11+. It's not incredibly clear from
the docs but no Gen11 platforms are in the list of platforms on which
this bit exists. Also, we've been always setting it false on Gen11 in
ANV and i965 thanks to GEN_L3P_SLM being zero with no ill effects.
Cc: "20.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
According to the BSpec, this should prevent hangs when using shaders
with large URB entries. A more precise fix can be done but it requires
re-arranging URB setup.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3454>
Before flushing the instruction cache with a pipe control, we need to
use a CS Stall pipe control.
Ref: GEN:BUG:1409226450
Rework: Add stall-at-scoreboard (Lionel)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>