Commit graph

806 commits

Author SHA1 Message Date
Kenneth Graunke
1b1c857248 iris: Make various classes inherit from u_threaded_context base classes
u_threaded_context requires various objects to inherit from a new
threaded_foo base class rather than directly from pipe_foo.  This
patch does most of the mechanical changes required for that.

It also initializes the new threaded_resource fields.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:21 -08:00
Kenneth Graunke
ec0d61c14c iris: Support rebinding of stream output targets
This enables us to replace the backing storage of resources that have
been used as stream output targets, in case we're invalidating their
entire contents.  This can avoid stalls.  We simply hadn't supported it
because it was going to be tricky to re-emit 3DSTATE_SO_BUFFER without
screwing up "reset offset to zero" vs. "keep appending".  But that
should be working fine with the previous patch's refactor.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:21 -08:00
Kenneth Graunke
08e04ddd2c iris: Rework zeroing of stream output buffer offsets
The previous mechanism was a bit fragile.  We stored the zero offset
in the pre-baked packet, and used an flag to override 0xFFFFFFFF
(append) offsets until our first emit - then prohibited anyone from
trying to re-emit the packet by flagging IRIS_DIRTY_SO_BUFFERS,
because that would re-emit the version with the zeroing of the offset.

Now, we always store 0xFFFFFFFF in the pre-baked packet, and use a
flag to override it to zero on the first emit.  That way, we can
re-emit that packet at any time, and it'll just keep appending.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:21 -08:00
Kenneth Graunke
e40fafa991 iris: Defer stream output target space allocation until set time
In the future, Marek is planning to make u_threaded_context call
create_stream_output_target() from a different thread than the main
driver thread, which means that we can't safely use uploaders there.

To prepare for this eventual future, just defer the allocation of
the offset BO 'til later.  It's a very small amount of overhead.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:20 -08:00
Kenneth Graunke
5659460af4 iris: Defer uploading of surface states
With u_threaded_context, create_surface and create_sampler_view will
be called from a different thread than the driver thread.  They aren't
allowed to access the context, which means that they can't use the
uploaders there to upload our SURFACE_STATE entries.

Thanks to backing-storage replacement and iris_rebind_buffer, we already
reworked things to maintain CPU-side copies of the SURFACE_STATE entries
and added the ability to upload or re-upload them later.  So we can skip
the upload at object creation time, and add a simple resource-is-NULL
check at binding table upload time to ensure that they get uploaded by
the time we need them.  (They might get uploaded earlier due to rebinds
or clear color updates, but this is the last moment to do so.)

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:20 -08:00
Jason Ekstrand
1e53e0d2c7 intel/mi_builder: Drop the gen_ prefix
mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining
us anything except extra characters.  It's possible that MI_ may
conflict a tiny bit with GenXML but it doesn't seem to be a problem
today and we can deal with that in the future if it's ever an issue.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>
2021-03-04 15:14:27 +00:00
Jordan Justen
d846901d9d intel/dev: Add devinfo genx10 field
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9329>
2021-03-01 22:00:08 -08:00
Jordan Justen
36dd7c44f6 intel: Use GEN_VERSIONx10 in more places
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9329>
2021-03-01 22:00:08 -08:00
Kenneth Graunke
b9133e48a6 iris: Pin surface state buffers after possibly updating the clear color
On Gen8, updating the clear color will end up allocating new
SURFACE_STATE entries.  These might end up living in a different BO
than the original copies, which means that we have to pin _after_
updating the clear color, not before.

Found by inspection.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9257>
2021-02-24 18:32:29 +00:00
Francisco Jerez
17add74dec iris/gen12: Implement programming of pixel pipe hashing tables.
Straightforward by using the pixel hashing table computation helper
previously introduced, assuming we know the fraction of work that
needs to be submitted to each pixel pipe.  Note that AFAIA the
hardware maps indices in the table to pixel pipes from largest to
smallest, so it shouldn't be necessary to permute indices based on the
physical IDs of the pixel pipes as we are doing on Gen11.

Improves performance of most non-trivial graphics workloads I've tried
on an 80 EU TGL.  E.g. the following testcases improve performance
significantly with sample size 27 and statistical significance 1%:

  gputest/pixmark_piano:      62.89% ±0.10%
  gputest/pixmark_volplosion: 61.51% ±0.06%
  unigine/valley:             26.72% ±0.25%
  gfxbench/gl_5_high:         24.70% ±0.19%
  unigine/heaven:             23.54% ±0.17%
  steam/csgo:                 22.75% ±4.36%
  gfxbench/gl_manhattan31:    22.43% ±0.29%
  gfxbench/gl_4:              20.92% ±0.35%
  warsow/benchsow:            19.15% ±2.53%
  gfxbench/gl_trex_off:       18.84% ±0.27%

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
2021-02-23 21:15:25 -08:00
Francisco Jerez
f9bcdc5bc7 iris/gen11+: Calculate pixel hashing tables instead of hardcoding.
Pixel hashing tables are a pain to type in, review and maintain IMHO.
In order to obtain satisfactory load balancing on all Gen12 parts
currently in production this series would need to add 5 different
additional tables.  Instead this introduces a simple algorithm able to
calculate a table on the fly based on a handful of parameters.

Note that the Gen11 tables generated with this algorithm are not
identical to the hardcoded ones, however the only difference should be
a phase shift that isn't expected to have any effect on performance,
since it shouldn't change the fraction of work submitted to each pixel
pipe.

The CPU overhead from this change is negligible since the tables only
need to be programmed once at context init time.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
2021-02-23 21:15:16 -08:00
Kenneth Graunke
4c4a91abe5 iris: Reference the shader variant for last_vue_map as well
We call update_last_vue_map after updating the shaders, which compares
the new and old VUE maps.  Except...updating the shaders may have
dropped the last reference to the variant that ice->shaders.last_vue_map
belonged to, leading to a classic use-after-free.

Fix this by taking a reference to the variant for the last VUE stage,
so it stays around until we're done with it.

Fixes: 1afed51445 ("iris: Store a list of shader variants in the shader itself")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4311
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9143>
2021-02-19 18:49:19 +00:00
Lionel Landwerlin
207ee2b6a9 isl: add external parameter to isl_mocs()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9127>
2021-02-18 11:20:59 +02:00
Kenneth Graunke
e7dc48c309 iris: Make a pin_scratch_space() helper
We need to (re-)pin the scratch buffer in four different places, and
it's going to get slightly more complicated on future platforms.  So,
make a helper function, allowing us to add the complexity in one spot.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9023>
2021-02-12 21:14:26 +00:00
Kenneth Graunke
4256f7ed58 iris: Fill out scratch base address dynamically
Now that shaders are shared between contexts, we can't pre-bake the
shader scratch address into the derived 3DSTATE_XS packets.  Scratch
buffers are and must be per-context, as multiple contexts could be
executing shaders using scratch at the same time.

So instead, we leave that field blank when pre-filling those packets
up-front, and merge in the actual address when emitting them.  It's
a little more overhead, but only in the case where scratch is used.

Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
2021-02-11 20:51:18 +00:00
Nanley Chery
0079b8543a iris: Fix aux usage of depth buffer prepare/finish
Prepare/finish a framebuffer's depth buffer with the aux usage that's
appropriate for the given miplevel instead of wrongly assuming that
compression is always enabled. Enables code simplifications later on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8853>
2021-02-10 20:48:01 +00:00
Nanley Chery
a0908d0c91 iris: Drop a stale comment about HiZ sampling
With commit 7339660e80, this comment is no
longer needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8853>
2021-02-10 20:48:01 +00:00
Kenneth Graunke
f7d4ebbf86 iris: add hooks to call INTEL_MEASURE
These hooks were written in the initial IRIS_MEASURE implementation.
Minor changes by Mark Janes <markjanes@swizzler.org> to adapt to the
INTEL_MEASURE reimplementation.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
2021-02-01 17:24:57 -08:00
Mark Janes
b338bb70e0 iris: add a iris_context reference to iris_batch
This eliminates the need to use container_of in error handling code.
INTEL_MEASURE will need to access the iris context from each batch.

suggested-by: Kenneth Graunke <kenneth@whitecape.org>

Acked-by:     Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
2021-02-01 17:24:57 -08:00
Kenneth Graunke
9d63547f2f iris: Properly handle new unbind_num_trailing_slots parameters
Commits 0278d1fa323cf1f289..b688ea31fcf7e20436 added a new parameter
to set_vertex_buffers(), set_shader_images(), and set_sampler_views()
which specifies a number of trailing slots to unbind.  They updated
the iris functions to do the unbinding, but didn't update the code
to mark which things are bound in the bitfields.  This meant that
later code would assume those unbound slots were bound, and crash
on a NULL dereference.  All that's needed is to add that slot count
when unbinding things in the bitfield.

Fixes: 0278d1fa32 ("gallium: add unbind_num_trailing_slots to set_vertex_buffers")
Fixes: 72ff66c3d7 ("gallium: add unbind_num_trailing_slots to set_shader_images")
Fixes: b688ea31fc ("gallium: add unbind_num_trailing_slots to set_sampler_views")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8758>
2021-01-28 00:54:22 -08:00
Marek Olšák
27dcb46629 gallium: add take_ownership param into set_vertex_buffers to eliminate atomics
There are a few places (mainly u_threaded_context) that do:
   set_vertex_buffers(...);
   for (i = 0; i < count; i++)
      pipe_resource_reference(&buffers[i].resource.buffer, NULL);

set_vertex_buffers increments the reference counts while the loop
decrements them.

This commit eliminates those reference count changes by adding a parameter
into set_vertex_buffers that tells the callee to accept all buffers
without incrementing the reference counts.

AMD Zen benefits from this because it has slow atomics if they come from
different CCXs.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
2021-01-27 23:53:35 +00:00
Marek Olšák
b688ea31fc gallium: add unbind_num_trailing_slots to set_sampler_views
Instead of calling this functions again to unbind trailing slots,
extend it to do it when binding. This reduces CPU overhead.

A lot of drivers ignore "start" and always unbind all slots after "count".
Such drivers don't need any changes here.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
2021-01-27 23:53:35 +00:00
Marek Olšák
72ff66c3d7 gallium: add unbind_num_trailing_slots to set_shader_images
Instead of calling this function again to unbind trailing slots,
extend it to do it when images are being set. This reduces CPU overhead.
Only st/mesa benefits.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
2021-01-27 23:53:34 +00:00
Marek Olšák
0278d1fa32 gallium: add unbind_num_trailing_slots to set_vertex_buffers
Instead of calling this functions again to unbind trailing slots,
extend it to do it as part of the call that sets vertex buffers.
This reduces CPU overhead. Only st/mesa benefits from this.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
2021-01-27 23:53:34 +00:00
Marek Olšák
a51d4b10f1 gallium: add take_ownership param into set_constant_buffer to eliminate atomics
We often do this:
    pipe->set_constant_buffer(pipe, shader, slot, &cb);
    pipe_resource_reference(&cb->buffer, NULL);

That results in atomic increment in set_constant_buffer followed by
atomic decrement after set_constant_buffer. This new interface
eliminates those atomics.

For the case above, this should be used instead:
    pipe->set_constant_buffer(pipe, shader, slot, true, &cb);
    cb->buffer = NULL; // if cb is not a local variable, else do nothing

AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
2021-01-27 23:53:34 +00:00
Kenneth Graunke
939bc0c588 iris: Reconfigure the URB only if it's necessary or possibly useful
Reconfiguring the URB partitioning is likely to cause shader stalls,
as the dividing line between each stage's section of memory is moving.
(Technically, 3DSTATE_URB_* are pipelined commands, but that mostly
means that the command streamer doesn't need to stall.)  So it should
be beneficial to update the URB configuration less often.

If the previous URB configuration already has enough space for our
current shader's needs, we can just continue using it, assuming we
are able to allocate the maximum number of URB entries per stage.
However, if we ran out of URB space and had to limit the number of
URB entrties for a stage, and the per-entry size is larger than we
need, we should reconfigure it to try and improve concurrency.

So, we begin tracking the last URB configuration in the context,
and compare against that when updating shader variants.

Cuts 36% of the URB reconfigurations (excluding BLORP) from a
Shadow of Mordor trace, and 46% from a GFXBench Manhattan 3.0 trace.

One nice thing is that this removes the need to look at the old
prog_data when updating shaders, which should make it possible to
unbind shader variants without causing spurious URB updates.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>
2021-01-27 18:30:54 +00:00
Kenneth Graunke
a710145b5b intel: Produce a "constrained" output from gen_get_urb_config()
When calculating a URB configuration, we start with a notion of how
much space each stage /wants/ (to achieve the maximum amount of
concurrency), but sometimes fall back to giving it less than that,
because we don't have enough space.  (Typically, this happens when
the per-stage size is large, or there are many stages, or both.)

We now output a "constrained" boolean which is true if we weren't
able to satisfy all the "wants" due to a lack of space.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>
2021-01-27 18:30:54 +00:00
Yevhenii Kolesnikov
0c08a66ce5 iris: only set point sprite overrides if actually using points
Fixes black screen in some FNA games.

Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3431
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7218>
2021-01-14 18:36:15 +00:00
Jason Ekstrand
f4902bb189 intel/genxml,anv,iris: Drop the legacy compute path from gen125.xml
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342>
2021-01-13 13:10:28 -08:00
Jordan Justen
32857a6350 iris: Add support for COMPUTE_WALKER
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342>
2021-01-13 13:10:28 -08:00
Marek Olšák
0cf5d1f226 gallium: remove PIPE_CAP_INFO_START_WITH_USER_INDICES and fix all drivers
Drivers aren't allowed to ignore start with user index buffers anymore.
This is required by the new fast path where mesa/main is using pipe_draw_info.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>
2021-01-04 19:22:34 -05:00
Marek Olšák
f2e281c231 iris: don't use index_bias if not indexed
index_bias is undefined if index_size == 0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>
2021-01-04 19:22:33 -05:00
Marek Olšák
912ba743b5 gallium: inline pipe_depth_state to decrease DSA state size by 4 bytes
Depth and alpha states are now packed together, interleaved somewhat.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
d0534cea7f gallium: inline pipe_alpha_state to enable better DSA bitfield packing
pipe_alpha_state and pipe_depth_state will be packed together
because they have only a few bitfields each. This will eventually
remove 4 bytes of padding in pipe_depth_stencil_alpha_state.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Marek Olšák
b7f12a0452 gallium: pass pipe_stencil_ref by value (it has only 2 bytes)
This changes pipe_context::set_stencil_ref to pass the parameter by value.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7940>
2020-12-22 12:01:38 +00:00
Rob Clark
790144e65a util+treewide: container_of() cleanup
Replace mesa's slightly different container_of() with one more aligned
to the linux kernel's version which takes a type as the 2nd param.  This
avoids warnings like:

  freedreno_context.c:396:44: warning: variable 'batch' is uninitialized when used within its own initialization [-Wuninitialized]

At the same time, we can add additional build-time type-checking asserts

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7941>
2020-12-10 16:48:36 +00:00
Marek Olšák
1cd455b17b gallium: extend draw_vbo to support multi draws
Essentially rename multi_draw to draw_vbo and remove start and count
from pipe_draw_info.

This is only an interface change. It doesn't add multi draw support
anywhere.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:25 +00:00
Marek Olšák
1a717dca04 gallium: move count_from_stream_output into pipe_draw_indirect_info
This removes some overhead from tc_draw_vbo and increases the maximum number
of draws per batch from 153 to 192 in u_threaded_context.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7441>
2020-11-18 01:41:24 +00:00
Tapani Pälli
460287adca iris: initialize shared screen->vtbl only once
Screen is shared among contexts, other context might be already using
vtbl while another initializes it again.

 ==45872== Possible data race during write of size 8 at 0x5DDAE78 by thread #549
 ==45872== Locks held: 1, at address 0x5D1B6F8
 ==45872==    at 0x6D66D91: gen9_init_state (iris_state.c:7816)
 ==45872==    by 0x6BA0A31: iris_create_context (iris_context.c:342)
 ==45872==    by 0x621F390: st_api_create_context (st_manager.c:917)
 ==45872==    by 0x620E6F9: dri_create_context (dri_context.c:163)
 ==45872==    by 0x6A40DB1: driCreateContextAttribs (dri_util.c:480)
 ==45872==    by 0x540B963: dri2_create_context (egl_dri2.c:1583)
 ==45872==    by 0x53FB84E: eglCreateContext (eglapi.c:821)
 ==45872==
 ==45872== This conflicts with a previous read of size 8 by thread #544
 ==45872== Locks held: 1, at address 0x5F6E0E0
 ==45872==    at 0x6CB779E: blorp_alloc_binding_table (iris_blorp.c:167)
 ==45872==    by 0x6CAEF70: blorp_emit_surface_states (blorp_genX_exec.h:1540)
 ==45872==    by 0x6CB67F9: blorp_exec (blorp_genX_exec.h:2016)
 ==45872==    by 0x6CB7AFE: iris_blorp_exec (iris_blorp.c:307)
 ==45872==    by 0x70F5916: try_blorp_blit (blorp_blit.c:2145)
 ==45872==    by 0x70F5FCA: do_blorp_blit (blorp_blit.c:2273)
 ==45872==    by 0x70F778F: blorp_copy (blorp_blit.c:2803)
 ==45872==    by 0x6BB9EB6: iris_copy_region (iris_blit.c:725)

v2: move as genX(init_screen_state) (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7544>
2020-11-16 05:53:20 +00:00
Anuj Phogat
3c4e43e72b intel: Pointer to SCISSOR_RECT array should be 64B aligned
v2: Apply the workaround to all gen hardawre

Ref: GEN:BUG:1409725701
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7463>
2020-11-09 21:29:04 +00:00
Jason Ekstrand
cdc546ae7f iris: Flush caches based on brw_compiler::indirect_ubos_use_sampler
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7230>
2020-10-20 19:54:29 +00:00
Kenneth Graunke
02fe825a61 isl, anv, iris: Add a centralized helper to select MOCS based on usage
On Gen12+, we can enable additional caches in certain usage situations.
This routes that decision making to a central place in ISL, based on
surface usage flags, and updates both drivers to use it.  (i965 doesn't
need to change because it doesn't support Gen12.)

We continue handling the "external" decision via an anv_mocs() wrapper
for now, since we store that flag in anv_bo, which isl doesn't know
about.  (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
actually sure that would be cleaner.)

This patch should not have any functional nor performance effects, as
we continue selecting the exact same MOCS values for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
2020-10-19 19:18:11 +00:00
Kenneth Graunke
71ed8c5aa6 iris: Fix doubling of shared local memory (SLM) sizes.
Commit 67ee9c5f55 added support for
using the `pipe_compute_state::req_local_mem` field, because Clover
can have a run-time specified size that isn't baked into the shaders.

However, it started adding the static size from the shader to the
dynamic state-supplied size.  The Mesa state tracker fills out
req_local_mem to prog->Base.info.cs.shared_size, which is exactly
what we fill out prog_data->total_shared to be.  Effectively, this
meant that we double-counted the same SLM requirements, doubling
our space requirements.

Fixes a 10% performance regression in Synmark2's OglCSDof test.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>
2020-10-14 23:13:41 +00:00
Jason Ekstrand
9df9f940f0 iris: Add support for load_work_dim as a system value
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>
2020-10-07 16:01:31 -05:00
Jason Ekstrand
67ee9c5f55 iris: Handle runtime-specified local memory size
The value specified in pipe_compute_state is in addition to the implicit
value computed by NIR.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>
2020-10-07 16:01:31 -05:00
Anuj Phogat
545d852a7a intel/gen9: Enable MSC RAW Hazard Avoidance
Workaround # 22011374674
Applied to i965, iris and anv drivers
No performance impact is observed with WA.

Cc: mesa-stable
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-10-01 16:57:50 +00:00
Jordan Justen
20a4235c4c anv, iris: Set MediaSamplerDOPClockGateEnable for gen12+
This has been shown to help performance on TGL and DG1. This could be
applied to gen9+, but we still need to show if it helps with those
platforms.

Rework:
 * Make change in src/intel/vulkan/genX_cmd_buffer.c too. (Ken)
 * Keep mask as 3 for gen < 12

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6684>
2020-09-11 17:40:03 -07:00
Jason Ekstrand
bbaa62e4e1 iris: Re-emit push constants if we have a varying workgroup size
Fixes: 33c61eb2f1 "iris: Implement ARB_compute_variable_group_size"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>
2020-09-02 20:38:22 +00:00
Jason Ekstrand
536727c465 iris: Patch constant data pointers into shaders
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
2020-09-02 19:48:44 +00:00
Jason Ekstrand
63dd1e980c iris: Always re-upload sysvals when we have kernel inputs
They can change on every dispatch and clover never gives us a heads up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
2020-08-21 22:49:54 +00:00