Commit graph

461 commits

Author SHA1 Message Date
Kenneth Graunke
10560f8506 iris: Minor tidying 2019-07-03 22:24:44 -07:00
Anuj Phogat
d96cba7754 Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch"
SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace.
This patch silences a simulator warning about it.

We don't need to add this workaround in linux kernel as the WA description
says it's fixed on latest stepping.

This reverts commit 9c421d6b47.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-28 14:02:13 -07:00
Kenneth Graunke
847ef8ee4f iris: Don't leak resources in iris_create_surface for incomplete FBOs
We were failing to pipe_resource_unreference on the failure path due
to a non-renderable format.  Instead of fixing this, just move the
checks earlier, before we even bother with refcounting or calloc.
2019-06-28 01:13:11 -07:00
Kenneth Graunke
bed305fb7a iris: Fix major resource leak in iris_set_shader_images
We were failing to unreference the old image resource.  Instead of open
coding this and doing it badly, just use the copier function which does
the right thing.
2019-06-27 19:08:46 -07:00
Nanley Chery
fb1350c76f intel: Add and use helpers for level0 extent
Prepare for a bug fix by adding and using helpers which convert
isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of
surface elements.

v2:
- Update iris (Ken).
- Update anv.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-27 23:38:37 +00:00
Kenneth Graunke
3d3685d354 iris: Fix memory leak of SO targets
We need to pitch these on context destroy.
2019-06-27 14:59:39 -07:00
Kenneth Graunke
d65819f054 iris: Fix memory leak for draw parameter resources
Need to pitch these on context destroy.
2019-06-27 14:59:39 -07:00
Kenneth Graunke
50eb1c1396 iris: Drop u_upload_unmap
We use persistent maps so this does nothing.
2019-06-27 14:59:39 -07:00
Kenneth Graunke
d6683e118f iris: Also properly restore INTERFACE_DESCRIPTOR_DATA buffer object
We were at least cleaning up this reference, but we were failing to
pin it in iris_restore_compute_saved_bos.
2019-06-27 08:12:22 -07:00
Kenneth Graunke
340df53d6a iris: Fix resource tracking for CS thread ID buffer
Today, we stream the compute shader thread IDs simply because they're
(annoyingly) relative to dynamic state base address.  We could upload
them once at compile time, but we'd need a separate non-streaming
uploader for IRIS_MEMZONE_DYNAMIC, and I'm not sure it's worth it.

stream_state pins the buffer for use in the current batch, but also
returns a reference to the pipe_resource.  We dropped this reference
on the floor, leaking a reference basically every time we dispatched
a compute shader after switching to a new one.

The reason it returns a reference is so that we can hold on to it and
re-pin it in iris_restore_compute_saved_bos, which we were also failing
to do.  So if we actually filled up a batch with repeated dispatches to
the same compute shader, and flushed, then continued dispatching, we
would fail to pin it and likely GPU hang.
2019-06-27 08:12:22 -07:00
Kenneth Graunke
16d334951e iris: Only bother with thread ID upload if doing MEDIA_CURBE_LOAD
We were unconditionally uploading the new data, but then conditionally
using it with MEDIA_CURBE_LOAD.  If we're not going to emit the command,
there's no point in uploading the data.
2019-06-27 08:12:22 -07:00
Kenneth Graunke
8f51f1ba6e iris: Do MEDIA_CURBE_LOAD when IRIS_DIRTY_CS is set, not constants
We only use push the compute shader thread IDs, not any actual constant
buffer data.  So we should track the compute shader variant changing,
not constbuf changes.
2019-06-27 08:12:22 -07:00
Kenneth Graunke
85c72da1b1 iris: Drop UBO range stuff from iris_restore_compute_saved_bos
Compute doesn't use UBO ranges (annoyingly), so this is dead code.
2019-06-27 08:12:22 -07:00
Kenneth Graunke
f94ebf0c9d iris: Properly align interface descriptor data addresses
MEDIA_INTERFACE_DESCRIPTOR's Interface Descriptor Data Start Address
field's docs say: "This bit specifies the 64-byte aligned address..."

And we were doing 32.  Superfluous thread ID uploading was apparently
saving us from GPU hangs in most cases.
2019-06-27 08:12:22 -07:00
Timur Kristóf
3b6d787e40 iris: move sysvals to their own constant buffer
This commit moves the sysvals to a separate, new constant buffer
at the end (before the shader constants). It also allows us to
remove the special handling we had for cbuf0, and enables all
constant buffers to support user-specified resources and user
buffers.

v2: (by Kenneth Graunke)
- Rebase on the previous patch to fix system value uploading.
- Fix disk cache num_cbufs calculation
- Fix passthrough TCS to report num_cbufs = 1 so upload actually occurs
- Change upload_sysvals to assert that num_cbufs > 0 when
  num_system_values > 0.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-23 18:33:23 +02:00
Kenneth Graunke
ebc8c20b3e iris: Mark cbuf0 as not needing uploading every single time
I neglected to mark cbuf0_needs_upload = false after uploading it.
The obvious fix regressed user clip plane tests, because of a second
bug: we also forgot to mark that they may need re-uploading when
changing shader programs (which may have more or less system values).

Thanks to Timur Kristóf for catching the original issue.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
2019-06-23 18:32:11 +02:00
Jason Ekstrand
13f0c278c5 i965,iris: Move guardband calculations to a common location
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Kenneth Graunke
31de802e7e iris: Use stream uploader for shader draw parameters.
Most vertex data lives in user VBOs in IRIS_MEMZONE_OTHER, which
typically have high bits set to 0xffff.  The shader draw parameters were
being uploaded in IRIS_MEMZONE_DYNAMIC, which have high bets set to 0x2.
This was causing a lot of ping-ponging of high bits, leading to
unnecessary VF cache flushing.

Cuts 7.2% of the flushes in the Civizilation VI demo on Kabylake GT2.
2019-06-20 13:32:16 -05:00
Kenneth Graunke
d4a4384b31 iris: Implement INTEL_DEBUG=pc for pipe control logging.
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush.  That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on.  It should make it easier to determine whether we're doing
too many flushes and why.
2019-06-20 13:32:15 -05:00
Caio Marcelo de Oliveira Filho
f346b277d1 iris: Create binding table slot for num_work_groups only when needed
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-11 17:57:37 -07:00
Caio Marcelo de Oliveira Filho
045aeccf0e iris: Always reserve binding table space for NIR constants
Don't have a separate mechanism for NIR constants to be removed from
the table.  If unused, we will compact it away.  The use_null_surface
is needed when INTEL_DISABLE_COMPACT_BINDING_TABLE is set.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho
97cd865be2 iris: Compact binding tables
Change the iris_binding_table to keep track of what surfaces are
actually going to be used, then assign binding table indices just for
those.  Reducing unused bytes on those are valuable because we use a
reduced space for those tables in Iris.

The rest of the driver can go from "group indices" (i.e. UBO #2) to
BTI and vice-versa using helper functions.  The value
IRIS_SURFACE_NOT_USED is returned to indicate a certain group index is
not used or a certain BTI is not valid.

The environment variable INTEL_DISABLE_COMPACT_BINDING_TABLE can be
set to skip compacting binding table.

v2: (all from Ken)
    Use BITFIELD64_MASK helper. Improve comments.
    Assert all group is marked as used when we have indirects.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho
79f1529ae0 iris: Create an enum for the surface groups
This will make convenient to handle compacting and printing the
binding table.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho
1c8ea8b300 iris: Handle binding table in the driver
Stop using brw_compiler to lower the final binding table indices for
surface access.  This is done by simply not setting the
'prog_data->binding_table.*_start' fields.  Then make the driver
perform this lowering.

This is a better place to perfom the binding table assignments, since
the driver has more information and will also later consume those
assignments to upload resources.

This also prepares us for two changes: use ibc without having to
implement binding table logic there; and remove unused entries from
the binding table.

Since the `block` field in brw_ubo_range now refers to the final
binding table index, we need to adjust it before using to index
shs->constbuf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-03 14:14:45 -07:00
Jason Ekstrand
e459d6d6df iris: Enable nir_opt_large_constants
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15306230 -> 15304726 (<.01%)
    instructions in affected programs: 4570 -> 3066 (-32.91%)
    helped: 16
    HURT: 0

    total cycles in shared programs: 361703436 -> 361680041 (<.01%)
    cycles in affected programs: 129388 -> 105993 (-18.08%)
    helped: 16
    HURT: 0

    LOST:   0
    GAINED: 2

The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal
Space Program

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-29 21:09:16 +00:00
Jason Ekstrand
744f93f5c1 iris: Move upload_ubo_ssbo_surf_state to iris_program.c
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-29 21:09:16 +00:00
Kenneth Graunke
7d2b54e393 iris: Record state sizes for INTEL_DEBUG=bat decoding.
Felix noticed a crash when using INTEL_DEBUG=bat decoding.  It turned
out that we were sometimes placing variable length data near the end
of a buffer, and with the decoder guessing random lengths rather than
having an actual count, it was walking off the end and crashing.  So
this does more than improve the decoder output.

Unfortunately, this is a bit more complicated than i965's handling,
because we don't have a single state buffer.  Various places upload
data via u_upload_mgr, and so there isn't a central place to record
the size.  We don't need to catch every single place, however, since
it's only important to record variable length packets (like viewports
and binding tables).

State data also lives arbitrarily long, rather than being discarded on
every batch like i965, so we don't know when to clear out old entries
either.  (We also don't have a callback when an upload buffer is
released.)  So, this tracking may space leak over time.  That's probably
okay though, as this is only a debugging feature and it's a slow leak.
We may also get lucky and overwrite existing entries as we reuse BOs,
though I find this unlikely to happen.

The fact that the decoder works in terms of offsets from a state base
address is also not ideal, as dynamic state base address and surface
state base address differ for iris.  However, because dynamic state
addresses start from the top of a 4GB region, and binding tables start
from addresses [0, 64K), it's highly unlikely that we'll get overlap.

We can always improve this, but for now it's better than what we had.
2019-05-23 08:07:08 -07:00
Kenneth Graunke
646924cfa1 intel/compiler: Implement TCS 8_PATCH mode and INTEL_DEBUG=tcs8
Our tessellation control shaders can be dispatched in several modes.

- SINGLE_PATCH (Gen7+) processes a single patch per thread, with each
  channel corresponding to a different patch vertex.  PATCHLIST_N will
  launch (N / 8) threads.  If N is less than 8, some channels will be
  disabled, leaving some untapped hardware capabilities.  Conditionals
  based on gl_InvocationID are non-uniform, which means that they'll
  often have to execute both paths.  However, if there are fewer than
  8 vertices, all invocations will happen within a single thread, so
  barriers can become no-ops, which is nice.  We also burn a maximum
  of 4 registers for ICP handles, so we can compile without regard for
  the value of N.  It also works in all cases.

- DUAL_PATCH mode processes up to two patches at a time, where the first
  four channels come from patch 1, and the second group of four come
  from patch 2.  This tries to provide better EU utilization for small
  patches (N <= 4).  It cannot be used in all cases.

- 8_PATCH mode processes 8 patches at a time, with a thread launched per
  vertex in the patch.  Each channel corresponds to the same vertex, but
  in each of the 8 patches.  This utilizes all channels even for small
  patches.  It also makes conditions on gl_InvocationID uniform, leading
  to proper jumps.  Barriers, unfortunately, become real.  Worse, for
  PATCHLIST_N, the thread payload burns N registers for ICP handles.
  This can burn up to 32 registers, or 1/4 of our register file, for
  URB handles.  For Vulkan (and DX), we know the number of vertices at
  compile time, so we can limit the amount of waste.  In GL, the patch
  dimension is dynamic state, so we either would have to waste all 32
  (not reasonable) or guess (badly) and recompile.  This is unfortunate.
  Because we can only spawn 16 thread instances, we can only use this
  mode for PATCHLIST_16 and smaller.  The rest must use SINGLE_PATCH.

This patch implements the new 8_PATCH TCS mode, but leaves us using
SINGLE_PATCH by default.  A new INTEL_DEBUG=tcs8 flag will switch to
using 8_PATCH mode for testing and benchmarking purposes.  We may
want to consider using 8_PATCH mode in Vulkan in some cases.

The data I've seen shows that 8_PATCH mode can be more efficient in
some cases, but SINGLE_PATCH mode (the one we use today) is faster
in other cases.  Ultimately, the TES matters much more than the TCS
for performance, so the decision may not matter much.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-14 13:16:30 -07:00
Illia Iorin
a35269cf44 iris: Implement ARB_indirect_parameters
iris_draw_vbo is divided into two functions to remove unnecessary
operations from the loop. This implementation of ARB_indirect_parameters
takes into account NV_conditional_render by saving MI_PREDICATE_RESULT
at the start of a draw call and restoring it at the end also the result
of NV_conditional_render is taken into account when computing predicates
that limit draw calls for ARB_indirect_parameters in a similar way
to 1952fd8d in ANV.

v2: Optimize indirect draws (suggested by Kenneth Graunke)
v3: (by Kenneth Graunke)
 - Fix an issue where indirect draws wouldn't set patch information
   before updating the compiled TCS.
 - Move some code back to iris_draw_vbo to avoid duplicating it.
 - Fix minor indentation issues.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-11 23:56:52 -07:00
Kenneth Graunke
72ccefb529 iris: Use full ways for L3 cache setup on Icelake.
Anuj fixed this in i965 and anv, but the fix never landed in iris.
Fixes tessellation corruption on Icelake.  Thanks to Rafael for
bisecting this and tracking it down.

Fixes: d0996d5fab iris: Emit default L3 config for the render pipeline
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-05-10 16:50:14 -07:00
Kenneth Graunke
a232aa5c50 iris: Also handle res->offset for buffer sampler/image views 2019-05-07 13:36:18 -07:00
Mike Blumenkrantz
ddd716e746 iris: support dmabuf imports with offsets
this adds support for imports where the image data begins at an offset
from the start of the buffer, as used in h/x264

fixes kwg/mesa#47

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-07 13:36:08 -07:00
Kenneth Graunke
a032a9665f iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS
This makes CompressedTexSubImage from a PBO source do proper GPU
rendering to upload instead of stalling to map the PBO source on
the CPU (then copying it on the CPU).

Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this
functionality, and to Jason Ekstrand for writing the code I adapted.
Vulkan only supports a single layer, however, and this code tries to
support multiple layers as long as it's miplevel 0.

Improves performance in Sid Meier's Civilization VI:

   Average frame time (ms):         -3.67423% +/- 1.46201% (n=5)
   99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5)
2019-05-06 09:50:32 -07:00
Kenneth Graunke
5ff5d0a895 iris: Disable dual source blending when shader doesn't handle it
This is a port of Danylo's eca4a6548d
which fixed the hang on i965.  It fixes GPU hangs in his new Piglit
test, arb_blend_func_extended-dual-src-blending-discard-without-src1.

I avoided my own review feedback here, and decided to simply adjust
3DSTATE_PS_BLEND rather than BLEND_STATE_ENTRY[0].  It has never been
clear to me which the hardware uses in every case.  However, whacking
the enable in 3DSTATE_PS_BLEND seems to be sufficient to fix the hang,
and that packet is already dynamic, so it's easy to handle.  I'd rather
avoid making BLEND_STATE_ENTRY[0] dynamic unless I have to.
2019-05-02 21:14:49 -07:00
Rafael Antognolli
cf3cadacdf iris: Update the surface state clear color address when available.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-30 08:31:44 -07:00
Kenneth Graunke
dcfca0af7c iris: Set XY Clipping correctly.
I was setting it based off a pipe_rasterizer_state field that appears
to be entirely dead outside of the draw module respecting it.

I should be setting it when the primitive type reaching the SF is
neither points nor lines.  This is, unfortunately, rather dirty,
as we have to look at the rasterizer state, the geometry shader state,
the tessellation evaluation shader state, and the primitive type...
2019-04-29 10:53:23 -07:00
Kenneth Graunke
6bd4cb920e iris: Fix zeroing of transform feedback offsets in strange cases.
Some of the dEQP.functional.transform_feedback tests end up doing
the following sequence of operations:

   1. BeginTransformFeedback
   2. PauseTransformFeedback
   3. Draw
   4. ResumeTransformFeedback

At step 1, we'd pack 3DSTATE_SO_BUFFER commands saying to zero the
SO_WRITE_OFFSET registers.  At step 2, we disable streamout, so step 3
doesn't bother emitting those commands.  Then, step 4 re-packs new
3DSTATE_SO_BUFFER commands with offset = 0xFFFFFFFF, saying to continue
appending at the existing offset.  This loads the value from the BO as
the offsets - but we never actually zeroed it.

So, just maintain a flag saying "we actually emitted the commands",
and stomp offset back to zero until we emit some.
2019-04-27 01:07:14 -07:00
Kenneth Graunke
529ace7887 iris: Silence unused function warning 2019-04-25 17:33:56 -07:00
Andrii Simiklit
4e9592c5fa iris: make the TFB result visible to others
OpenGL 4.6 Spec:
   "5.3.3 Rules
    .......
    Note: “Updates” via rendering or transform feedback
    are treated consistently with updates via GL commands.
    Once EndTransformFeedback has been issued, any subsequent
    command in the same context that uses the results of the
    transform feedback operation will see the results."

v2: removed a wrong comment
    ( Kenneth Graunke <kenneth@whitecape.org> )

v3: - flush+dirty depends on buffers usage history
    - removed an old hack
    ( Kenneth Graunke <kenneth@whitecape.org> )

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110404
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-25 11:48:04 -07:00
Kenneth Graunke
aa7306b4cf iris: Some tidying for preemption support
Just enable it during init_render_context on Gen10+, and move the
Gen9 state tracking into iris_genx_state so it only exists on Gen9.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-04-25 11:26:24 -07:00
Mike Blumenkrantz
7315882023 iris: add preemption support on gen9
this is basically just porting the following two commits to gallium:
d8b50e152a
5c454661c6

resolves kwg/mesa#49

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-04-24 14:47:08 -07:00
Mike Blumenkrantz
b53d256db8 iris: add support for INTEL_conservative_rasterization
this hooks up the iris gallium driver to existing mesa bits which handle
the implementation

resolves kwg/mesa#8

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-23 16:36:30 -07:00
Kenneth Graunke
2208d5a683 iris: Fix DrawTransformFeedback math when there's a buffer offset
We need to subtract the starting offset from the final offset before
dividing by the stride.  See src/intel/vulkan/genX_cmd_buffer.c:3142.

Not known to fix anything.
2019-04-23 15:57:07 -07:00
Kenneth Graunke
77449d7c41 iris: Track valid data range and infer unsynchronized mappings.
Applications frequently call glBufferSubData() to consecutive regions
of a VBO to append new vertex data.  If no data exists there yet, we
can promote these to unsynchronized writes, even if the buffer is busy,
since the GPU can't be doing anything useful with undefined content.
This can avoid a bunch of unnecessary blitting on the GPU.

u_threaded_context would do this for us, and in fact prohibits us from
doing so (see TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED).  But we haven't
hooked that up yet, and it may be useful to disable u_threaded_context
when debugging...at which point we'd still want this optimization.  At
the very least, it would let us measure the benefit of threading
independently from this optimization.  And it's not a lot of code.

Removes most stall avoidance blits in "Total War: WARHAMMER."

On my Skylake GT4e at 1920x1080, this appears to improve performance
in games by the following (but I did not do many runs for proper
statistics gathering):

   ----------------------------------------------
   | DiRT Rally        | +2% (avg) | + 2% (max) |
   | Bioshock Infinite | +3% (avg) | + 9% (max) |
   | Shadow of Mordor  | +7% (avg) | +20% (max) |
   ----------------------------------------------
2019-04-23 00:24:08 -07:00
Kenneth Graunke
5ad0c88dbe iris: Replace buffer backing storage and rebind to update addresses.
This implements PIPE_CAP_INVALIDATE_BUFFER and invalidate_resource(),
as well as the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag.  When either
of these happen, we swap out the backing storage of the buffer for a
new idle BO, allowing us to write to it immediately without stalling
or queueing a blit.

On my Skylake GT4e at 1920x1080, this improves performance in games:

   -----------------------------------------------
   | DiRT Rally        | +25% (avg) | +17% (max) |
   | Bioshock Infinite | +22% (avg) | +11% (max) |
   | Shadow of Mordor  | +27% (avg) | +83% (max) |
   -----------------------------------------------
2019-04-23 00:24:08 -07:00
Kenneth Graunke
b45dff1da8 iris: Rework image views to store pipe_image_view.
This will be useful when rebinding images.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
2f60850a3f iris: Rework UBOs and SSBOs to use pipe_shader_buffer
This unifies a bunch of the UBO and SSBO code to use common structures.
Beyond iris_state_ref, pipe_shader_buffer also gives us a buffer size,
which can be useful when filling out the surface state.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
00d4019676 iris: Track bound constant buffers
This helps avoid having to iterate over [0, PIPE_MAX_CONSTANT_BUFFERS)
looking to see if any resources are bound.
2019-04-23 00:24:08 -07:00
Kenneth Graunke
1566054459 iris: Track bound and writable SSBOs
Marek recently extended pipe->set_shader_buffers() to take an extra
writable_bitmask parameter, indicating which SSBOs are writable (some
may be bound read-only).  We can use this to decide whether to set
EXEC_OBJECT_WRITE when pinning.  Avoiding the write flag can save us
some cross-batch flushing if the SSBO is used for reading in both the
render and compute engines.
2019-04-22 11:31:14 -07:00
Kenneth Graunke
36478b9f77 iris: Enable the dual_color_blend_by_location driconf option.
This fixes rendering in Unigine Valley 1.0 and Heaven 4.0.
2019-04-22 09:36:36 -07:00