flush_pipeline_before_pipeline_select adds workarounds required before
switching the pipeline.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
This fixes a "regression" on Haswell and prior caused by merging the gen7
and gen8 flush_state functions. Haswell should still work just fine if
you're on a 4.4 kernel, but we really should make it detect the command
parser version and do something intelligent.
The old DRI3 implementation just used CopyArea instead of present. We
still don't support all the MST fancyness, but it should at least avoid
some copies and allow for.
v2 (Jason Ekstrand):
- Better object cleanup and destruction
- Handle the CONFIGURE_NOTIFY event and return OUT_OF_DATE when needed
- Track dirtyness via IDLE_NOTIFY rather than interating through the
images sequentially
Right now, Vulkan apps can pretty easily DOS the GPU by simply submitting a
lot of batches. This commit makes us wait until the rendering for earlier
frames is comlete before continuing. By waiting 2 frames out, we can still
keep the pipe reasonably full but without taking the entire system down.
This is similar to what the GL driver does today.
This is more consistent with the way the rest of the driver works and
ensures that all structs we pass into the kernel are zero'd out except for
the fields we actually want to fill. We were previously doing then when
building with valgrind to keep valgrind from complaining. However, we need
to start doing this unconditionally as recent kernels have been getting
touchier about this. In particular, as of kernel commit b31e51360e88 from
Chris Wilson, context creation and destroy fail if the padding bits are not
set to 0.
The new organization is as follows:
* anv_meta_blit.c: Blit and state setup/teardown commands
* anv_meta_copy.c: Copy and update commands
* anv_meta_blit2d.c: 2D Blitter API commands
Also, change the formatting to contain most lines
within 80 columns.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This can be reverted if the only other consumer, anv_meta_blit2d(),
uses a different method.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
In addition to demystifying the value being added to the height,
this future-proofs the code for new tiling modes and keeps the
image height as small as possible.
v2: Actually use the smallest height possible.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Special-casing the PS_BLEND packet wasn't really gaining us anything. It's
defined to be more-or-less the contents of blend state entry 0 only without
the indirection. We can just copy-and-paste the contents. If there are no
valid color targets, then blend state 0 will be 0-initialized anyway so
it's basically the same as the special case we had before.
Previously, we would always emit all of the render targets in the subpass.
This commit changes it so that we compact render targets just like we do
with other resources. Render targets are represented in the surface map by
using a descriptor set index of UINT16_MAX.
This reduces the number of allocations a bit and cuts back on memory usage.
Kind-of a micro-optimization but it also makes the error handling a bit
simpler so it seems like a win.
We cast he constant 0xfff values to a uintptr_t before applying a bitwise
negate to ensure that they are actually 64-bit when needed. Also, the
count variable doesn't need to be explicitly cast, it will get upcast as
needed by the "|" operation.
Previously we asserted every time you tried to pack a pointer and a counter
together. However, this wasn't really correct. In the case where you try
to grab the last element of the list, the "next elemnet" value you get may
be bogus if someonoe else got there first. This was leading to assertion
failures even though the allocator would safely fall through to the failure
case below.
Applications may create a *lot* of fences, perhaps as much as one per
vkQueueSubmit. Really, they're supposed to use ResetFence, but it's easy
enough for us to make them crazy-cheap so we might as well.
Between the initial check the returns NO_KERNEL and compiling the
shader, other threads may have added the shader to the cache. Before
uploading the kernel, check again (under the mutex) that the compiled
shader still isn't present.
There is no API for setting the point size and the shader is always
required to set it. Section 24.4:
"If the value written to PointSize is less than or equal to zero, or
if no value was written to PointSize, results are undefined."
As such, we can just always program PointWidthSource to Vertex. This
simplifies anv_pipeline a bit and avoids trouble when we enable the
pipeline cache and don't have writes_point_size in the prog_data.
Using anv_pipeline_cache_upload_kernel() will re-upload the kernel and
prog_data when we merge caches. Since the kernel and prog_data is
already in the program_stream, use anv_pipeline_cache_add_entry()
instead to only add the entry to the hash table.
This function is a helper that unconditionally sets a hash table entry
and expects the cache to have enough room. Calling it 'add_entry'
suggests it will grow the cache as needed.
We can serialize as much as the application asks for and just stop once
we run out of memory. This lets applications use a fixed amount of
space for caching and still get some benefit.