Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline. This will allow us to
properly mark the aux state so that we can handle resolves correctly.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.
v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
A GPU memcpy function could alternatively be implemented using MI_*
commands. Provide more detail into how this one operates in case another
memcpy function is created.
v2:
- Update the commit message.
v3:
- Use 'memcpy' instead of 'cpy' (Jason Ekstrand)
- Shorten 'streamout' to 'so'
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This helps Dota 2 on Broadwell by 8-9%. I also hacked up the driver and
used the Sascha "shadowmapping" demo to get some results. Setting
uses_kill to true dropped the framerate on the demo by 25-30%. Enabling
the PMA fix brought it back up to around 90% of the original framerate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
This code is far too complicated to cut and paste.
v2: Update the newly added genX_gpu_memcpy.c; const a few things.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This method of doing copies has the advantage of touching very little of
the GPU state. While it does disable all the shader stages, it doesn't
have to blow away binding tables, viewports, scissors, or any other bits of
dynamic state other than VBO 32 which is already reserved. All of the
state that it does touch is contained within a pipeline anyway so that's
the only thing that has to be dirtied.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
It really should have gone here all along. We were trying a bit too hard
to make it gen-agnostic just because it didn't have any #if's.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Now that we don't have meta, we have no need for a gen-agnostic pipeline
create path. We can, instead, just generate one Create*Pipelines function
per gen and be done with it.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Now that we no longer have meta, all pipelines get created via the normal
Vulkan pipeline creation mechanics. There is no more need for this bit of
extra magic data that we've been passing around.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Update the comment to reflect the correct filename and add a guard to
catch incorrect inclusion of the header.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Create a function that performs one of three HiZ operations -
depth/stencil clears, HiZ resolve, and depth resolves.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Initially, we had intended set_subpass to be an interesting function that
did whatever (presumably a lot) setup we needed for a subpass. In reality,
it just sets a pointer and a dirty bit and then emits depth and stencil
state. When we call BeginCommandBuffer on a secondary, there's no point in
setting depth and stencil state since it will already be set by the
primary. Instead, the only thing we need to do at the start of a secondary
is set the subpass pointer and the dirty bit.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
This is the only remaining part of genX_l3.c and there's really no good
reason for it to be in its own file.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Now that we're using gen_l3_config.c, we no longer have one set of l3
config functions per gen and we can simplify a bit. Also, we know that
only compute uses SLM so we don't need to look for it in all of the stages.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This is in contrast to emitting it directly in vkCmdPipelineBarrier. This
has a couple of advantages. First, it means that no matter how many
vkCmdPipelineBarrier calls the application strings together it gets one or
two PIPE_CONTROLs. Second, it allow us to better track when we need to do
stalls because we can flag when a flush has happened and we need a stall.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>