Nanley Chery:
(rebase)
- Resolve conflicts with new anv_batch_emit macro
(amend)
- Handle a QPitch TODO
- Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
- Only use HiZ for single-subpass renderpasses
- Emit the HiZ instruction before the stencil instruction to follow the
optimized clear sequence specified in the PRMs
- Don't modify clear params
- Enable resolves when a HiZ buffer is used to ensure depth buffer validity
Provides an FPS increase of ~15% on the Sascha triangle and multisampling
demos.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Create a function that performs one of three HiZ operations -
depth/stencil clears, HiZ resolve, and depth resolves.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Nanley Chery (amend):
- Change memset value from 0xff to 0 (a defined value for HiZ).
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery:
(rebase)
- Use isl_surf_get_hiz_surf()
(amend)
- Only add a HiZ surface onto a depth/stencil attachment
- Add comment above HiZ surface addition
- Hide HiZ behind INTEL_VK_HIZ prior to BDW
- Disable HiZ for untested cases
- Remove DISABLE_AUX_BIT instead of preventing it from being added
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
According to the spec - 9.6. Pipeline Cache :
If pDataSize is less than the maximum size that can be retrieved by the
pipeline cache, at most pDataSize bytes will be written to pData, and
vkGetPipelineCacheData will return VK_INCOMPLETE.
Fixes the following test from Vulkan CTS :
dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_fragment_stage
dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_geometry_stage_fragment_stage
dEQP-VK.pipeline.cache.misc_tests.invalid_size_test
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Since the gen_device_info structs are no longer just constant memory, a
pointer to one is not a pointer to something in the .data section so we
shouldn't be storing it in a static variable. Instead, we should just
store the entire device_info structure.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Initially, we had intended set_subpass to be an interesting function that
did whatever (presumably a lot) setup we needed for a subpass. In reality,
it just sets a pointer and a dirty bit and then emits depth and stencil
state. When we call BeginCommandBuffer on a secondary, there's no point in
setting depth and stencil state since it will already be set by the
primary. Instead, the only thing we need to do at the start of a secondary
is set the subpass pointer and the dirty bit.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
We have a DIRTY_RENDER_TARGETS flag and that makes a lot more sense than
just dirtying fragment descriptors. We're checking for it in some of the
gen7 code but unfortunately, nothing was setting it and it didn't do what
it was supposed to do in cmd_buffer_flush_state.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Because WSI images are created with VkImageCreateInfo::flags explicitly set
to 0, they don't ever have the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT set.
This means that you can't create an image view of it with a different
format so applications can't render directly in sRGB (without automatic
encoding) unless we actually advertise UNORM formats. There are a lot of
applications that want to do their own sRGB conversion, so we should allow
for that. We do, however, make UNORM come after sRGB in the list so that
the default for dumb apps that just grab the first thing is to render in
linear and let the sRGB conversion happen automatically.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
The header was renamed with earlier commit, so update the
Makefile.sources respectively.
{vulkan/genX_multisample.h => common/gen_sample_positions.h}
Fixes: c779ad3e661("intel: Move Vulkan sample positions to common code")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
According to chapters 16.5. (Timestamp Queries) and 30.2 (Limits) of the
Vulkan Specification 1.0.29, the .limits.timestampPeriod field returned
by vkGetPhysicalDeviceProperties is measured in nanoseconds, not in
seconds.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It is now the only caller so there's no sense in keeping things split out.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
gcc-4 and earlier don't allow compound literals where a constant
is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
when populating the anv_formats[] array. There are a few ways around
it: First one would be -std=c89/gnu89, but the rest of the code
depends on c99 so it's not really an option. The second option
would be to upgrade to gcc-5+ where the compiler behaviour was relaxed
a bit [1]. And the third option is just to avoid using compound
literals. I chose the last option since it keeps gcc-4 and earlier
working.
[1] https://gcc.gnu.org/gcc-5/porting_to.html
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 7ddb21708c ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
From the Vulkan spec:
Size is the number of bytes to fill, and must be either a multiple of 4,
or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer.
If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a
multiple of 4, then the nearest smaller multiple is used.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Make gen_device_info a mutable structure so we can update the fields that
can be refined by querying the kernel (like subslices and EU numbers).
This patch does not make any functional change, it just makes
gen_get_device_info() fill a structure rather than returning a const
pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reproduces this commit :
commit 0fb85ac08d
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Mon Jun 6 21:37:34 2016 -0700
i965: Use the correct number of threads for compute shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reproduces this commit :
commit 2213ffdb4b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Mon Jun 6 21:37:34 2016 -0700
i965: Allocate scratch space for the maximum number of compute threads.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Without this bit set, the value in "L3 Atomic Disable" won't get applied by
the hardware so we won't properly get L3 atomic caching.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of
the dEQP-VK.image.atomic_operations.* tests on HSW
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT
at the GPU initialization time and never sets it again. The GL driver sets
it every time the framebuffer changes. Originally, blorp set it to the
size of the drawing area but meant we had to set it back in the Vulkan
driver. Instead, we can easily just do that in the GL driver's blorp_exec
implementation and not set it in blorp core.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However,
now that Vulkan and GL use the same sample positions, we can set up
3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Since Vulkan doesn't allow single-slice 3D storage images, we need to just
set the base_array_layer and array_len to the full size of the 3-D LOD.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Everything that we were once using the blit2d framework for is now done
with blorp.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This is a lot cleaner and easier to read than the old piles of if
statements.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
This allows us to #undef them later if we don't want them to persist
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>