For readability, bring the assignment of CCS closer to the assignment of
NONE and MCS.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The lifespan of the fast-clear data will surpass the render pass scope.
We need CCS_D to be enabled in order to invalidate blocks previously
marked as cleared and to sample cleared data correctly.
v2: Avoid refactoring.
v3: Allow CCS_D for subpass resolves.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The next patch enables the use of CCS_D even when the color attachment
will not be fast-cleared. Catch the gen7 case early to simplify the
changes required.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.
v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Remove ::first_subpass_layout assertion (Jason Ekstrand).
v3: Allow some fast clears in the GENERAL layout.
v4: Remove extra '||' and adjust line break (Jason Ekstrand).
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Don't pass in the command buffer (Jason Ekstrand).
v3: Remove an incorrect assertion and an if condition for gen7.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will be used to load and store clear values from surface state
objects.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7. However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering. Let's just play it safe and disallow it for now. If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.
We have a very specific row pitch that we want and we don't want ISL to
be changing it on us so just be explicit about it.
Fixes: a40f043034
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
There is the same constraintg later on as assert in
isl_gen7_choose_image_alignment_el() so catch it earlier in order
to return error instead of crash.
Needed to avoid crashes with piglits on IVB and HSW:
arb_internalformat_query2.image_format_compatibility_type pname checks
arb_internalformat_query2.all internalformat_<x>_type pname checks
arb_internalformat_query2.max dimensions related pname checks
arb_copy_image.arb_copy_image-formats --samples=2/4/6/8
arb_texture_float.multisample-fast-clear gl_arb_texture_float
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.
There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try 16I/32I in addition to GL_RGBA8I.
IvyBridge passed all tests with all sample numbers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.
There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I.
IvyBridge passed all tests with all sample numbers and even
with 128-bit formats.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The implementation of ballotARB() will start by zeroing the flags
register. So, a doing something like
if (gl_SubGroupInvocationARB % 2u == 0u) {
... = ballotARB(true);
[...]
} else {
... = ballotARB(true);
[...]
}
(like fs-ballot-if-else.shader_test does) would generate identical MOVs
to the same destination (the flag register!), and we definitely do not
want to pull that out of the control flow.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The implementations of the ARB_shader_ballot intrinsics will explicitly
read the flag as a source register.
Reviewed-by: Matt Turner <mattst88@gmail.com>
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.
Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The implementations of the ARB_shader_group_vote intrinsics will
explicitly write the flag as the destination register.
Reviewed-by: Matt Turner <mattst88@gmail.com>
I don't expect anyone is going to care about using this in vec4 programs
(vertex/tessellation/geometry on Gen6/7), no one has come up with a good
way to implement it much less test it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Specifically, constant fold intrinsics from ARB_shader_group_vote, but I
suspect it'll be useful for other things in the future.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.
Found by assertion failures while enabling VK_ANDROID_native_buffer.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This will falsely trigger an assert on number of layers once
isl is used for 3D layouts of Gen4 cube maps.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We don't support the general version yet because that requires us to
lower shared variables up-front in SPIR-V -> NIR. This shouldn't be a
whole lot of work but it's not something we support today.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
From the Vulkan 1.0.53 spec VU for vkCreateImageView:
"image must have been created with a usage value containing at least
one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"
We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
We were early returning and never created the NULL surface state.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.
ANV did so since day one (back in 2015)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We always recommend sub-allocation and don't do anything special for
dedicated allocations.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
There is one small ANV change here because we used the
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had
to be updated to have the _KHR suffix.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
These have been formally deprecated by Khronos never to be shipped
again. The KHR versions should be implemented/used instead.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>