This fixes a regression in a bunch of image store vulkan CTS tests
from commit ad38ba1134, which started
using OWORD block read messages to implement UBO loads. The reason
for the failure is that we were giving bogus buffer alignment limits
to the application (1B), so the CTS would happily come back with
descriptor sets pointing at not even word-aligned uniform buffer
addresses.
Surprisingly the sampler messages used to fetch pull constants before
that commit were able to cope with the non-texel aligned addresses,
but the dataport messages used to fetch pull constants after that
commit and the ones used to access storage buffers (before and after
the same commit) aren't as permissive with unaligned addresses.
Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This makes Gen7/7.5 match Gen8-9.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Doesn't look like this can work on 32bit, just rids of annoying
warning.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Following on from the spirit of commit 011e5570f.
Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Enabling this debug switch causes surface shrinking to happen by
default, and lowers the surface size limit which causes blorp blits to
be split.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Detect when the surface sizes are too large for a blorp blit. When it
is too large, the blorp blit will be split into a smaller operation
and attempted again.
For gen7, this fixes the cts test:
ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit
It will also enable us to increase our renderable size from 8k x 8k to
16k x 16k.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In blorp_copy, when RGB surfaces are copied, we convert the
destination surface to a Red only surface, but 3 times as wide. This
introduces an implicit restriction of "mod 3" for the destination
width.
It is easier to handle the blorp split buffer offsetting with the
original RGB surface, and do the RGB=>R after this.
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
If try_blorp_blit() previously returned that a blit was too large,
shrink_surface_params() will be used to update the surface parameters
for the smaller blit so the blit operation can proceed.
v2:
* Use double instead of float. (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We rename do_blorp_blit() to try_blorp_blit(), and add a return error
if the surface size for the blit is too large. Now, do_blorp_blit() is
rewritten to try to split the blit into smaller operations if
try_blorp_blit() fails.
Note: In this commit, try_blorp_blit() will always attempt to blit and
never return an error, which matches the previous behavior. We will
enable the size checking and splitting in a future commit.
The motivation for this splitting is that in some cases when we
flatten an image, it's dimensions grow, and this can then exceed the
programmable hardware limits. An example is w-tiled+MSAA blits.
v2:
* Use double instead of float. (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will be useful for splitting blits into smaller sizes.
We also make the coordinates of type double rather than float. Since
we will be splitting and scaling the coordinates, we might require
extra precision in the calculations.
v2:
* Use double instead of float. (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We would really like it to be false as that's what you get on hardware that
doesn't have RegisterPoleMode (Sky Lake for example). While we're at it,
we change it to a boolean. This fixes dEQP-VK.synchronization.smoke.events
on Broxton.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
This moves the nir_lower_indirect_derefs() call into
brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
and removes that call to the old GLSL IR pass
lower_variable_index_to_cond_assign()
We want to do this pass in nir to be able to move loop unrolling
to nir.
There is a increase of 1-3 instructions in a small number of shaders,
and 2 Kerbal Space program shaders that increase by 32 instructions.
Shader-db results BDW:
total instructions in shared programs: 8705873 -> 8706194 (0.00%)
instructions in affected programs: 32515 -> 32836 (0.99%)
helped: 3
HURT: 79
total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
cycles in affected programs: 528104 -> 493460 (-6.56%)
helped: 47
HURT: 37
LOST: 2
GAINED: 0
This is already supported in genX_state.c, expose the extension string.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I
accidentally kept setting the SurfaceType to 2D in the stencil-only case
thanks to a copy+paste error.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Set the include paths to consider in-tree headers before out-of-tree
headers.
Avoids the build failing due to stale headers being present in
$prefix. Previosuly 'make -ki install' or something similar was required
to update the out-of-tree headers to allow the build to succeed.
Also avoids having to rebuild the entire thing after every 'make
install'.
Cc: Rob Clark <robdclark@gmail.com>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
This one was split across two dwords as "Kernel Start Pointer" and
"Kernel Start Pointer High", which looks like it works when the driver
only accesses "Kernel Start Pointer". This breaks, of course, with BO
offsets > 4G.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
When the state fields where shuffled around for gen8, the compare
function enums were downgraded to just uints. Change them to enum
3D_Compare_Function.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The previous commits got rid of any clashes between #defines and enum
values and we can now emit the genxml enums as debugger friendly C
enums.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
These values were defined both as an enum and as inline values. Remove
the inline values and reference the 3D_Compare_Function enum instead.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This lets us reference enums in the type attribute of a field.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cleaner this way and we avoid including gen9_pack.h when we compile with
gen8_pack.h. We also avoid the if (cherryview) condition for non-gen8
gens that don't need it.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The batch chain logic only needs the pre-gen8 size of
MI_BATCH_BUFFER_START, which seems like something we can make a special
case for. The other two gen7 references, MI_BATCH_BUFFER_END and
MI_NOOP, are the same on all gens.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We'll need to define them before we can reference them in structs and
instructions. Enums have no dependencies, so move them first in the
file.
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Remove duplicate .alphaToOne, add missing .shaderResourceMinLod, and
reorder a few entries to match their vulkan.h order. All the sparse
features are still left out entirely.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
This matches what NVIDIA and AMD hardware expose, as well as what Intel
hardware supports.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The 1-D special case doesn't actually apply to depth or HiZ. I discovered
this while converting BLORP over to genxml and ISL. The reason is that the
1-D special case only applies to the new Sky Lake 1-D layout which is only
used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers,
the old gen4 2-D layout is used and the QPitch should be in rows.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
This was already piped through in the CmdDraw(Indexed)Indirect handling.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The gen7/8_cmd_buffer logic already sets the clamp, and it's piped
through via the dynamic state.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This matches maxImageArrayLayers, as well as the same setting in the GL
frontend.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
These are all regularly available in desktop GL, so the backend fully
supports them.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This appears to be fully supported already.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This commit does two things. One is to pull useful and/or interesting
information from the AUB file header and display it as a header above your
decoded batches. Second, it is now capable of pulling the PCI ID from the
AUB file comment left by intel_aubdump. This removes the need to use the
--gen flag all the time.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This requires that a few more state bits become global.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>