This leaves us with a series of little anv_pipeline_compile_* functions
which each take a compiler object, a mem_ctx, the stage to compile, and
the previous stage for VUE linking purposes. Some of them do
interesting things but most are little more than wrappers around
brw_compile_*.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This breaks compilation up a bit into "link" and "compile". In the
"link" stage, new anv_pipeline_link_* helpers are called which are
responsible for setting up the binding table and doing anything needed
to properly link with the next stage in the pipeline if one exists.
They are called in reverse order starting with the fragment shader so
you can assume linking in later stages is already done.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
We can set active_stages much more directly and then it's just candy
around setting pipeline->stages[stage].
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Instead of hashing each stage separately (and TES and TCS together), we
hash the entire pipeline. This means we'll get fewer cache hits if
they, for instance, re-use the same VS over and over again but it also
means we can now safely do cross-stage optimizations.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Instead of having each anv_pipeline_compile_* function populate the
shader key, make it part of the anv_pipeline_stage struct and fill it
out up-front.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
During code review, Jason pointed out that:
2b3064c073 "i965, anv: Use INTEL_DEBUG for disk_cache driver flags"
Didn't account for INTEL_SCALER_* environment variables.
To fix this, let the compiler return the disk_cache driver flags.
Another possible fix would be to pull the INTEL_SCALER_* into
INTEL_DEBUG bits, but as we are currently using 41 of 64 bits, I
didn't think it was a good use of 4 more of these bits. (5 since
INTEL_PRECISE_TRIG needs to be accounted for as well.)
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Instead of just looking at the number of color attachments, look at
which ones are actually used by the subpass. This lets us potentially
throw away chunks of the fragment shader. In DXVK, for example, all
subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this
is very helpful in that case.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The back-end compiler emits the number of color writes specified by
wm_prog_key::nr_color_regions regardless of what nir_store_outputs we
have. Once we've gone through and figured out which render targets
actually exist and are written by the shader, we should restrict the key
to avoid extra RT write messages.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
With the new deref instructions, we have to keep the modes consistent
between the derefs and the variables they reference. Since we remove
outputs by changing them to local variables, we need to run the fixup
pass to fix the modes.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Until now, we had separate passes for lowering gl_PatchVerticesIn to
a statically known constant (for TES inputs when linked against a TCS),
and a uniform in the other cases. Annoyingly, one had to be run before
nir_lower_system_values, and the other afterward. This simplified the
passes, but made life painful for the callers.
This patch combines both into a single pass. If you give it a non-zero
static count, it uses that. If you give it Mesa state slots, it turns
it back into a built-in uniform. Otherwise, it does nothing.
This also moves the i965 uniform lowering out to shared code.
v2: Make token arrays const.
Reviewed-by: Eric Anholt <eric@anholt.net>
CovID: 1438132
Fixes: a99c9e63a0 "anv: finish the binding_table_pool on
destroyDevice when use_softpin"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Since various options within INTEL_DEBUG could impact code generation,
we need to set the disk cache driver_flags parameter based on the
INTEL_DEBUG flags in use.
An example that will affect the program generated by i965 is the
INTEL_DEBUG=nocompact option.
The DEBUG_DISK_CACHE_MASK value is added to mask the settings of
INTEL_DEBUG that can affect program generation.
v2:
* Use driver_flags (Tim)
* Also update Anvil (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This extra character should not be used by snprintf, but we make it
available to verify that we printed the exact number we wanted, and
didn't overflow.
v2:
* Also update Anvil
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.
Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().
As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In Python 2, dictionaries have 2 sets of methods to iterate over their
keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems().
The former return lists while the latter return iterators.
Python 3 dropped the method which return lists, and renamed the methods
returning iterators to keys()/values()/items().
Using those names makes the scripts compatible with both Python 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
The original pass only looked for load_uniform intrinsics but there are
a number of other places that could end up loading a push constant. One
obvious omission was images which always implicitly use a push constant.
Legacy VS clip planes also get pushed into the shader. This fixes some
new Vulkan CTS tests that test random combinations of bindings and, in
particular, test lots of UBOs and images together.
Cc: mesa-stable@lists.freedesktop.org
Cc: Kenneth Graunke <kenneth@whitecape.org>
It's actually also a bit safer, since now the compiler will warn if
the string is larger than the `.name` array.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
According to the spec, these should apply to all read/write access
types (so would be equivalent to specifying all other access types
individually). Currently, they were doing nothing.
v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We've had several broadwell hangs that have come down to this bit just
not working correctly. Most recently, we've had a pile of hangs
reported with apps running under DXVK:
https://github.com/doitsujin/dxvk/issues/469
Instead, use the bit that doesn't try to imply weird D3D coherency
things and just force-enables the PS like we want.
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We support mipmapped and arrayed linear images so we need to support
vkGetImageSubresourceLayout on them. Fortunately, it's just a trivial
call into ISL.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver.
Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted?
Not sure, so I left it as-is.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This lets us move the glBlitFramebuffer nonsense into the GL driver and
make the usage of BLORP mutch more explicit and obvious as to what it's
doing.
Reviewed-by: Chad Versace <chadversary@chromium.org>
The error buffer is limited to 256, but the report contains the
filename and possibly other data. So give it more space.
Avoids the warnings
../../src/intel/vulkan/anv_util.c: In function ‘__anv_perf_warn’:
../../src/intel/vulkan/anv_util.c:66:42: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 254 [-Wformat-truncation=]
snprintf(report, sizeof(report), "%s: %s", file, buffer);
^~ ~~~~~~
../../src/intel/vulkan/anv_util.c:66:4: note: ‘snprintf’ output 3 or more bytes (assuming 258) into a destination of size 256
snprintf(report, sizeof(report), "%s: %s", file, buffer);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/vulkan/anv_util.c: In function ‘__vk_errorf’:
../../src/intel/vulkan/anv_util.c:96:48: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 252 [-Wformat-truncation=]
snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
^~ ~~~~~~
../../src/intel/vulkan/anv_util.c:96:7: note: ‘snprintf’ output 8 or more bytes (assuming 263) into a destination of size 256
snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error_str);
~~~~~~~~~~
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
When one of the cases is not part of the enum, the compilar complains:
../../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceFormatProperties2’:
../../src/intel/vulkan/anv_formats.c:728:7: warning: case value ‘1000001004’ not in enumerated type ‘VkStructureType’ {aka ‘enum VkStructureType’} [-Wswitch]
case VK_STRUCTURE_TYPE_WSI_FORMAT_MODIFIER_PROPERTIES_LIST_MESA:
^~~~
Given the switch has an "default:" case, we don't lose anything by
switching on the unsigned value to avoid the warning.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The assert is checking that we are not binding more descriptor sets
than the supported by the driver. When binding the descriptor set
number MAX_SETS-1, it was breaking the assert because
descriptorSetCount = 1.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
In a single call to vk_errorf() in the Android code, the arguments were
swapped. The bug has existed since day one. Chrome OS used to forgive
the warning, but it is now a compilation error.
CC: <mesa-stable@lists.freedesktop.org>
Fixes: 053d4c32 "anv: Implement VK_ANDROID_native_buffer (v9)"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
CACHE_MODE_SS is not listed in gfxspecs table for user mode
non-privileged registers. So, making any changes from Mesa
will do nothing. Kernel is already setting this bit in
CACHE_MODE_SS register which is saved/restored to/from
the HW context image.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+
using the VK_KHR_get_physical_device_properties2 functionality
to expose if the extension is supported or not.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The implementation of CreateRenderPass2 uses the helpers we broke out in
previous commits. The implementations of the new vkCmd functions just
call the old versions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This makes certain checks a bit easier and means that we don't have
the attachment information duplicated in the attachment list and in
depth_stencil_attachment.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>