This extension is not properly tested (testing for
GL_ARB_fragment_shader_interlock is not sufficient), and since this was
noted in review on August 28th no tests have been sent.
Revert "i965: Add INTEL_fragment_shader_ordering support."
Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering"
This reverts commit 03ecec9ed2.
This reverts commit 119435c877.
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Anholt <eric@anholt.net>
Pipeline state pending bits should be taken into account when copying
results.
In the particular bug below, the results of the
vkCmdCopyQueryPoolResults() command was being overwritten by the
preceding vkCmdCopyBuffer() with a same destination buffer. This is
because we copy the buffers using the 3D pipeline whereas we copy the
query results using the command streamer. Those pieces of HW work in
parallel and the results are somewhat undefined.
v2: Unconditionally flush the pipeline before copying the results
(Jason)
v3: Wrap & expressions (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894
Cc: mesa-stable@lists.freedesktop.org
Vulkan and Gallium don't use Mesa's gl_program data structure, so they
can't poke at 'prog'. But we can simply use the copy of the shader info
stored with the NIR shader, which is guaranteed to exist.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Use L3 configuration specified in h/w specification.
V2: Drop configs which do under allocation of l3 cache.
Bump up the comment above table.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).
Reported-by: Niklas Haas <git@haasn.xyz>
Fixes: 8c048af589 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes issues with following SkQP tests:
unitTest_VulkanHardwareBuffer_Vulkan_EGL_Syncs
unitTest_VulkanHardwareBuffer_Vulkan_Vulkan_Syncs
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Instead of taking a whole pipeline (which could be anything!), just take
a physical device and robust_buffer_access boolean. This makes it
easier to verify that only the things in the hash actually affect
pipeline compilation.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
It affects apply_pipeline_layout. Shaders compiled with the wrong value
will work but they may not be robust as requested by the app.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Our compile already splits UBO loads into scalars and the untyped
surface read messages we use for SSBO reads and writes only require
dword alignment.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
This moves nir_shader_clone() to the driver-specific compile function,
rather than the shared src/intel/compiler code. This allows i965 to do
key-specific passes before calling brw_compile_*. Vulkan should not
need this cloning as it doesn't compile multiple variants.
We do need to continue cloning in the compute shader code because we
lower various things in NIR based on the SIMD width.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.
To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).
Tested-By: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0
was written, then VARYING_SLOT_CLIP_DIST1 would be as well. That's
true with the current lowering, but not necessary if there are 4 or
fewer clip distances. Separate out the checks to allow this.
The new NIR-based lowering will trigger this case, which would have
caused backend validation errors (src is null) without this patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
assert(nir_intrinsic_write_mask(instr) ==
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
(1 << instr->num_components) - 1);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This was caused by 6339aba775 which added these completely valid
checks. However clang likes to complain about signedness mismatches.
Fixes: 6339aba775 "intel/compiler: Lower SSBO and shared..."
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
It's not at all intel-specific; the formula is dictated by OpenGL and
Vulkan. The only intel-specific thing is that we need the lowering. As
a nice side-effect, the new version is variable-group-size ready.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
As the name says, the format is an sRGB format.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We can only start parsing commands from the head pointer. This was
working fine up to now because we only dealt with a "made up" ring
buffer (generated by aub_write) which always had its head at 0.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
Use this value to limit reading the ring buffer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>
We have a bunch of code to do this in the back-end compiler but it's
fairly specific to typed surface messages and the way we emit them.
This breaks it out into NIR were it's easier to do things a bit more
generally. It also means we can easily share the code between the vec4
and FS back-ends if we wish.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Both BRW_SFID_SAMPLER and GEN6_SFID_DATAPORT_SAMPLER_CACHE are getting
disassembled as "sampler", which is misleading for assembler tool.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Some hardware supports source mods only for float operations. Make it
possible to skip lowering to source mods in these cases.
v2: use option flags instead of a boolean (Jason Ekstrand)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
this helps reduce the overall code changes when a bit_size parameter is
added to nir_load_system_value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
It's only used in anv_image.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This shouldn't make any difference but I feel uneasy to use the
expanded aspects that do not represent the image in its entirety. If
we ever change the implementation of the anv_image_aspect_to_plane()
helper, this is safer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will make it easier to associate an aspect with a plane number.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
To play around with debugging, we might want to disable one or the
other component. Having 0s as default values makes this work.
Otherwise we might have NULL components, leading to crashes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
v4: Added missing engine definition to MI_TOPOLOGY_FILTER.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
v4: Added missing engine definition to MI_TOPOLOGY_FILTER.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
v4: Added more missing engine definitions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
v4: Added missing engine tag for MI_TOPOLOGY_FILTER and MI_LOAD_URB_MEM.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions
v4: Added missing engine to MEDIA_GATEWAY_STATE
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added addition engine definitions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instructions meant for the render engine now have a definition specifying that
so that can differentiate instructions meant for different engines due to shared
opcodes.
v2: Divided into individual patches for each gen
v3: Added additional engine definitions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The engine to which the batch was sent to is now set to the decoder context when
decoding the batch. This is needed so that we can distinguish between
instructions as the render and video pipe share some of the instruction opcodes.
v2: The engine is now in the decoder context and the batch decoder uses a local
function for finding the instruction for an engine.
v3: Spec uses engine_mask now instead of engine, replaced engine class enums
with the definitions from UAPI.
v4: Fix up aubinator_viewer (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Removed the gen_engine enum and changed the involved functions to use the
drm_i915_gem_engine_class enum from UAPI instead.
v3: Wrong engine was being used for blocks in video ring
v4: Fixed aubinator_viewer.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Preliminary work for adding handling of different pipes to gen_decoder. Each
instruction needs to have a definition describing which engine it is meant for.
If left undefined, by default, the instruction is defined for all engines.
v2: Changed to use the engine class definitions from UAPI
v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to
engine_mask, added check for incorrect engine and added the possibility to
define an instruction to multiple engines using the "|" as a delimiter in the
engine attribute.
v4: Fixed the memory leak.
v5: Removed an unnecessary ralloc_free().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This can occur during payload setup of SIMD-split send message
instructions, which can lead to the emission of header setup
instructions with a non-zero channel group and fixed SIMD width. Such
instructions could end up using undefined channel enable signals
except they don't care since they're always marked force_writemask_all.
Not known to affect correctness of any workload at this point, but it
would be trivial to back-port to stable if something comes up.
Reported-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>