Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to
disable prefetching of binding tables for ICLLP A0 and B0
steppings. It fixes multiple gpu hangs in
ext_framebuffer_multisample* tests on ICLLP B0 h/w.
Anuj: Add comments and commit message.
Add gen 11 checks in the code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
The pass can create a temporary result for the instruction and then
moves from it to the original destination, however, if the original
instruction was predicated, the mov has to be predicated as well.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Until now, we had separate passes for lowering gl_PatchVerticesIn to
a statically known constant (for TES inputs when linked against a TCS),
and a uniform in the other cases. Annoyingly, one had to be run before
nir_lower_system_values, and the other afterward. This simplified the
passes, but made life painful for the callers.
This patch combines both into a single pass. If you give it a non-zero
static count, it uses that. If you give it Mesa state slots, it turns
it back into a built-in uniform. Otherwise, it does nothing.
This also moves the i965 uniform lowering out to shared code.
v2: Make token arrays const.
Reviewed-by: Eric Anholt <eric@anholt.net>
These are lowered by brw_nir_lower_vs_inputs(). If they weren't, we
would have already hit the unreachable() in emit_system_values_block().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The various base addresses are simply addresses. There may or may not
be a buffer located at those addresses. So, it doesn't make much sense
to request one. Just save the raw address so we can add it later, when
asking about BOs at the final <base + offset> address.
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Normally, i965 programs STATE_BASE_ADDRESS every batch, and puts all
state for a given base in a single buffer.
I'm working on a prototype which emits STATE_BASE_ADDRESS only once at
startup, where each base address is a fixed 4GB region of the PPGTT.
State may live in many buffers in that 4GB region, even if there isn't
a buffer located at the actual base address itself.
To handle this, we need to save the STATE_BASE_ADDRESS values across
multiple batches, rather than assuming we'll see the command each time.
Then, each time we see a pointer, we need to ask the driver for the BO
map for that data. (We can't just use the map for the base address, as
state may be in multiple buffers, and there may not even be a buffer
at the base address to map.)
v2: Fix things caught in review by Lionel:
- Drop bogus bind_bo.size check.
- Drop "get the BOs again" code - we just get the BOs as needed
- Add a message about interface descriptor data being unavailable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CovID: 1438132
Fixes: a99c9e63a0 "anv: finish the binding_table_pool on
destroyDevice when use_softpin"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
We might fail on master node drm fd because we won't have the right
permissions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Since various options within INTEL_DEBUG could impact code generation,
we need to set the disk cache driver_flags parameter based on the
INTEL_DEBUG flags in use.
An example that will affect the program generated by i965 is the
INTEL_DEBUG=nocompact option.
The DEBUG_DISK_CACHE_MASK value is added to mask the settings of
INTEL_DEBUG that can affect program generation.
v2:
* Use driver_flags (Tim)
* Also update Anvil (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This extra character should not be used by snprintf, but we make it
available to verify that we printed the exact number we wanted, and
didn't overflow.
v2:
* Also update Anvil
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
we need rounding modes on other conversions involving floats and it is easier
to rename f2f16_undef than renaming all the other ones.
v2: rebased on master
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.
Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().
As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In Python 2, dictionaries have 2 sets of methods to iterate over their
keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems().
The former return lists while the latter return iterators.
Python 3 dropped the method which return lists, and renamed the methods
returning iterators to keys()/values()/items().
Using those names makes the scripts compatible with both Python 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
The original pass only looked for load_uniform intrinsics but there are
a number of other places that could end up loading a push constant. One
obvious omission was images which always implicitly use a push constant.
Legacy VS clip planes also get pushed into the shader. This fixes some
new Vulkan CTS tests that test random combinations of bindings and, in
particular, test lots of UBOs and images together.
Cc: mesa-stable@lists.freedesktop.org
Cc: Kenneth Graunke <kenneth@whitecape.org>
It's actually also a bit safer, since now the compiler will warn if
the string is larger than the `.name` array.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
According to the spec, these should apply to all read/write access
types (so would be equivalent to specifying all other access types
individually). Currently, they were doing nothing.
v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The first fix attempt contained a nasty typo which somehow didn't get
caught in review. It also didn't work as intended because the sRGB
conversion was happening but then throwing away all but the red channel
because it dind't know it was RGB. Really, it's my fault for trying to
fix a bug without first writing tests. I've now written tests and they
pass with this change. :)
Fixes: 11712b9ca1 "intel/blorp: Fix blits to R8G8B8_UNORM_SRGB"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
We've had several broadwell hangs that have come down to this bit just
not working correctly. Most recently, we've had a pile of hangs
reported with apps running under DXVK:
https://github.com/doitsujin/dxvk/issues/469
Instead, use the bit that doesn't try to imply weird D3D coherency
things and just force-enables the PS like we want.
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We support mipmapped and arrayed linear images so we need to support
vkGetImageSubresourceLayout on them. Fortunately, it's just a trivial
call into ISL.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver.
Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted?
Not sure, so I left it as-is.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
When running gdb, make sure to pass the LD_PRELOAD variable only to
the executed program, not the debugger. Otherwise the debugger will
run the preloaded constructor/destructor too and bad things will
happen.
Suggested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
The problem with passing the configuration of the dump lib through a
file descriptor is that it can be read only once. But under gdb you
might want to rerun your program multiple times.
This change hands the configuration through a temporary file that is
deleted once the command line passes to intel_dump_gpu has exited.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Rendering to a linear depth buffer on gen4 is causing a GPU hang in the
CI system. Until a better explanation is found, assume that errata is
applicable to all gen4 platforms.
Fixes fbe01625f6
("i965/miptree: Share tiling_flags in miptree_create").
Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In commit 86cb05a6d3 ("intel: aubinator: remove standard input
processing option") we removed the ability to process aub as an input
stream because we're now rely on mmapping the aub file to back the
buffers aubinator is parsing.
intel_aubdump was the provider of the standard input data and since
we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa,
we don't need that code anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This lets us move the glBlitFramebuffer nonsense into the GL driver and
make the usage of BLORP mutch more explicit and obvious as to what it's
doing.
Reviewed-by: Chad Versace <chadversary@chromium.org>
At the moment, this is entirely internal but we'll expose it to clients
of the BLORP API in the next commit.
Reviewed-by: Chad Versace <chadversary@chromium.org>
Instead of having quite so many singletons, we use a struct aub_file to
organize the bits we need for writing an aub file.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
For large buffers which span an entire l1 page table, we got the range
calculations wrong. In this case, we end up with an l1_start which is
the first byte represented by the given l1 table and an l1_end which is
the first byte after the range represented by the l1 table. Then
l2_start_index == L2_index(l2_end) due to roll-over. Instead, compute
lN_end using (1Ull << shift) - 1 so that lN_end is the last byte in the
range represented by the Nth level page table. When we do this, we
don't need the conditional expression anymore.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Code assumes that all the necessary fields will exist, but compiler
doesn't know about this. Provide zero as default values, like in other
decoding functions.
Fixes warnings
../../src/intel/common/gen_batch_decoder.c: In function ‘handle_media_interface_descriptor_load’:
../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_entry_count’ may be used uninitialized in this function [-Wmaybe-uninitialized]
dump_binding_table(ctx, binding_table_offset, binding_entry_count);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_table_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_count’ may be used uninitialized in this function [-Wmaybe-uninitialized]
dump_samplers(ctx, sampler_offset, sampler_count);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
../../src/intel/common/gen_batch_decoder.c:343:7: warning: ‘ksp’ may be used uninitialized in this function [-Wmaybe-uninitialized]
ctx_disassemble_program(ctx, ksp, "compute shader");
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c: In function ‘decode_dynamic_state_pointers’:
../../src/intel/common/gen_batch_decoder.c:663:54: warning: ‘state_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
const uint32_t *state_map = ctx->dynamic_base.map + state_offset;
~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c: In function ‘gen_print_batch’:
../../src/intel/common/gen_batch_decoder.c:856:13: warning: ‘next_batch.map’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if (next_batch.map == NULL) {
^
../../src/intel/common/gen_batch_decoder.c:860:13: warning: ‘next_batch.addr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
gen_print_batch(ctx, next_batch.map, next_batch.size,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next_batch.addr);
~~~~~~~~~~~~~~~~
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
strncpy() doesn't guarantee the terminator NUL, so we would need to
set ourselves. Just use snprintf() instead.
Fixes the warnings
../../src/intel/common/gen_decoder.c: In function ‘iter_decode_field’:
../../src/intel/common/gen_decoder.c:897:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation]
strncpy(iter->name, iter->field->name, sizeof(iter->name));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘iter_advance_field’,
inlined from ‘gen_field_iterator_next’ at ../../src/intel/common/gen_decoder.c:1015:9:
../../src/intel/common/gen_decoder.c:844:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation]
strncpy(iter->name, iter->field->name, sizeof(iter->name));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The error buffer is limited to 256, but the report contains the
filename and possibly other data. So give it more space.
Avoids the warnings
../../src/intel/vulkan/anv_util.c: In function ‘__anv_perf_warn’:
../../src/intel/vulkan/anv_util.c:66:42: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 254 [-Wformat-truncation=]
snprintf(report, sizeof(report), "%s: %s", file, buffer);
^~ ~~~~~~
../../src/intel/vulkan/anv_util.c:66:4: note: ‘snprintf’ output 3 or more bytes (assuming 258) into a destination of size 256
snprintf(report, sizeof(report), "%s: %s", file, buffer);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/vulkan/anv_util.c: In function ‘__vk_errorf’:
../../src/intel/vulkan/anv_util.c:96:48: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 252 [-Wformat-truncation=]
snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
^~ ~~~~~~
../../src/intel/vulkan/anv_util.c:96:7: note: ‘snprintf’ output 8 or more bytes (assuming 263) into a destination of size 256
snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error_str);
~~~~~~~~~~
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
When one of the cases is not part of the enum, the compilar complains:
../../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceFormatProperties2’:
../../src/intel/vulkan/anv_formats.c:728:7: warning: case value ‘1000001004’ not in enumerated type ‘VkStructureType’ {aka ‘enum VkStructureType’} [-Wswitch]
case VK_STRUCTURE_TYPE_WSI_FORMAT_MODIFIER_PROPERTIES_LIST_MESA:
^~~~
Given the switch has an "default:" case, we don't lose anything by
switching on the unsigned value to avoid the warning.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Explicitly convert to signed integer. Conversion is valid since is the
same (implicitly) used to initialize the loop. Avoids the warning:
../../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::lower_simd_width()’:
../../src/intel/compiler/brw_fs.cpp:5761:45: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
split_inst.eot = inst->eot && i == n - 1;
~~^~~~~~~~
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The assert is checking that we are not binding more descriptor sets
than the supported by the driver. When binding the descriptor set
number MAX_SETS-1, it was breaking the assert because
descriptorSetCount = 1.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
SNB doesn't have a definition of 3DSTATE_CONSTANT_BODY, thats
why we got segmentation fault when used INTEL_DEBUG=bat.
Fixed by adding of 3DSTATE_CONSTANT_BODY into 3DSTATE_CONSTANT
of VS, GS and PS structures.
v2: added definition of 3DSTATE_CONSTANT_BODY to the gen6.xml
Fixes: 169d8e011a (intel: Fix 3DSTATE_CONSTANT buffer decoding.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107190
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>