I'm sorry, but I have OCD and the rest of the file is nicely aligned.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39046>
On Broadwell, using the debug mode, you can't create even a single
VkImage:
createimage: ../../src/util/u_math.h:829: util_is_aligned: Assertion `(a != 0) && ((a & (a - 1)) == 0)' failed.
Thread 1 "createimage" received signal SIGABRT, Aborted.
Download failed: Invalid argument. Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007ffff789573f in __pthread_kill_internal (threadid=<optimized out>, signo=6) at ./nptl/pthread_kill.c:89
#2 0x00007ffff7840462 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff78284ac in __GI_abort () at ./stdlib/abort.c:77
#4 0x00007ffff7828420 in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, file=<optimized out>, line=829, function=<optimized out>) at ./assert/assert.c:118
#5 0x00007ffff5a5fb0c in util_is_aligned (n=0, a=0) at ../../src/util/u_math.h:829
#6 0x00007ffff5a6060d in memory_range_end (memory_range=...) at ../../src/intel/vulkan_hasvk/anv_image.c:51
#7 0x00007ffff5a61c52 in check_memory_range_s (p=0x7fffffffd800) at ../../src/intel/vulkan_hasvk/anv_image.c:779
#8 0x00007ffff5a61ef3 in check_memory_bindings (device=0x555555654d50, image=0x55555566e050) at ../../src/intel/vulkan_hasvk/anv_image.c:830
#9 0x00007ffff5a62ea3 in anv_image_init (device=0x555555654d50, image=0x55555566e050, create_info=0x7fffffffd9d0) at ../../src/intel/vulkan_hasvk/anv_image.c:1263
#10 0x00007ffff5a63147 in anv_image_init_from_create_info (device=0x555555654d50, image=0x55555566e050, pCreateInfo=0x7fffffffda80) at ../../src/intel/vulkan_hasvk/anv_image.c:1333
#11 0x00007ffff5a63211 in anv_CreateImage (_device=0x555555654d50, pCreateInfo=0x7fffffffda80, pAllocator=0x0, pImage=0x7fffffffdd20) at ../../src/intel/vulkan_hasvk/anv_image.c:1356
#12 0x00007ffff44ff376 in vvl::dispatch::Device::CreateImage (this=0x55555562c480, device=0x555555654d50, pCreateInfo=0x7fffffffdcb8, pAllocator=0x0, pImage=0x7fffffffdd20)
at ./layers/vulkan/generated/dispatch_object.cpp:1160
#13 0x00007ffff43e8214 in vulkan_layer_chassis::CreateImage (device=0x555555654d50, pCreateInfo=0x7fffffffdcb8, pAllocator=0x0, pImage=0x7fffffffdd20) at ./layers/vulkan/generated/chassis.cpp:2181
#14 0x0000555555560af4 in vks::Image::init (this=0x7fffffffdcb0) at /home/przanoni/git/random-stuff/vk/vks/libvulkanscript.hpp:1298
#15 0x000055555556557d in main () at createimage.cpp:36
Since we haven't noticed this issue as quickly as I imagined we would,
let's opt for what's mostly a revert of the behavior change in the
original commit.
Fixes: 7be63ef956 ("intel: do not NIH util_is_aligned")
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39045>
Each group of 16 lanes inside a SIMD32 shader will load different globals.
In SIMD8/16 shaders, the divergence analysis will turn this load into
nir_load_global_constant_uniform_block_intel.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>
If it wasn't for the workaround, it wouldn't be necessary to track the
whether instructions are exec_all or not. The workaround affects
results when mixing a dep and inst with different exec_all.
Add the predicate so that, when the workaround is disabled, none of
the effects of having different exec_all will kick in, all them will
be considered `exec_all = true`.
This patch don't change any behavior, just adds the predicate.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36659>
nr_params & params array are gone.
brw_ubo_range is not stored on the prog_data structure anymore (Anv
already stored a copy of that with its own additional information)
The backend now only deals with load_push_data_intel. load_uniform &
load_push_constant have to be lowered by the driver.
Pre Gfx12.5 platforms have to provide a subgroup_id_param to specify
where the subgroup_id value is located in the push constants.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
Anv already manages this itself. This allows removing the logic from
the compiler.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
Drivers can do all the lowering to push constants to find the only
value useful in that array (subgroup_id). Then drivers call into
brw_cs_fill_push_const_info() to get the cross/per thread constant
layout computed in the prog_data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
The way we build our ranges, the first empty one is the end of the
ranges.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
The current code walks the instructions, and when needed,
it will scan to find the next "end of scope" and sometimes
the next "end of block". It also has a separate patching
logic for HALTs.
The new code collects the necessary scope information up front,
then walks the instruction backwards, making avoiding the need
to scan for the end of scope. It will also walk only the
relevant instructions that were previously collected. It also
replaces the previous HALT-specific patching logic.
With this new change, many cases that were jumping to
intermediate HALTs, will now jump straight to the end of
scope (or the "end of the program" section). E.g. in
```
if
...
(...) HALT
...
(...) HALT
endif
```
both HALTs now will jump to the end of the scope, instead of the
first HALT jumping into the second one.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38914>
Most of instructions follow the basic formats (1, 2 and 3 src), so
consolidate their emission code in generator.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
Move validation, noting that LRP only supports BRW_TYPE_F -- the
previous assert had DF because it also was used by MAD in the past.
With that change, ALU3F can be replaced by ALU3 for LRP.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
When repctrl is used, the swizzle/chansel is ignored. Instead of setting
a swizzle that has all zeros and encode that, don't encode anything.
For context see e7598c5a62 ("intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX
for scalar region").
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
Replaying a dump file requires the VM state in order to feed the
replay tool with the necessary VMA properties that described the hang,
however, these properties are not necessarily useful once the replay
tool re-runs said traces, however, this patch makes this optional.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
The tool now can seamlessly support GPU hang dump files from
either the i915 or the Xe drivers.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
The changes as part of the Contexts state now include:
**** Contexts ****
[HWCTX].replay_offset: 0x0
[HWCTX].replay_length: 0xd000
and the changes as part of the VM state now include:
**** VM state ****
VM.uapi_flags: 0x1
[40000].length: 0x2000
[40000].properties: read_write|bo|mem_region=0x1|pat_index=2|cpu_caching=1
[40000].data: &-)\3!!E9mzzzzzzzzzz
In order to be able to replay a GPU hang from a devcore dump file
new properties have been added describing the offset and the length
of the affected hw context as well as a global VM flag and
several VMA property types: memory region, bo caching, pat index,
memory permission and memory type.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
Before bringing support for Xe let's create a lib so that
the common code can live there.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
initial refactoring of the i915 code in preparation
for Xe. No functional changes.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
That whole comment about Ivy Bridge is not relevant as ANV don't support IVB.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39175>
There is no side affects for this shadowing but better fix it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39175>
When this flag is set, it gives a hint to KMD to skip some operations around
compressed buffers, like copying the auxiliary buffer to smem during eviction.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38425>
There is no users for that function, is_volatile is only used in
brw_opt_cse.cpp is_expression() but it access the information using brw_send_inst
struct.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39104>
We don't need one bit per bitsize per instruction if only one actually
matters in the end.
First step towards moving NIR in the direction of full float_controls2
only.
Also rename this from fp_fast_math, because that name implied that 0 is
the no fast math mode, while the opposite was the case.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39026>
This patch adds compute_w_to_host_r barrier and also throw QueueWaitIdle
just to make sure all operations are completed before we fetch the BVH
data on the host.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39079>
The entire range of the allocated bo up to bo->size will be used, even if
alloc_size was way less, so to track the growth precisely for load factoring,
the allocated_batch_size should increase by bo->size.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38703>
Although a specific size is requested, the entire range of the returned bo up
to bo->size may end up being used by anv_batch_chain, spamming memcheck errors.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38703>