This doesn't really do anything for us today. One day, I suppose we
could use it to do something with wide loads with non-uniform offsets.
The big reason to do this is to get better testing to make sure that NIR
doesn't blow up on the deref paths.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Instead, we do a limited indirect deref lowering and then use
nir_lower_vars_to_explicit_types and nir_lower_explicit_io to lower it
as if it were SSBO or global memory access. Among other things, this
should enable pointer arithmetic on local variables. Fun!
The only shader-db change from this change on ICL was a few tiny cycle
count changes in 7 Aztec Ruins compute shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
Since everything flows through NIR and we're doing all of our indirect
deref lowering there now, there's no reason to keep making those
decisions in brw_compiler and stuffing them in the GLSL compiler
structs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
There's no good reason to split this into three. Sure, CS indirects are
only guaranteed by the spec to be DWORD aligned, but that's all untyped
surface reads require anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570>
This can come up if, for instance, the shader does a derivative of a
uniform or flat input. Ideally, NIR would use divergence analysis to
get rid of the derivative in this case but it doesn't right now. This
fixes a crash in F1 2017.
Cc: mesa-stable@lists.freedesktop.org
Reported-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6564>
When we have softpin, we know the address of the shader constant data at
shader upload time because it's sitting at the end of the shader. This
commit changes ANV to use patch constants to embed the address in the
shader patch the right address in at upload time. This allows us to
avoid having to set up a UBO binding on-the-fly for shader constants.
This commit uses an A64 message but it's quite possible that we could
also use an A32 message and make the dataport do the 64-bit add for us.
However, load_global is what we have right now so it was easier to just
use that.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
It's just a container around a devinfo. The one useful purpose it did
serve is that gen_disasm_create initialized the compaction table
singletons. Now that those no longer exist, this isn't necessary.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
With discrete GPUs, it's going to be possible to have GPUs from two
different hardware generations in the machine at the same time. Global
singletons like this aren't going to fly. Have a struct containing the
pointers which gets initialized once per shader disassemble instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
WAIT takes a notification register as a destination and a src0 argument.
Since the same notification register is specified in both fields, we
treat it as a special case and disassemble it only once.
If we disassemble it as if it is a source register, its scalar region
will be printed as <0,1,0>. This causes difficulties round-tripping
through the assembler <-> disassembler because that is not an acceptable
destination region. If we instead disassemble the destination, we
instead get a <1> region which is an acceptable and equivalent region
for source and destination.
The test .asm files are regenerated by round-tripping them through the
assembler/disassembler. Note that the <0> region in the tests was a
harmless mistake: the compiler translated it to a <0,1,0> source region
and a <1> destination region, since <0> isn't valid.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6543>
Other tests use the same environment variable to decide whether they
should print debugging information.
Will quiet Coverity's "'Constant' variable guards dead code".
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6126>
When divisor is constant integer != 0 there's no point in checking
whether it's 0.
Complained about by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6126>
Use labels instead of numeric JIP/UIP offsets.
Works for gen6+.
v2:
- Change asm tests to use labels on gen6+
- Remove usage of relative offsets on gen6+
- Consider brw_jump_scale when setting relative offset
- Return error if there is a JIP/UIP label without matching target
- Fix matching of label tokens
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>
Shader instructions which use UIP/JIP now get formatted with a label
in addition with immediate value, labels have "LABEL%d" format.
v2: - Consider brw_jump_scale when calculating label's offset
From: "Lonnberg, Toni" <toni.lonnberg@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>
Pre-work for shader disassembly label support.
Introduction of the structures and functions used by the shader disassembly
jump target labeling.
From: "Lonnberg, Toni" <toni.lonnberg@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>
Fixes crashes in:
- Rise of the Tomb Rider (on benchmark start)
- Total War: Three Kingdoms (on game start)
- Total War: Warhammer II (on game start)
Fixes: 34a0ce58c7 ("anv: add a new execution mode for secondary command buffers")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6546>
Only advertise VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT if CLOCK_MONOTONIC_RAW
is defined. Fixes the build on OpenBSD which has CLOCK_MONOTONIC but not
CLOCK_MONOTONIC_RAW.
Fixes: 67a2c1493c ("vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6517>
Replace local get_available_system_memory() function with
os_get_available_system_memory().
Fixes: b80930a6fe ("anv: add support for VK_EXT_memory_budget")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6517>
The immediate case is pretty uncommon to see but it can happen, in
theory. BROADCAST is typically used to uniformize values and those are
usually 32-bit. However, it does come up in some subgroup ops.
Fixes: 49c21802cb "intel/compiler: Split has_64bit_types into float/int"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6211>
Drop unnecessary check which allows enabling of lossless write through
compression (HiZ + CCS) for D16_UNORM format on Gen12+.
We had misleading HSD information previously which used to claim that
compression can not be supported for 16bpp format. Although BSpec does
not have any restriction for D16_UNORM format.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6485>
This implements timeline semaphores using a new type of dma-fence
stored into drm-syncobjs. We use a thread to implement delayed
submissions.
v2: Drop cloning of temporary semaphores and just transfer their ownership (Jason)
Drain queue when dealing with binary semaphore
Ensure we don't submit to the thread as long as we don't need to
v3: Use __u64 not uintptr_t for kernel pointers
Fix commented code for INTEL_DEBUG=bat
Set DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES in timeline fence execbuf extension
Add new anv_queue_set_lost()
Drop multi queue stuff meant for the fake multi queue patch
Rework temporary syncobj handling
Don't use syncobj when not available (DeviceWaitIdle/CreateDevice)
Use ANV_MULTIALLOC
And a few more tweaks...
v4: Drop drained condition helper (Lionel)
Fix missing EXEC_OBJECT_WRITE on BOs we want to wait on (Jason)
v5: Add missing device->lost_reported in _anv_device_report_lost (Lionel)
Fix missing free on submit->simple_bo (Lionel)
Don't drop setting the device in lost state on QueueSubmit error (Jason)
Store submit->fence_bos as an array of uintptr_t (Jason)
v6: condition device->has_thread_submit to i915 & core DRM support (Jason)
v7: Fix submit->in_fence leakage on error (Jason)
Keep dummy semaphore with no thread submission (Jason)
v8: Move ownership of submit->out_fence to submit (Jason)
v9: Don't forget to read the VkFence's syncobj binary payload (Lionel)
v10: Take the mutex lock on anv_gem_close() (Jason/Lionel)
v11: Fix void* -> u64 cast on 32bit (Lionel)
v12: Rebase after BO backed timeline semaphore (Lionel)
v13: Fix missing snippets lost after rebase (Lionel)
v14: Drop update_binary usage (Lionel)
v15: Use ANV_MULTIALLOC (Lionel)
v16: Fix some realloc issues (Ivan)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901>
Needed for dealing with the new DRM timeline syncobj ioctls.
v2: Make use of the anv_gem_get_drm_cap() parameter... (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901>
In 957bbc6ad9 we merged all the per stages allocations of push
constants into a single one. Unfortunately one field remained per
stage.
This fixes the issue by including all the per stage values of the
masked registers for robust buffer access into the push constant data.
v2: Drop unneeded loop (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 957bbc6ad9 ("anv: simplify push constant emissions")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6505>
All objects are expected to have the base internal object for private
data storage.
This also fixes a memory leak of a gen_perf_registers structure.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 51c6bc13ce ("anv,vulkan: Implement VK_EXT_private_data")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6255>
I have no idea how this pass ever worked. I guess it worked ok on the
one or two piglit tests but the whole thing seemed very fragile. It
makes a number of undocumented and unasserted assumptions and they
aren't always valid. This rewrite makes a number of changes:
1. It now properly handles the case where the gl_SampleMask write comes
before the gl_FragColor or gl_FragData[0] write.
2. It should early-exit faster because it now looks at bits in
shader_info::outputs_written instead of looking for variables.
3. Instead of the fragile variable lookup where we try to look the
variable up by both location and driver_location and match, we just
use the driver_location calculations used by brw_fs_nir.
4. It asserts that the index parameter to store_output is a constant
instead of silently failing if it isn't.
5. We now actually assert the implicit assumption that the two writes
are in the same block. We go even further and assert that they are
in the last block in the shader.
6. In the case where 3 or fewer components of the output are written,
we explicitly choose to leave the sample mask alone.
Fixes: 7ecfbd4f6d "nir: Add alpha_to_coverage lowering pass"
Closes: #3166
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
I'm honestly not sure how passing a builder by-value ever worked. I
guess the struct is mostly copyable. In any case, that's the wrong way
to use it and it's causing issues.
Fixes: 7ecfbd4f6d "nir: Add alpha_to_coverage lowering pass"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
Instead of allocating a push constant buffer per stage from the
dynamic state pool, we can use the same one for all stages.
We can do this because the push constant data is supposed to be
identical of all stages. Even if vkCmdPushConstants() allows to update
chunks of the push constant data differently per stage, this valid
usage guarantees that any chunk of push constant data used be 2
different stages must be identical :
"For each byte in the range specified by offset and size and for
each push constant range that overlaps that byte, stageFlags must
include all stages in that push constant range’s
VkPushConstantRange::stageFlags"
v2: Fix dirtying of stages (Jason)
v3: Move push constant data into base pipeline state struct (Jason)
v4: Remove duplicated field (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6183>
The raw query is meant to be used with MDAPI [1]. When using this
metric without this library, we usually selected the TestOa metric to
provide some default sensible values (instead of undefined).
Historically this TestOa metric lived in the kernel at ID=1. We
removed all metrics from the kernel in kernel commit 9aba9c188da136
("drm/i915/perf: remove generated code").
This fixes the Mesa code to use a valid metric set ID (1 could work
some of the time, but not guaranteed).
[1] : https://github.com/intel/metrics-discovery
v2: Store fallback metric at init time
v3: Drop TestOa lookout
v4: Skip the existing queries (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6438>