We're currently only calling it after creating the screen and the
bufmgr. There are a few cases where Iris checks for the DEBUG_BUFMGR
bit before we call brw_process_intel_debug_variable(), which means
intel_debug is 0 and so we don't run the debug code. Today, these are
all related to the creation of the workaround bo and its mmap.
I found this in a custom branch after I converted to INTEL_DEBUG an
environment variable that I had.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13780>
Mali4x0 PP doesn't have a swizzle for load_input, so use POT-aligned
varyings to avoid unnecessary movs for vec3 and precision downgrade
in case if this vec3 is coordinates for a sampler
shader-db:
total instructions in shared programs: 15707 -> 15623 (-0.53%)
instructions in affected programs: 3906 -> 3822 (-2.15%)
helped: 47
HURT: 18
helped stats (abs) min: 1 max: 9 x̄: 3.09 x̃: 2
helped stats (rel) min: 1.49% max: 23.53% x̄: 8.20% x̃: 6.45%
HURT stats (abs) min: 1 max: 7 x̄: 3.39 x̃: 3
HURT stats (rel) min: 0.78% max: 20.59% x̄: 10.45% x̃: 10.97%
95% mean confidence interval for instructions value: -2.18 -0.41
95% mean confidence interval for instructions %-change: -5.70% -0.38%
Instructions are helped.
total spills in shared programs: 146 -> 136 (-6.85%)
spills in affected programs: 39 -> 29 (-25.64%)
helped: 6
HURT: 0
total fills in shared programs: 617 -> 598 (-3.08%)
fills in affected programs: 125 -> 106 (-15.20%)
helped: 6
HURT: 0
HURT shaders are vertex shaders where we may need more instructions
for non-packed vec3s. It's acceptable trade-off since we don't get
precision downgrade if this varying is coordinates for a sampler.
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13151>
Driver should enable this cap if it prefers varyings to be aligned
to power of two in a slot, i.e. vec4 in .xyzw, vec3 in .xyz, vec2 in .xy
or .zw
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13151>
We no longer have reg defs for the HI fields, so all we can access from
lua is the low 32 bits. LUA has only double-precision floats for numbers,
so we can't fix that. However, the high bits are almost always the same,
so it's not that big of a deal to be ignoring them for this script.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13733>
For vulkan video I need these to parse slice headers, so move
them somewhere easier to get at them.
drops pointer_to_uintptr in favour of a cast.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13768>
there's no way to know what an image will be used for, so this bit needs
to always be added
fixes KHR-GL46.packed_pixels.varied_rectangle.compressed_rgb
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13798>
When you enable video genxml, lots of warnings about redefined things
appear, just clean those up before things get started.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13788>
No matter the format, this should return true if the instruction has an
exec operand.
Otherwise, eliminate_useless_exec_writes_in_block() could remove an exec
write in a block if it's successor begins with:
s2: %3737:s[8-9] = p_parallelcopy %0:exec
s2: %0:exec, s1: %3738:scc = s_wqm_b64 %3737:s[8-9]
Totals from 3 (0.00% of 150170) affected shaders (GFX10.3):
CodeSize: 23184 -> 23204 (+0.09%)
Instrs: 4143 -> 4148 (+0.12%)
Latency: 98379 -> 98382 (+0.00%)
Copies: 172 -> 175 (+1.74%)
Branches: 95 -> 97 (+2.11%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bc13049747 ("aco: Eliminate useless exec writes in jump threading.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5620
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13776>
for now only for texture dest and if there is no rounding mode required.
Totals from 2 (0.00% of 150170) affected shaders: (GFX10.3)
CodeSize: 7980 -> 7948 (-0.40%)
Instrs: 1441 -> 1422 (-1.32%)
Latency: 7703 -> 7626 (-1.00%)
InvThroughput: 2336 -> 2302 (-1.46%)
VClause: 34 -> 36 (+5.88%)
Copies: 57 -> 58 (+1.75%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13592>
If the same cmdbuf is submitted more than once, they were waiting on
the same fence value. Fix this by clearing the value when beginning
a new command buffer.
This might fix spurious GPU hangs, especially on GFX9.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5401
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13777>
VK CTS just added some new tests to write to a compressed image
from a compute shader, which was overrunning memory.
The image width/height need to be sized according to the block
sizes to avoid overwriting memory.
dEQP-VK.image.sample_texture.*bit_compressed*
Cc: mesa-stable
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13618>
p_is_helper doesn't have any operands, so ACO's value numbering and/or
the pre-RA optimizer could incorrectly recognize two such instructions
as the same.
This patch adds exec as an operand to p_is_helper in order to achieve
correct behavior.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5570
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13577>