This has been stubbed since 2019, but wasn't advertised or implemented.
Neither kernel offers us any interface for tracking evictions, but we
can at least report the size of various heaps.
Fixes various SPECviewperf subtests on Alchemist.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27168>
Our actual timestamp register from the hardware is 36-bits these days.
Depending on the clock base, a few of the low bits may be discarded.
We scale by 52.08 or so (see intel_device_info_timebase_scale()) in
order to convert clock ticks to nanoseconds. This value could take
~43-44 bits to represent. When our timestamp register overflows, the
reported value would overflow at some value which isn't a power-of-two.
For some reason, when we implemented ARB_timer_query for i965 in 2012,
we thought applications would use GL_QUERY_COUNTER_BITS to detect and
handle overflow, expecting our timer queries to overflow at a power-of-
two value. We tried to hack around this by reporting a 36-bit counter,
for what was then a 32-bit(?) timestamp register, with a timebase scale
of 80(?)...and reported the nanoseconds modulo 2^36, with some
handwaving about wrap around times being close enough.
I don't think this is what anyone wants. All other Gallium drivers
(except maybe zink) report 64 here, as does the Intel Windows driver.
The ARB_timer_query spec defines it as:
"If <pname> is QUERY_COUNTER_BITS, the implementation-dependent
number of bits used to hold the query result for <target> will be
placed in <params>. The number of query counter bits may be zero,
in which case the counter contains no useful information."
and it also mentions about overflow:
"If the elapsed time overflows the number of bits, <n>, available to
hold elapsed time, its value becomes undefined. It is recommended,
but not required, that implementations handle this overflow case by
saturating at 2^n - 1."
There's nothing about roll-over happening at power-of-two times, just
that the value returned has to fit in that many bits, and if the value
were to exceed that, it's undefined, optionally saturated to the maximum
representable value.
This patch makes us report 64 like other drivers, and stop taking the
modulus. Technically, our roll-over will happen before we reach the
number of query counter bits, which may or may not be valid. But this
does let us report longer times, and should be more desirable behavior.
Tested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26058>
This may help debugging performance problems in the possible case that
TBIMR negatively impacts the performance of some application. It could
also allow applying application-specific band-aid fixes in the XML file
until a more general workaround is implemented.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
We have competent lowering in NIR already available.
Drivers exposing CAP_DOUBLES but not SHADER_CAP_DROUND:
- d3d12 (NIR lowers ~0 if the underlying impl doesn't do floats)
- svga (Now sets the NIR lowering options)
- softpipe (Doesn't do GL4 so you can't use doubles anyway)
- llvmpipe (Lowers dround_even in NIR and passees the rest through
successfully)
- zink (NIR lowers ~0 if the underlying impl doesn't do floats,
otherwise passes things through successfully, except needed
dround_even lowering to avoid lavapipe regression with
native doubles)
- r600 (sets NIR rounding lowering flags, and lowers all fsign)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25777>
Most drivers that can expose GL4 were claiming the cap anyway (llvmpipe,
softpipe, zink, iris, nvc0, radeonsi, r600, freedreno, d3d12), and just
doing lowering in NIR if nessary.
crocus was only claiming the cap for gen8, but the backend compiler
enables NIR lowering regardless.
svga is the only other GL4 driver that didn't set it, and we can just set
the NIR lowering flag.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25777>
This change allow us to insert the MI_SEMAPHORE_WAIT before/after
specific draw call. With GTX tool, we can always update the memory
address to unblock spinning wait.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24308>
Since the mesa state tracker can promote RGB texture formats
to RGBA texture formats (among other formats) without exposing
any of that information to a driver, it is more desirable to
have the behaviour of `PIPE_CAP_RGB_OVERRIDE_DST_ALPHA_BLEND`
be the default. This avoids rendering bugs where an application
sets `DST_ALPHA` blending on a format where there is no alpha
channel, that has been promoted to a format that actually has an
alpha channel. The driver can instead rely on the common code
in the state tracker to convert the blending parameter to one
that reflects the limitations of the application requested format,
as long as `PIPE_CAP_INDEP_BLEND_FUNC` is supported.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24044>
ARB_clear_texture is now implemented in common code.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23735>
This will be required for `cl_intel_required_subgroup_size`, but it
already helps implementing OpenCL subgroups as this allows us to check
with every subgroup size when implementing
`CL_KERNEL_LOCAL_SIZE_FOR_SUB_GROUP_COUNT`.
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22893>
Now everyone's saying NIR, and doing any NTT internally. The only returns
of TGSI were in gallivm_get_shader_param() and
tgsi_exec_get_shader_param(), but the drivers were returning NIR instead
of calling down to them.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23114>
All drivers should now be using the appropriate NIR lowering, so we can
drop this pile of code.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>
Intel Gen9 GPUs have hardware ASTC support, but have a bug where they
don't handle denormalized values in void extent blocks correctly. This
isn't that hard to work around - on upload, we can detect such blocks,
and flush any denorms to zero. Because we're altering the data behind
the application's back, and applications can theoretically ask to
download the original unaltered image data, we unfortunately need to
maintain shadow copies of the data.
To make sure that we don't accidentally skip the void-extent flushing
via any fast-upload paths, and support download correctly, we plug this
into the st/mesa compressed texture format fallback paths, which store
a CPU copy of the original image data, and upload altered data.
This is unfortunately common code for what's likely to be a single
driver's issue (on a single generation), but it beats replicating an
entire framework we already have inside the driver.
Fixes dEQP-GLES3.functional.texture.compressed.astc.void_extent_ldr.*
using iris on Intel Gen9 GPUs.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4167
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21943>
The workaround address is used as a source for push constants when
there's no resource available, that address must be 32B aligned.
This fixes invalid address being used for buffers in
3DSTATE_CONSTANT_* packets.
Now that intel_debug_write_identifiers() already add the padding,
there's no need to include extra "+ 8" to the offset.
Thanks to Xiaoming Wang that contributed to find and fix this issue.
Fixes: 2a4c361b06 ("iris: add identifier BO")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21479>
Here adding kmd_type parameter to
intel_gem_read_render_timestamp(), intel_gem_can_render_on_fd() and
intel_gem_supports_protected_context().
Those 3 functions will have Xe implementations, the other functions
in intel_gem.h will not be called by Xe code paths so not adding
kernel_driver_type to it.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20773>
We can have multiple pipe_screen but only one iris_bufmgr per device.
So better to store intel_device_info into the shared iris_bufmgr and
save some memory.
Also in future patches iris_bufmgr will make more use of
intel_device_info.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19650>
I had simply copied these values from another driver when adding initial
compute support to iris. The actual hardware limit is UINT32_MAX (see
the GPGPU_WALKER/COMPUTE_WALKER ThreadGroupID{X,Y,Z}Dimension fields).
Thanks to Karol Herbst for noticing the unnecessarily low limit.
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7676
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19826>
Iris, hasvk and anv were fetching the same information, better do it
on one place.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>
We want to use nir_opt_large_constants() instead (which is already
enabled), since that doesn't involve uploading the large immediate data
array again on each CB0 update. The downside is a bit of addressing math,
since constant_data is accessed using 64-bit global addresses.
The shader-db results are a bit all over:
All Iris driver platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19910185 -> 19913931 (0.02%)
instructions in affected programs: 225374 -> 229120 (1.66%)
helped: 3 / HURT: 348
total cycles in shared programs: 856004856 -> 855016808 (-0.12%)
cycles in affected programs: 22832422 -> 21844374 (-4.33%)
helped: 277 / HURT: 101
total spills in shared programs: 6580 -> 6609 (0.44%)
spills in affected programs: 516 -> 545 (5.62%)
helped: 1 / HURT: 4
total fills in shared programs: 8235 -> 8267 (0.39%)
fills in affected programs: 1022 -> 1054 (3.13%)
helped: 1 / HURT: 3
total sends in shared programs: 1039347 -> 1039095 (-0.02%)
sends in affected programs: 16367 -> 16115 (-1.54%)
helped: 251 / HURT: 0
LOST: 5
GAINED: 2
LOST:
- 3 SIMD16 fragment shaders (Superposition)
- 2 SIMD16 compute shaders (Aztec Ruins)
GAINED:
- fake news... 2 SIMD8 compute shaders that replace the lost SIMD16
compute shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
Supported in all hardware and software drivers. Only that don't support
are virgl and svga, depending on host capabilities. I don't think
there's anything to be done there. This does give fewer places to screw
up the CAPs, though, because everyone wants ARB_buffer_storage.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Ol<C5><A1><C3><A1>k <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19392>
There are three tiers of drivers:
* Drivers that support MRT and support mixed colorbuffer formats. All modern
hardware fits this as it becomes a spec requirement.
* Drivers that do not support MRT. Then this CAP is a no-op, so we might as well
set it by default even here (this commit trivially enables the CAP for lima,
vc4, etanviv).
* Drivers that support MRT but do not support mixed colorbuffer formats! Very
little hardware fits this category as it doesn't suffice for MRT in most APIs.
Unfortunately we have a few drivers that are in this category, preventing us
from bulldozing the CAP altogether.
Given that the CAP only exists for a few legacy drivers, default it to being
enabled to avoid new drivers falling into the trap of forgetting to enable it.
Failing to set this CAP causes failures in
dEQP-GLES3.functional.fbo.completeness.*
Drivers which still do not set this CAP: nv30, r300 (older than r500), virgl in
some cases. r300/r400 is due to a hardware requirement as Emma points out.
v2: Advertise the cap on lima like the commit message claims (delete the special
case).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19079>
All current drivers reports supporting this cap, let's just assume
it's always supported.
It seems better to lower this in the drivers, like we already do for
etnaviv, panfrost and zink...
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Begrudgingly-reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19049>
Timestamp read is not in any hot path so there is no down-sides in
share the same function between iris, crocus, anv and hasvk.
Also while at it also dropping the functions to read MMIO from kernel,
the only use is read render timestamp so we don't need it.
v2:
- fix compilaton of ds
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18920>
when running recent Mesa on i855 (gen 2) without amber drivers:
error: Kernel is too old for Iris. Consider upgrading to kernel v4.16.
libGL error: glx: failed to create dri3 screen
libGL error: failed to load driver: iris
error: Kernel is too old for Iris. Consider upgrading to kernel v4.16.
libGL error: glx: failed to create dri2 screen
libGL error: failed to load driver: iris
move the i915 feature check to after the hardware generation check
which results in:
MESA: warning: Driver does not support the 0x3582 PCI ID.
libGL error: glx: failed to create dri3 screen
libGL error: failed to load driver: iris
MESA: warning: Driver does not support the 0x3582 PCI ID.
libGL error: glx: failed to create dri2 screen
libGL error: failed to load driver: iris
Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18563>
Intel has different Z interpolation float point rounding
than other mesa gpus
For example gl_Position.z = 0.0 will be interpolated to
gl_FragCoord.z = 0.5 for all gpus
gl_FragCoord = -0.00000001 will be interpolated to
gl_FragCoord.z = 0.4999999702 for Intel
and rounded to gl_FragCoord.z = 0.5 for other gpus
Games with LEQUAL depth func will fail depth test on Intel
and will pass it on other gpus in such case
This workaround lowers translated depth range
and several gl_FragCoord.z coords with extra small difference
will be translated to the same UINT16\UINT24\UINT32
value of an integer depth buffer
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7199
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18412>
For those drivers that don't make full use of the 64 bits in
pipe_query_result.u64.
Applications will make use of it via GL_QUERY_COUNTER_BITS to handle
when the value rolls over.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10770>
We have fine NIR lowering for this (already called from mesa/st), no need
for a separate GLSL pass.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18361>