It's not only for GL, change to a generic name.
Use command:
find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>
See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in
And this causes build errors when building for C23:
-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
123 | #define unreachable(str) \
| ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
456 | #define unreachable() (__builtin_unreachable ())
| ^~~~~~~~~~~
-----------------------------------------------------------------------
So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.
Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.
This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.
All the instances of the macro, including the definition, were updated
with the following command line:
git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
while read file; \
do \
sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
done && \
sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
Add support for disabling the VRT (Variable Register Thread) feature.
The strategy here is to force the old BRW_MAX_GRF limit for the
register allocator (locks the upper limit) and make sure
ptl_register_blocks() always return that amount of blocks (locks
the lower limit).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35781>
The recommended settings is just a guidance and not a programming
requirement as per the Bspec.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35904>
Spec recommended values should give us a good balance between progress
in render and compute engines, also with less possibility of values
it will reduce the number of times that we need to emit
STATE_COMPUTE_MODE reducing the number of stalls in the compute engine.
Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
Spec has several restrictions to the values we program to compute
engine async thread limits.
Without those we risk hit deadlocks, so here adding a function to
return the optimal value based on those restrictions.
Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
Fixes#13241, where iris_bufmgr occasionally deadlocks while allocating buffers.
The deadlock happens when iris_bufmgr.c calls intel_aux_map_unmap_range while
holding the bufmgr lock, with a range that includes pages that were never created,
which can happen because the iris_resource that adds aux mappings will sometimes
use a slightly larger buffer with an offset to ensure the resource is aligned
correctly.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35266>
We were not using the minimum values from devinfo for anything. For
tessellation control, the minimum value is 0, so we continue taking
MAX2 of that with 1 when tessellation is enabled so we have at least
something guaranteed to be present. For geometry, the minimum value
is already non-zero (and updated by the previous patch).
This will have the side-effect of raising the minimum number of URB
entries for geometry stages. This is currently not known to fix
anything, but should be more closely following the documentation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
If PXP initialization is not completed and application requested a
protected context the GEM_CONTEXT_CREATE will wait up to 250ms for
PXP to finish initialization but if that do not happens it will
return a error and set errno to EIO.
This patch add the missing retry handling.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30723>
Some compiles don't initialize the upper 32bits of the union that has
u64 addr and u32 handle.
Similar to previous patches but doing that for code in intel/misc.
Cc: stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33172>
These don't seem to be disallowed by recent hardware anymore. Stop
disabling SIMD32 due to hardware restrictions of multisample
rasterization, since it should have better performance, and on Xe3+
there may be no shader variant available other than SIMD32.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
reduces a bit of boilerplate.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33242>
Coverity notices that `err` might be used uninitialized, which is true
as we don't assign the value we want to check! Fix that assignment so
the EXPECT_EQ macro does what we expect.
CID: 1635272
Fixes: 6b931a68c7 ("intel/common: Implement Xe KMD in mi_builder tests")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32849>
If the assert were to fail the memory would leak, which is pretty
harmless in a unit test, but the fix is trivial.
CID: 1635429
Fixes: 6b931a68c7 ("intel/common: Implement Xe KMD in mi_builder tests")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32849>
No functional change, just move i915 specific data to a struct
and check for kmd_type where appropriate. This will make the
next patch (which adds Xe KMD support here) cleaner.
This patch had to make intel_kmd.h header C++ friendly so it
can use its symbols.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32194>
Corrects regression caused by prior commit that created memory
overwrite by not mallocing enough space for filename string.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32013>
Split anv_xe_wait_exec_queue_idle() into 2 functions, the first
function creates the syncobj and prepares it to be signaled when the
last workload in queue is completed.
And the second one that calls the first function, then waits for the
syncobj to be signaled and destroy the syncobj.
The main reason for that is that the first function can be reused in
Iris and a future patch will add another user, so lets share it.
No changes in behavior are expected here.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30958>
As coverity points out, if the second uint64_t was greater than the
first (I don't think it actually can be), then the overflow would result
in the check succeeding when it shouldn't. We could cast this to an
integer type, but since we have uint64_t, we'd need int128_t for that.
Instead, replace the comparison to 0 with a direct comparison, since
that would give the correct result without potential to overflow.
CID: 1604833
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31175>
It has been a while that the GuC version with the compute engine fix
was released, same for the KMD uAPI to query the GuC firmware version.
So at this point this parameters do more harm than good.
Also just setting those don't enable the async compute and copy engines
this is not enabled by default on i915.
If user wants to disable or enable usage of those engines a better
approach would be use ANV_QUEUE_OVERRIDE.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30593>
Add a restriction on SIMD mode for fast-clear pixel
shader according to the Bspec.
Backport-to: 24.2
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29907>
On MTL ChromeOS boards, during AI based video conference, we were
observing a lot of overhead from invalidations. Upon debug, it was found
that we were using clflush in this function and that isn't efficient.
With this change, while executing compute workloads like zoo models, we
are getting ~25% performance improvements in a best case scenario.
Rework:
* Jordan: Call intel_clflushopt_range() rather than
__builtin_ia32_clflushopt() because intel_mem.c is not compiled
with -mclflushopt.
Backport-to: 24.1 24.2
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30238>
Fixes the following building error:
../out_src/src/intel/common/intel_common.c:29:4: error: implicit declaration of function 'free' is invalid in C99 [-Werror,-Wimplicit-function-declarat
ion]
free(engine_info);
^
1 error generated.
Fixes: 5b8b4f78 ("intel/dev: Add engine_class_supported_count to intel_device_info")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29975>
GuC offers a mechanism for KMD/UMD to provide workload hints and one of
that strategy is low latency hint. We can utilize this hint when the
workload is more latency sensitive like compute usecases.
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28282>
From Bspec 56423 (r58507), the legacy full resovling and
partial resolving options are gone since Xe2. They also
cause hang on Xe2 if not disabled.
Some suggested code from Nanley Chery <nanley.g.chery@intel.com> is
included.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
Next patch will need to frequently get the count of supported engine
for compute and copy engines, so to reduce the overhead of doing
KMD queries at every call here caching this information into
intel_device_info struct.
With that ANV and Iris would need to set this information as intel/dev
can't depend on intel/common, so here adding a single function
to update intel_device_info with all fields filled by intel/common
functions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
All the MI_SDI currently have forced write checks (meaning the command
streamer will stall until completion) on Gfx12.0+.
Now on Gfx12.0/12.5, the read commands have implicit waits on previous
writes (BSpec ). So if we're only dealing with CS writes & reads, we
don't need forced write checks.
In the few cases where CS is writing data for other bits of HW, we
need the forced write checks. This change adds an API that will let
the driver decide when to enable forced write checks.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>
When you want to write a value to a register or memory but you don't
know just yet that value when you emit the command.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>
In iris, use the CCS scale down factor to calculate the impact of CCS on
TBIMR tile sizes. Even though we fall back to a seemingly less accurate
method to calculate the impact of CCS, it ends up giving the same
answer, 1bpp. Anv already uses this factor, so this patch replaces the
constant with this macro.
There are two benefits to doing this:
1) Consistency between anv and iris.
2) Preparation for a future where we no longer use ISL surfaces to
describe CCS on Xe+. In fact, in iris, we already don't create such
surfaces on ACM.
I considered using INTEL_AUX_MAP_MAIN_SIZE_SCALEDOWN for the calculation
in both drivers, but the naming is aux-map specific and the scaledown
actually exists on flat-ccs platforms as well.
So, we introduce a new macro for all Xe platforms, currently only used
for the specific use case of TBIMR calculations. We can add more such
macros for future platforms, as needed.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>