Commit 3fc79582a1db ("drm/i915: Increase I915_PARAM_MMAP_GTT_VERSION version to indicate support for partial mmaps")
increased the I915_PARAM_MMAP_GTT_VERSION version, with that we can
detect what kernel version has the partial mmap fix or not and limit
the usage of this workaround.
This time o mmap will be used in memory pool, so here adding this
propertly to enable or not the feature.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
Re-organize the configuration lists to make easier to include BFloat16
only for the Gfx125+ that support it, while keeping MTL supporting the
"lowered" configurations from pre-Gfx125.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
Add debug option to show current shader type being
compiled within anv_shader_bin_create.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34596>
We are reaching our limit of adding flags to intel_debug
(apporaching 64 flags). Switch intel_debug to a bitset,
which gives us almost "unlimited" bits to use in the future.
v2(Michael Cheng): Fixed a few ci errors
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34596>
Refactored the existing debug flags to use an enum instead of
hardcoded 1ull << N macros. This is a prep step before the
eventual switch of intel_debug to a bitset.
Using enums gives us cleaner indexing and avoids annoying shift
overflow warnings. No functional changes yet.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34596>
Not supported. Some operations *do* work, but proper support
was removed since it also doesn't support DPAS.
Fixes: 9916cc1050 ("brw: Add BRW_TYPE_BF for bfloat16")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
Gfx125+ has systolic, with exception for MTL and some ARL
variants. Update code and tests to use it.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
Not used anymore, just use the existing ADL definitions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>
Before commit b571ae6e7a ("intel: Make memory heaps consistent
between KMDs"), we had the following policy for reporting Sytem RAM
memory sizes:
- For OpenGL, we reported the total available RAM.
- For Vulkan, we reported the total available RAM as:
- 50% of the total RAM if the total RAM was <= 4GB,
- 75% otherwise
- In addition, the Memory Budget (for VK_EXT_memory_budget) is 90%
of the "free" memory, which can be an extra 10% off of the 50% or
75%.
When xe.ko was added, one key difference was noted: while i915.ko
reported the "real" RAM memory sizes in its ioctls, xe.ko reported
only 50% of the system RAM as available. Because of that (and other
reasons, see this discussion on MR 28513), commit b571ae6e7a decided
to unify the behavior by changing the Anv i915.ko rule to "always 50%"
instead of "50% or 75%". This also changed the Iris rule to 50%
instead of 100%.
In my research, I couldn't find any reason why this restriction should
also apply to Iris, so here we revert back to handling these size
restrictions on Anv only.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
Apparently hwconfig has not implemented this workaround.
This warning was noted on MTL and ADL.
> INTEL_HWCONFIG_TOTAL_GS_THREADS (336) != devinfo->max_gs_threads (312)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34243>
Previously we were printing this information whenever the driver
started, but that proved to noisy.
For example, if running thousands of tests, this would cause thousands
of warnings messages to be printed. (Assuming the driver was built in
debug mode.)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34243>
We will move this check into the `intel_dev_info` tool. Unfortunately,
this means we will be much less likely to notice inconsistencies, but
the current strategy has proven to be far too noisy.
For example, if the driver was built in debug mode, then when test
suites are running thousands of tests, the current approach can lead
to thousands of messages being printed.
Closes: mesa/mesa#12141
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34243>
Xe3 different SKUs can have different max_subslices_per_slice and
Xe KMD topology uAPI only provide us the available subslices.
Therefore, to correctly calculate the available slices, we need
max_subslices_per_slice to match the hardware.
This change retrieves this information from hwconfig for Xe3+.
This avoids adding all the PTL intel_device_info variants.
Additionally, the PTL topology values are currently embargoed and
cannot be hard-coded in public source code.
This could be simplified if we decide to apply max_slices and
max_subslices_per_slice to all platforms that hwconfig is required.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33328>
intel-gpu-tools have a few more entries in enum intel_hwconfig, so
adding the missing ones to Mesa.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34157>
In 4064b5546b, the idea was to have the minimum value as if all
stages are active, however hwconfig does not follow that for the
tessellation control stage. Ignore min values from hwconfig.
Fixes: 4064b5546b ("intel/dev: reduce warning noise from urb settings");
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33554>
This was only needed on Sandybridge. We can delete the brw code,
and replace the generic devinfo bit with a helper inside the elk
compiler itself.
Thanks to Iván Briano for noticing we still had dead brw code for this.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
We've been programming our minimum number of URB entries for geometry
shaders to 2, but it appears that we should have been setting 8 on
Broadwell and later. Additionally, there's a workaround on Skylake
and later that requires us to add flushing (which we haven't) or use
a minimum of 16 URB entries.
This alone will not fix anything, as nothing reads this devinfo field
presently (will be fixed in the next commit).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
As we added new platforms, the device info macros evolved over time
Most platforms had a "FEATURES" macro, some had a "HW_INFO" macro,
a few had macros for URB entries - some with min entries only, some
with min and max, some including the .urb = { ... } braces, others not.
Thread counts or subslice info was sometimes considered FEATURES,
sometimes HW_INFO, sometimes inserted only in the final structure.
FEATURES macros often inherited from an ancestor platform, but not
necessarily the prior platform - many were based on GFX8_FEATURES.
Many redundantly set the same feature bits as prior platforms.
This patch aims to clean up the situation, so it's a little more
organized, especially if you look at multiple generations. Macros
are now split into several separate pieces:
1. The FEATURES macro only has architectural features, such as LSC,
ray tracing support, 64-bit integers, flat CCS, and so on. Thread
counts, subslice info, and URB sizes that may vary by SKU are not
included here. This makes it easy for one platform to inherit the
features from the previous, while not pulling in that extra data.
2. THREAD_COUNTS macros contain maximum thread counts from the
3DSTATE_VS documentation and so on.
3. URB_MIN_MAX_ENTRIES macros contain the entire URB configuration,
including .urb = { ... }.
4. PAT_ENTRIES macros (on modern platforms) contains our choice of which
PAT entries to use for various types of resources.
5. CONFIG macros combine all of the above into a tidy bundle for use
in defining various structures, and may also include the platform
macro or simulator ID for convenience.
On recent platforms where hwconfig tables exist, items #2-3 could
potentially be dropped and filled in from there instead. For XEHP+
where we require hwconfig, we instead have a PLACEHOLDER_THREADS_AND_URB
macro that makes it clear that these values are updated from hwconfig.
One nice thing is that the bits that could (or do) come from hwconfig
tables are now cleanly separate from those that do not (i.e. platform
feature support, PAT entry selection, and so on).
This patch does not touch GFX7 or earlier macros. We could probably
offer a similar treatment there, but they're generally working and not
quite as complex.
To verify that this commit does not have unintentional changes, I
recommend running
objdump -s build/src/intel/dev/libintel_dev.a.p/intel_device_info.c.o
before and after this commit, and diffing the output. The devinfo
structures produced are identical.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
intel_device_info_init_common calculates this for Gfx9+ based on
max_threads_per_psd and slice information. Mark it as zero in the
structures to make clear that the value there isn't useful, and make
it easier to diff binaries for the next commit's refactors.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
The documentation for 3DSTATE_URB_HS has 0 as the minimum number of HS
URB entries for all platforms. See BSpecs 32162, 47137, 56271 for
Gfx6-11, Xe, and Xe2-3, respectively.
This should silence warnings about our device info field not matching
the hwconfig tables.
Notably, nothing in our drivers currently uses this value so it cannot
have a functional impact.
Fixes: 4064b5546b ("intel/dev: reduce warning noise from urb settings")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
This is only needed for original 965G/GM clipper code, which only exists
in the legacy compiler. Send it off to live with the elk.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
This is used in exactly one place in crocus, which already has a comment
indicating that this code is needed for original Gfx4 hardware. Just
replace that with a verx10 == 40 check.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
This is used by a single place in ISL only for sanity checking the
decisions it has already made. The knowledge is already all centralized
in ISL these days, so we don't need a device info bit.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33764>
Add an optimization pass to turn regular SENDs into SEND_GATHERs.
This allows the payload to be "broken" into smaller pieces that
can be further optimized, which _may_ result in
- less register pressure (no need to contiguous space), and
- less instructions (no need to MOV to such space).
For debugging, the INTEL_DEBUG=no-send-gather option skips this
optimization, and reporting how many opportunities were missed.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32410>
For Xe3+, set preferred SLM and SLM per threadgroup size.
Bspec: 73211
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32872>
This updates entry for 14017823839 which fixes issues on BMG with:
dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory.max_workgroup_memory.1
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32550>
Rather than checking hwconfig items when using them, wait until after
devinfo has been fully initialized. This includes having workarounds
implemented.
We can then check if the hwconfig data and final Mesa initialization
agree. If the match fails, we need to investigate if Mesa or the
hwconfig data is wrong.
This code becomes a no-op when not on a release build.
Fixes: a4c5bfd34c ("intel/dev: Use hwconfig for urb min/max entry values")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12141
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32359>
Real Xe KMD actually returns a uint64, so here changing from uint32
to uint64.
Fixes: 04bdbeec31 ("intel/dev/xe: Fix access to eu_per_dss_mask")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32527>
DRM_XE_TOPO_EU_PER_DSS and DRM_XE_TOPO_SIMD16_EU_PER_DSS can be any
number of bytes long but it was assuming it was always 4 bytes long.
That was not a issue because Xe KMD return 4 bytes even if only needs
1 or 2 bytes but that is a problem with our HW simulator that was
returning 2 bytes.
Fixes: a24d93aa89 ("intel/dev: Query and compute hardware topology for Xe")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32307>
This commit allows you to dump different regions of memory related to
bvh building. An additional script to decode the memory dump is also
added, and you're able to view the built bvh in 3D view in html. See the
included README.md for usage.
Rework:
- you can now view the actual child_coord in internalNode in html
- change exponent to be int8_t in the interpreter
- fix the actual coordinates using an updated formula
- now you can have 3D view of the bvh
- blockIncr could be 2 and vk_aabb should be first
- Now, if any bvh dump is enabled, we will zero out tlas, to prevent gpu
hang caused by incorrect tlas traversal
- rootNodeOffset is back to the beginning
- Add INTEL_DEBUG=bvh_no_build.
- Fix type of dump_size
- add assertion for a 4B alignment
- when clearing out bvh, only clear out everything after
(header+bvh_offset)
- TODO: instead of dumping on destory, track in the command buffer
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>