Now that TTN hooks this up to use_legacy_math_rules, we can flip the
switch and gallium nine can get the desired behavior from the hardware
instead of emitting math workarounds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16176>
to allow exposing 4G - 1. The "SIZE" was also a misnomer because it meant
elements. This no longer clamps the size to INT_MAX in st/mesa.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16881>
The drivers not setting it were:
- nv30, which gets lowering using NIR's lower_fsat flag.
- r300, which gets lowering using NIR's lower_fsat flag.
- a2xx, which has was getting it optimized back to fsat anyway.
This drops the check for the cap from gallium nine. While nine does have
a non-nir path, I think it's safe to assume that if you have SM3
texturing, you can do fsat.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16823>
This is used for the old, buggy and slow GLSL IR loop unrolling
code. All drivers have now switched to the NIR unrolling code so
here we remove the CAP.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
With this option enabled range of input values for fsin and fcos is
limited to [-2*pi : 2*pi] by calculating the reminder after 2*pi modulo
division. This helps to improve calculation precision for large input
arguments on Intel.
-v2: Add limit_trig_input_range option to prog_key to update shader
cache (Lionel)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16388>
The only interesting ones here were LOWER_IF_THRESHOLD (which previously
had connected to some lowering in GLSL that was broken in the face of side
effects), and FMA (which turned GLSL IR's fma() into TGSI_OPCODE_FMA
instead of MAD).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8044>
v2: Use os_get_available_system_memory() when kernel memory region
uAPI is not available (Lionel)
Cc: 22.1 <mesa-stable>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16210>
This cap is no longer TGSI specific, so let's rename it to reflect
reality.
Because the name got a bit vague when removing the TGSI-bits, let's add
some more details to the name.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This isn't specific to TGSI, so let's update the name to reflect
reality.
Because the name of the opcode was TGSI specific, let's pick a new one,
based on the naming of the PIPE_CAP_TEXTURE_QUERY_LOD cap.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
These aren't spiecic to TGSI any more, so let's rename them to reflect
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
These aren't specific to TGSI, so let's rename them to reflect the
reality.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
Similar to the previous commits, these aren't TGSI specific, so let's
drop TGSI from their name.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This cap is no longer specific to TGSI, so let's rename it and update
the documentation to reflect that.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This cap no longer has anything to do with TGSI, as the lowering happens
on GLSL IR, and applies just as much to NIR drivers. So let's rename
this cap and update the docs to reflect the current situation.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15316>
This callback is only intended for software rasterizers, layered drivers, and
other special drivers that go through the software winsys path. Remove the
unimplemented stubs from the Intel drivers.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Dave Airlie <airlied@redhat.com> [crocus]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15118>
This cap bit only affects DRI_PRIME setups. Since iris now uses the
blitter to perform dGPU -> iGPU copies asynchronously, it's better to
always use at least two backbuffers so the 3D engine can start rendering
the next frame during the copy.
See commit d17e752857 where this change
was made for radeonsi.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13877>
v2: Fixup gpu_id computation, use minor of /dev/dri/* % 128 since we
don't know whether we get card0 or renderD128 for instance.
(Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1)
Acked-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13996>
The chipset_id should be named after i915 ioctl that's called
to get the device id. In user space this field holds pci device
id in reality. We now have a pci_device_id queried from drm
instead using the ioctl, so there is no much reason to keep
the chipset_id for the same purpose.
Signed-off-by: Jianxun Zhang <jianxun.zhang@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13936>
With the new input from PCI bus and device fields, we can compute
device uuids in a multi-gpu system.
Signed-off-by: Jianxun Zhang <jianxun.zhang@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13936>
This capability is enabled for drivers supporting formatless image
writing in shader.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
We're currently only calling it after creating the screen and the
bufmgr. There are a few cases where Iris checks for the DEBUG_BUFMGR
bit before we call brw_process_intel_debug_variable(), which means
intel_debug is 0 and so we don't run the debug code. Today, these are
all related to the creation of the workaround bo and its mmap.
I found this in a custom branch after I converted to INTEL_DEBUG an
environment variable that I had.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13780>
On GFX version 12.5+ with COMPUTE_WALKER, this is the limit based on the
size of the HW packet. On older HW, we can technically go a bit bigger
but there's not much point. Technically, some hardware can support a
scalar workgroup size up to 2048 but most apps don't go any bigger than
1024.
As discussed on the merge request page, the current limit assumes
SIMD32, but it is unclear if we want to encourage applications to use
SIMD32 if it may lead to additional register spilling in shader
programs. Many applications have likely tuned for a limit of 1024
based on the OpenGL minimum limit, so it might not gain much by
advertising more than 1024.
Reworks:
* Jordan: Use MIN2 and limit total invocations as well.
* Jordan: Add second paragraph to commit message based on merge
request discussion.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13538>
When a device has its first slice/subslice fused off, we can't use the
number of slices/subslices to iterate the mask array.
v2: Fix spelling (Marcin)
Use size_t for iterator (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Matt Roper <matthew.d.roper@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5601
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10015>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
We don't want to suballocate some buffers, such as ones that we know
we're intending to export to other clients, or ones with special
semantics (such as the workaround BO not having proper synchronization).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
The vast majority of cases need no handling at all, as they simply
read bo->address or similar fields, which works in either case.
Some other cases simply need to unwrap to look at the underlying BO.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
We would like to start performing slab allocation of resources, where
multiple resources can be backed by a single GEM object.
Originally, I had thought to move busy tracking, cache domain tracking,
and so on into resources themselves, instead of having them at the BO
level. Multiple resources would point at the same BO with an offset.
Unfortunately, this meant adjusting the batch BO pinning code to take
resources rather than BOs. That cascades into needing iris_address
for genxml packing to store resources, not BOs. Which means that places
which have use raw BOs would need to start creating resources instead.
Except some places, like aux BO handling, really don't make sense as
pipe resources and really would rather use raw BOs. So iris_address
would need to store both, which convolutes the genxml field. And,
having a BO and resource means that every place in the code needs to
handle that offset correctly. It sounds simple, but is a giant mess.
Instead, we take a different route: adjust iris_bo itself, so that BOs
are either be backed by a GEM object (as is the case today), or backed
by another underlying BO. "Real" BOs have bo->gem_handle != 0. "Slab
allocated" or "fake" or "wrapper" BOs have bo->gem_handle == 0. We move
fields into a union based on these cases. amdgpu takes this approach.
This sounds complex at first glance---in theory, every place that
interacts with BOs might need to handle the wrapper BO special case.
But in practice, they don't. For suballocated BOs, we can set the
wrapper's address field to the underlying BO's address plus any offset,
at which point it looks like any other BO. Most other properties are
easily queried; the main code that needs updating is execbuf handling
and bufmgr internals.
For now, we simply move the fields. Any code that accesses either
bo->real.* or bo->gem_handle will need updating in future patches to
actually handle the slab-allocated case.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
There's going to be at least one more shader function set in
pipe_screen, so it makes more sense to do it in iris_program.c.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>