Commit graph

141245 commits

Author SHA1 Message Date
Iago Toral Quiroga
a48cb7534d v3dv: refactor descriptor updates
Make helper functions for all descriptor types and have them handle
all of the descriptor update so we can reuse them later to implement
template updates.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11213>
2021-06-07 11:10:49 +00:00
Tony Wasserka
3c390e2eb6 aco/scheduler: Move cursor handling state to dedicated interfaces
This clarifies the semantics of the index variables compared to the previous
version, which used the same variables in a slightly different way depending
on whether they were used for downwards moves or upwards ones.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10885>
2021-06-07 12:09:39 +02:00
Tony Wasserka
81761a311e aco/scheduler: Clean up register demand tracking
Refactoring total_demand and total_demand_clause to cover non-overlapping
instruction intervals makes the code easier to follow and allows the register
demand to be updated more efficiently in some cases.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10885>
2021-06-07 12:09:39 +02:00
Marcin Ślusarz
2ebf4e984b intel/disasm: remove useless space after "("
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11070>
2021-06-07 08:46:11 +00:00
Marcin Ślusarz
daba2894ff intel/disasm: decode/describe more send messages
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11070>
2021-06-07 08:46:11 +00:00
Pierre-Eric Pelloux-Prayer
a57e90bfea winsys/amdgpu: use int16 for buffer_indices_hashlist
int16 allows to correctly store the indices of 32k buffers; this
seems sufficient and is twice smaller than regular int.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11010>
2021-06-07 07:38:35 +00:00
Pierre-Eric Pelloux-Prayer
a981105d90 winsys/amdgpu: reduce amdgpu_cs size
buffer_indices_hashlist is only used by the current
amdgpu_cs_context (= amdgpu_cs.csc).

So store a single 16k int array instead of 2, and switch
the owner when flushing the cs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11010>
2021-06-07 07:38:35 +00:00
Pierre-Eric Pelloux-Prayer
74c67f2b72 amdgpu/winsys: remove amdgpu_cs_has_chaining
Store this property in admgpu_cs instead of using a function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11010>
2021-06-07 07:38:35 +00:00
Pierre-Eric Pelloux-Prayer
1bd64d8cfb winsys/amdgpu: don't read bo->u.slab.entry after pb_slab_free
Otherwise the pb_slabs might be freed by another thread in between.

Valgrind example:

==676841== Invalid read of size 1
==676841==    at 0x6B0A8B3: get_slab_wasted_size (amdgpu_bo.c:659)
==676841==    by 0x6B0AD7D: amdgpu_bo_slab_destroy (amdgpu_bo.c:684)
==676841==    by 0x6ACF94F: pb_destroy (pb_buffer.h:259)
==676841==    by 0x6ACF94F: pb_reference_with_winsys (pb_buffer.h:282)
==676841==    by 0x6ACF94F: radeon_bo_reference (radeon_winsys.h:754)
==676841==    by 0x6ACF94F: si_replace_buffer_storage (si_buffer.c:274)
==676841==    by 0x6957036: tc_call_replace_buffer_storage (u_threaded_context.c:1554)
                            [...]
==676841==    by 0x4ECCDEE: clone (clone.S:95)
==676841==  Address 0x27879945 is 5 bytes inside a block of size 208 free'd
==676841==    at 0x48399AB: free (vg_replace_malloc.c:538)
==676841==    by 0x6B0E8BD: amdgpu_bo_slab_free (amdgpu_bo.c:863)
==676841==    by 0x6B89D4A: pb_slabs_reclaim_locked (pb_slab.c:84)
==676841==    by 0x6B89D4A: pb_slab_alloc (pb_slab.c:130)
==676841==    by 0x6B0EE7F: amdgpu_bo_create (amdgpu_bo.c:1429)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4736
Fixes: 965c6445ad ("winsys/amdgpu,radeonsi: add HUD counters for how much memory is wasted by slabs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11010>
2021-06-07 07:38:35 +00:00
Pierre-Eric Pelloux-Prayer
cd9be8741f radeonsi: dirty msaa_config on rs->multisample_enable change
si_emit_msaa_config uses si_get_num_coverage_samples, and
si_get_num_coverage_samples depends on old_rs->multisample_enable.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4613
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11182>
2021-06-07 09:21:45 +02:00
Iago Toral Quiroga
017a150984 v3dv: expose VK_KHR_storage_buffer_storage_class
This extension is basically only wrapping SPV_KHR_storage_buffer_storage_class
which is entirely implemented in the SPIR-V frontend.

Relevant CTS tests:
dEQP-VK.glsl.opaque_type_indexing.ssbo_storage_buffer_decoration.*
dEQP-VK.spirv_assembly.*

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11184>
2021-06-07 06:09:01 +00:00
Iago Toral Quiroga
71b2ae66c2 v3dv: document VK_KHR_relaxed_block_layout as implemented
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11184>
2021-06-07 06:09:01 +00:00
Steve Pronovost
29e3a11d78 d3d12: Add mechanism for D3D12 Adapter Selection
This add a simple mechanism to select which GPU adapter the d3d12
driver should be using. A new environment variable is introduced.

MESA_D3D12_DEFAULT_ADAPTER_NAME

This represent a substring to search for in the GPU descrition,
for example "NVIDIA" or "INTEL", or "NVIDIA GeForce RTX 3090",
etc...

GPU are searched in order and the first one to include the substring
becomes a match. If no match is found, we default to the first
enumerated GPU.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10710>
2021-06-07 02:22:34 +00:00
Ilia Mirkin
108f34a165 nv50: expose GL ES 3.1 for nva3+ hardware
This hardware supports all of the points of ES 3.1 with the minor
exception of non-red gather operations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:58:38 -04:00
Ilia Mirkin
73a49c84d7 nv50: expose images/buffers/compute
This is not enough for desktop GL, since that requires support for
images/buffers in fragment shaders. However this is sufficient for ES
3.1's compute needs, where images/buffers need only be supported in
compute shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:58:38 -04:00
Ilia Mirkin
503d97445a st/mesa: allow hardware to claim ES 3.1 without hw indirect draws
Such a driver will be expected to handle indirect draws via emulation.
As such we don't want to expose the ext in desktop GL contexts. However
for ES 3.1 it's a required feature, so makes sense to allow fallbacks.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:58:38 -04:00
Ilia Mirkin
08fe7d72d1 mesa/get: allow image/buffer/atomic variables to be fetched in es3.1
Right now these rely on the desktop extension enables being set. However
some drivers may not be able to support that full functionality. Allow
presence of ES 3.1 to be sufficient.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:58:38 -04:00
Ilia Mirkin
a5379ef4a7 st/mesa: properly encode OES_geometry_shader requirement
Since the extension was added, we grew a cap to expose the number of
invocations. Use it to prevent geometry shaders from being spuriously
exposed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:43:06 -04:00
Ilia Mirkin
584799d6a3 mesa: relax ES 3.1 compute shader requirements
The desktop extensions require more than what's needed for ES 3.1.
Reduce this to allow implementations to expose ES 3.1 without supporting
desktop functionality.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:43:02 -04:00
Ilia Mirkin
00c46fec43 st/mesa: avoid enabling image/buffer/compute extensions for weak hardware
The requirements for ES 3.1 are lower than the requirements for desktop
GL. The thread block size can be smaller, and images/buffers/atomics
need not be supported in the fragment stage. Allow a driver to expose
ES 3.1 without flipping on the desktop GL extensions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10569>
2021-06-06 15:42:55 -04:00
Ilia Mirkin
7d49a6f23c nouveau: improve video limit reporting
This corrects max width/height/macroblocks reporting, in line with what
the nvidia driver docs suggest is supported.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10677>
2021-06-06 19:09:44 +00:00
Ilia Mirkin
d50e43c8a1 vdpau: allow state tracker to report a lower number of macroblocks
NVIDIA hardware can process tall or wide videos, but not both at the
same time (for some gens). This limit is provided in units of
macroblocks.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10677>
2021-06-06 19:09:44 +00:00
Ilia Mirkin
c7e877b0bf nvc0: fix 3d images
The hardware has no support for 3d image loads/stores. So present the
image as a larger 2d image and fudge the coordinates. Note that a 2d
image (in the shader) may be backed by a slice of a 3d image, so we
always have to do the coordinate adjustments for 2d as well.

This is largely copied from the nv50 support, which has the same
restriction, with extra care taken to differentiate loads (which
specifies the X coordinate in bytes) and stores, which specifies it in
(formatted) pixels.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10820>
2021-06-06 18:26:26 +00:00
Ilia Mirkin
729020c7e0 nv50: fix streamout queries
Prior to an earlier commit, xfb queries were not being marked as 64-bit.
The end result of this is that they would never appear to be "ready",
which in turn led to there always being a wait happening.

Once these got marked as 64-bit, we started checking the attached fence
for being signalled. However the screen fence does not seem to be enough
to wait for the streamout query data to actually be written out. So
instead we add a bit of extra "data" which emulates the 32-bit query way
of doing things (with the payload in front) which is emitted from the
same "unit" as the other streamout data. This seems to be sufficient.

Note that it does not seem to be required to actually emit the final
32-bit query from the streamout unit, but that seems logical and perhaps
there are edge cases where it is required.

While at it, also make the sequence management/initialization more
similar to the nvc0 driver.

Fixes dEQP-GLES3.functional.transform_feedback.*

Fixes: 58d47ca324 ("nv50: add compute invocations counter")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10867>
2021-06-06 18:11:54 +00:00
Vinson Lee
c51bdac742 v3dv: Fix assert.
Fix defect reported by Coverity Scan.

Side effect in assertion (ASSERT_SIDE_EFFECT)
assignment_where_comparison_intended: Assignment deviceMask = 1U
has a side effect. This code will work differently in a non-debug
build.

Fixes: 234e1b7356 ("v3dv: implement VK_KHR_device_group")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11197>
2021-06-05 23:04:14 -07:00
Yiwei Zhang
5bc47c9cc2 venus: unify VkNativeBufferANDROID and AHardwareBuffer image create info
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11195>
2021-06-05 22:50:23 +00:00
Yiwei Zhang
3a894d00bc venus: refactor gralloc buffer and drm modifier properties query
1. Code clean up
2. Fixed a misused allocator
3. Add error logs for external memory interop

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11195>
2021-06-05 22:50:23 +00:00
Alyssa Rosenzweig
0e2293a52b agx: Handle load_back_face_agx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
c21168a26c agx: Lower front face to back face
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
75cafd8472 agx: Pack SR immediate
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
307b8f1b2f agx: List sr enum in Python
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
cc8fec8b74 agx: Generate enums from Python
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
223476aff3 agx: Model get_sr
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
f70068583c asahi: Mark special fragment inputs as sysvals
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Alyssa Rosenzweig
c509878971 nir: Add nir_intrinsic_load_back_face_agx
On AGX, the special register for front facing is inverted from its meaning in
APIs. We need to lower load_front_face to inot(load_back_face). Doing this in
the backend is trivial, but then we would miss out on algebraic optimizations
for the inot.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>
2021-06-05 20:38:22 +00:00
Dmitry Baryshkov
cac88b5f06 freedreno/regs: split old/not used phy registers to separate DB
In order to simplify main DSI host database, split away phy register
definitions used on DSI v2 hosts to the separate database file.

Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11075>
2021-06-05 19:20:50 +00:00
Rob Clark
80b1e042e4 freedreno: Don't return a flushed batch
Somehow fairly recently the traces CI job started hitting timeouts, not
all the time but enough to be inconvenient for CI.  I tracked it down to
getting into a situation where `ctx->batch->flush == true`, which causes
an infinite loop in the draw_vbo and clear paths (because
fd_batch_lock_submit() checks for flushed batch but fd_context_batch()
does not).  I'm not entirely sure how we get into that state, or what
triggered this (seems possibly triggered by !10937).  But it is easy
enough to recover.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11196>
2021-06-05 18:51:41 +00:00
Rob Clark
ad375d0579 freedreno: Fix typo
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11196>
2021-06-05 18:51:41 +00:00
Ville Syrjälä
db83dc619c i915: Implement __DRI2_FLUSH version 4
DRI3 needs version 4 of __DRI2_FLUSH.

Straight up port of i965 commit 313f2bc32b ("intel: Add
support for the new flush_with_flags extension.").

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9734>
2021-06-05 00:22:22 +00:00
Ville Syrjälä
1c312bfc41 i915: Implement __DRI_IMAGE_ATTRIB_OFFSET query
DRI3 needs __DRI_IMAGE_ATTRIB_OFFSET so implement it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9734>
2021-06-05 00:22:22 +00:00
Jason Ekstrand
b742f73913 intel/isl: Fix isl_format_is_valid
The format enum space isn't necessarily contiguous so we can't assume
that if it's in the table it's valid.  We need to check something.

Fixes: ed6e586562 "intel: properly constify isl_format_layouts"
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11191>
2021-06-04 16:16:44 -05:00
Hoe Hao Cheng
90a5fef85c nir: define NIR_ALU_MAX_INPUTS
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11172>
2021-06-04 19:33:13 +00:00
Emma Anholt
d6d7421e98 util/ra: Use the conflicting neighbor to skip unavailable registers.
Now that we have an idea of how many regs the conflicting allocation uses,
we can just skip to the next one and save repeated tests to find the same
conflicting neighbor again.

shadowrun-returns shader-db time on skl -1.62821% +/- 1.58079% (n=679),
now there's no statistically significant change from the start of the series
(n=420)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
23df5dba92 lima: Use ra_alloc_contig_reg_class().
This greatly simplifies our register allocation code and reduces the
number of registers RA has to walk over.

Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
cf33316ec0 intel/vec4: Use ra_alloc_contig_reg_class() to reduce RA overhead.
We go from 1672 RA regs to the real 128 HW regs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
40e1d798c6 intel/fs: Use ra_alloc_contig_reg_class() to speed up RA.
By using the new class type, we don't need to make 1928 different
registers to represent each contigous reg size starting from the actual
128 HW register, or have a mapping between RA regs and HW base regs.  With
the number of regs reduced, and the fast q computation when using the new
classes, we no longer need to compute our own q.

This drops the FS RA initialization time on my CFL system from about 1ms to
50us.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
ec3bc5da74 v3d: Use the ra_alloc_contig_reg_class() function to speed up RA.
It means we don't need to do the n^2 loop over the regs to set up the pq
values, nor do we need the register conflicts lists.

Acked-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
15aa8e9189 vc4: Use the ra_alloc_contig_reg_class() function to speed up RA.
It means we don't need to do the n^2 loop over the regs to set up the pq
values, nor do we need to allocate conflicts lists.

Acked-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
2d7bcdaf6b ra: Add fast-path support for register classes of contiguous regs.
In the fully general case of register classes, to expose an allocation
class of unaligned 2-contiguous-regs allocations, for example, you'd have
your base individual regs (128 on intel), and another set of 127 regs that
each conflicted with the corresponding pair of the base regs.  Single-reg
nodes would allocate in the 128, and double-reg nodes would allocate in
the 127 and the user would remap from the 127 down to the base regs with
some irritating table.

If you need many different contiguous allocation sizes (16 is a pretty
common number across drivers), your number of regs explodes, wasting
memory and making the q computation expensive at startup.

If all the user has is contiguous-reg classes, we can easily compute the q
value up front (as found in the intel driver and nouveau, for example),
and we only have to change a couple of places in the conflict-checking
logic so the contiguous-reg classes can use the base registers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00
Eric Anholt
95d41a3525 ra: Use struct ra_class in the public API.
All these unsigned ints are awful to keep track of.  Use pointers so we
get some type checking.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9437>
2021-06-04 19:08:57 +00:00