In that case we need to use the sysval. That sysval can be optimized anyway in
the nonvariable case. Fixes test_basic.get_linear_ids on panfrost.
Fixes: 998d84fca5 ("nir/lower_system_values: Support lowering more intrinsics")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18662>
Setting the NIR options takes care of iris thanks to the common st/mesa
linking code, and updating brw_nir_link_shaders should handle anv.
The main effort here is updating remap_tess_levels, which needs to
handle vector stores, writemasking, and swizzling. Unfortunately,
we also need to continue handling the existing single-component
access because it's used for TES inputs, which we don't vectorize.
We could try to vectorize TES inputs too, but they're all pushed
anyway, so it wouldn't buy us much other than deleting this code.
Also, we do have opt_combine_stores, but not one for loads.
One limitation of using nir_vectorize_tess_levels is that it works
on variables, and so isn't able to combine outer/inner writes that
happen to live in the same vec4 slot (for triangle domains). That
said, it's still better than before.
For writes, we allow the intrinsics to supply up to the full size
of the variable (vec4 for outer, vec2 for inner) even if the domain
only requires a subset of those components (i.e. triangles needs 3).
shader-db results on Icelake:
total instructions in shared programs: 19600314 -> 19597528 (-0.01%)
instructions in affected programs: 65338 -> 62552 (-4.26%)
helped: 271 / HURT: 0
helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12
helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59%
95% mean confidence interval for instructions value: -10.71 -9.85
95% mean confidence interval for instructions %-change: -6.17% -5.43%
Instructions are helped.
total cycles in shared programs: 851842332 -> 851808165 (<.01%)
cycles in affected programs: 618577 -> 584410 (-5.52%)
helped: 271 / HURT: 0
helped stats (abs) min: 64 max: 540 x̄: 126.08 x̃: 111
helped stats (rel) min: 2.57% max: 37.97% x̄: 6.12% x̃: 5.06%
95% mean confidence interval for cycles value: -135.35 -116.80
95% mean confidence interval for cycles %-change: -6.67% -5.57%
Cycles are helped.
total sends in shared programs: 1025238 -> 1024308 (-0.09%)
sends in affected programs: 6454 -> 5524 (-14.41%)
helped: 271 / HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4
helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39%
95% mean confidence interval for sends value: -3.57 -3.29
95% mean confidence interval for sends %-change: -15.42% -14.54%
Sends are helped.
According to Felix DeGrood, this results in a 10% improvement in
the draw call time for certain draw calls from Strange Brigade.
v2: Fix assertions about number of components and add more of them.
Combine the quads and triangles handling as it's nearly identical.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19061>
This boring refactor:
- makes it consistent for extension name alias
- shortens the line a bit to not further regress line width
- applies macro when possible to be consistent
- removes some redundant empty lines
v2: rebase
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18993>
if a resource is in use every frame and never goes idle, it becomes
impossible to execute pruning, as there is no tracking for when views
are no longer in use
to avoid eventually ooming in this scenario, add some data to zink_resource_object
which can effectively "queue" pruning of these views if ballooning is
detected at a time when the views are guaranteed to be safe to delete
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19056>
the existing model for descriptor pools works thusly:
* allocate pool
* when possible, reuse existing pool from overflow array
* when pool is full, append to overflow array
* on batch reset, cycle the overflow arrays to enable reuse
the problem with this is it uses two separate arrays for overflow, and these arrays
are cycled but never consolidated, leading to wasted memory allocations for whichever
array is not currently being recycled
to resolve this, make the following changes to batch resets:
* set the recycle index to whichever overflow array is smaller
* copy the elements from the smaller array to the larger array (consolidate)
* set the number of elements in the smaller array to 0
this leads to always appending to an empty array while the array containing all the
"free" pools is always the one that's available for recycled use
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19053>
this mechanism worked off the previous iteration of descriptor updating,
in which pools were stored in a set to the batch state and could be iterated
normally
now, however, they're stored as a sparse array, and so the dynarray util for getting
the number of elements cannot be used
instead, use the calculated size for the array like every other accessor for
the pools to ensure correct indexing
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19053>
in the case where a new descriptor pool cannot be allocated naturally:
* iterate free batch states and try to free up memory
* iterate in-use batch states and try to free up memory
* ???
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19053>
Easy refactor. Change the storage type from `VkImageMemoryBarrier *` to
`void *`. Prepares for VK_KHR_synchronization2.
The patch series is cleaner with this refactor. I promise.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19046>
Reduces noise in following patches.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19046>
Last EGL header update changed the logic for the Xlib header inclusion. Now
the caller has to specify USE_X11 if they want the Xlib definitions.
Signed-off-by: Simon Zeni <simon@bl4ckb0ne.ca>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18848>
Since [1], EGL removed the inclusion of the Xlib headers by default.
The logic is now reversed, and the call has to define USE_X11 to include the
Xlib headers instead of EGL_NO_X11_HEADERS to prevent the inclusion.
[1]: 3670d645f4
Signed-off-by: Simon Zeni <simon@bl4ckb0ne.ca>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18848>
With the latest header update from Khronos, KHRONOS_APICALL (which is then
defined as EGLAPI) is empty, preventing the API symbols to be visible.
This commit adds `PUBLIC` to all the symbols from the EGL API that needs to be
visible.
Signed-off-by: Simon Zeni <simon@bl4ckb0ne.ca>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18848>
Move the assignment of draw->start_index into the num_draws loop
in draw_pt_arrays() so that it's updated for each primitive we're
drawing. Previously, it was only set once to the 0th primitive's
start index.
This fixes incorrect gl_BaseVertex values when using
vkCmdDrawMultiEXT() with more than one VkMultiDrawInfoEXT item.
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19039>
The vertex_id_offset value needs to be used both for the elt and
linear cases. As it was, if a long primitive was split, the vertex id
seen in the VS was also getting to reset to zero for each batch of the
split prim's vertices.
This fixes at least five VMware test cases (dx11-dxsdk-FluidCS11,
dx11-amd-TressFX-v1.0, dx10-win7-dxsdk-samples-nbodygravity,
dx10-win7-dxsdk-samples-gpuboids, dx10-wide-points).
Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19039>
So we have less stuff in the global namespace
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
Instead of crewating output or validating in the process code, just
process, then let main handle the rest
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
This makes two relatively small changes, first it addes the encoding to
the xml delcaration, and switches the quote style. Second, it changes
the final newline. These seemed minor enough to not warrent patches to
make the old wrter do the same thing as the new writer.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
This removes a bunch of hand-written code, and allows for fewer corner
cases. The resulting code has some minor differences (no empty newlines
and the encoding is declared in the xml declaration)
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
When we move to using etree to print this it won't have them, so this
minimizes the diff further.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
ElementTree.write will do this, and we want to minimize the diff when we
switch from our own writer to the builtin one.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18955>
To be able to use it from the common pipeline cache implementation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19047>