This is a preparation for dropping this field since this value is
expected to be calculated by the drivers now for variable group size
case. And also the field would get in the way of brw_compile_cs
producing multiple SIMD variants (like FS).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
The push.total field had three values but only one was directly
used (size). Replace it with a helper function that explicitly takes
the cs_prog_data and the number of threads -- and use that in the
drivers.
This is a preparation for ARB_compute_variable_group_size where the
number of threads (hence the total size for push constants) is not
defined at compile time (not cs_prog_data->threads).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
This is a similiar fix as bb2287ccdf ("gallivm/tessellator: use
private functions for min/max to avoid namespace issues").
Fixes: ab55708200 ("swr/rasterizer: Add tessellator implementation to the rasterizer")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Tested-by: Jan Zielinski <jan.zielinski@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4208>
Fix Segmentation fault during vaapi enc test on Arcturus.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4472>
This re-enables the change made in 2f5783bc2b which was
incorrectly disabled by 3e1dd99adc.
For radeonsi, we will prefer the NIR pass as it'll generate better code
(some index calculation and a single load vs. a load, then index
calculation, then another load) and oftentimes NIR optimization can kick
in and make all the access indices constant.
Fixes: 3e1dd99adc ("radeonsi: Remove a bunch of default handling of pipe caps.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4474>
debug_stack functions are implemented in another file for Android.
Also add backtrace library dependency.
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-bu: Kristian H. Kristensen <hoegsber@google.com>
Signed-off-by: Dominik Behr <dbehr@chromium.org>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2435>
Fix build error after llvm-11 commit 3a29393b4709 ("Remove
math.h/cmath include from DataTypes.h").
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c: In function ‘lp_build_linear_to_srgb’:
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: error: implicit declaration of function ‘powf’ [-Werror=implicit-function-declaration]
194 | exp2f_c * powf(coeff_f, 1.0f / exp_f));
| ^~~~
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:194:44: warning: incompatible implicit declaration of built-in function ‘powf’
src/gallium/auxiliary/gallivm/lp_bld_format_srgb.c:78:1: note: include ‘<math.h>’ or provide a declaration of ‘powf’
77 | #include "lp_bld_format.h"
+++ |+#include <math.h>
78 |
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4473>
Change brw_compute_vue_map() to also take the number of pos slots. If
more than one slot is used, the VARYING_SLOT_POS is treated as an
array.
When using Primitive Replication, instead of a single position, the
VUE must contain an array of positions. Padding might be
necessary (after clip distance) to ensure rest of attributes start
aligned.
v2: Add note about array in the commit message and assert that
pos_slots >= 1 to make clear 0 is invalid. (Jason)
Move padding to be after the clip distance.
v3: Apply the correct offset when gathering the sources from outputs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
This commit updates tu6_emit_vpc to selectively emit GS-specifc
configuration. Most of this is repurposed from fd6_program.c.
This also refactors `link_geometry_stages` to ir3_nir_lower_tess.c
so it can be shared between fd and tu.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
This change removes the basic infrastructure to work with perfmon
from the perfmon query impl and puts it into its own place.
Makes the whole series easier to review and ends smaller changes.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1530>
For the upcoming conversion of perfmon queries to the acc query
framework we need a way to tell that the data is not ready.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1530>
Saves us from doing an extra flush in !wait case and seems more
logical now. Also evaluate etna_bo_cpu_prep(..) retun value.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1530>
We might end in cases where etna_acc_get_query_result(..) gets called
within one draw call (aka before flushing). At this point the status
of the resource was not set but gets used in etna_acc_get_query_result(..)
to handle different wait cases. Fix this issue by calling resource_written(..)
explicitly.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1530>
Simplify the interface a sampler provider needs to implement. The start(..)
and stop(..) functions got called by resume(..) and suspend(..) so lets
get rid of start(..) and stop(..). Also the way we count and use samples
is much easier to follow now.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1530>
The name hw queries was choosen as occlusion queries are 'feeling'
like nothing special. It is possible to interact with them only
via the command stream - unlike perfom queries where some kernel
magic is needed.
Accumulated HW queries is a much better name for this type of queries.
We read some hardware values over some draw calls and need to accumulate
them to get the final result.
This is some prep work for the following perfmon changes.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1530>
The 8 bit type wasn't 8-bit so when doing signed work we lost
the sign bit. This fixes it to use a proper vector type,
even if we just end up in here with the 1-wide path for now.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4425>
Depending on user's vdpau headers, not all of those defines may exist.
Eventually we may want a private copy of these, but this is simple
enough for now.
Fixes asserts when running vdpauinfo which supports these recently added
formats.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4108>
This fixes some regressions where -1.0/1.0 results got flipped, but it's still
broken in some cases.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4416>
In some cases, there can be garbage in the upper bits after the channel
decode - for dxt5 this didn't matter (as the upper bits are shifted out
anyway) but for rgtc2 formats it does.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4416>
While we're at it, drop trying to re-calculate the max-size from the
max-level. It's not accurate on any drivers where the max-size isn't a
power of two anyway.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4307>
With the introduction of the live shader cache, when a shader is
fetched from the cache no stats are printed for shaderdb.
So in a sequence like this: vs1, fs1, vs1, fs2, shaderdb may see
3 or 4 lines, depending on the threads being used.
If one run produces 3 lines while the other produces 4 lines, it
would compare vs1 stats with fs2 stats.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4355>