Fix this build error on Ubuntu 18.04.
/usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5'
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663
Suggested-by: Eric Engestrom <eric@@engestrom.ch>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
We don't even support replay anymore; this is just wasting characters
and adding clutter.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Mali job dependency graphs, at least for GLES3.0, have the special
property that a given node will only have at most a single dependent.
This allows us to efficiently precompute the dependent array and
replace an inner loop's O(N) search with an O(1) lookup, bringing the
algorithmic complexity of scoreboarding from O(N^2) to O(N).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Now that it has been totally replaced by the borrow mechanism, it is now
unused code.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The whole purpose of the transient memory model is to make subdivision
stupidly easy, so let's handle that.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The batch now temporarily possesses the transient buffer, so it'll need
to remember that to free it later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We use a fixed size slab if we can, otherwise we create a dedicated
("oversized") BO and add that to the job. In the latter case we'll get
reference counting for free so we can forget about this corner case for
the rest of the series.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We would like transient allocations to occur on the screen (borrowed by
the batch) rather than on the context. Add fields to track this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The latter upload is correct, but the former upload is unassociated with
any particular FBO and therefore becomes orphaned. We do have to upload
at draw-time at the latest, if we haven't by then.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Zero-sized allocations will fail with an unhelpful errno from the
kernel; check size explicitly in userspace before it gets that far.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Don't rely on them being preinitialized to zero; this can cause junk to
appear on the wire.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
A bunch of these are from asserts not being compiled in 32-bit mode
(once Erik's ASSERTABLE stuff is merged, we'll want to switch).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Until now we have always been emitting our scoreboard locks on the last thread
switch to improve parallelism. We did this by emitting our last thread switch
right before our tlb writes at the very end of the program, where we know that
we are outside control flow.
Unfortunately, this strategy is not valid when we have tlb color reads too, as
these will happen before this point in the program and can happen inside
control flow.
To fix this we always emit a thread switch before the first tlb load and if we
see additional thread switches after that point, we change the strategy to lock
on the first thread switch.
v2: change the solution so it is expected to work in more scenarios (Eric).
Reviewed-by: Eric Anholt <eric@anholt.net>
Kind of a funky corner case that does not (as far as I know) apply to
organic shaders from GLES but does pop up in generated shaders from the
fixed-function desktop pipeline.
Fixes: bb483a9166 ("panfrost: Clamp point size")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The screen header includes the common xml, and otherwise we might race
to build before it's done.
Fixes: e03259974e ("freedreno: Generate headers from xml files")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Fixes: 66411521ea ("etnaviv: combine translate_ts_sampler_format/translate_msaa_format")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Only Z24S8 is properly supported right now, so let's be careful. Fixes a
number of issues relating to improper Z/S handling. The most obvious is
depth buffers with incorrect strides, which manifests in truly bizarre
ways and can happen commonly with FBOs.
Fixes WebGL (Aquarium runs, etc).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We were failing to flag the program dirty when it changed. Also, we
were unnecessarily setting key->input_vertices for SINGLE_PATCH mode,
which would reduce program cache hits. Only set it if needed.
Right now, all keys have two things in common: a program string ID and a
sampler_prog_key_data. I'd like to add another thing or two and need a
place to put it. This commit adds a new brw_base_prog_key struct which
contains those two common bits.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It's not clear the hardware really has a maximum which confuses dEQP;
clamp to whatever we report as our maximum.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In preparation for a Panfrost-based non-Gallium driver (maybe
Vulkan...?), hoist everything except for the Gallium driver into a
shared src/panfrost. Practically, that means the compilers, the headers,
and pandecode.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The reason for doing this is two-fold:
1. These passes are likely to be shared with the Bifrost compiler
Therefore, we don't want to restrict them to Midgard
2. The coding style is different (NIR-style vs Panfrost-style)
The NIR passes are candidates for moving upstream into
compiler/nir, so don't block that off for stylistic reasons
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Just use the #define instead.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
PIPE_CAP_SM3 has always been an odd one out of all our caps. While most
other caps are fine-grained and single-purpose, this cap encode several
features in one. And since OpenGL cares more about single features, it'd
be nice to get rid of this one.
As it turns, this is now relatively simple. We only really care about
three features using this cap, and those already got their own caps. So
we can remove it, and make sure all current drivers just give the same
response to all of them.
The only place we *really* care about SM3 is in nine, and there we can
instead just re-construct the information based on the finer-grained
caps. This avoids DX9 semantics from needlessly leaking into all of the
drivers, most of who doesn't care a whole lot about DX9 specifically.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Shader Model 3.0 is a big promise to make to the state-tracker, and
for instance mobile hardware might support vertex-shader saturate but
not some of the other features of SM3. So let's give this its own cap
for simplicity.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Shader Model 3.0 is a big promise to make to the state-tracker, and
for instance mobile hardware might support fragment-shader derivatives
but not some of the other features of SM3. So let's give this its own
cap for simplicity.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Shader Model 3.0 is a big promise to make to the state-tracker, and
for instance mobile hardware might support texture lod but not some
of the other features of SM3. So let's give this its own cap for
simplicity.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>