We also need render to the front buffer of temporary X pixmap,
this is the case of when we using opengl as video out for vaapi.
the basic implementation is to pass pixmap ID to X server, and
then X will return dma-buf fd, we will get the buffer object
through this dma-buf fd.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
When drawable size changed, PresentConfigureNotify event will be
emitted, by handling the event to re-allocate resized buffer.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
This will clear presentation area not covered by video content
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Request drawable content in pixmap by calling DRI3 PresentPixmap,
and handle PresentIdleNotify event.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
This implements DRI3 PixmapFromBuffer. Create buffer objects, and
associate it to a dma-buf fd, and then pass this fd with a pixmap
ID to X server for creating pixmap object; also add a function
for wait events.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
also place holder for present events handling
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Required functions into place for implementation, create screen
with device fd returned from X server, also bail out to DRI2
with certain conditions.
v2: -organize the error out path (Axel)
-squash previous patch 1 and 2 into one (Emil)
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Not sure how we didn't hit this already, but since we want fdiv
converted into mul + rcp, we should set this.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
In the glsl->tgsi path, this already gets translated to VAR8, which
matches up with rasterizer->sprite_coord_enable.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
When we got NIR directly from state tracker (vs using tgsi_to_nir) we
need to realize this and skip some TGSI specific hacks.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
INTERP is defined (by me) to have to have a INPUT source. However the
state tracker does not always obey this. This happens due to varying
packing logic introducing additional mov's which can't always be undone.
Instead of just giving up, we instead try harder to find the original
input. This won't always be possible, for example with indirect
accesses. There's not much we can (easily) do about that though.
This fixes the remaining interpolateAt* failures in dEQP:
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at*
some of which were asserting due to INTERP_* being passed a non-input.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This fixes
dEQP-GLES31.functional.draw_indirect.draw_elements_indirect.*.default_attribute
These tests were causing a const vbo to be set up, and were small enough
draws that the logic was trying to go via the push path (which emits
data directly into the cmd stream rather than uploading a user vbo).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This was handled in handleTEX(), however the way the logic works, those
extra arguments aren't added on by then, so it did nothing. Instead we
must duplicate that bit here. GK110 appears to complain about
MISALIGNED_GPR, however it's reasonable to believe that GK104 has the
same requirements.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95403
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cull distances are just a special case of clip distances as far as the
hardware is concerned. Make sure that the relevant "planes" are enabled,
and flip the clip mode to cull for those.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: add enables on nvc0, add nv50 support]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
This lets us safely enable or disable the extension as needed
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is purely cosmetic, making it easier to assign blame for space used
in the binary in case somebody else makes a similar cleanup effort in the
future.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This saves some space and avoids the need for relocations.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
So that it gets compiled and emitted only once, saving space is the final
binary.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
So that it gets compiled and emitted only once, saving space is the final
binary.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
So that it gets compiled and emitted only once, saving space is the final
binary.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
v2: Reuse the macro for bind & delete.
Note that may not be able to share the delete long-term as
pipe_compute_state contains members not in pipe_shader_state,
and we need to distinguish the pointer location if we add that
struct to the union.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This reduces the number of loop iterations for invalidating buffers
and images.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This is a pretty rare situation but this can happen though.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The SAMPLEMASK semantic should only return the bits set covered by the
current invocation. However we were always retrieving the covmask, which
returns the covered samples of the whole pixel.
When not doing per-sample invocation, this is precisely what we want.
However when doing per-sample invocation, we have to select the
sampleid'th bit and only return that. Furthermore, this means that we
have to have a 1:1 correlation for invocations and samples.
This fixes most
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*
tests. A few failures remain due to disagreements about nr_samples==1
logic as well as what happens with MSAA x2 RTs when the shading fraction
is 0.5.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
The i965 driver has its own pass for fusing mul+add combinations that's
much smarter than what nir_opt_algebraic can do so we don't want to get the
nir_opt_algebraic one just because we didn't set lower_ffma.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The goal is to allow the pipe driver to request something other than
TGSI, but detect whether what is getting is TGSI vs what it requested.
The pipe drivers will always have to support TGSI (and convert that into
whatever it is that they prefer), but in some cases we should be able to
skip the TGSI intermediate step (such as glsl->nir vs glsl->tgsi->nir).
I think pipe_compute_state should get similar treatment. Currently,
afaict, it has one user and one consumer, which has allowed it to be
sloppy wrt. supporting alternative IR's.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The use of transfer_inline_write() in TexSubImage path (see fb9fe352ea)
exposed a bug for "layer_first" resources (ie. a4xx) not setting correct
layer_stride.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Use GALLIVM_DEBUG=dumpbc for dumping of modules as bitcode.
Instead of a fixed llvmpipe.bc name, use ir_<modulename>.bc so multiple
modules can be dumped (albeit it might still overwrite previous modules,
particularly the modules from draw tend to always have the same name).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch fixes this build error.
CXX rasterizer/memory/libswrAVX_la-ClearTile.lo
In file included from rasterizer/memory/ClearTile.cpp:34:0:
./rasterizer/memory/Convert.h: In function ‘uint16_t Convert32To16Float(float)’:
./rasterizer/memory/Convert.h:170:9: error: ‘__builtin_isnan’ is not a member of ‘std’
if (std::isnan(val))
^
./rasterizer/memory/Convert.h:170:9: note: suggested alternative:
<built-in>: note: ‘__builtin_isnan’
./rasterizer/memory/Convert.h:176:14: error: ‘__builtin_isinf_sign’ is not a member of ‘std’
else if (std::isinf(val))
^
./rasterizer/memory/Convert.h:176:14: note: suggested alternative:
<built-in>: note: ‘__builtin_isinf_sign’
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95180
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
The calculated limit gave problems on SI as it was > 32 KiB
and the hardware LDS size on SI is only 32 KiB. It isn't
correct anyway when processing multiple patches in a threadgroup.
As we potentially have any number of patches such that the
used LDS is at most the hardware LDS size, and exact size
per patch is not known at compile time, this seems like
the only valid bound.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We index into these based on var->data.driver_location, which might have
gaps (ie. two inputs, one w/ drvloc 0 and other 2). This shows up in
(for example) 'bin/copyteximage 1D', but was only noticed recently due
to additional asserts.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Compute support seems to be pretty stable now, and according to piglit
it doesn't seem to break 3D state.
As a side effect, this will expose ARB_compute_shader on GK110/GK208.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>