This restores the previous behaviour before YCBCR landed. For D+S
formats, it returns the depth format.
This fixes an assertion with Thrones of Britannia.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110540
Fixes: 66507cc656 ("radv: Add single plane image views & meta operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Since 09f1de97a7 "anv,i965: Lower away image derefs in the driver"
the backend compiler is not expected to handle any derefs, so let's
assert on it.
This helps identifying problems when a deref is not lowered and
"leaks" into the backend compiler.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The MASK macro is used in the RANGE macro, and it should
return the pre-bitset word mask for the (b) value.
i.e.
BITSET_MASK(0) should be undefined since it's meaningless.
BITSET_MASK(31) should give 0x7fffffff
BITSET_MASK(32) should give 0xffffffff
BITSET_MASK(33) should give 0x00000001
BITSET_MASK(64) should give 0xffffffff
However then BITSET_RANGE ends up broken for cases where
it's (b) value is the 0,32,64 value as in that case the lower
mask would be 0 not 0xffffffff.
This fixes the unit tests that I've added, and my code that
uses bitsets.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: bb38cadb1c "More GLSL code"
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
The last test here currently fails as there is a bug in bitset.h
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This has a couple of hardcoded vec4 limits in it, change them
to the proper sizing to avoid future issues.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is a port of Danylo's eca4a6548d
which fixed the hang on i965. It fixes GPU hangs in his new Piglit
test, arb_blend_func_extended-dual-src-blending-discard-without-src1.
I avoided my own review feedback here, and decided to simply adjust
3DSTATE_PS_BLEND rather than BLEND_STATE_ENTRY[0]. It has never been
clear to me which the hardware uses in every case. However, whacking
the enable in 3DSTATE_PS_BLEND seems to be sufficient to fix the hang,
and that packet is already dynamic, so it's easy to handle. I'd rather
avoid making BLEND_STATE_ENTRY[0] dynamic unless I have to.
It is an input but it comes in as part of the shader payload and doesn't
count towards the limits.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
During a rebase, it seems I accidentally broke the contents-menu,
leading to a duplicate link to freedesktop.org. This was obviously not
intended. Let's fix this.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 7eee13c467 ("docs: use dl/dd instead of blockquote for
freedesktop link")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Support nir_op_ftrunc by turning it into a mov with a round to integer
output modifier.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
There's no need to have a whole build just for that flag, we can add it
to any build.
v2: Add a note about why we put glvnd where we did (by anholt).
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Let's just drop it.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Merge the following into `meson-main`/`meson-loader-classic-dri`/
`meson-gallium-swr`:
- meson-vulkan
- meson-gallium-drivers-other
- meson-gallium-st-other
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
[ Michel Dänzer ]
* Rebase and fix up commit log.
* Don't set VULKAN_DRIVERS in meson-loader-classic-dri.
* Remove extraneous whitespace.
* Squash in follow-up fixes.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
[ anholt]
* Add a note why nine and swrast landed where they did.
* Switch from s/meson-vulkan/meson-main/ to
s/meson-loader-classic-dri/meson-main/ which I think was the original
intent
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (anholt changes)
Acked-by: Dylan Baker <dylan@pnwbakers.com>
- Add GBM_BO_IMPORT_FD_MODIFIER to documentation of supported foreign
object types
- Add newline before documentation block
- Improve language
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
- make sure compute shader derivatives are exposed for all extensions
- unify duplicated code
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
It's done by:
- decrease the number of frames in flight by 1
- flush before throttling in SwapBuffers
(instead of wait-then-flush, do flush-then-wait)
The improvement is apparent with Unigine Heaven.
Previously:
draw frame 2
wait frame 0
flush frame 2
present frame 2
The input lag is 2 frames.
Now:
draw frame 2
flush frame 2
wait frame 1
present frame 2
The input lag is 1 frame. Flushing is done before waiting, because
otherwise the device would be idle after waiting.
Nine is affected because it also uses the pipe cap.
This roughly mirrors what we get from autotools. There's a few
differences, though:
1. The "exec_prefix" output has been dropped. Meson doesn't support
this, so it makes no sense here.
2. The "llvm-config" output has been dropped. Meson abstracts dependency
discovery a bit more than our autotools build-system does, so it's
not easy to get this information as-is.
3. HUD extra stats, SWR archs, Shared/Static libs and CFLAGS / CXXFLAGS /
LDFLAGS has been dropped. These can be inspected by "meson configure".
4. How we set defines works quite differently in our Meson build-system,
and the result isn't quite the same. In particular, the DEFINES output
has been dropped, to avoid having to refactor the code too much.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109326
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Variables are cheap, and there's little reason for the dri and gallium
drivers to work on the same variable for the driver list. So let's split
these in two separate lists instead.
This makes it easier to inspect these after-the fact, for instance
for generating a summary of build-settings.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
This way we can mark the dri_drivers and dri_link arrays as temporary,
as all knowledge about them are contained in a single build-file with
clearly visible limited life-span.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
We just need to do a sequence of commands to flush the cache.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Wire up support to sample from the fb (and force GMEM rendering when we
have fb reads). The existing GLSL IR lowering for blend_equation_advanced
does the rest.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Lower load_output to txf_ms_fb and add support for the new texture fetch
instruction.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Needed for sampling from tile buffer (GMEM).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Apparently we never hit this path. Or at least haven't for a rather
long time. But in either case (load_deref or load_frag_coord), we can
just directly use the intrinsic's ssa dest. So stop passing the
nir_variable (which would be NULL in the load_frag_coord case) around
and instead just use &intr->dest.ssa.
(This ofc means we need to setup the cursor to insert *after* the
instruction, which seems to be another bug of the original
implementation.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
The extra comma at the end was annoying me.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
And a comment.. since we are mixing units of bytes/dwords/vec4,
hopefully this will avoid some unit confusion.
Signed-off-by: Rob Clark <robdclark@chromium.org>
It isn't quite as simple as not running the pass, since with packed
varyings we get load_ubo for block==0 (ie. the "real" uniforms). So
instead run the pass normally but decline to lower anything in
block > 0
Signed-off-by: Rob Clark <robdclark@chromium.org>
Since we emit UBO regions INDIRECTly (ie. not copied into cmdstream but
emit by EXT_SRC_ADDR) we need to keep them 4*vec4 aligned. Which the
code already mostly did, except for aligning the first UBO region itself
(ie. the one after block==0 which is the "real" uniforms).
Fixes: 893425a607 freedreno/ir3: Push UBOs to constant file
Fixes: 3c8779af32 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <robdclark@chromium.org>
Otherwise we zero out the state again, but all the UBO loads that we
could lower are already lowered. End result is that we didn't emit the
uniforms for lowered UBO access in any case where multiple shader
variants are used.
Fixes: 893425a607 freedreno/ir3: Push UBOs to constant file
Fixes: 3c8779af32 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <robdclark@chromium.org>
Keen on having other people contribute.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
This is useful to normalize the numbers written into the output file
as those number are accumulated over a period of time and number of
frames.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
v2: switch to VkBase{In,Out}Structure
v3: Add timestamps at begin/end of primary command buffers to estimate
gpu time spent per submission (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
This significantly reworks how numbers displayed are computed. We
accumulate operations written into command buffers and add those to
the device when submitted to a queue. These collected values are then
used to compute per frame overlay data.
We also accumulate the data over the sampling fps period to produce
numbers for that period of time.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This will be used to copy chains of structures so that we can alterate
some of them.
v2: Drop vk_util.h include (Eric)
Use VkBaseInStructure directly (Eric)
v3: Drop --platforms= param to generator script, instead produce a
file with #ifdef based what platforms are compiled.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>