Seems there is no sense in blitting 0-sized sources
or destinations.
Additionaly it may cause segfaults for i965.
v2: Function call replaced with inline check
v3: Added check to avoid devision by zero (L. Landwerlin)
v4: Added simillar check for Iris (L. Landwerlin)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CXXLD libxatracker.la
/usr/bin/ld: ../../../../src/gallium/auxiliary/.libs/libgallium.a(tgsi_to_nir.o): in function `ttn_finalize_nir':
src/gallium/auxiliary/nir/tgsi_to_nir.c:2111: undefined reference to `gl_nir_lower_samplers_as_deref'
/usr/bin/ld: src/gallium/auxiliary/nir/tgsi_to_nir.c:2113: undefined reference to `gl_nir_lower_samplers'
Fixes: 9a834447d6 ("tgsi_to_nir: Produce optimized NIR for a given pipe_screen.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109929
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Removed a few unused variables and iris_getparam_boolean().
Kept 'name' around since there's a commented debug that make use of it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When we add new feature checks on the host side that is used to
enable a cap conditionally that was enabled unconditionally before
we might end up with a feature regression when a new mesa version
is used with an old virglrenderer version that doesn't check for
that cap.
To work around this problem add a version id to the caps that corresponds
to the features that are actually checked on the host and check that
version too when enabling the cap.
Fixes: 2ee197d6e8
virgl: Enable mixed color FBO attachemnets only when the host supports it
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Pohsien Wang <pwang@chromium.org>
h/w specification requires this bit to be always set.
See Mesa commit 5eb173304b.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is no longer true since PIPE_CAP_PACKED_UNIFORMS was enabled.
Fixes: 3c8779af32 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <robdclark@gmail.com>
For depth and stencil blits, we always want the main mask to be Z, and
the secondary pass mask to be S. If asked to blit Z+S to S, we should
handle the blit in the second pass which properly gets the stencil
resources.
Before, we were trying to handle S as the main mask, and accidentally
blitting a Z source to a S destination, which doesn't work out well.
Fixes Piglit's "framebuffer-blit-levels {draw,read} stencil" tests.
I neglected to fill out this driver function, causing us to advertise
0 modifiers. Now we advertise the various tilings and let the driver
pick them. I've verified that X tiling works with Weston (by hacking
the list to skip Y tiling).
Y+CCS doesn't work yet because it's multiplane and the Gallium dri
state tracker isn't really prepared for that. Leave it off for now.
The code to handle image unit indirect was missing
Fixes piglit tests/spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-mixed-const-non-const-uniform-index.shader_test
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This fixes the vertex id fetch in the non-llvm drawing paths.
This vertex id in elt mode comes from the elts not just a linear
value.
Note we don't bad basevertex in the elts case as it's already included
in the elts by the looks of it (at least tests fail if I add it)
Fixes piglit end-primitive tests and some others.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
We have a rather big constant file and it seems that the best way to
use it is to upload all UBOs and lower UBO access the load_uniform.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
This commit turns on the gallium cap and adds a pass to lower the
load_ubo intrinsics for block 0 back to load_uniform intrinsics and
adjust the backend where the cap switches units from vec4s to dwords.
As we stop using ir3_glsl_type_size() for uniform layout, this also
corrects an issue where we would allocate a vec4 slot for samplers in
uniforms, fixing:
dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_fragment
dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_vertex
dEQP-GLES3.functional.shaders.struct.uniform.sampler_nested_fragment
dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
VCN supports this profile as well as UVD, so add it
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: <mesa-stable@lists.freedesktop.org>
When the host is running on softpipe/llvmpipe the maximum number of
samples for multisampling is 1. GL 3.0 requires at least 4 samples, and
softpipe/llvmpipe get around this by enabling PIPE_CAP_FAKE_SW_MSAA.
This patch mimics softpipe/llvmpipe behavior in virgl by enabling the
same PIPE_CAP_FAKE_SW_MSAA workaround when the max sample count reported
by the host is 1. This change allows virgl on a softpipe/llvmpipe host
to advertise support for GL 3.0 and beyond.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
This patch refactors a substantial amount of code in preparation for
mipmaps. In particular, we know have a correct slice abstraction based
on offsets; cpu/gpu are no longer arbitrary pointers. We additionally
shuffle around other code to accompany these changes and cleanup how
tiled textures are handled, while drawing some attention to the blit
code.
Mipmaps are still disabled at this point, as autogeneration is not yet
implemented; enabling as-is would cause regressions.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
In fact, the native "fpow" instruction only does half of it; more work
is needed for the actual instruction. For now, just lower.
Fixes: 1ea42894c ("panfrost/midgard: Implement fpow")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Although this is not functional (and the command stream side is not
aiming for ES3 right now), this is enough to run dEQP-GLES3 shader
tests with the version override directive; this is useful, as some ES3
shader feature can occur in ES2 class shaders due to lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
On Midgard, float ops support standard source modifiers (abs/neg) and
destination modifiers (sat/pos/round). Integer ops do not support these,
however. To cope, we use native NIR source modifiers for floats, but
lower them away to iabs/ineg for integers, implementing those ops
simultaneously to avoid regressions.
Fixes the integer tests in
dEQP-GLES2.functional.shaders.operator.unary_operator.minus.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Some of these are not yet fully functional due to related bugs, but this
the correct op mapping. The native ball/bany opcodes act on vec4's
unconditionally. That said, both ball and bany have the nice property
that duplicating an argument does not affect their output, so the
default "hanging swizzles" allow us to implement 2/3-component opcodes
correctly, implicitly lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Whereas a normal fcsel acts on a boolean input in r31.w, the fcsel_i
variant acts on an integer input in r31.w, which can be preloaded with
an instruction like imov (with the appropriate negate flag on the
source).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This preliminary implementation should handle some basic cases. Future
work should scissor the FRAGMENT job as well for efficiency.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Our viewport code hardcoded a number of wrong assumptions, which sort of
sometimes worked but was definitely wrong (and broke most of dEQP). This
corrects the logic, accounting for flipped-Y framebuffers, which
fixes... most of dEQP.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This gets the basevertex from the draw depending on whether
it's an indexed or non-indexed draw.
We still fail a transform feedback test for vertex id, as
the vertex id actually an index id, and isn't getting translated
properly to a vertex id, suggestions on how/where to fix that welcome.
Reviewed-by: Brian Paul <brianp@vmware.com>
In 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") we
started counting number of samplers based on the uniform vars instead
of number of cat5 instructions. We used the number of samplers to
determine whether to enable derivatives, but when we only use
derivatives and no samplers, that now breaks. Track whether we need
derivatives explicitly and use that to enable the state.
Fixes: 1088b788 ("freedreno/ir3: find # of samplers from uniform vars")
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
iris is thread safe, enable csmt for a ~5% performace boost.
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
From "Alpha Coverage" section of SKL PRM Volume 7:
"If Pixel Shader outputs oMask, AlphaToCoverage is disabled in
hardware, regardless of the state setting for this feature."
From OpenGL spec 4.6, "15.2 Shader Execution":
"The built-in integer array gl_SampleMask can be used to change
the sample coverage for a fragment from within the shader."
From OpenGL spec 4.6, "17.3.1 Alpha To Coverage":
"If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value
is generated where each bit is determined by the alpha value at the
corresponding sample location. The temporary coverage value is then
ANDed with the fragment coverage value to generate a new fragment
coverage value."
Similar wording could be found in Vulkan spec 1.1.100
"25.6. Multisample Coverage"
Thus we need to compute alpha to coverage dithering manually in shader
and replace sample mask store with the bitwise-AND of sample mask and
alpha to coverage dithering.
The following formula is used to compute final sample mask:
m = int(16.0 * clamp(src0_alpha, 0.0, 1.0))
dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) |
0x0808 * (m & 2) | 0x0100 * (m & 1)
sample_mask = sample_mask & dither_mask
Credits to Francisco Jerez <currojerez@riseup.net> for creating it.
It gives a number of ones proportional to the alpha for 2, 4, 8 or 16
least significant bits of the result.
GEN6 hardware does not have issue with simultaneous usage of sample mask
and alpha to coverage however due to the wrong sending order of oMask
and src0_alpha it is still affected by it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
If the geom shader emits a point size we failed to find it here,
use the correct API to look it up.
Fixes:
tests/spec/glsl-1.50/execution/geometry/point-size-out.shader_test
Reviewed-by: Brian Paul <brianp@vmware.com>
With indirect rendering it's fine to set the instance count
parameter to 0, and expect the rendering to be ignored.
Fixes assert in KHR-GLES31.core.compute_shader.pipeline-gen-draw-commands
on softpipe
v2: return earlier before changing fpstate
Reviewed-by: Brian Paul <brianp@vmware.com>
The wait here is unnecessary since we got a pool of back buffers,
and the wait for swap buffer will happen before the present pixmap,
at the same time the previous back buffer will be put back to pool
for reuse after the check for PresentIdleNotify event
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>