Previously, when q.subroutine was set to 1, a new subroutine
declaration was added to the AST, while 0 meant a subroutine
definition has been detected by the parser.
Thus, setting the q.subroutine flag in both situations is
obviously wrong because a new type identifier is added instead
of trying to match the declaration. To fix it up, introduce
ast_type_qualifier::is_subroutine_decl() to differentiate
declarations and definitions easily.
This fixes a regression with:
arb_shader_subroutine/compiler/direct-call.vert
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Depending on the generated Makefile means that all generated sources are
recreated after ./configure.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
While an input attachment may only take on one of those two layouts,
other depth/stencil attachments that use the same image may have
HiZ-enabled layouts. Improves the average frame rate on a release
candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my
SKL GT4.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is needed to transition input attachments.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We'll loop through this array when performing automatic layout
transitions.
v2: Adjust formatting of an assignment (Jason Ekstrand)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Don't allocate space for resolve attachments if the subpass has none.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We will be using the image layout. Store the full struct directly from
the user.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer
when reading from a depth image subresource that is in the general
layout. Remove this unneeded resolve.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will be used to sample a depth input attachment without having to
pass through the HiZ buffer.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
surf_usage is only useful to image views that may use HiZ buffers.
Storage image views don't use HiZ buffers.
v2: Update commit message and add an assertion.
Fixes: 055ff2ec52 ("anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ")
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Validate the inputs, verify that this image has a depth
buffer, use gen_device_info instead of
v2:
- Add parenthesis (Jason Ekstrand)
- Make parameters const
- Use gen_device_info instead of gen
- Pass aspect to missed function in transition_depth_buffer
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This function supersedes layout_to_hiz_usage().
v2:
- Don't find the optimal buffer for layout transitions (Jason Ekstrand).
- Pass the devinfo instead of the gen (Jason Ekstrand)
- Update the function documentation.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.
Fixes SIGBUS on Raspberry Pi at high optimization levels.
This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.
v2: Commit message reword/typo fix, and add a bigger explanation in the
code (by anholt)
Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Recent change to st/mesa state update logic caused major regressions to
swr validation code.
swr uses the same validation logic (swr_update_derived) for both draw
and Clear calls. New st/mesa state update logic results in certain state
objects not being set/bound during Clear. This was causing null ptr
exceptions. Creation of static dummy state objects allows setting these
pointers during Clear validation, without interfering with relevant state
validation.
Once fixed, new logic also highlighted an error in dirty bit checking for
fragment shader and clip validation.
(The alternative is to have a simplified validation routine for Clear.
Which may do that at some point.)
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
At least, the extension is exported (gallium capability
PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT is 1)
Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
During the first update of the hw_clear_state atoms, we may not yet
have a current rasterizer state object. So, svga->curr.rast may be
NULL and we crash.
Add a few null pointer checks to work around this. Note that these
are only needed in the state update functions which are called for
'clear' validation.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
The is_color_attachement variable is later read when handling two
separate error cases, where only one of the cases results in the
variable being initialized.
This can be avoided by giving the variable a safe default value.
Coverity-Id: 1398631
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This allows us to allocate surface states from the command buffer when
pushing descriptor sets rather than allocating them through a
descriptor set pool.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We will need this declaration closer for readability later.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In validate_DrawElements_common() we need to check for OES_geometry_shader
extension to determine if we should fail if transform feedback is
unpaused. However current code reads ctx->Extensions.OES_geometry_shader
directly, which does not take context version into account. This means
that if the context is GLES 3.0, which makes the OES_geometry_shader
inapplicable, we would not validate the draw properly. To fix it, let's
replace the check with a call to _mesa_has_OES_geometry_shader().
Fixes following dEQP tests on i965 with a GLES 3.0 context:
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements_incomplete_primitive
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
One less set of enums. Dropped the #defines from brw_defines.h and ran:
$ for file in *.cpp *.c *.h; do sed -i \
-e 's/BRW_SURFACEFORMAT_/ISL_FORMAT_/g' \
-e 's/ISL_FORMAT_ASTC_[A-Zxs0-9_]*/\U&/g' $file; \
done
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
If we don't have pipelined register access (e.g. Haswell before kernel
v4.2), then we can only implement EXT_transform_feedback by reseting the
SO offsets *between* batches. However, if we do have pipelined access to
the SO registers on gen7, we can simply emit an inline reset of the SO
registers without a full batch flush.
v2 [by Ken]: Simplify after recent kernel feature detection changes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
According to the PRM description of the Depth field:
"This field specifies the total number of levels for a volume texture
or the number of array elements allowed to be accessed starting at the
Minimum Array Element for arrayed surfaces"
However, ISL defines array_len as the length of the range
[base_array_layer, base_array_layer + array_len], so it already represents
a value relative to the base array layer like the hardware expects.
v2: Depth is defined as a U11-1 field, so subtract 1 from
the actual value (Jason)
This fixes a number of new CTS tests that would crash otherwise:
dEQP-VK.pipeline.render_to_image.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This commit improves the message by telling them that they could probably
enable DRI3. More importantly, it includes a little heuristic to check
to see if we're running on AMD or NVIDIA's proprietary X11 drivers and,
if we are, doesn't emit the warning. This way, users with both a discrete
card and Intel graphics don't get the warning when they're just running
on the discrete card.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715
Co-authored-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Rene Lindsay <rjklindsay@hotmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.
v2: Properly handle vectors
v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the
calculations anyway and this is just extra instructions.
v4:
- Add back in the log2_denom stuff since it's needed for ensuring that
the shifts don't overflow.
- Rework the looping part of the pass to be easier to expand.
Reviewed-by: Matt Turner <mattst88@gmail.com>