There was only a single user which was using strlen(buf).
As this function is not user facing (i.e. we don't need to feed back
original length via a callback), we can simplify things.
Suggested-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Coverity noticed that we were passing this by value, and it's 152 bytes.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This patch replaces the old interger constant qualifiers with either
the new ast_layout_expression type if the qualifier requires merging
or ast_expression if the qualifier can't have mulitple declarations
or if all but the newest qualifier is simply ignored.
We also update the process_qualifier_constant() helper to be
similar to the one in the ast_layout_expression class, but in
this case it will be used to process the ast_expression qualifiers.
Global shader layout qualifier validation is moved out of the parser
in this change as we now need to evaluate any constant expression
before doing the validation.
V2: Fix minimum value check for vertices (Emil)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This will allow us to add error checking to this function
in a later patch, if we don't move it the error messages
will go missing.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This adds a state for the maximum dual source draw variables available
and the variable for determining if the extension has been enabled
in the program shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Mesa unconditionally sets this driver flag to true in
_mesa_init_extensions(). There is therefore no need for
the driver to communicate support for this extension.
Replace the driver capability flag with ::dummy_true.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
With NIR, it actually hurts things.
total instructions in shared programs: 6529329 -> 6528888 (-0.01%)
instructions in affected programs: 14833 -> 14392 (-2.97%)
helped: 299
HURT: 1
In all affected programs I inspected (including the single hurt one) the
pass CSE'd some multiplies and caused some reassociation (e.g., caused
(A * B) * C to be A * (B * C)) when the original intermediate result was
reused elsewhere.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This kind of definitions:
layout(xxx) buffer;
was not supported by commit 84fc5fece0.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
We initialize gl_GlobalInvocationID based on the extension spec
formula:
gl_GlobalInvocationID =
gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
v2: use ARB_texture_multisample enable bit
Patch adds extension enable bit and enables required keywords
and builtin functions for the extension.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
The name of both the GLSL built-in variable and the glGetInteger param
with the same value changed in GLSL ES 3.1 and GL 4.5. Its semantics
also changed slightly, since the limit now also takes into account the
number of SSBs in use. Switch our internal data structures to the
up-to-date name.
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This lowers the enhanced ir_call using the lookaside table
of subroutines into an if ladder. This initially was done
at the AST level but it caused some ordering issues so a separate
pass was required.
v2: clone return value derefs.
v2.1: update for subroutine->int convert.
v2.2: add a clone for the array index
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is the guts of the GLSL parser and AST support for
shader subroutines.
The code creates a subroutine type in the parser, and
uses that there to validate the identifiers. The parser
also distinguishes between subroutine types/function prototypes
/uniforms and subroutine defintions for functions.
Then in the AST conversion it recreates the types, and
stores the subroutine definition info or subroutine info
into the ir_function along with a side lookup table in
the parser state. It also converts subroutine calls into
the enhanced ir_call.
v2: move to handling method calls in
function handling not in field selection.
v3: merge Chris's previous parser patches in here, to
make it clearer what's changed in one place.
v3.1: add more documentation, drop unused include
v3.2: drop is_subroutine_def
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is used to identify shader storage buffer interface blocks where
buffer variables are declared.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
We now have is_array() and without_array() that make the
code much clearer and remove the need for this.
For all remaining calls to this we already knew that
the type was an array so returning a null wasn't adding any value.
v2: use without_array() in _mesa_ast_array_index_to_hir() and don't use
without_array() in lower_clip_distance_visitor() as we want to make sure the
array is 2D.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Just more boilerplate stuff.
v2:
bad fallthrough on versioning,
this is my ugly but self contained solution (Ian)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Currently no 3.10 ES features (beyond 3.00 ES) are enabled. That will
come later.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
And rename _mesa_glsl_parse_state::early_fragment_tests to
fs_early_fragment_tests for consistency with other FS-specific flags in the
same struct.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Previously, the ctx->Const.ForceGLSLVersion setting only worked if
the shader lacked a #version directive. Now, the ForceGLSLVersion
setting will override the #version directive too.
This change should be safe since it should be rare to have an app
that has a mix of shader versions and we only wanted to override
the #version for shaders which lacked the #version directive.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
st_glsl_to_tgsi and ir_to_mesa have handled conditional discards for a
long time; the previous patch added that capability to i965.
i965 (Haswell) shader-db stats:
Without NIR:
total instructions in shared programs: 5792133 -> 5776360 (-0.27%)
instructions in affected programs: 737585 -> 721812 (-2.14%)
helped: 6300
HURT: 68
GAINED: 2
With NIR:
total instructions in shared programs: 5787538 -> 5769569 (-0.31%)
instructions in affected programs: 767843 -> 749874 (-2.34%)
helped: 6522
HURT: 35
GAINED: 6
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This is basically Ian's review feedback for my patch that added
_mesa_shader_stage_to_abbrev() - it just makes both consistent again.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is similar to _mesa_shader_stage_to_string(), but returns "VS"
instead of "vertex".
v2: Use unreachable() and add MESA_SHADER_COMPUTE (requested by Ian).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
v2: add define bit (Tapani Pälli)
Patch makes following Piglit tests pass:
arb_gpu_shader_fp64/preprocessor/define.vert
arb_gpu_shader_fp64/preprocessor/define.frag
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Patch enables ES2 extension that utilizes existing ES3 functionality.
Changes make all the subtests to run and pass in WebGL conformance
test 'webgl-draw-buffers' when running Chrome on OpenGL ES, also
Piglit test 'draw_buffers_gles2' passes.
v2: remove unused boolean (Ilia Mirkin)
v3: proper error checking for invalid values (Chad Versace)
v4: run error check explicitly for ES2 and ES3 (Kenneth Graunke)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
It seems to have been forgotten during viewports array implementation time.
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
There may be two contexts compiling shaders at the same time, and we want the
anonymous struct id to be globally unique.
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Original patch by Petri Latvala <petri.latvala@intel.com>:
Add an optimization pass that drops min/max expression operands that
can be proven to not contribute to the final result. The algorithm is
similar to alpha-beta pruning on a minmax search, from the field of
AI.
This optimization pass can optimize min/max expressions where operands
are min/max expressions. Such code can appear in shaders by itself, or
as the result of clamp() or AMD_shader_trinary_minmax functions.
This optimization pass improves the generated code for piglit's
AMD_shader_trinary_minmax tests as follows:
total instructions in shared programs: 75 -> 67 (-10.67%)
instructions in affected programs: 60 -> 52 (-13.33%)
GAINED: 0
LOST: 0
All tests (max3, min3, mid3) improved.
A full shader-db run:
total instructions in shared programs: 4293603 -> 4293575 (-0.00%)
instructions in affected programs: 1188 -> 1160 (-2.36%)
GAINED: 0
LOST: 0
Improvements happen in Guacamelee and Serious Sam 3. One shader from
Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of
dropping of a (constant float (0.00000)) operand, which was
compiled to a saturate modifier.
Version 2 by Iago Toral Quiroga <itoral@igalia.com>:
Changes from review feedback:
- Squashed various cosmetic changes sent by Matt Turner.
- Make less_all_components return an enum rather than setting a class member.
(Suggested by Mat Turner). Also, renamed it to compare_components.
- Make less_all_components, smaller_constant and larger_constant static.
(Suggested by Mat Turner)
- Change mixmax_range to call its limits "low" and "high" instead of
"range[0]" and "range[1]". (Suggested by Connor Abbot).
- Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
Connor Abbot).
- Make the logic more clearer by rearrenging the code and commenting.
(Suggested by Connor Abbot).
- Added comment to explain why we need to recurse twice. (Suggested by
Connor Abbot).
- If we cannot prune an expression, do not return early. Instead, attempt
to prune its children. (Suggested by Connor Abbot).
Other changes:
- Instead of having a global "valid" visitor member, let the various functions
that can determine this status return a boolean and check for its value
to decide what to do in each case. This is more flexible and allows to
recurse into children of parents that could not be prunned due to invalid
ranges (so related to the last bullet in the review feedback).
- Make sure we always check if a range is valid before working with it. Since
any use of get_range, combine_range or range_intersection can invalidate
a range we should check for this situation every time we use any of these
functions.
Version 3 by Iago Toral Quiroga <itoral@igalia.com>:
Changes from review feedback:
- Now we can make get_range, combine_range and range_intersection static too
(suggested by Connor Abbot).
- Do not return NULL when looking for the larger or greater constant into
mixed vector constants. Instead, produce a new constant by doing a
component-wise minmax. With this we can also remove of the validations when
we call into these functions (suggested by Connor Abbot).
- Add a comment explaining the meaning of the baserange argument in
prune_expression (suggested by Connor Abbot).
Other changes:
- Eliminate minmax expressions operating on constant vectors with mixed values
by resolving them.
No piglit regressions observed with Version 3.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 74 40,578,719,715 67,762,208 62,263,404 5,498,804 0
After (32-bit): 52 40,565,579,466 66,359,800 61,187,818 5,171,982 0
Before (64-bit): 74 37,129,541,061 95,195,160 87,369,671 7,825,489 0
After (64-bit): 76 37,134,691,404 93,271,352 85,900,223 7,371,129 0
A real savings of 1.0MiB on 32-bit and 1.4MiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Later patches will give every ir_var_temporary the same name in release
builds. Adding a bunch of variables named "compiler_temp" to the symbol
table can only cause problems.
No change Valgrind massif results for a trimmed apitrace of dota2.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>