We get these when we operate on vector variables with array accessors
(i.e. things like a[0] where 'a' is a vec4). When we call variable_referenced()
on these expressions we want to return a reference to 'a' instead of NULL.
This fixes a problem where we pass a[0] as the first argument to an atomic
SSBO function that expects a buffer variable. In order to check this, we use
variable_referenced(), but that is currently returning NULL in this case, since
the underlying rvalue is a vector_extract expression.
Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
The unsized array length is computed with the following formula:
array.length() =
max((buffer_object_size - offset_of_array) / stride_of_array, 0)
Of these, only the buffer size needs to be provided by the backends, the
frontend already knows the values of the two other variables.
This patch identifies the cases where we need to get the length of an
unsized array, injecting ir_unop_ssbo_unsized_array_length expressions
that will be lowered (in a later patch) to inject the formula mentioned
above.
It also adds the ir_unop_get_buffer_size expression that drivers will
implement to provide the buffer length.
v2:
- Do not define a triop that will force backends to implement the
entire formula, they should only need to provide the buffer size
since the other values are known by the frontend (Curro).
v3:
- Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead
of using state->ARB_shader_storage_buffer_object_enable (Tapani).
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
They only can be defined in the last position of the shader
storage blocks.
When an unsized array is used in different shaders, it might be
converted in different sized arrays, avoid get a linker error
in that case.
v2:
- Rework error condition and error messages (Timothy Arceri)
v3:
- Move OpenGL ES check to its own patch.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Will be used for textureSamples()
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
A simple shader such as
vec4 color;
color.xy.x = 1.0;
would cause ir_assignment::set_lhs() to generate bogus IR:
(swiz xy (swiz x (constant float (1.0))))
We were setting the number of components of each new RHS swizzle based
on the highest channel used in the LHS swizzle. So, .xy.y would
generate (swiz xy (swiz xx ...)), while .xy.x would break.
Our existing Piglit test happened to use .xzy.z, which worked, since
'z' is the third component, resulting in an xxx swizzle.
This patch sets the number of swizzle components based on the size of
the LHS swizzle's inner value, so we always have the correct number
at each step.
Fixes new Piglit tests glsl-vs-swizzle-swizzle-lhs-[23].
Fixes ir_validate assertions in in Metro 2033 Redux.
v2: Move num_components updating completely out of update_rhs_swizzle
(suggested by Timothy Arceri). Simplify.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
We need to store two sets of info into the ir_function,
if this is a function definition with a subroutine list
(subroutine_def) or if it a subroutine prototype.
v1.1: add some more documentation.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This type will be used to store the name of subroutine types
as in subroutine void myfunc(void);
will store myfunc into a subroutine type.
This is required to the parser can identify a subroutine
type in a uniform decleration as a valid type, and also for
looking up the type later.
Also add contains_subroutine method.
v2: handle subroutine to int comparisons, needed
for lowering pass.
v3: do subroutine to int with it's own IR
operation to avoid hacking on asserts (Kayden)
v3.1: fix warnings in this patch, fix nir,
fix tgsi
v3.2: fixup tests
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
tests: fix warnings
This will be used to identify buffer variables inside shader storage
buffer objects, which are very similar to uniforms except for a few
differences, most important of which is that they are writable.
Since buffer variables are so similar to uniforms, we will almost always
want them to go through the same paths as uniforms.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
We now have is_array() and without_array() that make the
code much clearer and remove the need for this.
For all remaining calls to this we already knew that
the type was an array so returning a null wasn't adding any value.
v2: use without_array() in _mesa_ast_array_index_to_hir() and don't use
without_array() in lower_clip_distance_visitor() as we want to make sure the
array is 2D.
Reviewed-by: Matt Turner <mattst88@gmail.com>
These were added in commit f2616e56, presumably in preparation for
translating ARB vp/fp into GLSL IR. That never happened, and neither did
a lowering pass that actually generated these instructions.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
The optimization in commit d056863b covers these cases, which were the
first optimizations I added to the GLSL compiler.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 74 40,578,719,715 67,762,208 62,263,404 5,498,804 0
After (32-bit): 52 40,565,579,466 66,359,800 61,187,818 5,171,982 0
Before (64-bit): 74 37,129,541,061 95,195,160 87,369,671 7,825,489 0
After (64-bit): 76 37,134,691,404 93,271,352 85,900,223 7,371,129 0
A real savings of 1.0MiB on 32-bit and 1.4MiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Specifically, ir_var_temporary variables constructed with a NULL name
will all have the name "compiler_temp" in static storage.
No change Valgrind massif results for a trimmed apitrace of dota2.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
At least one of these pointers must be NULL, and we can determine which
will be NULL by looking at other fields. Use this information to store
both pointers in the same location.
If anyone can think of a better name for the union than "u", I'm all
ears.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 63 40,574,239,515 68,117,280 62,618,607 5,498,673 0
After (32-bit): 44 40,577,049,140 68,118,608 62,441,063 5,677,545 0
Before (64-bit): 53 37,126,451,468 95,150,256 87,711,304 7,438,952 0
After (64-bit): 63 37,122,829,194 95,153,008 87,333,600 7,819,408 0
A real savings of 173KiB on 32-bit and 368KiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Also move the new warn_extension_index into ir_variable::data. This
enables slightly better packing.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 82 40,580,040,531 68,488,992 62,973,695 5,515,297 0
After (32-bit): 73 40,580,476,304 68,488,400 62,796,151 5,692,249 0
Before (64-bit): 65 37,124,013,542 95,892,768 88,466,712 7,426,056 0
After (64-bit): 71 37,124,890,613 95,889,584 88,089,008 7,800,576 0
A real savings of 173KiB on 32-bit and 368KiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
The payoff for this will come in the next patch.
No change Valgrind massif results for a trimmed apitrace of dota2.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
All of the GL image enums fit in 16-bits.
Also move the fields from the anonymous "image" structucture to the next
higher structure. This will enable packing the bits with the other
bitfield.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0
After (32-bit): 70 40,577,421,777 68,487,584 62,973,695 5,513,889 0
Before (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0
After (64-bit): 74 37,124,603,758 95,891,808 88,466,712 7,425,096 0
A real savings of 346KiB on 32-bit and 262KiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0
After (32-bit): 74 40,580,119,657 69,186,544 63,506,327 5,680,217 0
Before (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0
After (64-bit): 89 36,822,971,897 96,526,616 88,735,296 7,791,320 0
A real savings of 173KiB on 32-bit and 368KiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Will be used to implement interpolateAt*() from ARB_gpu_shader5
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
This has the added perk that if you forget to set ir_type in the
constructor of a new subclass (or a new constructor of an existing
subclass) the compiler will tell you... instead of relying on
ir_validate or similar run-time detection.
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
This allows them to be moved to .rodata, and allow us to be sure that they
will not be modified.
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
The i965 MUL instruction doesn't natively support 32-bit by 32-bit
integer multiplication; additional instructions (MACH/MOV) are required.
However, we can avoid those if we know one of the operands can be
represented in 16 bits or less. The vector backend's is_16bit_constant
static helper function checks for this.
We want to be able to use it in the scalar backend as well, which means
moving the function to a more generally-usable location. Since it isn't
i965 specific, I decided to make it an ir_constant method, in case it
ends up being useful to other people as well.
v2: Rename from is_16bit_integer_constant to is_uint16_constant, as
suggested by Ilia Mirkin. Update comments to clarify that it does
apply to both int and uint types, as long as the value is
non-negative and fits in 16-bits.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
To fix MSVC compile breakage. Evidently, _restrict is an MSVC keyword,
though the docs only mention __restrict (with two underscores).
Note: we may want to also rename _volatile to volatile_flag to be
consistent.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74900
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
No opaque types may be statically initialized in the shader, all
opaque variables must be declared uniform or be part of an "in"
function parameter declaration, no opaque types may be used as the
return type of a function.
v2: Add explicit check for opaque types in interface blocks. Check
for opaque types in ir_dereference::is_lvalue().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
When handling function calls, we often want to walk through the list of
formal parameters and list of actual parameters at the same time.
(Both are guaranteed to be the same length.)
Previously, we used a pattern of:
exec_list_iterator 1st_iter = <1st list>.iterator();
foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) {
...
1st_iter.next();
}
This was awkward, since you had to manually iterate through one of
the two lists.
This patch introduces a foreach_two_lists macro which safely walks
through two lists at the same time, so you can simply do:
foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) {
...
}
v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested
by Ian Romanick.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In these cases, we edit the list (or at least might be), so we use the
foreach_list_safe variant.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
foreach_iter and exec_list_iterators have been deprecated for some time now;
we just hadn't ever bothered to convert code to the newer foreach_list
and foreach_list_safe macros.
In these cases, we aren't editing the list, so we can use foreach_list
rather than foreach_list_safe.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch creates a new generic is_value() method, which checks if an
ir_constant has a particular value. (For vectors, it must have the
single value repeated across all components.)
It then rewrites the is_zero/is_one/is_negative_one methods to use this
generic helper. All three were basically identical except for the value
they checked for. The other difference is that is_negative_one rejects
boolean types. The new is_value function maintains this behavior, only
allowing boolean types when checking for 0 or 1.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This patch moves following bitfields and variables to the data
structure:
explicit_location, explicit_index, explicit_binding, has_initializer,
is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray,
from_named_ifc_block_array, depth_layout, location, index, binding,
max_array_access, atomic
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This patch moves following bitfields in to the data structure:
used, assigned, how_declared, mode, interpolation,
origin_upper_left, pixel_center_integer
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Data section helps serialization and cloning of a ir_variable. This
patch includes the helper bits used for read only ir_variables.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Now that loop_controls no longer creates normatively bound loops,
there is no need for ir_loop::normative_bound or the
lower_bounded_loops pass.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch replaces the ir_loop fields "from", "to", "increment",
"counter", and "cmp" with a single integer ("normative_bound") that
serves the same purpose.
I've used the name "normative_bound" to emphasize the fact that the
back-end is required to emit code to prevent the loop from running
more than normative_bound times. (By contrast, an "informative" bound
would be a bound that is informational only).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
There's no need to loop through the "parameters" list and remove every
element; move_nodes_to(¶meters) already throws away all elements of
the destination list.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
From section 7.1 (Built-In Language Variables) of the GLSL 4.10
spec:
Also, if a built-in interface block is redeclared, no member of
the built-in declaration can be redeclared outside the block
redeclaration.
We have been regarding this text as a clarification to the behaviour
established for gl_PerVertex by GLSL 1.50, so we apply it regardless
of GLSL version.
This patch enforces the rule by adding an enum to ir_variable to track
how the variable was declared: implicitly, normally, or in an
interface block.
Fixes piglit tests:
- gs-redeclares-pervertex-out-after-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-global-redeclaration.vert
- gs-redeclares-pervertex-out-after-other-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-other-global-redeclaration.vert
- gs-redeclares-pervertex-out-before-global-redeclaration
- vs-redeclares-pervertex-out-before-global-redeclaration
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
v2: Don't set "how_declared" redundantly in builtin_variables.cpp.
Properly clone "how_declared".
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I made this a function (instead of a method of ir_variable) because it
made the change set smaller, and I expect that there will be an overload
that takes an ir_var_mode enum. Having both functions used the same way
seemed better.
v2: Add missing case for ir_var_system_value.
v3: Change the ir_var_mode_count case to just break. Move the assertion
and the return outside the switch-statment. In the unlikely event that
var->mode is an invalid value other than ir_var_mode_count, the
assertion will still fire, and in release builds we won't wind up
returning a garbage pointer. Suggested by Paul.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>