dotlike() uses ir_binop_mul for scalars, and ir_binop_dot for vectors.
When generating built-in functions, we often want to use regular
multiply for scalar signatures, and dot() for vector signatures.
ir_binop_dot only works on vectors, so we have to switch opcodes,
even if the code is otherwise identical. dotlike() makes this easy.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We use "ret" as the function name since "return" is a C++ keyword, and
"ir_return" is already a class name.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This adds two new signatures:
assign(lhs, rhs, condition, writemask);
assign(lhs, rhs, condition);
All the other existing APIs still exist.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Now that we have the ir_expression constructor that does type inference,
this is trivial to do.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We already have ir_expression constructors for unary and binary
operations, which automatically infer the type based on the opcode and
operand types.
These are convenient and also required for ir_builder support.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This isn't strictly necessary, since creators of ir_texture objects
should set LOD when relevant. However, it's nice to have a NULL pointer
in case they forget.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
During compilation, we'll use this to determine built-in availability.
The plan is to have a single shader containing every built-in in every
version of the language, but filter out the ones that aren't actually
available to the shader being compiled.
At link time, we don't actually need this filtering capability: we've
already imported prototypes for every built-in that the shader actually
calls, and they're flagged as is_builtin(). The linker doesn't import
any additional prototypes, so it won't pull in any unavailable
built-ins. When resolving prototypes to function definitions, the
linker ensures the values of is_builtin() match, which means that a
shader can't trick the linker into importing the body of an unavailable
built-in by defining a suspiciously similar prototype.
In other words, during linking, we can just pass in NULL. It will work
out fine.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We can simply call the stored predicate function. If state is NULL,
just report that the function is available.
v2: Add a comment (requested by Paul Berry).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This promises the method won't modify the contents of the object.
This allows us to call it even with a const pointer to the state.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
A signature is a built-in if and only if builtin_info != NULL, so we
don't actually need a separate flag bit. Making a boolean-valued
method allows existing code to ask the same question while not worrying
about the internal representation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
For the upcoming built-in function rewrite, we'll need to be able to
answer "Is this built-in function signature available?".
This is actually a somewhat complex question, since it depends on the
language version, GLSL vs. GLSL ES, enabled extensions, and the current
shader stage.
Storing such a set of constraints in a structure would be painful, so
instead we store a function pointer. When creating a signature, we
simply point to a predicate that inspects _mesa_glsl_parse_state and
answers whether the signature is available in the current shader.
Unfortunately, IR reader doesn't actually know when built-in functions
are available, so this patch makes it lie and say that they're always
present. This allows us to hook up the new functionality; it just won't
be useful until real data is populated. In the meantime, the existing
profile mechanism ensures built-ins are available in the right places.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Otherwise, coordinates with four components would result in a MOV
with a destination writemask that has no channels enabled:
mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };
At best, this is stupid: we emit code that shouldn't do anything.
Worse, it apparently causes GPU hangs (observable with Chris's
textureGather test on CubeArrays.)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
The change is very small. Do seamless filtering if either the context
enable is set or the sampler enable is set.
The AMD_seamless_cubemap_per_texture says:
"If TEXTURE_CUBE_MAP_SEAMLESS_ARB is emabled (sic) globally or the
value of the texture's TEXTURE_CUBE_MAP_SEAMLESS_ARB parameter is
TRUE, seamless cube map sampling is enabled..."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Appendix F.2 of the OpenGL ES 3.0.0 spec says:
"OpenGL ES 3.0 requires that all cube map filtering be
seamless. OpenGL ES 2.0 specified that a single cube map face be
selected and used for filtering."
Setting the field only in the context will work fine with sampler
objects (and drivers that support AMD_seamless_cubemap_per_texture)
because seamless filtering is used if *either* the context or the
sampler enable it:
"If TEXTURE_CUBE_MAP_SEAMLESS_ARB is emabled (sic) globally or the
value of the texture's TEXTURE_CUBE_MAP_SEAMLESS_ARB parameter is
TRUE, seamless cube map sampling is enabled..."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reported-by: Maxence Le Dore <maxence.ledore@gmail.com>
Thanked-by: Maxence Le Dore <maxence.ledore@gmail.com>
There is no GL_TEXTURE_CUBE_MAP_SEAMLESS in any version of OpenGL ES or
in any extension that applies to OpenGL ES. The same error check
already occurs for glTexParameteri.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: Maxence Le Dore <maxence.ledore@gmail.com>
This aligns the gfx, compute, and dma IBs to 8 DW boundries.
This aligns the the IB to the fetch size of the CP for optimal
performance. Additionally, r6xx hardware requires at least 4
DW alignment to avoid a hw bug. This also aligns the DMA
IBs to 8 DW which is required for the DMA engine. This
alignment is already handled in the gallium driver, but that
patch can be removed now that it's done in the winsys.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
278372b47e added the uninitialized pointer
field gl_query_object:Label. A free of this pointer resulted in a crash.
This patch fixes piglit regressions with swrast introduced by
6d8dd59cf5.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69047
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
We support indirect addressing only on the vertex index, but some
shaders also use indirect addressing on attributes. This patch
adds support for indirect addressing on both dimensions inside
gs arrays.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
When adding a new buffer to the beginning of the memory pool, we were
accidentally deleting the buffer that was first in the buffer list.
This was caused by a bug in the memory pool's linked list
implementation.
DPA2 is listed in the "Defeatured Instructions" section of the
965 PRM, Volume 4:
"The following instructions are removed from Gen4 implementation mainly
due to implementation cost/schedule reasons. They are candidates for
future generations."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
RSR and RSL are listed in the "Defeatured Instructions" section of the
965 PRM, Volume 4:
"The following instructions are removed from Gen4 implementation mainly
due to implementation cost/schedule reasons. They are candidates for
future generations."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes a bug where if an uniform array is passed to a function the accesses
to the array are not propagated so later all but the first vector of the
uniform array are removed in parcel_out_uniform_storage resulting in
broken shaders and out of bounds access to arrays in
brw::vec4_visitor::pack_uniform_registers.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dominik Behr <dbehr@chromium.org>
Haswell GT2 and GT3 require the number of vertex shader URB entries to
be at least 64, not 32.
At the moment, we always meet this requirement automatically, because
in the absence of a geometry shader, we assign all available URB space
to the vertex shader. But when we turn on support for geometry
shaders, this lower limit will become important.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
This patch creates a new file brw_vec4_vs_visitor.cpp, to contain code
that is specific to the vertex shader. Now the organization of vertex
shader and geometry shader visitor code is symmetric: vs-specific code
is in brw_vec4_vs_visitor.cpp, gs-specific code is in
brw_vec4_gs_visitor.cpp, and code shared between vs and gs is in
brw_vec4_visitor.cpp.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
This will allow it to be shared between brw_vec4_visitor.cpp and
brw_vec4_vs_visitor.cpp (which will be created in the next patch).
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
The fixup code emulates non-BGRA render targets by adding an
extra instruction at the end of fragment shaders to swizzle the
output. To do this, we also swizzle the blend function. However
an oversight until now was that the writemask wasn't getting
swizzled. This patch fixes that which fixes a bunch of piglit
tests.
The old code in dri2_glx suffered from a typographical error that caused
the default version to be 2.1 instead of 1.2 (minimum required by the
Linux OpenGL ABI). drisw_glx had a similar error resulting in a default
version of 0.1.
Some driver/card combinations (r200/RV280, i915/915G) don't support
OpenGL 2.1. These create in some corner cases an indirect context
instead of a direct context when calling glXCreateContextAttribsARB().
This happens because of a bad default value. To avoid this, just used
the default value specified by the GLX_ARB_create_context specification:
"The default values for GLX_CONTEXT_MAJOR_VERSION_ARB and
GLX_CONTEXT_MINOR_VERSION_ARB are 1 and 0 respectively. In this
case, implementations will typically return the most recent version
of OpenGL they support which is backwards compatible with OpenGL 1.0
(e.g. 3.0, 3.1 + GL_ARB_compatibility, or 3.2 compatibility
profile)"
Refactor all the default value setting to dri2_convert_glx_attribs, and
make sure the correct defaults are set in that one place.
Signed-off-by: Rico Schüller <kgbricola@web.de>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla http://bugs.winehq.org/show_bug.cgi?id=34238
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
This patch adds liveness analysis to i915g and a couple
optimizations which benefit from it. One interesting
optimization turns (fake) indirect texture accesses into direct
texture accesses (the i915 supports a maximum of 4 indirect
texture accesses). Among other things this fixes a bunch of
piglit tests.
It looks like commit 53febac removed the last user of that parameter.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
The vertex shader color outputs (gl_FrontColor, gl_BackColor,
gl_FrontSecondaryColor, and gl_BackSecondaryColor) don't have the same
names as the matching fragment shader color inputs (gl_Color and
gl_SecondaryColor). As a result, the qualifiers on them were not being
properly cross validated.
Full spec compliance required ir_variable::used and
ir_variable::assigned be set properly. Without the preceeding patch,
which fixes the ::clone method to copy them, this will not be the case.
Fixes all of the previously failing piglit
spec/glsl-1.30/linker/interpolation-qualifiers tests.
v2: Update callers of cross_validate_types_and_qualifiers and
cross_validate_front_and_back_color. The function signature changed in
v2 of a previous patch. Suggested by Paul.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47755