Also rename to _mesa_get_main_function_signature.
We will call it near the end of compilation to insert some code into
main for initializing some compute shader global variables.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
We lower gl_GlobalInvocationID based on the extension spec formula:
gl_GlobalInvocationID =
gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
We need to set this variable in main(), even if gl_GlobalInvocationID
is not referenced by the shader. (It may be used by a linked shader.)
Therefore, we can't eliminate these as dead variables.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This is to avoid needless float<->int conversions, since all
face-related computations are made on integers. Spotted by Emil
Velikov.
Reviewed-by: Brian Paul <brianp@vmware.com>
New enum to add to switch so compiler doesn't complain.
commit 1807a08e4f
Author: Ilia Mirkin <imirkin@alum.mit.edu>
AuthorDate: Thu Aug 27 23:05:03 2015 -0400
Commit: Ilia Mirkin <imirkin@alum.mit.edu>
CommitDate: Thu Sep 10 17:38:33 2015 -0400
nir: add nir_texop_texture_samples and convert from glsl
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rather than make yet another copy of channel(), let's move it into nir.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Sometimes a useful thing for compilers (or, for example, tgsi_to_nir) to
know. And pretty trivial for scan to figure this out for us.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
In commit f9caabe8f1:
One place in r600_llvm.c was forgotten when replacing
R600_UCP_CONST_BUFFER with R600_BUFFER_INFO_CONST_BUFFER.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91985
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Cypress/Cayman/Aruba, earlier r6xx/r7xx chips only support a subset
of the needed fp64 ops, and don't do GL4 anyway.
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Only for Cypress/Cayman/Aruba, older chips have only partial fp64 support.
Uses float intermediate values so only accurate for int24 range, which
matches what the blob does.
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I'm going to want a driver constant buffer for tess to coordinate
LDS storage, so before I go tackling that I decided to merge the
clip/samplepos and texture info buffers into one. So I can steal
the spare one.
This creates a single constant buffer between the two, with
clip/samplepos taking up a reserved 128 bytes at the start.
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just puts these in one place and #defines them.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
V2: -Change to "not started" for most entries
-Add status for multisample_2d_array
-Change shader_multisample_interpolation to "not_stared"
V3 (idr): Move the GLES 3.2 section after the "Additional functions"
section from GLES 3.1. Note that GL_KHR_texture_compression_astc_hdr is
done for i965 on gen9+ hardware. Note that GL_OES_shader_io_blocks is
based on some features from GLSL 1.50.
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v2]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This commit makes a lot of variables constant - this is basically done
by moving the computation to variable definition. Some of them are
moved into lower scopes (like in img_filter_2d_ewa).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Add a small inline function doing the casting - this is to make sure
we don't do a cast from some completely unrelated type. This commit
does not make tgsi_sampler parameters const in vfuncs themselves for
now - probably llvmpipe would need looking at before making such a
change.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Those functions actually could always take them as constants.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Those functions actually could always take them as constants.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
A followup from previous commit - since all functions called by
query_lod take pointers to const sp_sampler_view and const sp_sampler,
which are taken from tgsi_sampler subclass, we can the tgsi_sampler as
const itself now.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This is to prepare for making tgsi_sampler parameter in query_lod a
const too. These functions do not modify anything in either sampler or
view anymore.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
With that, sp_sampler_view instances are not abused anymore as a local
storage, so we can later make them constant.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
As of a10d4937, we would really like things associated with an instruction
to be allocated out of that instruction and not out of the shader. In
particular, you should be passing the instruction that will ultimately be
holding the source into nir_src_copy rather than an arbitrary memory
context.
We also change the prototypes of nir_dest_copy and nir_alu_src/dest_copy to
explicitly take an instruction so we catch this earlier in the future.
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
We copy the output, make the old output the temporary, and give the
temporary a new name. The copy keeps the pointer to the old name. This
works just fine up until the point where we lower things to SSA and delete
the old variable and, with it, the name. Instead, we should re-parent to
the copy.
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
opt_register_coalesce stopped to check previous instructions to
coalesce with if somebody else was writing on the same
destination. This can be optimized to check if somebody else was
writing to the same channels of the same destination using the
writemask.
Shader DB results (taking into account only vec4):
total instructions in shared programs: 1781593 -> 1734957 (-2.62%)
instructions in affected programs: 1238390 -> 1191754 (-3.77%)
helped: 12782
HURT: 0
GAINED: 0
LOST: 0
v2: removed some parenthesis, fixed indentation, as suggested by
Matt Turner
v3: added brackets, for consistency, as suggested by Eduardo Lima
Reviewed-by: Matt Turner <mattst88@gmail.com>
This makes it possible for NIR shaders to know the number of output
vertices and the number of invocations. Drivers could also access
these directly without going through gl_program.
We should probably add InputType and OutputType here too, but currently
those are stored as GL_* enums, and I wanted to avoid using those in
NIR, as I suspect Vulkan/SPIR-V will use different enums. (We should
probably make our own.)
We could add VerticesIn, but it's easily computable from the input
topology, so I'm not sure whether it's worth it. It's also currently
not stored in gl_shader (only gl_shader_program), which would require
changes to the glsl_to_nir interface or require us to store it there.
This is a bit of duplication of data...ideally, we would factor these
substructs out of gl_program, gl_shader_program, and nir_shader, creating
a gl_geometry_info class...but it would need to go in a new place (in
src/glsl?) that isn't mtypes.h nor nir.h.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
These provide a convenient way to do simple variable loads and stores.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cuts compile/link time of the fragment shader in #91857 by 19%
(16.28 -> 13.05).
I didn't bother with the acp sets because they're smaller, but it
might be worth doing as well.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Cuts compile/link time of the fragment shader in #91857 by 25%
(21.64 -> 16.28).
v2: Drop unnecessary _mesa_hash_table_destroy call, and use
refs.ht->entries == 0 rather than ad-hoc checking (suggested by
Timothy Arceri).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Cuts compile/link time of the fragment shader in bug #91857 by 31%
(31.79 -> 21.64). It has over 8,000 variables so linked lists are
terrible.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Previously the result of the complicated clamp() expression just dropped
on the floor: clamp does not modify any of its parameters. Looking at
the surrounding code, I believe this is supposed to modify the value of
tex_coord.
This change (along with a change to avoid the use of
brw_blorp_framebuffer) does not affect any existing piglit tests. I'm
not sure what this clamp is trying to accomplish, so I'm not sure how to
write a test to exercise this path.
I also noticed another bug in this code. There is no way the array
texture case could possibly work. This will generate code for the
TEXEL_FETCH macro like:
#define TEXEL_FETCH(coord) texelFetch(texSampler, ivec3(coord), sample_map[int(2 * fract(coord.x))]);
Since the coord parameter of this macro is a vec2 at all invocations, no
expansion of this macro will even compile.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
We may have been called from glGenerateTextureMipmap with CurrentUnit
still set to 0, so we don't know when we can skip binding the texture.
Assume that _mesa_BindTexture will be fast if we're rebinding the same
texture.
v2: Remove currentTexUnitSave because it is now unused. Suggested by
both Neil and Anuj.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91847
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>