Future patches will add some extra code to this path, and some of that
code will want to exit from the explicit location code early.
v2: Change a geometry shader "break" to a "return" so that try to apply
a bogus geometry shader location qualifier (which could cause cascading
errors). Suggested by Paul.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
The return value has been unused since commit d348b0c. This was
originally included in another patch, but it was split out by Ian
Romanick.
v2: Drop unnecessary final return. Suggested by Paul.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Incremental builds were failing because not all generated source files
were missing dependencies to src/mapi/glapi/gen/*.xml.
Hopefully this change will be the end of these incremental build
failures.
This avoids a defect in lower_output_reads.
The problem is lower_output_reads treats the gl_FragData array as a single
variable. It first redirects all output writes to a temporary variable (array)
and then writes the whole temporary variable to the output, generating
assignments to all elements of gl_FragData.
BTW this pass can be modified to lower all arrays, not just inputs and outputs.
The question is whether it is worth it.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
v2: addressed Paul Berry's comments
Use PKG_CHECK_MODULE over requesting the user to setup the
option at configure time. Drop unused EXPAT_INCLUDE and
update all targets.
NOTE: The this commit removes the --with-expat configure
option. One should ensure that the expat they wish to use
has expat.pc file accessible by pkg-config.
v2:
* Add note about the removal of --with-expat
(per Tom Stellard)
* Drop EXPAT_CFLAGS for targets that do not build DRI_COMMON
(spotted by Matt Turner)
v3:
* Rebase on top of megadrivers (drop EXPAT_CFLAGS from swrast)
Acked-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v2)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Conflicts:
configure.ac
src/mesa/drivers/dri/common/Makefile.am
Already available and used in other places of configure.ac.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
The function should have never used it in the first place as it was
a left over from the DRI1 days of the nouveau ddx. While we're around
check if KMS is supported before opening the nouveau device, and
add support for Fermi & Kepler cards.
Compile tested only due to the lack of a Fermi/Kepler card.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
The function xf86GetEntityInfo() retrieves the entity rather than
doing any changes. Remove this no-op code.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Commit a9f8baf00b removed the first and only use of the variables
but forgot to remove them.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
v2: Remove xf86PciInfo.h, all drivers provide their own PCI ID list
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
A convenient front end to indices generate/translate code, for emulating
primitives which are not supported natively by the driver.
This handles saving/restoring index buffer state, etc.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
The idea of the original order was that you'd dead code eliminate accesses
to push constants. But I've never seen a case of that (nor has
shader-db), while we frequently see sparse accesses of large constant
arrays that would overflow into pull constants.
Cuts pull constant use on csgo, serious sam, planeshift, and the cave:
total instructions in shared programs: 1695103 -> 1688795 (-0.37%)
instructions in affected programs: 92024 -> 85716 (-6.85%)
GAINED: 339
LOST: 0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* Use more consistant data sources
* Fix improper color space assignments
* Remove unnecessary comments and code
* Drop unnecessary round_up function (this was leftover
from moving winsys code out of renderer)
Acked-by: Brian Paul <brianp@vmware.com>
* Instead of assuming the displaytarget is the same
stride / colorspace as the destination, lets
actually check the source bitmap.
* Fixes random stride issues in rendering
Acked-by: Brian Paul <brianp@vmware.com>
The MRF variant is going to be used extensively by the atomic counter
intrinsics to assemble untyped atomic and surface read messages
easily.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
The maximum number of atomic buffer objects is somewhat arbitrary, we
can change it in the future easily if it turns out it's not enough...
v2: Add comments with the relevant mesa dirty bits. Fix usage of
BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom.
v3: Update binding table layout diagrams.
v4: Resolve conflicts with the recent dynamic surface index assignment changes.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Almost a trivial change, it boils down to renaming a few identifiers
so their names still make sense for opaque types other than sampler.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.
We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override. As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: Define local helper function to generate ir_call nodes in the
builtin generator.
And use it to forbid comparisons of opaque operands. According to the
GL 4.2 specification:
> Except for array indexing, structure member selection, and
> parentheses, opaque variables are not allowed to be operands in
> expressions.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: Fix GLSL version in which the type became available. Add
contains_atomic() convenience method. Split off atomic counter
comparison error checking to a separate patch that will handle all
opaque types. Include new ir_variable fields for atomic types.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch implements the common support code required for the
ARB_shader_atomic_counters extension. It defines the necessary data
structures for tracking atomic counter buffer objects (from now on
"ABOs") associated with some specific context or shader program, it
implements support for binding buffers to an ABO binding point and
querying the existing atomic counters and buffers declared by GLSL
shaders.
v2: Fix extension checks. Drop unused MAX_ATOMIC_BUFFERS constant.
Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Add XML file for the dispatch code generator, update the
dispatch_sanity test and add stub definition for the new entry point.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
These ralloc contexts belong to a specific object and are being
deallocated manually from the class destructor. Now that we've hooked
up destructors to ralloc there's no reason for them to be children of
any other context, and doing so might to lead to double frees under
some circumstances. The class destructor has all the responsibility
of freeing class memory resources now.
This patch makes sure that class destructors are called as they should
be when a C++ object allocated by ralloc is released.
Based on a previous patch by Kenneth Graunke, but it doesn't exhibit
the ~0.8% performance regression in shader compilation times because
we now use the HAS_TRIVIAL_DESTRUCTOR() macro to detect the typical
case where the indirect function call can be avoided because the
object's destructor doesn't need to do anything.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Only implemented on GCC and Clang for now. Other compilers use a
dummy implementation that always returns false, which should be a safe
[but slightly inefficient] assumption in all cases.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This will let us use strcasecmp() from anywhere inside Mesa without
having to worry about the fact that it doesn't exist in MSVC.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The layer coming from GS needs to be clamped (not sure if that's actually
the correct error behavior but we need something) as the number can be higher
than the amount of layers in the fb. However, this code was using the layer
calculation from the scene, and this was actually calculated in
lp_scene_begin_rasterization() hence too late (so setup was using the value
from the _previous_ scene or just zero if it was the first scene).
Since the value is used in both rasterization and setup, move calculation up
to lp_scene_begin_binning() though it's a bit more inconvenient to calculate
there. (Theoretically could move _all_ code which was in
lp_scene_begin_rasterization() to there, because ever since we got rid of
swizzled render/depth buffers our "map" functions preparing the fb data for
render don't actually change the data in there at all, but it feels like
it would be a hack.)
v2: improve comments
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Before we were only checking the st->vertex_array_out_of_memory flag
after updating array state. But if there's two consecutive glDrawArrays
calls and the first one is skipped because of OOM, the second one should
be skipped too.
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Orbital Explorer was generating a 4000 instruction geometry shader, which
was taking 275 trips through dead code elimination and register
coalescing, each of which updated live variables to get its work done, and
invalidated those live variables afterwards.
By using bitfields instead of bools (reducing the working set size by a
factor of 8) in live variables analysis, it drops from 88% of the profile
to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul
says 3+ minutes) to 10.5 seconds.
Compare to f179f419d1 on the FS side.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This prevents unnecessary (and wrong) register allocation in the
scheduler for preloaded values in fixed registers.
Fixes interpolation-mixed.shader_test on rv770
(and probably on all other pre-evergreen chips).
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
I noticed this in a shader in Unigine Heaven that was spilling. While it
doesn't really reduce register pressure, it shaves a few instructions
anyway (7955 -> 7882).
v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik
Faye-Lund).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
While ir_builder is slightly less efficient, we're only increasing the
work when there's actual optimization being done, and it's way more
readable code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Matt and I had each screwed up these common required patterns recently, in
ways that wouldn't have been noticed for a long time if not for code
review. Just enforce it in the caller so that we don't rely on code
review catching these bugs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.
Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case, replacing it with the
necesssary state updates to end the query, (clear the bindpt pointer and call
into the driver's EndQuery hook).
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>