We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
cleanup_cf_node() is part of the control flow modification code, which
we're going to split into its own file, but remove_defs_uses() is an
internal function used by nir_instr_remove(). Break the dependency by
making cleanup_cf_node() use nir_instr_remove() instead, which simply
calls remove_defs_uses() and then removes the instruction from the list.
nir_instr_remove() does do extra things for jumps, though, so we avoid
calling it on jumps which matches the previous behavior (this will be
fixed later in the series).
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It was being used to initialize function impls and loops, even though
it's really a control flow modification helper. It's pretty trivial, so
just inline it to avoid the dependency.
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We should be checking almost everything now.
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It's simply the first nir_cf_node in the nir_function_impl::body list,
which is easy enough to access - we don't to store a pointer to it
explicitly. Removing it means we don't need to maintain the pointer
when, say, splitting the start block when modifying control flow.
Thanks to Connor Abbott for suggesting this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Only uncompressed formats have a non-void type and actual
components per pixel. Rename _mesa_format_to_type_and_comps
to _mesa_uncompressed_format_to_type_and_comps and require
callers to check if the format is not compressed.
v2. include compressed format cases to avoid gcc warnings (Chad).
Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
On the older platforms where we don't have logical contexts preserving
state across batches, we emit the invariant state setup on every batch
using the brw_invariant_state atom. This includes the pipeline selection
which is cached with the introduction of
commit 0e0e23ef53
Author: Jordan Justen <jordan.l.justen@intel.com>
Date: Wed Apr 22 11:43:50 2015 -0700
i965/state: Emit pipeline select when changing pipelines
However, we do not reset the cache between batches on context-less
platforms resulting in us not setting the pipeline selection and can
cause GPU hangs if a media pipelined was loaded in the meantime (e.g.
mixing mplayer/gstreamer using libva and gnome-shell). A simple solution
is to just forcibly re-emit the pipeline select along with the invariant
state and reset the cache at that point.
Reported-and-tested-by: Tomasz C. <tomaszc@o2.pl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91254
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
This reverts commit 567394112d.
It regressed performance. It looks like smaller IBs are better, because
the GPU goes idle quicker and there is less waiting for buffers and fences.
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Same as commit 1af0641db but for nvc0. If an integer texture is
bound to RT0, don't do alpha-to-one or alpha-to-coverage.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
This was missed when I did fp64, I've sent a piglit test to cover
the case as well.
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When the edge flag element is enabled then the elements are slightly
reordered so that the edge flag is always the last one. This was
confusing the code to upload the 3DSTATE_VF_INSTANCING state because
that is uploaded with a separate loop which has an instruction for
each element. The indices used in these instructions weren't taking
into account the reordering so the state would be incorrect.
v2: Use nr_elements instead of brw->vb.nr_enabled so that it will cope
when gl_VertexID is used.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
The edge flag data on Gen6+ is passed through the fixed function hardware as
an extra attribute. According to the PRM it must be the last valid
VERTEX_ELEMENT structure. However if the vertex ID is also used then another
extra element is added to source the VID. This made it so the vertex ID is in
the wrong register in the vertex shader and the edge attribute is no longer in
the last element.
v2: Also implement for BDW+
v3 [by Ben]: Remove 10.5 tag. Too late.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Fixes a compiler warning of defined but not used function when
HAVE_MKOSTEMP is defined.
Fixes: eb3e2562a4b(configure.ac: check for mkostemp())
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
The build/file was removed with an earlier commit while the EXTRA_DIST
was forgotten.
Fixes: 66d77cd71c (scons: don't build the kms-dri winsys)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
The files are not referenced in any other place in whole of
mesa. They are likely remnants of the early development stage.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
vc4 conflicts with ilo, when build on x86 as it's build for emulation
purposes. In that mode a i965-like symbol is exported by vc4, which
conflicts with the ilo one in the gallium-dri megadriver.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
The nv_conditional_render piglits were sporadically failing. Moving
the control flush from the write and placing it just before the read
was sufficient to make the piglits pass a 1000/1000 times. The bspec
says that the flush enable bit "waits until all previous writes of
immediate data from post sync circles are complete before executing the
next command" - the operative word being previous!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90691
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Neil Roberts <neil@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I switched us to tracking whether the results *could* go to r4, but then
didn't make a separate register class for the class bits that included r4.
Switch the "any" class to actually be "any", and name the "any but r4"
class more appropriately.
total instructions in shared programs: 96798 -> 94680 (-2.19%)
instructions in affected programs: 62736 -> 60618 (-3.38%)
We had several reports of users hitting bugs
with the other path to upload constants,
and switching to the user constant buffer
path solves the bugs.
User constant buffers are expected to be slower
for Nvidia cards, so ideally this patch should be
reverted when the path is fixed.
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Krzysztof Sobiecki <sobkas@gmail.com>
It is very common for d3d9 apps to set again the constants
they need before every draw call, even if nothing changed.
Since we are mostly gpu bound, it is better to check
for change, and upload constants again (and thus use
gpu bandwith) only if the constants changed.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
The number of texture stages is 8.
'tex_stage' array was too big, and thus
the checks with 'Elements(state->ff.tex_stage)' were passing,
causing some invalid API calls to pass, and crash because of
out of bounds write since bumpmap_vars was just the correct size.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
The CSO cache unbinds views that are not needed anymore,
which we don't do.
It checks for change before committing the views.
Signed-off-by: Axel Davy <axel.davy@ens.fr>