Commit graph

70024 commits

Author SHA1 Message Date
Francisco Jerez
d91d6b3f03 nir: Translate memory barrier intrinsics from GLSL IR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
f8f8b31847 nir: Translate image load, store and atomic intrinsics from GLSL IR.
v2: Undefine coordinate components not applicable to the target.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
6de78e6b0c nir: Fix indexing of atomic counter arrays with a constant value.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
f1269a3e01 nir: Add memory barrier intrinsic.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
d9e930997f nir: Define image load, store and atomic intrinsics.
v2: Undefine coordinate components not applicable to the target.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
ee1a8b5a8c i965/fs: Have component() set the register stride to zero.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:56 +03:00
Francisco Jerez
4171ef371a i965/fs: Fix offset() for registers with zero stride.
stride == 0 implies that the register has one channel per vector
component.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-12 15:47:56 +03:00
Francisco Jerez
0db663503e i965: Don't forget the force_sechalf flag in lower_load_payload().
Regression from commit 41868bb682.
Fixes a bunch of ARB_shader_image_load_store tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-05-12 15:47:56 +03:00
Francisco Jerez
cbf204069d i965: Document brw_mask_reg().
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-05-12 15:47:56 +03:00
Tapani Pälli
95774ca258 nir: fix sampler lowering pass for arrays
This fixes bugs with special cases where we have arrays of
structures containing samplers or arrays of samplers.

I've verified that patch results in calculating same index value as
returned by _mesa_get_sampler_uniform_value for IR. Patch makes
following ES3 conformance test pass:

	ES3-CTS.shaders.struct.uniform.sampler_array_fragment

v2: remove unnecessary comment (Topi)
    simplify changes and the overall code (Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90114
2015-05-12 14:28:16 +03:00
Neil Roberts
426023050d i965: Use predicate enable bit for conditional rendering w/o stalling
Previously whenever a primitive is drawn the driver would call
_mesa_check_conditional_render which blocks waiting for the result of
the query to determine whether to render. On Gen7+ there is a bit in
the 3DPRIMITIVE command which can be used to disable the primitive
based on the value of a state bit. This state bit can be set based on
whether two registers have different values using the MI_PREDICATE
command. We can load these two registers with the pixel count values
stored in the query begin and end to implement conditional rendering
without stalling.

Unfortunately these two source registers were not in the whitelist of
available registers in the kernel driver until v3.19. This patch uses
the command parser version from intel_screen to detect whether to
attempt to set the predicate data registers.

The predicate enable bit is currently only used for drawing 3D
primitives. For blits, clears, bitmaps, copypixels and drawpixels it
still causes a stall. For most of these it would probably just work to
call the new brw_check_conditional_render function instead of
_mesa_check_conditional_render because they already work in terms of
rendering primitives. However it's a bit trickier for blits because it
can use the BLT ring or the blorp codepath. I think these operations
are less useful for conditional rendering than rendering primitives so
it might be best to leave it for a later patch.

v2: Use the command parser version to detect whether we can write to
    the predicate data registers instead of trying to execute a
    register load command.
v3: Simple rebase
v4: Changes suggested by Kenneth Graunke: Split the
    load_64bit_register function out to a separate patch so it can be
    a shared public function. Avoid calling
    _mesa_check_conditional_render if we've already determined that
    there's no query object. Some styling fixes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 11:20:47 +01:00
Neil Roberts
9585879d46 i956: Add a function to load a 64-bit register from a buffer
Adds brw_load_register_mem64 which is similar to brw_load_register_mem
except that it queues two GEN7_MI_LOAD_REGISTER_MEM commands in order
to load both halves of a 64-bit register. The function is implemented
by splitting the 32-bit version into an internal helper function which
takes a size.

This will later be used to set the 64-bit predicate source registers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 11:20:35 +01:00
Neil Roberts
8a59f2f26f i965: Store the command parser version number in intel_screen
In order to detect whether the predicate source registers can be used
in a later patch we will need to know the version number for the
command parser. This patch just adds a member to intel_screen and does
an ioctl to get the version.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 11:20:35 +01:00
Roland Scheidegger
971be2b7c9 docs/GL3: (trivial) mark some tf extensions as done for softpipe/llvmpipe
Those extensions were enabled for ages already.
2015-05-12 04:48:48 +02:00
Emil Velikov
95089bfaeb docs: add news item and link release notes for mesa 10.5.5
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-05-11 22:07:46 +01:00
Emil Velikov
d4125c41f9 docs: Add sha256 sums for the 10.5.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 8ee1a1c08b)
2015-05-11 22:06:13 +01:00
Emil Velikov
22aaa746bd Add release notes for the 10.5.5 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit d88fb40505)
2015-05-11 22:06:11 +01:00
Ilia Mirkin
2b5355c8ab st/mesa: make sure to create a "clean" bool when doing i2b
i2b has to work for all integers, not just 1. INEG would not necessarily
result with all bits set, which is something that other operations can
rely on by e.g. using AND (or INEG for b2i).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: mesa-stable@lists.freedesktop.org
2015-05-11 15:52:17 -04:00
Tom Stellard
9c4dc98b29 clover: Fix a bug with multi-threaded events v2
It was possible for some events never to get triggered if one thread
was creating events and another threads was waiting for them.

This patch consolidates soft_event::wait() and hard_event::wait()
into event::wait() so that hard_event objects will now wait for
all their dependencies to be submitted before flushing the command
queue.

v2:
  - Rename variables
  - Use mutable varibales so we can keep event::wait() const
  - Open code signalled() call so mutex can be atted to signalled
    without deadlocking.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-11 18:52:29 +00:00
Tom Stellard
f546902d95 clover: Add a mutex to guard queue::queued_events
This fixes a potential crash where on a sequence like this:

Thread 0: Check if queue is not empty.
Thread 1: Remove item from queue, making it empty.
Thread 0: Do something assuming queue is not empty.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-05-11 18:52:18 +00:00
Matt Turner
73f4010082 i965/fs: Add missing initializer in fs_visitor(). 2015-05-11 11:25:03 -07:00
Adam Jackson
7a58262e58 egl: Remove skeleton implementation of EGL_MESA_screen_surface
No backend wires this up to anything, and the extension spec has been
marked obsolete for 4+ years.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2015-05-11 14:19:37 -04:00
Axel Davy
13fa84e1bc egl/swrast: Enable config extension for swrast
Enables to use dri config for swrast, like vblank_mode.

Reviewed-by: Dave Airlie <airlied@redhat.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
cdcfe48fb0 egl/wayland: Implement swrast support
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
cd25e52f6b egl/wayland: Simplify dri2_wl_create_surface
This function is always used with EGL_WINDOW_BIT. Pixmaps are forbidden
for Wayland, and PBuffers are unimplemented.

Reviewed-by: Daniel Stone <daniels@collabora.com>.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
f1cc478d89 egl/x11: move dri2_x11_swrast_create_image_khr to egl_dri2_fallback.h
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
4cd546df82 egl/wayland: Implement DRI_PRIME support
When the server gpu and requested gpu are different:
. They likely don't support the same tiling modes
. They likely do not have fast access to the same locations

Thus we do:
. render to a tiled buffer we do not share with the server
. Copy the content at every swap to a buffer with no tiling
that we share with the server.

This is similar to the glx dri3 DRI_PRIME implementation.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
fb0960a14b egl/wayland: Add support for render-nodes
It is possible the server advertises a render-node.
In that case no authentication is needed,
and Gem names are forbidden.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>

v2: do not check for __DRI_IMAGE_DRIVER, but instead
do not advertise __DRI_DRI2_LOADER when on a render-node.
2015-05-11 19:31:44 +02:00
Axel Davy
c4ff6d00cd glx/dri3: Add additional check for gpu offloading case
Checks blitImage is implemented.
Initially having the __DRIimageExtension extension
at version 9 at least meant blitImage was supported.
However some implementation do advertise version >= 9
without implementing it.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
05ac39ac49 doc/egl: Remove depreciated EGL_SOFTWARE
EGL_SOFTWARE is not supported anywhere in the code,
whereas LIBGL_ALWAYS_SOFTWARE is.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:44 +02:00
Axel Davy
6aaf09b93b egl/wayland: properly destroy wayland objects
the wl_registry and the wl_queue allocated weren't destroyed.

CC: 10.5 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2015-05-11 19:31:43 +02:00
Neil Roberts
bfdae9149e i965/fs: Disable opt_sampler_eot for textureGather
The opt_sampler_eot optimisation seems to break when the last
instruction is SHADER_OPCODE_TG4. A bunch of Piglit tests end up doing
this so it causes a lot of regressions. I can't find any documentation
or known workarounds to indicate that this is expected behaviour, but
considering that this is probably a pretty unlikely situation in a
real use case we might as well disable it in order to avoid the
regressions. In total this fixes 451 tests.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-05-11 12:09:20 +01:00
Tapani Pälli
abf3fefa1a mesa: use _mesa_has_compute_shaders instead of extension check
This was really the original purpose, for enabling the path for
ES3.1 tests without the extension being set. Set also fallthrough
comment for Coverity (caught by Matt).

v2: .. and test the right way, not wrong one (Ilia Mirkin)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-05-11 08:14:51 +03:00
Marta Lofstedt
4a8cd2799c main: glGetIntegeri_v fails for GL_VERTEX_BINDING_STRIDE
The return type for GL_VERTEX_BINDING_STRIDE is missing,
this cause glGetIntegeri_v to fail.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-05-11 08:00:30 +03:00
Dave Airlie
9ab90c058f r600: use pipe->hw prim convert from radeonsi
This avoids future addition to PIPE_PRIM_ from causing regressions
on r600g.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-11 06:43:18 +10:00
Rob Clark
1cbdafc47a freedreno/ir3/nir: fix build break after f752effa
Our lower if/else pass was missed when converting NIR to use linked
lists rather than hashsets to track use/def sets.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-05-10 06:03:53 -04:00
Ilia Mirkin
da136dc07d nv50/ir: only enable mul saturate on G200+
Commit 44673512a8 enabled support for saturating fmul. However
experimentally this does not seem to work on the older chips. Restrict
the feature to G200 (NVA0) and later.

Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90350
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:41:51 -04:00
Ilia Mirkin
7892210400 nvc0: reset the instanced elements state when doing blit using 3d engine
Since we update num_vtxelts here, we could otherwise end up with stale
instancing information in the upper bits which wouldn't otherwise get
reset. (Also we run the risk of the previous draw having set the first
element as instanced.)

This appears as one of the causes for the test pointed out in fdo#90363
to fail on nvc0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:36:23 -04:00
Ilia Mirkin
e9b1ea29bf nvc0: keep track of PGRAPH state in nvc0_screen
See identical commit for nv50. Destroying the current context and then
creating a new one or switching to another existing context would cause
the "current" state to not be properly initialized, so we save it off in
the screen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:36:23 -04:00
Ilia Mirkin
f617029db3 nv50: keep track of PGRAPH state in nv50_screen
Normally this is kept in nv50_context, and on switching the active
context, the state is copied from the previous context. However when the
last context is destroyed, this is lost, and a new context might later
be created. When the currently-active context is destroyed, save its
state in the screen, and restore it when setting the current context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363
Reported-by: Matteo Bruni <matteo.mystral@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-05-09 13:36:23 -04:00
Kenneth Graunke
d6fb155f30 nir: Fix aggressive typos in nir_from_ssa.c.
s/agressive/aggressive/g

Trivial.
2015-05-08 19:38:14 -07:00
Jason Ekstrand
fb5f411248 nir/search: Save/restore the variables_seen bitmask when matching
Shader-db results on Broadwell:

   total instructions in shared programs: 7152330 -> 7137006 (-0.21%)
   instructions in affected programs:     1330548 -> 1315224 (-1.15%)
   helped:                                5797
   HURT:                                  76
   GAINED:                                0
   LOST:                                  8

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:15 -07:00
Jason Ekstrand
e0cfe59c37 nir/search: Assert that variable id's are in range
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:15 -07:00
Jason Ekstrand
13facfbd5b nir/search: handle explicitly sized sources in match_value
Previously, this case was being handled in match_expression prior to
calling match_value.  However, there is really no good reason for this
given that match_value has all of the information it needs.  Also, they
weren't being handled properly in the commutative case and putting it in
match_value gives us that for free.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:14 -07:00
Jason Ekstrand
f752effa08 nir/nir: Use a linked list instead of a hash set for use/def sets
This commit switches us from the current setup of using hash sets for
use/def sets to using linked lists.  Doing so should save us quite a bit of
memory because we aren't carrying around 3 hash sets per register and 2 per
SSA value.  It should also save us CPU time because adding/removing things
from use/def sets is 4 pointer manipulations instead of a hash lookup.

Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists:

   GLSL IR Only:        586.4 +/- 1.653833
   NIR with hash sets:  675.4 +/- 2.502108
   NIR + use/def lists: 641.2 +/- 1.557043

I also ran a memory usage experiment with Ken's patch to delete GLSL IR and
keep NIR.  This patch cuts an aditional 42.9 MiB of ralloc'd memory over
and above what we gained by deleting the GLSL IR on the same dota trace.

On the code complexity side of things, some things are now much easier and
others are a bit harder.  One of the operations we perform constantly in
optimization passes is to replace one source with another.  Due to the fact
that an instruction can use the same SSA value multiple times, we had to
iterate through the sources of the instruction and determine if the use we
were replacing was the only one before removing it from the set of uses.
With this patch, uses are per-source not per-instruction so we can just
remove it safely.  On the other hand, trying to iterate over all of the
instructions that use a given value is more difficult.  Fortunately, the
two places we do that are the ffma peephole where it doesn't matter and GCM
where we already gracefully handle duplicates visits to an instruction.

Another aspect here is that using linked lists in this way can be tricky to
get right.  With sets, things were quite forgiving and the worst that
happened if you didn't properly remove a use was that it would get caught
in the validator.  With linked lists, it can lead to linked list corruption
which can be harder to track.  However, we do just as much validation of
the linked lists as we did of the sets so the validator should still catch
these problems.  While working on this series, the vast majority of the
bugs I had to fix were caught by assertions.  I don't think the lists are
going to be that much worse than the sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
2c2cd368aa util/list: Add a list validation function
Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
addcf41066 util/list: Add list_empty and list_length functions
v2: Don't use C99 when iterating over the list

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
b31d8983ba util/list: Add C99-based iterator macros
v2: Use LIST_ENTRY instead of container_of in iterators

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
7a30668ad6 util: Move gallium's linked list to util
The linked list in gallium is pretty much the kernel list and we would like
to have a C-based linked list for all of mesa.  Let's not duplicate and
just steal the gallium one.

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
258b4194c8 gallium/double_list: s/INLINE/inline and remove the p_compiler include
Acked-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-08 17:16:13 -07:00