Commit graph

63746 commits

Author SHA1 Message Date
Matt Turner
22cd917329 mesa: Add and use foreach_in_list_use_after.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
d49173a97b glsl: Replace uses of foreach_list_const.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
fd8f65498a glsl: Replace another couple uses of foreach_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
6e217ad1d7 glsl: Use foreach_list_typed when possible.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
373824d769 mesa: Use typed foreach_in_list_safe instead of foreach_list_safe.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
c6a16f6d0e glsl: Use typed foreach_in_list_safe instead of foreach_list_safe.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
e0cb82d0c4 mesa: Use typed foreach_in_list instead of foreach_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
4d78446d78 glsl: Use typed foreach_in_list instead of foreach_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
da9f0316e6 glsl: Add typed foreach_in_list_safe macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Matt Turner
3597681040 glsl: Add typed foreach_in_list/_reverse macros.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-07-01 08:55:51 -07:00
Axel Davy
4d6c9352f3 mesa: fix the condition in src/loader/Makefile.am
We want to have the dri common files compiled to define USE_DRICONF.
We need to check both NEED_OPENGL_COMMON and HAVE_DRICOMMON

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Tested-by: Brian Paul <brianp@vmware.com>
2014-07-01 09:42:44 -06:00
Brian Paul
ad6e1e12cc mesa: update comment for UniformBufferSize to indicate size is in bytes
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 09:42:44 -06:00
Brian Paul
f4b0ab7afd st/mesa: fix incorrect size of UBO declarations
UniformBufferSize is in bytes so we need to divide by 16 to get the
number of constant buffer slots.  Also, the ureg_DECL_constant2D()
function takes first..last parameters so we need to subtract one
for the last value.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 09:42:44 -06:00
Brian Paul
01bf8bb875 st/mesa: don't use address register for constant-indexed ir_binop_ubo_load
Before, we were always using the address register and indirect addressing
to index into a UBO constant buffer.  With this change we only do that
when necessary.

Using the piglit bin/arb_uniform_buffer_object-rendering test as an
example:

Shader code:
  uniform ub_rot {float rotation; };
  ...
  m[1][1] = cos(rotation);

Before:
  IMM[1] INT32 {0, 1, 0, 0}
  1: UARL ADDR[0].x, IMM[1].xxxx
  2: MOV TEMP[0].x, CONST[3][ADDR[0].x].xxxx
  3: COS TEMP[1].x, TEMP[0].xxxx

After:
  0: COS TEMP[0].x, CONST[3][0].xxxx

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 09:42:44 -06:00
Brian Paul
dfca35f807 st/mesa: allow 2D indexing for all shader types in translate_src()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 09:42:44 -06:00
Brian Paul
f11e3dc122 st/mesa: don't ignore const buf index in src_register()
Otherwise, if we were creating a const buffer src register for a UBO
the index into the UBO was always zero.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 09:42:44 -06:00
Ilia Mirkin
5e04526399 nvc0: expose 4 vertex streams, use stream ids in xfb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-01 11:34:40 -04:00
Ilia Mirkin
2f2467cb23 nvc0/ir: only merge emit/restart for identical streams
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-01 11:34:40 -04:00
Ilia Mirkin
e5cdbdecd2 nvc0/ir: avoid creating restarts with non-0 stream
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-01 11:34:40 -04:00
Ilia Mirkin
40b8aec251 nvc0/ir: fix emitting vertex stream
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-01 11:34:40 -04:00
Ilia Mirkin
1d16dbf416 mesa/st: add vertex stream support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 11:34:37 -04:00
Ilia Mirkin
746e5260f6 gallium: add a cap for max vertex streams
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 11:34:35 -04:00
Ilia Mirkin
43e4b3e311 gallium: add an index argument to create_query
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 11:34:31 -04:00
Ilia Mirkin
7f1b365f65 gallium: add support for stream in so info
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 11:34:28 -04:00
Ilia Mirkin
0cbefc1bea gallium: add vertex stream argument to EMIT/ENDPRIM
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-01 11:34:24 -04:00
Matt Turner
1bfc0a1102 i965/fs: Mark predicated PLN instructions with dependency hints.
To implement the unlit_centroid_workaround, previously we emitted

   (+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 1Q };
   (-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 1Q };

where the flag register contains the channel enable bits from g0.

Since the predicates are complementary, the pair of pln instructions
write to non-overlapping components of the destination, which is the
case that the dependency control hints are designed for.

Typically setting dependency control hints on predicated instructions
isn't safe (if an instruction doesn't execute due to the predicate, it
won't update the scoreboard, leaving it in a bad state) but since we
must have at least one channel executing (i.e., +f0 is true for some
channel) by virtue of the fact that the thread is running, we can put
the +f0 pln instruction last and set the hints:

   (-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 NoDDClr 1Q };
   (+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 NoDDChk 1Q };

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 22:31:06 -07:00
Matt Turner
4fe53ee5d7 i965/fs: Predicate PLN instructions used in unlit centroid WA.
Maybe lets us skip some PLN instructions if whole subspans are disabled?

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 22:31:05 -07:00
Matt Turner
6d2536395d i965/fs: Add no_dd_{clear,check} fields to fs_inst.
And plumb them through. Also make the assert in the generator look like
the vec4 one.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 22:31:05 -07:00
Matt Turner
bcbb7c41b7 i965/fs: Let sat-prop ignore live ranges if producer already has sat.
This sequence (where both x and w are used afterwards) wasn't handled.

   mul.sat x, y, z
   ...
   mov.sat w, x

We assumed that if x was used after the mov.sat, that we couldn't
propagate the saturate modifier, but in fact x was already saturated.

So ignore the live range check if the producing instruction already
saturates its result. Cuts one instruction from hundreds of TF2 shaders.

total instructions in shared programs: 1995631 -> 1994951 (-0.03%)
instructions in affected programs:     155248 -> 154568 (-0.44%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-06-30 22:31:05 -07:00
Matt Turner
e58992aedd i965/fs: Pass const references to emit functions.
Cuts 10k of .text and saves a bunch of useless struct copies.
2014-06-30 22:31:05 -07:00
Matt Turner
35b741c8e7 i965/vec4: Pass const references to instruction functions.
text	   data	    bss	    dec	    hex	filename
4231165	 123200	  39648	4394013	 430c1d	i965_dri.so
4186277	 123200	  39648	4349125	 425cc5	i965_dri.so

Cuts 43k of .text and saves a bunch of useless struct copies.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-06-30 22:31:05 -07:00
Matt Turner
d35f34cea9 i965/vec4: Pass const references to vec4_instruction().
text	   data	    bss	    dec	    hex	filename
4244821	 123200	  39648	4407669	 434175	i965_dri.so
4231165	 123200	  39648	4394013	 430c1d	i965_dri.so

Cuts 13k of .text and saves a bunch of useless struct copies.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-06-30 22:31:05 -07:00
Matt Turner
e4b05af5d4 i965/fs: Pass const references to instruction functions.
text	   data	    bss	    dec	    hex	filename
4270747	 123200	  39648	4433595	 43a6bb	i965_dri.so
4244821	 123200	  39648	4407669	 434175	i965_dri.so

Cuts 25k of .text and saves a bunch of useless struct copies.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-06-30 22:31:05 -07:00
Axel Davy
5d5c20920e radeonsi: Use dma_copy when possible for si_blit.
This improves GLX DRI3 GPU offloading significantly on CPU
bound benchmarks particularly.
No performance impact for DRI2 GPU offloading.

v2: Add missing tests

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák<marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:10:01 +10:00
Axel Davy
9320c8fea9 glx/dri3: add GPU offloading support.
The differences with DRI2 GPU offloading are:
a) There's no logic for GPU offloading needed in the Xserver

b) for DRI2, the card would render to a back buffer, and
the content would be copied to the front buffer (the same buffers
everytime). Here we can potentially use several back buffers and copy
to buffers with no tiling to share with X. We send them with the
Present extension.

That means than the DRI2 solution is forced to have tearings with GPU
offloading. In the ideal scenario, this DRI3 solution doesn't have this
problem.

However without dma-buf fences, a race can appear (if the card is slow
and the rendering hasn't finished before the server card reads the buffer),
and then old content is displayed. If a user hits this, he should probably
revert to the DRI2 solution (LIBGL_DRI3_DISABLE). Users with cards fast
enough seem to not hit this in practice (I have an Amd hd 7730m, and I
don't hit this, except if I force a low dpm mode)

c) for non-fullscreen apps, the DRI2 GPU offloading solution requires
compositing. This DRI3 solution doesn't have this requirement. Rendering
to a pixmap also works.

d) There is no need to have a DDX loaded for the secondary card.

V4: Fixes some piglit tests

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:07:52 +10:00
Axel Davy
3ecd9e1a93 loader: Use drirc device_id parameter in complement to DRI_PRIME
DRI_PRIME is not very handy, because you have to launch the executable
with it set, which is not always easy to do.
By using drirc, the user specifies the target executable
and the device to use. After that the program will be launched everytime
on the target device.

For example if .drirc contains:

<driconf>
    <device driver="loader">
        <application name="Glmark2" executable="glmark2">
            <option name="device_id" value="pci-0000_01_00_0" />
        </application>
    </device>
</driconf>

Then glmark2 will use if possible the render-node of
ID_PATH_TAG pci-0000_01_00_0.

v2: Fix compilation issue
v3: Add "-lm" and rebase.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:07:40 +10:00
Axel Davy
7ab925a6aa loader: add gpu selection code via DRI_PRIME.
v2: Fix the leak of device_name
v3: Rebased

It enables to use the DRI_PRIME env var to specify
which gpu to use.
Two syntax are supported:
If DRI_PRIME is 1 it means: take any other gpu than the default one.
If DRI_PRIME is the ID_PATH_TAG of a device: choose this device if
possible.

The ID_PATH_TAG is a tag filled by udev.
You can check it with 'udevadm info' on the device node.
For example it can be "pci-0000_01_00_0".

Render-nodes need to be enabled to choose another gpu,
and they need to have the ID_PATH_TAG advertised.
It is possible for not very recent udev that the tag
is not advertised for render-nodes, then
ones need to add a file containing:

SUBSYSTEM=="drm", IMPORT{builtin}="path_id"

in /etc/udev/rules.d/

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:07:30 +10:00
Axel Davy
da3a47d682 drirc: Add string support
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
2014-07-01 13:06:51 +10:00
Dave Airlie
29800e6a3e dri: remove GL types from config queries
This in theory changes ABI for the boolean->bool I think,
but nothing in the tree uses configQueryb AFAICS.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:06:29 +10:00
Dave Airlie
a513daec29 dri/xmlconfig: remove GL types.
This just drops all the GL types from the xmlconfig and use
std C types from stdint and stdbool.

v2: drop further double and header include.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:03:06 +10:00
Dave Airlie
b94dc944df dri3: cache pointer to back instead of looking up.
This is just prep work for the dri3 prime patches.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:00:14 +10:00
Alexandre Demers
11a879f260 configure.ac: (trivial) Fixing a typo
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-30 22:50:00 +01:00
Emil Velikov
ce1a137228 targets/egl-static: use inline_drm_helper and Automake.inc helpers
Update all three build systems, and add freedreno to the android
build. Pending future work on the ST we can convert egl-static
to provide either static or dynamic access to the pipe-drivers.

There is no functional change with this patch.

v2: Don't add freedreno to android build, drop the wrapper winsys.

Cc: Chia-I Wu <olv@lunarg.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-30 22:27:12 +01:00
Emil Velikov
7689aa28cd targets/gbm: convert to static/shared pipe-driver
Move the gbm "target" code to the state-tracker, similar
to other - dri, omx, vdpau... ST.

v2: Drop inclusion of the wrapper winsys and softpipe/llvmpipe.

Cc: Chia-I Wu <olv@lunarg.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-30 22:27:11 +01:00
Emil Velikov
37e640a073 targets/xa: provide alternative(static) xa target
Now we can build the xa target (libxatracker) with either static
pipe-drivers or shared ones. Currently we default to static.

 - Remove the unused CFLAGS/CPPFLAGS.
 - Use GALLIUM_TARGET_CFLAGS where applicable.

v2: Update the printout messages at configure.
v3: Drop inclusion of the wrapper winsys and softpipe/llvmpipe.

Cc: Jakob Bornecrantz <jakob@vmware.com>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-30 22:27:11 +01:00
Kenneth Graunke
c60a4ba7e3 i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.
Apparently INTEL_DEBUG=fs has crashed on Broadwell for anything using
ARB_fragment_program since commit 9cee3ff5.  We need to NULL-check the
right field.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-06-30 14:06:51 -07:00
Kenneth Graunke
5dfbfd17e0 i965/disasm: Delete gen8_disasm.c.
The functionality has been merged into brw_disasm.c; use that instead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 14:05:28 -07:00
Kenneth Graunke
e59a9ecc98 i965/disasm: Stop using gen8_disassemble in favor of brw_disassemble.
At this point, brw_disassemble can do everything gen8_disassemble can
do - and, thanks to the new brw_inst API, it supports all generations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 14:05:28 -07:00
Kenneth Graunke
7b7f95b952 i965/disasm: Improve render target write message disassembly.
Previously, we decoded render target write messages as:

   render ( RT write, 0, 16, 12, 0) mlen 8 rlen 0

which made you remember (or look up) what the numbers meant:

1. The binding table index
2. The raw message control, undecoded:
   - Last Render Target Select
   - Slot Group Select
   - Message Type (SIMD8, normal SIMD16, SIMD16 replicate data, ...)
3. The dataport message type, again (already decoded as "RT write")
4. The write commit bit (0 or 1)

Needless to say, having to decipher that yourself is annoying.  Now, we
do:

   render RT write SIMD16 LastRT Surface = 0 mlen 8 rlen 0

with optional "Hi" and "WriteCommit" for slot group/write commit.

Thanks to the new brw_inst API, we can also stop duplicating code on a
per-generation basis.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 14:05:28 -07:00
Kenneth Graunke
0e5b52e35d i965/disasm: Rename msg_target to SFID.
We haven't used the name "message target" in a while - there are a lot
of things called "target", and it gets confusing.  SFID ("Shared
Function ID") is the term commonly used in the modern documentation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-30 14:05:28 -07:00