Commit graph

74732 commits

Author SHA1 Message Date
Rob Clark
2fbe4e7d2f freedreno/a4xx: rework vinterp/vpsrepl
Same as previous commit, for a4xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-26 12:35:10 -05:00
Rob Clark
5adf4a5cda freedreno/a3xx: rework vinterp/vpsrepl
Make the interpolation / point-sprite replacement mode setup deal with
varying packing.

In a later commit, we switch to packing just the varying components that
are actually used by the frag shader, so we won't be able to assume
everything is vec4's aligned to vec4.  Which would highly confuse the
previous vinterp/vpsrepl logic.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-26 12:35:10 -05:00
Serge Martin
b7c958b7b7 clover: fix tgsi compiler crash with invalid src
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-11-26 15:30:25 +02:00
Francisco Jerez
55ffa64daf i965/gen9+: Switch thread scratch space to non-coherent stateless access.
The thread scratch space is thread-local so using the full IA-coherent
stateless surface index (255 since Gen8) is unnecessary and
potentially expensive.  On Gen8 and early steppings of Gen9 this is
not a functional change because the kernel already sets bit 4 of
HDC_CHICKEN0 which overrides all HDC memory access to be non-coherent
in order to workaround a hardware bug.

This happens to fix a full system hang when running any spilling code
on a pre-production SKL GT4e machine I have on my desk (forcing all
HDC access to non-coherent from the kernel up to stepping F0 might be
a good idea though regardless of this patch), and improves performance
of the OglPSBump2 SynMark benchmark run with INTEL_DEBUG=spill_fs by
33% (11 runs, 5% significance) on a production SKL GT2 (on which HDC
IA-coherency is apparently functional so it wouldn't make sense to
disable globally).

Reviewed-by: Kristian Høgsberg  <krh@bitplanet.net>
2015-11-26 14:07:58 +02:00
Francisco Jerez
bc8182808a i965/fs: Don't use Gen7-style scratch block reads on Gen9+.
Unfortunately Gen7 scratch block reads and writes seem to be hardwired
to BTI 255 even on Gen9+ where that index causes the dataport to do an
IA-coherent read or write.  This change is required for the next patch
to be correct, since otherwise we would be writing to the scratch
space using non-coherent access and then reading it back using
IA-coherent reads, which wouldn't be guaranteed to return the value
previously written to the same location without introducing an
additional HDC flush in between.

Reviewed-by: Kristian Høgsberg  <krh@bitplanet.net>
2015-11-26 14:07:58 +02:00
Francisco Jerez
3e6d0d2ca4 i965: Add symbolic defines for some magic dataport surface indices.
Reviewed-by: Kristian Høgsberg  <krh@bitplanet.net>
2015-11-26 14:07:58 +02:00
Nicolai Hähnle
6b5268d202 radeon: use PIPE_DRIVER_QUERY_FLAG_DONT_LIST for perfcounters
Since the query names are not very enlightening, and there are thousands
of them, GALLIUM_HUD=help should only show the first and last query name
for each hardware block.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-26 10:57:44 +01:00
Nicolai Hähnle
f36d9857cd gallium: add PIPE_DRIVER_QUERY_FLAG_DONT_LIST
This allows the driver to give a hint to the HUD so that GALLIUM_HUD=help is
less spammy.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-26 10:57:43 +01:00
Nicolai Hähnle
80a16dece6 radeon: delay the generation of driver query names until first use
This shaves a bit more time off the startup of programs that don't
actually use performance counters.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-26 10:57:43 +01:00
Julien Isorce
ca976e6900 st/va: add missing profiles in PipeToProfile's switch.
Otherwise assert is raised from vlVaQueryConfigProfiles's for loop.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-26 08:21:45 +00:00
Marta Lofstedt
63b49e1711 mesa: remove ARB_geometry_shader4
No drivers currently implement ARB_geometry_shader4, nor are there
any plans to implement it.  We only support the version of geometry
shaders that was incorporated into OpenGL 3.2 / GLSL 1.50.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-26 08:40:46 +01:00
Tapani Pälli
c2e146f487 mesa: error out in indirect draw when vertex bindings mismatch
Patch adds additional mask for tracking which vertex arrays have
associated vertex buffer binding set. This array can be directly
compared to which vertex arrays are enabled and should match when
drawing.

Fixes following CTS tests:

   ES31-CTS.draw_indirect.negative-noVBO-arrays
   ES31-CTS.draw_indirect.negative-noVBO-elements

v2: update mask in vertex_array_attrib_binding
v3: rename mask and make it track _BoundArrays which matches what
    was actually originally wanted (Fredrik Höglund)
v4: code cleanup, check for GLES 3.1 (Fredrik Höglund)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-11-26 08:01:31 +02:00
Michel Dänzer
22d2dda03b targets/xvmc: use the non-inline sw helpers
This was missed in commit 59cfb21d ("targets: use the non-inline sw
helpers").

Fixes build failure:

  CXXLD    libXvMCgallium.la
../../../../src/gallium/auxiliary/pipe-loader/.libs/libpipe_loader_static.a(libpipe_loader_static_la-pipe_loader_sw.o):(.data.rel.ro+0x0): undefined reference to `sw_screen_create'
collect2: error: ld returned 1 exit status
Makefile:756: recipe for target 'libXvMCgallium.la' failed
make[3]: *** [libXvMCgallium.la] Error 1

Trivial.
2015-11-26 12:14:28 +09:00
Emil Velikov
72c33f0dd5 targets/nine: remove freedreno target
Analogous to previous commit. As we no longer have anyone who uses NIR
we can drop the link.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
2015-11-25 20:29:44 +00:00
Emil Velikov
aa335bb01b targets/nine: remove vc4 target
There are no users for it.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-11-25 20:28:38 +00:00
Emil Velikov
b78259c4b5 gallium: remove unused function declarations
Unused as of commit 23fb11455b "{st,targets}/dri: use static/dynamic
pipe-loader"

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-25 20:26:52 +00:00
Emil Velikov
59cfb21d46 targets: use the non-inline sw helpers
Previously (with the inline ones) things were embedded into the
pipe-loader, which means that we cannot control/select what we want in
each target.

That also meant that at runtime we ended up with the empty
sw_screen_create() as the GALLIUM_SOFTPIPE/LLVMPIPE were not set.

v2: Cover all the targets, not just dri.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2015-11-25 20:25:29 +00:00
Emil Velikov
fbc6447c3d target-hepers: add non inline sw helpers
Feeling rather dirty copying the inline ones, yet we need the inline
ones for swrast only targets like libgl-xlib, osmesa.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
2015-11-25 20:25:14 +00:00
Emil Velikov
f623517188 pipe-loader: fix off-by one error
With earlier commit we've dropped the manual iteration over the fixed
size array and prepemtively set the variable storing the size, that is
to be returned. Yet we forgot to adjust the comparison, as before we
were comparing the index, now we're comparing the size.

Fixes: ff9cd8a67c "pipe-loader: directly use
pipe_loader_sw_probe_null() at probe time"
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93091
Reported-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2015-11-25 20:22:35 +00:00
Emil Velikov
0572e5fea5 nir: include what we want/need
Swap core.h with macros.h, as the latter provides the required MAX2
macro.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-25 20:19:47 +00:00
Kenneth Graunke
3810c15614 i965: Fix scalar vertex shader struct outputs.
While we correctly set output[] for composite varyings, we set completely
bogus values for output_components[], making emit_urb_writes() output
zeros instead of the actual values.

Unfortunately, our simple approach goes out the window, and we need to
recurse into structs to get the proper value of vector_elements for each
field.

Together with the previous patch, this fixes rendering in an upcoming
game from Feral Interactive.

v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt).

Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-25 11:47:47 -08:00
Kenneth Graunke
3e9003e9cf i965: Fix fragment shader struct inputs.
Apparently we have literally no support for FS varying struct inputs.
This is somewhat surprising, given that we've had tests for that very
feature that have been passing for a long time.

Normally, varying packing splits up structures for us, so we don't see
them in the backend.  However, with SSO, varying packing isn't around
to save us, and we get actual structs that we have to handle.

This patch changes fs_visitor::emit_general_interpolation() to work
recursively, properly handling nested structs/arrays/and so on.
(It's easier to read with diff -b, as indentation changes.)

When using the vec4 VS backend, this fixes rendering in an upcoming
game from Feral Interactive.  (The scalar VS backend requires additional
bug fixes in the next patch.)

v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt).

Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-25 11:47:47 -08:00
Tom Stellard
89851a2965 radeonsi/compute: Use the compiler's COMPUTE_PGM_RSRC* register values
The compiler has more information and is able to optimize the bits
it sets in these registers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

CC: <mesa-stable@lists.freedesktop.org>
2015-11-25 11:03:05 -05:00
Tom Stellard
95e0510916 radeonsi: Rename si_shader::ls_rsrc{1,2} to si_shader::rsrc{1,2}
In the future, these will be used by other shaders types.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 11:03:05 -05:00
Samuel Iglesias Gonsálvez
98ceb60177 docs: minimum required python mako version is 0.3.4
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-25 16:50:53 +01:00
Nicolai Hähnle
07bddff460 docs: update relnotes with AMD_performance_monitor for radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:52:09 +01:00
Nicolai Hähnle
ad22006892 radeonsi: implement AMD_performance_monitor for CIK+
Expose most of the performance counter groups that are exposed by Catalyst.
Ideally, the driver will work with GPUPerfStudio at some point, but we are not
quite there yet. In any case, this is the reason for grouping multiple
instances of hardware blocks in the way it is implemented.

The counters can also be shown using the Gallium HUD. If one is interested to
see how work is distributed across multiple shader engines, one can set the
environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance
counter groups.

Part of the implementation is in radeon because an implementation for
older hardware would largely follow along the same lines, but exposing
a different set of blocks which are programmed slightly differently.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:52:09 +01:00
Nicolai Hähnle
b9fc01aee7 radeon: scale query buffer size to result size
Performance monitor queries can become very big, especially considering that
instances of a block in different shader engines are queried separately.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:28:09 +01:00
Nicolai Hähnle
592928065c radeonsi/sid: add performance counter registers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:28:06 +01:00
Nicolai Hähnle
9823048e0b radeonsi/sid: add hardware constants for COPY_DATA packet
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:28:03 +01:00
Nicolai Hähnle
1aa3b48c12 radeon: extend CIK_UCONFIG_REG_END for performance counters
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:28:00 +01:00
Nicolai Hähnle
b589e18a98 radeon: add perfcounter-related EVENT_TYPEs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:27:56 +01:00
Nicolai Hähnle
30462b1826 radeon: additional constants for WAIT_REG_MEM and EVENT_WRITE_EOP
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-25 15:27:34 +01:00
Nicolai Hähnle
bfddd005ea st/mesa: remove outdated comment
The enable of AMD_performance_monitor is no longer related to whether
queries are run by the GPU since the commit mentioned below.

Suggested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit ddf27a3dd0
Author: Nicolai Hähnle <nhaehnle@gmail.com>
Date:   Tue Nov 10 13:35:01 2015 +0100

    gallium: remove pipe_driver_query_group_info field type
2015-11-25 15:27:34 +01:00
Nicolai Hähnle
babf655ab2 st/mesa: delay initialization of performance counters
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-11-25 15:27:33 +01:00
Nicolai Hähnle
27a06e0bbe mesa/main: allow delayed initialization of performance monitors
Most applications never use performance counters, so allow drivers to
skip potentially expensive initialization steps.

A driver that wants to use this must enable the appropriate extension(s)
at context initialization and set the InitPerfMonitorGroups driver function
which will be called the first time information about the performance monitor
groups is actually used.

The init_groups helper is called for API functions that can be called before
a monitor object exists. Functions that require an existing monitor object
can rely on init_groups having been called before.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-11-25 15:27:33 +01:00
Tapani Pälli
315c4c315e glsl: handle case where index is array deref in optimize_split_arrays
Previously pass did not traverse to those array dereferences which were
used as indices to arrays. This fixes Synmark2 Gl42CSCloth application
issues.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2015-11-25 11:25:57 +02:00
Julien Isorce
63c344d179 nouveau: move interlaced assert down in nouveau_vp3_video_buffer_create
templat->interlaced is 0 if not NV12 which is the case currently
when using VPP.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-25 08:17:39 +00:00
Iago Toral Quiroga
2bba2152e4 i965: remove trailing spaces in various files
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-25 08:12:08 +01:00
Iago Toral Quiroga
1af0d9d939 glsl: remove trailing spaces in various files
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-25 08:09:17 +01:00
Matt Turner
f1b7fefd4e i965: Pass brw_context pointer, not gl_context pointer.
Fixes a warning introduced by commit dcadd855.
2015-11-24 21:27:57 -08:00
Timothy Arceri
7436d7c33b glsl: only call dead code pass when new inputs/outputs demoted
This will help avoid eliminating inputs/outputs needed by SSOs.

Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2015-11-25 09:50:13 +11:00
Timothy Arceri
404ac4bf9e glsl: move and reused code to find first and last shaders
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2015-11-25 09:49:48 +11:00
Matt Turner
0ce370a84b mesa: Use unreachable() instead of a default case.
(And add an unreachable() in one place that didn't have a default case)
2015-11-24 13:27:20 -08:00
Ian Romanick
47b3a0d235 meta: Don't save or restore the active client texture
This setting is only used by glTexCoordPointer and related glEnable
calls.  Since the preceeding commits removed all of those, it is not
necessary to save, reset to default, or restore this state.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-24 11:31:30 -08:00
Ian Romanick
c63f9c735d meta: Don't save or restore the VBO binding
Nothing left in meta does anything with the VBO binding, so we don't
need to save or restore it.  The VAO binding is still modified.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-24 11:31:30 -08:00
Ian Romanick
58aa56d40b meta/TexSubImage: Don't pollute the buffer object namespace
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

 - Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

 - Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

 - Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-24 11:31:30 -08:00
Ian Romanick
76cfe2bc44 meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex
tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

 - Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

 - Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

 - Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-24 11:31:30 -08:00
Ian Romanick
a222d4cbc3 meta: Use internal functions for buffer object and VAO access in _mesa_meta_DrawTex
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-24 11:31:30 -08:00
Ian Romanick
b8a7369fb7 meta: Track VBO using gl_buffer_object instead of GL API object handle in _mesa_meta_DrawTex
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-11-24 11:31:30 -08:00