Commit graph

98481 commits

Author SHA1 Message Date
Jason Ekstrand
2f9eb045f3 anv/cmd_buffer: Add support for pushing UBO ranges
In order to do this we have to modify push constant set up to handle
ranges.  We also have to tweak the way we handle dirty bits a bit so
that we re-push whenever a descriptor set changes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
0c879b62b0 anv/cmd_buffer: Add some stage asserts
There are several places where we look up opcodes in an array of stages.
Assert that the we don't end up going out-of-bounds.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
1968cd07a2 anv/cmd_buffer: Add some helpers for working with descriptor sets
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
1bce04deb8 anv/pipeline: Translate vulkan_resource_index to a constant when possible
We want to call brw_nir_analyze_ubo_ranges immedately after
anv_nir_apply_pipeline_layout and it badly wants constants.  We could
run an optimization step and let constant folding do it but that's way
more expensive than needed.  It's really easy to just handle constants
in apply_pipeline_layout.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
3b34ed79f1 i965/fs: Rewrite assign_constant_locations
This rewires the logic for assigning uniform locations to work in terms
of "complex alignments".  The basic idea is that, as we walk the list of
instructions, we keep track of the alignment and continuity requirements
of each slot and assert that the alignments all match up.  We then use
those alignments in the compaction stage to ensure that everything gets
placed at a properly aligned register.  The old mechanism handled
alignments by special-casing each of the bit sizes and placing 64-bit
values first followed by 32-bit values.

The old scheme had the advantage of never leaving a hole since all the
64-bit values could be tightly packed and so could the 32-bit values.
However, the new scheme has no type size special cases so it handles not
only 32 and 64-bit types but should gracefully extend to 16 and 8-bit
types as the need arises.

Tested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
597c194487 anv: Disable VK_KHR_16bit_storage
The testing for this extension is currently very poor.  The CTS tests
only test accessing UBOs and SSBOs at dynamic offsets so none of our
constant-offset paths get triggered at all.  Also, there's an assertion
in our handling of nir_intrinsic_load_uniform that offset % 4 == 0 which
is never triggered indicating that nothing every gets loaded from an
offset which is not a dword.  Both push constants and the constant
offset pull paths are complex enough, we really don't want to ship
without tests.  We'll turn the extension back on once we have decent
tests.
2017-12-08 15:42:55 -08:00
Leo Liu
6d74cb2570 radeon/vce: move destroy command before feedback command
VCE processing IBs starts from session and task info at first level,
other commands processed subsequently. The task info for destroy is
embedded to destroy command, resulting that feedback command is not
properly procoessed. This is causing kernel spin VM fault messages on
Polaris and Vega10 card when running ends at encode application.

The fix is also verified on VCE physical mode card.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Christian König <christian.koenig@amd.com>
2017-12-08 12:56:48 -05:00
Ben Crocker
060eb314eb docs/llvmpipe: document ppc64le as alternative architecture to x86.
Power8, Power8NV, and Power9 are supported on an equal footing
with X86.

Cc: "17.2" "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

[Eric: changed formatting, reworded a bit (with Ben's ack)]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-08 14:49:00 +00:00
Emil Velikov
bce489a4ed docs/release-calendar: drop 17.3.0 from the table
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-08 13:59:27 +00:00
Emil Velikov
95c9d751ce docs: add news item and link release notes for 17.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-08 13:58:03 +00:00
Emil Velikov
706986bcc9 docs: add sha256 checksums for 17.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 49a612d158)
2017-12-08 13:54:34 +00:00
Emil Velikov
4124ac51f4 docs: Update 17.3.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8d55da9f57)
2017-12-08 13:54:33 +00:00
Samuel Pitoiset
572b2bad1d radv: do not print ASM to stderr when dumping shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:24:24 +01:00
Samuel Pitoiset
33b329f769 radv/winsys: implement query_value()
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:35 +01:00
Samuel Pitoiset
c202119286 radv: remove useless check radv_set_dcc_need_cmask_elim_pred()
emit_fast_color_clear() already checks that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:03 +01:00
Samuel Pitoiset
d90b7a4c50 radv: remove useless checks in radv_set_{color,depth}_clear_regs()
Already checked by the respective callers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:00 +01:00
Samuel Pitoiset
c7c7b00889 radv: only re-mit the index type when it changes
dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:36 +01:00
Samuel Pitoiset
a302009b7b radv: only reset command buffers that are not in the initial state
dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:23 +01:00
Samuel Pitoiset
a380bc7ecf radv: track different status of a command buffer
RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:21 +01:00
Samuel Pitoiset
fc6c77e162 radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega
Copied from RadeonSI.

This fixes all CTS
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.*

And some other ones which use the same format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:15:44 +01:00
Jordan Justen
4d81c8e43e docs: Update GL_ARB_get_program_binary docs to support 1 format
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 17:01:02 +11:00
Jordan Justen
b4c37ce214 i965: Add ARB_get_program_binary support using nir_serialization
This resolves an apparent game bug described in 85564. The game
doesn't properly handle ARB_get_program_binary with 0 supported
formats.

V2 (Timothy Arceri):
 - less driver code as more has been moved into the common helpers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 17:00:57 +11:00
Jordan Justen
c1ff99fd70 main: Clear shader program data whenever ProgramBinary is called
The GL_ARB_get_program_binary extension spec says:

 "If ProgramBinary fails to load a binary, no error is generated, but
  any information about a previous link or load of that program object
  is lost."

v2:
 * Re-initialize shProg->data after clear. (Jordan)
   (Required after 6a72eba755)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
50c09a648f main: add binary support to ProgramBinary
V2: call generic mesa_program_binary() helper rather than driver
    function directly to allow greater code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
7ee54ad057 main: add binary support to GetProgramBinary
V2: call generic _mesa_get_program_binary() helper rather than driver
    function directly to allow greater code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
e30ed18215 main: Support getting GL_PROGRAM_BINARY_LENGTH
V2: call generic _mesa_get_program_binary_length() helper
    rather than driver function directly to allow greater
    code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>i (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
c20fd744fe mesa: Add Mesa ARB_get_program_binary helper functions
V2 (Timothy Arceri):
 - add extra code comment
 - stop passing around void *binary and just pass
   program_binary_header *hdr instead.
 - move to src/mesa/main rather than src/util

V3 (Timothy Arceri):
 - Move more code out of the backend and into the common
   helpers.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Timothy Arceri
90d4abdd87 mesa: add driver callbacks for serialising ProgramBinary blobs
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
64ad804e59 main: Support 1 Mesa format with get for GL_PROGRAM_BINARY_FORMATS
Mesa supports either 0 or 1 formats. If 1 format is supported, it is
GL_PROGRAM_BINARY_FORMAT_MESA as defined in the
GL_MESA_program_binary_formats extension spec.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
fb077d603b main: Allow non-zero NUM_PROGRAM_BINARY_FORMATS
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
2e28494af2 i965: Fix memory leak when serializing nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
25b3ce6e3b i965: Add brw_program_serialize_nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:59:22 +11:00
Jordan Justen
b3f1b765e9 i965: Free serialized nir after deserializing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
cdc7ac23b9 i965: Add brw_program_deserialize_nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
7cf1037d5a main, glsl: Add UniformDataDefaults which stores uniform defaults
The ARB_get_program_binary extension requires that uniform values in a
program be restored to their initial value just after linking.

This patch saves off the initial values just after linking. When the
program is restored by glProgramBinary, we can use this to copy the
initial value of uniforms into UniformDataSlots.

V2 (Timothy Arceri):
 - Store UniformDataDefaults only when serializing GLSL as this
   is what we want for both disk cache and ARB_get_program_binary.
   This saves us having to come back later and reset the Uniforms
   on program binary restores.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
ebd9e789c4 glsl: Split out shader program serialization
This will allow us to use the program serialization to implement
ARB_get_program_binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
219628c118 include: Add GL_MESA_program_binary_formats to GL/GLES2 ext.h files
Thus was merged into the OpenGL Registry in version
667c5a253781834b40a6ae9eb19d05af4542cfe1.

Ref: https://github.com/KhronosGroup/OpenGL-Registry/pull/127
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
0c48487893 mesa: add GL_PROGRAM_BINARY_FORMAT_MESA enum
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Francisco Jerez
4d1959e693 intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.
This addresses a long-standing back-end compiler bug that could lead
to cross-channel data corruption in loops executed non-uniformly.  In
some cases live variables extending through a loop divergence point
(e.g. a non-uniform break) into a convergence point (e.g. the end of
the loop) wouldn't be considered live along all physical control flow
paths the SIMD thread could possibly have taken in between due to some
channels remaining in the loop for additional iterations.

This patch fixes the problem by extending the CFG with physical edges
that don't exist in the idealized non-vectorized program, but
represent valid control flow paths the SIMD EU may take due to the
divergence of logical threads.  This makes sense because the i965 IR
is explicitly SIMD, and it's not uncommon for instructions to have an
influence on neighboring channels (e.g. a force_writemask_all header
setup), so the behavior of the SIMD thread as a whole needs to be
considered.

No changes in shader-db.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-12-07 18:27:05 -08:00
Francisco Jerez
9355116bda intel/fs: Don't let undefined values prevent copy propagation.
This makes the dataflow propagation logic of the copy propagation pass
more intelligent in cases where the destination of a copy is known to
be undefined for some incoming CFG edges, building upon the
definedness information provided by the last patch.  Helps a few
programs, and avoids a handful shader-db regressions from the next
patch.

shader-db results on ILK:

  total instructions in shared programs: 6541547 -> 6541523 (-0.00%)
  instructions in affected programs: 360 -> 336 (-6.67%)
  helped: 8
  HURT: 0

  LOST:   0
  GAINED: 10

shader-db results on BDW:

  total instructions in shared programs: 8174323 -> 8173882 (-0.01%)
  instructions in affected programs: 7730 -> 7289 (-5.71%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 4

shader-db results on SKL:

  total instructions in shared programs: 8185669 -> 8184598 (-0.01%)
  instructions in affected programs: 10364 -> 9293 (-10.33%)
  helped: 5
  HURT: 2

  LOST:   0
  GAINED: 2

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 18:27:04 -08:00
Francisco Jerez
c3c1aa5aeb intel/fs: Restrict live intervals to the subset possibly reachable from any definition.
Currently the liveness analysis pass would extend a live interval up
to the top of the program when no unconditional and complete
definition of the variable is found that dominates all of its uses.

This can lead to a serious performance problem in shaders containing
many partial writes, like scalar arithmetic, FP64 and soon FP16
operations.  The number of oversize live intervals in such workloads
can cause the compilation time of the shader to explode because of the
worse than quadratic behavior of the register allocator and scheduler
when running out of registers, and it can also cause the running time
of the shader to explode due to the amount of spilling it leads to,
which is orders of magnitude slower than GRF memory.

This patch fixes it by computing the intersection of our current live
intervals with the subset of the program that can possibly be reached
from any definition of the variable.  Extending the storage allocation
of the variable beyond that is pretty useless because its value is
guaranteed to be undefined at a point that cannot be reached from any
definition.

According to Jason, this improves performance of the subgroup Vulkan
CTS tests significantly (e.g. the runtime of the dvec4 broadcast test
improves by nearly 50x).

No significant change in the running time of shader-db (with 5%
statistical significance).

shader-db results on IVB:

  total cycles in shared programs: 61108780 -> 60932856 (-0.29%)
  cycles in affected programs: 16335482 -> 16159558 (-1.08%)
  helped: 5121
  HURT: 4347

  total spills in shared programs: 1309 -> 1288 (-1.60%)
  spills in affected programs: 249 -> 228 (-8.43%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1652 -> 1597 (-3.33%)
  fills in affected programs: 262 -> 207 (-20.99%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 209

shader-db results on BDW:

  total cycles in shared programs: 67617262 -> 67361220 (-0.38%)
  cycles in affected programs: 23397142 -> 23141100 (-1.09%)
  helped: 8045
  HURT: 6488

  total spills in shared programs: 1456 -> 1252 (-14.01%)
  spills in affected programs: 465 -> 261 (-43.87%)
  helped: 3
  HURT: 0

  total fills in shared programs: 1720 -> 1465 (-14.83%)
  fills in affected programs: 471 -> 216 (-54.14%)
  helped: 4
  HURT: 0

  LOST:   2
  GAINED: 162

shader-db results on SKL:

  total cycles in shared programs: 65436248 -> 65245186 (-0.29%)
  cycles in affected programs: 22560936 -> 22369874 (-0.85%)
  helped: 8457
  HURT: 6247

  total spills in shared programs: 437 -> 437 (0.00%)
  spills in affected programs: 0 -> 0
  helped: 0
  HURT: 0

  total fills in shared programs: 870 -> 854 (-1.84%)
  fills in affected programs: 16 -> 0
  helped: 1
  HURT: 0

  LOST:   0
  GAINED: 107

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 18:27:04 -08:00
Francisco Jerez
acf98ff933 intel/fs: Teach instruction scheduler about GRF bank conflict cycles.
This should allow the post-RA scheduler to do a slightly better job at
hiding latency in presence of instructions incurring bank conflicts.
The main purpuse of this patch is not to improve performance though,
but to get conflict cycles to show up in shader-db statistics in order
to make sure that regressions in the bank conflict mitigation pass
don't go unnoticed.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-12-07 15:56:49 -08:00
Francisco Jerez
af2c320190 intel/fs: Implement GRF bank conflict mitigation pass.
Unnecessary GRF bank conflicts increase the issue time of ternary
instructions (the overwhelmingly most common of which is MAD) by
roughly 50%, leading to reduced ALU throughput.  This pass attempts to
minimize the number of bank conflicts by rearranging the layout of the
GRF space post-register allocation.  It's in general not possible to
eliminate all of them without introducing extra copies, which are
typically more expensive than the bank conflict itself.

In a shader-db run on SKL this helps roughly 46k shaders:

   total conflicts in shared programs: 1008981 -> 600461 (-40.49%)
   conflicts in affected programs: 816222 -> 407702 (-50.05%)
   helped: 46234
   HURT: 72

The running time of shader-db itself on SKL seems to be increased by
roughly 2.52%±1.13% with n=20 due to the additional work done by the
compiler back-end.

On earlier generations the pass is somewhat less effective in relative
terms because the hardware incurs a bank conflict anytime the last two
sources of the instruction are duplicate (e.g. while trying to square
a value using MAD), which is impossible to avoid without introducing
copies.  E.g. for a shader-db run on SNB:

   total conflicts in shared programs: 944636 -> 623185 (-34.03%)
   conflicts in affected programs: 853258 -> 531807 (-37.67%)
   helped: 31052
   HURT: 19

And on BDW:

   total conflicts in shared programs: 1418393 -> 987539 (-30.38%)
   conflicts in affected programs: 1179787 -> 748933 (-36.52%)
   helped: 47592
   HURT: 70

On SKL GT4e this improves performance of GpuTest Volplosion by 3.64%
±0.33% with n=16.

NOTE: This patch intentionally disregards some i965 coding conventions
      for the sake of reviewability.  This is addressed by the next
      squash patch which introduces an amount of (for the most part
      boring) boilerplate that might distract reviewers from the
      non-trivial algorithmic details of the pass.

The following patch is squashed in:

SQUASH: intel/fs/bank_conflicts: Roll back to the nineties.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-12-07 15:56:06 -08:00
Dylan Baker
c34b53f133 meson: Fix building gallium media targets with gallium-xlib glx
To demonstrate this bug run meson with the options:
-Ddri-drivers= -Dglx=gallium-xlib

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-07 10:22:27 -08:00
Dylan Baker
2adc3817c6 meson: Add lmsensors to gallium libgl-xlib target.
Fixes: 5e71efef44 ("meson: Add lmsensors support")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-07 10:20:58 -08:00
Eric Engestrom
4cba39331d meson: add dep_thread to every lib that includes threads.h
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104141
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-12-07 17:29:42 +00:00
Eric Engestrom
f0337f0f70 meson: fix pl111 dependency on vc4
src/gallium/winsys/pl111/drm/libpl111winsys.a(pl111_drm_winsys.c.o): In function `pl111_drm_screen_create':
pl111_drm_winsys.c:(.text+0x33): undefined reference to `vc4_drm_screen_create_renderonly'

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-12-07 17:21:03 +00:00
Samuel Pitoiset
5f81a43535 radv: use a faster version for nir_op_pack_half_2x16
This patch is ported from RadeonSI and it has two effects.

It fixes a rendering issue which affects F1 2017 and Dawn
of War 3 (Vega only) because LLVM was ending up by generating
the new v_mad_mix_{hi,lo} instructions which appear to be
buggy in some way. Not sure if Mesa is generating something
wrong or if the issue is in LLVM only. Anyway, that explains why
the DOW3 issue can't be reproduced with GL on Vega.

It also improves performance because v_cvt_pkrtz_f16 is faster,
and because I guess the rounding mode behaviour is similar between
GL and VK, we can use it. About performance, it improves Talos
by +3/4% but I don't see any other impacts.

No CTS regressions on Polaris.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-07 17:21:50 +01:00
Alejandro Piñeiro
25e56b2eba mesa/spirv: move and rename nir_spirv_supported_capabilities
To avoid any vulkan driver to include the GL mtypes.h. Renamed as
eventually this could be used by drivers not using nir.

v2: remove compiler/spirv/spirv.h from mtypes (Alejandro)
v3: added the definition at compiler/shader_info.h (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-07 17:15:11 +01:00
Vadym Shovkoplias
b2490a326c util/disk_cache: Remove unneeded free() on always null string
At this point dc_job->cache_item_metadata.keys always equals
NULL, so call to free() is useless

Fixes: b86ecea344 ("util/disk_cache: write cache item metadata to disk")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-07 11:50:41 +00:00