Commit graph

89278 commits

Author SHA1 Message Date
Emil Velikov
912b4f5472 glx/windows: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
5b874cee09 glx/apple: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
d66f9e6d93 glx: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
d221bf9b91 d3dadapter9: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
517f34b4be st/dri: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
02f991c00d clover: automake: remove -I$(srcdir)
Already implicitly handled by the build system.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Aaron Watry <awatry@gmail.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
65d5a60cac clover: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
c5921ae0d2 egl: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
90ac5c339e i915: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
4622c75dfb i965: automake: include builddir prior to srcdir
The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
a922c82125 freedreno: automake: correctly set MKDIR_GEN
Analogous to previous commit.

Fixes: 4610e5ef28 "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
2017-01-27 17:56:55 +00:00
Emil Velikov
5eed48d237 i965: automake: correctly set MKDIR_GEN
Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 17:56:54 +00:00
Eric Engestrom
1ee2ae8348 anv: add missing extension errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-27 17:23:32 +00:00
Eric Engestrom
86879bf4ed anv: add missing core errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-27 17:23:32 +00:00
Lionel Landwerlin
ba26c79157 anv: don't assert on out of memory descriptor pool in debug mode
Fixes:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-01-27 17:23:32 +00:00
Eric Engestrom
4da0d1c59a docs/repository: fix name of main branch
This is git, not svn :P

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-27 16:52:23 +00:00
Eric Engestrom
87619a1a6a egl: EGL_PLATFORM_SURFACELESS_MESA is now upstream
EGL_PLATFORM_SURFACELESS_MESA is in eglext.h as of last commit.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-01-27 16:52:23 +00:00
Eric Engestrom
a98b3a0872 egl: update headers from registry
Khronos introduced a new macro (suggested by Google) to avoid using
C-style casts in C++ code, as those generate warnings.

Khronos Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16113
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-01-27 16:52:23 +00:00
Eric Engestrom
06842585df radv: add missing extension errors in vk_errorf()
v2(Bas): Remove the extra VK_ERROR_FRAGMENTED_POOL cases.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-27 17:33:05 +01:00
Eric Engestrom
43cf967512 radv: add missing core errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-27 17:33:05 +01:00
Andreas Boll
1f2a890ace configure.ac: Require LLVM for r300 only on x86 and x86_64
b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.

r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before b3119a3.

Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 12:31:17 +01:00
Nicolai Hähnle
c5e76a262a gallium: enable int64 on radeonsi, llvmpipe, softpipe
All of these have had support for the TGSI opcodes since before most of
the glsl compiler work landed.

Also update the docs accordingly, including the missing note about i965.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:48 +01:00
Dave Airlie
93dc5c1a06 st/mesa: add support for enabling ARB_gpu_shader_int64.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:43 +01:00
Dave Airlie
278580729a st/glsl_to_tgsi: add support for 64-bit integers
v2: add conversion opcodes.

v3 (idr): Rebase on replacemtn of TGSI_OPCODE_I2U64 with
TGSI_OPCODE_I2I64.

v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

v5 (nha): add clarifying comment about a subtle assumption

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:39 +01:00
Dave Airlie
f804506d4d gallium: Add integer 64 capability
v1.1: move to using a normal CAP. (Marek)

v2: fill in the cap everywhere

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:25 +01:00
Topi Pohjolainen
a283a4ee2f meta: Refactor texture format translation
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
542bb85049 intel/blorp/dbg: Name blit shaders for easy recognition in dumps
Blorp clears already have an equivalent.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
56094cfb9e i965/hiz/gen6: Stop setting false qpitch
which is not applicable for "all slices at each lod". Current
logic makes one to believe it has some purpose. When miptree
layout is calculated brw_miptree_layout_texture_array() sets
the qpitch unconditionally but later on ignores it altogether
for ALL_SLICES_AT_EACH_LOD.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
b13d30a72b i965/blorp/gen6: Remove dead code in hiz setup
Such as comment states for intel_miptree_hiz_buffer::mt, hiz_mt
only exists for gen6. In addition, intel_hiz_miptree_buf_create()
uses MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD unconditionally.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
b864e3d7ee i965/gen6: Simplify hiz surface setup
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. The same goes for
intel_miptree_aux_buffer::pitch/qpitch.

This will make following patches simpler to read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
40bf622ced i965/blorp/gen6: Simplify hiz surface setup
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. Also intel_miptree_aux_buffer::offset
is initialised to zero (calloc()).

This will make following patches significantly simpler to read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
5201d2991b i965/gen6: Remove check for stencil format
There are is no alternative.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
19412abb3f i965: Remove check for hiz on earlier gens than SNB
Only caller, brw_workaround_depthstencil_alignment(), returns
early for gen6+.

While at it, reduce scope for brw_get_depthstencil_tile_masks() as
well.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
26a9e039fd i965/miptree: Remove redundant check for null texture
There exact same check earlier in brw_miptree_layout() which
intel_miptree_create_layout() in turn calls unconditionally.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
bcec4113cc i965/miptree: Tell when brw_miptree_layout() fails
In addition, let intel_miptree_create_layout() release the
miptree - it is the allocator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:25 +02:00
Topi Pohjolainen
aa9e21a316 i965/meta: Remove unused brw_get_rb_for_slice()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gons<C3><A1>lvez <siglesias@igalia.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:25 +02:00
Michel Dänzer
d9f8bae616 clover: Fix build against clang SVN >= r293097
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-01-27 09:53:14 +09:00
Eric Anholt
9baf1ff8fc vc4: Use NEON to speed up utile stores on Pi2+.
Improves 1024x1024 TexSubImage2D by 41.2371% +/- 3.52799% (n=10).
2017-01-26 12:50:05 -08:00
Eric Anholt
4d30024238 vc4: Use NEON to speed up utile loads on Pi2.
We had a lot of memcpy call overhead because gpu_stride wasn't being
inlined.  But if you split out the stride==8 and stride==16 cases like
this code does while still using memcpy, you'd no longer have glibc's
NEON memcpy applied at which point we'd be doing 16 uncached reads
instead of 64/(NEON memcpy granularity), for about a 30% performance
hit.  By hand writing the assembly, we can get a whole cacheline
loaded at a time.

Unfortunately, NEON intrinsics turned out to be unusable -- they
didn't have the vldm instruction available.

Note that, for now, the NEON code is only enabled when building for ARMv7
(Pi 2+).  We may want to do runtime detection for the Raspbian case, in
the future.

Improves 1024x1024 GetTexImage by 208.256% +/- 7.07029% (n=10).
2017-01-26 12:48:10 -08:00
Eric Anholt
347b69e7d7 vc4: Move LT tiling code to a separate file.
This paves the way for building it twice, with NEON assembly or not.
2017-01-26 12:23:31 -08:00
Eric Anholt
14cf5c60b8 vc4: Use unreachable() in an unreachable codepath for tiling. 2017-01-26 12:23:31 -08:00
Samuel Pitoiset
eca96ea308 gallium/radeon: add VRAM-vis-usage HUD query
This new query returns the current visible usage of VRAM accessed
by the CPU. It will return 0 on radeon because it's unimplemented.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-26 19:40:52 +01:00
Samuel Pitoiset
9f087e1c7c gallium/radeon: query the CPU accessible size of VRAM
R600_DEBUG="info" can be used to display that size, as well as
the total amount of VRAM/GTT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-26 19:40:14 +01:00
Ian Romanick
13439031c8 mesa: Arrange validate_uniform_parameters parameters to match call sites
Saves a measly 20 bytes on IA32 and nothing on x64.  Depending on
exactly when this is applied, a lot of variation is possible due to
function alignment.

   text	   data	    bss	    dec	    hex	filename
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so before
6670111	 228340	  22552	6921003	 699b2b	lib/i965_dri.so after
6342932	 293872	  29880	6666684	 65b9bc	lib64/i965_dri.so before
6342932	 293872	  29880	6666684	 65b9bc	lib64/i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:18 -08:00
Ian Romanick
9be5fd3c87 mesa: Arrange _mesa_uniform parameters to match the call sites
By putting the parameters first that match the parameters to the call
site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64.  On
IA32, the details of the instructions change, but it is the same count
and mix of instructions.

Before:

0000000000000830 <_mesa_Uniform4fv>:
     830:       48 83 ec 10             sub    $0x10,%rsp
     834:       49 89 d0                mov    %rdx,%r8
     837:       48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 83e <_mesa_Uniform4fv+0xe>
     83e:       89 f8                   mov    %edi,%eax
     840:       89 f1                   mov    %esi,%ecx
     842:       41 b9 02 00 00 00       mov    $0x2,%r9d
     848:       64 48 8b 3a             mov    %fs:(%rdx),%rdi
     84c:       48 8b 97 c8 01 02 00    mov    0x201c8(%rdi),%rdx
     853:       48 8b 72 70             mov    0x70(%rdx),%rsi
     857:       6a 04                   pushq  $0x4
     859:       89 c2                   mov    %eax,%edx
     85b:       e8 00 00 00 00          callq  860 <_mesa_Uniform4fv+0x30>
     860:       48 83 c4 18             add    $0x18,%rsp
     864:       c3                      retq

After:

00000000000007f0 <_mesa_Uniform4fv>:
     7f0:       48 83 ec 10             sub    $0x10,%rsp
     7f4:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7fb <_mesa_Uniform4fv+0xb>
     7fb:       41 b9 02 00 00 00       mov    $0x2,%r9d
     801:       64 48 8b 08             mov    %fs:(%rax),%rcx
     805:       48 8b 81 c8 01 02 00    mov    0x201c8(%rcx),%rax
     80c:       6a 04                   pushq  $0x4
     80e:       4c 8b 40 70             mov    0x70(%rax),%r8
     812:       e8 00 00 00 00          callq  817 <_mesa_Uniform4fv+0x27>
     817:       48 83 c4 18             add    $0x18,%rsp
     81b:       c3                      retq

Saves a measly 416 bytes of text on x64.  Depending on exactly when this
is applied, a lot of variation is possible due to function alignment.

   text	   data	    bss	    dec	    hex	filename
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so before
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so after
6343348	 293872	  29880	6667100	 65bb5c	lib64/i965_dri.so before
6342932	 293872	  29880	6666684	 65b9bc	lib64/i965_dri.so after

There is likely to be no performance change with just this patch.
_mesa_uniform immediately calls validate_uniform_parameters with
parameters in the "wrong" (different from the call site) order.

v2: Rebase on GL_ARB_gpu_shader_fp64.

v3: Rebase on GL_ARB_gpu_shader_int64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:14 -08:00
Ian Romanick
9f7ac45ce4 mesa: Arrange _mesa_uniform_matrix parameters to match the call sites
By putting the parameters first that match the parameters to the call
site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on
x64.  On IA32, the details of the instructions change, but it is the
same count and mix of instructions.

Before:

0000000000001380 <_mesa_UniformMatrix4fv>:
    1380:       48 83 ec 10             sub    $0x10,%rsp
    1384:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 138b <_mesa_UniformMatrix4fv+0xb>
    138b:       41 89 f8                mov    %edi,%r8d
    138e:       41 89 f1                mov    %esi,%r9d
    1391:       0f b6 d2                movzbl %dl,%edx
    1394:       64 48 8b 38             mov    %fs:(%rax),%rdi
    1398:       48 8b b7 c8 01 02 00    mov    0x201c8(%rdi),%rsi
    139f:       48 8b 76 70             mov    0x70(%rsi),%rsi
    13a3:       68 06 14 00 00          pushq  $0x1406
    13a8:       51                      push   %rcx
    13a9:       52                      push   %rdx
    13aa:       b9 04 00 00 00          mov    $0x4,%ecx
    13af:       ba 04 00 00 00          mov    $0x4,%edx
    13b4:       e8 00 00 00 00          callq  13b9 <_mesa_UniformMatrix4fv+0x39>
    13b9:       48 83 c4 28             add    $0x28,%rsp
    13bd:       c3                      retq

After:

0000000000001360 <_mesa_UniformMatrix4fv>:
    1360:       48 83 ec 10             sub    $0x10,%rsp
    1364:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 136b <_mesa_UniformMatrix4fv+0xb>
    136b:       0f b6 d2                movzbl %dl,%edx
    136e:       64 4c 8b 00             mov    %fs:(%rax),%r8
    1372:       49 8b 80 c8 01 02 00    mov    0x201c8(%r8),%rax
    1379:       68 06 14 00 00          pushq  $0x1406
    137e:       6a 04                   pushq  $0x4
    1380:       6a 04                   pushq  $0x4
    1382:       4c 8b 48 70             mov    0x70(%rax),%r9
    1386:       e8 00 00 00 00          callq  138b <_mesa_UniformMatrix4fv+0x2b>
    138b:       48 83 c4 28             add    $0x28,%rsp
    138f:       c3                      retq

Saves a measly 576 bytes of text on x64.

   text	   data	    bss	    dec	    hex	filename
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so before
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so after
6343924	 293872	  29880	6667676	 65bd9c	lib64/i965_dri.so before
6343348	 293872	  29880	6667100	 65bb5c	lib64/i965_dri.so after

v2: Rebase on GL_ARB_gpu_shader_fp64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:09 -08:00
Ian Romanick
874393186b mesa: Trivial clean-ups in uniform_query.cpp
This is C++, so we can mix code and declarations.  Doing so allows
constification.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:07 -08:00
Lionel Landwerlin
bbe8705c57 spirv: handle undefined components for OpVectorShuffle
Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:31:21 +00:00
Lionel Landwerlin
df7063cba3 spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :

...
%zero        = OpConstant %i32 0
%ivec3_0     = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef  = OpUndef %ivec3
%sc_0        = OpSpecConstant %i32 0
%sc_1        = OpSpecConstant %i32 0
%sc_2        = OpSpecConstant %i32 0
...

Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:29:29 +00:00
Lionel Landwerlin
c3421106ec anv: fix descriptor pool internal size allocation
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.

Fixes a crash due to out of bound memory access in:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

v2: Drop debug traces (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:24:21 +00:00