Commit graph

68882 commits

Author SHA1 Message Date
Ian Romanick
ce3f46397d i965/fs: Handle CMP.nz ... 0 and AND.nz ... 1 similarly in cmod propagation
Espically on platforms that do not natively generate 0u and ~0u for
Boolean results, we generate a lot of sequences where a CMP is
followed by an AND with 1.  emit_bool_to_cond_code does this, for
example.  On ILK, this results in a sequence like:

    add(8)          g3<1>F          g8<8,8,1>F      -g4<0,1,0>F
    cmp.l.f0(8)     g3<1>D          g3<8,8,1>F      0F
    and.nz.f0(8)    null            g3<8,8,1>D      1D
    (+f0) iff(8)    Jump: 6

The AND.nz is obviously redundant.  By propagating the cmod, we can
instead generate

    add.l.f0(8)     null            g8<8,8,1>F      -g4<0,1,0>F
    (+f0) iff(8)    Jump: 6

Existing code already handles the propagation from the CMP to the ADD.

Shader-db results:

GM45 (0x2A42):
total instructions in shared programs: 3550829 -> 3550788 (-0.00%)
instructions in affected programs:     10028 -> 9987 (-0.41%)
helped:                                24

Iron Lake (0x0046):
total instructions in shared programs: 4993146 -> 4993105 (-0.00%)
instructions in affected programs:     9675 -> 9634 (-0.42%)
helped:                                24

Ivy Bridge (0x0166):
total instructions in shared programs: 6291870 -> 6291794 (-0.00%)
instructions in affected programs:     17914 -> 17838 (-0.42%)
helped:                                48

Haswell (0x0426):
total instructions in shared programs: 5779256 -> 5779180 (-0.00%)
instructions in affected programs:     16694 -> 16618 (-0.46%)
helped:                                48

Broadwell (0x162E):
total instructions in shared programs: 6823088 -> 6823014 (-0.00%)
instructions in affected programs:     15824 -> 15750 (-0.47%)
helped:                                46

No chage on Sandy Bridge or on any platform when NIR is used.

v2: Add unit tests suggested by Matt.  Remove spurious writes_flag()
check on scan_inst when scan_inst is known to be BRW_OPCODE_CMP (also
suggested by Matt).

v3: Fix some comments and remove some explicit int() casts in fs_reg
constructors in the unit tests.  Both suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-17 14:59:43 -07:00
Matt Turner
d35720da9b i965: Mark paths in linear <-> tiled functions as unreachable().
text    data     bss     dec     hex filename
9663       0       0    9663    25bf intel_tiled_memcpy.o   before
8215       0       0    8215    2017 intel_tiled_memcpy.o   after

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-03-17 14:09:56 -07:00
Matt Turner
6c6e2a15aa egl: Remove eglQueryString virtual dispatch.
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-17 14:09:56 -07:00
Laura Ekstrand
827da841a1 main: Correct _mesa_error with no format in bufferobj.c.
This fixes Bug 89616, a build failure due to line 1639 of bufferobj.c:
_mesa_error(ctx, GL_INVALID_OPERATION, func);

Trivial.
2015-03-17 13:30:54 -07:00
Laura Ekstrand
579297c8bd main: Cosmetic changes to GetBufferSubData.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
23eab47bbe main: Add entry point for GetNamedBufferSubData.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
3706ace244 main: Cosmetic updates to GetBufferPointerv.
v3: Review from Fredrik Hoglund
   -Split cosmetic refactor of GetBufferPointerv out into a separate commit

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
105ddc6aea main: Add entry point for GetNamedBufferPointerv.
v3: Review from Fredrik Hoglund
   -Split cosmetic refactor of GetBufferPointerv out into a separate commit

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
1e45752aaf main: Add entry points for GetNamedBufferParameteri[64]v.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
efcb830d49 main: Refactor GetBufferParameteri[64]v.
v2: Split into a refactor commit and an entry point commit.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
1cfc18da8d main: Add entry point for FlushMappedNamedBufferRange.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
ee5fae6e89 main: Refactor FlushMappedBufferRange.
v2:-Remove "_mesa" from in front of static software fallback.
   -Split out the refactor from the addition of the DSA entry points.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
f7f5df9954 main: Add entry point for UnmapNamedBuffer.
v2: review from Ian Romanick
   - Restore VBO_DEBUG and BOUNDS_CHECK
   - Remove _mesa from static software fallback unmap_buffer.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
a0cc03929e main: Add entry points for MapNamedBuffer[Range].
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
4f513bc330 main: Refactor MapBuffer[Range].
v2: review from Jason Ekstrand
   - Split refactor from addition of DSA entry points.
    review from Ian Romanick
   - Remove "_mesa" from static software fallback map_buffer_range
   - Restore VBO_DEBUG and BOUNDS_CHECK

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
16244525fb main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2015-03-17 10:18:34 -07:00
Laura Ekstrand
5030d0a4f7 main: Add entry points for ClearNamedBuffer[Sub]Data.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
9fa6c3637a main: Refactor ClearBuffer[Sub]Data.
v2: review by Jason Ekstrand
   - Split refactor of clear buffer sub data from addition of DSA entry
     points.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
4adaad5fcc main: Add entry point for CopyNamedBufferSubData.
v2: remove _mesa in front of static software fallback.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
9cb732b8e9 main: Improve errors and style in BufferSubData.
- More explicit error reporting.
- Removed legacy style.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
566ccdf11b main: Add entry point for NamedBufferSubData.
v2: review by Ian Romanick
   - Remove "_mesa" from name of static software fallback buffer_sub_data.
   - Remove mappedRange from _mesa_buffer_sub_data.
   - Removed some cosmetic changes to a separate commit.

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
cb56835f87 main: Add entry point for NamedBufferData.
v2: review from Ian Romanick
   - Fix space in ARB_direct_state_access.xml.
   - Remove "_mesa" from the name of buffer_data static fallback.
   - Restore VBO_DEBUG and BOUNDS_CHECK.
   - Fix beginning of comment to start on same line as /*

Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
a76808dc19 main: Add entry point for NamedBufferStorage.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
2cf48c37c1 main: Add entry point for CreateBuffers.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-03-17 10:18:33 -07:00
Laura Ekstrand
44ecf0793d Revert "main: _mesa_cube_level_complete checks NumLayers."
This reverts commit 1ee000a0b6.
Failures with the GLES3 conformance suite and Synmark2 OGLHdrBloom revealed
that this commit was in error.

Extensive testing with Piglit prior to patch review and upstreaming did not
reveal this problem because, in the few Piglit tests that test for cube
completeness, NumLayers = 6.  This is because all of the existing tests use
TextureStorage to initialize the texture, which sets NumLayers.

A new Piglit test has been sent to the mailing list that reproduces the bug
related to this patch ("texturing: Testing
glGenerateMipmap(GL_TEXTURE_CUBE_MAP) without glTexStorage2D").

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-17 10:04:10 -07:00
Neil Roberts
5a06ee7384 i965/skl: Send a message header when doing constant loads SIMD4x2
Commit 0ac4c27275 made it add a header for the send message when
using SIMD4x2 on Skylake because without this it will end up using
SIMD8D. However the patch missed the case when a sampler is being used
to implement constant loads from a buffer surface in a SIMD4x2 vertex
shader.

This fixes 29 Piglit tests, mostly related to the ARL instruction in
vertex programs.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-17 16:32:11 +00:00
Tapani Pälli
627c683086 i965/fs: in MAD optimizations, switch last argument to be immediate
Commit bb33a31 introduced optimizations that transform cases of MAD
in to simpler forms but it did not take in to account that src[0]
can not be immediate and did not report progress. Patch switches
src[0] and src[1] if src[0] is immediate and adds progress
reporting. If both sources are immediates, this is taken care of by
the same opt_algebraic pass on later run.

v2: Fix for all cases, use temporary fs_reg (Matt, Kenneth)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89569
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-03-17 07:59:30 +02:00
Vinson Lee
60f77b22b1 common.py: Fix PEP 8 issues.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 22:55:08 -07:00
Roland Scheidegger
2372275d2f gallivm: abort properly when running out of buffer space in lp_disassembly
Before this actually ran into an infinite loop printing out "invalid"...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-17 00:46:48 +01:00
Marek Olšák
9d1682d619 docs/GL3: also mark GLES3/GS5 for radeonsi as done 2015-03-16 23:27:25 +01:00
Emil Velikov
c066669b8d st/dri: remove unused include from the automake/scons build
st/dri/common hasn't been around for a while.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:52 +00:00
Emil Velikov
55f0c0a29f auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:36 +00:00
Emil Velikov
5664f57df3 gallium/sw/kms: trivial cleanups
Remove the forward declaration and make use of the DEBUG_PRINT macro for
debug builds.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:22 +00:00
Emil Velikov
771cd266b9 loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
2015-03-16 20:48:07 +00:00
Felix Janda
aead7fe2e2 c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default
Previously PTHREAD_MUTEX_RECURSIVE_NP had been used on linux for
compatibility with old glibc. Since mesa defines __GNU_SOURCE__
on linux PTHREAD_MUTEX_RECURSIVE is also available since at least
1998. So we can unconditionally use the portable version
PTHREAD_MUTEX_RECURSIVE.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88534
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-16 20:41:39 +00:00
Marek Olšák
b5f19db976 radeonsi: implement TGSI_OPCODE_BFI (v2)
v2: Don't use the intrinsics, the shader backend can recognize these
    patterns and generates optimal code automatically.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 14:58:19 +01:00
Marek Olšák
d3723c614f radeonsi: add a helper for extracting bitfields from parameters (v2)
This will be used a lot (especially by tessellation).

v2: don't use the bfe intrinsic

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 14:58:19 +01:00
Antia Puentes
9735a62a2c i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7
IvyBridge and Haswell PRM say that the JIP should be emitted
with type W but we were using UD. The previous implementation
did not show adverse effects, but IMHO it is safer to follow
the specification thoroughly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
2015-03-16 12:56:17 +01:00
Marek Olšák
dc39413640 radeonsi: move scratch reloc state setup
- move it to its own function
- do it after all states are emitted
- bump SI_MAX_DRAW_CS_DWORDS

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
567c8d7300 radeonsi: don't emit PA_SC_LINE_STIPPLE if not rendering lines
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
1f4bb38264 radeonsi: don't emit PA_SC_LINE_STIPPLE after every rasterizer state change
Do it only when the line stipple state is changed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
f5832f3f9d radeonsi: move PA_SU_SC_MODE_CNTL to rasterizer state
This requires enabling the optional GL provoking vertex behavior for quads.

+ some cosmetic changes, so that the register is set exactly the same as
on r600.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
98a2398222 radeonsi: implement line and polygon smoothing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
303d23e10d radeonsi: add shader code for smoothing
The fragment shader multiplies the alpha channel with gl_SampleMaskIn.
If blending is enabled, it looks like MSAA.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
4f20a8f278 radeonsi: split sample locations into its own state atom
Sample locations are not updated as often as framebuffers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
f7796a966d radeonsi: add basic code for overrasterization
This will be used for line and polygon smoothing.
This is GCN-only even though it's in shared code.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
1921fa4304 radeonsi: small cleanup in si_shader_selector_key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
52ff1edc51 radeonsi: simplify accessing alpha pointer in si_llvm_emit_fs_epilogue
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
955ebf2890 radeonsi: add support for easy opcodes from ARB_gpu_shader5
I have to use the BFE instrinsics, because BFE is one of the most complex
instructions that can't be matched easily. BFE has 3 conditional branches
and one of them is quite big.

In the isel DAG, lowered BFE has 27 nodes (including leafs).
2015-03-16 12:54:18 +01:00
Marek Olšák
755a2907a3 radeonsi: implement bit-finding opcodes from ARB_gpu_shader5
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00