Commit graph

91007 commits

Author SHA1 Message Date
Boyan Ding
d02829c94e nvc0: Enable ARB_shader_ballot on Kepler+
readInvocationARB() and readFirstInvocationARB() need SHFL.IDX
instruction which is introduced in Kepler.

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:17 -04:00
Boyan Ding
59f6aa8096 nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*
v2: Check if each channel is masked in TGSI_OPCODE_BALLOT (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:14 -04:00
Boyan Ding
48d00779d0 nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:08 -04:00
Boyan Ding
f7787f224f nvc0/ir: Add SV_LANEMASK_* system values.
v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:04 -04:00
Boyan Ding
2a3c4c6bc3 nvc0/ir: Allow 0/1 immediate value as source of OP_VOTE
Implementation of readFirstInvocationARB() on nvidia hardware needs a
ballotARB(true) used to decide the first active thread. This expressed
in gm107 asm as (supposing output is $r0):
	vote any $r0 0x1 0x1

To model the always true input, which corresponds to the second 0x1
above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and
"not 0x1" in the src field respectively.

v2: Make sure that asImm() is not NULL (Samuel Pitoiset)

v3: (Ilia Mirkin)
Make the handling more symmetric with predicate version in gm107
Use i->getSrc(s)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:59 -04:00
Boyan Ding
f1252996f5 gk110/ir: Emit OP_SHFL
v2: Make sure that asImm() is not NULL (Samuel Pitoiset)

v3: Check the range of immediate in OP_SHFL (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:55 -04:00
Boyan Ding
c32e150008 nvc0/ir: Emit OP_SHFL
v2: (Samuel Pitoiset)
Add an assertion to check if the target is Kepler
Make sure that asImm() is not NULL

v3: (Ilia Mirkin)
Check the range of immediate value of OP_SHFL
Use the new setPDSTL API

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:52 -04:00
Boyan Ding
d941ef3829 nvc0/ir: Properly handle a "split form" of predicate destination
GF100's ISA encoding has a weird form of predicate destination where its
3 bits are split across whole the instruction. Use a dedicated setPDSTL
function instead of original defId which is incorrect in this case.

v2: (Ilia Mirkin)
Change API of setPDSTL() to handle cases of no output
Fix setting of the highest bit in setPDSTL()

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:47 -04:00
Boyan Ding
854554c314 gm107/ir: Emit third src 'bound' and optional predicate output of SHFL
v2: Emit the original hard-coded 0x1c03 when OP_SHFL is used in gm107's
    lowering (Samuel Pitoiset)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:30 -04:00
Michel Dänzer
a981e68c26 clover: Fix build against clang SVN >= r299965
clang::LangAS::Offset is gone, the behaviour is as if it was 0.

v2: Introduce and use clover::llvm::compat::lang_as_offset (Francisco
    Jerez)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-13 12:51:24 +09:00
Brian Paul
46f49d6fdc st/mesa: add some _mesa_is_winsys_fbo() assertions
A few functions related to FBOs/renderbuffers should only be used with
window-system buffers, not user-created FBOs.  Assert for that.
Add additional comments.  No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-12 21:13:23 -06:00
Brian Paul
c36d224921 st/mesa: minor optimization in st_DrawBuffers()
We only do on-demand renderbuffer allocation for window-system FBOs,
not user-created FBOs.  So put the loop inside a conditional.

Plus, add some comments.  No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-12 21:13:23 -06:00
Timothy Arceri
fbcd709a34 mesa/st: only update samplers for stages that have changed
Might help reduce cpu for some apps that use sso.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 12:08:31 +10:00
Vinson Lee
f30f575e7b st/mesa: Fix missing-braces warning.
CXX      state_tracker/st_glsl_to_nir.lo
state_tracker/st_glsl_to_nir.cpp:250:57: warning: suggest braces around initialization of subobject [-Wmissing-braces]
      nir_lower_wpos_ytransform_options wpos_options = {0};
                                                        ^
                                                        {}

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-12 15:43:30 -07:00
Alex Smith
4603bea1aa radv: Disable primitive restart for non-indexed draws
According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's
primitiveRestartEnable flag should only apply to indexed draws, however
it was being enabled regardless of the type of draw. This could cause
problems for non-indexed draws with >=65535 vertices if the previous
indexed draw used 16-bit indices.

Fixes corruption of the credits text in Mad Max.

v2: Reset primitive restart state after executing a secondary command
    buffer.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-12 20:58:41 +02:00
Matt Turner
ab18578b03 anv: Only define wsi_cbs when VK_USE_PLATFORM_WAYLAND_KHR defined 2017-04-12 11:00:39 -07:00
Marek Olšák
f7b1371d2d Revert "r600g: get rid of dummy pixel shader"
This reverts commit 61e47d92c5.

It causes a hang on RS780.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100663
2017-04-12 17:46:21 +02:00
Bartosz Tomczyk
bb847e78cf mesa: fix memory leak in arb_fragment_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-12 17:50:36 +10:00
Bas Nieuwenhuizen
c4d43388c0 radv: Hash the immutable samplers.
Since the shader code can include them.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-12 07:43:38 +02:00
Bas Nieuwenhuizen
bd91caf863 radv: Use an offset instead of pointers for immutable samplers.
Makes more sense when we hash the layout for the pipeline cache.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-12 07:43:25 +02:00
Bas Nieuwenhuizen
b35b5951fc radv: Stop shadowing the result in radv_GetQueryPoolResults.
The outer result was referred to, which meant bugs.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen
0763453291 radv: Return VK_NOT_READY if the query results are not available.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 8475a14302 ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen
2dacb727c2 radv: Set query availability bit even if we don't wait.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 8475a14302 ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2017-04-12 07:38:58 +02:00
Gregory Hainaut
03d1de387e mesa: avoid NULL ptr in prog parameter name
Context: _mesa_add_parameter is sometimes[0] called with a
NULL name as a mean of an unnamed parameter.

Allowing NULL pointer as a name means that it must be NULL checked
each access. So far it isn't always[1] true.

Parameter name is only used for debug purpose (printf) and
to lookup the index/location of the program by the application.

Conclusion, there is no valid reason to use a NULL pointer instead of
an empty string. So it was decided to use an empty string which avoid all
issues related to NULL pointer

[0]: texture gather offsets glsl opcode and st_init_atifs_prog
[1]: at least shader cache, st_nir_lookup_parameter_index and some printfs

Issue found by piglit 'texturegatheroffsets' tests on Nouveau

v4: new patch based on Nicolai/Timothy/ilia discussion
Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-12 14:30:28 +10:00
Kenneth Graunke
754b961f38 i965/drm: Use bools for a few flags.
These one bit values are booleans.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
44ecbbebe2 i965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit.
unsigned long is a terrible type for a bitfield - if you need fewer
than 32 bits, it wastes 4 bytes.  If you need more, things break on
32-bit builds.  Just use unsigned.

Even that's a bit ridiculous as we only have one flag today.
Still, it's at least somewhat better.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
f374b9449e i965/drm: Make BO size a uint64_t rather than unsigned long.
The drm_i915_gem_create ioctl structure uses a __u64 for the size,
so we should probably use uint64_t to match.  In theory, we could
probably have a BO larger than 4GB, using a 48-bit PPGTT - it just
wouldn't be mappable in the CPU's 32-bit address space.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
c85d6832fd i965/drm: Make alignment parameter a uint64_t.
Theoretically, with a 48-bit address space, we could have buffers
with an alignment of >= 4GB.  It's a bit silly, but the exec_object
structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may
as well use the same type as the kernel API.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
444ab8126d i965/drm: Make stride/pitch a uint32_t.
struct drm_i915_gem_set_tiling's stride field is a __u32.
intel_mipmap_tree::stride is a uint32_t.  Using unsigned long just
doesn't make sense.  Switching also lets us drop many pointless
locals that only existed to deal with the type mismatch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
14fc188460 i965/drm: Fix types for pwrite/pread fields.
The ioctl structs contain __u64 offset and size fields, so make them
uint64_t rather than unsigned long.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
193601311c i965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer.
For some reason we passed tiling by pointer, through several layers,
even though the functions only read the initial value, and never
actually change it.  We even had a do-while loop that executed until
the tiling mode matched - except it always did, so it only ran once.
We then had bogus error handling in case it changed the tiling mode
to something nonsensical...which it never did.

Drop all this nonsense.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Timothy Arceri
9bd7184078 mesa/st: remove _mesa_get_fallback_texture() calls
These calls look like leftover from fallback texture support first
being added to the st in 8f6d9e12be and then later being added
to core mesa in 00e203fe17.

The piglit test fp-incomplete-tex continues to work with this
change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-12 12:00:35 +10:00
Timothy Arceri
c72170fb1f mesa: use pre_hashed version of search for the mesa hash table
The key is just an unsigned int so there is never any real hashing
done.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-04-12 12:00:35 +10:00
Tim Rowley
d0f381f865 swr: [rasterizer core] Disable 8x2 tile backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
31a23a9d9d swr: [rasterizer common] Add _simd_testz_si alias
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
7abd1f9b24 swr: [rasterizer archrast] Fix archrast for MSVC 2017 compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
54d11b3c95 swr: [rasterizer jitter] Remove unused function
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
af909c0200 swr: [rasterizer jitter] Remove HAVE_LLVM tests supporting llvm < 3.8
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
973d38801d swr: [rasterizer common/core] Fix 32-bit windows build
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
217b791a44 swr: [rasterizer core] Fix unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
da7aa39f93 swr: [rasterizer core] Code formating change
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
c8cc07ca25 swr: [rasterizer core] SIMD16 Frontend WIP - PA
Fix PA NextPrim for SIMD8 on SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
08a7136848 swr: [rasterizer core] SIMD16 Frontend WIP - Clipper
Implement widened clipper for SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
0033e86b2c swr: [rasterizer core] Multisample sample position setup change
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
4c093869db swr: [rasterizer core] Reduce templates to speed compile
Quick patch to remove some unused template params to cut down
rasterizer compile time.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Francisco Jerez
147e71242c i965/fs: Take into account lower frequency of conditional blocks in spilling cost heuristic.
The individual branches of an if/else/endif construct will be executed
some unknown number of times between 0 and 1 relative to the parent
block.  Use some factor in between as weight while approximating the
cost of spill/fill instructions within a conditional if-else branch.
This favors spilling registers used within conditional branches which
are likely to be executed less frequently than registers used at the
top level.

Improves the framerate of the SynMark2 OglCSDof benchmark by ~1.9x on
my SKL GT4e.  Should have a comparable effect on other platforms.  No
significant regressions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-11 15:28:54 -07:00
Tim Rowley
9a7b257450 swr: return true for PIPE_CAP_DOUBLES
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-11 13:16:43 -05:00
Kenneth Graunke
02ccd8f52c i965: Set kernel features before computing max GL version.
We check these bitfields when computing the Haswell max GL version.
We need to set them ahead of time, or they won't exist, and all our
checks will fail.  That sets the max core profile GL version to 4.2.

This introduces the bizarre situation where asking for a GL context
with version 4.3+ fails, but asking for a GL core profile context
with version <= 4.2 actually promotes you a 4.5 context.

GLX_MESA_query_renderer also reported the bogus 4.2 value.
Now it shows 4.5.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>
2017-04-11 08:58:16 -07:00
Juan A. Suarez Romero
8d7a82ae32 anv: remove needless VALGRIND_MAKE_MEM_DEFINED
This is already invoked in the following VG_NOACCESS_READ() call.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-11 17:21:57 +02:00
Lucas Stach
4ee7c2c284 etnaviv: enable TS, but disable autodisable
Autodisable seems to cause missed rendering in some cases, but
otherwise TS seems to work properly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-04-11 16:52:31 +02:00