Commit graph

64121 commits

Author SHA1 Message Date
Brian Paul
7844263f07 svga: remove unneeded depth==1 assertion in svga_texture_view_surface()
We can create 3D texture views.  Avoids an assertion in piglit
fbo-generatemipmap-3d test and allows it to pass.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-07-29 09:16:23 -06:00
José Fonseca
66a1b3a1da st/wgl: Clamp wglChoosePixelFormatARB's output nNumFormats to nMaxFormats.
While running https://github.com/nvMcJohn/apitest with apitrace I noticed that Mesa was producing bogus results:

  wglChoosePixelFormatARB(hdc, piAttribIList = {...}, pfAttribFList = &0, nMaxFormats = 1, piFormats = {19, 65576, 37, 198656, 131075, 0, 402653184, 0, 0, 0, 0, -573575710}, nNumFormats = &12) = TRUE

However https://www.opengl.org/registry/specs/ARB/wgl_pixel_format.txt states

    <nNumFormats> returns the number of matching formats. The returned
    value is guaranteed to be no larger than <nMaxFormats>.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-07-29 15:41:32 +01:00
Michel Dänzer
8d0a1a6bc0 gallium/radeon: Add some Emacs .dir-locals.el files
Based on the toplevel one but adapted to the driver/winsys coding styles.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-29 17:59:13 +09:00
Chia-I Wu
9a53f941c7 ilo: fix fb height of HiZ ops
It was set to aligned width.  It appears to be fine on GEN7+, but causes
random hangs on GEN6.
2014-07-29 10:24:59 +08:00
Tapani Pälli
76b11d15d3 glapi: add indexed blend functions (GL 4.0)
This makes some of the UE4 engine demos (Stylized, Mobile Temple)
render correctly, tested on Intel Haswell machine.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78716
2014-07-28 16:26:27 -07:00
Marek Olšák
a9528cef6b r600g,radeonsi: switch all occurences of array_size to util_max_layer
This fixes 3D texture support in all these cases, because array_size is 1
with 3D textures and depth0 actually contains the "array size".
util_max_layer is universal and returns the last layer index for any texture
target.

A lot of the cases below can't actually be hit with 3D textures, but let's
be consistent.

This fixes a failure in:
    piglit layered-rendering/clear-color-all-types 3d single_level
for r600g and radeonsi, which was caused by an incorrect CMASK size
calculation.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
71ce92200e radeonsi: fix occlusion queries on Hawaii
This was just a guess - and it worked!

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
156b7e244c winsys/radeon: fix vram_size overflow with Hawaii
This fixes piglit spec/!OpenGL 3.1/minmax.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
0e7f56313d radeonsi: fix a hang with streamout on Hawaii
I actually couldn't reproduce this one, but internal docs recommend this
workaround. Better safe than sorry.

Also, the number of dwords for the sync packets is increased by 4 instead
of 2, because it wasn't bumped last time when a new packet was added there.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
3d9e87406c radeonsi: fix a hang with instancing on Hawaii
This fixes "piglit/bin/arb_transform_feedback2-draw-auto instanced".

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
c7407b94a8 gallium/util: add a helper for calculating primitive count from vertex count
This is needed by the following commit which is a candidate for stable too.

Cc: mesa-stable@lists.freedesktop.org
2014-07-28 23:57:08 +02:00
Marek Olšák
9b046474c9 radeonsi: fix CMASK and HTILE calculations for Hawaii
This fixes the checkerboard pattern in glxgears and anything that triggers
fast color clear.

num_channels is always <= 8, but Hawaii has 16 pipes.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
ecbd3a545a r600g,radeonsi: add debug flags which disable tiling
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák
04f2c88f45 gallium: rename shader cap MAX_CONSTS to MAX_CONST_BUFFER_SIZE
This new name isn't so confusing.

I also changed the gallivm limit, because it looked wrong.

Reviewed-by: Brian Paul <brianp@vmware.com>

v2: use sizeof(float[4])
2014-07-28 23:57:08 +02:00
Marek Olšák
d5bcb5e8de r600g: switch SNORM conversion to DX and GLES behavior
it also matches GL 4.2

further discussion:
http://lists.freedesktop.org/archives/mesa-dev/2013-August/042680.html

Cc: mesa-stable@lists.freedesktop.org
2014-07-28 23:57:08 +02:00
Tom Stellard
5fe20592d4 util: Fix typo
Spotted by okias on IRC.
2014-07-28 16:40:05 -04:00
Chia-I Wu
cc1e1da24a ilo: correctly propagate resource renames to hardware
Not only should we mark states dirty when the underlying resource is renamed,
we should also update the CSO bo when available.
2014-07-28 23:55:55 +08:00
Chia-I Wu
fb1820355b ilo: add ilo_resource_get_bo() helper
We will need it in the following commit.
2014-07-28 23:55:55 +08:00
Tom Stellard
6f0c1f2b5f radeonsi: Use util_memcpy_cpu_to_le32()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-28 10:14:28 -04:00
Tom Stellard
f0e0737922 util: Add util_memcpy_cpu_to_le32() v3
v2:
  - Preserve word boundaries.

v3:
  - Use const and restrict.
  - Fix indentation.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-28 10:10:43 -04:00
Tom Stellard
3d636b4785 clover: Add checks for image support to the image functions v2
Most image functions are required to return a CL_INVALID_OPERATION
error when used on devices without image support.

v2:
  - Simplified the code

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-07-28 10:10:30 -04:00
Bruno Jiménez
7f96bea5bc r600g/compute: Add debug information to promote and demote functions
v2: Add information about the item's starting point and size
v3: Rebased on top of master

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-07-28 10:10:20 -04:00
Bruno Jiménez
e7715126f7 r600g/compute: Add documentation to compute_memory_pool
v2: Rebased on top of master

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-07-28 10:09:46 -04:00
Chia-I Wu
717e3b1ca1 ilo: unblock an inline write with a staging bo
This should allow a deeper pipeline.
2014-07-28 22:57:22 +08:00
Chia-I Wu
7395432f2e ilo: try unblocking a transfer with a staging bo
When mapping a busy resource with PIPE_TRANSFER_DISCARD_RANGE or
PIPE_TRANSFER_FLUSH_EXPLICIT, we can avoid blocking by allocating and mapping
a staging bo, and emit pipelined copies at proper places.  Since the staging
bo is never bound to GPU, we give it packed layout to save space.
2014-07-28 22:57:22 +08:00
Chia-I Wu
0a0e57b070 ilo: enable persistent and coherent transfers
Enable PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT and reorder caps a bit.
2014-07-28 22:57:22 +08:00
Chia-I Wu
b02e993d8c ilo: drop ptr from ilo_transfer
With the recent clean-ups, we can pass the mapped pointer around between
functions cleanly.  Drop it to make ilo_transfer smaller.
2014-07-28 22:57:22 +08:00
Chia-I Wu
b1dd54d9fe ilo: s/TRANSFER_MAP_UNSYNC/TRANSFER_MAP_GTT_UNSYNC/
It maps to drm_intel_gem_bo_map_unsynchronized(), which results in
unsynchronized GTT mapping.
2014-07-28 22:57:22 +08:00
Chia-I Wu
2a82bb30e8 ilo: drop unused context param from transfer functions
Many of the transfer functions do not need an ilo_context.  Drop it.
2014-07-28 22:57:22 +08:00
Chia-I Wu
8abf6c06e8 ilo: tidy up transfer mapping/unmapping
Add xfer_map() to replace map_bo_for_transfer().  Add xfer_unmap() and
xfer_alloc_staging_sys() to simplify texture and buffer mapping/unmapping, and
enable more code sharing between them.
2014-07-28 22:57:22 +08:00
Chia-I Wu
2f4bed0405 ilo: tidy up choose_transfer_method()
Add a bunch of helper functions and a big comment for
choose_transfer_method().  This also fixes handling of
PIPE_TRANSFER_MAP_DIRECTLY to not ignore tiling.
2014-07-28 22:57:22 +08:00
Chia-I Wu
91656eb375 ilo: free transfers with util_slab_free()
We used FREE() in one of the error path.
2014-07-28 22:57:22 +08:00
EdB
1d3e06c216 clover: Add clUnloadPlatformCompiler.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-07-28 14:46:44 +02:00
EdB
39869423cb clover: Add clCreateProgramWithBuiltInKernels.
[ Francisco Jerez: Check for devices not associated with the specified
  context.  Style fix. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-07-28 14:45:29 +02:00
Jordan Justen
be8bc588b9 glsl/cs: Add several GLSL compute shader variables
With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit:
built-in-constants tests/spec/arb_compute_shader/minimum-maximums.txt

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-07-27 17:59:28 -07:00
Jordan Justen
12029046a2 main/cs: Add additional compute shader constant values
With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit:
* arb_compute_shader-minmax

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-07-27 17:58:58 -07:00
Chris Forbes
74e100affc glsl: No longer require ubo block index to be constant in ir_validate
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-26 16:46:03 +12:00
Chris Forbes
be237a6129 glsl: Accept nonconstant array references in lower_ubo_reference
Instead of falling back to just the block name (which we won't find),
look for the first element of the block array. We'll deal with the rest
in the backend by arranging for the blocks to be laid out contiguously.

V2: Squashed together patches 3, 5 of V1, plus a naming tweak.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-26 16:46:03 +12:00
Chris Forbes
c59802d3a1 glsl: Convert uniform_block in lower_ubo_reference to ir_rvalue.
Previously this was a block index with special semantics for -1.
With ARB_gpu_shader5, this need not be a compile-time constant, so
allow any rvalue here and convert the -1 to a NULL pointer.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-26 16:46:03 +12:00
Chris Forbes
9c90a63378 glsl: Mark entire UBO array active if indexed with non-constant.
Without doing a lot more work, we have no idea which indices may
be used at runtime, so just mark them all.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-26 16:46:03 +12:00
Chris Forbes
8eae5ceb99 glsl: Allow non-constant UBO array indexing with GLSL4/ARB_gpu_shader5.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-26 16:46:03 +12:00
Chia-I Wu
4714c4ec48 ilo: simplify ilo_flush()
Move fence creation to the new ilo_fence_create().
2014-07-26 12:30:39 +08:00
Bruno Jiménez
654fd3e33f r600g/compute: Defrag the pool at the same time as we grow it
This allows us two things: we now need less item copies when we have
to defrag+grow the pool (to just one copy per item) and, even in the
case where we don't need to defrag the pool, we reduce the data copied
to just the useful data that the items use.

Note: The fallback path is a bit ugly now, but hopefully we won't need
it much.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-07-25 17:51:57 -04:00
Bruno Jiménez
4ca04f3112 r600g/compute: Try to use a temporary resource when growing the pool
Now, before moving everything to host memory, we try to create a
new resource to use as a pool. I we succeed we just use this resource
and delete the previous one. If we fail we fallback to using the
shadow.

This should make growing the pool faster, and we can also save
64KB of memory that were allocated for the 'shadow', even if they
weren't used.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-07-25 17:51:57 -04:00
Rob Clark
5eb11eb192 freedreno: fix typo in gpu version check
Opps, I should use larger fonts, I guess.

Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-07-25 14:29:02 -04:00
Rob Clark
db193e5ad0 freedreno/ir3: split out shader compiler from a3xx
Move the bits we want to share between generations from fd3_program to
ir3_shader.  So overall structure is:

  fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3
                                    |- ...
                                    \- ir3_shader_variant -> ir3

So the ir3_shader becomes the topmost generation neutral object, which
manages the set of variants each of which generates, compiles, and
assembles it's own ir.

There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/,
etc.

Keep the split between the gallium level stateobj and the shader helper
object because it might be a good idea to pre-compute some generation
specific register values (ie. anything that is independent of linking).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-07-25 13:29:28 -04:00
Rob Clark
7d7e6ae9c3 freedreno/a3xx/compiler: rename ir3_shader to ir3
First step of reoganization split out compiler (so it can be shared
between a3xx and a4xx).  Rename ir3_shader -> ir3 (since we'll want
the name ir3_shader for a higher level object).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-07-25 13:29:28 -04:00
Rob Clark
faaeddb55e freedreno/a3xx/compiler: scheduler vs pred reg
The scheduler also needs to be aware of predicate register (p0) in
addition to address register (a0).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-07-25 13:29:28 -04:00
Rob Clark
9f391322a0 freedreno/a3xx/compiler: little cleanups
Remove some obsolete comments, rename deref->addr.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-07-25 13:29:28 -04:00
Rob Clark
d48faad3c2 freedreno/a3xx: enable/disable wa's based on patch-level
It seems like for the most part, different behaviors, workarounds, etc,
should be conditional on GPU patch revision (ie. a320.0 vs a320.2)
rather than GPU id (a320 vs a330).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-07-25 13:29:28 -04:00