Commit graph

32650 commits

Author SHA1 Message Date
Neil Roberts
6d87500fe1 dri: Change __DriverApiRec::CreateContext to take a struct for attribs
Previously the CreateContext method of __DriverApiRec took a set of
arguments to describe the attribute values from the window system API's
CreateContextAttribs function. As more attributes get added this could
quickly get unworkable and every new attribute needs a modification for
every driver.

To fix that, pass the attribute values in a struct instead.  The struct
has a bitmask to specify which members are used. The first three members
(two for the GL version and one for the flags) are always set. If the
bit is not set in the attribute mask then it can be assumed the
attribute has the default value. Drivers will error if unknown bits in
the mask are set.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
2017-11-06 16:09:02 -05:00
Wladimir J. van der Laan
96463614a3 etnaviv: Don't over-pad compressed textures
HALIGN_FOUR/SIXTEEN has no meaning for compressed textures, and we can't
render to them anyway. So use the tightest possible packing. This
avoids bugs with non-power-of-two block sizes.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-06 21:31:20 +01:00
Wladimir J. van der Laan
93ba3f29bb etnaviv: ASTC texture support
Add ASTC texture support for hardware that supports this
(currently only GC3000 on i.MX6qp is known to have this).

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-06 21:30:54 +01:00
Wladimir J. van der Laan
f1e1c60ff6 etnaviv: Update from rnndb
Updated as of etnav_viv commit 3b4a8ec.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-11-06 21:29:19 +01:00
Marek Olšák
71f5fe36b7 gallium/u_vbuf: use signed vertex buffers offsets for optimal uploads
Uploaded data must start at (stride * start), because we can't modify
start in all cases. If it's the first allocation, it's also the amount
of memory wasted. If the starting offset is larger than the size of
the upload buffer, the buffer is re-created, used for 1 upload, and then
thrown away. If the upload is small, most of the buffer space is unused
and wasted. Keep doing that and the OOM killer comes. It's actually
pretty quick.

With signed VB offsets, we can set min_out_offset = 0
in u_upload_alloc/u_upload_data.

This fixes OOM situations with SPECviewperf.
2017-11-06 19:09:12 +01:00
Marek Olšák
3f58988b81 radeonsi: enable signed vertex buffer offsets 2017-11-06 19:09:12 +01:00
Marek Olšák
24d6318d24 gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET 2017-11-06 19:09:12 +01:00
Marek Olšák
adab7f16ff radeonsi: don't map big VRAM buffers for the first upload directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-06 16:23:20 +01:00
Marek Olšák
4b0dc098b2 gallium/u_threaded: don't map big VRAM buffers for the first upload directly
This improves Paraview "many spheres" performance 4x along with the radeonsi
commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-06 16:23:20 +01:00
Marek Olšák
a5d3999c31 gallium/u_threaded: clean up tc_improve_map_buffer_flags and prevent reentry
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-06 16:23:20 +01:00
Pierre Moreau
b041687ed1 nv50,nvc0: Display shared memory usage in pipe_debug_message
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-11-04 14:12:07 -04:00
Pierre Moreau
efe532b739 nv50,nvc0: Copy shared memory per block to the program info structure and back
In OpenCL/CUDA kernels, shared memory usage can be defined within the
kernel code. Those usage will only be picked up while parsing the
SPIR-V, during the translation phase of the program.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-11-04 14:12:07 -04:00
Pierre Moreau
49752e99f8 nv50/ir: Store shared memory per block in nv50_ir_prog_info
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-11-04 14:12:07 -04:00
Andrey Grodzovsky
19fc3cdcfb winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx.
Fixes reverted patch f03b7c9 by doing VMID reservation per
process and not per context.
Also updates required amdgpu libdrm version since the change
involved interface updates in amdgpu libdrm.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-11-03 18:06:17 +01:00
Dave Airlie
0722b6d693 i915g: remove some unknown cap warnings. 2017-11-03 15:03:30 +10:00
Dave Airlie
cc69f2385e i915g: make gears run again.
We need to validate some structs exist before we dirty the states, and
avoid the problem in some other places.

Fixes: e027935a7 ("st/mesa: don't update unrelated states in non-draw calls such as Clear")
2017-11-03 15:03:30 +10:00
Timothy Arceri
439a2febc4 ac/radeonsi: add support for tex instr without a derefence
These are produced by nir_lower_bitmap(), adding the missing derefence
would cause other issues that need to be hacked around such as
skipping sampler lowering and uniform location assignment, so this
change seems the correct way to go.

Fixes 194 piglit crashes on radeonsi using NIR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:51 +11:00
Dave Airlie
de126b0402 r600: add support for early depth/stencil.
This add support for the early depth/stencil property found
on image shaders.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:37 +10:00
Dave Airlie
f3c6149c26 r600: add support for emitting RAT instructions to the assembler.
This adds support for emitting RAT instructions to the assembler.
RAT instructions are used to implement image accessors.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:33 +10:00
Dave Airlie
159bf38c3a r600: add support for mark bit to the assembler.
This adds support to the assembler for the mark bit
 on the export word1.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:30 +10:00
Dave Airlie
90ca378080 r600: add support for valid pixel mode on CF clauses
This just adds support to the assembler for setting the valid
pixel mode on the CF clause.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:33:26 +10:00
Dave Airlie
d584b4671f r600: add support for some ALU sources.
These special ALU sources provide the shader engine,
simd and hw wave ids.

These are required for images support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-03 09:31:50 +10:00
Marek Olšák
529cdce799 radeonsi: remove 'Authors:' comments
It's inaccurate. Instead, see the copyright and use "git log" and
"git blame" to know the authorship.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-02 18:19:03 +01:00
Tim Rowley
0023b5ae67 gallivm: allow arch rounding with avx512
Fixes piglit vs-roundeven-{float,vec[234]} with simd16 VS.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-11-02 10:24:54 -05:00
Wladimir J. van der Laan
0ba4320d94 etnaviv: Allow clearing constant buffer using buffer==NULL user_buffer==NULL
Prevents an assertion when using GALLIUM_HUD with ioquake3,
when cso_restore_constant_buffer_slot0 restores an empty
constant buffer in slot 0.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-02 11:03:30 +01:00
Wladimir J. van der Laan
bc71c31842 etnaviv: Don't flush on transfer when UNSYNCHRONIZED
Structure code to only flush when we will potentially call cpu_prep. This
prevents spurious flushes in applications that heavily rely on u_uploader.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-02 11:00:26 +01:00
Wladimir J. van der Laan
8fbd82f464 etnaviv: don't do resolve-in-place without valid TS
GC3000 resolve-in-place assumes that the TS state is configured.
If it is not, this will result in MMU errors. This is especially
apparent when using glGenMipmaps().

Fixes: 78ade65956 ("etnaviv: Do GC3000 resolve-in-place when possible")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-11-02 10:58:48 +01:00
Dylan Baker
6594213cfa svga: Use __asm__ instead of asm
__asm__ is portable, and allows the svga driver to be compiled with the
c99 standard instead of requiring the gnu99 standard.

I have compile tested this with GCC and Clang on Linux.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-11-01 15:05:26 -07:00
Marek Olšák
1f2640bfa9 Revert "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx."
This reverts commit f03b7c9ad9.

The libdrm interface is wrong.
2017-11-01 21:42:31 +01:00
Brian Paul
eedecb4eca gallium: increase pipe_sampler_view::target bitfield size for MSVC
MSVC treats enums as being signed.  The 4-bit target field isn't large
enough to correctly store the value 8 (for PIPE_TEXTURE_CUBE_ARRAY).
The bitfield value 0x8 was being interpreted as -8 so matching the
target with PIPE_TEXTURE_CUBE_ARRAY in switch statements, etc. was
failing.

To keep the structure size the same, we reduce the format field from
16 bits to 15.  There don't appear to be any other enum bitfields
which need to be adjusted.

This fixes a number of Piglit cube map array tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-11-01 11:06:02 -06:00
Dave Airlie
d3fdd66401 gallium: add cap for driver specified max combined shader resources.
Some hw (evergreen) has a limit on how many combined (images/buffers/mrts)
a fragment shader can access.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-01 10:07:03 +10:00
Gert Wollny
69eee511c6 r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling
It is possible that the optimizer ends up in an infinite loop in
post_scheduler::schedule_alu(), because post_scheduler::prepare_alu_group()
does not find a proper scheduling. This can be deducted from
pending.count() being larger than zero and not getting smaller.

This patch works around this problem by signalling this failure so that the
optimizers bails out and the un-optimized shader is used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-01 09:33:40 +10:00
Timothy Arceri
e80bbd6f52 radeonsi: fix culldist_writemask in nir path
The shared si_create_shader_selector() code already offsets the mask.

Fixes the following piglit tests:

arb_cull_distance/clip-cull-3.shader_test
arb_cull_distance/clip-cull-4.shader_test

Fixes: 29d7bdd179 (radeonsi: scan NIR shaders to obtain required info)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-01 09:41:11 +11:00
Andrey Grodzovsky
f03b7c9ad9 winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-10-31 16:55:24 +01:00
Erik Faye-Lund
cf41c19d9f meson: use dep_m in libgallium
The u_format_other.c users sqrtf, which on some systems require
a math-library. So let's make sure we link with it.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-31 08:10:37 +01:00
Eric Anholt
2a77c763fe broadcom/vc5: Force blending to treat alpha as 1 for formats without alpha.
Fixes fbo-blending-formats on RGB8 and 565.  We will still need to demote
blending to shader code in the MRT case to fix it in general, but that can
be added when we start doing 32F blending (which also needs to be done in
the shader).
2017-10-30 13:31:32 -07:00
Eric Anholt
61bb0df60e broadcom/vc5: Do BGRA vs RGBA swapping for the BLEND_CONSTANT_COLOR.
Fixes many of the fbo-blending-formats tests.
2017-10-30 13:31:32 -07:00
Eric Anholt
2e3c7beb1e broadcom/vc5: Pack clear colors according to the TLB internal format/type.
The previous packing I did got us all the R*16F and R*32F formats, where
the pipe format basically matched the TLB's format, but since the clear
color will just be memcpyed to the TLB, we should be looking at its format
for deciding how to pack.

Fixes RGB565, RGB5_A1 and RGBA10 fbo-clear-formats tests and improves
4444.
2017-10-30 13:31:32 -07:00
Eric Anholt
828299d1bd broadcom/vc5: Don't do r/b channel swapping on 565.
The HW's format actually matches the gallium format.
2017-10-30 13:31:32 -07:00
Eric Anholt
9e5df1897c broadcom/vc5: Use the proper gallium format for our RGB10_A2.
This keeps us from needing our own reswizzling of the B vs R fields.
2017-10-30 13:31:31 -07:00
Eric Anholt
2d6088f2a3 broadcom/vc5: Drop duplicated setup of clip_window_height_in_pixels. 2017-10-30 13:31:28 -07:00
Eric Anholt
1b32786de6 broadcom/vc5: Don't forget to actually turn on stencil testing.
I had the rest of stencil state set up, but forgot to actually enable it
in the higher level configuration bits packet.
2017-10-30 13:31:28 -07:00
Eric Anholt
a797f0eb63 broadcom/vc5: Set up MSAA texture type according to the internal format.
It gets most of EXT_framebuffer_multisample-formats passing, but doesn't
really work for texture views.
2017-10-30 13:31:28 -07:00
Eric Anholt
fe6fc579cb broadcom/vc5: Use the sampler view's format, not the resource's.
This should help with texture views, though I just noticed this while
reading the code.
2017-10-30 13:31:27 -07:00
Eric Anholt
0ec4b4178f broadcom/vc5: Emit raw loads for MSAA buffers.
Similar to stores, but we also need to emit dummy stores in between each
load, to flush out the previous queued load.
2017-10-30 13:31:27 -07:00
Eric Anholt
464f1fb733 broadcom/vc5: Use raw stores for MSAA buffers.
We were storing the resolved pixels in all cases, but nr_samples > 0 means
we should be keeping the per-sample values.

We will probably want to change the job structure at some point, as we'll
want to recognize full-buffer resolves and do the resolved store in the
same job as the original rendering, meaning we'll need to track both the
MSAA and single-sample resources in the job.  However, this will be enough
to build the rest of the MSAA support.
2017-10-30 13:31:27 -07:00
Eric Anholt
e717e3e7cd broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture.
The HW has no native sampler support for multisample textures, but since
we only need to support txf_ms and the layout is UIF, we just need to
scale up the texcoords and then add in the sample.

This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating
MSAA textures as textures, rather than basically texbos like VC4 had to.
2017-10-30 13:31:27 -07:00
Eric Anholt
b1a8b3979c broadcom/vc5: Lay out MSAA textures/renderbuffers as UIF scaled by 4.
We just need to multiply width/height by 2 each, and always set them up as
UIF tiling, since that's how the TLB will store them in raw (per-sample)
mode.
2017-10-30 13:31:27 -07:00
Eric Anholt
eecdbaa985 broadcom/vc5: Add PIPE_TEX_WRAP_CLAMP support for linear-filtered textures.
I already had the texture's wrapping set up to use different behavior for
nearest or linear, so we just needed to saturate the coordinates in linear
mode to get the "proper" blend between the edge and border values.
2017-10-30 13:31:16 -07:00
Eric Anholt
e798455330 broadcom/vc5: Disable GL_ARB_transform_feedback3.
We don't seem to have a way to generally handle gl_SkipComponents.
2017-10-30 13:31:15 -07:00