Commit graph

37426 commits

Author SHA1 Message Date
Khaled Emara
f0fb73dcf6 freedreno: PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT unreachable statement
There seems to be a duplicate return statement,
as A2XX doesn't support shader buffers.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-04-09 17:31:06 -04:00
Boyuan Zhang
d507bcdcf2 st/va: reverse qt matrix back to its original order
The quantiser matrix that VAAPI provides has been applied with inverse z-scan.
However, what we expect in MPEG2 picture description is the original order.
Therefore, we need to reverse it back to its original order.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110257
Cc: mesa-stable@lists.freedesktop.org

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-09 10:51:03 -04:00
Gert Wollny
b999865f55 softpipe: Enable PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT
The offset alignment must be set to s16 because the tile cache is
implemented to require this.

This enables ARB_buffer_texture_range and OES_texture_buffer for
softpipe. The according deqp-gles31 tests pass.

Also update the feature table.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-04-09 08:17:45 +00:00
Gert Wollny
8cf8dfe408 softpipe: Add an extra code path for the buffer texel lookup
With buffers the addressing is done on a per-byte bases so the code
path for normal textures doesn't work properly. Also add an assert
to make sure that the bit cound for storing the X coordinate is
large enough.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-04-09 08:17:44 +00:00
Gert Wollny
47dd7c4054 softpipe: raise number of bits used for X coordinate texture lookup
With buffers the addressing is done on a per byte basis and we with
a maximal block size of 16 byte we have to take into acount four more
bits. For simplicity just remove the TEX_TILE_SIZE_LOG2, which is 5 bit.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-04-09 08:17:44 +00:00
Gert Wollny
11f219a5ee softpipe: Don't use mag filter for gather op
For the gather op no magnifictaion filter is provided, so always use
the filter given for minification (which is the linear filter)

Fixes: 0dff1533f2
    softpipe: Use mag texture filter also for clamped lod == 0

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-04-09 09:50:13 +02:00
Jason Ekstrand
6279074de1 nir: Get rid of global registers
We have a pass to lower global registers to locals and many drivers
dutifully call it.  However, no one ever creates a global register ever
so it's all dead code.  It's time we bury it.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-09 00:29:36 -05:00
Dave Airlie
ff852fdc05 virgl: add support for ARB_indirect_parameters
The protocol changes are already in place for it.

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-04-09 14:25:01 +10:00
Dave Airlie
05ff2dbf13 virgl: add support for ARB_multi_draw_indirect
This will pass the multi draw through to the host if it has
support for it instead of using the st to emulate it

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-04-09 14:15:24 +10:00
Dave Airlie
316b785c59 virgl: add support for missing command buffer binding.
When I added indirect support I forgot this, however to use it
now we need to check for a new enough capability on the host side.

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
2019-04-09 14:15:12 +10:00
Caio Marcelo de Oliveira Filho
956226c8ba iris: Enable NV_compute_shader_derivatives
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-04-08 19:29:33 -07:00
Caio Marcelo de Oliveira Filho
f9b29c4a58 gallium: Add PIPE_CAP_COMPUTE_SHADER_DERIVATIVES
To enable NV_compute_shader_derivatives, which allows derivatives (and
texture lookups with implicit derivatives) in compute shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-04-08 19:29:33 -07:00
Timothy Arceri
e30804c602 nir/radv: remove restrictions on opt_if_loop_last_continue()
When I implemented opt_if_loop_last_continue() I had restricted
this pass from moving other if-statements inside the branch opposite
the continue. At the time it was causing a bunch of spilling in
shader-db for i965.

However Samuel Pitoiset noticed that making this pass more aggressive
significantly improved the performance of Doom on RADV. Below are
the statistics he gathered.

28717 shaders in 14931 tests
Totals:
SGPRS: 1267317 -> 1267549 (0.02 %)
VGPRS: 896876 -> 895920 (-0.11 %)
Spilled SGPRs: 24701 -> 26367 (6.74 %)
Code Size: 48379452 -> 48507880 (0.27 %) bytes
Max Waves: 241159 -> 241190 (0.01 %)

Totals from affected shaders:
SGPRS: 23584 -> 23816 (0.98 %)
VGPRS: 25908 -> 24952 (-3.69 %)
Spilled SGPRs: 503 -> 2169 (331.21 %)
Code Size: 2471392 -> 2599820 (5.20 %) bytes
Max Waves: 586 -> 617 (5.29 %)

The codesize increases is related to Wolfenstein II it seems largely
due to an increase in phis rather than the existing jumps.

This gives +10% FPS with Doom on my Vega56.

Rhys Perry also benchmarked Doom on his VEGA64:

Before: 72.53 FPS
After:  80.77 FPS

v2: disable pass on non-AMD drivers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-09 11:29:41 +10:00
Dave Airlie
c6cf602121 softpipe: add support for vertex streams (v2)
This enables the ARB_gpu_shader5 vertex streams on softpipe.

v2: only enable when not using llvm.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-09 11:20:39 +10:00
Dave Airlie
7720ce32aa draw: add support to tgsi paths for geometry streams. (v2)
This hooks up the geometry shader processing to the TGSI
support added in the previous commits.

It doesn't change the llvm interface other than to
keep things building.

v2: fix some regressions caused by primitiveoffsets

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-09 11:19:38 +10:00
Dave Airlie
ddb9ad363d softpipe: add support for indexed queries.
We need indexed queries to retrieve the geom shader info.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-09 11:19:38 +10:00
Dave Airlie
00fe67c015 tgsi: add support for geometry shader streams.
This adds support to retrieve the primitive counts
for each stream, along with the offset for each
primitive into the output array.

It also adds support for parsing the stream argument
to the emit and end instructions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-09 11:19:38 +10:00
Dave Airlie
333746011d draw: add stream member to stats callback
This just adds space for the member to the callback, doesn't
change anything else.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-09 11:19:38 +10:00
Lionel Landwerlin
48e48b8560 intel: add dependency on genxml generated files
Drivers using genxml will start compilation before generated files are
created, so add a dependency to it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
2019-04-08 20:52:47 +00:00
Marek Olšák
4b63f57cbc radeonsi: fix a crash when unbinding sampler states
Acked-by: James Zhu <James.Zhu@amd.com>
2019-04-08 15:23:32 -04:00
Alyssa Rosenzweig
4209a27c61 panfrost: Remove "mali_unknown6" nonsense
This structure was used maaaany moons ago as a placeholder for the
varying meta (now unified with mali_attr_meta and essentially fully
decoded). I don't know why it's still in the file. Let's wack it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 16:05:42 +00:00
Alyssa Rosenzweig
b19d1a1e63 panfrost/midgard: Enable lower_find_lsb
This is exactly what the blob does.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 16:01:49 +00:00
Alyssa Rosenzweig
65816ad6e8 panfrost/midgard: Add ibitcount8 op
The mechanics of this opcode are a little opaque, but essentially, it's
used in 8-bit mode to do a bit count in parallel of a uint and then
doing a ton of clever iadd/imov ops to recombine.

v2: Correct opcode. Thank you to jernej on IRC for noticing this awkward
typo!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 16:01:12 +00:00
Alyssa Rosenzweig
6cba9acb75 panfrost/midgard: Add ilzcnt op
Used for implementing findLSB/MSB

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 16:00:39 +00:00
Alyssa Rosenzweig
2e7555b14b panfrost/midgard: Add umin/umax opcodes
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 15:59:05 +00:00
Alyssa Rosenzweig
d84ee49027 panfrost: Add tilebuffer load? branch
Also document branches better.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 15:58:44 +00:00
Alyssa Rosenzweig
7cccc89f80 panfrost/decode: Add flags for tilebuffer readback
These flags are set when reading back the tilebuffer from a fragment
shader via various mechanisms (including ARM_shader_framebuffer_fetch
and EXT_pixel_local_storage).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 15:58:19 +00:00
Karol Herbst
1aabb79bdc panfrost/midgard: use nir_src_is_const and nir_src_as_uint
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-07 15:56:10 +00:00
Jason Ekstrand
10a2fdacfa vc4: Prefer nir_src_comp_as_uint over nir_src_as_const_value
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-04-07 15:13:36 +02:00
Kenneth Graunke
4e802089bc gallium/util: Add const to u_range_intersect
This doesn't modify the range, so it can accept a const pointer.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-04-07 00:21:12 -07:00
Greg V
c5a6e72e15 gallium/hud: add CPU usage support for FreeBSD
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-04-07 06:47:57 +00:00
Kenneth Graunke
9c46046f79 iris: Silence unused variable warnings in release mode 2019-04-06 15:58:16 -07:00
Andrii Simiklit
cade9001b1 util: clean the 24-bit unused field to avoid an issues
This is a field of FLOAT_32_UNSIGNED_INT_24_8_REV texture pixel.
OpenGL spec "8.4.4.2 Special Interpretations" is saying:
   "the second word contains a packed 24-bit unused field,
    followed by an 8-bit index"
The spec doesn't require us to clear this unused field
however it make sense to do it to avoid some
undefined behavior in some apps.

Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110305
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-04-05 21:33:53 +00:00
Marek Olšák
26e161b1e9 tegra: fix the build after the set_shader_buffers change 2019-04-05 11:18:39 -04:00
James Zhu
0f416b85fb gallium/auxiliary/vl: Add barrier/unbind after compute shader launch.
Add memory barrier sync for multiple launch cases, and unbind completed
resources after launch.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-05 09:50:52 -04:00
James Zhu
4bbc9c493f gallium/auxiliary/vl: Fixed blank issue with compute shader
Multiple init buffer within one open instance will cause blank issue.
Updating viewport per frame will fix this issue.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Tested-by: Bruno Milreu <bmilreu@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-05 09:50:52 -04:00
James Zhu
32b861d46d gallium/auxiliary/vl: Fixed blur issue with weave compute shader
Correct wrong interpolatation with top/bottom row which caused blur issue.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Tested-by: Bruno Milreu <bmilreu@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-05 09:50:52 -04:00
Gert Wollny
0dff1533f2 softpipe: Use mag texture filter also for clamped lod == 0
Follow the spec when selecting the magnification filter (OpenGL 4.5,
section 8.14):

  If λ(x, y) is less than or equal to the constant c (see section 8.15)
  the texture is said to be magnified;

While we're here also silence a potential warning about implicit float
to double conversion.

v2: Update commit message to contain a reference to the spec as pointed
    out by Eric.

Fixes a number of dEQP GLES2 and GLES3 test out of:
 dEQP-GLES2.functional.texture.filtering.*
 dEQP-GLES2.functional.texture.vertex.2d.filtering.*
 dEQP-GLES3.functional.texture.vertex.*.filtering.*
 dEQP-GLES3.functional.texture.filtering.*
 dEQP-GLES3.functional.texture.shadow.2d.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-04-05 09:07:45 +02:00
Tapani Pälli
361f3d19f1 iris: handle aux properly in iris_resource_get_handle
Disable aux when resource seen the first time and EXPLICIT_FLUSH
not being set. This fixes issues seen when launching Xorg and
CCS_E getting utilized.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-04 23:35:24 -07:00
Eric Anholt
4c70f276bc v3d: Don't try to use the TFU blit path if a scissor is enabled.
We'll need to do a render-based blit for scissors, since the TFU (as seen
in this conditional) can only update a whole surface.

Fixes: 976ea90bdc ("v3d: Add support for using the TFU to do some blits.")
Fixes piglit fbo-scissor-blit.
2019-04-04 17:30:35 -07:00
Eric Anholt
62360e92ec v3d: Bump the maximum texture size to 4k for V3D 4.x.
4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in
the CTS at 8k and 16k.  4k at least lets us get one 4k display working.

Cc: mesa-stable@lists.freedesktop.org
2019-04-04 17:30:35 -07:00
Eric Anholt
e3063a8b2f v3d: Add support for handling OOM signals from the simulator.
I have v3d allocating enough initial allocation memory that we've been
passing tests without it, but to match kernel behavior more it would be
good to actually exercise the OOM path.
2019-04-04 17:30:35 -07:00
Dave Airlie
738921afd9 ddebug: add compute functions to help hang detection
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-05 10:01:08 +10:00
Dave Airlie
0ea386128b iris: avoid use after free in shader destruction
While playing with compute shaders, I was getting a random crash,
noticed that bind_state was using the old shader info for comparision,
but gallium allows the shader to be deleted while bound, so this could
lead to a use after free.

This can't happen using the cso cache. As it tracks all of this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-05 09:57:44 +10:00
Marek Olšák
42f63e6334 radeonsi: set exact shader buffer read/write usage in CS
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-04 19:28:52 -04:00
Marek Olšák
66a82ec6f0 gallium: add writable_bitmask parameter into set_shader_buffers
to indicate write usage per buffer.
This is just a hint (it will be used by radeonsi).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-04 19:28:52 -04:00
Danylo Piliaiev
b19494c54e iris: Fix assert when using vertex attrib without buffer binding
The GL 4.5 spec says:
 "If any enabled array’s buffer binding is zero when DrawArrays or
  one of the other drawing commands defined in section 10.4 is called,
  the result is undefined."

The result is undefined but it should not crash.

Fixes: gl-3.1-vao-broken-attrib
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-04 22:57:24 +00:00
Tapani Pälli
61cc379371 iris: move iris_flush_resource so we can call it from get_handle
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-04 13:36:51 -07:00
Kenneth Graunke
8d9e169bdd iris: Save/restore MI_PREDICATE_RESULT, not MI_PREDICATE_DATA.
MI_PREDICATE_DATA is an intermediate storage for the MI_PREDICATE
command's calculations - it holds the result of the subtraction when
the compare operation is SRCS_EQUAL or DELTAS_EQUAL.  But the actual
result of the predication is MI_PREDICATE_RESULT, which is what we
want to copy from the render context to the compute context.
2019-04-04 11:41:10 -07:00
Eric Engestrom
05b114e526 simplify LLVM version string printing
Figure it out once in the build system, then just use that all over the place.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-04 16:08:11 +00:00