Commit graph

32650 commits

Author SHA1 Message Date
Rob Clark
16ac70bdcf freedreno/a5xx: align height to GMEM
Similar to the way width/pitch alignment works, it seems like we need to
do similar for height.  Otherwise the BLIT from system memory to GMEM
can over-fetch beyond the end of the buffer, triggering a fault.

I'm not sure if there is a better solution yet.  Possibly we could fall
back to pre-a5xx style DRAW packets for cases where BLIT might over-
fetch.  (We in theory have that problem already with rendering to higher
mipmap levels, although fortunately those tend to use GMEM bypass.)

This fixes issues reported with glamor.

Reported-by: don.harbin@linaro.org
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-02 09:25:57 -04:00
Nicolai Hähnle
146c2b7c28 radeonsi: adjust clip discard based on line width / point size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
63680471f9 radeonsi: remove si_context::{scissor_enabled,clip_halfz}
They are just copies of the rasterizer state.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
12f3155e28 radeonsi: simplify the signature of si_update_vs_writes_viewport_index
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
7bbcb6ac6c radeonsi: move current_rast_prim into si_context
v2: rebase fixes

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
6b416ec3d6 radeonsi: move and rename scissor and viewport state and functions
v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
449ac258d1 radeonsi: remove si_apply_scissor_bug_workaround
It only affects pre-SI chips.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
c955f45946 radeonsi: move r600_viewport.c to si_viewport.c
This is purely a file-move + #include fixup + build system changes.
Other cleanups will follow in subsequent commits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
30e37289ea radeonsi: fix maximum advertised point size / line width
The hardware registers store the half-size/width in 12.4 fixed point
format, so 8192 is the maximum.

Fixes dEQP-GLES3.functional.rasterization.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
a3fa3b2e02 radeonsi: deduce rast_prim correctly for tessellation point mode
Together with the previous patches, this fixes
dEQP-GLES31.functional.primitive_bounding_box.wide_points.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
4d74432dd3 radeonsi: don't discard points and lines
This is a bit conservative, but a more precise solution requires access
to the rasterizer state. This is something to tackle after the fork between
r600 and radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
f86a112b07 radeonsi: move current_rast_prim to r600_common_context
We'll use it in the scissors / clip / guardband state.

v2: avoid a performance regression on r600 when applied to
    (pre-fork) stable branches

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Nicolai Hähnle
85a3e1cae0 gallium: add PIPE_FORMAT_R10G10B10X2_UNORM
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Rob Clark
d304c467ba freedreno: fix PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE
Fixes an assert in fd_acc_query_register_provider() about query provider
not already registered.

Fixes: 3f6b3d9d ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-02 08:44:57 -04:00
Nicolai Hähnle
6d23f7c65d radeonsi: fix a regression in integer cube map handling
A recent commit fixed the case of 8888 integer cube maps, which need the
workaround of replacing the data format with USCALED/SSCALED. However,
this broke the case of non-8888 integer cube maps; those still need the
fix of shifting the texture coordinates.

Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar.

Fixes: 6fb0c1013b ("radeonsi: workaround for gather4 on integer cube maps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 12:17:15 +02:00
Nicolai Hähnle
052b974fed amd/common: move ac_build_phi from radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 12:17:15 +02:00
Marek Olšák
da3cf0e206 radeonsi: don't use the template keyword
for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-30 19:03:07 +02:00
Marek Olšák
9592c43a96 gallium/vl: don't use the template keyword
for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-30 19:03:07 +02:00
Benedikt Schemmer
3797a82e78 radeonsi/uvd: clean up si_video_buffer_create
V2: remove code duplication and one unnessecary variable, minor whitespace fix

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-09-30 19:03:07 +02:00
Marek Olšák
e9cf64a67c radeonsi/uvd: fix planar formats broken since f70f6baaa3
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-30 19:03:07 +02:00
Roland Scheidegger
740a1618c3 gallium: add new LOD opcode
The operation performed is all the same as LODQ, but with the usual
differences between dx10 and GL texture opcodes, that is separate resource
and sampler indices (plus result swizzling, and setting z/w channels
to zero).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-30 02:58:09 +02:00
Leo Liu
361d8f82c0 st/va: add dst rect to avoid scale on deint
For 1080p video transcode, the height will be scaled to 1088 when deint
to progressive buffer. Set dst rect to make sure no scale.

Fixes: 3ad8687 "st/va: use new vl_compositor_yuv_deint_full() to deint"

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andy Furniss <adf.lists@gmail.com>
2017-09-29 10:06:30 -04:00
Nicolai Hähnle
d190bfc1ad radeonsi: emit DLDEXP and DFRACEXP TGSI opcodes
Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:07 +02:00
Nicolai Hähnle
061303e4fd radeonsi: emit LDEXP opcode
The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:04 +02:00
Nicolai Hähnle
cad959d901 gallium: add LDEXP TGSI instruction and corresponding cap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:01 +02:00
Nicolai Hähnle
2b0bfc51de tgsi: infer that dst[1] of DFRACEXP is an integer
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:59 +02:00
Nicolai Hähnle
5cf279bf7e gallivm: add support for TGSI instructions with two outputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:57 +02:00
Nicolai Hähnle
7af64b4d4a gallivm: add dst register index to lp_build_tgsi_context::emit_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:55 +02:00
Nicolai Hähnle
3c78215a1c tgsi: clarify the semantics of DFRACEXP
The status quo is quite the mess:

1. tgsi_exec will do a per-channel computation, and store the dst[0]
   result (significand) correctly for each channel. The dst[1] result
   (exponent) will be written to the first bit set in the writemask.
   So per-component calculation only works partially.

2. r600 will only do a single computation. It will replicate the
   exponent but not the significand.

3. The docs pretend that there's per-component calculation, but even
   get dst[0] and dst[1] confused.

4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
   and kind-of assumes that everything is replicated, generating this for
   the dvec4 case:

     DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
     DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
     DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
     DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw

Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:50 +02:00
Nicolai Hähnle
dbe7fc00d5 tgsi: fix the documentation of DLDEXP
Sourcing the exponent for the zw destination pair from Z is consistent
with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always
generates per-channel instructions anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:46 +02:00
Nicolai Hähnle
d713af711d tgsi: infer that DLDEXP's second source has an integer type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:33 +02:00
Nicolai Hähnle
c49400a03b r600: cleanup set_occlusion_query_state
This fixes a warning caused by the fork (note the change in the function
signature):

../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’:
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
  rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:47:37 +02:00
Nicolai Hähnle
5184a1e8ee r300: add missing case PIPE_SHADER_CAP_INT64_ATOMICS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:47:34 +02:00
Nicolai Hähnle
797dd12c7b radeonsi: fix border color translation for integer textures
This fixes the extremely unlikely case that an application uses
0x80000000 or 0x3f800000 as border color for an integer texture and
helps in the also, but perhaps slightly less, unlikely case that 1 is
used as a border color.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:45:08 +02:00
Nicolai Hähnle
6eb9483912 radeonsi: clamp border colors for upgraded depth textures
The hardware does this automatically for unorm formats, but we need to
do it manually for unorm depth formats that have been upgraded to
Z32_FLOAT.

Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
and others.

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:45:05 +02:00
Nicolai Hähnle
4c56e07029 radeonsi: clamp depth comparison value only for fixed point formats
The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.

The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.

Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:44:50 +02:00
Nicolai Hähnle
7dfa891f32 radeonsi/gfx9: fix geometry shaders without output vertices
Not that those are super common or useful, but hey! Fun corner cases
of the API...

Fixes dEQP-GLES31.functional.geometry_shading.emit.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:43:09 +02:00
Nicolai Hähnle
4ed419328d radeonsi: move descriptor logs to after corresponding draw/compute packet
It has to happen after descriptor uploads since otherwise we'll print out
the wrong GPU list / incorrectly claim descriptor corruption.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:37:06 +02:00
Nicolai Hähnle
9ddc6e16a9 amd/common: remove ac_shader_abi::chip_class
Redundant with the recently added ac_llvm_context::chip_class.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:37:03 +02:00
Nicolai Hähnle
5b86c53b47 gallium/radeon: fix a comment
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:36:46 +02:00
Brian Paul
4d5497d50d svga: add missing PIPE_SHADER_CAP_INT64_ATOMICS switch cases
Silences a compiler warning.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-09-28 10:41:33 -06:00
Brian Paul
e8d09f80ea svga: trivial whitespace clean-ups in svga_screen.c 2017-09-28 10:41:33 -06:00
Brian Paul
f33fbe2cf9 gallium/util: use new util_vasprintf() function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-28 10:41:33 -06:00
Neha Bhende
652bc4b537 svga: start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Since our driver support arb_provoking_vertex, we can start
advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests

Tested piglit, glretrace on Hw 11 and Hw 13

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-28 10:41:33 -06:00
Lucas Stach
15e3657e43 etnaviv: optimize RS transfers
Currently we are blitting the whole resource when the RS is used to
de-/tile a resource. This can be very inefficient for large resources
where the transfer is only changing a small part of the resource
(happens a lot with glTexSubImage2D).

Optimize this by only blitting the tile aligned subregion of the
resource, which the transfer is going to change.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-28 17:41:07 +02:00
Lucas Stach
69eb93cbb9 etnaviv: add resource subregion copy
This is useful if we only need to copy part of a larger resource, mostly
when using the RS engine to de-/tile on pipe transfers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-28 17:41:01 +02:00
Lucas Stach
9df635844c etnaviv: support tile aligned RS blits
The RS can blit abitrary tile aligned subregions of a resource by
adjusting the buffer offset.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-28 17:40:49 +02:00
Leo Liu
6ed61b8d3f st/va: use pipe transfer_map to map upload buffer
The function pipe_buffer_map() is only for linear pipe buffer,
with height as 0, and it's not for any 2D textures.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 09:22:55 -04:00
Gwan-gyeong Mun
5603670bc0 gallium/docs: add reference links for resource_create method
It adds reference links for arguments usage and bind of resource_create().

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 13:20:14 +01:00
Gwan-gyeong Mun
c6c23e95a7 gallium/docs: fix a reference link for get_paramf
Previous get_paramf links same as get_param. It changes the reference link to
PIPE_CAPF_*

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 13:20:14 +01:00