Commit graph

38800 commits

Author SHA1 Message Date
Jonathan Marek
e910acb3f2 etnaviv: rs: don't use etna_compatible_rs_format when possible
This mirrors the change in blt. RS cares about this for msaa/compression.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:18 -04:00
Jonathan Marek
66411521ea etnaviv: combine translate_ts_sampler_format/translate_msaa_format
Both translate the same thing, so just add the missing cases into one.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:18 -04:00
Jonathan Marek
84c87f40fb etnaviv: fix compression format not set correctly in TS_MEM_CONFIG
VIVS_TS_MEM_CONFIG_COLOR_COMPRESSION_FORMAT() needs to be used.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:18 -04:00
Jonathan Marek
53475c85fd etnaviv: set correct ts_clear_value for BLT engine
BLT engine uses all ones to clear TS, set ts_clear_value to match that.
Note: ts_clear_value is never used with BLT engine.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:18 -04:00
Jonathan Marek
7c7eaaed4a etnaviv: remove initial CPU ts clear
Since we have "ts_valid" to avoid using uncleared ts, this memset serves
no purpose. Also it is broken because it doesn't use cpu_prep/cpu_fini.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:18 -04:00
Jonathan Marek
95d937852e etnaviv: implement TS_MODE for GC7000L
GC7000L has a TS mode with larger tiles, which improves performance.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:18 -04:00
Jonathan Marek
bc5ae6a330 etnaviv: fix ts size calculation
The size of the TS is screen->specs.bits_per_tile bits per tile, with each
tile being 64 bytes of the resource.

This gives the same result for 32bpp formats, but reduces the size of TS
for 16bpp formats by 2.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:05:09 -04:00
Jonathan Marek
2f540745ad etnaviv: update headers from rnndb
Update to etna_viv commit 8a8b13a and use new names in the code.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-07-04 14:04:47 -04:00
Bas Nieuwenhuizen
bbbcb49f9b radeonsi: Fix some warnings.
../mesa/src/gallium/drivers/radeonsi/si_compute_blit.c: In function ‘si_clear_buffer’:
../mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:195:11: warning: unused variable ‘clear_alignment’ [-Wunused-variable]
  unsigned clear_alignment = MIN2(clear_value_size, 4);
           ^~~~~~~~~~~~~~~
[23/60] Compiling C object 'src/gallium/drivers/radeonsi/3cdc30e@@radeonsi@sta/si_compute_prim_discard.c.o'.
../mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c: In function ‘si_prepare_prim_discard_or_split_draw’:
../mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:1106:7: warning: unused variable ‘compute_has_space’ [-Wunused-variable]
  bool compute_has_space = sctx->ws->cs_check_space(cs, need_compute_dw, false);

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-04 11:12:27 +00:00
Nicolai Hähnle
cb07f91489 amd/common: move ac_shader_{binary,reloc} into r600 and rename
They are no longer used by radeonsi or radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-04 10:52:26 +00:00
Tomeu Vizoso
0cc02c9ea6 panfrost: Take into account off-screen FBOs
In that case, ctx->pipe_framebuffer.cbufs[0] can be NULL.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 5375d009be ("panfrost: Pass referenced BOs to the SUBMIT ioctls")
2019-07-04 10:48:09 +02:00
Kenneth Graunke
9ea67f0a79 iris: Fix MOCS for grid surface
Hardcoding 4 is bad; we have a function for this now.
2019-07-03 22:24:50 -07:00
Kenneth Graunke
10560f8506 iris: Minor tidying 2019-07-03 22:24:44 -07:00
Marek Olšák
8dfdf5aae4 gallium/u_blitter: add return to fix the build 2019-07-03 23:44:14 -04:00
Marek Olšák
92e34568b7 radeonsi/gfx10: fix legacy GS
LLVM doesn't insert s_waitcnt_vscnt before GS_DONE.

There was also the crash in legacy GS copy shader.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
dfa8e758c2 radeonsi/gfx10: disable clear state
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
0dd57f0fc0 radeonsi/gfx10: disable DPBB
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
815fd77a47 radeonsi/gfx10: disable SDMA
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
f66ee5af2f radeonsi: determine the rasterization primitive type accurately (v2)
v2: reworked version to fix bugs and make it more efficient

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
a4b3eea325 radeonsi/gfx10: consolidate & improve input_prim determination for NGG
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
969e5176c2 ac: rework ac_build_waitcnt for gfx10
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
214ddfb688 radeonsi/gfx10: implement si_shader_vs
Only used with tessellation + GS instancing.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6cf2fb1fc4 radeonsi/gfx10: unpack GS invocation ID
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
32694456f7 radeonsi/gfx10: jump over the shader query atomic if the queries are disabled
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
244a8e6798 radeonsi/gfx10: cosmetic changes
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
09a905d930 radeonsi/gfx10: set cache control registers
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
b680f723f8 radeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
3203a74dcb radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
07aacdbfd5 radeonsi/gfx10: add a workaround for stencil HTILE with mipmapping
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
51db950419 radeonsi/gfx10: disable DCC with MSAA
It was only enabled for 2x MSAA anyway.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6920f09f4b radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives
We need to tell PA to accept edge flags generated by the input assembler,
because decomposed primitives shouldn't draw inner edges.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
e39d4594da radeonsi/gfx10: fix NGG GS color clamping
Just need to pass the input from ES to GS. Everything else is done.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
40e7c65590 radeonsi/gfx10: fix vertex color clamping for TES
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
cc7875150a radeonsi/gfx10: unbind NGG shaders when destroyed
This fixes glsl-max-varyings, which creates shaders, draws, and then
destroys them.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
b90ddff477 radeonsi/gfx10: don't use the GS workaround for triangle strips w/ adjancency
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
c3ac22a620 radeonsi/gfx10: don't do the query buffer atomic for blit shaders
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
adbec817d3 radeonsi/gfx10: update spi_map if API VS (as NGG) changes and PS doesn't
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
1e39c21c23 radeonsi/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
683cf11b81 radeonsi/gfx10: prefetch HW GS when NGG is used
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
7f71579064 radeonsi/gfx10: fix PS exports for SPI_SHADER_32_AR
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
4bdf44724f radeonsi/gfx10: set DLC for loads when GLC is set
This fixes L1 shader array cache coherency.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
f81aa6b0c8 radeonsi/gfx10: fix shader images
Don't promote 2D image instructions to 3D, and don't set z=BASE_ARRAY.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
7c805a7c67 radeonsi/gfx10: set the DCC constant encoding flag
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6eb219e963 radeonsi/gfx10: fix intensity formats
move the ALPHA_IS_ON_MSB fixup into vi_alpha_is_on_msb

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
6944f99176 radeonsi/gfx10: allocate GDS BOs for streamout
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák
395185912d radeonsi/gfx10: make sure GDS is idle between IBs
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
5ff3aff0d6 radeonsi/gfx10: implement streamout
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
792a638b03 radeonsi/gfx10: implement streamout-related queries
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.

This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.

The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
bcd2d2e194 radeonsi/gfx10: enable the workaround for unaligned vertex fetch
Yes, really. Note that non-format buffer loads are unaffected and work
just fine with unaligned pointers (as long as SH_MEM_CONFIG is setup
correctly, which amdgpu ensures).

Fixes e.g. KHR-GL45.vertex_attrib_64bit.vao

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
22b85bfc02 radeonsi/gfx10: re-order the initialization order in si_compile_tgsi_main
It's useful to be able to access gs_ngg_scratch before creating the
main wrapping branch.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00