Commit graph

37997 commits

Author SHA1 Message Date
Alyssa Rosenzweig
d1a9b760ea panfrost: Wire up nir_lower_blend
This implements blend shaders via nir_lower_blend, by creating dummy
fragment shaders simply passing through the source color and using the
new lowering pass to inject blendability.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-19 17:56:34 +00:00
Alyssa Rosenzweig
39104221e1 panfrost/midgard: Route new blending intrinsics
To prepare for the new nir_lower_blend pass, we wire up the intrinsics
for tilebuffer reads and constant colour loading.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-19 17:56:14 +00:00
Alyssa Rosenzweig
a1885b2a35 panfrost/nir: Add nir_lower_blend pass
This new lowering pass implements the OpenGL ES blend pipeline in
shaders, applicable to hardware lacking full-featured blending hardware
(including Midgard/Bifrost and vc4). This pass is run on a fragment
shader, rewriting the store to a blended version, loading in the
framebuffer destination color and constant color via intrinsics as
necessary. This pass is sufficient for OpenGL ES 2.0 and is verified to
pass dEQP's blend tests. MIN/MAX modes are included and tested as well.
That said, at present it has the following limitations:

 - MRT is not supported (ES3).
 - sRGB support is missing (ES3).
 - Extended blending is not yet ported from GLSL IR lowering (ES3.2)
 - Dual-source blending is not supported. (N/A)
 - Logic ops are not supported. (N/A)

v2: Fix code conventions (per Ian Romanick's feedback). Implement color
masks.

This pass should be in common nir/ space, but due to non-technical
reasons, for now it's in Panfrost space. In the future, depending if
other drivers need some of the functionality, we can move this back to
src/compiler/nir space.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-19 17:54:56 +00:00
Alyssa Rosenzweig
6b2457e75c panfrost: Fix Bifrost-specific padding
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-05-19 17:41:28 +00:00
Alyssa Rosenzweig
7b5217ad70 panfrost: Cleanup panfrost_job comments
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-05-19 17:41:26 +00:00
Alyssa Rosenzweig
ae705387a9 panfrost/decode: Decode blend constant
This adds a forgotten decode line on Midgard and adds the field of a
blend constant on Bifrost. The Bifrost encoding is fairly weird; whereas
Midgard is just a regular 32-bit float, Bifrost uses a fancy
fixed-point-esque encoding. The decode logic here is experimentally
correct. The encode logic is a sort of "guesstimate", assuming that the
high byte is just int(f / 255.0) and then solving algebraicly for the
low byte. This might be slightly off in some cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-05-19 17:41:23 +00:00
Alyssa Rosenzweig
3645c781ab panfrost: Hoist blend constant into Midgard-specific struct
This eliminates one major source of #ifdef parity between Midgard and
Bifrost, better representing how the struct acts on Midgard and allowing
proper decodes on Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-05-19 17:41:21 +00:00
Alyssa Rosenzweig
50382df728 panfrost/decode: Disassemble Bifrost shaders
We already have the Bifrost disassembler in-tree, so now that panwrap is
able to dump Bifrost command streams, hook up the disassembler to
pandecode.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
2019-05-19 17:41:08 +00:00
Eric Engestrom
ccb8ea7acf meson: expose glapi through osmesa
Suggested-by: Pierre Guillou <pierre.guillou@lip6.fr>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659
Fixes: f121a669c7 "meson: build gallium based osmesa"
Fixes: cbbd5bb889 "meson: build classic osmesa"
Cc: Brian Paul <brianp@vmware.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
2019-05-18 11:15:04 +01:00
Neha Bhende
926a6a35cf draw: fix memory leak introduced 7720ce32a
We need to free memory allocation PrimitiveOffsets in draw_gs_destroy().
This fixes memory leak found while running piglit on windows.

Fixes: 7720ce32a ("draw: add support to tgsi paths for geometry streams. (v2)")

Tested with piglit

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-05-17 12:26:48 -06:00
Alyssa Rosenzweig
ea479fdc1d panfrost/midgard: Typofix
Reported-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-17 14:59:52 +00:00
Thomas Hellstrom
47afc5eed7 svga: Add an environment variable to force coherent surface memory
The vmwgfx driver supports emulated coherent surface memory as of version
2.16. Add en environtment variable to enable this functionality for
texture- and buffer maps: SVGA_FORCE_COHERENT.
This environment variable should be used for testing only.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2019-05-17 08:44:31 +02:00
Thomas Hellstrom
1a66ead1c7 pipebuffer, winsys/svga: Add functionality to update pb_validate_entry flags
In order to be able to add access modes to a pb_validate_entry, update
the pb_validate_add_buffer function to take a pointer hash table and also
to return whether the buffer was already on the validate list.

Update the svga winsys accordingly.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-05-17 08:44:31 +02:00
Thomas Hellstrom
a119da3bc9 svga: Set the rendered-to flag for dma transfers to surfaces
The rendered-to flag indicates that the HW surface content is more recent
than the content of the mob. That's the case after a SurfaceDMA transfer
to the surface.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-05-17 08:44:31 +02:00
Thomas Hellstrom
fb6d09764d winsys/svga: Fix RELOC_INTERNAL mob GPU access
SVGA_RELOC_INTERNAL indicates a transfer between surface and backing mob.
This means that if the GPU for example reads from the surface it writes
to the backing mob. But since the buffer mapping code allows for
simultaneous gpu- and cpu read access, a read from the surface to the mob
will not synchronize a subsequent map to the readback.

Fix this by inverting the mob access mode in a surface relocation with
SVGA_RELOC_INTERNAL set.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-05-17 08:44:31 +02:00
Thomas Hellstrom
eed24156ec svga: Remove the surface_invalidate winsys function
Instead unconditionally call SVGA3D_InvalidateGBSurface() since it's needed
also for Linux for dirty buffers and operation without SurfaceDMA.
For non-guest-backed operation, remove the surface cache surface invalidation
altogether.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-05-17 08:44:31 +02:00
Gert Wollny
0f598ed7b3 Revert "softpipe/buffer: load only as many components as the the buffer resource type provides"
This reverts commit 865b9ddae4.

The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only
one component would be supported. The original issue is still relevant, but
the fix should be different.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-05-17 08:27:55 +02:00
Alyssa Rosenzweig
81d3262fa5 panfrost: Cleanup leak todos
Many of these are now patched; one of them we patch here. Regardless,
this is one less thing to worry about in the code, I suppose.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-17 00:14:49 +00:00
Alyssa Rosenzweig
c65271c929 panfrost: assert(0) -> unreachable for some switch
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-05-16 23:42:33 +00:00
Eric Anholt
ef88e23d03 freedreno: Log the number of loops in the shader for shader-db.
shader-db's report.py will use this to see when we've changed loop
unrolling behavior on a shader and skip including other stats like
instruction count from being considered for that shader, since they won't
be useful as a proxy for real world performance in that case.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:22 -07:00
Eric Anholt
c2e68bebb4 freedreno: Output the same shader-db format as v3d and intel.
This lets us reuse their report.py, at the expense of fd-report.py no
longer working.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:20 -07:00
Eric Anholt
6d9b45171d freedreno: Remove the ir3_tgsi_to_nir() helper function.
It was more of a hindrance, as it pretended that we could compile in the
driver with a missing screen.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:18 -07:00
Eric Anholt
a0d4d7febf freedreno: Fix assertion failures in context setup in shader-db mode.
The TTN path needs access to the screen to make the right decisions about
lowering, but we didn't have pctx->screen set up at fdN_prog_init time.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Tested-by: Eduardo Lima Mitev <elima@igalia.com>
2019-05-16 10:25:06 -07:00
Marek Olšák
894e017c9c r600+radeonsi: use ctx_query_reset_status on radeon
This allows a nice cleanup, because the winsys always handles it.
2019-05-16 13:15:36 -04:00
Marek Olšák
4549c36788 winsys/radeon: implement ctx_query_reset_status by copying radeonsi
To make it behave like amdgpu. I'm just trying to move this out of
radeonsi. The radeonsi code will be removed in the next commit.
2019-05-16 13:15:36 -04:00
Marek Olšák
6b3343e5d8 winsys/amdgpu: report a CS rejection as a reset only if there's no GPU reset 2019-05-16 13:15:36 -04:00
Marek Olšák
78e35df52a radeonsi: update buffer descriptors in all contexts after buffer invalidation
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
2019-05-16 13:15:36 -04:00
Marek Olšák
0f1b070bad radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets
This is a prerequisite for the next commit.

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
2019-05-16 13:14:55 -04:00
Marek Olšák
f3ae455eb0 radeonsi: compute culling - flush CS to remove write references to buffers
Only read-only buffers can use compute culling.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
04122532e3 radeonsi: invalidate caches at the beginning of the prim discard compute IB
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
9f505ce21d radeonsi: disable primitive restart for triangles for DiRT Rally
It may decrease performance and it prevents compute-based primitive culling.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
0252fb92b8 radeonsi: add primitive culling stats to the HUD
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:36 -04:00
Marek Olšák
c9b7a37b8f radeonsi: cull primitives with async compute for large draw calls
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:13:34 -04:00
Marek Olšák
187f1c999f winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_space
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
4eb377d1c3 radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1
The prim discard compute shader bakes InstanceID into the output index buffer.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
b206f007de radeonsi: make some functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
301344008f radeonsi: allow si_shader_select_with_key to return an optimized shader or fail
If a prim discard compute shader hasn't finished compilation, we don't want
to any shader.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
ca9edd7cd0 radeonsi: use pipe_draw_info::instance_count indirectly
It will be modified by compute shader culling.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
d380fabdbb radeonsi: use pipe_draw_info::prim and primitive_restart indirectly
so that the fields can be changed by the driver.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
43aa2f4f7c radeonsi: make functions for creating LLVM functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
b19884e08e winsys/amdgpu: add a parallel compute IB coupled with a gfx IB
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:07:00 -04:00
Marek Olšák
07c83d25fd radeonsi: add a cs parameter into si_cp_copy_data
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:57 -04:00
Marek Olšák
ce264d19a0 radeonsi: add a cs parameter into si_cp_release_mem
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:56 -04:00
Marek Olšák
9624855f13 radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limits
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:54 -04:00
Marek Olšák
6e38af0631 radeonsi: move si_*_descriptors_idx functions into si_state.h
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:53 -04:00
Marek Olšák
49a016ec5d radeonsi: make si_initialize_compute reusable
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:51 -04:00
Marek Olšák
c44c6951d4 radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helper
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:49 -04:00
Marek Olšák
c7ceeea093 radeonsi: return the last part's return value from @wrapper
The primitive discard compute shader will get the position output this way.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:40 -04:00
Marek Olšák
d569b7cb31 winsys/amdgpu: always set NO_CPU_ACCESS and NO_SUBALLOC on GDS resources
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:06:18 -04:00
Jan Zielinski
d65b160e6a swr: clean up supported OGL4.0/4.1 extensions list
This commit adjusts the capabilities returned
by the SWR driver and the documentation to correctly
report the following extensions:

GL_ARB_texture_query_lod, GL_ARB_texture_cube_map_array,
GL_ARB_gpu_shader_fp64, GL_ARB_texture_gather,
GL_ARB_vertex_attrib_64bit.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-05-16 17:41:14 +02:00