fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 12:58:09 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	6dc1c2d8bd	iris: Fix ALT mode regressions from shader cache We were checking this based on nir->info.name, but with the shader cache enabled, nir_strip throws out the name, causing us to use IEEE mode for ARB programs. gl-1.0-spot-light regressed because it wants ALT mode for 0^0 behavior. Fixes: `dc5dc727d5` iris: Serialize the NIR to a blob we can use for shader cache purposes.	2019-05-21 16:58:54 -07:00
Marek Olšák	d6053bf2a1	radeonsi: fix a regression in si_rebind_buffer Don't update non-buffer images. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110701 Fixes: `78e35df52a` "radeonsi: update buffer descriptors in all contexts after buffer invalidation" Cc: 19.1 <mesa-stable@lists.freedesktop.org> Tested-By: Gert Wollny <gert.wollny@collabora..com>	2019-05-21 18:58:03 -04:00
Kenneth Graunke	fb1d08dcfd	iris: Expose the disk cache to the state tracker as well. This lets st/nir cache the NIR for shaders, based on the shader source string hash, allowing us to skip initial compiles altogether, and also letting us start from there should we need to recompile for NOS. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Dylan Baker	601c9bc135	iris: Cache assembly shaders in the on-disk shader cache This implements storing and retrieving iris_compiled_shader objects from the on-disk shader cache. (by Dylan Baker and Kenneth Graunke)	2019-05-21 15:05:38 -07:00
Kenneth Graunke	dc5dc727d5	iris: Serialize the NIR to a blob we can use for shader cache purposes. We will use a hash of the serialized NIR together with brw_prog_*_key (for NOS) as the disk cache key, where the disk cache contains actual assembly shaders. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Dylan Baker	4756864cdc	iris: Start wiring up on-disk shader cache This creates the on-disk shader cache data structure, and handles the build-id keying aspects. The next commits will fill it out so it's actually used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-21 15:05:38 -07:00
Kenneth Graunke	6ae2caf201	iris: Move iris_uncompiled_shader definition to iris_context.h It had been internal to iris_program.c, but with the upcoming disk cache code, the "program module" is going to be spread across a couple source files. Into a header it goes! Now it lives alongside iris_compiled_shader, which makes sense. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Rob Clark	9f61aa3f75	freedreno/a6xx: WFI in program stateobj too This "fixes" hangs seen w/ various android games. I think a similar issue to with constant state, we need to avoid CP_LOAD_STATE until previous draw completes. It isn't entirely clear why blob doesn't need to do this, but it might have a different way to accomplish the same thing. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Rob Clark	abfb31acdb	freedreno/a6xx: make sure binning pass constlen is large enough Since we use same constant state for both binning pass program state and draw pass state, and it is possible for binning pass shader to use fewer consts, we need to make sure we program a large enough constlen. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Rob Clark	d200d58e65	freedreno/a6xx: limit IBO state to draw pass Currently we are only supporting images in FS (and CS) so limit this stateobj to draw pass. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Rob Clark	54d94f5780	freedreno/a6xx: don't evaluate FS tex state in binning pass It is unneeded since FS doesn't run in binning pass. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Qiang Yu	a1d419603f	lima/gpir: switch to use nir_lower_viewport_transform Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-20 10:57:11 +08:00
Qiang Yu	a7688b2713	lima/gpir: support vector ssa load Some vector sysval can't be lowered to scaler, so need to break it to scaler in nir to gpir convertion. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-20 10:57:11 +08:00
Qiang Yu	4a74e28130	lima/gpir: add helper function for emit load node Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-20 10:57:11 +08:00
Alyssa Rosenzweig	1155446c19	panfrost/midgard: Split up midgard_compile.c (RA) This commit moves the register allocator out of midgard_compile.c and into its own midgard_ra.c file. In doing so, a number of dependencies are identified and moved into their own files in turn. midgard_compile.c is still fairly monolithic, but this should help. Code churn, but no functional changes should be introduced by this commit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 23:37:45 +00:00
Alyssa Rosenzweig	9cd8cd26de	panfrost: Improve fixed-function blending This fixes a few miscellaneous issues with the fixed-function blending programming, though it is far from complete. For cases known to be buggy, we force a fallback to blend shaders. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:56:35 +00:00
Alyssa Rosenzweig	d1a9b760ea	panfrost: Wire up nir_lower_blend This implements blend shaders via nir_lower_blend, by creating dummy fragment shaders simply passing through the source color and using the new lowering pass to inject blendability. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:56:34 +00:00
Alyssa Rosenzweig	39104221e1	panfrost/midgard: Route new blending intrinsics To prepare for the new nir_lower_blend pass, we wire up the intrinsics for tilebuffer reads and constant colour loading. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:56:14 +00:00
Alyssa Rosenzweig	a1885b2a35	panfrost/nir: Add nir_lower_blend pass This new lowering pass implements the OpenGL ES blend pipeline in shaders, applicable to hardware lacking full-featured blending hardware (including Midgard/Bifrost and vc4). This pass is run on a fragment shader, rewriting the store to a blended version, loading in the framebuffer destination color and constant color via intrinsics as necessary. This pass is sufficient for OpenGL ES 2.0 and is verified to pass dEQP's blend tests. MIN/MAX modes are included and tested as well. That said, at present it has the following limitations: - MRT is not supported (ES3). - sRGB support is missing (ES3). - Extended blending is not yet ported from GLSL IR lowering (ES3.2) - Dual-source blending is not supported. (N/A) - Logic ops are not supported. (N/A) v2: Fix code conventions (per Ian Romanick's feedback). Implement color masks. This pass should be in common nir/ space, but due to non-technical reasons, for now it's in Panfrost space. In the future, depending if other drivers need some of the functionality, we can move this back to src/compiler/nir space. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:54:56 +00:00
Alyssa Rosenzweig	6b2457e75c	panfrost: Fix Bifrost-specific padding Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:28 +00:00
Alyssa Rosenzweig	7b5217ad70	panfrost: Cleanup panfrost_job comments Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:26 +00:00
Alyssa Rosenzweig	ae705387a9	panfrost/decode: Decode blend constant This adds a forgotten decode line on Midgard and adds the field of a blend constant on Bifrost. The Bifrost encoding is fairly weird; whereas Midgard is just a regular 32-bit float, Bifrost uses a fancy fixed-point-esque encoding. The decode logic here is experimentally correct. The encode logic is a sort of "guesstimate", assuming that the high byte is just int(f / 255.0) and then solving algebraicly for the low byte. This might be slightly off in some cases. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:23 +00:00
Alyssa Rosenzweig	3645c781ab	panfrost: Hoist blend constant into Midgard-specific struct This eliminates one major source of #ifdef parity between Midgard and Bifrost, better representing how the struct acts on Midgard and allowing proper decodes on Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:21 +00:00
Alyssa Rosenzweig	50382df728	panfrost/decode: Disassemble Bifrost shaders We already have the Bifrost disassembler in-tree, so now that panwrap is able to dump Bifrost command streams, hook up the disassembler to pandecode. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:08 +00:00
Alyssa Rosenzweig	ea479fdc1d	panfrost/midgard: Typofix Reported-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-17 14:59:52 +00:00
Thomas Hellstrom	47afc5eed7	svga: Add an environment variable to force coherent surface memory The vmwgfx driver supports emulated coherent surface memory as of version 2.16. Add en environtment variable to enable this functionality for texture- and buffer maps: SVGA_FORCE_COHERENT. This environment variable should be used for testing only. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-17 08:44:31 +02:00
Thomas Hellstrom	a119da3bc9	svga: Set the rendered-to flag for dma transfers to surfaces The rendered-to flag indicates that the HW surface content is more recent than the content of the mob. That's the case after a SurfaceDMA transfer to the surface. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-17 08:44:31 +02:00
Thomas Hellstrom	eed24156ec	svga: Remove the surface_invalidate winsys function Instead unconditionally call SVGA3D_InvalidateGBSurface() since it's needed also for Linux for dirty buffers and operation without SurfaceDMA. For non-guest-backed operation, remove the surface cache surface invalidation altogether. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-17 08:44:31 +02:00
Gert Wollny	0f598ed7b3	Revert "softpipe/buffer: load only as many components as the the buffer resource type provides" This reverts commit `865b9ddae4`. The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only one component would be supported. The original issue is still relevant, but the fix should be different. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-17 08:27:55 +02:00
Alyssa Rosenzweig	81d3262fa5	panfrost: Cleanup leak todos Many of these are now patched; one of them we patch here. Regardless, this is one less thing to worry about in the code, I suppose. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-17 00:14:49 +00:00
Alyssa Rosenzweig	c65271c929	panfrost: assert(0) -> unreachable for some switch Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 23:42:33 +00:00
Eric Anholt	ef88e23d03	freedreno: Log the number of loops in the shader for shader-db. shader-db's report.py will use this to see when we've changed loop unrolling behavior on a shader and skip including other stats like instruction count from being considered for that shader, since they won't be useful as a proxy for real world performance in that case. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:22 -07:00
Eric Anholt	c2e68bebb4	freedreno: Output the same shader-db format as v3d and intel. This lets us reuse their report.py, at the expense of fd-report.py no longer working. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:20 -07:00
Eric Anholt	6d9b45171d	freedreno: Remove the ir3_tgsi_to_nir() helper function. It was more of a hindrance, as it pretended that we could compile in the driver with a missing screen. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:18 -07:00
Eric Anholt	a0d4d7febf	freedreno: Fix assertion failures in context setup in shader-db mode. The TTN path needs access to the screen to make the right decisions about lowering, but we didn't have pctx->screen set up at fdN_prog_init time. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:06 -07:00
Marek Olšák	894e017c9c	r600+radeonsi: use ctx_query_reset_status on radeon This allows a nice cleanup, because the winsys always handles it.	2019-05-16 13:15:36 -04:00
Marek Olšák	78e35df52a	radeonsi: update buffer descriptors in all contexts after buffer invalidation Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824 Cc: 19.1 <mesa-stable@lists.freedesktop.org>	2019-05-16 13:15:36 -04:00
Marek Olšák	0f1b070bad	radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets This is a prerequisite for the next commit. Cc: 19.1 <mesa-stable@lists.freedesktop.org>	2019-05-16 13:14:55 -04:00
Marek Olšák	f3ae455eb0	radeonsi: compute culling - flush CS to remove write references to buffers Only read-only buffers can use compute culling. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	04122532e3	radeonsi: invalidate caches at the beginning of the prim discard compute IB Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	9f505ce21d	radeonsi: disable primitive restart for triangles for DiRT Rally It may decrease performance and it prevents compute-based primitive culling. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	0252fb92b8	radeonsi: add primitive culling stats to the HUD Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	c9b7a37b8f	radeonsi: cull primitives with async compute for large draw calls Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:34 -04:00
Marek Olšák	187f1c999f	winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_space Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	4eb377d1c3	radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1 The prim discard compute shader bakes InstanceID into the output index buffer. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	b206f007de	radeonsi: make some functions non-static Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	301344008f	radeonsi: allow si_shader_select_with_key to return an optimized shader or fail If a prim discard compute shader hasn't finished compilation, we don't want to any shader. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	ca9edd7cd0	radeonsi: use pipe_draw_info::instance_count indirectly It will be modified by compute shader culling. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	d380fabdbb	radeonsi: use pipe_draw_info::prim and primitive_restart indirectly so that the fields can be changed by the driver. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	43aa2f4f7c	radeonsi: make functions for creating LLVM functions non-static Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00

1 2 3 4 5 ...

24701 commits