fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 07:08:04 +02:00

Author	SHA1	Message	Date
Iago Toral Quiroga	76fc8c8bb1	v3d: compute appropriate VPM memory configuration for geometry shader workloads Geometry shaders can output many vertices and thus have higher VPM memory pressure as a result. It is possible that too wide geometry shader dispatches exceed the maximum available VPM output allocated, in which case we need to reduce the dispatch width until we can fit the VPM memory requirements. Supported dispatch widths for geometry shaders are 16, 8, 4, 1. There is a limit in the number of VPM output sectors that can be used by a geometry shader that we can meet by lowering the dispatch width at compile time, however, at draw time we need to revisit this number and, together with other elements that can contribute to total VPM memory requirements, decide on a configuration that can fit the program into the available VPM memory. Ideally, we also want to aim for not using more than half of the available memory so we that we can run a pair of bin and render programs in parallel. v2: fixed language in comment and typo in commit log. (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	76f4c83815	v3d: add 1-way SIMD packing definition According to the documentation, the 1-way dispatch width is only supported with geometry shaders. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	4f5fbd6490	v3d: implement geometry shader instancing v2: - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	8a81ac2eed	v3d: emit geometry shader state commands This is good enough to get basic GS workloads working, later patches will improve this by adding instancing support, proper SIMD configuration, etc. Notice that most of the TESSELLATION_GEOMETRY_SHADER_PARAMS fields are only relevant when tessellation shaders are present. We do not support tessellation yet, but we still need to fill in these tessellation state with default values since our packing functions require some of these to have non-zero values. v2: - Add a comment in the code explaining why we fill in tessellation fields (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	0934bd4460	v3d: fix packet descriptions for geometry and tessellation shaders Every code address starts at bit 3 (addresses must be 64-bit aligned), with the first 3 bits used to specify threading and NaN propagation parameters for the shader program. We generally skip "reserved" bits, however, doing this when the reserved field is the last in a struct and it is large enough can make us compute incorrect (smaller) struct sizes which can lead to corrupt CLs. In particular, the "Tess/Geom Common Params" struct has a reserved field at the end that is 8-bit, so if we don't include this we compute a packet size that is 1 byte smaller than it shold, making the next packet we emit start 1 byte earlier and therefore leading to incorrect CL data from that point forward. The name of one of the fields was not correct. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	5d578c27ce	v3d: add initial compiler plumbing for geometry shaders Most of the relevant work happens in the v3d_nir_lower_io. Since geometry shaders can write any number of output vertices, this pass injects a few variables into the shader code to keep track of things like the number of vertices emitted or the offsets into the VPM of the current vertex output, etc. This is also where we handle EmitVertex() and EmitPrimitive() intrinsics. The geometry shader VPM output layout has a specific structure with a 32-bit general header, then another 32-bit header slot for each output vertex, and finally the actual vertex data. When vertex shaders are paired with geometry shaders we also need to consider the following: - Only geometry shaders emit fixed function outputs. - The coordinate shader used for the vertex stage during binning must not drop varyings other than those used by transform feedback, since these may be read by the binning GS. v2: - Use MAX3 instead of a chain of MAX2 (Alejandro). - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro) - Update comment in IO owering so it includes the GS stage (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	f63750accf	v3d: remove unused variable Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	52cbef0039	v3d: enable debug options for geometry shader dumps Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	d6b0786a38	v3d: add debug assert While lowering vpm outputs we look for the NIR variables matching particular store output instructions and we expect to find a match, so assert on that. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	6e68f74395	v3d: add missing plumbing for VPM load instructions We will need to use LDVPMG_IN specifically to read VPM inputs in geometry shaders. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Eric Anholt	f58ef5d481	turnip: Lower usub_borrow. Fixes dEQP-VK.glsl.builtin.function.integer.usubborrow.uvec2_mediump_fragment. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>	2019-12-16 04:52:09 +00:00
Caio Marcelo de Oliveira Filho	c06ba83589	intel/fs: Lower 64-bit MOVs after lower_load_payload() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>	2019-12-14 21:12:21 +00:00
Bas Nieuwenhuizen	b53856aca3	amd/common: Always use addrlib for HTILE tc-compat. Even without depth+stencil addrlib can (correctly!) decide to disable tc compatible HTILE. One example is 8x sampling with 32-bit depth on Stoney. The row size on Stoney is 1024, while the tile size is 2048, which results in tile splits which are not supported with tc-compat. On Stoney, this fixes dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>	2019-12-14 20:39:29 +00:00
Bas Nieuwenhuizen	e197fb1c2f	amd/common: Fix tcCompatible degradation on Stoney. addrlib sometimes returns smaller sizes for tcCompat as it does not seem to take into account the depth+stencil matching config gymnastics with tcCompat. This fixes dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>	2019-12-14 20:39:29 +00:00
Denis Pauk	6bf14e9c47	docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr Signed-off-by: Denis Pauk <pauk.denis@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: Marek Olšák <maraeo@gmail.com> CC: Rhys Perry <pendingchaos02@gmail.com> CC: Bruce Cherniak <bruce.cherniak@intel.com> CC: Matt Turner <mattst88@gmail.com>	2019-12-14 20:02:10 +00:00
Denis Pauk	3acc15f4f0	gallium/swr: Enable support bptc format. Reuse Code from: `f69bc797e1` gallium/auxiliary: Add helper support for bptc format compress/decompress Signed-off-by: Denis Pauk <pauk.denis@gmail.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: Marek Olšák <maraeo@gmail.com> CC: Tim Rowley <timothy.o.rowley@intel.com>	2019-12-14 20:02:10 +00:00
Rob Clark	1bf3837395	freedreno/a6xx: fix OUT_REG() vs growable cmdstream BEGIN_RING() could decide we can't fit the next packet in the current cmdstream segment, and grow a new segment. So we need to grab ring->cur after BEGIN_RING(), otherwise we are writing cmdstream past the end of the previous segment. Fixes: `bdd98b892f` ("freedreno: New struct packing macros") Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-14 09:12:39 -08:00
Erico Nunes	ce52b49348	lima: split draw calls on 64k vertices The Mali400 only supports draws with up to 64k vertices per command. To handle this, break the draw_vbo call into multiple commands. Indexed drawing is left to a separate code path. This implementation was ported from vc4_draw_vbo. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	6d46d0e82b	vc4: move the draw splitting routine to shared code This can also be useful for other hardware which has similar limitations on vertex count per single draw. The Mali400 has a similar limitation and can reuse this. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	2d7be5f01f	lima: refactor indexed draw indices upload As of this commit this is just a refactor in preparation to enable support for more than 64k vertices. To support splitting the draw_vbo call, indices shouldn't be re-uploaded every time. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	270c282a43	lima: allocate separate bo to store varyings The current strategy using the suballocator with fixed size doesn't scale and causes some programs with large number of vertices (like some glmark2 scenes) to crash. Change it to dynamically allocate a separate bo to accomodate for arbitrary number of vertices. This also fixes the buffer read/write flags for gp. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	8bf2b5db78	gallium/util: add alignment parameter to util_upload_index_buffer At least on Mali Utgard, index buffers need to be aligned on 0x40. To avoid duplicating this, add an alignment parameter. Keep the previous default for the other existing users. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Kenneth Graunke	9fb45c5bbd	drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version This gets it running on i965 with Mesa master. (The game won't start without GL 3.3 compatibility, but uses 1.20 with GL_EXT_gpu_shader4 for shaders.) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>	2019-12-13 17:58:42 -08:00
Timothy Arceri	7564c5fc6d	st/glsl_to_nir: fix SSO validation regression Fixes: b77907edb554 ("st/glsl_to_nir: use nir based program resource list builder") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2216 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-13 23:09:57 +00:00
Alyssa Rosenzweig	46f0b9ecc5	ci: Remove T760/T860 from CI temporarily I feel really bad about this but this one test is flaking. I don't want to do a mass revert (and bisection is extremely difficult with nondeterministic/Heisenbugs), but it's Friday night and master needs to pass. This commit should be reverted asap (once the flake is solved) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 22:52:39 +00:00
Rafael Antognolli	59de5d9b6a	iris: Implement WA for push constants. v2: Apply WA to gen11+ instead of gen12+ (Jordan). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-12-13 14:15:04 -08:00
Andreas Baierl	8adeeaa7f2	lima/parser: Add texture descriptor parser Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>	2019-12-13 22:02:03 +00:00
Andreas Baierl	5456916309	lima/parser: Add RSW parsing Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>	2019-12-13 22:02:03 +00:00
Andreas Baierl	31ed081ca3	lima/parser: Some fixes and cleanups Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>	2019-12-13 22:02:03 +00:00
Rafael Antognolli	6a3b8811ea	vulkan/overlay: Update docs. Add mention to overlay control socket. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	56ccea58ae	vulkan/overlay: Add basic overlay control script. This can be used to start/stop statistics capturing from the command line. v3: - Install script (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	a94fa1da93	vulkan/overlay: Add a command to start capturing data to a file. By default, if an output_file is specified, the overlay layer will start capturing data immediately. After this commit, when a control socket is used, the capture starts disabled by default, and is only enabled when a command ":capture=1;" is received. when the capture is enabled, we might have already accumulated some stats. To avoid capturing such noise, we discard and reset the fps and stats, updating the display and capturing only data from that point on. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	606dff1b73	vulkan/overlay: Add support for a control socket. Add support for socket from which the overlay layer can receive commands. This control socket can be useful to allow setting options once the application is already running. For instance, triggering the capture of fps data at a certain point. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	e87d7fea8a	vulkan/overlay: Add a control socket. v2: Use a socket instead of named pipe. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	ef5266ebd5	util/os_socket: Add socket related functions. v3: - Add os_socket.c/h into Makefile.sources (Lionel) - Add empty non-linux implementation to public functions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Eric Engestrom	c327245257	anv: drop unused #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:42:40 +00:00
Eric Engestrom	1a837e803b	util/simple_mtx: don't set the canary when it can't be checked Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-13 20:20:21 +00:00
Eric Engestrom	d600b19640	intel/compiler: replace `0` pointer with `NULL` Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:16:20 +00:00
Eric Engestrom	8074f68b3b	intel/compiler: add ASSERTED annotation to avoid "unused variable" warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:16:20 +00:00
Kenneth Graunke	91efae4f80	iris: Alphabetize source files after iris_perf.c was added	2019-12-13 11:03:13 -08:00
Rob Clark	3b8feefd9c	freedreno/ir3: add iterator macros So many open coded list iterators were getting annoying. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	ad92aa36ac	freedreno/ir3: add scheduler traces Add some infrastructure to trace scheduler decisions. The next patch will add some more traces, just splitting this out to reduce clutter. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	dd34ccb2c5	freedreno/ir3: add last-baryf shaderdb stat Sometimes sched changes that are a win in terms of instruction count and/or register pressure, are worse in real life, due to keeping varying storage locked for too long. Add a shader-db stat to give this more visibility. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Alejandro Piñeiro	2865d79a33	nir/opt_peephole_select: remove unused variables To avoid "unused variable" warnings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-12-13 17:14:58 +01:00
Alyssa Rosenzweig	7c972eba40	panfrost: Report GPU name in es2_info We can prettify the ID. Closes #2093 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	09a2c74cfd	panfrost: Add panfrost_model_name helper This gives us a string representation of a GPU ID. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	a215289176	panfrost: Move property queries to _encoder We'll want these in non-Gallium devices. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	102789886c	panfrost: Move nir_undef_to_zero to Midgard compiler Nothing Gallium about it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	ddbbb2db48	pandecode: Add cast Fixes minor coverity warning about the format specifier. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	4f7fddbd71	panfrost: Pass size to panfrost_batch_get_scratchpad We'll compute the size with the new scratchpad helpers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00

... 2 3 4 5 6 ...

118690 commits