fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-04 14:08:05 +02:00

Author	SHA1	Message	Date
Jordan Justen	8a019f5601	i965: Add shader cache support for compute v2: * Use MAYBE_UNUSED. (Matt) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	42383faf51	i965: add shader cache support for tess stages v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	5a4afd822f	i965: add shader cache support for geometry shaders v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	2589e7ddaf	i965: Add shader cache support for vertex and fragment stages This enables the cache on vertex and fragment shaders only. v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.justen@intel.com: reword subject] [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	516d50db31	i965: add initial implementation of on disk shader cache This uses the Mesa disk_cache support to write out the final linked binary for vertex and fragment shader programs. This is based off the initial implementation done by Carl Worth. It has been significantly reworked, first by Tim Arceri, and then by Jordan Justen. v2: * Squash 'i965: add image param shader cache support' * Squash 'i965: add shader cache support for pull param pointers' * Sustantially simplified by a rework on top of Jason's `2975e4c56a`. * Rename load_program_data to read_program_data. (Jason) v3: * Simplify and align program read/write. (Jason) v4: * Don't save prog_data size since we know it from the stage. (Ken) * Don't save program size, since prog_data includes the size. (Ken) * Remove `assert` that potentially could be triggered by disk corruption of the cache entries. (Ken) * Fix compute shader scratch allocation. (Ken) * Remove special case mapping for non-LLC. (Ken) * Remove SET_UPLOAD_PARAMS macro [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] [jordan.l.justen@intel.com: brw_shader_cache.c => brw_disk_cache.c] [jordan.l.justen@intel.com: don't map to write program when LLC is present] [jordan.l.justen@intel.com: set program_written_to_cache on read from cache] [jordan.l.justen@intel.com: only try cache when status is linking_skipped] [jordan.l.justen@intel.com: all v2-v4 changes noted above] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	f9d5a7add4	i965: Calculate thread_count in brw_alloc_stage_scratch Previously, thread_count was sent in from the stage after some stage specific calculations. Those stage specific calculations were moved into brw_alloc_stage_scratch, which will allow the shader cache to also use the same calculations. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	f082d7f64f	intel/compiler: Add functions to get prog_data and prog_key sizes for a stage v2: * Return unsigned instead of size_t. (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	05b1193361	intel/compiler: Add union types for prog_data and prog_key stages Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	4c7a1ec62a	blob: Don't set overrun if reading 0 bytes at end of data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	3dcbc5cdaa	intel/compiler: Remove final_program_size from brw_compile_* The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Carl Worth	540636045f	intel/compiler: add new field for storing program size This will be used by the on disk shader cache. v2: * Set in brw_compile_* rather than brw_codegen_*. (Jason) Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> [jordan.l.justen@intel.com: Only add to brw_stage_prog_data] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	1edf0fe612	i965: Don't rely on nir for uses_texture_gather When a program is restored from the shader cache, prog->nir will be NULL, but prog->info will be restored. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	0610a624a1	i965/link: Serialize program to nir after linking for shader cache If the shader cache is enabled, after linking the program, we serialize the program to nir. This will be saved out by the glsl shader cache support. Later, if the same program is found in the cache, we can use the nir for a fallback in the unlikely case that the gen binary program is not found in the cache. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	6b815e405d	glsl/shader_cache: Save and restore serialized nir in gl_program v3: * Rename serialized_nir* to driver_cache_blob*. (Tim) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	571bee96d5	main: Add driver cache blob fields to gl_program These fields can be used to optionally save off a driver blob with the program metadata. For example, serialized nir, or tgsi. v3: * Rename serialized_nir* to driver_cache_blob. (Tim) Free memory. (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:53 -07:00
Jason Ekstrand	54f691311c	nir: Add hooks for testing serialization Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-31 23:36:53 -07:00
Connor Abbott	120da00975	nir: add serialization and deserialization v2 (Jason Ekstrand): - Various whitespace cleanups - Add helpers for reading/writing objects - Rework derefs - [de]serialize nir_shader::num_* - Fix uses of blob_reserve_bytes - Use a bitfield struct for packing tex_instr data v3: - Zero nir_variable struct on deserialization. (Jordan) - Allow nir_serialize.h to be included in C++. (Jordan) - Handle NULL info.name. (Jason) - Set info.name to NULL when name is NULL. (Jordan) Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:53 -07:00
Dave Airlie	57892a23be	mesa/st: implement max combined output resources limiting. if the driver sets the cap, then use the value it gives us. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 10:07:07 +10:00
Dave Airlie	d3fdd66401	gallium: add cap for driver specified max combined shader resources. Some hw (evergreen) has a limit on how many combined (images/buffers/mrts) a fragment shader can access. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 10:07:03 +10:00
Gert Wollny	69eee511c6	r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling It is possible that the optimizer ends up in an infinite loop in post_scheduler::schedule_alu(), because post_scheduler::prepare_alu_group() does not find a proper scheduling. This can be deducted from pending.count() being larger than zero and not getting smaller. This patch works around this problem by signalling this failure so that the optimizers bails out and the un-optimized shader is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 09:33:40 +10:00
Timothy Arceri	e80bbd6f52	radeonsi: fix culldist_writemask in nir path The shared si_create_shader_selector() code already offsets the mask. Fixes the following piglit tests: arb_cull_distance/clip-cull-3.shader_test arb_cull_distance/clip-cull-4.shader_test Fixes: `29d7bdd179` (radeonsi: scan NIR shaders to obtain required info) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-01 09:41:11 +11:00
Neil Roberts	b697ece10a	nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB Previously the values were calculated by just shifting ~0 by the invocation ID. This would end up including bits that are higher than gl_SubGroupSizeARB. The corresponding CTS test effectively requires that these high bits be zero so it was failing. There is a Piglit test as well but this appears to checking the wrong values so it passes. For the two greater-than bitmasks, this patch adds an extra mask with (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-31 23:28:00 +01:00
Nanley Chery	9e849eb8bb	i965: Check CCS_E compatibility for texture view rendering Only use CCS_E to render to a texture that is CCS_E-compatible with the original texture's miptree (linear) format. This prevents render operations from writing data that can't be decoded with the original miptree format. On Gen10, with the new CCS_E-enabled formats handled, this enables the driver to pass the arb_texture_view-rendering-formats piglit test. v2. Add a TODO for texturing. (Jason) Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 14:26:23 -07:00
Nanley Chery	c7baaafe54	intel/isl: Disable some gen10 CCS_E formats for now CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 formats with CCS_E. None of these formats fit within the current blorp_copy framework so disable them until support is added. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 14:26:23 -07:00
Eric Engestrom	3ba973fe37	meson: pass correct args to gles2 ABI test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 18:00:11 +00:00
Eric Engestrom	8a3022ffa0	meson: pass correct args to gles1 ABI test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 18:00:06 +00:00
Eric Engestrom	5eb7bd0e77	meson: pass correct args to gbm symbol test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 18:00:00 +00:00
Eric Engestrom	be301ab724	meson: pass correct args to wayland-egl symbol test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:54 +00:00
Eric Engestrom	64f17440b6	automake+meson: don't run egl symbol check on libglvnd lib We might want to add a symbol check for the glvnd variant though. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:49 +00:00
Eric Engestrom	1946de2b71	meson: pass correct env/args to egl tests Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:44 +00:00
Eric Engestrom	ddb3a695a8	gles2: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:39 +00:00
Eric Engestrom	4e7612c54d	gles1: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:34 +00:00
Eric Engestrom	d830351bfa	gbm: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:29 +00:00
Eric Engestrom	4e43ba5687	wayland-egl: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:24 +00:00
Eric Engestrom	0529a63384	egl: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:19 +00:00
Dylan Baker	1c59119368	meson: set visibility flags on gbm This is done in autotools, and is an oversight in the meson build. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 10:41:19 -07:00
Dylan Baker	c16486f5db	meson: Don't link gbm with threads It's supposed to be linked with pthread-stubs (if the platform needs pthread-stubs). Pthread stubs support isn't (yet) implemented in the meson build, so add a TODO. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 10:41:19 -07:00
Dylan Baker	0589331d54	meson: Use true and false instead of yes and no for tristate options This allows a user to not care whether they're setting a tristate or a boolean option, which is a nice user facing feature, and something I've personally run into. Suggested-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 10:37:17 -07:00
Andrey Grodzovsky	f03b7c9ad9	winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-10-31 16:55:24 +01:00
Erik Faye-Lund	5c2ff5773a	meson: do not search for needless deps If we don't want to use these deps, there's no good reason to search for them in the first place. This should shave a bit of time for the initial build. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 15:44:13 +01:00
Samuel Pitoiset	5010436e09	radv: bail out when binding the same vertex buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 10:16:38 +01:00
Samuel Pitoiset	11fdc2cd34	radv: bail out when binding the same index buffer DOW3 appears to hit this path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 10:16:35 +01:00
Erik Faye-Lund	cf41c19d9f	meson: use dep_m in libgallium The u_format_other.c users sqrtf, which on some systems require a math-library. So let's make sure we link with it. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-31 08:10:37 +01:00
Timothy Arceri	e92405c55a	radv: use correct alloc function when loading from disk Fixes regression in: dEQP-VK.api.object_management.alloc_callback_fail.graphics_pipeline Fixes: `1e84e53712` "radv: add cache items to in memory cache when reading from disk" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 14:51:55 +11:00
Plamena Manolova	048d4c45c9	i965: Fix ARB_indirect_parameters logic. This patch modifies the ARB_indirect_parameters logic in brw_draw_prims, so that our implementation isn't affected if another application attempts to use predicates. Previously we were using a predicate with a DELTAS_EQUAL comparison operation and relying on the MI_PREDICATE_DATA register being 0. Our code to initialize MI_PREDICATE_DATA to 0 was incorrect, so we were accidentally using whatever value was written there. Because the kernel does not initialize the MI_PREDICATE_DATA register on hardware context creation, we might inherit the value from whatever context was last running on the GPU (likely another process). The Haswell command parser also does not currently allow us to write the MI_PREDICATE_DATA register. Rather than fixing this and requiring an updated kernel, we switch to a different approach which uses a SRCS_EQUAL predicate that makes no assumptions about the states of any of the predicate registers. Fixes Piglit's spec/arb_indirect_parameters/tf-count-arrays test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103085 Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-30 20:40:05 -07:00
Kenneth Graunke	877dd14e88	i965: Don't flag BRW_NEW_SURFACES unless some push constants are dirty. Due to a gaffe on my part, we were re-emitting all binding table entries on every single draw call. The push_constant_packets atom listens to BRW_NEW_DRAW_CALL, but skips emitting 3DSTATE_CONSTANT_XS for each stage unless stage_state->push_constants_dirty is true. However, it flagged BRW_NEW_SURFACES unconditionally at the end, by mistake. Instead, it should only flag it if we actually emit 3DSTATE_CONSTANT_XS for a stage. We can move it a few lines up, inside the loop - the early continues will skip over it if push constants aren't dirty for a stage. With INTEL_NO_HW=1 set, improves performance of GFXBench5 gl_driver_2 on Apollolake at 1280x720 by 1.01122% +/- 0.470723% (n=35). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-10-30 20:38:08 -07:00
Kenneth Graunke	28fcf5cd94	intel/genxml: Fix decoding of groups with fields smaller than a DWord. Groups containing fields smaller than a DWord were not being decoded correctly. For example: <group count="32" start="32" size="4"> <field name="Vertex Element Enables" start="0" end="3" type="uint"/> </group> gen_field_iterator_next would properly walk over each element of the array, incrementing group_iter, and calling iter_group_offset_bits() to advance to the proper DWord. However, the code to print the actual values only considered iter->field->start/end, which are 0 and 3 in the above example. So it would always fetch bits 3:0 of the current DWord when printing values, instead of advancing to each element of the array, printing bits 0-3, 4-7, 8-11, and so on. To fix this, we add new iter->start/end tracking, which properly advances for each instance of a group's field. Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING, with a patch to convert it to use an array of bitfields (the example above). This also fixes the decoding of 3DSTATE_SBE's "Attribute Active Component Format" fields. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-30 20:22:55 -07:00
Ian Romanick	53c7b8bdca	glsl: Fix bad formatting in a comment Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2017-10-30 20:08:25 -07:00
Eric Anholt	2a77c763fe	broadcom/vc5: Force blending to treat alpha as 1 for formats without alpha. Fixes fbo-blending-formats on RGB8 and 565. We will still need to demote blending to shader code in the MRT case to fix it in general, but that can be added when we start doing 32F blending (which also needs to be done in the shader).	2017-10-30 13:31:32 -07:00
Eric Anholt	61bb0df60e	broadcom/vc5: Do BGRA vs RGBA swapping for the BLEND_CONSTANT_COLOR. Fixes many of the fbo-blending-formats tests.	2017-10-30 13:31:32 -07:00

1 2 3 4 5 ...

97209 commits