fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-05 13:58:04 +02:00

Author	SHA1	Message	Date
Alejandro Piñeiro	691cee751a	nir/linker: add ubo/ssbo to the program resource list v2: "nir/linker: Use the stageref when adding UBO/SSBO resources" squashed on this one (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	a638971929	nir/linker: Fill the uniform's BLOCK_INDEX Binding comparison is used to determine the block the uniform is part of. Note that to do the binding comparison we need the information in UniformBlocks[] and ShaderStorageBlocks[] to be available, so we have to call gl_nir_link_uniform_blocks() before linking the uniforms. v2: add missing break (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-12 23:42:41 +02:00
Samuel Pitoiset	f239e22813	radv/gfx10: enable 1D textures Mirror RadeonSI. This also fixes crashes in addrlib. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 18:25:45 +02:00
Andres Gomez	f4d2be03b1	intel/compiler: remove abandoned comments `c8665005`: ("intel/compiler: Don't always require precise lowering of flrp") forgot to remove some comments that didn't apply any more after the change. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrnd.net>	2019-07-12 16:15:20 +00:00
Andres Gomez	9aadd5d688	nir/compiler: keep same bit size when lowering with flrp This was probably not caught before because no supported test was exercising the flrp lowering with other bit size different than 32. With the arrival of VK_KHR_shader_float_controls we will have some of those and, unless we keep the bit size, we will end with something like: ../src/compiler/nir/nir_builder.h:420: nir_builder_alu_instr_finish_and_insert: Assertion `src_bit_size == bit_size' failed. Fixes: `158370ed2a` ("nir/flrp: Add new lowering pass for flrp instructions") Fixes: `ae02622d8f` ("nir/flrp: Lower flrp(a, b, c) differently if another flrp(_, b, c) exists") Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrnd.net>	2019-07-12 16:15:20 +00:00
Jason Ekstrand	16842b2391	anv: Properly compute image usage in CreateImageView With separate stencil usage, we can't just grab the usage from the image directly and have to consider the per-aspect usage instead. Fixes: `1be38f9178` "anv:Use VK_EXT_separate_stencil_usage to avoid..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-12 16:13:48 +00:00
Samuel Pitoiset	b393b2ce95	radv/gfx10: emit DISABLE_CONSERVATIVE_ZPASS_COUNTS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:12 +02:00
Samuel Pitoiset	8cc4e4a81e	radv/gfx10: init more registers in the graphics preamble Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:12 +02:00
Samuel Pitoiset	e68b55f5e3	radv/gfx10: set HS/GS/CS.WGP_MODE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:12 +02:00
Samuel Pitoiset	5d5e26230a	radv/gfx10: emit GE_PC_ALLOC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	df062afa03	radv/gfx10: enable vertex shaders without export parameters GFX10 allows this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	3f76c0f47c	radv/gfx10: launch 2 compute waves per CU before going onto the next CU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	e631d65fc6	radv: use ac_get_compute_resource_limits() No behaviour change. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	e510c5ee3b	ac: import ac_get_compute_resource_limits() from RadeonSI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Alyssa Rosenzweig	5f4f8aec74	panfrost: Initialize shift/extra_flags Don't rely on them being preinitialized to zero; this can cause junk to appear on the wire. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 07:38:37 -07:00
Alyssa Rosenzweig	6d8490f900	panfrost: Fix build warnings A bunch of these are from asserts not being compiled in 32-bit mode (once Erik's ASSERTABLE stuff is merged, we'll want to switch). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 07:38:37 -07:00
Samuel Pitoiset	37aefb2be1	radv/gfx10: invalidate everything in L2 when shaders read data This includes metadata as well. On GFX10, we have to invalidate the L2 metadata cache when shaders read DCC. Note that we still have to implement GFX10 coherency by introducing INV_L2_METATADA but for now just flush L2. This fixes a corruption with DCC and Talos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 14:08:12 +02:00
Samuel Pitoiset	4e38322dd8	radv/gfx10: fix wrong emission of GE_CNTL Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 12:15:08 +02:00
Samuel Pitoiset	219d6939df	radv: add more assertions to make sure packets are correctly emitted Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 12:15:06 +02:00
Alejandro Piñeiro	85b78f96a6	v3d: use inc/dec tmu operation with image atomic sub/add of 1 This allows to remove a mov of 1/-1, as it is implicit with the operation. As with atomic inc/dec/add, usual shader-db set doesn't include any GLES shader using it. So using as workaround vk-gl-cts shaders, we get this: total instructions in shared programs: 1217013 -> 1217006 (<.01%) instructions in affected programs: 53 -> 46 (-13.21%) helped: 2 HURT: 0 One of the helped shader went from 40 to 34 instructions. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:51:22 +02:00
Alejandro Piñeiro	2e22879115	v3d: refactor some code from v3d40_vir_emit_image_load_store And moved to new auxiliar method v3d40_image_load_store_tmu_op, equivalent to the nir_to_nir v3d_general_tmu_op, to clean-up a little. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:49:29 +02:00
Alejandro Piñeiro	934ce48db8	v3d: use inc/dec tmu operation with atomic sub/add of 1 Among other things, this avoid the need of loading 1/-1 constants (so one less operation). The removed comment suggest the option of adding support on NIR for inc/dec. Intel just uses an auxiliar method to get which hw operation is needed, so no lowering is needed. And at the same time, being so small, seems unreasonable to try to add a general one on NIR itself. It is more easy to just adapt the method here (that is what the patch does right now). It is worth to note that we are not getting any change on shader-db stats because all those methods are used on the usual shader-db set with shaders needing GLSL > 4.2. In general there aren't too many GLSL ES 3.1 tests. As an alternative, we captured the GLES3/GLSL31/GLS32 used on vk-gl-cts, even if that is not a real life usage of shaders. With those we get the following: total instructions in shared programs: 1217022 -> 1217013 (<.01%) instructions in affected programs: 117 -> 108 (-7.69%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.50 x̃: 1 helped stats (rel) min: 3.57% max: 10.00% x̄: 8.09% x̃: 9.09% 95% mean confidence interval for instructions value: -2.07 -0.93 95% mean confidence interval for instructions %-change: -10.54% -5.64% Instructions are helped. Note that the shaders helped are really low because most of the vk-gl-cts tests using AtomicInc/Dec/Add are mostly used on compute shaders. Although right now there is a branch around with CS support, the usual is doing the stats against master. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:48:40 +02:00
Alejandro Piñeiro	3912a32a79	v3d: remove redefinition of tmu operations on nir_to_vir They are already defined, although is a slightly different format on the generated packet headers, so it was needed to change how it is used on nir_to_vir. In addition to allow to remove some duplicated headers, it will allow to define just one get_op_for_atomic_add aux method later to support using inc/dec instead of add of 1/-1. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:48:17 +02:00
Alejandro Piñeiro	c2ff38d2df	v3d: tweak initial comment on pack generator script As the files it mentions to use as reference has slightly different names. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:48:09 +02:00
Yevhenii Kolesnikov	8c5692b696	glsl/link_varyings: Fix hash table leak Hash tables were not destroyed at return. v2: Use ralloc_context (Eric Anholt) Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-12 11:07:08 +03:00
Kenneth Graunke	712ac83033	iris: Simplify devinfo access in calculate_result_on_gpu() We have devinfo, no need for screen->devinfo.	2019-07-12 00:33:19 -07:00
Iago Toral Quiroga	10d50f2904	v3d: remove unused definitions Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	8e50a9f6cf	v3d: move implementation of some intrinsics to separate helpers Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	d69184204e	v3d: emit correct lowering for logic ops with RGB10A2 render targets Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	7bf3676845	v3d: emit correct lowering for logic ops with integer render targets Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	e540775f0c	v3d: add lowering for OpenGL logic operations This implements support for OpenGL logic operations by emitting code to read from the TLB if needed and blending the fragment output accordingly. It is similar to VC4's blend lowering pass, but exclusive to logic operations, since blending is otherwise supported in hardware. The pass doesn't handle MSAA targets yet. Fixes the following piglit tests: spec/!opengl 1.0/gl-1.0-logicop/* spec/!opengl 1.1/gl-1.1-xor spec/!opengl 1.1/gl-1.1-xor-copypixels It also fixes text cursor rendering in Libreoffice with the GTK+2 theme, which is rendered via glamor using the XOR logic operation. v2: fix checks for allowed variable location and maximum render target (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	7c1d708911	v3d: acquire scoreboard lock before first tlb read Until now we have always been emitting our scoreboard locks on the last thread switch to improve parallelism. We did this by emitting our last thread switch right before our tlb writes at the very end of the program, where we know that we are outside control flow. Unfortunately, this strategy is not valid when we have tlb color reads too, as these will happen before this point in the program and can happen inside control flow. To fix this we always emit a thread switch before the first tlb load and if we see additional thread switches after that point, we change the strategy to lock on the first thread switch. v2: change the solution so it is expected to work in more scenarios (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	47d7c80dc7	v3d: implement tile buffer color read intrinsic We will be emitting this intrinsic to signal TLB color loads when we implement OpenGL logic operations, where we need to blend the fragment shader color output with the existing color in the render target. Per-sample TLB reads are not supported yet. v2: fix the offset into the color_reads array (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	b0eec9e27d	nir: add a new v3d-specific intrinsic for tile buffer color reads This is intended to be used, for example, with OpenGL logic operations. It takes a render target as source and a sample index in the base index for MSAA color reads. v2: drop the CAN_ELIMINATE and CAN_REORDER flags (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	6af1bdefa9	v3d: fix size of color_reads and sample_colors arrays We need to scale the size of these arrays to consider up to V3D_MAX_DRAW_BUFFERS render targets and 4 components per color. v2: we want to store each color component separately, so scale by 4 too. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	0279ac6e51	v3d: add color formats and swizzles to the fragment shader key We are going to need these very soon to emit correct reads from the tlb to implement logic operations. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	d26b35ba44	v3d: add helpers to emit ldtlb and ldtlbu signals The ldtlbu version will read an implicit uniform with the TLB read specifier and should be used for the first read in a sequence of TLB reads (unless the default configuration is valid, in which case we can use ldtlb). The ldtlb version is used for any subsequent TLB read in the sequence. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	aff8885cf9	v3d: handle tlb read dependency tracking as if they were writes Tile buffer reads are emitted as ordered sequences and cannot be reordered. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	4793e2c888	v3d: instructions with the ldtlb and ldtlbu signals are tlb instructions Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	83a66e10de	v3d: tlb loads cannot be removed Loads from the tile buffer are emitted in ordered sequences so we cannot eliminate or reorder any of them. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	08f4dc3adc	v3d: the ldtlbu signal reads an implicit uniform Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	271bc8acfb	v3d: handle ldtlb and ldtlbu signals during disassembly We already have code to print these signals but the early return in the code that checks if any signals are present present was missing the checks for them, so it would skip printing them unless they were paired with other signals. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Samuel Pitoiset	958ee4c21a	radv: report shader stage name when dumping LLVM IR For debugging purposes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	2b6a089813	radv: tidy up radv_get_shader_name() and add NGG stages Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	ffd6a979bf	radv/gfx10: update OVERWRITE_COMBINER_{MRT_SHARING,WATERMARK} DCC related, mirror RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	c6fa4de15d	radv/gfx10: do not set alignment on the ngg_emit pointer This is invalid and this fixes a crash in LLVM. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	df0a23ad1e	radv/gfx10: fix exporting clip/cull distances for GS This fixes dEQP-VK.clipping.user_defined.clip_distance.geom. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	edcd2bc833	radv/gfx10: fix exporting the subpass view index for GS This fixes dEQP-VK.multiview.geometry. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:20 +02:00
Timothy Arceri	3043908ccb	mesa: save/restore SSO flag when using ARB_get_program_binary Without this the restored program will fail the pipeline validation checks when we attempt to use an SSO program. Fixes: `c20fd744fe` ("mesa: Add Mesa ARB_get_program_binary helper functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111010	2019-07-12 09:26:53 +10:00
Alyssa Rosenzweig	fe783c5b0c	pan/midgard: Correct component count clamping PSIZ Kind of a funky corner case that does not (as far as I know) apply to organic shaders from GLES but does pop up in generated shaders from the fixed-function desktop pipeline. Fixes: `bb483a9166` ("panfrost: Clamp point size") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 13:30:55 -07:00

1 2 3 4 5 ...

113039 commits