fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 15:28:18 +02:00

Author	SHA1	Message	Date
Boyan Ding	04593d9a73	gk110/ir: Add rcp f64 implementation Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	6adb9b38bf	nvc0: stick zero values for the compute invocation counts Not quite perfect, but at least we don't end up with random values in the query buffer. Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	e00799d3dc	nv50,nvc0: use condition for occlusion queries when already complete For the NO_WAIT variants, we would jump into the ALWAYS case for both nested and inverted occlusion queries. However if the query had previously completed, the application could reasonably expect that the render condition would follow that result. To resolve this, we remove the nesting distinction which unnecessarily created an imbalance between the regular and inverted cases (since there's no "zero" condition mode). We also use the proper comparison if we know that the query has completed (which could happen as a result of an earlier get_query_result call). Fixes KHR-GL45.conditional_render_inverted.functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	162352e671	nvc0: fix 3d images on kepler Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d tiling, they just need the correct inputs. Supply them. We also have to deal with the case where a 2d "layer" of a 3d image is bound. In this case, we supply the z coordinate separately to the shader, which has to optionally treat every 2d case as if it could be a slice of a 3d texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	5de5beedf2	nvc0/ir: fix second tex argument after levelZero optimization We used to pre-set a bunch of extra arguments to a texture instruction in order to force the RA to allocate a register at the boundary of 4. However with the levelZero optimization, which removes a LOD argument when it's uniformly equal to zero, we undid that logic by removing an extra argument. As a result, we could end up with insufficient alignment on the second wide texture argument. Instead we switch to a different method of achieving the same result. The logic runs during the constraint analysis of the RA, and adds unset sources as necessary right before being merged into a wide argument. Fixes MISALIGNED_REG errors in Hitman when run with bindless textures enabled on a GK208. Fixes: `9145873b15` ("nvc0/ir: use levelZero flag when the lod is set to 0") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	4443b6ddf2	nvc0/ir: always use CG mode for loads from atomic-only buffers Atomic operations don't update the local cache, which means that we would have to issue CCTL operations in order to get the updated values. When we know that a buffer is primarily used for atomic operations, it's easier to just avoid the caching at that level entirely. The same issue persists for non-atomic buffers, which will have to be fixed separately. Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Ilia Mirkin	399215eb7a	nvc0: add support for handling indirect draws with attrib conversion The hardware does not natively support FIXED and DOUBLE formats. If those are used in an indirect draw, they have to be converted. Our conversion tries to be clever about only converting the data that's needed. However for indirect, that won't work. Given that DOUBLE or FIXED are highly unlikely to ever be used with indirect draws, read the indirect buffer on the CPU and issue draws directly. Fixes the failing dEQP-GLES31.functional.draw_indirect.random.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 19:35:57 -05:00
Kristian H. Kristensen	0f7a20e91e	freedreno/a6xx: Use tiling for all resources We used to restrict this to just PIPE_BIND_SAMPLER_VIEW resources, but most resources benefit from being tiled. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-06 15:28:48 -08:00
Kristian H. Kristensen	357ea7da51	freedreno/a6xx: Emit blitter dst with OUT_RELOCW We're writing to the bo and the kernel needs to know for fd_bo_cpu_prep() to work. Fixes: `f93e431272` ("freedreno/a6xx: Enable blitter") Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-06 15:22:25 -08:00
Bas Nieuwenhuizen	13ab63bb62	radv: Implement VK_EXT_buffer_device_address. v2: Also update the release notes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:37:38 +01:00
Bas Nieuwenhuizen	3259e7b036	radv: Do not use the bo list for local buffers. The kernel already does it for us. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:19 +01:00
Bas Nieuwenhuizen	8a15950211	amd/common: Implement global memory accesses. Needed for VK_EXT_buffer_device_address. The pointers are implmemented as i8*, since I could not figure out how to emulate setting struct offsets in LLVM based on the SPIR-V offsets (and more weird stuff like row major matrices). Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:11 +01:00
Bas Nieuwenhuizen	5703ecf651	amd/common: Do not use 32-bit loads for shared memory. We use a straight glsl->llvm type conversion so types should already be right. Also even though the writemasks were changed we we not actually doing 32-bit things, so this fails miserably. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:06 +01:00
Bas Nieuwenhuizen	8d1718590b	amd/common: handle nir_deref_cast for shared memory from integers. Can happen e.g. after a phi. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:36:02 +01:00
Bas Nieuwenhuizen	830fd0efc1	amd/common: Handle nir_deref_type_ptr_as_array for shared memory. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:58 +01:00
Bas Nieuwenhuizen	dbdb44d575	amd/common: Fix stores to derefs with unknown variable. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:54 +01:00
Bas Nieuwenhuizen	3c24fc64c7	amd/common: Use correct writemask for shared memory stores. The check was for 1 bit being set, which is clearly not what we want. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:49 +01:00
Bas Nieuwenhuizen	00253ab2c4	radv: Fix the shader info pass for not having the variable. For example with VK_EXT_buffer_device_address or VK_KHR_variable_pointers. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:45 +01:00
Bas Nieuwenhuizen	58c8dadd32	amd/common: Implement ptr->int casts in ac_to_integer. For the implicit casts inherent in nir. This should probably have been done for shared memory for VK_KHR_variable_pointers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:40 +01:00
Bas Nieuwenhuizen	e00d9a9a72	amd/common: Add gep helper for pointer increment. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:36 +01:00
Bas Nieuwenhuizen	39ab4e12f7	radv: Only look at pImmutableSamples if the descriptor has a sampler. Equivalent of ANV patch `c7f4a2867c` CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-02-06 22:35:32 +01:00
Eric Engestrom	40b53a7203	xvmc: fix string comparison Fixes: `6fca18696d` "g3dvl: Update XvMC unit tests." Cc: Younes Manton <younes.m@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 18:15:43 +00:00
Eric Engestrom	110a6e1839	xvmc: fix string comparison Fixes: `c7b65dcaff` "xvmc: Define some Xv attribs to allow users to specify color standard and procamp" Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-06 18:15:43 +00:00
Marek Olšák	42a1cd034d	radeonsi: use local ws variable in si_need_dma_space Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	2c4911c652	radeonsi: don't leak an index buffer if draw_vbo fails Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	d72c319867	radeonsi: make allocator_zeroed_memory unmappable and use bigger buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	5068dec5de	radeonsi: clear allocator_zeroed_memory with SDMA so that it can be used in parallel IBs. This also removes the SO_FILLED_SIZE hack. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	7d4c935654	radeonsi: initialize textures using DCC to black when possible Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Jonathan Marek	3361305f57	freedreno: a2xx: fix fast clear Fixes: `912a9c8d` Signed-off-by: Jonathan Marek <jonathan@marek.ca> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-06 14:34:57 +00:00
Eric Engestrom	54fa5eceae	egl: use coherent variable names `EGLDisplay` variables (the opaque Khronos type) have mostly been consistently called `dpy`, as this is the name used in the Khronos specs. However, `_EGLDisplay` variables (our internal struct) have been randomly called `dpy` when there was no local variable clash with `EGLDisplay`s, and `disp` otherwise. Let's be consistent and use `dpy` for the Khronos type, and `disp` for our struct. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-02-06 11:53:24 +00:00
Eric Anholt	3c08ecf147	v3d: Whitespace consistency fix.	2019-02-05 15:46:42 -08:00
Eric Anholt	940501a446	v3d: Fix copy-propagation of input unpacks. I had a single function for "does this do float input unpacking" with two major flaws: It was missing the most common thing to try to copy propagate a f32 input nunpack to (the VFPACK to an FP16 render target) along with several other ALU ops, and also would try to propagate an f32 unpack into a VFMUL which only does f16 unpacks. instructions in affected programs: 659232 -> 655895 (-0.51%) uniforms in affected programs: 132613 -> 135336 (2.05%) and a couple of programs increase their thread counts. The uniforms hit appears to be a pattern in generated code of doing (-a >= a) comparisons, which when a is abs(b) can result in the abs instruction being copy propagated once but not fully DCEed.	2019-02-05 15:46:04 -08:00
Eric Anholt	e5c6938590	v3d: Fix input packing of .l for rounding/fdx/fdy. Avoids a regression in dEQP-GLES3.functional.shaders.derivate.fwidth.texture.* once we start copy-propagating more input packs.	2019-02-05 15:45:23 -08:00
Eric Anholt	1a4170952d	v3d: Fix pack/unpack of VFPACK operand unpacks. We want to be able to copy propagate our texture unpacks into the vfpack.	2019-02-05 15:45:23 -08:00
Eric Anholt	d0fdbd4211	v3d: Fix dumping of shaders with alpha test. We were trying to print a NULL entry from the table.	2019-02-05 15:42:14 -08:00
Eric Anholt	bdef17b052	v3d: Store the actual mask of color buffers present in the key. If you only bound rt 1+, we'd still emit a write to the rt0 that isn't present (noticed while debugging an ext_framebuffer_multisample-alpha-to-coverage-no-draw-buffer-zero regression in another change).	2019-02-05 15:42:04 -08:00
Eric Anholt	17a649af05	v3d: Fix precompile of FRAG_RESULT_DATA1 and higher outputs. I was just leaving the other MRT targets than DATA0 out, by accident.	2019-02-05 15:35:49 -08:00
Kristian H. Kristensen	ba4b22011a	st/nir: Use src/ relative include path for autotools Fixes: `cdc53fa81c` Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-05 14:19:51 -08:00
Kenneth Graunke	8fa54bc549	gallium: Add a PIPE_CAP_NIR_COMPACT_ARRAYS capability bit. Iris would like to use compact arrays for tesslevels and clip/cull distances. radeonsi will likely want to switch to these at some point, since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready for them just yet. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	cf731564e6	st/nir: Call nir_lower_clip_cull_distance_arrays(). Today, st always sets LowerCombinedClipCullDistance, causing the GLSL IR lowering to run, giving us vec4[2] arrays. I would like to disable this and instead run the NIR lowering so that we get compact float[] arrays instead. Calling the new pass is a noop if the GLSL IR pass has already run, so it's safe to call the pass unconditionally. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	15c6902117	nir: Avoid splitting compact arrays into per-element variables. Compact arrays are used for special variables like clip and cull distances, or tessellation levels. Drivers using compact arrays assume that these values will always be actual arrays. We don't want to turn a float[1] gl_CullDistance into a single float; that would confuse drivers. Today, i965 uses compact arrays, and Gallium drivers use nir_lower_io_arrays_to_elements, so we haven't had any overlap that would demonstrate the issue. Iris will use both. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	ba9dcc80fb	nir: Avoid clip/cull distance lowering multiple times. A couple places in st/nir assume that cull distances have been lowered away, so it will need to call this lowering pass for drivers which opt out of the GLSL IR lowering. The Intel backend also calls this pass, for i965 and anv. We need to only do it once. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	5730364d69	nir: Bail on clip/cull distance lowering if GLSL IR already did it. We have a GLSL IR pass to convert clip/cull distance float[] arrays into vec4[2] arrays. In `ff281e6204`, we attempted to skip this pass if the GLSL IR lowering had already run. But, that code was not quite right, as we forgot to strip away the per-vertex IO array layer for geometry and tessellation shader varyings. If the GLSL IR pass has run, the variables will not be marked as "compact". So we can simply check that and bail. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	ef99f4c8d1	compiler: Mark clip/cull distance arrays as compact before lowering. nir_lower_clip_cull_distance_arrays() marks the combined clip/cull distance array as compact. However, when translating in from GLSL or SPIR-V, we were not marking the original float[] arrays as compact. We should do so. That way, we can detect these corner cases properly. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-02-05 13:58:46 -08:00
Kenneth Graunke	3327c93510	nir: Record info->fs.pixel_center_integer in lower_system_values radeonsi uses a system value for gl_FragCoord rather than an input var. These get translated into load_frag_coord NIR intrinsics, which lose the pixel_center_integer and origin_upper_left decorations. To cope with this, Tim added a shader_info field for pixel_center_integer, and made glsl_to_nir set it accordingly. prog_to_nir also needs to handle these fragcoord conventions. Instead of duplicating the logic to set the info field, just move it to nir_lower_system_values so it'll happen regardless of who makes the NIR. (For what it's worth, we don't need an info flag for origin_upper_left, because radeonsi lowers origin conventions in nir_lower_wpos_ytransform before nir_lower_system_values destroys the variable and qualifiers.) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:52 -08:00
Kenneth Graunke	536abd453b	program: Extend prog_to_nir handle system values. Some drivers, such as radeonsi, use a system value for gl_FragCoord rather than an input variable. In this case, our Mesa IR will have a PROGRAM_SYSTEM_VALUE register, which we need to translate. This makes prog_to_nir work for Gallium drivers which expose the PIPE_CAP_TGSI_FS_POSITION_IS_SYSVAL capability bit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:51 -08:00
Kenneth Graunke	fa38ca25f6	program: Use u_bit_scan64 in prog_to_nir. We can simply iterate the bits rather than using util_last_bit and checking each one up until that point. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:51:50 -08:00
Kenneth Graunke	a01ad3110a	st/mesa: Add NIR versions of the PBO upload/download shaders. Acked-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:42 -08:00
Kenneth Graunke	a02349b9e7	st/mesa: Add a NIR version of the OES_draw_texture built-in shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:41 -08:00
Kenneth Graunke	be492affa8	st/mesa: Add NIR versions of the clear shaders. We implement the basic VS and FS, as well as the VS that does layered clears by writing gl_Layer from the vertex shader. Drivers which need a geometry shader for writing layer continue falling back to TGSI, as I didn't need this and so didn't bother implementing it. (We certainly could, however, if people want to add it in the future.) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Eric Anholt <eric@anholt.net>	2019-02-05 13:43:39 -08:00

1 2 3 4 5 ...

98912 commits