fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 09:18:04 +02:00

Author	SHA1	Message	Date
Tapani Pälli	19a85a704b	nir: add option to use scaling factor when sampling planes YUV lowering Patch adds nir_lower_tex_options as parameter to sample_plane so that we don't need to extend nir_tex_instr for this. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-12 08:41:20 +02:00
Kenneth Graunke	3eedc8f7b1	i965: Use info->textures_used instead of prog->SamplersUsed. prog->SamplersUsed is set by the linker when validating resource limits, while info->textures_used is gathered after NIR optimizations, which may have eliminated some unused surfaces. This may let us skip some work. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:50 -08:00
Kenneth Graunke	59ae985631	i965: Drop unnecessary 'and' with prog->SamplerUnits textures_used_by_txf is a subset of textures_used which is a subset of prog->SamplerUnits. This should do nothing. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:48 -08:00
Kenneth Graunke	f5c7df4dc9	nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref. Eric and I would like a bitmask of which samplers are used, similar to prog->SamplersUsed, but available in NIR. The linker uses SamplersUsed for resource limit checking, but later optimizations may eliminate more samplers. So instead of propagating it through, we gather a new one. While there, we also gather the existing textures_used_by_txf bitmask. Gathering these bitfields in nir_shader_gather_info is awkward at best. The main reason is that it introduces an ordering dependency between the two passes. If gathering runs before lower_samplers_as_deref, it can't look at var->data.binding. If the driver doesn't use the full lowering to texture_index/texture_array_size (like radeonsi), then the gathering can't use those fields. Gathering might be run early /and/ late, first to get varying info, and later to update it after variant lowering. At this point, should gathering work on pre-lowered or post-lowered code? Pre-lowered is also harder due to the presence of structure types. Just doing the gathering when we do the lowering alleviates these ordering problems. This fixes ordering issues in i965 and makes the txf info gathering work for radeonsi (though they don't use it). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:45 -08:00
Kenneth Graunke	120f9b8362	nir: Use sampler derefs in drawpixels and bitmap lowering. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:44 -08:00
Kenneth Graunke	04bdc56872	program: Make prog_to_nir create texture/sampler derefs. Until now, prog_to_nir has been setting texture_index and sampler_index directly. This is different than GLSL shaders, which create variable dereferences and rely on lowering passes to reach this final form. radeonsi uses variable dereferences for samplers rather than texture_index and sampler_index, so it doesn't even make sense to set them there. By moving to derefs, we ensure that both GLSL and ARB programs produce the same final form that the driver desires. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:40 -08:00
Kenneth Graunke	6a4be25a90	st/nir: Use sampler derefs in built-in shaders. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:38 -08:00
Kenneth Graunke	ba9c1c8217	st/nir: Lower sampler derefs for builtin shaders. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:36 -08:00
Kenneth Graunke	8d1646e0e1	st/nir: Pull sampler lowering into a helper function. This will make it easier to reuse across GLSL / ARB / built-ins. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:35 -08:00
Kenneth Graunke	243c11dc16	i965: Call nir_lower_samplers for ARB programs. An upcoming patch will start building derefs in prog_to_nir, at which point we'll need to lower them to indexes. This gets both GLSL and non-GLSL shaders using the same paths. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:30 -08:00
Kenneth Graunke	529a0711c1	glsl: Don't look at sampler uniform storage for internal vars Passes like nir_lower_drawpixels add additional sampler variables, and set an explicit binding which never changes. These extra samplers don't have proper uniform storage associated with them, and there is no way to update bindings via the API. So, for any 'hidden' variables, just trust that there's an explicit binding set. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:28 -08:00
Kenneth Graunke	d34e434989	glsl: Allow gl_nir_lower_samplers*() without a gl_shader_program I would like to be able to run gl_nir_lower_samplers() to turn texture and sampler variable dereferences into indexes and offsets, even for ARB programs, and built-in shaders. This would make sampler handling more consistent across the various types of shaders. For GLSL programs, the gl_nir_lower_samplers_as_deref() pass looks up the variable bindings in the shader program's uniform storage. But ARB programs and built-in shaders don't have a gl_shader_program, and uniform storage doesn't exist. In this case, we simply skip that lookup, and trust var->data.binding to be set correctly by whoever created the shader. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:34:22 -08:00
Kenneth Graunke	f45dd6d31b	st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048 Piglit's vp-max-array test creates a vertex program containing a uniform array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB. Mesa will then add additional state-var parameters for things like the MVP matrix. radeonsi currently exposes a value of 4096, derived from constant buffer upload size. This means the array will have 4096 elements, and the extra MVP state-vars would get a prog_src_register::Index of over 4096. Unfortunately, prog_src_register::Index is a signed 13-bit integer, so values beyond 4096 end up turning into negative numbers. Negative source indexes are only valid for relative addressing, so this ends up generating illegal IR. In prog_to_nir, this would cause an out of bounds array access. st_mesa_to_tgsi checks for a negative value, assumes it's bogus, and remaps it to parameter 0 in order to get something in-range. This isn't right - instead of reading the MVP matrix, it would read the first element of the vertex program's large array. But the test only checks that the program compiles, so we never noticed that it was broken. This patch limits the size of the program limits, with the understanding that we may need to generate additional state-vars internally. i965 has exposed 1024 for this limit for years, so I don't expect lowering it to 2048 will cause any practical problems for radeonsi or other drivers. Fixes vp-max-array with prog_to_nir.c. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-02-11 21:09:51 -08:00
Francisco Jerez	374eb3cd6f	intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces. This fixes a rather astonishing problem that came up while debugging an issue in the Vulkan CTS. Apparently the Vulkan CTS framework has the tendency to create multiple VkDevices, each one with a separate DRM device FD and therefore a disjoint GEM buffer object handle space. Because the intel_dump_gpu tool wasn't making any distinction between buffers from the different handle spaces, it was confusing the instruction state pools from both devices, which happened to have the exact same GEM handle and PPGTT virtual address, but completely different shader contents. This was causing the simulator to believe that the vertex pipeline was executing a fragment shader, which didn't end up well. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-11 12:27:22 -08:00
Kristian H. Kristensen	e404c6879d	freedreno/a6xx: Fall back to masked RGBA blits for depth/stencil The blitter doesn't seem to have a write mask, so for depth only and stencil only blits to Z24S8 we cast the Z24S8 buffer to an RGBA UNORM8 buffer and fall back to pipeline blits with corresponding write mask. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_depth dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8 dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8 Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	f03ba155d5	freedreno/a6xx: Add format argument to fd6_tex_swiz() We need to allow overriding the format with that of the image or sampler view, so we can't take it from the resource in fd6_tex_swiz(). Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	bc8c813d5a	freedreno/a6xx: Support y-inverted blits The src coordinates are s24.8. For an inverted blit that ends at y=0 we need to program -1 for sy2, so we need to handle negative values correctly. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_color dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_color Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	03a01e5d23	freedreno/a6xx: Support some depth/stencil blits on blitter We can rewrite almost all depth stencil blits to various red-only blits. The exception is depth-only or stencil-only blits into z24s8 combined depth stencil buffer. We can fall back for depth-only, but stencil-only remains broken. Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	e9592da2b4	freedreno/a6xx: Move blit check so as to restore comment The explanation for the compressed format check is broken across two comments: /* We can blit if both or neither formats are compressed formats... / / ... but only if they're the same compression format. */ but the ok_format() checks were inserted between, breaking up the flow of the sentence. Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	d2639f2eac	freedreno: Don't tell the blitter what it can't do Call ctx->blit() and let it reject blits it can't do instead of giving up on stencil blits and blits u_blitter can't do. Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	8cf1303698	freedreno: Consolidate u_blitter functions in freedreno_blitter.c Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	701d30dda8	freedreno/a6xx: Combine emit_blit and fd6_blit Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	6d1a7bdba3	freedreno/a6xx: Use the right resource for separate stencil stride Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	24b4172375	freedreno: Log number of draw for sysmem passes Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	a201cb157d	freedreno/a6xx: Drop render condition check in blitter We already check earlier in the call chain in fd_blit(). glBlitFramebuffer always sets render_condition_enable and thus we would never try the blitter path for that. Now that we get all of dEQP-GLES3.functional.fbo.blit.conversion.* down this path, it turs out that the fail_if(info->mask != util_format_get_mask(info->src.format)); fail_if(info->mask != util_format_get_mask(info->dst.format)); conditions weren't accurate. util_format_get_mask() returns PIPE_MASK_RGBA for any format with any color channels, while info->mask is the exact set of channels to blit. So we reject things we could blit - for example, PIPE_FORMAT_R16G16_FLOAT where info->mask is RG while util_format_get_mask() returns RGBA - and accept things we can't. It turns out that the blitter is happy to blit different number of channels, but fails to blit formats with different numerical formats and srgb formats. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-11 12:26:21 -08:00
Kristian H. Kristensen	4f7a9c23ed	freedreno/a6xx: regen headers Update for a6xx.xml.h to incorporate a few new bits and changes to blit src rect coordinate types. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-02-11 12:26:21 -08:00
Leo Liu	a0a52a0367	st/va/vp9: set max reference as default of VP9 reference number If there is no information about number of render targets Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-11 14:44:16 -05:00
Leo Liu	21cdb828a3	st/va: fix the incorrect max profiles report Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will be correct when adding more profiles in the future. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-02-11 14:44:16 -05:00
Guttula, Suresh	2cf2a56739	st/va:Add support for indirect manner by returning VA_STATUS_ERROR_OPERATION_FAILED Based on VA Spec,DeriveImage() returns VA_STATUS_ERROR_OPERATION_FAILED if driver dont have support for internal surface formats.Currently vaDeriveImage() failed for non-contiguous planes and operation failed error string is required to support indirect manner i.e. vaCreateImage()+vaPutImage() incase vaDeriveImage() failed with VA_STATUS_ERROR_OPERATION_FAILED. This patch will notify to the client as operation failed with proper error sting,so that client will fallback to vaCreateImage()+vaPutImage(). v2: updated commit message based on VA spec. Signed-off-by: suresh guttula <suresh.guttula@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-02-11 14:44:16 -05:00
Marek Olšák	114a899cc8	winsys/amdgpu: cs_check_space sets the minimum IB size for future IBs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	766e920cdb	winsys/amdgpu: clean up IB buffer size computation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	8c1cb393fc	winsys/amdgpu: remove occurence of INDIRECT_BUFFER_CONST Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	881ef14b32	winsys/amdgpu: use a separate fence list for syncobjs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	9f00123d51	winsys/amdgpu: unify fence list code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	ddfe209a0d	winsys/amdgpu: don't drop manually added fence dependencies wow, it's hard to believe that fence and syncobjs dependencies were ignored. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:48 -05:00
Marek Olšák	61c678d4bc	radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:06 -05:00
Marek Olšák	4522f01d4e	gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-11 12:35:04 -05:00
Jason Ekstrand	9e6a6ef0d4	nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: `7d1d1208c2` "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-11 10:57:23 -06:00
Jason Ekstrand	fd77606b5b	intel/fs: Use enumerated array assignments in fb read TXF setup It's more clear and means we don't have to update the array every time we add an optional texture instruction argument Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-11 10:57:09 -06:00
Michel Dänzer	d6c55f6c62	gitlab-ci: Re-use docker image from the main repo in forked repos Instead of generating it from scratch in each forked repo. This should save time, energy and storage. (The xserver & xf86-video-amdgpu CI scripts do basically the same) v2: * Hardcode "mesa" instead of using $CI_PROJECT_NAME, to avoid breakage if the project name is changed after forking (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-11 12:24:31 +01:00
Ilia Mirkin	cc79a1483f	nvc0: we have 16k-sized framebuffers, fix default scissors For some reason we don't use view volume clipping by default, and use scissors instead. These scissors were set to an 8k max fb size, while the driver advertises 16k-sized framebuffers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2019-02-10 23:36:23 -05:00
Alyssa Rosenzweig	85e2bb58ca	panfrost: Specify supported draw modes per-context Midgard has native support for QUADS and POLYGONS; Bifrost seemingly does not. Thus, Midgard generally skips prim_convert whereas Bifrost needs the pass; this patch allows the setting of allowed primitives to occur on a per-context basis (for runtime hardware selection). v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Robert Foss <robert.foss@collabora.com>	2019-02-11 03:23:00 +00:00
Dave Airlie	90c6880df7	radv: remove alloc parameter from pipeline init clang points out this isn't used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-11 10:04:40 +10:00
Dave Airlie	a523ae0cac	radv/llvm: initialise passes member. Fixes coverity warning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-02-11 08:59:02 +10:00
Dave Airlie	d2e82c2682	glsl: glsl to nir fix uninit class member. The constructor should init this to NULL Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-02-11 08:55:07 +10:00
Alyssa Rosenzweig	2458797256	panfrost: Elucidate texture op scheduling comment Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:51:57 +00:00
Alyssa Rosenzweig	658961aec3	panfrost: Remove speculative if 0'd format bit code Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:51:51 +00:00
Alyssa Rosenzweig	b1213a3947	panfrost: Remove if 0'd dead code Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:50:35 +00:00
Alyssa Rosenzweig	e91e1786c5	panfrost: Add kernel-agnostic resource management Various methods relating to resource management were previously marked as kernel-specific, forcing them to stay downstream in the vendor overlay and eventually be duplicated for DRM code. This patch adds back this code in kernel-neutral space, allowing for code sharing and minimising the diff to downstream. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:44:32 +00:00
Alyssa Rosenzweig	4ed23b193a	panfrost: Don't hardcode number of nir_ssa_defs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-10 00:42:52 +00:00

1 2 3 4 5 ...

107439 commits