fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 17:38:08 +02:00

Author	SHA1	Message	Date
Dylan Baker	92235e0c48	meson: replace has_exe_wrapper with can_run_host_binaries The former is a deprecated alias for the latter, which more accurately describes what the function does. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20409>	2023-01-19 16:29:03 +00:00
Lionel Landwerlin	ff34e96701	nir/lower_io: fix bounds checking for 64bit_bounded_global If the offset is negative like it's the case in dEQP-VK.robustness.robustness2.bind.notemplate.r32i.unroll.volatile.storage_buffer_dynamic.readwrite.no_fmt_qual.len_256.samples_1.1d.comp we end up passing the bounds checking condition because it's using signed integers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com> Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20762>	2023-01-19 09:16:40 +00:00
Alyssa Rosenzweig	e664082d35	nir/lower_blend: No-op nir_color_mask if no mask In this usual case, do a quick check to avoid generating 5 useless instructions (mov/vec4 instructions). They'll get copypropped but that creates more work for the optimizer and nir/lower_blend runs in a hot variant path on both Asahi and Panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	1fc25c8c79	nir/lower_blend: Handle undefs in stores nir/lower_blend asserts: assert(nir_intrinsic_write_mask(store) == nir_component_mask(store->num_components)); For the special blend shaders used in Panfrost, this holds. But for arbitrary shaders coming out of GLSL-to-NIR (as used with Asahi), this does not hold. In particular, after nir_opt_undef runs, undefined components can be trimmed. Concretely, if we have the shader: gl_FragColor.xyz = foo; Then this becomes in NIR gl_FragColor = vec4(foo.xyz, undef); and then opt_undef will give the store_deref a wrmask of xyz but 4 components. Then lower_blend asserts out. Found in a gfxbench shader on asahi. Closes: #6982 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	8b83210ab3	nir/lower_blend: Don't do logic ops on pure float Per the spec. Fixes arb_color_buffer_float-render on both Panfrost and Asahi (before/after reproduced on Mali-T860 and AGX G13 respectively). Without that patch, that test fails the assertion: arb_color_buffer_float-render: ../src/compiler/nir/nir_lower_blend.c:259: nir_blend_logicop: Assertion `util_format_is_pure_integer(format)' failed. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	dbd0615e7a	nir/lower_blend: Avoid useless iand with logic ops The upper bits start correctly, there's no need to clear them as long as we keep them zero'ed by using ixor with a valid bit mask instead of inot. Makes the code generated for logic op slightly less ridiculous. I'm joking. It's still ridiculous but I'm not in the mood to fix up the Midgard compiler and it's just a little ALU for a feature almost nothing uses. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	ee127f03e4	nir/lower_blend: Fix SNORM logic ops We need to sign extend. Incidentally this means the iand above is useless for SNORM. Fixes arb_color_buffer_float-render with GL_RGBA8_SNORM. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	f9839e7e1b	nir/lower_blend: Clamp blend factors Particularly constant colours, but also (more obscurely) SNORM. Fixes arb_color_buffer_float-render with SNORM framebuffers. Issue affects both Asahi and Panfrost (the latter after we start advertising EXT_render_snorm). v2: Check the blend factor to avoid unnecessary clamps. This avoids regressing blend shader code quality on Panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> [v1] Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	fca457790e	nir/lower_blend: Fix alpha=1 for RGBX format In this case we have 4 components but the value of the fourth component is undefined. Apply the fixup we already have. Fixes dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.0 on Asahi. That test blend with DST_ALPHA with its RGB565 attachment, which is fine if RGB565 is preserved, but Asahi is demoting that format to RGBX8 which means -- after lowering the tilebuffer access -- we blend with an ssa_undef. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Lionel Landwerlin	b82d9b1a3d	nir/divergence: add missing RT intrinsinc handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20763>	2023-01-18 22:32:43 +00:00
Qiang Yu	49cfbe1fed	nir/xfb_info: nir_gather_xfb_info_from_intrinsics update nir xfb_info Use this function to update nir_shader->xfb_info. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19489>	2023-01-18 05:30:14 +00:00
Gert Wollny	2e05cfa179	nir: Add range_base to atomic_counter and an option to use it Some drivers may encode constant offsets in the instruction, so make it possible for the drivers to request lowering the atomic uniform offset into the range_base variable of the intrinsic. v2: drop patch to use build-in array offset evaluation, it makes problems with zink, and update the code accordingly v3: always initialize range base Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>	2023-01-17 13:19:04 +00:00
Gert Wollny	c4cde91c1b	nir: Add possibility to store image var offset in range_base Add the intrinsic range_base value to the image intrinsics and add the option to store the image array offset into range_base instead of adding it to the image array index if the driver requests it. v2: Always initialize range_base v3: fix for bindless intrinsics Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>	2023-01-17 13:19:04 +00:00
Alyssa Rosenzweig	c3839bd540	nir: Optimize vendored sin/cos the same way As we've done for the AMD one, to prevent any codegen regression from switching the Midgard lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Alyssa Rosenzweig	a49ba0f1ae	nir: Add Midgard-specific fsin/fcos ops NIR has a fsin instruction that takes an argument in radians. Midgard instead has an fsinpi argument that takes an argument in multiples of pi. So, we had a NIR pass that would change fsin(x) to fsin(x / pi) and then map fsin to fsinpi in the backend. But that's invalid! In NIR, the opcode fsin is well-defined. fsin(x) means something very different than fsin(x / pi). They won't usually be equal. The transform fsin(x) -> fsin(x / pi) is fundamentally unsound. It did work before, by accident. Most NIR passes don't care about the semantics of ALU instructions. fsin(x) and fsin(x / pi) are both well-defined but fundamentally different NIR shaders. So while rewriting is wrong -- the NIR we get out is not equivalent to the NIR we put in, and the Midgard ops we generate are not equivalent to the NIR -- but if we don't run any passes that care about the definition of fsin the two wrongs will cancel out to make a right. However, some NIR passes do care about the definitions of ALU instructions, instead of treating them as named black boxes. In particular, constant folding (nir_opt_constant_fold) evaluates ALU instructions when their inputs are constants, according to the definition in nir_opcodes.py. So our little charade will only work if we don't call nir_opt_constant_fold, or if all the fsin instructions have non-constant inputs. At the beginning of this series, that is the case. With the later scalarization change, that's no longer the case, and the unsoundness translates to real failing tests rather than a quibble of NIR's semantics. To mitigate, we define a new NIR opcode with the semantics we want and translate fsin(x) = fsin_mdg(x / pi), where that equivalence does hold mathematically. So the new translation is sound and doesn't rely on lucky pass ordering. This matches the approach already used for AMD and AGX, which have fsin_amd and fsin_agx opcodes respectively. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Jason Ekstrand	b39958a3a1	anv,nir: Move the ANV YCbCr lowering pass to common code Nir changes: Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Anv changes: Acked-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>	2023-01-16 14:10:21 +00:00
Jason Ekstrand	f02a11e4e4	nir: Add copyright and include guards to nir_vulkan.h Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>	2023-01-16 14:10:21 +00:00
Gert Wollny	0ca325cc10	glsl/nir: only set uses_sample_shading when the output is a fbfetch Constructs like out vec4 fs_out; .... fs_out = vec4(...); if (fs_out.w < alpha_test_value) discard; lead to initial nir that reads from fs_out, even though we don't actually do a framebuffer fetch, and later nir passes will eliminate that direct read from the output variable. As given in the commit message of `1124bee4` we are actually only interested in the framebuffer fetch, so set the property only when an output is used for fbfetch reads. v2: Iterate over all variables (Jason) Fixes: commit `1124bee4ba` glsl/nir: Set sample_shading if a FS output ever shows up as an rvalue Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20694>	2023-01-15 22:04:15 +00:00
Jason Ekstrand	433fe592ac	nir/builder: Add some texture helpers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19480>	2023-01-13 20:25:01 +00:00
Jason Ekstrand	30f3fec380	nir: Add more opcodes to nir_tex_instr_is_query() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19480>	2023-01-13 20:25:01 +00:00
t0b3	267dd1f4d5	nir/nir_opt_move: fix ALWAYS_INLINE compiler error Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Closes: #6825 Fixes: `f1d20ec6` ("nir/nir_opt_move: handle non-SSA defs ") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17439>	2023-01-13 14:23:35 +00:00
Alyssa Rosenzweig	f4b3201244	nir/peephole_select: Allow load_preamble load_preamble is intended to be almost free (costing at most a move), and it does not have special bounds checking requirement, so it's ok to select with it. With this, drivers that use nir_opt_preamble together with a late call to peephole_select can optimize sequences like: if (x) { <uniform-on-uniform calculation> } else { <different uniform-on-uniform calculation> } to simply bcsel(x, <uniform register 0>, <uniform register 1>) rather than emitting needless control flow / branching over some moves. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20597>	2023-01-13 00:43:04 +00:00
Pavel Ondračka	3fcdd9e4a7	nir/lower_bool: ntt: Generate a good opcode for bcsel This is heavily copy-pasted from a patch of Ian Romanick, including the commit message. Previously, this pass always generated fcsel for bcsel. This was the only place that generate fcsel, so various drivers assumed (and needed!) that src0 was a Boolean with 0.0 or 1.0 as the only values. Specifically, many DX9 / GL_ARB_vertex_program platforms lack a CMP instruction in vertex shaders. In those cases, they would use LRP to implement fcsel. The bummer is that many plaforms have a real fcsel instruction, and those platforms would benefit from other places generating that opcode. Instead of leaving assumptions in drivers about the sources of an opcode that they can't really support, allow them to control the way the lowering pass translates bcsel. Two flags are used to control this: - If the driver sets has_fused_comp_and_csel in nir_options, fcsel_gt will be used. Since the Boolean value is 0.0 or 1.0, this is equivalent to fcsel. - If the parameter has_fcsel_ne is set, fcsel will be used. This is the old path. - Otherwise, the lowering pass assumes we're on a crufty, old DX9 vertex program, and it emits flrp. With this, the assumptions about src0 of fcsel in NTT can be removed. If a platform can't handle fcsel, it should ensure that the lowering pass won't generate it. No change in shader-db. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>	2023-01-12 23:01:05 +00:00
Ian Romanick	70b25d9fe8	nir/lower_int_to_float: Add support for i32csel opcodes These lower naturally to the corresponding fcsel opcodes. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>	2023-01-12 23:01:05 +00:00
Alyssa Rosenzweig	2976548e4a	nir/gather_info: Handle store_zs_agx This acts as a depth/stencil write. The AGX compiler checks outputs_written to determine what conservative depth settings the driver needs. Nominally, this should work: the original store_output(FRAG_RESULT_DEPTH) intrinsic causes the DEPTH outputs_written bit to be set, so the metadata is still correct after lowering store_output to store_zs_agx. However, there are a handful of places that call nir_gather_info late, which resets the existing outputs_written value and regathers, causing Asahi to use the wrong conservative depth settings when shuffling NIR pass order and breaking gl_FragDepth. To fix, handle store_zs_agx conservatively when gathering info so we don't have to play games with the pass order or stashing info in a sideband. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20563>	2023-01-11 21:14:20 +00:00
Alyssa Rosenzweig	cc5ca8164d	nir: Add store_agx intrinsic This works like store_global, but lets us optimize address arithmetic. Like load_agx, it is formatted to match the hardware semantic. We don't make use of any clever formats in this series, though. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20558>	2023-01-11 20:36:51 +00:00
Rob Clark	5fb0992a53	mesa/st: Track complete access qualifier for images Don't turn gl_access_qualifier coming from NIR back into GL enums, losing information in the process. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20612>	2023-01-11 20:09:01 +00:00
Jesse Natalie	b0f3a387c9	nir_lower_fragcoord_wtrans: Support Vulkan shaders In Vulkan shaders, you might not have all derefs pointing to a variable Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20400>	2023-01-10 04:25:26 +00:00
Timothy Arceri	ac5af6c06d	util/driconf: add Dune: Spice Wars workaround As per the bug report the game does not correctly handle a uniform index of -1 being returned for the unused array element, which results in rendering issues. So here we skip the uniform array resizing optimisation. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6397 Cc: mesa-stable Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20579>	2023-01-10 03:53:19 +00:00
Mary	d8e5714e81	isaspec: Fix bitmask conversions when isa.bitsize < 64 Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20541>	2023-01-07 00:14:10 +01:00
Jason Ekstrand	39c6f6454c	isaspec: Give decode.c/h more descriptive names Because these are being included across subdir boundaries, the name "decode" is potentially pretty overloaded. Instead, prefix them with "isaspec_". Also, since they're both weird includes now and not really complete files in their own right, give them a descriptive suffix. Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>	2023-01-05 18:21:02 +00:00
Jason Ekstrand	e8945a8ce6	isaspec: Stop depending on glue headers and out-of-folder C files The way the isaspec decoder used to work was that it would generate a header and a C file, each with ISA-specific stuff in it. Then that would get built together with a stand-alone decode.c file which lives in the isaspec folder, not the driver's folder. In order for decode.c to find the ISA-specific headers, it would also generate a glue header which had to be named isaspec-isa.h. This effectively meant that you can't have multiple isaspec definitions in the same folder. To solve this, we make do it the other way around and make the generated header and C files include the stand-alone files. This is a bit awkward because it means including a C file from another C file but it's better for the build system. Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>	2023-01-05 18:21:02 +00:00
Jason Ekstrand	4953a8db25	isaspec: Use argparse This also cleans up some of our python script execution conventions and handles mako errors better. Copied a bit from vk_entrypoints_gen.py. Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>	2023-01-05 18:21:02 +00:00
Jason Ekstrand	e83ad77ef5	isaspec: Stop using s and xml from the global namespace We really shouldn't rely on these being global variables. Pass them along instead. Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>	2023-01-05 18:21:02 +00:00
Pavel Ondračka	a93bc6afc4	nir: check for x - ffract(x) patterns when lowering f2i32 We already skip emitting ftrunc in nir_lower_int_to_float when there is ffloor, fround or any other integer-making opcode preceding f2i32. However if lower_ffloor is set for driver that doesn't support integers, the lowered x - ffract(x) patterns would not be recognized and extra ftruct would be emitted, doing unnecessary rounding. This optimization only works if there is no non-trivial swizzling used for the fadd, fneg and ffract involved, which seems to be 99% of the cases according to my testing. This is needed to enable nir ffloor lowering on r300 driver without regressions. I'm not sure if this helps anybody else, the only hardware which sets lower_ffloor and converts ints to floats (and can't do trunc) are some old etnaviv cards, so maybe it will help there a bit. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20208>	2023-01-05 12:01:32 +00:00
Qiang Yu	cf2ea3fce9	nir/xfb: save high_16bits output info It is combined with slot location to identify a varying when using VARYING_SLOT_VARx_16BIT. Acked-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20157>	2023-01-05 01:12:06 +00:00
Ian Romanick	043508d8f8	glsl: Remove bit_count lowering As far as I can tell, every driver that supports GLSL 1.30 or GL_EXT_gpu_shader4 (and therefore also enables support for GL_MESA_shader_integer_functions) also sets the NIR lower_bit_count flag. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>	2023-01-03 18:37:53 -08:00
Ian Romanick	abe5acf7fd	glsl: Remove bitfield_reverse lowering As far as I can tell, every driver that supports GLSL 1.30 or GL_EXT_gpu_shader4 (and therefore also enables support for GL_MESA_shader_integer_functions) also sets the NIR lower_bitfield_reverse flag. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>	2023-01-03 18:37:53 -08:00
Ian Romanick	f5722c4973	glsl: Remove bitfield_extract and bitfield_insert lowering As far as I can tell, every driver that supports GLSL 1.30 or GL_EXT_gpu_shader4 (and therefore also enables support for GL_MESA_shader_integer_functions) also sets some subset of the various NIR lower_bitfield_extract and lower_bitfield_insert flags. v2: Declaration of 'result' still needs to be added to the IR. Noticed by marge. v3: Fix 'git rebase --autosquash' putting the v2 fix in the wrong place. I've never seen that happen before. :( Reviewed-by: Emma Anholt <emma@anholt.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>	2023-01-03 18:37:53 -08:00
Ian Romanick	db241fbd70	nir: Don't allow conflicting bitfield lowering passes Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>	2023-01-03 18:37:53 -08:00
Pavel Ondračka	53d9b696e4	nir: basic tests for nir_opt_shrink_vectors Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>	2023-01-03 12:32:33 +01:00
Pavel Ondračka	3305c9602d	nir: fix shrinking of load_const for large vectors Specifically when shrinking load_const with number of components > 5, if the final number of components is not allowed (for example 8->6) it would report false for progress even if we actually did some reshuffling and also it would skip on the rewrite of the readers. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>	2023-01-03 12:32:33 +01:00
Pavel Ondračka	cb7f201288	nir: remove duplicate alu channels in nir_opt_shrink_vectors This will clean code like: vec3 32 ssa_8 = frcp ssa_7.www vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8 into vec1 32 ssa_8 = frcp ssa_7.w vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8.xxx This helps r300 driver because we can only do single channel for math ops at a time, so the first version would result in three frcp instructions. The nir_opt_shrink_vectors comments even claim the pass should be doing this, however it actually does it only for nir_op_vecx instructions, so extend this for generic alu instructions. RV530 shader-db: total instructions in shared programs: 135032 -> 133707 (-0.98%) instructions in affected programs: 46121 -> 44796 (-2.87%) helped: 452 HURT: 26 total temps in shared programs: 17051 -> 17033 (-0.11%) temps in affected programs: 1509 -> 1491 (-1.19%) helped: 91 HURT: 30 12.02->12.08 (+0.5%) fps gain in Unigine Sanctuary (n=5) with RV530 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7051 Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reiewed-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>	2023-01-03 12:32:33 +01:00
Christian Gmeiner	9e56f69edf	isaspec: encode: handle special fieldname properties Without this change a fieldname like '{DST::align=12}' was not used for encoding. Change the regex to include such fieldnames and remove the fieldname property in a later step. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20462>	2022-12-31 13:43:15 +00:00
Danylo Piliaiev	1c9ee30838	nir/fold_16bit_tex_image: Add type granularity for dst folding Some HW may be able to fold only some of dst types, e.g. for Adreno folding i32 -> i16 could cause a different result since folded variant clamps the result instead of masking it. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20396>	2022-12-23 15:48:18 +01:00
Lionel Landwerlin	3af08b9c30	nir/divergence: handle shader_record_ptr intrinsic Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes `6b8fd65e84` ("spirv: Implement the new ray-tracing storage classes") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20413>	2022-12-23 09:22:13 +00:00
Danylo Piliaiev	8482ad0110	nir/nir_lower_is_helper_invocation: Lower helper invocation if required nir_lower_is_helper_invocation lowers intrinsic_is_helper_invocation and uses load_helper_invocation (which is lowered by nir_lower_system_values). While nir_lower_system_values may lower SYSTEM_VALUE_HELPER_INVOCATION into intrinsic_is_helper_invocation. So they depend on each other. Break the dependency by making nir_lower_is_helper_invocation aware of lower_helper_invocation option and emitting lowered load_helper_invocation when required. Happens with SPIR-V 1.6 for which gl_HelperInvocation is translated into "BuiltIn HelperInvocation" + "Volatile", which nir_lower_system_values translates into is_helper_invocation. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19677>	2022-12-20 11:06:52 +00:00
Qiang Yu	e85c5d8779	nir/divergence_analysis: add missing intrinsics Reviewed-by: Marek Olšák <marek.olsak@amd.com> Singed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>	2022-12-19 09:22:24 +08:00
Qiang Yu	194add2c23	nir: lower image add lower_to_fragment_mask_load_amd option Like lower_to_fragment_fetch_amd option in lower tex, this is for radeonsi to lower MS image ops. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>	2022-12-19 09:22:16 +08:00
Qiang Yu	1461b5f61b	nir: add image fragment mask load intrinsic Like nir_texop_fragment_mask_fetch_amd, this is used to load multi sample image fmask data for AMD GPU. We will lower multi sample image load and samples_identical intrinsics to use it latter for radeonsi. RADV does not need this because it always expand fmask images before dispatch compute shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>	2022-12-19 09:22:11 +08:00

1 2 3 4 5 ...

7573 commits