fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-16 22:48:05 +02:00

Author	SHA1	Message	Date
Alejandro Piñeiro	3685528c1e	nir: track if var copies lowering was called In general we should only call it once, and then we should avoid to call any lowering that introduce back copies. So far we were tracking that manually out of the nir shader on several places. Ideally we would like to add a nir_validate rule, but right now there are some exceptions to this rule. For example right now the Intel compiler calls nir_lower_io_to_temporaries as part of linking tess_ctrl/mesh/task sahders. One option would be to allow drivers to reset the value, but for now let's not add that validation rule. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19338>	2023-02-06 22:11:34 +00:00
Konstantin Seurer	9104dafb6f	vulkan,nir: Refactor ycbcr conversion state into a struct This will be useful for RADV since it hashes the state. v3dv changes: Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20731>	2023-02-06 18:36:29 +00:00
Jason Ekstrand	9c62e0c77d	nir: Remove nir_lower_io_force_sample_interpolation It's no longer used. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:17 +00:00
Alyssa Rosenzweig	6b97f396e6	nir/lower_clip: Only emit 1 discard If we have multiple clip planes, rather than emit multiple discards we can just OR together the discard criteria. Then a nir_opt_algebraic rule kicks in to optimize out the flt/.../flt/ior/.../ior into fmin/.../fmin/flt, generating much less code at the end. Written while debugging an unrelated issue with the clip lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21103>	2023-02-06 02:50:20 +00:00
Alyssa Rosenzweig	93db6094a1	nir/print: Pretty-print color0/1_interp These are an enum. Furthermore, their 0 state is INTERP_MODE_NONE which we shouldn't bother printing at all. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21091>	2023-02-04 17:26:30 +00:00
Alyssa Rosenzweig	b235be1fd4	nir/print: Pretty-print I/O semantic locations Instead of printing the raw location number, which is pretty hard to interpret, let's print the name of the location. Example output: vec4 16 ssa_2 = intrinsic load_interpolated_input (ssa_0, ssa_1) (base=0, component=0, dest_type=float16 /144/, io location=VARYING_SLOT_VAR0 slots=1 mediump /8388768/) One of the "regressions" from moving to purely lowered I/O with all variables removed is a lack of debuggability, since otherwise these location strings don't show up anywhere in the printed shader! By contrast this should make the lowered I/O nice to read like the early I/O. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21091>	2023-02-04 17:26:30 +00:00
Alyssa Rosenzweig	435e7f5e6d	nir/print: Extract get_location_str Locations show up in two places: variables and lowered I/O semantics. We want to reuse the logic in both places, so extract it out. The extracted logic is IMO easier to read, too. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21091>	2023-02-04 17:26:30 +00:00
Hampus Linander	4ffc7c3ff4	nir: Add extr_agx opcode The AGX extr instruction extracts a bitfield from two 32bit registers. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20628>	2023-02-04 11:13:24 -05:00
Ian Romanick	ea413e826b	nir: Eliminate nir_op_f2b Builds on the work of !15121. This gets to delete even more code because many drivers shared a lot of code for i2b and f2b. No shader-db or fossil-db changes on any Intel platform. v2: Rebase on `1a35acd8d9`. v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin. v4: Another rebase. Remove f2b stuff from Midgard. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Ian Romanick	024122c069	nir/builder: Handle f2b conversions specially in nir_type_convert No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Ian Romanick	b265020b82	nir/builder: Eliminate nir_f2b helper (and use of nir_f2b32 helper) There were only two users. Replace each with nir_fneu instead. This is now a squash of what was two separate commits. nir_lower_pstipple_block is called after nir_lower_bool_to_int32, so nir_fneu32 has to be used here or there will be regresssions in stipple tests on llvmpipe. v2: Rebase on !20869. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Suggested-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Alyssa Rosenzweig	071ac59960	nir: Add a late texcoord replacement pass Add a second NIR pass for lowering point/texture coordinate replacement (i.e. point sprites). Why a second one? The current pass works on derefs/variables, which is good for drivers that don't lower I/O at all (like Zink, where the pass originates). However, it is problematic for hardware drivers: the inputs to this pass depend on the shader key, so we want to run the pass as late as possible to minimize the cost of building/compiling the associated shader variants. In particular, we need to be able to lower point sprites after lowering I/O if we would like to lower I/O when preprocessing NIR. The logic for early lowering and late lowering is considerably different (the late lowering is a lot simpler), so I've split this out into a second pass rather than trying to weld them together into one. This pass will be used on Asahi, which currently uses the early pass. It may be useful for other drivers as well. (Actually, it's been shipping on Asahi for a little while now, just hasn't been sent upstream yet.) Tested with Neverball. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Asahi Lina <lina@asahilina.net> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21065>	2023-02-03 15:03:06 +00:00
Qiang Yu	f6b194b648	nir,ac/llvm,aco,radv,radeonsi: remove nir_export_vertex_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>	2023-02-03 12:27:44 +00:00
Qiang Yu	f44872c7b6	nir,ac/llvm,aco: remove nir_export_primitive_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>	2023-02-03 12:27:44 +00:00
Qiang Yu	5f24d58549	nir: add nir_export_amd intrinsic Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691>	2023-02-03 12:27:43 +00:00
Sagar Ghuge	0ec3522163	nir: Handle other variants of image_samples properly while lowering while lowering image_samples to one, we need to take nir_intrinsic_image_deref_samples and nir_intrinsic_bindless_image_samples intrinsic into account. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8211 Fixes: `ab4c2990ed` ("intel/compiler: use lower_image_samples_to_one") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21053>	2023-02-02 21:40:45 +00:00
Timur Kristóf	1244506c15	nir/opt_algebraic: Add optimization for ieq/ine and right-shift. Fossil DB stats on GFX11: Totals from 1343 (1.00% of 134913) affected shaders: SpillSGPRs: 7145 -> 7137 (-0.11%) CodeSize: 20737744 -> 20739148 (+0.01%); split: -0.02%, +0.03% Instrs: 4010443 -> 4008449 (-0.05%); split: -0.05%, +0.00% Latency: 50021520 -> 50021105 (-0.00%); split: -0.00%, +0.00% InvThroughput: 6354371 -> 6354112 (-0.00%); split: -0.00%, +0.00% VClause: 63035 -> 63038 (+0.00%); split: -0.01%, +0.01% SClause: 121162 -> 121166 (+0.00%) Copies: 251354 -> 251058 (-0.12%); split: -0.18%, +0.06% PreSGPRs: 137283 -> 137299 (+0.01%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20936>	2023-02-02 03:08:19 +00:00
Pavel Ondračka	7e6acfd587	nir: mark progress when removing trailing unused load_const channels When the unused channels were at the end and so no reswizzling was needed, we wouldn't correctly mark the progress. Fixes: `3305c960` Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>	2023-02-01 20:33:31 +00:00
Pavel Ondračka	fe56dd9c42	nir: mark progress when removing trailing unused alu channels When the unused channels were at the end and so no reswizzling was needed, we wouldn't correctly mark the progress. Fixes: `cb7f2012` Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>	2023-02-01 20:33:31 +00:00
Pavel Ondračka	ef800da3f7	nir: nir opt_shrink_vectors whitespace fix Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21014>	2023-02-01 20:33:31 +00:00
Amber	c384690ab7	nir: support lowering nir_intrinsic_image_samples to a constant load This can be used by multiple drivers that do not support ms images Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Amber Amber <amber@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>	2023-02-01 19:52:49 +00:00
Alyssa Rosenzweig	b0b5a71c74	nir/opt_preamble: Consider load_preamble as movable It's kosher to get load_preamble intrinsics ahead of time if the driver is pushing sysvals. Handle them like load_uniform. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by-(with-sparkles): Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20562>	2023-01-31 17:02:34 +00:00
Alyssa Rosenzweig	05d3238692	nir/opt_preamble: Treat size as an input Some backends may wish to reserve early uniforms for internal system values, and use the remaining space for preamble storage. In this case, it's convenient to teach nir_opt_preamble about a reserved offset. It's logical to treat the output size instead of an in/out variable that nir_opt_preamble adds to. This requires a slight change to the consumers to zero the input. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by-(with-sparkles): Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20562>	2023-01-31 17:02:34 +00:00
Marcin Ślusarz	2255375c4d	nir: add nir_mod_analysis & its tests Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20050>	2023-01-31 13:50:08 +00:00
Erik Faye-Lund	f00c9e85e5	meson: use files() instead of joining paths The Meson docs points out that it's better to use the files() function when referring to files in the source tree than manually constructing paths like this. Let's follow that advice, and get some neat cleanups. Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20907>	2023-01-27 11:35:50 +00:00
Timur Kristóf	65a917cb6e	nir: Add algebraic optimization for VKD3D-Proton fp32->fp16 conversion. VKD3D-Proton DXBC f32 to f16 conversion implements a float conversion using PackHalf2x16. Because the spec does not specify a rounding mode, it emits a sequence to ensure D3D-like behaviour for infinity. When we know the current backend has pack_half_2x16_rtz_split, we can eliminate the extra sequence. Fossil DB stats on GFX11: Totals from 835 (0.62% of 134913) affected shaders: VGPRs: 49368 -> 49224 (-0.29%) CodeSize: 5341956 -> 5124564 (-4.07%) Instrs: 1024062 -> 987041 (-3.62%) Latency: 6530956 -> 6465120 (-1.01%); split: -1.01%, +0.00% InvThroughput: 908189 -> 870253 (-4.18%) VClause: 18704 -> 18702 (-0.01%); split: -0.02%, +0.01% SClause: 33406 -> 33284 (-0.37%); split: -0.38%, +0.01% Copies: 67440 -> 65992 (-2.15%); split: -2.15%, +0.00% Branches: 18498 -> 18465 (-0.18%) PreSGPRs: 38409 -> 38331 (-0.20%) PreVGPRs: 44089 -> 43834 (-0.58%) Note, some fossils are from before this pattern was added to VKD3D-Proton, so the above may not reflect real-world impact. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Timur Kristóf	7985933a6d	nir: Lower pack_half_2x16_split to RTZ if available. Constant folding always uses RTNE for pack_half_2x16_split, but some backends implement it with RTZ. Lowering to RTZ when available ensures that the behaviour will be consistent between constant folding and the backend. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Timur Kristóf	12652cc549	nir: Add pack_half_2x16_rtz_split opcode. Same as pack_half_2x16_rtz_split, but always uses RTZ mode. Note that pack_half_2x16 rounding mode is unspecified. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Lionel Landwerlin	ff34e96701	nir/lower_io: fix bounds checking for 64bit_bounded_global If the offset is negative like it's the case in dEQP-VK.robustness.robustness2.bind.notemplate.r32i.unroll.volatile.storage_buffer_dynamic.readwrite.no_fmt_qual.len_256.samples_1.1d.comp we end up passing the bounds checking condition because it's using signed integers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Jason Ekstrand <jason.ekstrand@collabora.com> Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20762>	2023-01-19 09:16:40 +00:00
Alyssa Rosenzweig	e664082d35	nir/lower_blend: No-op nir_color_mask if no mask In this usual case, do a quick check to avoid generating 5 useless instructions (mov/vec4 instructions). They'll get copypropped but that creates more work for the optimizer and nir/lower_blend runs in a hot variant path on both Asahi and Panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	1fc25c8c79	nir/lower_blend: Handle undefs in stores nir/lower_blend asserts: assert(nir_intrinsic_write_mask(store) == nir_component_mask(store->num_components)); For the special blend shaders used in Panfrost, this holds. But for arbitrary shaders coming out of GLSL-to-NIR (as used with Asahi), this does not hold. In particular, after nir_opt_undef runs, undefined components can be trimmed. Concretely, if we have the shader: gl_FragColor.xyz = foo; Then this becomes in NIR gl_FragColor = vec4(foo.xyz, undef); and then opt_undef will give the store_deref a wrmask of xyz but 4 components. Then lower_blend asserts out. Found in a gfxbench shader on asahi. Closes: #6982 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	8b83210ab3	nir/lower_blend: Don't do logic ops on pure float Per the spec. Fixes arb_color_buffer_float-render on both Panfrost and Asahi (before/after reproduced on Mali-T860 and AGX G13 respectively). Without that patch, that test fails the assertion: arb_color_buffer_float-render: ../src/compiler/nir/nir_lower_blend.c:259: nir_blend_logicop: Assertion `util_format_is_pure_integer(format)' failed. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	dbd0615e7a	nir/lower_blend: Avoid useless iand with logic ops The upper bits start correctly, there's no need to clear them as long as we keep them zero'ed by using ixor with a valid bit mask instead of inot. Makes the code generated for logic op slightly less ridiculous. I'm joking. It's still ridiculous but I'm not in the mood to fix up the Midgard compiler and it's just a little ALU for a feature almost nothing uses. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	ee127f03e4	nir/lower_blend: Fix SNORM logic ops We need to sign extend. Incidentally this means the iand above is useless for SNORM. Fixes arb_color_buffer_float-render with GL_RGBA8_SNORM. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	f9839e7e1b	nir/lower_blend: Clamp blend factors Particularly constant colours, but also (more obscurely) SNORM. Fixes arb_color_buffer_float-render with SNORM framebuffers. Issue affects both Asahi and Panfrost (the latter after we start advertising EXT_render_snorm). v2: Check the blend factor to avoid unnecessary clamps. This avoids regressing blend shader code quality on Panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> [v1] Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Alyssa Rosenzweig	fca457790e	nir/lower_blend: Fix alpha=1 for RGBX format In this case we have 4 components but the value of the fourth component is undefined. Apply the fixup we already have. Fixes dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.0 on Asahi. That test blend with DST_ALPHA with its RGB565 attachment, which is fine if RGB565 is preserved, but Asahi is demoting that format to RGBX8 which means -- after lowering the tilebuffer access -- we blend with an ssa_undef. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20016>	2023-01-19 04:09:17 +00:00
Lionel Landwerlin	b82d9b1a3d	nir/divergence: add missing RT intrinsinc handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20763>	2023-01-18 22:32:43 +00:00
Qiang Yu	49cfbe1fed	nir/xfb_info: nir_gather_xfb_info_from_intrinsics update nir xfb_info Use this function to update nir_shader->xfb_info. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19489>	2023-01-18 05:30:14 +00:00
Gert Wollny	2e05cfa179	nir: Add range_base to atomic_counter and an option to use it Some drivers may encode constant offsets in the instruction, so make it possible for the drivers to request lowering the atomic uniform offset into the range_base variable of the intrinsic. v2: drop patch to use build-in array offset evaluation, it makes problems with zink, and update the code accordingly v3: always initialize range base Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>	2023-01-17 13:19:04 +00:00
Gert Wollny	c4cde91c1b	nir: Add possibility to store image var offset in range_base Add the intrinsic range_base value to the image intrinsics and add the option to store the image array offset into range_base instead of adding it to the image array index if the driver requests it. v2: Always initialize range_base v3: fix for bindless intrinsics Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>	2023-01-17 13:19:04 +00:00
Alyssa Rosenzweig	c3839bd540	nir: Optimize vendored sin/cos the same way As we've done for the AMD one, to prevent any codegen regression from switching the Midgard lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Alyssa Rosenzweig	a49ba0f1ae	nir: Add Midgard-specific fsin/fcos ops NIR has a fsin instruction that takes an argument in radians. Midgard instead has an fsinpi argument that takes an argument in multiples of pi. So, we had a NIR pass that would change fsin(x) to fsin(x / pi) and then map fsin to fsinpi in the backend. But that's invalid! In NIR, the opcode fsin is well-defined. fsin(x) means something very different than fsin(x / pi). They won't usually be equal. The transform fsin(x) -> fsin(x / pi) is fundamentally unsound. It did work before, by accident. Most NIR passes don't care about the semantics of ALU instructions. fsin(x) and fsin(x / pi) are both well-defined but fundamentally different NIR shaders. So while rewriting is wrong -- the NIR we get out is not equivalent to the NIR we put in, and the Midgard ops we generate are not equivalent to the NIR -- but if we don't run any passes that care about the definition of fsin the two wrongs will cancel out to make a right. However, some NIR passes do care about the definitions of ALU instructions, instead of treating them as named black boxes. In particular, constant folding (nir_opt_constant_fold) evaluates ALU instructions when their inputs are constants, according to the definition in nir_opcodes.py. So our little charade will only work if we don't call nir_opt_constant_fold, or if all the fsin instructions have non-constant inputs. At the beginning of this series, that is the case. With the later scalarization change, that's no longer the case, and the unsoundness translates to real failing tests rather than a quibble of NIR's semantics. To mitigate, we define a new NIR opcode with the semantics we want and translate fsin(x) = fsin_mdg(x / pi), where that equivalence does hold mathematically. So the new translation is sound and doesn't rely on lucky pass ordering. This matches the approach already used for AMD and AGX, which have fsin_amd and fsin_agx opcodes respectively. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Jason Ekstrand	b39958a3a1	anv,nir: Move the ANV YCbCr lowering pass to common code Nir changes: Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Anv changes: Acked-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>	2023-01-16 14:10:21 +00:00
Jason Ekstrand	f02a11e4e4	nir: Add copyright and include guards to nir_vulkan.h Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>	2023-01-16 14:10:21 +00:00
Jason Ekstrand	433fe592ac	nir/builder: Add some texture helpers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19480>	2023-01-13 20:25:01 +00:00
Jason Ekstrand	30f3fec380	nir: Add more opcodes to nir_tex_instr_is_query() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19480>	2023-01-13 20:25:01 +00:00
t0b3	267dd1f4d5	nir/nir_opt_move: fix ALWAYS_INLINE compiler error Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Closes: #6825 Fixes: `f1d20ec6` ("nir/nir_opt_move: handle non-SSA defs ") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17439>	2023-01-13 14:23:35 +00:00
Alyssa Rosenzweig	f4b3201244	nir/peephole_select: Allow load_preamble load_preamble is intended to be almost free (costing at most a move), and it does not have special bounds checking requirement, so it's ok to select with it. With this, drivers that use nir_opt_preamble together with a late call to peephole_select can optimize sequences like: if (x) { <uniform-on-uniform calculation> } else { <different uniform-on-uniform calculation> } to simply bcsel(x, <uniform register 0>, <uniform register 1>) rather than emitting needless control flow / branching over some moves. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20597>	2023-01-13 00:43:04 +00:00
Pavel Ondračka	3fcdd9e4a7	nir/lower_bool: ntt: Generate a good opcode for bcsel This is heavily copy-pasted from a patch of Ian Romanick, including the commit message. Previously, this pass always generated fcsel for bcsel. This was the only place that generate fcsel, so various drivers assumed (and needed!) that src0 was a Boolean with 0.0 or 1.0 as the only values. Specifically, many DX9 / GL_ARB_vertex_program platforms lack a CMP instruction in vertex shaders. In those cases, they would use LRP to implement fcsel. The bummer is that many plaforms have a real fcsel instruction, and those platforms would benefit from other places generating that opcode. Instead of leaving assumptions in drivers about the sources of an opcode that they can't really support, allow them to control the way the lowering pass translates bcsel. Two flags are used to control this: - If the driver sets has_fused_comp_and_csel in nir_options, fcsel_gt will be used. Since the Boolean value is 0.0 or 1.0, this is equivalent to fcsel. - If the parameter has_fcsel_ne is set, fcsel will be used. This is the old path. - Otherwise, the lowering pass assumes we're on a crufty, old DX9 vertex program, and it emits flrp. With this, the assumptions about src0 of fcsel in NTT can be removed. If a platform can't handle fcsel, it should ensure that the lowering pass won't generate it. No change in shader-db. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>	2023-01-12 23:01:05 +00:00
Ian Romanick	70b25d9fe8	nir/lower_int_to_float: Add support for i32csel opcodes These lower naturally to the corresponding fcsel opcodes. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>	2023-01-12 23:01:05 +00:00

... 11 12 13 14 15 ...

4732 commits