fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 22:28:06 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	a3177cca99	nir: Add a lowering pass to lower memcpy Reviewed-by: Jesse Natalie <jenatali@microsoft.com>. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6713>	2020-09-25 23:48:03 +00:00
Jason Ekstrand	b2899f7265	nir: Add a new memcpy intrinsic This matches SPIR-V's OpCopyMemorySized Reviewed-by: Jesse Natalie <jenatali@microsoft.com>. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6713>	2020-09-25 23:48:03 +00:00
Jesse Natalie	09bca4cb95	vtn/opencl: Switch some nir-sequence ops to use libclc All of these are pretty well-defined. Rather than implementing them as a sequence of nir ops, we can just use the libclc implementation. v2 (idr): Delete functions that are now unused. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6035>	2020-09-25 13:14:45 -07:00
Jesse Natalie	93db59e066	nir: Add an internal flag to shader_info Don't print the shader if it's marked internal, unless NIR_PRINT has been explicitly set to 2 (or higher). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6035>	2020-09-25 20:09:08 +00:00
Jason Ekstrand	0206fb3941	nir/liveness: Consider if uses in nir_ssa_defs_interfere Fixes: `f86902e75d` "nir: Add an SSA-based liveness analysis pass" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3428 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Yevhenii Kharchenko <yevhenii.kharchenko@globallogic.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6824>	2020-09-25 14:16:15 +00:00
Rhys Perry	a18c84ecce	nir/instr_set: hash intrinsic sources ministat (CSE only): Difference at 95.0% confidence -9.80325 +/- 0.173089 -41.4434% +/- 0.461972% (Student's t, pooled s = 0.0763653) ministat (entire run): Difference at 95.0% confidence -3.13667 +/- 0.61519 -5.11107% +/- 0.990737% (Student's t, pooled s = 0.271416) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6860>	2020-09-25 10:18:36 +00:00
Marek Olšák	ea77958fea	nir: gather information about fbfetch and dual source color Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6758>	2020-09-25 02:29:30 -04:00
Marek Olšák	a6abf175ef	nir: fix input/output info gathering for lowered IO Ooops. Fixes: `17af07024d` - nir: gather all IO info from IO intrinsics Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6758>	2020-09-25 02:29:30 -04:00
Marek Olšák	ef98c175c0	nir: gather fs.uses_sample_qualifier from lowered IO Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6758>	2020-09-25 02:29:30 -04:00
Marek Olšák	7b108e6ac4	nir: set system_values_read for all intrinsics Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6758>	2020-09-25 02:29:30 -04:00
Marek Olšák	abe9588ff0	nir: gather tess.tcs_cross_invocation info from lowered IO intrinsics Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6758>	2020-09-25 02:29:30 -04:00
Marek Olšák	10be706778	nir: gather indirect info from lowered IO intrinsics Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6758>	2020-09-25 02:29:30 -04:00
Kenneth Graunke	140f53e646	Revert "nir: replace lower_ffma and fuse_ffma with has_ffma" This reverts commit `939ddf3f67`. Intel has a separate pass for fusing FFMAs selectively. We split these flags in commit `1b72c31e1f` and the reasoning still stands. The patch being reverted was just a cleanup, so there should be no issue with reverting it. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>	2020-09-24 13:11:50 -07:00
Marek Olšák	939ddf3f67	nir: replace lower_ffma and fuse_ffma with has_ffma Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Marek Olšák	771aad3027	nir: split lower_ffma into lower_ffma16/32/64 AMD wants different behavior for each bit size Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Marek Olšák	21174dedec	nir: split fuse_ffma into fuse_ffma16/32/64 AMD wants different behavior for each bit size Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Jesse Natalie	924e27647e	nir_lower_system_values: Fix load_global_invocation_id to use base_work_group_id even with no base_global id Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6668>	2020-09-22 21:22:26 +00:00
Jason Ekstrand	9750164c09	nir: Rename get_buffer_size to get_ssbo_size This makes it explicit that this intrinsic is only for SSBOs. For the v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be able to distinguish between the two. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812>	2020-09-22 13:34:12 +00:00
Rhys Perry	f100cf0d30	aco: stop multiplying driver_location by 4 This didn't really serve any purpose, doesn't match how FS inputs are currently done, and prevented us from using nir_io_add_const_offset_to_base in the future. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Danylo Piliaiev	f2b17dec12	nir/lower_samplers: Clamp out-of-bounds access to array of samplers Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says: "In the subsections described above for array, vector, matrix and structure accesses, any out-of-bounds access produced undefined behavior.... Out-of-bounds reads return undefined values, which include values from other variables of the active program or zero." Robustness extensions suggest to return zero on out-of-bounds accesses, however it's not applicable to the arrays of samplers, so just clamp the index. Otherwise instr->sampler_index or instr->texture_index would be out of bounds, and they are used as an index to arrays of driver state. E.g. this fixes such dereference: if (options->lower_tex_packing[tex->sampler_index] != in nir_lower_tex.c CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>	2020-09-22 09:06:52 +00:00
Danylo Piliaiev	0ba82f78a5	nir/large_constants: Eliminate out-of-bounds writes to large constants Out-of-bounds writes could be eliminated per spec: Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says: "In the subsections described above for array, vector, matrix and structure accesses, any out-of-bounds access produced undefined behavior.... Out-of-bounds writes may be discarded or overwrite other variables of the active program." Fixes: `1235850522` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>	2020-09-22 09:06:52 +00:00
Danylo Piliaiev	66669eb529	nir/lower_io: Eliminate oob writes and return zero for oob reads Out-of-bounds writes could be eliminated per spec: Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says: "In the subsections described above for array, vector, matrix and structure accesses, any out-of-bounds access produced undefined behavior.... Out-of-bounds writes may be discarded or overwrite other variables of the active program. Out-of-bounds reads return undefined values, which include values from other variables of the active program or zero." GL_KHR_robustness and GL_ARB_robustness encourage us to return zero for reads. Otherwise get_io_offset would return out-of-bound offset which may result in out-of-bound loading/storing of inputs/outputs, that could cause issues in drivers down the line. E.g. this fixes such dereference: int vue_slot = vue_map->varying_to_slot[intrin->const_index[0]]; in brw_nir.c CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>	2020-09-22 09:06:52 +00:00
Jason Ekstrand	e1fc23265f	nir: Add a pass for lowering CL-style image ops to texture ops In CL 1.2, images are required to be either read-only or write-only. We can always translate the read-only image ops to texture ops. In CL 2.0 (and an extension), the ability is added to have read-write images but sampling (with a sampler) is only allowed on read-only images. As long as we only lower read-only images to texture ops, everything should stay consistent. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6578>	2020-09-20 14:28:13 +00:00
Gert Wollny	6f2b6952be	nir: remove ubo_r600 instrinsic since ubo_vec4 is used now As suggested by Eric. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> eviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6743>	2020-09-17 10:11:11 +00:00
Alejandro Piñeiro	2aaa1564ad	nir/lower_io: don't reduce range if parent length is zero When handling arrays, range is increased based on the array size minus one. But if such is zero, it has the effect of reducing the range. Handle that case by returning the unknown range value. v2: * Add missing braces. * Return unknown range in this case, instead of keeping the initial range. v3: Simplify code, using existing "fail" label. (Jason) Fixes the following using v3dv: dEQP-VK.graphicsfuzz.cov-simplify-clamp-max-itself Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6737>	2020-09-16 23:24:28 +02:00
Gert Wollny	2c9fee9b6a	nir: Add option lower_uniforms_to_ubo Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6316>	2020-09-16 10:07:42 +00:00
Marek Olšák	57bf4c2028	nir,radeonsi: move ffma fusing to late optimizations for better codegen The freedreno trace changes were suggested by Rob Clark. ALU performance is higher, because ffma is used more often, but so is register usage, because trinary opcodes (such as ffma) usually need at least 3 live registers. 54793 shaders in 33659 tests Totals: SGPRS: 2639746 -> 2642938 (0.12 %) VGPRS: 1534120 -> 1536392 (0.15 %) Spilled SGPRs: 3541 -> 3618 (2.17 %) Spilled VGPRs: 33 -> 44 (33.33 %) Scratch size: 292 -> 312 (6.85 %) dwords per thread Code Size: 55639836 -> 55620116 (-0.04 %) bytes Max Waves: 964785 -> 963977 (-0.08 %) Totals from affected shaders: SGPRS: 1105800 -> 1108992 (0.29 %) VGPRS: 635292 -> 637564 (0.36 %) Spilled SGPRs: 3193 -> 3270 (2.41 %) Spilled VGPRs: 33 -> 44 (33.33 %) Scratch size: 36 -> 56 (55.56 %) dwords per thread Code Size: 31568708 -> 31548988 (-0.06 %) bytes Max Waves: 319991 -> 319183 (-0.25 %) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6596>	2020-09-16 02:39:02 +00:00
Italo Nicola	00914e2179	nir/algebraic: fold some nested comparisons with ball and bany Signed-off-by: Italo Nicola <italonicola@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6604>	2020-09-14 17:47:39 +00:00
Marek Olšák	656d8edd9e	nir/opt_vectorize: don't lose exact and no_*_wrap flags This fixes a bunch of dEQP GLES tests. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6694>	2020-09-11 17:41:14 -04:00
Marek Olšák	50d335804f	nir/algebraic: add late optimizations that optimize out mediump conversions (v3) v2: move 2mp patterns to the end of late_optimizations v3: remove ftrunc from the optimizations to fix: dEQP-GLES3.functional.shaders.builtin_functions.common.modf.vec2_lowp_vertex Reviewed-by: Rob Clark <robdclark@chromium.org> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	b86305bb57	nir/algebraic: collapse conversion opcodes (many patterns) mediump inserts a lot of conversions. This cleans up the IR. All other combinations are covered too. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	cdd498bbe8	nir: add new mediump opcodes f2[ui]mp, i2fmp, u2fmp Algebraic optimizations will select them. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	385b4dbc39	nir: enforce 32-bit src type requirement for f2fmp and i2imp Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	3d3df8dbff	nir: remove redundant opcode u2ump Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	26fc5e1f4a	nir/algebraic: expand existing 32-bit patterns to all bit sizes using loops Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	3c8934a644	nir/algebraic: add flrp patterns for 16 and 64 bits Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	40f7afc1e9	nir: fix lower_mediump_outputs to not require variables If IO is lowered, NIR doesn't have to contain any IO variables (and in fact radeonsi removes them and other drivers should too). This makes the pass work without variables. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6621>	2020-09-10 19:52:57 +00:00
Marek Olšák	c2ae39e0ce	nir: add mediump flag to IO semantics Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6621>	2020-09-10 19:52:57 +00:00
Jesse Natalie	89401e5867	nir: More NIR_MAX_VEC_COMPONENTS fixes Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6655>	2020-09-09 20:19:42 +00:00
Jason Ekstrand	c5dd54e600	nir/idiv_const: Use the modern nir_src_as_* constant helpers Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6655>	2020-09-09 20:19:42 +00:00
Jason Ekstrand	d86e38af2c	nir: More NIR_MAX_VEC_COMPONENTS fixes A couple of these probably aren't strictly necessary but they won't hurt. The one that's particularly tricky is a fixed-length array in nir_search.h. However, to avoid blowing up the binary size of nir_opt_algebraic by about 2x, we just assert that only small ops are used. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6655>	2020-09-09 20:19:42 +00:00
Jesse Natalie	7ee5da90ed	nir_dominance: Use uint32_t instead of int16_t for dominance counters We're seeing OpenCL kernels that can hit this INT16_MAX block count. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6657>	2020-09-09 19:01:01 +00:00
Rhys Perry	641d45befb	nir/opt_loop_unroll: fix is_access_out_of_bounds with vectors Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsquueze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6347>	2020-09-09 12:34:47 +00:00
Jason Ekstrand	45bcb10841	nir: Add a dominance validation pass We don't do full dominance validation of SSA values in nir_validate because it requires generating valid dominance information and, while that's not extremely expensive, it's probably more than we want to do on every pass. Also, dominance information is generated through the metadata system so if we ran it by default in nir_validate, we would get different beavior of the metadata system based on whether or not you have a debug build and metadata bugs would be very hard to find. However, having a pass for it that can be run occasionally, should help detect and expose bugs. For ease of use, we add a NIR_VALIDATE_SSA_DOMINANCE environment variable which can be set to manually enable dominance validation as a standard part of nir_validate. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>	2020-09-08 19:44:01 +00:00
Rhys Perry	6cef804067	nir/opt_if: fix opt_if_merge when destination branch has a jump Fixes a case where opt_if_merge created code like: if (...) { break; loop { ... } } which caused opt_peel_loop_initial_if to complain that the loop pre-header wasn't a predecessor of the loop header. This patch prevents this (invalid, I think) unreachable code from being created. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3496 Fixes: `4d3f6cb973` ('nir: merge some basic consecutive ifs') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6633>	2020-09-08 18:39:47 +00:00
Eric Anholt	1ed78bd247	nir: Use explicit deref information to provide real UBO ranges. freedreno results (note that cat6 is loads from memory as opposed to pushed constants from the constant file): total instructions in shared programs: 8044344 -> 8022085 (-0.28%) total constlen in shared programs: 1411384 -> 1461964 (3.58%) total cat6 in shared programs: 89983 -> 87065 (-3.24%) Over the last 3 commits, we increased Manhattan31 performance by ~10% Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>	2020-09-08 18:20:51 +00:00
Eric Anholt	f3b33a5a35	nir: Add a range_base+range to nir_intrinsic_load_ubo(). For UBO accesses to be the same performance as classic GL default uniform block uniforms, we need to be able to push them through the same path. On freedreno, we haven't been uploading UBOs as push constants when they're used for indirect array access, because we don't know what range of the UBO is needed for an access. I believe we won't be able to calculate the range in general in spirv given casts that can happen, so we define a [0, ~0] range to be "We don't know anything". We use that at the moment for all UBO loads except for nir_lower_uniforms_to_ubo, where we now avoid losing the range information that default uniform block loads come with. In a departure from other NIR intrinsics with a "base", I didn't make the base an be something you have to add to the src[1] offset. This keeps us from needing to modify all drivers (particularly since the base+offset thing can mean needing to do addition in the backend), makes backend tracking of ranges easy, and makes the range calculations in load_store_vectorizer reasonable. However, this could definitely cause some confusion for people used to the normal NIR base. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>	2020-09-08 18:20:51 +00:00
Eric Anholt	3a9356831a	nir: Update the comment about nir_lower_uniforms_to_ubo()'s multiplier. I remembered doing this analysis and was arguing in another MR that this pass didn't have any driver dependency, but it actually does based on PIPE_CAP_PACKED_UNIFORMS. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>	2020-09-08 18:20:51 +00:00
Rhys Perry	e4d75c22be	nir/opt_shrink_vectors: shrink image stores using the format fossil-db (Navi): Totals from 657 (0.48% of 135946) affected shaders: VGPRs: 26076 -> 25520 (-2.13%); split: -2.15%, +0.02% CodeSize: 3033016 -> 3014472 (-0.61%); split: -0.64%, +0.03% MaxWaves: 9386 -> 9420 (+0.36%) Instrs: 590109 -> 585502 (-0.78%); split: -0.82%, +0.04% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5772>	2020-09-07 18:06:50 +00:00
Jason Ekstrand	bd428162b6	nir/lower_io: Fix the unknown-array-index case in get_deref_align The current align_mul calculation in the unknown-array-index calculation is align_mul = MIN3(parent_mul, min_pow2_divisor(parent_offset), min_pow2_divisor(stride)) which is certainly correct if parent_offset > 0. However, when parent_offset = 0, min_pow2_divisor(parent_offset) isn't well-defined and our calculation for it is 1 << -1 which isn't well-defined. That said.... it's not actually needed. The offset to the base of the array is array_base = parent_mul * k + parent_offset for some integer k. When we throw in an unknown array index i, we get elem = parent_mul * k + parent_offset + stride * i. If we set new_align = MIN2(parent_mul, min_pow2_divisor(stride)), then both parent_mul and stride are divisible by new_align and elem = (parent_mul / new_alig) * new_align * k + (stride / new_align) * new_align * i + parent_offset = new_align * ((parent_mul / new_alig) * k + (stride / new_align) * i) + parent_offset so elem = new_align * j + parent_offset where j = (parent_mul / new_alig) * k + (stride / new_align) * i. That's a very long-winded way of saying that we can delete one parameter from the align_mul calculation and it's still fine. :-) Fixes: `480329cf8b` "nir: Add a helper for getting the alignment of a deref" Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Tested-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6628>	2020-09-07 17:29:10 +00:00

1 2 3 4 5 ...

2585 commits