fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 03:38:12 +02:00

Author	SHA1	Message	Date
Mike Blumenkrantz	5342dbe96d	features: mark off GL 4.1 for zink Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8329>	2021-01-05 08:58:28 -05:00
Mike Blumenkrantz	c211f466cc	zink: GLSL 410 Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8329>	2021-01-05 08:58:27 -05:00
Mike Blumenkrantz	3f640e56c4	features: mark off GL 4.0 for zink Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8329>	2021-01-05 08:58:25 -05:00
Mike Blumenkrantz	ae9d6c5620	zink: GLSL 4.00 Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8329>	2021-01-05 08:58:24 -05:00
Mike Blumenkrantz	22be7b9674	zink: handle arrays of ubos with the nir pass removing all dynamic indexing, all that's needed here is generating extra binding points for each array member, as everything else is already handled Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8314>	2021-01-05 08:37:03 -05:00
Mike Blumenkrantz	dbba989907	zink: run nir_lower_dynamic_bo_access this fixes up most cases of dynamic bo loading Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8314>	2021-01-05 08:37:03 -05:00
Mike Blumenkrantz	35e346f428	zink: handle vertex streams we already support all this, it's just a matter of slapping on some Stream decoration flex tape Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8204>	2021-01-05 13:30:34 +00:00
Mike Blumenkrantz	68242767d2	zink: enable PIPE_CAP_START_INSTANCE and add feature Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8313>	2021-01-05 13:24:14 +00:00
Mike Blumenkrantz	351b6c667e	zink: always load (gl_InstanceID - gl_BaseInstance) when loading gl_InstanceID gl's values here always begin at 0, while vk begins with the firstInstance param used in the current draw command Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8313>	2021-01-05 13:24:14 +00:00
Samuel Pitoiset	4bb92d9145	radv: enable TC-compat HTILE in GENERAL on GFX10+ GFX10+ supports compressed writes to HTILE, so it should just work to skip decompressions when transitioning from/to GENERAL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Samuel Pitoiset	326c7312bf	radv: only load the DS fast clear values for compressed rendering Otherwise it's useless because we are unlikely to perform a fast depth stencil clear. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Samuel Pitoiset	76e33d528b	radv: clean up radv_layout_is_htile_compressed() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Samuel Pitoiset	f4f096805b	radv: fix TC-compat HTILE images with DST_OPTIMAL on the compute queue This is probably rare but can happen if someone performs a depth-stencil copy on the compute queue. This might work (untested by CTS) but it looks more conservative to decompress before perfoming the operation. Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Samuel Pitoiset	1c539b6484	radv: add radv_htile_get_initial_value() and document the HTILE dword Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Samuel Pitoiset	3038c88661	radv: fix potential HTILE issues for TC-compat images on GFX8 We can only use the entire HTILE buffer if TILE_STENCIL_DISABLE is TRUE. On GFX8+, this is only true if the depth image has no stencil and if it's not TC-compatible because of the ZRANGE_PRECISION issue. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Samuel Pitoiset	f7f6e9ad56	radv: always clear the SR0/SR1 bits of the HTILE buffer To make sure the stencil compare state is properly initialized and cleared when the driver performs a fast depth clear. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>	2021-01-05 12:10:11 +00:00
Pierre-Eric Pelloux-Prayer	5c3b471c9f	mesa/st: fix redundant initialization https://gitlab.freedesktop.org/mesa/mesa/-/issues/3966 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:12 +00:00
Pierre-Eric Pelloux-Prayer	094ab8bc12	radeonsi: fix redundant initializations See https://gitlab.freedesktop.org/mesa/mesa/-/issues/3966 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:11 +00:00
Pierre-Eric Pelloux-Prayer	b1c7a65815	gallium/vl: merge identical h264/h265 enums Use h2645 notations for shared enums to reduce duplication and fix a clang warning. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:11 +00:00
Pierre-Eric Pelloux-Prayer	8d347742fe	tesselator: remove unused variable Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:11 +00:00
Pierre-Eric Pelloux-Prayer	d0767fc045	amd/addrlib: use cpp.has_argument() to filter compiler arguments Acked-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:11 +00:00
Pierre-Eric Pelloux-Prayer	6679a34394	vdpau: fix invalid enum usage Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:11 +00:00
Pierre-Eric Pelloux-Prayer	cd1ac36ddd	vdpau: fix -Wabsolute-value warning vdpau specifies that top-left is x0/y0, bottom-right is x1/y1 and that x0/y0 are inclusive while x1/y1 are exclusive. This commit remove the abs() usage and instead verifies that the VdpRects passed by the user matche the documentation. When they don't they're treated as empty rectangles. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846>	2021-01-05 11:29:11 +00:00
Rhys Perry	c5973ede01	ac/nir: use llvm.readcyclecounter for LLVM9+ Unlike llvm.amdgcn.s.memtime, this works on GFX10.3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4033 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8306>	2021-01-05 10:27:00 +00:00
Eric Anholt	c0bcde8b45	gallium/tgsi_exec: Remove unused MaxGeometryShaderOutputs. Just an indirection from the value you should be grepping for (the one that controls the allocation of the output buffer). Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8283>	2021-01-05 09:54:49 +00:00
Eric Anholt	d31c30007b	gallium/tgsi_exec: Clean up storage of the pixel kill mask. We need one dword per exec, rather than one per channel, since it's the bitmask of channels killed. Removes the remainder of the TGSI_EXEC_NUM_TEMP_EXTRAS! Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8283>	2021-01-05 09:54:49 +00:00
Eric Anholt	6fb9365a07	gallium/tgsi_exec: Drop the unused scratch temp regs. I suspect this was used back in the SSE2 backend days. Definitely dead now. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8283>	2021-01-05 09:54:49 +00:00
Eric Anholt	c27cbfd9ed	gallium/tgsi_exec: Stop doing the weird allocation of the Addrs array. Saves an indirection on referencing the address regs, and also my sanity. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8283>	2021-01-05 09:54:49 +00:00
Eric Anholt	af135bb8af	gallium/tgsi_exec: Simplify GS output vertex count tracking. We had this strange 5-dword-per-stream storage for the single dword current vertex count, due to copy and paste. We can make much cleaner code by just having a 4-element array in the machine. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8283>	2021-01-05 09:54:49 +00:00
Samuel Pitoiset	831d9d406a	radv: remove unused radv_image::aspects Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8324>	2021-01-05 09:46:01 +00:00
Samuel Pitoiset	58c68bac39	radv: fix clearing images with vkCmdClear{Color,DepthStencil}Image() The image aspects field is actually never set and we should use the range aspect anyways. Fixes: `1a7b7b17ad` ("radv: avoid oob read during clear") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8324>	2021-01-05 09:46:01 +00:00
Pierre-Eric Pelloux-Prayer	4c751ad67a	vbo/dlist: use a shared index buffer Draws can be merged by u_threaded if they share the same IB. This improves performance in SPECviewperf13 snx-03: tests fps are improved by a 1.2x - 2.0x factor. v2: reworked error handling Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> (v2) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8111>	2021-01-05 09:51:56 +01:00
Marek Olšák	a0314083be	mesa: fix a second bug in merging light state parameters with unpacked uniforms The memcpy size should be packed even if the allocated parameter size is padded to 4 components. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8017>	2021-01-05 03:47:16 +00:00
Marek Olšák	45acf9b49a	mesa: fix a bug in merging light state parameters with unpacked uniforms This code is not enabled yet. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8017>	2021-01-05 03:47:16 +00:00
Marek Olšák	4db8b171a5	mesa: add STATIC_ASSERTs to the STATE_LIGHT_ATTRIBS case Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8017>	2021-01-05 03:47:16 +00:00
Marek Olšák	6549caf2c2	st/mesa: fix a defect when st_validate_state was invoked for unused states This fixes a small performance issue. Discovered with piglit/drawoverhead. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8017>	2021-01-05 03:47:16 +00:00
Marek Olšák	1f17f8bb6d	st/mesa: simplify checking whether to pin threads to L3 Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8017>	2021-01-05 03:47:16 +00:00
Marek Olšák	a0467b7fa1	util: replace UTIL_MAX_CPUS by util_cpu_caps.num_cpu_mask_bits to reduce overhead when setting thread affinity. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8017>	2021-01-05 03:47:16 +00:00
Alexander von Gluck IV	c7486c996e	glsl/builtin_functions: Rename int64 function to int64_avail * int64 is a core type on Haiku (and potentially other platforms) * rename to int64_avail matching other similar calls Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2021-01-04 21:18:55 -06:00
Alexander von Gluck IV	cd2f3627a6	meson: Add _GNU_SOURCE for Haiku to activate non-posix functions Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2021-01-04 21:18:54 -06:00
Marek Olšák	76eb3478cf	radeonsi: take color interpolation into account for shader variants Fixes: - Sample shading now uses per-sample interpolation for colors if colors are the only inputs. (this is the only case that was broken) Optimizations: - BC_OPTIMIZE (barycentric optimization) is now enabled with MSAA if colors are qualified with both center and centroid. (BC_OPTIMIZE means that the hardware skips initializing centroid (i,j) if they are equal to center (i,j)) - If MSAA is disabled and at least 2 out of (center, centroid, sample) are used by all inputs now including colors, center is forced for all inputs. - If INTERP_MODE_COLOR is not used and the legacy GL shade model is flat, the shader variant for flat shading is not generated. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225>	2021-01-05 02:43:55 +00:00
Marek Olšák	31240a875c	radeonsi: add driconf options to enable/disable Smart Access Memory so that anybody can test it if they have Above 4G Decoding and compare performance. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225>	2021-01-05 02:43:55 +00:00
Marek Olšák	b94626d3ee	ac,radeonsi: limit Smart Access Memory to Zen 3 and GFX10.3 due to perf issues Many people experience performance degradation on some systems. There will be a driconf option to enable SAM on other chips as well as disable it on enabled systems. Fixes: `d3d6d38145` - ac: add radeon_info::all_vram_visible for Smart Access Memory Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3982 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225>	2021-01-05 02:43:55 +00:00
Marek Olšák	e4fa7c440d	util: add AMD CPU family enums and enable L3 cache pinning on Zen3 Based on: https://en.wikichip.org/wiki/amd/cpuid The only reason it's nominated as a fix is because Zen3 might underperform because the CPU detection ignored it. Fixes: `15fa2c5e35` - gallium/u_cpu_detect: get the number of cores per L3 cache for AMD Zen Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225>	2021-01-05 02:43:55 +00:00
Vinson Lee	8457be1497	radeonsi: Fix typos. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8289>	2021-01-05 02:25:36 +00:00
Ian Romanick	539c25c2da	nir/algebraic: Move the flrp -> bcsel rule earlier If multiple rules could match, the rule that appears first in the file is used. Only Tiger Lake and Ice Lake are affected. Other platforms either have a LRP instruction or can't run any shaders from shader-db that would benefit. v2: Fix issues created when this commit was rebased on top of `3c8934a644` ("nir/algebraic: add flrp patterns for 16 and 64 bits"). Noticed by Caio. Tiger Lake and Ice Lake had similar results. total instructions in shared programs: 20908672 -> 20908661 (<.01%) instructions in affected programs: 419 -> 408 (-2.63%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 2.20 x̃: 3 helped stats (rel) min: 1.85% max: 3.19% x̄: 2.49% x̃: 2.65% 95% mean confidence interval for instructions value: -3.56 -0.84 95% mean confidence interval for instructions %-change: -3.24% -1.73% Instructions are helped. total cycles in shared programs: 473513940 -> 473513793 (<.01%) cycles in affected programs: 7176 -> 7029 (-2.05%) helped: 12 HURT: 0 helped stats (abs) min: 5 max: 22 x̄: 12.25 x̃: 12 helped stats (rel) min: 0.84% max: 3.24% x̄: 2.09% x̃: 1.80% 95% mean confidence interval for cycles value: -15.43 -9.07 95% mean confidence interval for cycles %-change: -2.57% -1.61% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	ec16f935fe	nir/algebraic: Mark comparisons generated from lowered fsign precise This prevents other transformations from converting them to 'a != 0'. For example, both of these transformations can do this: (('~flt', 0.0, ('fabs', a)), ('fne', a, 0.0)), (('~flt', ('fneg', ('fabs', a)), 0.0), ('fne', a, 0.0)), Both fsign(fabs(NaN)) and fsign(fneg(fabs(NaN))) should produce zero, but, since 'NaN != 0.0' is true, cascading these transformations could cause them to generate 1.0 or -1.0 respecively. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9771af5dde	nir/algebraic: Fix broken NaN and -0.0 behavior No shader-db or fossil-db changes on any Intel platform. v2: Add a coding line to fix SCons build problems caused by the ± character. Fixes: `25bfba3335` ("nir/algebraic: Recognize open-coded copysign(1.0, a)") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	010e663cc3	spir-v: Mark floating point comparisons exact OpenGL GLSL, OpenGL ARB assembly shaders, and DX9 are pretty loose about the behavior in the presence of NaNs. Many GPUs that implement these specifications do not even have a representation of NaN. However, OpenCL and Vulkan SPIR-V are not so lax. Both actually have some required behavior in the presence of NaN, and, of the two, OpenCL is the most strict. For years we have implemented SPIR-V by using the same comparison opcodes as we use for OpenGL GLSL and OpenGL assembly shaders. This has repeatedly caused problems where an optimization that is valid in the NaN-relaxed world is not valid in Vulkan or OpenCL. To fix this, set the "exact" flag on comparisons instructions generated from SPIR-V. This will block optimizations that may have different NaN behavior. v2: Set the exact flag in the nir_builder, not in the vtn_builder. v3: Add an assertion in vtn_handle_constant that the exact flag wasn't set (because it's ignored). Rebase on `80163bbec3` ("nir/vtn: Support OpOrdered and OpUnordered opcodes"). Mark the NIR generated for those opcodes as exact as well. v4: s/unused_exact/exact/ in a couple places, and assert that exact has the expected value (true in one place, false in the other). Suggested by Caio. Closes: #3345 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Fixes: `8513b12590` ("nir/opt_if: split ALU from Phi more aggressively") This commit doesn't really fix anything in `8513b12590`. However, without `8513b12590`, a regression is triggered in RADV on No Man's Sky. I want to ensure that this change is only applied on top of `8513b12590`, and Fixes: seems the safest way to do that. No shader-db changes on any Intel platform. This only affects SPIR-V, and we have no OpenGL SPIR-V shaders in shader-db. 124 shaders in Shadow of the Tomb Raider (Steam "native") were hurt by 1 spill and 1 fill each. All Intel platforms had similar results. (Tiger Lake shown) Instructions in all programs: 155668276 -> 155685764 (+0.0%) SENDs in all programs: 6474570 -> 6474570 (+0.0%) Loops in all programs: 35271 -> 35271 (+0.0%) Cycles in all programs: 3198055373 -> 3198628031 (+0.0%) Spills in all programs: 231522 -> 231646 (+0.1%) Fills in all programs: 347571 -> 347695 (+0.0%) Vega Totals: SGPRs: 20955712 -> 20956756 (+0.00%); split: -0.02%, +0.03% VGPRs: 13476920 -> 13473132 (-0.03%); split: -0.07%, +0.04% CodeSize: 613371940 -> 613339348 (-0.01%); split: -0.06%, +0.05% MaxWaves: 3111886 -> 3112481 (+0.02%); split: +0.02%, -0.00% Instrs: 120723785 -> 120746991 (+0.02%); split: -0.04%, +0.06% Cycles: 626658992 -> 626862708 (+0.03%); split: -0.05%, +0.08% VMEM: 216330854 -> 216343196 (+0.01%); split: +0.04%, -0.04% SMEM: 32079391 -> 32081972 (+0.01%); split: +0.05%, -0.04% VClause: 2688784 -> 2688789 (+0.00%); split: -0.03%, +0.03% SClause: 6554669 -> `6556251` (+0.02%); split: -0.01%, +0.03% Copies: 5356667 -> 5353283 (-0.06%); split: -0.36%, +0.29% Branches: 954466 -> 954716 (+0.03%); split: -0.01%, +0.04% PreSGPRs: 9078300 -> 9081626 (+0.04%); split: -0.01%, +0.05% PreVGPRs: 10972090 -> 10966576 (-0.05%); split: -0.06%, +0.01% Totals from 48239 (12.08% of 399432) affected shaders: SGPRs: 2713984 -> 2715028 (+0.04%); split: -0.16%, +0.19% VGPRs: 1997804 -> 1994016 (-0.19%); split: -0.46%, +0.27% CodeSize: 172094092 -> 172061500 (-0.02%); split: -0.21%, +0.19% MaxWaves: 337327 -> 337922 (+0.18%); split: +0.20%, -0.02% Instrs: 33053657 -> 33076863 (+0.07%); split: -0.15%, +0.22% Cycles: 254961228 -> 255164944 (+0.08%); split: -0.12%, +0.20% VMEM: 15165226 -> 15177568 (+0.08%); split: +0.59%, -0.51% SMEM: 3304938 -> 3307519 (+0.08%); split: +0.49%, -0.41% VClause: 766225 -> 766230 (+0.00%); split: -0.12%, +0.12% SClause: 1332645 -> 1334227 (+0.12%); split: -0.04%, +0.16% Copies: 2040651 -> 2037267 (-0.17%); split: -0.94%, +0.77% Branches: 743668 -> 743918 (+0.03%); split: -0.01%, +0.05% PreSGPRs: 1697667 -> 1700993 (+0.20%); split: -0.07%, +0.27% PreVGPRs: 1718424 -> 1712910 (-0.32%); split: -0.39%, +0.07% Polaris Totals: SGPRs: 21349172 -> 21354376 (+0.02%); split: -0.02%, +0.04% VGPRs: 13690680 -> 13686920 (-0.03%); split: -0.07%, +0.04% CodeSize: 613745824 -> 613704988 (-0.01%); split: -0.06%, +0.05% MaxWaves: 2775012 -> 2775189 (+0.01%); split: +0.01%, -0.00% Instrs: 120735079 -> 120756209 (+0.02%); split: -0.04%, +0.06% Cycles: 627906100 -> 628076156 (+0.03%); split: -0.05%, +0.08% VMEM: 216623065 -> 216641838 (+0.01%); split: +0.04%, -0.04% SMEM: 32295618 -> 32299338 (+0.01%); split: +0.05%, -0.04% VClause: 2711025 -> 2711141 (+0.00%); split: -0.03%, +0.04% SClause: 6545185 -> 6546769 (+0.02%); split: -0.01%, +0.03% Copies: 5387723 -> 5383249 (-0.08%); split: -0.37%, +0.29% Branches: 953775 -> 953954 (+0.02%); split: -0.01%, +0.03% PreSGPRs: 9148814 -> 9153211 (+0.05%); split: -0.01%, +0.06% PreVGPRs: 11029429 -> 11023915 (-0.05%); split: -0.06%, +0.01% Totals from 48239 (12.00% of 402052) affected shaders: SGPRs: 2682056 -> 2687260 (+0.19%); split: -0.16%, +0.35% VGPRs: 1994436 -> 1990676 (-0.19%); split: -0.46%, +0.27% CodeSize: 170857060 -> 170816224 (-0.02%); split: -0.21%, +0.19% MaxWaves: 295429 -> 295606 (+0.06%); split: +0.07%, -0.01% Instrs: 32808802 -> 32829932 (+0.06%); split: -0.16%, +0.22% Cycles: 254633252 -> 254803308 (+0.07%); split: -0.13%, +0.20% VMEM: 14897934 -> 14916707 (+0.13%); split: +0.65%, -0.52% SMEM: 3289726 -> 3293446 (+0.11%); split: +0.53%, -0.42% VClause: 775318 -> 775434 (+0.01%); split: -0.11%, +0.13% SClause: 1304867 -> 1306451 (+0.12%); split: -0.04%, +0.16% Copies: 2026334 -> 2021860 (-0.22%); split: -0.99%, +0.77% Branches: 742554 -> 742733 (+0.02%); split: -0.02%, +0.04% PreSGPRs: 1690887 -> 1695284 (+0.26%); split: -0.07%, +0.33% PreVGPRs: 1717709 -> 1712195 (-0.32%); split: -0.40%, +0.07% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	55621c6d1c	nir/algebraic: Add some compare-with-zero optimizations that are exact This prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Note that the patterns and replacements produce the same value when isnan(b). Suggested by Caio. v3: Use C99 isfinite() instead of (obsolete) BSD finite(). Fixes various Windows builds. No fossil-db changes on any Inetl platform, Vega, or Polaris10. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908670 -> 20908672 (<.01%) instructions in affected programs: 69 -> 71 (2.90%) helped: 0 HURT: 1 total cycles in shared programs: 473515288 -> 473513940 (<.01%) cycles in affected programs: 4942 -> 3594 (-27.28%) helped: 2 HURT: 0 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00

... 42 43 44 45 46 ...

135153 commits