fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 12:58:09 +02:00

Author	SHA1	Message	Date
Jonathan Marek	0c6702cfa5	nir: improve convert_yuv_to_rgb Use a different arrangement of constants to allow more ffma. A vec4 backend will now use 3 fma for yuv_to_rgb. On freedreno/ir3, it is down from 10 to 7 alu (4 fma, 3 mul, 3 add to 7 fma). Other backends shouldn't be hurt. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-01 04:13:36 -07:00
Juan A. Suarez Romero	bbbe00a101	spirv: add missing SPV_EXT_descriptor_indexing capabilities Add ShaderNonUniformEXT, UniformBufferArrayNonUniformIndexingEXT, SampledImageArrayNonUniformIndexingEXT, StorageBufferArrayNonUniformIndexingEXT, StorageImageArrayNonUniformIndexingEXT, InputAttachmentArrayNonUniformIndexingEXT, UniformTexelBufferArrayNonUniformIndexingEXT and StorageTexelBufferArrayNonUniformIndexingEXT capabilities. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-04-30 09:22:45 +02:00
Caio Marcelo de Oliveira Filho	1fb6630636	spirv: Properly handle SpvOpAtomicCompareExchangeWeak The code was handling the Weak variant in some cases, but missing others, e.g. the get_deref_nir_atomic_op. Add all the missing cases with the same behavior of the non-Weak SpvOpAtomicCompareExchange. Note that the Weak variant is basically an alias, as SPIR-V 1.3, Revision 7 says "OpAtomicCompareExchangeWeak Deprecated (use OpAtomicCompareExchange). Has the same semantics as OpAtomicCompareExchange." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-29 19:02:44 -07:00
Eric Engestrom	7ca8ba199f	delete autotools .gitignore files One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-04-29 21:17:19 +00:00
Andres Gomez	c81fbb42d9	glsl/linker: check for xfb_offset aliasing From page 76 (page 80 of the PDF) of the GLSL 4.60 v.5 spec: " No aliasing in output buffers is allowed: It is a compile-time or link-time error to specify variables with overlapping transform feedback offsets." Currently, this is expected to fail, but it succeeds: " ... layout (xfb_offset = 0) out vec2 a; layout (xfb_offset = 0) out vec4 b; ... " Fixes the following piglit test: tests/spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-overlap.vert Fixes the following test: KHR-GL44.enhanced_layouts.xfb_output_overlapping v2: - Use a data structure to track the used components instead of a nested loop (Ilia). v3: - Take the BITSET_WORD array out from the gl_transform_feedback_buffer struct and make it local to the validation process (Timothy). - Do not use a nested scope for the validation (Timothy). v4: - Add reference to the fixed piglit test in the commit log. - Add reference to the fixed VK-GL-CTS test in the commit log (Tapani). - Empty initialize the BITSET_WORD pointers array (Tapani). Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-04-29 12:13:29 +02:00
Kenneth Graunke	2b44b27dbe	nir: Add a new nir_cf_list_is_empty_block() helper. Helper and name suggested by Eric Anholt. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-28 22:36:08 -07:00
Kenneth Graunke	08dc93c67c	glsl/list: Add an exec_list_is_singular() helper. Similar to list_is_singular() in util/list.h. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-28 22:35:42 -07:00
Andreas Baierl	b82de2b4d7	nir: add rcp(w) lowering for gl_FragCoord On some hardware (e.g. Mali400) the shader needs to apply some transformations for correct gl_FragCoord handling. The lowering actions look like the following in pseudocode: gl_FragCoord.xyz = gl_FragCoord_orig.xyz gl_FragCoord.w = 1.0 / gl_FragCoord_orig.w Add this lowering as a nir pass in preparation for using it in the driver. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-29 02:46:44 +00:00
Tapani Pälli	af06963d24	glsl: use empty brace initializer fixes following warning with clang: warning: suggest braces around initialization of subobject Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-26 12:24:41 -07:00
Tapani Pälli	7a7f182dac	nir: use braces around subobject in initializer Used same syntax as elsewhere with Mesa sources, verified result against MSVC with godbolt.org. fixes following warning with clang: warning: suggest braces around initialization of subobject v2: empty braces -> braces around subobject (Caio, Kristian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-04-26 12:01:22 -07:00
Jason Ekstrand	00d4e78ea9	nir/algebraic: Optimize integer cast-of-cast These have been popping up more and more with the OpenCL work and other bits causing extra conversions to/from 64-bit. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-04-26 04:26:08 -05:00
Dave Airlie	d946cbe9f5	nir: fix bit_size in lower indirect derefs. This fixes a case where we are expecting 64-bit but generate 32-bit consts and validate gets angry. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-26 12:59:43 +10:00
Marek Olšák	c5f65bfe6c	glsl: fix shader_storage_blocks_write_access for SSBO block arrays (v2) This fixes KHR-GL45.compute_shader.resources-max on radeonsi. Fixes: `4e1e8f684b` "glsl: remember which SSBOs are not read-only and pass it to gallium" v2: use is_interface_array, protect again assertion failures in u_bit_consecutive Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-04-25 18:57:38 -04:00
Rob Clark	2f0b9d2249	freedreno/ir3: lower load_barycentric_at_offset Calculates i,j at specified offset within a pixel. A new load_size_ir3 intrinsic is used in conjunction with fddx/fddy to translate the offset into primitive space and adjust the i,j from load_barycentric_pixel accordingly. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	c4f423aa36	freedreno/ir3: lower load_barycentric_at_sample This lowers load_barycentric_at_sample to load_sample_pos_from_id plus load_barycentric_at_offset. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Rob Clark	4d08c1b595	compiler: rename SYSTEM_VALUE_VARYING_COORD And add corresponding enums for different sorts of varying interpolation. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-04-25 14:13:31 -07:00
Caio Marcelo de Oliveira Filho	d5ac5d6e83	nir: Add option to lower tex to txl when shader don't support implicit LOD We already add the LOD src, so go ahead and update the texop as well when this option is set. v2: Make it an option. (Rob Clark) v3: Use a more concise name suggested by Jason. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-25 12:13:06 -07:00
Timothy Arceri	b155f74d7b	nir: fix nir_remove_unused_varyings() We were only setting the used mask for the first component of a varying. Since the linking opts split vectors into scalars this has mostly worked ok. However this causes an issue where for example if we split a struct on one side of the interface but not the other, then we can possibly end up removing the first components on the side that was split and then incorrectly remove the whole struct on the other side of the varying. With this change we simply mark all 4 components for each slot used by a struct. We could possibly make this more fine gained but that would require a more complex change. This fixes a bug in Strange Brigade on RADV when tessellation is enabled, all credit goes to Samuel Pitoiset for tracking down the cause of the bug. Fixes: `f1eb5e6399` ("nir: add component level support to remove_unused_io_vars()") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-25 16:37:36 +10:00
Marek Olšák	45ca7798dc	glsl: handle interactions between EXT_gpu_shader4 and texture extensions also, EXT_texture_buffer_object has to be enabled separately. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	825c35999c	glsl: allow "varying out" for fragment shader outputs with EXT_gpu_shader4 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	4ff3b8e18a	glsl: add texture builtin functions for EXT_gpu_shader4 v2: some fixes to texture functions thanks to piglit tests Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	8dbe23c8c6	glsl: add arithmetic builtin functions for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	7004114102	glsl: add builtin variables for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	1a973aa5e1	glsl: apply some 1.30 and other rules to EXT_gpu_shader4 as well Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Chris Forbes	85fefd1913	glsl: enable types for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Marek Olšák	a7f38e7fbd	glsl: add `unsigned int` type for EXT_GPU_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Chris Forbes	2d8f4fff49	glsl: enable noperspective\|flat\|centroid for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Chris Forbes	8740726e46	glsl: add scaffolding for EXT_gpu_shader4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-24 20:45:15 -04:00
Ian Romanick	3b087f668f	glsl: Silence may unused parameter warnings in glsl/ir.h Every file that included glsl/ir.h had a warning like: src/compiler/glsl/ir.h: In member function ‘virtual bool ir_rvalue::is_lvalue(const _mesa_glsl_parse_state) const’: src/compiler/glsl/ir.h:236:64: warning: unused parameter ‘state’ [-Wunused-parameter] virtual bool is_lvalue(const struct _mesa_glsl_parse_state state = NULL) const ^ Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `fa4ebf6b8d` ("glsl: add _mesa_glsl_parse_state object to is_lvalue()") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-23 17:49:19 -07:00
Timothy Arceri	a6b7068ff5	st/mesa/radeonsi: fix race between destruction of types and shader compilation Commit `624789e370` moved the destruction of types out of atexit() and made use of a ref count instead. This is useful for avoiding a crash where drivers such as radeonsi are still compiling in a thread when the app exits and has not called MakeCurrent to change from the current context. While the above scenario is technically an app bug we shouldn't crash. However that change caused another race condition between the shader compilation tread in radeonsi and context teardown functions. This patch makes two changes to fix this new problem: First we explicitly call _mesa_destroy_shader_compiler_types() when destroying the st context rather than calling it indirectly via _mesa_free_context_data(). We do this as we must call it after st_destroy_context_priv() so that we don't destory the glsl types before the compilation threads finish. Next wait for the shader threads to finish in si_destroy_context() this also means we need to call context destroy before destroying the queues in si_destroy_screen(). Fixes: `624789e370` ("compiler/glsl: handle case where we have multiple users for types") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-24 10:23:10 +10:00
Caio Marcelo de Oliveira Filho	7e2684ce01	spirv: Handle SpvOpDecorateId This operation decorate with an Id instead of a Literal or String. It is used by HlslCounterBufferGOOGLE (provided by SPV_GOOGLE_hlsl_functionality1). Even if we don't do anything with that decoration, we must be able to parse SPIR-V that uses it. Fixes: `891886da2f` "spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-23 14:58:01 -07:00
Caio Marcelo de Oliveira Filho	7b66d584a3	spirv: Rename vtn_decoration literals to operands Decorations (and ExecutionModes) can have not only literals, but also Ids associated with them. So rename the field to the more general name "Operand" used by the spec. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-23 14:58:01 -07:00
Kenneth Graunke	47303b466c	Revert "glsl: Set location on structure-split sampler uniform variables" This reverts commit `9e0c744f07`, which regressed dEQP-GLES2.functional.uniform_api.random.3. It turns out that the newly produced location is meaningless and impossible to consume by drivers that want to look at gl_uniform_storage, so it's probably better to leave it unset (0) than a number that looks usable. Leave a tombstone^Wcomment to discourage the next person from making the obvious looking fix. See the next commit for a longer description of the problem. This breaks tests/spec/glsl-1.10/execution/samplers/uniform-struct on i965, which was originally fixed by the revert. The next commit will fix it again. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-22 15:39:55 -07:00
Jason Ekstrand	ccb25aaeaf	nir: Use the NIR_SRC_AS_ macro to define nir_src_as_deref We have a macro for this now; no reason to hand-roll it for derefs. While we're here, move the NIR_DEFINE_CAST for derefs down to where all the other ones are. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-22 15:23:24 +00:00
Jason Ekstrand	470422870a	nir: Add helpers for getting the type of an address format Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	2edf29b933	intel,nir: Lower TXD with a bindless sampler When we have a bindless sampler, we need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	995dc4e5c3	nir/lower_io: Expose some explicit I/O lowering helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Kristian H. Kristensen	41593f3c37	nir_opcodes.py: Saturate to expression that doesn't overflow Compiler warns about overflow when assigning UINT64_MAX to something smaller than a uin64_t: src/compiler/nir/nir_constant_expressions.c:16909:50: warning: implicit conversion from 'unsigned long long' to 'uint1_t' (aka 'unsigned char') changes value from 18446744073709551615 to 255 [-Wconstant-conversion] uint1_t dst = (src0 + src1) < src0 ? UINT64_MAX : (src0 + src1); ~~~ ^~~~~~~~~~ Shift UINT64_MAX down to the appropriate maximum value for the type being assigned to. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-19 16:17:37 +00:00
Kristian H. Kristensen	15605cc9d4	glsl_to_nir: Initialize debug variable If we want to assert on found == true when the loop exits early, we need to initialize it to false. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-04-19 16:17:37 +00:00
Eric Anholt	38c75aff4c	nir: Use the nir_builder _imm helpers in setting up deref offsets. When looking at the dEQP nested_struct_array_dynamic_index_fragment code after lowering, I was horrified at the amount of adding and multiplying by 0 we were doing. The builder _imm helpers handle that for you so that the following optimization passes have less work to do. Plus, it's easier to read. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-19 08:45:14 -07:00
Eric Anholt	9ac5ec2f90	nir: Fix deref offset calculation for structs. We were calcuating the offset for the field within the struct, and just dropping it on the floor. Fixes a regression in KHR-GLES3.shaders.struct.local.nested_struct_array_dynamic_index_fragment and a few of its friends since the scratch lowering commit. Fixes: `e8e159e9df` ("nir/deref: Add helpers for getting offsets") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-19 08:45:14 -07:00
Erico Nunes	4577eb7b7c	nir/algebraic: add lowering for fsign The mali utgard pp doesn't support a sign instruction. In the ARM offline shader compiler, the sign function is implemented using sub(gt(0.0, a), lt(0.0, a)). This is a generic optimization, so implement it in the nir level when lower_fsign is set, alongside the lowering for isign. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-19 15:42:23 +00:00
Ian Romanick	6b97fa9a99	nir/algebraic: Strength reduce some compares of x and -x Converting the x vs -x comparison to an x vs 0 comparison enable cmod propagation to help. The seems to be a win everywhere except Gen7. Skylake and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 15566733 -> 15566014 (<.01%) instructions in affected programs: 72617 -> 71898 (-0.99%) helped: 302 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.15% max: 7.69% x̄: 1.28% x̃: 0.98% 95% mean confidence interval for instructions value: -2.55 -2.21 95% mean confidence interval for instructions %-change: -1.40% -1.16% Instructions are helped. total cycles in shared programs: 413014786 -> 413015475 (<.01%) cycles in affected programs: 707594 -> 708283 (0.10%) helped: 227 HURT: 101 helped stats (abs) min: 1 max: 612 x̄: 36.07 x̃: 20 helped stats (rel) min: 0.04% max: 19.39% x̄: 2.25% x̃: 1.49% HURT stats (abs) min: 2 max: 334 x̄: 87.90 x̃: 45 HURT stats (rel) min: 0.07% max: 14.51% x̄: 4.54% x̃: 3.36% 95% mean confidence interval for cycles value: -8.12 12.32 95% mean confidence interval for cycles %-change: -0.67% 0.34% Inconclusive result (value mean confidence interval includes 0). Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13828220 -> 13827881 (<.01%) instructions in affected programs: 60887 -> 60548 (-0.56%) helped: 253 HURT: 6 helped stats (abs) min: 1 max: 5 x̄: 1.36 x̃: 1 helped stats (rel) min: 0.16% max: 3.85% x̄: 0.81% x̃: 0.64% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.26% max: 0.89% x̄: 0.47% x̃: 0.27% 95% mean confidence interval for instructions value: -1.39 -1.23 95% mean confidence interval for instructions %-change: -0.85% -0.70% Instructions are helped. total cycles in shared programs: 386870095 -> 386894412 (<.01%) cycles in affected programs: 1537307 -> 1561624 (1.58%) helped: 127 HURT: 188 helped stats (abs) min: 1 max: 381 x̄: 17.89 x̃: 4 helped stats (rel) min: 0.02% max: 14.33% x̄: 1.00% x̃: 0.33% HURT stats (abs) min: 2 max: 5585 x̄: 141.43 x̃: 14 HURT stats (rel) min: 0.03% max: 11.50% x̄: 1.65% x̃: 1.06% 95% mean confidence interval for cycles value: 21.95 132.45 95% mean confidence interval for cycles %-change: 0.32% 0.85% Cycles are HURT. Sandy Bridge total instructions in shared programs: 10896339 -> 10896276 (<.01%) instructions in affected programs: 10757 -> 10694 (-0.59%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.29 x̃: 1 helped stats (rel) min: 0.12% max: 1.85% x̄: 0.87% x̃: 0.89% 95% mean confidence interval for instructions value: -1.42 -1.15 95% mean confidence interval for instructions %-change: -1.03% -0.72% Instructions are helped. total cycles in shared programs: 155091003 -> 155090480 (<.01%) cycles in affected programs: 102761 -> 102238 (-0.51%) helped: 51 HURT: 0 helped stats (abs) min: 1 max: 36 x̄: 10.25 x̃: 4 helped stats (rel) min: 0.02% max: 2.57% x̄: 0.76% x̃: 0.36% 95% mean confidence interval for cycles value: -12.98 -7.53 95% mean confidence interval for cycles %-change: -0.97% -0.56% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8234667 -> 8234652 (<.01%) instructions in affected programs: 2063 -> 2048 (-0.73%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 1.56% x̄: 0.82% x̃: 0.81% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.97% -0.67% Instructions are helped. total cycles in shared programs: 188700906 -> 188700598 (<.01%) cycles in affected programs: 283480 -> 283172 (-0.11%) helped: 83 HURT: 3 helped stats (abs) min: 2 max: 8 x̄: 3.78 x̃: 4 helped stats (rel) min: 0.04% max: 0.55% x̄: 0.15% x̃: 0.12% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.04% 95% mean confidence interval for cycles value: -3.87 -3.29 95% mean confidence interval for cycles %-change: -0.16% -0.12% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	f3d6df719c	nir/algebraic: Fix some 1-bit Boolean weirdness Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total cycles in shared programs: 372594532 -> 372594460 (<.01%) cycles in affected programs: 46854 -> 46782 (-0.15%) helped: 9 HURT: 0 helped stats (abs) min: 2 max: 22 x̄: 8.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.41% x̄: 0.16% x̃: 0.09% 95% mean confidence interval for cycles value: -14.34 -1.66 95% mean confidence interval for cycles %-change: -0.28% -0.04% Cycles are helped. Ivy Bridge total instructions in shared programs: 12038379 -> 12038373 (<.01%) instructions in affected programs: 1278 -> 1272 (-0.47%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.31% max: 0.77% x̄: 0.54% x̃: 0.55% total cycles in shared programs: 180889027 -> 180888997 (<.01%) cycles in affected programs: 29979 -> 29949 (-0.10%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 5 helped stats (rel) min: 0.02% max: 0.34% x̄: 0.11% x̃: 0.07% 95% mean confidence interval for cycles value: -13.40 1.40 95% mean confidence interval for cycles %-change: -0.27% 0.05% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total cycles in shared programs: 155091021 -> 155091003 (<.01%) cycles in affected programs: 8842 -> 8824 (-0.20%) helped: 2 HURT: 0 No changes on Iron Lake or GM45. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	403aac7500	nir/algebraic: Replace a pattern where iand with a Boolean is used as a bcsel All of the affected shaders are in Mad Max. I noticed this while looking at some other things. I tried a couple similar patterns, but the affect on cycles was general negative. It may be worth revisiting this later. v2: Rebase on 1-bit Boolean changes. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15282073 -> 15282053 (<.01%) instructions in affected programs: 1192 -> 1172 (-1.68%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.43 x̃: 1 helped stats (rel) min: 1.16% max: 2.17% x̄: 1.65% x̃: 1.39% 95% mean confidence interval for instructions value: -1.73 -1.13 95% mean confidence interval for instructions %-change: -1.91% -1.38% Instructions are helped. total cycles in shared programs: 372595954 -> 372594532 (<.01%) cycles in affected programs: 11477 -> 10055 (-12.39%) helped: 14 HURT: 0 helped stats (abs) min: 76 max: 122 x̄: 101.57 x̃: 104 helped stats (rel) min: 7.76% max: 15.62% x̄: 12.94% x̃: 14.78% 95% mean confidence interval for cycles value: -111.05 -92.09 95% mean confidence interval for cycles %-change: -14.90% -10.98% Cycles are helped. No changes on any Gen6 or earlier platforms. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Ian Romanick	25bfba3335	nir/algebraic: Recognize open-coded copysign(1.0, a) All of the affected shaders are in Mad Max. The inner part of the pattern is itself an open-coded sign(a). I tried using that as a pattern, but the results were not good. A bunch of shaders were helped for instructions, but overall cycles, spill, and fills were hurt. v2: Rebase on 1-bit Boolean changes. v3: Fix order of copysign() parameters in comments and commit message. Noticed by Matt. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15282141 -> 15282073 (<.01%) instructions in affected programs: 6106 -> 6038 (-1.11%) helped: 17 HURT: 0 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.02% max: 2.20% x̄: 1.15% x̃: 1.06% 95% mean confidence interval for instructions value: -4.00 -4.00 95% mean confidence interval for instructions %-change: -1.30% -1.00% Instructions are helped. total cycles in shared programs: 372597886 -> 372595954 (<.01%) cycles in affected programs: 32701 -> 30769 (-5.91%) helped: 17 HURT: 0 helped stats (abs) min: 6 max: 216 x̄: 113.65 x̃: 118 helped stats (rel) min: 0.40% max: 21.86% x̄: 6.20% x̃: 5.83% 95% mean confidence interval for cycles value: -152.84 -74.45 95% mean confidence interval for cycles %-change: -8.89% -3.51% Cycles are helped. No changes on any Gen6 or earlier platforms. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-18 12:37:48 -07:00
Jason Ekstrand	c6463f8ac2	nir: Add a nir_src_as_intrinsic() helper Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Jason Ekstrand	85c35885b3	nir: Rework nir_src_as_alu_instr to not take a pointer Other nir_src_as_* functions just take a nir_src. It's not that much more memory copying and the constness preserving really isn't worth the cognitive dissonance. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Jason Ekstrand	eee994e769	nir: Drop "struct" from some nir_* declarations Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Iago Toral Quiroga	e6ee07a664	compiler/spirv: move the check for Int8 capability So it is right after the checks for the other various Int* capabilities. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 13:23:03 +02:00

1 2 3 4 5 ...

3628 commits