fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Lionel Landwerlin	96c8880900	intel/fs: fix total_scratch computation We only have a single prog_data::total_scratch for all shader variants (SIMD 8, 16, 32). Therefore we should always max the total_scratch on top of existing variant. We probably haven't run into that issue before because we compile by increasing SIMD size and higher SIMD size is more likely to spill. But for bindless shaders with return shaders, if the last return part doesn't spill, we completely ignore the previous parts' scratch computation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>	2022-03-02 13:13:03 +00:00
Lionel Landwerlin	57eed6698b	intel/compiler: tracker number of ray queries in prog_data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Jason Ekstrand	4fa58d27a5	intel/fs,vec4: Drop support for shader time Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Jason Ekstrand	8f3c100d61	intel/fs,vec4: Drop uniform compaction and pull constant support The only driver using these was i965 and it's gone now. This is all dead code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Marcin Ślusarz	d05f7b4a2c	intel: fix INTEL_DEBUG environment variable on 32-bit systems INTEL_DEBUG is defined (since `4015e1876a`) as: #define INTEL_DEBUG __builtin_expect(intel_debug, 0) which unfortunately chops off upper 32 bits from intel_debug on platforms where sizeof(long) != sizeof(uint64_t) because __builtin_expect is defined only for the long type. Fix this by changing the definition of INTEL_DEBUG to be function-like macro with "flags" argument. New definition returns 0 or 1 when any of the flags match. Most of the changes in this commit were generated using: for c in `git grep INTEL_DEBUG \| grep "&" \| grep -v i915 \| awk -F: '{print $1}' \| sort \| uniq`; do perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c perl -pi -e "s/INTEL_DEBUG & ($[A-Z0-9_ \|]+$)/INTEL_DBG\1/" $c done but it didn't handle all cases and required minor cleanups (like removal of round brackets which were not needed anymore). Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>	2021-10-15 19:55:14 +00:00
Ian Romanick	a9120eccff	intel/compiler: Move type_is_unsigned_int to brw_reg_type.h ...and rename it to brw_reg_type_is_unsigned_integer. It is now next to brw_reg_type_is_floating_point and brw_reg_type_is_integer. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Dave Airlie	8a81d14271	intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5 This is the equivalent of idr's intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 except for the vec4 backend. This fixes buggy rendering seen with crocus on a qt trace. v2 (idr): Trivial whitespace change. Add unit tests. v3: Fix type in comment in unit tests. Noticed by Jason and Priit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Iron Lake total instructions in shared programs: 8183077 -> 8184543 (0.02%) instructions in affected programs: 198990 -> 200456 (0.74%) helped: 0 HURT: 1355 HURT stats (abs) min: 1 max: 8 x̄: 1.08 x̃: 1 HURT stats (rel) min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70% 95% mean confidence interval for instructions value: 1.04 1.12 95% mean confidence interval for instructions %-change: 0.96% 1.03% Instructions are HURT. total cycles in shared programs: 238967672 -> 238962784 (<.01%) cycles in affected programs: 4666014 -> 4661126 (-0.10%) helped: 406 HURT: 314 helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18 helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65% HURT stats (abs) min: 2 max: 112 x̄: 13.48 x̃: 12 HURT stats (rel) min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16% 95% mean confidence interval for cycles value: -8.60 -4.98 95% mean confidence interval for cycles %-change: -0.87% -0.49% Cycles are helped. GM45 total instructions in shared programs: 4986888 -> 4988354 (0.03%) instructions in affected programs: 198990 -> 200456 (0.74%) helped: 0 HURT: 1355 HURT stats (abs) min: 1 max: 8 x̄: 1.08 x̃: 1 HURT stats (rel) min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70% 95% mean confidence interval for instructions value: 1.04 1.12 95% mean confidence interval for instructions %-change: 0.96% 1.03% Instructions are HURT. total cycles in shared programs: 153577826 -> 153572938 (<.01%) cycles in affected programs: 4666014 -> 4661126 (-0.10%) helped: 406 HURT: 314 helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18 helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65% HURT stats (abs) min: 2 max: 112 x̄: 13.48 x̃: 12 HURT stats (rel) min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16% 95% mean confidence interval for cycles value: -8.60 -4.98 95% mean confidence interval for cycles %-change: -0.87% -0.49% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>	2021-08-11 13:09:32 -07:00
Ian Romanick	5ffbee84a4	intel/compiler: Add id parameter to shader_perf_log callback There are two problems with the current architecture. In OpenGL, the id is supposed to be a unique identifier for a particular log source. This is done so that applications can (theoretically) filter particular log messages. The debug callback infrastructure in Mesa assigns a uniqe value when a value of 0 is passed in. This causes the id to get set once to a unique value for each message. By passing a stack variable that is initialized to 0 on every call, every time the same message is logged, it will have a different id. This isn't great, but it's also not catastrophic. When threaded shader compiles are used, the id pointer is saved and dereferenced at a possibly much later time on a possibly different thread. This causes one thread to access the stack from a different thread... and that stack frame might not be valid any more. :( I have not observed any crashes related to this particular issue. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>	2021-08-01 23:58:08 +00:00
Dave Airlie	8da92b5c0a	intel/compiler: add flag to indicate edge flags vertex input is last 965 and the mesa st disagree on how vertex elements are ordered when edgeflags are involved. 965 wants them in gl_vert_attrib order, but gallium supplies the edgeflag as the last vertex element regardless. This adds a flag which is enabled for gen4/5 to denote that the edgeflag is at the end. When we reap 965 later we can resolve this better. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11146>	2021-06-14 06:05:18 +10:00
Jason Ekstrand	e23b55c3f0	i965: Use nir_lower_passthrough_edgeflags Now that there's a common NIR pass, there's no point in us doing this in the back-end anymore. In order to use this pass in i965, we do have to make one tiny change. Gallium runs the pass after assigning input and output locations and so needs the pass to respect those locations and num_inputs. i965, however, runs it before any location assignment or I/O lowering so we don't care. We do, however, need the pass to succeed with num_inputs == 0 because we set that later. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11313>	2021-06-11 21:19:06 +00:00
Jason Ekstrand	ebba3cad81	intel/vec4: Add support for UBO pushing Shader-db results on Haswell (vec4 only): total instructions in shared programs: 2853928 -> 2726576 (-4.46%) instructions in affected programs: 855840 -> 728488 (-14.88%) helped: 9500 HURT: 18 helped stats (abs) min: 1 max: 359 x̄: 13.54 x̃: 11 helped stats (rel) min: 0.44% max: 53.33% x̄: 19.13% x̃: 17.44% HURT stats (abs) min: 4 max: 124 x̄: 71.00 x̃: 92 HURT stats (rel) min: 3.64% max: 77.86% x̄: 46.43% x̃: 52.12% 95% mean confidence interval for instructions value: -13.78 -12.98 95% mean confidence interval for instructions %-change: -19.21% -18.81% Instructions are helped. total cycles in shared programs: 101822616 -> 60245580 (-40.83%) cycles in affected programs: 93312382 -> 51735346 (-44.56%) helped: 13292 HURT: 4506 helped stats (abs) min: 2 max: 1229260 x̄: 3370.82 x̃: 776 helped stats (rel) min: 0.04% max: 96.70% x̄: 47.56% x̃: 43.76% HURT stats (abs) min: 2 max: 17644 x̄: 716.37 x̃: 82 HURT stats (rel) min: 0.02% max: 491.80% x̄: 41.00% x̃: 11.11% 95% mean confidence interval for cycles value: -3037.07 -1635.03 95% mean confidence interval for cycles %-change: -26.03% -24.25% Cycles are helped. total spills in shared programs: 1080 -> 1314 (21.67%) spills in affected programs: 74 -> 308 (316.22%) helped: 0 HURT: 47 total fills in shared programs: 310 -> 497 (60.32%) fills in affected programs: 71 -> 258 (263.38%) helped: 0 HURT: 47 total sends in shared programs: 239884 -> 151799 (-36.72%) sends in affected programs: 129302 -> 41217 (-68.12%) helped: 9547 HURT: 0 helped stats (abs) min: 1 max: 226 x̄: 9.23 x̃: 8 helped stats (rel) min: 3.12% max: 98.15% x̄: 72.38% x̃: 80.00% 95% mean confidence interval for sends value: -9.48 -8.98 95% mean confidence interval for sends %-change: -72.80% -71.97% Sends are helped. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	89fd196f6b	intel/vec4: Add support for masking pushed data This is the vec4 equivalent of `d0d039a4d3`, required for proper UBO pushing in vertex stages for Vulkan on HSW. Sadly, the implementation requires us to do everything in ALIGN1 mode and the vec4 instruction scheduler doesn't understand HW_GRF <-> UNIFORM interference so it's easier to do the whole thing in the generator. We add an instruction to the top of the program which just means "emit the blob" and all the magic happens in codegen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	a881f2295f	intel/vec4: Set up push ranges before we emit any code In order to avoid switching pull constants to push constants and then having to back to pull, compute the push ranges up-front. This way we know by the time we emit code exactly what ranges are pushable. This is a bit inefficient in the case where the "normal" push constants get compacted. However, most apps don't use giant piles of dead uniforms combined with substantial UBO use so this should be ok. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	c35501ffe8	intel/vec4: Update nr_params in pack_uniform_registers This is where we re-arrange and re-pack the params. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Jason Ekstrand	3d1ac996d0	intel/vec4: Add some asserts to move_push_to_pull Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>	2021-05-19 14:38:13 +00:00
Marcin Ślusarz	3340d5ee02	intel: simplify is_haswell checks, part 1 Generated with: files=`git grep is_haswell \| cut -d: -f1 \| sort \| uniq` for file in $files; do cat $file \| \ sed "s/devinfo->ver <= 7 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" \| \ sed "s/devinfo->ver >= 8 \|\| devinfo->is_haswell/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo->is_haswell \|\| devinfo->ver >= 8/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo.is_haswell \|\| devinfo.ver >= 8/devinfo.verx10 >= 75/g" \| \ sed "s/devinfo->ver > 7 \|\| devinfo->is_haswell/devinfo->verx10 >= 75/g" \| \ sed "s/devinfo->ver == 7 && !devinfo->is_haswell/devinfo->verx10 == 70/g" \| \ sed "s/devinfo.ver == 7 && !devinfo.is_haswell/devinfo.verx10 == 70/g" \| \ sed "s/devinfo->ver < 8 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" \| \ sed "s/device->info.ver == 7 && !device->info.is_haswell/device->info.verx10 == 70/g" \ > tmpXXX mv tmpXXX $file done Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10810>	2021-05-17 09:46:45 +00:00
Anuj Phogat	61e8636557	intel: Rename gen_device prefix to intel_device export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen_device" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_device/intel_device/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Anuj Phogat	926d343acf	intel: Rename files with gen_debug prefix export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" find $SEARCH_PATH -type f -name "gen_debug.[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \; grep -E "gen_debug" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_debug\./intel_debug\./g" grep -E "GEN_DEBUG" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Iván Briano	8328989130	intel, anv: propagate robustness setting to nir_opt_load_store_vectorize Closes #4309 Fixes dEQP-VK-robustness.robustness2..readonly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10147>	2021-04-13 13:30:09 -07:00
Anuj Phogat	e7e55af4d6	intel: Rename GENx keyword to GFXx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN$[[:digit:]]\+$/GFX\1/g" Exclude the changes to modifiers: grep -E "I915_.GFX" -rIl $SEARCH_PATH \| xargs sed -ie "s/$I915_.$GFX/\1GEN/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	1d296484b4	intel: Rename Genx keyword to Gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/Gen$[[:digit:]]\+$/Gfx\1/g" Exclude changes in src/intel/perf/oa-.xml: find src/intel/perf -type f $ -name ".xml" $ \| xargs sed -ie "s/Gfx/Gen/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	b75f095bc7	intel: Rename genx keyword to gfxx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen$[[:digit:]]\+$/gfx\1/g" Exclude pack.h and xml changes in this patch: grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+_pack\.h$/gen\1/g" grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+\.xml$/gen\1/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Caio Marcelo de Oliveira Filho	05933fb0f7	intel/compiler: Use INTEL_DEBUG=blorp to dump blorp shaders Make INTEL_DEBUG=blorp dump the blorp shaders instead using the general INTEL_DEBUG=fs,vs, which is now reserved to the actual FS and VS shaders used by the pipeline. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho	7fb1e58651	intel/compiler: Make visitors take debug_enabled as a parameter The callers already have this value, and we would like to make it follow different rules other than stage that might not be visible to the helper function, so just pass explicitly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho	758eb18c6f	intel/compiler: Make vec4 generator take debug_enabled as a parameter The callers already have this value, and we would like to make it follow different rules other than stage that might not be visible to the helper function, so just pass explicitly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho	244d2daa00	intel/compiler: Make brw_postprocess_nir take debug_enabled as a parameter The callers already have this value, and we would like to make it follow different rules other than stage that might not be visible to the helper function, so just pass explicitly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho	82d77f0ea8	intel/compiler: Refactor the shader INTEL_DEBUG checks Make the check once in a variable, that can be reused for other parts. Also add `unlikely` to the various conditionals depending on it Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho	57d664245e	intel/compiler: Use a struct for brw_compile_vs parameters Makes calling code more explicit about what is being set, and allows take advantage of zero initialization for the ones the callsite don't care. Besides moving to the struct, two extra "ergonomic" changes were done: - Add a new shader_time boolean, so shader_time_index is ignored when unused -- this allow taking advantage of the zero initialization of unset fields. - Since we have a struct, provide space for the error_str pointer. Both iris and i965 were using it, and the extra rstrdup in case of failure shouldn't be a burden for the others. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Ian Romanick	3c31364f5e	intel/compiler: Use CMPN for min / max on Gen4 and Gen5 On Intel platforms before Gen6, there is no min or max instruction. Instead, a comparison instruction (*more on this below) and a SEL instruction are used. Per other IEEE rules, the regular comparison instruction, CMP, will always return false if either source is NaN. A sequence like cmp.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F (+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F will generate the wrong result for min if g22 is NaN. The CMP will return false, and the SEL will pick g22. To account for this, the hardware has a special comparison instruction CMPN. This instruction behaves just like CMP, except if the second source is NaN, it will return true. The intention is to use it for min and max. This sequence will always generate the correct result: cmpn.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F (+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F The problem is... for whatever reason, we don't emit CMPN. There was even a comment in lower_minmax that calls out this very issue! The bug is actually older than the "Fixes" below even implies. That's just when the comment was added. That we know of, we never observed a failure until #4254. If src1 is known to be a number, either because it's not float or it's an immediate number, use CMP. This allows cmod propagation to still do its thing. Without this slight optimization, about 8,300 shaders from shader-db are hurt on Iron Lake. Fixes the following piglit tests (from piglit!475): tests/spec/glsl-1.20/execution/fs-nan-builtin-max.shader_test tests/spec/glsl-1.20/execution/fs-nan-builtin-min.shader_test tests/spec/glsl-1.20/execution/vs-nan-builtin-max.shader_test tests/spec/glsl-1.20/execution/vs-nan-builtin-min.shader_test Closes: #4254 Fixes: `2f2c00c727` ("i965: Lower min/max after optimization on Gen4/5.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8115134 -> 8115135 (<.01%) instructions in affected programs: 229 -> 230 (0.44%) helped: 0 HURT: 1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027>	2021-02-17 19:52:24 +00:00
Caio Marcelo de Oliveira Filho	9da54b9252	intel/compiler: Use gl_varying_slot_name_for_stage() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8998>	2021-02-13 00:44:53 +00:00
Caio Marcelo de Oliveira Filho	9f3d5e99ea	compiler: Use util/bitset.h for system_values_read It is currently a bitset on top of a uint64_t but there are already more than 64 values. Change to use BITSET to cover all the SYSTEM_VALUE_MAX bits. Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>	2021-01-26 20:20:47 +00:00
Caio Marcelo de Oliveira Filho	b3daf341d4	intel/fs: Add assert on the brw_STAGE_prog_data downcasts Motivation is to detect earlier certain bugs that can occur when missing a check for the stage before using the downcast. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7540>	2020-11-16 12:40:59 -09:00
Ian Romanick	262ca98b3a	intel/compiler: Remove Gen10-specific code Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>	2020-10-15 09:29:53 -07:00
Marcin Ślusarz	9c25689287	intel: drop likely/unlikely around INTEL_DEBUG It's included in declaration of INTEL_DEBUG. Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>	2020-10-06 18:43:07 +00:00
Ian Romanick	1d71b1a311	intel/vec4: Remove everything related to VS_OPCODE_SET_SIMD4X2_HEADER_GEN9 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Ian Romanick	2a49007411	intel/vec4: Remove all support for Gen8+ [v2] v2: Restore the gen == 10 hunk in brw_compile_vs (around line 2940). This function is also used for scalar VS compiles. Squash in: intel/vec4: Reindent after removing Gen8+ support intel/vec4: Silence unused parameter warning in try_immediate_source Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>	2020-09-28 11:43:10 -07:00
Marcin Ślusarz	d4c6e3f196	intel/compiler: use the same name for nir shaders in brw_compile_* functions Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602>	2020-09-04 17:38:25 +00:00
Jason Ekstrand	90b6745bc8	intel/fs,vec4: Stuff the constant data from NIR in the end of the program Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>	2020-09-02 19:48:44 +00:00
Francisco Jerez	6579f562c3	intel/ir: Use brw::performance object instead of CFG cycle counts for codegen stats. These should be more accurate than the current cycle counts, since among other things they consider the effect of post-scheduling passes like the software scoreboard on TGL. In addition it will enable us to clean up some of the now redundant cycle-count estimation functionality in the instruction scheduler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-04-28 23:01:27 -07:00
Matt Turner	bb3e7b0fe3	intel/compiler: Pass shader_stats for each SIMD mode Passing shader_stats to the fs_generator constructor means that the SIMD8 shader stats from the visitor (such as the scheduler mode) will be reported out for the SIMD16/SIMD32 versions as well. As you can see, we are now passing 'shader_stats' and 'stats' to generate_code(), which is obviously odd looking. Ian rebased and committed an old patch of mine which added the shader_stats struct on July 30 in commit `dabb5d4bee` (i965/fs: Add a shader_stats struct.) and shortly after on August 12 Jason added the brw_compile_stats struct in commit `134607760a` (intel/compiler: Fill a compiler statistics struct). I'd like to combine the two, but I'm not sure how. shader_stats is an input to generate_code() while brw_compile_stats is an output and is only used by the Vulkan driver. Leave it as is for now... Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4093>	2020-03-09 04:44:12 +00:00
Matt Turner	75a33e268e	intel/compiler: Mark some methods and parameters const Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4093>	2020-03-09 04:44:11 +00:00
Matt Turner	3d0821a216	intel/vec4: Make implied_mrf_writes() a vec4_instruction method Same as commit `c20dc9b836` (intel/fs: Make implied_mrf_writes() an fs_inst method.) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4093>	2020-03-09 04:44:11 +00:00
Francisco Jerez	acf24df201	intel/compiler/vec4: Switch liveness analysis to IR analysis framework This involves wrapping vec4_live_variables in a BRW_ANALYSIS object and hooking it up to invalidate_analysis() so it's properly invalidated. Seems like a lot of churn but it's fairly straightforward. The vec4_visitor invalidate_ and calculate_live_intervals() methods are no longer necessary after this change. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:59 -08:00
Francisco Jerez	48dfb30f92	intel/compiler: Move all live interval analysis results into vec4_live_variables This moves the following methods that are currently defined in vec4_visitor (even though they are side products of the liveness analysis computation) and are already implemented in brw_vec4_live_variables.cpp: > int var_range_start(unsigned v, unsigned n) const; > int var_range_end(unsigned v, unsigned n) const; > bool virtual_grf_interferes(int a, int b) const; > int virtual_grf_start; > int virtual_grf_end; It makes sense for them to be part of the vec4_live_variables object, because they have the same lifetime as other liveness analysis results and because this will allow some extra validation to happen wherever they are accessed in order to make sure that we only ever use up-to-date liveness analysis results. The naming of the virtual_grf_start/end arrays was rather misleading, they were indexed by variable rather than by vgrf, this renames them start/end to match the FS liveness analysis pass. The churn in the definition of var_range_start/end is just in order to avoid a collision between the start/end arrays and local variables declared with the same name. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:44 -08:00
Francisco Jerez	ab6d792986	intel/compiler: Pass detailed dependency classes to invalidate_analysis() Have fun reading through the whole back-end optimizer to verify whether I've missed any dependency flags -- Or alternatively, just trust that any mistake here will trigger an assertion failure during analysis pass validation if it ever poses a problem for the consistency of any of the analysis passes managed by the framework. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:39 -08:00
Francisco Jerez	d966a6b4c4	intel/compiler: Introduce backend_shader method to propagate IR changes to analysis passes The invalidate_analysis() method knows what analysis passes there are in the back-end and calls their invalidate() method to report changes in the IR. For the moment it just calls invalidate_live_intervals() (which will eventually be fully replaced by this function) if anything changed. This makes all optimization passes invalidate DEPENDENCY_EVERYTHING, which is clearly far from ideal -- The dependency classes passed to invalidate_analysis() will be refined in a future commit. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:32 -08:00
Francisco Jerez	27ae3c1f68	intel/compiler: Reverse inclusion dependency between brw_vec4_live_variables.h and brw_vec4.h brw_vec4.h (in particular vec4_visitor) is logically a user of the live variables analysis pass, not the other way around. brw_vec4_live_variables.h requires the definition of some VEC4 IR data structures to compile, but those can be obtained directly from brw_ir_vec4.h without including brw_vec4.h. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:28 -08:00
Jason Ekstrand	d1c4e64a69	intel/compiler: Add a flag to avoid compacting push constants In vec4, we can just not run the pass. In fs, things are a bit more deeply intertwined. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00

1 2 3

116 commits