fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 04:48:07 +02:00

Author	SHA1	Message	Date
Italo Nicola	59623f211b	intel/compiler: remove old comment This comment was correct some time ago, but since commit `d3c10ad427`, it isn't true anymore. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-18 10:20:34 -08:00
Danylo Piliaiev	0904ee0c60	intel/fs: Do not lower large local arrays to scratch on gen7 On gen7 and earlier the scratch space size is limited to 12kB. By enabling this optimization we may easily exceed this limit without having any fallback. arb_compute_shader/linker/bug-93840.shader_test crashes with this lowering on IVB due to exceeding scratch size limit. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2092 Fixes: `69244fc7` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-14 20:08:30 +00:00
Paulo Zanoni	eb6352162d	intel/compiler: fix nir_op_{i,u}*32 on ICL On ICL we have the src1 restriction which is applied through fix_byte_src() and potentially changes the type of the operands from 8 to 32 bits. When this change happens, we fall into the "else if (bit_size < 32)" case and miscompute src_type because it takes into consideration bit_size (8) instead of the adjusted size of temp_op (32). This results in the shader reading unused memory, giving us mostly failures, but occasional passes due to whatever was already in the registers we were reading. This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as: dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2 This can also be verified by simply changing fix_byte_src() to apply on all platforms. Fixes: `5847de6e9a` ("intel/compiler: don't use byte operands for src1 on ICL") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-13 22:13:52 +00:00
Jason Ekstrand	69244fc72a	intel/fs: Lower large local arrays to scratch Shader-db results on Kaby Lake: total instructions in shared programs: 14929212 -> 14880028 (-0.33%) instructions in affected programs: 72428 -> 23244 (-67.91%) helped: 6 HURT: 2 helped stats (abs) min: 2165 max: 15981 x̄: 8590.00 x̃: 7624 helped stats (rel) min: 56.06% max: 74.52% x̄: 67.55% x̃: 72.08% HURT stats (abs) min: 1178 max: 1178 x̄: 1178.00 x̃: 1178 HURT stats (rel) min: 350.60% max: 361.35% x̄: 355.97% x̃: 355.97% 95% mean confidence interval for instructions value: -11947.03 -348.97 95% mean confidence interval for instructions %-change: -125.72% 202.37% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 368585300 -> 342557344 (-7.06%) cycles in affected programs: 28144921 -> 2116965 (-92.48%) helped: 6 HURT: 2 helped stats (abs) min: 1404978 max: 7766106 x̄: 4353922.00 x̃: 3890682 helped stats (rel) min: 82.01% max: 95.57% x̄: 89.95% x̃: 92.28% HURT stats (abs) min: 47778 max: 47798 x̄: 47788.00 x̃: 47788 HURT stats (rel) min: 278.20% max: 282.98% x̄: 280.59% x̃: 280.59% 95% mean confidence interval for cycles value: -5900438.73 -606550.27 95% mean confidence interval for cycles %-change: -140.79% 146.16% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 9243 -> 8901 (-3.70%) spills in affected programs: 2718 -> 2376 (-12.58%) helped: 4 HURT: 4 total fills in shared programs: 21831 -> 10141 (-53.55%) fills in affected programs: 11804 -> 114 (-99.03%) helped: 6 HURT: 2 total sends in shared programs: 815912 -> 815912 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 1 GAINED: 3 The helped shaders are all compute shaders in Aztec Ruins. There is also a compute shader in synmark2 OglCSDof that's helped but it doesn't show up in above shader-db results because it went from SIMD8 to SIMD16. That shader improves enough to yield an 15-20% performance boost to the benchmark as a whole on my KBL laptop. The hurt shaders are a couple shaders in Kerbal Space Program and a couple in Aztec Ruins. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	53bfcdeecf	intel/fs: Implement the new load/store_scratch intrinsics This commit fills in a number of different pieces: 1. We add support to brw_nir_lower_mem_access_bit_sizes to handle the new intrinsics. This involves simple plumbing work as well as a tiny bit of extra logic to always scalarize scratch intrinsics 2. Add code to brw_fs_nir.cpp to turn nir_load/store_scratch intrinsics into byte/dword scattered read/write messages which use the A32 stateless model. 3. Add code to lower_surface_logical_send to handle dword scattered messages and the A32 stateless model. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	e2297699de	intel/nir: Plumb devinfo through lower_mem_access_bit_sizes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	1dff48af05	intel/fs: refactor surface header setup Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	a0999bc049	intel/fs: Add DWord scattered read/write opcodes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	83f04d80b0	intel/nir: Use nir_extract_bits in lower_mem_access_bit_sizes The new helper solves most of the annoying problems with data wrangling in brw_nir_lower_mem_access_bit_sizes. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Paulo Zanoni	b57383a944	intel/compiler: remove the operand restriction for src1 on GLK Commit `5847de6e9a` implemented a restriction that applies to ICL, but wrongly marked it as also applying to GLK. Reviewers or MR !1125 pointed this, and the commit history shows removal of GLK to parts of the patch, but it turns there was still a left-over GLK check in the code. This code was breaking some of the i8vec2 tests on GLK, for example: dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2 Removing the GLK check solves the issue for GLK. I don't see a reason on why implementing this restriction would actually break GLK, so there's still more to investigate here since this bug may be affecting ICL+, but let's apply the real GLK fix while we analyze and discuss the other possible issues. Fixes: `5847de6e9a` ("intel/compiler: don't use byte operands for src1 on ICL") BSpec: 3017 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-05 00:08:34 +00:00
Ian Romanick	7b3f38ef69	intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too This make shader-db's report.py work on Haswell and earlier platforms. The problem is that the script would detect the "sends" output for scalar shaders and expect in in vec4 shaders too. When it didn't find it, the script would fail with: Traceback (most recent call last): File "./report.py", line 351, in <module> main() File "./report.py", line 182, in main before_count = before[p][m] KeyError: 'sends' Fixes: `f192741ddd` ("intel/compiler: Report the number of non-spill/fill SEND messages") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 21:27:03 -07:00
Jordan Justen	2b186264cc	intel/eu/validate/gen12: Add TGL to eu_validate tests. These reworks were combined into this patch: * Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+ * Francisco Jerez: intel/eu/validate/gen12: Disable qword_low_power_no_depctrl eu_validate test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:08:51 -07:00
Matt Turner	12d3b11908	intel/compiler: Add instruction compaction support on Gen12 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	c8fbc8823f	intel/compiler: Make separate src0/src1 index tables TGL uses different data (and even a different format!) for each source. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	cde73625f8	intel/compiler: Inline get_src_index() TGL will have separate tables for src0 and src1, so the shared function will no longer make sense. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	d0eff8a539	intel/compiler: Restructure instruction compaction in preparation for Gen12 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	ded9fb2b18	intel/compiler: Remove unreachable() from brw_reg_type.c The EU compaction unit test fuzzes the compaction code by flipping bits. We use a simple skip_bits() function with a list of reserved bits to ignore, but for more complex cases like invalid combinations of register file:type, we need either machinery to check validity or for these functions to simply inform us whether a combination was valid. enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it with an "INVALID" value, just return -1 and let the caller check for that. Scott suggested redefining unreachable() within the unit test to longjmp() which would allow driver code like this to still use it and allow the test to handle expected failures like this. If that plan works out, I plan to revert this.	2019-10-30 11:11:50 -07:00
Jason Ekstrand	24c0545b2d	intel/vec4: Set brw_stage_prog_data::has_ubo_pull In `0e4a75f917`, Ken added a flag brw_stage_prog_data which indicates whether any UBO pulls ever occur. Unfortunately, he neglected to set the bit in the vec4 back-end. This was fine at the time because the optimization was intended for iris which does not support gen7 and using the vec4 back-end on Gen8+ requires an environment variable. We want to use this in Vulkan which does support Gen7 so we want the information from the vec4 back-end as well as scalar. Fixes: `0e4a75f917` "intel/compiler: Record whether any pull constant..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Danylo Piliaiev	12a8f2616a	intel/compiler: Fix C++ one definition rule violations When building with "-flto" brw::block_data definitions were colliding. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-28 12:02:40 +02:00
Caio Marcelo de Oliveira Filho	e142061399	intel/fs: Implement scoped_memory_barrier Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-24 11:39:56 -07:00
Michel Dänzer	2b1b56cb3a	intel/fs: Check for NULL key in fs_visitor constructor Flagged by UBSan: ../src/intel/compiler/brw_fs_visitor.cpp:986:20: runtime error: member access within null pointer of type 'const struct brw_base_prog_key' #0 0x559fadb48556 in fs_visitor::init() ../src/intel/compiler/brw_fs_visitor.cpp:986 #1 0x559fadb46db3 in fs_visitor::fs_visitor(brw_compiler const, void, void, brw_base_prog_key const, brw_stage_prog_data, nir_shader const, unsigned int, int, brw_vue_map const) ../src/intel/compiler/brw_fs_visitor.cpp:962 #2 0x559fad9c7cd8 in saturate_propagation_fs_visitor::saturate_propagation_fs_visitor(brw_compiler, brw_wm_prog_data, nir_shader) (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x61bcd8) #3 0x559fad9960a1 in saturate_propagation_test::SetUp() ../src/intel/compiler/test_fs_saturate_propagation.cpp:65 #4 0x559fadd7a32d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #5 0x559fadd65c3b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #6 0x559fadd0af75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2470 #7 0x559fadd0d8a4 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x559fadd10032 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x559fadd2ba0c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x559fadd7df46 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #11 0x559fadd69613 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #12 0x559fadd2302e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x559fadda2d61 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x559fadda2c21 in main ../src/gtest/src/gtest_main.cc:37 #15 0x7fe8f6748bba in __libc_start_main ../csu/libc-start.c:308 #16 0x559fad9950f9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x5e90f9) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:20:04 +02:00
Michel Dänzer	41623be20e	intel/compiler: Cast to target type before shifting left Otherwise a smaller type may be promoted to int, which can hit undefined behaviour: ../src/intel/compiler/brw_packed_float.c:66:17: runtime error: left shift of 128 by 24 places cannot be represented in type 'int' #0 0x5604a03969aa in brw_vf_to_float ../src/intel/compiler/brw_packed_float.c:66 #1 0x5604a0391305 in vf_float_conversion_test_test_vf_to_float_Test::TestBody() ../src/intel/compiler/test_vf_float_conversions.cpp:70 #2 0x5604a041a323 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #3 0x5604a0405c31 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #4 0x5604a03ab03b in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #5 0x5604a03ad714 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #6 0x5604a03afea2 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #7 0x5604a03cb87c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #8 0x5604a041df3c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #9 0x5604a0409609 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #10 0x5604a03c2e9e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #11 0x5604a0442d57 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #12 0x5604a0442c17 in main ../src/gtest/src/gtest_main.cc:37 #13 0x7f9a1983dbba in __libc_start_main ../csu/libc-start.c:308 #14 0x5604a0390d89 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/vf_float_conversions+0x8dd89) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:19:23 +02:00
Michel Dänzer	59b72bdfb4	intel/compiler: Don't left-shift by >= the number of bits of the type To avoid it, use the modulo of the number of bits in the value being shifted, which is presumably what ended up happening on x86. Flagged by UBSan: ../src/intel/compiler/brw_eu_validate.c:974:33: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' #0 0x561abb612ab3 in general_restrictions_on_region_parameters ../src/intel/compiler/brw_eu_validate.c:974 #1 0x561abb617574 in brw_validate_instructions ../src/intel/compiler/brw_eu_validate.c:1851 #2 0x561abb53bd31 in validate ../src/intel/compiler/test_eu_validate.cpp:106 #3 0x561abb555369 in validation_test_source_cannot_span_more_than_2_registers_Test::TestBody() ../src/intel/compiler/test_eu_validate.cpp:486 #4 0x561abb742651 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #5 0x561abb72e64d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #6 0x561abb6d5451 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #7 0x561abb6d7b2a in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x561abb6da2b8 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x561abb6f5c92 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x561abb74626a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #11 0x561abb732025 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #12 0x561abb6ed2b4 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x561abb768b3b in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x561abb7689fb in main ../src/gtest/src/gtest_main.cc:37 #15 0x7f525e5a9bba in __libc_start_main ../csu/libc-start.c:308 #16 0x561abb538ed9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/eu_validate+0x1b8ed9) Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:16:49 +02:00
Sagar Ghuge	97e6d34e66	intel/compiler: Refactor disassembly of sources in 3src instruction Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	18b28b5654	intel/compiler: Don't move immediate in register On Gen12, we support mixed mode HF/F operands, and also 3 source instruction supports immediate value support, so keep immediate as it is, if it fits properly in 16 bit field. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	bf943bdf24	intel/compiler: Set bits according to source file On Gen >= 12, if src0 or src2 holds immediate value, we need set src[0/2]_is_imm bits instead of register file. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	c018c5a339	intel/compiler: Add Immediate support for 3 source instruction On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but not both. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	f729ecefef	intel/compiler: Remove emit_alpha_to_coverage workaround from backend Remove emit_alpha_to_coverage workaround from backend compiler and start using ported workaround from NIR. v2: Copy comment from brw_fs_visitor (Caio Marcelo de Oliveira Filho) Fixes piglit test on HSW: - arb_sample_shading-builtin-gl-sample-mask-mrt-alpha-to-coverage-combinations Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Sagar Ghuge	7ecfbd4f6d	nir: Add alpha_to_coverage lowering pass Importing this pass from fs_visitor::emit_alpha_to_coverage_workaround() in intel/compiler. v2 (Caio Marcelo de Oliveira Filho): - Track store output and sample mask instruction - Nest math insturction for more readability - Bail out early if no gl_SampleMask v3: (Caio Marcelo de Oliveira Filho): - Do math instructions after instruction block - Restructure code - Move pass under src/intel/compiler v4: (Caio Marcelo de Oliveira Filho): - Organize dither mask calculation Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Kenneth Graunke	f192741ddd	intel/compiler: Report the number of non-spill/fill SEND messages This can be useful to measure whether memory access optimizations are having the desired effect. For example, we might see a reduction in image loads/stores, or constant buffer loads. We can already see this in cycle estimates to some extent, but this is a more direct approach, minus a lot of the noise of random scheduler shuffling. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-17 20:44:00 -07:00
Ian Romanick	92252219d3	intel/vec4: Don't try both sources as immediates for DPH DPH isn't actually commutative, so this doesn't work. If the immediate in src0 would be a VF candidate, we could do better. shrug No shader-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `b04beaf41d` ("intel/vec4: Try both sources as candidates for being immediates")	2019-10-17 15:07:01 -07:00
Caio Marcelo de Oliveira Filho	c847bfaaf5	intel/fs/gen12: Add tests for scoreboard pass Tests the combinations of cases of RAW, WAW and WAR hazards involving both inorder and outoforder instructions. Also tests that dependencies combine and propagate correctly through control flow (loops and conditionals). v2: Add an extra test illustrating that the non-logical CFG edge between then-block and else-block is being taking into account. (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-10-17 10:02:35 -07:00
Kenneth Graunke	44754279ac	intel/fs/gen12: Use TCS 8_PATCH mode. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-10-11 12:24:16 -07:00
Jason Ekstrand	c92fb60007	intel/fs/gen12: Implement gl_FrontFacing on gen12+. The bit moved on gen12 in order to prepare for dual-SIMD8 dispatch. This implementation isn't an entirely complete as it only works on SIMD8 and SIMD16 and not dual-SIMD8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	ceb123befa	intel/fs/gen11+: Fix CS_OPCODE_CS_TERMINATE codegen. Apparently the ts_request_type and ts_resource_select thread spawner message descriptor bits were removed from the hardware at least since ICL. Drop them in order to avoid assertion failures on Gen12+ platforms which don't have any encoding for this. On Gen9+ these are probably just ignored by the hardware, so this is unlikely to have had any functional implications prior to Gen12. v2: Mark TS message fields as non-existing in brw_inst.h on ICL. (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	a5efb0eae8	intel/fs/gen12: Fix barrier codegen. The WAIT instruction has been removed, but SYNC.bar can be used instead to wait for a notification on n0.0. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6b52f81395	intel/eu: Don't set notify descriptor field of gateway barrier message. Apparently this field was removed on SKL, and according to the hardware docs for previous platforms "This field is only valid for a ForwardMsg message. It is ignored for other messages. The BarrierMsg message always increments the N0 notification counter". Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	b0e69d115e	intel/ir/gen12: Update assert in brw_stage_has_packed_dispatch(). Confirmed no regressions after a full Piglit run on TGL with the brw_fs_test_dispatch_packing() test enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Jason Ekstrand	ca7b6fd392	intel/eu/validate/gen12: Don't blow up on indirect src0. They look like a NULL source if you don't look at the address mode. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	ab5aa01689	intel/eu/validate/gen12: Validation fixes for SEND instruction. The following fix-up by Jordan Justen is squashed in: intel/eu/validate: gen12 send instruction doesn't have a dst type field Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	a81f9b5e3e	intel/eu/validate/gen12: Fix validation of SYNC instruction. src0 will typically be null for this instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	45768e6b3c	intel/eu/validate/gen12: Implement integer multiply restrictions in EU validator. Due to hardware bug filed as HSDES#1604601757. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Jordan Justen	f9ec4ac5a1	intel/ir: Lower fpow on Gen12. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	cb6db5bfb3	intel/fs/gen12: Don't support source mods for 32x16 integer multiply. Due to hardware bug filed as HSDES#1604601757. v2: Only return if result of fs_inst::can_do_source_mods() is known to be false for the case new orthogonal restrictions are implemented below in the future. (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	de5d106ccf	intel/disasm: Disassemble register file of split SEND sources. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	c03869323b	intel/disasm: Don't disassemble saturate control on SEND instructions. The field is gone on Gen12+ and it was illegal on previous generations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	f15e0b3439	intel/disasm/gen12: Disassemble Gen12 SEND instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	fd7e21dd90	intel/disasm/gen12: Disassemble Gen12 SYNC instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	606d823b42	intel/disasm/gen12: Disassemble three-source instruction source and destination regions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00

1 2 3 4 5 ...

1147 commits