fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 22:38:06 +02:00

Author	SHA1	Message	Date
Rafael Antognolli	43dc842cb9	anv: Wait for the GPU to be idle before invalidating the aux table. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>	2020-03-02 22:28:11 +00:00
Jason Ekstrand	3ca3050de5	anv: Do end-of-pipe sync around MCS/CCS ops instead of CS stall v2: Do end-of-pipe sync after clear depth stencil too (Jason). v3: Also do end-of-pipe sync before clear depth stencil too (Jason). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>	2020-03-02 22:28:11 +00:00
Jason Ekstrand	2db471953a	anv: Use a proper end-of-pipe sync instead of just CS stall Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>	2020-03-02 22:28:11 +00:00
Jason Ekstrand	ac8d412ba3	anv: Use the PIPE_CONTROL instead of bits for the CS stall W/A Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4005>	2020-03-02 22:28:11 +00:00
Lionel Landwerlin	85457e350d	intel/tools/dump_gpu: fix getparam values Don't return the pci_id for all params Fixes: `76bf38eaf0` ("intel/tools/aub_dump: move aub file initialization to maybe_init()") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3994> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3994>	2020-03-02 08:24:40 +00:00
Jordan Justen	cf12faef61	intel/compiler: Restrict cs_threads to 64 Our current GPGPU_WALKER code only supports up to 64 threads. On HSW we could use up to 70 and TGL up to 112, but only if the walker is adjusted so the width does not exceed 64. Work to support this is in progress. Previous to this change, we might try to downgrade to SIMD8 if the SIMD16 shader spilled. Since HSW and TGL have the max number of threads above 64, we would then try to emit an invalid GPGPU walker command. Fixes: `932045061b` ("i965/cs: Emit compute shader code and upload programs") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2020-02-28 14:45:43 -08:00
Caio Marcelo de Oliveira Filho	dab7a4d82c	anv: Remove unused field `urb.total_size` This was used before the URB calculation functions were shared by GL and Vulkan. Also drop the substruct for the remaining, `l3_config` is a good name on its own. Also-written-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3981> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3981>	2020-02-27 14:45:10 -08:00
Caio Marcelo de Oliveira Filho	811990dc1c	anv: Remove unused field xfb_used from anv_pipeline Since we only use xfb_info for GEN >= 8, make the ifdef cover that local variable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3973> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3973>	2020-02-27 10:44:11 -08:00
Jason Ekstrand	349898a967	nir: Drop nir_tex_instr::texture_array_size It's set by lots of things and we spend a lot of time maintaining it but no one actually uses the value for anything useful. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3940> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3940>	2020-02-26 18:29:49 +00:00
Matt Turner	cb166aea24	intel/tools: Do not print type/qualifiers/name for c_literal External tools may wish to choose their own type, qualifiers, and name, so do not emit our own. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	5feea40889	intel/tools: Allow i965_disasm to disassemble c_literal input type Added extra argument named 'type' which can be 'bin' (default if ommited) or 'c_literal' for input type. Change 'binary-path' argument name to 'input-path'. v2: - Use util_dynarray for assembly (Matt Turner) - Read data in 8 bytes chunk (Matt Turner) - Fix help option (Akeem Abodunrin) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	2f83daedb1	intel/tools: Print c_literals 4 byte wide We already print hex value a byte wide, instead of printing c_literal byte wide, we can print it 4 byte wide, which gives us 2 different combinations. v2: Fix the aliasing issue (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	0b0e958f4f	intel/tools: Add test for state register as source Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	31c29f4f55	intel/tools: Add test for address register as source Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	9526e5c359	intel/tools: Set correct address register file and number in i965_asm We need to use already created brw_reg and set correct file type, register number and sub register number. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	87d9e78f26	intel/tools: Handle STATE_REG in typed source operand Also stop using brw_sr0_reg function as it return new brw_reg, we already created register, all we have to is just set file, register number and subnr. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Sagar Ghuge	2a75e60365	intel/tools: Handle illegal instruction Allow assembler to handle illegal instruction even though mesa doesn't use it but might be required at some point in future. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3952>	2020-02-25 22:23:38 +00:00
Jason Ekstrand	5dfd83d7a1	anv: Always enable the data cache Because we set the needs_data_cache bit from the NIR during compilation, any time a shader was pulled out of the pipeline cache, we wouldn't set the bit and the data cache was disabled. Fortunately, on Gen8+, this bit is ignored because we always use the ALL section in the L3$ config instead of separate DC and RO sections. On Gen7, however, this meant that we were basically never running with the data cache enabled and our compute performance was suffering massively because of it. This commit improves Geekbench 5 scores on my Haswell GT3 by roughly 330% (no, that's not a typo). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3912>	2020-02-25 20:12:10 +00:00
Lionel Landwerlin	d4e7a11bc3	intel/aub_dump: stub the waits when overriding the device We don't actually want to wait on anything, just complete submitting the commands as fast as possible. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3705> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3705>	2020-02-25 20:56:49 +02:00
Lionel Landwerlin	31461e2379	intel/tools/aub_dump: fix crash when using the default legacy context When execbuffer->rsvd1 == 0, the legacy context is used. Ensure we have context created for this. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3705>	2020-02-25 20:56:45 +02:00
Lionel Landwerlin	76bf38eaf0	intel/tools/aub_dump: move aub file initialization to maybe_init() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3705>	2020-02-25 20:56:40 +02:00
Jason Ekstrand	6fbe3f40a9	intel/isl: Add isl_aux_info.c to Makefile.sources This should fix the Android build. Fixes: `58d4749e56` "isl: Add a module which manages aux resolves" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reported-by: Clayton Craft <clayton.a.craft@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3934> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3934>	2020-02-25 00:41:15 +00:00
Rafael Antognolli	9ab0e92cff	intel/blorp: Implement GEN:BUG:1605967699. v2: - Update comments and refactor code (Lionel). - Only apply workaround to stencil resolves. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3909> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3909>	2020-02-25 00:04:36 +00:00
Caio Marcelo de Oliveira Filho	956e4b2d37	nir, intel: Move use_scoped_memory_barrier to nir_options This option will be used later by GLSL, so move to a common struct. Because nir_options is filled in the compiler instead of the Vulkan driver, fix that up. GLSL will ignore that for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913>	2020-02-24 19:12:11 +00:00
Eric Anholt	3e16434acd	nir: Move intel's intrinsic_image_coordinate_components() to core nir. This is a query that both Intel and freedreno need to do. We can simplify it a lot with the new glsl_get_sampler_dim_coordinate_components() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>	2020-02-24 18:25:02 +00:00
Nanley Chery	58d4749e56	isl: Add a module which manages aux resolves Provide a generic interface which manages aux resolves in ISL. The feature differences between this and what's in iris is: * Support for media compression. ISL_AUX_USAGE_MC behaves differently from many other usages of CCS, so it was useful to implement this support upfront, while designing the interfaces. * Optimizations for full-surface writes. For example, after a full-surface write occurs with ISL_AUX_USAGE_CCS_E in the PARTIAL_CLEAR state, isl_aux_state_transition_write() returns COMPRESSED_NO_CLEAR instead of COMPRESSED_CLEAR. A performance suggestion for main-surface-invalidating/replacing writes is given as a comment instead of adding a boolean to isl_aux_prepare_access(). This avoids extra validation and should be simple enough for the caller to handle. v2. Add assertions. (Jason) v3. Use switches in 2 more functions. (Jason) Store aux metadata in a static table. (Jason) Change prepare and finish function signatures. (Jason) Keep isl_aux_state_transition_* functions separate. v4. (Jason) Assert against resolving in AUX_INVALID. Rename aux_info struct to aux_usage_info. Drop the justification for each aux_usage_info field. Split out the NONE case in write function. Restructure tests to more easily confirm coverage. Rename access_compressed field to compressed. Make write behavior less ambiguous. v5. (Jason) Add more detail above WRITES_RESOLVE_AMBIGUATE. Add ISL_AUX_USAGE_MC to WritesResolveAmbiguate. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2957>	2020-02-24 18:00:05 +00:00
Caio Marcelo de Oliveira Filho	89a3856714	anv: Add pipe_state_for_stage() helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3911> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3911>	2020-02-21 13:09:44 -08:00
Caio Marcelo de Oliveira Filho	7df5d36078	anv: Use intel_debug_flag_for_shader_stage() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3911>	2020-02-21 13:09:44 -08:00
Ian Romanick	273b8cd1ca	intel/fs: Correctly handle multiply of fsign with a source modifier The other source of the multiply will be interpreted as a uint32_t in an XOR instruction. Any source modifiers with either not be interpreted at all or will be misinterpreted due to the differing types. If the other operand of the multiplication has a source modifier, just emit an extra move to resolve the source modifiers. The negation source modifier problem is difficult to reproduce due to an algebraic optimization that changes (-ab) to -(ab). However, changes in MR !1359 push the negations back down. On Gen7+ it might be possible to do slightly better for an abs() source modifier by using BFI2 as a glorified copysign(). On Gen8+ it might be possible to do slightly better for a neg() source modifier by emitting (~a ^ b). There were no shader-db changes on any Intel platform, so I think we can deal with that problem when it arises. See also piglit!224. Fixes: `06d2c11641` ("intel/fs: Add a scale factor to emit_fsign") Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3780> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3780>	2020-02-19 23:51:42 +00:00
Chad Versace	d8fe9e045f	anv: Drop anv_image.c:get_surface() It was called exactly once, and even there it returned the wrong surface in a corner case. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3882> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3882>	2020-02-19 19:41:05 +00:00
Danylo Piliaiev	d5931f285b	intel/compiler: Do not qsort zero sized array ../src/intel/compiler/brw_nir_analyze_ubo_ranges.c:316:4: runtime error: null pointer passed as argument 1, which is declared to never be null #0 0x7f78f5916611 in brw_nir_analyze_ubo_ranges ../src/intel/compiler/brw_nir_analyze_ubo_ranges.c:316 #1 0x7f78f255c189 in brw_codegen_wm_prog ../src/mesa/drivers/dri/i965/brw_wm.c:97 #2 0x7f78f2565571 in brw_fs_precompile ../src/mesa/drivers/dri/i965/brw_wm.c:608 #3 0x7f78f24edd2c in brw_shader_precompile ../src/mesa/drivers/dri/i965/brw_link.cpp:56 #4 0x7f78f24f3af8 in brw_link_shader ../src/mesa/drivers/dri/i965/brw_link.cpp:381 #5 0x7f78f39a302a in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3119 #6 0x7f78f3a43826 in create_new_program ../src/mesa/main/ff_fragment_shader.cpp:1133 #7 0x7f78f3a43d00 in _mesa_get_fixed_func_fragment_program ../src/mesa/main/ff_fragment_shader.cpp:1163 #8 0x7f78f325ddcd in update_program ../src/mesa/main/state.c:134 #9 0x7f78f325fe64 in _mesa_update_state_locked ../src/mesa/main/state.c:360 #10 0x7f78f32600f1 in _mesa_update_state ../src/mesa/main/state.c:394 #11 0x7f78f2b3e587 in clear ../src/mesa/main/clear.c:169 #12 0x7f78f2b3e587 in _mesa_Clear ../src/mesa/main/clear.c:242 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3825>	2020-02-19 12:07:24 +02:00
Danylo Piliaiev	d596795d4d	brw_fs: Avoid zero size vla ../src/intel/compiler/brw_fs.cpp:2247:46: runtime error: variable length array bound evaluates to non-positive value 0 #0 0x7f78f5697678 in fs_visitor::assign_constant_locations() ../src/intel/compiler/brw_fs.cpp:2247 #1 0x7f78f571d29e in fs_visitor::optimize() ../src/intel/compiler/brw_fs.cpp:7361 #2 0x7f78f574eb84 in fs_visitor::run_fs(bool, bool) ../src/intel/compiler/brw_fs.cpp:8022 #3 0x7f78f575641b in brw_compile_fs ../src/intel/compiler/brw_fs.cpp:8408 #4 0x7f78f255c8e4 in brw_codegen_wm_prog ../src/mesa/drivers/dri/i965/brw_wm.c:123 #5 0x7f78f2565571 in brw_fs_precompile ../src/mesa/drivers/dri/i965/brw_wm.c:608 #6 0x7f78f24edd2c in brw_shader_precompile ../src/mesa/drivers/dri/i965/brw_link.cpp:56 #7 0x7f78f24f3af8 in brw_link_shader ../src/mesa/drivers/dri/i965/brw_link.cpp:381 #8 0x7f78f39a302a in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3119 #9 0x7f78f3a43826 in create_new_program ../src/mesa/main/ff_fragment_shader.cpp:1133 #10 0x7f78f3a43d00 in _mesa_get_fixed_func_fragment_program ../src/mesa/main/ff_fragment_shader.cpp:1163 #11 0x7f78f325ddcd in update_program ../src/mesa/main/state.c:134 #12 0x7f78f325fe64 in _mesa_update_state_locked ../src/mesa/main/state.c:360 #13 0x7f78f32600f1 in _mesa_update_state ../src/mesa/main/state.c:394 #14 0x7f78f2b3e587 in clear ../src/mesa/main/clear.c:169 #15 0x7f78f2b3e587 in _mesa_Clear ../src/mesa/main/clear.c:242 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3825>	2020-02-19 12:07:24 +02:00
Danylo Piliaiev	d4e395a27d	brw_nir: Cast bitshift to unsigned ../src/intel/compiler/brw_nir.c:979:40: runtime error: left shift of 1 by 31 places cannot be represented in type 'int' #0 0x7f78f590d10b in brw_nir_apply_sampler_key ../src/intel/compiler/brw_nir.c:979 #1 0x7f78f590e07b in brw_nir_apply_key ../src/intel/compiler/brw_nir.c:1057 #2 0x7f78f5754b45 in brw_compile_fs ../src/intel/compiler/brw_fs.cpp:8347 #3 0x7f78f255c8e4 in brw_codegen_wm_prog ../src/mesa/drivers/dri/i965/brw_wm.c:123 #4 0x7f78f2565571 in brw_fs_precompile ../src/mesa/drivers/dri/i965/brw_wm.c:608 #5 0x7f78f24edd2c in brw_shader_precompile ../src/mesa/drivers/dri/i965/brw_link.cpp:56 #6 0x7f78f24f3af8 in brw_link_shader ../src/mesa/drivers/dri/i965/brw_link.cpp:381 #7 0x7f78f39a302a in _mesa_glsl_link_shader ../src/mesa/program/ir_to_mesa.cpp:3119 #8 0x7f78f3a43826 in create_new_program ../src/mesa/main/ff_fragment_shader.cpp:1133 #9 0x7f78f3a43d00 in _mesa_get_fixed_func_fragment_program ../src/mesa/main/ff_fragment_shader.cpp:1163 #10 0x7f78f325ddcd in update_program ../src/mesa/main/state.c:134 #11 0x7f78f325fe64 in _mesa_update_state_locked ../src/mesa/main/state.c:360 #12 0x7f78f32600f1 in _mesa_update_state ../src/mesa/main/state.c:394 #13 0x7f78f2b3e587 in clear ../src/mesa/main/clear.c:169 #14 0x7f78f2b3e587 in _mesa_Clear ../src/mesa/main/clear.c:242 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3825>	2020-02-19 12:07:24 +02:00
Caio Marcelo de Oliveira Filho	79788b8f7f	intel/gen12: Take into account opcode when decoding SWSB The interpretation of the fields is different depending whether the instruction is a SEND/MATH or not. This fixes the disassembly output for non-SEND/MATH instructions that have both in-order and out-of-order dependencies. Their dependencies were wrongly represented as `@A $B` when the correct would be `@A $B.dst`. Fixes: `6154cdf924` ("intel/eu/gen12: Add auxiliary type to represent SWSB information during codegen.") Fixes: `83612c0127` ("intel/disasm/gen12: Disassemble software scoreboard information.") Acked-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3660> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3660>	2020-02-18 09:17:51 -08:00
Caio Marcelo de Oliveira Filho	8004cb256a	anv: Advertise VK_KHR_shader_non_semantic_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3856>	2020-02-18 09:57:15 -06:00
Francisco Jerez	8d3b86e34a	intel/fs/gen7+: Implement discard/demote for SIMD32 programs. At this point this simply involves fixing the initialization of the sample mask flag register to take the right dispatch mask from the thread payload, and fixing sample_mask_reg() to return f1.1 for the second half of a SIMD32 thread. This improves Manhattan 3.1 performance by 2.4%±0.31% (N>40) on my ICL with SIMD32 enabled relative to falling back to SIMD16 for the shaders that use discard. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:49 -08:00
Francisco Jerez	04c7d3d4b1	intel/fs: Return consistent UW types from sample_mask_reg() in fragment shaders. In SIMD32 programs that don't use discard, the upper 16 bits of the UD result of sample_mask_reg() don't contain the sample mask of the upper 16 channels as one would expect. Stop pretending we are returning a valid 32-bit mask. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:49 -08:00
Francisco Jerez	1c6853a9be	intel/fs: Refactor predication on sample mask into helper function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	a792e11f5c	intel/fs/gen7+: Swap sample mask flag register and FIND_LIVE_CHANNEL temporary. FIND_LIVE_CHANNEL was using f1.0-f1.1 as temporary flag register on Gen7, instead use f0.0-f0.1. In order to avoid collision with the discard sample mask, move the latter to f1.0-f1.1. This makes room for keeping track of the sample mask of the second half of SIMD32 programs that use discard. Note that some MOVs of the sample mask into f1.0 become redundant now in lower_surface_logical_send() and lower_a64_logical_send(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>x	2020-02-14 14:31:48 -08:00
Francisco Jerez	083fd96a97	intel/fs: Use helper for discard sample mask flag subregister number. Use it instead of hard-coding f0.1 for the sample mask of programs that use discard. This will make the task easier when we replace f0.1 with another flag register location in order to support discard with SIMD32 shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	a6bc11a789	intel/fs: Make sample_mask_reg() local to brw_fs.cpp and use it in more places. It's only really useful there. This will avoid confusion with another helper with a similar purpose I'm about to introduce that will be useful in multiple files from the FS back-end. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	b84fa0b31e	intel/fs/gen11: Work around dual-source blending hangs in combination with SIMD32. The SIMD8 dual-source blending framebuffer write messages seem to have trouble releasing the pixel scoreboard dependency in SIMD32 dispatch mode, which leads to hangs. I have a better workaround for this which doesn't involve disabling SIMD32 when dual-source blending is enabled, but I'm still investigating some issues with it. Limit the dispatch width to SIMD16 in such cases for the moment in order to make the CI happy on ICL with SIMD32 fragment shaders enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	57dee58c82	intel/fs: Set src0 alpha present bit in header when provided in message payload. Currently the "Source0 Alpha Present to RenderTarget" bit of the RT write message header is derived from brw_wm_prog_data::replicate_alpha. However the src0_alpha payload is provided anytime it's specified to the logical message. This could theoretically lead to an inconsistency if somebody provided a src0_alpha value while brw_wm_prog_data::replicate_alpha was false, as I'm planning to do in a future commit in order to implement a hardware workaround. Instead calculate the header bit based on whether a src0_alpha value was provided to the logical message, which guarantees the same behavior on pre-ICL and ICL+ (the latter used an extended descriptor bit for this which didn't suffer from the same issue). Remove the brw_wm_prog_data::replicate_alpha flag. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	e14529ff32	intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow. Together with the fixup_nomask_control_flow() pass introduced in a previous patch, this implements a less invasive alternative to the workaround documented in the hardware spec for GEN:BUG:1407528679, which doesn't involve disabling structured control flow. Under some conditions Gen12 hardware can end up executing a BB with all channels disabled, which will lead to the execution of any NoMask instructions in it, even though any execution-masked instructions will be correctly shot down. This could break assumptions of the SWSB pass if the data computed by a NoMask instruction is synchronized against by using an SWSB annotation baked into a regular execution-masked instruction, since the first (NoMask) instruction may be executed redundantly by the hardware, even though the second will correctly be shot down, potentially leading to a RaW or WaW hazard if a third instruction subsequently accesses the destination register of the first instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	4e4e8d793f	intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch writes. Found by inspection. Existing code was trying to avoid assuming that an SBID had been assigned to the virtual instruction, but synchronizing the header setup with respect to the previous SIMD16 SEND by using SYNC.ALLRD doesn't really seem possible unless the SEND instruction had been assigned an SBID. Assert-fail instead if no SBID has been allocated. Fixes: `15e3a0d9d2` "intel/eu/gen12: Set SWSB annotations in hand-crafted assembly." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	a8ac0bd759	intel/fs/gen12: Workaround unwanted SEND execution due to broken NoMask control flow. This is a less invasive alternative to the workaround documented in the hardware spec for GEN:BUG:1407528679, which doesn't involve disabling structured control flow (it's unlikely that switching to GOTO/JOIN would have actually fixed the problem anyway). Under some conditions Gen12 hardware can end up executing a BB with all channels disabled, which will lead to the execution of any NoMask instructions in it, even though any execution-masked instructions will be correctly shot down. This may break assumptions of some NoMask SEND messages whose descriptor depends on data generated by live invocations of the shader. This avoids the problem by predicating certain instructions on an ANY horizontal predicate that makes sure that their execution is omitted when all channels of the program are disabled. The shader-db impact of this patch seems to be minimal: total instructions in shared programs: 17169833 -> 17169913 (0.00%) instructions in affected programs: 30663 -> 30743 (0.26%) helped: 0 HURT: 42 total cycles in shared programs: 336966176 -> 336968568 (0.00%) cycles in affected programs: 2367290 -> 2369682 (0.10%) helped: 0 HURT: 13 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	008f95a043	intel/fs: Add virtual instruction to load mask of live channels into flag register. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	b8b509fb92	intel/fs/gen7: Fix fs_inst::flags_written() for SHADER_OPCODE_FIND_LIVE_CHANNEL. We need to pass a width of 32 since the opcode bashes the whole f1.0 register on IVB. This is unlikely to have caused problems since f1.0 is largely unused currently. That's likely to change soon though, even on platforms other than Gen7. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Francisco Jerez	c9e33e5cbf	intel/fs/cse: Make HALT instruction act as CSE barrier. Found by inspection. This seems particularly likely to cause problems with instructions dependent on the current execution mask like SHADER_OPCODE_FIND_LIVE_CHANNEL or the FS_OPCODE_LOAD_LIVE_CHANNELS instruction I'm about to introduce, but one could imagine it leading to data corruption if CSE ever managed to combine two instructions before and after the FS_OPCODE_PLACEHOLDER_HALT, since the one before may not be executed for some channels. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: 20.0 <mesa-stable@lists.freedesktop.org>	2020-02-14 14:31:48 -08:00
Rafael Antognolli	6baeca3689	intel/tools: Update aubinator_error_decode. "ringbuffer" is now called only "ring" in the error state. v2: Keep compatible with old error state (Lionel). v3: Also update "gtt_offset" -> "batch". Closes: https://gitlab.freedesktop.org/drm/intel/issues/1206 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-02-13 16:53:18 -08:00

1 2 3 4 5 ...

5237 commits