mesa/src/intel/compiler
Iago Toral Quiroga 5e584a9db7 i965: skip reading unused slots at the begining of the URB for the FS
We can start reading the URB at the first offset that contains varyings
that are actually read in the URB. We still need to make sure that we
read at least one varying to honor hardware requirements.

This helps alleviate a problem introduced with 99df02ca26 for
separate shader objects: without separate shader objects we assign
locations sequentially, however, since that commit we have changed the
method for SSO so that the VUE slot assigned depends on the number of
builtin slots plus the location assigned to the varying. This fixed
layout is intended to help SSO programs by avoiding on-the-fly recompiles
when swapping out shaders, however, it also means that if a varying uses
a large location number close to the maximum allowed by the SF/FS units
(31), then the offset introduced by the number of builtin slots can push
the location outside the range and trigger an assertion.

This problem is affecting at least the following CTS tests for
enhanced layouts:

KHR-GL45.enhanced_layouts.varying_array_components
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_components
KHR-GL45.enhanced_layouts.varying_locations

which use SSO and the the location layout qualifier to select such
location numbers explicitly.

This change helps these tests because for SSO we always have to include
things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is
very unlikely to read them, so by doing this we free builtin slots from
the fixed VUE layout and we avoid the tests to crash in this scenario.

Of course, this is not a proper fix, we'd still run into problems if someone
tries to use an explicit max location and read gl_ViewportIndex, gl_LayerID or
gl_CullDistancein in the FS, but that would be a much less common bug and we
can probably wait to see if anyone actually runs into that situation in a real
world scenario before making the decision that more aggresive changes are
required to support this without reverting 99df02ca26.

v2:
- Add a debug message when we skip clip distances (Ilia)
- we also need to account for this when we compute the urb setup
  for the fragment shader stage, so add a compiler util to compute
  the first slot that we need to read from the URB instead of
  replicating the logic in both places.

v3:
- Make the util more generic so it can account for all unused slots
  at the beginning of the URB, that will make it more useful (Ken).
- Drop the debug message, it was not what Ilia was asking for.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 08:27:13 +02:00
..
.gitignore
brw_cfg.cpp
brw_cfg.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_clip.h i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_line.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_point.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_tri.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_unfilled.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_util.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_compile_clip.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_compile_sf.c i965: Move SF compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_compiler.c i965: Set lower_vote_trivial in vector_nir_options_gen6 too. 2017-07-21 18:09:01 -07:00
brw_compiler.h i965: skip reading unused slots at the begining of the URB for the FS 2017-10-02 08:27:13 +02:00
brw_dead_control_flow.cpp
brw_dead_control_flow.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_disasm.c i965: Stop using hardware register types directly 2017-08-21 14:05:23 -07:00
brw_eu.c i965: Move brw_reg_type_letters() as well 2017-08-21 14:05:23 -07:00
brw_eu.h i965: Mark src inst pointer const in compaction code 2017-08-21 14:05:23 -07:00
brw_eu_compact.c i965: Switch to using the logical register types 2017-08-21 14:05:23 -07:00
brw_eu_defines.h i965: Hide the register type hardware encodings 2017-08-21 14:05:23 -07:00
brw_eu_emit.c i965: Switch to using the logical register types 2017-08-21 14:05:23 -07:00
brw_eu_util.c intel/compiler: whitespace cleanups 2017-03-13 11:16:35 +00:00
brw_eu_validate.c intel/eu/validate: Look up types on demand in execution_type() 2017-09-12 15:01:00 -07:00
brw_fs.cpp i965: skip reading unused slots at the begining of the URB for the FS 2017-10-02 08:27:13 +02:00
brw_fs.h i965: Use pushed UBO data in the scalar backend. 2017-07-13 20:18:54 -07:00
brw_fs_builder.h
brw_fs_cmod_propagation.cpp
brw_fs_combine_constants.cpp
brw_fs_copy_propagation.cpp
brw_fs_cse.cpp
brw_fs_dead_code_eliminate.cpp
brw_fs_generator.cpp i965: Normalize types for FBL, FBH, etc 2017-09-30 20:18:09 -07:00
brw_fs_live_variables.cpp
brw_fs_live_variables.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_fs_lower_conversions.cpp i965/fs: rename lower_d2x to lower_conversions 2017-04-14 14:56:07 -07:00
brw_fs_lower_pack.cpp
brw_fs_nir.cpp i965/fs: force pull model for 64-bit GS inputs 2017-09-29 08:18:25 +02:00
brw_fs_reg_allocate.cpp intel/fs: Take into account amount of data read in spilling cost heuristic. 2017-04-24 11:01:40 -07:00
brw_fs_register_coalesce.cpp
brw_fs_saturate_propagation.cpp
brw_fs_sel_peephole.cpp i965/fs: Do not move MOVs writing the flag outside of control flow 2017-07-20 16:56:49 -07:00
brw_fs_surface_builder.cpp i965: Mark functions static 2017-08-21 14:45:44 -07:00
brw_fs_surface_builder.h
brw_fs_validate.cpp
brw_fs_visitor.cpp i965/fs: Lower gl_VertexID and friends to inputs at the NIR level 2017-05-09 15:07:47 -07:00
brw_inst.h i965: Optimize reading the destination type 2017-08-21 14:05:23 -07:00
brw_interpolation_map.c
brw_ir_allocator.h
brw_ir_fs.h i965/fs: add helper to retrieve instruction execution type 2017-04-14 14:56:07 -07:00
brw_ir_vec4.h i965/vec4: don't do horizontal stride on some register file types 2017-04-14 14:56:09 -07:00
brw_nir.c i965/nir: export nir_optimize 2017-09-26 22:37:02 +10:00
brw_nir.h i965/nir: export nir_optimize 2017-09-26 22:37:02 +10:00
brw_nir_analyze_boolean_resolves.c
brw_nir_analyze_ubo_ranges.c i965: Select ranges of UBO data to be uploaded as push constants. 2017-07-13 19:56:49 -07:00
brw_nir_attribute_workarounds.c nir: Rework conversion opcodes 2017-03-14 07:36:40 -07:00
brw_nir_intrinsics.c nir: Add system values from ARB_shader_ballot 2017-07-20 16:56:49 -07:00
brw_nir_opt_peephole_ffma.c
brw_nir_tcs_workarounds.c
brw_nir_trig_workarounds.py intel: use a flag instead of setting PYTHONPATH 2017-09-27 09:07:28 -07:00
brw_packed_float.c
brw_predicated_break.cpp
brw_reg.h i965: Move brw_reg_type_letters() as well 2017-08-21 14:05:23 -07:00
brw_reg_type.c intel/compiler: Cast reg types explicitly 2017-08-28 14:43:39 +03:00
brw_reg_type.h i965: Mark brw_hw_type_to_reg_type() as a pure function 2017-08-21 14:05:23 -07:00
brw_schedule_instructions.cpp
brw_shader.cpp i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3 2017-06-09 16:02:59 -07:00
brw_shader.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_vec4.cpp i965/vec4: Use 'class' src_reg, rather than 'struct' src_reg 2017-08-21 14:45:44 -07:00
brw_vec4.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_builder.h
brw_vec4_cmod_propagation.cpp
brw_vec4_copy_propagation.cpp i965: Support copy propagating of untyped atomic surface indexes. 2017-09-26 15:35:14 -07:00
brw_vec4_cse.cpp
brw_vec4_dead_code_eliminate.cpp i965/vec4/dce: improve track of partial flag register writes 2017-04-14 14:56:09 -07:00
brw_vec4_generator.cpp i965: Normalize types for FBL, FBH, etc 2017-09-30 20:18:09 -07:00
brw_vec4_gs_nir.cpp i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_gs_visitor.cpp intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed. 2017-08-03 16:54:08 +10:00
brw_vec4_gs_visitor.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_live_variables.cpp
brw_vec4_live_variables.h i965/vec4: consider subregister offset in live variables 2017-04-14 14:56:08 -07:00
brw_vec4_nir.cpp i965/vec4: Actually handle atomic op intrinsics. 2017-09-26 15:35:06 -07:00
brw_vec4_reg_allocate.cpp i965/vec4: Return float from spill_cost_for_type() 2017-08-21 14:45:44 -07:00
brw_vec4_surface_builder.cpp i965/vec4: Fix swizzles on atomic sources. 2017-09-26 15:35:11 -07:00
brw_vec4_surface_builder.h
brw_vec4_tcs.cpp i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3 2017-06-09 16:02:59 -07:00
brw_vec4_tcs.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_tes.cpp i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_tes.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_visitor.cpp i965: Handle unwritten PSIZ/VIEWPORT/LAYER outputs in vec4 shaders. 2017-09-21 09:39:27 -07:00
brw_vec4_vs.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_vs_visitor.cpp i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vue_map.c
brw_wm_iz.cpp nir: Embed the shader_info in the nir_shader again 2017-05-09 15:07:47 -07:00
gen6_gs_visitor.cpp i965: Move SOL PSIZ hacks from draw time to link time. 2017-06-01 00:08:29 -07:00
gen6_gs_visitor.h
intel_asm_annotation.c i965: Add a weak no-op nir_print_instr() symbol 2017-05-15 11:43:01 -07:00
intel_asm_annotation.h
meson.build meson: Add build Intel "anv" vulkan driver 2017-09-27 09:12:19 -07:00
test_eu_compact.cpp i965: Remove CONT/BREAK from instruction compaction test 2017-08-21 14:05:23 -07:00
test_eu_validate.cpp i965: Add functions to abstract access to register types 2017-08-21 14:05:23 -07:00
test_fs_cmod_propagation.cpp
test_fs_copy_propagation.cpp
test_fs_saturate_propagation.cpp
test_vec4_cmod_propagation.cpp
test_vec4_copy_propagation.cpp
test_vec4_register_coalesce.cpp
test_vf_float_conversions.cpp