fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 18:18:06 +02:00

Author	SHA1	Message	Date
D Scott Phillips	49f9a0bb57	intel/tools/aubinator_error_decode: read HW Context before other batches The hardware context buffer has state that was set before the batch started. By decoding it first, references to things like Dynamic State Base Address are decodable in the command batches. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4246> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4246>	2020-03-23 18:13:37 +00:00
Sagar Ghuge	60c789543e	anv: Set patch count threshold in 3DSTATE_HS Lets specifiy maximum number of patches that will be accumulated before a thread is dispatched. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3563>	2020-03-23 17:57:57 +00:00
Sagar Ghuge	1a5ac646ce	intel/compiler: Track patch count threshold Return the number of patches to accumulate before an 8_PATCH mode thread is launched. v2: (Kenneth Graunke) - Track patch count threshold instead of input control points. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3563>	2020-03-23 17:57:57 +00:00
Sagar Ghuge	b3dd54fe13	intel/genxml: Add patch count threshold field on gen12 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3563>	2020-03-23 17:57:57 +00:00
Jason Ekstrand	3252041a78	anv: Only add END_OF_PIPE_SYNC if we actually have AUX_INVAL Fixes: `43dc842cb9` "anv: Wait for the GPU to be idle before..." Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4234> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4234>	2020-03-19 21:58:49 +00:00
Jason Ekstrand	9dbff6f6ce	intel/iris: Always initialize CCS to 0 Previously, we were initializing the CCS to 0xFF for MCS+CCS due to a misunderstanding of the following lines in the bspec: The following are the general SW requirements for MCS buffer clear functionality: ... - If Software wants to enable Color Compression without Fast clear, Software needs to initialize MCS with zeros. - Lossless compression and CCS initialized to all F (using HW Fast Clear or SW direct Clear) on the same surface is not supported. The first line does not refer to the CCS as the comment author supposed but refers to the MCS as the comment says. It means that if you want to use MCS compression without a fast-clear, you should initialize the MCS to 0x00. This is because the value 0x00 in the MCS means "all data is in plane 0" which is a perfectly valid non-fast-clear initialization. It's also the value the MCS should be in if you do a RECTLIST slow-clear where the primitive fully covers each pixel such that the same value is written to all samples. The second line in the above quote seems to imply that CCS fast-clear is incompatible with MCS fast-clear. In particular, MCS+CCS fast-clear uses a 0xff value in the MCS (like on Gen7-11) and leaves the CCS in either the compressed or the pass-through state. Therefore, we should initialize the CCS to 0x00 even for MCS+CCS surfaces. Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4074> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4074>	2020-03-19 20:54:19 +00:00
Lionel Landwerlin	507abc3959	isl: drop min row pitch alignment when set by the driver When the caller of the isl_surf_init() specifies a row pitch, do not consider the minimum CCS requirement if it's incompatible with the caller's value. isl_surf_get_ccs_surf() will check that the main surface alignment matches CCS expectations. v2: Simplify checks (Nanley) v3: Add Comment about isl_surf_get_ccs_surf() (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Fixes: `a3f6db2c4e` ("isl: drop CCS row pitch requirement for linear surfaces") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>	2020-03-19 19:17:10 +00:00
Lionel Landwerlin	def3470e9b	isl: only apply main surface ccs pitch constraint with CCS We could be creating a Y-tiled surface that isn't going to use CCS (this could be the case when clearly indicated through modifiers). Don't apply the main surface pitch alignment constraint in that case. v2: Use logical NOT (Sagar) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a3f6db2c4e` ("isl: drop CCS row pitch requirement for linear surfaces") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>	2020-03-19 19:17:10 +00:00
Lionel Landwerlin	dab0aadea9	isl: properly filter supported display modifiers on Gen9+ Y tiling is supported for display on Gen9+ so don't filter it from the possible flags. v2: Drop Yf from display supported tilings on Gen12+ (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>	2020-03-19 19:17:10 +00:00
Lionel Landwerlin	157a3cf3ec	isl: implement linear tiling row pitch requirement for display We're missing a requirement for alignment of row pitch for the display HW. In linear tiling, the row pitch must be a 64bytes aligned. v2: Use correct formula to align to 64bytes (Chad) v3: Matching {} (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4243>	2020-03-19 19:17:10 +00:00
Jason Ekstrand	46187bb54f	anv: Swizzle fast-clear values Starting with Gen12, we can fast-clear a lot more surface formats and we are suddenly in the position of having to fast-clear surfaces with formats with an implicit swizzle such as VK_FORMAT_R4G4B4A4_UNORM_PACK16 which is represented as ISL_FORMAT_A4B4G4R4 with a BGRA swizzle. In order for blorp to do the fast-clear color conversion for us, it needs a properly swizzled color. This fixes the following Vulkan CTS groups on TGL: - dEQP-VK.pipeline.blend.format.b4g4r4a4_unorm_pack16.* - dEQP-VK.api.image_clearing.core.clear_color_image..b4g4r4a4 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>	2020-03-18 21:05:07 +00:00
Jason Ekstrand	3fb8f19481	intel/blorp: Add support for swizzling fast-clear colors Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4218>	2020-03-18 21:05:07 +00:00
Chad Versace	6ee971c882	anv: Use isl_drm_modifier_get_default_aux_state() Use it in anv_layout_to_aux_state(). Refactor only. No change in behavior. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>	2020-03-18 11:39:33 -07:00
Jason Ekstrand	0905d5a14a	intel/isl: Don't align linear images to 64K on Gen12+ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>	2020-03-18 17:33:28 +00:00
Lionel Landwerlin	25a54554b3	intel/decoder: don't consider header fields past dword0 v2: use ULL Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>	2020-03-18 09:19:53 +00:00
Jason Ekstrand	d60375cbc2	anv: Do an end-of-pipe sync before updating AUX table entries We've found in GL that an actual end-of-pipe sync is required before invalidating the aux tables and that a simple CS stall is insufficient. If we're about to modify the actual AUX table entries from the GPU, we should definitely make sure it's stopped dead before we do so. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>	2020-03-17 16:38:50 +00:00
Caio Marcelo de Oliveira Filho	3dd0d12aa5	intel/blorp: Plumb the stage through blorp upload_shader Vulkan uses that for its own upload function -- even though for BLORP it doesn't really currently care. Neither Iris and i965 makes use of it at the moment. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>	2020-03-17 08:24:46 -07:00
Jason Ekstrand	4061ac859d	anv: Push UBO ranges relative to the start of the binding There was a disconnect between anv_nir_compute_push_layout and the code which sets up the push_ubo_sizes array. The NIR code we emit checks relative to the start of the bound UBO range so that, if we end up with a vector which straddles the start of the push range, we can perform the bounds check without risking overflow issues. The code which sets up the push_ubo_sizes, on the other hand, assumed it was relative to the start of the push range. Somehow, this didn't get get caught by any of the available tests. Fixes: `e03f965280` "anv: Bounds-check pushed UBOs when ..." Closes: #2623 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>	2020-03-16 15:14:14 +00:00
Jason Ekstrand	ae15b4fd73	anv: Fix the comparison in an assert Fixes: `e03f965280` "anv: Bounds-check pushed UBOs when ..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>	2020-03-16 15:14:14 +00:00
Tapani Pälli	d836f3fadf	isl: allow compression for storage images on gen12+ This is done to be able to use ISL_AUX_USAGE_CCS_E with images. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>	2020-03-16 10:34:21 +00:00
Tapani Pälli	e8f0483ec4	intel/compiler: detect if atomic load store operations are used Patch adds a new arg and modifies existing calls from i965, anv pass NULL but iris stores this information for later use. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>	2020-03-16 10:34:21 +00:00
Matt Turner	b93a195225	isl: Avoid EXPECT_DEATH in unit tests EXPECT_DEATH works by forking the process and letting the forked process fail with an assertion. This process is evidently incredibly expensive, taking ~30 seconds to run the whole isl_aux_info_test on a 2.8GHz Skylake. Annoyingly all of the (expected) assertion failures also leaves lots of messages in dmesg and potentially generates lots of coredumps. Instead, avoid the expense of fork/exec by redefining assert() and unreachable() in the code we're testing to return a unit-test-only value. With this patch, the test takes ~1ms. Also, while modifying the EXPECT_EQ() calls, reverse the arguments so that the expected value comes first, as is intended. Otherwise gtest failure messages don't make much sense. Fixes: https://gitlab.freedesktop.org/mesa/mesa/issues/2567 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4174> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4174>	2020-03-13 17:48:03 -07:00
Jason Ekstrand	4432dd6ea4	anv: Dump push ranges via VK_KHR_pipeline_executable_properties Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>	2020-03-13 16:31:44 +00:00
Caio Marcelo de Oliveira Filho	f8051f77ea	anv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set Also use a single condition statement instead of two. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	0a5053b687	anv: Reduce compute pipeline batch_data size The batch associated with the compute pipeline only needs room for a MEDIA_VFE_STATE. So this patch moves the batch_data to each pipeline struct and cap the one in compute pipeline. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	925df46b7e	anv: Split graphics and compute bits from anv_pipeline Add two new structs that use the anv_pipeline as base. Changed all functions that work on a specific pipeline to use the corresponding struct. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	af33f0d767	anv: Use a separate field in the pipeline for compute shader This is a preparation for splitting the compute and graphics pipelines into separate structs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	bff45b6a7f	anv: Decouple flush_descriptor_sets() from pipeline struct Explicitly pass the active stages and the array (and size) of shaders to be processed. This will make easy to store only the shaders needed for each pipeline. The active stages can be identified by a non-NULL shader in the shaders array, so stop using it and keep track of the flushed stages as iteration happens. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	6df0ac2653	anv: Decouple flush_descriptor_sets() helpers from pipeline struct Pass the `anv_shader_bin *` instead of expecting the helpers to peek into the pipeline struct. Also reach for the device from the cmd_buffer instead of the pipeline. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	d1c13f01aa	anv: Remove redundant check in flush_descriptor_sets() helpers These helpers are only called for stages that are active, so the code for a non-active stage is never executed. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	eec04c0aae	anv: Pass the right pipe_state to flush_descriptor_sets() The caller has this information, so pass directly instead of making each helper function call figure that one out. Also, since we can reach the pipeline from pipe_state, drop that parameter from the function. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	88df3bf79a	anv: Keep the shader stage in anv_shader_bin This will be used to decouple the logic flush_descriptor_sets() from the position in the shader array, allowing us to store just the shaders needed for each pipeline. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	9bf044d254	anv: Use a dynamic array for storing executables in pipeline Avoids waste for pipelines that don't use all the shaders, and is flexible enough to cover cases where there are multiple variants per shader (e.g. SIMD8/16/32 for fragment shader). Even though we could pre-calculate the exact size of the array, this is not a critical path so it is worth preventing the bug that will likely happen when new variants are added but not accounted for. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	9b0682df82	anv: Use pipeline type to decide whether or not lower multiview Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	613c9b78e3	anv: Add a new enum to identify the pipeline type Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>	2020-03-12 13:18:54 -07:00
Caio Marcelo de Oliveira Filho	7d54b84d49	intel/fs: Combine adjacent memory barriers This will avoid generating multiple identical fences in a row. For Gen11+ we have multiple types of fences (affecting different variable modes), but is still better to combine them in a single scoped barrier so that the translation to backend IR have the option of dispatching both fences in parallel. This will clean up redundant barriers from various dEQP-VK.memory_model.* tests. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>	2020-03-12 19:21:36 +00:00
Jason Ekstrand	6310c666a4	intel/isl: Set DepthStencilResource based on aux usage In ISL, usage flags only carry intent and not semantic meaning. We don't have a bulletproof way in ISL to specify that an image is of depth/stencil type. The usage flags are great but blorp, for instance, loves to disrespect them. One proposed solution to this problem is to add explicit depth/stencil formats which are distinct from the corresponding color formats. Fortunately, however, empirical evidence suggests that this bit only affects the sampler's interpretation of the CCS data. Therefore, we can set the bit based off of the aux_usage which is now very specific and does carry semantic meaning. In particular, aux_usage now makes a distinction between color CCS and depth/stencil CCS which appears to be exactly what the DepthStencilResource bit is for. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	f047e504a5	intel: Require ISL_AUX_USAGE_STC_CCS for stencil CCS Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	69a0150e4e	intel/blorp: Allow STC_CCS in blit sources Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	6fa92cd015	intel/isl: Add a separate ISL_AUX_USAGE_STC_CCS Stencil CCS is slightly different from color CCS. Using a color CCS resolve with stencil CCS doesn't do the right thing and you can't sample from a stencil CCS image without the DepthStencilResource bit set or you will get the wrong data. Stencil CCS also has it's own rules such as it doesn't support fast-clear and has no partial resolve. This seems to indicate that it should probably be its own isl_aux_usage. Now that adding new isl_aux_usage values is pretty cheap, let's split stencil CCS out on its own. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	05a8e981ad	intel/isl: Require ISL_AUX_USAGE_HIZ_CCS_WT for HZ+CCS WT mode We also delete the badly named isl_surf_supports_hiz_ccs_wt. The name is misleading because it doesn't return whether or not the surface supports HiZ+CCS in write-through mode (any single-sampled HiZ+CCS capable surface does) but rather a heuristic decision about whether or not we want to enable write-through mode based on the usage flags in the isl_surf. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	ff1f0a720d	iris: Use ISL_AUX_USAGE_HIZ_CCS_WT to indicate write-through HiZ Previously, we always set the aux_usage to ISL_AUX_USAGE_HIZ_CCS and let ISL choose write-through based on isl_surf_supports_hiz_ccs_wt. This commit makes us choose explicitly at surface creation time whether to use HIZ_CCS or HIZ_CCS_WT based on the same set of conditions. This is more explicit and should be more robust as it lets us choose WT mode in one place rather than trusting isl_surf_supports_hiz_ccs_wt to return the same thing every time. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	e13ed0e9e5	intel/blorp: Allow HIZ_CCS_WT in copy sources Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	98dc7f56b7	intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT This is distinct from ISL_AUX_USAGE_HIZ_CCS in that the HiZ surface operates in write-through mode which means that the HiZ surface is only used for depth-testing acceleration and the CCS-compressed main surface is always valid so we can texture from it. Separating full HiZ from write-through mode at the isl_aux_usage level has a couple of advantages: 1. It's more explicit. Instead of write-through mode depending on the heuristic decision in isl_surf_supports_hiz_ccs_wt, it's now something that's explicitly requested by the driver. This should be more robust than hoping isl_surf_supports_hiz_ccs_wt always returns the same thing every time. If someone (say BLORP) ever drops a usage flag on the isl_surf, there's a chance it could return a different value without us noticing leading to corruptions. 2. Because ISL_AUX_USAGE_HIZ_CCS_WT is it's own isl_aux_usage flag, we can say inside the driver that HIZ_CCS does not support sampling but HIZ_CCS_WT does. We can also pass HIZ_CCS_WT to isl_surf_fill_state and it can do some validation for us beyond what we would be able to do if we conflate HIZ_CCS_WT and CCS_E. 3. In the future, we can add new heuristics to the driver which do things such as start all depth surfaces (regardless of usage flags) off in HIZ_CCS and then do a full resolve and drop to HIZ_CCS_WT the first time it gets used by the sampler. This would potentially let us enable the faster HIZ_CCS mode even in cases where it technically comes in through the API as a texture. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Jason Ekstrand	feaedc1fbe	intel/isl: Clean up some aux surface logic The first check is redundant because the first thing we do in the "emit the aux surface" section is assert that we actually have an aux_surf. The second check involves an exclusion list of things which don't have aux surfaces on Gen12 but an inclusion list is much simpler because it's just "does it have MCS?". Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>	2020-03-12 17:51:28 +00:00
Ian Romanick	ba88e95187	intel/fs: Fix NULL destinations on 3-source instructions again after late DCE We considered moving this down near the call to insert_gen4_send_dependency_workarounds. By that point it's too late for a couple reasons. One, we're potentially increasing resiter pressure that may lead to anoter spill. Two, fixup_3src_null_dest tries to allocate a VGRF, but the post-register allocation shader uses physical registers. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2621 Fixes: `ba2fa1ceaf` ("intel/fs: Do cmod prop again after scheduling") Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4155> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4155>	2020-03-12 08:22:43 -07:00
Yevhenii Kolesnikov	32b7ba66b0	intel/compiler: fix cmod propagation optimisations Knowing following: - CMP writes to flag register the result of applying cmod to the `src0 - src1`. After that it stores the same value to dst. Other instructions first store their result to dst, and then store cmod(dst) to the flag register. - inst is either CMP or MOV - inst->dst is null - inst->src[0] overlaps with scan_inst->dst - inst->src[1] is zero - scan_inst wrote to a flag register There can be three possible paths: - scan_inst is CMP: Considering that src0 is either 0x0 (false), or 0xffffffff (true), and src1 is 0x0: - If inst's cmod is NZ, we can always remove scan_inst: NZ is invariant for false and true. This holds even if src0 is NaN: .nz is the only cmod, that returns true for NaN. - .g is invariant if src0 has a UD type - .l is invariant if src0 has a D type - scan_inst and inst have the same cmod: If scan_inst is anything than CMP, it already wrote the appropriate value to the flag register. - else: We can change cmod of scan_inst to that of inst, and remove inst. It is valid as long as we make sure that no instruction uses the flag register between scan_inst and inst. Nine new cmod_propagation unit tests: - cmp_cmpnz - cmp_cmpg - plnnz_cmpnz - plnnz_cmpz () - plnnz_sel_cmpz - cmp_cmpg_D - cmp_cmpg_UD () - cmp_cmpl_D () - cmp_cmpl_UD () this would fail without changes to brw_fs_cmod_propagation. This fixes optimisation that used to be illegal (see issue #2154) = Before = 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F 1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f = After = 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F Now it is optimised as such (note change of cmod in line 0): = Before = 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F 1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f = After = 0: linterp.nz.f0.0(8) vgrf0:F, g2:F, attr0<0>:F No shaderdb changes Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2154 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348>	2020-03-11 21:21:25 +00:00
Danylo Piliaiev	10eee6d8c6	intel/tools: Fix compilation with UBSan Compilation failed with several similar errors: ../src/intel/tools/aub_read.c:322:4: error: case label does not reduce to an integer constant 322 \| case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER): Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4132> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4132>	2020-03-10 15:20:26 +00:00
Mathias Fröhlich	630154e77b	i965: Move down genX_upload_sbe in profiles. Avoid looping over all VARYING_SLOT_MAX urb_setup array entries from genX_upload_sbe. Prepare an array indirection to the active entries of urb_setup already in the compile step. On upload only walk the active arrays. v2: Use uint8_t to store the attribute numbers. v3: Change loop to build up the array indirection. v4: Rebase. v5: Style fix. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>	2020-03-10 14:28:36 +00:00
Francisco Jerez	45d4665dc7	intel/fs: Fix workaround for VxH indirect addressing bug under control flow. The current workaround for this hardware bug involved marking the ADD instruction used to initialize the address register as NoMask on Gen12, which was based on the assumption that the problem was caused by a hardware bug affecting the application of the execution mask to the address register write. However that doesn't seem to be the case: The address register write was working correctly, the real problem leading to hangs on TGL is that the indirect addressing logic is unable to deal with garbage values in the address register (e.g. misaligned offsets), even for channels which are currently inactive due to non-uniform control flow. The current workaround isn't able to avoid that situation in general, since the result of the NoMask ADD instruction for a dead channel is calculated based on the corresponding (dead) component of the indirect_byte_offset source, which would still be undefined in the likely case that the source was initialized under control flow itself. This would lead to hangs whenever MOV_INDIRECT was used under non-uniform control flow in some scenarios like a tessellation shader from GFXBench5/gl_4 (AKA Car Chase) on TGL. In addition I've managed to reproduce the same issue on earlier platforms by initializing the whole address register with garbage before the ADD instruction, so this seems to be a long-standing issue we have avoided mostly by luck. This patch fixes the problem and applies the workaround to all platforms, since even when the hardware is able to deal with garbage address values without hanging there might be a significant performance cost from reading random GRF registers due to the useless extra EU cycles spent fetching registers for dead channels and due to the potential for unintended serialization with respect to other random instructions that could be executed in parallel, which may have had a cost of the order of hundreds of cycles in the worst case scenario. Fixes: `f93dfb509c` "intel/fs: Write the address register with NoMask for MOV_INDIRECT" Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-03-10 00:42:50 +00:00

1 2 3 4 5 ...

5342 commits