fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 22:08:10 +02:00

Author	SHA1	Message	Date
Francisco Jerez	79fa3eba11	intel/fs/xe2+: Add ALU-based implementation of barycentric interpolation at a per-channel sample. This implements a replacement for the previous implementation of nir_intrinsic_load_barycentric_at_sample that relied on the Pixel Interpolator shared function, since it's going to be removed from the hardware from Xe2 onwards. This implementation simply looks up the X/Y offsets of each sample index on the table provided in the PS thread payload by using indirect addressing, then does the actual interpolation by recursing into emit_pixel_interpolater_alu_at_offset() introduced in the previous commit. Note that even though this is only immediately useful on Xe2+ platforms there's no reason why it shouldn't work on earlier platforms, as long as we have the sample X/Y offsets available in the thread payload. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>	2024-06-27 00:18:00 +00:00
Francisco Jerez	95eec5a0dd	intel/fs/xe2+: Add ALU-based implementation of barycentric interpolation at a per-channel offset. This implements a replacement for the previous implementation of nir_intrinsic_load_barycentric_at_offset that relied on the Pixel Interpolator shared function, since it's going to be removed from the hardware from Xe2 onwards. That's okay since we can get all the primitive setup information needed for interpolation at an arbitrary coordinate: We use the X/Y offset relative to the "X/Y Start" coordinates from the thread payload order to evaluate the plane equations also provided in the thread payload for each barycentric coordinate of each polygon. The evaluation of the barycentric plane equations (and the RHW plane equation for perspective-correct interpolation) uses the accumulator and MAD/MAC for ALU efficiency, but that means we need to manually split instructions to fit the width of the accumulator. The division and scaling for perspective-correct interpolation is also now done in the shader if necessary. Note that even though this is only immediately useful on Xe2+, the thread payload numbers are filled out for older platforms, and the EU restrictions of previous Xe platforms are taken into account, mostly for the purposes of testing and performance evaluation. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>	2024-06-27 00:18:00 +00:00
Francisco Jerez	e8007c9325	intel/fs/xe2+: Don't lower barycentric load offsets to fixed-point format on Xe2+. Floating-point offsets work fine in combination with the floating-point arithmetic we're about to lower these intrinsics into, and they require less instructions than converting to fixed-point and then back. No reason to take the precision/range hit nor the extra instructions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>	2024-06-27 00:18:00 +00:00
Francisco Jerez	04b5b8b9ec	anv/gfx11+: Request PS payload fields for ALU-based interpolation via 3DSTATE_PS_EXTRA. Plumb the prog_data bits recently introduced for ALU-based interpolation down to 3DSTATE_PS_EXTRA emission in the Vulkan driver. Even though this is only going to be used on Xe2+ for now there seems to be no reason not to plumb the bits on all platforms back to gfx11, since the 3DSTATE_PS_EXTRA enables already existed on ICL. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>	2024-06-27 00:18:00 +00:00
Francisco Jerez	3d30cc82f9	intel/fs/xe2+: Ask driver for PS payload registers based on barycentric load intrinsics in use. The ALU-based implementation of the barycentric interpolation intrinsics introduced by a subsequent commit will require some primitive setup information not delivered in the PS thread payload unless explicitly requested: - "Source Depth and/or W Attribute Vertex Deltas" if a perspective-correct interpolation mode is used -- Note that this is already requested for CPS interpolation, we just need to enable it in more cases. - "Perspective Bary Planes" if a perspective-correct interpolation mode is used. - "Non-Perspective Bary Planes" if a non-perspective-corrected interpolation mode is used. - "Sample offsets" if any at_sample interpolation is used so the coordinate offsets of the sample can be calculated. This ALU implementation of barycentric interpolation will only be needed for _at_offset and _at_sample interpolation, since the fixed function hardware still computes barycentrics for us at the current sample coordinates, only the cases that previously relied on the Pixel Interpolator shared function need to be re-implemented with ALU instructions, since that shared function will no longer exist on Xe2 hardware. Thanks to Rohan for a bugfix of the uses_sample_offsets calculation, this patch includes his fix squashed in. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>	2024-06-27 00:18:00 +00:00
Eli Schwartz	e60dcaa71d	meson: add various generated header dependencies as order-only deps https://mesonbuild.com/FAQ.html#how-do-i-tell-meson-that-my-sources-use-generated-headers A few locations had underspecified deps on the header files, and this caused builds to fail given sufficient parallelism. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29115>	2024-06-26 22:54:50 +00:00
Ian Romanick	5bc05c6f53	intel/tools: Advertise I915_PARAM_HAS_EXEC_TIMELINE_FENCES This has been required from the kernel for quite some time, but it wasn't (and technically still isn't) explicitly checked. Commit `7da5b1caef` changed the code paths such that an assertion is hit when I915_PARAM_HAS_EXEC_TIMELINE_FENCES is not available. Fixes: `7da5b1caef` ("anv: move trtt submissions over to the anv_async_submit") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29920>	2024-06-26 20:00:26 +00:00
Jianxun Zhang	dc26ad1e86	anv: Update synchronization of fast clear (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:44 +00:00
Jianxun Zhang	930ea030ed	isl: Initialize the last usage in isl_encode_aux_mode[] (xe2) The ISL_AUX_USAGE_STC_CCS is the last defined usage. We could get a random value from isl_encode_aux_mode[] once it is passed as index if its element is not initialized. Explicit initialization of ISL_AUX_USAGE_HIZ_CCS_WT is added too. Suggested by Nanley Chery <nanley.g.chery@intel.com> Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:44 +00:00
Jianxun Zhang	9d3ce65628	blorp: Don't convert ccs_e formats for copy (xe2) Fix: dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear blorp_blit.c:2770: get_ccs_compatible_copy_format: Assertion `!"" "Not a compressible format"' failed. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	255889a795	isl: Remove restriction of CCS_E support on formats (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	6073f091bb	anv: Disable PAT-based compression on depth images (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	e835b53a03	anv: Don't enable compression on external bos (xe2) Fix: dEQP-VK.synchronization.cross_instance.suballocated. write_draw_indexed_read_blit_image.image_128x128_r16 _uint_binary_semaphore_fence_fd Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	0b75f89f57	anv: Don't enable compression with modifiers (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	1c92b31888	intel/genxml,blorp,common: Update 3DSTATE_PS command (xe2) From Bspec 56423 (r58507), the legacy full resovling and partial resolving options are gone since Xe2. They also cause hang on Xe2 if not disabled. Some suggested code from Nanley Chery <nanley.g.chery@intel.com> is included. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	4dfc3367fc	blorp: Pass down fast clear color value (xe2) Also add a quote of Bspec for previous platforms. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	3269d505e7	blorp: Get fast clear rectangle of non-MSAA surfaces (xe2) Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	3b89bdb96e	isl: Don't set clear values or their address (xe2) The render surface state doesn't have these features any more since Xe2. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>	2024-06-26 05:25:43 +00:00
Jianxun Zhang	7be1912625	isl: Update render CMF mapping (xe2) Update mapping between render target surface formats and compression formats. Some preexisting correct mappings are also re-ordered to the order of types in the spec for an easier verification (top to bottom and left to right). Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29905>	2024-06-25 23:02:14 +00:00
Jordan Justen	a985576755	isl: Implement isl_get_render_compression_format for xe2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29905>	2024-06-25 23:02:14 +00:00
Jordan Justen	bb6e8cab79	isl: Move isl_get_render_compression_format in isl_genX_helpers.h Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29905>	2024-06-25 23:02:14 +00:00
Ian Romanick	2bbd0fd9da	intel/brw/xe2+: Add LNL cooperative matrix configurations Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Ian Romanick	556e78f737	intel/brw/xe2+: Allow vec16 for cooperative matrix Xe2 will allow a B matrix large enough that it will be stored in a vec16. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Ian Romanick	b6236dd8f3	intel/brw/xe2+: Adjust DPAS lowering to DP4A to accommodate larger GRF and SIMD16 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Ian Romanick	77ef241577	intel/brw/xe2+: Scale size_written by reg_unit for DPAS Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Ian Romanick	e368b8e01b	intel/brw/xe2+: Adjust size_read() for DPAS v2: Remov "DG2" from a comment because it applies to DG2 and Xe2. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Ian Romanick	b051602754	intel/brw/xe2+: Catch invalid uses of writes_accumulator earlier It turns out the problem I was trying to catch in `be4fa59a72` ("intel/brw: Clear write_accumulator flag when changing the destination") also came from the DPAS lowering pass itself. Checking for invalid uses of the feature in fs_validate helped detect the problem. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:17:47 -07:00
Ian Romanick	7a773ac53e	intel/brw: Major rework of lower_cmat_load_store The original goal was to get rid of a bunch of the magic constants sprinkled through the function. Once I did that, I realized that there was a lot my symmertry between the row-major and column-major paths possible. It's +6 lines of code, but about 15 of those lines are comments explaining things that were not obvious in the original code. v2: Save duplicated condition in a variable with a meaningful name. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 14:16:48 -07:00
Ian Romanick	ea6e10c0b2	intel/brw: Temporarily disable result=float16 matrix configs Even though the hardware does not naively support these configurations, there are many potential benefits to advertising them. These configurations can theoretically use half the memory bandwidth for loads and stores. For large matrices, that can be the limiting in performance. The current implementation, however, has a number of significant problems. The conversion from float16 to float32 is performed in the driver during conversion from NIR. As a result, many common usage patterns end up doing back-to-back conversions to and from float16 between matrix multiplications (when the result of one multiplication is used as the accumulator for the next). The float16 version of the matrix waste half the possible register space. Each float16 value sits alone in a dword. This is done so that the per-invocation slice of an 8x8 float16 result matrix and an 8x8 float32 result matrix will have the same number of elements. This makes it possible to do straightforward implementations of all the unary_op type conversions in NIR. It would be possible to perform N:M element type conversions in the backend using specialized NIR intrinsics. However, per #10961, this would be very, very painful. My hope is that, once a suitable resolution for that issue can be found, support for these configs can be restored. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>	2024-06-25 13:52:12 -07:00
Juston Li	33dd38f9d5	anv/android: set ANV_BO_ALLOC_EXTERNAL for imported AHW This fixes some cacheline flush artifacts Signed-off-by: Juston Li <justonli@google.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29882>	2024-06-25 20:21:27 +00:00
José Roberto de Souza	2d29dee889	intel/perf: Extend intel_perf_query_result_read_gt_frequency() to gfx 20 BSpec 62720 states that the previous and current offsets remains the same as previous gfx versions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>	2024-06-25 14:16:45 +00:00
José Roberto de Souza	0a6fe638f3	intel/perf: Add INTEL_PERF_QUERY_FIELD_TYPE_SRM_OA_PEC Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>	2024-06-25 14:16:45 +00:00
José Roberto de Souza	6e1852981b	intel/perf: Add LNL OA XML Also added pec_offset to struct intel_perf_query_info and two new hw variables needed by this XML, those changes are required to at least compile with this new XML. pec_offset will be set in the next patches. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>	2024-06-25 14:16:45 +00:00
José Roberto de Souza	5b8b4f7878	intel/dev: Add engine_class_supported_count to intel_device_info Next patch will need to frequently get the count of supported engine for compute and copy engines, so to reduce the overhead of doing KMD queries at every call here caching this information into intel_device_info struct. With that ANV and Iris would need to set this information as intel/dev can't depend on intel/common, so here adding a single function to update intel_device_info with all fields filled by intel/common functions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>	2024-06-25 14:16:45 +00:00
José Roberto de Souza	2f2a0bc083	intel/perf: Add assert to check if allocated enough query fiels Xe2 platforms will have way more query fields and allocation of that will need to be increased but first lets add a function to return the max_fields and assert if tried to access more query fields then allocated. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>	2024-06-25 14:16:45 +00:00
José Roberto de Souza	0a51842f7a	intel/perf: Change order of if blocks Most places we follow the newest GFX version first, so doing that here. No changes in behavior exepected. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>	2024-06-25 14:16:45 +00:00
Kenneth Graunke	5cb15a6c67	intel/brw: Make bld.ADD(x, 0) emit no instructions and return x directly There are a lot of places where we add 0 to an offset. Avoiding generating this can save us algebraic + copy_propagation later. Cuts compile time in Borderlands 3 by -0.590631% +/- 0.170108% (n=25). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>	2024-06-24 19:12:21 -07:00
Kenneth Graunke	068865ce81	intel/brw: Make an alu2 builder helper Instead of replicating the whole thing in macros, just make an alu2() function and use that in the wrappers. It ought to get inlined anyway. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>	2024-06-24 19:12:19 -07:00
Kenneth Graunke	c18de3f048	intel/brw: Delay liveness calculations in saturate propagation Wait and see if we actually have a candidate for saturate propagation before requesting liveness info. Saves the calculation in the case where we have nothing to do. Cuts compile time in Borderlands 3 by -0.304754% +/- 0.194162% (n=25). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29849>	2024-06-24 19:12:00 -07:00
Paulo Zanoni	41a95d0b13	anv/sparse: use ANV_SPARSE_BLOCK_SIZE instead of tile_size when possible When I wrote sparse resources support for Anv we didn't have TileYs support so I made non-opaque binds work even for non-standard block shapes, which meant the block size could be either 64k or 4k. Since then we merged TileYs support and changed our sparse resources implementation to treat all the non-standard block shape cases as "everything is the miptail", which means non-opaque binds are not possible. So here we adjust the code to more explicitly represent that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	8271e12b8e	anv/sparse: unify and rework tile size calculation There are 3 different places in our code where we calculate the tile size and until recently the 3 implementations were different and with slight bugs. Unify everything and also change the calculation to use tile_info->phys_extent_B. While doing this we move the isl_surf_get_tile_info() calls from anv_sparse_calc_block_shape() to its callers so we total amount of times we call it doesn't change. v2: Adjust the patch now that tile_info is not part of isl_surf anymore. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	2ac35116d1	anv/sparse: remove obsolete linear tiling code path The code that tries to create a "pretend block shape" for linear tiling surfaces was necessary back when we were going to support sparse residency (non-opaque binds) for non-standard block shapes (since there was uncertainty about TileYs support). That hasn't been the case since before we merged sparse resources upstream, so remove the code and leave an assertion instead, just in case. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	2f65acfbb8	anv/sparse: fix TR-TT page table bo size and flags Since commit `18d8c3ca33` we were allocating a little more than what we were actually using (2621440 bytes instead of 2097152, aka 0x280000 instead of 0x200000), and we were not properly marking the BO as internal. No applications should be misbehaving because of this. Fixes: `18d8c3ca33` ("anv: Add missing ANV_BO_ALLOC_INTERNAL") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	23e91fdd64	anv/sparse: dump info about opaque binds when DEBUG_SPARSE I've found myself adding this piece of code to our codebase when debugging some Zink sparse failures recently, so let's upstream it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	49504ab857	intel/isl: pass struct isl_tile_info to choose_image_alignment_el() Pass struct isl_tile_info to isl_choose_image_alignment_el() and its subfunctions. We already compute isl_tile_info at isl_surf_init_s(), don't make the subfunctions compute it again, just reuse the results. Other subfunctions of isl_surf_init_s() also take the tile info as an argument instead of recomputing it. v2: Rebase after the gen20 version was added. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v2) Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	6a6d449a1d	anv/sparse: fix reporting of VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT This calculation was wrong for both compressed formats and multi-sampled images. As a result, we misreported the image as having a single miptail. No Vulkan or GL CTS tests were tripping on this bug. I found this while looking for tile size calculations after fixing a similar bug elsewhere in the code. The calculation should now match what we have in anv_sparse_bind_image_memory(), which is widely tested. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	789b53c523	anv/sparse: fix the image property sizes for multi-sampled images We have to take the number of samples into account when calculating the tile size. If we don't do this, multi-sampled images may end up falling in the "goto out_everything_is_miptail" case, while in reality multi-sampled images don't even have miptails. Also assert that the value is one of the only two values we expect this to be. This assert would have been useful to catch this issue, since with multi-sampled images we were getting values like 16k or 32k depending on the number of samples. This helps move forward progress in some Zink tests, but does not make them fully pass yet, as those tests are full of sub-cases and this only helps some of them: KHR-GL46.sparse_texture2_tests.UncommittedRegionsAccess KHR-GL46.sparse_texture2_tests.SparseTexture2Commitment KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup Fixes: `7ef3d652b2` ("anv/sparse: enable MSAA for Sparse when applicable") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Paulo Zanoni	5c18ccd2d3	anv/sparse: reject 1D sparse residency images The Vulkan spec splits sparse resources in two different features: sparse binding and sparse residency. Sparse binding is much simpler. It requires the resources to be fully bound before being used and it treats them as a black box. We're required to support sparse binding for all the formats that are supported by non-sparse, but that's easy beacause this feature is simpler. Now sparse residency is the one where we're allowed to partially bind resources, and the one that comes with more complicated features such as block shapes and non-opaque binding of images. This feature is subdivided into: - sparseResidencyBuffer - sparseResidencyImage2D - sparseResidencyImage3D - sparseResidency{2,4,8,16}Samples (which refers to 2D images) Notice that there's no sparseResidencyImage1D. And if you read the specs it's clear that sparse residency is meant for non-1D images. Still, supporting it didn't require any extra effort in Anv so we just did it. That's until we started running GL CTS tests on Zink. There's a CTS test that checks for the standard block shapes. It creates 1D images and expects the block shapes for them to be the standard 2D block shapes. While we could very well just patch anv_sparse_calc_image_format_properties() to return the standard 2D block shapes for 1D images, that's just wrong (block shapes for 1D images are just line segments, not rectangles!) so let's just reject this all until maybe one day Vulkan defines sparseResidencyImage1D and we get GL_ARB_sparse_texture3 to match it, or somebody decides to change the GL CTS test. Testcase: KHR-GL46.sparse_texture2_tests.StandardPageSizesTestCase Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>	2024-06-24 17:54:30 +00:00
Nanley Chery	6fc63b1d56	intel/isl: Enable Tile4 for CPB surfaces I got the image alignment requirements for CPCB surfaces from Bspec authors. The vertical alignment value of 8 was confirmed through the Vulkan CTS test group, dEQP-VK.fragment_shading_ratelayered. It also happens to match the QPitch alignment requirement documented in the Bspec. Hopefully the CTS will add tests for LOD2+ in order to exercise the horizontal alignment value. With this in place, we can start using Tile4. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10784 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29355>	2024-06-24 13:08:51 +00:00
Tapani Pälli	7934b70ff1	isl/iris/anv: provide drirc toggle intel_sampler_route_to_lsc Some applications may benefit from this while some can get a performance hit. Default to false and make it possible to toggle only for selected workloads. See workaround 14022483228 for some measurements. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29760>	2024-06-24 09:23:07 +00:00

1 2 3 4 5 ...

12256 commits