fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 11:38:06 +02:00

Author	SHA1	Message	Date
Karol Herbst	d0c6ef2793	nir: rename global/local to private/function memory the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:51:46 +01:00
Jason Ekstrand	05d72d6d48	spirv: Sort supported capabilities Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-07 18:41:15 -06:00
Jason Ekstrand	63b9aa2e25	spirv: Add support for using derefs for UBO/SSBO access For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	adc155a815	spirv: Add explicit pointer types Instead of baking in uvec2 for UBO and SSBO pointers and uint for push constant and shared memory pointers, make it configurable. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	fc9c4f89b8	nir: Move propagation of cast derefs to a new nir_opt_deref pass We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Bas Nieuwenhuizen	3cc940277a	radv: Fix rasterization precision bits. Note that these limits are exact, not a "precision is at least x", as texel coords also get snapped to a multiple of this step size before filtering. This fixes CTS tests dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:27:30 +01:00
Bas Nieuwenhuizen	be6cee51c0	amd/common: Add some parentheses to silence warning. [1/59] Compiling C object 'src/amd/common/src@amd@common@@amd_common@sta/ac_nir_to_llvm.c.o'. ../mesa/src/amd/common/ac_nir_to_llvm.c: In function ‘get_inst_tessfactor_writemask’: ../mesa/src/amd/common/ac_nir_to_llvm.c:4089:32: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] writemask = ((1 << num_comps + 1) - 1) << first_component; ~~~~~~~~~~^~~ ../mesa/src/amd/common/ac_nir_to_llvm.c:4091:33: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] writemask = (((1 << num_comps + 1) - 1) << first_component) << 4; Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:37 +01:00
Bas Nieuwenhuizen	64c83efaee	radv: Remove unused variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:33 +01:00
Bas Nieuwenhuizen	656c1c488c	radv: Remove device path. unused and gcc complains about strncpy. (from what I can see because strncpy does not leave a 0 byte on truncate. That said we don't use it so this does not fix a real bug). Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:14 +01:00
Marek Olšák	492ad9a402	ac: remove unused variable from ac_build_ddxy trivial	2019-01-07 14:51:25 -05:00
Bas Nieuwenhuizen	9a45a190ad	radv: Implement buffer stores with less than 4 components. We started using it in the btoi paths for r32g32b32, and the LLVM IR checker will complain about it because we end up with intrinsics with the wrong type extension in the name. Fixes: `593996bc02` ("radv: implement buffer to image operations for R32G32B32") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 14:54:14 +01:00
Timothy Arceri	50de3f80a8	nir: rename nir_link_constant_varyings() nir_link_opt_varyings() The following patches will add support for an additional optimisation so this function will no longer just optimise varying constants. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 12:19:17 +11:00
Timothy Arceri	2832bc972b	ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs() The following patch will use this with the radeonsi NIR backend but I've added it to ac so we can use it with RADV in future. This is a NIR implementation of the tgsi function tgsi_scan_tess_ctrl(). Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 10:01:24 +11:00
Bas Nieuwenhuizen	8c93ef5de9	radv: Do a cache flush if needed before reading predicates. This caused random failures for two conditional rendering tests: dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard These wrote the predicate with the vertex shader, did a barrier and then started the conditional rendering. However the cache flushes for the barrier only happen on first draw, so after the predicate has been read. Fixes: `e45ba51ea4` "radv: add support for VK_EXT_conditional_rendering" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-12-31 20:52:08 +01:00
Bas Nieuwenhuizen	bba5749484	radv: Fix wrongly positioned paren. Trivial. Fixes: `9f0bfbed11` "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."	2018-12-21 21:06:55 +01:00
Samuel Pitoiset	9606310081	radv: enable shaderStorageImageMultisample feature on GFX8+ Untested on older chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:19 +01:00
Samuel Pitoiset	6b976024a8	radv: add support for FMASK expand Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:17 +01:00
Samuel Pitoiset	fa16da53d8	radv: initialize FMASK for images in fully expanded mode The value depends on the number of samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:15 +01:00
Samuel Pitoiset	65d82c84d2	ac/nir: restrict fmask lookup to image load intrinsics We don't ever want to do the fmask lookup on a atomic or store, the fmask should have been decompressed if the surface has been moved to IMAGE_LAYOUT. Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:11 +01:00
Samuel Pitoiset	5b1ec10e4c	radv: compute optimal VM alignment for imported buffers This fixes GPU hangs on GFX9 with dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.* Copied from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 17:34:04 +01:00
Bas Nieuwenhuizen	9f0bfbed11	radv: Work around non-renderable 128bpp compressed 3d textures on GFX9. Exactly what title says, the new addrlib does not allow the above with certain dimensions that the CTS seems to hit. Work around it by not allowing the app to render to it via compat with other 128bpp formats and do not render to it ourselves during copies. Fixes: `776b911365` "amd/addrlib: update Mesa's copy of addrlib" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-20 15:07:20 +01:00
Samuel Pitoiset	5c7935f8fc	radv: fix subpass image transitions with multiviews The driver needs to decompress all image layers if a fast depth/color clear has been performed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 13:36:37 +01:00
Samuel Pitoiset	0a7e767e58	radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8 This workaround has been introduced by `135e4d434f` for fixing DXVK GPU hangs with many games. It is no longer needed since LLVM r345718. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 12:09:57 +01:00
Samuel Pitoiset	576040f2e5	ac/nir: remove the bitfield_extract workaround for LLVM 8 This workaround has been introduced by `3d41757788` and it is no longer needed since LLVM r346422. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-20 09:40:16 +01:00
Jason Ekstrand	ec1d5841fa	radv/query: Use 1-bit booleans in query shaders Fixes: `44227453ec` "nir: Switch to using 1-bit Booleans for almost..." Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-19 16:36:40 -06:00
Jason Ekstrand	6896c91c10	radv/query: Add a nir_test_flag helper This is little more than an iadd_imm right now but it will help in the next commit where we refactor things further. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-19 16:36:26 -06:00
Nicolai Hähnle	23af72af25	radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:32 +01:00
Nicolai Hähnle	0ef263d62f	ac/surface: 3D and cube surfaces are never displayable Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:22 +01:00
Nicolai Hähnle	8efaffa893	amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan Allow for a unified but efficient treatment of adding a bitmask over a wave or an entire threadgroup. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:19 +01:00
Nicolai Hähnle	300876a9a7	amd/common: scan/reduce across waves of a workgroup Order-aware scan/reduce can trade-off LDS traffic for external atomics memory traffic in producer/consumer compute shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:17 +01:00
Nicolai Hähnle	3963402fd3	amd/common: add ac_build_ifcc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:15 +01:00
Nicolai Hähnle	3c77f26ccc	amd/common: whitespace fixes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:12 +01:00
Nicolai Hähnle	76c5ad1995	amd/sid_tables: add additional python3 compatibility imports This happened to bite me while doing some experiments. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:09 +01:00
Ian Romanick	378f996771	nir/opt_peephole_select: Don't peephole_select expensive math instructions On some GPUs, especially older Intel GPUs, some math instructions are very expensive. On those architectures, don't reduce flow control to a csel if one of the branches contains one of these expensive math instructions. This prevents a bunch of cycle count regressions on pre-Gen6 platforms with a later patch (intel/compiler: More peephole select for pre-Gen6). v2: Remove stray #if block. Noticed by Thomas. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	09b7e1d8e4	nir/opt_peephole_select: Don't try to remove flow control around indirect loads That flow control may be trying to avoid invalid loads. On at least some platforms, those loads can also be expensive. No shader-db changes on any Intel platform (even with the later patch "intel/compiler: More peephole select"). v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select. Suggested by Rob. See also the big comment in src/intel/compiler/brw_nir.c. v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from nir_lower_io_arrays_to_elements.c). v4: Fix inverted condition in brw_nir.c. Noticed by Lionel. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Bas Nieuwenhuizen	f67dea5e19	radv: Fix multiview depth clears We were not using the view mask for depth clears, causing only the first view to be cleared. Fixes: `2e86f6b259` "radv: Add multiview clears." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 20:16:26 +00:00
Bas Nieuwenhuizen	9add63a3a5	radv: Remove redundant format check. The switch directly after the check has a default case that returns NULL too, so the effective return value is not changed. Also this check is wrong once we start dealing with formats introduced by an extension (e.g. YUV formats). Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 20:09:38 +00:00
Samuel Pitoiset	445867c80d	radv: report Vulkan version 1.1.90 for real I thought the value was correctly propagated, but actually not. Fixes: `2ac6d55f38` ("radv: bump reported version to 1.1.90") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-17 17:51:48 +01:00
Jason Ekstrand	cae373117c	anv,radv: Re-enable VK_EXT_pci_bus_info Now at version 2 with the fixed header. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 10:42:35 -06:00
Rhys Perry	ef198e8c6a	radv: switch from nir_bcsel to nir_b32csel Fixes: `191a1dce92` ('nir: Add 1-bit Boolean opcodes') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 14:52:39 +00:00
Rhys Perry	bba94a3d85	radv: don't set surf_index for stencil-only images Fixes: `f8d5b377c8` ('radv: set cb base tile swizzles for MRT speedups (v4)') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108116 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 14:52:10 +00:00
Jason Ekstrand	47e1e0692c	radv: Fix a stupid if in gather_intrinsic_info Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 15:06:07 -06:00
Jason Ekstrand	11dc130779	nir: Add a bool to int32 lowering pass We also enable it in all of the NIR drivers. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	80e8dfe9de	nir: Rename Boolean-related opcodes to include 32 in the name This is a squash of a bunch of individual changes: nir/builder: Generate 32-bit bool opcodes transparently nir/algebraic: Remap Boolean opcodes to the 32-bit variant Use 32-bit opcodes in the NIR producers and optimizations Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' */.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' */.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' */.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' */.c sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' */.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' */.c Use 32-bit opcodes in the NIR back-ends Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' */.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' */.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' */.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' */.c sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' */.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' */.c Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Rhys Perry	bde9f482de	ac: split 16-bit ssbo loads that may not be dword aligned Fixes: `7e7ee82698` ('ac: add support for 16bit buffer loads') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108114 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-16 14:56:10 +00:00
Rhys Perry	12dc7cb202	ac: refactor visit_load_buffer This is so that we can split different types of loads more easily. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-16 14:56:10 +00:00
Dave Airlie	b3f2b03ece	radv/xfb: fix counter buffer bounds checks. If we gave this function 0 counter buffers, we'd still try and access pCounterBuffers[0] as this check was incorrect. Fixes crash with ext_transform_feedback-pipeline-basic-primgen on zink on radv. Fixes: `677b496b6` (radv: fix begin/end transform feedback with 0 counter buffers.) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-13 19:27:05 +00:00
Samuel Pitoiset	5088ba2aeb	radv: don't check if format is depth in radv_image_can_enable_hile() This is always TRUE if htile_size is not 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:21 +01:00
Samuel Pitoiset	eb0034fe28	radv: check if addrlib enabled HTILE in radv_image_can_enable_htile() When hile_size is 0, we can't enable HTILE. This doesn't change anything, except not calling radv_image_alloc_htile(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:19 +01:00
Samuel Pitoiset	d8325f1f07	radv: switch on EOP when primitive restart is enabled with triangle strips Otherwise, Yakuza hangs the GPU with DXVK. We don't know if linetrip and pointlist are affected, so my point is to do that only for triangle strips. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:16 +01:00

1 2 3 4 5 ...

2934 commits