fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-03 11:30:21 +01:00

Author	SHA1	Message	Date
Marek Olšák	8b220877ad	radeonsi/gfx9: set registers and shader key for merged ES-GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	067dacd1b1	radeonsi/gfx9: define and set LS-HS user SGPRs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	0588146cb0	radeonsi/gfx9: set up shader registers for merged LS-HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	983d7e743e	radeonsi: code shuffling in si_emit_derived_tess_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-28 21:47:35 +02:00
Marek Olšák	80814819c2	radeonsi/gfx9: don't set deprecated field PARTIAL_ES_WAVE_ON Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	60a20e6879	radeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register Cc: 17.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-26 13:08:05 +02:00
Marek Olšák	5438e39fae	radeonsi: don't allow user indices with indirect draws Not possible with GL and it will make future gallium rework easier. (also it's something I wouldn't like to support) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Marek Olšák	1c94d29984	radeonsi: merge two if (indirect) statements Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-04-17 01:22:11 +02:00
Nicolai Hähnle	4f7e3fbb50	radeonsi: fix gl_BaseVertex in non-indexed draws gl_BaseVertex is supposed to be 0 in non-indexed draws. Unfortunately, the way they're implemented, the VGT always generates indices starting at 0, and the VS prolog adds the start index. There's a VGT_INDX_OFFSET register which causes the VGT to start at a driver-defined index. However, this register cannot be written from indirect draws. So fix this unlikely case by setting a bit to tell the VS whether the draw is indexed or not, so that gl_BaseVertex can be adjusted accordingly when used. Fixes a bug in KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters.* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:31:11 +02:00
Nicolai Hähnle	472c84d1ad	radeonsi: provide VS_STATE input to all VS variants v2: fix incorrect change in get_tcs_out_patch_stride Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:20 +02:00
Nicolai Hähnle	3b9fbcb3b6	radeonsi: change the bit-packing of LS out/TCS in data Avoid conflicts when merging various VS state bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:19 +02:00
Nicolai Hähnle	ff39f0d59c	radeonsi: emit VS_STATE register explicitly from si_draw_vbo We will merge other derived state information into this register. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:18 +02:00
Nicolai Hähnle	8c224d3d9f	radeonsi: extract derived tess state emit to higher level Especially with subsequent changes, this makes it easier to see the sequence of state emits at the higher level. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-13 17:30:17 +02:00
Constantine Kharlamov	fa8bc90990	r600g/radeonsi: use the correct types (taken from pipe_draw_info) Note: si_shader.h has also "type" variable that should be changed to "enum pipe_prim_type", however it triggers a bunch of warnings about unhandled switches, so due not knowing the correct way to handle them, I decided to leave it as is. Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-04-04 22:15:47 +02:00
Marek Olšák	d76c306162	radeonsi: don't make a copy of pipe_index_buffer in draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:32 +02:00
Marek Olšák	9c100bd693	radeonsi/gfx9: flush CB & DB caches with an EOP TS event Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	6e0d64712a	radeonsi/gfx9: use ACQUIRE_MEM Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	a4f0a1099f	radeonsi/gfx9: draw changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	71eca0780a	radeonsi/gfx9: add a scissor bug workaround Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Nicolai Hähnle	6a1d9684f4	radeonsi: handle MultiDrawIndirect in si_get_draw_start_count Also handle the GL_ARB_indirect_parameters case where the count itself is in a buffer. Use transfers rather than mapping the buffers directly. This anticipates the possibility that the buffers are sparse (once ARB_sparse_buffer is implemented), in which case they cannot be mapped directly. Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type on <= CIK. v2: - unmap the indirect buffer correctly - handle the corner case where we have indirect draws, but all of them have count 0. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-21 10:45:02 +01:00
Marek Olšák	c8ef512398	gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally It's OK for r300g (because r300g can't write to buffers via the GPU), but not later hardware. This issue was spotted randomly. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-19 17:16:26 +01:00
Marek Olšák	a264fee624	radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2) start can only be non-zero with MultiDrawElements, which is unlikely to occur with UNSIGNED_BYTE indices. v2: Also fix the util_shorten_ubyte_elts_to_userptr call. Tested with the new piglit. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-19 17:16:26 +01:00
Marek Olšák	791e8ce04a	radeonsi: use a clever alignment for index buffer uploads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	620aded541	radeonsi: move index buffer flushing into a non-upload indexed case The other codepaths don't need this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-18 01:22:08 +01:00
Marek Olšák	2ca3548eb9	gallium/radeon: remove the internal u_upload_mgr pointer also remove the BIND flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Tested-by: Charmaine Lee <charmainel@vmware.com>	2017-02-14 21:46:16 +01:00
Marek Olšák	4c288c73ea	radeonsi: remove SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER not necessary Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	65df38b191	radeonsi: remove separate CB/DB_META flush flags not used separately Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-10 11:27:50 +01:00
Marek Olšák	f8bc628b2c	gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter to simplify things in draw_vbo a little Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:45:29 +01:00
Marek Olšák	eba9e9dd1d	radeonsi: handle count_from_stream_output in a few IA_MULTI_VGT_PARAM cases Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:45:29 +01:00
Marek Olšák	f8dd2f5bac	radeonsi: fold info->indirect conditionals into the last one in draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:29:36 +01:00
Marek Olšák	408f9a1584	radeonsi: atomize the scratch buffer state The update frequency is very low. Difference: Only account for the size when allocating a new one and when starting a new IB, and check for NULL. (v3) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 17:29:36 +01:00
Marek Olšák	5f99c49008	radeonsi: precompute IA_MULTI_VGT_PARAM values into a table The perf difference is very small: 0.99% -> 0.40% for the time spent in si_get_ia_multi_vgt_param when si_draw_vbo is 20%. Pretty much nothing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	c78177fc64	radeonsi: move VGT_VERTEX_REUSE_BLOCK_CNTL into shader states for Polaris Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	ac059f1c23	radeonsi: use a bitmask for looping over dirty PM4 states also move it to draw_vbo, because it should be 0 in most cases Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	802fcdc0d2	radeonsi: atomize L2 prefetches to move the big conditional statement out of draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	879c73fac8	radeonsi: update dirty_level_mask only after the first draw after FB change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	573bf0940a	radeonsi: always set the TCL1_ACTION_ENA when invalidating L2 Some CIK-VI docs say this is the default behavior on SI. That doesn't answer whether it's also the default behavior on CIK-VI. Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	cf248929bf	radeonsi: use a global dirty mask for shader pointers Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Bas Nieuwenhuizen	0ef1b4d5b1	ac/debug: Move IB decode to common code. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:43:59 +01:00
Marek Olšák	ece6e1f658	radeonsi: add TC L2 prefetch for shaders and VBO descriptors Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	5871ebd7f1	radeonsi: add HUD queries for cache flush stats Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	a816c7fe07	radeonsi: add a tess+GS hang workaround for VI dGPUs ported from Vulkan Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	78c4528ae7	radeonsi: apply a tessellation bug workaround for SI Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	72d48fcd8e	radeonsi: apply a multi-wave workgroup SPI bug workaround to affected CIK chips All codepaths are handled except for clover. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-01 02:16:51 +01:00
Marek Olšák	fa476e0566	radeonsi: fast exit si_emit_derived_tess_state early Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Nicolai Hähnle	908f92ad1f	radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation Fixes GL45-CTS.geometry_shader.adjacency.adjacency_indiced_triangle_strip and others. This leaves the case of triangle strips with adjacency and primitive restarts open. It seems that the only thing that cares about that is a piglit test. Fixing this efficiently would be really involved, and I don't want to use the hammer of degrading to software handling of indices because there may well be software that uses this draw mode (without caring about the precise rotation of triangles). v2: - skip the GS prolog entirely if workaround is not needed - only check for TES (TES is always non-null when tessellation is used) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:11:24 +01:00
Marek Olšák	dc6bbe2dd0	gallium/radeon: use r600_gfx_write_event_eop everywhere Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	d4d9ec55c5	radeonsi: implement TC-compatible HTILE so that decompress blits aren't needed and depth texturing needs less memory bandwidth. Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16. The format promotion is not visible to state trackers. This is part of TC-compatible renderbuffer compression, which has 3 parts: DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now. I don't see a measurable increase in performance though. (I tested Talos Principle and DiRT: Showdown, the latter is improved by 0.5%, which is almost noise, and it originally used layered Z16, so at least we know that Z16 promoted to Z32F isn't slower now) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	40e1f7e09b	radeonsi: use TC write-back instead of full cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	8cdce30cc2	radeonsi: implement TC L2 write-back (flush) without cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00

1 2 3 4 5 ...

344 commits