fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 16:58:10 +02:00

Author	SHA1	Message	Date
Roland Scheidegger	dcf2feadc3	gallivm: fix gather implementation a bit gather is defined in terms of bilinear filtering, just without the filtering part. However, there's actually some subtle differences required in our implementation, because we use some tricks to simplify coord wrapping for the two coords per direction. For bilinear filtering, we don't care if we end up with an incorrect texel, as long as the filter weight is 0.0 for it. Likewise, the order of the texels doesn't actually matter (as long as they still have the correct filter weight). But for gather, these tricks lead to incorrect results. Fix this for CLAMP_TO_EDGE, and add some comments to the other wrap functions which look broken (the 3 mirror_clamp plus mirror_repeat) (too complex to fix right now, and noone really seems to care...). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-09-09 03:06:10 +02:00
Charmaine Lee	57d9222ef2	svga: abort shader translation upon indirect indexing of temporaries This patch aborts shader translation upon indirect indexing of temporary register on non-vgpu10 device. This prevents non-supported feature sending to the device. Tested wth MTT-piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-09-08 13:58:38 -06:00
Eric Engestrom	f77d06fb28	gallium/tests: use ARRAY_SIZE macro Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-08 10:29:40 +01:00
Eric Engestrom	db8c5ae853	r300: use ARRAY_SIZE macro Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-08 10:29:40 +01:00
Connor Abbott	b8a51c8c4b	radeonsi: move the guts of ARB_shader_group_vote emission to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:12:49 +01:00
Connor Abbott	bd73b89792	radeonsi: move si_emit_ballot() to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:12:42 +01:00
Connor Abbott	ac27fa7294	radeonsi: move emit_optimization_barrier() to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:06:47 +01:00
Connor Abbott	c181d4f2b7	radeonsi: move llvm_get_type_size() to ac Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-09-08 04:04:16 +01:00
Leo Liu	6e8ef53837	Revert "st/va: add enviromental variable to disable interlace" This reverts commit `10dec2de2d`. The environment variable is no longer needed with the previous change Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-07 13:32:36 -04:00
Leo Liu	15d4d44d9b	st/va: move YUV content to deinterlaced buffer when reallocated for encoder v2: use deinterlace common function v3: make sure deinterlace only Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-07 13:32:36 -04:00
Leo Liu	cadeb73f6b	st/va: reallocate the buffer if the layout isn't supported So that it makes more clear for buffer reallocation based on buffers layout for both decoder and encoder. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-07 13:32:36 -04:00
Leo Liu	78ec7400c5	vl/compositor: make vl_compositor_set_yuv_layer() static Since it's no longer being called outside of compositor Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-07 13:32:36 -04:00
Leo Liu	9f32078c20	st/omx: use vl/compositor helper function for YUV deinterlacing v2: separate helper function in different patch Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-07 13:32:36 -04:00
Leo Liu	a6da7e6c3a	vl/compositor: make a helper function for YUV deinterlacing The similar function is in OMX, and only used by OMX. Now have it moved to vl/compositor for other state tracker to use later. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-07 13:32:36 -04:00
Marek Olšák	4bd2bdbb3c	ac/surface: add radeon_surf::has_stencil for convenience Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 17:59:37 +02:00
Marek Olšák	7ec64bd88c	radeonsi: don't read tcs_out_lds_layout.patch_stride from an SGPR Same as before, writing TCS outputs to LDS is rare. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:07 +02:00
Marek Olšák	07fe10c75d	radeonsi: don't read tcs_out_lds_layout.vertex_size from an SGPR TCS outputs are usually not written to LDS, so no stats here. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:07 +02:00
Marek Olšák	89bf8668c2	radeonsi/gfx9: don't read LS out vertex stride from an SGPR in monolithic HS -44 bytes in a monolithic LS-HS binary. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:07 +02:00
Marek Olšák	f974bb768b	radeonsi: don't read the LS output vertex stride from an SGPR in LS Now it's able to generate ds_write2_b64 instead of ds_write2_b32. -20 bytes in one shader binary. (having only 1 output) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:07 +02:00
Marek Olšák	22f5dfd300	radeonsi: don't read the number of TCS out vertices from an SGPR in TCS -16 bytes in one shader binary. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:07 +02:00
Marek Olšák	17dd4856a6	radeonsi: don't always apply the PrimID instancing bug workaround on SI It looks like commit `391673af7a` that should have fixed the perf regression didn't really change much if anything. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:06 +02:00
Marek Olšák	a0823df148	radeonsi: remove 2 callbacks from si_shader_context Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:06 +02:00
Marek Olšák	1cda9a2fee	winsys/amdgpu: disable local BOs on Raven It hangs with a high degree of reproducibility. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 12:57:48 +02:00
Roland Scheidegger	6d9d6071ee	llvmpipe, tgsi: hook up dx10 gather4 opcode Trivial. We already support tg4 for legacy tex opcodes, so the actual texture sampling code already handles it. (Just like TG4, we don't handle additional capabilities and always sample red channel.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-09-07 03:32:01 +02:00
Roland Scheidegger	de6810d9be	llvmpipe, draw: increase shader cache limits We're not particularly concerned with memory usage, if the tradeoff is shader recompiles. And it's common for apps to have a lot of shaders nowadays (and, since our shaders include a LOT of context state of course we may create quite a bit more shaders even). So quadruple the amount of shaders draw will cache (from 128 to 512). For llvmpipe (fs shaders) quadruple the number of instructions, keep the number of variants the same for now (only with very simple, non-texturing shaders the variant limit could really be reached), and simplify the definition, it's probably easier to just have one different definition per branch... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-09-07 03:32:01 +02:00
Leo Liu	e1e3c0384b	radeon/uvd: fix the assertion check for YUYV format Fixes:7319ff87("radeon/uvd: add YUYV format support for target buffer") Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-09-06 15:53:18 -04:00
Tim Rowley	dad32fc61c	swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib types Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:02:36 -05:00
Tim Rowley	1ebf6fc865	swr/rast: Remove use of C++14 template variable SWR rasterizer must remain C++11 compliant. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:02:29 -05:00
Tim Rowley	9df5691fff	swr/rast: SIMD16 FE remove templated immediates workaround Fixed properly in gcc-compatible fashion. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:02:23 -05:00
Tim Rowley	404ac6da9e	swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble For consistency and to support overloading. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:02:17 -05:00
Tim Rowley	6cb20c9f3a	swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:02:12 -05:00
Tim Rowley	6afdc8732c	swr/rast: Removed some trailing whitespace caught during review Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:02:06 -05:00
Tim Rowley	4edc5d8305	swr: set caps for VB 4-byte alignment Needed to compensate for change to fetch jit requiring alignment. Fixes regressions in piglit: vertex-buffer-offsets and about another hundred of the vs-inputbyte tests. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:01:59 -05:00
Tim Rowley	4475583f5e	swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-09-06 11:01:39 -05:00
Nicolai Hähnle	45c5c44451	radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug When the HS wave is empty, the hardware writes the LS VGPRs starting at v0 instead of v2. Workaround by shifting them back into place when necessary. For simplicity, this is always done in the LS prolog. According to the hardware team, this will be fixed in future chips, so take that into account already. Note that this is not a bug fix, as the bug was already worked around by commit `166823bfd2` ("radeonsi/gfx9: add a temporary workaround for a tessellation driver bug"). This change merely replaces the workaround by one that should be better. v2: add workaround code to shader only when necessary v3: clarify the prefer_mono comment Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 10:02:49 +02:00
Nicolai Hähnle	274f1dace7	amd/common: pass chip_class to ac_dump_reg Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:59:17 +02:00
Nicolai Hähnle	34124e412f	radeonsi/gfx9: always flush DB metadata on framebuffer changes This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:57:08 +02:00
Charmaine Lee	c12ef63b69	svga: move index buffer bind flag assertion The buffer bind flags can be promoted in svga_buffer_handle(), so move the assertion after it. This has already been done for vertex buffer in commit `6b4bf7e8be`, but it misses the one for index buffer. Fixes assertion running WarThunder. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2017-09-05 10:31:18 -06:00
Charmaine Lee	98badd7f6e	svga: avoid emitting redundant SetShaderResources and SetVertexBuffers Minor performance improvement in avoiding binding the same shader resource or the same vertex buffer for the same slot. Tested with MTT glretrace. v2: Per Brian's suggestion, add a helper function to do vertex buffer comparision. v3: Change the helper function to vertex_buffers_equal(). Reviewed-by: Brian Paul <brianp@vmware.com>	2017-09-05 10:31:18 -06:00
Marek Olšák	c3ebac6890	radeonsi/gfx9: implement primitive binning This increases performance, but it was tuned for Raven, not Vega. We don't know yet how Vega will perform, hopefully not worse. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-05 12:09:02 +02:00
Marek Olšák	51e10c2770	radeonsi: add more state flags into si_state_dsa 3 flags for primitive binning, 2 flags for out-of-order rasterization (but that will be done some other time) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-05 12:09:02 +02:00
Marek Olšák	0797eea758	radeonsi/gfx9: don't use BREAK_BATCH and FLUSH_DFSM if DFSM is disabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-05 12:09:02 +02:00
Marek Olšák	fb7ba68f6c	radeonsi: eliminate PS color outputs when colormask kills them Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-04 15:10:39 +02:00
Marek Olšák	468c131033	gallium/radeon: sort DBG shader flags according to pipe_shader_type Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-04 15:10:39 +02:00
Nicolai Hähnle	50283109aa	radeonsi: ensure cache flushes happen before SET_PREDICATION packets The data is read when the render_cond_atom is emitted, so we must delay emitting the atom until after the flush. Fixes: `0fe0320dc0` ("radeonsi: use optimal packet order when doing a pipeline sync") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-04 13:50:57 +02:00
Nicolai Hähnle	097cfe9fde	radeonsi: fix ARB_transform_feedback_overflow_query on <= VI The result written by the shader workaround needs to be written back, or the CP may read stale data. Fixes: `78476cfe07` ("radeonsi: enable ARB_transform_feedback_overflow_query") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-04 13:50:54 +02:00
Nicolai Hähnle	55df3d2286	radeonsi: fix compute shader state dumping Fixes: `420c438589` ("radeonsi: log draw and compute state into log context") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-04 13:50:47 +02:00
Nicolai Hähnle	30a2f0dfd4	radeonsi: add an assertion that only two-dimensional constant references are used v2: remove some redundant checks Acked-by: Roland Scheidegger <sroland@vmware.com> (v1) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-09-04 13:44:09 +02:00
Nicolai Hähnle	3e4dff4f00	gallium/radeon: always use two-dimensional constant references Acked-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-09-04 13:44:06 +02:00
Nicolai Hähnle	83923a1f17	gallium/tests: always use two-dimensional constant references Acked-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-09-04 13:44:04 +02:00

1 2 3 4 5 ...

32177 commits