fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 08:28:16 +02:00

Author	SHA1	Message	Date
Roland Scheidegger	4d5346aaac	Revert "draw: use vectorized calculations for fetch" Trivial. There's some regressions internally, related to overflow behavior. I'll have to look at it at another time, some interactions with vsplit/vcache are actually mind-blowing. This reverts commit `3fa10ffb49`.	2016-11-09 05:53:16 +01:00
Ilia Mirkin	f037afb701	swr: disable logic op when the rt format is float or srgb Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Ilia Mirkin	e2e40e236f	swr: fix AND_INVERTED logic op conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Ilia Mirkin	bef4a48d1c	swr: add support for EXT_depth_bounds_test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Ilia Mirkin	aa62fa8fb7	swr: [rasterizer core] set depth hottile when depth bounds test enabled Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-08 19:28:35 -05:00
Tim Rowley	95ed1c19bf	swr: allow alphatest without blend or logicop We need to compile a blend function when alphatest is enabled. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-11-08 14:18:47 -06:00
Marek Olšák	bdd48e47c0	tgsi/scan: turn a huge if-else-if.. chain into a switch statement Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-08 17:56:42 +01:00
Marek Olšák	f864547fa9	tgsi/scan: fix images_buffers regression The first IF statement disabled the second one. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98599 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-08 17:56:42 +01:00
Nicolai Hähnle	88f791db75	gallivm: fix [IU]MUL_HI regression This patch does two things: 1. It separates the host-CPU code generation from the generic code generation. This guards against accidently breaking things for radeonsi in the future. 2. It makes sure we actually use both arguments and don't just compute a square :-p Fixes a regression introduced by commit `29279f44b3` Cc: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-11-08 16:25:54 +01:00
Roland Scheidegger	3fa10ffb49	draw: use vectorized calculations for fetch Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. Because llvm is complete fail with the zero-extend widening mul, roll our own even... To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to apparently llvm not being able to deduce it's really all the same with a couple instanced elements). Also, for elts gathering, use vectorized code as well - provide a fake elt buffer if there's no valid one bound. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). No piglit change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-08 03:41:26 +01:00
Roland Scheidegger	29279f44b3	gallivm: introduce 32x32->64bit lp_build_mul_32_lohi function This is used by shader umul_hi/imul_hi functions (and soon by draw). It's actually useful separating this out on its own, however the real reason for doing it is because we're using an optimized sse2 version, since the code llvm generates is atrocious (since there's no widening mul in llvm, and it does not recognize the widening mul pattern, so it generates code for real 64x64->64bit mul, which the cpu can't do natively, in contrast to 32x32->64bit mul which it could do). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-11-08 03:41:26 +01:00
Samuel Pitoiset	e32e5d214e	nvc0: simplify draw parameters upload for vertex shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-07 22:50:17 +01:00
Steven Toth	381edca826	gallium/hud: protect against and initialization race In the event that multiple threads attempt to install a graph concurrently, protect the shared list. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Steven Toth	5a58323064	gallium/hud: close a previously opened handle We're missing the closedir() to the matching opendir(). Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Steven Toth	6ffed08679	gallium/hud: fix a problem where objects are free'd while in use. Instead of trying to maintain a reference counted list of valid HUD objects, and freeing them accordingly, creating race conditions between unanticipated multiple threads, simply accept they're allocated once and never released until the process terminates. They're a shared resource between multiple threads, so accept they're always available for use. Signed-off-by: Steven Toth <stoth@kernellabs.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-07 18:31:52 +01:00
Serge Martin	cc495055cd	clover: Add CL_PROGRAM_BINARY_TYPE support (CL1.2). v3 [Francisco Jerez]: Loosely based on Serge's v1 of this patch in order to avoid CL-specific enums in the clover module binary format. In addition to other changes made in v2: Represent the CL program binary type as the section type instead of adding a CL API-specific enum, check that the binary types of the input objects are valid during clLinkProgram(), pass section type as argument to build_module_library() instead of using separate function. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-06 15:56:54 +01:00
Serge Martin	05fcc73f08	clover: add missing clGetDeviceInfo CL1.2 queries Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2016-11-06 15:56:49 +01:00
Samuel Pitoiset	8cc4a74971	nvc0: get rid of NVE4_COMPUTE_MP_PM_{A,B}_SIGSEL_XXX Instead, hardcode group sigsel because there are a bunch of unknown groups, especially on SM50/SM52. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-11-05 19:28:25 +01:00
Samuel Pitoiset	a295364596	gm107/ir: emit RED instead of ATOM when no dst This is similar to NVC0 and GK110 emitters where we emit reduction operations instead of atomic operations when the destination is not used. Found after writing some tests which check if performance counters return the expected value. In that case, gred_count returned 0 on gm107 while at least gk106 returned the correct value. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-11-05 19:27:35 +01:00
Mauro Rossi	0148313ea3	android: amd/common: add support for libmesa_amd_common Fixes the following building error introduced with commit `7115e56` and related amd/common dependencies: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:6861: error: undefined reference to 'ac_is_sgpr_param' external/mesa/src/gallium/drivers/radeonsi/si_shader.c:6951: error: undefined reference to 'ac_is_sgpr_param' clang++: error: linker command failed with exit code 1 (use -v to see invocation) ninja: build stopped: subcommand failed. build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed make: *** [ninja_wrapper] Error 1 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-11-05 18:42:29 +01:00
Marek Olšák	0f72f7292a	winsys/radeon: don't call surface_best for FMASK Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98518 Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-11-05 18:36:26 +01:00
Eric Anholt	283d4d18e5	vc4: Use Newton-Raphson on the 1/W write to fix glmark2 terrain. The 1/W was apparently not accurate enough, and we were getting sparklies in the distance. The closed driver also did a N-R step here. Cc: <mesa-stable@lists.freedesktop.org>	2016-11-04 15:34:38 -07:00
Eric Anholt	70fc3a941a	vc4: Make sure that vertex shader texture2D() calls use LOD 0. I noticed this while trying to debug glmark2 terrain (which does vertex shader texturing, but no mipmaps on its textures sampled from the VS).	2016-11-04 15:34:38 -07:00
Nicolai Hähnle	2c875158e2	radeonsi: fix vertex fetches for 2_10_10_10 formats The hardware always treats the alpha channel as unsigned, so add a shader workaround. This is rare enough that we'll just build a monolithic vertex shader. The SINT case cannot actually happen in OpenGL, but I've included it for completeness since it's just a mix of the other cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 21:30:18 +01:00
Dave Airlie	d0d5f7600c	Revert "st/vdpau: use linear layout for output surfaces" This reverts commit `d180de3532`. This is a radeon specific hack that causes problems on nouveau when combined with the SHARED flag later. If radeonsi needs a fix for this, please fix it in the driver. [chk] Using linear surfaces for this makes sense because tilling isn't beneficial and the surfaces can potentially be shared with other GPUs using the VDPAU OpenGL interop. [airlied] I think we need a flag that isn't SHARED/LINEAR that is more SHARED_OTHER_GPU. [mareko] Does radeonsi need PIPE_BIND_VIDEO_DECODE_OUTPUT that it would translate into linear ? [mareko] My only concern is decoding performance. If the decoder works in 64x1 blocks, tiling will hurt. That's the theory. I don't know how the decoder works. Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (I+A)	2016-11-04 15:04:21 +00:00
Marek Olšák	00baaa4752	radeonsi: fix an assertion failure in si_decompress_sampler_color_textures This fixes a crash in Deus Ex: Mankind Divided. Release builds were unaffected, so it's not too serious. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 11:30:47 +01:00
Nicolai Hähnle	84a74be9e4	radeonsi: enable GLSL 4.50 Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-04 10:33:50 +01:00
Michel Dänzer	8ce7ef75f5	gallium/radeon: Multiply bpe by nsamples in surf_winsys_to_drm For symmetry with surf_drm_to_winsys. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 16:51:18 +09:00
Michel Dänzer	356458363d	gallium/radeon: Use flags parameter in radeon_winsys_surface_init Fixes valgrind warnings about surf_ws->flags being uninitialized while starting X. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 16:49:39 +09:00
Michel Dänzer	6f844a30c1	gallium/radeon: Only convert stencil info if RADEON_SURF_SBUFFER is set Fixes valgrind warnings about using uninitialized memory when starting X. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 16:48:59 +09:00
Michel Dänzer	38fb9aa1aa	gallium/radeon: Only loop up to last_level for drm<->winsys conversion Fixes spurious assertion failure in surf_level_drm_to_winsys when starting X, due to processing a miplevel which was never initialized. Fixes: `e9c76eeeaa` ("gallium/radeon: remove radeon_surf_level::pitch_bytes") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-04 16:47:43 +09:00
Eric Anholt	80157466cd	vc4: Add miptree/texture state support for ETC1 compressed textures. The format isn't flagged as enabled at runtime yet, because we need kernel validation support.	2016-11-03 18:42:58 -07:00
Eric Anholt	bedb996087	vc4: Fix use of undefined values since the ralloc zeroing changes. reralloc() no longer zeroes the new contents, so switch to using rzalloc_array() instead.	2016-11-03 18:42:58 -07:00
Roland Scheidegger	572a952126	draw: fix undefined input handling some more... Previous fixes were incomplete - some code still iterated through the number of elements provided by velem layout instead of the number stored in the key (which is the same as the number defined by the vs). And also actually accessed the elements from the layout directly instead of those in the key. This mismatch could still cause crashes. (Besides, it is a very good idea to only use data stored in the key anyway.) v2: move null format check, remove now unnecessary function parameter, some minor prettify Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 01:48:22 +01:00
Brian Paul	f4dd3bde37	gallium/hud: call fflush() after printing error messages For Windows. Otherwise, we don't see the message until the program exits. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:23 -06:00
Brian Paul	260d951486	svga: move svga_mark_surfaces_dirty() prototype to svga_surface.h Trivial.	2016-11-03 14:29:23 -06:00
Brian Paul	c96f63cac2	svga: whitespace / formatting clean-up in svga_context.c Trivial.	2016-11-03 14:29:23 -06:00
Brian Paul	1691e29e62	svga: collect stats for time spent in svga_context_finish() This should have appeared with commit "svga: add guest statistic gathering interface" from August 4, but was somehow lost.	2016-11-03 14:29:23 -06:00
Charmaine Lee	8a195e2fd5	svga: invalidate new surface before it is bound to a render target view Invalidate a "new" surface before it is bound to a render target view or depth stencil view in order to avoid the unnecessary host side copy of the surface data before it is rendered to. Note that, recycled surface is already invalidated before it is reused. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:23 -06:00
Charmaine Lee	06bba2452f	Revert "svga: use untyped surface formats in most cases" Using untyped surface formats causes huge performance degradation on Fusion. This reverts commit `eb0ced74f6` until the backend has a better solution to address typeless surface formats.	2016-11-03 14:29:23 -06:00
Charmaine Lee	f2eec4e829	svga: allow quad blit for more formats Currently blitter will fail if the blit format is different and view-incompatible to the resource format. Instead of punting to software blit which will stall the pipeline, we will create temporary resource to allow blitter to work. Fixes piglit test arb_copy_image-formats. Also tested with MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	4bd5ce853b	svga: create BGRX render target view for BGRX_UNORM surface Currently we adjust the view format when we are asked to create a BGRA render target view for BGRX surface. But we only look for SVGA3D_B8G8R8X8_TYPELESS surface format. With this patch, we will also check for SVGA3D_B8G8R8X8_UNORM surface format, and use SVGA3D_B8G8R8X8_UNORM as the view format for that case. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	0d221fcd40	svga: add a helper function to check for typeless format This patch adds a helper function svga_format_is_typeless() which returns TRUE if the specified format is typeless. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Brian Paul	d451421bca	svga: add SVGA_NEW_FRAME_BUFFER to svga_hw_tss_binding state atom We may need to re-emit texture bindings when the framebuffer state changes. In particular, emitting the texture binding can also involve updating a texture from its backing copy during sampler view validation. The backing copy is made during framebuffer validation. This helps to fix an issue with Photoshop on VGPU9 (VMware bug 1723971). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	ec138d6237	svga: allow copy_region if sample counts match With this patch, we will allow blit with copy_region if the source and destination textures have the same sample counts. Fixes failures with piglit tests spec@arb_texture_float@multisample-formats 2 gl_arb_texture_float spec@arb_texture_rg@multisample-formats 2 gl_arb_texture_rg-float Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	a2d49c4b46	svga: set rendered-to flag after updating the texture using PredCopyRegion This patch sets the rendered-to flag for the subresource after it is updated using the PredCopyRegion command. This is to ensure that the GB surface will be sync up properly before it will be directly mapped to. Tested with MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	59f14563a3	svga: add can_use_upload flag This patch adds a flag "can_use_upload" to svga_texture structure to avoid some checking of the upload availability at each transfer map time. Tested with Lightsmark2008, Tropics, MTT glretrace, piglit. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	3dfb4243bd	svga: fix texture upload path condition As Thomas suggested, we'll first try to map directly to a GB surface. If it is blocked, then we'll use texture upload buffer. Also if a texture is already "rendered to", that is, the GB surface is already out of sync, then we'll use the texture upload buffer to avoid syncing the GB surface. Tested with Lightsmark2008, Tropics, MTT piglit, glretrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Charmaine Lee	4750c4e543	svga: set rendered_to flag with texture uploaded using TransferFromBuffer command This patch sets the rendered_to flag for the texture subresource that is uploaded using the TransferFromBuffer command. This is to ensure that the subresource will be read back or invalidated before it will be directly mapped to. This makes sure that the content of the GB surface will not be accidentally overwritten by the device at suspend/resume time. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-11-03 14:29:22 -06:00
Neha Bhende	03e1b7cacd	svga: Add render_condition boolean flag in struct svga_context set render_condition flag when driver performs conditional rendering. Blit using DXPredCopyRegion command gets affected by conditional rendering so We should check this flag while performing blit operation Tested with piglit tests. v2: As per Charmaine's comment, setting render_condition flag if svga_query is valid. Tested with pigit tests. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-11-03 14:29:22 -06:00

1 2 3 4 5 ...

29229 commits