fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 16:38:12 +02:00

Author	SHA1	Message	Date
Dylan Baker	43b0e5f5cd	meson: build virgl driver Build tested only. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:38 -08:00
Dylan Baker	a537231b22	meson: build svga driver on linux Build tested only. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:36 -08:00
Dylan Baker	5060c51b6f	meson: build r600 driver v4: - Ensure inc_amd_common defined when radeonsi is disabled (needed by r600) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:33 -08:00
Dylan Baker	4ae08296d0	meson: build r300 driver This is build tested only Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:30 -08:00
Dylan Baker	9169dde941	meson: build i915g driver Build tested only. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:26 -08:00
Brian Paul	c5d199fa2c	svga: move svga_is_format_supported() to svga_format.c where the other format-related functions live. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-28 06:50:16 -07:00
Brian Paul	bae5b2a87c	svga: s/unsigned/SVGA3dDevCapIndex/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-28 06:50:16 -07:00
Eric Engestrom	7bb89e1c8f	vc4: check preprocessor token existence using #ifdef instead of #if (other uses of USE_VC4_SIMULATOR are already correct) Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 09:50:36 +00:00
Nicolai Hähnle	dd07868904	radeonsi/gfx9: simplify condition for on-chip ESGS Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	239d2b5809	radeonsi: clarify that si_shader_selector::esgs_itemsize is set for the ES part Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	26da5d0317	radeonsi: use si_shader_context instead of lp_build_context in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	1c2d19d84d	radeonsi: cleanup si_initialize_color_surface Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	08f6b4dd7b	radeonsi: avoid attempting to create CMASK if the tiling mode doesn't have it Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	e52e8326d9	radeonsi: check that we don't leak fine.buf references Just as an added precaution. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	97f42d11df	amd/common: sid.h cleanups Fix a bunch of labels indicating when registers were added/removed and normalize the SI-class GRBM_GFX_INDEX. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Marek Olšák	ec15ff78c3	ac: change legacy_surf_level::slice_size to dword units The next commit will reduce the size even more. v2: typecast to uint64_t manually v3: add more typecasts, add asserts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:44:04 +01:00
Marek Olšák	474b4a9191	ac: pack ac_surface better r600_texture: 1736 -> 1488 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	b5444877c0	radeonsi: always initialize max_forced_staging_uploads r600_resource is malloc'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103808 Fixes: `4b0dc098b2` ("gallium/u_threaded: don't map big VRAM buffers for the first upload directly") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	95cd74abd4	radeonsi: remove an old hack for evergreen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	1cb731012c	radeonsi: set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST when profitable ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Dave Airlie	fd301472bd	r600/eg: dump event type in dumps This just makes it easier to debug some things. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 12:53:18 +10:00
Tobias Klausmann	068a72fbcb	nouveau/compiler: Allow to omit line numbers when printing instructions This comes in handy when checking "NV50_PROG_DEBUG=1" outputs with diff! V2: - Use environmental variable (Karol Herbst) V3: - Use the already populated nv50_ir_prog_info to forward information to the print pass (Pierre Moreau) V4: - get rid of default value in PrintPass constructor Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 12:51:30 -05:00
Nicolai Hähnle	0fed7f83ba	radeonsi: try flushing unflushed fences in si_fence_finish even when timeout == 0 Under certain conditions, waiting on a GL sync objects should act like a flush, regardless of the timeout. Portal 2, CS:GO, and presumably other Source engine games rely on this behavior and hang during loading without this fix. Fixes: `bc65dcab3b` ("radeonsi: avoid syncing the driver thread in si_fence_finish") Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103902 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103904	2017-11-26 16:53:00 +01:00
Ilia Mirkin	0bd83d0461	nv50/ir: move LateAlgebraicOpt to the very end Memory loads can take offsets, but the SHLADD will often attempt to consume the offsets too. As there may be multiple memory loads with the same base but different offsets, those would end up in a SHLADD instead of the offset of the memory operation. This moves the pass after we've had a chance to attempt to propagate immediate adds into the indirect offset. total instructions in shared programs : 6580681 -> 6567716 (-0.20%) total gprs used in shared programs : 944261 -> 943375 (-0.09%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60339896 -> 60221504 (-0.20%) local shared gpr inst bytes helped 0 0 555 2698 2698 hurt 0 0 138 336 336 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 01:10:19 -05:00
Ilia Mirkin	3072bbef63	nv50/ir: when merging immediates/consts, load directly When a MERGE operation gets its constraint moves added, it susbstantially extends live ranges to be reusing an immediate from earlier in the program (not to mention the silliness of loading an immediate into a register, and then moving into another register). We detect these scenarios and insert moves that take the immediate or constbuf load directly into the register. If it's the last use, then we can just move that operation to the closer location. With SM35 (255 regs) we get these results: total instructions in shared programs : 6583670 -> 6580681 (-0.05%) total gprs used in shared programs : 950818 -> 944261 (-0.69%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60367456 -> 60339896 (-0.05%) local shared gpr inst bytes helped 0 0 4584 3186 3186 hurt 0 0 55 968 968 I suspect they will be better for SM20 and SM30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 01:10:19 -05:00
Ilia Mirkin	50e913b9c5	nv50/ir: add optimization for modulo by a non-power-of-2 value We can still use the optimized division methods which make use of multiplication with overflow. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2017-11-26 01:10:03 -05:00
Ilia Mirkin	3079993727	nv50/ir: optimize signed integer modulo by pow-of-2 It's common to use signed int modulo in GLSL. As it happens, the GLSL specs allow the result to be undefined, but that seems fairly surprising. It's not that much more effort to get it right, at least for positive modulo operators. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-25 22:48:09 -05:00
Ilia Mirkin	f39a91c152	freedreno/a4xx: add ARB_framebuffer_no_attachments support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	4f748d12e8	freedreno/a4xx: add indirect draw support This is a copy of the a5xx logic. Fails a few tests, but basic functionality is there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	c3c8d48725	freedreno: regenerate pm4 header, adjust code for new names Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	ffdcd51e66	freedreno/a4xx: add stencil texturing support Copied from a5xx, should be identical. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	86f12e9377	freedreno/ir3: add a pass to lower tg4 to txl, enable gather on a4xx Unfortunately Adreno A4xx hardware returns incorrect results with the GATHER4 opcodes. As a result, we have to lower to 4 individual texture calls (txl since we have to force lod to 0). We achieve this using offsets, including on cube maps which normally never have offsets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:59 -05:00
Marek Olšák	2cfa319f9f	radeonsi: expose all CB performance counters on Stoney Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	797c447f1c	radeonsi: handle imported textures with DCC robustly now you can hack the driver to enable DCC for displayable textures and Glamor that doesn't enable that by default won't crash anymore. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	992b6e18d0	radeonsi: fix a typo in creating monolithic ES-GS This has no effect because both occupy the same memory in a union. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	f783677a82	radeonsi: don't write undefined output channels to LDS in LS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	b63e7d4c6f	radeonsi: use ac.lds for shared memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	39b098dafb	radeonsi: do 64-bit LDS loads recursively Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Wladimir J. van der Laan	35548cae93	etnaviv: Emit vertex buffers consecutively Vertex buffer legacy state is no longer picked up with new drawing commands. Change to use different cases depending on the number of vertex streams in the GPU specs. This results in slightly more compact state emission as well, on all vivantes. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-23 22:24:51 +01:00
Roland Scheidegger	71e630753e	r600: set DX10_CLAMP for compute shader too I really intended to set this for all shader stages by `3835009796` but missed it for compute shaders (because it's in a different source file...). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-23 02:28:38 +01:00
Gert Wollny	799d350870	r600/shader: Fix all warnings issed with "-Wall -Wextra" - fix a number of -Wsign-compare warnings - fix two warnings for -Woverride-init because TGSI_OPCODE_CEIL == 83, and the according field was defined two times. [airlied: don't use -1 with unsigned type, fix whitespace] Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-22 22:50:18 +00:00
Gert Wollny	1d076aafbc	r600: Emit EOP for more CF instruction types So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END, CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction was added to add the EOP flag, even though this is not actually needed, because all these instrutions support the EOP flag. This patch removes the fixup code, adds setting the EOP flag for the according instructions as well as others like CF_OP_TEX and CF_OP_VTX, and adds writing out EOP for this type of instruction in the disassembler. This also fixes a bug where shaders were created that didn't actually have the EOP flag set in the last CF instruction, which might have resulted in GPU lockups. [airlied: cleaned up a little] Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-22 22:39:42 +00:00
Eric Anholt	6a78416dab	broadcom/vc5: Fix BASE_LEVEL handling with txl. The HW doesn't add the base level anywhere (the min/max lod clamping is what does base level), so we need to add it manually in this case. Fixes piglit tex-miplevel-selection *Lod 2D.	2017-11-22 10:56:31 -08:00
Eric Anholt	c55813c22e	broadcom/vc5: Fix array texture layer count setup. Fixes piglit array-texture.	2017-11-22 10:56:31 -08:00
Eric Anholt	ad1521d708	broadcom/vc5: Don't increment primitive queries while they're paused. Fixes ext_transform_feedback-generatemipmap prims_generated	2017-11-22 10:56:31 -08:00
Eric Anholt	1214c2ea2a	broadcom/vc5: Fix incorrect padding of TF outputs. After the first output, we were padding by an extra size of the previous output. Fixes piglit ext_transform_feedback-output-type mat4x3[2] and friends.	2017-11-22 10:56:31 -08:00
Eric Anholt	b18840ac6e	broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes. The HW was computing an implicit height for the surface based on the image size, but that may be smaller than the surface with ARB_fbo mismatched sizes. In that case, we need to tell it about the pad, either with the little 4-bit field in the RT config, or the extended field in CLEAR_COLORS_PART3. Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.	2017-11-22 10:56:31 -08:00
Wladimir J. van der Laan	9f162fa107	etnaviv: Put HALTI level in specs The HALTI level is an indication of the gross architecture of the GPU. It determines for significant part what feature level the GPU has, what state (especially frontend state) is there, and where it is located. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-22 14:42:06 +01:00
Wladimir J. van der Laan	391c958f08	etnaviv: Const-correctness etnaviv_emit.h The relocation structure is never changed by submitting it. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-22 14:42:00 +01:00
Roland Scheidegger	b5957cee92	llvmpipe: fix snorm blending The blend math gets a bit funky due to inverse blend factors being in range [0,2] rather than [-1,1], our normalized math can't really cover this. src_alpha_saturate blend factor has a similar problem too. (Note that piglit fbo-blending-formats test is mostly useless for anything but unorm formats, since not just all src/dst values are between [0,1], but the tests are crafted in a way that the results are between [0,1] too.) v2: some formatting fixes, and fix a fairly obscure (to debug) issue with alpha-only formats (not related to snorm at all), where blend optimization would think it could simplify the blend equation if the blend factors were complementary, however was using the completely unrelated rgb blend factors instead of the alpha ones... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-11-21 04:06:29 +01:00

1 2 3 4 5 ...

20533 commits