fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-29 03:00:23 +01:00

Author	SHA1	Message	Date
Rob Clark	b23fc4cacb	freedreno/a6xx: move VBO state to stateobj Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:49 -04:00
Rob Clark	e194056832	freedreno/a6xx: move ZSA state to stateobj Step towards single cmdstream, where we need different state-group-id's for binning vs draw ZSA state. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	a50a9a44e8	freedreno/a6xx: remove vismode param We don't need to keep this IGNORE_VISIBILITY in binning pass. Prep work for using single cmdstream for both draw and binning passes. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	d9dbc9c21f	freedreno/ir3: move binning-pass fixup for a6xx+ Move this to after ir3_cp (which can add lowered immediates to the const state) for a6xx+, to ensure the uniform state matches between binning and vertex shaders. This way we can emit just a single VS_CONST state- group when we re-use single cmdstream for both binning and draw passes. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	1a51c4a87e	freedreno/a6xx: a bit more state emit cleanup Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	2ffc79c7d1	freedreno/a6xx: move framebuffer state emit to emit_mrt() No point in checking this per-draw, since framebuffer change means new batch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	5894f37b85	freedreno/a6xx: small emit_mrt() cleanup On a6xx, this is only used for pfb->cbufs so we can just directly pass the pfb state. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	b4e94af37d	freedreno/a6xx: use program cache Use the in-memory cache to construct shader program state and re-use it on subsequent draws, to lower driver overhead. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	1d7fbe2cd1	freedreno/ir3: shader variant cache Cache that maps gallium hwcso (in this case, 'struct ir3_shader') plus shader variant key to a generation specific state object. This could eventually replace the linked list of shader variants, but for now it lets us re-use the work currently done in fdN_program_emit() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	2e9c08c0bc	freedreno/ir3: move binning_pass out of shader variant key Prep work for a following patch, that introduces a cache to map from program state (all shader stages) plus variant key to pre-baked hw state (which could be emit'd via CP_SET_DRAW_STATE, for example). To do that, we really want the variant key to be immutable, and to treat the binning pass shader as an extra shader stage, rather than as a VS variant. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	8b1a3b5dde	freedreno/ir3: track # of samplers used by shader This is useful for a6xx to avoid program state from depending on bound tex/samp state. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	1b9d69410c	freedreno/a6xx: texture state obj Unfortunately gallium doesn't match what the hw wants perfectly here, in using a separate CSO for each texture/sampler. So we have to use a hash table to map the collection of texture/samplers to hw state object. We probably could use separate hw state objects for texture and sampler state, but mesa/st tends to update the tex and samp state together. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	e8606b11dd	freedreno: add resource seqno Intended to be something more compact than a 64b pointer, which could be used as a key into hashtables. Prep work for texture state objects. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	abcdf5627a	freedreno/a6xx: move const emit to state group Eventually we want to move nearly everything, but no other state depends on const state, so this is the easiest one to move first. For webgl aquarium, this reduces GPU load by about 10%, since for each fish it does a uniform upload plus draw.. fish frequently are visible in only a single tile, so this skips the uniform uploads for other tiles. The additional step of avoiding WFI's when using CP_SET_DRAW_STATE seems to be work an additional 10% gain for aquarium. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	a398d26fd2	freedreno/a6xx: add infrastructure for CP_DRAW_STATE Add helper to add state-groups to emit, and code to emit CP_DRAW_STATE packet if we have any state-groups. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	ec717fc629	freedreno: reduce resource dependency tracking overhead Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Neil Roberts	ee61790daf	freedreno: Remove the Emacs mode lines These are not necessary because the corresponding settings are set via the .dir-locals.el file anyway. Most of them were missing a ‘:’ after “tab-width” which was making Emacs display an annoying warning whenever you open the file. This patch was made with: sed -ri '/-\- mode:/,/^$/d' \ $(find src/gallium/{drivers,winsys} -name \.\[ch\] \ -exec grep -l -- '-\*- mode:' {} \+) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Neil Roberts	afe640b360	freedreno: Fix the Emacs indentation configuration file The .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Hyunjun Ko	8e798e28f7	freedreno: allocate batches from the cache in launch_grid Needs to allocate batches from the cache so that it could get a valid index and make resource dependancy tracking right. In addition this fixes assertion on debug build since the commit `1a40faa8` landed. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Hyunjun Ko	2385d7b066	freedreno: adds nondraw param to fd_bc_alloc_batch Needs to specify nondraw when creating a batch through fd_bc_alloc_batch since it'd better create a batch through it rather than fd_batch_create. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	9e6019bd46	freedreno/a6xx: remove fd6_emit_render_cntl() It was dead code carried over from a5xx Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	835cb06965	freedreno/ir3: fix broken texcoord inputs TODO not sure if this is best solution, but current logic is broken for texcoord inputs. It is definitely the simplest solution. Fixes: `1a24f51966` freedreno/ir3: ignore unused inputs Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	cbf9fe50b5	freedreno: fix off-by-one error in BEGIN_RING() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Marek Olšák	669dd22983	util: document a limitation of util_fast_udiv32 trivial	2018-10-17 12:27:58 -04:00
Matt Turner	58a51d0a67	i965/fs: Add 64-bit int immediate support to dump_instructions() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-16 17:48:17 -07:00
Marek Olšák	fcc70e4855	radeonsi: track context rolls better for the Vega scissor bug workaround We should get fewer context rolls with the SET_CONTEXT_REG optimization, but it would have been for nothing if the scissor state rolled the context anyway. Don't emit the scissor state if there is no context roll.	2018-10-16 17:23:25 -04:00
Marek Olšák	25ddb15cfe	radeonsi: emit sample locations for 1xAA only when the hw bug is present	2018-10-16 17:23:25 -04:00
Marek Olšák	9b331e462e	radeonsi: use compute shaders for clear_buffer & copy_buffer Fast color clears should be much faster. Also, fast color clears on evicted buffers should be 200x faster on GFX8 and older.	2018-10-16 17:23:25 -04:00
Marek Olšák	5030adcbe0	radeonsi: use copy_buffer in buffer_do_flush_region directly	2018-10-16 17:23:25 -04:00
Marek Olšák	0b40fbc879	radeonsi: use faster integer division for instance divisors We know the divisors when we upload them, so instead we can precompute and upload division factors derived from each divisor. This fast division consists of add, mul_hi, and two shifts, and we have to load 4 dwords intead of 1. This probably won't affect any apps.	2018-10-16 17:23:25 -04:00
Marek Olšák	bfc795670e	ac: add helpers for fast integer division by a constant	2018-10-16 17:23:25 -04:00
Marek Olšák	ea039f789d	radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewports	2018-10-16 15:28:22 -04:00
Marek Olšák	4fd8d2df9c	radeonsi: move emission of PA_SU_VTX_CNTL into emit_guardband We'll modify the quant mode there, which also affects the guarband computation.	2018-10-16 15:28:22 -04:00
Marek Olšák	41a6c3de1f	radeonsi: don't re-upload the sample position constant buffer repeatedly	2018-10-16 15:28:22 -04:00
Marek Olšák	b94824c787	radeonsi: set PA_SU_PRIM_FILTER_CNTL optimally	2018-10-16 15:28:22 -04:00
Marek Olšák	9e182b8313	radeonsi: center viewport to improve guardband clipping for high resolutions This will be more useful when we change the quant mode to increase subpixel precision and decrease the viewport range (which might not be possible if the viewport is not centered in the viewport range).	2018-10-16 15:28:22 -04:00
Marek Olšák	fedc1fda30	radeonsi: save raster config in screen, add se_tile_repeat	2018-10-16 15:28:22 -04:00
Marek Olšák	ac76aeef20	radeonsi: switch back to standard DX sample positions Apps may rely on them.	2018-10-16 15:28:22 -04:00
Marek Olšák	67f02cf810	radeonsi: add GDS support to CP DMA	2018-10-16 15:28:22 -04:00
Marek Olšák	0d05581578	radeonsi: rename si_gfx_* functions to si_cp_* and write_event_eop -> release_mem	2018-10-16 15:28:22 -04:00
Marek Olšák	6e1cf6532d	radeonsi: make si_gfx_write_event_eop more configurable	2018-10-16 15:28:22 -04:00
Sergii Romantsov	0fa9e6d7b3	anv/skylake: disable ForceThreadDispatchEnable On Skylake enabling of ForceThreadDispatchEnable causes gpu-hang. -v2: enabling of ForceThreadDispatchEnable is only for gen8, for gen9 and higher reverted enabling of PixelShaderHasUAV. -v3 (Jason Ekstrand): Rework the comments a bit. CC: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107941 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760 Fixes: `79270d2140` (anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-16 13:20:51 -05:00
Lionel Landwerlin	322a919a41	anv: Implement VK_EXT_pci_bus_info Even though the Intel GPU are always at the same PCI location, all the info we need is already provided by libdrm. Let's be future proof. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-16 12:47:55 +01:00
Jose Fonseca	8550be7a2f	appveyor: Cache pip's cache files. It should speed up the Python packages installation. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-16 09:41:14 +01:00
Jose Fonseca	bfb8afb14d	appveyor: Update to newer Mako/winflexbison versions. As that's what most people are bound to use. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-16 09:41:12 +01:00
Jose Fonseca	b94f9cd8f9	appveyor: Update to MSVC 2017. That's what we (and I suppose most people out there) are using now. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-16 09:41:07 +01:00
Samuel Pitoiset	647c2b90e9	radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT This feature isn't used for now, so disable it until wwm is fixed in LLVM. Fixes dEQP-VK.subgroups.vote.graphics.subgroupallequal* https://bugs.freedesktop.org/show_bug.cgi?id=108115 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-16 10:24:19 +02:00
Samuel Pitoiset	593996bc02	radv: implement buffer to image operations for R32G32B32 This should fix rendering issues with Batman Arkham City. We will probably need to implement itob and itoi at some point, but currently nothing hits these paths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-16 09:22:38 +02:00
Alex Smith	ca83d51cfb	ac/nir: Use context-specific LLVM types LLVMInt*Type() return types from the global context and therefore are not safe for use in other contexts. Use types from our own context instead. Fixes frequent crashes seen when doing multithreaded pipeline creation. Fixes: `4d0b02bb5a` "ac: add support for 16bit load_push_constant" Fixes: `7e7ee82698` "ac: add support for 16bit buffer loads" Cc: "18.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-16 08:18:24 +01:00
Vadym Shovkoplias	ad558408ff	glsl: Check the subroutine associated functions names Adding compile time check for subroutine functions with the same names. Similar check for intrastage linking was already landed in commit `5f0567a4f6`. From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification "A program will fail to compile or link if any shader or stage contains two or more functions with the same name if the name is associated with a subroutine type." Fixes: * no-overloads.vert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-16 08:15:21 +03:00

1 2 3 4 5 ...

105203 commits