fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-06 21:28:15 +02:00

Author	SHA1	Message	Date
Marek Olšák	871d2aff24	gallium/radeon: fix partial layered transfers of cube (array) textures a staging cube texture with array_size % 6 != 0 doesn't work very well just use 2D_ARRAY or 2D for all staging textures Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	c2377b394b	gallium/radeon: align alignments for better buffer reuse It's for the buffer cache. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	544967faf5	gallium/radeon: use gart_page_size instead of hardcoded 4096 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	bfa8a00920	winsys/radeon: use gart_page_size instead of private size_align Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	9d8c283f28	winsys/amdgpu: move gart_page_size to struct radeon_winsys Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Roland Scheidegger	e4cf8717de	gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir Those aren't really interesting, however outputting them is helpful when trying to feed the IR to llvm llc (or opt) for debugging. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	5c200894c8	gallivm: use InternalLinkage instead of PrivateLinkage for texture functions At least with MCJIT the disassembler will crash otherwise when trying to disassemble such functions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-05-10 17:08:16 +02:00
Roland Scheidegger	8b66e2647d	gallivm: disable avx512 features We don't target this yet, and some llvm versions incorrectly enable it based on cpu string, causing crashes. (Albeit this is a losing battle, it is pretty much guaranteed when the next new feature comes along llvm will mistakenly enable it on some future cpu, thus we would have to proactively disable all new features as llvm adds them.) This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested) Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com CC: <mesa-stable@lists.freedesktop.org>	2016-05-10 17:08:16 +02:00
Samuel Iglesias Gonsálvez	d00a239b28	freedreno/ir3: lower lrp when operating with double operands Lower lrp when operating with double operands because float version of lrp is also lowered. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-05-10 11:25:01 +02:00
Samuel Pitoiset	eafe3905d9	nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_* We don't need them for compute shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-09 21:58:56 +02:00
Rob Clark	57763ee735	freedreno/ir3: fix fallout from new block iterators Since this is potentially modifying the block structure of the shader, it needs the _safe() version of the iterator. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 13:52:29 -04:00
Nicolai Hähnle	fe102f7677	radeonsi: workaround for tesselation on SI We request more than 32KB of LDS here, which SI doesn't have. Since LLVM recently started checking the size of declared LDS allocations, all shaders involved in tesselation fail to compile on SI. Note that the entire calculation here seems wrong, given how we calculate indices for generic attributes, so the number ends up wrong on CI+ as well. A proper solution is clearly needed, but this patch should serve as a band-aid for SI in the meantime. Also note that the real size of the LDS allocation in hardware is independent from what we tell LLVM, so this is really more of a "cosmetic" change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198 Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	d8f3e8e626	radeonsi: always allocate export memory for pixel shaders Experiments with framebuffer-no-attachments type draw calls have shown that NULL exports stall terribly unless we ensure that export memory is allocated by the SPI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Nicolai Hähnle	ad1782cfb5	radeonsi: expose performance counters as 64 bit This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-09 11:52:46 -05:00
Tim Rowley	b65f7ec450	gallium: enable intel jitevents profiling LLVM when configured with "intel jitevents" enabled can inform VTune about dynamic code, so individual shaders are attributed profiling data and the resulting assembly can be examined. Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-05-09 11:25:02 -05:00
Bruce Cherniak	0062c5f09b	swr: Add missing break in query switch statement. Missed a switch break in query stat collection when refactoring queries. Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2016-05-09 11:21:47 -05:00
Rob Clark	f33083a216	freedreno/ir3: allow for additional VS sysval inputs There are a total of four possible currently, rather than 2. So we need to be prepared for the input array to grow by 16 components. We could get away with less if we could pack sysval inputs.. and the way this is handled currently isn't really the nicest thing. But it's a tactical fix for an issue hit in: GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-05-09 11:51:59 -04:00
Marek Olšák	172bfdaa9e	r300g: add support for PIPE_FORMAT_x8R8G8B8_* And set endian swap for packed formats the way it should be done in theory. This allows big endian to work again, but it can still be buggy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-05-09 13:11:40 +02:00
Nicolai Hähnle	b9e6e8e7d4	radeonsi: fix undefined behavior (memcpy arguments must be non-NULL) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	146927ce7b	radeonsi: fix some reported undefined left-shifts One of these is an unsigned bitfield, which I suspect is a false positive, but gcc 5.3.1 complains about it with -fsanitize=undefined. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	60d2fc233b	gallium/radeon: clean left-shift undefined behavior Shifting into the sign bit of a signed int is undefined behavior. Unfortunately, there are potentially many places where this happens using the register macros. This commit is the result of running sed -ie "s/(((\(\w\+\)) & 0x\(\w\+\)) << \(\w\+\))/(((unsigned)(\1) \& 0x\2) << \3)/g" on all header files in gallium/{r600,radeon,radeonsi}. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Nicolai Hähnle	62b7958cd0	gallium: fix various undefined left shifts into sign bit Funnily enough, some of these were turned into a compile-time error by gcc with -fsanitize=undefined ("initializer is not a constant"). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-07 16:46:59 -05:00
Bas Nieuwenhuizen	6291f19f71	radeonsi: Compute correct LDS size for fragment shaders. No sure where the 36 came from, but we clearly need at least 48 bytes per attribute per primitive. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-06 21:40:17 +02:00
Eric Anholt	a1f698881e	vc4: Add support for loading immediate values in QIR. This will be used for resetting the uniform stream in the presence of branching, but may also be useful as an optimization to reduce how many uniforms we have to copy out per draw call (in exchange for increasing icache pressure).	2016-05-06 10:25:55 -07:00
Eric Anholt	890dc19eeb	vc4: Make vc4_qpu_validate() produce more verbose failures. Seeing the expansion of a QPU_GET_FIELD in an assert isn't very informative, and it's hard find what's going wrong without getting a dump of the instruction that failed.	2016-05-06 10:25:55 -07:00
Eric Anholt	8e2d0843c0	vc4: Add a small QIR validate pass. This has caught a couple of bugs during loop development so far, and I should probably have written it long ago.	2016-05-06 10:25:55 -07:00
Eric Anholt	daaa9d579d	vc4: Fix the src count on exp2/log2. Found by the upcoming QIR validate pass.	2016-05-06 10:25:55 -07:00
Eric Anholt	d36b28402f	vc4: Reuse QPU disasm's cond flags in QIR. In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.	2016-05-06 10:25:55 -07:00
Eric Anholt	419fee92ee	vc4: When emitting an instruction to an existing temp, mark it non-SSA. Prevents a bug in the later control-flow support series.	2016-05-06 10:25:55 -07:00
Eric Anholt	1387e722cd	vc4: Make sure that we don't overwrite the signal for PROG_END. We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.	2016-05-06 10:25:55 -07:00
Samuel Pitoiset	44de03b0f8	nvc0: unreference images when the context is destroyed Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-06 15:15:32 +02:00
Marek Olšák	901f57dff5	radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4 Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-06 12:56:47 +02:00
Marek Olšák	4489d75a58	r600g: use the hw MSAA resolving if formats are compatible This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-05-06 12:56:47 +02:00
Leo Liu	fef0e993a1	st/omx/enc: fix incorrect reference picture order for B frames Stacking frames is for driver that's capable to do dual instances encoding. Such feature is not enabled for B frames currently. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-05-05 19:26:43 -04:00
Connor Abbott	7c36f9eb52	vc4: fixup for new nir_foreach_block() Reviewed-by: Eric Anholt <eric@anholt.net>	2016-05-05 16:19:41 -07:00
Connor Abbott	582815d9ea	ir3: fixup for new nir_foreach_block()	2016-05-05 16:19:41 -07:00
Tim Rowley	ff8c0c9a35	swr: [rasterizer core] Faster modulo operator in ProcessVerts Avoid % operator, since we know that curVertex is always incrementing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:50:11 -05:00
Tim Rowley	2be7c3e780	swr: [rasterizer] Small warning cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:50:03 -05:00
Tim Rowley	b39c530f88	swr: [rasterizer] Add SWR_ASSUME / SWR_ASSUME_ASSERT macros Fix static code analysis errors found by coverity on Linux Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:56 -05:00
Tim Rowley	db084f48eb	swr: [rasterizer] Miscellaneous backend changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:48 -05:00
Tim Rowley	3951a2109e	swr: [rasterizer] Add support for X24_TYPELESS_G8_UINT format Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:42 -05:00
Tim Rowley	909aee07f8	swr: [rasterizer jitter] Fix printing bugs for tracing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:29 -05:00
Tim Rowley	bc084e6b3d	swr: [rasterizer memory] Add missing store tiles function Storing color hot tile to 8bit w-major stencil format. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:23 -05:00
Tim Rowley	5332c9d931	swr: [rasterizer jitter] Add asserts for supported formats in fetch shader Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:18 -05:00
Tim Rowley	6e89227054	swr: [rasterizer core] Fix thread allocation Fix windows in 32-bit mode when hyperthreading is disabled on Xeons. Some support for asymmetric processor topologies. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:11 -05:00
Tim Rowley	c2f5d2daa8	swr: [rasterizer core] Fix threadviz support in buckets Need to do lazy eval of the threadviz knob since order of globals is undefined. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:49:04 -05:00
Tim Rowley	1eb211c4a4	swr: [rasterizer] Whitespace cleanup and misc changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2016-05-05 14:48:55 -05:00
Nicolai Hähnle	d97e333ea4	radeonsi: mark descriptor loads as using dynamically uniform indices This tells LLVM to always use SMEM loads for descriptors. It fixes a regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test that was caused by LLVM r268259 (but the proper fix is really here in Mesa). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-05 12:21:40 -05:00
Bruce Cherniak	9d86a5eea7	swr: Remove stall waiting for core query counters. When gathering query results, swr_gather_stats was unnecessarily stalling the entire pipeline. Results are now collected asynchronously, with a fence marking completion. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2016-05-05 10:50:09 -05:00
Thomas Hindoe Paaboel Andersen	3a6763f0a0	freedreno: remove null check before free Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-05-05 09:34:01 +02:00

1 2 3 4 5 ...

27196 commits