fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-02 04:48:26 +02:00

Author	SHA1	Message	Date
Marek Olšák	076db67217	gallium/radeon: inline radeon_winsys::query_memory_usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	9646ae7799	gallium/radeon/winsyses: expose per-IB used_vram and used_gart to drivers The following patches will use this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	1c8f17599e	gallium/radeon/winsyses: print CS submission error number Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	0edc2e433e	radeonsi: flush if constant, shader, and streamout buffers use too much memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	c3efdeb8dd	radeonsi: flush if sampler views and images use too much memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	d82cfab84c	radeonsi: deal with high vertex buffer memory usage correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	e62caf576e	radeonsi: take compute shader and dispatch indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	c56ecb68e7	radeonsi: take scratch buffer and draw indirect memory usage into account Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	ed2254d157	radeonsi: check IB memory usage of CP DMA operations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Marek Olšák	f4b977bf3d	gallium/radeon: add r600_resource::vram_usage and gart_usage Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-06 13:56:14 +02:00
Jason Ekstrand	f29fd7897a	util: Move format_r11g11b10f.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:57 -07:00
Jason Ekstrand	6c665cdfc5	util: Move format_rgb9e5.h to src/util It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-08-05 09:06:31 -07:00
Tim Rowley	b521083ffb	swr: [rasterizer core] static analysis fixes for conservative rast Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:35 -05:00
Tim Rowley	68dc544879	swr: [rasterizer core] implement InnerConservative input coverage Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:35 -05:00
Tim Rowley	4034f48833	swr: [rasterizer core] remove CanEarlyZ function Test is now in SetupPipeline. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	b365989875	swr: [rasterizer core] use 32x32 macrotile for openswr Significant performance increase (up to 2x) on high geometry workloads. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	5f4bc9e85b	swr: [rasterizer fetch] add support for 24bit format fetch Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	527d45c8fe	swr: [rasterizer fetch] additional fetch format support Add support for 0 pitch in fetch. Add support for USCALE/SSCALE for 32bit integer fetches. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	f438b7ba81	swr: [rasterizer jitter] fix potential jit exit crash Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	57b07498d2	swr: [rasterizer core] update sync handling Sync now uses a callback to ensure that it's called by the last thread moving past a DC. This will help with the new counter handling. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:38:34 -05:00
Tim Rowley	191786d0f4	swr: [rasterizer core] rename variable Avoid nested declarations of the same name within a single function. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:37 -05:00
Tim Rowley	61cc012e9a	swr: [rasterizer jitter] adjust extern "C" block scope Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:31 -05:00
Tim Rowley	9f7d99fcfe	swr: [rasterizer core] conservative rast degenerate handling Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 14:01:25 -05:00
Tim Rowley	f01827a469	swr: [rasterizer core] allow hexadecimal for integer knobs Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-08-04 13:52:12 -05:00
Eric Anholt	c976e164d2	vc4: Move scalarizing and some lowering to link time. This works out to be a wash in terms of memory usage: We use more memory to store the separate ALU instructions, but we optimize out a lot of code as well. The main result, though, is that we do more of our work at link time rather than draw time.	2016-08-04 08:48:27 -07:00
Eric Anholt	2350569a78	vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far. We don't want to bake the whole array into the FS key, because of the hashing overhead. But we can keep a set of the arrays seen, and use a pointer to the copy in as the array's proxy. Between this and the previous patch, gl-1.0-blend-func now passes on hardware, where previously it was filling the 256MB CMA area with shaders and OOMing. Drops 712 shaders from shader-db.	2016-08-04 08:48:27 -07:00
Eric Anholt	62ea2461ed	vc4: Don't recompile the CS when the FS changes. The compiled_fs_id is a proxy for the vc4->prog.fs->input_slots[], but only the VS dereferences it. Drops 754 shaders from shader-db.	2016-08-04 08:48:27 -07:00
Eric Anholt	d577dbc201	vc4: Move FS inputs setup out to a helper function. It's a pretty big block, and I was about to make it bigger.	2016-08-04 08:48:27 -07:00
Michel Dänzer	67c5e843b9	vl/dri3: Destroy Present event context when destroying drawable v2 Without this, the X server may accumulate stale Present event contexts if a client performs several video decoding sessions using the same window. v2: Based on Chris Wilson's review: * Use xcb_discard_reply() instead of free(xcb_request_check()) Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>	2016-08-04 15:45:43 +09:00
Eric Anholt	bc1fc9c985	vc4: Avoid generating a custom shader per level in glGenerateMipmaps(). We were baking in the LOD of the source level to each shader. Instead, pass it in as a uniform -- this requires storing it to a temp register, but that's better than compiling a ton of separate shaders: total instructions in shared programs: 115032 -> 115036 (0.00%) instructions in affected programs: 96 -> 100 (4.17%) LOST: 572	2016-08-03 10:55:54 -07:00
Eric Anholt	e97e9e62a1	vc4: Tell valgrind about BO allocations from mmap time to destroy. This helps in debugging memory pressure. It would be nice if we could tell valgrind about it all the way from allocation time to destroy, but we need a pointer to hand to VALGRIND_MALLOCLIKE_BLOCK.	2016-08-03 10:28:20 -07:00
Eric Anholt	a0671d67de	vc4: Fix a leak of the src[] array of VPM reads in optimization. Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-08-03 10:25:09 -07:00
Eric Anholt	9f95690959	vc4: Fix leak of the bo_handles table.	2016-08-03 10:25:08 -07:00
Eric Anholt	02f8c444e8	vc4: Fix handling of UBO range offsets. The ranges are in units of bytes, not dwords. This wasn't caught by piglit tests because ttn tends to make one big uniform file, so we only had one UBO range with a src and dst offset of 0.	2016-08-03 10:25:08 -07:00
Eric Anholt	36b9eb82c1	vc4: Dump NIR at shader state creation time as well. I keep wanting to see this version of the NIR.	2016-08-03 10:25:08 -07:00
Marek Olšák	435d9595d3	r600g: use last_gfx_fence like radeonsi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	a6bfafa083	gallium/radeon: move last_gfx_fence from radeonsi to common code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c15a9dec29	radeonsi: skip unnecessary si_update_shaders calls Small decrease in draw call overhead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c2a0e99169	radeonsi: print the command line to VM fault reports (v2) v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	6573ad69ef	ddebug: print the command line to all logs (v2) for piglit with the pipelined hang detection mode v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	840353059a	ddebug: don't use fmemopen on non-Linux OS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97140 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c88b309fd5	radeonsi: don't set the last parameter component of llvm.AMDGPU.cube LLVM doesn't use it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	42c5f839ad	radeonsi: use llvm.amdgcn.cube* if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	1fb6e55eaf	radeonsi: use llvm.amdgcn.rsq.f64 if available Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	db2d31dab1	radeonsi: use v_mad_f32 for fma v_fma_f32 runs at FP64 rate (= slow). Alien Isolation and F1 2015 seem to use fma for all d3d multiply-add instructions, which is silly. This tries to restore performance for those games. The main difference between v_mad_f32 and v_fma_f32 is that v_mad doesn't support denormals, which we don't enable anyway, because they are slow too. Also, there is code size reduction: Totals from affected shaders: VGPRS: 109796 -> 109808 (0.01 %) Spilled SGPRs: 29995 -> 30022 (0.09 %) Spilled VGPRs: 12 -> 13 (8.33 %) <-- it's just one shader going from 12 to 13 Code Size: 6667596 -> 6476356 (-2.87 %) bytes Max Waves: 26931 -> 26899 (-0.12 %) I've not actually tested real performance. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Tim Rowley	11072de368	swr: build swr with -fno-strict-aliasing swr rasterizer contains numerous data transfers between vectors and ordinary C types. Fixing for strict aliasing will take time. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-02 14:30:33 -05:00
Marek Olšák	6db93cd167	gallium/util: fix align64 it cut off the upper 32 bits Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-08-01 23:28:14 +02:00
Matt Turner	be35c6ba92	draw: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	8e68f35d32	r600g: Avoid aliasing violations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00
Matt Turner	d2838f77ec	r300g: Avoid aliasing violation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-01 12:09:17 -07:00

1 2 3 4 5 ...

28249 commits