fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 10:48:08 +02:00

Author	SHA1	Message	Date
Eric Anholt	69ef08d303	vc4: Make the pack-to-unorm instructions be non-SSA. This helps ensure that the register allocator doesn't force the later pack operations to insert extra MOVs. total instructions in shared programs: 98170 -> 98159 (-0.01%) instructions in affected programs: 2134 -> 2123 (-0.52%)	2015-08-20 23:42:17 -07:00
Eric Anholt	0bba4fa070	vc4: Allow QIR registers to be non-SSA. Now that we have NIR, most of the optimization we still need to do is peepholes on instruction selection rather than general dataflow operations. This means we want to be able to have QIR be a lot closer to the actual QPU instructions, just with virtual registers. Allowing multiple instructions writing the same register opens up a lot of possibilities.	2015-08-20 23:40:22 -07:00
Eric Anholt	ceb1a31842	vc4: We can now move TEX_RESULT accesses across other r4 ops. No difference on shader-db.	2015-08-20 23:40:16 -07:00
Ilia Mirkin	8483577f6b	nv50/ir: pre-compute BFE arg when both bits and offset are imm Due to a quirk in how the nv50 opt passes run, the algebraic optimization that looks for these BFE's happens before the constant folding pass. Rearranging these passes isn't a great idea, but this is easy enough to fix. Allows a following cvt to eliminate the bfe in certain situations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 22:16:46 -04:00
Glenn Kennard	4237dfb978	r600g: Fix handling of TGSI_OPCODE_ARR with SB FLT_TO_INT goes in the vector pipes on evergreen/NI, not the trans unit as on earlier chips. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-21 09:46:13 +10:00
Edward O'Callaghan	7a32652231	r600: Turn 'r600_shader_key' struct into union This struct was getting a bit crowded, following the lead of radeonsi, mirror the idea of having sub-structures for each shader type. Turning 'r600_shader_key' into an union saves some trivial memory and CPU cycles for the shader keys. [airlied: drop as_ls, and reorder so larger fields at start.] Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-21 09:46:13 +10:00
Edward O'Callaghan	e2145de74d	r600: Rewrite r600_shader_selector_key() to use a switch stmt Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-21 09:46:13 +10:00
Tobias Klausmann	3e6adbd761	nv50/ir: Handle OP_CVT when folding constant expressions [imirkin: handle more type combinations, use macro] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 17:58:30 -04:00
Ilia Mirkin	f5b926183d	nvc0/ir: undo more shifts still by allowing a pre-SHL to occur This happens with unpackSnorm lowering. There's yet another bitfield-extract behind it, but there's too much variation to be worth cutting through. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 17:58:30 -04:00
Ilia Mirkin	9ebe7dc094	nvc0/ir: don't require AND when the high byte is being addressed unpackUnorm* lowering doesn't AND the high byte/word as it's unnecessary. Detect that situation as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 17:58:30 -04:00
Ilia Mirkin	63cb85e567	nvc0/ir: detect i2f/i2i which operate on specific bytes/words Some Unigine shaders have been observed to unpack bytes out of 32-bit integers and convert them to floats. I2F/I2I can handle this sort of thing directly. Detect the handleable situations. This misses 16-bit word capabilities in nv50, but I haven't seen shaders that would actually make use of that. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 17:58:30 -04:00
Ilia Mirkin	51499bb5ff	nvc0/ir: detect AND/SHR pairs and convert into EXTBF Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 17:58:30 -04:00
Chih-Wei Huang	2a4af36517	nv50/ir: support different unordered_set implementations If build with C++11 standard, use std::unordered_set. Otherwise if build on old Android version with stlport, use std::tr1::unordered_set with a wrapper class. Otherwise use std::tr1::unordered_set. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-20 17:58:30 -04:00
Marek Olšák	3b1e283d88	radeonsi: fix a typo as_es -> as_ls in a string Trivial.	2015-08-19 12:04:51 +02:00
Marek Olšák	5fb0180592	winsys/amdgpu: fix the type of memory usage counters If the 32-bit types overflowed, the driver could submit an IB that uses much more memory than is available. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-08-19 12:03:01 +02:00
Marek Olšák	421b809db1	radeonsi: fix indirect indexing of MSAA textures FMASK wasn't handled correctly. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-08-19 12:03:01 +02:00
Jason Ekstrand	f01bdb0484	util/ra: Make allocating conflict lists optional Since i965 is now using make_reg_conflicts_transitive and doesn't need q-value computations, they are disabled on i965. They are enabled everywhere else so that they get the old behavior. This reduces the time spent in eglInitialize() on BDW by around 10-15%. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-18 17:48:53 -07:00
Rob Clark	4a0bea3863	freedreno: use fd_pipe_wait_timeout() To properly support the case of waiting on a fence with a 0 timeout, we still need to call down to the kernel. Which requires the use of the new fd_pipe_wait_timeout() API. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-18 15:36:30 -04:00
Rob Clark	fd7a14f8dd	freedreno: fence fix Don't take current timestamp/fence from current ring, as we might have already rolled over to new rb. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-18 15:36:30 -04:00
Neil Roberts	885762e182	Add mesa.icd to the .gitignore Since `4d7e0fa8c7` this file is generated by the configure script. Reviewed-by: Tapani Palli <tapani.palli@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-08-18 12:12:15 -07:00
Grazvydas Ignotas	97f5d00648	radeon/uvd: remove unused variables Recent commits introduced new unused variable warnings, fix them. Reviewed-by: Christian König <christian.koenig@amd.com>	2015-08-18 14:11:48 +02:00
Marcos Paulo de Souza	df97126731	nouveau: recognize tess stages in nouveau_compiler Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 23:05:00 -04:00
Marcos Paulo de Souza	723a5a2e68	tgsi: fix parsing of tessellation shader inputs/outputs Tessellation control shaders write to outputs as OUT[ADDR[0].x][0], make sure to parse the indirect dimension on outputs. Also tess control inputs/outputs and tess eval input declarations need to receive the same treatment as geometry shader inputs. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 23:05:00 -04:00
Marcos Paulo de Souza	a37fa7653b	tgsi: set implicit array size for tess stages Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 22:50:16 -04:00
Ilia Mirkin	5af71fb5ac	freedreno/a3xx: add s3tc texture format support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 11:38:38 -04:00
Ilia Mirkin	581cbfdec1	freedreno/a3xx: fix up logic for handling block formats This only appears in cubemaps which have have packed layers, so are very sensitive to any layout disagreement between sw and hw. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 11:38:38 -04:00
Ilia Mirkin	12e1bf0b68	freedreno/a3xx: double the polygon offset value A few other drivers do this, fixes the gl-1.4-polygon-offset piglit test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 11:38:38 -04:00
Ilia Mirkin	1af0641db3	nvc0: implement the color buffer 0 is integer rule for alpha-to-one/cov The hardware checks for multisampling being enabled, but does not have the rule about cbuf0 being an integer format. Only enable alpha-to-one/alpha-to-coverage if cbuf0 is not an integer format. Fixes piglits ext_framebuffer_multisample-int-draw-buffers-alpha-to-one ext_framebuffer_multisample-int-draw-buffers-alpha-to-coverage Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 04:21:18 -04:00
Ilia Mirkin	2f5ee9bf27	gk110/ir: fix sched calculator to consider all registers in the ISA GK110/GK208 have 256 registers, not 64. Find out the number of registers from the target to avoid unnecessary iteration for pre-GK110. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 02:46:16 -04:00
Ilia Mirkin	ae5cf4f3f7	nvc0: program smooth line width when multisampling is enabled There are separate line widths for smooth and aliased lines. The smooth one is selected when multisampling is enabled even if line smoothing isn't explicitly turned on. Fixes the ext_framebuffer_multisample-line-smooth piglits Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 01:01:02 -04:00
Ilia Mirkin	884b4df3b6	nvc0: bind a fake tess control program when there isn't one available Apparently this is necessary in order for tess factors to work in a tess eval program without a tess control program bound. Probably because it uses the fake program's shader header to work out the number of patch constants. Fixes vs-tes-tessinner-tessouter-inputs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 01:01:02 -04:00
Ilia Mirkin	f13073b775	gm107/ir: avoid letting the lowering pass get out of sync There's a lot of functionality duplicated in the gm107 lowering pass from the nvc0 pass. As that one gets updated, the gm107 one falls behind. Avoid this by sharing the code. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-17 01:01:02 -04:00
Ilia Mirkin	2514c78fba	nv50,nvc0: take level into account when doing eng2d multi-layer blits This fixes arb_get_texture_sub_image-get, and any situation where the 2d engine was being used for multi-layer blits to a non-0 level. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6" <mesa-stable@lists.freedesktop.org>	2015-08-17 01:01:02 -04:00
Ilia Mirkin	ca628085b6	freedreno/a3xx: add per-texture seamless cubemap control The default is to enable seamless cubemap filtering, but there's a bit to turn it off. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-16 03:01:53 -04:00
Ilia Mirkin	b4ace13eea	freedreno/a4xx: add cube map array support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-15 14:05:37 -04:00
Rob Clark	868b66fce7	freedreno/a4xx: fix srgb render targets Also fixes mipmap level generation for srgb textures. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-15 12:09:06 -04:00
Rob Clark	dd412c8fcb	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-15 12:08:34 -04:00
Ilia Mirkin	d19a98e2e6	freedreno: expose OES exts for float linear filtering a4xx can do both float and half-float, while a3xx can only do half-float Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-14 20:22:49 -04:00
Ilia Mirkin	d3e23f1ff9	nvc0: disable tessellation on maxwell The address calculations are all different (e.g. see GP), there appear to be sync's in programs, and probably a bunch of other differences. Just disable it for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-14 16:02:26 -04:00
Eric Anholt	bf3c50fba2	vc4: Move all of our fixed function fragment color handling to NIR. This massively reduces our dependency on VC4-specific optimization passes. shader-db: total uniforms in shared programs: 32077 -> 32067 (-0.03%) uniforms in affected programs: 149 -> 139 (-6.71%) total instructions in shared programs: 98208 -> 98182 (-0.03%) instructions in affected programs: 2154 -> 2128 (-1.21%)	2015-08-14 11:39:18 -07:00
Eric Anholt	38c6c0f5b4	vc4: Add a helper for making driver-specific NIR load_uniform for GL state In order to move more of our lowering into NIR, we need the ability to reference various pipeline state (like texture rectangle scaling factors or blend colors), so we just set those up as a load_uniform with a big offset to indicate that it's not within the shader's uniform storage and is one of our state values.	2015-08-14 11:39:18 -07:00
Eric Anholt	9e6dc5b64d	nir: Add a nir_opt_undef() to handle csels with undef. We may find a cause to do more undef optimization in the future, but for now this fixes up things after if flattening. vc4 was handling this internally most of the time, but a GLB2.7 shader that did a conditional discard and assign gl_FragColor in the else was still emitting some extra code. total instructions in shared programs: 100809 -> 100795 (-0.01%) instructions in affected programs: 37 -> 23 (-37.84%) v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas Helland). v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg over to src[0], too (by anholt). Reviewed-by: Thomas Helland <thomashelland90@gmail.com> (v2) Tested-by: Thomas Helland <thomashelland90@gmail.com> (v2)	2015-08-14 11:39:18 -07:00
Ilia Mirkin	b346a84e27	gm107/ir: indirect handle goes first on maxwell also Fixes fs-simple-texture-size.shader_test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6" <mesa-stable@lists.freedesktop.org>	2015-08-14 14:11:44 -04:00
Ilia Mirkin	7ff7d5d799	nv30: add depth bounds test support for hw that has it Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-14 13:05:29 -04:00
Ilia Mirkin	a6bf20d153	nv50: add depth bounds test support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-14 13:05:29 -04:00
Ilia Mirkin	d4087265f6	nvc0: add depth bounds test support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-14 13:05:29 -04:00
Marek Olšák	f47c59322e	radeonsi: revert a wrong DB bug workaround for VI The bug was misunderstood. Besides that, the bug affects a DB feature we don't use yet. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-08-14 15:02:31 +02:00
Boyuan Zhang	839bf82606	radeon/uvd: implement HEVC support add context buffer to fix H265 uvd decode issue. fix H265 corruption issue caused by incorrect assigned ref_pic_list. v2: disable interlace for HEVC add CZ sps flag workaround fix coding style Signed-off-by: Christian KÃ¶nig <christian.koenig@amd.com> Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-08-14 15:02:31 +02:00
Leo Liu	0654a9ca17	radeon/vce: disable VCE dual instance for harvest part Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-08-14 15:02:31 +02:00
Leo Liu	09def7e1e0	radeon/vce: implement VCE dual instance support VCE dual instances are encoding in parallel, it needs two frames for encoding with their own parameters in one IB. Master instance will check the task info to find another frame, assign it to the slave instance Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-14 15:02:31 +02:00

1 2 3 4 5 ...

24315 commits