fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-06 10:38:17 +02:00

Author	SHA1	Message	Date
Rob Clark	f8feb97ba5	freedreno/ir3: fix silly brain-fart in RA We want to consider all the vars, not 1/32nd of them, when extending live-ranges. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	8e451c2d06	freedreno/ir3: don't cp into phi's The block defining a phi source might not have been executed. If we allow copy propagation, we could end up pointing to a src instruction in the wrong block. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	383b6e87f9	freedreno/ir3: we can't store immediate values Fixes some transform-feedback piglits, like: bin/ext_transform_feedback-nonflat-integral Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	d47fb856af	freedreno/ir3: add dumping for use/def/live-in/live-out Turned out to be useful to debug an issue in RA. Let's keep it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	38ae05a340	freedreno/ir3: drop unused instr category arg No longer used, so drop the extra arg to ir3_instr_create() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	19739e4fb9	freedreno/ir3: remove ir3_instruction::category Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Rob Clark	70735643f4	freedreno/ir3: encode instruction category in opc_t Been on my TODO list for a while. If nothing else this will make gdb properly grok the opc_t enum. This first step preserves ir3_instruction::category (with an added assert that category matches what is encoded in opc_t). Next step is to drop the category field (and arg to ir3_instr_create()), but that is split into next commit for bisectability and so that we can run piglit in the intermediate state to flush out any problems. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-04-04 20:18:18 -04:00
Ilia Mirkin	4bc3b1ca48	nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-04 00:32:48 -04:00
Jose Fonseca	7ad49daca6	gallivm: Introduce lp_format_intrinsic. For adding .v4f32 like suffixes to intrinsics, taking special care for scalar case, which was being often neglected. This fixes invalid IR when doing mipmap filtering on SSE2 (the only case where we'd use intrinsics with scalars.) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-04 00:06:09 +01:00
Jose Fonseca	a293f57e13	gallivm: Use llvm.fabs. Exactly the same code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:09:09 +01:00
Jose Fonseca	e4f01da15d	gallivm: Prefer backend agnostic intrinsic for rounding. We could unconditionally use these instrinsics, but performance with SSE2 would suck, as LLVM falls back to calling libm. lp_test_arit. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:09:07 +01:00
Jose Fonseca	324451e73f	gallivm: Add debug option to force SSE2. For simulating less capable machines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 22:08:57 +01:00
Jose Fonseca	5fa31a4aba	llvmpipe: Test abs. Trivial.	2016-04-03 11:17:20 +01:00
Jose Fonseca	522ebe701d	llvmpipe: Build lp_test_arit on MSVC too. It builds fine now. Probably due to C99 support. Trivial.	2016-04-03 11:17:20 +01:00
Jose Fonseca	b284f1f7f9	gallivm: Fix performance regressions due to vector selects. LLVM often can't determine the mask elements are all ones/zeros, and there doesn't seem to be a good way to hint that. Thanks to Roland Scheidegger for spotting and analyzing the issue. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Jose Fonseca	11c4e5b45c	gallivm: Remove lp_build_load_volatile. No longer needed. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Jose Fonseca	bcfb86b09d	gallivm: Use standard LLVMSetAlignment from LLVM 3.4 onwards. Only provide a fallback for LLVM 3.3. One less dependency on LLVM C++ interface. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-03 09:51:27 +01:00
Ilia Mirkin	d64134ecae	gm107/ir: add OP_SELP emission, used in DSQRT lowering The current DSQRT lowering code emits an OP_SELP, so we have to handle its emission. This will eventually go away, but no harm supporting this op. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-02 19:27:51 -04:00
Ilia Mirkin	3610b1466d	nv50/ir: we can't load local memory directly into an output This fixes piglit tests like tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test and related ones. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-02 18:10:20 -04:00
Samuel Pitoiset	0852c5703b	nv50/ir: fix envyas variants when building the code lib nvc0 and nve4 have been respectively replaced by gf100 and gk104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-02 20:00:57 +02:00
Brian Paul	36d8fed798	svga: remove unused svga_compile_key::texture_msaa field Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Brian Paul	b283c76342	svga: check TXF instruction's target to determine MSAA Rather than the currently bound texture. This goes along with the earlier patch to get away from examining bound textures and sampler views during shader translation. Fixes VMware bug 1632739. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Brian Paul	ef10b5427a	tgsi: add simple tgsi_is_msaa_target() helper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-04-02 08:05:20 -06:00
Bas Nieuwenhuizen	1a5c8c24b5	gallium: distinguish between shader IR in get_compute_param For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:51:13 +02:00
Bas Nieuwenhuizen	be5899dcf9	gallium: add global buffer memory barrier bit Currently radeonsi synchronizes after every dispatch and Clover does nothing to synchronize. This is overzealous, especially with GL compute, so add a barrier for global buffers. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:51:06 +02:00
Bas Nieuwenhuizen	01f993a21f	gallium: add threads per block TGSI property The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:50:59 +02:00
Bas Nieuwenhuizen	ea8f4a6b13	gallium: add compute shader IR type Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-02 01:49:57 +02:00
Samuel Pitoiset	60e1c6a7fc	nvc0: enable compute shaders on GK104 and GM107+ Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	71f327aa21	nvc0: bump the maximum number of UBOs for compute on Kepler The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	839a469166	nvc0/ir: do not lower shared+atomics on GM107+ For Maxwell, the ATOMS instruction can be used to perform atomic operations on shared memory instead of this load/store lowering pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	543fb95473	nvc0/ir: add atomics support on shared memory for Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	275019d7db	nvc0/ir: fix wrong pred emission for ld lock on GK104 This fixes `84b9b8f` (nvc0/ir: add missing emission of locked load predicate). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	4f58b78c30	nvc0/ir: add support for compute UBOs on Kepler Make sure to avoid out of bounds access in presence of indirect array indexing by loading the size from the driver constant buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	3b246a71d7	nvc0: add indirect compute support on Kepler The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) \| y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	7797d5f7d9	nvc0: reduce likelihood of collision for real buffers on Kepler Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	e2e8085fac	nvc0: store ubo info to the driver constbuf on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	12aa047c98	nvc0: bind user uniforms for compute on Kepler Uniform buffer objects will be sticked to the driver constant buffer like buffers because the launch descriptor only allows 8 CBs. Input kernel parameters for OpenCL are still uploaded to screen->parm which is bound on c0, but this will be changed later with a new series. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	1828d90a00	nvc0: bind shader buffers for compute on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Samuel Pitoiset	debd910512	nvc0: bind driver cb for compute on c7[] for Kepler Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-01 22:26:24 +02:00
Jose Fonseca	f72de6f386	gallivm: Prevent disassembly debug output from being truncated. By using os_log_message directly, as _debug_vprintf truncates messages to 4K. Also cleanup the disassemble interface. Spotted by Roland. Trivial.	2016-04-01 21:22:42 +01:00
Mauro Rossi	e09d04cd56	radeonsi: use util_strchrnul() to fix android build error Android Bionic does not support strchrnul() string function, gallium auxiliary util/u_string.h provides util_strchrnul() This change avoids the following building error: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:3863: error: undefined reference to 'strchrnul' collect2: error: ld returned 1 exit status Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-04-01 13:56:57 +01:00
Jose Fonseca	cdf7c6b83d	gallivm: Use vector selects on LLVM 3.3+. This is an old patch I had around. Vector selects seem to work well from LLVM 3.3. Using them should improve code quality, as it might make constant propagation pass more effective. Tested lp_test_* Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-04-01 09:05:19 +01:00
Ilia Mirkin	df03be196a	nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supported vdpau has recently come to rely on this, so make sure to check it properly. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-03-31 21:53:11 -04:00
Samuel Pitoiset	d22eca5f90	tgsi: silence compiler warning in fetch_sampler_unit() The unit variable can be used uninitialized. Fixes: `24e77cb09` ("tgsi: handle indirect sampler arrays. (v2)") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-01 07:16:24 +10:00
Samuel Pitoiset	05902a6686	tgsi: fix out of bounds access in exec_atomop() The number of channels must be 4 for all RGBA components. Fixes: `22d129601` ("tgsi: add support for image operations to tgsi_exec. (v2.1)") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-04-01 07:15:16 +10:00
Brian Paul	9076e04934	tgsi: split tgsi_util_get_texture_coord_dim() function into two It was kind of overloaded, returning two different things. Now get the index of the shadow reference src register with a new tgsi_util_get_shadow_ref_src_index() function. To verify the new code, I added some temp/debug code which looped over all TGSI_TEXTURE_x values, calling the old function and new and checking that the returned indexes matched. Also tested piglit "shadow" tests with softpipe/llvmpipe. No testing of ilo and radeonsi changes. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:48:00 -06:00
Brian Paul	9d7cd43988	tgsi: skip texture query opcodes when examining texture targets Should fix the assertion in piglit spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the TXQ instruction specifies a 2D target but the sampler view was declared as SHADOW2D. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-03-31 09:47:40 -06:00
Pierre Moreau	f96a403bc3	nv50/ir: Check for valid insn instead of def size This fixes a null pointer dereference during the register allocation pass, if a function had arguments. Functions arguments get a definition from the function itself, a definition which is therefore not linked to any instruction. If a value ends up having a definition but no linked instruction, the register allocation pass doesn't need to consider whether that value is generated by an instruction that can only handle "short" registers (on nv50). Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2016-03-31 10:30:29 -04:00
Dave Airlie	eb9ad9faa3	softpipe: add image support to softpipe (v3) This adds support for ARB_shader_image_load_store to softpipe. v2: add RESQ support (Ilia) v3: constify, cleanup internals, add some comments (Brian). Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:16 +10:00
Dave Airlie	0d1f679ded	draw: add support for passing images to vs/gs shaders. This just adds support for passing through images to the tgsi execution stage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-03-31 09:14:11 +10:00

... 19 20 21 22 23 ...

27608 commits