fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 15:20:10 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	e25a453b7f	i965: Add missing /* BRW_NEW_FRAGMENT_PROGRAM */ comments. I had to dig a bit to figure out why this was necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:39 -07:00
Kenneth Graunke	3d31ed0d93	i965: Use "1ull" instead of "1" in BRW_NEW_* defines. Now that the bitfield is a uint64_t, we should use 1ull. Currently, we only have 32 entries, so 1 works fine, but it's not future-proof. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:38 -07:00
Kenneth Graunke	a114f452ae	i965: Use ~0ull when flagging all BRW_NEW_* dirty flags. ~0 is 0xFFFFFFFF, which only covers the first 32 bits. We need all 64. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:36 -07:00
Kenneth Graunke	5105f9a7ae	i965: Fix INTEL_DEBUG=state to work with 64-bit dirty bits. This will keep INTEL_DEBUG=state working when we add BRW_NEW_* bits beyond 1 << 31. We missed doing this when widening the driver flags from uint32_t to uint64_t. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:35 -07:00
Kenneth Graunke	fbebd5e4a5	i965: Delete CACHE_NEW_BLORP_CONST_COLOR_PROG. Unused since krh rewrote fast clears to use meta. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:24 -07:00
Chris Forbes	e4e3b0fc0d	i965: Fix typo in comment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 18:37:06 +13:00
Chris Forbes	d8c5c4f3e4	i965: Fix spelling of GEN7_SAMPLER_EWA_ANISOTROPIC_ALGORITHM Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 18:37:06 +13:00
Vinson Lee	6a238ac0b7	llvmpipe: Add missing LLVMGetGlobalContext() arg in lp_test_format.c. Fix build error introduced with commit `eedbce9c63`. lp_test_format.c: In function ‘test_format_unorm8’: lp_test_format.c:226:4: error: too few arguments to function ‘gallivm_create’ gallivm = gallivm_create("test_module_unorm8"); ^ In file included from ../../../../src/gallium/auxiliary/gallivm/lp_bld_format.h:38:0, from lp_test_format.c:42: ../../../../src/gallium/auxiliary/gallivm/lp_bld_init.h:58:1: note: declared here gallivm_create(const char *name, LLVMContextRef context); ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84538 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-09-30 21:52:13 -07:00
Keith Packard	3202926746	glx/dri3: Provide error diagnostics when DRI3 allocation fails Instead of just segfaulting in the driver when a buffer allocation fails, report error messages indicating what went wrong so that we can debug things. As a simple example, chromium wraps Mesa in a sandbox which doesn't allow access to most syscalls, including the ability to create shared memory segments for fences. Before, you'd get a simple segfault in mesa and your 3D acceleration would fail. Now you get: $ chromium --disable-gpu-blacklist [10618:10643:0930/200525:ERROR:nss_util.cc(856)] After loading Root Certs, loaded==false: NSS error code: -8018 libGL: pci id for fd 12: 8086:0a16, driver i965 libGL: OpenDriver: trying /local-miki/src/mesa/mesa/lib/i965_dri.so libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted. libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted. libGL error: DRI3 Fence object allocation failure Operation not permitted [10618:10618:0930/200525:ERROR:command_buffer_proxy_impl.cc(153)] Could not send GpuCommandBufferMsg_Initialize. [10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(236)] CommandBufferProxy::Initialize failed. [10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(256)] Failed to initialize command buffer. This made it pretty easy to diagnose the problem in the referenced bug report. Bugzilla: https://code.google.com/p/chromium/issues/detail?id=415681 Signed-off-by: Keith Packard <keithp@keithp.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 21:23:04 -07:00
Keith Packard	f7a355556e	glx/dri3: Use four buffers until X driver supports async flips A driver which doesn't have async flip support will queue up flips without any way to replace them afterwards. This means we've got a scanout buffer pinned as soon as we schedule a flip and so we need another buffer to keep from stalling. When vblank_mode=0, if there are only three buffers we do: current scanout buffer = 0 at MSC 0 Render frame 1 to buffer 1 PresentPixmap for buffer 1 at MSC 1 This is sitting down in the kernel waiting for vblank to become the next scanout buffer Render frame 2 to buffer 2 PresentPixmap for buffer 2 at MSC 1 This cannot be displayed at MSC 1 because the kernel doesn't have any way to replace buffer 1 as the pending scanout buffer. So, best case this will get displayed at MSC 2. Now we block after this, waiting for one of the three buffers to become idle. We can't use buffer 0 because it is the scanout buffer. We can't use buffer 1 because it's sitting in the kernel waiting to become the next scanout buffer and we can't use buffer 2 because that's the most recent frame which will become the next scanout buffer if the application doesn't manage to generate another complete frame by MSC 2. With four buffers, we get: current scanout buffer = 0 at MSC 0 Render frame 1 to buffer 1 PresentPixmap for buffer 1 at MSC 1 This is sitting down in the kernel waiting for vblank to become the next scanout buffer Render frame 2 to buffer 2 PresentPixmap for buffer 2 at MSC 1 This cannot be displayed at MSC 1 because the kernel doesn't have any way to replace buffer 1 as the pending scanout buffer. So, best case this will get displayed at MSC 2. The X server will queue this swap until buffer 1 becomes the scanout buffer. Render frame 3 to buffer 3 PresentPixmap for buffer 3 at MSC 1 As soon as the X server sees this, it will replace the pending buffer 2 swap with this swap and release buffer 2 back to the application Render frame 4 to buffer 2 PresentPixmap for buffer 2 at MSC 1 Now we're in a steady state, flipping between buffer 2 and 3 waiting for one of them to be queued to the kernel. ... current scanout buffer = 1 at MSC 1 Now buffer 0 is free and (e.g.) buffer 2 is queued in the kernel to be the scanout buffer at MSC 2 Render frames, flipping between buffer 0 and 3 When the system can replace a queued buffer, and we update Present to take advantage of that, we can use three buffers and get: current scanout buffer = 0 at MSC 0 Render frame 1 to buffer 1 PresentPixmap for buffer 1 at MSC 1 This is sitting waiting for vblank to become the next scanout buffer Render frame 2 to buffer 2 PresentPixmap for buffer 2 at MSC 1 Queue this for display at MSC 1 1. There are three possible results: 1) We're still before MSC 1. Buffer 1 is released, buffer 2 is queued waiting for MSC 1. 2) We're now after MSC 1. Buffer 0 was released at MSC 1. Buffer 1 is the current scanout buffer. a) If the user asked for a tearing update, we swap scanout from buffer 1 to buffer 2 and release buffer 1. b) If the user asked for non-tearing update, we queue buffer 2 for the MSC 2. In all three cases, we have a buffer released (call it 'n'), ready to receive the next frame. Render frame 3 to buffer n PresentPixmap for buffer n If we're still before MSC 1, then we'll ask to present at MSC 1. Otherwise, we'll ask to present at MSC 2. Present already does this if the driver offers async flips, however it does this by waiting for the right vblank event and sending an async flip right at that point. I've hacked the intel driver to offer this, but I get tearing at the top of the screen. I think this is because flips are always done from within the ring, and so the latency between the vblank event and the async flip happening can cause tearing at the top of the screen. That's why I'm keying the need for the extra buffer on the lack of 2D driver support for async flips. Signed-off-by: Keith Packard <keithp@keithp.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-09-30 20:08:28 -07:00
Jason Ekstrand	eedbce9c63	i965/fs: Fix the build	2014-09-30 17:27:33 -07:00
Jason Ekstrand	83669fac9d	i965/fs: Fix an uninitialized value warnings Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 17:26:05 -07:00
Roland Scheidegger	9750ae8ca9	galahad: fix indirect draw Need to unwrap the indirect resource otherwise bad things will happen. Fixes random crashes and timeouts with piglit's arb_indirect_draw tests. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-10-01 02:17:24 +02:00
Roland Scheidegger	e3da8c110c	galahad: (trivial) handle cubemap arrays Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-10-01 02:16:57 +02:00
Matt Turner	3e7f8005db	i965/fs: Emit compressed BFI2 instructions on Gen > 7. IVB had a restriction that prevented us from emitting compressed three-source instructions, and although that was lifted on Haswell, Haswell had a new restriction that said BFI instructions specifically couldn't be compressed.	2014-09-30 17:09:34 -07:00
Matt Turner	9f5e5bd34d	i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen > 7. These checks were intended for Gen 7 only. None of these restrictions apply to Gen 8. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	05586f9bc1	i965/fs: Set MUL source type to W/UW in 64-bit mul macro on Gen8. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	94b68109fb	i965/fs: Optimize sqrt+inv into rsq. Transform sqrt a, b rcp c, a into sqrt a, b rsq c, b The improvement here is that we've broken a dependency between these instructions. Leads to 330 fewer INV instructions and 330 more RSQ. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	b52126b44f	i965/vec4: Optimize sqrt+inv into rsq. Transform sqrt a, b rcp c, a into sqrt a, b rsq c, b In most cases the sqrt's result is still used, so the improvement here is that we've broken a dependency between these instructions. Leads to 80 fewer INV instructions and 80 more RSQ. Occasionally the sqrt's result is no longer used, leading to: instructions in affected programs: 5005 -> 4949 (-1.12%) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	189ac07764	i965/vec4: Call opt_algebraic after opt_cse. The next patch adds an algebraic optimization for the pattern sqrt a, b rcp c, a and turns it into sqrt a, b rsq c, b but many vertex shaders do a = sqrt(b); var1 /= a; var2 /= a; which generates sqrt a, b rcp c, a rcp d, a If we apply the algebraic optimization before CSE, we'll end up with sqrt a, b rsq c, b rcp d, a Applying CSE combines the RCP instructions, preventing this from happening. No shader-db changes. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	d13bcdb3a9	i965/fs: Extend predicated break pass to predicate WHILE. Helps a handful of programs in Serious Sam 3 that use do-while loops. instructions in affected programs: 16114 -> 16075 (-0.24%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Mathias Fröhlich	6e7d36fd2c	gallivm: Fix build for LLVM 3.2 Do not rely on LLVMMCJITMemoryManagerRef being available. The c binding to the memory manager objects only appeared on llvm-3.4. The change is based on an initial patch of Brian Paul. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-01 00:29:31 +02:00
Rob Clark	cc355f1c06	freedreno: destroy transfer pool after blitter Blitter can still have transfers hanging around which it frees in util_blitter_destroy(). So let it clean up before we yank the transfer_pool from under it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-30 16:56:15 -04:00
Rob Clark	01ff0b28b3	freedreno/lowering: fix token calculation for lowering Indirect registers consume an additional token. Try to clean up the token calculation math a bit, and fix it at the same time. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-30 16:56:15 -04:00
Ian Romanick	408aa46ca8	i965/fs: Don't make a name for a vector splitting temporary If the name is just going to get dropped, don't bother making it. If the name is made, release it sooner (rather than later). No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	0b47252999	glsl: Don't make a name for the function return variable If the name is just going to get dropped, don't bother making it. If the name is made, release it sooner (rather than later). No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	c87d09d7f0	glsl: Don't allocate a name for ir_var_temporary variables Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 74 40,578,719,715 67,762,208 62,263,404 5,498,804 0 After (32-bit): 52 40,565,579,466 66,359,800 61,187,818 5,171,982 0 Before (64-bit): 74 37,129,541,061 95,195,160 87,369,671 7,825,489 0 After (64-bit): 76 37,134,691,404 93,271,352 85,900,223 7,371,129 0 A real savings of 1.0MiB on 32-bit and 1.4MiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	eaa0c74142	glsl: Use ir_var_temporary for compiler generated temporaries These few places were using ir_var_auto for seemingly no reason. The names were not added to the symbol table. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	04e1357d97	glsl: Add context-level controls for whether temporaries have real names No change Valgrind massif results for a trimmed apitrace of dota2. v2: Minor rebase on _mesa_init_constants changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	a99482482d	glsl: Never put ir_var_temporary variables in the symbol table Later patches will give every ir_var_temporary the same name in release builds. Adding a bunch of variables named "compiler_temp" to the symbol table can only cause problems. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	7625babfae	glsl: Add the possibility for ir_variable to have a non-ralloced name Specifically, ir_var_temporary variables constructed with a NULL name will all have the name "compiler_temp" in static storage. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	0e654ab1b9	glsl: Store ir_variable_data::_num_state_slots and ::binding in 16-bits each Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 44 40,577,049,140 68,118,608 62,441,063 5,677,545 0 After (32-bit): 71 40,583,408,411 67,761,528 62,263,519 5,498,009 0 Before (64-bit): 63 37,122,829,194 95,153,008 87,333,600 7,819,408 0 After (64-bit): 67 37,123,303,706 95,150,544 87,333,600 7,816,944 0 A real savings of 173KiB on 32-bit and no change on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	a32ac726ee	glsl: Squish ir_variable::max_ifc_array_access and ::state_slots together At least one of these pointers must be NULL, and we can determine which will be NULL by looking at other fields. Use this information to store both pointers in the same location. If anyone can think of a better name for the union than "u", I'm all ears. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 63 40,574,239,515 68,117,280 62,618,607 5,498,673 0 After (32-bit): 44 40,577,049,140 68,118,608 62,441,063 5,677,545 0 Before (64-bit): 53 37,126,451,468 95,150,256 87,711,304 7,438,952 0 After (64-bit): 63 37,122,829,194 95,153,008 87,333,600 7,819,408 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	5aa8d8194c	glsl: Make ir_variable::num_state_slots and ir_variable::state_slots private Also move num_state_slots inside ir_variable_data for better packing. The payoff for this will come in a few more patches. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	21df016902	glsl: Make ir_variable::max_ifc_array_access private The payoff for this will come in a few more patches. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	8afe6efa21	glsl: Store ir_variable::depth_layout using 3 bits warn_extension_index was moved to improve packing. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 73 40,580,476,304 68,488,400 62,796,151 5,692,249 0 After (32-bit): 73 40,575,751,558 68,116,528 62,618,607 5,497,921 0 Before (64-bit): 71 37,124,890,613 95,889,584 88,089,008 7,800,576 0 After (64-bit): 62 37,123,578,526 95,150,784 87,711,304 7,439,480 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. v2: Use the enum name with the bit-field and remove the extra casts. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]	2014-09-30 13:34:42 -07:00
Ian Romanick	ab51179f1f	glsl: Replace ir_variable::warn_extension pointer with an 8-bit index Also move the new warn_extension_index into ir_variable::data. This enables slightly better packing. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 82 40,580,040,531 68,488,992 62,973,695 5,515,297 0 After (32-bit): 73 40,580,476,304 68,488,400 62,796,151 5,692,249 0 Before (64-bit): 65 37,124,013,542 95,892,768 88,466,712 7,426,056 0 After (64-bit): 71 37,124,890,613 95,889,584 88,089,008 7,800,576 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:41 -07:00
Ian Romanick	baf5a75664	glsl: Use accessors for ir_variable::warn_extension The payoff for this will come in the next patch. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:41 -07:00
Ian Romanick	1012e95a40	glsl: Eliminate unused built-in variables after compilation After compilation (and before linking) we can eliminate quite a few built-in variables. Basically, any uniform or constant (e.g., gl_MaxVertexTextureImageUnits) that isn't used (with one exception) can be eliminated. System values, vertex shader inputs (with one exception), and fragment shader outputs that are not used and not re-declared in the shader text can also be removed. gl_ModelViewProjectMatrix and gl_Vertex are used by the built-in function ftransform. There are some complications with eliminating these variables (see the comment in the patch), so they are not eliminated. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 46 40,661,487,174 75,116,800 68,854,065 6,262,735 0 After (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0 Before (64-bit): 64 37,200,329,700 104,872,672 96,514,546 8,358,126 0 After (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0 A real savings of 4.9MiB on 32-bit and 7.0MiB on 64-bit. v2: Don't remove any built-in with Transpose in the name. v3: Fix comment typo noticed by Anuj. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Eric Anholt <eric@anholt.net>	2014-09-30 13:34:41 -07:00
Ian Romanick	77005cfabd	glsl: Validate that built-in uniforms have backing state All built-in uniforms are supposed to be backed by some GL state. The state_slots field describes this backing state. This helped me track down a bug in a later patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-30 13:34:41 -07:00
Eric Anholt	8786544b3e	vc4: Don't forget to store stencil along with depth when storing either. Otherwise, we'd replace the stencil in our packed depth/stencil with 0s. Fixes about 50 piglit tests.	2014-09-30 12:55:28 -07:00
Mathias Fröhlich	43e2109326	llvmpipe: Reuse llvmpipes LLVMContext in the draw context. Reuse the LLVMContext already allocated in llvmpipe_context for draw_llvm if ppossible. This should decrease the memory footprint of an llvmpipe context. v2: Fix compile with llvm disabled. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-09-30 20:51:02 +02:00
Mathias Fröhlich	d90ff351f3	llvmpipe: Make a llvmpipe OpenGL context thread safe. This fixes the remaining problem with the recently introduced global jit memory manager. This change again uses a memory manager that is local to gallivm_state. This implementation still frees the majority of the memory immediately after compilation. Only the generated code is deferred until this code is no longer used. This change and the previous one using private LLVMContext instances I can now safely run several independent OpenGL contexts driven by llvmpipe from different threads. v3: Rebase on llvm-3.6 compile fixes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-09-30 20:51:02 +02:00
Mathias Fröhlich	83c62597fc	llvmpipe: Use two LLVMContexts per OpenGL context instead of a global one. This is one step to make llvmpipe thread safe as mandated by the OpenGL standard. Using the global LLVMContext is obviously a problem for that kind of use pattern. The patch introduces two LLVMContext instances that are private to an OpenGL context and used for all compiles. One is put into struct draw_llvm and the other one into struct llvmpipe_context. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-09-30 20:45:19 +02:00
Jason Ekstrand	98d00d6640	i965/brw_reg: Make the accumulator register take an explicit width. The big pile of patches I just pushed regresses about 25 piglit tests on SNB. This fixes the regressions. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-30 11:42:34 -07:00
Brian Paul	6b65847835	llvmpipe: move lp_jit_screen_init() call after allocation of screen object The screen argument isn't actually used by lp_jit_screen_init() at this time, but let's move the call so that we pass a valid pointer. v2: don't leak screen if lp_jit_screen_init() fails. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-30 12:09:14 -06:00
Brian Paul	b12899d752	tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl() Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC. Fixes polygon stipple regression. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-30 12:08:49 -06:00
Brian Paul	0fb1e6b7b4	util: simplify PIPE_TEXTURE_CUBE case in util_max_layer() For cube resources, the array_size value should be 6. So handle that case as we do for array texture resources. But assert that array_size==6 just to be safe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 12:08:49 -06:00
Brian Paul	59562e9ba5	softpipe: don't special case PIPE_TEXTURE_CUBE in softpipe_resource_layout() As with the previous patch for llvmpipe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-30 12:08:49 -06:00
Brian Paul	3d77b80d80	llvmpipe: remove special case for PIPE_TEXTURE_CUBE in llvmpipe_texture_layout() layers (aka array_size) should be 6 for cube textures so we don't need to special-case it. But add an assertion just to be safe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-30 12:08:49 -06:00

1 2 3 4 5 ...

65836 commits