fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 11:10:10 +01:00

Author	SHA1	Message	Date
Gregory Hainaut	8ed8592fd6	mesa/sso: Add support for GL_PROGRAM_SEPARABLE query This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	4177d39c1e	mesa/sso: Implement _mesa_IsProgramPipeline Implement IsProgramPipeline based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	0c26552662	mesa/sso: Implement _mesa_GenProgramPipelines Implement GenProgramPipelines based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	55311557fd	mesa/sso: Implement _mesa_DeleteProgramPipelines Implement DeleteProgramPipelines based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	f4c13a890f	mesa/sso: Add pipeline container/state V1: * Extend gl_shader_state as pipeline object state * Add a new container gl_pipeline_shader_state that contains binding point of the previous object * Update mesa init/free shader state due to the extension of the attibute * Add an init/free pipeline function for the context V2: * Rename gl_shader_state to gl_pipeline_object * Rename Pipeline.PipelineObj to Pipeline.Current * Formatting improvement V3 (idr): * Split out from previous uber patch. * Remove '#if 0' debug printfs. V4 (idr): * Fix some errors in comments. Suggested by Jordan. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	0f137a1d73	mesa: Add a mutex and refcounting to gl_shader_state Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	47476fa673	mesa: Make get_shader_flags publicly available Future patches will use this function outside shaderapi.c. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	73b78f9c9f	mesa/sso: Add extension entry points for GL_ARB_separate_shader_objects Nothings implemented yet but glProgramUniform* which are mostly a copy/paste of the older function glUniform* I create dedicated pipelineobj.[ch] file that will contains function related to the "new" pipeline container object. V2: formatting improvement V3: * indentation fix * Update copyright * Add a comment on ProgramParameteri already present in another extension * Remove TODO, will be readded on correct patch V4 (idr): * Fix dispatch_sanity unit test * Make extension string available in core profiles (instead of just compatibility). * Trivial reformating Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	4d14b190bb	glsl/sso: Add parser and AST-to-HIR support for separate shader object layouts GL_ARB_separate_shader_objects adds the ability to specify location layouts for interstage inputs and outputs. In addition, this extension makes 'in' and 'out' generally available for shader inputs and outputs. This mimics the behavior of GL_ARB_explicit_attrib_location. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	f3b184590f	mesa/sso: Add extension tracking for ARB_separate_shader_objects This adds the necessary bits for both the API and the GLSL compiler. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	79146065f9	mesa: Refactor per-stage link check to its own function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:01 -08:00
Eric Anholt	c2ebbe2728	i965: Stop throwing away our double precision for time calculations. Fixes negative times being reported in our perf debug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:50 -08:00
Eric Anholt	f2f337c6d5	meta: Add support for integer blits. Compared to i965, the code generated doesn't use the AVG instruction. But I'm not sure that multisampled integer resolves are really that important to worry about. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	b0a8d0ee40	meta: Add support for doing MSAA to MSAA blits. These are non-stretched, non-resolving blits, so it's just a matter of sampling once from our gl_SampleID and storing that to our color/depth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	eb55b01eef	meta: Save and restore a bunch of MSAA state. We're disabling GL_MULTISAMPLE, so we didn't need to worry about a lot of that state. But to do MSAA to MSAA blits, we need to start handling more state. v2: Fix pasteo caught by Kenneth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	f7f15d3c2d	meta: Try to do blending of sRGB values in linear colorspace. Blending of values would occur when doing GL_LINEAR filtering with scaling, and in an upcoming commit when doing MSAA resolves. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	7d2f73e737	meta: Add support for doing multisample resolves. Note that this doesn't handle GL_EXT_multisample_scaled_blit yet. The i965 code for that extension bakes in knowledge of the sample positions (well, knowledge of the sample positions aligned to a lower-resolution grid), which we would have to do at runtime somehow for meta. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	aba85d960e	i965: Fix miptree matching for multisampled, non-interleaved miptrees. We haven't been executing this code before the meta-blit case, because we've been flagging the miptree as validated at texstorage time, and never having to revalidate. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Courtney Goeltzenleuchter	941769be81	mesa: Remove unnecessary condition. Identified by Valgrind memory check. Initialized block-opaque in a different patch. This test seems unnecessary. If opaque must be true, just set to true. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>	2014-02-21 10:16:10 -08:00
Francisco Jerez	9b2fe7cf96	clover: Unabbreviate a few data accessor names for consistency. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:23 +01:00
Francisco Jerez	a0d99937a0	clover: Replace the transfer(new ...) idiom with a safer create(...) helper function. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	c4578d2277	clover: Migrate a bunch of pointers and references in the object tree to smart references. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	d82b39ce38	clover: Allow storing a range into a container of different (but compatible) element type. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	1b9fb2fd91	clover: Define an intrusive smart reference class. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	9ae0bd3829	clover: Some improvements for the intrusive pointer class. Define some additional convenience operators, clean up the implementation slightly, and rename it to 'intrusive_ptr' for reasons that will be obvious in the next commit. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	198cd136b9	clover: Fix up NULL constant pointer arguments. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:29:05 +01:00
Jordan Justen	c97763ca2d	tgsi_ureg: add property_gs_invocations Fixes a build break in state_tracker/st_program.c Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75278 Reviewed-by: Dave Airlie <airlied@redhat.com>	2014-02-20 16:41:01 -08:00
Kenneth Graunke	808952a095	i965/fs: Implement FS_OPCODE_[UN]PACK_HALF_2x16_SPLIT[_XY] opcodes. I'd neglected to port these to Broadwell. Most of this code is copy and pasted from Gen7, but instead of using F32TO16/F16TO32, we just use MOV with HF register types. Fixes fs-packHalf2x16 and fs-unpackHalf2x16 tests (both the ARB extension and ES 3.0 variants). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:59 -08:00
Kenneth Graunke	850e372fc7	i965: Drop bogus F32TO16/F16TO32 instructions on Broadwell - use MOV. Broadwell removed the F32TO16 and F16TO32 instructions. However, it has actual support for HF values, so they're actually just MOV. Fixes vs-packHalf2x16 and vs-unpackHalf2x16 tests (both the ARB extension and ES 3.0 variants). v2: Emulate F32TO16's align16 zeroing bug, since Chad's front end code relies on it happening. We can probably refactor this code to be better later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:57 -08:00
Kenneth Graunke	3663bbe773	i965: Create a hardware context before initializing state module. brw_init_state() calls brw_upload_initial_gpu_state(). If hardware contexts are enabled (brw->hw_ctx != NULL), this will upload some initial invariant state for the GPU. Without hardware contexts, we rely on this state being uploaded via atoms that subscribe to the BRW_NEW_CONTEXT bit. Commit `46d3c2bf4d` accidentally moved the call to brw_init_state() before creating a hardware context. This meant brw_upload_initial_gpu_state would always early return. Except on Gen6+, we stopped uploading the initial GPU state via state atoms, so it never happened. Fixes a regression since `46d3c2bf4d`. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	e3823147a5	i965/fs: Implement scratch read/write support for Broadwell. To make sure that both the Gen4 and Gen7 style messages work, I initially disabled the SHADER_OPCODE_GEN7_SCRATCH_READ optimization, ran Piglit, re-enabled it, and ran Piglit again. Both worked fine. Fixes 40 Piglit tests (most of the varying-packing category). v2: Move num_regs assertion from gen8_fs_generator to gen8_set_dp_scratch_message() (suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	29a6974403	i965: Add Gen8 assembly support for DP Scratch messages. The new accessors will make it easy to do Gen7-style scratch messages. v2: Move num_regs assertion from gen8_fs_generator into gen8_set_dp_scratch_message() (suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	a5e54c91a3	i965: Store absolute thread count in max_wm_threads on Broadwell. In the past, 3DSTATE_PS took an absolute number of threads. Conversely, on Broadwell you always program 64, and it implicitly scales based on the GT-level with no special programming. So, I stored 64 in brw_device_info::max_wm_threads. However, I didn't realize that we also use max_wm_threads to compute the size of the scratch space buffer. In that case, we really need the absolute number of threads. This patch hardcodes 3DSTATE_PS to use the value it expects, and changes max_wm_threads back to a (completely fake) absolute thread count (once again copied from Haswell). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	dca84b4b5b	i965: Use MOV, not OR for setting URB write channel enables on Gen8+. On Broadwell, g0.5 contains the "Scratch Space Pointer"; using OR puts some bits of that into "ignored" sections of our message header. While this doesn't hurt, it's also not terribly /useful/. Using MOV is sufficient to set the only interesting bits in this part of the message header. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:07 -08:00
Kenneth Graunke	e643c7d036	i965: Implement a CS stall workaround on Broadwell. According to the latest documentation, any PIPE_CONTROL with the "Command Streamer Stall" bit set must also have another bit set, with five different options: - Render Target Cache Flush - Depth Cache Flush - Stall at Pixel Scoreboard - Post-Sync Operation - Depth Stall I chose "Stall at Pixel Scoreboard" since we've used it effectively in the past, but the choice is fairly arbitrary. Implementing this in the PIPE_CONTROL emit helpers ensures that the workaround will always take effect when it ought to. Apparently, this workaround may be necessary on older hardware as well; for now I've only added it to Broadwell as it's absolutely necessary there. Subsequent patches could add it to older platforms, provided someone tests it there. v2: Only flag "Stall at Pixel Scoreboard" when none of the other bits are set (suggested by Ian Romanick). v3: Prefix the function with "gen8" (requested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:07 -08:00
Jordan Justen	741782b594	i965: support instanced GS on gen7 v3: * Properly prevent dual object mode execution when the invocation count > 1 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	008338bc4e	i965: support gl_InvocationID for gen7 v2: * Make gl_InvocationID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	d099019935	glsl: add gl_InvocationID variable for ARB_gpu_shader5 v2: * Make gl_InvocationID a system value Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	22388e2208	main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support v3: * Add check for ARB_gpu_shader5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	86d6b5546b	mesa: initialize gl_geometry_program Invocations field Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	313402048f	glsl/linker: produce gl_shader_program Geom.Invocations Grab the parsed invocation count, check for consistency during linking, and finally save the result in gl_shader_program Geom.Invocations. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	02dc74fbd7	glsl: parse invocations layout qualifier for ARB_gpu_shader5 _mesa_glsl_parse_state in_qualifier->invocations will store the invocations count. v3: * Use in_qualifier to allow the primitive to be specied separately from the invocations count (merge_qualifiers) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	738c9c3c54	glsl: Generate error for invalid input layout declarations Fixes various piglit tests: spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-*.geom Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	0c558f9ee6	glsl: convert GS input primitive to use ast_type_qualifier We introduce a new merge_in_qualifier ast_type_qualifier which allows specialized handling of merging input layout qualifiers. By merging layout qualifiers into state->in_qualifier, we allow multiple input qualifiers. For example, the primitive type can be specified specified separately from the invocations count (ARB_gpu_shader5). state->gs_input_prim_type is moved into state->in_qualifier->prim_type state->gs_input_prim_type_specified is still processed separately so we can determine when the input primitive is specified. This is important since certain scenerios are not supported until after the primitive type has been specified in the shader code. v4: * Merge with compute shader input layout qualifiers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Eric Anholt	5bc0b2f432	i965: Fix extra return value after winsys rb update refactor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75172 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-20 10:15:13 -08:00
Eric Anholt	9245206cbf	i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls. Improves performance of a dolphin emulator trace I had laying around by 3.60131% +/- 0.995887% (n=128). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-20 10:15:13 -08:00
Eric Anholt	9e3cab8881	i965/fs: Add an optimization pass to remove redundant flags movs. We generate steaming piles of these for the centroid workaround, and this quickly cleans them up. total instructions in shared programs: 1591228 -> 1590047 (-0.07%) instructions in affected programs: 26111 -> 24930 (-4.52%) GAINED: 0 LOST: 0 (Improved apps are l4d2, csgo, and dolphin) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-20 10:15:13 -08:00
Roland Scheidegger	b2b2a2c06c	gallivm: add smallfloat to float conversion not relying on cpu denorm handling The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Si Chen <sichen@vmware.com>	2014-02-20 18:41:42 +01:00
Leo Liu	0206f0b3d4	st/omx/enc: add multi scaling buffers for performance improvement Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-20 13:34:16 +01:00
Christian König	754fa3a0d2	st/omx/dec/h264: fix prevFrameNumOffset handling Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-20 13:34:06 +01:00

... 11 12 13 14 15 ...

55884 commits