V2 (idr):
* Keep the behavior of other info logs in Mesa: and empty info log
reports a GL_INFO_LOG_LENGTH of zero.
* Use a NULL pointer to denote an empty info log.
* Split out from previous uber patch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Test become green in piglit:
The updated ext_transform_feedback-api-errors:useprogstage_noactive useprogstage_active bind_pipeline
arb_separate_shader_object-GetProgramPipelineiv
arb_separate_shader_object-IsProgramPipeline
For the moment I reuse Driver.UseProgram but I guess it will be better
to create a UseProgramStages functions. Opinion is welcome
V2: formatting & rename
V3 (idr):
* Change spec references to core OpenGL versions instead of issues in the
extension spec.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Now arb_separate_shader_object-GetProgramPipelineiv should pass.
V3 (idr):
* Change spec references to core OpenGL versions instead of issues in
the extension spec.
* Split out from previous uber patch.
v4 (idr): Use _mesa_has_geometry_shaders in _mesa_UseProgramStages to
detect availability of geometry shaders.
v5 (idr): Whitespace cleanup, use _mesa_lookup_shader_program_err
instead of open-coding it again, and update some comments at the end of
_mesa_UseProgramStages. All suggested by Eric.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Extend use_shader_program to support a different target. Allow to reuse the
function to update the pipeline state. Note I bypass the flush when target
isn't current. Maybe it would be better to create a new UseProgramStages
driver function
This was originally included in another patch, but it was split out by
Ian Romanick.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
save and restore _Shader/Pipeline binding point. Rational we don't want any
conflict when the program will be unattached.
V2: formatting improvement
V3 (idr):
* Build fix. The original patch added calls to _mesa_use_shader_program
with 4 parameters, but the fourth parameter isn't added to that
function until a much later patch. Just drop that parameter for now.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Basically a sed but shaderapi.c and get.c.
get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior
shaderapi.c => the old api stil update the Shader object directly
V2: formatting improvement
V3 (idr):
* Rebase fixes after a block of code was moved from ir_to_mesa.cpp to
shaderapi.c.
* Trivial reformatting.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
To avoid NULL pointer check a default pipeline object is installed in
_Shader when no program is current
The spec say that UseProgram/UseShaderProgramEXT/ActiveProgramEXT got an
higher priority over the pipeline object. When default program is
uninstall, the pipeline is used if any was bound.
Note: A careful rename need to be done now...
V2: formating improvement
V3 (idr):
* Build fix. The original patch added calls to _mesa_use_shader_program
with 4 parameters, but the fourth parameter isn't added to that
function until a much later patch. Just drop that parameter for now.
* Trivial reformatting.
* Updated comment of gl_context::_Shader
v4 (idr): Reformat spec quotations to look like spec quotations. Update
comments describing what gl_context::_Shader can point to. Bot
suggested by Eric.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eliminate lp_vertex_shader, as it added nothing over draw_vertex_shader.
Simplify lp_geometry_shader, as most of the incoming state is unneeded.
(We could also just use draw_geometry_shader if we were willing to peek
inside the structure.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
As done in draw_pipe_aaline and draw_pipe_aapoint modules.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
* Add HAVE_PTHREAD, we do have pthread support wrappers now for
non-native Haiku threaded applications.
* Viewport changed behavior recently breaking the build.
We fix this by looking at the gl_context ViewportArray
(Thanks Brian for the idea)
Acked-by: Brian Paul <brianp@vmware.com>
The SIMD16 replicated FB write message only works if we don't need the
color calculator to mask our framebuffer writes. Previously, we bailed
on it if color_mask wasn't <true, true, true, true>. However, this was
needlessly strict for formats with fewer than four components - only the
components that actually exist matter.
WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set
to <true, true, true, false>. This will work perfectly fine with the
replicated data message; we just bailed unnecessarily.
Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) by
abound 50%, and Bay Trail (at 1366x768) by over 70% (using Chrome 24).
v2: Use _mesa_format_has_color_component() to properly handle ALPHA
formats (and generally be less fragile).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Dylan Baker <baker.dylan.c@gmail.com>
WebGL Aquarium in Chrome 24 actually hits this.
v2: Move to core Mesa (wisely suggested by Ian); only consider
components which actually exist.
v3: Use _mesa_format_has_color_component to determine whether components
actually exist, fixing alpha format handling.
v4: Add a comment, as requested by Brian. No actual code changes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Dylan Baker <baker.dylan.c@gmail.com>
When considering color write masks, we often want to know whether an
RGBA component actually contains any meaningful data. This function
provides an easy way to answer that question, and handles luminance,
intensity, and alpha formats correctly.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Dylan Baker <baker.dylan.c@gmail.com>
This gets us equivalent code paths on BDW and pre-BDW, except for stencil
(where we don't have MSAA stencil resolve code yet)
Improves MSAA-forced citybench by 7.94496% +/- 2.38429% (n=16). Reduces
DRI2 MSAA glxgears performance by -12.3559% +/- 1.52845% (n=9).
v2: Move the new meta code to brw_meta_updownsample.c, name it
brw_meta_updownsample(), add a comment about
intel_rb_storage_first_mt_slice(), and rename that function and move
the RB generation into it (review ideas by Ken).
v3: Fix 2 src vs dst pasteos in previous change.
v4: Skip this path pre-gen8 for now, until we can analyze the glxgears
performance delta some more.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now that BindRenderbufferTexImage() is a thing that drivers can do, winsys
FBOs *can* have NeedsFinishRenderTexture set.
v2: Keep the short-circuit for non-BindRenderbufferTexImage() drivers
(review by Ken).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Even if the singlesample_mt got reopened from DRI due to
pageflipping/buffer swapping, our private miptree shouldn't need any
changes.
Improves performance of a little swapbuffers-loving microbenchmark with
MSAA forced on, by 1.2371% +/- 0.624802% (n=102)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The formatting was weird, and the tests were duplicated, and it is
guaranteed that mt->region exists.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes a memory leak with MSAA winsys buffers since my move of
singlesample_mt to the rb in 4e0924c5de
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Sometimes it would be nice to benchmark some app with MSAA versus not, but
it doesn't offer the controls you want. Just provide a handy knob to
force the issue.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
For each write, search previous instructions for unread writes to the
flag register and remove them. Note that this will not eliminate the
last unread write.
total instructions in shared programs: 788074 -> 788004 (-0.01%)
instructions in affected programs: 4930 -> 4860 (-1.42%)
Reviewed-by: Eric Anholt <eric@anholt.net>
With an awful O(n^2) algorithm that searches previous instructions for
dead writes.
total instructions in shared programs: 805582 -> 788074 (-2.17%)
instructions in affected programs: 144561 -> 127053 (-12.11%)
Reviewed-by: Eric Anholt <eric@anholt.net>
That is, modify
mad dst, a, b, c
to be
mad dst.xyz, a, b, c
if dst.w is never read.
total instructions in shared programs: 811869 -> 805582 (-0.77%)
instructions in affected programs: 168287 -> 162000 (-3.74%)
Reviewed-by: Eric Anholt <eric@anholt.net>
A future patch adds support for removing dead writes to the flag
register. This patch simplifies the logic until then.
total instructions in shared programs: 811813 -> 811869 (0.01%)
instructions in affected programs: 3378 -> 3434 (1.66%)
Reviewed-by: Eric Anholt <eric@anholt.net>
To be consistent with the fs backend. Also the instruction scheduler
incorrectly considered SEL with a conditional modifier to read the flag
register.
Reviewed-by: Eric Anholt <eric@anholt.net>
The ARB_framebuffer_object spec lists this case before the
FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER and
FRAMEBUFFER_INCOMPLETE_READ_BUFFER cases.
Fixes two broken cases in piglit's fbo-incomplete test, if
ARB_ES2_compatibility is not advertised. (If it is, this is masked
because the FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER /
FRAMEBUFFER_INCOMPLETE_READ_BUFFER cases are removed by that extension)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
GL_INTENSITY has never been valid as a pixel format -- to get the memcpy
pack/unpack paths, the app needs to specify GL_RED as the pixel format
(or GL_RED_INTEGER for the integer formats).
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
With shared glx contexts it is possible that a texture is create and used
in one context and then used in another one resulting in incorrect
sampler view usage.
v2: avoid template copy
v3: add XXX comment
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
It's useful to know whether a clear is fast (MCS-based), using the
SIMD16 repdata message, or slow.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This reverts commit 2919c3fdb4.
For formats like BGRX, looping through 0..num_components works fine.
But for formats like XRGB, we'd check the color mask for X and fail to
check it for B.
The SIMD16 replicated FB write message only works if we don't need the
color calculator to mask our framebuffer writes. Previously, we bailed
on it if color_mask wasn't <true, true, true, true>. However, this was
needlessly strict for formats with fewer than four components - only the
components that actually exist matter.
WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set
to <true, true, true, false>. This will work perfectly fine with the
replicated data message; we just bailed unnecessarily.
Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) by
abound 40%, and Bay Trail (at 1366x768) by over 70% (using Chrome 24).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Dylan Baker <baker.dylan.c@gmail.com>
This lets us distinguish MSAA resolves from other ordinary blits.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Currently, we don't use this path on Sandybridge because we suspect
other paths will be faster. But we potentially could. If we do, we
should allow it to support Y-tiled BLTs.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested on ILK and CTG (with the GL3isms taken out of the piglits).
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA
(and vice versa), fix this to handle also 16bit 565-style srgb formats.
Still not really all that generic, things like r10g10b10a2_srgb or
r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not
require more work to not crash but near certainly need some higher precision
calculation) but not needed right now.
The code is not fully optimized for this (could use more direct calculation
instead of expanding to 8-bit range first) but should be good enough.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
GL generally doesn't seem to allow srgb formats with less (or more) than 8 bit
for the rgb channels, though some hw could easily do it (typically for formats
with up to 10 bits for the rgb channels, at least for formats with less than 8
bits support is likely widespread even). While it may be true there aren't
really any benefits for such formats, we need for it for d3d, though luckily
only for b5g6r5_srgb it seems.
So add this format along with the util code for conversion - since that util
code is heavily tuned for 8bit srgb this isn't really all that well optimized
and rounding doesn't seem right but at least it should give some halfway
meaningful results.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The nvc0 texfetch instruction expects the sample id to be in the second
source (usually used for the offset) rather than as part of the texture
coordinate.
This fixes all the sampler2DMS/Array tests on nvc0.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
This fallback to triangle strips is silly and should be done in drivers
if they need it.
This should fix the case when quad strips are used with flatshading that is
enabled by the "flat" GLSL varying modifier. It also fixes primitive restart
for quad strips.
This fixes piglit:
NV_primitive_restart/primitive-restart-draw-mode-quad_strip
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
The last changes to it are from 2008 and 2009.
It doesn't support most texture formats and some texture targets.
Nobody can possibly be using this.
Reviewed-by: Brian Paul <brianp@vmware.com>