v2 (Kayden): Move the enable into an existing intel->gen >= 4 block
(as suggested by Ian).
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Implements BGRA swizzle, sign recovery, and normalization
as required by ARB_vertex_type_10_10_10_2_rev.
V2: Ported to the new VS backend, since that's all that's left;
fixed normalization.
V3: Moved fixups out of the GLSL-only path, so it works for FF/VP too.
V4 (Kayden): Rework ES3 normalization, don't heap allocate registers;
tidy comments.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Flag the need for various workarounds to be applied by
the vertex shader.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Next few patches build on this to add other workarounds
for packed formats.
V2: rename BRW_ATTRIB_WA_COMPONENTS to BRW_ATTRIB_WA_COMPONENT_MASK;
V3 (Kayden): remove separate bit for ES3 signed normalization
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Always use R10G10B10A2_UINT; Most of the other formats we'd like
don't actually work on the hardware. Will emit w/a for scaling,
sign recovery and BGRA swizzle in the VS.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Until we have proper 'make dist' this is an improvement of the current
situation, because each time some old Makefiles got converted to automake
we had to update the tarballs target.
NOTE: This is a candidate for the 9.0 branch.
Cc: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
We can't support IF statements in 16-wide on these. To get back to 16-wide
for these shaders, we need to support predicate on discard instructions in the
backend IR, which is something we've sort of got on the list to do anyway.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55828
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Commit 774fb90db3 introduced a ralloc context to
each user of struct brw_compile, but for this one a NULL context was used,
causing the later ralloc_free(mem_ctx) to not do anything.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55175
NOTE: This is a candidate for the stable branches.
We have a special case where non-shadow comparison with LOD requires using a
SIMD16 vec4 in an 8-wide shader, which appears in the register allocator as a
size 8 vgrf.
Fixes assertions in various piglit tests and webgl conformance.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56521
Since a signed 2-bit integer can only represent -1, 0, or 1, it is
tempting to simply to convert it directly to a float. This maps it
onto the correct range of [-1.0, 1.0]. However, it gives different
values compared to the usual equation:
(2.0 * 1.0 + 1.0) * (1.0 / 3.0) = +1.0 (same)
(2.0 * 0.0 + 1.0) * (1.0 / 3.0) = +0.33333333... (different)
(2.0 * -1.0 + 1.0) * (1.0 / 3.0) = -0.33333333... (different)
According to the GL_ARB_vertex_type_2_10_10_10_rev extension, signed
normalization is performed using equation 2.2 from the GL 3.2
specification, which is:
f = (2c + 1)/(2^b - 1). (2.2)
Comments below that equation state: "In general, this representation is
used for signed normalized fixed-point parameters in GL commands, such
as vertex attribute values." Which is what we're doing here.
The 3.2 specification goes on to declare an alternate formula:
f = max{c/(2^(b-1) - 1), -1.0} (2.3)
which is closer to the existing code, and maps the end points to exactly
-1.0 and 1.0. Comments below the equation state: "In general, this
representation is used for signed normalized fixed-point texture or
framebuffer values." Which is *not* what we're doing here.
It then states: "Everywhere that signed normalized fixed-point
values are converted, the equation used is specified." This is the real
clincher: the extension explicitly specifies that we must use equation
2.2, not 2.3. So we need to do (2x + 1) / 3.
This matches the behavior expected by oglconform's packed-vertex test,
and is correct for desktop GL (pre-4.2). It's not correct for ES 3.0,
but a future patch will correct that.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
For the 10-bit components, the divisor was incorrect. A 10-bit signed
integer can represent -2^9 through 2^9 - 1, which leads to the following
ranges:
(float)value.x -> [ -512, 511]
2.0F * (float)value.x -> [-1024, 1022]
2.0F * (float)value.x + 1.0F -> [-1023, 1023]
So dividing by 511 would incorrectly scale it to approximately:
[-2.001956947, 2.001956947]. To correctly scale to [-1.0, 1.0], we need
to divide by 1023.
This correctly implements the desktop GL rules. ES 3.0 has different
rules, but those will be implemented in a separate patch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
The bug was found by Coverity.
NOTE: This is a candidate for the stable branches.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This gives us checking of our arguments (no more passing 1 operand to
BRW_OPCODE_MUL!), at the cost of a couple of extra parens.
v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
This was a regression in the brw_fs_fp.cpp change. We just need to return
something good enough to get the IR generation to the end without crashing,
but ir->type isn't initialized and we wanted something of the coordinate's
type anyway.
Fixes around 30 piglit cases on my ilk system in drawpixels and framebuffer
blit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56962
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The theory of the guardband is that you extend the clip volume to avoid
expensive clipping computation, and just let fragments outside the viewport
get clipped by the drawable's bounds. But if a smaller-than-window-size
viewport is set, and we don't also happen to have a scissor set, then
rendering could incorrectly extend outside of the viewport when it should have
been clipped to the viewport.
Fixes the new piglit triangle-guardband-viewport test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.0 branch.
When you're comparing to the spec, you're trying to immediately see what
numbered dword of the packet your bit ends up in.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.0 branch.
Fixes oglconform shad-compiler advanced.TestLessThani.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
NOTE: This is a candidate for the 9.0 branch.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
All Intel code is compiled with -std=c99. There is no excuse to not use
designated initializers.
As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep. I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The dri directory is compiled with -std=c99. There is no excuse to not use
designated initializers.
As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep. I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
For a packed depth/stencil buffer on separate stencil hardware, the
separate depth miptree is set up with alignment of 4,4 and the separate
stencil miptree is setup with alignment of 8,8. We can't just use the
irb->draw_{x,y} offsets for stencil, since that is the offset in the
depth miptree.
Fixes 12 piglit depthstencil testcases on ivb.
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Given that we have the mask information here (assuming the rebase is to
the same tiling, which is safe), we can just save a set of miptrees and
offsets and the global intra-tile offset in the context and cut out a
bunch of logic. This will also save emitting the next fix I need to do
twice.
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Fixes a theoretical problem where we had an aligned depth buffer and a
misaligned stencil buffer with a matching tile offset, so we would fail
to rebase depth even after the needed tile offset changed due to the
rebase of stencil.
It should also fix double-rebase of a misaligned packed depth/stencil
renderbuffer, which may have been a performance issue.
Acked-by: Chad Versace <chad.versace@linux.intel.com>
We were always passing 0 for one of the two fields, and the code just used
whichever one wasn't 0.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
I noticed these in the next patch where these paths were using the Face
of a teximage but didn't have array handling.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Apparently this was accidentally marked as unimplemented, and thus not
put in the dispatch table.
Fixes 7 es3conform tests:
- copy_buffer_parameters
- copy_buffer_data
- copy_buffer_usage
- pixel_buffer_object_bind
- pixel_buffer_object_parameteriv
- pixel_buffer_object_texture_read
- pixel_buffer_object_usage
v2: Also update the DispatchSanity test for this change.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Only legacy OpenGL allows the use of non-gen'd names. Core profiles
and ES 3 both require the use of glGenQueries().
Note that BeginQuery doesn't exist in ES 1 or ES 2.
Fixes es3conform's occlusion_query_invalid_beginquery test.
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
GL_READ_FRAMEBUFFER and GL_DRAW_FRAMEBUFFER are valid targets in ES 3.
Fixes 23 es3conform framebuffer_blit tests. Two more go from fail to
crash, but that appears to be because they actually run now.
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
Calling glTexParameteri() with pname GL_TEXTURE_MAX_LEVEL and either a
target of GL_TEXTURE_RECTANGLE or a negative value previously generated
GL_INVALID_OPERATION. However, GL_INVALID_VALUE seems more appropriate.
Fixes oglconform's api-error/negative.glTexParameter and es3conform's
sgis_texture_lod_basic_error.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
The new brw_reg always had type BRW_REGISTER_TYPE_F, rather than
inheriting the original type of the ATTR file register.
In the past, this hasn't been a problem since we only execute this code
when fixing up GL_FIXED attributes, which always have float types.
However, we'll soon be using it for ARB_vertex_type_10_10_10_2 support,
which uses D and UD types.
Reviewed-by: Eric Anholt <eric@anholt.net>
When dri2CreateContextContextAttribs failed, eglCreateContext returned
NULL yet set the error code to EGL_SUCCESS! The problem was that
eglCreateContext ignored the error code returned by
driCreateContextAttribs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56706
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
For GLES1 and GLES2, brwCreateContext neglected to validate the requested
context version received from the DRI layer. If DRI requested an OpenGL
ES2 context with version 3.9, we provided it one.
Before this fix, the switch statement that validated the requested GL
context flavor was an ugly #ifdef copy-paste mess. Instead of reproducing
the copy-past-mess for GLES1 and GLES2, I first refactored it. Now the
switch statement is readable.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
It seems that -NDEBUG and other flags might still be leaked through
those variables, so strip those off there as well.
NOTE: This is a candidate for the 9.0 branch.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
In addition to registers used by instructions, fs_visitor maintains
direct references to certain "special" values used for inputs/outputs.
When I added VGRF compaction, I overlooked these, believing that these
direct references weren't used once instructions were generated. That
was wrong. For example, pixel_x/y are used in virtual_grf_interferes(),
which is called by optimization passes and register allocation.
This patch treats all of them as used and patches them after compacting.
While it's not strictly necessary to patch all of them (as some aren't
used after emitting code), it seems safer to simply fix them all.
Fixes oglconform's textureswizzle/advanced.shader.targets, piglit's
glsl-fs-lots-of-tex, and glean's texCombine on pre-Gen6 hardware.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56790
Reviewed-by: Eric Anholt <eric@anholt.net>
The goal of that change was to skip counting things that aren't actually
outputs from the VS to the FS. However, explicit_location isn't set in
the case of linker-assigned locations (the common case), so basically
varying component counting got disabled. At this stage of the linker,
we've already ensured that var->location is set, so we can just look at
it without worrying.
Fixes i965 assertion failure with the new
piglit glsl-max-varyings --exceed-limits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51545
Reviewed-by: Brian Paul <brianp@vmware.com>
The diff looks funny, but it's moving the integer vs non-integer check
below the _mesa_source_buffer_exists() check that ensures
_ColorReadBuffer is non-null, so we get a GL_INVALID_OPERATION instead
of a segfault. This looks like it had regressed in the
_mesa_error_check_format_and_type() changes, which removed the first of
the two duplicated checks for the source buffer. Fixes segfault in the
new piglit ARB_framebuffer_object/negative-readpixels-no-rb.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45877
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>