This is used everywhere else in this file because it avoids problems
when count is zero (due to trimming).
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: Marius Predut <marius.predut@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 25543d8ec5)
This was missing in the HAVE_TRIANGLES path, and that could cause
incorrect rendering.
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: Marius Predut <marius.predut@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c0b3b2f760)
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0d475ee2b9)
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit fad8d54de7)
The value passed in count previously was "vertex after the last vertex
to be processed." Calling that "count" was misleading and kind of mean.
Looking at the code, many functions immediately do "count-start" to get
back the true count. That's just silly.
If it is better for the loops to be 'for (j = start; j < (start +
count); j++)', GCC will do that transformation.
NOTE: There is some strange formatting left by this patch. That was
done to make it more obvious that the before and after code is
equivalent. These will be fixed in the next patch.
No piglit regressions on i915 (G33) or radeon (Radeon 7500).
v2: Fix a remaining (count-start) in render_quad_strip_verts.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit d7bf7969b9)
From section 9.2. Binding and Managing Framebuffer Objects:
"Upon successful return from Get*FramebufferAttachmentParameteriv, if
pname is FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE, then params will contain
one of NONE, FRAMEBUFFER_DEFAULT, TEXTURE, or RENDERBUFFER, identifying
the type of object which contains the attached image."
And then it clarifies further:
"If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
either no framebuffer is bound to target; or the default framebuffer is
bound, attachment is DEPTH or STENCIL, and the number of depth or stencil
bits, respectively, is zero"
Currently, if the default framebuffer is bound, we always return
GL_FRAMEBUFFER_DEFAULT for FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE, but
according to the spec, when GL_DEPTH or GL_STENCIL attachments are
the ones being queried, we should return GL_NONE if they don't exist.
Fixes the following dEQP test:
dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cf439951b7)
The nominated commit 7f8815bcb9 (i965: fix
textureGrad for cubemaps) addresses issue, raised post 10.6 branchpoint.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
The nominated commit afa1efdc85 (mesa:
fix errors when reading depth with glReadPixels) addresses issue,
raised post 10.6 branchpoint.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
The src_reg constructor that received the glsl_type was using it
only to build the swizzle, but not to fill this->type as dst_reg
is doing.
This caused some type mismatch between movs and alu operations
on the NIR path, so copy propagation optimization was not applied
to remove unneeded movs if negate modifier was involved. This was
first detected on minus (negate+add) operations.
Shader DB results (taking into account only vec4):
total instructions in shared programs: 20019 -> 19934 (-0.42%)
instructions in affected programs: 2918 -> 2833 (-2.91%)
helped: 79
HURT: 0
GAINED: 0
LOST: 0
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 4de86e1371)
Nominated-by: Christoph Brill <egore911@egore911.de>
Various pieces of code to create compressed textures will first
generate an uncompressed RGBA texture into a temporary buffer,
and then read from that buffer while creating the final compressed
texture in the requested format.
The code reading from the temporary buffer assumes the buffer is
formatted as an array of bytes in RGBA order. However, the buffer
is filled using a _mesa_texstore call with MESA_FORMAT_R8G8B8A8_UNORM
format -- this is defined as an array of *integers* holding the
RGBA values in packed format (least-significant to most-significant).
This means incorrect bytes are accessed on big-endian systems.
This patch fixes this by using the MESA_FORMAT_A8B8G8R8_UNORM format
instead on big-endian systems when filling the buffer. This fixes
about 100 piglit test case failures on s390x for me.
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@gmail.com>
(cherry picked from commit bd016a2601)
This is what the hardware supports, there never was any sort of 64K
limit.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7a275fcda8)
This fixes failures with the newly-submitted max-size texture buffer
piglit test for GPUs exposing >= 128M max texels.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
(cherry picked from commit eb081681df)
At the moment if a gbm buffer is imported and the gbm buffer
has an old-style GBM_BO_FORMAT format, the import will crash,
since it's passed directly to DRI functions that expect
a fourcc format (as provided by the newer GBM_FORMAT
definitions)
This commit addresses the problem in two ways:
1) it prevents invalid formats from leading to a crash by
returning EINVAL if the image couldn't be created
2) it translates GBM_BO_FORMAT formats into the comparable
GBM_FORMAT formats.
Reference: https://bugzilla.gnome.org/show_bug.cgi?id=753531
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
(cherry picked from commit 4bf151e662)
If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.
This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.
For example,
mov vgrf7:D vgrf2:D
mov.sat m4:F vgrf7:F
is coalesced to:
mov.sat m4:D vgrf2:D
The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV (without type
conversion implied). In that case, the patch sets the register
types to the type of the final destination.
Shader-db results in HSW (only vec4 instructions shown):
total instructions in shared programs: 1754415 -> 1754416 (0.00%)
instructions in affected programs: 74 -> 75 (1.35%)
helped: 0
HURT: 1
GAINED: 0
LOST: 0
Only one extra instruction in one of the shaders, that comes from
eliminating a saturation error by preventing register coalesce.
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 79f1a7ae28)
Something is wrong with the support somewhere. I couldn't get the blob
driver to use it either, although it happily used RGB5_A1.
teximage-colors works, but WoW seems to fail in the menus for drawing
text.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 342e68dc60)
Some modern apps try to use msaa without keeping in mind the
restrictions on videomem of older cards. Resulting in dmesg saying:
[ 1197.850642] nouveau E[soffice.bin[3785]] fail ttm_validate
[ 1197.850648] nouveau E[soffice.bin[3785]] validating bo list
[ 1197.850654] nouveau E[soffice.bin[3785]] validate: -12
Because we are running out of video memory, after which the program
using the msaa visual freezes, and eventually the entire system freezes.
To work around this we do not allow msaa visauls by default and allow
the user to override this via NV30_MAX_MSAA.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[imirkin: move env var lookup to screen so that it's only done once]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3e9df0e3af)
Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>
Conflicts:
src/gallium/drivers/nouveau/nv30/nv30_screen.c
Indications are that if the colormask indicates a single bit set on
fermi, that value will always be read from $r0 instead of a potentially
higher register (if e.g. green is set). Not to upset the counting logic,
always set the header up with a full color mask for each RT. Such a
situation can basically only ever happen with generated blit shaders.
Fixes the following piglit on Fermi (Kepler is unaffected):
fbo-stencil blit GL_DEPTH32F_STENCIL8
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 39df725f73)
The sifm object has a limit of 1024x1024 for its input size and 2048x2048
for its output. The code checking this was trying to be clever resulting
in it seeing a surface of e.g 1024x256 being outside of the input size
limit.
This commit fixes this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 87073c69f3)
Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>
Conflicts:
src/gallium/drivers/nouveau/nv30/nv30_transfer.c
Nothing in the spec allows for the reduced precision, and this also
fixes st_QuerySamplesForFormat for nv50, which does not allow MS8 on
RGBA32F. Now this will be respected instead of reporting MS8 as
supported with an assumption that the format used will be RGBA16F.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e40f32d562)
round(val*dscale) produces a double result, as val and dscale are double.
However, LLVMConstInt receives unsigned long long, so there is an
implicit conversion from double to unsigned long long.
This is an undefined behavior. Therefore, we need to first explicitly
convert the round result to long long, and then let the compiler handle
conversion from that to unsigned long long.
This bug manifests itself in POWER, where all IMM values of -1 are being
converted to 0 implicitly, causing a wrong LLVM IR output.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit 4f2290d161)
Note this is not ideal. Since the sifm can only do source sizes upto
1024x1024 we end up using the blitter on nv4x, which is not that fast.
And on nv3x we end up using the cpu which is really slow.
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3c6c4d4f29)
Scanout buffers on nv30 must always be non-swizzled and have special
width alignment constraints.
These constrains have been taken from the xf86-video-nouveau
src/nv_accel_common.c: nouveau_allocate_surface() function.
nouveau_allocate_surface() applies these width constraints only when a
tiled attribute is set, which it sets for all surfaces allocated via
dri, and this "tiling" is not the same as swizzling, scanout surfaces
must be linear / have a uniform_pitch or only complete garbage is shown.
This commit fixes dri3 on nv30 showing a garbled display, with dri3 the
scanout buffers are allocated by mesa, rather then by the ddx, and the
wrong stride of these buffers was causing the garbled display.
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit 3329703eb1)
Previously we would allow glUniformMatrix4fv on a dmat4 and
glUniformMatrix4dv on a mat4. Both are illegal. That later also
overwrites the storage for the mat4 and causes bad things to happen.
Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7237c937af)
This matches _mesa_uniform, and it enables the bug fix in the next
patch.
v2: s/type/basicType/ in the assert in _mesa_uniform_matrix.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> [v1]
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a6976f0972)
Temporarily undefine DEBUG macro while including LLVM C++ headers,
leveraging the push/pop_macro pragmas, which are supported both by GCC
and MSVC.
https://bugs.freedesktop.org/show_bug.cgi?id=90621
Trivial.
(cherry picked from commit 09d6243aed)
Nominated-by: Sedat Dilek <sedat.dilek@gmail.com>
TargetOptions::NoFramePointerElim was removed in llvm-3.7.0svn r238244
"Remove NoFramePointerElim and NoFramePointerElimOverride from
TargetOptions and remove ExecutionEngine's dependence on CodeGen. NFC."
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
(cherry picked from commit 147ffd4816)
Nominated-by: Sedat Dilek <sedat.dilek@gmail.com>
Broadwell's stencil blitting code attempts to bind a renderbuffer as a
texture, using dd->BindRenderbufferTexImage().
This calls _mesa_init_teximage_fields(), which then attempts to set
img->_BaseFormat = _mesa_base_tex_format(ctx, internalFormat), which
assert fails if internalFormat is GL_STENCIL_INDEX8 but
ARB_texture_stencil8 is unsupported.
To work around this, just pretend to support the extension momentarily,
during the blit. Meta has already munged a variety of other things in
the context (including the API!), so it's not that much worse than what
we're already doing.
Fixes regressions since commit f7aad9da20
(mesa/teximage: use correct extension for accept stencil texture.).
v2: Add an XXX comment explaining the situation (requested by Jason
Ekstrand and Martin Peres), and an assert that we don't support
the extension so we remember to remove this hack (requested by
Neil Roberts).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit f83b9e58f6)
Nominated-by: Emil Velikov <emil.velikov@collabora.co.uk>
In various versions of OpenGL and GLSL, it's possible to declare
multiple VS input variables with aliasing attribute locations.
So, when computing the storage requirements for vertex attributes,
we can't simply add up the sizes. Instead, we need to look at the
enabled slots.
This patch begins tracking which attributes are double types that
are larger than 128-bits (i.e. take up two vec4 slots). We then
count normal attributes once, and count the double-size attributes
a second time.
Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests
on i965, which regressed with commit ad208d975a.
No Piglit changes on llvmpipe (which actually supports dvecs).
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c3294ca5a1)
This was using the wrong extension, ARB_stencil_texturing
doesn't mention any changes in this area.
Fixes "dEQP-GLES3.functional.fbo.completeness.renderable.texture.
stencil.stencil_index8."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90751
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f7aad9da20)
Nominated-by: Mark Janes <mark.a.janes@intel.com>
Broadwell's stencil blitting code attempts to bind a renderbuffer as a
texture, using dd->BindRenderbufferTexImage().
This calls _mesa_init_teximage_fields(), which then attempts to set
img->_BaseFormat = _mesa_base_tex_format(ctx, internalFormat), which
assert fails if internalFormat is GL_STENCIL_INDEX8 but
ARB_texture_stencil8 is unsupported.
To work around this, just pretend to support the extension momentarily,
during the blit. Meta has already munged a variety of other things in
the context (including the API!), so it's not that much worse than what
we're already doing.
Fixes regressions since commit f7aad9da20
(mesa/teximage: use correct extension for accept stencil texture.).
v2: Add an XXX comment explaining the situation (requested by Jason
Ekstrand and Martin Peres), and an assert that we don't support
the extension so we remember to remove this hack (requested by
Neil Roberts).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit f83b9e58f6)
Nominated-by: Mark Janes <mark.a.janes@intel.com>
Mesa supports EXT_texture_rg and OES_texture_float. This patch adds
support for using unsized enums GL_RED and GL_RG for floating point
targets and writes proper checks for internalformat when format is
GL_RED or GL_RG and type is of GL_FLOAT or GL_HALF_FLOAT.
Later, internalformat will get adjusted by adjust_for_oes_float_texture
after these checks.
v2: simplify to check vs supported enums
v3: follow the style and break out if internalFormat ok (Kenneth)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90748
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5b0d6f5c1b)
Nominated-by: Mark Janes <mark.a.janes@intel.com>
This reverts commit f3b709c0ac.
The "dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_4.
interpolation.lines_wide" test appears to be broken on Cherryview when
we expose line widths greater than 12.0. I'm not sure why.
For now, just go back to the limits we used on older platforms.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90902
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 16658f426d)
Nominated-by: Mark Janes <mark.a.janes@intel.com>
commit 472ef9a02f introduced code to
change the types of SEL and MOV instructions for moves that simply
"copy bits around". It didn't account for type conversion moves,
however. So it would happily turn this:
mov(8) vgrf6:D, -vgrf5:D
mov(8) vgrf7:F, vgrf6:UD
into this:
mov(8) vgrf6:D, -vgrf5:D
mov(8) vgrf7:D, -vgrf5:D
which erroneously drops the conversion to float.
Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 2ace64fd59)
The lowered code reads from the destination, which isn't possible from
message registers.
Fixes the following dEQP tests on SNB:
dEQP-GLES3.functional.shaders.precision.int.highp_mul_fragment
dEQP-GLES3.functional.shaders.precision.int.mediump_mul_fragment
dEQP-GLES3.functional.shaders.precision.int.lowp_mul_fragment
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 9390cb8459)
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 9b510a9652)
Conflicts:
src/gallium/drivers/radeonsi/si_shader.h
src/gallium/drivers/radeonsi/si_state_shaders.c
I've been chasing a geom shader hang on rv635 since I wrote
r600 geom code, and finally I hacked some values from fglrx
in and I could run texelfetch without failures.
This is totally my fault as well, maths fail 101.
This makes geom shaders on r600 not fail heavily.
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0de53ccc8c)
As Glenn did for finalize_loop we need to update_cf when we
add a POP at the end of a shader.
I think this fixes one of the earlier shader going off end
of memory problems we've stopped.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 3063913f77)