Commit graph

72592 commits

Author SHA1 Message Date
Emil Velikov
f39bc1c828 docs: add sha256 checksums for 10.6.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit e3e2a3e0e5)
2015-09-04 23:10:09 +01:00
Emil Velikov
5685ed72b8 docs: add release notes for 10.6.6
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4b05739e9d)
2015-09-04 23:10:07 +01:00
Oded Gabbay
4f2290d161 llvmpipe: convert double to long long instead of unsigned long long
round(val*dscale) produces a double result, as val and dscale are double.
However, LLVMConstInt receives unsigned long long, so there is an
implicit conversion from double to unsigned long long.
This is an undefined behavior. Therefore, we need to first explicitly
convert the round result to long long, and then let the compiler handle
conversion from that to unsigned long long.

This bug manifests itself in POWER, where all IMM values of -1 are being
converted to 0 implicitly, causing a wrong LLVM IR output.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-09-04 17:37:17 -04:00
Hans de Goede
3c6c4d4f29 nv30: Implement color resolve for msaa
Note this is not ideal. Since the sifm can only do source sizes upto
1024x1024 we end up using the blitter on nv4x, which is not that fast.

And on nv3x we end up using the cpu which is really slow.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-09-04 16:07:08 -04:00
Hans de Goede
3329703eb1 nv30: Fix creation of scanout buffers
Scanout buffers on nv30 must always be non-swizzled and have special
width alignment constraints.

These constrains have been taken from the xf86-video-nouveau
src/nv_accel_common.c: nouveau_allocate_surface() function.

nouveau_allocate_surface() applies these width constraints only when a
tiled attribute is set, which it sets for all surfaces allocated via
dri, and this "tiling" is not the same as swizzling, scanout surfaces
must be linear / have a uniform_pitch or only complete garbage is shown.

This commit fixes dri3 on nv30 showing a garbled display, with dri3 the
scanout buffers are allocated by mesa, rather then by the ddx, and the
wrong stride of these buffers was causing the garbled display.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-09-04 16:07:08 -04:00
Boyan Ding
48de40ce9c vc4: Initialize pack field of qreg to 0 in qir_get_temp
This avoids generation of undefined packing in qir and qpu instructions,
fixing a lot of rendering errors.

Fixes 8b36d107fd (vc4: Pack the unorm-packing bits into a src MUL
instruction when possible.)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-09-04 12:16:07 -07:00
Chris Wilson
099f5b3a62 i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixels
The tiled memcpy fast paths perform a simple blit (with only a couple of
trivial pixel conversion routines) and do not accommodate PixelTransfer
operations. Therefore if any are set, fallback to the regular routines.
Note that PixelTransfer only applies to TexImage and ReadPixels, not to
GetTexImage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2015-09-04 20:11:15 +01:00
Iago Toral Quiroga
96ea166308 i965/vec4: Don't unspill the same register in consecutive instructions
If we have spilled/unspilled a register in the current instruction, avoid
emitting unspills for the same register in the same instruction or consecutive
instructions following the current one as long as they keep reading the spilled
register. This should allow us to avoid emitting costy unspills that come with
little benefit to register allocation.

v2:
  - Apply the same logic when evaluating spilling costs (Curro).

v3:
  - Abstract the logic that decides if a register can be reused in a function.
    that can be used from both spill_reg and evaluate_spill_costs (Curro).

v4:
  - Do not disallow reusing scratch_reg in predicated reads (Curro).
  - Track if previous sources in the same instruction read scratch_reg (Curro).
  - Return prev_inst_read_scratch_reg at the end (Curro).
  - No need to explicitily skip scratch read/write opcodes in spill_reg (Curro).
  - Fix the comments explaining what happens when we hit an instruction that
    does not read or write scratch_reg (Curro)
  - Return true early when the current or previous instructions read
    scratch_reg with a compatible mask.

v5:
  - Do not return true early, the loop should not be expensive anyway
    and this adds more complexity (Curro).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-09-04 15:13:49 +02:00
Iago Toral Quiroga
bd6e516fc2 i965: Add a debug option for spilling everything in vec4 code
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-09-04 12:49:36 +02:00
Francisco Jerez
6cf4142db8 dri/common: Tokenize driParseDebugString() argument before matching debug flags.
Fixes debug string parsing when one of the supported flags is a
substring of another.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-09-04 12:49:36 +02:00
Francisco Jerez
3d4f75506c dri/common: Fix codestyle of driParseDebugString().
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-09-04 12:49:36 +02:00
Tapani Pälli
08e9049e3d glsl: error out on ES 3.1 if VS or FS present but not both
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-09-04 09:22:24 +03:00
Tapani Pälli
69678953d1 glsl: error on linking if no shaders are attached to program
This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-09-04 09:01:00 +03:00
Kenneth Graunke
4323e78d3f i965: Improve disassembly of data port read messages.
We now print out the name of the message instead of its numerical
value, and label the message control and surface numbers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:04 -07:00
Kenneth Graunke
0e23c246c0 i965: Optimize VUE map comparisons.
The entire VUE map is computed based on the slots_valid bitfield;
calling brw_compute_vue_map on the same bitfield will return the
same result.  So we can simply compare those.

struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is
much cheaper and should work just as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:04 -07:00
Kenneth Graunke
6e03377daf i965/gs: Don't reserve space for clip plane uniforms.
These were only for legacy userclipping, which we no longer support
in geometry shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:04 -07:00
Kenneth Graunke
fba4823a91 i965: Don't do legacy userclipping in non-compatibility contexts.
According to the GLSL 1.50 specification, page 76:
"The shader must also set all values in gl_ClipDistance that have been
 enabled via the OpenGL API, or results are undefined."

With this patch, we only enable clip distance writes when the shader
actually writes them.  We no longer force a value to be written when
clip planes are enabled in the API.  This could mean the first varying
slot would be used as clip distances - I believe it should be the safe
kind of undefined behavior.

Empirically, it doesn't seem to cause a problem.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:04 -07:00
Kenneth Graunke
4f4b7c4711 i965: Remove the brw_vue_prog_key base class.
The legacy userclip fields are only used for the vertex shader, and at
that point there's only program_string_id and the tex struct, which are
common to all keys.  So there's no need for a "VUE" key base class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:04 -07:00
Kenneth Graunke
3239621825 i965: Virtualize vec4_visitor::emit_urb_slot().
This avoids a downcast of key, which won't exist in the base class soon.

I'm not a huge fan of this patch, but given that we're currently using
inheritance, this seems like the "right" way to do it.  The alternative
is to make key a void pointer in the parent class and continue
downcasting.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:03 -07:00
Kenneth Graunke
27e83b62bb i965: Store a key_tex pointer in vec4_visitor.
I'm about to remove the base class for VS/GS/HS/DS program keys, at
which point we won't be able to use key->tex anymore.  Instead, we'll
need to store a direct pointer (like we do in the FS backend).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:03 -07:00
Kenneth Graunke
014b90221a i965: Move legacy clip plane handling to vec4_vs_visitor.
This is now only used for the vertex shader, so it makes sense to get it
out of any paths run by the geometry shader.

Instead of passing the gl_clip_plane array into the run() method (which
is shared among all subclasses), we add it as a vec4_vs_visitor
constructor parameter.  This eliminates the bogus NULL parameter in the
GS case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:03 -07:00
Kenneth Graunke
082b7f1876 i965: Delete the brw_vue_program_key::userclip_active flag.
There are two uses of this flag.

The primary use is checking whether we need to emit code to convert
legacy gl_ClipVertex/gl_Position clipping to clip distances.  In this
case, we also have to upload the clip planes as uniforms, which means
setting nr_userclip_plane_consts to a positive value.  Checking if it's
> 0 works for detecting this case.

Gen4-5 also wants to know whether we're doing clipping at all, so it can
emit user clip flags.  Checking if output_reg[VARYING_SLOT_CLIP_DIST0]
is set to a real register suffices for this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:03 -07:00
Kenneth Graunke
294282aaa6 i965: Remove legacy clip plane handling from geometry shaders.
We only support geometry shaders in core profiles, where gl_ClipVertex
doesn't exist.  Presumably the even older behavior of clipping to
gl_Position isn't supported either.  In fact, GLSL 1.50 page 76 claims:

"The shader must also set all values in gl_ClipDistance that have been
 enabled via the OpenGL API, or results are undefined."

So we don't need to handle legacy clipping in geometry shaders.  I think
Paul added this back when we were considering supporting the old
GL_ARB_geometry_shader4 extension.

This removes a non-orthagonal state dependency on GS compilation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:03 -07:00
Kenneth Graunke
a2151560b8 i965: Move brw_setup_tex_for_precompile to brw_program.[ch].
This living in brw_fs.{h,cpp} is a historical artifact of us supporting
texturing for fragment shaders before any other stages.  It's kind of
awkward given that we use it for all stages.

This avoids having to include brw_fs.h in geometry shader code in order
to access this function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-09-03 22:31:03 -07:00
Tapani Pälli
04e201d0c0 mesa: change 'SHADER_SUBST' facility to work with env variables
Patch modifies existing shader source and replace functionality to work
with environment variables rather than enable dumping on compile time.
Also instead of _mesa_str_checksum, _mesa_sha1_compute is used to avoid
collisions.

Functionality is controlled via two environment variables:

MESA_SHADER_DUMP_PATH - path where shader sources are dumped
MESA_SHADER_READ_PATH - path where replacement shaders are read

v2: cleanups, add strerror if fopen fails, put all functionality
    inside HAVE_SHA1 since sha1 is required

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-04 08:22:37 +03:00
Tapani Pälli
0db323a624 build: add HAVE_SHA1 define when using --with-sha1 option
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
2015-09-04 08:05:24 +03:00
Kenneth Graunke
2ace64fd59 i965: Fix copy propagation type changes.
commit 472ef9a02f introduced code to
change the types of SEL and MOV instructions for moves that simply
"copy bits around".  It didn't account for type conversion moves,
however.  So it would happily turn this:

   mov(8) vgrf6:D, -vgrf5:D
   mov(8) vgrf7:F, vgrf6:UD

into this:

   mov(8) vgrf6:D, -vgrf5:D
   mov(8) vgrf7:D, -vgrf5:D

which erroneously drops the conversion to float.

Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-09-03 21:12:54 -07:00
Dave Airlie
5fa5a012b1 r600: fix loop overrun in cayman_mul_double_instr
Coverity warned about this. Ilia pointed it out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-09-04 08:02:14 +10:00
Ben Widawsky
b05619c627 i965/gen9: Annotate input coverage mask change
As far as I can tell, the behavior is preserved from the previous generations.
Before we set a single bit to tell the FS whether or not we'll be using an input
coverage mask. Now we have some options which are implementing various
extensions. These bits are used for the various conservative rasterization
mechanisms (for collision detection, binning, and whatever else).

I believe that the behavior is preserved because the problem which conservative
rasterization is attempting to fix would go away with the "NORMAL" mode (at the
cost of performance, I believe).

This patch serves as documentation of the change by creating the enums, as well
as giving some of the history with the links here so that the next person who
comes along and looks at it doesn't spend as long as I had to in order to
determine if there is an issue or not.

Previously, this algorithm had been done in software, and this can still be used
as long as we don't export an extension stating otherwise.

References: https://www.opengl.org/registry/specs/NV/conservative_raster.txt
References: https://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-09-03 11:55:31 -07:00
Brian Paul
70dbdca15f svga: update call to u_upload_alloc()
u_upload_alloc() no longer returns a return value.

Trivial.
2015-09-03 11:24:24 -06:00
Marek Olšák
efea7c3a3f winsys/radeon: remove exported buffers from the cache
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-09-03 18:41:45 +02:00
Marek Olšák
54964c7751 winsys/amdgpu: remove exported buffers from the cache
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-09-03 18:41:42 +02:00
Marek Olšák
35d0f12797 gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitly
This must be done before exporting a buffer as dmabuf fds, because
we lose track of who is using it and can't trust the reference counter.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-09-03 18:41:40 +02:00
Marek Olšák
44dbaa1746 u_upload_mgr: remove the return value from u_upload_data
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-03 18:14:50 +02:00
Marek Olšák
0c5df863ba u_upload_mgr: remove the return value from u_upload_buffer
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-03 18:14:48 +02:00
Marek Olšák
b4f7639955 u_upload_mgr: remove the return value from u_upload_alloc_buffer
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-03 18:14:43 +02:00
Marek Olšák
8c6ff05517 u_upload_mgr: remove the return value from u_upload_alloc
The return buffer or the returned pointer can be used instead.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-03 18:14:09 +02:00
Marek Olšák
6c1e368cf3 u_upload_mgr: optimize u_upload_alloc
This is probably the most called util function. It does almost nothing,
yet it can consume 10% of the CPU on the profile. This drops it down to 5%.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-09-03 18:09:13 +02:00
Grazvydas Ignotas
722ce74743 gallium/radeon: remove 'dirty' member from r600_atom
It's no longer used by both r600 and radeonsi now.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-03 18:06:51 +02:00
Grazvydas Ignotas
ccbc7952a4 r600g: simplify dirty atom tracking
Now that R600_NUM_ATOMS is below 64, dirty atom tracking can be
simplified.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-03 18:06:42 +02:00
Grazvydas Ignotas
6ef4572937 r600g: start numbering atoms from 1
There doesn't seem any reason to start from 4.
Start from 1 instead (0 is left reserved to catch uninitialized atoms).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-03 18:06:29 +02:00
Grazvydas Ignotas
4d9af438bc r600g: make all viewport states use single atom
Similarly to scissor states, we can use single atom to track all viewport
states. This will allow to simplify dirty atom handling later.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-03 18:06:14 +02:00
Grazvydas Ignotas
fbb423b433 r600g: apply disable workaround on all scissors
During review of the "r600g: make all scissor states use single atom" patch
Marek Olšák noticed that scissor disable workaround should be applied on
all scissor states and not just first one, so let's do so.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-03 18:05:58 +02:00
Grazvydas Ignotas
7d475bad66 r600g: make all scissor states use single atom
As suggested by Marek Olšák, we can use single atom to track all scissor
states. This will allow to simplify dirty atom handling later.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-09-03 18:05:54 +02:00
Neil Roberts
ce181aea6c mesa/pbo: Handle zero width, height or depth when validating access
It's legal to call glTexSubImage with zero values for the width,
height or depth. Previously this was breaking the PBO access
validation because it tries to work out the last pixel accessed by
getting the pixel at height-1 and depth-1 which would end up with
bogus values.

This was causing GL errors to be generated during the Piglit
texsubimage test, although the test was passing anyway.

v2: Also check for width == 0. Don't validate the start pointer if any
    of the dimensions are zero.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-09-03 17:00:54 +01:00
Kenneth Graunke
30e84530a0 glsl: Remove unused total_attribs_size variable.
Accidentally left behind by my previous patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-09-03 00:56:18 -07:00
Kenneth Graunke
c3294ca5a1 glsl: Handle attribute aliasing in attribute storage limit check.
In various versions of OpenGL and GLSL, it's possible to declare
multiple VS input variables with aliasing attribute locations.

So, when computing the storage requirements for vertex attributes,
we can't simply add up the sizes.  Instead, we need to look at the
enabled slots.

This patch begins tracking which attributes are double types that
are larger than 128-bits (i.e. take up two vec4 slots).  We then
count normal attributes once, and count the double-size attributes
a second time.

Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests
on i965, which regressed with commit ad208d975a.

No Piglit changes on llvmpipe (which actually supports dvecs).

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2015-09-02 23:28:20 -07:00
Ian Romanick
6e37304521 i965/meta: Fix typo in comment
Trivial.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2015-09-02 16:24:18 -07:00
Ian Romanick
7237c937af mesa: Don't allow wrong type setters for matrix uniforms
Previously we would allow glUniformMatrix4fv on a dmat4 and
glUniformMatrix4dv on a mat4.  Both are illegal.  That later also
overwrites the storage for the mat4 and causes bad things to happen.

Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-09-02 16:24:17 -07:00
Ian Romanick
a6976f0972 mesa: Pass the type to _mesa_uniform_matrix as a glsl_base_type
This matches _mesa_uniform, and it enables the bug fix in the next
patch.

v2: s/type/basicType/ in the assert in _mesa_uniform_matrix.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> [v1]
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-09-02 16:24:17 -07:00