Commit graph

63050 commits

Author SHA1 Message Date
Jordan Justen
d374cfe0bc i965: Add auxiliary surface field #defines for Broadwell.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
(cherry picked from commit a46cb6a971)
2014-07-17 15:59:01 -07:00
Matt Turner
b56908d7db i965/fs: Set correct number of regs_written for MCS fetches.
regs_written is in units of virtual GRFs.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dfd117b857)
2014-07-17 15:59:01 -07:00
Eric Anholt
c6a6acb6b4 i965: Generalize the pixel_x/y workaround for all UW types.
This is the only case where a fs_reg in brw_fs_visitor is used during
optimization/code generation, and it meant that optimizations had to be
careful to not move pixel_x/y's register number without updating it.

Additionally, it turns out we had a couple of other UW values that weren't
getting this treatment (like gl_SampleID), so this more general fix is
probably a good idea (though I wasn't able to replicate problems with
either pixel_[xy]'s values or gl_SampleID, even when telling the register
allocator to reuse registers immediately)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 66f5c8df06)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
64ff84abae i965/fs: Use WE_all for gl_SampleID header register munging.
This code should execute without regard to the currently executing
channels.  Asking for gl_SampleID inside control flow might break in
strange ways.  It appears to break even at the top of the program in
SIMD16 mode occasionally as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 6dc9e4e22a19108057162d9d8f8c7d559545f8de)
2014-07-17 15:59:00 -07:00
Matt Turner
8f4e03c397 i965/fs: Don't use brw_imm_* unnecessarily.
Using brw_imm_* creates a source with file=HW_REG, and the scheduler
inserts barrier dependencies when it sees HW_REG. None of these are
hardware-registers in the sense that they're special and scheduling
shouldn't touch them. A few of the modified cases already have HW_REGs
for other sources, so it won't allow extra flexibility in some cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit c938be8ad2)
(This patch was cherry-picked to make the next commit apply cleanly.)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
258f35441a i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.
gen8_fs_generator uses these to decide whether to set the execution size
to 8 or 16, so we incorrectly made both of these MOVs the full width in
SIMD16 shaders.  (It happened to work out on Gen4-7.)

Setting them should also help inform optimization passes what's really
going on, which could help avoid bugs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2014-07-17 15:59:00 -07:00
Kenneth Graunke
cb04294b42 i965: Set execution size to 8 for instructions with force_sechalf set.
Both inst->force_uncompressed and inst->force_sechalf mean that the
generated instruction should be uncompressed and have an execution size
of 8.  We don't require the visitor to set both flags - setting
inst->force_sechalf by itself is supposed to be enough.

On Gen4-7, guess_execution_size() demoted instructions to 8-wide based
on the default compression state.  On Gen8+, we instead set a default
execution size, which worked great...except that we forgot to check
inst->force_sechalf when deciding whether to use 8 or 16.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 1c62126612752f6eedb66f705cc3ff1e11beea5d)
2014-07-17 15:59:00 -07:00
Matt Turner
d389a863f2 i965/vec4: Constant propagate into 2-src math instructions on Gen8.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 7192207de1)
2014-07-17 15:59:00 -07:00
Matt Turner
7fcfdfb17b i965/fs: Constant propagate into 2-src math instructions on Gen8.
total instructions in shared programs: 1878133 -> 1876986 (-0.06%)
instructions in affected programs:     153007 -> 151860 (-0.75%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 038eb649b3)
2014-07-17 15:59:00 -07:00
Matt Turner
8612a12a62 i965/fs: Make try_constant_propagate() static.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit aca4a951ea)
2014-07-17 15:59:00 -07:00
Matt Turner
8f787d3ca2 i965/fs: Don't fix_math_operand() on Gen >= 8.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 48f1143c64)
2014-07-17 15:59:00 -07:00
Matt Turner
b323fa8957 i965/vec4: Don't fix_math_operand() on Gen >= 8.
The emit_math?_gen? functions serve to implement workarounds for the
math instruction, none of which exist on Gen8+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit b24e1cc604)
2014-07-17 15:59:00 -07:00
Matt Turner
d5d94598cb i965/vec4: Don't return void from a void function.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0e800dfe75)
2014-07-17 15:59:00 -07:00
Kenneth Graunke
2efd0a3479 i965: Don't copy propagate abs into Broadwell logic instructions.
It's not clear what abs on logical instructions means on Broadwell, and
it doesn't appear to do anything sensible.

Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2de656278)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
1a832e5846 i965/vec4: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit c17db7537f)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
d55a897929 i965/fs: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit 609d00e13e)
2014-07-17 15:59:00 -07:00
Abdiel Janulgue
276c6bb369 i965/fs: Refactor check for potential copy propagated instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
(cherry picked from commit a66660d2b7)
2014-07-17 15:59:00 -07:00
Marek Olšák
0273f22a10 radeonsi: add support for TXB2
This is needed by latest fixes for samplerCubeShadow with bias.
Otherwise, a crash occurs.
2014-07-17 14:29:10 -07:00
Marek Olšák
e731031372 radeonsi: fix samplerCubeShadow with bias
Pack the depth value before overwriting it with cube coordinates.

Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit b279f0143f)

Conflicts:
	src/gallium/drivers/radeonsi/si_shader.c
2014-07-14 11:46:43 -07:00
Marek Olšák
906727dccb st/mesa: fix samplerCubeShadow with bias
It has 5 coordinates: (x,y,z,depth,lodbias)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit a11fff329e)
2014-07-14 11:43:22 -07:00
Brian Paul
d69c9114df gallium/u_blitter: fix some shader memory leaks
The _msaa shaders weren't getting freed.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 378fa34c7b)
2014-07-10 12:59:50 -07:00
Brian Paul
9b062d2020 st/mesa: fix geometry shader memory leak
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
(cherry picked from commit d10204930f)
2014-07-10 12:59:34 -07:00
Brian Paul
37005cafa4 mesa: fix geometry shader memory leaks
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 176b64b811)
2014-07-10 12:59:19 -07:00
Marek Olšák
abe859d56e gallium: fix u_default_transfer_inline_write for textures
This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
(cherry picked from commit fe6be9926f)
2014-07-10 12:58:56 -07:00
Ilia Mirkin
1e6620997f nvc0/ir: use manual TXD when offsets are involved
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.

This is the minimal fix appropriate for backporting.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 114d46829d)
2014-07-10 12:58:28 -07:00
Ilia Mirkin
9fd133747b nvc0/ir: do quadops on the right texture coordinates for TXD
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit afea9bae67)
2014-07-10 12:58:05 -07:00
Ilia Mirkin
5e1bfed1ca nv50/ir: ignore bias for samplerCubeShadow on nv50
Unfortunately there's no good way to do this on the nv50 shader isa.
Dropping the bias seems preferable to doing the compare post-filtering.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1065aa92f4)
2014-07-10 12:57:43 -07:00
Ilia Mirkin
0618881c82 nv50/ir: retrieve shadow compare from first arg
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 30d91e0eec)
2014-07-10 12:57:17 -07:00
Carl Worth
d00d73d1e1 docs: Add sha256 checksums for the 10.2.3 release
This was not possible until the previous commit was complete, used for
building archives, and then tagged.
2014-07-07 16:17:21 -07:00
Carl Worth
33cb9f9503 docs: Add release notes for the 10.2.3 release.
Which is imminent.
2014-07-07 16:12:42 -07:00
Rob Clark
0186858227 freedreno/a3xx: vtx formats
Add support for more vertex buffer formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 06e9536e5f)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit ba6a490bbc)
2014-07-07 16:09:36 -07:00
Rob Clark
b20c82f74c freedreno: fix for null textures
Some apps seem to give us a null sampler/view for texture slots which
come before the last used texture slot.  In particular 0ad triggers
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 6aeeb706d2)
2014-07-07 16:09:36 -07:00
Rob Clark
8f77fbb6af freedreno/a3xx: texture fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit aa78c4586d)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 2456be63e9)
2014-07-07 16:09:21 -07:00
Rob Clark
afcb63802f freedreno: few caps fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>

(cherry picked from commit 286863939f)
2014-07-07 16:09:21 -07:00
Rob Clark
8b2d1068b5 freedreno/a3xx: fix blend opcode
Seems the opcodes are slightly different from a2xx.  Resync headers and
move blend_func() helper into hw generation specific code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit a4d229b099)
2014-07-07 16:09:20 -07:00
Rob Clark
f96e3e5351 freedreno/a3xx: fix depth/stencil gmem restore
We already multiply by bytes per pixel for this, so f3ba7611 broke
mem2gmem for depth/stencil.  Drop the now-redundant mutiply by cpp.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b81de5352d)
2014-07-07 16:09:20 -07:00
Rob Clark
55b6821a9f freedreno/a3xx: fix depth/stencil GMEM positioning
In cases where there was no color buf bound, there were inconsistancies
in register settings related to position of depth/stencil inside GMEM.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit f3ba761129)

Squashed with:

freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 4da8267c36)
2014-07-07 16:09:00 -07:00
Rob Clark
a1b7c7d88e freedreno: use OUT_RELOCW when buffer is written
These aren't buffers we ever read back from CPU, so using incorrect
reloc fxn wasn't really harming anything.  But might as well be correct.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 0d54904c04)
2014-07-03 21:26:09 -07:00
Rob Clark
ff9cea8776 xa: fix segfault
Fixes:

  Program received signal SIGSEGV, Segmentation fault.
  bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445
  445						mask_pic->srf->tex->format);
  (gdb) bt
  #0  bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445
  #1  xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054)
      at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488
  #2  0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>,
      pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0,
      pDst=0x5923b8) at msm-exa-xa.c:533

We can't yet handle solid fill mask, so explicitly reject that, rather
than segfaulting.  Otherwise DDX would need to check XA version to see
if solid fill mask were supported.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b7e7ae9f60)
2014-07-03 21:25:50 -07:00
Ilia Mirkin
95ff8c6f18 nvc0: add a memory barrier when there are persistent UBOs
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9a37eb8adb)
2014-07-03 20:04:56 -07:00
Ilia Mirkin
e11b3f8fbc nv50: do an explicit flush on draw when there are persistent buffers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 5d4f5218bb)
2014-07-03 20:04:12 -07:00
Ilia Mirkin
da80e6a1c4 nv50: disable dedicated ubo upload method
The hardware allows multiple simultaneous renders with the same
memory-backed constbufs but with each invocation having different
values. However in order for that to work, the data has to be streamed
in via the right constbuf slot. We weren't doing that for UBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b2b7c65122)
2014-07-03 20:03:06 -07:00
Aaron Watry
5ba1cf1893 radeon/llvm: Allocate space for kernel metadata operands
Previously, we were assuming that kernel metadata nodes only had 1 operand.

Kernels which have attributes can have more than 1, e.g.:
!0 = metadata !{void (i32 addrspace(1)*)* @testKernel, metadata !1}
!1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1}

Attempting to get the kernel without the correct number of attributes led
to memory corruption and luxrays crashing out.

Fixes the cl/program/execute/attributes.cl piglit test.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223
CC: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 824197efd5)
2014-07-03 20:02:17 -07:00
Thomas Hellstrom
ff02e7995c st/xa: Don't close the drm fd on failure v2
If XA fails to initialize with pipe_loader enabled, the pipe_loader's
cleanup function will close the drm file descriptor. That's pretty bad
because the file descriptor will probably be the X server driver's only
connection to drm. Temporarily solve this by dup()'ing the file descriptor
before handing it over to the pipe loader.

This fixes freedesktop.org bugzilla bug #80645.

v2: Fix CC addresses.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
(cherry picked from commit 35cf3831d7)

Conflicts:
	src/gallium/state_trackers/xa/xa_tracker.c
2014-07-03 18:42:46 -07:00
Michel Dänzer
ee4274c393 radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>

https://bugs.freedesktop.org/show_bug.cgi?id=80015

CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b9f501bc6b)

Squashed together with the earlier:

radeon/llvm: Adapt to AMDGPU.rsq intrinsic change in LLVM 3.5

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 93b6b1fa83)
2014-07-03 18:39:43 -07:00
Carl Worth
7b21ee08db cherry-ignore: Add a patch that's been rejected
It may be that the patch is just fine, but at the very least it needs a better
commit message.
2014-07-03 18:36:06 -07:00
Kenneth Graunke
f9718e4b93 i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.
Apparently INTEL_DEBUG=fs has crashed on Broadwell for anything using
ARB_fragment_program since commit 9cee3ff5.  We need to NULL-check the
right field.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c60a4ba7e3)

Conflicts:
	src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
2014-07-03 18:35:54 -07:00
Jasper St. Pierre
3ca2119593 glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event
While the official INTEL_swap_event specification says that the drawable
field should contain the GLXDrawable, not the Drawable, the existing
DRI2 code in dri2.c that translates from DRI2_BufferSwapComplete sends out
GLX_BufferSwapComplete with the Drawable's ID, so existing codebases
like Clutter/Cogl rely on getting the Drawable.

Match DRI2's error here and stuff the event with the X Drawable, not
the GLX drawable.

This fixes apps seeing wrong drawables through an indirect GLX context
or with DRI3, which uses the GLX_BufferSwapComplete event directly on
the wire instead of translates Present in mesa.

At the same time, also modify the structure for the event to make sure
that clients don't make the same mistake. This is not an API or ABI
break, as GLXDrawable and Drawable are both typedefs for XID.

Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit b4dcf87f34)
2014-07-03 18:03:11 -07:00
Kenneth Graunke
ad79d7e987 i965: Include marketing names for Broadwell GPUs.
Intel would like us to include the marketing names.  Developers
additionally want "Broadwell GT1/2/3" because it makes it easier
to identify what hardware users have when they request assistance
or report issues.

Including both makes it easy for everyone to map between the names.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 05126b9bb5)
2014-07-03 18:02:00 -07:00
Takashi Iwai
9bd6dc9371 llvmpipe: Fix zero-division in llvmpipe_texture_layout()
Fix the crash of "gnome-control-center info" invocation on QEMU where
zero height is passed at init.

(sroland: simplify logic by eliminating the div altogether, using 64bit mul.)

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=879462

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6b8b17153a)
2014-07-03 18:01:14 -07:00