Commit graph

58109 commits

Author SHA1 Message Date
Kenneth Graunke
7d86042dee i965/fs: Rename "cont" to "progress" in dataflow algorithm.
This variable indicates that the fixed-point algorithm made changes to
the data at this step, so it needs to run for another iteration.

"progress" seems a nicer name for that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:23 -07:00
Kenneth Graunke
0225dea6c4 i965/fs: Switch to a do-while loop in copy propagation dataflow.
The fixed-point algorithm needs to run at least once, so a do-while loop
is more natural.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:23 -07:00
Kenneth Graunke
3c68662bb1 i965/fs: Skip global copy propagation step.
The dataflow analysis used for global copy propagation is severely
broken, and I believe it doesn't actually do anything.  Fixing it will
require a lot of changes, each of which might break things.

Once all the fixes land, we can re-enable this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-19 11:29:23 -07:00
Emil Velikov
b9d1173f2c vl/buffers: consistent use on VL_MAX_SURFACES
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
e7c17eb819 st/vdpau: drop unnecessary variable prof
Any decent compiler will do this for us, although doing this
will make grepping through the code alot easier.

v2: In both mixer and query interface
v3: rebase

Reviewed-by: Christian König <christian.koenig@amd.com> [v1]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
1d260360d8 vl/idct: cleanup all idct buffers
Code should loop through and cleanup the three (VL_NUM_COMPONENTS) idct
buffers, rather than doing the first one three times.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
5354d2e76a vl/buffer: add sanity check after CALLOC_STRUCT
Check if we have successfully allocated memory.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:08 +02:00
Emil Velikov
eab9bad1ac st/xvmc: exit gracefully if we fail to create video buffer
Free any allocated memory and return BadAlloc if create_video_buffer()
has failed to create a buffer.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-08-19 18:32:07 +02:00
Emil Velikov
5e91c15290 st/vdpau: don't try to create video buffer when the format is FORMAT_NONE
Not seen in the wild yet, but seems like a reasonable thing to do.
[suggested by Christian]

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-19 18:32:03 +02:00
Andy Furniss
3448b66dac vdpau/vl 422 chroma width/height mix up
I was looking into some minor 422 issues/discrepencies I noticed long
ago using vdpau on my rv790.

I noticed that there is code that is halving height rather than width -
422 is full height AFAIK.

Making the changes below doesn't actually make any noticable difference
to what I was looking into.

Maybe there are more but here's three I've found so far

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-19 18:31:26 +02:00
Vinson Lee
b1d05eeb1f radeonsi: Ensure fmask_format is initialized in release builds.
Fixes "Uninitialized scalar variable" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-08-19 09:19:19 -07:00
Paul Berry
c6b6c93643 i965: STATIC_ASSERT that there aren't too many BRW_NEW_* flags.
We are getting close to the maximum number of BRW_NEW_* bits that can
be stored in brw->state.dirty.brw without overflowing 32 bits, and
geometry shaders are going to add more.  Add a STATIC_ASSERT so that
we will be alerted when we need to switch to 64 bits.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-19 08:28:17 -07:00
Christian König
5ddd840f5a vl: add entrypoint to is_video_format_supported
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
a15cbabb8b vl: add entrypoint to get_video_param
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
f2f7064e56 vl: rename pipe_video_decoder to pipe_video_codec
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
8e423ab984 vl: rename enum pipe_video_codec to pipe_video_format
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:15 +02:00
Christian König
53e20b8b41 vl: use a template for create_video_decoder
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-08-19 10:21:14 +02:00
Marek Olšák
d13003f544 glsl: don't eliminate texcoords that can be set by GL_COORD_REPLACE
Tested by examining generated TGSI shaders from piglit/glsl-routing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Henri Verbeet <hverbeet@gmail.com>
Tested-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-18 12:27:08 +02:00
Ilia Mirkin
a8346a2f52 nv50: allow non-nv12 buffers to be created, just pass them through to vl
Since we expose non-NV12 formats as supported when there is no decoer
profile selected, make sure that those formats are actually allowed to
be allocated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-17 17:58:36 +02:00
Eric Anholt
bef423bee6 dri: Choose a decent global driNConfigOptions.
Previously, we were asserting that each driver specified an NConfigOptions
exactly equal to the number of options they supplied, leading to frequent
bugs when people would forget to adjust the value when adjusting driver
options.  Instead, just overallocate the table by a bit and leave sanity
checking to the assert in findOption().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-17 11:43:19 +02:00
Kenneth Graunke
703a2f4219 i965: Improve comments for driver hooks in intel_buffer_object.c.
Consistently using a "The ___ driver hook." line at the the top of each
function's comment block makes it easy to see at a glance what function
is being implemented.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 19:00:49 -07:00
Kenneth Graunke
96a0fe7e4d i965: Split intel_upload code out into a separate file.
This code upload performs batched uploads via a BO.  By moving it out to
a separate file, intel_buffer_objects.c only provides the core buffer
object functionality.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 19:00:49 -07:00
Kenneth Graunke
76c2533470 i965: Move GL_APPLE_object_purgeable functionality into a new file.
GL_APPLE_object_purgeable creates a mechanism for marking OpenGL objects
as "purgeable" so they can be thrown away when system resources become
scarce.  It specifically applies to buffer objects, textures, and
renderbuffers.

The intel_buffer_objects.c file provides core functionality for GL
buffer objects, such as MapBufferRange and CopyBufferSubData.  Having
texture and renderbuffer functionality in that file is a bit strange.

The 2010 copyright on the new file is because Chris Wilson first added
this code in January 2010 (commit 755915fa).

v2: Actually remember to call the new dd table setup function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 19:00:49 -07:00
Marek Olšák
aafb0f9e06 radeonsi: fix feature support reporting
broken by 21d9a1b5ef
2013-08-17 02:49:00 +02:00
Niels Ole Salscheider
5394ee8f30 clover: Fix linkage of libOpenCL
Clover needs the option component of llvm.

Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2013-08-16 16:52:31 -07:00
Marek Olšák
21d9a1b5ef radeonsi: require LLVM 3.4 for MSAA 2013-08-17 01:48:25 +02:00
Marek Olšák
87b88f1dae radeonsi: don't make scanout resources linear except for cursors
The surface allocator understands the scanout flag just fine.

This seems to improve performance for Ubuntu Unity on top of st/xorg
and it fixes the cursor.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
89ca4a00f5 radeonsi: remove useless code from tex_fetch_args
The array slice has already been added to "address".

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
5550554f1e radeonsi: disable unbound colorbuffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
356c041167 radeonsi: port texture improvements from r600g
This started as an attempt to add support for MSAA texture transfers and
MSAA depth-stencil decompression for the DB->CB copy path.
It has gotten a bit out of control, but it's for the greater good.

Some changes do not make much sense, they are there just to make it look
like the other driver.

With a few cosmetic modifications, r600_texture.c can be shared with
a symlink.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
4855acd461 radeonsi: implement texture fetching for compressed MSAA textures (v2)
v2: use resource slots 16..31 for FMASK textures

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
f671dfa8aa radeonsi: add FMASK texture binding slots and resource setup (v2)
v2: bind FMASK textures to shader resource slots 16..31

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
3c3feb38f4 radeonsi: implement FMASK decompression for MSAA texturing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
8c04f25360 radeonsi: scanout buffers cannot be a destination of MSAA resolve
Resolving to scanout buffers just doesn't work.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
2a4b2e2305 radeonsi: implement MSAA colorbuffer compression for rendering
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
2f1c449415 radeonsi: implement uncompressed MSAA texturing
This is glBlitFramebuffer support for MSAA surfaces as required by GL 3.0
and texturing as required by GL 3.2 and GL_ARB_texture_multisample.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
f083f79751 radeonsi: disable alpha-to-coverage for integer colorbuffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
6d4755a4d7 radeonsi: implement GL_SAMPLE_ALPHA_TO_ONE
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
07955d4f2b radeonsi: implement uncompressed MSAA rendering and color resolving
This is basic MSAA support which should work with most apps.
Some features are missing, those will be implemented by other commits.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-17 01:48:25 +02:00
Marek Olšák
c8e70e64ac radeonsi: add flexible shader descriptor management and use it for sampler views
It moves all sampler view descriptors to a buffer.
It supports partial resource updates and it can also unbind resources
(required for FMASK texturing).

The buffer contains all sampler view descriptors for one shader stage,
represented as an array. On top of that, there are N arrays in the buffer,
which are used to emulate context registers as implemented by the previous
ASICs (each array is a context).

This uses the RCU synchronization approach to avoid read-after-write hazards
as discussed in the thread:
"radeonsi: add FMASK texture binding slots and resource setup"

CP DMA is used to clear the descriptors at context initialization and to copy
the descriptors from one context to the next.

v2: - use PKT3_DMA_DATA on CIK (I'll test CIK later)
    - turn the bool CP DMA parameters into self-explanatory flags
    - add a nice simple API for packet emission to radeon_winsys.h
    - use 256 contexts, 128 causes texture corruption in openarena
2013-08-17 01:48:25 +02:00
Tom Stellard
764502b481 radeonsi/compute: Let the state tracker do all the flushing
It shouldn't be necessary to call radeon_winsys::cs_flush() from
radeonsi_launch_grid(), because the state tracker is responsible for
flushing the pipeline at the appropriate time.  The current behavior is
also wrong, because radeonsi_launch_grid() submits packets to the
compute ring, but when the state tracker calls pipe->flush() everything
is submitted to the graphics ring.  This has the potential to create a
race condition.

The downside of removing this flush is that the compute dispatch packets
will be sent to the graphics ring rather than the compute ring.
In the future we will need to come up with a way to detect 'compute'
command streams and submit them to the appropriate ring.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2013-08-17 01:48:25 +02:00
Kenneth Graunke
e29931aa74 i965: Dump more information about batch buffer usage.
Previously, INTEL_DEBUG=bat would dump messages like:

intel_mipmap_tree.c:1643: Batchbuffer flush with 456b used

This only reported the space used for command packets, and didn't
report any information on the space used for indirect state.

Now it dumps:

intel_context.c:366: Batchbuffer flush with 6128b (pkt) + 4288b (state)
= 10416b (31.8%)

This conveniently shows the breakdown of space used for packets vs.
state, as well as the percentage of batchbuffer space.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 15:54:24 -07:00
Kenneth Graunke
2a9492f321 i965: Add Gen7 depth stall flushes before disabling depth in BLORP.
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-16 15:03:55 -07:00
Kenneth Graunke
8fba8d4ee7 i965: Add Gen6 depth stall flushes before disabling depth in BLORP.
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.

On Sandybridge, this also requires the post_sync_nonzero flush.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-16 15:03:38 -07:00
Matt Turner
9c48ae751a i965: Don't copy propagate bitcasts with source modifiers.
Previously, copy propagation would cause bitcast_f2u(abs(float)) to
be performed in a single step, but the application of source modifiers
(abs, neg) happens after type conversion, leading to incorrect results.

That is, for bitcast_f2u(abs(float)) we would in fact generate code to
do abs(bitcast_f2u(float)).

For example, whereas bitcast_f2u(abs(float)) might result in a register
argument such as
   (abs)g2.2<0,1,0>UD

v2: Set interfered = true and break in register_coalesce instead of
    returning false.

Reviewed-by: Paul Berry <stereoytpe441@gmail.com>
2013-08-16 13:11:07 -07:00
Matt Turner
0ae9ca12a8 i965: Emit MOVs for neg/abs.
Necessary to avoid combining a bitcast and a modifier into a single
operation. Otherwise if safe, the MOV should be removed by
copy-propagation or register coalescing.

With this and the next patch, there are only four changes in shader-db:
all a single extra instruction. The code does something like
   mov a.w, -b.x
and copy propagation doesn't work because it only handles no-op
swizzles. Seems acceptable, given the known limitation of our copy
propagation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereoytpe441@gmail.com>
2013-08-16 13:11:07 -07:00
Anuj Phogat
079bdba05f i965/blorp: Add support for single sample scaled blit with bilinear filter
Currently single sample scaled blits with GL_LINEAR filter falls
back to meta path. Patch removes this limitation in BLORP engine
and implements single sample scaled blit with bilinear filter.
No piglit, gles3 regressions are observed with this patch on Ivybridge.

V2: Use "sample" message to utilize the linear filtering functionality
built in to hardware.
V3: Define a bool variable (bilinear_filter) to handle the conditions
for GL_LINEAR blits.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Anuj Phogat
aff371b634 i965/blorp: Define a function to clamp texture coordinates
New function clamp_tex_coords() clamps the texture coordinates
to texture boundaries.  This function will also be utilized later
for the BLORP implementation of single-sample scaled blit with
bilinear filter.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Anuj Phogat
6066fb1721 i965/blorp: Use more appropriate variable names
When we talk about both multi-sample and single-sample scaled blits,
rect_grid_{x1, y1} are more appropriate variable names as compared
to sample_grid_{x1, y1}. There are no functional changes in this patch.
It just prepares for the BLORP implementation of single-sample scaled
blit with bilinear filter.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00
Anuj Phogat
d944a6144f meta: Fix blitting a framebuffer with renderbuffer attachment
This patch fixes a case of framebuffer blitting with renderbuffer
as color attachment and GL_LINEAR filter. Meta implementation of
glBlitFrambuffer() converts source color buffer to a texture and
uses it to do the scaled blitting in to destination buffer. Using
the exact source rectangle to create the texture does incorrect
linear filtering along the edges. This patch makes the changes to
extend the texture edges by one pixel in x, y directions. This
ensures correct linear filtering.
It fixes failing piglit fbo-attachments-blit-scaled-linear test.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-08-16 09:46:15 -07:00