Commit graph

73939 commits

Author SHA1 Message Date
Julien Isorce
1bdea0e579 st/va: handle Video Post Processing for configs
Add support for VA_PROFILE_NONE and VAEntrypointVideoProc
in the 4 following functions:

vlVaQueryConfigProfiles
vlVaQueryConfigEntrypoints
vlVaCreateConfig
vlVaQueryConfigAttributes

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-30 13:20:29 +01:00
Julien Isorce
0b868807e4 st/va: add colospace conversion through Video Post Processing
Add support for VPP in the following functions:
vlVaCreateContext
vlVaDestroyContext
vlVaBeginPicture
vlVaRenderPicture
vlVaEndPicture

Add support for VAProcFilterNone in:
vlVaQueryVideoProcFilters
vlVaQueryVideoProcFilterCaps
vlVaQueryVideoProcPipelineCaps

Add handleVAProcPipelineParameterBufferType helper.

One application is:
VASurfaceNV12 -> gstvaapipostproc -> VASurfaceRGBA

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-30 13:20:10 +01:00
Julien Isorce
05b6ce4209 st/va: implement dmabuf import for VaCreateSurfaces2
For now it is limited to RGBA, BGRA, RGBX, BGRX surfaces.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-30 13:20:03 +01:00
Julien Isorce
adf1133118 st/va: implement VaCreateSurfaces2 and VaQuerySurfaceAttributes
Inspired from http://cgit.freedesktop.org/vaapi/intel-driver/
especially src/i965_drv_video.c::i965_CreateSurfaces2.

This patch is mainly to support gstreamer-vaapi and tools that uses
this newer libva API. The first advantage of using VaCreateSurfaces2
over existing VaCreateSurfaces, is that it is possible to select which
the pixel format for the surface. Indeed with the simple VaCreateSurfaces
function it is only possible to create a NV12 surface. It can be useful
to create a RGBA surface to use with video post processing.

The avaible pixel formats can be query with VaQuerySurfaceAttributes.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-30 13:19:54 +01:00
Julien Isorce
d42029d2d9 st/va: do not destroy old buffer when new one failed
If formats are not the same vlVaPutImage re-creates the video
buffer with the right format. But if the creation of this new
video buffer fails then the surface looses its current buffer.
Let's just destroy the previous buffer on success.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-30 13:19:47 +01:00
Julien Isorce
87109e5f88 st/va: properly defines VAImageFormat formats and improve VaCreateImage
Added PIPE_VIDEO_CHROMA_FORMAT_NONE in p_format.h
and return it by default in ChromaToPipe.

Renamed YCbCrToPipe to VaFourccToPipeFormat because it now
contains RGB.

Implemented PipeFormatToVaFourcc which will be used later in
VlVaDeriveImage.

Note that gstreamer-vaapi check all the VAImageFormat fields.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-10-30 13:05:23 +01:00
Samuel Iglesias Gonsalvez
7b8cc37585 main: fix basename match's check if it's an array or struct
Commit 4565b6f did not update the basename match's check for
the case that string would exactly match the name of the
variable if the suffix "[0]" were appended to it.

Fixes two dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array
dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element

v2:
- Change the position of rname_has_array_index_zero to avoid an out-of-bounds
  read. Reported by Tapani Pälli.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-10-30 08:12:53 +01:00
Kristian Høgsberg
f7f1bc6cca i965: Fix invalid memory accesses after resizing brw_codegen's store table
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-30 07:49:10 +01:00
Connor Abbott
73caa26e43 i965/sched: use liveness analysis for computing register pressure
Previously, we were using some heuristics to try and detect when a write
was about to begin a live range, or when a read was about to end a live
range. We never used the liveness analysis information used by the
register allocator, though, which meant that the scheduler's and the
allocator's ideas of when a live range began and ended were different.
Not only did this make our estimate of the register pressure benefit of
scheduling an instruction wrong in some cases, but it was preventing us
from knowing the actual register pressure when scheduling each
instruction, which we want to have in order to switch to register
pressure scheduling only when the register pressure is too high.

This commit rewrites the register pressure tracking code to use the same
model as our register allocator currently uses. We use the results of
liveness analysis, as well as the compute_payload_ranges() function that
we split out in the last commit. This means that we compute live ranges
twice on each round through the register allocator, although we could
speed it up by only recomputing the ranges and not the live in/live out
sets after scheduling, since we only shuffle around instructions within
a single basic block when we schedule.

Shader-db results on bdw:

total instructions in shared programs: 7130187 -> 7129880 (-0.00%)
instructions in affected programs: 1744 -> 1437 (-17.60%)
helped: 1
HURT: 1

total cycles in shared programs: 172535126 -> 172473226 (-0.04%)
cycles in affected programs: 11338636 -> 11276736 (-0.55%)
helped: 876
HURT: 873

LOST:   8
GAINED: 0

v2: use regs_read() in more places.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-30 02:19:43 -04:00
Connor Abbott
c1860299b8 i965/fs: split out calculation of payload live ranges
We'll need this for the scheduler too, since it wants to know when the
live ranges of payload registers end in order to model them in our
register pressure calculations.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-30 02:19:33 -04:00
Connor Abbott
45cd76e342 i965: dump scheduling cycle estimates
The heuristic we're using is rather lame, since it assumes everything is
non-uniform and loops execute 10 times, but it should be enough for
measuring improvements in the scheduler that don't result in a change in
the number of instructions.

v2:
- Switch loops and cycle counts to be compatible with older shader-db.
- Make loop heuristic 10x to match with spilling code.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-30 02:19:24 -04:00
Connor Abbott
486268bdb0 i965: always run the post-RA scheduler
Before, we would only do scheduling after register allocation if we
spilled, despite the fact that the pre-RA scheduler was only supposed to
be for register pressure and set the latencies of every instruction to
1. This meant that unless we spilled, which we rarely do, then we never
considered instruction latencies at all, and we usually never bothered
to try and hide texture fetch latency. Although a later commit removes
the setting the latency to 1 part, we still want to always run the
post-RA scheduler since it's able to take the false dependencies that
the register allocator creates into account, and it can be more
aggressive than the pre-RA scheduler since it doesn't have to worry
about register pressure at all.

Test                   master      post-ra-sched     diff       %diff
bench_OglPSBump2       396.730     402.386           5.656      +1.400%
bench_OglPSBump8       244.370     247.591           3.221      +1.300%
bench_OglPSPhong       241.117     242.002           0.885      +0.300%
bench_OglPSPom         59.555      59.725            0.170      +0.200%
bench_OglShMapPcf      86.149      102.346           16.197     +18.800%
bench_OglVSTangent     388.849     395.489           6.640      +1.700%
bench_trex             65.471      65.862            0.390      +0.500%
bench_trexoff          69.562      70.150            0.588      +0.800%
bench_heaven           25.179      25.254            0.074      +0.200%

Reviewed-by: Jason Ekstrand <jasoan.ekstrand@intel.com>
2015-10-30 02:19:00 -04:00
Connor Abbott
85fce2d2f5 i965/sched: write-after-read dependencies are free
Although write-after-write dependencies have the same latency as
read-after-write dependencies due to how the register scoreboard works,
write-after-read dependencies aren't checked by the EU at all, so
they're purely a constraint on how the scheduler can order the
instructions.

v2: fix accumulator dependencies too.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-30 02:18:56 -04:00
Connor Abbott
6f231fddff i965: fix cycle estimates when there's a pipeline stall
The issue time for an instruction is how many cycles it takes to
actually put it into the pipeline. If there's a pipeline stall that
causes the instruction to be delayed, we should first take that into
account to figure out when the instruction would start executing and
*then* add the issue time. The old code had it backwards, and so we
would underestimate the total time whenever we thought there would be a
pipeline stall by up to the issue time of the instruction.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-30 02:18:53 -04:00
Eric Anholt
04c42f3ab5 vc4: Allow user index buffers, to avoid slow readback for shadow IBs.
Improves low-settings openarena performance by 31.9975% +/- 0.659931%
(n=7).
2015-10-29 22:58:01 -07:00
Ilia Mirkin
06fa2e864a nv50: mark contexts shareable, compile at creation time
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-29 23:25:08 -04:00
Ilia Mirkin
f768eaa87d nv50: allow per-sample interpolation to be forced via rast
Uses the same technique as for nvc0 of fixups before upload, and
evicting in case of state change. Removes one source of variants kept by
st/mesa.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-29 22:42:38 -04:00
Matt Turner
85ee2f7fcf i965: Add INTEL_DEBUG=nocompact to disable instruction compaction.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
93268939e4 i965: Add INTEL_DEBUG=hex to print the hex with the disassembly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
18b194925f i965: Print the type and writemask on null destinations.
These are often useful in debugging, and the writemask (actually
"Channel Enables") determines more than just what goes into the
destination.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
bcdf664682 i965: Test fixed_hw_reg.file against BRW_IMMEDIATE_VALUE, not IMM.
No functional change, since they were both 3, but BRW_IMMEDIATE_VALUE is
the hardware value and IMM was the IR value -- and you can see that
BRW_IMMEDIATE_VALUE was correctly used in the context of this patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
ee46c1e626 i965/vec4: Test against BRW_IMMEDIATE_VALUE, not IMM.
No functional change, since they were both 3, but BRW_IMMEDIATE_VALUE is
the hardware value and IMM was the IR value -- and you can see that
BRW_IMMEDIATE_VALUE was correctly used in the context of this patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
8c4151b866 i965/fs: Use group(4, 0) to emit an exec-size 4 MOV.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
9115fa1d13 i965/cfg: Handle no-idom case in cfg_t::dump_domtree().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
5916b073f6 i965/disasm: Remove unused _addr_mode argument from src_ia1().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
e09e5f992e i965: Set correct field for indirect align16 addrimm.
This has been wrong since the initial import of the i965 driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
fa142773d9 i965/vec4: Drop brw_set_default_* before popping insn state.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-29 17:51:16 -07:00
Matt Turner
11a7b6bbaa i965/vec4: Remove unnecessary #includes from the generator.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-29 17:51:16 -07:00
Dave Airlie
744cc036b9 r600: enable SB for geom shaders on pre-evergreen
I've checked with piglit and one tests fails, but it fails
on evergreen as well, so will get fixed later.

Otherwise SB seems to be working fine for geom shaders on my
rv635.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-10-30 10:40:59 +10:00
Kenneth Graunke
c6b24448b5 i965/vec4: Eliminate the vec4_generator class altogether.
We really weren't taking advantage of vec4_generator being a class.
By adding a "p" parameter to the helper methods, and "prog_data" to
ones which need binding table information, we can convert everything
to static functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-29 16:56:41 -07:00
Kenneth Graunke
1a094a2ee2 i965/vec4: Move vec4_generator class definition into the .cpp file.
The public API for the generator is brw_vec4_generate_code(); nobody
actually needs to use the class.  This means we can extend it without
triggering the recompiles associated with altering brw_vec4.h.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-29 16:56:41 -07:00
Kenneth Graunke
4cba8f5d21 i965/vec4: Wrap vec4_generator in a C function.
vec4_generator is a class for convenience, but only exports a single
method as its public API.  It makes much more sense to just export a
single function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-29 16:56:41 -07:00
Kenneth Graunke
73ff0ead36 i965/vec4: Convert src_reg/dst_reg to brw_reg at the end of the visitor.
This patch makes the visitor convert registers to the HW_REG file at the
very end, after register allocation, post-RA scheduling, and dependency
control flagging.  After that, everything is in fixed brw_regs.

This simplifies the code generator, as it can just use the hardware
registers rather than having to interpret our abstract files.  In
particular, interpreting the UNIFORM file meant reading prog_data
to figure out where push constants are supposed to start.

Having the part of the code that performs register allocation also
translate everything to hardware registers seems sensible.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-29 16:56:41 -07:00
Ivan Kalvachev
f75f21a24a r600g: Fix special negative immediate constants when using ABS modifier.
Some constants (like 1.0 and 0.5) could be inlined as immediate inputs
without using their literal value. The r600_bytecode_special_constants()
function emulates the negative of these constants by using NEG modifier.

However some shaders define -1.0 constant and want to use it as 1.0.
They do so by using ABS modifier. But r600_bytecode_special_constants()
set NEG in addition to ABS. Since NEG modifier have priority over ABS one,
we get -|1.0| as result, instead of |1.0|.

The patch simply prevents the additional switching of NEG when ABS is set.

[According to Ivan Kalvachev, this bug was fond via
https://github.com/iXit/Mesa-3D/issues/126 and
https://github.com/iXit/Mesa-3D/issues/127]

Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
CC: <mesa-stable@lists.freedesktop.org>
2015-10-29 23:56:57 +01:00
Nicolai Hähnle
24c90888ae st/mesa: fix mipmap generation for immutable textures with incomplete pyramids
Without the clamping by NumLevels, the state tracker would reallocate the
texture storage (incorrect) and even fail to copy the base level image
after reallocation, leading to the graphical glitch of
https://bugs.freedesktop.org/show_bug.cgi?id=91993 .

A piglit test has been submitted for review as well (subtest of
arb_texture_storage-texture-storage).

v2: also bypass all calls to st_finalize_texture (suggested by Marek Olšák)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-10-29 23:56:57 +01:00
Nanley Chery
65f6caf43e mesa: Enable ASTC in GLES' [NUM_]COMPRESSED_TEXTURE_FORMATS queries
In OpenGL ES, the COMPRESSED_TEXTURE_FORMATS query returns the set of
supported specific compressed formats. Since ASTC formats fit within
that category, include them in the set and update the
NUM_COMPRESSED_TEXTURE_FORMATS query as well.

This enables GLES2-based ASTC dEQP tests to run. See the Bugzilla for
more info.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92193
Reported-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-29 15:40:37 -07:00
Nanley Chery
8090a1c326 mesa/texcompress: Restrict FXT1 format to desktop GL subset
In agreement with the extension spec and commit
dd0eb00487, filter FXT1 formats to the
desktop GL profiles. Now we no longer advertise such formats as supported
in an ES context and then throw an INVALID_ENUM error when the client
tries to use such formats with CompressedTexImage2D.

Fixes the following 26 dEQP tests:
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_neg_x
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_neg_y
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_neg_z
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_pos_x
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_pos_y
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_pos_z
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_size
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_level_max_cube_pos
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_level_max_tex2d
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_level_cube
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_level_tex2d
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_neg_x
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_neg_y
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_neg_z
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_pos_x
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_pos_y
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_pos_z
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_tex2d
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_neg_x
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_neg_y
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_neg_z
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_pos_x
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_pos_y
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_pos_z
* dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_tex2d

v2. Use _mesa_is_desktop_gl() (Ilia, Ian)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-10-29 15:31:59 -07:00
Samuel Pitoiset
0260620ab3 nvc0: expose a group of performance metrics on Fermi
This allows to monitor those performance metrics through
GL_AMD_performance_monitor.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-29 23:01:29 +01:00
Ilia Mirkin
6166a8e369 st/mesa: create temporary textures with the same nr_samples as source
Not sure if this is actually reachable in practice (to have a complex
copy with MS textures).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-10-29 13:20:45 -04:00
Tapani Pälli
afbe8b6085 glsl: add fragdata arrays to program resource list
This makes sure that user is still able to query properties about
variables that have gotten removed by opt_dead_builtin_varyings pass.

Fixes following OpenGL ES 3.1 test:
   ES31-CTS.program_interface_query.output-layout

No Piglit regressions.

v2: cleanup, drop extra parenthesis (Topi)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
2015-10-29 17:17:42 +02:00
Tapani Pälli
6ce0857e30 mesa: add fragdata_arrays list to gl_shader
This is required to store information about fragdata arrays, currently
these variables get lost and cannot be retrieved later in sensible way
for program interface queries. List will be utilized by next patch.

Patch also modifies opt_dead_builtin_varyings pass to build list when
lowering fragdata arrays. This is identical approach as taken with
packed varyings pass.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
2015-10-29 17:16:22 +02:00
Samuel Iglesias Gonsalvez
85f1f04413 glsl: fix GL_BUFFER_DATA_SIZE value for shader storage blocks with unsize arrays
From ARB_program_interface_query:

"For the property of BUFFER_DATA_SIZE, then the implementation-dependent
 minimum total buffer object size, in basic machine units, required to hold
 all active variables associated with an active uniform block, shader
 storage block, or atomic counter buffer is written to <params>.  If the
 final member of an active shader storage block is array with no declared
 size, the minimum buffer size is computed assuming the array was declared
 as an array with one element."

Fixes the following dEQP-GLES31 tests:

dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.named_block
dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.unnamed_block
dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.block_array

v2:
- Fix comment's indentation and explain that the parser already
  checked that unsized array is in last element of a shader
  storage block (Iago).
- Add assert (Iago).

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-29 08:29:06 +01:00
Kenneth Graunke
cf93251bed docs: Mark GL_ARB_fragment_layer_viewport as done on i965. 2015-10-28 22:05:08 -07:00
Kenneth Graunke
8c902a580a i965: Implement ARB_fragment_layer_viewport.
Normally, we could read gl_Layer from bits 26:16 of R0.0.  However, the
specification requires that bogus out-of-range 32-bit values written by
previous stages need to appear in the fragment shader as-written.

Instead, we pass in the full 32-bit value from the VUE header as an
extra flat-shaded varying.  We have the SF override the value to 0
when the previous stage didn't actually write a value (it's actually
defined to return 0).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-10-28 22:05:08 -07:00
Kenneth Graunke
5392328a32 i965: Make calculate_attr_overrides return the URB read offset.
Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which
represents 2 vec4 varying slots) to skip over the 8 DWord VUE header.

In order to support ARB_fragment_layer_viewport, we'll need to read
from that header.  This patch adds the basic plumbing necessary to
calculate a value dynamically and hook it up in the SBE packets.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-10-28 22:05:08 -07:00
Kenneth Graunke
b3d19d20f2 glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat.
Integer varyings need to be flat qualified - all others were already.
I think we just missed this.  Presumably some hardware passes this via
sideband and ignores attribute interpolation, so no one has noticed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-10-28 22:05:08 -07:00
Kenneth Graunke
b94cdcdada i965/fs: Properly check for PAD in fragment shaders with > 16 varyings.
Commit 268008f98c changed unused VUE map
slots to be initialized with BRW_VARYING_SLOT_PAD, not COUNT.  I missed
updating this.  It also means that commit message was wrong, as some
code *did* rely slots being initialized to COUNT.

This may fix a bug with SSO programs with > 16 FS input varyings.
I think we probably just emitted extra pointless code, but probably
didn't break anything.  We might also just have no tests for that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-10-28 22:05:08 -07:00
Kenneth Graunke
6ae47a3eb4 i965: Update stale comment about unused VUE map slots.
I changed this from COUNT to PAD in commit 268008f98c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-10-28 22:05:08 -07:00
Ilia Mirkin
5227e91580 nv50/ir: adapt to new method for passing in cull/clip distance masks
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-29 00:44:22 -04:00
Ilia Mirkin
a5bae7b31d nvc0: share shaders between contexts and build immediately
Avoid deferring building shaders until draw time, should hopefully
reduce any stuttering, as well as enable shader-db style analysis.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-29 00:44:22 -04:00