Commit graph

73445 commits

Author SHA1 Message Date
Kenneth Graunke
73b01e2711 i965: Fix clear code for ignoring colormask for XRGB formats on Gen9+.
In commit cda886a485, Neil made us stop
advertising RGBX formats on Gen9+, as the hardware apparently no longer
has working fast clear support for those formats.  Instead, we just
fall back to RGBA formats, and use SCS to override alpha to 1.0.

This is fine, but had one unintended side effect: it made us fall back
to slow clears when the color mask disables alpha.  Normally, we ignore
the color mask for non-existent channels.  This includes alpha for XRGB
formats as writing garbage to the X channel is harmless.  But, now that
we use RGBA, we think there's a real alpha channel, and can't do the
optimization.

To hack around this, check if _BaseFormat is GL_RGB and ignore alpha.

Improves WebGL Aquarium performance on Skylake GT3e by about 50%
by letting it use repclears instead of slow clears.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-21 12:01:49 -07:00
Iago Toral Quiroga
bdaa0e12a2 i965/blorp: Improve precission of blitting coordinates when clipping
We do this in two steps: first we clip the dst rect and adjust the src
rect accordingly. Then we do it the other way around. In both passes
the adjustment part involves multiplying by a scale factor that can lead
to a small precision loss. This is breaking a few dEQP tests.

Specifically, the problem happens when we need to clip the same coordinate
twice. For example, if srcX0 and dstX0 need both to be clipped we want to
avoid the situation where we clip srcX0 first, then adjust dstX0 accordingly
but then we realize that the resulting dstX0 still needs to be clipped, so
we clip dstX0 and adjust srcX0 again. Each of these two passes can lead
to precission loss. What we want to do here is detect the rect that leads
to the largest clip (accounting for the scale factor involved), clip that
rect and adjust the other one. With this we ensure that the adjusted
coordinate does not need to be clipped again and we can skip a second pass,
improving precision.

Fixes the following 4 dEQP tests:
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_x_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_x_linear

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2016-04-21 10:43:39 -07:00
Bas Nieuwenhuizen
38f4cee3ff radeonsi: Add config parameter to si_shader_apply_scratch_relocs.
shader->config is not updated for compute kernels.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2016-04-21 19:36:19 +02:00
Matt Turner
1bc983cd64 glsl: Relax GLSL 1.10 float suffix error to a warning.
Float suffixes are allowed in all subsequent GLSL specifications, and
it's obvious what the user meant if they specify one. Accept it with a
warning to avoid breaking applications, like Planeshift (although it
looks like between 0.6.1 and 0.6.3 they might have removed the suffixes
from their shaders).

Reviewed-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:33:08 -07:00
Matt Turner
33565d6764 i965/fs: Readd opt_drop_redundant_mov_to_flags().
This reverts commit b449366587.

I removed the pass thinking that it was now not useful, but that was not
true. I believe I ran shader-db on HSW and saw no results, but HSW does
not use the unlit centroid workaround code and as a result does not emit
redundant MOV_DISPATCH_TO_FLAGS instructions.

On IVB, the shader-db results are:

total instructions in shared programs: 6650806 -> 6646303 (-0.07%)
instructions in affected programs: 106893 -> 102390 (-4.21%)
helped: 793

total cycles in shared programs: 56195538 -> 56103720 (-0.16%)
cycles in affected programs: 873048 -> 781230 (-10.52%)
helped: 553
HURT: 209

On SNB, the shader-db results are:

total instructions in shared programs: 7173074 -> 7168541 (-0.06%)
instructions in affected programs: 119757 -> 115224 (-3.79%)
helped: 799

total cycles in shared programs: 98128032 -> 98072938 (-0.06%)
cycles in affected programs: 1437104 -> 1382010 (-3.83%)
helped: 454
HURT: 237

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-21 10:32:40 -07:00
Topi Pohjolainen
0020ca3c92 i965/blorp: Do not emit pma stall on gen9+
This was left out from the original gen8 upload introduction.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 20:18:51 +03:00
Tim Rowley
81c1c481ed swr: add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT to get_param
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-21 11:32:09 -05:00
Emil Velikov
9dcb3dfb23 i965: automake: remove gratuitous "+" during variable assignment
There is not initial assignment, thus appending to it does not work.

Fixes: b27c85c4c0 "i965: add build rule for brw_nir_trig_workarounds.c"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 16:48:34 +01:00
Rob Herring
1ba203a085 gbm: add GBM_FORMAT_XBGR8888 format support
Add GBM_FORMAT_XBGR8888/__DRI_IMAGE_FORMAT_XBGR8888 format support which
is needed for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-21 14:45:56 +01:00
Rob Herring
ccdcf91104 st/dri: add 32-bit RGBX/RGBA formats
Add support for 32-bit RGBX/RGBA formats which are preferred for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-21 14:45:53 +01:00
Rob Herring
3b69076435 dri/common: add MESA_FORMAT_R8G8B8{A8, X8}_UNORM formats as supported configs
Add MESA_FORMAT_R8G8B8A8_UNORM and MESA_FORMAT_R8G8B8X8_UNORM formats as
these are the preferred formats for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-04-21 14:45:21 +01:00
Rob Herring
b27c85c4c0 i965: add build rule for brw_nir_trig_workarounds.c on Android
Commit bfd17c76c1 ("i965: Port INTEL_PRECISE_TRIG=1 to NIR.") added a
generated file brw_nir_trig_workarounds.c which broke the Android build.
Add the necessary makefiles to the Android build.

Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:43:26 +01:00
Rob Herring
30239ba056 glsl: android: add back missing generated glcpp include path
Commit 4db8f15a25 ("glsl: move the android build scripts a level up")
dropped a generated include path for glcpp. Add it back adjusting for the
new location.

Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:43:21 +01:00
Jonathan Gray
28e3ae344b loader: add a libdrm case for loader_get_device_name_for_fd
Use dev_node_from_fd() with HAVE_LIBDRM to provide an implmentation
of loader_get_device_name_for_fd() for non-linux systems that
use libdrm but don't have udev or sysfs.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:41:41 +01:00
Jonathan Gray
5d09394fb1 i965/tiled_memcpy: don't unconditionally use __builtin_bswap32
Use the defines Mesa configure sets to indicate presence of the bswap32
builtins.  This lets i965 work on OpenBSD again after the changes that
were made in 0a5d8d9af4.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-21 14:41:41 +01:00
Jonathan Gray
9bbf3737f9 egl/x11: authenticate before doing chipset id ioctls
For systems without udev or sysfs that use drm ioctls in the loader
drm authentication must take place earlier or the loader will fail
"MESA-LOADER: failed to get param for i915".

Patch from Mark Kettenis.

Cc: "11.2 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mark Kettenis <kettenis@openbsd.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
[Emil Velikov: remove gratuitous white-space]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-21 14:40:44 +01:00
Bas Nieuwenhuizen
4abe051a3f gallium/radeon: Silence possibly uninitialized variable warning.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 13:40:47 +02:00
Bas Nieuwenhuizen
51d1551241 winsys/amdgpu: Silence possibly uninitialized variable warning.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 13:40:42 +02:00
Bas Nieuwenhuizen
4d13c7c879 radeonsi: Enable loading into CE RAM.
We need to enable a bit in the CONTEXT_CONTROL packet for the
loads to work.

v2: Style issues.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-21 12:50:58 +02:00
Bas Nieuwenhuizen
f45f54e14a radeonsi: Use defines for CONTEXT_CONTROL instead of magic values.
v2: Use field names provided by Nicolai.
v3: Updated to use CONTEXT_CONTROL prefix.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 12:50:58 +02:00
Thomas Hindoe Paaboel Andersen
d4a21a0de0 winsys/amdgpu: fix preamble IB size
The missing break caused the IB size to be overwritten with
the size of IB_CONST.

This was introduced in: 7201230582

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-04-21 12:14:50 +02:00
Topi Pohjolainen
935ce14a44 i965/blorp: Reduce the urb size requirement for vertex buffer
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
26fdb7e51e i965/blorp: Reduce the size of vertex buffer
Previously the vertex buffer consisted of eight floats per vertex
of which six where constants. These can be as easily provided by
vertex fetcher as it is capable of filling vertex elements with
constant one and zero. This reduces the size of the vertex buffer
from 3 * 8 * 4 = 96 to 3 * 2 * 4 = 24 bytes.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
0ae360f098 i965/blorp: Do not tricker urb re-configuration unnecessarily
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
69dfb7b2b7 i965/blorp: Skip re-emitting urb config whenever possible
Otherwise clearing with blorp will regress performance in some
synthetic test cases.

v2: Used vsize >= 2 instead of vsize > 0, and updated the comment.
    Review by Ken in one of the earlier patches revealed this.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
7644e8ab68 i965/blorp: Prepare to switch from compute pipeline
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
aa322f8ae5 i965/blorp: Skip uploading state/options not needed for clears
In case there is no source it means the program does a simple
clear or a resolve. In such case there is no need to program
sampling state or enable pixel kill in fragment shader.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
87d333f2fe i965/blorp: Re-introduce clear programs
This partially reverts 2f28a0dc23

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
69c364f2dc i965/meta: Move check for srgb into is_color_fast_clear_compatible()
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
8a696e75d8 i965/meta: Expose check for fast clear compatibility
Also add the additional render format check to the same utility.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
a848ad6806 i965/meta: Expose fast clear value setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:03 +03:00
Topi Pohjolainen
fb14a2fc78 i965/meta: Expose non-fast clear rectangle calculation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
9d79235e4e i965/meta: Expose resolve clear rectangle calculation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
2757d723da i965/meta: Expose fast clear rectangle calculation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
3ef957e783 i965: Declare input to mcs alignment calculation constant
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
c40b1efa70 i965/blorp: Switch the order of render and texture targets
On gen8 color resolving won't work anymore if the target isn't
the first entry in the binding table.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
0d062d79c3 i965/blorp: Reduce scope for generator and its inputs
Generator is only needed for getting the assembly.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
4c3de6b2d6 i965/blorp: Add support for disabling color blending
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
da5a477ce4 i965/blorp: Add support for setting fast clear operation
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
7de72f728b i965/blorp: Enable blits on gen8
v2 (Ken): Moved switch cases for gen8/9 in texel_fetch() to
          earlier patch adding gen8/9 sampling support.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
f7ab4e0cc4 i965/blorp: Prepare stencil sampling for gen8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:02 +03:00
Topi Pohjolainen
708453952b i965/blorp: Add check for supported sample numbers
v2 (Ken): Fix the condition on using meta for stencil blits:
          use_blorp -> !use_blorp

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
9e4d19372b i965/blorp: Add support for sampling 3D textures
This patch adds additional MOV instruction for all blorp programs
that use SHADER_OPCODE_TXF. Alternative is to augment blorp program
key to tell if z-coordinate is needed, add condition to the blorp
blit compiler and to produce a variant with and without the MOV.
This seems a little overkill.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
6b33d63d77 i965/blorp: Add support for source swizzle
In order to support cases where gen9 uses RGBA format to back client
requested RGB, one needs to have means to force alpha channel to one
when user requested RGB surface is used as blit source.

v2 (Ken): Use helper for constructing the swizzle (this should be
          changed to use brw_get_texture_swizzle() as a follow-up).
          Also calculate the swizzle for CopyTexSubImage.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
52e7008a5a i965/blorp: Pipeline upload support for gen8
v2 (Ken): Drop GEN8_RASTER_FRONT_WINDING_CCW in raster state
          Add emission of pma stall.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:20:01 +03:00
Topi Pohjolainen
2fda441371 i965/gen8: Expose pma stall emission
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 10:19:30 +03:00
Topi Pohjolainen
8b2332e3d1 i965: Allow texture surface state setup to be used by blorp
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:42:10 +03:00
Topi Pohjolainen
0ad83d222b i965/blorp: Prepare sampling for gen9
v2 (Ken): Added switch cases for gen8/9 in texel_fetch(). These
          were wrongly introduced in blit-enabling patch.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:41:40 +03:00
Topi Pohjolainen
328ab6c268 i965/blorp: Prepare render target write for gen8
v2 (Ken): Use payload directly instead of retyping it into vec8.
          Drop the implied header, it isn't used for gen6+ anyway.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:40:33 +03:00
Topi Pohjolainen
135f00e666 i965/blorp/gen6: Prepare vertex buffer setup logic for gen8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-21 08:37:06 +03:00