Commit graph

96477 commits

Author SHA1 Message Date
Eric Anholt
e676434856 mesa: Implement a new GL_MESA_tile_raster_order extension.
The intent is to use this extension on vc4 to allow X11 to do overlapping
CopyArea() within a pixmap without first blitting the pixmap to a
temporary.  With associated glamor patches, improves x11perf
-copywinwin100 performance on a Raspberry Pi 3 from ~4700/sec to
~5130/sec, and is an even larger boost to uncomposited window movement
performance (most copywinwin100 copies don't overlap).

v2: Fix glIsEnabled() on the new enums.
v3: Drop the local spec since I'm upstreaming the spec.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 10:45:22 -07:00
Eric Anholt
087b39a346 broadcom/vc4: Expose PIPE_CAP_TILE_RASTER_ORDER
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.

v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Split from the core gallium commit, drop some unnecessary code related
    to glBlitFramebuffer(), fix a crash with clears before state has been
    bound.
2017-10-10 10:45:22 -07:00
Eric Anholt
ac0051a507 gallium: Create a new PIPE_CAP_TILE_RASTER_ORDER for vc4.
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.

This commit introduces the core gallium support, vc4 changes will follow.

v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Drop vc4 changes from this commit, for clarity.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
2017-10-10 10:45:22 -07:00
Eric Anholt
4aa700e0e0 broadcom/vc4: Implement GL_ARB_texture_barrier.
Improves x11perf -copywinwin100 from ~2000/sec to ~4700/sec.  More
importantly, this is a prerequisite for the new GL_MESA_tile_raster_order
extension.
2017-10-10 10:45:22 -07:00
Eric Anholt
13b303ff92 docs: Update the list of used MESA GL enums.
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 10:45:22 -07:00
Eric Anholt
9ab0d83079 docs: Fix a typo in the old MESA_program_debug spec.
Noticed that we had two 0x8bb4 in the spec while grepping to find an open
slot in the MESA enums set.  gl.xml had the right value.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 10:45:22 -07:00
Brian Paul
a3b2e60aa0 git_sha1_gen: accept MESA_GIT_SHA1_OVERRIDE env var
If one uses a parent build script to download/build Mesa we may not
have a full git repository (maybe a tar archive) so the 'git rev-parse'
command will fail.

This updates the script to look for a MESA_GIT_SHA1_OVERRIDE env var.
If it's set, use that sha1 instead of using git rev-parse.  With this
change we can put a git hash in the GL_VERSION string even when we
don't have a git repo.

v2: incorporate Dylan's suggestions to simplify the code

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-10 11:28:31 -06:00
Brian Paul
c43b0d3f91 mesa: move _mesa_half_is_negative() to half_float.h
v2: use !! in the function to be explicit about type conversion.  Though,
gcc generates the same code with or without the logical !!.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-10 11:28:31 -06:00
Brian Paul
3c5664b78d mesa: move _mesa_exec_malloc/free() prototypes to their own header
Try to start removing things from the cluttered imports.h file.

v2: add new header to Makefile.sources

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-10 11:28:31 -06:00
Kenneth Graunke
d670dd6b65 i965: minor whitespace fix 2017-10-10 10:18:17 -07:00
Eric Anholt
45f34d733b mesa: Set new renderbuffers to RGBA4 on all GLES contexts.
Before we were doing RGBA4 on GLES3 only, but as of GLES2 2.0.22 it should
be RGBA4 as well.  Fixes DEQP
functional.state_query.rbo.renderbuffer_internal_format.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 09:31:29 -07:00
Eric Anholt
c16a7443e9 mesa: Expose GL_OES_required_internalformat on GLES contexts.
This extension is effectively a backport of GLES3's internalformat
handling to GLES 1/2.  It guarantees that sized internalformats specified
for textures and renderbuffers have at least the specified size stored.
That's a pretty minimal requirement, so I think it can be dummy_true and
exposed as a standard in Mesa.

As a side effect, it also allows GL_RGB565 to be specified as a texture
format, not just as a renderbuffer.  Mesa had previously been allowing 565
textures, which angered DEQP in the absence of this extension being
exposed.

v2: Allow 2101010rev with sized internalformats even on GLES3, citing the
    extension spec.  Extend extension checks for GLES2 contexts exposing
    with texture_float, texture_half_float, and texture_rg.
v3: Fix ALPHA/LUMINANCE/LUMINANCE_ALPHA error checking (GLES3 CTS
    failures)
v4: Mark GL_RGB10 non-color-renderable on ES, fix A/L/LA errors on GLES2
    with float formats.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 09:31:29 -07:00
Eric Anholt
cee5585da7 mesa: Only expose GLES's EXT_texture_type_2_10_10_10_REV if supported in HW.
Previously, we were downconverting to 8888 automatically if the hardware
didn't suport it.  However, with the advent of
GL_OES_required_internalformat, we have to actually store the
internalformats we advertise support for.  And, it seems rather
disingenuous to advertise the extension if we don't actually support it.

v2: Throw an error when using the format on ES2 without the extension present.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 09:31:29 -07:00
Eric Anholt
cbb532429b vc4: Add support for 5551 textures.
This keeps us from promoting them up to 8888, at the cost of not being
color-renderable.
2017-10-10 09:31:29 -07:00
Eric Anholt
ef874ee450 gallium: Add support for 5551 with the 1-bit field in the low bit.
This is how VC4 stores 5551 textures, which we need to support for
GL_OES_required_internalformat.

v2: Extend commit message, fix svga driver build, add BE ordering from
    Roland.
v3: Rebase on PIPE_FORMAT_R10G10B10X2_UNORM addition.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
2017-10-10 09:31:29 -07:00
Eric Anholt
3078296226 mesa: Add X1B5G5R5 along with A1B5G5R5.
For supporting RGB5 in hardware with A in the low bit (vc4), we need this
format as well.

v2: Add proper _mesa_format_matches_format_and_type() support (from
    Nicolai).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-10-10 09:31:29 -07:00
Nicolai Hähnle
fbcae1897b st_api: remove unused get_resource_for_egl_image
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:48 +02:00
Nicolai Hähnle
e14fe41e0b st/dri: implement createImageFromRenderbuffer(2)
Tested with dEQP-EGL.functional.image.*renderbuffer* tests.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-10 13:58:48 +02:00
Nicolai Hähnle
4ec2ac11bd egl/dri: remove old left-overs
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:47 +02:00
Nicolai Hähnle
bad24395d9 egl/dri: use createImageFromRenderbuffer2 when available
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:47 +02:00
Nicolai Hähnle
d0d6efcc64 egl/dri: factor out egl_error_from_dri_image_error
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:47 +02:00
Nicolai Hähnle
f12e1c5586 dri_interface: add an error-returning version of createImageFromRenderbuffer
We ought to be able to distinguish between allocation errors and bad
parameters (non-existent renderbuffer object).

Bumps the version of the DRI Image extension to 17.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:46 +02:00
Nicolai Hähnle
9a8f13a33b st/mesa: don't clobber glGetInternalformat* buffer for GL_NUM_SAMPLE_COUNTS
Applications might pass in a buffer that is sized too large and rely
on the extra space of the buffer not being overwritten.

Fixes dEQP-GLES31.functional.state_query.internal_format.partial_query.num_sample_counts

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:46 +02:00
Nicolai Hähnle
1b592d30c5 u_threaded_context: fix a memory leak
The uploaders can own transfers which need to be unmapped. Destroy them
before the final sync (they're not used from the driver thread anyway)
so that the transfer_unmap call is processed by the driver.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:46 +02:00
Nicolai Hähnle
76fcede3f4 disk_cache: remove unnecessary NULL-pointer guards
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:45 +02:00
Nicolai Hähnle
b041bf9f4b disk_cache: fix a memory leak
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:45 +02:00
Nicolai Hähnle
83c54a1402 st/mesa: whitespace fix
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:44 +02:00
Nicolai Hähnle
288dea076e st/mesa: fix import of EGL images with non-zero level or layer
In GL state, textures created from EGL images look like plain 2D textures
with a single level, so we use the existing layer_override facility and
add an analogous level_override one.

Fixes dEQP-EGL.functional.image.create.gles2_cubemap_{positive,negative}_{x,y,z}_rgba_texture

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:44 +02:00
Nicolai Hähnle
d245724399 st/mesa: fix switching from surface-based to non-surface-based textures
This can happen with surface-based texture objects derived from EGL
images, since those aren't immutable.

Fixes tests in dEQP-EGL.functional.sharing.gles2.multithread.random.images.teximage2d.* and others

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:44 +02:00
Nicolai Hähnle
a2c8812f91 glsl/linker: add check for compute shared memory size
Unlike uniforms, the limit on shared memory size is not called out
explicitly in the list of things that cause linker errors, but presumably
that's just an oversight in the spec.

Fixes dEQP-GLES31.functional.debug.negative_coverage.{callbacks,get_error,log}.compute.exceed_shared_memory_size_limit

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-10 13:58:43 +02:00
Lucas Stach
ca949e00d8 etnaviv: update HW headers and fix provoking vertex
Now that the real meaning of the 2 bits in PA_SYSTEM_MODE is known,
we can set them according to the rasterizer state, which fixes uses
that are setting provoking vertex first.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-10 12:32:24 +02:00
Lucas Stach
0ab59f120b etnaviv: remove flat shading workaround
It turned out not to be a hardware bug, but the shader compiler
emitting wrong varying component use information. With that fixed
we can turn flat shading back on.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:30:34 +02:00
Lucas Stach
cedab87e76 etnaviv: fix varying interpolation
It seems that newer cores don't use the PA_ATTRIBUTES to decide if the
varying should bypass the flat shading, but derive this from the component
use. This fixes flat shading on GC880+.

VARYING_COMPONENT_USE_POINTCOORD is a bit of a misnomer now, as it isn't
only used for pointcoords, but missing a better name I left it as-is.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:29:35 +02:00
Lucas Stach
03b1f8ba20 etnaviv: fix bogus flush requests in transfer handling
The logic to decide if we need to flush the GPU command stream was broken
and hard to reason about. Fix and clarify this.

Fixes the data sync subtests from piglit arb_vertex_buffer_object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:29:03 +02:00
Iago Toral Quiroga
5ec21eb1a0 i965/tes: account for the fact that dvec3/4 inputs take two slots
When computing the total size of the URB for tessellation evaluation
inputs we were not accounting for this, and instead we were always
assuming that each input would take a single vec4 slot, which could
lead to computing a smaller read size than required. Specifically, this
is a problem when the last input is a dvec3/4 such that its XY components
are stored in the the second half of a payload register (which can happen
if the offset for the input in the URB is not 64-bit aligned because
there are 32-bit inputs mixed in) and the ZW components in the
first half of the next, as in this case we would fail to account for the
extra slot required for the ZW components.

Fixes (requires another fix in CTS currently in review):
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-10 08:59:54 +02:00
Tapani Pälli
63e6db18c5 anv: fix null pointer dereference
CID: 1419033

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-10 08:17:44 +03:00
Dave Airlie
4adc456580 radv: export KHR_relaxed_block_layout
This seems to pass all the cts tests it enables.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-10 13:22:44 +10:00
Ilia Mirkin
ce6da2a026 nv50/ir: fix 64-bit integer shifts
TGSI was adjusted to always pass in 64-bit integers but nouveau was left
with the old semantics. Update to the new thing.

Fixes: d10fbe5159 (st/glsl_to_tgsi: fix 64-bit integer bit shifts)
Reported-by: Karol Herbst <karolherbst@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 20:42:59 -04:00
Lionel Landwerlin
8ee6828df7 i965: silence coverity warning
Also makes this statement a bit clearer.

CID: 1418920
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2017-10-10 00:56:01 +01:00
Józef Kucia
91ba331ef4 anv: Do not assert() on VK_ATTACHMENT_UNUSED
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 16:28:43 -07:00
Józef Kucia
e0acb630a5 spirv: Fix SpvOpAtomicISub
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 16:28:11 -07:00
Timothy Arceri
7a7fb90af7 glsl: tidy up IR after loop unrolling
c7affbf687 enabled GLSLOptimizeConservatively on some
drivers. The idea was to speed up compile times by running
the GLSL IR passes only once each time do_common_optimization()
is called. However loop unrolling can create a big mess and
with large loops can actually case compile times to increase
significantly due to a bunch of redundant if statements being
propagated to other IRs.

Here we make sure to clean things up before moving on.

There was no measureable difference in shader-db compile times,
but it makes compile times of some piglit tests go from a couple
of seconds to basically instant.

The shader-db results seemed positive also:

Totals:
SGPRS: 2829456 -> 2828376 (-0.04 %)
VGPRS: 1720793 -> 1721457 (0.04 %)
Spilled SGPRs: 7707 -> 7707 (0.00 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 3140 -> 2060 (-34.39 %)
Scratch size: 3308 -> 2180 (-34.10 %) dwords per thread
Code Size: 79441464 -> 79214616 (-0.29 %) bytes
LDS: 436 -> 436 (0.00 %) blocks
Max Waves: 558670 -> 558571 (-0.02 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
646621c66d glsl: make loop unrolling more like the nir unrolling path
The old code assumed that loop terminators will always be at
the start of the loop, resulting in otherwise unrollable
loops not being unrolled at all. For example the current
code would unroll:

  int j = 0;
  do {
     if (j > 5)
        break;

     ... do stuff ...

     j++;
  } while (j < 4);

But would fail to unroll the following as no iteration limit was
calculated because it failed to find the terminator:

  int j = 0;
  do {
     ... do stuff ...

     j++;
  } while (j < 4);

Also we would fail to unroll the following as we ended up
calculating the iteration limit as 6 rather than 4. The unroll
code then assumed we had 3 terminators rather the 2 as it
wasn't able to determine that "if (j > 5)" was redundant.

  int j = 0;
  do {
     if (j > 5)
        break;

     ... do stuff ...

     if (bool(i))
        break;

     j++;
  } while (j < 4);

This patch changes this pass to be more like the NIR unrolling pass.
With this change we handle loop terminators correctly and also
handle cases where the terminators have instructions in their
branches other than a break.

V2:
- fixed regression where loops with a break in else were never
  unrolled in v1.
- fixed confusing/wrong naming of bools in complex unrolling.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
d24e16fe1f glsl: check if induction var incremented before use in terminator
do-while loops can increment the starting value before the
condition is checked. e.g.

  do {
    ndx++;
  } while (ndx < 3);

This commit changes the code to detect this and reduces the
iteration count by 1 if found.

V2: fix terminator spelling

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
ab23b759f2 glsl: don't drop instructions from unreachable terminators continue branch
These instructions will be executed on every iteration of the loop
we cannot drop them.

V2:
- move removal of unreachable terminators from the terminator list
  to the same place they are removed from the IR as suggested by
  Nicolai.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Dylan Baker
c63ce5c95d travis: Add a travis profile for meson dri drivers
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-09 13:55:12 -07:00
Dylan Baker
68c91264eb travis: don't run ninja test for meson
This pulls in tons of extra dependencies because the tests are not
properly guarded.

v2: - Put this patch before the one that adds a loader/dri test for
      meson

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-09 13:54:46 -07:00
Dylan Baker
c2cd5801cd meson: build classic swrast
This adds support for building the classic swrast implementation. This
driver has been tested with glxinfo and glxgears.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
816bf7d164 meson: build gbm
This doesn't include egl support, just dri support.

v2: - when gbm is set to 'auto', only build if a dri driver is also
      enabled
    - Fix conditional to check for x11 modules with vulkan as well as
      with dri drivers
v3: - Set pkgconfig libraries.private value

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
db9788420d meson: Add support for configuring dri drivers directory.
v2: - drop with_ from dri_drivers_path variable (Eric A)
v3: - Move HAVE_X11_PLATFORM to the proper patch (Eric A)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00