Commit graph

89540 commits

Author SHA1 Message Date
Marek Olšák
345f04ed92 radeonsi: remove r600_emit_reloc
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:02 +02:00
Marek Olšák
da61946cb1 radeonsi: merge si_set_streamout_targets with si_common_set_streamout_targets
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:00 +02:00
Marek Olšák
a86c9328ce radeonsi: add si_so_target_reference
The src type is different on purpose.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:58 +02:00
Marek Olšák
65f2e33500 radeonsi: import r600_streamout from drivers/radeon
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:55 +02:00
Marek Olšák
ed7f27ded8 radeonsi: add performance thresholds for CP DMA, decrease it for clears
The first one isn't used yet.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:24:21 +02:00
Marek Olšák
8e969cce38 radeonsi: disable primitive binning on Vega10 (v2)
Our driver implementation is known to decrease performance for some tests,
but we don't know if any apps and benchmarks (e.g. those tested by Phoronix)
are affected. This disables the feature just to be safe.

Set this to enable partial primitive binning:
    R600_DEBUG=dpbb
Set this to enable full primitive binning:
    R600_DEBUG=dpbb,dfsm

v2: add new debug options

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:18 +02:00
Marek Olšák
3784ce9782 radeonsi: enumerize DBG flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:16 +02:00
Marek Olšák
99fa9ccf96 drirc: whitelist glthread for Spec Ops: The Line
On i7 4790k and a 280X, there is a boost of about 10% more FPS.

Nominated by John Ettedgui.
2017-10-09 15:43:33 +02:00
Samuel Pitoiset
7824cb4b03 radv: configure VGT_VERTEX_REUSE at pipeline creation
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:06:19 +02:00
Samuel Pitoiset
b09b43b166 radv: do not need to zero-init ds/raster states
Already done when creating the pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:06:17 +02:00
Samuel Pitoiset
d4652e7c86 radv: remove unused fields in radv_raster_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:06:15 +02:00
Samuel Pitoiset
6732a8369a radv: set ALPHA_TO_MASK_ENABLE at blend state init
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:05:06 +02:00
Samuel Pitoiset
5848565ee3 radv: emit PA_SU_POINT_{SIZE,MINMAX} in si_emit_config()
These registers don't change during the lifetime of the
command buffer, there is no need to re-emit them when
binding a new pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:05:04 +02:00
Samuel Pitoiset
aab1537568 radv: allow launching waves out-of-order for compute
Ported from RadeonSI, and -pro seems to enable it as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:04:17 +02:00
Jason Ekstrand
6c7720ed78 anv/wsi: Allocate enough memory for the entire image
Previously, we allocated memory for image->plane[0].surface.isl.size
which is great if there is no compression.  However, on BDW, we can do
CCS_D on X-tiled images so we also have to allocate space for the
auxiliary buffer.  This fixes hangs in some of the WSI CTS tests and
should also reduce hangs in real applications.  In particular, it fixes
the dEQP-VK.wsi.*.incremental_present.* test group.

When we hand the image off to X11 or Wayland, it will ignore the CCS
entirely which is ok because we do a resolve when it's transitioned to
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-07 17:12:38 -07:00
Lionel Landwerlin
e262845e37 anv: fix nir.h include
All over mesa we include "nir/nir.h", we should probably do the same
here. This fixes the meson build that was broken by the ycbcr series.

Thanks to Dylan for finding the issue.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f3e91e78a3 ("anv: add nir lowering pass for ycbcr textures")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-07 22:57:50 +01:00
Jason Ekstrand
49a6fb8474 spirv: Don't warn on the ImageCubeArray capability
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-07 14:52:03 -07:00
Kenneth Graunke
37e128b9b7 mesa: make glFramebuffer* check immutable texture level bounds
When a texture is immutable, we can't tack on extra levels
after-the-fact like we could with glTexImage. So check against that
level limit and return an error if it's surpassed.

This fixes:
KHR-GL45.geometry_shader.layered_fbo.fb_texture_invalid_level_number

(Based on a patch by Ilia Mirkin.)

Reviewed-by: Antia Puentes <apuentes@igalia.com> [imirkin v2]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 13:26:55 -07:00
Marek Olšák
5a47abb63e radeonsi: don't change viewport for blits, use window-space positions
The viewport state was an identity anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
76ef08f6ee radeonsi: set correct PA_SC_VPORT_ZMIN/ZMAX when viewport is disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
13b6c1c031 radeonsi: minor cleanup of si_update_vs_writes_viewport_index
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
5f566faa46 radeonsi: don't save and restore vertex buffers and elements for u_blitter
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
69ccb9dae7 radeonsi: use new VS blit shaders (VS inputs in SGPRs)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
6a8401a94e radeonsi: add VS blit shader creation
no users yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
f3fe6afba8 radeonsi: split declare_default_desc_pointers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
0a3b5a0232 gallium/u_blitter: let drivers decide which VS to use for draw_rectangle
This approach allows drivers to set their own vertex shader and skip
compilation of u_blitter vertex shaders.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
a46bcf0a77 gallium/u_blitter: let drivers set the vertex elements state
radeonsi won't set it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
7f8af4624d gallium/u_blitter: remove blitter_context_priv::viewport
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
f84a63bc00 radeonsi: don't use util_draw_arrays_instanced in si_draw_rectangle
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
387590accb radeonsi: move si_draw_rectangle into si_state_draw.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
de810f8b84 radeonsi: remove wrappers si_decompress_xx_textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
efd72b31cb gallium/radeon: remove r600_atom::num_dw
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
f1eb9a9c27 gallium/radeon: remove old r600g code checking chip_class and family
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Mark Thompson
c4ed39f85b st/va: Implement vaExportSurfaceHandle()
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

v2: Convert surfaces to progressive before exporting them (Christian).

v3: Set destination rectangle to match source when converting (Leo).
    Add guards to allow building with libva1.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-and-Tested-by: Leo Liu <leo.liu@amd.com>
2017-10-07 10:15:14 -04:00
Roland Scheidegger
52b73caaf4 gallivm: don't use pabs intrinsic with llvm version >= 6
The intrinsic is gone, causing shader compilation to crash.
While here, also change the fallback code to match what llvm's auto-updater
of these intrinsics would do (except that there will still be zext/trunc
instructions in there), which should ensure that the sequence gets recognized
and fused back into a pabs in the end (I didn't test this, and it's possible
even the old sequence would get recognized, but I don't see a reason why we
shouldn't use the same sequence in any case).

Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-10-07 00:54:09 +02:00
Tim Rowley
9716c69e22 swr/rast: use proper alignment for debug transposedPrims
Causing a crash in ParaView waveletcontour.py test when
_DEBUG defined due to vector aligned copy with unaligned
address.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-06 13:27:39 -05:00
Lionel Landwerlin
0763f814d7 anv/cmd_buffer: Reset state in cmd_buffer_destroy
This ensures that everything gets cleaned up properly. In particular,
it fixes a memory leak where we were leaking the push constants
structs.

Valgrind stats on
dEQP-VK.pipeline.push_constant.graphics_pipeline.range_size_128 :

Before:
HEAP SUMMARY:
    in use at exit: 2,467,513 bytes in 1,305 blocks
  total heap usage: 697,853 allocs, 696,530 frees, 138,466,600 bytes allocated

LEAK SUMMARY:
   definitely lost: 1,068 bytes in 11 blocks
   indirectly lost: 24,669 bytes in 412 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 2,441,776 bytes in 882 blocks
        suppressed: 0 bytes in 0 blocks

After:
HEAP SUMMARY:
    in use at exit: 2,467,381 bytes in 1,304 blocks
  total heap usage: 697,853 allocs, 696,531 frees, 138,466,600 bytes allocated

LEAK SUMMARY:
   definitely lost: 936 bytes in 10 blocks
   indirectly lost: 24,669 bytes in 412 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 2,441,776 bytes in 882 blocks
        suppressed: 0 bytes in 0 blocks

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
2017-10-06 17:32:34 +01:00
Lionel Landwerlin
d296dea54e anv/cmd_buffer: fix push descriptors with set > 0
When writing to set > 0, we were just wrongly writing to set 0. This
commit fixes this by lazily allocating each set as we write to them.

We didn't go for having them directly into the command buffer as this
would require an additional ~45Kb per command buffer.

v2: Allocate push descriptors from system memory rather than in BO
    streams. (Lionel)

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Fixes: 9f60ed98e5 ("anv: add VK_KHR_push_descriptor support")
Reported-by: Daniel Ribeiro Maciel <daniel.maciel@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 17:32:13 +01:00
Lionel Landwerlin
b24b93d584 anv: enable VK_KHR_sampler_ycbcr_conversion
v2: Make GetImageMemoryRequirements2KHR() iterate over all pInfo
    structs (Lionel)
    Handle VkSamplerYcbcrConversionImageFormatPropertiesKHR (Andrew/Jason)
    Iterator over BindImageMemory2KHR's pNext structs correctly (Jason)

v3: Revert GetImageMemoryRequirements2KHR() change from v2 (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:34:04 +01:00
Lionel Landwerlin
a62a979335 anv: enable multiple planes per image/imageView
This change introduce the concept of planes for image & views. It
matches the planes available in new formats.

We also refactor depth & stencil support through the usage of planes
for the sake of uniformity. In the backend (genX_cmd_buffer.c) we have
to take some care though with regard to auxilliary surfaces.
Multiplanar color buffers can have multiple auxilliary surfaces but
depth & stencil share the same HiZ one (only store in the depth
plane).

v2: by Jason
    Remove unused aspect parameters from anv_blorp.c
    Assert when attempting to resolve YUV images
    Drop redundant logic for plane offset in make_surface()
    Rework anv_foreach_plane_aspect_bit()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:20 +01:00
Jason Ekstrand
185e719090 anv: Take an image in can_sample_with_hiz
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
558d8a3979 anv: Take a single aspect in anv_layout_to_aux_usage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
3735af0415 anv/cmd_buffer: Make get_fast_clear_state return an address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
fd146e4f3f anv/blorp: Add a concept of default aux usage
A good chunk of anv_blorp just wants the aux usage from the image.  This
magic aux_usage value means just that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
f3e91e78a3 anv: add nir lowering pass for ycbcr textures
This pass implements all the implicit conversions required by the
VK_KHR_sampler_ycbcr_conversion specification.

It also inserts plane sources onto sampling instructions that we then
let the pipeline layout pass deal with, when mapping things correctly
to descriptors.

v2: Add new file to meson build (Lionel)
    Use nir_frcp() rather than (1.0f / x) (Jason)
    Reuse nir_tex_instr_dest_size() rather than handwritten one (Jason)
    Return progress (Jason)
    Account for array of samplers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
3492d56067 anv: prepare sampler emission code for multiplanar images
New settings from the KHR_sampler_ycbcr_conversion specifications
might require different sampler settings for luma and chroma planes.
This change makes the sampler table emission ready to handle multiple
planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
a2a7846d37 anv/apply_pipeline_layout: Prepare for multi-planar images
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
72aec2060f anv: add new formats KHR_sampler_ycbcr_conversion
Adding new downsampling factors for each planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:46:08 +01:00
Lionel Landwerlin
bbc3700798 anv: modify the internal concept of format to express multiple planes
A given Vulkan format can now be decomposed into a set of planes. We
now use 'struct anv_format_plane' to represent the format of those
planes.

v2: by Jason
    Rename anv_get_plane_format() to anv_get_format_plane()
    Don't rename anv_get_isl_format()
    Replace ds_fmt() by fmt2()
    Introduce fmt_unsupported()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:46:03 +01:00
Lionel Landwerlin
18914715d1 anv: prepare formats to handle disjoints sets
Newer format enums start at offset 1000000000, making it impossible to
have them all in one table. This change splits the formats into sets
that we then access through indirection.

v2: rename format_extract to vk_to_anv_format (Chad/Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:45:56 +01:00