Commit graph

1112 commits

Author SHA1 Message Date
Jordan Justen
fc12fd05f5
iris: Implement pipe_screen::resource_get_param
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 01:12:30 -07:00
Rafael Antognolli
a1a499e7fe iris/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.
If the pixel pipes have a different number of subslices, emit a slice
hashing table that will ensure proper workload distribution.

v2: Don't need to set the mask - it's mbo (Ken).
v3: Don't keep a reference to the resource used for emitting the table
(Ken).
2019-08-12 16:19:08 -07:00
Jason Ekstrand
134607760a intel/compiler: Fill a compiler statistics struct
This commit is all annoying plumbing work which just adds support for a
new brw_compile_stats struct.  This struct provides a binary driver
readable form of the same statistics we dump out to stderr when we
INTEL_DEBUG is set with a shader stage.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-12 22:56:07 +00:00
Francisco Jerez
026773397b iris/gen9: Optimize slice and subslice load balancing behavior.
See "i965/gen9: Optimize slice and subslice load balancing behavior."
for the rationale.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-12 13:17:58 -07:00
Tapani Pälli
d4b574f26a iris: reorder arguments as expected by the function
CID: 1452262
Fixes: b4c54894bb "iris: Handle vertex shader with window space position"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
2019-08-12 13:08:26 +03:00
Tapani Pälli
590ba15d6e iris/android: move iris_query.c to 'per gen' LIBIRIS_SRC_FILES
Fixes Iris build on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-08-12 10:06:36 +03:00
Kenneth Graunke
0f3768bc5d iris: Free query on error path
CID: 1452276
2019-08-11 14:04:31 -07:00
Kenneth Graunke
661be3fef9 iris: Add missing 'break'
We don't want to fall through to unreachable().

CID: 1452277
2019-08-11 14:04:31 -07:00
Kenneth Graunke
f1dba99639 iris: minor restyling 2019-08-10 00:16:45 -07:00
Mark Janes
9c597514d4 iris/query: enable amd performance monitors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:34 -07:00
Mark Janes
469af7fdc9 iris/perf: get monitor results
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:32 -07:00
Mark Janes
1cb4fc184f iris/perf: add begin/end hooks
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:24 -07:00
Mark Janes
8c4c346665 iris/perf: add delete query
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:17 -07:00
Mark Janes
aca42759ff iris/perf: implement iris_create_monitor_object
This is the first call that provides the iris context to the monitor
implementation.  On the first call, use the iris context to initialize
the monitor context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:14 -07:00
Mark Janes
0fd4359733 iris/perf: implement routines to return counter info
With this commit, Iris will report that AMD_performance_monitor is
supported, and will allow the caller to query the available metrics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:03 -07:00
Greg V
c0dc5c1859 meson: define ETIME to ETIMEDOUT if not present
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-08 21:44:33 +01:00
Rhys Perry
c52c54a746 anv,i965,iris: deduplicate setting of total_shared
v5: add patch

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Mark Janes
2446f5cfd8 intel/perf: move perf-related constants to common location
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers.  Place these macros within perf,
if the perf implementation contains the only references to the values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Bas Nieuwenhuizen
5a26f528cb meson,i965: Link with android deps when building for android.
The DBG marco in brw_blorp.c ends up calling an android log function:

error: undefined reference to '__android_log_print'

v2: On suggestion from Lionel, hang the Android dependency onto a new
    libintel_common dependency.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-07 15:34:46 +02:00
Danylo Piliaiev
b4c54894bb iris: Handle vertex shader with window space position
Iris advertises support for PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION
so let's actually implement it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110657

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-06 20:25:35 +00:00
Kenneth Graunke
382f92a814 iris: Increase BATCH_SZ to 64kB
This seems to improve performance by roughly ~1% across the board.
Thanks to Rafael Antognolli and Dan Walsh for their help tuning.
2019-08-06 09:09:26 -07:00
Kenneth Graunke
64b73b770b iris: Fix bad external BO hash table and zombie list interactions
A while ago, we started deferring GEM object closure and VMA release
until buffers were idle.  This had some unforeseen interactions with
external buffers.

We keep imported buffers in hash tables, so if we have repeated imports
of the same GEM object, we map those to the same iris_bo structure.
This is critical for several reasons.  Unfortunately, we broke this
assumption.  When freeing a non-idle external buffer, we would drop it
from the hash tables, then move it to the zombie list.  If someone
reimported the same GEM object, we would not find it in the hash tables,
and go ahead and make a second iris_bo for that GEM object.  But the old
iris_bo would still be in the zombie list, and so we would eventually
call GEM_CLOSE on it - closing a BO that should have still been live.

To work around this, we defer removing a BO from the hash tables until
it's actually fully closed.  This has the strange effect that an
external BO may be on the zombie list, and yet be resurrected before
it can be properly cleaned up.  In this case, we remove it from the
list so it won't be freed.

Fixes severe instability in Weston, which was hitting EINVALs and
ENOENTs from execbuf2, due to batches referring to a GEM object that
had been closed, or at least had its VMA torched.

Fixes: 457a55716e ("iris: Defer closing and freeing VMA until buffers are idle.")
2019-08-05 08:53:41 -07:00
Kenneth Graunke
48e5a99d86 iris/bufmgr: Move iris_bo_reference into hash_find_bo, rename it
Everybody importing an external buffer was looking it up in the hash
table, then referencing it.  We can just do that in the helper instead,
which also gives us a convenient spot to stash extra code shortly.
2019-08-05 08:53:07 -07:00
Jason Ekstrand
aebca3961b iris: Fix handling of SIMD32 fragment shaders
The brw_wm_prog_data_dispatch_grf_start_reg and _prog_offset helpers
read the _NPixelDispatchEnable fields from 3DSTATE_PS to figure out
which bits to pull out of the prog data and stuff where.  Therefore,
they need to be called with the final set of _NPixelDispatchEnable bits
after we've done the workaround for SIMD32 and 16x MSAA.  Otherwise, if
you end up with a somewhat odd combination of enables, the GRF start reg
and KSP data ends up in the wrong slots.  In particular, running
SIMD32-only is broken but several other combinations are as well.

Fixes: 5445c176e2 "iris: Disable SIMD32 when using a 16x MSAA..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-03 22:24:40 +00:00
Timothy Arceri
06ec14d692 iris: bump compat profile support to 4.6
All of the current piglit compat profile tests pass.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-02 18:56:53 +10:00
Kenneth Graunke
18c2e09dc7 gallium: Implement GL_EXT_shader_samples_identical via a new capability
This exposes the textureSamplesIdenticalEXT function in GLSL.

We enable it for iris and radeonsi, because their compilers already
have support for this.  Tested on Intel Kabylake and AMD Vega 64.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-01 23:38:54 -07:00
Mark Janes
49465f1330 iris/screen: use initialization routine for gen_device_info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:39:48 -07:00
Mark Janes
7852fe5415 intel/common: provide common ioctl routine
i965 links against libdrm for drmIoctl, but anv and iris both
re-implement this routine to avoid the dependency.

intel/dev also needs an ioctl wrapper, so lets share the same
implementation everywhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-01 16:38:40 -07:00
Timothy Arceri
2afedfaf9a iris: add support for gl_ClipVertex in tess eval shaders
Required for OpenGL compat support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-01 16:12:37 -07:00
Timothy Arceri
00b5bf2d72 iris: add support for gl_ClipVertex in geometry shaders
This will enable us to support the OpenGL compat profile.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-01 16:12:27 -07:00
Kenneth Graunke
b61f17d362 iris: Skip emitting 3DSTATE_INDEX_BUFFER if possible
We were emitting 3DSTATE_INDEX_BUFFER on every indexed draw, even if
back-to-back draws referred to the same index buffer.  This improves
drawoverhead scores in the DrawElements cases by about 10%, by giving
us even more minimal batches.
2019-07-31 15:14:10 -07:00
Kenneth Graunke
3a22a8bf49 iris: Skip repeated depth buffer disables.
Often times, the depth buffer is entirely disabled, but color render
targets change.  For example, GenerateMipmaps will change the color
render target for each miplevel, but there is no depth buffer.

In the Civilization VI benchmark, this drops the median number of
3DSTATE_DEPTH_BUFFER etc. packets emitted per frame from 472 to 34.
2019-07-30 19:47:41 -07:00
Sagar Ghuge
587a497529 iris: Enable EXT_texture_shadow_lod
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-30 10:42:20 -07:00
Kenneth Graunke
44e713eddb iris: Fix SO offset to be 32-bit in DrawTransformFeedback handling
We accidentally started copying a full 64-bit value rather than copying
a 32-bit offset and zeroing the top 32-bits.  This caused us to compute
bogus vertex counts which could lead to GPU hangs in some cases.

Thanks to Clayton Craft for catching the regressions!

Fixes: 0e24d10ff5 ("iris: Use gen_mi_builder to handle CS ALU operations.")
2019-07-29 16:38:19 -07:00
Jason Ekstrand
4bb6e6817e intel: Use a system value for gl_FragCoord
It's kind-of an anomaly that the Intel drivers are still treating
gl_FragCoord as an input.  It also makes zero sense because we have to
special-case it in the back-end.

Because ANV is the only user of nir_lower_wpos_center, we go ahead and
just update it to look for nir_intrinsic_load_frag_coord as part of this
patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-29 23:30:26 +00:00
Kenneth Graunke
0e24d10ff5 iris: Use gen_mi_builder to handle CS ALU operations.
In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke
fe7ed6b057 iris: Make iris_query.c a genxml-compiled file.
This will let us use Jason's new MI-builder shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke
975f7e4a59 iris: Move iris_resolve_conditional_render to the vtable.
It's going to be in genxml code shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke
6c4c7b600d iris: Refactor genxml macros and inlines into iris_genx_macros.h.
This will let us put the genxml boilerplate in one place, before we
expand genxml to more files shortly.  Like i965/genX_boilerplate.h.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke
204a3bb816 iris: Make an iris_genx_protos.h header for prototypes.
This lets us specify the prototypes once, instead of cut and pasting
them per generation.  isl uses a similar approach (isl_genX_priv.h).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Jason Ekstrand
c84b8eeeac intel/compiler: Be more conservative about subgroup sizes in GL
The rules for gl_SubgroupSize in Vulkan require that it be a constant
that can be queried through the API.  However, all GL requires is that
it's a uniform.  Instead of always claiming that the subgroup size in
the shader is 32 in GL like we have to do for Vulkan, claim 8 for
geometry stages, the maximum for fragment shaders, and the actual size
for compute.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Ilia Mirkin
0e30c6b8a7 gallium: switch boolean -> bool at the interface definitions
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).

This has been build-tested on amd64 with:

Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
                 vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st:      mesa xa xvmc xvmc vdpau va

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 22:13:51 -04:00
Kenneth Graunke
7cdde962c5 iris: Support storage images that have matching typed formats for reads
Even if we don't directly support typed reads on a format, we can often
translate them to a reasonable matching format.  Advertise those too.
2019-07-22 17:30:13 -07:00
Kenneth Graunke
2f1c7fae9e iris: Stop advertising MSAA storage images by mistake
st_extensions.c sets const->MaxImageSamples (GL_MAX_IMAGE_SAMPLES) by
looping over [16, 15, .. 1x] MSAA modes, and RGBA/BGRA/ARGB/ABGR 8888
color formats, calling pipe->is_format_supported() for each, with
the usage set to PIPE_BIND_SHADER_IMAGE.  If any are supported, it
selects that number of samples.

We were checking if sample_count <= 1, which meant that we were getting
a value of 1x MSAA, rather than the expected 0x (feature doesn't exist).

But, only on Icelake because Gen11 adds support for typed read messages
for R8G8B8A8_UNORM.  The lack of typed read messages for these formats
was tricking the check on Gen9 to say no correctly.  This caused some
Icelake conformance failures, because we don't implement this feature.

Just check for sample_count == 0 instead.
2019-07-22 17:30:13 -07:00
Timothy Arceri
80c2c17e1e iris: change last_vue_stage() to look at uncompiled shaders
This allows us to find the last vue stage before we have compiled
the shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Rafael Antognolli
393f659ed8 iris: Enable fast clears on other miplevels and layers than 0.
Until now we only supported fast clear colors on the first miplevel and
layer. The main reason for it is that we can't have different fast clear
values at different levels/layers, since the surface state only supports
one clear value.

We can, however, enable it if we make sure we only use the same value
for all levels/layers, and if one of them changes, we resolve all the
others. We already do that for depth fast clears so hopefully it will be
fine for color fast clears too.

v2: Add check for partial clear too (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-17 14:53:37 -07:00
Rafael Antognolli
8bbd4f32bf iris: Allow resolving clear color of CCS_D surfaces.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-17 14:53:16 -07:00
Kenneth Graunke
df4c2ec5e1 iris: Make iris_has_color_unresolved non-static
We want to use this in the transfer code and possibly for fast clears.
2019-07-17 13:43:04 -07:00
Kenneth Graunke
1d5ee31553 iris: Drop copy and pasted iris_timebase_scale
Lionel moved brw_timebase_scale to gen_device_info_timebase_scale a few
months ago, so we should just use that, and not our own copy in iris.
2019-07-16 17:22:48 -07:00
Kenneth Graunke
5e76c99923 iris: Better handle decoder base addresses
It can be useful to call the decoder on a single batch.  But, that batch
may not contain STATE_BASE_ADDRESS, at which point the decoder will have
no idea how to find any buffers.  We can initialize the two static bases
at the beginning of time, so it has them even if it never sees SBA.

Surface base address changes dynamically, possibly in the middle of a
batch.  So we update it at the start of each batch, making it always
start at the value we inherited from the previous one.  SBA commands
inside the batch can update it to a proper value.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-07-15 11:49:19 -07:00