Commit graph

96 commits

Author SHA1 Message Date
Calder Young
a39779e695 iris: Fix issue with conditional dispatching
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35946>
2025-07-08 10:00:45 +00:00
Kenneth Graunke
5ccb9f4632 iris: Don't return timestamps modulo 36-bits
Our actual timestamp register from the hardware is 36-bits these days.
Depending on the clock base, a few of the low bits may be discarded.
We scale by 52.08 or so (see intel_device_info_timebase_scale()) in
order to convert clock ticks to nanoseconds.  This value could take
~43-44 bits to represent.  When our timestamp register overflows, the
reported value would overflow at some value which isn't a power-of-two.

For some reason, when we implemented ARB_timer_query for i965 in 2012,
we thought applications would use GL_QUERY_COUNTER_BITS to detect and
handle overflow, expecting our timer queries to overflow at a power-of-
two value.  We tried to hack around this by reporting a 36-bit counter,
for what was then a 32-bit(?) timestamp register, with a timebase scale
of 80(?)...and reported the nanoseconds modulo 2^36, with some
handwaving about wrap around times being close enough.

I don't think this is what anyone wants.  All other Gallium drivers
(except maybe zink) report 64 here, as does the Intel Windows driver.

The ARB_timer_query spec defines it as:

   "If <pname> is QUERY_COUNTER_BITS, the implementation-dependent
    number of bits used to hold the query result for <target> will be
    placed in <params>.  The number of query counter bits may be zero,
    in which case the counter contains no useful information."

and it also mentions about overflow:

   "If the elapsed time overflows the number of bits, <n>, available to
    hold elapsed time, its value becomes undefined.  It is recommended,
    but not required, that implementations handle this overflow case by
    saturating at 2^n - 1."

There's nothing about roll-over happening at power-of-two times, just
that the value returned has to fit in that many bits, and if the value
were to exceed that, it's undefined, optionally saturated to the maximum
representable value.

This patch makes us report 64 like other drivers, and stop taking the
modulus.  Technically, our roll-over will happen before we reach the
number of query counter bits, which may or may not be valid.  But this
does let us report longer times, and should be more desirable behavior.

Tested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26058>
2024-01-24 23:13:15 +00:00
José Roberto de Souza
d0890a0b8b iris: Set MI_MATH MOCS field
MOCS = 0 is a invalid MOCS index on MTL, so it is necessary get a
valid value and set to MI_MATH instructions.

So here the mocs index is set with mi_builder_set_mocs(), it can be
always set but it is required when mi_build will emit MI_MATH
instructions.
The mocs index will only be stored and used in gfx12.5+ platforms
so no changes were are required in crocus or hasvk.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
2023-08-01 19:49:44 +00:00
Yonggang Luo
137aa8b2dc util: Replace all usage of PIPE_TIMEOUT_INFINITE with OS_TIMEOUT_INFINITE
They are exactly the same, so it's safe to do the replace
Also gen OS_TIMEOUT_INFINITE var with rusticl_mesa_bindings_rs by OS_ prefix and
include "util/os_time.h" in rusticl/rusticl_mesa_bindings.h

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23401>
2023-06-05 05:12:02 +00:00
Rohan Garg
14dec0c147 iris: correctly set alignment to next power of two for struct size
We're currently aligning the offset to the size of the data structure
itself when the upload manager actually expects a POT. Ideally this
would be the next POT that's greater than the size of the structure.

Fixes: c24a574e6c ("iris: Don't allocate a BO per query object")

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20153>
2023-05-25 21:24:44 +00:00
Lionel Landwerlin
1d13f22174 iris: rework Wa_14017076903 to only apply with occlusion queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 415b824bc6 ("iris: implement occlusion query related Wa_14017076903")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22807>
2023-05-25 09:10:33 +00:00
Rohan Garg
ec6ad8c7dc iris: Don't flush the render cache for a compute batch
Make sure we comply with BSpec and ensure that certain flush flags
are not set for compute batches

Signed-off-by: Rohan Garg's avatarRohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15664>
2023-01-20 11:09:24 +00:00
José Roberto de Souza
aff85114fd iris: Store intel_device_info in iris_bufmgr
We can have multiple pipe_screen but only one iris_bufmgr per device.
So better to store intel_device_info into the shared iris_bufmgr and
save some memory.
Also in future patches iris_bufmgr will make more use of
intel_device_info.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19650>
2022-12-15 18:55:02 +00:00
Kenneth Graunke
0ce9d7b7c9 iris: Use PIPE_* defines rather than ones from main/config.h
Gallium drivers shouldn't be including src/mesa/main headers, but we're
picking up a rogue main/config.h via the compiler, so this code I ported
over from i965 kept compiling.  Use the PIPE_* defines instead so that
we can stop including that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
2022-06-30 23:46:35 +00:00
Francisco Jerez
03cf788891 iris: Replace unconditional QBO flush with iris_dirty_for_history().
We can now use the same cache tracking mechanism for synchronizing QBO
writes instead of the unconditional PIPE_CONTROL performed currently,
which is unable to invalidate any incoherent caches which may contain
stale data for the buffer object.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15738>
2022-04-04 10:32:31 -07:00
Dave Airlie
127bcbed18 gallivm/st/lvp: add flags arg to get_query_result_resource api.
Currently this just has wait, but in order to get the right answer
for vulkan partial, lavapipe/llvmpipe need to pass a partial flag
through here in the future.

This just changes the API so that's possible.

v2: use an enum (zmike)

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15009>
2022-02-15 10:12:01 +10:00
Paulo Zanoni
b2c811ccdc iris: syncobjs are now owned by bufmgr instead of screen
The next patches will justify the new ownership. We want the BOs to
have references on the batches' syncobjs so we can implement implicit
tracking. In other words: BOs will be able to wait on syncobjs owned
by different screens. Since our syncobjs are actually just a Kernel
handle with a refcount, they can be used globally and it makes more
sense to map them to the bufmgr, just like the BOs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12363>
2021-09-01 21:48:13 +00:00
Nanley Chery
2944f49610 intel: Parse INTEL_NO_HW for devinfo construction
This commit does several things:

* Unify code common to several drivers by evaluating INTEL_NO_HW within
  intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
  separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
  code simplification, but we may find reason to undo this later on.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
2021-08-24 00:12:47 +00:00
Anuj Phogat
61e8636557 intel: Rename gen_device prefix to intel_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat
9da8a55b08 intel: Rename GEN_GEN macro to GFX_VER
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN_GEN" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_GEN/GFX_VER/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:06 +00:00
Lionel Landwerlin
33bc2977e5 intel/mi_builder: use device info to use the right CS prefetch size
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9679>
2021-03-18 20:08:45 +00:00
Kenneth Graunke
1b1c857248 iris: Make various classes inherit from u_threaded_context base classes
u_threaded_context requires various objects to inherit from a new
threaded_foo base class rather than directly from pipe_foo.  This
patch does most of the mechanical changes required for that.

It also initializes the new threaded_resource fields.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
2021-03-04 13:59:21 -08:00
Jason Ekstrand
1e53e0d2c7 intel/mi_builder: Drop the gen_ prefix
mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining
us anything except extra characters.  It's possible that MI_ may
conflict a tiny bit with GenXML but it doesn't seem to be a problem
today and we can deal with that in the future if it's ever an issue.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>
2021-03-04 15:14:27 +00:00
Nanley Chery
04ac3a6620 iris: Delete iris_resolve_conditional_render
This function has no more users.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7762>
2021-01-08 23:23:31 +00:00
yshi18
3aaac40b12 iris: fix memleak for query_buffer_uploader
In the Chrome WebGL Aquarium  stress test, 20 instances of Chrome will run
Aquarium  simultaneously over 20+ hours. That causes Chrome crash.
During the stress, glBeginQueryIndexed is called frequently.

1.Each query will only use 32 bytes from query_buffer_uploader. After the offset
exceed 4096, it will alloc new buffer for query_buffer_uploader->buffer
and release the old buffer.

2.But iris_begin_query will call u_upload_alloc when the offset changed, and it
will increase the query_buffer_uploader->buffer->reference.count every time
when it called u_upload_alloc.

3.So when u_upload_release_buffer try to release the resource of
query_buffer_uploader->buffer, its reference.count is
already equal to 129. pipe_reference_described will only decrease its reference
count to 128.So it never called old_dst->screen->resource_destroy.

4.The old resouce bo will never be freeed. And chrome will called mmap every time
when it alloc new resource bo.

5. Chrome process map too many vmas in its process. Its map count exceed the
sysctl_max_map_count which is 65530 defined in kernel.

6. When iris_begin_query want to alloc new resource bo, it will meet NULL pointer
because mmap return failed. Finally chrome crashed when it access this NULL resource
bo.

The fix is decrease the reference count in iris_destroy_query.

Patch is verified by chrome webgl Aquarium test case for more than 72 hours.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Yang Shi <yang.a.shi@intel.com>
Reviewed-by: Alex Zuo <alex.zuo@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7890>
2020-12-08 11:48:53 +00:00
Francisco Jerez
ae88e79f69 iris: Drop redundant iris_address::write flag.
The write flag is redundant since it can be inferred easily from the
iris_address::access domain.  This allows the iris_address struct to
be laid out more efficiently in memory, leading to a measurable
improvement in several Piglit Drawoverhead test-cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
2020-06-03 23:12:22 +00:00
Francisco Jerez
eb5d1c2722 iris: Annotate all BO uses with domain and sequence number information.
Probably the most annoying patch to review from the whole series --
Mark every buffer object use as accessed through some caching domain
with the sequence number of the current synchronization section of the
batch.  The additional argument of iris_use_pinned_bo() makes sure I'd
have gotten a compile error if I had missed any buffer added to the
batch validation list.

There are only a few exceptions where a buffer is left untracked while
adding it to the validation list, justified below:

 - Batch buffers: These are strictly read-only for the moment.

 - BLORP buffer objects: Their seqnos are bumped manually at the end
   of iris_blorp_exec() instead, in order to avoid plumbing domain
   information through BLORP address combining.

 - Scratch buffers: The contents of these are strictly thread-local.

 - Shader images and SSBOs: Accesses of these buffers are explicitly
   synchronized at the API level.

v2: Opt out of tracking more aggressively (Ken): In addition to the
    above, surface states, binding tables, instructions and most
    dynamic states are now left untracked, which means a *lot* more BO
    uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely
    carefully, since the cache tracker won't be able to provide any
    coherency guarantees for them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
2020-06-03 23:12:22 +00:00
Francisco Jerez
e81c07de41 iris: Bracket batch operations which access memory within sync regions.
This delimits all batch operations which access memory between
iris_batch_sync_region_start() and iris_batch_sync_region_end() calls.
This makes sure that any buffer objects accessed within the region are
considered in use through the same caching domain until the end of the
region.

Adding any buffer to the batch validation list outside of a sync
region will lead to an assertion failure in a future commit, unless
the caller explicitly opted out of the cache tracking mechanism.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
2020-06-03 23:12:22 +00:00
Francisco Jerez
46183a999b iris: Extend iris_context dirty state flags to 128 bits.
We're nearly out of dirty bits, and some patches pending review on
GitLab no longer apply due to that.  Make room for them by splitting
off shader stage-specific bits into a separate stage_dirty mask.

An alternative would be to split compute-related bits into a separate
mask, but that would prevent the '<< stage' indexing done in various
parts of the driver from working.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>
2020-06-03 22:22:19 +00:00
Kenneth Graunke
4a1ed75b85 iris: Rename iris_syncpt to iris_syncobj for clarity.
This is just a refcounted wrapper around a drm_syncobj.  There is
enough terminology going on in the area of synchronization (sync
objects, sync files, ...) that I'd rather not invent our own.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
2020-05-01 19:00:02 +00:00
Mike Blumenkrantz
91375f13ce iris: move iris_vtable to iris_screen
instead of inlining this into every context, now a struct is used in the screen
struct to reduce memory usage and simplify a couple of the methods

Closes: https://gitlab.freedesktop.org/kwg/mesa/-/issues/6
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4376>
2020-04-29 16:59:45 +00:00
Lionel Landwerlin
33b9c7a7f6 intel/perf: break GL query stuff away
This stuff is somewhat specific to the GL extension & drivers. On
Vulkan we won't use this, it also made a rather large file.

v2: Fix Android build (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
2020-03-27 14:14:49 +00:00
Danylo Piliaiev
f3ca47d9f3 iris/query: Implement PIPE_QUERY_GPU_FINISHED
Implementation is similar to radeonsi in 5f1cef76

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-06 12:43:14 +02:00
Kenneth Graunke
0f3768bc5d iris: Free query on error path
CID: 1452276
2019-08-11 14:04:31 -07:00
Mark Janes
9c597514d4 iris/query: enable amd performance monitors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:34 -07:00
Mark Janes
aca42759ff iris/perf: implement iris_create_monitor_object
This is the first call that provides the iris context to the monitor
implementation.  On the first call, use the iris context to initialize
the monitor context.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-09 19:28:14 -07:00
Mark Janes
2446f5cfd8 intel/perf: move perf-related constants to common location
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers.  Place these macros within perf,
if the perf implementation contains the only references to the values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Kenneth Graunke
0e24d10ff5 iris: Use gen_mi_builder to handle CS ALU operations.
In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke
fe7ed6b057 iris: Make iris_query.c a genxml-compiled file.
This will let us use Jason's new MI-builder shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke
975f7e4a59 iris: Move iris_resolve_conditional_render to the vtable.
It's going to be in genxml code shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Ilia Mirkin
0e30c6b8a7 gallium: switch boolean -> bool at the interface definitions
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).

This has been build-tested on amd64 with:

Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
                 vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st:      mesa xa xvmc xvmc vdpau va

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 22:13:51 -04:00
Kenneth Graunke
1d5ee31553 iris: Drop copy and pasted iris_timebase_scale
Lionel moved brw_timebase_scale to gen_device_info_timebase_scale a few
months ago, so we should just use that, and not our own copy in iris.
2019-07-16 17:22:48 -07:00
Kenneth Graunke
712ac83033 iris: Simplify devinfo access in calculate_result_on_gpu()
We have devinfo, no need for screen->devinfo.
2019-07-12 00:33:19 -07:00
Kenneth Graunke
ab009b7d6e iris: Fix overzealous query object batch flushing.
In the past, each query object had their own BO.  Checking if the batch
referenced that BO was an easy way to check if commands were still
queued to compute the query value.  If so, we needed to flush.

More recently (c24a574e6c), we started using an u_upload_mgr for query
objects, placing multiple queries in the same BO.  One side-effect is
that iris_batch_references is a no longer a reasonable way to check if
commands are still queued for our query.  Ours might be done, but a
later query that happens to be in the same BO might be queued.  We don't
want to flush in that case.

Instead, check if the current batch's signalling syncpt is the one we
referenced when ending the query.  We know the syncpt can't have been
reused because our query is holding a reference, so a simple pointer
comparison should suffice.

Removes all batch flushing caused by query objects in Shadow of Mordor.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-06-26 09:49:01 -07:00
Kenneth Graunke
d4a4384b31 iris: Implement INTEL_DEBUG=pc for pipe control logging.
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush.  That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on.  It should make it easier to determine whether we're doing
too many flushes and why.
2019-06-20 13:32:15 -05:00
Kenneth Graunke
bbbf7a538c iris: Bail on queries for INTEL_NO_HW=1.
We don't execute any of the commands to record snapshots, so we can't
actually produce a real result.  We do however need to avoid waiting
on a syncpt which will never be signalled.  So, just return 0.
2019-06-19 11:55:43 -05:00
Illia Iorin
a35269cf44 iris: Implement ARB_indirect_parameters
iris_draw_vbo is divided into two functions to remove unnecessary
operations from the loop. This implementation of ARB_indirect_parameters
takes into account NV_conditional_render by saving MI_PREDICATE_RESULT
at the start of a draw call and restoring it at the end also the result
of NV_conditional_render is taken into account when computing predicates
that limit draw calls for ARB_indirect_parameters in a similar way
to 1952fd8d in ANV.

v2: Optimize indirect draws (suggested by Kenneth Graunke)
v3: (by Kenneth Graunke)
 - Fix an issue where indirect draws wouldn't set patch information
   before updating the compiled TCS.
 - Move some code back to iris_draw_vbo to avoid duplicating it.
 - Fix minor indentation issues.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-11 23:56:52 -07:00
Kenneth Graunke
2208d5a683 iris: Fix DrawTransformFeedback math when there's a buffer offset
We need to subtract the starting offset from the final offset before
dividing by the stride.  See src/intel/vulkan/genX_cmd_buffer.c:3142.

Not known to fix anything.
2019-04-23 15:57:07 -07:00
Kenneth Graunke
8d9e169bdd iris: Save/restore MI_PREDICATE_RESULT, not MI_PREDICATE_DATA.
MI_PREDICATE_DATA is an intermediate storage for the MI_PREDICATE
command's calculations - it holds the result of the subtraction when
the compare operation is SRCS_EQUAL or DELTAS_EQUAL.  But the actual
result of the predication is MI_PREDICATE_RESULT, which is what we
want to copy from the render context to the compute context.
2019-04-04 11:41:10 -07:00
Rafael Antognolli
ce830a364e iris: Add iris_resolve_conditional_render().
This function can be used to stall on the CPU and resolve the predicate
for the conditional render. It will convert ice->state.predicate from
IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or
IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query.

v2:
 - return void (Ken)
 - update the stored condition (Ken)
 - simplify the code leading to resolve the predicate (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Kenneth Graunke
1cd001aa63 iris: Make a iris_batch_reference_signal_syncpt helper function.
Suggested by Chris Wilson.  More obvious what's going on.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
9376799bd6 iris: Use READ_ONCE and WRITE_ONCE for snapshots_landed
Suggested by Chris Wilson, if only to make it obvious to the human
readers that these are volatile reads.  It may also be necessary for
the compiler in a few cases.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
18e31a9b31 iris: Fix accidental busy-looping in query waits
When switching from bo_wait to sync-points, I missed that we turned an
if (not landed) bo_wait into a while (not landed) check_syncpt(), which
has a timeout of 0.  This meant, rather than sleeping until the batch
is complete, we'd busy-loop, continually asking the kernel "is the batch
done yet???".  This is not what we want at all - if we wanted a busy
loop, we'd just loop on !snapshots_landed.  We want to sleep.

Add an effectively infinite timeout so that we sleep.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
3b1ac8244e iris: Add a timeout_nsec parameter, rename check_syncpt to wait_syncpt
I want to be able to wait with a non-zero timeout from elsewhere.
2019-02-21 10:26:11 -08:00
Sagar Ghuge
c24a574e6c iris: Don't allocate a BO per query object
Instead of allocating 4K BO per query object, we can create a large blob
of memory and split it into pieces as required.

Having one BO for multiple query objects, we don't want to wait on all
of them, instead when we write last snapshot, we create a sync point, and
check syncpoints while waiting on particular object.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-02-21 10:26:11 -08:00