Introduce stubs to anv_gem_stub.c that match the anv_gem.c ones.
Otherwise we may get link-time errors, when building the tests.
v2: Introduce all the missing stubs at once.
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100574
Fixes: c964f0e485 ("anv: Query the kernel for reset status")
Fixes: 651ec926fc ("anv: Add support for 48-bit addresses")
Fixes: 060a6434ec ("anv: Advertise larger heap sizes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
I've intentionally kept the order the same identical to the anv_gem.c.
This way we can easily grep & diff in the future ;-)
Instead of just advertising the aperture size, we do something more
intelligent. On systems with a full 48-bit PPGTT, we can address 100%
of the available system RAM from the GPU. In order to keep clients from
burning 100% of your available RAM for graphics resources, we have a
nice little heuristic (which has received exactly zero tuning) to keep
things under a reasonable level of control.
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
This commit adds support for using the full 48-bit address space on
Broadwell and newer hardware. Thanks to certain limitations, not all
objects can be placed above the 32-bit boundary. In particular, general
and state base address need to live within 32 bits. (See also
Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order
to handle this, we add a supports_48bit_address field to anv_bo and only
set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit
for all client-allocated memory objects but leave it false for
driver-allocated objects. While this is more conservative than needed,
all driver allocations should easily fit in the first 32 bits of address
space and keeps things simple because we don't have to think about
whether or not any given one of our allocation data structures will be
used in a 48-bit-unsafe way.
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible. In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering. In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It's possible that the device could have been lost while we were
waiting. We should let the user know if this has happened.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When the shader does not set one of these values, they are supposed to
get a default value of 0. We have hardware bits in 3DSTATE_CLIP for
this but haven't been setting them. This fixes the intermittent failure
of dEQP-VK.geometry.layered.3d.render_to_default_layer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
This allows us to run 32bit Vulkan apps on Android, ftruncate
call would fail on 2GB (max size being 2GB - 1).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
While having the _3d and _gpgpu versions is nice, there's no reason why
we need to have duplicated logic for tracking the current pipeline.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The programming note that says we need to do this still exists in the
SkyLake PRM and, from looking at the bspec, seems like it may apply to
all hardware generations SNB+. Unfortunately, this isn't particularly
clear cut since there is also language in the bspec that says you can
skip the flushing and stall to get better throughput. Experimentation
with the "Car Chase" benchmark in GL seems to indicate that some form of
flushing is still needed. This commit makes us do the full set of
flushes regardless of hardware generation. We can always reduce the
flushing later.
Reported-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
A bunch of code was indented in such a way that it looked like it went
with the if statement above but it definitely didn't.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
This fixes rendering issues in the Vulkan port of skia on some hardware.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
All callers of isl_surf_init() that set 'min_row_pitch' wanted to
request an *exact* row pitch, as evidenced by nearby asserts, but isl
lacked API for doing so. Now that isl has an API for that, update the
code to use it.
v2: Assert that isl_surf_init() succeeds because the callers assume
it. [for jekstrand]
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
We should use anv_get_layerCount() to access layerCount of VkImageSub-
resourceRange in anv_CmdClearColorImage and anv_CmdClearDepthStencil-
Image, which handles the VK_REMAINING_ARRAY_LAYERS (~0) case.
Test: Sample multithreadcmdbuf from LunarG can run without crash
Signed-off-by: Xu Randy <randy.xu@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
A resolve is not needed on Skylake in this case. We were forcing
a resolve because we set the input_aux_usage to ISL_AUX_USAGE_NONE.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
We don't need to make the caller (CmdCopyQueryPoolResults) aware of the
problem since compute_query_result() only emits state. The caller is also
expected to hit OOM in this scenario right after calling this function, but
it is already handling it safely.
Fixes:
dEQP-VK.api.out_of_host_memory.cmd_copy_query_pool_results
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We need to know if sample shading has been requested during shader
compilation since that affects the way fragment coordinates are
computed.
Notice that the semantics of fragment coordinates only depend on
whether sample shading has been requested, not on whether more
than one sample will actually be produced (that is,
minSampleShading and rasterizationSamples do not affect this
behavior).
Because this setting affects the code we generate for the shader, we also
need to include it in the WM prog key. Notice we don't need to alter the
OpenGL code because it doesn't ever use this behavior, so they key's
value is always false (the default).
Fixes:
dEQP-VK.glsl.builtin_var.fragcoord_msaa.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
According to section 14.6 of the Vulkan specification:
"When sample shading is enabled, the x and y components of FragCoord
reflect the location of the sample corresponding to the shader
invocation."
So add a boolean parameter to the lowering pass to select this behavior
when we need it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
If we know the device has been lost we should return this error code for
any command that can report it before we attempt to do anything with the
device.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The Vulkan specs say:
"A logical device may become lost because of hardware errors, execution
timeouts, power management events and/or platform-specific events. This
may cause pending and future command execution to fail and cause hardware
resources to be corrupted. When this happens, certain commands will
return VK_ERROR_DEVICE_LOST (see Error Codes for a list of such commands).
After any such event, the logical device is considered lost. It is not
possible to reset the logical device to a non-lost state, however the lost
state is specific to a logical device (VkDevice), and the corresponding
physical device (VkPhysicalDevice) may be otherwise unaffected. In some
cases, the physical device may also be lost, and attempting to create a
new logical device will fail, returning VK_ERROR_DEVICE_LOST."
This means that we need to track if a logical device has been lost so we can
have the commands referenced by the spec return VK_ERROR_DEVICE_LOST
immediately.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
So that we don't have to do things like rolling back address relocations in
case that we ran into OOM after computing them, etc
Also, make sure that if the queue submission comes with a fence, we set it up
correctly so it behaves according to the spec after returning
VK_ERROR_DEVICE_LOST.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It's written in C rather than pure python and is strictly faster, the
only reason not to use it that it's classes cannot be subclassed.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
This has the potential to mask errors, since Element.get works like
dict.get, returning None if the element isn't found. I think the reason
that Element.get was used is that vulkan has one extension that isn't
really an extension, and thus is missing the 'protect' field.
This patch changes the behavior slightly by replacing get with explicit
lookup in the Element.attrib dictionary, and using xpath to only iterate
over extensions with a "protect" attribute.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Instead of using an if and a check, use dict.get, which does the same
thing, but more succinctly.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
This produces the header and the code in one command, saving the need to
call the same script twice, which parses the same XML file.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
This produces a file that is identical except for whitespace, there is a
table that has 8 columns in the original and is easy to do with prints,
but is ugly using mako, so it doesn't have columns; the data is not
inherently tabular.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
This does two things, first it updates both the .h and the .c file to
have the same do not edit string. Second, it uses __file__ to ensure
that even if the file is moved or renamed that the name will be correct.
One thing to note is the use of '{{' and '}}' in the C template. This is
to instruct python to print a literal '{' and '}' respectively, rather
than treating the contents as a formatter specifier.
v3: - add this patch
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
This is groundwork for the next patches, it will allows porting the
header and the code to mako separately, and will also allow both to be
run simultaneously.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
It's slow, and has the potential for encoding issues.
v2: - pass xml file location via argument
- update Android.mk
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
These are all fairly small cleanups/tweaks that don't really deserve
their own patch.
- Prefer comprehensions to map() and filter(), since they're faster
- replace unused variables with _
- Use 4 spaces of indent
- drop semicolons from the end of lines
- Don't use parens around if conditions
- don't put spaces around brackets
- don't import modules as caps (ET -> et)
- Use docstrings instead of comments
v2: - Replace comprehensions with multiplication
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
The query is a properties query so it needs to be handled in
GetPhysicalDeviceProperties2, not GetPhysicalDeviceFeatures2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Found by inspection.
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
This was used for aubdumping (deleted a while ago) and INTEL_DEBUG=bat
decoding (deleted recently).
While we're changing parameters, delete the wrapper macro and make the
actual function brw_state_batch instead of __brw_state_batch.
This subsumes a patch by Emil Velikov to drop this from BLORP.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The crash is due to NULL pColorBlendState, which is legal if the
pipeline has rasterization disabled or if the subpass of the render pass
the pipeline is created against does not use any color attachments.
Test: Sample subpasses from LunarG can run without crash
Signed-off-by: Xu,Randy <randy.xu@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>