Commit graph

1528 commits

Author SHA1 Message Date
Jason Ekstrand
060a6434ec anv: Advertise larger heap sizes
Instead of just advertising the aperture size, we do something more
intelligent.  On systems with a full 48-bit PPGTT, we can address 100%
of the available system RAM from the GPU.  In order to keep clients from
burning 100% of your available RAM for graphics resources, we have a
nice little heuristic (which has received exactly zero tuning) to keep
things under a reasonable level of control.

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
651ec926fc anv: Add support for 48-bit addresses
This commit adds support for using the full 48-bit address space on
Broadwell and newer hardware.  Thanks to certain limitations, not all
objects can be placed above the 32-bit boundary.  In particular, general
and state base address need to live within 32 bits.  (See also
Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
to handle this, we add a supports_48bit_address field to anv_bo and only
set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
for all client-allocated memory objects but leave it false for
driver-allocated objects.  While this is more conservative than needed,
all driver allocations should easily fit in the first 32 bits of address
space and keeps things simple because we don't have to think about
whether or not any given one of our allocation data structures will be
used in a 48-bit-unsafe way.

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
439da38d18 anv: Replace anv_bo::is_winsys_bo with a uint32_t flags
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
5d1ba2cb04 anv/blorp: Align vertex buffers to 64B
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
c964f0e485 anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible.  In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering.  In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
82573d0f75 anv: Check for device loss at the end of WaitForFences
It's possible that the device could have been lost while we were
waiting.  We should let the user know if this has happened.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-04 18:33:51 -07:00
Jason Ekstrand
c6f69eea6a anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex
When the shader does not set one of these values, they are supposed to
get a default value of 0.  We have hardware bits in 3DSTATE_CLIP for
this but haven't been setting them.  This fixes the intermittent failure
of dEQP-VK.geometry.layered.3d.render_to_default_layer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:51 -07:00
Jason Ekstrand
3503b2714b i965/fs: Always provide a default LOD of 0 for TXS and TXL
We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages.  However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V.  Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations.  This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:35 -07:00
Jason Ekstrand
1fde054b8f intel/isl: Refactor and clerify gen8 alignment calculations
Adding the actual table from the docs makes it clearer exactly what the
restrictions are.  In particular, it becomes clear that compressed
textures ignore the alignment parameters in RENDER_SURFACE_STATE.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-04 14:51:57 -07:00
Lionel Landwerlin
e8d9b76f63 intel: tools: add aubinator_error_decode tool
This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

v2 (from Matt): Drop authors
                Remove undefined automake variable

v3: Fix incorrect offsets for dword > 1 (Jordan)

v4: Fix decompression error with large blobs (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
567d77885e intel: genxml: add RING_BUFFER_CTL registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
6f260ff049 intel: genxml: add FAULT_REG register
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
ca2771fa18 intel: genxml: add gen7 ERR_INT register
v2: add register to gen7.5 (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
84613bf6d5 intel: genxml: add ACTHD registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
0f195f22aa intel: genxml: add GFX_ARB_ERROR_RPT register
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
d1a7a54d77 intel: genxml: add INSTDONE registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Mauro Rossi
72175bd2a5 android: intel: genxml: fix genX_xml.h generation rules
Recent changes in Makefile.sources merged the aubinator files in
a unique list of generated files and genxml/genX_xml.h is now needed
to avoid the following building error:

ninja: error: '.../genxml/genX_xml.h', needed by '.../genxml/genX_xml.h',
missing and no known rule to make it
build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed

Fixes: 0f83c05 "intel: genxml: compress all gen files into one"
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-04 09:10:46 +03:00
Jason Ekstrand
405ef7bb33 intel/vec4: Add some fall through comments
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 16:58:35 -07:00
Jason Ekstrand
0817110969 anv: Implement VK_KHR_incremental_present
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-04-03 13:51:08 -07:00
Jason Ekstrand
f82b6c6272 vulkan/wsi: Plumb present regions through the common code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-04-03 13:51:08 -07:00
Lionel Landwerlin
471c1bc7cc aubinator/gen_decoder/i965: decode instructions from dword 0
Some packets like 3DSTATE_VF_STATISTICS, 3DSTATE_DRAWING_RECTANGLE,
3DPRIMITIVE, PIPELINE_SELECT, etc... have configurable fields in
dword0, we probably want to print those.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 20:45:34 +01:00
Lionel Landwerlin
04f2e80257 intel: gen_decoder: store pointer to current decoded field in iterator
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 20:45:34 +01:00
Lionel Landwerlin
74a80d579d intel: genxml: fix out of tree builds
v2: use Emil's recommendation
    change rule to closer to genxml/genX_bits.h

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-31 15:29:57 +01:00
Tapani Pälli
3535b87a1a anv: change BLOCK_POOL_MEMFD_SIZE to 1GB
This allows us to run 32bit Vulkan apps on Android, ftruncate
call would fail on 2GB (max size being 2GB - 1).

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-31 08:43:28 +03:00
Tapani Pälli
2398770c87 android: add libmesa_genxml as dep to libmesa_isl
This is to fix following compile error with libmesa_isl:
   mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not found

Fixes: f0eaf38 ("genxml: New generated header genX_bits.h (v6)")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emli Velikov <emil.velikov@collabora.com>
2017-03-31 08:42:54 +03:00
Lionel Landwerlin
469da094e1 aubinator: enable snb/ilk through --gen
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-31 01:25:33 +01:00
Lionel Landwerlin
0f83c05149 intel: genxml: compress all gen files into one
Combining all the files into a single string didn't make any
difference in the size of the aubinator binary.

With this change we now also embed gen4/4.5/5 descriptions, which
increases the aubinator size by ~16Kb.

v2 (Lionel): rebase makefiles

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-31 01:24:56 +01:00
Kenneth Graunke
e113dfabad intel: Add INTEL_CFLAGS to aubinator CFLAGS.
It still needs intel_aub.h.  Fixes the build.
2017-03-30 11:58:00 -07:00
Emil Velikov
3df993e1a2 intel: automake: move INTEL_CFLAGS as applicable
Only common/decoder.[ch] requires it [for intel_aub.h].

v2: The code was moved to from intel/tools to intel/common,
update accordingly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-30 19:07:28 +01:00
Emil Velikov
4ffb394961 intel: android: remove libdrm_intel requirement
The only part which requires libdrm_intel tools/aubinator is not built
on Android.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-30 19:07:23 +01:00
Craig Stout
1da7a11de8 anv/cmd_buffer: fix host memory leak
push_constants must be free'd.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100452
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-29 14:32:32 -07:00
Jason Ekstrand
9aba81b160 anv/batch_chain: Handle another OOM in cmd_buffer_execbuf
Found by inspection while rebasing other patches.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-29 09:39:49 -07:00
Alejandro Piñeiro
2f8d6bd578 i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+
Technically those hw operations are only available on gen7, as gen8+
support the conversion on the MOV. But, when using the builder to
implement nir operations (example: nir_op_fquantize2f16), it is not
needed to do the gen check. This check is done later, on the final
emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
specific operation accordingly.

So in the middle, during optimization phases those hw operations can
be around for gen8+ too.

Without this patch, several (at least 95) vulkan-cts quantize tests
crashes when using INTEL_DEBUG=optimizer. For example:
dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert

v2: simplify the code using GEN_GE (Ilia Mirkin)
v3: tweak brw_instruction_name instead of changing opcode_descs
    table, that is used for validation (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-29 17:34:15 +02:00
Jason Ekstrand
f3673db3d6 anv/cmd_buffer: Refactor flush_pipeline_select_*
While having the _3d and _gpgpu versions is nice, there's no reason why
we need to have duplicated logic for tracking the current pipeline.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-03-28 14:57:09 -07:00
Jason Ekstrand
6baae9625d anv: Flush caches prior to PIPELINE_SELECT on all gens
The programming note that says we need to do this still exists in the
SkyLake PRM and, from looking at the bspec, seems like it may apply to
all hardware generations SNB+.  Unfortunately, this isn't particularly
clear cut since there is also language in the bspec that says you can
skip the flushing and stall to get better throughput.  Experimentation
with the "Car Chase" benchmark in GL seems to indicate that some form of
flushing is still needed.  This commit makes us do the full set of
flushes regardless of hardware generation.  We can always reduce the
flushing later.

Reported-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:57:08 -07:00
Jason Ekstrand
0fe3dcce4c anv/cmd_buffer: Fix bad indentation
A bunch of code was indented in such a way that it looked like it went
with the if statement above but it definitely didn't.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:57:06 -07:00
Jason Ekstrand
01a65dc43b anv/cmd_buffer: Apply flush operations prior to executing secondaries
This fixes rendering issues in the Vulkan port of skia on some hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:56:55 -07:00
Jason Ekstrand
9319ef96fd anv/blorp: Use anv_get_layerCount everywhere
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:41:48 -07:00
Jason Ekstrand
1b8fa8dd79 anv: Make anv_get_layerCount a macro
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:41:47 -07:00
Chad Versace
d1032a047b isl: Drop unused isl_surf_init_info::min_pitch
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-28 09:44:44 -07:00
Chad Versace
6cbc13d94c intel: Fix requests for exact surface row pitch (v2)
All callers of isl_surf_init() that set 'min_row_pitch' wanted to
request an *exact* row pitch, as evidenced by nearby asserts, but isl
lacked API for doing so. Now that isl has an API for that, update the
code to use it.

v2: Assert that isl_surf_init() succeeds because the callers assume
    it.  [for jekstrand]

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2017-03-28 09:44:44 -07:00
Chad Versace
e9017d58dc isl: Let isl_surf_init's caller set the exact row pitch (v2)
The caller does so by setting the new field
isl_surf_init_info::row_pitch.

v2: Validate the requested row_pitch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2017-03-28 09:44:44 -07:00
Chad Versace
23802dafc2 isl: Validate the calculated row pitch (v45)
Validate that isl_surf::row_pitch fits in the below bitfields,
if applicable based on isl_surf::usage.

    RENDER_SURFACE_STATE::SurfacePitch
    RENDER_SURFACE_STATE::AuxiliarySurfacePitch
    3DSTATE_DEPTH_BUFFER::SurfacePitch
    3DSTATE_HIER_DEPTH_BUFFER::SurfacePitch

v2:
  -Add a Makefile dependency on generated header genX_bits.h.
v3:
  - Test ISL_SURF_USAGE_STORAGE_BIT too. [for jekstrand]
  - Drop explicity dependency on generated header. [for emil]
v4:
  - Rebase for new gen_bits_header.py script.
  - Replace gen_10x with gen_device_info*.
v5:
  - Drop FINISHME for validation of GEN9 1D row pitch. [for jekstrand]
  - Reformat bit tests. [for jekstrand]

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)
2017-03-28 09:44:44 -07:00
Chad Versace
f0eaf38db2 genxml: New generated header genX_bits.h (v6)
genX_bits.h contains the sizes of bitfields in genxml instructions,
structures, and registers. It also defines some functions to query those
sizes.

isl_surf_init() will use the new header to validate that requested
pitches fit in their destination bitfields.

What's currently in genX_bits.h:

  - Each CONTAINER::Field from gen*.xml that has a bitsize has a macro
    in genX_bits.h:

        #define GEN{N}_CONTAINER_Field_bits {bitsize}

  - For each set of macros whose name, after stripping the GEN prefix,
    is the same, genX_bits.h contains a query function:

      static inline uint32_t __attribute__((pure))
      CONTAINER_Field_bits(const struct gen_device_info *devinfo);

v2 (Chad Versace):
  - Parse the XML instead of scraping the generated gen*_pack.h headers.

v3 (Dylan Baker):
  - Port to Mako.

v4 (Jason Ekstrand):
  - Make the _bits functions take a gen_device_info.

v5 (Chad Versace):
  - Fix autotools out-of-tree build.
  - Fix Android build. Tested with git://github.com/android-ia/manifest.
  - Fix macro names. They were all missing the "_bits" suffix.
  - Fix macros names more. Remove all double-underscores.
  - Unindent all generated code. (It was floating in a sea of whitespace).
  - Reformat header to appear human-written not machine-generated.
  - Sort gens from high to low. Newest gens should come first because,
    when we read code, we likely want to read the gen8/9 code and ignore
    the gen4 code. So put the gen4 code at the bottom.
  - Replace 'const' attributes with 'pure', because the functions now
    have a pointer parameter.
  - Add --cpp-guard flag. Used by Android.
  - Kill class FieldCollection. After Jason's rewrite, it was just
    a dict.

v6 (Chad Versace):
  - Replace `key not in d.keys()` with `key not in d`. [for dylan]

Co-authored-by: Dylan Baker <dylan@pnwbakers.com>
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v5)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v6)
2017-03-28 09:44:44 -07:00
Matt Turner
7dccd38b40 i965/fs: Don't emit SEL instructions for type-converting MOVs.
SEL can only convert between a few integer types, which we basically
never do.

Fixes fs/vs-double-uniform-array-direct-indirect-non-uniform-control-flow
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-03-27 10:59:42 -07:00
Xu Randy
004468de14 anv/blorp: Fix a crash in CmdClearColorImage
We should use anv_get_layerCount() to access layerCount of VkImageSub-
resourceRange in anv_CmdClearColorImage and anv_CmdClearDepthStencil-
Image, which handles the VK_REMAINING_ARRAY_LAYERS (~0) case.

Test: Sample multithreadcmdbuf from LunarG can run without crash

Signed-off-by: Xu Randy <randy.xu@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-27 07:43:17 -07:00
Samuel Iglesias Gonsálvez
c4c02471f4 anv: enable sampling from fast-cleared images on SKL
A resolve is not needed on Skylake in this case. We were forcing
a resolve because we set the input_aux_usage to ISL_AUX_USAGE_NONE.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-03-27 06:32:24 +02:00
Chad Versace
7414326164 genxml: Add 3DSTATE_DEPTH_BUFFER to gen5.xml
isl will use this for validating the depth buffer pitch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 19:07:05 -07:00
Jason Ekstrand
e6621746dc genxml: Whitespace fixes
Some field names had extra spaces and some had places where we should
have had a space but didn't.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00
Jason Ekstrand
34c3f6a27f genxml: Replace "[N]" with "N"
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00