Commit graph

85428 commits

Author SHA1 Message Date
Lionel Landwerlin
014bd4acb8 anv: turn on samplerAnisotropy in VkPhysicalDeviceFeatures
According to the Vulkan spec 5.63.4 :

  samplerAnisotropy indicates whether anisotropic filtering is supported. If
  this feature is not enabled, the maxAnisotropy member of the
  VkSamplerCreateInfo structure must be 1.0.

Since we already set maxAnisotropy to 16 and program the hardware according
to the VkSamplerCreateInfo.maxAnisotropy, it seems we can turn this on.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-10 09:25:38 +01:00
Edward O'Callaghan
ba43768a1e radv: Use proper header guards over 'pragma once' directives
Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2016-10-10 16:10:56 +11:00
Tapani Pälli
2d7e0f35c5 mesa: throw error if bufSize negative in GetSynciv on OpenGL ES
Fixes following dEQP tests:

   dEQP-GLES31.functional.debug.negative_coverage.callbacks.state.get_synciv
   dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_synciv
   dEQP-GLES31.functional.debug.negative_coverage.log.state.get_synciv

v2: drop _mesa_is_gles check (Kenneth)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98133
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-10 07:29:31 +03:00
Tapani Pälli
d997d5c0c9 glsl: prohibit lowp, mediump precision on atomic_uint
Fixes following dEQP tests:

   dEQP-GLES31.functional.debug.negative_coverage.callbacks.atomic_counter.atomic_precision
   dEQP-GLES31.functional.debug.negative_coverage.get_error.atomic_counter.atomic_precision
   dEQP-GLES31.functional.debug.negative_coverage.log.atomic_counter.atomic_precision

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98131
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-10 07:29:31 +03:00
Tapani Pälli
c64093e7d5 glsl: optimize copy_propagation_elements pass
Changes make copy_propagation_elements pass faster, reducing link
time spent in test case of bug 94477. Does not fix the actual issue
but brings down the total time. No regressions seen in CI.

v2 (idr): Formatting / whitespace fixes.  Embed the acp_ref in the
acp_entry.

v3 (idr): Delete unused copy constructor.  Use while(pop_head) instead
of foreach() { remove }.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-10 07:29:31 +03:00
Dave Airlie
db5d278541 radv: don't build without SHA1.
Just copy the section from anv above this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98167
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-10 10:08:47 +10:00
Edward O'Callaghan
185be15d9d docs/features.txt: Add GL_KHR_robustness supported on ES 3.2
Both radeonsi and nvc0 should also support ES so fixup doc.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-09 01:06:38 +11:00
Lionel Landwerlin
4682abdaa8 intel: aubinator: enable loading dumps from standard input
In conjuction with an intel_aubdump change, you can now look at your
application's output like this :

$ intel_aubdump -c '/path/to/aubinator --gen=hsw' my_gl_app

v2: Add print_help() comment about standard input handling (Eero)
    Remove shrinked gtt space debug workaround (Eero)

v3: Use realloc rather than memcpy/free (Ben)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
2016-10-08 02:18:47 +01:00
Lionel Landwerlin
619c8de522 intel: aubinator: enable loading xml files from a given directory
This might be useful for people who debug with out of tree descriptions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
2016-10-08 02:17:35 +01:00
Lionel Landwerlin
63a366a881 intel: aubinator: generate a standalone binary
Embed the xml files into the binary, so aubinator can be used from any
location.

v2: Split generation packing into another patch (Jason)
    Check for xxd (Jason)

v3: Fix out of tree builds (Jason)
    Generate custom variable name rather than names generated by xxd
    (Lionel)

v4: Move generated _xml.h files to genxml/ (Sirisha)

v5: Remove newline from makefile (Jason)

v6: Add comment on gen*_xml.h creation (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-08 02:17:03 +01:00
Nanley Chery
4d7d9825f3 anv/TODO: Update the HiZ task
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-10-07 12:54:18 -07:00
Nanley Chery
d8aacc24cc anv: Enable fast depth clears
Provides an FPS increase of ~30% on the Sascha triangle and multisampling
demos.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-10-07 12:54:18 -07:00
Chad Versace
78d074b87a anv/cmd_buffer: Enable rendering to HiZ
Nanley Chery:
(rebase)
 - Resolve conflicts with new anv_batch_emit macro
(amend)
 - Handle a QPitch TODO
 - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
 - Only use HiZ for single-subpass renderpasses
 - Emit the HiZ instruction before the stencil instruction to follow the
   optimized clear sequence specified in the PRMs
 - Don't modify clear params
 - Enable resolves when a HiZ buffer is used to ensure depth buffer validity

Provides an FPS increase of ~15% on the Sascha triangle and multisampling
demos.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 12:54:18 -07:00
Nanley Chery
134d181be1 anv/cmd_buffer: Add code for performing HZ operations
Create a function that performs one of three HiZ operations -
depth/stencil clears, HiZ resolve, and depth resolves.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-10-07 12:54:18 -07:00
Jason Ekstrand
9919a2d34d anv/image: Memset hiz surfaces to 0 when binding memory
Nanley Chery (amend):
 - Change memset value from 0xff to 0 (a defined value for HiZ).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 12:54:18 -07:00
Jason Ekstrand
b4bbabf21b anv: Move BindImageMemory to anv_image.c
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 12:54:18 -07:00
Chad Versace
917814dccd anv: Allocate hiz surface
Nanley Chery:
(rebase)
 - Use isl_surf_get_hiz_surf()
(amend)
 - Only add a HiZ surface onto a depth/stencil attachment
 - Add comment above HiZ surface addition
 - Hide HiZ behind INTEL_VK_HIZ prior to BDW
 - Disable HiZ for untested cases
 - Remove DISABLE_AUX_BIT instead of preventing it from being added

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-10-07 12:54:18 -07:00
Chad Versace
3aec432ed3 anv: Add func anv_image_has_hiz()
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 12:54:17 -07:00
Chad Versace
fe40d026a1 anv: Add anv_image::hiz_surface
Unused.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 12:54:17 -07:00
Nanley Chery
814fa12379 isl: Correct a comment in the isl_format enum
HiZ is not a color surface, but an auxiliary depth surface.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 12:54:17 -07:00
Rob Clark
495ba8884a gallium: add missing zero-init for resource templates
Mostly test code, plus one spot I noticed in r600.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-07 15:50:46 -04:00
Rob Clark
3ebfc44b42 freedreno: don't try to shadow layered textures
We will only hit this with multi-planar YUV external images, so we would
probably never hit this code path in the first place.  But if we did, it
wouldn't do the right thing so just bail.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-10-07 15:50:46 -04:00
Rob Clark
f88f025e8c freedreno/a3xx+a4xx: fix clip-plane lowering state
If enabled clip-planes have changed, we need to mark program state
dirty.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-10-07 15:50:46 -04:00
Ian Romanick
f546b41f6a glsl: Let cache_test build when the shader cache is not enabled
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Aaron Watry <awatry@gmail.com>
2016-10-07 11:19:37 -07:00
Lionel Landwerlin
eb23de6116 anv: pipeline cache: fix return value of vkGetPipelineCacheData
According to the spec - 9.6. Pipeline Cache :

  If pDataSize is less than the maximum size that can be retrieved by the
  pipeline cache, at most pDataSize bytes will be written to pData, and
  vkGetPipelineCacheData will return VK_INCOMPLETE.

Fixes the following test from Vulkan CTS :

  dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_fragment_stage
  dEQP-VK.pipeline.cache.pipeline_from_incomplete_get_data.vertex_stage_geometry_stage_fragment_stage
  dEQP-VK.pipeline.cache.misc_tests.invalid_size_test

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-07 18:46:12 +01:00
Timothy Arceri
965ebc8b28 util: remove unused variable
Also initialise page at declaration.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-07 21:24:50 +11:00
Martin Peres
a599b1c203 loader/dri3: import prime buffers in the currently-bound screen
This tries to mirrors the codepath taken by DRI2 in IntelSetTexBuffer2()
and fixes many applications when using DRI3:
 - Totem with libva on hw-accelerated decoding
 - obs-studio, using Window Capture (Xcomposite) as a Source
 - gstreamer with VAAPI

v2:
 - introduce get_dri_screen() in the dri3 loader's vtable (krh)

Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Tested-by: Ionut Biru <biru.ionut@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2016-10-07 11:11:55 +03:00
Martin Peres
0247e5ee3e loader/dri3: add get_dri_screen() to the vtable
This allows querying the current active screen from the
loader's common code.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
2016-10-07 11:11:44 +03:00
Jason Ekstrand
82b4f1c47b anv/entrypoints: Save off the entire devinfo rather than a pointer
Since the gen_device_info structs are no longer just constant memory, a
pointer to one is not a pointer to something in the .data section so we
shouldn't be storing it in a static variable.  Instead, we should just
store the entire device_info structure.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-06 21:13:52 -07:00
Dave Airlie
85a47f647e radv: drop all uint for unsigned.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-07 12:09:13 +10:00
Eric Anholt
20d91e5ce9 vc4: Don't worry about partial Z/S clear if the other is already cleared.
We have to be careful to not smash the value they're clearing to, but
other than that we're fine.  Avoids quad clears in Processing, which likes
to do glClear(Z|S); glClear(Z).

Improves performance of Processing's QuadRendering demo at 5000 quads by
5.46507% +/- 1.35576% (n=15 before, 32 after)
2016-10-06 18:29:16 -07:00
Eric Anholt
cb328123fe vc4: Try to fix the HW-2116 workaround.
We were incrementing the count at the end of vc4_start_draw(), except that
that function returns immediately if we've already started drawing on this
batch.  It also failed to count the statechanges from the GFXH-515
workaround.

This incidentally allows repeated glClear() to be coalesced, because the
fast clears aren't counted in draw_calls_queued any more.  Fixes most of
the extra flushes in Processing, which emits glClear(Z|S); glClear(Z);
glClear(C) during its frame setup.

Improves performance of Processing's QuadRendering demo at 5000 quads by
3.33538% +/- 2.05846% (n=21 before, 15 after)
2016-10-06 18:29:12 -07:00
Eric Anholt
bca9a58d04 vc4: Drop dead argument from vc4_start_draw(). 2016-10-06 18:09:24 -07:00
Eric Anholt
9421a6065c vc4: Fix fallback to quad clears of depth in GLX.
The fix in the vc4-jobs series ended up triggering the fallback path on
GLX apps that use depth but not stencil.
2016-10-06 18:09:24 -07:00
Eric Anholt
8810270d06 vc4: Add the format name in miptree_debug.
I was curious if my Z/S buffer was actually ZS or ZX, and the vc4 format
of "0" didn't tell me much.
2016-10-06 18:09:24 -07:00
Eric Anholt
ee577e7fa7 vc4: Fix perf debug formatting on partial Z/S clear. 2016-10-06 18:09:24 -07:00
Eric Anholt
7c7bcbbc7d vc4: Drop destination register when it's unused.
This slightly reduces instructions on shader-db, but I think it's just
perturbing register allocation -- the allocator should have always
trivially colored these nodes, before.  This commit is just to make QIR
code failing more intelligible when register allocation fails.
2016-10-06 18:09:24 -07:00
Eric Anholt
d4ae5ca823 vc4: Fix live intervals analysis for screening defs in if statements.
If a conditional assignment is only conditioned on the exec mask, that's
still screening off the value in the executed channels (and, since we're
not storing to the unexcuted channels, we don't care what's in there).

Fixes a bunch of extra register pressure on Processing's Ribbons demo,
which is failing to allocate.
2016-10-06 18:09:24 -07:00
Eric Anholt
06cc3dfda4 vc4: Fix simulator when more than one vc4_screen is opened.
We would assertion fail in setting up the simulator the second time
around.  This at least postpones the assertion failure until we've closed
all of the first set of screens and started opening a new set.
2016-10-06 18:09:24 -07:00
Eric Anholt
b30205b112 vc4: Fix assertion fails from trying to cast non-ALU instrs to ALU.
Fixes 100 piglit tests since the assertions were added to nir.h.  What's
amazing is that these tests used to pass, even when casting garbage.
2016-10-06 18:09:24 -07:00
Jason Ekstrand
c81ec84c1e anv/cmd_buffer: Move the clear_subpasses calls to set_subpass
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-06 16:52:31 -07:00
Jason Ekstrand
b548fdbed5 anv/cmd_buffer: Don't call set_subpass in a secondary
Initially, we had intended set_subpass to be an interesting function that
did whatever (presumably a lot) setup we needed for a subpass.  In reality,
it just sets a pointer and a dirty bit and then emits depth and stencil
state.  When we call BeginCommandBuffer on a secondary, there's no point in
setting depth and stencil state since it will already be set by the
primary.  Instead, the only thing we need to do at the start of a secondary
is set the subpass pointer and the dirty bit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-06 16:52:31 -07:00
Jason Ekstrand
fe4e276b02 anv/cmd_buffer: Rework descriptor dirtying in set_subpass
We have a DIRTY_RENDER_TARGETS flag and that makes a lot more sense than
just dirtying fragment descriptors.  We're checking for it in some of the
gen7 code but unfortunately, nothing was setting it and it didn't do what
it was supposed to do in cmd_buffer_flush_state.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-06 16:52:31 -07:00
Jason Ekstrand
a1db0e87ff anv/wsi: Advertise UNORM formats as well as sRGB
Because WSI images are created with VkImageCreateInfo::flags explicitly set
to 0, they don't ever have the VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT set.
This means that you can't create an image view of it with a different
format so applications can't render directly in sRGB (without automatic
encoding) unless we actually advertise UNORM formats.  There are a lot of
applications that want to do their own sRGB conversion, so we should allow
for that.  We do, however, make UNORM come after sRGB in the list so that
the default for dumb apps that just grab the first thing is to render in
linear and let the sRGB conversion happen automatically.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-06 16:52:31 -07:00
Dave Airlie
5267124648 radv: fix configure.ac check
This should be positive test.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-07 09:28:03 +10:00
Gustaw Smolarczyk
24815bd7b3 radv: Skip already signalled fences.
If the user created a fence with VK_FENCE_CREATE_SIGNALED_BIT set, we
shouldn't fail to wait for a fence if it was not submitted since that is
not necessary.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-07 09:24:09 +10:00
Dave Airlie
f4e499ec79 radv: add initial non-conformant radv vulkan driver
This squashes all the radv development up until now into
one for merging.

History can be found:
https://github.com/airlied/mesa/tree/semi-interesting

This requires llvm 3.9 and is in no way considered
a conformant vulkan implementation. It can run a number
of vulkan applications, and supports all GPUs using
the amdgpu kernel driver.

Thanks to Intel for providing anv and spirv->nir,
and Emil Velikov for reviewing build integration.

Parts of this are:
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>

Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-07 09:16:09 +10:00
Samuel Pitoiset
28ecd3eac2 nv50/ir: fix wrong check when optimizing MAD to SHLADD
Checking if MAD is supported is definitely wrong, and it's
more likely a typo I introduced few days ago which breaks
NV50 because SHLADD is not supported there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-10-07 01:13:06 +02:00
Lionel Landwerlin
0b10152b80 intel: aubinator: use getopt to parse arguments
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>
2016-10-07 00:05:56 +01:00
Samuel Pitoiset
a198883bf7 nvc0: dump program binary only when NV50_PROG_DEBUG is set
When the chipset is forced with NV50_PROG_CHIPSET, we actually
only want to output the binary if NV50_PROG_DEBUG is also
enabled. Otherwise, this pollutes the shader-db output.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-10-07 01:01:17 +02:00