Compile times of simple shaders are reduced by ~20%.
Compile times of prologs and epilogs are reduced by up to 40%.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Changes in v2:
- make loadSuInfo32() protected without making the rest protected
- move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating
NVC0_SU_INFO_MS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
The current code causes:
/usr/include/c++/8/debug/safe_iterator.h:207:
Error: attempt to copy from a singular iterator.
This is due to the iterators getting invalidated, fix the
reverse iterator to use the return value from erase, and
cast it properly.
(used Mathias suggestion)
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
This ports radv to the shared code, however due to a bug in LLVM
version prior to 7, radv cannot add target info at this stage,
as it would leak one for every shader compile, however I'd prefer
to keep this llvm damage in the shared code, since it isn't the
driver at fault here. We just add a flag to denote if the driver
can support leaking the target info or not, and the common code
does the right thing depending on the llvm version.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We want to share this code with radv in the future, so port
it out of radeonsi.
Add a return value as radv will want that to know if this
succeeds
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
As precursor to moving init to common code, just rename the struct
and move it.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This just splits out the non-shared code and reuses ac_get_llvm_target in radv.
v2: rebase on Marek's patch - fixup brace position/whitespace
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes warning at screen creation. We store our outputs in normal temps
and just emit them to shader I/O at the end, due to our I/O ordering
requirements, so reading "outputs" in NIR is fine.
this will be needed for compatibility profiles
v2: handle tess shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
When building with asserts enabled, we'll end up triggering an assert
in pipe_buffer_map_range down this code-path, due to trying to map
an empty range. Even if we avoid that, we'll trigger another assert
a bit later, because u_vbuf_get_minmax_index returns a min-index of
-1 here, which gets promoted to an unsigned value, and gives us an
out-of-bounds buffer-mapping offset.
Since we can't really have a well-defined min/max range here when
the range is empty anyway, we should just drop this dance in the
first place. After all, no rendering is going to be produced.
This fixes a crash in dEQP-GLES31.functional.draw_indirect.random.0
on VirGL for me.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Use caps to obtain the multisample sample positions for up to 16
positions and implement the according Gallium interface.
This implemenation (plus its counterpart in virglrenderer) assume that
the fixed sample position are always the same for a given number of samples
over the whole live time of a qemu session. It also assumes that sample
series are only given for 2, 4, 8, and 16 samples, and for intermediate
numbers N of samples the next higher supported set from above list is picked
and the sample positions for the first N samples are returned accordingly.
Fixes (when run on GL host):
dEQP-GLES31.functional.texture.multisample.samples_1.sample_position
dEQP-GLES31.functional.texture.multisample.samples_2.sample_position
dEQP-GLES31.functional.texture.multisample.samples_3.sample_position
dEQP-GLES31.functional.texture.multisample.samples_4.sample_position
dEQP-GLES31.functional.texture.multisample.samples_8.sample_position
dEQP-GLES31.functional.texture.multisample.samples_10.sample_position
dEQP-GLES31.functional.texture.multisample.samples_12.sample_position
dEQP-GLES31.functional.texture.multisample.samples_13.sample_position
dEQP-GLES31.functional.texture.multisample.samples_16.sample_position
v2: remove unrelated chunk (thanks Ilia Mirkin)
v3: - also return positions for intermediate sample counts
- fix unused varible warning
- update description
v4: explain better what this patch assumes and how it handles sample numbers
that are not directly advertised (thanks go to Erik Faye-Lund for making
me aware that this should be documented)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
v2: none
v3: none
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Matt Turner <mattst88@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
v2: none
v3: none
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reuse code shared with mesa/main/texcompress_bptc.
v2: Use block decompress function
v3: Include static bptc code from texcompress_bptc_tmp.h
Suggested-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
CC: Nicolai Hähnle <nicolai.haehnle@amd.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
This is mainly useful for when one needs to add new opcodes in a painless
and reliable way.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Having this if statement here prevented the next if statement from being
reached in the case of image stores, which is needed for instructions with
indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...".
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Add support for glsl 'invariant' modifier for output data declarations.
Gallium drivers that use TGSI serialization currently loose invariant
modifiers in glsl shaders.
v2: use boolean for invariant instead of unsigned.
Tested: chromiumos on qemu with virglrenderer.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is
changed to anything other than 0x20.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset
could be set to a specific value. The IndirectPropagation pass expected
it to return whether the offset could be increased by a specific value,
which is what TargetNV50::insnCanLoadOffset() does.
Fixes: 37b67db6ae
("nvc0/ir: be careful about propagating very large offsets into const load")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output
file argument to addPassesToEmitFile and hook it up to dwo output.").
CXX rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3
pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Chromium OS uses Autotools and pkg-config when building Mesa for
Android. The gallium drivers were failing to find the headers and
libraries for zlib and Android's libbacktrace.
v2:
- Don't add a check for zlib.pc. configure.ac already checks for
zlib.pc elsewhere. [for tfiga]
- Check for backtrace.pc separately from the other Android libs.
[for tfiga]
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>