gen8_dump_compile will be called indirectly by code common used by
generations before and after the gen8 instruction format change.
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
brw_dump_compile will be called indirectly by code common used by
generations before and after the gen8 instruction format change.
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Has been misaligned since we added instruction offset prefixes.
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
brw_disasm doesn't disassemble compacted instructions, so we uncompact
before disassembling them which would unset the compaction control bit.
Instead pass it as a separate argument.
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is effective only on gen8 for now as previous generations still
go through blorp.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Create the intel renderbuffer with level hardcoded to zero instead
of overriding it in the surface state configuration. Also moved the
dimension adjustments for tiling, mip level, msaa into the render
buffer creation. Finally prepares for another blit path needed for
miptree updownsampling.
v3 (Ken): Dropped unnecessary memory context for "ralloc_asprintf()"
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
v2: Configure stencil directly for final dimensions instead of
adjusting bit by bit for tiling, mip level and msaa.
v3 (Ken): Used non-static constant for horizontal alignment
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Allow hardware to offset accesses to individual layers. Also leave
the mip-level overriding for the creator of the intel renderbuffer
to handle. Merged with "i965/gen8: Allow stencil buffers to be
configured as single sampled"
Ken: I left the "_mesa_problem()" still in place. I think it is clearer
to remove it in a separate patch.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Use intel_mipmap_tree::total_width in order to get correct alignment
automatically. Also use "mt->total_height / mt->physical_depth0" as
surface height allowing hardware to offset to correct slice.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
mt->format is of type mesa_format, and therefore can't be
used with _mesa_base_fbo_format which requires a GLenum input.
On gen8, this fixes various piglit fbo-depthstencil tests with
samples > 1.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
They've served their purpose (in transitioning blorp to using
fs_generator) and now they just necessitate large amounts of manual
labor to regenerate if the disassembler changes.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
With this and the previous patch, we no longer have multiple
definitions in the final egl_gallium.so.
v2: Drop duplicate libloader link.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com> (v1)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v1)
It makes more sense to link the core and common parts of the driver as the
target is build. Additionally this will help us drop duplicating symbols
for targets that static link mulitple pipe-drivers. Only egl-static needs
that currently with more to come.
To simplify things a bit add HAVE_GALLIUM_RADEON_COMMON variable.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
The loops for updating the multiple packed fields in SP_VS_OUT[] and
SP_VS_VPC_DST[] will zero out one register beyond the last that on
required. Which is normally not a problem (and is kinda convenient
when looking at cmdstream dumps) unless we have maximum (16) varyings.
Fix loop termination condition so that this does not happen.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
We need to size input/output tables big enough for special inputs/
outputs (gl_Position, gl_FrontFacing, etc) which, while they don't
count towards the hw limit of 16 attributes or 16 varyings, we do
still need to track them all the same.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Hardware only supports 16. Which fd3_shader_variant properly reflected,
but the pipe cap did not, leading to array overflow (and shaders that
could not possibly work).
Also a bunch of asserts to make problems like this easier to see.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
We are starting to add integer support to the compiler, which does not
get exercised with glsl feature level 120 and without advertising
integer support. But doing so breaks too many things right now. So
for now use a debug flag to conditionally expose the functionality
while it is in development.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
The KILL_IF opcode could potentially be merged in to the regular KILL
opcode function. It was a pain to do so, so I've left is separated
for cleanliness.
Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Adds a large sum of TGSI opcodes to the a3xx compiler.
For integer opcodes we have 28 opcodes added.
Adds 4 floating point compare opcodes
If GLSL 1.30 is enabled, this allows the GLSL 1.30 piglits to have a
completion amount of 432/641.
Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
All shaders had the same name.
We could probably use some identifier per shader too, but for now only use
the variant number.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The setup shaders were composed of both a fs shader number and a variant
number. But since they aren't tied to a particular fragment shader, the
former was a fixed zero while the latter was also always zero because
it was never assigned. So, similar to what the fs code does, use a ever
increasing number to give it a more catchy name (unlike fragment shaders
though where this number is for each explicitly created shader, we just use
it for the implicitly created variants).
And while here, fix whitespace a bit.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Unused except it was increased for both fs and setup shader variants created.
Probably some leftover from ages ago.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Previously the code used the total number of ubos being declared in the
linked program (so the ubos of all shaders combined), use the number
from the particular shader instead.
This fixes an assertion failure with piglit arb_uniform_buffer_object-maxblocks
seen in llvmpipe since 8a9f5ecdb1 as it now emits
code for each declared buffer, not just the ones actually used.
CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
The big missing part here is proper sched data calculations, but
hopefully the chosen placeholder will be sufficient for now.
Passes piglit as well as GK107 does.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Maxwell no longer has the methods to set constant attributes, and we'll
want to be treating stride 0 vtxbufs the same as for stride > 0.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Both changes landed in 10.2, and for people not following the
development cycle these will come as a surprise. Note that the
pipe_* interface is not stable.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Rob Clark <robclark@freedesktop.org>
Use "@GL_LIB@" in src/gallium/targets/libgl-xlib/Makefile.am to produce
the library name specified by the configure --with-gl-lib-name option.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Commit 11623be934 was meant to have this hunk, which
I accidently dropped during git rebase.
Cc: 10.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
In 1d35f77228 support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Bring it back in line with r600g. I broke this in the original radeonsi
bringup. :(
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>