Different generations of hardware measure jump distances in different
units. Previously, every function that needed to set a jump target open
coded this scaling, or made a hardcoded assumption (i.e. just used 2).
Most functions start with the number of instructions to jump, and scale
up to the hardware-specific value. So, I made the function match that.
Others start with a byte offset, and divide by a constant (8) to obtain
the jump distance. This is actually 16 / 2 (the jump scale for Gen5-7).
v2: Make the helper a static inline defined in brw_eu.h, instead of
an actual function in brw_eu_emit.c (as suggested by Matt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
It has Gen6+ knowledge baked in, and indeed is only called for Gen6+,
but it wasn't immediately obvious that this was the case.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
See gen8_generator::CMP().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Until now, it's been off implicitly: we never call the compactor
function. When we merge the generators, we'll start calling it, so we
should make it do nothing.
Matt will enable instruction compaction properly later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Broadwell is going to use the brw_eu_emit.c code soon. We want to get
the fake MRF handling and URB HWord channel mask handling.
We don't need the CMP thread switch workaround, though.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Now that we no longer use ctx->DrawBuffer->_Xmin and related fields to
program the screen-space viewport extents, we don't depend on any
scissoring state. So we can drop the +_NEW_SCISSOR dependency.
On GEN8, a change in scissor state does not effect anything for the
clipper/sf hardware state. The hardware will always do the right thing
once the viewport extents are programmed. We can therefore remove the
unecessary state emission.
Ken originally spotted this.
v2: Reword the commit message. Remove spurious hunk.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The goal of guardband clipping is to try to avoid 3d clipping because it
is an expensive operation. When guardband clipping is disabled, all
geometry that intersects the viewport is sent to the FF 3d clipper.
Objects which are entirely enclosed within the viewport are said to be
"trivially accepted" while those entirely outside of the viewport are,
"trivially rejected".
When guardband clipping is turned on the above behavior is changed such
that if the geometry is within the guardband, and intersects the
viewport, it skips the 3d clipper. Prior to GEN8, this was problematic
if the viewport was smaller than the screen as it could allow for
rendering to occur outside of the viewport. That could be mitigated if
the programmer specified a scissor region which was less than or equal
to the viewport - but this is not required for correctness in OpenGL. In
theory you could be clever with the guardband so as not to invoke this
problem. We do not do this, and have no data that suggests we should
bother (nor the converse data).
With viewport extents in place on GEN8, it should be safe to turn on
guardband clipping for all cases
While here, add a comment to the code which confused me thoroughly.
v2: Update grammar in commit message. Reword comments based on Ken's
suggestion.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Viewport extents are a 3rd rectangle that defines which pixels get
discarded as part of the rasterization process. The actual pixels drawn
to the screen are an intersection of the drawing rectangle, the viewport
extents, and the scissor rectangle. It permits the use of guardband
clipping in all cases (see later patch). The actual pixels drawn to the
screen are an intersection of the drawing rectangle, the viewport
extents, and the scissor rectangle.
Scissor rectangle is not super important for this discussion as it should
always help do the right thing provided the programmer uses it.
switch (viewport dimensions, drawrect dimension) {
case viewport > drawing rectangle: no effects; break;
case viewport == drawing rectangle: no effects; break;
case viewport < drawing rectangle:
Pixels (after the viewport transformation but before expensive
rastersizing and shading operations) which are outside of the
viewport are discarded.
}
I am unable to find a test case where this improves performance, but in
all my testing it doesn't hurt performance, and intuitively, it should
not ever hurt performance. It also permits us to use the guardband more
freely (see upcoming patch).
v2: Updating commit message.
v3: Commit message updates requested by Ken
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
While working in this part of the code I had a great deal of trouble
understanding what it was trying to do, and matching it with the spec.
(mostly due bad wording in the PRM). To help future people, I've cleaned
up the wording and provided some ascii art.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This adds support for Marek's new driconf parameter, which avoids
totally white rendering in Unigine Valley (which attempts to enable
the GL_ARB_sample_shading extension in an illegal place).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75664
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This lets us call dump_instructions() after register allocation without
failing an assertion.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Without this patch I get the following during DMA transfers:
[drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
radeon 0000:01:00.0: CP DMA dst buffer too small (21475829792 4096)
This is a fixup for e878e154cd.
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tahiti has 12 tile pipes, but P8 pipe config.
It looks like there is no way to get the pipe config except for reading
GB_TILE_MODE. The TILING_CONFIG ioctl doesn't return more than 8 pipes,
so we can't use that for Hawaii.
This fixes a regression caused by 9b046474c9
on Tahiti.
v2: add an assertion and print an error on failure
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This will help to get rid of the buffer_get_virtual_address calls.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
The code is rewritten to take known constraints into account, while always
using 0 by default.
This should improve performance for multi-SE parts in theory.
A debug option is also added for easier debugging. (If there are hangs,
use the option. If the hangs go away, you have found the problem.)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
v2: fix a typo, set max_se for evergreen GPUs according to the kernel driver
This isn't documented anywhere, but it's the only thing that works
for this case.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This validates all bound buffers (CB, ZB, textures, DMA) at the beginning
of CS. This fixes "bo->space_accouned" assertion failures.
Tested by: Jochen Rollwagen <joro-2013@t-online.de>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This fixes piglit spec/EXT_texture_array/render-1darray.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This is a bug which was probably uncovered recently by Jason's commits
and broke this.
The problem is _mesa_base_tex_format(GL_STENCIL_INDEX) returns -1.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
ec8ebff "Check for dladdr()" erroneously uses LDFLAGS rather than LIBS to add
-ldl to the dladdr check.
Replace the workaround in 39a4cc4 of explicitly checking in libdl, with a more
correct approach of using LIBS.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Pali Rohár <pali.rohar@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
These values are supposed to be the minimum/maximum index values used to
read from the vertex buffers. This code either copies index values out of
the old IB (so, same min/max as the original draw call), or generates a
new IB (using index values between the start and the start + count of the
old array draw info, which just happens to be what min/max_index are set
to by st_draw.c).
We were incorrectly setting the max_index in the
converting-from-glDrawArrays case to the start vertex plus the number of
vertices generated in the new IB, which broke QUADS primitive conversion
on VC4 (where max_index really has to be correct, or the kernel might
reject your draw call due to buffer overflow).
Reviewed-by: Rob Clark <robclark@freedesktop.org> (from verbal description
of the patch)
Some tests start working (useprogram-flushverts, for example) due to
getitng the right vertices now. Some that used to pass start failing with
memory overflow during binning, which is weird (glsl-fs-texture2drect).
And a couple stop rendering correctly (glsl-fs-bug25902).
v2: Move the attribute format setup in the key from after search time to
before the search.
v3: Fix reading of attributes other than position (I forgot to respect
attr and stored everything in inputs 0-3, i.e. position).
We could get undefined sources in real programs from the wild, so we'll
need to turn off this debug eventually. But for now, using undefined
sources is typically me just mistyping something.