Per pixel stats are cached but were not always being flushed as threads
moved from one draw context to the next. Added an explicit flush to allow
all archrast objects to flush any cached events.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Performance is now 50x faster with archrast now that we're properly
filtering out all of the rdtsc begin/end.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Autogen functions that instantiates different BackendPixelRate templates.
Functions get split into separate files after reaching a user defined
threshold (currently 512 per file) to speed up compilation.
This change will enable the addition of more template flags in the pixel
back end.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment.
Tested-by: Constantine Charlamov <Hi-Angel@yandex.ru>
v2: Resend the patch again through git-email. The prev. rebase was sent
through Thunderbird, which screwed up tab characters, making the patch
not apply.
--------------
When fixing the stalls on evergreen I introduced leaking of the useinfo
structure(s). Sorry. Instead of allocating a new object to hold 3 values
where only one is actually used, rework the list to just store the node
pointer. Thus no allocating and deallocation is needed. Since use_info
and use_kind aren't used anywhere, drop them and reduce code complexity.
This might also save some small amount of cycles.
Thanks to Bartosz Tomczyk for finding the bug.
Reported-by: Bartosz Tomczyk <bartosz.tomczyk86 at gmail.com <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>
Signed-off-by: Heiko Przybyl <lil_tux at web.de <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>
Supersedes: https://patchwork.freedesktop.org/patch/135852
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
CID 1399479: Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking velems suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Like in a few other places in that radeon_drm_bo.c file.
CID 715739.
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
fc_sp variable should indicate number of elements in
fc_stack array, but fc_sp was increased at beginning of fc_pushlevel
function. It leads to situation where idx=0 was never used, and last
32 element was stored outside fs_stack array.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
The second check in the old code looked pretty much unreachable, esp.
because it's not obvious that "max_entries" could be zero. To find out
that it was intentional I had to run some checks, and to dig into
the old versions of the file.
So, rewrite the check to make the intention clear.
v2: s/r600/r600g in the title, and per Dieter Nützel's comment wrap
lines of condition.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
There should be minimal gain, if any, for nvc0, but nv50 may end up
noticing more often that the lod argument is uniform. This, in turn,
will remove the need for some unnecessary transformations, which were
being hit due to the checks being done pre-ssa.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Helps mainly Feral-ported games, due to their use of fma()
shader-db changes:
total instructions in shared programs : 3901147 -> 3842505 (-1.50%)
total gprs used in shared programs : 471258 -> 467359 (-0.83%)
total local used in shared programs : 27405 -> 27361 (-0.16%)
total bytes used in shared programs : 35749888 -> 35214176 (-1.50%)
local gpr inst bytes
helped 17 1829 4091 4091
hurt 4 44 3 3
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Fixes a bunch of piglit crashes that hit an assert() when trying
to delete the framebuffer. The assert() was triggered because
WinSysDrawBuffer was set to NULL before glDeleteFramebuffers()
was called.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This makes bin/gl-3.2-layered-rendering-gl-layer-render fail only with
2DMS_ARRAY, which is expected given the lackluster MSAA support. However
all the regular types pass.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Currently the GLSL-to-TGSI translation pass assumes it can use
floating point source modifiers on the UCMP instruction. See the bug
report linked below for an example where an unrelated change in the
GLSL built-in lowering code for atan2 (e9ffd12827)
caused the generation of floating-point ir_unop_neg instructions
followed by ir_triop_csel, which is translated into UCMP with a negate
modifier on back-ends with native integer support.
Allowing floating-point source modifiers on an integer instruction
seems like rather dubious design for a transport IR, since the same
semantics could be represented as a sequence of MOV+UCMP instructions
instead, but supposedly this matches the expectations of TGSI
back-ends other than tgsi_exec, and the expectations of the DX10 API.
I take no responsibility for future headaches caused by this
inconsistency.
Fixes a regression of piglit glsl-fs-tan-1 on softpipe introduced by
the above-mentioned glsl front-end commit. Even though the commit
that triggered the regression doesn't seem to have made it to any
stable branches yet, this might be worth back-porting since I don't
see any reason why the bug couldn't have been reproduced before that
point.
Suggested-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99817
Reviewed-by: Roland Scheidegger <sroland@vmware.com>