Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.
This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.
Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.
This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
This can be used to guard support for EXT_memory_object and related
extensions.
v2: update gallium docs
v3 (Timothy Arceri):
- add cap to nv50
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
v2: rename cap to PIPE_CAP_QUERY_SO_OVERFLOW and be a bit more explicit
in the documentation
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The last user of the function was removed with earlier commit.
Fixes: 50842e8a93 ("swr: replace gallium->swr format enum conversion")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Fixes performance regression from f50aa21456 - was forcing internal
code generation to target AVX (no gather, etc).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block. This is faster and more
efficient than queuing many large client draws.
Applications that still use large client arrays benefit from this. VMD
is an example.
The threshold for this path defaults to 32KB. This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.
v2: Use #define for default value, rather than hard-coded constant.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Moved reading of environment config options out of
swr_create_screen_internal, into a separate swr_validate_env_options.
This is to keep from cluttering create_screen.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Use the SWR rasterizer API through the table returned from
SwrGetInterface rather than referencing the functions directly.
This will allow us to move to a model of having the driver dynamically
load the appropriate swr architecture library.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
This patch limits the number of items on the fence work queue (the
deferred deletion list) by submitting a sync fence when the queue size
exceeds a threshold. This initiates deferred deletion of all resources
on the list and decreases the total amount of memory held waiting for
"deferred deletion".
This resolves bug 101467 filed against swr for the piglit
streaming-texture-leak test. For those running on smaller memory
(16GB?) systems, this will prevent oom-killer.
Thus far, we have not seen any real world applications that exhibit
behavior like the streaming-texture-leak test; as any form of pipeline
flush will trigger the defer queue and properly free any retained
allocations. But, this addresses those as well.
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Whether bindless texture operations are supported by the
underlying driver.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The next patch will use it. This is really for svga and GL2-level drivers.
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
v3: list piglit tests fixed by this patch. Fixed typo Tim pointed out.
v2: Reword commit message to more closely adhere to community
guidelines.
This patch moves msaa resolve down into core/StoreTiles where the
surface format conversion routines are available. The previous
"experimental" resolve was limited to 8-bit unsigned render targets.
This fixes a number of piglit msaa tests by adding resolve support for
all the render target formats we support.
Specifically:
layered-rendering/gl-layer-render: fail->pass
layered-rendering/gl-layer-render-storage: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg: crash->pass
multisample-formats *[2,4,8,16] gl_ext_texture_snorm: crash->pass
multisample-formats *[2,4,8,16] gl_arb_texture_float: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg-float: fail->pass
MSAA is still disabled by default, but can be enabled with
"export SWR_MSAA_MAX_COUNT=4" (1,2,4,8,16 are options)
The default is 0, which is disabled.
This patch improves the number of multisample-formats supported by swr,
and fixes several crashes currently in the 17.1 branch. Therefore, it
should be considered for inclusion in the 17.1 stable release. Being
disabled by default, it poses no risk to most users of swr.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
cc: mesa-stable@lists.freedesktop.org
This patch enables multisample antialiasing in the OpenSWR software renderer.
MSAA is a proof-of-concept/work-in-progress with bug fixes and performance
on the way. We wanted to get the changes out now to allow several customers
to begin experimenting with MSAA in a software renderer. So as not to
impact current customers, MSAA is turned off by default - previous
functionality and performance remain intact. It is easily enabled via
environment variables, as described below.
It has only been tested with the glx-lib winsys. The intention is to
enable other state-trackers, both Windows and Linux and more fully support
FBOs.
There are 2 environment variables that affect behavior:
* SWR_MSAA_FORCE_ENABLE - force MSAA on, for apps that are not designed
for MSAA... Beware, results will vary. This is mainly for testing.
* SWR_MSAA_MAX_SAMPLE_COUNT - sets maximum supported number of
samples (1,2,4,8,16), or 0 to disable MSAA altogether.
(The default is currently 0.)
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Removed unnecessary and probably wrong PIPE_BIND_SCANOUT and PIPE_BIND_SHARED
flags in favor of check on single PIPE_BIND_DISPLAY_TARGET flag.
Reference llvmpipe change <bee4c7718a3bd57e3d99f0913d9081cd13fe5fd>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Changes since v1:
- Add pipe caps for etnaviv, freedreno, swr and virgl
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
all drivers support it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com> (VMware driver only)
Nouveau does not currently have logic to implement this as a library
function. Even though such a library could be written, there's no big
advantage to do it that way for now given that int64 is a very uncommon
use-case. Allow a driver to expose INT64 without supporting division and
modulo operations.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
v1.1: move to using a normal CAP. (Marek)
v2: fill in the cap everywhere
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Defer delete on regular resources. This ensures that any work being done
on the resource is completed before freeing up the resource's memory.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
wrap lp_bld_type.h around extern "C".
Windows decorates global variables, so when used from .cpp files, need
to use an undecorated version.
Also, removed related and unneeded code from swr_screen.cpp
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Work can now be added to fences and triggered by fence completion. This
allows for deferred resource deletion, and other asynchronous tasks.
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
The components count the number of individual values, not the number of
slots.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
There is no support for resuming streamout. Furthermore, this also
controls glDrawTransformFeedback functionality which requires the same
ability to query how many primitives were sent out of TF.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Drivers that support this benefit by saving one lowering pass in the
GLSL-to-TGSI conversion.
radeonsi already supports this because all outputs are stored in temporary
variables before the export (except for TCS outputs, which have always
been readable in TGSI anyway due to their special semantics).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
I find this a lot more readable and compact - much easier to scan
through the list and see what's on and what's off.
No functional change intended.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
This is the same value that llvmpipe uses. Since swr uses the same
sampler logic, makes sense for this value to also be the same. Most
applications don't care.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>