mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-31 13:48:20 +02:00
There are currently two methods in llvmpipe code to calculate coeffs to
be used as inputs for the fragment shader. The two methods use slightly
different ways to do the floating point calculations and thus produce
slightly different results.
The decision which method to use is determined by the size of the vector
that is used by the platform.
For vectors with size of more than 128bit, a single-step method is used,
in which coeffs_init_simple() + attribs_update_simple() are called.
For vectors with size of 128bit or less, a two-step method is used, in
which coeffs_init() + attribs_update() are called.
This causes some piglit tests (clip-distance-bulk-copy,
interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with
128bit vectors (such as ppc64le or x86-64 without AVX).
This patch makes platforms with 128bit vectors use the single-step
method (aka "simple" method) instead of the two-step method.
This would make the resulting coeffs identical between more platforms,
make sure the piglit tests passes, and make debugging and maintainability
a bit easier as the generated LLVM IR will be the same for more platforms.
The performance impact is negligible for x86-64 without AVX, and
basically non-existent for ppc64le, as it can be seen from the following
benchmarking results:
- glxspheres, on ppc64le:
- original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec
- with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec
- Additional 0.8% performance boost
- glxspheres, on x86-64 without AVX:
- original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec
- with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec
- Additional 0.74% performance boost
- glmark2, on ppc64le:
- original code: score of 58
- with my change: score of 57
- glmark2, on x86-64 without AVX:
- original code: score of 175
- with the patch: score of 167
- Impact of of -4.5% on performance
- OpenArena, on ppc64le:
- original code: 3398 frames 1719.0 seconds 2.0 fps
255.0/505.9/2773.0/0.0 ms
- with the patch: 3398 frames 1690.4 seconds 2.0 fps
241.0/497.5/2563.0/0.2 ms
- 29 seconds faster with the patch, which is about 2%
- OpenArena, on x86-64 without AVX:
- original code: 3398 frames 239.6 seconds 14.2 fps
38.0/70.5/719.0/14.6 ms
- with the patch: 3398 frames 244.4 seconds 13.9 fps
38.0/71.9/697.0/14.3 ms
- 0.3 fps slower with the patch (about 2%)
Additional details can be found at:
http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
|
||
|---|---|---|
| .. | ||
| auxiliary | ||
| docs | ||
| drivers | ||
| include | ||
| state_trackers | ||
| targets | ||
| tests | ||
| tools | ||
| winsys | ||
| Android.common.mk | ||
| Android.mk | ||
| Automake.inc | ||
| Makefile.am | ||
| README.portability | ||
| SConscript | ||
CROSS-PLATFORM PORTABILITY GUIDELINES FOR GALLIUM3D
= General Considerations =
The state tracker and winsys driver support a rather limited number of
platforms. However, the pipe drivers are meant to run in a wide number of
platforms. Hence the pipe drivers, the auxiliary modules, and all public
headers in general, should strictly follow these guidelines to ensure
= Compiler Support =
* Include the p_compiler.h.
* Cast explicitly when converting to integer types of smaller sizes.
* Cast explicitly when converting between float, double and integral types.
* Don't use named struct initializers.
* Don't use variable number of macro arguments. Use static inline functions
instead.
* Don't use C99 features.
= Standard Library =
* Avoid including standard library headers. Most standard library functions are
not available in Windows Kernel Mode. Use the appropriate p_*.h include.
== Memory Allocation ==
* Use MALLOC, CALLOC, FREE instead of the malloc, calloc, free functions.
* Use align_pointer() function defined in u_memory.h for aligning pointers
in a portable way.
== Debugging ==
* Use the functions/macros in p_debug.h.
* Don't include assert.h, call abort, printf, etc.
= Code Style =
== Inherantice in C ==
The main thing we do is mimic inheritance by structure containment.
Here's a silly made-up example:
/* base class */
struct buffer
{
int size;
void (*validate)(struct buffer *buf);
};
/* sub-class of bufffer */
struct texture_buffer
{
struct buffer base; /* the base class, MUST COME FIRST! */
int format;
int width, height;
};
Then, we'll typically have cast-wrapper functions to convert base-class
pointers to sub-class pointers where needed:
static inline struct vertex_buffer *vertex_buffer(struct buffer *buf)
{
return (struct vertex_buffer *) buf;
}
To create/init a sub-classed object:
struct buffer *create_texture_buffer(int w, int h, int format)
{
struct texture_buffer *t = malloc(sizeof(*t));
t->format = format;
t->width = w;
t->height = h;
t->base.size = w * h;
t->base.validate = tex_validate;
return &t->base;
}
Example sub-class method:
void tex_validate(struct buffer *buf)
{
struct texture_buffer *tb = texture_buffer(buf);
assert(tb->format);
assert(tb->width);
assert(tb->height);
}
Note that we typically do not use typedefs to make "class names"; we use
'struct whatever' everywhere.
Gallium's pipe_context and the subclassed psb_context, etc are prime examples
of this. There's also many examples in Mesa and the Mesa state tracker.