mesa/src/gallium at 39b4dfe6ab1003863778a25c091c080e098833ec - fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-31 13:48:20 +02:00

History

Oded Gabbay 39b4dfe6ab llvmpipe: use simple coeffs calc for 128bit vectors There are currently two methods in llvmpipe code to calculate coeffs to be used as inputs for the fragment shader. The two methods use slightly different ways to do the floating point calculations and thus produce slightly different results. The decision which method to use is determined by the size of the vector that is used by the platform. For vectors with size of more than 128bit, a single-step method is used, in which coeffs_init_simple() + attribs_update_simple() are called. For vectors with size of 128bit or less, a two-step method is used, in which coeffs_init() + attribs_update() are called. This causes some piglit tests (clip-distance-bulk-copy, interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with 128bit vectors (such as ppc64le or x86-64 without AVX). This patch makes platforms with 128bit vectors use the single-step method (aka "simple" method) instead of the two-step method. This would make the resulting coeffs identical between more platforms, make sure the piglit tests passes, and make debugging and maintainability a bit easier as the generated LLVM IR will be the same for more platforms. The performance impact is negligible for x86-64 without AVX, and basically non-existent for ppc64le, as it can be seen from the following benchmarking results: - glxspheres, on ppc64le: - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec - Additional 0.8% performance boost - glxspheres, on x86-64 without AVX: - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec - Additional 0.74% performance boost - glmark2, on ppc64le: - original code: score of 58 - with my change: score of 57 - glmark2, on x86-64 without AVX: - original code: score of 175 - with the patch: score of 167 - Impact of of -4.5% on performance - OpenArena, on ppc64le: - original code: 3398 frames 1719.0 seconds 2.0 fps 255.0/505.9/2773.0/0.0 ms - with the patch: 3398 frames 1690.4 seconds 2.0 fps 241.0/497.5/2563.0/0.2 ms - 29 seconds faster with the patch, which is about 2% - OpenArena, on x86-64 without AVX: - original code: 3398 frames 239.6 seconds 14.2 fps 38.0/70.5/719.0/14.6 ms - with the patch: 3398 frames 244.4 seconds 13.9 fps 38.0/71.9/697.0/14.3 ms - 0.3 fps slower with the patch (about 2%) Additional details can be found at: http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>		2015-11-04 02:38:53 +01:00
..
auxiliary	gallivm: disable f16c when not using AVX	2015-10-26 16:45:49 +01:00
docs	gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS	2015-10-28 11:52:17 +01:00
drivers	llvmpipe: use simple coeffs calc for 128bit vectors	2015-11-04 02:38:53 +01:00
include	gallium/swrast: fix front buffer blitting. (v2)	2015-10-31 16:04:36 +10:00
state_trackers	gallium/swrast: fix front buffer blitting. (v2)	2015-10-31 16:04:36 +10:00
targets	virgl: add driver for virtio-gpu 3D (v2)	2015-10-23 14:40:07 +10:00
tests	gallium: add flags parameter to pipe_screen::context_create	2015-08-26 19:25:18 +02:00
tools	gallium: add an index argument to create_query	2014-07-01 11:34:31 -04:00
winsys	virgl/vtest: fix extra malloc	2015-10-31 18:05:33 +10:00
Android.common.mk	android: enable the radeonsi driver	2015-06-09 12:25:50 -07:00
Android.mk	winsys/amdgpu: add a new winsys for the new kernel driver	2015-08-14 15:02:28 +02:00
Automake.inc	pipe-loader: remove pipe_loader_sw_probe_xlib	2015-07-13 19:57:38 +01:00
Makefile.am	virgl/vtest: add vtest driver	2015-10-23 14:40:07 +10:00
README.portability	gallium: replace INLINE with inline	2015-07-21 17:52:16 -04:00
SConscript	scons: don't build the kms-dri winsys	2015-07-22 16:35:25 +01:00

README.portability

	      CROSS-PLATFORM PORTABILITY GUIDELINES FOR GALLIUM3D 


= General Considerations =

The state tracker and winsys driver support a rather limited number of
platforms. However, the pipe drivers are meant to run in a wide number of
platforms. Hence the pipe drivers, the auxiliary modules, and all public
headers in general, should strictly follow these guidelines to ensure


= Compiler Support =

* Include the p_compiler.h.

* Cast explicitly when converting to integer types of smaller sizes.

* Cast explicitly when converting between float, double and integral types.

* Don't use named struct initializers.

* Don't use variable number of macro arguments. Use static inline functions
instead.

* Don't use C99 features.

= Standard Library =

* Avoid including standard library headers. Most standard library functions are
not available in Windows Kernel Mode. Use the appropriate p_*.h include.

== Memory Allocation ==

* Use MALLOC, CALLOC, FREE instead of the malloc, calloc, free functions.

* Use align_pointer() function defined in u_memory.h for aligning pointers
 in a portable way.

== Debugging ==

* Use the functions/macros in p_debug.h.

* Don't include assert.h, call abort, printf, etc.


= Code Style =

== Inherantice in C ==

The main thing we do is mimic inheritance by structure containment.

Here's a silly made-up example:

/* base class */
struct buffer
{
  int size;
  void (*validate)(struct buffer *buf);
};

/* sub-class of bufffer */
struct texture_buffer
{
  struct buffer base;  /* the base class, MUST COME FIRST! */
  int format;
  int width, height;
};


Then, we'll typically have cast-wrapper functions to convert base-class 
pointers to sub-class pointers where needed:

static inline struct vertex_buffer *vertex_buffer(struct buffer *buf)
{
  return (struct vertex_buffer *) buf;
}


To create/init a sub-classed object:

struct buffer *create_texture_buffer(int w, int h, int format)
{
  struct texture_buffer *t = malloc(sizeof(*t));
  t->format = format;
  t->width = w;
  t->height = h;
  t->base.size = w * h;
  t->base.validate = tex_validate;
  return &t->base;
}

Example sub-class method:

void tex_validate(struct buffer *buf)
{
  struct texture_buffer *tb = texture_buffer(buf);
  assert(tb->format);
  assert(tb->width);
  assert(tb->height);
}


Note that we typically do not use typedefs to make "class names"; we use
'struct whatever' everywhere.

Gallium's pipe_context and the subclassed psb_context, etc are prime examples 
of this.  There's also many examples in Mesa and the Mesa state tracker.