Commit graph

82384 commits

Author SHA1 Message Date
Roland Scheidegger
4b249ed4cd softpipe: fix misleading TGSI_QUAD_SIZE usage
All these img filter loops iterate through NUM_CHANNELS, not QUAD_SIZE.
In practice both are of course the same unchangeable value (4), but it
makes the code look a bit confusing. Moreover, some of the functions were
actually given an array of 4 values according to the declaration, yet the
code was addressing values 0/4/8/12 out of it, so fix this by just saying
it's a pointer to floats like the other functions.

While here, also add comment about not quite correct filtering.

There's no actual code difference.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-03-15 19:37:59 +01:00
Roland Scheidegger
9e9d69979c softpipe: fix anisotropic filtering crash
The filt_args->offset wasn't assigned but was always used later leading
to a crash (as far as I can tell, texel offsets don't actually make much
sense with anisotropic filtering, but because there's no explicit setting
if offsets are enabled there the array is always accessed).

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94481

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

CC: <mesa-stable@lists.freedesktop.org>
2016-03-15 16:40:05 +01:00
Nicolai Hähnle
4de25fa7b0 radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCIL
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:59 -05:00
Nicolai Hähnle
0ffcc318e6 tgsi: add tgsi_full_src_register_from_dst helper function
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:49 -05:00
Nicolai Hähnle
c02d73af0b gallium/u_inlines: add util_copy_image_view
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:46 -05:00
Nicolai Hähnle
f6dc4f5558 st/mesa: set image access flags in st_bind_images
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:43 -05:00
Nicolai Hähnle
71a1b54b33 gallium: add access field to pipe_image_view
This allows drivers to make smarter decisions e.g. about whether the image
has to be decompressed.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:40 -05:00
Nicolai Hähnle
8c497b8fb5 st/glsl_to_tgsi: set FS_EARLY_DEPTH_STENCIL when required
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:37 -05:00
Nicolai Hähnle
e526f930aa tgsi: add TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:33 -05:00
Nicolai Hähnle
1c0cee8764 st/glsl_to_tgsi: set memory access type on image intrinsics
This is required to preserve the image variable's coherent/restrict/volatile
qualifiers in TGSI.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:30 -05:00
Nicolai Hähnle
dfcf420412 st/glsl_to_tgsi: provide Texture and Format information for image ops
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:26 -05:00
Nicolai Hähnle
3243b6fc97 tgsi: add Texture and Format to tgsi_instruction_memory
Frontends should have this information readily available, and it simplifies
image LOAD/STORE/ATOM* handling especially with indirect image access.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-14 17:24:02 -05:00
Nicolai Hähnle
9b68bdf6f8 get: reconcile aliasing enums for MaxCombinedShaderOutputResources
The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and
MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only
appear once.

Noticed while implementing ARB_shader_image_load_store without previously
implementing SSBO.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-14 17:19:14 -05:00
Francisco Jerez
b054605722 i965/fs: Restrict inequality that can only hold equal in saturate propagation.
Should have no functional change.  The IP value of an instruction that
reads src_var cannot possibly be after the end of the live interval of
the variable it's reading from, by the definition of live interval.
Might save future readers a momentary WTF while trying to understand
this code.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:58:19 -07:00
Francisco Jerez
7d7990cf65 i965/vec4: Consider removal of no-op MOVs as progress during register coalesce.
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  The no-op MOV check in
opt_register_coalesce() was removing instructions which makes the
cached liveness analysis calculation inconsistent with the shader IR.
We were failing to set progress to true in that case though, which
means that invalidate_live_intervals() wouldn't necessarily be called
at the end of the function.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:58:11 -07:00
Francisco Jerez
93be4158ae i965/fs: Add missing analysis invalidation in fixup_3src_null_dest().
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  fixup_3src_null_dest() was allocating
registers which makes the cached liveness analysis calculation
incomplete, so it must be invalidated.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:57:58 -07:00
Francisco Jerez
6691c03fd3 i965/fs: Add missing analysis invalidation in opt_sampler_eot().
Bug found by the liveness analysis validation pass that will be
introduced in a later commit.  opt_sampler_eot() was allocating
registers and inserting and removing instructions, which makes the
cached liveness analysis calculation inconsistent with the shader IR,
so it must be invalidated.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-14 14:56:02 -07:00
Hans de Goede
4d02e91e49 clover: Fix pipe_grid_info.indirect not being initialized.
After pipe_grid_info.indirect was introduced, clover was not modified
to set it causing it to pass uninitialized memory for it to launch_grid.

This commit fixes this by zero-ing the entire pipe_grid_info struct when
declaring it, to avoid similar problems popping-up in the future.

Cc: "11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
[ Francisco Jerez: Trivial codestyle fix. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-14 14:12:42 -07:00
Sarah Sharp
af06190760 mesa: docs: Intel i965 hardware limits.
This should help the next person working on hardware enabling figure out
where in the Intel PRMs to find the magic platform hardware values.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
2016-03-14 14:00:29 -07:00
Sarah Sharp
0f5bfc7f01 mesa: docs: i965: Use correct doxygen groupings syntax
When reading the source code, it's useful to indicate that a group of
fields in a struct are related in someway. There were several places
where people tried to group related structure members with the {@
syntax, without realizing they also needed to add the \name syntax in
order to generate correct doxygen html.

There are several files with groupings that look like this:

struct foo {
    /**
     * Related fields description
     * @{
     */
    int bar;
    char baz;
    /** @} */
    long qux;
}

However, the doxygen syntax for grouping is:

struct foo {
    /**
     * \name Related fields description
     * @{
     */
    int bar;
    char baz;
    /** @} */
    long qux;
}

https://www.stack.nl/~dimitri/doxygen/manual/grouping.html

Without the group name definition, the fields don't get properly
grouped. Instead, the group description is applied to the first field.

Fix the Intel hardware information structure, brw_device_info to
properly group the GPU hardware limitations and hardware quirks fields.

Once you've run `cd doxygen; make clean; make all`,
updated documentation can be found at

mesa/doxygen/i965/structbrw__device__info.html

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
2016-03-14 14:00:29 -07:00
Bruce Cherniak
e9d68cc3da gallium/swr: Resource management
Better tracking of resource state and synchronization.
A follow on commit will clean up resource functions into a new
swr_resource.cpp file.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2016-03-14 14:07:48 -05:00
Marek Olšák
7a2333e4ef configure.ac: require libdrm 2.4.66 for drmGetDevice
since 737b6ed13e
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c no longer compiles:
error: unknown type name ‘drmDevicePtr’
2016-03-14 16:42:41 +01:00
Francisco Jerez
63250d8178 i965: Remove useless IR self-destruct backend_shader method.
From the point it's constructed the CFG contains the only existing
copy of the program IR, and it never becomes invalid.  Calling
backend_shader::invalidate_cfg would have destroyed the program
structure irrecoverably -- We weren't calling it at all for a good
reason.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-13 18:07:53 -07:00
Pierre Moreau
8c7acd87af nv50,nvc0: Set only NEW_CP_GLOBALS upon binding
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-13 22:34:50 +01:00
Rob Clark
e73ac84b93 freedreno/ir3: lower extract_byte/word
The following commits broke things by starting to feed us unhandled
extract_u16/extract_u8 opcodes:

commit 905ff86198
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Wed Feb 3 14:28:31 2016 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Fri Mar 4 11:52:34 2016 -0800

    nir: Recognize open-coded extract_u16.

commit 76289fbfa8
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Thu Jan 21 09:09:48 2016 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Fri Mar 4 11:52:34 2016 -0800

    nir: Recognize open-coded extract_u8.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 14:10:57 -04:00
Ilia Mirkin
c1e4a6bfbf nv50,nvc0: handle SQRT lowering inside the driver
First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to
find out whether the input is less than 0). Secondly the current
approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced
instead of inf.

Instead we switch to the less accurate rcp(rsq(x)) method - this behaves
nicely for all valid inputs. We still don't do this for DSQRT since the
RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson
steps right now. Eventually we should have a separate library function
for DSQRT that does it more precisely (and perhaps move this lowering to
the post-opt phase).

This fixes a number of dEQP precision tests that were expecting better
behavior for infinite inputs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-13 13:17:24 -04:00
Ilia Mirkin
b3e7fb5234 nv50/ir: avoid folding mul + add if the mul has a dnz
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-03-13 13:17:24 -04:00
Ilia Mirkin
a651bc027d nvc0: fix blit triangle size to fully cover FB's > 8192x8192
The idea is that a single triangle will cover the whole area being
drawn, allowing the blit shader to do its work. However the max fb size
is 16384x16384, which means that the triangle we draw needs to be twice
that in order to cover the whole area fully. Increase the size of the
triangle to 32768x32768.

This fixes a number of dEQP tests that were failing because a blit was
involved which would miss some of the resulting texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-13 13:17:24 -04:00
Rob Clark
01b071d530 freedreno: OUT_RELOC vs OUT_RELOCW fixes
Make sure we use OUT_RELOCW() in cases where the buffer is written to.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
f68c6951b8 freedreno/a4xx: hw binning
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
b3fe196e21 freedreno/a4xx: use generated headers for draw initiator
No need to open-code this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
2224ba5976 freedreno/a4xx: remove RB_RENDER_CONTROL patching
Bitfields where shuffled around for the better on a4xx, so we don't need
any patching on this one.  It appears to be something we set entirely in
the gmem code so no conflict between tiling and render state like we had
in a3xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
8824a765a2 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
476551a21f freedreno/a3xx: move where we deal w/ binning FS
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
dd9135c452 freedreno/a4xx: move where we deal w/ binning FS
Move where we pick dummy FS for binning pass, so the whole driver sees
the same dummy/no-op FS stage.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
09b3447344 freedreno/a3xx: constify the shader variants
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:41 -04:00
Rob Clark
5b955f09f7 freedreno/a4xx: constify the shader variants
Most of the driver just needs read-only access, so constify..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:40 -04:00
Rob Clark
d9395e4ed8 freedreno/a3xx: remove duplicate mark of end of binning cmds
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 12:23:40 -04:00
Nicolai Hähnle
28d2a7e67c radeonsi: avoid crash when a sampler state is bound for a buffer texture
Sampler states don't really make sense with buffer textures, but they
can be set anyway, so we need to be defensive here. This bug was lurking
for a while and was finally noticed due to PBO uploads setting sampler
states.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Tested-by: Shawn Starr <shawn.starr@rogers.com>
2016-03-13 09:37:23 -05:00
Matt Turner
61b10b4eb7 i965: Use foreach_in_list_reverse_safe() macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-03-12 19:23:50 -08:00
Jason Ekstrand
98d58e7320 nir/clone: Add support for cloning a single function_impl
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
036b209484 nir/validate: Better function validation
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
f86f3c90aa nir/print: Better function argument printing
Since we aren't going to put the function parameters or the return variable
in the list of locals, it won't get a proper declaration.  This changes
nir_print to print the type along with each parameter or return variable.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
13969565f9 nir/print: Factor variable name lookup into a helper
Otherwise, we have a problem when we go to print functions with arguments
because their names get added to the hash table during declaration which
happens after we print the prototype.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
e4bebe8a02 nir: Create function parameters in function_impl_create
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
066d3c115e nir: Add a helper for creating a "bare" nir_function_impl
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
2ef4754a20 nir: Add a new "param" variable mode for parameters and return variables
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand
41ae553fda nir/glsl: Remove dead function parameter handling code
NIR has never been used on IR where we haven't already done function
inlining so this code has been dead from the beginning.  Let's just get rid
of it for now.  We can always put it back in if we decide to use NIR for
function inlining at some point in the future.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jordan Justen
b83785d86d anv/gen7: Add stall and flushes before switching pipelines
This is a port of 18c76551ee from OpenGL
to Vulkan.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 13:13:37 -08:00
Jordan Justen
c8ec65a1f5 anv: Add flush_pipeline_before_pipeline_select
flush_pipeline_before_pipeline_select adds workarounds required before
switching the pipeline.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 13:13:37 -08:00