Commit graph

29173 commits

Author SHA1 Message Date
Nicolai Hähnle
8dbf2a8570 radeonsi: add DRAWID parameter to vertex shaders
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-09 15:56:04 +02:00
Nicolai Hähnle
febb5dbf72 radeonsi: wire up TGSI_SEMANTIC_BASEINSTANCE
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-09 15:56:03 +02:00
Nicolai Hähnle
d34292a77f radeonsi: remove an incorrect assertion
Byte indices don't need any alignment, so remove this assertion (it got moved
into a path where a piglit test hit it during the refactoring of
commit 64ff23a58c).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-09 15:56:03 +02:00
Nicolai Hähnle
2852dedaa0 radeonsi: flush TC L2 cache for indirect draw data
This fixes a bug when indirect draw data is generated by transform
feedback.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-09 15:56:03 +02:00
Nicolai Hähnle
76c4a3b567 radeonsi/sid: add additional bits for the DRAW_(INDEX)_INDIRECT_MULTI packets
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-09 15:56:03 +02:00
Marek Olšák
06b2fd04f6 ddebug: dump driver states and shaders for apitrace calls
I think this was an oversight when the PIPE_DUMP flags were added.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-09 15:35:42 +02:00
Nicolai Hähnle
96bbb620a5 radeonsi: add has_draw_indirect_multi flag
Prefer to use DRAW_(INDEX)_INDIRECT_MULTI when available in the firmware.

Versions for SI and CI already added as provided by the firmware team, but
keep in mind that they won't currently be used since the radeon kernel module
has no interface to query the firmware version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:53:06 +02:00
Nicolai Hähnle
5c343cce0f radeonsi: transpose indirect/index draw dispatch
This allows better code sharing for indirect draw calls.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:53:04 +02:00
Nicolai Hähnle
64ff23a58c radeonsi: move index buffer calculations in si_emit_draw_packets up
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:53:02 +02:00
Nicolai Hähnle
cf7d18b75c radeonsi: unify emitting PKT3_SET_BASE for indirect draws
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:52:59 +02:00
Nicolai Hähnle
e0736c438c winsys/amdgpu: query ME/PFP/CE firmware versions
The radeon kernel module doesn't have the firmware query interface, so the
corresponding values will remain 0.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:52:41 +02:00
Nicolai Hähnle
7f5a8dc27e radeonsi: move spi_ps_input_addr override outside of the loop
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:51:32 +02:00
Nicolai Hähnle
287822ee33 radeonsi: drop unnecessary u_pstipple.h include
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:51:29 +02:00
Nicolai Hähnle
3e4c5693a1 radeonsi: do not pass the return type to buffer_load_const
Overriding it is not allowed anyway, and actually lead to a crash when polygon
stippling was used with monolithic shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:51:26 +02:00
Marek Olšák
3fb4a9b3b3 Revert "gallium/radeon: count contexts"
This reverts commit b403eb3385.

Not needed.
2016-08-06 17:29:23 +02:00
Marek Olšák
11b1d064a3 radeonsi: add GLSL lit tests
They can only be run manually as described in HOW_TO_RUN.
It should help catch suboptimal code generation.

Some of the tests already fail.

v2: rename the tests to *.glsl,
    fix lit.cfg to find FileCheck

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-08-06 16:11:43 +02:00
Marek Olšák
35942ee8a8 radeonsi: add a standalone compiler amdgcn_glslc
This will be used by GLSL lit tests.

For developers only. It shouldn't be distributable and it doesn't use
the Mesa build system.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 16:11:39 +02:00
Marek Olšák
ad8af99c86 radeonsi: add environment variable SI_FORCE_FAMILY
This will be used by: amdgcn_glslc -mcpu=[family]

It can also be used for shader-db if you want stats for a different family.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 16:11:35 +02:00
Marek Olšák
d0646cc745 winsys/radeon: implement cs_get_next_fence
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:31 +02:00
Marek Olšák
63b99590db winsys/amdgpu: implement cs_get_next_fence
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
04a6cb63aa gallium/radeon: add cs_get_next_fence winsys callback
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
b403eb3385 gallium/radeon: count contexts
We don't wanna use unflushed fences when we have multiple contexts.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
16d568d911 gallium/radeon: count gfx IB flushes
This will be used as a counter for whether fence_finish needs to flush
the IB.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
c5ff0d3e65 gallium/radeon: move radeon_winsys::cs_memory_below_limit to drivers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
076db67217 gallium/radeon: inline radeon_winsys::query_memory_usage
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
9646ae7799 gallium/radeon/winsyses: expose per-IB used_vram and used_gart to drivers
The following patches will use this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
1c8f17599e gallium/radeon/winsyses: print CS submission error number
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
0edc2e433e radeonsi: flush if constant, shader, and streamout buffers use too much memory
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
c3efdeb8dd radeonsi: flush if sampler views and images use too much memory
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
d82cfab84c radeonsi: deal with high vertex buffer memory usage correctly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
e62caf576e radeonsi: take compute shader and dispatch indirect memory usage into account
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
c56ecb68e7 radeonsi: take scratch buffer and draw indirect memory usage into account
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
ed2254d157 radeonsi: check IB memory usage of CP DMA operations
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
f4b977bf3d gallium/radeon: add r600_resource::vram_usage and gart_usage
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Jason Ekstrand
f29fd7897a util: Move format_r11g11b10f.h to src/util
It's used from both mesa main and gallium.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:57 -07:00
Jason Ekstrand
6c665cdfc5 util: Move format_rgb9e5.h to src/util
It's used from both mesa main and gallium.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:31 -07:00
Tim Rowley
b521083ffb swr: [rasterizer core] static analysis fixes for conservative rast
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:35 -05:00
Tim Rowley
68dc544879 swr: [rasterizer core] implement InnerConservative input coverage
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:35 -05:00
Tim Rowley
4034f48833 swr: [rasterizer core] remove CanEarlyZ function
Test is now in SetupPipeline.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
b365989875 swr: [rasterizer core] use 32x32 macrotile for openswr
Significant performance increase (up to 2x) on high geometry workloads.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
5f4bc9e85b swr: [rasterizer fetch] add support for 24bit format fetch
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
527d45c8fe swr: [rasterizer fetch] additional fetch format support
Add support for 0 pitch in fetch.

Add support for USCALE/SSCALE for 32bit integer fetches.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
f438b7ba81 swr: [rasterizer jitter] fix potential jit exit crash
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
57b07498d2 swr: [rasterizer core] update sync handling
Sync now uses a callback to ensure that it's called by the last
thread moving past a DC.  This will help with the new counter
handling.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
191786d0f4 swr: [rasterizer core] rename variable
Avoid nested declarations of the same name within a single function.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:01:37 -05:00
Tim Rowley
61cc012e9a swr: [rasterizer jitter] adjust extern "C" block scope
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:01:31 -05:00
Tim Rowley
9f7d99fcfe swr: [rasterizer core] conservative rast degenerate handling
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:01:25 -05:00
Tim Rowley
f01827a469 swr: [rasterizer core] allow hexadecimal for integer knobs
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 13:52:12 -05:00
Eric Anholt
c976e164d2 vc4: Move scalarizing and some lowering to link time.
This works out to be a wash in terms of memory usage: We use more memory
to store the separate ALU instructions, but we optimize out a lot of code
as well.  The main result, though, is that we do more of our work at link
time rather than draw time.
2016-08-04 08:48:27 -07:00
Eric Anholt
2350569a78 vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.
We don't want to bake the whole array into the FS key, because of the
hashing overhead.  But we can keep a set of the arrays seen, and use a
pointer to the copy in as the array's proxy.

Between this and the previous patch, gl-1.0-blend-func now passes on
hardware, where previously it was filling the 256MB CMA area with shaders
and OOMing.

Drops 712 shaders from shader-db.
2016-08-04 08:48:27 -07:00