Commit graph

35470 commits

Author SHA1 Message Date
Dylan Baker
fc39dc9841 gallium/util: move debug_print_bind_flags to u_debug_gallium
This also appears to be unused.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
e4f1fea821 gallium/util: move debug_print_usage_enum to the u_debug_gallium
This isn't used in mesa, maybe vmware uses this in a closed source state
tracker?

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
078b3cdb34 gallium/util: start splitting u_debug into generic and gallium specific components
In order to pull u_debug into src/util we need to break the generically
useful bits from the bits that are tightly coupled to gallium.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Dylan Baker
389d59c72a gallium: split u_prim_name out of u_debug.h
This allows us to pull u_prim.h out of u_debug.h

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 14:32:52 -07:00
Andre Heider
25a3ce97d5 gallium/hud: fix power sensor readings for amdgpu users
amdgpu doesn't use the INPUT but the AVERAGE subfeature:

$ sensors -u
amdgpu-pci-0100
Adapter: PCI adapter
power1:
  power1_average: 17.233
  power1_cap: 180.000

Signed-off-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2018-10-30 16:30:32 -04:00
Marek Olšák
26cb93e229 radeonsi: add support for Raven2 (v2)
v2: fix enabling primitive binning

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-30 16:03:02 -04:00
Marek Olšák
0dea85928e radeonsi: clean up decompress flags in fast color clear 2018-10-30 16:03:02 -04:00
Marek Olšák
99835fff08 radeonsi/gfx9: set optimal OVERWRITE_COMBINER_WATERMARK 2018-10-30 16:03:02 -04:00
Marek Olšák
8ad12c8bec gallium: rework PIPE_HANDLE_USAGE_* flags
Only radeonsi uses them, so adjust them to match its needs.
2018-10-30 16:03:02 -04:00
Eric Engestrom
69eb6d58e8 nouveau: remove unused class member
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-30 18:10:59 +00:00
Eric Engestrom
6f9309d5d4 scons: drop unused HAVE_STDINT_H macro
This was required back when MSVC didn't support C99 and was missing this
header, but since MSVC 2013 (or maybe earlier?) this isn't it does and
this code isn't doing anything anymore.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Engestrom
4a266d01a7 vl: drop left-over variable
Fixes: 6ccc435e7a "pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 18:10:59 +00:00
Eric Anholt
68657d76b9 vc4: Fix unused variable warning.
Fixes: bb84fa146f ("util: use C99 declaration in the for-loop hash_table_foreach() macro")
2018-10-30 10:46:52 -07:00
Eric Anholt
cc54e1acf9 v3d: Use nir_remove_unused_io_vars to handle binner shader output DCE
We were doing this late after nir_lower_io, but we can just reuse the core
code.  By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
2018-10-30 10:46:52 -07:00
Eric Anholt
17c8198952 v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components.
This lets us trim unused trailing components in the vertex attributes,
reducing the size of our VPM allocations.
2018-10-30 10:46:52 -07:00
Michał Janiszewski
8ebd7039c4 svga: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-30 06:19:09 -06:00
Eric Engestrom
e9fb81375a st/dri: remove leftover local variable
Left over from the cleanup in 6ccc435e7a "pipe-loader: move dup(fd)
within pipe_loader_drm_probe_fd"

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-30 10:20:58 +00:00
Eric Engestrom
1df0c1e8fb clover: add missing meson build dependency
Fixes: 42ea0631f1 "meson: build clover"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:42 +00:00
Eric Engestrom
98e7c3e7a7 svga: add missing meson build dependency
Fixes: a537231b22 "meson: build svga driver on linux"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-10-29 16:39:38 +00:00
Rob Clark
a61952e737 freedreno: don't flush when new and old pfb is identical
In the 'inorder' case (ie. FD_MESA_DEBUG=inorder, or old kernel), if the
u_blitter clear path is used (a3xx, a4xx, and some fallback cases on
newer gens), util_blitter_restore_fb_state() will set_framebuffer_state()
to something that is identical to the current fb state, which triggers
an unnecessary flush, and then eventually an assert:

  (gdb) bt
  #0  0x0000007fbf24a078 in kill () from /lib64/libc.so.6
  #1  0x0000007fbe061278 in _debug_assert_fail (expr=0x7fbe93a820 "!batch->flushed", file=0x7fbe93a628 "../src/gallium/drivers/freedreno/freedreno_batch.c", line=491, function=0x7fbe93a990 <__func__.17380> "fd_batch_check_size") at ../src/gallium/auxiliary/util/u_debug.c:322
  #2  0x0000007fbe1ccb8c in fd_batch_check_size (batch=0x55556d5a70) at ../src/gallium/drivers/freedreno/freedreno_batch.c:491
  #3  0x0000007fbe1d0e08 in fd_clear (pctx=0x55555c61e0, buffers=5, color=0x55556e388c, depth=1, stencil=0) at ../src/gallium/drivers/freedreno/freedreno_draw.c:463
  #4  0x0000007fbe57afa4 in st_Clear (ctx=0x55556e17b0, mask=18) at ../src/mesa/state_tracker/st_cb_clear.c:452

The assert was introduced in 4b847b38ae, so from a functionality
standpoint this patch fixes that commit.  But it should also avoid an
unnecessary flush in the 'inorder' case, fixing a performance bug.

Fixes: 4b847b38ae freedreno: make fd_batch a one-shot thing
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-28 14:03:38 -04:00
Rob Clark
32dd75b927 freedreno: dependency tracking for z/s depends on ZSA state
ZSA state can change whether depth or stencil is enabled

This plus previous patch fix stk, and various things w/
FD_MESA_DEBUG=inorder

Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-28 14:03:38 -04:00
Rob Clark
05e868925c freedreno: mark all state dirty after switching batch
The problem isn't directly with ec717fc629 but rather that commit
exposes the problem.  When we switch batch we cannot assume previous
state is clean so we should mark all state dirty.

Fixes: ec717fc629 freedreno: reduce resource dependency tracking overhead
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-28 14:03:38 -04:00
Rob Clark
c41772d17a freedreno/a6xx: inline draw_impl()
Now that it is just called once per draw (instead of once for binning
and once for draw), let's just inline it.  If nothing else, it makes
perf-annotate easier to look at.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Rob Clark
604b5f1dca freedreno/a6xx: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Rob Clark
2a74d9ae8d freedreno/a6xx: move where we handle dirty vbo state
Historically this wasn't in fdN_emit_state(), because prior to addition
of blitter in a5xx, fdN_emit_state() was also used in the clear path.
These days that is only true for a2xx (a3xx and a4xx use u_blitter).  So
the reason for it not to be in fd6_emit_state() no longer exists.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Rob Clark
ddb7fadaf8 freedreno: avoid no-op flushes by re-using last-fence
Noticed that with webgl (in chromium, at least) we end up generating a
lot of no-op submits just to get a fence.  Tracking the last fence and
returning that if there is no rendering since last flush avoids this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
01194cd582 freedreno/a6xx: Move stencil/depth/alpha state to IB
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
a664dc2d59 freedreno/a6xx: Move stencil mask emit to FD_DIRTY_ZSA group
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
3073926512 freedreno/a6xx: Rename FD6_GROUP_ZSA ro FD6_GROUP_LRZ
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
edc0f1b10f freedreno/a6xx: Move rasterizer state to state object
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
3264eb691a freedreno/a6xx: Fix set_blit_scissor helper
The scissor maxx/maxy are non-inclusive, so don't subtract one from
framebuffer width and height.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
4222fe8af2 freedreno/a2xx: Squash a compiler warning
We get a warning here for assigning a const char * pointer to
char *swizzle in struct ir2_src_register.  The constructor strdups a 4
byte string here, so just memcpy to that instead.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Kristian H. Kristensen
4fd6265f42 freedreno/a6xx: Use fd6_emit_ib from a6xx
Move it to a header and use it where possible to avoid vfunc call.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2018-10-26 18:10:00 -04:00
Rob Clark
f3cc0d2747 freedreno: import libdrm_freedreno + redesign submit
In the pursuit of lowering driver overhead, it became clear that some
amount of redesign of how libdrm_freedreno constructs the submit ioctl
would be needed.  In particular, as the gallium driver is starting to
make heavier use of CP_SET_DRAW_STATE state groups/objects, the over-
head of tracking cmd buffers and relocs becomes too much.  And for
"streaming" state, which isn't ever reused (like uniform uploads) the
overhead of allocating/freeing ringbuffer[1] objects is too high.

This redesign makes two main changes:

 1) Introduces a fd_submit object for tracking bos and cmds table
    for the submit ioctl, making ringbuffer objects more light-
    weight.  This was previously done in the ringbuffer.  But we
    have many ringbuffer instances involved in a submit (gmem +
    draw + potentially 1000's of state-group rbs), and only need
    a single bos and cmds table.  (Reloc table is still per-rb)

    The submit is also a convenient place for a slab allocator for
    ringbuffer objects.  Other options would have required locking
    because, while we can guarantee allocations will only happen on
    a single thread, free's could happen either on the application
    thread or the flush_queue thread.  With the slab allocator in
    the submit object, any frees that happen on the flush_queue
    thread happen after we know that the application thread is done
    with the submit.

 2) Introduce a new "softpin" msm_ringbuffer_sp implementation that
    does not use relocs and only has cmds table entries for IB1 (ie.
    the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER
    to from the RB).  To do this properly will require some updates
    on the kernel side, so whether you get the softpin or legacy
    submit/ringbuffer implementation at runtime depends on your
    kernel version.

To make all these changes in libdrm would basically require adding a
libdrm_freedreno2, so this is a good point to just pull the libdrm code
into mesa.  Plus it allows for using mesa's hashtable, slab allocator,
etc.  And it lets us have asserts enabled for debug mesa buids but
omitted for release builds.  And it makes life easier if further API
changes become necessary.

At this point I haven't tried to pull in the kgsl backend.  Although
I left the level of vfunc indirection which would make it possible
to have other backends.  (And this was convenient to keep to allow
for the "softpin" ringbuffer to coexist.)

NOTE: if bisecting a build error takes you here, try a clean build.
There are a bunch of ways things can go wrong if you still have
libdrm_freedreno cflags.

[1] "ringbuffer" is probably a bad name, the only level of cmdstream
    buffer that is actually a ring is RB managed by kernel.  User-
    space cmdstream is all IB1/IB2 and state-groups.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-26 18:10:00 -04:00
Axel Davy
2318ca68bb st/nine: Handle window resize when a presentation buffer is used
Usually when a window is resized, the app calls d3d to resize the back
buffer to the window size. In some cases, it is not done,
and it expects the output resizes to the window size, even if
the back buffer size is unchanged.

This patch introduces the behaviour when a presentation buffer
is used.

ID3DPresent_GetWindowInfo is a function available with
D3DPresent v1.0, and thus we don't need to check if the
function is available.
The function had been introduced to implement this very
feature.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
3d975e98e4 st/nine: Reduce MaxSimultaneousTextures to 8
Windows drivers don't set this flag (which affects ff) to more than 8.

Do the same in case some games check for 8.

v2: Remove any dependence on MaxSimultaneousTextures. For non-ff
the number of textures is 16 when the device is able of vs/ps3.
Add this requirement of 16 textures to the driver requirements.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
739c700950 st/nine: Enable shadow mapping for ps 1.X
We didn't implement shadow textures for ps 1.X,
assuming the case couldn't happen...
Well it does.

Fixes: https://github.com/iXit/Mesa-3D/issues/261

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
847861aab4 st/nine: Do not set unused states for stateblocks
A lot of these states are used only for the context,
and are unused for stateblocks (which just uses the
changed.* fields instead for a lot of them).

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
6f373b9b74 st/nine: Fix aliasing states for stateblocks
If NINE_STATE_FF_MATERIAL is set, the stateblock will upload
its recorded materials matrix.
If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded.

These flags could be set by a NineDevice9_SetTransform call
or by setting some states related to ff, but that shouldn't trigger
these stateblock behaviours.

We don't need to follow the context states dirtied by render states.
NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock
updates of transformation matrices, NINE_STATE_FF is too broad.

These two changes avoid setting the two mentionned states when we
shouldn't.

Fixes: https://github.com/iXit/Mesa-3D/issues/320

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
454201b452 st/nine: Never update device changed.* fields
The device state changed.* field are never used.
These fields are used only for stateblocks.

Avoid setting them at all for clarity.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
2594b2efdc st/nine: Capture also default matrices for D3DSBT_ALL
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
bbeddb801e st/nine: Mark transform matrices dirty for D3DSBT_ALL
D3DSBT_ALL stateblocks capture the transform matrices.

Fixes some d3d test programs not displaying properly.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
a4e9bbb8f8 st/nine: Don't update unused world matrices
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.

Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
2e51c4c7cc st/nine: Remove two unused states.
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
cb8ea21e1c st/nine: Remove commented nine_context_apply_stateblock
At some point the project was to adapt the
commented version to csmt.

The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Brian Paul
d6be0b5556 scons/svga: remove opt from the list of valid build types
This reverts commit a5fd54f8bf.

The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2018-10-26 12:09:00 -06:00
Boyuan Zhang
f4126cfaab radeon/vcn: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang
55cf565698 radeon/vce: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang
b15d0200a9 vl: get h264 profile idc
Adding a function for converting h264 pipe video profile to profile idc

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
David McFarland
07a00a8729 util: Change remaining uint32 cache ids to sha1
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id.  Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id.  Instead of writing a uint32_t it now writes to a
mesa_sha1.  All drivers using disk_cache_get_function_identifier are
updated accordingly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 83ea8dd99b ("util: add disk_cache_get_function_identifier()")
2018-10-26 14:49:22 +11:00