Commit graph

391 commits

Author SHA1 Message Date
Iago Toral Quiroga
39df568ca1 v3d: refactor v3d_tf_statistics_record slightly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-26 08:29:41 +02:00
Pierre-Eric Pelloux-Prayer
e9cf8c1d30 u_blitter: add a msaa parameter to util_blitter_clear
Fixes: ea5b7de138 ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-23 14:42:20 -04:00
Ilia Mirkin
0e30c6b8a7 gallium: switch boolean -> bool at the interface definitions
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).

This has been build-tested on amd64 with:

Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
                 vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st:      mesa xa xvmc xvmc vdpau va

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 22:13:51 -04:00
Iago Toral Quiroga
dacaf7ec06 v3d: fill logicop_func in the fragment shader key when precompiling shaders
Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng
of logic ops on precompiled shaders, which we don't want to do. Also, this
had the side effect of making shader-db crash, as during this lowering we
would try to read the color format swizzle information from the fragment shader
key that we don't populate in precompiled shaders because right now we only
need it when logic operations are enabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-22 08:05:59 +02:00
Andreas Bergmeier
f92290a8d9 broadcom: Move v3d_get_device_info to common
In common we can use implementation for Vulkan.
2019-07-17 20:02:34 +00:00
Iago Toral Quiroga
556c299430 v3d: flag dirty state when binding new sampler states
We emit code to saturate texture coordinates when using clamp wrapping
mode so if we don't flag the dirty state here we don't get to recompile
the shaders when the wrapping mode changes.

v2:
  - Do the same when setting sampler views (Eric)
  - Use a switch statement instead of an if ladder.
  - Swap the shader stage assertion with an unreachable.

Fixes:
spec/!opengl 1.1/texwrap 1d bordercolor/gl_rgba8, border color only
spec/!opengl 1.1/texwrap 1d proj bordercolor/gl_rgba8, projected, border color only
spec/!opengl 1.1/texwrap 2d bordercolor/gl_rgba8, border color only
spec/!opengl 1.1/texwrap 2d proj bordercolor/gl_rgba8, projected, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha12, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha16, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_intensity8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance4_alpha4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance6_alpha2, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance8_alpha8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_r3_g3_b2, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb10, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb10_a2, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb5, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb5_a1, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgba4, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgba8, swizzled, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha12, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha16, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_intensity8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance4_alpha4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance6_alpha2, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance8_alpha8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_r3_g3_b2, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb10, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb10_a2, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb5, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb5_a1, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb8, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgba4, border color only
spec/!opengl 1.1/texwrap formats bordercolor/gl_rgba8, border color only
spec/!opengl 1.2/texwrap 3d bordercolor/gl_rgba8, border color only
spec/!opengl 1.2/texwrap 3d proj bordercolor/gl_rgba8, projected, border color only
spec/arb_es2_compatibility/texwrap formats bordercolor-swizzled/gl_rgb565, swizzled, border color only
spec/arb_es2_compatibility/texwrap formats bordercolor/gl_rgb565, border color only
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_alpha, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_luminance_alpha, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_rgb, swizzled, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_alpha, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_luminance_alpha, border color only
spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_rgb, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_alpha16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_intensity16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_luminance16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_luminance_alpha16f_arb, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_rgb16f, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_rgba16f, swizzled, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_alpha16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_intensity16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_luminance16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_luminance_alpha16f_arb, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_rgb16f, border color only
spec/arb_texture_float/texwrap formats bordercolor/gl_rgba16f, border color only
spec/arb_texture_rectangle/texwrap rect bordercolor/gl_rgba8, border color only
spec/arb_texture_rectangle/texwrap rect proj bordercolor/gl_rgba8, projected, border color only
spec/arb_texture_rg/texwrap formats bordercolor-swizzled/gl_r8, swizzled, border color only
spec/arb_texture_rg/texwrap formats bordercolor-swizzled/gl_rg8, swizzled, border color only
spec/arb_texture_rg/texwrap formats bordercolor/gl_r8, border color only
spec/arb_texture_rg/texwrap formats bordercolor/gl_rg8, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor-swizzled/gl_r16f, swizzled, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor-swizzled/gl_rg16f, swizzled, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor/gl_r16f, border color only
spec/arb_texture_rg/texwrap formats-float bordercolor/gl_rg16f, border color only
spec/ext_packed_float/texwrap formats bordercolor-swizzled/gl_r11f_g11f_b10f, swizzled, border color only
spec/ext_packed_float/texwrap formats bordercolor/gl_r11f_g11f_b10f, border color only
spec/ext_texture_shared_exponent/texwrap formats bordercolor-swizzled/gl_rgb9_e5, swizzled, border color only
spec/ext_texture_shared_exponent/texwrap formats bordercolor/gl_rgb9_e5, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_alpha8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_intensity8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_luminance8_alpha8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_luminance8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_r8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rg8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rgb8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rgba8_snorm, swizzled, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_alpha8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_intensity8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_luminance8_alpha8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_luminance8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_r8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_rg8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_rgb8_snorm, border color only
spec/ext_texture_snorm/texwrap formats bordercolor/gl_rgba8_snorm, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_sluminance8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_sluminance8_alpha8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_srgb8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_srgb8_alpha8, swizzled, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_sluminance8, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_sluminance8_alpha8, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_srgb8, border color only
spec/ext_texture_srgb/texwrap formats bordercolor/gl_srgb8_alpha8, border color only

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 08:13:28 +02:00
Iago Toral Quiroga
7c1d708911 v3d: acquire scoreboard lock before first tlb read
Until now we have always been emitting our scoreboard locks on the last thread
switch to improve parallelism. We did this by emitting our last thread switch
right before our tlb writes at the very end of the program, where we know that
we are outside control flow.

Unfortunately, this strategy is not valid when we have tlb color reads too, as
these will happen before this point in the program and can happen inside
control flow.

To fix this we always emit a thread switch before the first tlb load and if we
see additional thread switches after that point, we change the strategy to lock
on the first thread switch.

v2: change the solution so it is expected to work in more scenarios (Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-12 09:16:38 +02:00
Iago Toral Quiroga
0279ac6e51 v3d: add color formats and swizzles to the fragment shader key
We are going to need these very soon to emit correct reads from the tlb
to implement logic operations.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-12 09:16:38 +02:00
Erik Faye-Lund
39e7fbf24a gallium: get rid of PIPE_CAP_SM3
PIPE_CAP_SM3 has always been an odd one out of all our caps. While most
other caps are fine-grained and single-purpose, this cap encode several
features in one. And since OpenGL cares more about single features, it'd
be nice to get rid of this one.

As it turns, this is now relatively simple. We only really care about
three features using this cap, and those already got their own caps. So
we can remove it, and make sure all current drivers just give the same
response to all of them.

The only place we *really* care about SM3 is in nine, and there we can
instead just re-construct the information based on the finer-grained
caps. This avoids DX9 semantics from needlessly leaking into all of the
drivers, most of who doesn't care a whole lot about DX9 specifically.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-10 15:50:51 +02:00
Alejandro Piñeiro
71446bf8e3 v3d: Early return with handle 0 when getting a bo on the simulator
Until now we were just asking entries on the bo hash table, and don't
worry if the handle was NULL, as we were just expecting to get a NULL
in return. It seems that now the hash table assert with some reserverd
pointers, included NULL. This commit just early returns with handle 0.

This change fixes several crashes on vk-gl-cts GLES tests when using
the v3d simulator, like:
KHR-GLES3.core.internalformat.copy_tex_image.*

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-09 08:40:35 +02:00
Iago Toral Quiroga
042aeffd5b v3d: do not flush jobs that are synced with 'Wait for transform feedback'
Generally, we achieve this by skipping the flush on calls to
v3d_flush_jobs_writing_resource() when we detect that the resource is written
in the current job from a transform feedback write.

The exception to this is the case where the caller is about to map the
resource, in which case we need to flush immediately since we can only emit
'Wait for transform feedback' commands on rendering jobs. We add a parameter
to the function so the caller can identify that scenario.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-02 08:57:20 +02:00
Iago Toral Quiroga
88cbc4f7f6 v3d: emit 'Wait for transform feedback' commands when needed
The hardware can flush transform feedback writes before reads in the same
job by inserting this command.

This patch detects when the rendering state for the current draw call reads
resources that had been previously written by transform feedback in the
same job and inserts the 'Wait for transform feedback' command before
emitting the new draw.

v2 (Eric):
  - this was intended to look at job->tf_write_prscs for TF jobs.
  - clear job->tf_write_prscs after we emit the TF flush.
  - can skip flushes for fragment shader reads from TF.

v3 (Eric):
  - all resources in job->tf_write_prscs are resources written by TF so
   we don't need to check if they are bound to PIPE_BIND_STREAM_OUTPUT.
  - documented optimization opportunity for geometry stages.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-02 08:57:20 +02:00
Iago Toral Quiroga
c7dff0e614 v3d: keep track of resources written by transform feedback
The hardware provides a feature to sync reads from previous transform feedback
writes in the same job so if we use this mechanism we no longer have to flush
the job.

In order to identify this scenario we need a mechanism to identify resources
that are written by transform feedback.

v2: use _mesa_pointer_set_create (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-02 08:57:20 +02:00
Iago Toral Quiroga
4d8f82946b v3d: flush jobs writing to vertex buffers used in the current draw call
This can happen when any of our vertex buffers was written by a previous
transform feedback draw.

Fixes the following piglit tests:
spec/ext_transform_feedback/position-render-bufferbase
spec/ext_transform_feedback/position-render-bufferbase-discard
spec/ext_transform_feedback/position-render-bufferoffset
spec/ext_transform_feedback/position-render-bufferoffset-discard
spec/ext_transform_feedback/position-render-bufferrange
spec/ext_transform_feedback/position-render-bufferrange-discard

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 08:06:13 +02:00
Iago Toral Quiroga
eb44dcc219 v3d: flush jobs reading from transform feedback output buffers
If we are about to write to a transform feedback buffer, we should
make sure that we flush any prior work that intended to read from
any of these buffers.

Fixes piglit test:
spec/ext_transform_feedback/immediate-reuse

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 08:06:13 +02:00
Iago Toral Quiroga
42572f2f7d v3d: add a helper to check if transform feedback is enabled
v2: We should be safe assuming that bind_vs != NULL (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 08:06:13 +02:00
Iago Toral Quiroga
6d97c8fac1 v3d: only flush jobs accessing the query BO when reading query results
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-18 08:09:03 +02:00
Iago Toral Quiroga
5491883a9a v3d: add a helper function to flush jobs using a BO
v2: use _mesa_set_search() (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-18 08:09:03 +02:00
Iago Toral Quiroga
9b96ae69bc v3d: don't emit point coordinates varyings if the FS doesn't read them
We still need to emit them in V3D 3.x since there there is no mechanism to
disable them.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:29:42 +02:00
Eric Anholt
60a64f028d v3d: Use driconf to expose non-MSAA texture limits for Xorg.
The V3D 4.2 HW has a limit to MSAA texture sizes of 4096.  With non-MSAA,
we can go up to 7680 (actually probably 8138, but that hasn't been
validated by the HW team).  Exposing 7680 in X11 will allow dual 4k displays.
2019-05-13 12:03:11 -07:00
Eric Anholt
0c31fe9ee7 gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.
The _LEVELS assumes that the max is always power of two.  For V3D 4.2, we
can support up to 7680 non-power-of-two MSAA textures, which will let X11
support dual 4k displays on newer hardware.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-13 12:03:08 -07:00
Eric Anholt
971a13d805 Revert "v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER."
This reverts commit ccce940947, leaving a
note as to why we had to (corruption in chromium, breaking some GLES3.1
tests).
2019-04-26 12:42:30 -07:00
Eric Anholt
49071b2e3f v3d: Don't try to update the shadow texture for separate stencil.
There are two cases where v3d's sampler view's resource doesn't match the
base's: shadow textures for sampling from raster, and pointing at the
separate depth texture for z32f_s8x24.  We only want to update shadow for
the first case.

Fixes
dEQP-GLES31.functional.stencil_texturing.render.depth32f_stencil8_draw
when run after the previous testcase.
2019-04-26 12:42:30 -07:00
Eric Anholt
d8486c2ad7 v3d: Use _mesa_hash_table_remove_key() where appropriate. 2019-04-26 12:42:30 -07:00
Eric Anholt
42210a4351 v3d: Apply the GFXH-930 workaround to the case where the VS loads attrs.
We were emitting a dummy load for when the VS doesn't load any attributes,
but we also need to emit a dummy load for when the render VS loads
attributes but the binner VS doesn't.  Fixes simulator assertion failures
and GPU hangs on KHR-GLES31.core.texture_gather.\*
2019-04-26 12:42:30 -07:00
Eric Anholt
448fc3ea42 v3d: Fill in the ignored segment size fields to appease new simulator.
We are assured that the input segment size field is ignored for
!separate_segs mode, and now the simulator wants an in-range value set
regardless of whether it's functionally ignored or not.
2019-04-26 12:40:31 -07:00
Eric Anholt
d23b47fda5 v3d: Disable SSBOs and atomic counters on vertex shaders.
The CTS fails on
dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.*vertex
when they are enabled, due to the VS being run for both bin and render.  I
think this behavior is expected to be valid, but I can't find text in
atomic counters or SSBO specs saying so (the closed I found was in
shader_image_load_store).  Just disable it for now, since the closed
source driver doesn't expose vertex atomic counters/SSBOs either.
2019-04-24 17:24:11 -07:00
Dylan Baker
95aefc94a9 Delete autotools
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-04-15 13:44:29 -07:00
Eric Anholt
dc402be73e v3d: Use the new lower_to_scratch implementation for indirects on temps.
We can use the same register spilling infrastructure for our loads/stores
of indirect access of temp variables, instead of doing an if ladder.

Cuts 50% of instructions and max-temps from 2 KSP shaders in shader-db.
Also causes several other KSP shaders with large bodies and large loop
counts to not be force-unrolled.

The change was originally motivated by NOLTIS slightly modifying register
pressure in piglit temp mat4 array read/write tests, triggering register
allocation failures.
2019-04-12 16:16:58 -07:00
Eric Anholt
8a2d91e124 v3d: Detect the correct number of QPUs and use it to fix the spill size.
We were missing a * 4 even if the particular hardware matched our
assumption.
2019-04-12 15:59:31 -07:00
Eric Anholt
6b1c659825 v3d: Add Compute Shader compilation support.
While waiting for the CSD UABI to get reviewed, I keep having to rebase
the CS patch.  Just land the compiler side for now to keep it from
diverging.

For now this covers just GLES 3.1 compute shaders, not CL kernels.
2019-04-12 15:59:31 -07:00
Eric Anholt
276ec879fd v3d: Drop a note for the future about PIPE_CAP_PACKED_UNIFORMS. 2019-04-12 15:58:28 -07:00
Timothy Arceri
035759b61b nir/i965/freedreno/vc4: add a bindless bool to type size functions
This required to calculate sizes correctly when we have bindless
samplers/images.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Eric Anholt
771adffec1 st: Lower uniforms in st in the !PIPE_CAP_PACKED_UNIFORMS case as well.
PIPE_CAP_PACKED_UNIFORMS conflates several things: Lowering uniforms i/o
at the st level instead of the backend, packing uniforms with no padding
at all, and lowering to UBOs.

Requiring backends to lower uniforms i/o for !PIPE_CAP_PACKED_UNIFORMS
leads to the driver needing to either link against the type size function
in mesa/st, or duplicating it in the backend.  Given that all backends
want this lower-io as far as I can tell, just move it to mesa/st to
resolve the link issue and avoid the driver author needing to understand
st's uniforms layout.

Incidentally, fixes uniform layout failures in nouveau in:

dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment
dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_fragment
dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex

and I think in Lima as well.

v2: fix indents

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-10 11:44:20 -07:00
Jason Ekstrand
6279074de1 nir: Get rid of global registers
We have a pass to lower global registers to locals and many drivers
dutifully call it.  However, no one ever creates a global register ever
so it's all dead code.  It's time we bury it.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-04-09 00:29:36 -05:00
Eric Anholt
4c70f276bc v3d: Don't try to use the TFU blit path if a scissor is enabled.
We'll need to do a render-based blit for scissors, since the TFU (as seen
in this conditional) can only update a whole surface.

Fixes: 976ea90bdc ("v3d: Add support for using the TFU to do some blits.")
Fixes piglit fbo-scissor-blit.
2019-04-04 17:30:35 -07:00
Eric Anholt
62360e92ec v3d: Bump the maximum texture size to 4k for V3D 4.x.
4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in
the CTS at 8k and 16k.  4k at least lets us get one 4k display working.

Cc: mesa-stable@lists.freedesktop.org
2019-04-04 17:30:35 -07:00
Eric Anholt
e3063a8b2f v3d: Add support for handling OOM signals from the simulator.
I have v3d allocating enough initial allocation memory that we've been
passing tests without it, but to match kernel behavior more it would be
good to actually exercise the OOM path.
2019-04-04 17:30:35 -07:00
Marek Olšák
66a82ec6f0 gallium: add writable_bitmask parameter into set_shader_buffers
to indicate write usage per buffer.
This is just a hint (it will be used by radeonsi).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-04 19:28:52 -04:00
Eric Anholt
16f2770eb4 v3d: Upload all of UBO[0] if any indirect load occurs.
The idea was that we could skip uploading the constant-indexed uniform
data and just upload the uniforms that are variably-indexed.  However,
since the VS bin and render shaders may have a different set of uniforms
used, this meant that we had to upload the UBO for each of them.  The
first case is generally a fairly small impact (usually the uniform array
is the most space, other than a couple of FSes in shader-db), while the
second is a larger impact: 3DMMES2 was uploading 38k/frame of uniforms
instead of 18k.

Given that the optimization is of dubious value, has a big downside, and
is quite a bit of code, just drop it.  No change in shader-db.  No change
on 3DMMES2 (n=15).
2019-03-21 14:20:50 -07:00
Eric Anholt
320e96bace v3d: Move constant offsets to UBO addresses into the main uniform stream.
We'd end up with the constant offset in the uniform stream anyway, since
they're bigger than small immediates.  Avoids the extra uniforms and adds
in the shader in favor of just adding once on the CPU.

shader-db:
total instructions in shared programs: 6496865 -> 6494851 (-0.03%)
total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)
2019-03-21 14:20:50 -07:00
Eric Anholt
c36d2793ec v3d: Rename v3d_tmu_config_data to v3d_unit_data.
I want to reuse this for encoding small constant UBO/SSBO offsets into the
uniform stream to reduce the extra uniform loads and adds for the small
constant offsets.
2019-03-21 14:20:50 -07:00
Kenneth Graunke
220c1dce1e gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.
The glMemoryBarrier() function makes shader memory stores ordered with
respect to things specified by the given bits.  Until now, st/mesa has
ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT,
saying that drivers should implicitly perform the needed flushing.

This seems like a pretty big assumption to make.  Instead, this commit
opts to translate them to new PIPE_BARRIER bits, and adjusts existing
drivers to continue ignoring them (preserving the current behavior).

The i965 driver performs actions on these memory barriers.  Shader
memory stores go through a "data cache" which is separate from the
render cache and other read caches (like the texture cache).  All
memory barriers need to flush the data cache (to ensure shader memory
stores are visible), and possibly invalidate read caches (to ensure
stale data is no longer visible).  The driver implicitly flushes for
most caches, but not for data cache, since ARB_shader_image_load_store
introduced MemoryBarrier() precisely to order these explicitly.

I would like to follow i965's approach in iris, flushing the data cache
on any MemoryBarrier() call, so I need st/mesa to actually call the
pipe->memory_barrier() callback.

Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate
and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on
the iris driver.

Roland said this looks reasonable to him.
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-19 23:43:33 -07:00
Eric Anholt
17115da6ad v3d: Expose the dma-buf modifiers query.
This allows DRI3 to pick between UIF and raster according to whether we're
pageflipping or not and whether the pageflipping display can do UIF,
avoiding copies for the windowed/composited case that previously was
forced to linear.

Improves windowed glmark2 -b build:use-vbo=false performance by 30.7783%
+/- 13.1719% (n=3)
2019-03-19 08:59:01 -07:00
Eric Anholt
bf6973199d v3d: Allow the UIF modifier with renderonly.
We ask the other side to make a buffer with the right number of pages, and
then just store the UIF in it.  This avoids an extra silent copy of the
buffer from linear to UIF if it gets used for texturing (X11 copy-based
swapbuffers, GL compositors).
2019-03-19 08:54:46 -07:00
Eric Anholt
eb5903a908 v3d: Always lay out shared tiled buffers with UIF_TOP set.
The samplers are already ready for this, we just needed to make sure that
layout chose UIF for level 0.
2019-03-19 08:54:46 -07:00
Alyssa Rosenzweig
cca270bb03 v3d: Use shared drm_find_modifier util
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-03-14 22:42:51 +00:00
Eric Anholt
486b181fd7 v3d: Fix leak of the renderonly struct on screen destruction.
This makes v3d match vc4's destroy path.

Fixes: e113b21cb7 ("v3d: Add renderonly support.")
2019-03-12 16:15:40 -07:00
Eric Anholt
ccce940947 v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER.
This reduces the runtime of dEQP-GLES3.functional.shaders.precision.* from
11.5s to 3.3s.  This brings CTS runs down to 4 hours on one of my target
devices.
2019-03-12 09:04:25 -07:00
Timur Kristóf
9a834447d6 tgsi_to_nir: Produce optimized NIR for a given pipe_screen.
With this patch, tgsi_to_nir will output NIR that is tailored to
the given pipe, by reading its capabilities and adjusting the NIR code
to those capabilities similarly to how glsl_to_nir works.

It also adds an optimization loop that brings the output NIR in line
with what glsl_to_nir outputs. This is necessary for the same reason
why glsl_to_nir has its own optimization loop: currently not every
driver does these optimizations yet.

For uses which cannot pass a pipe_screen we also keep a variant
called tgsi_to_nir_noscreen which keeps the old behavior.

Signed-Off-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Acked-By: Eric Anholt <eric@anholt.net>
2019-03-05 19:13:27 +00:00