Commit graph

4322 commits

Author SHA1 Message Date
Marek Olšák
c3fa9de2a9 nir/serialize: try to pack both deref array src into 32 bits
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
ed6b01d5e0 nir/serialize: cleanup - fold nir_deref_type_var cases into switches
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
a0cd67d292 nir/serialize: try to put deref->var index into the unused bits of the header
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
ca201bfe70 nir/serialize: don't serialize mode for deref non-cast instructions
It can be derived from src and var. This frees 10 bits in the header
that will be used later.

"mode" is moved in the structure, because those bits will be used for
something else later.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
2286340fde nir/serialize: don't store deref types if not needed
- type_cast: deduplicate types if the last one is the same
- derive the type from the parent for other derefs

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
70a7f85149 nir/serialize: try to pack two alu srcs into 1 uint32
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
ef4630cf4f nir/serialize: pack nir_intrinsic_instr::const_index[] better
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
d3346b275a nir/serialize: pack 1-component constants into 20 bits if possible
The majority of constants can be packed like this.

v2: - use enum for the packing encoding,
    - trim packed_value to 20 bits add 1 bit to last_component,
      which simplifies a later commit

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
75f7c38863 nir/serialize: pack load_const with non-64-bit constants better
v2: use blob_write_uint8/16

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
a572ba673b nir/serialize: try to store a diff in var data locations instead of var data
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
c8314678ee nir/serialize: deduplicate serialized var types by reusing the last unique one
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
545415f45f nir/serialize: don't serialize var->data for temporaries
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
c358c2b2bf nir/serialize: pack src better and limit the object count to 1M from 1G
We need to limit the object count to 1M to free 10 bits for the src
modifiers.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák
35655865cb nir/serialize: pack instructions better
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Ian Romanick
ca353285cb nir/range_analysis: Make sure the table validation only occurs once
All of the tables are static const, so they only need to be validated
once.  As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass.  Even with
the help of the previous commit, this does not always occur.

-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Ian Romanick
ccefce46cb nir/range-analysis: Add pragmas to help loop unrolling
I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions.  It seems none of this happens even at -O3.

Adding the pragmas helps encourage loop unrolling at some optimization
levels.  I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.

-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5

v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Danylo Piliaiev
25a00b449f glsl: Add varyings to "zero-init of uninitialized vars" workaround
Varyings are similar to already handled cases. And "glsl_zero_init"
name of the workaround already looks like it should include varyings.

The issue was observed in GiMark subtest from GpuTest.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 15:25:56 +00:00
Alyssa Rosenzweig
deaebc82a7 nir: Add load_sampler_lod_paramaters_pan intrinsic
This loads in the <min_lod, max_lod, lod_bias> settings for a given
sampler, which is necessary for lowering clamps/biases on certain
Midgard chips.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Marek Olšák
0b1452ffdd nir/serialize: do ctx = {0} instead of manual initializations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Marek Olšák
ff71fae440 nir: strip as we serialize to remove the nir_shader_clone call
Serializing stripped NIR is faster now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Dave Airlie
cce07ea835 nir: fix deref offset builder
Use the correct bit size

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:41 +10:00
Dave Airlie
7325f6ac98 vtn/opencl: add clz support
This is needed for OpenCL

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:41 +10:00
Dave Airlie
d0d96053e6 nir: add 64-bit ufind_msb lowering support. (v2)
This adds the option to lower 64-bit ufind_msb opcodes.

v2: use split_x/y removes component loops (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:37 +10:00
Dave Airlie
12913bcf86 spirv/nir/opencl: handle some multiply instructions.
This adds support for some missing 24-bit and hi multiply
variants.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Dave Airlie
5375c30234 spirv: get the correct type for function returns.
This needs to be derived from the address format, not always 1/32.

Suggested by Jason

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Dave Airlie
b62a925ad1 spirv: don't store 0 to cs.ptr_size for non kernel stages.
cs is a union so storing this there is wrong.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Iago Toral Quiroga
c573b50179 glsl: add missing initialization of the location path field
This was apparently missed in 67b32190f3, which added support
for ARB_shading_language_include to #line, including the 'path'
field for the location.

Fixes crashes in CTS with all drivers as they attempt to access
an uninitialized path string during parsing.

Fixes: 67b32190f3 ("glsl: add ARB_shading_language_include support to #line")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-11-21 12:55:15 +01:00
Timothy Arceri
cd6322366d compiler: move build definition of pp_standalone_scaffolding.c
This should fix android build issues while still allowing scons to
build the standalone compiler.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-11-21 16:07:08 +11:00
Karol Herbst
5934a53bfe nir/validate: validate num_components on registers and intrinsics
also make 8 and 16 compoments invalid. We will enable that later again
when we actually support it.

v2: fix validation of nir_intrinsic_instr::num_components
    correct validation of instr->num_components

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-21 01:10:24 +01:00
Rhys Perry
ca2de7ae9c nir/large_constants: use nir_index_vars and nir_variable::index
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-20 15:05:42 +00:00
Rhys Perry
9f92e8b721 nir: add nir_variable::index and nir_index_vars
This will be useful as a deterministic identifier/index for the variable.

v2: fix comment style

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
2019-11-20 15:05:42 +00:00
Rhys Perry
45a0b53490 nir: make nir_variable::{num_members,num_state_slots} a uint16_t
Doesn't shrink it (at least, on x86-64) and leaves space for more members.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-20 15:05:42 +00:00
Neil Roberts
f6b5abe91a nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops
Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
634eb9c04b nir: Add a 8-bit bool type
Adds nir_type_bool8 as well as 8-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
0f5640c577 nir: Add a 16-bit bool type
Adds nir_type_bool16 as well as 16-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
2ec97e78a9 nir/opcodes: Add a helper function to generate reduce opcodes
Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit
versions of the reduce operation. This reduces the code duplication a
bit and will make it easier to later add 16-bit versions as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts
9a96afb97e nir/opcodes: Add a helper function to generate the comparison binops
Adds binop_compare_all_sizes which generates both 1-bit and 32-bit
versions of the comparison operation. This reduces the code
duplication a bit and will make it easier to later add 16-bit versions
as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Timothy Arceri
1201d3377e mesa: add support cursor support for relative path shader includes
This will allow us to continue searching the current path for
relative shader includes.

From the ARB_shading_language_include spec:

   "If it is quoted with double quotes in a previously included
   string, then the first search point will be the tree location
   where the previously included string had been found."

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
db5197cec5 glsl: delay compilation skip if shader contains an include
If the shader contains an include when need to first run the
preprocessor before deciding if we can skip compilation based
on the shader cache.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
17df8f8b5d glsl: add can_skip_compile() helper
We will reuse this in the following commit.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri
5327b756bf glsl: error if #include used while extension is disabled
In other words make sure the shader does this:

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
13a1426b97 glsl: add preprocessor #include support
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
e0fd2fa689 glsl: pass gl_context to glcpp_parser_create()
This is a small tidy up and will be useful in the following commit.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
67b32190f3 glsl: add ARB_shading_language_include support to #line
From the ARB_shading_language_include spec:

   "#line must have, after macro substitution, one of the following
    forms:

       #line <line>
       #line <line> <source-string-number>
       #line <line> "<path>"

    where <line> and <source-string-number> are constant integer
    expressions and <path> is a valid string for a path supplied in the
    #include directive. After processing this directive (including its
    new-line), the implementation will behave as if it is compiling at
    line number <line> and source string number <source-string-number>
    or <path> path. Subsequent source strings will be numbered
    sequentially, until another #line directive overrides that
    numbering."

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri
35108caa71 glsl: add infrastructure for ARB_shading_language_include
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Marek Olšák
654efd38bb nir: don't use GLenum16 in nir.h
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:12 -05:00
Marek Olšák
ec7d37c9c0 nir: move data.descriptor_set above data.index for better packing
4 bytes down

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:10 -05:00
Marek Olšák
b160acb9f5 glsl_to_nir: rename image_access to mem_access
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:09 -05:00
Marek Olšák
193e2c9625 nir/print: only print image.format for image variables
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:07 -05:00
Marek Olšák
ebe7579655 nir: move data.image.access to data.access
The size of the data structure doesn't change.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:05 -05:00