Commit graph

742 commits

Author SHA1 Message Date
Jason Ekstrand
dee58ecd2e intel/fs/nir: Use Q immediates for load_const on gen8+
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
9bb34892bf intel/fs/nir: Setup immediates based on type in i2b and f2b
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-11-07 10:41:24 -08:00
Jason Ekstrand
ab9220edd6 nir,intel/compiler: Use a fixed subgroup size
The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
as a uniform.  This means that it cannot change across an invocation
such as a draw call or a compute dispatch.  For compute shaders, we're
ok because we only ever use one dispatch size.  For fragment, however,
the hardware dynamically chooses between SIMD8 and SIMD16 which violates
the spec.  Instead, let's just pick a subgroup size based on the shader
stage.  The fixed size we choose for compute shaders is a bit higher
than strictly needed but there's no real harm in that.  The advantage is
that, if they do anything interesting with the value, NIR will see it as
an immediate and can optimize better.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
dc4cf11dfc intel/fs: Explicitly set EXECUTE_1 where needed
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6132992cdb intel/compiler/fs: Set up subgroup invocation as a system value
Subgroup invocation is computed using a vector immediate and some
dispatch-aware arithmetic.  Unfortunately, due to the vector arithmetic,
and the fact that it's frequently read 16-wide, it's not something that
can easily be CSEd by the back-end compiler.  There are a few different
possible approaches to this problem:

 1) Emit the code to calculate the subgroup invocation on-the-fly and
    trust NIR to do the CSE.  This is what we were doing.

 2) Add a back-end instruction for the subgroup ID.  This has the
    advantage of helping the back-end compiler with CSE but has the
    downside of very poor scheduling for the calculation because it has
    to be emitted in the back-end.

 3) Emit the calculation at the top of the program and re-use the
    result.  This gets rid of the CSE problem but comes at the cost of
    an extra live register.

This commit switches us from 1) to 3).  We choose to store the subgroup
invocation values as a W type to reduce the impact of the extra live
register.  Trusting NIR and using 1) was fine but we're soon going to
want to use the subgroup invocation value for other things in the
back-end compiler and this makes it much easier to do without having to
worry about CSE problems.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
295605c930 intel/cs: Push subgroup ID instead of base thread ID
We're going to want subgroup ID for SPIR-V subgroups eventually anyway.
We really only want to push one and calculate the other from it.  It
makes a bit more sense to push the subgroup ID because it's simpler to
calculate and because it's a real API thing.  The only advantage to
pushing the base thread ID is to avoid a single SHL in the shader.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
80ddfab2f5 intel/cs: Rework the way thread local ID is handled
Previously, brw_nir_lower_intrinsics added the param and then emitted a
load_uniform intrinsic to load it directly.  This commit switches things
over to use a specific NIR intrinsic for the thread id.  The one thing I
don't like about this approach is that we have to copy thread_local_id
over to the new visitor in import_uniforms.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
b299ded02e intel/fs: use pull constant locations to check for first compile of a shader
Before, we bailing in assign_constant_locations based on the minimum
dispatch size.  The more direct thing to do is simply to check for
whether or not we have constant locations and bail if we do.  For
nir_setup_uniforms, it's completely safe to do it multiple times because
we just copy a value from the NIR shader.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
103081c9a9 intel/fs: Retype dest to match value in read[First]Invocation
This is what we really wanted all along.  Always retyping to D works
because that's what get_nir_src() always gives us, at least for 32-bit
types.  The SPIR-V variants of these operations accept arbitrary types
and we need this if we're going to handle 64 or 16-bit values.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
ebaee9da4a intel/fs: Uniformize the index in readInvocation
The index is any value provided by the shader and this can be called in
non-uniform control flow so we can't just take component 0.  Found by
inspection.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
aa4ff4b98c i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
ec8c6649f1 i965/fs/nir: Minor refactor of store_output
Stop retyping the output of shuffle_64bit_data_for_32bit_write.  It's
always BRW_REGISTER_TYPE_D which is perfectly fine for writing out.
Also, when we change get_nir_src to return something with a 64-bit type
for 64-bit values, the retyping will not be at all what we want.  Also,
retyping the output based on src.type before we whack it back to 32 bits
is a problem because the output is always 32 bits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
030d2b5016 i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
All callers of this function allocate a fs_reg expressly to pass into
it.  It's much easier if we just let the helper allocate the register.
While we're here, we switch it to doing the MOVs with an integer type so
that we don't accidentally canonicalize floats on half of a double.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6197a6b7ac i965/fs/nir: Simplify 64-bit store_output
The swizzles weren't doing any good because swiz is just XYZW.  Also, we
were emitting an extra set of MOVs because shuffle_64bit_data_for_32bit
already does a MOV for us.  Finally, the temporary was only ever used
inside the inner loop so there's no need for it to actually be an array.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-11-07 10:37:52 -08:00
Jason Ekstrand
1b8ef49f48 intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
For some reason, the any/all predicates don't work properly with SIMD32.
In particular, it appears that a SEL with a QtrCtrl of 2H doesn't read
the correct subset of the flag register and you end up getting garbage
in the second half.  Work around this by using a pair of 1-wide MOVs and
scattering the result.  This fixes the any/all instructions for SIMD32.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
1f41663007 intel/fs: Use an explicit D type for vote any/all/eq intrinsics
The any/all intrinsics return a boolean value so D or UD is the correct
type.  Unfortunately, get_nir_dest has the annoying behavior of
returnning a float type by default.  This causes format conversion which
gives us -1.0f or 0.0f in the register.  If the consumer of the result
does an integer comparison to zero, it will give you the right boolean
value but if we do something more clever based on the 0/~0 assumption
for booleans, this will give the wrong value.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
6c00240bc6 intel/fs: Don't stomp f0.1 in SIMD16 ballot
In fragment shaders f0.1 is used for discards so doing ballot after a
discard can potentially cause the discard to not happen.  However, we
don't support SIMD32 fragment shaders yet so this isn't a problem.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Jason Ekstrand
def013a863 intel/fs: Use ANY/ALL32 predicates in SIMD32
We have ANY/ALL32 predicates and, for the most part, they work just
fine.  (See the next commit for more details.)  Also, due to the way
that flag registers are handled in hardware, instruction splitting is
able to split the CMP correctly.  Specifically, that hardware looks at
the execution group and knows to shift it's flag usage up correctly so a
2H instruction will write to f0.1 instead of f0.0.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-11-07 10:37:52 -08:00
Alejandro Piñeiro
4723933b8e i965/fs: Add brw_reg_type_from_bit_size utility method
Returns the brw_type for a given ssa.bit_size, and a reference type.
So if bit_size is 64, and the reference type is BRW_REGISTER_TYPE_F,
it returns BRW_REGISTER_TYPE_DF. The same applies if bit_size is 32
and reference type is BRW_REGISTER_TYPE_HF it returns BRW_REGISTER_TYPE_F

v2 (Jason Ekstrand):
 - Use better unreachable() messages
 - Add Q types

Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-25 16:14:09 -07:00
Jason Ekstrand
99778e7f9f i965/fs/nir: Use the nir_src_bit_size helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-25 16:14:09 -07:00
Samuel Iglesias Gonsálvez
c6d7d09bd0 i965/fs: remove setting default LOD in the backend
It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-20 08:29:53 +02:00
Kenneth Graunke
6f5abf3146 i965: Fix output register sizes when multiple variables share a slot.
ARB_enhanced_layouts allows multiple output variables to share the same
location - and these variables may not have the same sizes.  For
example, consider these output variables:

   // consume X/Y/Z components of 6 vectors
   layout(location = 0) out vec3 a[6];

   // consumes W component of the first vector
   layout(location = 0, component = 3) out float b;

Looking at the first declaration, we see that VARYING_SLOT_VAR0 needs 24
components worth of space (vec3 padded out to a vec4, 4 * 6 = 24).  But
looking at the second declaration, we would think that VARYING_SLOT_VAR0
needs only 4 components of space (a single float padded out to a vec4).

nir_setup_outputs() only considered the space requirements of the first
declaration it happened to see, so if 'float b' came first, it would
underallocate the output register space, causing brw_fs_validator.cpp
to assert fail about inst->dst.offset exceeding the register size.

Fixes Piglit's tests/spec/arb_enhanced_layouts/execution/component-layout/
vs-to-fs-array-interleave-single-location.shader_test.

Thanks to Tim Arceri for finding this bug and writing a test!

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-10 17:29:37 -07:00
Iago Toral Quiroga
5ec21eb1a0 i965/tes: account for the fact that dvec3/4 inputs take two slots
When computing the total size of the URB for tessellation evaluation
inputs we were not accounting for this, and instead we were always
assuming that each input would take a single vec4 slot, which could
lead to computing a smaller read size than required. Specifically, this
is a problem when the last input is a dvec3/4 such that its XY components
are stored in the the second half of a payload register (which can happen
if the offset for the input in the URB is not 64-bit aligned because
there are 32-bit inputs mixed in) and the ZW components in the
first half of the next, as in this case we would fail to account for the
extra slot required for the ZW components.

Fixes (requires another fix in CTS currently in review):
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-10 08:59:54 +02:00
Matt Turner
7e88f93469 i965/fs: Rewrite fsign64 to skip the float -> double conversion
... without the float -> double conversion. Low power parts have
additional restrictions when it comes to operating on 64-bit types, and
the instruction used to do the conversion violates one of them:
specifically, the restriction that "Source and Destination horizontal
stride must be aligned to the same qword".

Previously we generated a float and then converted, but we can avoid the
conversion by using the same extract-the-sign-bit + or-in-1.0 algorithm
by directly operating on the high four bytes of each double-precision
component in the result.

In SIMD8 and SIMD16 this cuts one instruction from the implementation,
and more importantly that instruction is the one which violated the
regioning restriction.

Along the way I removed some comments that I did not think helped, and
some code about double comparisons which does not seem to be necessary
today.

This prevents validation failures caught by the new EU validation code
added in later patches.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-10-04 14:08:54 -07:00
Matt Turner
b541945c20 i965/fs: Unpack count argument to 64-bit shift ops on Atom
64-bit operations on Atom parts have additional restrictions over their
big-core counterparts (validated by later patches).

Specifically, the restriction that "Source and Destination horizontal
stride must be aligned to the same qword" is violated by most shift
operations since NIR uses a 32-bit value as the shift count argument,
and this causes instructions like

   shl(8)          g19<1>Q         g5<4,4,1>Q      g23<4,4,1>UD

where src1 has a 32-bit stride, but the dest and src0 have a 64-bit
stride.

This caused ~4 pixels in the ARB_shader_ballot piglit test
fs-readInvocation-uint.shader_test to be incorrect. Unfortunately no
ARB_gpu_shader_int64 test hit this case because they operate on
uniforms, and their scalar regions are an exception to the restriction.

We work around this by effectively unpacking the shift count, so that we
can read it with a 64-bit stride in the shift instruction. Unfortunately
the unpack (a MOV with a dst stride of 2) is a partial write, and cannot
be copy-propagated or CSE'd.

Bugzilla: https://bugs.freedesktop.org/101984
2017-10-04 14:08:54 -07:00
Iago Toral Quiroga
47e527bd81 i965/fs: force pull model for 64-bit GS inputs
Triggering the push model when 64-bit inputs are involved is not easy due to
the constrains on the maximum number of registers that we allow for this mode,
however, for GS with 'points' primitive type and just a couple of double
varyings we can trigger this and it just doesn't work because the
implementation is not 64-bit aware at all. For now, let's make sure that we
don't attempt this model whith 64-bit inputs and we always fall back to pull
model for them.

Also, don't enable the VUE handles in the thread payload on the fly when we
find an input for which we need the pull model, this is not safe: if we need
to resort to the pull model we need to account for that when we setup the
thread payload so we compute the first non-payload register properly. If we
didn't do that correctly and we enable it on-the-fly here then we will end up
VUE handles on the first non-payload register which will probably lead to
GPU hangs. Instead, always enable the VUE handles for the pull model so we
can safely use them when needed. The GS is going to resort to pull model
almost in every situation anyway, so this shouldn't make a significant
difference and it makes things easier and safer.

v2: Always enable the VUE handles for pull model, this is easier and safer
    and the GS is going to fallback to pull model almost always anyway (Ken)

v3: Only clamp the URB read length if we are over the maximum reserved for
    push inputs as we were doing in the original code (Ken).

v4: No need to clamp the urb read length if invocations > 1

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-29 08:18:25 +02:00
Matt Turner
069bf7c907 i965/fs: Match destination type to size for ballot
No use in taking a 64-bit value when we know the high 32-bits are zero.
2017-07-20 16:56:50 -07:00
Matt Turner
782ef30451 i965/fs: Implement ARB_shader_ballot operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
43ef75b394 nir: Add system values from ARB_shader_ballot
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.

Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
ee9fa4ac18 i965/fs: Implement ARB_shader_group_vote operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Kenneth Graunke
b2da123801 i965: Use pushed UBO data in the scalar backend.
This actually takes advantage of the newly pushed UBO data, avoiding
pull loads.

Improves performance in GLBenchmark Manhattan 3.1 by:

   HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%.
   (thanks to Eero Tamminen for these numbers)

shader-db results on Skylake, ignoring programs with spill/fill changes:

   total instructions in shared programs: 13963994 -> 13651893 (-2.24%)
   instructions in affected programs: 4250328 -> 3938227 (-7.34%)
   helped: 28527
   HURT: 0

   total cycles in shared programs: 179808608 -> 172535170 (-4.05%)
   cycles in affected programs: 79720410 -> 72446972 (-9.12%)
   helped: 26951
   HURT: 1248

   LOST:   46
   GAINED: 21

Many "Deus Ex: Mankind Divided" shaders which already spilled end up
spill a lot more (about 240 programs hurt, 9 helped).  The cycle
estimator suggests this is still overall a win (-0.23% in cycle counts)
presumably because we trade pull loads for fills.

v2: Drop "PULL" environment variable left in for initial debugging
    (caught by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Lionel Landwerlin
226fae7849 intel/compiler: no need to check unsigned is >= 0
CID: 1338342
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:45 +01:00
Lionel Landwerlin
030abc6109 intel: compiler/i965: fix is_broxton checks
In 5f2fe9302c is_geminilake was introduced for the differenciate
broxton from geminilake. Unfortunately I failed as verifying that
is_broxton is throughout the code base to mean Gen9lp.

Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-20 23:26:42 +01:00
Jason Ekstrand
e31042ab40 i965/fs: Move remapping of gl_PointSize to the NIR level
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand
ca4d192802 i965/fs: Lower gl_VertexID and friends to inputs at the NIR level
NIR calls these system values but they come in from the VF unit as
vertex data.  It's terribly convenient to just be able to treat them as
such in the back-end.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
5e832302dc i965: Move multiply by 4 for VS ATTR setup into the scalar backend.
The vec4 backend will want to count in units of vec4s, not scalar
components.  The simplest solution is to move the multiplication by 4
into the scalar backend.  This also improves consistency with how we
count varyings.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
b86dba8a0e nir: Embed the shader_info in the nir_shader again
Commit e1af20f18a changed the shader_info
from being embedded into being just a pointer.  The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct.  This, however, has
caused a few problems:

 1) There are many things which generate NIR without GLSL.  This means
    we have to support both NIR shaders which come from GLSL and ones
    that don't and need to have an info elsewhere.

 2) The solution to (1) raises all sorts of ownership issues which have
    to be resolved with ralloc_parent checks.

 3) Ever since 00620782c9, we've been
    using nir_gather_info to fill out the final shader_info.  Thanks to
    cloning and the above ownership issues, the nir_shader::info may not
    point back to the gl_shader anymore and so we have to do a copy of
    the shader_info from NIR back to GLSL anyway.

All of these issues go away if we just embed the shader_info in the
nir_shader.  There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
3503b2714b i965/fs: Always provide a default LOD of 0 for TXS and TXL
We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages.  However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V.  Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations.  This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:35 -07:00
Jason Ekstrand
762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
7107b32155 i965/fs: Re-arrange conversion operations
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
b377be9213 i965/fs: Use num_components from the SSA def in image intrinsics
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
700bebb958 i965: Move the back-end compiler to src/intel/compiler
Mostly a dummy git mv with a couple of noticable parts:
 - With the earlier header cleanups, nothing in src/intel depends
files from src/mesa/drivers/dri/i965/
 - Both Autoconf and Android builds are addressed. Thanks to Mauro and
Tapani for the fixups in the latter
 - brw_util.[ch] is not really compiler specific, so it's moved to i965.

v2:
 - move brw_eu_defines.h instead of brw_defines.h
 - remove no-longer applicable includes
 - add missing vulkan/ prefix in the Android build (thanks Tapani)

v3:
 - don't list brw_defines.h in src/intel/Makefile.sources (Jason)
 - rebase on top of the oa patches

[Emil Velikov: commit message, various small fixes througout]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Renamed from src/mesa/drivers/dri/i965/brw_fs_nir.cpp (Browse further)