fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 17:50:12 +01:00

Author	SHA1	Message	Date
Caio Marcelo de Oliveira Filho	7558340ebb	intel/compiler: Add helpers to select SIMD for compute shaders Clean up the logic and move it to functions that work with prog_data attributes to select the right SIMD. This shouldn't change any behavior compared to the original. Having it extracted will allow reuse by Task/Mesh and make it easier to write tests. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>	2021-10-26 17:49:09 +00:00
Dylan Baker	e73096bd6d	meson: use gtest protocol for gtest based tests when possible With the `gtest` protocol meson will add some extra arguments to the test to generate better junit results, which may be useful. This protocol is only available in meson 0.55.0+, so keep using the default `exitcode` protocol for meson older than that. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8484>	2021-10-16 03:22:24 +00:00
Caio Marcelo de Oliveira Filho	29177c7cee	intel/compiler: Build all tests in a single binary With gtest is possible to filter execution and run only a specific test suite or individual test, so there's no particular reason here to generate multiple binaries for the tests of a single module. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13303>	2021-10-15 10:06:51 -07:00
Caio Marcelo de Oliveira Filho	bd2cc4b916	intel/compiler: Convert test_eu_compact to use gtest Be consistent with the other test suites in intel/compiler. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13340>	2021-10-13 17:24:29 +00:00
Jordan Justen	b5514a2236	intel/compiler: Rename brw_nir_lower_image_load_store to brw_nir_lower_storage_image Reworks: * Add crocus Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466>	2021-07-21 11:02:15 -07:00
Dave Airlie	52e426fd8b	intel/compiler: add support for compiling fixed function gs This is ported from i965, but the interface is cleaned up Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9721>	2021-05-04 03:39:45 +00:00
Anuj Phogat	dc28390e3c	intel: Rename genx keyword in filenames to gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" find $SEARCH_PATH -type f -name "gen[[:digit:]].[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen/gfx}"' \; grep -E "gen[[:digit:]]+_[[:alnum:]_]\.(c\|h\|cpp)" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen$[[:digit:]]\+_[[:alnum:]_]\.$$c\\|h\\|cpp$/gfx\1\2/g" grep -E "_gen[[:digit:]]+[[:alnum:]_]\.(c\|h\|cpp)" -rIl $SEARCH_PATH \| xargs sed -ie "s/$_$gen$[[:digit:]]\+[[:alnum:]_]\.$$c\\|h\\|cpp$/\1gfx\2\3/g" grep -E "GEN[[:digit:]]+[[:alnum:]_]_H( \|$)" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN$[[:digit:]]\+[[:alnum:]_]*_H$$ \\|$$/GFX\1\2/g" Exclude the "_pack.h" changes: grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+_pack\.h$/gen\1/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Jason Ekstrand	303378e1dd	intel/rt: Add lowering for combined intersection/any-hit shaders Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:10 +00:00
Jason Ekstrand	ca88cd8e5a	intel/rt: Add return instructions at the end of ray-tracing shaders Each callable ray-tracing shader shader stage has to perform a return operation at the end. In the case of raygen shaders, it retires the bindless thread because the raygen shader is always the root of the call tree. In the case of any-hit shaders, the default action is accep the hit. For callable, miss, and closest-hit shaders, it does a return operation. The assumption is that the calling shader has placed a BINDLESS_SHADER_RECORD address for the return in the first QWord of the callee's scratch space. The return operation simply loads this value and calls a btd_spawn intrinsic to jump to it. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:10 +00:00
Jason Ekstrand	2b3f6cdc6c	intel/rt: Add lowering functions for each ray-tracing stage These will eventually contain per-stage lowering for various ray-tracing things. This is separate from brw_nir_lower_rt_intrinsics because, for reasons that will become apparent later, brw_nir_lower_rt_intrinsics has to be run very late in the compile process, right before brw_compile_bs. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:10 +00:00
Jason Ekstrand	c7660918d7	intel/rt: Add a pass to lower the new ray-tracing intrinsics The new intrinsics we added for doing address calculations are all things we fetch from the RT_DISPATCH_GLOBALS struct. We could emit an RT_DISPATCH_GLOBALS load at every point we want it and trust NIR to CSE it for us but it's easier to use intermediate intrinsics. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:10 +00:00
Jason Ekstrand	6e50db4eda	intel/rt: Add builder helpers for accessing RT data structures Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:10 +00:00
Jason Ekstrand	6d5b57aeb7	intel/rt: Add a brw_rt.h header with #defines for basic RT data structures Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:09 +00:00
Rob Clark	53f7d539cd	util: Add helgrind support for simple_mtx Annoyingly mtypes.h pulls in simple_mtx, which means we end up needing to sprinkle a lot of idep_mesautil around. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3773 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7644>	2020-11-24 21:03:34 +00:00
Boris Brezillon	689acc7398	intel/compiler: Extract control barriers from scoped barriers Add a lowering pass extracting all control barriers embedded in scoped barriers into proper control barriers so we can get rid of the logic inserting control barriers when an SpvOpControlBarrier with WorkGroup scope is parsed in spirv_to_nir(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4900>	2020-06-03 07:39:52 +00:00
Dylan Baker	a8e2d79e02	meson: use gnu_symbol_visibility argument This uses a meson builtin to handle -fvisibility=hidden. This is nice because we don't need to track which languages are used, if C++ is suddenly added meson just does the right thing. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>	2020-06-01 18:59:18 +00:00
Francisco Jerez	188a3659ae	intel/ir: Import shader performance analysis pass. This introduces an analysis pass intended to estimate several performance statistics of the shader, including cycle count latency and throughput values, based on static modeling. It has instruction performance information more comprehensive than the current scheduling pass for all platforms between Gen4-11, and works on both the FS and VEC4 back-end. The most immediate purpose of this pass is to implement a heuristic meant to determine whether using SIMD32 dispatch for a fragment shader can be expected to help more than it hurts. In addition this will allow the effect of passes run after scheduling (e.g. the TGL software scoreboard pass and the VEC4 dependency control pass) to be visible in shader-db statistics. But that isn't the end of the story, other potential applications of this pass (not part of this MR) I've been playing around with are: - Implement a similar SIMD16 heuristic allowing the identification of inefficient SIMD16 fragment shaders. - Implement similar SIMD16 and SIMD32 heuristics for the compute shader stage -- Currently compute shader builds always use the SIMD16 shader if available and never use the SIMD32 shader unless strictly necessary, which is suboptimal under certain conditions. - Hook up to the instruction scheduler in order to improve the accuracy of its timing information. - Use as heuristic in order to drive the selection of scheduling modes (Matt was experimenting with that). - Plug to the TGL software scoreboard pass in order to implement a more effective SBID token allocation algorithm, since in general the optimal token allocation depends on the timings of all instructions in the program. - Use its bottleneck detection functionality in order to implement a heuristic computing a more optimal bound for the number of fragment shader threads executed in parallel (by adjusting the MaximumNumberofThreadsPerPSD control of 3DSTATE_PS). As a follow-up I'm planning to submit updated timing information for Gen12 platforms -- Everything else required to support Gen12 like SWSB handling is already included in this patch, but there were some IP concerns regarding the TGL timing parameters since they cannot currently be obtained with the documentation and hardware which is publicly available. The timing parameters for any previous Gen7-11 platforms can be obtained by anyone by sampling the timestamp register using e.g. shader_time, though I have some more convenient instrumentation coming up. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-04-28 23:01:03 -07:00
Eric Engestrom	8970b7839a	intel: drop unused include directories Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>	2020-03-28 21:36:54 +01:00
Eric Engestrom	79af30768d	meson: inline `inc_common` Let's make it clear what includes are being added everywhere, so that they can be cleaned up. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>	2020-03-28 21:36:54 +01:00
Francisco Jerez	03eb46f4a7	intel/compiler: Introduce simple IR analysis pass framework Motivated in detail in the source code. The only piece missing here from the analysis pass infrastructure is some sort of mechanism to broadcast changes in the IR to all existing analysis passes, which will be addressed by a future commit. The analysis_dependency_class enum might seem a bit silly at this point, more interesting dependency categories will be defined later on. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:30 -08:00
Francisco Jerez	d46fb2126d	intel/compiler: Move base IR definitions into a separate header file This pulls out the i965 IR definitions into a separate file and leaves the top-level backend_shader structure and back-end compiler entry points in brw_shader.h. The purpose is to keep things tidy and prevent a nasty circular dependency between brw_cfg.h and brw_shader.h. The logical dependency between these data structures looks like: backend_shader (brw_shader.h) -> cfg_t (brw_cfg.h) -> bblock_t (brw_cfg.h) -> backend_instruction (brw_shader.h) This circular header dependency is currently resolved by using forward declarations of cfg_t/bblock_t in brw_shader.h and having brw_cfg.h include brw_shader.h, which seems backwards and won't work at all when the forward declarations of cfg_t/bblock_t are no longer sufficient in a future commit. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4012>	2020-03-06 10:20:11 -08:00
Lionel Landwerlin	397ff2976b	intel: Implement Gen12 workaround for array textures of size 1 Gen12 does not support RENDER_SURFACE_STATE::SurfaceArray = true && RENDER_SURFACE_STATE::Depth = 0. SurfaceArray can only be set to true if Depth >= 1. We workaround this limitation by adding the max(value, 1) snippet in the shaders on the 3 components for texture array sizes. Tested on Gen9 with the following Vulkan CTS tests : dEQP-VK.image.image_size.2d_array.* v2: Drop debug print (Tapani) Switch to GEN:BUG instead of Wa_ v3: Fix dEQP-VK.image.image_size.1d_array.* cases (Lionel) v4: Fix dEQP-VK.glsl.texture_functions.query.texturesize.* cases (Missing tex_op handling) (Lionel) v5: Missing break statement (Lionel) v6: Fixup comment (Tapani) v7: Fixup comment again (Tapani) v8: Don't use sample_dim as index (Jason) Rename pass Simplify control flow Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v7) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362>	2020-01-26 22:27:03 +02:00
Sagar Ghuge	7ecfbd4f6d	nir: Add alpha_to_coverage lowering pass Importing this pass from fs_visitor::emit_alpha_to_coverage_workaround() in intel/compiler. v2 (Caio Marcelo de Oliveira Filho): - Track store output and sample mask instruction - Nest math insturction for more readability - Bail out early if no gl_SampleMask v3: (Caio Marcelo de Oliveira Filho): - Do math instructions after instruction block - Restructure code - Move pass under src/intel/compiler v4: (Caio Marcelo de Oliveira Filho): - Organize dither mask calculation Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Caio Marcelo de Oliveira Filho	c847bfaaf5	intel/fs/gen12: Add tests for scoreboard pass Tests the combinations of cases of RAW, WAW and WAR hazards involving both inorder and outoforder instructions. Also tests that dependencies combine and propagate correctly through control flow (loops and conditionals). v2: Add an extra test illustrating that the non-logical CFG edge between then-block and else-block is being taking into account. (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-10-17 10:02:35 -07:00
Francisco Jerez	265c7c8971	intel/fs/gen12: Introduce software scoreboard lowering pass. Gen12+ hardware lacks the register scoreboard logic that used to guarantee data coherency between register reads and writes in previous generations. This lowering pass runs after register allocation in order to make up for it. It works by performing global dataflow analysis in order to determine the set of potential dependencies of every instruction in the shader, and then inserts any required SWSB annotations and additional SYNC instructions in order to guarantee data coherency. v2: Drop unnecessary _safe list iteration (Caio). v3: Temporarily workaround potential WaR hazard between FPU instruction and subsequent out-of-order write, pending clarification from the hardware team. Drop redundant tracking of implicit access of acc0-1, since the hardware guarantees coherency of these (but not the other accumulators...). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	25dd67099d	intel/eu: Rework opcode description tables to allow efficient look-up by either HW or IR opcode. This rewrites the current opcode description tables as a more compact flat data structure. The purpose is to allow efficient constant-time look-up by either HW or IR opcode, which will allow us to drop the hard-coded correspondence between HW and IR opcodes -- See the next commits for the rationale. brw_eu.c is now built as C++ source so we can take advantage of pointers to member in order to make the look-up function work regardless of the opcode_desc member used as look-up key. v2: Optimize devinfo struct comparison (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Eric Engestrom	178811d8f6	meson: drop unused dep_{thread,dl} Unused as of last commit. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Eric Engestrom	d2d85b950d	meson: replace libmesa_util with idep_mesautil This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Iago Toral Quiroga	3e377c68f8	intel/compiler: add a NIR pass to lower conversions Some conversions are not directly supported in hardware and need to be split in two conversion instructions going through an intermediary type. Doing this at the NIR level simplifies a bit the complexity in the backend. v2: - Consider fp16 rounding conversion opcodes - Properly handle swizzles on conversion sources. v3 - Run the pass earlier, right after nir_opt_algebraic_late (Jason) - NIR alu output types already have the bit-size (Jason) - Use 'is_conversion' to identify conversion operations (Jason) v4: - Be careful about the intermediate types we use so we don't lose range and avoid incorrect rounding semantics (Jason) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-18 11:05:18 +02:00
Kenneth Graunke	fad7801afd	i965: Move program key debugging to the compiler. The i965 driver has a bunch of code to compare two sets of program keys and print out the differences. This can be useful for debugging why a shader needed to be recompiled on the fly due to non-orthogonal state dependencies. anv doesn't do recompiles, so we didn't need to share this in the past - but I'd like to use it in iris. This moves the bulk of the code to the compiler where it can be reused. To make that possible, we need to decouple it from i965 - we can't get at the brw program cache directly, nor use brw_context to print things. Instead, we use compiler->shader_perf_log(), and simply pass in keys. We put all of this debugging code in brw_debug_recompile.c, and only export a single function, for simplicity. I also tidied the code a bit while moving it, now that it all lives in one file. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-04-16 09:01:15 -07:00
Jason Ekstrand	9d437f9482	intel/fs: Drop the fs_surface_builder All of the actual abstraction (except possibly setting size_written) happens as part of the logical opcodes. The only thing that the surface builder is providing at this point is extra levels of functions to call through. I'm going to be adding bindless image support soon and all the extra abstraction here is just getting in the way. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Francisco Jerez	2c99c7a56c	intel/fs: Remove existing lower_conversions pass. It's redundant with the functionality provided by lower_regioning now. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Francisco Jerez	efa4e4bc5f	intel/fs: Introduce regioning lowering pass. This legalization pass is meant to handle situations where the source or destination regioning controls of an instruction are unsupported by the hardware and need to be lowered away into separate instructions. This should be more reliable and future-proof than the current approach of handling CHV/BXT restrictions manually all over the visitor. The same mechanism is leveraged to lower unsupported type conversions easily, which obsoletes the lower_conversions pass. v2: Give conditional modifiers the same treatment as predicates for SEL instructions in lower_dst_modifiers() (Iago). Special-case a couple of other instructions with inconsistent conditional mod semantics in lower_dst_modifiers() (Curro). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Ian Romanick	440c051340	i965/vec4/dce: Don't narrow the write mask if the flags are used In an instruction sequence like cmp(8).ge.f0.0 vgrf17:D, vgrf2.xxxx:D, vgrf9.xxxx:D (+f0.0) sel(8) vgrf1:UD, vgrf8.xyzw:UD, vgrf1.xyzw:UD The other fields of vgrf17 may be unused, but the CMP still needs to generate the other flag bits. To my surprise, nothing in shader-db or any test suite appears to hit this. However, I have a change to brw_vec4_cmod_propagation that creates cases where this can happen. This fix prevents a couple dozen regressions in that patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5df88c20` ("i965/vec4: Rewrite dead code elimination to use live in/out.")	2018-12-17 13:47:06 -08:00
Dylan Baker	a999798daa	meson: Add tests to suites Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-20 09:09:22 -08:00
Jason Ekstrand	6339aba775	intel/compiler: Lower SSBO and shared loads/stores in NIR We have a bunch of code to do this in the back-end compiler but it's fairly specific to typed surface messages and the way we emit them. This breaks it out into NIR were it's easier to do things a bit more generally. It also means we can easily share the code between the vec4 and FS back-ends if we wish. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:49 -06:00
Jason Ekstrand	37f7983bcc	intel/compiler: Do image load/store lowering to NIR This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Mathieu Bridon	2ee1c86d71	meson: Build with Python 3 Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-10 15:15:09 -07:00
Matt Turner	f3833f1ca7	intel/compiler: Use gen_get_device_info() in test_eu_validate Previously the unit test filled out a minimal devinfo struct. A previous patch caused the test to begin assert failing because the devinfo was not complete. Avoid this by using the real mechanism to create devinfo. Note that we have to drop icl from the table, since we now rely on the name -> PCI ID translation done by gen_device_name_to_pci_device_id(), and ICL's PCI IDs are not upstream yet. Fixes: `f89e735719` ("intel/compiler: Check for unsupported register sizes.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Dylan Baker	2083a14179	meson: Use dependencies for nir This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	4ccb981673	meson: Use consistent style for tests Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	fbf192a67e	meson: Use consistent style Currently the meosn build has a mix of two styles: arg : [foo, ... bar], and arg : [ foo, ..., bar, ] For consistency let's pick one. I've picked the later style, which I think is more readable, and is more common in the mesa code base. v2: - fix commit message Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Francisco Jerez	af2c320190	intel/fs: Implement GRF bank conflict mitigation pass. Unnecessary GRF bank conflicts increase the issue time of ternary instructions (the overwhelmingly most common of which is MAD) by roughly 50%, leading to reduced ALU throughput. This pass attempts to minimize the number of bank conflicts by rearranging the layout of the GRF space post-register allocation. It's in general not possible to eliminate all of them without introducing extra copies, which are typically more expensive than the bank conflict itself. In a shader-db run on SKL this helps roughly 46k shaders: total conflicts in shared programs: 1008981 -> 600461 (-40.49%) conflicts in affected programs: 816222 -> 407702 (-50.05%) helped: 46234 HURT: 72 The running time of shader-db itself on SKL seems to be increased by roughly 2.52%±1.13% with n=20 due to the additional work done by the compiler back-end. On earlier generations the pass is somewhat less effective in relative terms because the hardware incurs a bank conflict anytime the last two sources of the instruction are duplicate (e.g. while trying to square a value using MAD), which is impossible to avoid without introducing copies. E.g. for a shader-db run on SNB: total conflicts in shared programs: 944636 -> 623185 (-34.03%) conflicts in affected programs: 853258 -> 531807 (-37.67%) helped: 31052 HURT: 19 And on BDW: total conflicts in shared programs: 1418393 -> 987539 (-30.38%) conflicts in affected programs: 1179787 -> 748933 (-36.52%) helped: 47592 HURT: 70 On SKL GT4e this improves performance of GpuTest Volplosion by 3.64% ±0.33% with n=16. NOTE: This patch intentionally disregards some i965 coding conventions for the sake of reviewability. This is addressed by the next squash patch which introduces an amount of (for the most part boring) boilerplate that might distract reviewers from the non-trivial algorithmic details of the pass. The following patch is squashed in: SQUASH: intel/fs/bank_conflicts: Roll back to the nineties. Acked-by: Matt Turner <mattst88@gmail.com>	2017-12-07 15:56:06 -08:00
Matt Turner	821ec473a8	i965: Rename intel_asm_annotation -> brw_disasm_info It was the only file named intel_* in the compiler. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-17 12:14:38 -08:00
Rob Clark	2207af032b	meson: extract out variable for nir_algebraic.py Also needed in freedreno/ir3. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-24 15:33:40 -04:00
Jason Ekstrand	b1d1b7222a	intel/compiler: Make brw_nir_lower_intrinsics compute-specific It's already only ever called from brw_compile_cs and only handles compute intrinsics. Let's just make it CS-specific. We can always make it handle other stages again later if we want. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Dylan Baker	7a5a986ddd	meson: convert gtest to an internal dependency In truth gtest is an external dependency that upstream expects you to "vendor" into your own tree. As such, it makes sense to treat it more like a dependency than an internal library, and collect it's requirements together in a dependency object. v2: - include with -isystem instead of setting compiler args (Eric) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-03 10:02:08 -07:00
Dylan Baker	d1992255bb	meson: Add build Intel "anv" vulkan driver This allows building and installing the Intel "anv" Vulkan driver using meson and ninja, the driver has been tested against the CTS and has seems to pass the same series of tests (they both segfault when the CTS tries to run wayland wsi tests). There are still a mess of TODO, XXX, and FIXME comments in here. Those are mostly for meson bugs I'm trying to fix, or for additional things to implement for other drivers/features. I have configured all intermediate libraries and optional tools to not build by default, meaning they will only be built if they're pulled in as a dependency of a target that will actually be installed) this allows us to avoid massive if chains, while ensuring that only the bits that need to be built are. v2: - enable anv, x11, and wayland by default - add configure option to disable valgrind v3: - fix typo in meson_options (Nicholas) v4: - Remove dead code (Eric) - Remove change to generator that was from v0 (Eric) - replace if chain with loop (Eric) - Fix typos (Eric) - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric) v5: - rebase on util string buffer implementation Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v4)	2017-09-27 09:12:19 -07:00

48 commits