fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 15:48:19 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	35955afa7a	intel: aubinator: fix read the context/ring Up to now we've been lucky that the buffer returned was always exactly at the address we requested. Fixes: `144b40db54` ("intel: aubinator: drop the 1Tb GTT mapping") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-04 09:38:34 +01:00
Jason Ekstrand	1d900e55fd	anv/pipeline: Disable FS dispatch for pointless fragment shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-03 05:52:23 -07:00
Andres Gomez	2d4d139877	intel/tools: add error2aub creation into autotools Tarball distribution is done through "make distcheck". We include the meson targets also into autotools so they won't fail when building from the tarball. Fixes: `6a60beba40` ("intel/tools: Add an error state to aub translator") Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-02 21:15:57 +03:00
Jason Ekstrand	7ef6cd0ee8	anv/pipeline: Do cross-stage linking optimizations This appears to help the Aztec Ruins benchmark by about 2% on my Kaby Lake gt2 laptop. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	a5bffa061d	anv/pipeline: Pull most of the anv_pipeline_compile_* into common code This leaves us with a series of little anv_pipeline_compile_* functions which each take a compiler object, a mem_ctx, the stage to compile, and the previous stage for VUE linking purposes. Some of them do interesting things but most are little more than wrappers around brw_compile_*. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	5351339554	anv/pipeline: Add a separate "link" stage This breaks compilation up a bit into "link" and "compile". In the "link" stage, new anv_pipeline_link_* helpers are called which are responsible for setting up the binding table and doing anything needed to properly link with the next stage in the pipeline if one exists. They are called in reverse order starting with the fragment shader so you can assume linking in later stages is already done. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	5b196f39bd	anv/pipeline: Compile to NIR in compile_graphics This pulls the SPIR-V to NIR step out into common code. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	946fcd02a9	anv/pipeline: Recompile all shaders if any are missing from the cache Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	f76d6d8a63	anv/pipeline: Drop anv_pipeline_add_compiled_stage We can set active_stages much more directly and then it's just candy around setting pipeline->stages[stage]. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	703a24932a	anv/pipeline: Pull shader compilation out into a helper. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	f3c59ca947	anv/pipeline: Call anv_pipeline_compile_* in a loop Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	bdc3565c8c	anv/pipeline: Hash the entire pipeline in one go Instead of hashing each stage separately (and TES and TCS together), we hash the entire pipeline. This means we'll get fewer cache hits if they, for instance, re-use the same VS over and over again but it also means we can now safely do cross-stage optimizations. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	4a8236ae17	anv/pipeline: Populate keys up-front Instead of having each anv_pipeline_compile_* function populate the shader key, make it part of the anv_pipeline_stage struct and fill it out up-front. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	76503b319a	anv/pipline: Add a helper struct for per-stage info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jordan Justen	8fcdb71d8c	intel/compiler: Add brw_get_compiler_config_value for disk cache During code review, Jason pointed out that: `2b3064c073` "i965, anv: Use INTEL_DEBUG for disk_cache driver flags" Didn't account for INTEL_SCALER_* environment variables. To fix this, let the compiler return the disk_cache driver flags. Another possible fix would be to pull the INTEL_SCALER_* into INTEL_DEBUG bits, but as we are currently using 41 of 64 bits, I didn't think it was a good use of 4 more of these bits. (5 since INTEL_PRECISE_TRIG needs to be accounted for as well.) Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 23:49:16 -07:00
Jordan Justen	3887700dfd	i965: Disable shader cache with INTEL_DEBUG=shader_time Shader time hard codes an index of the shader time buffer within the gen program. In order to support shader time in the disk shader cache, we'd need to add the shader time index into the program key. This should work, but probably is not worth it for this particular debug feature. Therefore, let's just disable the disk shader cache if the shader time debug feature is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106382 Fixes: `96fe36f7ac` "i965: Enable disk shader cache by default" Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 23:30:49 -07:00
Jason Ekstrand	de9e5cf35a	anv/pipeline: Add populate_tcs/tes_key helpers They don't really do anything interesting, but it's more consistent this way. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	e621f57556	anv/pipeline: Rework the parameters to populate_wm_prog_key Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b2e0b0dad6	anv/pipeline: More aggressively optimize away color attachments Instead of just looking at the number of color attachments, look at which ones are actually used by the subpass. This lets us potentially throw away chunks of the fragment shader. In DXVK, for example, all subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this is very helpful in that case. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	80bc0b728c	anv: Restrict the number of color regions to those actually written The back-end compiler emits the number of color writes specified by wm_prog_key::nr_color_regions regardless of what nir_store_outputs we have. Once we've gone through and figured out which render targets actually exist and are written by the shader, we should restrict the key to avoid extra RT write messages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4d57e543b8	anv/pipeline: Fix up deref modes if we delete a FS output With the new deref instructions, we have to keep the modes consistent between the derefs and the variables they reference. Since we remove outputs by changing them to local variables, we need to run the fixup pass to fix the modes. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4434591bf5	intel/nir: Call nir_lower_io_to_scalar_early Shader-db results on Kaby Lake: total instructions in shared programs: 15166953 -> 15073611 (-0.62%) instructions in affected programs: 2390284 -> 2296942 (-3.91%) helped: 16469 HURT: 505 total loops in shared programs: 4954 -> 4951 (-0.06%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b0bb547f78	intel/nir: Split IO arrays into elements The NIR nir_lower_io_arrays_to_elements pass attempts to split I/O variables which are arrays or matrices into a sequence of separate variables. This can help link-time optimization by allowing us to remove varyings at a more granular level. Shader-db results on Kaby Lake: total instructions in shared programs: 15177645 -> 15168494 (-0.06%) instructions in affected programs: 79857 -> 70706 (-11.46%) helped: 392 HURT: 0 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	57804efa88	i965/fs: Flag all slots of a flat input as flat Otherwise, only the first vec4 of a matrix or other complex type will get marked as flat and we'll interpolate the others. This was caught by a dEQP test which started failing because it did a SSO vs. non-SSO comparison. Previously, we did the interpolation wrong consistently in both versions. However, with one of Tim Arceri's NIR linkingpatches, we started splitting the matrix input into vectors at link time in the non-SSO version and it started getting correctly interpolated which didn't match the broken SSO version. As of this commit, they both get correctly interpolated. Fixes: `e61cc87c75` "i965/fs: Add a flat_inputs field to prog_data" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4e060385e9	intel/nir: Use the correct scalar stage for consumers when linking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Lionel Landwerlin	2477e516d9	intel: tools: aubwrite: split gen[89] from gen10+ Gen10+ has an additional bit in MI_BATCH_BUFFER_END to signal the end of the context image. We select the largest size for the context image regardless of the generation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-01 15:31:56 +01:00
Mathieu Bridon	a71df20855	python: Explicitly use byte strings In both Python 2 and 3, zlib.Compress.compress() takes a byte string, and returns a byte string as well. In Python 2, the script was working because: 1. string literalls were byte strings; 2. opening a file in unicode mode, reading from it, then passing the unicode string to compress() would automatically encode to a byte string; On Python 3, the above two points are not valid any more, so: 1. zlib.Compress.compress() refuses the passed unicode string; 2. compressed_data, defined as an empty unicode string literal, can't be concatenated with the byte string returned by compress(); This commit fixes this by explicitly using byte strings where appropriate, so that the script works on both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	c24d826968	python: Open file in binary mode The XML parser wants byte strings, not unicode strings. In both Python 2 and 3, opening a file without specifying the mode will open it for reading in text mode ('r'). On Python 2, the read() method of the file object will return byte strings, while on Python 3 it will return unicode strings. Explicitly specifying the binary mode ('rb') makes the behaviour identical in both Python 2 and 3, returning what the XML parser expects. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	12eb5b496b	python: Better get character ordinals In Python 2, iterating over a byte-string yields single-byte strings, and we can pass them to ord() to get the corresponding integer. In Python 3, iterating over a byte-string directly yields those integers. Transforming the byte string into a bytearray gives us a list of the integers corresponding to each byte in the string, removing the need to call ord(). This makes the script compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 14:26:19 +01:00
Iago Toral Quiroga	471bce5689	intel/compiler: implement 8-bit constant load Fixes VK-GL-CTS CL#2567 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-01 08:08:15 +02:00
Iago Toral Quiroga	7e6c8b0cb7	intel/compiler: add setup_imm_(u)b helpers The hardware doesn't support byte immediates, so similar to setup_imm_df() for doubles, these helpers work by loading the constant value into a VGRF. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-01 08:08:15 +02:00
Topi Pohjolainen	a5889d70f2	i965/icl: Disable binding table prefetching Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to disable prefetching of binding tables for ICLLP A0 and B0 steppings. It fixes multiple gpu hangs in ext_framebuffer_multisample* tests on ICLLP B0 h/w. Anuj: Add comments and commit message. Add gen 11 checks in the code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-27 11:05:04 -07:00
Iago Toral Quiroga	615aaedb93	intel/compiler: fix lower conversions to account for predication The pass can create a temporary result for the instruction and then moves from it to the original destination, however, if the original instruction was predicated, the mov has to be predicated as well. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-07-27 14:48:29 +02:00
Kenneth Graunke	488972222c	i965: Combine both gl_PatchVerticesIn lowering passes. Until now, we had separate passes for lowering gl_PatchVerticesIn to a statically known constant (for TES inputs when linked against a TCS), and a uniform in the other cases. Annoyingly, one had to be run before nir_lower_system_values, and the other afterward. This simplified the passes, but made life painful for the callers. This patch combines both into a single pass. If you give it a non-zero static count, it uses that. If you give it Mesa state slots, it turns it back into a built-in uniform. Otherwise, it does nothing. This also moves the i965 uniform lowering out to shared code. v2: Make token arrays const. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-26 21:51:36 -07:00
Kenneth Graunke	8794fe3e30	intel/compiler: Delete dead VS intrinsic handling. These are lowered by brw_nir_lower_vs_inputs(). If they weren't, we would have already hit the unreachable() in emit_system_values_block(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-26 11:45:34 -07:00
Eric Engestrom	2cc1849afb	anv: drop unused local vars Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-26 10:21:03 +01:00
Eric Engestrom	2a4191bb38	anv: remove incorrect `UNUSED` flag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-26 10:06:11 +01:00
Kenneth Graunke	37c3efca29	intel: Make the decoder just store addresses for bases, not buffers. The various base addresses are simply addresses. There may or may not be a buffer located at those addresses. So, it doesn't make much sense to request one. Just save the raw address so we can add it later, when asking about BOs at the final <base + offset> address. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-25 14:43:54 -07:00
Kenneth Graunke	933223db3c	intel: Make the decoder handle STATE_BASE_ADDRESS not being a buffer. Normally, i965 programs STATE_BASE_ADDRESS every batch, and puts all state for a given base in a single buffer. I'm working on a prototype which emits STATE_BASE_ADDRESS only once at startup, where each base address is a fixed 4GB region of the PPGTT. State may live in many buffers in that 4GB region, even if there isn't a buffer located at the actual base address itself. To handle this, we need to save the STATE_BASE_ADDRESS values across multiple batches, rather than assuming we'll see the command each time. Then, each time we see a pointer, we need to ask the driver for the BO map for that data. (We can't just use the map for the base address, as state may be in multiple buffers, and there may not even be a buffer at the base address to map.) v2: Fix things caught in review by Lionel: - Drop bogus bind_bo.size check. - Drop "get the BOs again" code - we just get the BOs as needed - Add a message about interface descriptor data being unavailable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-25 14:43:47 -07:00
Eric Engestrom	aa59f9c8bc	anv: don't crash on vkDestroyDevice(NULL) CovID: 1438132 Fixes: `a99c9e63a0` "anv: finish the binding_table_pool on destroyDevice when use_softpin" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-07-25 21:04:30 +01:00
Eric Engestrom	bbf8316fcb	anv: fix python whitespace warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Eric Engestrom	e0347581f3	anv: cleanup python imports Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Eric Engestrom	ce7348507e	anv: remove unnecessary semicolons in python Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Lionel Landwerlin	b21b38c46c	intel: tools: dump: only store device id on success We might fail on master node drm fd because we won't have the right permissions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-25 16:53:06 +01:00
Jordan Justen	2b3064c073	i965, anv: Use INTEL_DEBUG for disk_cache driver flags Since various options within INTEL_DEBUG could impact code generation, we need to set the disk cache driver_flags parameter based on the INTEL_DEBUG flags in use. An example that will affect the program generated by i965 is the INTEL_DEBUG=nocompact option. The DEBUG_DISK_CACHE_MASK value is added to mask the settings of INTEL_DEBUG that can affect program generation. v2: * Use driver_flags (Tim) * Also update Anvil (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 16:17:28 -07:00
Jordan Justen	69a686b0ae	i965, anv: Add extra unused character in disk_cache renderer temp string This extra character should not be used by snprintf, but we make it available to verify that we printed the exact number we wanted, and didn't overflow. v2: * Also update Anvil Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 16:17:25 -07:00
Karol Herbst	7f95564a22	nir: rename f2f16_undef to f2f16 we need rounding modes on other conversions involving floats and it is easier to rename f2f16_undef than renaming all the other ones. v2: rebased on master Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Mathieu Bridon	9ebd8372b9	python: Use range() instead of xrange() Python 2 has a range() function which returns a list, and an xrange() one which returns an iterator. Python 3 lost the function returning a list, and renamed the function returning an iterator as range(). As a result, using range() makes the scripts compatible with both Python versions 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	5530cb1296	python: Better iterate over dictionaries In Python 2, dictionaries have 2 sets of methods to iterate over their keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems(). The former return lists while the latter return iterators. Python 3 dropped the method which return lists, and renamed the methods returning iterators to keys()/values()/items(). Using those names makes the scripts compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Kenneth Graunke	9b34742495	intel: Make the disassembler take a const pointer to the assembly. Disassembling doesn't modify the assembly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 11:04:56 -07:00

1 2 3 4 5 ...

3329 commits