fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 15:40:11 +01:00

Author	SHA1	Message	Date
Brian Paul	2c07c40d2f	svga: clean up and improve comments in svga_draw_private.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	0f983e1793	util/indices: implement unfilled (tri->line) conversion for adjacency prims Tested with new piglit gl-3.2-adj-prims test. v2: re-order trisadj and tristripadj code, per Roland. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	d6c2c7d710	util/indices: implement provoking vertex conversion for adjacency primitives Tested with new piglit gl-3.2-adj-prims test. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	479d364c39	util/indices: assert that the incoming primitive is a triangle type The unfilled index translator/generator functions should only be called when the primitive mode is one of the triangle types. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	26de558072	util/indices: formatting, whitespace fixes in u_unfilled_indices.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	24eadb4810	util/indices: improve comments in u_indices.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-05-26 17:44:17 -06:00
Brian Paul	5393238765	svga: fix primitive mode (point/line/tri) test for unfilled primitives The original mode test was valid before we had GS support. Regression tested with full piglit run. Though, I don't think we have any piglit tests that exercise drawing unfilled adjacency primitives. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-05-26 17:44:17 -06:00
Ian Romanick	b7af108d3e	i965: Enable GL_OES_shader_io_blocks Only one dEQP io_blocks test fails. This test fails for the same reason as the match_different_member_struct_names test in a previous commit. dEQP-GLES31.functional.separate_shader.validation.io_blocks.match_different_member_struct_names v2: Add to release notes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	660240da9e	glsl: Allow shader interface blocks in GLSL ES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	7a3093efcc	glsl: Add a has_shader_io_blocks helper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	f0902ee813	mesa: Add extension tracking for GL_OES_shader_io_blocks v2: Also support GL_EXT_shader_io_blocks. It's pretty much identical to the OES extension. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:24:25 -07:00
Ian Romanick	326a269c77	mesa: Only validate SSO shader IO in OpenGL ES or debug context v2: Move later in series to avoid issues with Gallium drivers and debug contexts. Suggested by Ilia. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-05-26 16:23:53 -07:00
Ian Romanick	3722c76001	mesa: Remove old validate_io function The new validate_io catches all of the cases (and many more) that the old function caught. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:22:25 -07:00
Ian Romanick	bd3f15cffd	mesa: Additional SSO validation using program_interface_query data Fixes the following dEQP tests on SKL: dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_smooth_fragment_flat dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_implicit_explicit_location_1 dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_element_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_none dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_order dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_centroid_fragment_flat dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_array_length dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_precision dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location_type dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_centroid dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_explicit_location dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_qualifier_vertex_flat_fragment_smooth dEQP-GLES31.functional.separate_shader.validation.varying.mismatch_struct_member_name It regresses one test: dEQP-GLES31.functional.separate_shader.validation.varying.match_different_struct_names Hoever, this test is based on language in the OpenGL ES 3.1 spec that I believe is incorrect. I have already submitted a spec bug: https://www.khronos.org/bugzilla/show_bug.cgi?id=1500 v2: Move spec quote about built-in variables to the first place where it's relevant. Suggested by Alejandro. v3: Move patch earlier in series, fix rebase issues. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v2] Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> [v2]	2016-05-26 16:21:01 -07:00
Ian Romanick	cfff746297	mesa: Track the additional data in gl_shader_variable The interface type, interpolation mode, precision, the type of the outermost structure, and whether or not the variable has an explicit location will be used for SSO validation on OpenGL ES. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2016-05-26 16:19:16 -07:00
Jason Ekstrand	15e553daf0	nir: Make nir_const_value a union There's no good reason for it to be a struct of an anonymous union. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96221 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-26 16:03:44 -07:00
Kenneth Graunke	e7776fa947	i965: Use the buffer object size for VERTEX_BUFFER_STATE's size field. commit `7c8dfa78b9` (i965/draw: Use the real size for vertex buffers) changed how we programmed the VERTEX_BUFFER_STATE size field. Previously, we programmed it to the size of the actual underlying BO, which is page-aligned, and potentially much larger than the GL buffer object. This violated the ARB_robust_buffer_access spec. With that change, we started programming it based on the range of data we expect the draw call to actually access - which is based on the min_index and max_index information provided to glDrawRangeElements(). Unfortunately, applications often provide inaccurate range information to glDrawRangeElements(). For example, all the Unreal demos appear to draw using a range of [0, 3] when the index buffer's actual index range is [0, 5]. Such results are undefined, and we are absolutely allowed to restrict access to the range they specified. However, the failure mode is usually that nothing draws, or misrendering with wild geometry, which is kind of bad for a common mistake. And people tend to assume the range information isn't that important when data is in VBOs. There's no real advantage, either. ARB_robust_buffer_access only requires us to restrict access to the GL buffer object size, not the range of data we think they should access. Doing that allows buggy applications to still function. (Note that we still use this information for busy-tracking, so if they try to overwrite the data with glBufferSubData, they'll still hit a bug.) This seems to be safer. We may want to provide the more strict range as a debug option, or scan the VBO and warn against bogus glDrawRangeElements in debug contexts. That can be done as a later patch, though. Makes Unreal demos draw again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-05-26 15:56:41 -07:00
Samuel Pitoiset	e01a482182	nvc0: invalidate textures/samplers between 3D and CP on Fermi Like constant buffers, samplers and textures are aliased on Fermi and we need to invalidate the state when switching from 3D to CP and vice versa. This fixes rendering issues in the UE4 demos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 23:51:22 +02:00
Jason Ekstrand	9f0bc0f2b3	anv: Stop linking against libmesa.la and libdri_test_stubs.la This brings the final size of an optimized non-debug build of the Vulkan driver down to 2.9 MB as opposed to 8.7 MB for the dri driver. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	057259655e	i965: Don't link libmesa or libdri_test_stubs into tests Now that the compiler has been completely separated from libmesa, we no longer need these. We can make the tests much smaller by not linking them in. This also ensures that anyone who runs make check won't accidentally put in any dependencies from the compiler to the rest of mesa core. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	870ff6cd38	i965: Move compiler debug functions to intel_screen.c They reference the compiler so they shouldn't go in libi965_compiler.la. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	327161a48d	i965/test: Remove the fragment/vertex_program field from test visitors None of them are actually using it. It's a relic of an older compiler interface that required a gl_program. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	e0ae10c49a	i965: Move brw_new_shader to brw_link.cpp That's where brw_link_shader lives and they seem to go together. Also, this gets it out of libi965_compiler. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	5136b67915	i965: Move brw_nir_lower_uniforms.cpp to i965_FILES This gets it out of i965_compiler.la Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	5e43ba7e9e	i965: Move brw_create_nir to brw_program.c This way it's no longer part of libi965_compiler.la since it depends on GLSL and ARB program stuff. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	86a2447eec	i965/nir: Move the type_size_*_bytes functions to brw_nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	58d1e82d32	ptn: Include nir.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Jason Ekstrand	32210dea8e	compiler: Move glsl_to_nir to libglsl.la Right now libglsl.la depends on libnir.la so putting it in libnir.la adds a dependency on libglsl.la that goes the wrong direction. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:13:38 -07:00
Ben Widawsky	ddcfc35f62	i965/sklgt4: Implement depth/timestamp write w/a The stated bug describes a scenario in which a post sync write operation for depth or timestamp can be ignored. There are two workarounds suggested, the first and easier is to simply do a cs stall when we do these type of writes. The second option is to do a PIPE_CONTROL flush after the post sync but before the data is required. Generally, I believe the data written out is consumed by the application on the CPU side and so doing the easier of the two is ideal. Furthermore, these queries aren't tremendously common in the perf sensitive apps I have looked at. However, there could be cases where a shader stage might directly consume the data, and as a result option 2 may be desirable. This patch goes with the easier solution for now. gen9lp bug_de_id=2137196 By itself, this does not fix any of the GT4 hangs we're currently experiencing. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-05-26 14:08:17 -07:00
Ben Widawsky	f1fa8b4a1c	i965/bxt: Add 2x6 variant Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-05-26 14:06:43 -07:00
Bas Nieuwenhuizen	43d7305a40	radeonsi: Allow TES distribution between shader engines. The R_028B50_VGT_TESS_DISTRIBUTION value is copied from amdgpu-pro. Smaller values in the ACCUM fields seem to decrease the performance advantage from this patch, higher values don't seem to matter. v2: Add distribution mode field enums. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	f91c85b29b	radeonsi: Process multiple patches per threadgroup. Using more than 1 wave per threadgroup does increase performance generally. Not using too many patches per threadgroup also increases performance. Both catalyst and amdgpu-pro seem to use 40 patches as their maximum, but I haven't really seen any performance increase from limiting the number of patches to 40 instead of 64. Note that the trick where we overlap the input and output LDS does not work anymore as the insertion of the tess factors changes the patch stride. v2: - Add comment about LDS assumptions. - Add constant for buffer size. - Fix code style. v3: - Correct limits for not splitting patches between waves. - Set max num_patches to 40 as in the proprietary driver. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	fd0a7a382f	radeonsi: Add barrier before writing the tess factors. The factors may be stored to LDs by another invocation than the invocation for vertex 0. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	fee3160af9	radeonsi: Enable dynamic HS. This allows running the TES on different CU's than the TCS which results in performance improvements. v2: Only write the control word from one invocation. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	26f436132b	radeonsi: Remove LDS layout user SGPR's from TES. They are unused. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	a4e2146a9d	radeonsi: Use buffer loads and stores for passing data from TCS to TES. We always try to use 4-component loads, as LLVM does not combine loads and they bypass the L1 cache. We can't use a similar strategy for stores and this is especially notable with the tess factors, as they are often set with separate MOV's per component in the TGSI. We keep storing to LDS and the LDS space, so we can load the outputs later, either due to the shader, of for wrting the tess factors. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	6217716e8f	radeonsi: Store inputs to memory when not using a TCS. We need to copy the VS outputs to memory. I decided to do this using a shader key, as the value depends on other shaders. I also switch the fixed function TCS over to monolithic, as otherwisze many of the user SGPR's need to be passed to the epilog, which increases register pressure, or complexity to avoid that. The main body of the fixed function TCS is not that interesting to precompile anyway, since we do it on demand and it is very small. v2: Use u_bit_scan64. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	7846fa8768	radeonsi: Add offchip buffer address calculation. Instead of creating a memory area per patch and per vertex, we put the same attribute of every vertex & patch together. Most loads and stores access the same attribute across all lanes, only for different patches and vertices. For the TCS this results in tightly packed data for 4-component stores. For the TES this is not the case as within a patch the loads often also access the same vertex. However if there are < 4 vertices/patch, this still results in a reduction of the number of cache lines. In the LDS situation we only do better than worst case if the data per patch < 64 bytes, which due to the tessellation factors is pretty much never. We do not use hardware swizzling for this. It would slightly reduce the number of executed VALU instructions, but I had issues with increased wait times that I haven't been able to solve yet. Furthermore, the tbuffer_store intrinsic does not support both VGPR offset and an index, so we have a problem storing indirectly indexed outputs. This can be solved by temporarily storing arrays in LDS and then copying them, but I don't think that is worth the effort. The difference in VALU cycles hardware swizzling gives is about 0.2% of total busy cycles. That is without handling the array case. I chose for attributes instead of components as they are often accessed together, and the software swizzling takes VALU cycles for calculating offsets. v2: - Rename functions to get_tcs_tes_buffer_address. - multiply by 16 as late as possible. - Use tgsi_full_src_register_from_dst. - Remove some bad comments. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	c49e68dc4b	radeonsi: Add user SGPR for the layout of the offchip buffer. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	d9a0c54f6f	radeonsi: Use correct parameter index for LS_OUT_LAYOUT. This happens to be in the right position, but that changes when TCS/TES get new parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	3e7a7a9a65	radeonsi: Add buffer load functions. v2: - Use llvm.admgcn.buffer.load instrinsics for new LLVM. - Code style fixes. v3: - Code style fix. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	9fdb778702	radeonsi: Define build_tbuffer_store_dwords earlier to support new users. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	5c34562d7c	radeonsi: Add offchip tessellation parameters. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Bas Nieuwenhuizen	d27ff7d683	radeonsi: Add buffer for offchip storage between TCS and TES. The buffer is quite large, but should only be allocated if the application uses tessellation. Most non-games don't. v2: - Use the correct register for SI. - Add define for block size. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-05-26 22:07:04 +02:00
Rob Clark	6e51fe75a4	tgsi: fix coverity out-of-bounds warning CID 1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local: Overrunning array of 2 16-byte elements at element index 2 (byte offset 32) by dereferencing pointer &inst.Dst[i]. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Rob Clark	3d66ba971e	tgsi: fix out of bounds access Not sure why coverity calls this an out-of-bounds read vs out-of-bounds write. CID 1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local: Overrunning array r of 3 16-byte elements at element index 3 (byte offset 48) using index chan (which evaluates to 3). Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 15:17:49 -04:00
Anuj Phogat	0c02d7002d	i965: Don't use fast copy blit in case of logical operations other than GL_COPY XY_FAST_COPY_BLT command doesn't have a field for raster operation. So, fall back to using XY_SRC_COPY_BLT to handle those cases. Fixes piglit test gl-1.1-xor-copypixels when fast copy blit is enabled for all tiling formats. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-05-26 10:57:09 -07:00
Anuj Phogat	97f0f91cc1	i965/gen9: Remove the halign/valign field setup code in fast copy blit Experimentation with different values of src/dst horizontal/vertical alignment showed that these fileds are not used on gen9 hardware. A recent update in graphics specs has removed these fields from XY_FAST_COPY_BLT command. Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Chad Versace <chad.versace@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2016-05-26 10:57:09 -07:00
Samuel Pitoiset	c52e92ec3a	nvc0: allow to monitor MP perf counters with compute shaders To read out MP perf counters we use a compute shader and need to upload input data like a 64-bits addr used to store the values and a sequence ID for synchronization. Currently, this input data is uploaded as user uniforms which means that it's sticked to c0[], but if a compute shader from a real application is used, monitoring those performance counters will just overwrite some data and miserably crash. Instead, sticking the 64-bits addr and the sequence into the driver constant buffer seems like much better and will allow to monitor counters with GL 4.3 apps. Tested on GF119 and GK110, but should not hurt anything on GK104. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-05-26 19:34:57 +02:00
Kristian Høgsberg Kristensen	329d115ac6	mesa: Move robustness code to main/robustness.c Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-05-26 09:37:17 -07:00

... 9 10 11 12 13 ...

82416 commits