fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-09 10:08:17 +02:00

Author	SHA1	Message	Date
Eric Anholt	cd5e0b2729	v3d: Use the early_fragment_tests flag for the shader's disable-EZ field. Apparently we need disable-EZ flagged, not just "does Z writes". Fixes dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo on 7278, even though it passed in simulation. Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: `051a41d3d5` ("v3d: Add support for the early_fragment_tests flag.")	2019-02-18 18:09:06 -08:00
Eric Anholt	332b969c4e	v3d: Sync indirect draws on the last rendering. Fixes intermittent fails in dEQP-GLES31.functional.draw_indirect.compute_interop.separate.drawelements_compute_cmd_and_data_and_indices and others (particularly when run as part of a CTS run)	2019-02-18 18:09:06 -08:00
Eric Anholt	32f16b0b1e	v3d: Clear the GMP on initialization of the simulator. Otherwise, we might have pages accessible that shouldn't be and miss out on errors. This is unlikely for most tests since v3d_hw_get_mem() is big enough that it'll be a freshly zeroed mmap, but if screens are destroyed and recreated then we'd be reusing the old v3d_hw_get_mem() contents.	2019-02-18 18:09:06 -08:00
Rob Clark	28fc6733cd	freedreno/a6xx: fix helper_invocation (sampler mask/id) Since gl_HelperInvocation is lowered to: !((1 << sample_id) & sample_mask_in)) Not setting these enable bits was causing it be broken. (And probably a bunch of other stuff too.) Fixes dEQP-GLES31.functional.shaders.helper_invocation.* Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-18 10:37:54 -05:00
Alyssa Rosenzweig	2c6a7fbeb7	panfrost: Fix clipping region Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:13:50 +00:00
Alyssa Rosenzweig	fa1b36ddc2	panfrost: Preserve w sign in perspective division This fixes issues where polygons that should be culled (due to negative w, for instance) may not be. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:13:34 +00:00
Alyssa Rosenzweig	49985cebea	panfrost: Cleanup mali_viewport (clipping) code Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:13:03 +00:00
Alyssa Rosenzweig	a94463732a	panfrost: Swap order of tiled texture (de)alloc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:10:33 +00:00
Alyssa Rosenzweig	4a4ed53c01	panfrost: Free imported BOs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:10:06 +00:00
Alyssa Rosenzweig	b5a01296f4	panfrost: Fix various leaks unmapping resources v2: Don't check for NULL before free() Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-02-18 05:09:41 +00:00
Rob Clark	99b90ecd35	freedreno/a6xx: cache flush harder Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	1af0c5d320	freedreno/a6xx: compute support Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	5118dcf8c3	freedreno/a6xx: image/ssbo state emit Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	2183d9cff7	freedreno/a6xx: border-color offset helper Soon we'll need this logic to deal w/ image/SSBO case, so split out a helper rather than duplicate the logic. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:28:00 -05:00
Rob Clark	2e0ea3f09c	freedreno/ir3: add image/ssbo <-> ibo/tex mapping Images and SSBOs don't map directly to the hw. They end up being part texture and part something else. Starting with a6xx, the hack used for a5xx to smash the image tex state into hw texture state starting from MAX counting down won't work, because we start using tex state also for SSBO read. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	aefdb9bed2	freedreno/a6xx: clean up some open-coded bits Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:27:59 -05:00
Rob Clark	b51de44dea	freedreno/a6xx: move stream-out emit to helper Split out of the main fd6_emit() code, since it was already getting to be a pretty giant function. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-02-16 16:26:14 -05:00
Alok Hota	f695e43354	swr/rast: Add translation support to streamout Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:29 -06:00
Alok Hota	a7fa0cc0a5	swr/rast: simdlib cleanup, clipper stack space fixes Reduce stack space used by clipper, which had lead to crashes in some versions for MSVC Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:23 -06:00
Alok Hota	f9c29a301a	swr/rast: convert DWORD->uint32_t, QWORD->uint64_t Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:19 -06:00
Alok Hota	c503b58878	swr/rast: Refactor scratch space variable names Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:14 -06:00
Alok Hota	0b4db43705	swr/rast: FP consistency between POSH/RENDER pipes - Ensure all threads have optimal floating-point control state - Disable auto-generation of fused FP ops for VERTEX shader stage - Disable "fast" FP ops for VERTEX shader stage Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:09 -06:00
Alok Hota	dc7b3c95a4	swr/rast: Move knob defaults to generated cpp file Reduces amount of compile churn when testing different default values Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:54:04 -06:00
Alok Hota	05e4ff33f5	swr/rast: Flip BitScanReverse index calculation The intrinsic returns the number of leading zeros, not the bit number of the first nonzero, so just flip it based on the mask size Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:53:58 -06:00
Alok Hota	ae400a9b11	swr/rast: Correctly align 64-byte spills/fills Fixes crashes on some compute shaders when running on AVX512 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:53:54 -06:00
Alok Hota	78bab66479	swr/rast: Disable use of __forceinline by default - Was not useful to inline in release builds - FORCEINLINE can be used if absolutely necessary Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:52:51 -06:00
Alok Hota	20d5c88760	swr/rast: Convert system memory pointers to gfxptr_t Fulfills an unused internal interface Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-02-15 14:52:32 -06:00
James Zhu	9364d66cb7	gallium/auxiliary/vl: Add video compositor compute shader render Add compute shader initilization, assign and cleanup in vl_compositor API. Set video compositor compute shader render as default when pipe support it. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	f6ac0b5d71	gallium/auxiliary/vl: Add compute shader to support video compositor render Add compute shader to support video compositor render. Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	299e2bc046	gallium/auxiliary/vl: Rename csc_matrix and increase its size. Rename csc_matrix to shader_params, and increase shader_params size to store more constants for compute shader, Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	7b7b5f2029	gallium/auxiliary/vl: Split vl_compositor graphic shaders from vl_compositor API Split vl_compositor graphic shaders from vl_compositor API in order to share vl_compositor API with vl_compositor compute shader later. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
James Zhu	b34d7c5daa	gallium/auxiliary/vl: Move dirty define to header file Move dirty define to header file to share with compute shader. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-02-15 10:07:03 -05:00
Gurchetan Singh	67426ccd42	virgl: use virgl_transfer_inline_write even less We've noticed the Team Fortress 2 engine seems to do many small calls to glSubData(..). Let's pick our heuristic based on the resource base width, not the size of a particular upload. This will cause transfers to be batched together in the transfer queue. Revelant glbench microbenchmark -- Before: buffer_upload_dynamic_element_array_131072 = 131.17 mbytes_sec After: buffer_upload_dynamic_element_array_131072 = 6828.24 mbytes_sec Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	f0e71b1088	virgl: use transfer queue This improves Unigine Valley benchmark by 3 to 10 fps (depending on the scene). It also improves the Team Fortress 2 benchmark from 6 fps to 13 fps (host: 20 fps). Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	4a7857b377	virgl: introduce transfer queue Transfers will be placed here at unmap time instead of incurring a VM exit. There's an attempt to deduplicate intersecting 1D transfers, which are surprisingly common. This can also help with mipmapped texture upload and smaller textures, where the majority of the time is spent in the guest kernel / QEMU -- not virglrenderer. This is shown by the GLbench texture upload benchmark: Before: texture_upload_rgba_teximage2d_32 = 64.23 mtexel_sec After: texture_upload_rgba_teximage2d_32 = 367.44 mtexel_sec v2: Split up list iteration functions (@gerddie) v3: Support for optimizing glBufferSubData Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	9c4930946a	virgl: add encoder functions for new protocol Let's encode the new protocol with new helper functions. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	5510cc67e0	virgl: make winsys modifications for encoded transfers The idea is to have two command buffers: 1) One for transfers 2) One for commands, which can include transfers At flush time, (2) will be filled. Otherwise, (1) will be used to submit transfers if there are enough of them. v2: Pass size directly to cmd_buf_create (@gerddie) Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	90e9650585	virgl: add extra checks in virgl_res_needs_flush_wait This is motivated by the following scenario: glSubBufferData(GL_ARRAY_BUFFER, ...) glFlush(..) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) This increases @davidriley's Team Fortress 2 apitrace from 1 fps to 6 fps and helps with the Chromium glbench microbenchmarks: Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec buffer_upload_dynamic_array_12 = 0.02 mbytes_sec buffer_upload_dynamic_array_576 = 1.07 mbytes_sec After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec buffer_upload_dynamic_array_12 = 2.22 mbytes_sec buffer_upload_dynamic_array_576 = 164.89 mbytes_sec Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	ab6ea6e9ce	virgl: pass virgl transfer to virgl_res_needs_flush_wait Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	d98fbd9c92	virgl: keep track of number of computations It's good to keep track of these things. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	35515985a9	virgl: limit command length to 16 bits Much of our logic is based around the idea the upper 16 bits of a command dword can encode the length of the command. Now that the command buffer >= 2^16 - 1, we should check for this. v2: alignment, and only check VIRGL_ENCODE_MAX_DWORDS Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	503ffe46bb	virgl: use virgl_transfer in inline write Let's define a helper function and use it. This commit also allows resources to be emitted into different command buffers. Like the ioctls, send 0 for layer_stride and stride. If we actually send the real values, there are various assumptions in virglrenderer for non-1D buffers that may need to be modified. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	0fcd48bac5	virgl: add protocol for resource transfers Mostly similar to VIRGL_CCMD_RESOURCE_INLINE_WRITE. However, this uses the resource's already attached iovecs rather than the command buffer to transfer the data. v2: Used (1 << 16) not (1 << 15) [@gerddie] Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:05 +01:00
Gurchetan Singh	168c3ffce3	virgl: when creating / freeing transfers, pass slab pool directly This will allow us to destroy transfers w/o having a pointer to the context. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	d5c2dacc15	virgl: unmap uploader at flush time This should save some memory when allocating and freeing transfers. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	14f265b533	virgl: make alignment smaller when uploading index user buffers Since we're just uploading to guest memory, let's just align to dword size. Fixes: e0f932 ("u_upload_mgr: pass alignment to u_upload_data manually") Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	7626e6e189	virgl: track level cleanliness rather than resource cleanliness This allows a minor optimization for texture upload. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	c19aedcf1a	virgl: don't mark unclean after a flush The guest memory is still clean until host GL touches it, which we should track elsewhere. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	5b6a2ae987	virgl: use virgl_resource_dirty helper Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00
Gurchetan Singh	1d294ad264	virgl: add ability to do finer grain dirty tracking There are levels to cleanliness. Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-02-15 11:19:04 +01:00

1 2 3 4 5 ...

36129 commits