fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-02-04 02:00:35 +01:00

Author	SHA1	Message	Date
Emil Velikov	dafcb21405	virgl: use virgl_screen/surface upcast wrappers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	7af46b9c74	virgl: introduce and use virgl_transfer/texture/resource inline wrappers The only two remaining cases of (struct virgl_resource *) require a closer look. Either the error checking is missing or the arguments provided feel wrong. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	6b123fa07f	virgl: add virgl_context/sampler_view/so_target() upcast wrappers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	1f43e4e1a3	winsys/virgl/drm: drop unneeded forward declaration Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	e0056228f6	virgl: remove sw_winsys pointer from virgl_screen The screen already has a pointer to the (base) winsys object. With the latter of which implemented/sub-classed as either drm or sw based one, depending on the target. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	0c82c2fb0b	virgl: rename virgl.h to virgl_screen.h Provide a more meaningful name considering it's purpose. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	87f7d61e19	virgl: move virgl_hw.h into the driver dir Strictly speaking virgl_hw.h should reside in the driver folder, as it describes the hardware. Moving it allows us to nuke the following strange dependency winsys/vtest > driver > winsys/drm Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	014f8ef2ff	virgl: straighten the includes confusion Use the relevant GALLIUM_foo_CFLAGS which has all the requirements (not to mention VISIBITY_CFLAGS) and keep ../ out of the include directives. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	2c705d2220	virgl: remove the _FILE_OFFSET_BITS defines The build already sets it as needed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	a05648fd7e	winsys/virgl/drm: add all files to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	8b9e69e2ea	winsys/virgl/vtest: list all files in Makefile.sources Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:36:46 +00:00
Emil Velikov	73308ca802	virgl: move sources list to Makefile.sources ... and add the missing files while we're at it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:33:11 +00:00
Emil Velikov	c1bf71f77c	virgl: fix drm.h include path The drm/ prefix is required, if using the kernel provided headers. As most distros don't ship them it and we already depend on libdrm (which adds the relevant -I flag) just drop the drm/ from the include. Once a libdrm release with the virtgpu_drm.h header is released, we can drop our local copy of the file. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:29:01 +00:00
Emil Velikov	60418a28ea	i965: enable ARB_shader_clock on gen7+ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:23:18 +00:00
Emil Velikov	4379ca22f1	i965: Implement nir_intrinsic_shader_clock v2: - Add a few const qualifiers for good measure. - Drop unneeded retype()s (Matt) - Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns SIMD4 (Connor) v3: - Remove unneeded temporary + MOV (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:40 +00:00
Emil Velikov	6a15517242	i965/fs: move the fs_reg::smear() from get_timestamp() to the callers We're about to reuse get_timestamp() for the nir_intrinsic_shader_clock. In the latter the generalisation does not apply, so move the smear() where needed. This also makes the function analogous to the vec4 one. v2: Tweak the comment - The caller -> We (Matt, Connor). v3: More comment tweaks (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:36 +00:00
Emil Velikov	7682844f34	nir: add shader_clock intrinsic v2: Add flags and inline comment/description. v3: None of the input/outputs are variables v4: Drop clockARB reference, relate code motion barrier comment wrt intrinsic flag. v5: Drop the "thus we can eliminate..." comment (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:32 +00:00
Emil Velikov	f1d98fc90a	glsl: add support for the clock2x32ARB function v2: correctly set the return type Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:29 +00:00
Emil Velikov	51265c1b85	glsl: add ARB_shader_clock infrastructure Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:27 +00:00
Emil Velikov	e916d5e013	mesa: add infra for ARB_shader_clock Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:23 +00:00
Samuel Pitoiset	0d0329df8f	nv50: do not create an invalid HW query type While we are at it, store the rotate offset for occlusion queries to nv50_hw_query like on nvc0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2015-10-30 17:57:15 +01:00
Samuel Pitoiset	5f1eeb799b	nv50: move HW queries to nv50_query_hw.c/h files Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2015-10-30 17:57:15 +01:00
Samuel Pitoiset	76b48ceee9	nv50: move nva0_so_target_save_offset() to its correct location Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2015-10-30 17:57:15 +01:00
Samuel Pitoiset	2e3fe0379e	nv50: add a header file for nv50_query Like for nvc0, this will allow to split different types of queries and to prepare the way for both global performance counters and MP counters. While we are at it, make use of nv50_query struct instead of pipe_query. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-30 17:57:15 +01:00
Julien Isorce	e7ed3963ed	st/va: add support to export a surface as dmabuf I.e. implements: VaAcquireBufferHandle VaReleaseBufferHandle for memory of type VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME And apply relatives change to: vlVaMapBuffer vlVaUnMapBuffer vlVaDestroyBuffer Implementation inspired from cgit.freedesktop.org/vaapi/intel-driver Tested with gstreamer-vaapi with nouveau driver. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:21:20 +01:00
Julien Isorce	802ba6f865	st/va: implement VaDeriveImage And apply relatives change to: vlVaBufferSetNumElements vlVaCreateBuffer vlVaMapBuffer vlVaUnmapBuffer vlVaDestroyBuffer vlVaPutImage It is unfortunate that there is no proper va buffer type and struct for this. Only possible to use VAImageBufferType which is normally used for normal user data array. On of the consequences is that it is only possible VaDeriveImage is only useful on surfaces backed with contiguous planes. Implementation inspired from cgit.freedesktop.org/vaapi/intel-driver Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:21:11 +01:00
Julien Isorce	5e763aaa21	st/va: add more errors checks in vlVaBufferSetNumElements and vlVaMapBuffer Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:41 +01:00
Julien Isorce	86eb4131a9	st/va: add headless support, i.e. VA_DISPLAY_DRM This patch allows to use gallium vaapi without requiring a X server running for your second graphic card. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:35 +01:00
Julien Isorce	1bdea0e579	st/va: handle Video Post Processing for configs Add support for VA_PROFILE_NONE and VAEntrypointVideoProc in the 4 following functions: vlVaQueryConfigProfiles vlVaQueryConfigEntrypoints vlVaCreateConfig vlVaQueryConfigAttributes Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:29 +01:00
Julien Isorce	0b868807e4	st/va: add colospace conversion through Video Post Processing Add support for VPP in the following functions: vlVaCreateContext vlVaDestroyContext vlVaBeginPicture vlVaRenderPicture vlVaEndPicture Add support for VAProcFilterNone in: vlVaQueryVideoProcFilters vlVaQueryVideoProcFilterCaps vlVaQueryVideoProcPipelineCaps Add handleVAProcPipelineParameterBufferType helper. One application is: VASurfaceNV12 -> gstvaapipostproc -> VASurfaceRGBA Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:10 +01:00
Julien Isorce	05b6ce4209	st/va: implement dmabuf import for VaCreateSurfaces2 For now it is limited to RGBA, BGRA, RGBX, BGRX surfaces. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:03 +01:00
Julien Isorce	adf1133118	st/va: implement VaCreateSurfaces2 and VaQuerySurfaceAttributes Inspired from http://cgit.freedesktop.org/vaapi/intel-driver/ especially src/i965_drv_video.c::i965_CreateSurfaces2. This patch is mainly to support gstreamer-vaapi and tools that uses this newer libva API. The first advantage of using VaCreateSurfaces2 over existing VaCreateSurfaces, is that it is possible to select which the pixel format for the surface. Indeed with the simple VaCreateSurfaces function it is only possible to create a NV12 surface. It can be useful to create a RGBA surface to use with video post processing. The avaible pixel formats can be query with VaQuerySurfaceAttributes. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:19:54 +01:00
Julien Isorce	d42029d2d9	st/va: do not destroy old buffer when new one failed If formats are not the same vlVaPutImage re-creates the video buffer with the right format. But if the creation of this new video buffer fails then the surface looses its current buffer. Let's just destroy the previous buffer on success. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:19:47 +01:00
Julien Isorce	87109e5f88	st/va: properly defines VAImageFormat formats and improve VaCreateImage Added PIPE_VIDEO_CHROMA_FORMAT_NONE in p_format.h and return it by default in ChromaToPipe. Renamed YCbCrToPipe to VaFourccToPipeFormat because it now contains RGB. Implemented PipeFormatToVaFourcc which will be used later in VlVaDeriveImage. Note that gstreamer-vaapi check all the VAImageFormat fields. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:05:23 +01:00
Samuel Iglesias Gonsalvez	7b8cc37585	main: fix basename match's check if it's an array or struct Commit `4565b6f` did not update the basename match's check for the case that string would exactly match the name of the variable if the suffix "[0]" were appended to it. Fixes two dEQP-GLES31 tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element v2: - Change the position of rname_has_array_index_zero to avoid an out-of-bounds read. Reported by Tapani Pälli. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-30 08:12:53 +01:00
Kristian Høgsberg	f7f1bc6cca	i965: Fix invalid memory accesses after resizing brw_codegen's store table Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-30 07:49:10 +01:00
Connor Abbott	73caa26e43	i965/sched: use liveness analysis for computing register pressure Previously, we were using some heuristics to try and detect when a write was about to begin a live range, or when a read was about to end a live range. We never used the liveness analysis information used by the register allocator, though, which meant that the scheduler's and the allocator's ideas of when a live range began and ended were different. Not only did this make our estimate of the register pressure benefit of scheduling an instruction wrong in some cases, but it was preventing us from knowing the actual register pressure when scheduling each instruction, which we want to have in order to switch to register pressure scheduling only when the register pressure is too high. This commit rewrites the register pressure tracking code to use the same model as our register allocator currently uses. We use the results of liveness analysis, as well as the compute_payload_ranges() function that we split out in the last commit. This means that we compute live ranges twice on each round through the register allocator, although we could speed it up by only recomputing the ranges and not the live in/live out sets after scheduling, since we only shuffle around instructions within a single basic block when we schedule. Shader-db results on bdw: total instructions in shared programs: 7130187 -> 7129880 (-0.00%) instructions in affected programs: 1744 -> 1437 (-17.60%) helped: 1 HURT: 1 total cycles in shared programs: 172535126 -> 172473226 (-0.04%) cycles in affected programs: 11338636 -> 11276736 (-0.55%) helped: 876 HURT: 873 LOST: 8 GAINED: 0 v2: use regs_read() in more places. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:19:43 -04:00
Connor Abbott	c1860299b8	i965/fs: split out calculation of payload live ranges We'll need this for the scheduler too, since it wants to know when the live ranges of payload registers end in order to model them in our register pressure calculations. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:19:33 -04:00
Connor Abbott	45cd76e342	i965: dump scheduling cycle estimates The heuristic we're using is rather lame, since it assumes everything is non-uniform and loops execute 10 times, but it should be enough for measuring improvements in the scheduler that don't result in a change in the number of instructions. v2: - Switch loops and cycle counts to be compatible with older shader-db. - Make loop heuristic 10x to match with spilling code. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:19:24 -04:00
Connor Abbott	486268bdb0	i965: always run the post-RA scheduler Before, we would only do scheduling after register allocation if we spilled, despite the fact that the pre-RA scheduler was only supposed to be for register pressure and set the latencies of every instruction to 1. This meant that unless we spilled, which we rarely do, then we never considered instruction latencies at all, and we usually never bothered to try and hide texture fetch latency. Although a later commit removes the setting the latency to 1 part, we still want to always run the post-RA scheduler since it's able to take the false dependencies that the register allocator creates into account, and it can be more aggressive than the pre-RA scheduler since it doesn't have to worry about register pressure at all. Test master post-ra-sched diff %diff bench_OglPSBump2 396.730 402.386 5.656 +1.400% bench_OglPSBump8 244.370 247.591 3.221 +1.300% bench_OglPSPhong 241.117 242.002 0.885 +0.300% bench_OglPSPom 59.555 59.725 0.170 +0.200% bench_OglShMapPcf 86.149 102.346 16.197 +18.800% bench_OglVSTangent 388.849 395.489 6.640 +1.700% bench_trex 65.471 65.862 0.390 +0.500% bench_trexoff 69.562 70.150 0.588 +0.800% bench_heaven 25.179 25.254 0.074 +0.200% Reviewed-by: Jason Ekstrand <jasoan.ekstrand@intel.com>	2015-10-30 02:19:00 -04:00
Connor Abbott	85fce2d2f5	i965/sched: write-after-read dependencies are free Although write-after-write dependencies have the same latency as read-after-write dependencies due to how the register scoreboard works, write-after-read dependencies aren't checked by the EU at all, so they're purely a constraint on how the scheduler can order the instructions. v2: fix accumulator dependencies too. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:18:56 -04:00
Connor Abbott	6f231fddff	i965: fix cycle estimates when there's a pipeline stall The issue time for an instruction is how many cycles it takes to actually put it into the pipeline. If there's a pipeline stall that causes the instruction to be delayed, we should first take that into account to figure out when the instruction would start executing and then add the issue time. The old code had it backwards, and so we would underestimate the total time whenever we thought there would be a pipeline stall by up to the issue time of the instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:18:53 -04:00
Eric Anholt	04c42f3ab5	vc4: Allow user index buffers, to avoid slow readback for shadow IBs. Improves low-settings openarena performance by 31.9975% +/- 0.659931% (n=7).	2015-10-29 22:58:01 -07:00
Jason Ekstrand	3883728730	anv: Add better push constant support What we had before was kind of a hack where we made certain untrue assumptions about the incoming data. This new support, while it still doesn't support indirects properly (that will come), at least pulls the offsets and strides from SPIR-V like it's supposed to.	2015-10-29 22:26:36 -07:00
Jason Ekstrand	1f2624e6dd	nir/spirv: Add support for push constants	2015-10-29 22:26:00 -07:00
Jason Ekstrand	a2283508b0	nir/intrinsics: Add a load_push_constant intrinsic	2015-10-29 22:26:00 -07:00
Jason Ekstrand	f2a8c9db24	nir/spirv: Rework the way we handle interface types	2015-10-29 22:26:00 -07:00
Ilia Mirkin	06fa2e864a	nv50: mark contexts shareable, compile at creation time Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 23:25:08 -04:00
Ilia Mirkin	f768eaa87d	nv50: allow per-sample interpolation to be forced via rast Uses the same technique as for nvc0 of fixups before upload, and evicting in case of state change. Removes one source of variants kept by st/mesa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 22:42:38 -04:00
Matt Turner	85ee2f7fcf	i965: Add INTEL_DEBUG=nocompact to disable instruction compaction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00

... 145 146 147 148 149 ...

82384 commits