fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-09 17:38:09 +02:00

Author	SHA1	Message	Date
Bas Nieuwenhuizen	2dacb727c2	radv: Set query availability bit even if we don't wait. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `8475a14302` ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-04-12 07:38:58 +02:00
Gregory Hainaut	03d1de387e	mesa: avoid NULL ptr in prog parameter name Context: _mesa_add_parameter is sometimes[0] called with a NULL name as a mean of an unnamed parameter. Allowing NULL pointer as a name means that it must be NULL checked each access. So far it isn't always[1] true. Parameter name is only used for debug purpose (printf) and to lookup the index/location of the program by the application. Conclusion, there is no valid reason to use a NULL pointer instead of an empty string. So it was decided to use an empty string which avoid all issues related to NULL pointer [0]: texture gather offsets glsl opcode and st_init_atifs_prog [1]: at least shader cache, st_nir_lookup_parameter_index and some printfs Issue found by piglit 'texturegatheroffsets' tests on Nouveau v4: new patch based on Nicolai/Timothy/ilia discussion Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-12 14:30:28 +10:00
Kenneth Graunke	754b961f38	i965/drm: Use bools for a few flags. These one bit values are booleans. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	44ecbbebe2	i965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit. unsigned long is a terrible type for a bitfield - if you need fewer than 32 bits, it wastes 4 bytes. If you need more, things break on 32-bit builds. Just use unsigned. Even that's a bit ridiculous as we only have one flag today. Still, it's at least somewhat better. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	f374b9449e	i965/drm: Make BO size a uint64_t rather than unsigned long. The drm_i915_gem_create ioctl structure uses a __u64 for the size, so we should probably use uint64_t to match. In theory, we could probably have a BO larger than 4GB, using a 48-bit PPGTT - it just wouldn't be mappable in the CPU's 32-bit address space. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	c85d6832fd	i965/drm: Make alignment parameter a uint64_t. Theoretically, with a 48-bit address space, we could have buffers with an alignment of >= 4GB. It's a bit silly, but the exec_object structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may as well use the same type as the kernel API. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	444ab8126d	i965/drm: Make stride/pitch a uint32_t. struct drm_i915_gem_set_tiling's stride field is a __u32. intel_mipmap_tree::stride is a uint32_t. Using unsigned long just doesn't make sense. Switching also lets us drop many pointless locals that only existed to deal with the type mismatch. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	14fc188460	i965/drm: Fix types for pwrite/pread fields. The ioctl structs contain __u64 offset and size fields, so make them uint64_t rather than unsigned long. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Kenneth Graunke	193601311c	i965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer. For some reason we passed tiling by pointer, through several layers, even though the functions only read the initial value, and never actually change it. We even had a do-while loop that executed until the tiling mode matched - except it always did, so it only ran once. We then had bogus error handling in case it changed the tiling mode to something nonsensical...which it never did. Drop all this nonsense. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-04-11 21:07:45 -07:00
Timothy Arceri	9bd7184078	mesa/st: remove _mesa_get_fallback_texture() calls These calls look like leftover from fallback texture support first being added to the st in `8f6d9e12be` and then later being added to core mesa in `00e203fe17`. The piglit test fp-incomplete-tex continues to work with this change. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-04-12 12:00:35 +10:00
Timothy Arceri	c72170fb1f	mesa: use pre_hashed version of search for the mesa hash table The key is just an unsigned int so there is never any real hashing done. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-04-12 12:00:35 +10:00
Tim Rowley	d0f381f865	swr: [rasterizer core] Disable 8x2 tile backend Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	31a23a9d9d	swr: [rasterizer common] Add _simd_testz_si alias Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	7abd1f9b24	swr: [rasterizer archrast] Fix archrast for MSVC 2017 compiler Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	54d11b3c95	swr: [rasterizer jitter] Remove unused function Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	af909c0200	swr: [rasterizer jitter] Remove HAVE_LLVM tests supporting llvm < 3.8 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	973d38801d	swr: [rasterizer common/core] Fix 32-bit windows build Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	217b791a44	swr: [rasterizer core] Fix unused variable warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	da7aa39f93	swr: [rasterizer core] Code formating change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	c8cc07ca25	swr: [rasterizer core] SIMD16 Frontend WIP - PA Fix PA NextPrim for SIMD8 on SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	08a7136848	swr: [rasterizer core] SIMD16 Frontend WIP - Clipper Implement widened clipper for SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	0033e86b2c	swr: [rasterizer core] Multisample sample position setup change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Tim Rowley	4c093869db	swr: [rasterizer core] Reduce templates to speed compile Quick patch to remove some unused template params to cut down rasterizer compile time. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-04-11 18:01:03 -05:00
Francisco Jerez	147e71242c	i965/fs: Take into account lower frequency of conditional blocks in spilling cost heuristic. The individual branches of an if/else/endif construct will be executed some unknown number of times between 0 and 1 relative to the parent block. Use some factor in between as weight while approximating the cost of spill/fill instructions within a conditional if-else branch. This favors spilling registers used within conditional branches which are likely to be executed less frequently than registers used at the top level. Improves the framerate of the SynMark2 OglCSDof benchmark by ~1.9x on my SKL GT4e. Should have a comparable effect on other platforms. No significant regressions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-04-11 15:28:54 -07:00
Tim Rowley	9a7b257450	swr: return true for PIPE_CAP_DOUBLES Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-04-11 13:16:43 -05:00
Kenneth Graunke	02ccd8f52c	i965: Set kernel features before computing max GL version. We check these bitfields when computing the Haswell max GL version. We need to set them ahead of time, or they won't exist, and all our checks will fail. That sets the max core profile GL version to 4.2. This introduces the bizarre situation where asking for a GL context with version 4.3+ fails, but asking for a GL core profile context with version <= 4.2 actually promotes you a 4.5 context. GLX_MESA_query_renderer also reported the bogus 4.2 value. Now it shows 4.5. Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>	2017-04-11 08:58:16 -07:00
Juan A. Suarez Romero	8d7a82ae32	anv: remove needless VALGRIND_MAKE_MEM_DEFINED This is already invoked in the following VG_NOACCESS_READ() call. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-04-11 17:21:57 +02:00
Lucas Stach	4ee7c2c284	etnaviv: enable TS, but disable autodisable Autodisable seems to cause missed rendering in some cases, but otherwise TS seems to work properly. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:31 +02:00
Lucas Stach	797890bbbd	etnaviv: enable TS also on sampler resources Fixes a performance issue with imported winsys buffers as those are marked with binding sampler view. This might require a TS flush on single pipe chips that directly sample from the rendered buffer, but otherwise seems to work fine. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:27 +02:00
Lucas Stach	52f6c8cc31	etnaviv: align TS surface size to number of pixel pipes The TS surface gets cleared by a tiled RS fill. If the chip has more than 1 pixel pipe the size of the TS surface needs to be aligned so that each pipe address matches a tile start, otherwise the RS will hang. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:22 +02:00
Lucas Stach	37622ecc79	etnaviv: avoid using invalid TS The TS is only valid after it has been initialized by a fast clear, so it should not be taken into account when blitting resources that haven't been cleared. Also the blit itself invalidates the destination TS, as it's not updated and will retain data from the previous rendering after the blit. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-04-11 16:52:01 +02:00
Samuel Pitoiset	768f81b62b	glsl: use the BA1 macro for textureQueryLevels() For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-11 10:24:57 +02:00
Samuel Pitoiset	981ba1c89b	glsl: use the BA1 macro for textureSamples() For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-11 10:24:54 +02:00
Samuel Pitoiset	29082b0b22	glsl: use the BA1 macro for textureCubeArrayShadow() For both consistency and new bindless sampler types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-04-11 10:24:51 +02:00
Bas Nieuwenhuizen	8475a14302	radv: Implement pipeline statistics queries. The devil is in the shader again, otherwise this is fairly straightforward. The CTS contains no pipeline statistics copy to buffer testcases, so I did a basic smoketest. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	d2906bc72d	radv: Let count be dynamic in radv_break_on_count. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	8473193760	radv: Rename query pipeline/set layout. For using them with both occlusion and pipeline statistics queries. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	95743d5b88	radv: Use VK_WHOLE_SIZE for the query buffer bindings. The buffer sizes are specified just a few lines earlier, so don't repeat ourselves. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	8911dd6d12	radv: Use a shader for occlusion CmdCopyQueryPoolResults. Use the new occlusion query copy shader. We don't use the shader for the waiting as a polling loop ineracts badly with having caching enabled. I noticed on my GPU (Tonga) that the values are written out in order, so I just use a WAIT_REG_MEM on the last value. If it turns out other chips don't do that we may need to look a bit more into this. Having 8 WAIT_REG_MEM packets per query doesn't sound ideal. This also restricts the availability word in the pool to timestamp queries only, as occlusion queries don't use it, and pipeline statistic queries likely won't either. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	ce0c8cf941	radv: Add occlusion query shader. Adds a shader for writing occlusion query results to a buffer, as the CP packet isn't support on SI or secondary buffers, and doesn't handle the availability bit (or partial results) nor truncation to 32-bit. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Kenneth Graunke	50b987c0f0	i965: Fix wonky indentation left by brw_bo_alloc_tiled rename.	2017-04-10 23:25:13 -07:00
Ilia Mirkin	d9cc58d6ec	nouveau: when mapping a persistent buffer, synchronize on former xfers If the buffer is being used, we should wait for those uses to be complete before returning the map. Fixes: GL45-CTS.direct_state_access.buffers_functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2017-04-11 00:13:55 -04:00
Ilia Mirkin	8036809799	nvc0: increase texture buffer object alignment to 256 for pre-GM107 We currently don't pass the low byte of the address via the surface info, so in order to work with images, these have to implicitly be aligned to 256. The proprietary driver also doesn't go out of its way to provide lower alignment. Fixes GL45-CTS.texture_buffer.texture_buffer_texture_buffer_range Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-04-11 00:13:55 -04:00
Timothy Arceri	8ffd54fef8	mesa: fix typo and add assert() to _mesa_attach_renderbuffer_without_ref() This function should only be used with a "freshly created" renderbuffer so assert RefCount is 1.	2017-04-11 09:57:45 +10:00
Kenneth Graunke	bd84252be6	i965/drm: Add stall warnings when mapping or waiting on BOs. This restores the performance warnings removed in: i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings. but adds them for nearly all BO mapping, and also for wait_rendering. Because we add this to the core bufmgr, we automatically get stall warnings in all callers, unlike before where only a few callsites used the wrappers that gave stall warnings. We also do it a bit differently: we simply measure how long set_domain takes (the part that stalls), and complain if it's more than 0.01 ms. We don't bother calling brw_bo_busy(), and we don't measure the mmap time (which doesn't stall). This should be more accurate. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-04-10 14:33:18 -07:00
Kenneth Graunke	f053ee78ed	i965/drm: Make a set_domain() helper function. Less boilerplate. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-04-10 14:33:18 -07:00
Daniel Vetter	a99a4979fd	i965/batch: Ensure we use a consistent offset in relocs In theory gcc is free to re-load them, and if a concurrent execbuf races and updates bo->offset64 then we have a problem: execbuffer api requires that the ->presumed_offset and the one we used for the reloc matches. It does not require that the value is sensible, which means no locks needed, just a consistent load. Ken said his next series will nuke this, so just hand-roll the kernel's READ_ONCE idea inline. FIXME: Most callers of brw_emit_reloc recompute the relocation themselves, which means this doesn't really fix the race. But the long term plan is to move to per-context relocation handling, which will fix this all properly. So leave this for now as just a reminder. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:18 -07:00
Daniel Vetter	7f3c85c21e	i965/bufmgr: Garbage-collect vma cache/pruning This was done because the kernel has 1 global address space, shared with all render clients, for gtt mmap offsets, and that address space was only 32bit on 32bit kernels. This was fixed in commit 440fd5283a87345cdd4237bdf45fb01130ea0056 Author: Thierry Reding <treding@nvidia.com> Date: Fri Jan 23 09:05:06 2015 +0100 drm/mm: Support 4 GiB and larger ranges which shipped in 4.0. Of course you still want to limit the bo cache to a reasonable size on 32bit apps to avoid ENOMEM, but that's better solved by tuning the cache a bit. On 64bit, this was never an issue. On top, mesa never set this, so it's all dead code. Collect an trash it. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:18 -07:00
Daniel Vetter	1f965d3f7a	i965/bufmgr: Remove some reuse functions is_reusable was needed by uxa because it couldn't keep track of its scanout buffers and used this as a proxy. Disabling reuse is a silly idea, we set this once at start. Remove both. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:18 -07:00
Daniel Vetter	edd85c1f04	i965/bufmgr: remove start_gtt_access Iirc this was used by uxa for persistent mmpas of the frontbuffer. For mesa all the set_domain stuff needed before a synchronized mmap is handled within the bufmgr, so no reason ever to call this. Inline the implementation into its only internal user. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-04-10 14:33:17 -07:00

... 23 24 25 26 27 ...

92185 commits