fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 02:20:11 +01:00

Author	SHA1	Message	Date
Francisco Jerez	e1a918ba7b	i965/fs: Replace fs_inst::regs_read with ::size_read using byte units. The previous regs_read value can be recovered by rewriting each reference of regs_read() like 'x = i.regs_read(j)' to 'x = DIV_ROUND_UP(i.size_read(j), reg_unit)'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	27cb6b081e	i965/ir: Drop backend_instruction::regs_written field. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	69fdf13c21	i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	69570bbad8	i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes. The previous regs_written field can be recovered by rewriting each rvalue reference of regs_written like 'x = i.regs_written' to 'x = DIV_ROUND_UP(i.size_written, reg_unit)', and each lvalue reference like 'i.regs_written = x' to 'i.size_written = x * reg_unit'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	d28cfa35fe	i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written. This is in preparation for dropping vec4_instruction::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	c458eeb946	i965/fs: Add wrapper functions for fs_inst::regs_read and ::regs_written. This is in preparation for dropping fs_inst::regs_read and ::regs_written in favor of more accurate alternatives expressed in byte units. The main reason these wrappers are useful is that a number of optimization passes implement dataflow analysis with register granularity, so these helpers will come in handy once we've switched register offsets and sizes to the byte representation. The wrapper functions will also make sure that GRF misalignment (currently neglected by most of the back-end) is taken into account correctly in the calculation of regs_read and regs_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	be095e11e4	i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes. The fs_reg::subreg_offset and ::offset fields are now redundant, the sub-GRF offset can just be added to the single ::offset field expressed in byte units. The current subreg_offset value can be recovered by applying the following rule: Replace each rvalue reference of subreg_offset like 'x = r.subreg_offset' with 'x = r.offset % reg_unit', and each lvalue reference like 'r.subreg_offset = x' with 'r.offset = ROUND_DOWN_TO(r.offset, reg_unit) + x'. For the same reason as in the previous patches, this doesn't attempt to be particularly clever about simplifying the result in the interest of keeping the rather lengthy patch as obvious as possible. I'll come back later to clean up any ugliness introduced here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	9a523dd051	i965/ir: Remove backend_reg::reg_offset. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:53 -07:00
Francisco Jerez	fba020e5af	i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes. The dst/src_reg::offset field in byte units introduced in the previous patch is a more straightforward alternative to an offset representation split between ::reg_offset and ::subreg_offset fields. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple FS back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. v2: Fix division by the wrong reg_unit in the UNIFORM case of convert_to_hw_regs(). (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:52 -07:00
Francisco Jerez	86944e063a	i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes. The fs_reg::offset field in byte units introduced in this patch is a more straightforward alternative to the current register offset representation split between fs_reg::reg_offset and ::subreg_offset. The split representation makes it too easy to forget about one of the offsets while dealing with the other, which has led to multiple back-end bugs in the past. To make the matter worse the unit reg_offset was expressed in was rather inconsistent, for uniforms it would be expressed in either 4B or 16B units depending on the back-end, and for most other things it would be expressed in 32B units. This encodes reg_offset as a new offset field expressed consistently in byte units. Each rvalue reference of reg_offset in existing code like 'x = r.reg_offset' is rewritten to 'x = r.offset / reg_unit', and each lvalue reference like 'r.reg_offset = x' is rewritten to 'r.offset = r.offset % reg_unit + x * reg_unit'. Because the change affects a lot of places and is rather non-trivial to verify due to the inconsistent value of reg_unit, I've tried to avoid making any additional changes other than applying the rewrite rule above in order to keep the patch as simple as possible, sometimes at the cost of introducing obvious stupidity (e.g. algebraic expressions that could be simplified given some knowledge of the context) -- I'll clean those up later on in a second pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-09-14 14:50:52 -07:00
Eero Tamminen	8ad5fb3a8f	glsl: grammar fix Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-09-14 13:35:47 -07:00
Kenneth Graunke	aa70ac172e	docs: Mention AEP in release notes	2016-09-14 12:43:16 -07:00
Kenneth Graunke	8c9dddadad	i965: Enable ANDROID_extension_pack_es31a on Gen9+. AEP requires ASTC, which is currently only enabled on Skylake and later. (It may be possible to extend this to Cherryview/Braswell in the future, but earlier hardware doesn't have ASTC support.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-09-14 12:16:25 -07:00
Kenneth Graunke	2d8a3fa7ea	nir: Report progress from nir_lower_phis_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:51 -07:00
Kenneth Graunke	32630e211e	nir: Report progress from nir_lower_alu_to_scalar. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:49 -07:00
Kenneth Graunke	e6eed3533e	nir: Call nir_metadata_preserve from nir_lower_alu_to_scalar(). This is mandatory. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-09-14 12:01:39 -07:00
Rob Clark	bff90aedf1	nir/lower_tex: fix typo with sample_dim Numeric 2 is actually GLSL_SAMPLER_DIM_3D, which I don't think is what was intended. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Rob Clark	1a8424ceba	nir: move tex_instr_remove_src I want to re-use this in a different pass, so move to nir.h Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Rob Clark	2c3f966276	nir/lower_tex: remove tex_instr_find_src() Turns out it already exists.. so don't duplicate it. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-09-14 13:45:32 -04:00
Kyle Brenneman	7206b3a556	egl: Add storage for EGL_KHR_debug's state to EGL objects Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	1d535c1e83	egl: Factor out _eglGetSyncAttribCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	5b0b844ac9	egl: Factor out _eglWaitSyncCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	9a992038e7	egl: Lock the display in _eglCreateSync's callers Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	58338c6b65	egl: Factor out _eglCreateImageCommon (v2) v2: - Pass disp to RETURN_EGL_ERROR so we unlock the display Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	82a2e2cb50	egl: Factor out _eglWaitClientCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	8cc3d9855f	egl: Use _eglCreatePixmapSurfaceCommon consistently This moves the native pixmap fixup to a helper function so we don't repeat ourselves. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	7d7ae5e1c3	egl: Use _eglCreateWindowSurfaceCommon consistently This moves the native window fixup to a helper function so we don't repeat ourselves. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	017946b724	egl: Factor out _eglGetPlatformDisplayCommon Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	fe6ffa79be	egl: Fix typo Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 11:45:58 -04:00
Adam Jackson	e2c067d256	egl: Tear down images and syncs at eglTerminate Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2016-09-14 11:45:58 -04:00
Kyle Brenneman	6e50f12b04	egl: Update eglext.h (v2) Updated eglext.h to revision 33111 from the Khronos repository. v2: - Don't (re)move extension includes from eglext.h (Emil Velikov) - Bump to revision 33111 (Adam Jackson) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2016-09-14 11:45:58 -04:00
Brendan King	95f3e5861c	configure.ac: fix the name of the Wayland Scanner pc file The Wayland Scanner pkg-config file is called wayland-scanner.pc. Fixes: `153539bd9d` ("configure: rework wayland_scanner handling (fix make distcheck)") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Brendan King <Brendan.King@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 14:38:30 +01:00
Eric Engestrom	4bb9efb592	gbm: remove left-over array `e7c8c85785` ("gbm: Removed unused function.") forgot to remove the global array used only by that function. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-09-14 14:37:34 +01:00
Martina Kollarova	2527e18eeb	gallium: fix return value check A possible error (-1) was being lost because it was first converted to an unsigned int and only then checked. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Martina Kollarova <martina.kollarova@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2016-09-14 14:36:43 +01:00
Marek Olšák	ab29788250	radeonsi: reload PS inputs with direct indexing at each use (v2) The LLVM compiler can CSE interp intrinsics thanks to LLVMReadNoneAttribute. 26011 shaders in 14651 tests Totals: SGPRS: 1146340 -> 1132676 (-1.19 %) VGPRS: 727371 -> 711730 (-2.15 %) Spilled SGPRs: 2218 -> 2078 (-6.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35841268 -> 36009732 (0.47 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222559 -> 224779 (1.00 %) Wait states: 0 -> 0 (0.00 %) v2: don't call load_input for fragment shaders in emit_declaration Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:33:00 +02:00
Marek Olšák	007b512f9d	radeonsi: get rid of constant buffer preloading 26011 shaders in 14651 tests Totals: SGPRS: 1152636 -> 1146340 (-0.55 %) VGPRS: 728198 -> 727371 (-0.11 %) Spilled SGPRs: 3776 -> 2218 (-41.26 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35835152 -> 35841268 (0.02 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222372 -> 222559 (0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	16be87c904	radeonsi: get rid of img/buf/sampler descriptor preloading (v2) 26011 shaders in 14651 tests Totals: SGPRS: 1251920 -> 1152636 (-7.93 %) VGPRS: 728421 -> 728198 (-0.03 %) Spilled SGPRs: 16644 -> 3776 (-77.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 36001064 -> 35835152 (-0.46 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222221 -> 222372 (0.07 %) Wait states: 0 -> 0 (0.00 %) v2: merge codepaths where possible Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:32:59 +02:00
Marek Olšák	22797d7d83	radeonsi: rename get_sampler_desc -> load_sampler_desc Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	5f0a8fbcc8	radeonsi: cosmetic changes in si_shader.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-09-14 12:32:59 +02:00
Marek Olšák	afaf27bff3	radeonsi: load streamout buffer descriptors before use (v2) v2: inline the code and remove the conditional that's a no-op now Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-14 12:32:59 +02:00
Eric Anholt	f597ac3966	vc4: Implement job shuffling Track rendering to each FBO independently and flush rendering only when necessary. This lets us avoid the overhead of storing and loading the frame when an application momentarily switches to rendering to some other texture in order to continue rendering the main scene. Improves glmark -b desktop:effect=shadow:windows=4 by 27% Improves glmark -b desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4 by 17% While I haven't tested other apps, this should help X rendering a lot, and I've heard GLBenchmark needed it too.	2016-09-14 06:25:41 +01:00
Eric Anholt	f473348468	vc4: Handle resolve skipping at job submit time. This is done in vc4_flush currently, but I'm going to make the job always track the surfaces it might be rendering to instead of putting in the destinations at flush time.	2016-09-14 06:08:03 +01:00
Eric Anholt	9688166bd9	vc4: Move the render job state into a separate structure. This is a preparation step for having multiple jobs being queued up at the same time.	2016-09-14 06:08:03 +01:00
Eric Anholt	c31a7f529f	vc4: Always unref the current job surfaces at job reset time. Drops some tricky logic in vc4_flush() trying to update the pointers, and fixes a broken lack of unref for MSAA surfaces at context destroy time.	2016-09-14 06:08:03 +01:00
Eric Anholt	774a556b6d	vc4: Move job-submit skip cases to vc4_job_submit(). For calling job_submit() directly, I need the skipping here.	2016-09-14 06:08:03 +01:00
Eric Anholt	0ef1b32ebb	vc4: Move bin CL trailer to job_submit() time. To implement job shuffling, I want to be able to call submit() on specific jobs, turning vc4_flush() into the context's flush-all-jobs hook.	2016-09-14 06:08:03 +01:00
Eric Anholt	a2014c2eb9	vc4: Simplify the DISCARD_RANGE handling It's really just an upgrade to attempting WHOLE_RESOURCE. Pulling the logic out caught two bugs in it: We would try to do so on cubemaps (even though we're only mapping 1 of the 6 slices), and we would break persistent coherent mappings by trying to reallocate when we shouldn't.	2016-09-14 06:08:03 +01:00
Eric Anholt	21a27ad956	vc4: Fix incorrect clearing of Z/stencil when cleared separately. The clear of Z or stencil will end up clearing the other as well, instead of masking. There's no way around this that I know of, so if we are clearing just one then we need to draw a quad. Fixes a regression in the job-shuffling code, where the clear values move to the job and don't just have the last clear's value laying around when you do glClear(DEPTH) and then glClear(STENCIL) separately (ext_framebuffer_multisample-clear 4 depth)). This causes regressions in ext_framebuffer_multisample/multisample-blit depth and ext_framebuffer_multisample/no-color depth, but these were formerly false positives due to the reference image also being black. Now the reference and test images are both being drawn, and it looks like there's an incorrect resolve of depth during blitting to an MSAA FBO.	2016-09-14 06:08:03 +01:00
Ilia Mirkin	89a49af31e	glsl: add core plumbing for GL_ANDROID_extension_pack_es31a Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:55 -04:00
Ilia Mirkin	83116d084f	mesa: introduce glPrimitiveBoundingBoxARB entrypoint This requires a bit of rejiggering, since normally ES entrypoints alias core ones, not vice-versa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-09-13 20:49:50 -04:00

... 9 10 11 12 13 ...

85385 commits