From e47ed60ee6133ef562f7fb04b4bfb81296ac9c75 Mon Sep 17 00:00:00 2001 From: "Nemallapudi, Jaikrishna" Date: Fri, 15 May 2026 07:34:49 +0000 Subject: [PATCH] intel/dev: fix timebase_scale ticks-to-ns precision loss across 2^32 Android CTS CtsGpuProfilingDataTest#testProfilingDataProducersAvailable intermittently fails with "Render stages reported before their VkQueueSubmit events". Root cause is in the Perfetto clock correlation: render-stage timestamps go through intel_device_info_timebase_scale() while VkQueueSubmit packets use BOOTTIME directly, so any drift in the scaler shows up as render stages preceding their submits. intel_device_info_timebase_scale() scales the upper and lower halves of the raw timestamp separately and recombines them, but silently drops the upper-half division's remainder. When the frequency doesn't evenly divide 1e9, every wrap past 2^32 loses a fixed number of ns and shows up as a step in Perfetto's GPU-vs-BOOTTIME snapshot offset. Carry the upper-half remainder into the lower-half numerator before dividing, so no precision is lost. All intermediates still fit in uint64_t. Cc: mesa-stable Signed-off-by: Nemallapudi, Jaikrishna Part-of: --- src/intel/dev/intel_device_info.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/intel/dev/intel_device_info.h b/src/intel/dev/intel_device_info.h index f254d5b0304..bd5bcf8ea8b 100644 --- a/src/intel/dev/intel_device_info.h +++ b/src/intel/dev/intel_device_info.h @@ -171,8 +171,11 @@ intel_device_info_timebase_scale(const struct intel_device_info *devinfo, /* Try to avoid going over the 64bits when doing the scaling */ uint64_t upper_ts = gpu_timestamp >> 32; uint64_t lower_ts = gpu_timestamp & 0xffffffff; - uint64_t upper_scaled_ts = upper_ts * 1000000000ull / devinfo->timestamp_frequency; - uint64_t lower_scaled_ts = lower_ts * 1000000000ull / devinfo->timestamp_frequency; + uint64_t upper_num = upper_ts * 1000000000ull; + uint64_t upper_scaled_ts = upper_num / devinfo->timestamp_frequency; + uint64_t upper_remainder = upper_num % devinfo->timestamp_frequency; + uint64_t lower_scaled_ts = ((upper_remainder << 32) + lower_ts * 1000000000ull) / + devinfo->timestamp_frequency; return (upper_scaled_ts << 32) + lower_scaled_ts; }