在C++11时钟之间转换

zxlwwiss 于 2023-05-13 发布在其他

关注(0)|答案(2)|浏览(222)

如果我有一个任意时钟的time_point（比如high_resolution_clock::time_point），有没有办法将它转换为另一个任意时钟的time_point（比如system_clock::time_point）？
我知道如果这种能力存在的话，肯定会有限制，因为并不是所有的时钟都是稳定的，但是在规范中有没有任何功能来帮助这种转换呢？

c++

来源：https://stackoverflow.com/questions/35282308/convert-between-c11-clocks

2条答案

按热度按时间

km0tfn4u1#

我想知道T. C.并且Howard Hinnant可以被改进。作为参考，这里是我测试的基本版本。

template
<
  typename DstTimePointT,
  typename SrcTimePointT,
  typename DstClockT = typename DstTimePointT::clock,
  typename SrcClockT = typename SrcTimePointT::clock
>
DstTimePointT
clock_cast_0th(const SrcTimePointT tp)
{
  const auto src_now = SrcClockT::now();
  const auto dst_now = DstClockT::now();
  return dst_now + (tp - src_now);
}

使用测试

int
main()
{
    using namespace std::chrono;
    const auto now = system_clock::now();
    const auto steady_now = CLOCK_CAST<steady_clock::time_point>(now);
    const auto system_now = CLOCK_CAST<system_clock::time_point>(steady_now);
    const auto diff = system_now - now;
    std::cout << duration_cast<nanoseconds>(diff).count() << '\n';
}

其中CLOCK_CAST将是#define d，现在是clock_cast_0th，我收集了一个空闲系统和一个高负载系统的直方图。请注意，这是一个冷启动测试。我第一次尝试在一个循环中调用这个函数，它给出了 * 好得多 * 的结果。然而，我认为这会给予人一种错误的印象，因为大多数现实世界的程序可能会不时地转换一个时间点，然后 * 将 * 击中冷的情况。
通过与测试程序并行运行以下任务生成负载。（我的计算机有四个CPU。

矩阵乘法基准测试（单线程）。
find /usr/include -execdir grep "$(pwgen 10 1)" '{}' \; -print
hexdump /dev/urandom | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip | hexdump | gzip| gunzip > /dev/null
dd if=/dev/urandom of=/tmp/spam bs=10 count=1000

那些将在有限时间内终止的命令在无限循环中运行。
下面的直方图-以及随后的直方图-显示了50000次运行的误差，其中删除了最差的1 ‰。

请注意，纵坐标具有对数刻度。
在空闲情况下，误差大致落在0.5 µs和1.0 µs之间，在竞争情况下，误差大致落在0.5 µs和1.5 µs之间。
最引人注目的观察是误差分布远非对称的（根本没有负误差），这表明误差中有很大的系统分量。这是有意义的，因为如果我们在两次调用now之间被中断，错误总是在同一个方向，我们不能被中断“负时间量”。
竞争案例的直方图几乎看起来像一个完美的指数分布（注意对数尺度！）具有似乎合理的相当尖锐的截止;你被打断时间t的几率大致与e − t成正比。
然后我尝试使用以下技巧

template
<
  typename DstTimePointT,
  typename SrcTimePointT,
  typename DstClockT = typename DstTimePointT::clock,
  typename SrcClockT = typename SrcTimePointT::clock
>
DstTimePointT
clock_cast_1st(const SrcTimePointT tp)
{
  const auto src_before = SrcClockT::now();
  const auto dst_now = DstClockT::now();
  const auto src_after = SrcClockT::now();
  const auto src_diff = src_after - src_before;
  const auto src_now = src_before + src_diff / 2;
  return dst_now + (tp - src_now);
}

希望内插scr_now将部分地消除由于不可避免地按顺序调用时钟而引入的误差。
在这个答案的第一个版本中，我声称这没有任何帮助。事实证明，这不是真的。在霍华德Hinnant指出他确实观察到了改进之后，我改进了我的测试，现在有一些可观察到的改进。

在误差跨度方面并没有太大的改进，但是，误差现在大致集中在零附近，这意味着我们现在的误差范围从&#1202f;-0.5 µs到0.5&#1202f; µs。
接下来，我尝试在一个循环中调用上面的代码，该循环将为src_diff选择最佳值。

template
<
  typename DstTimePointT,
  typename SrcTimePointT,
  typename DstDurationT = typename DstTimePointT::duration,
  typename SrcDurationT = typename SrcTimePointT::duration,
  typename DstClockT = typename DstTimePointT::clock,
  typename SrcClockT = typename SrcTimePointT::clock
>
DstTimePointT
clock_cast_2nd(const SrcTimePointT tp,
               const SrcDurationT tolerance = std::chrono::nanoseconds {100},
               const int limit = 10)
{
  assert(limit > 0);
  auto itercnt = 0;
  auto src_now = SrcTimePointT {};
  auto dst_now = DstTimePointT {};
  auto epsilon = detail::max_duration<SrcDurationT>();
  do
    {
      const auto src_before = SrcClockT::now();
      const auto dst_between = DstClockT::now();
      const auto src_after = SrcClockT::now();
      const auto src_diff = src_after - src_before;
      const auto delta = detail::abs_duration(src_diff);
      if (delta < epsilon)
        {
          src_now = src_before + src_diff / 2;
          dst_now = dst_between;
          epsilon = delta;
        }
      if (++itercnt >= limit)
        break;
    }
  while (epsilon > tolerance);
#ifdef GLOBAL_ITERATION_COUNTER
  GLOBAL_ITERATION_COUNTER = itercnt;
#endif
  return dst_now + (tp - src_now);
}

该函数采用两个额外的可选参数来指定所需的精度和最大迭代次数，并在任一条件变为真时返回当前最佳值。
我在上面的代码中使用了以下两个直接的帮助函数。

namespace detail
{

  template <typename DurationT, typename ReprT = typename DurationT::rep>
  constexpr DurationT
  max_duration() noexcept
  {
    return DurationT {std::numeric_limits<ReprT>::max()};
  }

  template <typename DurationT>
  constexpr DurationT
  abs_duration(const DurationT d) noexcept
  {
    return DurationT {(d.count() < 0) ? -d.count() : d.count()};
  }

}

误差分布现在在零附近非常对称，误差的幅度下降了几乎100倍。
我很好奇迭代平均运行的频率，所以我将#ifdef添加到代码中，并将#define添加到main函数将打印出来的全局static变量的名称中。（请注意，我们每个实验收集两次迭代计数，因此此直方图的样本大小为100 000。
另一方面，竞争案例的直方图似乎更均匀。我对此没有任何解释，我希望情况正好相反。

看起来，我们几乎总是达到迭代次数限制（但这没关系），有时我们确实提前返回。这个直方图的形状当然可以通过改变传递给函数的tolerance和limit的值来影响。
最后，我想我可以更聪明，而不是查看src_diff，直接使用往返错误作为质量标准。

template
<
  typename DstTimePointT,
  typename SrcTimePointT,
  typename DstDurationT = typename DstTimePointT::duration,
  typename SrcDurationT = typename SrcTimePointT::duration,
  typename DstClockT = typename DstTimePointT::clock,
  typename SrcClockT = typename SrcTimePointT::clock
>
DstTimePointT
clock_cast_3rd(const SrcTimePointT tp,
               const SrcDurationT tolerance = std::chrono::nanoseconds {100},
               const int limit = 10)
{
  assert(limit > 0);
  auto itercnt = 0;
  auto current = DstTimePointT {};
  auto epsilon = detail::max_duration<SrcDurationT>();
  do
    {
      const auto dst = clock_cast_0th<DstTimePointT>(tp);
      const auto src = clock_cast_0th<SrcTimePointT>(dst);
      const auto delta = detail::abs_duration(src - tp);
      if (delta < epsilon)
        {
          current = dst;
          epsilon = delta;
        }
      if (++itercnt >= limit)
        break;
    }
  while (epsilon > tolerance);
#ifdef GLOBAL_ITERATION_COUNTER
  GLOBAL_ITERATION_COUNTER = itercnt;
#endif
  return current;
}

事实证明，这不是一个好主意。

我们已经再次回到非对称误差分布，并且误差的幅度也增加了。（虽然功能也变得更贵了！）实际上，空闲情况下的直方图看起来很奇怪。会不会是尖峰信号对应着我们被打断的频率？这说不通啊
迭代频率显示出与之前相同的趋势。

总之，我建议使用第二种方法，我认为可选参数的默认值是合理的，但当然，这可能因机器而异。霍华德Hinnant评论说，只有四次迭代的限制对他来说效果很好。
如果你真实的的实现了这一点，你不想错过优化的机会，检查是否std::is_same<SrcClockT, DstClockT>::value，在这种情况下，简单地应用std::chrono::time_point_cast，而不调用任何now函数（因此不会引入错误）。

如果你想重复我的实验，我在这里提供了完整的代码。clock_castXYZ``代码已经完成。（只需将所有示例连接到一个文件中，#include明显的标题并保存为clock_cast.hxx。
下面是我使用的实际main.cxx。

#include <iomanip>
#include <iostream>

#ifdef GLOBAL_ITERATION_COUNTER
static int GLOBAL_ITERATION_COUNTER;
#endif

#include "clock_cast.hxx"

int
main()
{
    using namespace std::chrono;
    const auto now = system_clock::now();
    const auto steady_now = CLOCK_CAST<steady_clock::time_point>(now);
#ifdef GLOBAL_ITERATION_COUNTER
    std::cerr << std::setw(8) << GLOBAL_ITERATION_COUNTER << '\n';
#endif
    const auto system_now = CLOCK_CAST<system_clock::time_point>(steady_now);
#ifdef GLOBAL_ITERATION_COUNTER
    std::cerr << std::setw(8) << GLOBAL_ITERATION_COUNTER << '\n';
#endif
    const auto diff = system_now - now;
    std::cout << std::setw(8) << duration_cast<nanoseconds>(diff).count() << '\n';
}

下面的GNUmakefile构建并运行所有内容。

CXX = g++ -std=c++14
CPPFLAGS = -DGLOBAL_ITERATION_COUNTER=global_counter
CXXFLAGS = -Wall -Wextra -Werror -pedantic -O2 -g

runs = 50000
cutoff = 0.999

execfiles = zeroth.exe first.exe second.exe third.exe

datafiles =                            \
  zeroth.dat                           \
  first.dat                            \
  second.dat second_iterations.dat     \
  third.dat third_iterations.dat

picturefiles = ${datafiles:.dat=.png}

all: ${picturefiles}

zeroth.png: errors.gp zeroth.freq
    TAG='zeroth' TITLE="0th Approach ${SUBTITLE}" MICROS=0 gnuplot $<

first.png: errors.gp first.freq
    TAG='first' TITLE="1st Approach ${SUBTITLE}" MICROS=0 gnuplot $<

second.png: errors.gp second.freq
    TAG='second' TITLE="2nd Approach ${SUBTITLE}" gnuplot $<

second_iterations.png: iterations.gp second_iterations.freq
    TAG='second' TITLE="2nd Approach ${SUBTITLE}" gnuplot $<

third.png: errors.gp third.freq
    TAG='third' TITLE="3rd Approach ${SUBTITLE}" gnuplot $<

third_iterations.png: iterations.gp third_iterations.freq
    TAG='third' TITLE="3rd Approach ${SUBTITLE}" gnuplot $<

zeroth.exe: main.cxx clock_cast.hxx
    ${CXX} -o $@ ${CPPFLAGS} -DCLOCK_CAST='clock_cast_0th' ${CXXFLAGS} $<

first.exe: main.cxx clock_cast.hxx
    ${CXX} -o $@ ${CPPFLAGS} -DCLOCK_CAST='clock_cast_1st' ${CXXFLAGS} $<

second.exe: main.cxx clock_cast.hxx
    ${CXX} -o $@ ${CPPFLAGS} -DCLOCK_CAST='clock_cast_2nd' ${CXXFLAGS} $<

third.exe: main.cxx clock_cast.hxx
    ${CXX} -o $@ ${CPPFLAGS} -DCLOCK_CAST='clock_cast_3rd' ${CXXFLAGS} $<

%.freq: binput.py %.dat
    python $^ ${cutoff} > $@

${datafiles}: ${execfiles}
    ${SHELL} -eu run.sh ${runs} $^

clean:
    rm -f *.exe *.dat *.freq *.png

.PHONY: all clean

辅助脚本run.sh相当简单。作为对这个答案的早期版本的改进，我现在在内部循环中执行不同的程序，以便更公平，也可能更好地摆脱缓存效应。

#! /bin/bash -eu

n="$1"
shift

for exe in "$@"
do
    name="${exe%.exe}"
    rm -f "${name}.dat" "${name}_iterations.dat"
done

i=0
while [ $i -lt $n ]
do
    for exe in "$@"
    do
        name="${exe%.exe}"
        "./${exe}" 1>>"${name}.dat" 2>>"${name}_iterations.dat"
    done
    i=$(($i + 1))
done

我还写了binput.py脚本，因为我不知道如何在Gnuplot中单独绘制直方图。

#! /usr/bin/python3

import sys
import math

def main():
    cutoff = float(sys.argv[2]) if len(sys.argv) >= 3 else 1.0
    with open(sys.argv[1], 'r') as istr:
        values = sorted(list(map(float, istr)), key=abs)
    if cutoff < 1.0:
        values = values[:int((cutoff - 1.0) * len(values))]
    min_val = min(values)
    max_val = max(values)
    binsize = 1.0
    if max_val - min_val > 50:
        binsize = (max_val - min_val) / 50
    bins = int(1 + math.ceil((max_val - min_val) / binsize))
    histo = [0 for i in range(bins)]
    print("minimum: {:16.6f}".format(min_val), file=sys.stderr)
    print("maximum: {:16.6f}".format(max_val), file=sys.stderr)
    print("binsize: {:16.6f}".format(binsize), file=sys.stderr)
    for x in values:
        idx = int((x - min_val) / binsize)
        histo[idx] += 1
    for (i, n) in enumerate(histo):
        value = min_val + i * binsize
        frequency = n / len(values)
        print('{:16.6e} {:16.6e}'.format(value, frequency))

if __name__ == '__main__':
    main()

最后，这里是errors.gp ...

tag = system('echo ${TAG-hist}')
file_hist = sprintf('%s.freq', tag)
file_plot = sprintf('%s.png', tag)
micros_eh = 0 + system('echo ${MICROS-0}')

set terminal png size 600,450
set output file_plot

set title system('echo ${TITLE-Errors}')

if (micros_eh) { set xlabel "error / µs" } else { set xlabel "error / ns" }
set ylabel "relative frequency"

set xrange [* : *]
set yrange [1.0e-5 : 1]

set log y
set format y '10^{%T}'
set format x '%g'

set style fill solid 0.6

factor = micros_eh ? 1.0e-3 : 1.0
plot file_hist using (factor * $1):2 with boxes notitle lc '#cc0000'

...和iterations.gp脚本。

tag = system('echo ${TAG-hist}')
file_hist = sprintf('%s_iterations.freq', tag)
file_plot = sprintf('%s_iterations.png', tag)

set terminal png size 600,450
set output file_plot

set title system('echo ${TITLE-Iterations}')
set xlabel "iterations"
set ylabel "frequency"

set xrange [0 : *]
set yrange [1.0e-5 : 1]

set xtics 1
set xtics add ('' 0)

set log y
set format y '10^{%T}'
set format x '%g'

set boxwidth 1.0
set style fill solid 0.6

plot file_hist using 1:2 with boxes notitle lc '#3465a4'

赞(0）回复(0）举报 2023-05-13

kxxlusnw2#

没有办法精确地做到这一点，除非你知道两个时钟历元之间的精确持续时间差。你不知道high_resolution_clock和system_clock的情况，除非is_same<high_resolution_clock, system_clock>{}是true。
话虽如此，你可以编写一个近似正确的翻译，它就像T.C.在他的评论中所说的那样。事实上，libc在condition_variable::wait_for的实现中使用了这种技巧：
https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/__mutex_base#L455
对不同时钟的now的调用尽可能地接近，希望线程在这两个调用之间不会被抢占太长时间。这是我所知道的最好的方法，规范中有回旋的余地来允许这些类型的恶作剧。有些东西可以晚一点醒来，但不能早一点醒来。
在libc的情况下，底层操作系统只知道如何等待system_clock::time_point，但规范要求您必须等待steady_clock（有充分的理由）。所以你尽你所能。
下面是HelloWorld的一个想法：

#include <chrono>
#include <iostream>

std::chrono::system_clock::time_point
to_system(std::chrono::steady_clock::time_point tp)
{
    using namespace std::chrono;
    auto sys_now = system_clock::now();
    auto sdy_now = steady_clock::now();
    return time_point_cast<system_clock::duration>(tp - sdy_now + sys_now);
}

std::chrono::steady_clock::time_point
to_steady(std::chrono::system_clock::time_point tp)
{
    using namespace std::chrono;
    auto sdy_now = steady_clock::now();
    auto sys_now = system_clock::now();
    return tp - sys_now + sdy_now;
}

int
main()
{
    using namespace std::chrono;
    auto now = system_clock::now();
    std::cout << now.time_since_epoch().count() << '\n';
    auto converted_now = to_system(to_steady(now));
    std::cout << converted_now.time_since_epoch().count() << '\n';
}

对我来说，使用Apple clang/libc++ at -O3输出：

1454985476610067
1454985476610073

指示组合转换具有6微秒的误差。

更新

我在上面的一个转换中任意颠倒了对now()的调用顺序，使得一个转换以一种顺序调用它们，而另一个转换以相反的顺序调用它们。这 * 应该 * 对任何 * 一个 * 转换的准确性没有影响。然而，当我在这个HelloWorld中转换 * 两种 * 方式时，应该有一个统计取消，这有助于减少 * 往返 * 转换错误。

赞(0）回复(0）举报 2023-05-13

我来回答

在C++11时钟之间转换

2条答案

相关问题

热门标签

最新问答