C++中三个稠密特征：：MatrixXi公共首零值指标的求取

i5desfxk 于 2023-11-19 发布在其他

关注(0)|答案(1)|浏览(104)

我有一个NxN非负的Eigen::MatrixXi称为cost_matrix，两个Nx1非负的Eigen::VectorXi称为rowVector和colVector。我想找到第一个索引（i，j），使得cost_matrix(i, j)，rowVector(i)和colVector(j)都是零。
我知道有一些简单的解决方案，比如遍历所有元素，但它们花费太多时间。我想在Eigen C++中找到最有效的方法。
这是我目前的代码，它比使用Eigen::DenseBase::visit遍历所有元素要快。

Eigen::MatrixXi cost_matrix(4,4);
cost_matrix<<1, 2, 3, 0, 
             5, 0, 0, 8,
             9, 8, 7, 6,
             0, 2, 1, 5;
Eigen::VectorXi rowCover(4);
Eigen::VectorXi colCover(4);
rowCover << 0, 0, 1, 1;
colCover << 1, 1, 0, 1;
//A data demo. Cost_matrix is a NxN nonnegative int matrix.
//RowCover and colCover are N nonnegative int vector.

int i, j;
cost_matrix.colwise() += rowCover;
cost_matrix.rowwise() += colCover.transpose();
Eigen::Index zeroRow, zeroCol;
int min = find_zero_matrix.transpose().minCoeff(&zeroCol, &zeroRow);
i = zeroRow;
j = zeroCol;
//the result should be
//i = 1
//j = 2

字符串

c++

来源：https://stackoverflow.com/questions/77493553/get-common-first-zero-value-indice-from-three-dense-eigenmatrixxi-in-c

1条答案

按热度按时间

62o28rlo1#

正如@chtz已经在评论中建议的那样，Eigen不会真正帮助你。我们仍然可以尝试找到一个更快的版本。
这是我的想法：
首先，我扩展/修复您的代码，以获得一个工作的引用实现。

/**
 * Reference implementation
 * 
 * Finds lowest row and column (prioritizes low row number)
 * where cost_matrix is zero and both row_cover and col_cover are zero
 * 
 * Returns {-1, -1} if no result is found.
 * The input arrays can be overwritten by the implementation as desired
 */
std::pair<int, int> reference(
      Eigen::MatrixXi& cost_matrix,
      Eigen::VectorXi& row_cover,
      Eigen::VectorXi& col_cover) noexcept
{
    int i, j;
    cost_matrix.colwise() += row_cover;
    cost_matrix.rowwise() += col_cover.transpose();
    Eigen::Index zeroRow, zeroCol;
    int min = cost_matrix.transpose().minCoeff(&zeroCol, &zeroRow);
    i = zeroRow;
    j = zeroCol;
    if(min) // no result found
        return {-1, -1};
    return {i, j};
}

字符串
遍历整个输入矩阵似乎是非常浪费的，除非覆盖向量通常为零而输入矩阵不是。如果覆盖包含许多非零元素，最好将它们转换为零索引的向量。

/**
 * Overwrite the front of cover with indices where it was zero
 * 
 * The return value gives the total number of zeros found.
 * The values behind that are left in undefined state
 */
static Eigen::Index to_zero_indices(Eigen::VectorXi& cover) noexcept
{
    Eigen::Index outpos = 0;
    for(Eigen::Index i = 0, n = cover.size(); i < n; ++i) {
        /* Loop body is written to be branchless */
        int value = cover[i];
        cover[outpos] = static_cast<int>(i);
        outpos += ! value;
    }
    return outpos;
}

型
现在我们只需要检查矩阵中两个覆盖都为零的条目。

/**
 * Alternative implementation based on sparse probing of the cost matrix
 * 
 * API-compatible to reference but in practice changes the cover vectors,
 * not the cost matrix
 */
std::pair<int, int> probe_indices(
      Eigen::MatrixXi& cost_matrix,
      Eigen::VectorXi& row_cover,
      Eigen::VectorXi& col_cover) noexcept
{
    const Eigen::Index rows_n = to_zero_indices(row_cover);
    const Eigen::Index cols_n = to_zero_indices(col_cover);
    for(int i: row_cover.head(rows_n))
        for(int j: col_cover.head(cols_n))
            if(! cost_matrix(i, j))
                return {i, j};
    return {-1, -1};
}

型
为了获得最佳结果，请使用-DNDEBUG编译，以避免在各种索引操作中进行范围检查。

测试

矩阵大小N大约是20，但是它每帧会运行很多次。0的间隔大约是1/N（可能每行几个0）
我假设这个1/N不适用于覆盖向量。否则你找到一个条目的机会很低。我随意决定用4/N测试覆盖矩阵，用1/N测试成本矩阵。
完整的测试和基准测试结果如下所示。在我的系统上，我的版本在所选参数集下的速度大约是10倍，即使我将cover更改为全零，它也快了7.5倍。即使在绝对最坏的情况下-cover全零，matrix全一-它仍然快了两倍。

#include <Eigen/Dense>

#include <chrono>
#include <iostream>
#include <random>
// using std::default_random_engine, std::bernoulli_distribution
#include <utility>
// using std::pair

/**
 * Fill the output with zeros and ones depending on the percentage
 *
 * Results are random but reproducible across process executions.
 * Input percentage is expected to be between 0 and 1
 */
void fill_pct(Eigen::Ref<Eigen::MatrixXi> out, double pct_ones)
{
    static std::default_random_engine::result_type seed = 0xdeadbeef;
    std::default_random_engine rng {seed++};
    std::bernoulli_distribution distr {pct_ones};
    out = Eigen::MatrixXi::NullaryExpr(
          out.rows(), out.cols(),
          [&]() -> int { return distr(rng); });
}
int main()
{
    using clock_t = std::chrono::steady_clock;
    const int size = 20;
    const double pct_zero = 1. / size;
    const double pct_zero_cover = 4. / size;
    const int repetitions = 1000000;
    Eigen::MatrixXi cost_matrix, matrix_copy;
    Eigen::VectorXi row_cover, col_cover, row_copy, col_copy;
    clock_t::duration reference_time {}, bench_time {};
    for(int i = 0; i < repetitions; ++i) {
        cost_matrix.resize(size, size);
        fill_pct(cost_matrix, 1. - pct_zero);
        matrix_copy = cost_matrix;
        for(Eigen::VectorXi* cover: {&row_cover, &col_cover}) {
            cover->resize(size);
            fill_pct(*cover, 1. - pct_zero_cover);
        }
        row_copy = row_cover;
        col_copy = col_cover;
        clock_t::time_point t1 = clock_t::now();
        std::pair<int, int> bench_result = probe_indices(
              cost_matrix, row_cover, col_cover);
        clock_t::time_point t2 = clock_t::now();
        std::pair<int, int> ref_result = reference(
              matrix_copy, row_copy, col_copy);
        clock_t::time_point t3 = clock_t::now();
        bench_time += t2 - t1;
        reference_time += t3 - t2;
        if(bench_result != ref_result)
            std::cout << bench_result.first << " != " << ref_result.first
                      << ' ' << bench_result.second << " != " << ref_result.second
                      << '\n';
    }
    using std::chrono::milliseconds;
    std::cout << "Reference " << std::chrono::duration_cast<milliseconds>(
                    reference_time).count() * 1e-3
              << "\nBench " << std::chrono::duration_cast<milliseconds>(
                    bench_time).count() * 1e-3
              << '\n';
}

型

C++中三个稠密特征：：MatrixXi公共首零值指标的求取

1条答案

测试

更多要点

相关问题

热门标签

最新问答