rust 基于向量元素创建HashMap的最高效内存和最快的方法

velaa5lx  于 2023-04-21  发布在  其他
关注(0)|答案(4)|浏览(265)

假设我们将数百万个项目(从DB表,JSON文件等)读入Vec<Item>

#[derive(Clone)]
struct Item {
    id: u64,
    name: String,
    //...
}
let items: Vec<Item> = load_items();

通常,为了有效地处理这些数据,我们需要额外的数据结构,比如一个HashMap,其中一个item的字段作为键,item本身作为值。在我看来,最简单的方法是这样的:

let map_by_name: HashMap<String, Item> = items.iter().map(|x| (x.name.clone(), x.clone())).collect();

这很简单,但这里我们克隆所有数据,这显然在性能和内存使用方面不是最佳的。
在基于GC的语言(Java,.NET,Python,JS等)中,这个特殊的问题不存在-最简单和最自然的实现不会涉及克隆,而只是存储对vector中对象的额外引用。
在Rust,AFAIK中,我们不能移动或引用向量元素。那么最有效的替代方案是什么呢?以下想法浮现在脑海中:
1.使用Rc

let items: Vec<Rc<Item>> = load_items();
let map_by_name: HashMap<String, Rc<Item>> = items.iter().map(|x| (x.name.clone(), Rc::clone(&x))).collect();

1.在HashMap值中,我们只将元素的索引存储在vector中:

let map_by_name: HashMap<String, usize> = items.iter().enumerate().map(|(i, x)| (x.name.clone(), i)).collect();

我的理解是正确的吗?第二种方法通常比第一种方法更有效,但是如果我们需要将map传递给其他函数,它可能不太方便,因为它总是需要我们传递向量。
还有第三种方法是我所缺少的吗?
P.S.:在上面的例子中,我使用了一个克隆的String作为密钥。很明显,如果对特定情况有意义,我们也可以使用通过x.name.as_ref()获得的&str,但我想特别关注更大的问题-项目本身。

yduiuuwa

yduiuuwa1#

存储索引-您的第二个示例-不一定更有效/更快,因为Rc<T>只是指向RcInner<T>的指针,该指针保存引用计数器和T。实际上,Rc<T>usize-index具有相同的大小。查找时间基本相同,因为索引到Vec中并从Rc<T>解引用T基本上是存储器提取。然而,当创建查找Map时,由于Rc<Item>的引用计数器需要增加,所以Rc<T>存在更多开销,而索引方法仅假设项目将在它们最初被发现的地方。
最后一点是什么会告诉我选择哪种方法:如果Vec<Item>HashMap<Key, usize>可以紧密耦合-例如,在同一个结构体或同一个函数中,它们不能转义-那么使用索引可能更方便。
但是,如果查找Map到处浮动,那么当Vec发生变化时,Map中保存的索引总是有不正常的危险(例如,稍后有人可能会在索引已经被收集之后放入对Vec进行排序的代码,从而破坏耦合)。我会使用Rc<Item>来 * 保证 * 查找Map只包含有效的引用。

pdkcd3nj

pdkcd3nj2#

我写了一个简单的测试,比较了以下内容:

  1. Vec<Item>HashMap<String, Item>(带克隆)。
  2. Vec<Item>HashMap<&str, &Item>
  3. Vec<Rc<Item>>HashMap<&str, Rc<Item>>
  4. Vec<Item>HashMap<&str, usize>
    我在Ubuntu 20.04机器上测试了几次,在发布模式下进行了优化。结果显示变体2和4是最快的。变体3较慢,当然1是最慢的。注意,在这个测试中,我只比较了 * 创建 * 向量和HashMap的时间。我没有测试查找时间。
    下面是测试代码:
use std::collections::HashMap;
use std::rc::Rc;
use std::time::Instant;

const CNT: u64 = 1_000_000;
const RUNS: usize = 6;

#[derive(Clone, Debug)]
struct Item {
    id: u64,
    name: String,
    //...
}

fn main() {
    println!("============ Clone ============");
    let t1 = Instant::now();
    for _ in 0..RUNS {
        test_clone();
    }
    println!("Total: {} ms", (Instant::now() - t1).as_millis());
    println!("============ Refs =============");
    let t1 = Instant::now();
    for _ in 0..RUNS {
        test_refs();
    }
    println!("Total: {} ms", (Instant::now() - t1).as_millis());
    println!("============ Rc =============");
    let t1 = Instant::now();
    for _ in 0..RUNS {
        test_rc();
    }
    println!("Total: {} ms", (Instant::now() - t1).as_millis());
    println!("============ Index =============");
    let t1 = Instant::now();
    for _ in 0..RUNS {
        test_index();
    }
    println!("Total: {} ms", (Instant::now() - t1).as_millis());
    println!("===============================");
}

fn test_clone() {
    let t1 = Instant::now();
    let items: Vec<Item> = (0..CNT).map(|i| Item { id: i, name: format!("Abc {}", i) }).collect();
    let t2 = Instant::now();
    let h: HashMap<String, Item> = items.iter().map(|i| (i.name.clone(), i.clone())).collect();
    let t3 = Instant::now();
    let t = (items.len(), h.len());
    println!("Elapsed: {:4} ms  ({:4} for vec, {:4} ms for map). Items: {:?}", (t3 - t1).as_millis(), (t2 - t1).as_millis(), (t3 - t2).as_millis(), t);
}

fn test_refs() {
    let t1 = Instant::now();
    let items: Vec<Item> = (0..CNT).map(|i| Item { id: i, name: format!("Abc {}", i) }).collect();
    let t2 = Instant::now();
    let h: HashMap<&str, &Item> = items.iter().map(|i| (i.name.as_ref(), i)).collect();
    let t3 = Instant::now();
    let t = (items.len(), h.len());
    println!("Elapsed: {:4} ms  ({:4} for vec, {:4} ms for map). Items: {:?}", (t3 - t1).as_millis(), (t2 - t1).as_millis(), (t3 - t2).as_millis(), t);
}

fn test_rc() {
    let t1 = Instant::now();
    let items: Vec<Rc<Item>> = (0..CNT).map(|i| Rc::new(Item { id: i, name: format!("Abc {}", i) })).collect();
    let t2 = Instant::now();
    let h: HashMap<&str, Rc<Item>> = items.iter().map(|i| (i.name.as_ref(), Rc::clone(i))).collect();
    let t3 = Instant::now();
    let t = (items.len(), h.len());
    println!("Elapsed: {:4} ms  ({:4} for vec, {:4} ms for map). Items: {:?}", (t3 - t1).as_millis(), (t2 - t1).as_millis(), (t3 - t2).as_millis(), t);
}

fn test_index() {
    let t1 = Instant::now();
    let items: Vec<Item> = (0..CNT).map(|i| Item { id: i, name: format!("Abc {}", i) }).collect();
    let t2 = Instant::now();
    let h: HashMap<&str, usize> = items.iter().enumerate().map(|(i, x)| (x.name.as_ref(), i)).collect();
    let t3 = Instant::now();
    let t = (items.len(), h.len());
    println!("Elapsed: {:4} ms  ({:4} for vec, {:4} ms for map). Items: {:?}", (t3 - t1).as_millis(), (t2 - t1).as_millis(), (t3 - t2).as_millis(), t);
}

结果是:

============ Clone ============
Elapsed:  419 ms  ( 189 for vec,  230 ms for map). Items: (1000000, 1000000)
Elapsed:  564 ms  ( 363 for vec,  200 ms for map). Items: (1000000, 1000000)
Elapsed:  425 ms  ( 188 for vec,  237 ms for map). Items: (1000000, 1000000)
Elapsed:  372 ms  ( 168 for vec,  204 ms for map). Items: (1000000, 1000000)
Elapsed:  365 ms  ( 167 for vec,  197 ms for map). Items: (1000000, 1000000)
Elapsed:  394 ms  ( 173 for vec,  221 ms for map). Items: (1000000, 1000000)
Total: 4139 ms
============ Refs =============
Elapsed:  242 ms  ( 170 for vec,   71 ms for map). Items: (1000000, 1000000)
Elapsed:  259 ms  ( 187 for vec,   71 ms for map). Items: (1000000, 1000000)
Elapsed:  241 ms  ( 169 for vec,   71 ms for map). Items: (1000000, 1000000)
Elapsed:  243 ms  ( 168 for vec,   75 ms for map). Items: (1000000, 1000000)
Elapsed:  240 ms  ( 168 for vec,   72 ms for map). Items: (1000000, 1000000)
Elapsed:  239 ms  ( 169 for vec,   69 ms for map). Items: (1000000, 1000000)
Total: 1678 ms
============ Rc =============
Elapsed:  274 ms  ( 201 for vec,   73 ms for map). Items: (1000000, 1000000)
Elapsed:  272 ms  ( 199 for vec,   73 ms for map). Items: (1000000, 1000000)
Elapsed:  276 ms  ( 201 for vec,   74 ms for map). Items: (1000000, 1000000)
Elapsed:  278 ms  ( 201 for vec,   76 ms for map). Items: (1000000, 1000000)
Elapsed:  280 ms  ( 203 for vec,   76 ms for map). Items: (1000000, 1000000)
Elapsed:  286 ms  ( 211 for vec,   74 ms for map). Items: (1000000, 1000000)
Total: 2297 ms
============ Index =============
Elapsed:  243 ms  ( 170 for vec,   73 ms for map). Items: (1000000, 1000000)
Elapsed:  244 ms  ( 172 for vec,   72 ms for map). Items: (1000000, 1000000)
Elapsed:  240 ms  ( 168 for vec,   71 ms for map). Items: (1000000, 1000000)
Elapsed:  245 ms  ( 172 for vec,   73 ms for map). Items: (1000000, 1000000)
Elapsed:  243 ms  ( 170 for vec,   73 ms for map). Items: (1000000, 1000000)
Elapsed:  243 ms  ( 171 for vec,   72 ms for map). Items: (1000000, 1000000)
Total: 1664 ms
===============================

除了单个迭代中的时间之外,我们还应该注意每个测试类型的“总”运行时间。它包括Rc,特别是Clone变体的单独测试运行之间的显著更多的额外运行时间。我认为这背后的原因是,对于这些变体,有更多的数据需要 * 删除 *在每次test_*函数调用结束时。

ecr0jaav

ecr0jaav3#

对我来说,这些问题主要集中在内存上,所以我使用size_of crate,criterion和User的at54321代码进行了内存基准测试:

use criterion::{ criterion_group, criterion_main, Criterion};
use size_of::SizeOf;

use std::collections::HashMap;
use std::rc::Rc;

#[derive(Clone, Debug, SizeOf)]
struct Item {
    id: u64,
    name: String,
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

pub fn criterion_benchmark(c: &mut Criterion) {
    println!("============ Clone ============");
    c.bench_function("Cloning: 1000 entities", |b| {
        b.iter(|| test_clone(1000, false))
    });
    c.bench_function("Cloning: 1000000 entities", |b| {
        b.iter(|| test_clone(1000000, false))
    });

    test_clone(1000, true);
    test_clone(1000000, true);

    println!("============ Refs =============");
    c.bench_function("Refs: 1000 entities", |b| {
        b.iter(|| test_refs(1000, false))
    });
    c.bench_function("Refs: 1000000 entities", |b| {
        b.iter(|| test_refs(1000000, false))
    });

    test_refs(1000, true);
    test_refs(1000000, true);

    println!("============ Rc =============");
    c.bench_function("RCs: 1000 entities", |b| {
        b.iter(|| test_rc(1000, false))
    });
    c.bench_function("RCs: 1000000 entities", |b| {
        b.iter(|| test_rc(1000000, false))
    });

    test_rc(1000, true);
    test_rc(1000000, true);

    println!("============ Index =============");
    c.bench_function("Indexes: 1000 entities", |b| {
        b.iter(|| test_index(1000, false))
    });
    c.bench_function("Indexes: 1000000 entities", |b| {
        b.iter(|| test_index(1000000, false))
    });

    test_index(1000, true);
    test_index(1000000, true);

}

fn test_clone(cnt_var: u64, should_print: bool) {
    let items: Vec<Item> = (0..cnt_var).map(|i| Item { id: i, name: format!("Abc {}", i) }).collect();
    let h: HashMap<String, Item> = items.iter().map(|i| (i.name.clone(), i.clone())).collect();

    if should_print {
        println!("Size of {} Clone HashMap: {:?}", cnt_var, h.size_of());
        println!("Size of {} Clone Vec: {:?}", cnt_var, items.size_of());
    }
}

fn test_refs(cnt_var: u64, should_print: bool) {
    let items: Vec<Item> = (0..cnt_var).map(|i| Item { id: i, name: format!("Abc {}", i) }).collect();
    let h: HashMap<&str, &Item> = items.iter().map(|i| (i.name.as_ref(), i)).collect();

    if should_print {
        println!("Size of {} Refs HashMap: {:?}", cnt_var, h.size_of());
        println!("Size of {} Refs Vec: {:?}", cnt_var, items.size_of());
    }
}

fn test_rc(cnt_var: u64, should_print: bool) {
    let items: Vec<Rc<Item>> = (0..cnt_var).map(|i| Rc::new(Item { id: i, name: format!("Abc {}", i) })).collect();
    let h: HashMap<&str, Rc<Item>> = items.iter().map(|i| (i.name.as_ref(), Rc::clone(i))).collect();

    if should_print {
        println!("Size of {} RCs HashMap: {:?}", cnt_var, h.size_of());
        println!("Size of {} RCs Vec: {:?}", cnt_var, items.size_of());
    }
}

fn test_index(cnt_var: u64, should_print: bool) {
    let items: Vec<Item> = (0..cnt_var).map(|i| Item { id: i, name: format!("Abc {}", i) }).collect();
    let h: HashMap<&str, usize> = items.iter().enumerate().map(|(i, x)| (x.name.as_ref(), i)).collect();

    if should_print {
        println!("Size of {} Index HashMap: {:?}", cnt_var, h.size_of());
        println!("Size of {} Index Vec: {:?}", cnt_var, items.size_of());
    }
}

结果如下:

============ Clone ============
Cloning: 1000 entities  time:   [147.76 µs 150.60 µs 154.15 µs]
Found 17 outliers among 100 measurements (17.00%)
  1 (1.00%) high mild
  16 (16.00%) high severe

Cloning: 1000000 entities: time:   [525.85 ms 532.25 ms 539.09 ms]
Found 11 outliers among 100 measurements (11.00%)
  10 (10.00%) high mild
  1 (1.00%) high severe

Size of 1000 Clone HashMap: TotalSize { total_bytes: 130572, excess_bytes: 59736, shared_bytes: 0, distinct_allocations: 2001 }
Size of 1000 Clone Vec: TotalSize { total_bytes: 40024, excess_bytes: 1110, shared_bytes: 0, distinct_allocations: 1001 }
Size of 1000000 Clone HashMap: TotalSize { total_bytes: 139315500, excess_bytes: 62537664, shared_bytes: 0, distinct_allocations: 2000001 }
Size of 1000000 Clone Vec: TotalSize { total_bytes: 47920024, excess_bytes: 6031110, shared_bytes: 0, distinct_allocations: 1000001 }

============ Refs =============
Refs: 1000 entities     time:   [75.551 µs 78.979 µs 82.558 µs]
Found 17 outliers among 100 measurements (17.00%)
  3 (3.00%) high mild
  14 (14.00%) high severe

Refs: 1000000 entities  time:   [219.28 ms 222.75 ms 226.50 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Size of 1000 Refs HashMap: TotalSize { total_bytes: 51256, excess_bytes: 26200, shared_bytes: 0, distinct_allocations: 1 }
Size of 1000 Refs Vec: TotalSize { total_bytes: 40024, excess_bytes: 1110, shared_bytes: 0, distinct_allocations: 1001 }
Size of 1000000 Refs HashMap: TotalSize { total_bytes: 52428856, excess_bytes: 27428800, shared_bytes: 0, distinct_allocations: 1 }
Size of 1000000 Refs Vec: TotalSize { total_bytes: 47920024, excess_bytes: 6031110, shared_bytes: 0, distinct_allocations: 1000001 }

============ Rc =============
RCs: 1000 entities      time:   [110.63 µs 117.17 µs 124.82 µs]
Found 13 outliers among 100 measurements (13.00%)
  13 (13.00%) high severe

RCs: 1000000 entities   time:   [320.45 ms 324.97 ms 329.94 ms]
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe

Size of 1000 RCs HashMap: TotalSize { total_bytes: 91256, excess_bytes: 27310, shared_bytes: 40000, distinct_allocations: 2001 }
Size of 1000 RCs Vec: TotalSize { total_bytes: 48024, excess_bytes: 1110, shared_bytes: 40000, distinct_allocations: 2001 }
Size of 1000000 RCs HashMap: TotalSize { total_bytes: 100348856, excess_bytes: 33459910, shared_bytes: 47920000, distinct_allocations: 2000001 }
Size of 1000000 RCs Vec: TotalSize { total_bytes: 55920024, excess_bytes: 6031110, shared_bytes: 47920000, distinct_allocations: 2000001 }

============ Index =============
Indexes: 1000 entities  time:   [73.632 µs 76.370 µs 79.756 µs]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

Indexes: 1000000 entities: time:   [217.83 ms 221.13 ms 224.78 ms]
Found 15 outliers among 100 measurements (15.00%)
  4 (4.00%) high mild
  11 (11.00%) high severe

Size of 1000 Index HashMap: TotalSize { total_bytes: 51256, excess_bytes: 26200, shared_bytes: 0, distinct_allocations: 1 }
Size of 1000 Index Vec: TotalSize { total_bytes: 40024, excess_bytes: 1110, shared_bytes: 0, distinct_allocations: 1001 }
Size of 1000000 Index HashMap: TotalSize { total_bytes: 52428856, excess_bytes: 27428800, shared_bytes: 0, distinct_allocations: 1 }
Size of 1000000 Index Vec: TotalSize { total_bytes: 47920024, excess_bytes: 6031110, shared_bytes: 0, distinct_allocations: 1000001 }

编辑:根据情况,即使是BOX也可能有所帮助:根据应用程序和使用情况,即使是Box也可能有所帮助:

Box: 1000 entities      time:   [105.45 µs 108.84 µs 113.61 µs]
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe

Benchmarking Box: 1000000 entities: Warming up for 3.0000 s
Box: 1000000 entities   time:   [288.40 ms 291.78 ms 295.87 ms]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

Size of 1000 Boxes HashMap: TotalSize { total_bytes: 51256, excess_bytes: 26200, shared_bytes: 0, distinct_allocations: 1 }
Size of 1000 Boxes Vec: TotalSize { total_bytes: 48024, excess_bytes: 1110, shared_bytes: 0, distinct_allocations: 2001 }
Size of 1000000 Boxes HashMap: TotalSize { total_bytes: 52428856, excess_bytes: 27428800, shared_bytes: 0, distinct_allocations: 1 }
Size of 1000000 Boxes Vec: TotalSize { total_bytes: 55920024, excess_bytes: 6031110, shared_bytes: 0, distinct_allocations: 2000001 }
wn9m85ua

wn9m85ua4#

你提到(强调我的):
在Rust,AFAIK中,我们不能移动或引用向量元素。
我们实际上可以将元素移出vector!假设你不需要保留vector,我们可以使用drain()方法将所有元素移出vector并移入一个hashmap,如下所示:

let map_by_name: HashMap<String, Item> = items.drain(..)
    .map(|x| (x.name.clone(), x)).collect();

相关问题