Rust中借用的数据转义到函数之外,如何处理?

qgelzfjb  于 2023-08-05  发布在  其他
关注(0)|答案(1)|浏览(142)

我正在关注Rust-Lang Book,在chapter 12上开发项目后,我想优化它并使用一些并发性。我决定将目标文件拆分成块,并分配一个线程来处理每个块,下面是我的当前代码:

pub fn run(config: Config, content_per_thread: Vec<&str>) -> Result<(), Box<dyn Error>> {
    let (sender, receiver) = channel::<Vec<&str>>();
    concurrent_search(&config, content_per_thread, &sender);
    drop(sender);

    for v in receiver.iter() {
        v.iter().for_each(|line| println!("{:?}", line))
    }

    Ok(())
}

fn concurrent_search(config: &Config, content_per_thread: Vec<&str>, sender: &Sender<Vec<&str>>) {
    for i in 0..content_per_thread.len() {
        let th_query = config.query.clone();
        let th_sender = sender.clone();
        let th_content = content_per_thread[i].clone();

        thread::spawn(move || {
            let matched_lines = search(&th_query, th_content, true);
            th_sender.send(matched_lines.clone()).unwrap();
        });
    }
}

字符串
在concurrent_search函数上,我在闭包移动到派生线程时得到以下错误:

error[E0521]: borrowed data escapes outside of function
  --> src/lib.rs:27:9
   |
21 |   fn concurrent_search(config: &Config, content_per_thread: Vec<&str>, sender: &Sender<Vec<&str>>) {
   |                                         ------------------      - let's call the lifetime of this reference `'1`
   |                                         |
   |                                         `content_per_thread` is a reference that is only valid in the function body
...
27 | /         thread::spawn(move || {
28 | |             let matched_lines = search(&th_query, th_content, true);
29 | |             th_sender.send(matched_lines.clone()).unwrap();
30 | |         });
   | |          ^
   | |          |
   | |__________`content_per_thread` escapes the function body here
   |            argument requires that `'1` must outlive `'static`

error[E0521]: borrowed data escapes outside of function
  --> src/lib.rs:27:9
   |
21 |   fn concurrent_search(config: &Config, content_per_thread: Vec<&str>, sender: &Sender<Vec<&str>>) {
   |                                                                        ------              - let's call the lifetime of this reference `'2`
   |                                                                        |
   |                                                                        `sender` is a reference that is only valid in the function body
...
27 | /         thread::spawn(move || {
28 | |             let matched_lines = search(&th_query, th_content, true);
29 | |             th_sender.send(matched_lines.clone()).unwrap();
30 | |         });
   | |          ^
   | |          |
   | |__________`sender` escapes the function body here
   |            argument requires that `'2` must outlive `'static`
   |
   = note: requirement occurs because of the type `Sender<Vec<&str>>`, which makes the generic argument `Vec<&str>` invariant
   = note: the struct `Sender<T>` is invariant over the parameter `T`
   = help: see <https://doc.rust-lang.org/nomicon/subtyping.html> for more information about variance

For more information about this error, try `rustc --explain E0521`.
error: could not compile `minigrep` due to 2 previous errors


我怎样才能实现我的目标?
我能做到这一点的唯一方法是在每次生成新线程时创建当前块的副本。这种方法的问题是复制100K+行会减慢执行速度。
最小重现性示例:

use std::sync::mpsc::{channel, Sender};
use std::thread;

pub fn main() {
    let content_per_thread = vec!["Hello\nGoodbye", "Wow\nCool\nData!", "a\nb\nc"];
    let (sender, receiver) = channel::<Vec<&str>>();
    concurrent_search("o", content_per_thread, &sender);
    drop(sender);

    for v in receiver.iter() {
        v.iter().for_each(|line| println!("{:?}", line))
    }
}

fn concurrent_search(query: &str, content_per_thread: Vec<&str>, sender: &Sender<Vec<&str>>) {
    for i in 0..content_per_thread.len() {
        let th_query = query.clone();
        let th_sender = sender.clone();
        let th_content = content_per_thread[i].clone();

        thread::spawn(move || {
            let matched_lines = search(&th_query, th_content);
            th_sender.send(matched_lines).unwrap();
        });
    }
}

fn search<'a>(query: &str, content: &'a str) -> Vec<&'a str> {
    content
        .lines()
        .filter(|line| line.contains(query))
        .collect()
}

mcdcgff0

mcdcgff01#

下面是一个与您的原始代码相近的解决方案。其主要变化有:

  • 使用thread::scope可避免所有克隆
  • 直接迭代content_per_thread。* * 不要**在Rust中使用带有偏移量和[i]索引的C样式迭代。这是C/C++程序员的坏习惯;它速度很慢,而且容易出错。
use std::sync::mpsc::{channel, Sender};
use std::thread;

pub fn main() {
    let content_per_thread = vec!["Hello\nGoodbye", "Wow\nCool\nData!", "a\nb\nc"];
    let (sender, receiver) = channel::<Vec<&str>>();
    concurrent_search("o", content_per_thread, &sender);
    drop(sender);

    for v in receiver.iter() {
        v.iter().for_each(|line| println!("{:?}", line))
    }
}

fn concurrent_search<'a>(
    query: &str,
    content_per_thread: Vec<&'a str>,
    sender: &Sender<Vec<&'a str>>,
) {
    thread::scope(|s| {
        for th_content in content_per_thread {
            s.spawn(|| {
                let matched_lines = search(query, th_content);
                sender.send(matched_lines).unwrap();
            });
        }
    })
}

fn search<'a>(query: &str, content: &'a str) -> Vec<&'a str> {
    content
        .lines()
        .filter(|line| line.contains(query))
        .collect()
}

个字符
也就是说,这样做有点慢,效率低下。并行性很难调整正确,因此强烈建议使用rayon这样的板条箱,因为它已经为您完成了工作。
但是,很难确切地说出如何做到这一点,因为你没有给予足够的背景信息,说明你实际上试图实现什么,从而给你更详细的指导。

相关问题