R markdown -基于 Dataframe 中预定义文本和循环的输出

mtb9vblg  于 2022-12-20  发布在  其他
关注(0)|答案(1)|浏览(159)

在循环中创建Word文档最简单的方法是什么?我对R-markdown和用R处理文本还比较陌生,所以我希望有下面的简单方法:
我有一个用户数据集,我必须为每个用户创建单独的页面/文档。例如,用户:

df <- data.frame(name = c("Amy", "Bob", "Chloe", "Dan"),  
                 age = c(20, 35, 26, 41),  
                 country = c("USA", "UK", "FR", "AU"))

我也有预定义的文本:

text <- c("Name: ",
          "Age: ",
          "Country: ")

我想知道是否有任何简单的方法来循环通过名称(行)在df中产生Word页面(而不是在表中)每个人像下面:
姓名:艾米
年龄:20岁
国家:美国
我试过使用R降价解决方案,如这里Use loop to generate section of text in rmarkdown和它的工作,直到我添加军官库(不知道为什么它不工作在我的电脑上,无论什么),所以我尝试了类似下面的东西:

for (i in seq(nrow(df))){
current <- df[i,]
current_value <- c(current$name,current$age, current$country)
df_text <- data.frame(text, current_value)
cat("\n\\pagebreak\n")
}

但是输出是在一行中,有df_text标题和行号。我想以后在文本格式上工作,所以我想知道是否有一个简单的方法来做到这一点?(实际数据可能多达100个docx页面/文件)。

xzv2uavs

xzv2uavs1#

文件

1.一个月一个月

## @knitr configure

# --- Be very, very, very quiet loading these packages...
suppressPackageStartupMessages( suppressWarnings( require( purrr     )))
suppressPackageStartupMessages( suppressWarnings( require( rmarkdown )))

# --- Convenience function to create templates
to_file <- function( filespec, what ){
  connection <- file(
      description = filespec
    , open = "wt"
  )
  write( what, connection )
  close( connection )
}

2. print_asis.R

这是用于MS-Word分页的。

# Reference:
#  11.17 Customize the printing of objects in chunks (*)
#  https://bookdown.org/yihui/rmarkdown-cookbook/opts-render.html#opts-render

# Three elements are needed in order to print something ** as is **
# within a chunk (when everything else is formatted by knitr):

# NOTE the following:
# 1. `knit_print`, `knitr` and `asis_output` are all defined in the `knitr` package,
#    so don't mess with them.
# 2. `PRINT_ASIS`, `CONTROL`, and `FORMFEED` can be whatever you want them to be,
#    just remember, if you change them to something else, you make those changes
#    everywhere they appear.

## @knitr print_as_is

# --- (1) Define a `knit_print` method
PRINT_ASIS <- function( x, ... ){
  knitr::asis_output( x )
}

# --- (2) Register the method defined above
registerS3method(
    genname = "knit_print"                 # Generic name
  , class   = "CONTROL"                    # For which class of object?
  , method  = PRINT_ASIS                   # The custom method for this class
  , envir   = asNamespace( "knitr" )
)

# --- (3) Set the class of whatever is to be printed
FORMFEED <- '\\newpage'
class( FORMFEED ) <- "CONTROL"

3. data.csv

Name,Age,Country
Amy,20,USA
Bob,35,UK
Chloe,26,FR
Dan,41,AU

4. Multipage_MSWord_Doc.Rmd

该文件的内容分五部分(一个YAML头,加上四个块)描述如下:

如何运作

RMarkdown模板,

Multipage_MSWord_Doc.Rmd

CONFIG_FILEDATA_FILE
然后创建PAGE_TEMPLATEBODY_TEMPLATE
PAGE_TEMPLATE使用DATA_FILE中的标题,因此可以随意添加字段或更改字段的名称。

YAML标头

请注意,您必须编辑标题中的PATH,以指向您保存数据和配置文件的目录。

---
output: word_document
params:
  PATH:          'path/to/your/MultipageDoc'    # <------- EDIT THIS
  CONFIG_FILE:   'configure.R'
  DATA_FILE:     'data.csv'
  PAGE_TEMPLATE: 'AUTOGENERATED_page_template.R'
  BODY_TEMPLATE: 'AUTOGENERATED_body_template.R'
---

第一块

第一个区块定位所有文件并读取输入数据

```{r STEP1_CONFIGURE, eval=TRUE,echo=FALSE}
# --- Use these labels to refer to the various files
CONFIG_FILE    <- file.path( params$PATH, params$CONFIG_FILE    )
PAGE_TEMPLATE  <- file.path( params$PATH, params$PAGE_TEMPLATE  )
BODY_TEMPLATE  <- file.path( params$PATH, params$BODY_TEMPLATE  )
DATA_FILE      <- file.path( params$PATH, params$DATA_FILE  )

# --- Get the special sauce that prints MS-Word pagebreaks
knitr::read_chunk( 'print_asis.R'     )

# --- The configuration file must exist
stopifnot( file.exists( CONFIG_FILE   ))
knitr::read_chunk( CONFIG_FILE )

# --- The data file must exist
stopifnot( file.exists( DATA_FILE   ))
THE_DATA <- read.csv( DATA_FILE)
i <- 0    # Initialize the record/page counter

### 第二块

第二个区块生成`PAGE_TEMPLATE`,它控制页面上的数据以及数据的格式。

# --- Evaluate the <<configure>> chunk

<<configure>>

# --- Find out what this chunk is called and what the .Rmd file name is, as well
#     so we can put a comment in the templates that we generate. This way,
#     when you want to change a template, you can see where to find the code that
#     made it.

THIS_CHUNK <- knitr::opts_current$get()$label
THIS_FILE  <- knitr::current_input()

# --- Define the page template

stuff_to_put_in_page_template <- list(
    chunk_start = "## @knitr page_template\n"
  , counter = 'i <- 1 + i'
  , data = names( THE_DATA ) %>%
      purrr::map_chr(
        ~sprintf(
            "cat( sprintf( '%s: %%s', THE_DATA[ i, '%s' ] ))"
          ,                 .x
          ,                                         .x
         )
      )
  , notice = sprintf(
       "\n# NOTE: This file is generated by %s of %s\n#       Do not edit manually!\n"
     ,                              THIS_CHUNK
     ,                                     THIS_FILE
    )
)
stuff_to_put_in_page_template <- unlist( stuff_to_put_in_page_template )

# --- Write the page template to disk
to_file( PAGE_TEMPLATE, stuff_to_put_in_page_template )

# --- Make sure it gets there
stopifnot( file.exists( PAGE_TEMPLATE ))

# --- Now read it back as a chunk
knitr::read_chunk( PAGE_TEMPLATE )

### 第三块

第三个块生成`BODY_TEMPLATE`,它使用FORMFEED将页面粘合在一起。

# --- Find out what this chunk is called
#     so we can put a comment in the templates that we generate. This way,
#     when you want to change a template, you can see where to find the code that
#     made it.

THIS_CHUNK <- knitr::opts_current$get()$label

# --- Glue the pages together with form feeds
first_item    <- "<<page_template>>"
repeated_part <- "FORMFEED\n<<page_template>>"

# --- Define the body template

stuff_to_put_in_contents_chunk <- list(
    chunk_start = "## @knitr contents\n"
  , data = c( first_item, rep( repeated_part, nrow( THE_DATA ) - 1 ))
  , notice = sprintf(
       "\n# NOTE: This file is generated by %s of %s\n#       Do not edit manually!\n"
     ,                              THIS_CHUNK
     ,                                     THIS_FILE
    )
)
stuff_to_put_in_contents_chunk <- unlist( stuff_to_put_in_contents_chunk )

# --- Write the body template to disk
to_file( BODY_TEMPLATE, stuff_to_put_in_contents_chunk )

# --- Make sure it gets there
stopifnot( file.exists( BODY_TEMPLATE ))

# --- Now read it back as a chunk
knitr::read_chunk( BODY_TEMPLATE )

# --- Evaluate the <<print_as_is>> chunk
<<print_as_is>>

### 第四块

第四个块生成并将其放在一起,然后呈现多页文档。注意,此块使用`comment`选项,如`comment=''`中所示。这是为了覆盖默认格式,即在每行输出的开头放置两个磅符号。
# ,comment='' to prevent default ## at the beginning of each line of output

# --- Evaluate the <<contents>> chunk, populating the body of the multipage document.

<<contents>>

## 块名称

无论您在哪里看到`## @knitr` *something*,这都是一个 *chunk* 的名称--也就是说,这是一种引用后面所有内容的方式(无论如何,直到下一个chunk)
无论在何处看到`<<something>>`,都引用了名为`something`的块。

## 从命令行调用

使用以下命令从命令行调用knitr
 `Rscript -e rmarkdown::render"('Multipage_MSWord_Doc.Rmd',output_file = 'Multipage.doc')"`

相关问题