csv 使用PowerShell将特定行转置为列

rhfm7lfc  于 2023-01-03  发布在  Shell
关注(0)|答案(5)|浏览(219)

因此,我有一个CSV文件,其中包含我想使用PowerShell将(其中一些)转置为列的行。
示例如下:

ALPHA
CD
CL
CM
-5
0.1
-0.2
0.05
0
0.4
0.4
-0.08
5
0.5
0.8
-0.1

我想要的是这样的:

Alpha  CD    CL    CM
-5     0.1  -0.2   0.05
 0     0.4   0.4  -0.08
 5     0.5   0.8  -0.1

作为参考,我从一个.dat数据文件输出中得到了这些值,其中有超过400行的信息。我使用out-file将其重新格式化为CSV文件,并跳过了所有不需要的行。
信息被分成行而不是列,这意味着ALPHA CD CL CM都在一个单元格中,中间有空格,所以我使用split命令将它们分成行,如下所示。

$ text .split () | where { $ _ }

现在我想把其中的一些转回到列中。
问题是它不是固定的数量,这意味着它不总是四行四列,有时我会得到五行,我想变成五列,然后把每四行变成四列之后。
抱歉,我说得太多了,但事情是这样的:

Row 1 > Column 1 Row 1
Row 2 > Column 2 Row 1
Row 3 > Column 3 Row 1
Row 4 > Column 4 Row 1
Row 5 > Column 5 Row 1
Row 6 > Column 1 Row 2
Row 7 > Column 2 Row 2
Row 8 > Column 3 Row 2
Row 9 > Column 4 Row 2
Row 10 > Column 5 Row 2
Row 11 > Column 1 Row 3
Row 12 > Column 2 Row 3
Row 13 > Column 3 Row 3
Row 14 > Column 4 Row 3
Row 15 > Column 1 Row 4

请注意它是如何从五列变成现在的四列的。
如果在PowerShell之外的其他方法中可以更容易地完成,我可以使用PowerShell来运行它们,即调用PowerShell的批处理文件,这对我来说很好,因为我需要自动化一个非常长的过程,这是后面的过程步骤之一。
PS:数据没有干净地用逗号分隔。使用的程序 * DATCOM * 输出一个数据文件,它看起来整洁,结构化的文本格式,但当你导出CSV时它会破坏它,所以必须使用:

out-file name csv

PPS:没有明确的分隔符/截止点,也没有重复的数字或其他任何可以作为提示的东西,我必须按行号来做,这是我知道的,由于之前处理 * DATCOM *。
我在上面解释了很多,但是我试着用split命令,它把它们都放到行中,所以如果有一种方法,可以用空格分隔列(就像在 * Excel * 中一样)这将是完美的,甚至比将它们分成行然后转置到列更好。它必须完全像 * Excel *。问题是每个值之间有4 - 8个"空格",所以如果我尝试

import-csv -delim " "

在文件上,我得到了类似Alpha H1 H2 H3 CD H4 H5 H6 H7 H8 CL的东西,其他所有东西都被破坏了,而如果我真的打开 * Excel *,突出显示单元格,文本到列〉分隔〉检查"空格",结果是完美的。
以下是这些文件:https://easyupload.io/m/6q70ei

  • for006.dat是由 * DATCOM * 生成的数据文件。
  • Output1是我想做的,如上所述(行到列)。
  • Output2是我希望我以后能做的,即删除一列和一行,使它更干净,这是我理想的最终输出。
vxqlmq5t

vxqlmq5t1#

尝试以下操作:

$columns = 4
$data = 
"@ALPHA
CD
CL
CM
-5
0.1
-0.2
0.05
0
0.4
0.4
-0.08
5
0.5
0.8
-0.1@"
$data | Format-Table
$headers = [System.Collections.ArrayList]::new()
$table = [System.Collections.ArrayList]::new()
$rows = [System.IO.StringReader]::new($data)
for($i = 0; $i -lt $columns; $i++)
{
   $headers.Add($rows.ReadLine())
}
$rowCount = 0
Write-Host $headers
While(($line = $rows.ReadLine()) -ne $null)
{
   if($rowCount % $columns -eq 0)
   {
       $newRow = New-Object -TypeName psobject
       $table.Add($newRow)
   }
   $newRow | Add-Member -NotePropertyName $headers[$rowCount % $columns] -NotePropertyValue $line
   $rowCount++
}
$table | Format-Table
f5emj3cl

f5emj3cl2#

您可以使用PowerShell的开始/Process/End生命周期来“缓冲”输入数据,直到有足够的数据用于“行”,然后输出该数据并开始收集下一行:

# define width of each row as well as the column separator
$columnCount = 5
$delimiter = "`t"

# read in the file contents, "cell-by-cell"
$rawCSVData = Get-Content path\to\input\file.txt |ForEach-Object -Begin {
    # set up a buffer to hold 1 row at a time
    $index = 0
    $buffer = [psobject[]]::new($columnCount)
} -Process  {
    # add input to buffer, and optionally output
    $buffer[$index++] = $_
    if($index -eq $columnCount){
        # output row, reset column index
        $buffer -join $delimiter
        $index = 0
    }
} -End {
    # Output any partial last row
    if($index){
        $buffer -join $delimiter
    }
}

这将生成一个字符串列表,可以将其写入磁盘或使用PowerShell中的常规CSV解析工具进行解析:

$rawCSVData |Set-Content path\to\output.csv
# or
$rawCSVData |ConvertFrom-Csv -Delimiter $delimiter
gev0vcfq

gev0vcfq3#

一旦知道头文件和数据的行数,就可以使用ConvertFrom-Csv将文件转换为对象数组。
完成后,很容易创建一个新的csv,如下所示:

# in this example the first 4 lines are the columns, the rest is data
# other files may need a different number of columns
$columns = 4  
$data    = @(Get-Content -Path 'X:\Somewhere\data.txt')
$count   = 0
$result = while ($count -lt ($data.Count - ($columns - 1))) {
    $data[$count..($count + $columns - 1)] -join "`t"  # join the lines with a TAB
    $count += $columns
}

$result = $result | ConvertFrom-Csv -Delimiter "`t"

# output on screen
$result | Format-Table -AutoSize

# write to new csv file
$result | Export-Csv -Path 'X:\Somewhere\data_new.csv' -NoTypeInformation

屏幕输出:

ALPHA CD  CL   CM   
----- --  --   --   
-5    0.1 -0.2 0.05 
0     0.4 0.4  -0.08
5     0.5 0.8  -0.1
ghhaqwfi

ghhaqwfi4#

两个(自定义)函数可以帮助您从for006.dat中抓取数据:

Get-Content .\for006.dat |
SelectString -From '(?=0 ALPHA     CD       CL.*)' -To '^0.+' |
ForEach-Object { $_.SubString(1) } |
ConvertFrom-SourceTable |
ConvertTo-Csv

结果:

"ALPHA","CD","CL","CM","CN","CA","XCP","CLA","CMA","CYB","CNB","CLB"
"-6","0.013","-0.175","0.2807","-0.176","-0.006","-1.599","3.100E+00","-3.580E+00","-5.643E-02","-3.080E-03","-8.679E-02"
"-3","0.011","-0.011","0.0926","-0.012","0.01","-7.977","3.172E+00","-3.626E+00","","","-8.989E-02"
"0","0.013","0.157","-0.0990","0.157","0.013","-0.631","3.286E+00","-3.740E+00","","","-9.305E-02"
"3","0.019","0.333","-0.2991","0.334","0.001","-0.897","3.426E+00","-3.901E+00","","","-9.635E-02"
"6","0.029","0.516","-0.5075","0.516","-0.025","-0.984","3.529E+00","-4.084E+00","","","-9.979E-02"
"7.5","0.036","0.609","-0.6158","0.608","-0.044","-1.013","3.472E+00","-4.002E+00","","","-1.015E-01"
"9","0.043","0.698","-0.7171","0.696","-0.067","-1.031","3.218E+00","-3.679E+00","","","-1.032E-01"
"10","0.047","0.752","-0.7791","0.748","-0.084","-1.041","2.895E+00","-3.489E+00","","","-1.042E-01"
"11","0.051","0.799","-0.8388","0.794","-0.102","-1.057","2.572E+00","-3.345E+00","","","-1.051E-01"
"12","0.055","0.841","-0.8958","0.835","-0.121","-1.073","2.320E+00","-3.178E+00","","","-1.059E-01"
"13","0.059","0.88","-0.9498","0.87","-0.140","-1.091","2.041E+00","-2.983E+00","","","-1.066E-01"
"14","0.063","0.913","-0.9999","0.901","-0.160","-1.110","1.738E+00","-2.772E+00","","","-1.072E-01"
"15","0.066","0.94","-1.0465","0.925","-0.180","-1.131","1.356E+00","-2.567E+00","","","-1.077E-01"
"16","0.067","0.96","NA","0.941","-0.201","NA","1.798E-02","NA","","","-1.081E-01"
"18","0.055","0.883","NA","0.857","-0.220","NA","-4.434E+00","NA","","","-1.066E-01"
l2osamch

l2osamch5#

嗯......我担心你的描述很混乱,所以我忘记了,专注于你的文件......
下面的批处理文件读取for006.dat文件并生成.csv格式的“理想最终输出”Output2.xlsx文件。

@echo off
setlocal EnableDelayedExpansion

set "skip="
set "lines="
for /F "delims=:" %%a in ('findstr /N /L /C:" ALPHA" for006.dat') do (
   if not defined skip (
      set /A "skip=%%a-1"
   ) else if not defined lines (
      set /A "lines=%%a-skip-1"
   )
)

< for006.dat (
   for /L %%a in (1,1,%skip%) do set /P "="
   for /L %%a in (1,1,%lines%) do (
      set /P "line="
      set "line=!line:~2!"
      if defined line call :reformat
   )
) > Output2.csv
goto :EOF

:reformat
   set "newLine=%line:  = %"
   if "%newLine%" == "%line%" goto continue
   set "line=%newLine%"
goto reformat

:continue
if "%line:~0,1%" == " " set "line=%line:~1%"
if "%line:~-1%" == " " set "line=%line:~0,-1%"
echo "%line: =","%"

这是输出2.csv:

"ALPHA","CD","CL","CM","CN","CA","XCP","CLA","CMA","CYB","CNB","CLB"
"-6.0","0.013","-0.175","0.2807","-0.176","-0.006","-1.599","3.100E+00","-3.580E+00","-5.643E-02","-3.080E-03","-8.679E-02"
"-3.0","0.011","-0.011","0.0926","-0.012","0.010","-7.977","3.172E+00","-3.626E+00","-8.989E-02"
"0.0","0.013","0.157","-0.0990","0.157","0.013","-0.631","3.286E+00","-3.740E+00","-9.305E-02"
"3.0","0.019","0.333","-0.2991","0.334","0.001","-0.897","3.426E+00","-3.901E+00","-9.635E-02"
"6.0","0.029","0.516","-0.5075","0.516","-0.025","-0.984","3.529E+00","-4.084E+00","-9.979E-02"
"7.5","0.036","0.609","-0.6158","0.608","-0.044","-1.013","3.472E+00","-4.002E+00","-1.015E-01"
"9.0","0.043","0.698","-0.7171","0.696","-0.067","-1.031","3.218E+00","-3.679E+00","-1.032E-01"
"10.0","0.047","0.752","-0.7791","0.748","-0.084","-1.041","2.895E+00","-3.489E+00","-1.042E-01"
"11.0","0.051","0.799","-0.8388","0.794","-0.102","-1.057","2.572E+00","-3.345E+00","-1.051E-01"
"12.0","0.055","0.841","-0.8958","0.835","-0.121","-1.073","2.320E+00","-3.178E+00","-1.059E-01"
"13.0","0.059","0.880","-0.9498","0.870","-0.140","-1.091","2.041E+00","-2.983E+00","-1.066E-01"
"14.0","0.063","0.913","-0.9999","0.901","-0.160","-1.110","1.738E+00","-2.772E+00","-1.072E-01"
"15.0","0.066","0.940","-1.0465","0.925","-0.180","-1.131","1.356E+00","-2.567E+00","-1.077E-01"
"16.0","0.067","0.960","NA","0.941","-0.201","NA","1.798E-02","NA","-1.081E-01"
"18.0","0.055","0.883","NA","0.857","-0.220","NA","-4.434E+00","NA","-1.066E-01"

您还可以生成不带引号的.csv输出文件,只需删除最后一个echo命令中的引号即可

相关问题