首页 > 解决方案 > 根据文件中的文本拆分文本文件

问题描述

我有一个大文本(.txt)文件,其中包含几个需要单独文件的文档。

每个文档的开头都有一个标题,我们可以使用它来引用开头。

我想在这一点上启动新文件,并将文件命名为数字(增量)

奖励积分!:解析刚刚损坏的文件并获取一些文本示例:“Doc No. 1”用作文件名。

我尝试了这个以及其他一些没有运气的建议.. https://forums.windowssecrets.com/showthread.php/174836-Powershell-Split-a-Text-File-Output-With-Delimiter-As-File -姓名

  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         1

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX





  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         2

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX






  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         3

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX






  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA
  HEADER                                        EXAMPLE DATA

  ADDRESS CORRECTION REQUESTED                  Document No.         4

                                                period:
                                                DATE thru DATE

EXAMPLE DATA                    EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA
EXAMPLE DATA                        EXAMPLE DATA


EXAMPLE DATA

          XXXXXXXXXXXX                             XXXX

标签: powershell

解决方案


给定SplitText.txt当前文件夹中的文件:

> Get-Content .\SplitText.txt
xxx FirstFile zzz
FirstFile line 1
FirstFile line 2
FirstFile line 3
FirstFile line 4
FirstFile line 5
FirstFile line 6
xxx SecondFile zzz
SecondFile line A
SecondFile line B
SecondFile line C
SecondFile line D

此脚本会将其拆分为附加到 BaseName 的编号部分:

## Q:\Test\2019\01\31\SO_54467665.ps1
$File = Get-Item ".\SplitText.txt"
$i = 0
(Get-Content $File -raw) -split 'xxx .*? zzz\r?\n' -ne ''| ForEach-Object {
    $i++
    $_ | Set-Content -Path {"{0}\{1}_{2}{3}" -f `
         $File.DirectoryName, $File.BaseName, $i, $File.Extension}
}

> Get-Content .\SplitText_1.txt
FirstFile line 1
FirstFile line 2
FirstFile line 3
FirstFile line 4
FirstFile line 5
FirstFile line 6

> Get-Content .\SplitText_2.txt
SecondFile line A
SecondFile line B
SecondFile line C
SecondFile line D

推荐阅读