首页 > 解决方案 > Count lines in zipped files using Windows PowerShell

问题描述

There is a folder which contains more than 1000 zipped files. Each zipped file contains 12 others zipped files which contains one CSV file each. I need count the total number of lines of all files...

It can be done using windows powershell, but I am in trouble in order to unzip files, count the number of lines and zip it again, in order to save disk space during the process.

$folderPath="C:\_Unzip_Folder";

Get-ChildItem $folderPath -recurse | %{ 

    if($_.Name -match "^*.`.zip$")
    {
        $parent="$(Split-Path $_.FullName -Parent)";    
        write-host "Extracting $($_.FullName) to $parent"

        $arguments=@("e", "`"$($_.FullName)`"", "-o`"$($parent)`"");
        $ex = start-process -FilePath "`"C:\Program Files\7-Zip\7z.exe`"" -ArgumentList $arguments -wait -PassThru;

        if( $ex.ExitCode -eq 0)
        {
            write-host "Extraction successful, deleting $($_.FullName)"
            rmdir -Path $_.FullName -Force
        }
    }
}

Get-ChildItem $folderPath -recurse -Filter *.csv | %{ 
    Get-Content $($_.FullName)  | Measure-Object -Line
}

cmd /c pause | out-null

Now, it is counting lines but, it can be easier, if it SUM them to me.

Does someone can help me with this task?

Thank you all.

标签: powershellzipcounting

解决方案


您还可以将所有内容保存在内存中,如下所示:

Set-StrictMode -Version "Latest"
$ErrorActionPreference = "Stop"
$InformationPreference = "Continue"

Add-Type -Assembly "System.IO.Compression.FileSystem"

$folderPath = "C:\_Unzip_Folder\*.zip"
$files      = Get-ChildItem $folderPath -Recurse
$csvCount   = 0
$lineCount  = 0
$bufferSize = 1MB
$buffer     = [byte[]]::new($bufferSize)

foreach ($file in $files)
{
    Write-Information "Getting information from '$($file.FullName)'"

    $zip  = [System.IO.Compression.ZipFile]::OpenRead($file.FullName)
    $csvs = $zip.Entries | Where-Object { [System.IO.Path]::GetExtension($_.Name) -eq ".csv" }
    foreach ($csv in $csvs)
    {
        $csvCount++
        Write-Information "Counting lines in '$($csv.FullName)'"

        $stream = $csv.Open()
        try
        {
            $byteCount = $stream.Read($buffer, 0, $bufferSize)
            while ($byteCount)
            {
                for ($i = 0; $i -lt $byteCount; $i++)
                {
                    # assume line feed (LF = 10) is the end-of-line marker
                    # you could also use carriage return (CR = 13)
                    if ($buffer[$i] -eq 10) { $lineCount++ }
                }
                $byteCount = $stream.Read($buffer, 0, $bufferSize)
            }
        }
        finally
        {
            $stream.Close()
        }
    }
}

Write-Information "Counted a total of $lineCount line(s) in $csvCount CSV-file(s)"

推荐阅读