首页 > 解决方案 > 用于 Excel 的 VBA;编辑非常大的文件

问题描述

我有一组非常大的带有 AIS(运输)数据的日志文件。由于这些日志文件每天大约 200Mb,我正在尝试缩小它们的大小以进行归档。文件如下所示:

244630075;under way ;128°'; 0.0kt;52.395290N;4.886883E;342.0°;511°;55s; 170418 000000;serial#1(A)[1]
244670835;under way ;128°'; 0.0kt;52.410140N;4.833700E;283.8°;511°;54s; 170418 000000;serial#1(B)[3]
244750830;under way ;128°'; 0.0kt;52.404563N;4.864063E;  0.0°;511°;55s; 170418 000000;serial#1(B)[1]
244900124;under way ;000°'; 7.1kt;52.426495N;4.780100E;279.4°;281°;56s; 170418 000000;serial#1(B)[2]
244670779;under way ;000°'; 0.0kt;52.420773N;4.801418E;330.9°;325°;58s; 170418 000000;serial#1(A)[1]
244660512;under way ;128°'; 0.0kt;52.402092N;4.781258E;268.3°;511°;54s; 170418 000000;serial#1(B)[1]
236202000;under way ;000°';11.7kt;52.477408N;4.462048E;285.4°;296°;55s; 170418 000000;serial#1(B)[1]
244690403;under way ;128°'; 0.0kt;52.400760N;4.891647E;  0.0°;511°;55s; 170418 000000;serial#1(A)[1]

每个文件大约有 200 万行。为了缩小这些文件的大小,我想删除包含“0.0kt”的每一行,因为它代表的信息对我没有用。为此,我在 Excel 中编写了一个 VBA 脚本。我似乎有脚本为主要部分工作。它遍历文件并编辑出所有包含“0.0kt”的行。但是当脚本结束时,应该保存它导出一个空文件。

这是我的脚本:

Sub test()
'this will force the script to end when end of file is reached
On Error GoTo ASD

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\x\170418.log", ForReading)

x = 1

Do
Do While i < 1000

        strline = objFile.ReadLine
         If InStr(strline, " 0.0kt") = 28 Then
            strline = "" & vbCrLf

        End If
    i = i + 1

Loop

'doevents and a calculation to call doevents after 1000 lines to prevent freezing of the script
DoEvents
a = a + 1
b = a * 1000
i = i + b
x = i / 1000
i = 0
iLineNumber = x

Loop

ASD:

objFile.Close

Set objFile = objFSO.OpenTextFile("C:\x\170418.log", ForWriting)
objFile.Write strline

objFile.Close

End Sub

在删除所有包含“0.0kt”的行而不是删除所有行的情况下保存和关闭文件,我缺少什么?

谢谢

标签: excellarge-fileslarge-data-volumesvba

解决方案


Looking at your sample text, I think any line that contains ; 0.0kt; can be excluded.

Using something I've already built, I've tweaked it to pick up your your file and use your DoEvents every 1000 rows.

Sub Test()

    Dim ifileno As Integer, ofileno As Integer, rownum As Long
    Dim ifilename As String, ofilename As String, excludestring As String, strLine As String

    ifilename = "C:\Users\v.doynov\Desktop\nd.txt"
    ofilename = "C:\Users\v.doynov\Desktop\nd_output.txt"
    excludestring = "; 0.0kt;"

    ifileno = FreeFile
    Open ifilename For Input As ifileno

    ofileno = FreeFile
    Open ofilename For Output As ofileno

    rownum = 0

    Do Until EOF(ifileno)
        rownum = rownum + 1
        Line Input #ifileno, strLine
        If InStr(strLine, excludestring) = 0 Then Print #ofileno, strLine
        If rownum Mod 1000 = 0 Then DoEvents
    Loop

    Close ifileno
    Close ofileno

End Sub

推荐阅读