python - 检测到当前字母后在 Python 中检查下一个字母的方法?
问题描述
May 1 00:00:00 date=2018-04-30 time=23:59:59 dev=A devid=1234 msg="test 1"
May 1 00:00:00 date=2018-04-31 time=00:00:01 dev=A devid=1234 msg="test 2"
上面是一个日志文件的示例,我试图通过逐个字母检查=
并将其保存为一行中的列值来将其转换为 csv。
=
如果后面的值不是字符串,我设法捕获了 columnValue 。下面是提取值的部分代码。行的一部分在 之后=
,有一个字符串,其间有空格。这破坏了提取物以开始新的发现。是否可以检查下一个字母"\""
,然后开始逐个字母保存直到下一个字母,"\""
以便我可以将列值保存为字符串?
我正在使用 python 2.7
def outputCSV(log_file_path, outputCSVName, colValueSet):
data = []
f = open(log_file_path, "r")
values = set() # create empty set for all column values
content = f.readlines()
content = [x.strip() for x in content] #List of lines to iterate through
colValueSet.add("postingDate")
for line in content:
new_dict = dict.fromkeys(colValueSet, "")
new_dict["postingDate"]= line[0:16]
findingColHeader = True # we have to find the columns first
findingColValue = False # After column found, starting finding values
col_value = "" # Empty at first
value = "" # Empty value at first
start = False
for letter in line:
if findingColHeader:
if letter == " ":
# space means start taking in new value
# data is in this structure with space prior to column names -> " column=value"
start = True
col_value = ""
elif letter == "=":
findingColValue = True
start = False
findingColHeader = False
elif start:
col_value += letter
elif findingColValue:
if letter == " ":
new_dict[col_value] = value
value = ""
col_value = ""
findingColHeader = True
start = True
findingColValue = False
else:
value += letter
data += [new_dict]
with open(outputCSVName, 'wb') as csvfile:
fieldnames = list(colValueSet)
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for row in data:
writer.writerow(row)
print("Writing Complete")
# findColumnValues(a) would calculate all column value from the file path
outputCSV("ttest.log", "MyProcessedLog.csv", findColumnValues("test.log"))
解决方案
你可以尝试这样的事情:
>>> a = 'May 1 00:00:00 date=2018-04-30 time=23:59:59 dev=A devid=1234 msg="test 1" '
>>> a.split('=')
['May 1 00:00:00 date', '2018-04-30 time', '23:59:59 dev', 'A devid', '1234 msg', '"test 1" ']
>>> parts = a.split('=')
>>> b = []
>>> for i,j in zip(parts, parts[1:]) :
... b.append( (i[i.rfind(' ')+1:], j[:j.rfind(' ')]) )
...
>>> b
[('date', '2018-04-30'), ('time', '23:59:59'), ('dev', 'A'), ('devid', '1234'), ('msg', '"test 1"')]
>>>
我可以做一个可爱的单线,但我认为这样对你来说更容易理解,当你看到所有的中间结果并能掌握主要思想——在=
标志处分割线,使用最后一个单词作为关键字,然后休息为价值。
推荐阅读
- swift - Swift 已连接到 SQLite DB,但我找不到该文件
- python - 如何在python中添加字典数组?
- python - 从txt文件中读取并划分单词
- reactjs - 在 FlatList 中 React Native 更新状态
- algorithm - 展开方法:当 n = 0 和 2T(n-1) + 1 时 T(n) = 1
- javascript - 如何在反应中为数组中的特定对象设置状态
- python - scikit-learn - SVM.fit 函数的执行永无止境
- sql-server - SQL Server 返回一行,其中包含具有最小和最大数据的列
- python - 密码/解密 Python 初学者程序
- php - 仅重定向一次并保存访问者 ip