首页 > 解决方案 > 一种使用分隔符将两个文本文件的内容合并为一个以区分内容来自两个不同文件的方法

问题描述

这是一段将从文件中读取内容然后创建数据框的代码。我需要创建两个数据框 - 一个包含 File1.txt 的内容,另一个包含 File2.txt 的内容。

`Content of File1.txt`
Str1:123
   Str2:456

`Content of File1.txt`
Str1:789
   Str2:1011

Expected Dataframe:
  123 456 789 1011



file_name=['File1.txt','File2.txt']
string_to_search = ['Str1']
string_next= ['Str2']
list_of_results = []
list_of_results_1 = []

for item in file_name:
  with open(item, 'r') as read_obj:
    for line in read_obj:
        for set in string_to_search:
          if set in line:
            list_of_results.append(line.replace(" ", "").strip('\n').split(':'))
        for set1 in string_next:
            if set1 in line:
               list_of_results_1.append(line.replace(" ", "").strip('\n').split(':'))


for i in range(0, len(list_of_results)): list_of_results[i].extend(list_of_results_1[i])
print(list_of_results)
df = pd.DataFrame(list_of_results)
print(df)

标签: python-3.x

解决方案


文件 1:

Str1:123
Str2:456
Str1:1212
Str2:9999

文件 2:

Str1:789
Str2:1011
Str1:0000
Str2:1000

代码:

import pandas as pd
file_name=['File1.txt','File2.txt']
string_to_search = ['Str1']
string_next= ['Str2']

list_df = []

for item in file_name:
  list_of_results = []
  list_of_results_1 = []
  with open(item, 'r') as read_obj:
    for line in read_obj:
        if "Str1" in line:
            list_of_results.append(line.replace(" ", "").strip('\n').split(':')[1])
        else:
            list_of_results_1.append(line.replace(" ", "").strip('\n').split(':')[1])
    list_df.append(pd.DataFrame({"Str1_"+item.split(".")[0]: list_of_results,"Str2_"+item.split(".")[0]:list_of_results_1}))

print(list_df[0])
print(list_df[1])
final_df = pd.concat([list_df[0],list_df[1]],axis = 1)
print(final_df)

输出(final_df,每个文件中的 2 个数据帧组合在一起):

  Str1_File1 Str2_File1 Str1_File2 Str2_File2
0        123        456        789       1011
1       1212       9999       0000       1000

单个数据框(每个文件一个)

文件1:

  Str1_File1 Str2_File1
0        123        456
1       1212       9999

文件2:

  Str1_File2 Str2_File2
0        789       1011
1       0000       1000

推荐阅读