首页 > 解决方案 > Create a dataframe from a list with multiple columns

问题描述

I want to create a dataframe from a list, the thing is that my column name is also in the list.

List:

['Input_file_column_name,Is_key,Config_file_column_name,Value\nEmployee ID,Y,identifierValue,identityTypeCode:001\nCumb ID,N,identifierValue,identityTypeCode:002\nFirst Name,N,first_Name \nLast Name,N,last_Name   \nEmail,N,email_Address   \nEntityID,N,entity_Id,entity_Id:01\nSourceCode,N,sourceCode,sourceCode:AHRWB\n']

Resulting dataframe:

Input_file_column_name Is_key Config_file_column_name                 Value
0            Employee ID      Y         identifierValue  identityTypeCode:001
1                Cumb ID      N         identifierValue  identityTypeCode:002
5               EntityID      N               entity_Id          entity_Id:01
6             SourceCode      N              sourceCode      sourceCode:AHRWB

How do I convert it? Do I convert the list to a dictionary and then do it or is there a way that it can be done directly?

Code:

import pandas as pd
with open('onboard_config.txt') as myFile:
  text = myFile.read()
result = text.split("regex")
print result 

df=pd.DataFrame[[sub.split(",") for sub in result]]

标签: pythonpandaspython-2.7

解决方案


似乎您需要splitlines然后转换为Series.str.split

df=pd.Series(l[0].splitlines()).str.split(',',expand=True).T.set_index(0).T.dropna()
df
Out[1183]: 
0 Input_file_column_name          ...                          Value
1            Employee ID          ...           identityTypeCode:001
2                Cumb ID          ...           identityTypeCode:002
6               EntityID          ...                   entity_Id:01
7             SourceCode          ...               sourceCode:AHRWB
[4 rows x 4 columns]

推荐阅读