pandas - 如何从 Pandas DataFrame 中的行中提取单词
问题描述
如果我有列名类别,并且其中有像 Plane Travel|Train Travel|Bus Travel 这样的行,那么如何在 pandas Dataframe 中提取 Plane Travel
解决方案
You need to use the .str
accessor and then .split()
your string then you can put the result into separated columns.
Let's generate the proper DataFrame:
df = pd.DataFrame({"Category":["Plane France", "Train Russia", "Spacecraft Moon"],
"other_variable":[1,2,3] })
print df
Category other_variable
0 Plane France 1
1 Train Russia 2
2 Spacecraft Moon 3
You now can access strings with .str
accessor (take a look at Pandas doc) and split them.
df["category_list"] = df.Category.str.split(" ") # you can replace " " with any
# other word delimiter
and you have to then attibute each element of the list to a new column
df[["transportation", "destination"]] = pd.DataFrame(df.category_list.values.tolist(),
index = df.index)
that gives
Category other_variable category_list transportation \
0 Plane France 1 [Plane, France] Plane
1 Train Russia 2 [Train, Russia] Train
2 Spacecraft Moon 3 [Spacecraft, Moon] Spacecraft
destination
0 France
1 Russia
2 Moon
You now have your transportation an destination columns.
推荐阅读
- flutter - Flutter Firestore,与用户交互?
- neo4j - 基于等式生成不相交的星状子图
- javascript - 根据 jsTree 中的位置对不同端点的 Ajax 请求
- python - Pandas 中的列需要统一的并列排名
- javascript - 函数调用时是否可以使用解构?
- ios - 无法存储列表 - Realm Swift
- coderunner - CodeRunner 版本 3.0.1
- html - kagnax/html-minifier grunt 插件如何使用多个目标
- javascript - 如何使用 nginx 将单个 url 的一部分指向 2 个不同的服务器
- .net - Excel 互操作参考