r - How to split strings for a tibble
问题描述
I am trying to do a data project with sports teams and I was wondering if there was a way that I could take plain text and make it into a tibble with the data separated into city and the mascot.
tibble("City, Mascot,
Arizona Diamondbacks
Atlanta Braves
Baltimore Orioles
Boston Red Sox
Chicago White Sox
Chicago Cubs
Cincinnati Reds
Cleveland Indians
Colorado Rockies
Detroit Tigers
Houston Astros
Kansas City Royals
Los Angeles Angels
Los Angeles Dodgers
Miami Marlins
Milwaukee Brewers
Minnesota Twins
New York Yankees
New York Mets
Oakland Athletics
Philadelphia Phillies
Pittsburgh Pirates
San Diego Padres
San Francisco Giants
Seattle Mariners
St. Louis Cardinals
Tampa Bay Rays
Texas Rangers
Toronto Blue Jays
Washington Nationals
"
)
Basically being able to edit the code so that I don't have to manually change each one by hand but I can make small adjustments if necessary. I am doing this so that I can join it with other data by city.
解决方案
Some regex black magic
library(tidyverse)
example_data <- tibble::tribble(
~data,
"Arizona Diamondbacks",
"Atlanta Braves",
"Baltimore Orioles",
"Boston Red Sox",
"Chicago White Sox",
"Chicago Cubs",
"Cincinnati Reds",
"Cleveland Indians",
"Colorado Rockies",
"Detroit Tigers",
"Houston Astros",
"Kansas City Royals",
"Los Angeles Angels",
"Los Angeles Dodgers",
"Miami Marlins",
"Milwaukee Brewers",
"Minnesota Twins",
"New York Yankees",
"New York Mets",
"Oakland Athletics",
"Philadelphia Phillies",
"Pittsburgh Pirates",
"San Diego Padres",
"San Francisco Giants",
"Seattle Mariners",
"St. Louis Cardinals",
"Tampa Bay Rays",
"Texas Rangers",
"Toronto Blue Jays",
"Washington Nationals"
)
example_data |>
mutate(city = str_remove(data,'[[:alpha:]]+$') |> str_trim(),
macot = str_extract(data,'[[:alpha:]]+$'))
#> # A tibble: 30 x 3
#> data city macot
#> <chr> <chr> <chr>
#> 1 Arizona Diamondbacks Arizona Diamondbacks
#> 2 Atlanta Braves Atlanta Braves
#> 3 Baltimore Orioles Baltimore Orioles
#> 4 Boston Red Sox Boston Red Sox
#> 5 Chicago White Sox Chicago White Sox
#> 6 Chicago Cubs Chicago Cubs
#> 7 Cincinnati Reds Cincinnati Reds
#> 8 Cleveland Indians Cleveland Indians
#> 9 Colorado Rockies Colorado Rockies
#> 10 Detroit Tigers Detroit Tigers
#> # ... with 20 more rows
Created on 2021-10-18 by the reprex package (v2.0.1)
推荐阅读
- php - mysqli_real_escape_string() 用于 php 中的整个 $_POST 数组
- react-native - 无法更改选项卡背景颜色
- angular - Angular:向另一个组件的数组添加值
- python - Python 真正的线程安全字典(“RuntimeError:字典在迭代期间更改了大小”避免)
- python - TensorFlow 解析和重塑 Dataset.map() 中的浮点列表
- php - PHP:如何基于相同的键合并行?
- json - .net Core 和 Serilog 电子邮件接收器 - json 配置
- sorting - bash 排序 -g 用于科学计数法(E 值)
- java - JAVA - 从动作侦听器调用时 JFrame 看起来不正确
- python - 解析在同一元素中定义元素前缀的 XML