r - 谁能帮我用 R 处理数据?[图片更新]
问题描述
大家,早安!我是 R 编程的新手。最近,我遇到了一些问题..我想处理所附图片等数据。一列有两种数据,excel文件/表格总共有9000+行和几列,而有两种数据的列只是其中一列。
ID Fruit/FruitJuice
<dbl> <chr>
1 1 NA/applejuice:15(ml)
2 2 banana:10(kg)/orangejuice:20(ml);tomatojuice:25(ml)
3 3 watermelon:10(kg)/NA
4 4 banana:5(kg);grape:6(kg)/orangejuice:30(ml);applejuice:50(ml);mangojuice:25(ml)
简单描述一下数据,有2种数据的栏目是水果和果汁,用“/”隔开,括号内各有单位,水果或果汁中的对象用“;”隔开。
由于实际数据框包含的列比我显示的图像多,并且我在网上做了一些搜索,但我仍然不知道如何解决它并希望得到如下的最终表,任何人都可以借给我一只手?谢谢
ID `Fruit/FruitJuice`
1 1 Fruit_NA
2 1 Fruitjuice_applejuice:15(ml)
3 2 Fruit_banana:10(kg)
4 2 Fruitjuice_orangejuice:20(ml)
5 2 Fruitjuice_tomatojuice:25(ml)
6 3 Fruit_watermelon:10(kg)
7 3 Fruitjuice_NA
8 4 Fruit_banana:5(kg)
9 4 Fruit_grape:6(kg)
10 4 Fruitjuice_orangejuice:30(ml)
11 4 Fruitjuice_applejuice:50(ml)
12 4 Fruitjuice_mangojuice:25(ml)
解决方案
这是一种方法tidyr
:
library(tidyverse)
data %>%
separate(2,into = c("Fruit","FruitJuice"), sep = "/") %>%
pivot_longer(-ID) %>% separate_rows(value, sep = ";") %>%
summarise(ID = ID, `Fruit/FruitJuice` = str_c(name,"_",value))
# A tibble: 12 x 2
ID `Fruit/FruitJuice`
<int> <chr>
1 1 Fruit_NA
2 1 FruitJuice_applejuice:15(ml)
3 2 Fruit_banana:10(kg)
4 2 FruitJuice_orangejuice:20(ml)
5 2 FruitJuice_tomatojuice:25(ml)
6 3 Fruit_watermelon:10(kg)
7 3 FruitJuice_NA
8 4 Fruit_banana:5(kg)
9 4 Fruit_grape:6(kg)
10 4 FruitJuice_orangejuice:30(ml)
11 4 FruitJuice_applejuice:50(ml)
12 4 FruitJuice_mangojuice:25(ml)
推荐阅读
- r - 重新格式化R中的重复列表
- php - Laravel中的数字范围交集搜索
- excel - How to exclude 0 and blank cells when using excel MIN function
- mongodb - Unable to find image from docker hub repository
- django - how to get distance in drf api response?
- excel - Updating a caption of a text box when something is changed in the userform
- python - How to get text from HTML element by using lxml.html
- javascript - Bootstrap dropdowns not working javascript?
- ios - How to ensure that my app's backend API is only accessible by the app itself?
- jquery - Metafizzy Isotope - Remove inline CSS