r - 在 R 中将数据整理成长格式
问题描述
我有一个自然文章的源数据集。我想知道如何将第 4 行和第 12 行的值提取为具有相关分配组的长数据格式(即低效/高效)。
这是我用来将数据导入 R 的代码。
# load the required libraries
library(ggsignif)
library(readxl)
library(svglite)
library(tidyverse)
library(tidyr)
library(dplyr)
# The paper from which the figure is taken is Tasdogen et al. (2020)
# Metabolic heterogeneity confers differences in melanoma metastatic potential
# The figure is 2b and can be accessed at
# https://www.nature.com/articles/s41586-019-1847-2#MOESM3
# The link to the raw data used in the article is given below and directly improted for plotting
url <-'https://static-content.springer.com/esm/art%3A10.1038%2Fs41586-019-1847-2/MediaObjects/41586_2019_1847_MOESM3_ESM.xlsx'
#create a dataframe from the Excel data
temp <- tempfile()
download.file(url, temp, mode='wb')
myData <- read_excel(path = temp)
我不知道如何插入数据集的图像,但它应该显示在前面的代码中。我需要 2-31 列来表示高效,2 到 37 列表示低效。
我希望这些信息足以让人们理解我所说的。
解决方案
虽然它可能不漂亮,但我相信这将是您仅使用readxl
和tidyverse
包的解决方案:
# Select first set of rows with group and value
set1 <-
myData %>%
filter(row_number() %in% c(2, 4))
# Select second set of rows with group and value
set2 <-
myData %>%
filter(row_number() %in% c(10, 12))
# Join both sets of data, so that all group labels are in one row and all values are in one row.
left_join(set1, set2, by = "Fractional enrichment of glucose m+6 in primary subcutaneous tumors after [U-13C]glucose infusion") %>%
#pivot the table to a long format with group lable and value labels in separate columns
pivot_longer(cols = !`Fractional enrichment of glucose m+6 in primary subcutaneous tumors after [U-13C]glucose infusion`) %>%
# pivot wider to a format with group lable and value labels in separate columns
pivot_wider(names_from = `Fractional enrichment of glucose m+6 in primary subcutaneous tumors after [U-13C]glucose infusion`, values_from = value) %>%
# Remove old column names/numbers
select(-name)
# A tibble: 72 x 2
Group `Glucose m+6`
<chr> <chr>
1 Inefficient 0.48499999999999999
2 Inefficient 0.47399999999999998
3 Inefficient 0.48799999999999999
4 Inefficient 0.45600000000000002
5 Inefficient 0.53100000000000003
6 Inefficient 0.318
7 Inefficient 0.26600000000000001
8 Inefficient 0.30399999999999999
9 Inefficient 0.309
10 Inefficient 0.33
# ... with 62 more rows
推荐阅读
- python - 我想在情节中指定一种颜色
- sql - 应用 SUM(其中 date1 和 date2 之间的日期)
- javascript - 如何在功能组件中使用 react-datepicker 的 excludeTimes 并使用 onChange?
- java - 在 Spring Boot 应用程序初始化时将数据插入 MongoDB 容器
- javascript - 是的,多个复选框的验证
- python - dtale 在用户输入后重新加载数据单元
- html - 为什么元素的顶部和底部不使用 CSS 中的背景颜色?
- java - 如何触发回调帖子在spring boot中获取实体
- ansible - 使用角色和任务执行角色
- java - 从 Firebase 数据库中自动突出显示 CompactCalendarView 中的日期