r - 阅读器中的 read_tsv 未正确解析表
问题描述
我正在尝试读取制表符分隔的表格,该表格不断产生一些解析失败。我认为是由于在文本中使用了非反斜杠引号。请参阅下面的示例:
concept_id concept_name domain_id vocabulary_id concept_class_id standard_concept concept_code valid_start_date valid_end_date invalid_reason
2618087 Services delivered under an outpatient speech language pathology plan of care Observation HCPCS HCPCS Modifier S GN 19990101 20991231
2618083 "opt out" physician or practitioner emergency or urgent service Observation HCPCS HCPCS Modifier S GJ 19981001 20991231
2618082 Diagnostic mammogram converted from screening mammogram on same day Observation HCPCS HCPCS Modifier S GH 19981001 20991231
请注意第二列中的“选择退出”,问题似乎源于此。以下代码存在解析失败:
df <- read_delim(
file = "~/_data/test.csv",
col_types = cols(
col_integer(), col_character(), col_character(),
col_character(), col_character(), col_character(),
col_character(), col_date(format = "%Y%m%d"), col_date(format = "%Y%m%d"),
col_character()),
delim = "\t"
)
Warning: 4 parsing failures.
row col expected actual file
1 NA 10 columns 9 columns '~/_data/test.csv'
2 concept_name delimiter or quote '~/_data/test.csv'
2 concept_name closing quote at end of file '~/_data/test.csv'
2 NA 10 columns 2 columns '~/_data/test.csv'
我似乎无法指定解决方案。
解决方案
这解决了这个问题。我需要将quote
参数修改为quote = ""
df <- read_delim(
file = "~/_data/test.csv",
col_types = cols(
col_integer(), col_character(), col_character(),
col_character(), col_character(), col_character(),
col_character(), col_date(format = "%Y%m%d"), col_date(format = "%Y%m%d"),
col_character()),
quote = "",
delim = "\t"
)
推荐阅读
- reactjs - 具有活动 getOptionSelected 的 Material-UI 自动完成清除值
- python - 如何在python脚本中隐藏敏感数据
- python - 清洁 Boto3 分页
- flutter - 我不知道如何在颤振中传输文件
- python - FastCgiModule - 发生未知的 FastCGI 错误
- c# - Powerpoint中的组表和文本形状错误
- osgi - MAC/Windows Eclipse RCP 产品文件 vm args
- react-native - DialogFlow -CX 与 React Native
- excel - 如果在宏开始处未显示工作表,则 .find() 什么都不是
- reactjs - 在 React 测试库中测试 react-lazyload