r - 将 csv 导入 r 时,a-hat 是什么意思(以及如何摆脱它)?
问题描述
我正在将 csv 导入到 r 中,并且到处都是原始数据中不存在的 a-hats(一个带有抑扬符/向上克拉的 a)。
有谁知道它们是什么以及如何摆脱它们?
这是我提供的@foc 建议的 dput(head(df)) 结果:
structure(list(V1 = c("", "Race3 and Hispanic Origin", "Whiteâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦",
" White, not Hispanicâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦",
"Blackâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦",
"Asianâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦"
), V2 = c("", "", "245,985", "195,221", "41,962", "18,879"),
V3 = c("", "", "27,113", "17,263", "9,234", "1,908"), V4 = c("",
"", "547", "493", "388", "175"), V5 = c("", "", "11.0", "8.8",
"22.0", "10.1"), V6 = c("", "", "0.2", "0.3", "0.9", "0.9"
), V7 = c("", "", "247,272", "195,256", "42,474", "19,475"
), V8 = c("", "", "26,436", "16,993", "8,993", "1,953"),
V9 = c("", "", "714", "571", "373", "190"), V10 = c("", "",
"10.7", "8.7", "21.2", "10.0"), V11 = c("", "", "0.3", "0.3",
"0.9", "1.0"), V12 = c("", "", "-677", "-270", "-241", "45"
), V13 = c("", "", "*-0.3", "-0.1", "-0.8", "-0.1")), row.names = c(NA,
6L), class = "data.frame")
解决方案
不确定这是否是您想要的:
数据示例:
df <- structure(list(V1 = c("", "Race3 and Hispanic Origin", "Whiteâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦",
" White, not Hispanicâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦",
"Blackâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦",
"Asianâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦"
), V2 = c("", "", "245,985", "195,221", "41,962", "18,879"),
V3 = c("", "", "27,113", "17,263", "9,234", "1,908"), V4 = c("",
"", "547", "493", "388", "175"), V5 = c("", "", "11.0", "8.8",
"22.0", "10.1"), V6 = c("", "", "0.2", "0.3", "0.9", "0.9"
), V7 = c("", "", "247,272", "195,256", "42,474", "19,475"
), V8 = c("", "", "26,436", "16,993", "8,993", "1,953"),
V9 = c("", "", "714", "571", "373", "190"), V10 = c("", "",
"10.7", "8.7", "21.2", "10.0"), V11 = c("", "", "0.3", "0.3",
"0.9", "1.0"), V12 = c("", "", "-677", "-270", "-241", "45"
), V13 = c("", "", "*-0.3", "-0.1", "-0.8", "-0.1")), row.names = c(NA,
6L), class = "data.frame")
删除字符:
df[] <- lapply(df, gsub, pattern='a€¦', replacement='')
结果:
df
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
1
2 Race3 and Hispanic Origin
3 White 245,985 27,113 547 11.0 0.2 247,272 26,436 714 10.7 0.3 -677 *-0.3
4 White, not Hispanic 195,221 17,263 493 8.8 0.3 195,256 16,993 571 8.7 0.3 -270 -0.1
5 Black 41,962 9,234 388 22.0 0.9 42,474 8,993 373 21.2 0.9 -241 -0.8
6 Asian 18,879 1,908 175 10.1 0.9 19,475 1,953 190 10.0 1.0 45 -0.1
推荐阅读
- android - RecyclerView、不同的 ViewTypes 和 notifyItemMoved() 导致视觉故障
- vue.js - VUE 以最简洁的方式呈现当前日期和时间
- react-native - expo:build web 后浏览器地址链接不起作用
- yii2 - Yii 2.0 覆盖 ActiveQuery - 调用未知方法:yii\db\ActiveQuery::fechadas()
- python - 在python中绘制特定的字母和数字
- python - Pycharm:如何将突出显示的代码行转换为'def ###():'
- reactjs - 在 React Router 中导航嵌套路由时,初始路径被附加到每个链接
- batch-file - 不匹配批量 for 循环中的子字符串
- javascript - 尝试在不使用用户的情况下获取语音频道的 ID
- c - CMake:“致命错误:cstdio:没有这样的文件或目录”