首页 > 解决方案 > Convert non-numeric rows and columns to zero

问题描述

I have this data from an r package, where X is the dataset with all the data

library(ISLR)
data("Hitters")
X=Hitters
head(X)

here is one part of the data:

                 AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun CRuns CRBI CWalks League Division PutOuts Assists Errors Salary NewLeague
-Andy Allanson      293   66     1   30  29    14     1    293    66      1    30   29     14      A        E     446      33     20     NA         A
-Alan Ashby         315   81     7   24  38    39    14   3449   835     69   321  414    375      N        W     632      43     10  475.0         N
-Alvin Davis        479  130    18   66  72    76     3   1624   457     63   224  266    263      A        W     880      82     14  480.0         A
-Andre Dawson       496  141    20   65  78    37    11   5628  1575    225   828  838    354      N        E     200      11      3  500.0         N
-Andres Galarraga   321   87    10   39  42    30     2    396   101     12    48   46     33      N        E     805      40      4   91.5         N
-Alfredo Griffin    594  169     4   74  51    35    11   4408  1133     19   501  336    194      A        W     282     421     25  750.0         A

I want to convert all the columns and the rows with non numeric values to zero, is there any simple way to do this. I found here an example how to remove the rows for one column just but for more I have to do it for every column manually.

Is in r any function that does this for all columns and rows?

标签: r

解决方案


要删除非数字列,也许是这样的?

df %>%
    select(which(sapply(., is.numeric)))
#                  AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun
#-Andy Allanson      293   66     1   30  29    14     1    293    66      1
#-Alan Ashby         315   81     7   24  38    39    14   3449   835     69
#-Alvin Davis        479  130    18   66  72    76     3   1624   457     63
#-Andre Dawson       496  141    20   65  78    37    11   5628  1575    225
#-Andres Galarraga   321   87    10   39  42    30     2    396   101     12
#-Alfredo Griffin    594  169     4   74  51    35    11   4408  1133     19
#                  CRuns CRBI CWalks PutOuts Assists Errors Salary
#-Andy Allanson       30   29     14     446      33     20     NA
#-Alan Ashby         321  414    375     632      43     10  475.0
#-Alvin Davis        224  266    263     880      82     14  480.0
#-Andre Dawson       828  838    354     200      11      3  500.0
#-Andres Galarraga    48   46     33     805      40      4   91.5
#-Alfredo Griffin    501  336    194     282     421     25  750.0

或者

df %>%
    select(-which(sapply(., function(x) is.character(x) | is.factor(x))))

或者更整洁(感谢@AntoniosK):

df %>% select_if(is.numeric)

更新

要另外替换NAs 0,您可以执行

df %>% select_if(is.numeric) %>% replace(is.na(.), 0)
#                  AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun
#-Andy Allanson      293   66     1   30  29    14     1    293    66      1
#-Alan Ashby         315   81     7   24  38    39    14   3449   835     69
#-Alvin Davis        479  130    18   66  72    76     3   1624   457     63
#-Andre Dawson       496  141    20   65  78    37    11   5628  1575    225
#-Andres Galarraga   321   87    10   39  42    30     2    396   101     12
#-Alfredo Griffin    594  169     4   74  51    35    11   4408  1133     19
#                  CRuns CRBI CWalks PutOuts Assists Errors Salary
#-Andy Allanson       30   29     14     446      33     20    0.0
#-Alan Ashby         321  414    375     632      43     10  475.0
#-Alvin Davis        224  266    263     880      82     14  480.0
#-Andre Dawson       828  838    354     200      11      3  500.0
#-Andres Galarraga    48   46     33     805      40      4   91.5
#-Alfredo Griffin    501  336    194     282     421     25  750.0 

推荐阅读