r - 使用二进制变量进行聚类
问题描述
我有一个数据集,其中一些变量具有二进制类型。第一列是名称,因此在应用聚类分析时会显示错误。
kc <- kmeans(j1,4) ## j1 is the stored data frame
do_one(nmeth) 中的错误:外部函数调用中的 NA/NaN/Inf (arg 1) 另外:警告消息:在 storage.mode(x) <- "double" : NAs 由强制引入 -</p>
我在这里给出的数据头使用dput(j1[1:5,]
:
structure(list(OUTPUT_NAME = c("nonsaturation_fba268_2ch_0_out.wav",
"nonsaturation_fba268_2ch_32_out.wav", "substreaminfo_fba268_2ch_96_out.wav",
"substreaminfo_fba268_2ch_201_out.wav", "substreaminfo_fba268_2ch_93_out.wav"
), PEAK_MIPS = c(82.47, 82.5, 82.63, 82.73, 82.73), PRESENTATION = c(0,
0, 0, 0, 0), DTHD_ATMOS_PRE = c(0, 0, 0, 0, 0), FBAFBBDETECTER = c(1,
1, 1, 1, 1), DIAL_NORM = c(31, 31, 31, 31, 31), NORMAL_DRC = c(0,
0, 0, 0, 0), ANALOG_DB_GAIN_REQ = c(0, 0, 0, 0, 0), DECODER_CH_ASSIGN = c(1,
1, 1, 1, 1), DECODER_6_CH_ASSIGN = c(1, 1, 13, 1, 1), DECODER_8_CH_ASSIGN = c(1,
1, 13, 1, 1), DECODER_16_CH_ASSIGN = c(0, 0, 0, 0, 0), CH_MODIFIER = c(0,
0, 0, 0, 0), CH_ASSIGNMENT_TYPE = c(0, 0, 0, 0, 0), FILTER_ORDER = c(0,
0, 0, 0, 0), COEFF_BITS = c(9, 9, 9, 9, 9), COEFF_SHIFT = c(7,
7, 7, 7, 7), STATE_BITS = c(4, 4, 6, 6, 6), STATE_SHIFT = c(0,
0, 0, 0, 0), `31EC_PRIMITIVE_MATRIX_CNT` = c(16, 16, 8, 8, 8),
LSB_BYPASS_COUNT = c(0, 0, 0, 0, 0), DITHER_SCALE = c(1,
1, 1, 1, 1), `31EC_FRAC_BITS` = c(14, 14, 12, 12, 12), INTERPOLATION_USED = c(1,
1, 0, 0, 0), `31EA_31EB_PRIMITIVE_MATIX_CNT` = c(0, 0, 0,
0, 0), `31EA_31EB_FRAC_BITS` = c(14, 14, 12, 12, 12), LSB_BYPASS_USED = c(0,
0, 0, 0, 0), AU_LENGTH = c(937, 937, 937, 937, 937), VARIABLE_RATE = c(1,
1, 1, 1, 1), PEAK_DATA_RATE = c(6000, 6000, 6000, 6000, 6000
), SUBSTREAM_CNT = c(1, 1, 2, 2, 2), EXTENDED_SUBSTREAM_CNT = c(0,
0, 0, 0, 0), SUBSTREAM_INFO = c(20, 20, 40, 24, 24), SPEAKER_LAYOUT = c(0,
0, 0, 0, 0), CONTROL_EN_2 = c(0, 0, 0, 0, 0), CONTROL_EN_6 = c(0,
0, 0, 0, 0), CONTROL_EN_8 = c(0, 0, 0, 0, 0), MIX_LEVEL_2 = c(35,
35, 35, 35, 35), MIX_LEVEL_6 = c(35, 35, 35, 35, 35), MIX_LEVEL_8 = c(35,
35, 35, 35, 35), DIALOGUE_NORM_2 = c(31, 31, 31, 31, 31),
DIALOGUE_NORM_6 = c(31, 31, 31, 31, 31), DIALOGUE_NORM_8 = c(31,
31, 31, 31, 31), SOURCE_FORMAT_6 = c(0, 0, 0, 0, 0), SOURCE_FORMAT_8 = c(0,
0, 0, 0, 0), DRC_STARTUP_GAIN = c(0, 0, 0, 0, 0), DIALOGUE_NORM_16 = c(28,
28, 31, 31, 31), MIX_LEVEL_16 = c(35, 35, 35, 35, 35), CHANNEL_CNT_16 = c(16,
16, 16, 16, 16), DYNAMIC_OBJ_ONLY = c(1, 1, 1, 1, 1), DYNAMIC_CHANNEL_CNT_16 = c(0,
0, 0, 0, 0), LFE_PRE = c(1, 1, 0, 0, 0), CHANNEL_CONTENT_DES_16 = c(0,
0, 0, 0, 0), MIN_CHAN = c(0, 0, 0, 0, 0), MAX_CHAN = c(1,
1, 1, 1, 1), RESTART_SYNC_WORD = c(12778, 12778, 12778, 12778,
12778), MAX_MATRIX_CHAN = c(1, 1, 1, 1, 1), DITHER_SHIFT = c(0,
0, 0, 0, 0), ERROR_PROTECT = c(1, 1, 1, 1, 1), LOSSLESS_PROTECT = c(0,
0, 1, 1, 1), BLOCK_SIZE = c(32, 32, 40, 40, 40), OUTPUT_SHIFT = c(0,
0, 0, 0, 0), QUANT_STEP_SIZE = c(0, 0, 0, 0, 0), HUFF_OFFSET = c(0,
0, 0, 0, 0), HUFF_TYPE = c(1, 1, 0, 2, 2), HUFF_LSBS = c(6,
6, 8, 5, 5), SAMPLE_RATE = c(0, 3, 0, 3, 0), OUTPUT_SAMPLE_COUNT = c(40,
40, 40, 40, 40), RESTART_HEADER_EXISTS = c(0, 0, 0, 0, 0)), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"))
解决方案
您使用的变量不是数字,请看:
class(j1[,1])
[1] "character"
你必须删除它,才能kmeans
工作:
set.seed(1234)
kmeans(j1[,-1],2)
推荐阅读
- javascript - 无法在反应类中声明常量
- python - Python单线程比CPU绑定中的两个线程慢
- javascript - 带有文本输入的自定义对话框 React Native
- spring-boot - 在 Gradle Spring Boot 项目中的 wsdl2Java 任务中将正确的路径传递给本地 WSDL 文件
- c++ - 读取数据包时出现FFMPEG c++内存泄漏问题
- sql - 如何从底部 SQL Server 创建层次结构路径
- loops - 对于高维数组,Julia 中的数组访问循环很慢
- python - Viber Bot 无法启动
- javascript - Angular 自定义元素、构造函数与 ngDoBootstrap
- pycharm - 如何在 PyCharm 中启用“停止并重新运行”弹出窗口?