首页 > 解决方案 > 读取镶木地板文件时收到 CorruptStatistics 警告

问题描述

使用 spark version 读取镶木地板文件时收到警告2.4.5

Sep 2, 2021 10:54:03 PM WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by coul
d not be parsed (see PARQUET-251): parquet-cpp version 1.4.0
org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-cpp version 1.4.0 usin
g format: (.+) version ((.*) )?\(build ?(.*)\)
        at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
        at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:60)
        at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConve
rter.java:263)
        at org.apache.parquet.hadoop.ParquetFileReader$Chunk.readAllPages(ParquetFileReader.java:583)
        at org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:513)
        at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:130)
        at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:214
)

您对这条消息的来源有任何想法吗?以及如何解决?

标签: javaapache-sparkparquet

解决方案


推荐阅读