java - Why does HDFS serialize using protocol buffers, not Java serialization APIs?
问题描述
Why does HDFS use a protocol buffer instead of the Java serialization API?
What if I want to send an object from a data node to another data node via Java serialization?
I've tried a couple of things and I get the following error: java.io.WriteAbortedException: writing aborted; java.io.NotSerializableException: java.lang.Thread
解决方案
Because formats with external schema definition like Protocol Buffers are more space efficient then build-in Java serialization which produces a very verbose file.
HDFS can use different format to store the data. Formats that provide best space efficiency while not being overly CPU intensive are generally preferred. Some format are designed for a specific goal which helps with data processing:
java.io.NotSerializableException: java.lang.Thread
exception shows that you are attempting to serialize Thread
which is not implementing Serializable
推荐阅读
- node.js - NodeJS 未成功安装在用户数据内的 AWS EC2 中
- jquery - 如何根据选择的另一个选择选项(数组)更改选择选项?
- sql - 为什么即使我添加了 jdbcType,插入也会失败并显示错误代码 17004?
- sql - 同一列的两个索引并更改顺序
- maven - 缺少 Maven 依赖项
- charts - 图表js:根据数据生成动态标签
- java - NDC.remove() 与。NDC.clear() 哪个是多线程应用程序的首选?
- c++ - boost::asio::io_service::run() 实际上做了什么?
- angular - 导航到任何路线并直接转到组件时,有没有办法跳过解析器(针对组件)?
- cron - 春季时间表 - 一个月的最后一天不工作