首页 > 解决方案 > 从elasticsearch连接并读取数据到hive

问题描述

我想将配置单元连接到弹性搜索。我按照这里的说明进行操作。我执行以下步骤

1. start-dfs.sh
2. start-yarn.sh
3. launch elasticsearch
4. launch kibana
5. launch hive
inside hive 
a- create a database
b- create a table
c- load data into the table (LOAD DATA LOCAL INPATH '/home/myuser/Documents/datacsv/myfile.csv' OVERWRITE INTO TABLE students; )
d- add jar /home/myuser/elasticsearch-hadoop-7.10.1/dist/elasticsearch-hadoop-hive-7.10.1.jar
e- create a table for Elastic. 
create table students_es (stt int not null, mahocvien varchar(10), tenho string, ten string, namsinh date, gioitinh string, noisinh string, namvaodang date, trinhdochuyenmon string, hesoluong float, phucaptrachnhiem float, chucvudct string, chucdqh string, dienuutien int, ghichu int) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.nodes' = '127.0.0.1', 'es.port' = '9201', 'es.resource' = 'students/student');

f- insert overwrite table students_es select * from students;  

然后我得到的错误如下

FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/commons/httpclient/protocol/ProtocolSocketFactory

我使用了组件 kibana: 7.10.1 hive: 3.1.2 hadoop: 3.1.2

标签: javaapache-sparkelasticsearchhadoophive

解决方案


我终于找到了解决方法。您需要下载 jar 文件 commons-httpclient-3.1.jar 并将其放入您的 hive lib 目录中。


推荐阅读