mysql - logstash-input-jdbc 如何在语句中使用 utf-8 字符
问题描述
我使用 logstash-input-jdbc 将我的数据库同步到 elasticsearch。
环境:(logstash 7.5,elasticsearch 7.5,mysql-connector-java-5.1.48.jar,logstash-input-jdbc-4.3.16)
材料.conf:
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/sc_education"
jdbc_driver_library => "connector/mysql-connector-java-5.1.48.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_user => "dauser"
jdbc_password => "daname"
jdbc_paging_enabled => "true"
jdbc_page_size => "50"
statement_filepath => "./materials.sql"
schedule => "* * * * *"
last_run_metadata_path => "./materials.info"
record_last_run => true
tracking_column => updated_at
codec => plain { charset => "UTF-8"}
# parameters => { "favorite_artist" => "Beethoven" }
# statement => "SELECT * from songs where artist = :favorite_artist"
}
}
filter {
json {
source => "message"
remove_field => ["message"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "materials"
document_id => "%{material_id}"
}
stdout {
codec => json_lines
}
}
材料.sql:
SELECT material_name,material_id,
CASE grade_id
WHEN grade_id = 1 THEN "一年级"
WHEN grade_id = 2 THEN "二年级"
WHEN grade_id = 3 THEN "三年级"
WHEN grade_id = 4 THEN "四年级"
WHEN grade_id = 5 THEN "五年级"
WHEN grade_id = 6 THEN "六年级"
WHEN grade_id = 7 THEN "初一"
WHEN grade_id = 8 THEN "初二"
WHEN grade_id = 9 THEN "初三"
WHEN grade_id = 10 THEN "高一"
WHEN grade_id = 11 THEN "高二"
WHEN grade_id = 12 THEN "高三"
ELSE "" END as grade,
CASE subject_id
WHEN subject_id = 1 THEN "数学"
WHEN subject_id = 2 THEN "物理"
WHEN subject_id = 3 THEN "化学"
WHEN subject_id = 4 THEN "语文"
WHEN subject_id = 5 THEN "英语"
WHEN subject_id = 6 THEN "科学"
WHEN subject_id = 7 THEN "音乐"
WHEN subject_id = 8 THEN "绘画"
WHEN subject_id = 9 THEN "政治"
WHEN subject_id = 10 THEN "历史"
WHEN subject_id = 11 THEN "地理"
WHEN subject_id = 12 THEN "生物"
WHEN subject_id = 13 THEN "奥数"
ELSE "" END as subject,
CASE course_term_id
WHEN course_term_id = 1 THEN "春"
WHEN course_term_id = 2 THEN "暑"
WHEN course_term_id = 3 THEN "秋"
WHEN course_term_id = 4 THEN "寒"
ELSE "" END as season,
created_at, updated_at from sc_materials where updated_at > :sql_last_value and material_id in (2025,317,2050);
./bin/logstash -f 材料.conf
{"@version":"1","updated_at":"2019-08-19T02:04:54.000Z","season":"?","grade":"","created_at":"2019-08-19T02:04:54.000Z","@timestamp":"2019-12-13T01:02:01.907Z","material_name":"test material seri''al","material_id":2025,"subject":"??"}
{"@version":"1","updated_at":"2019-08-26T09:25:35.000Z","season":"","grade":"","created_at":"2019-08-26T09:25:35.000Z","@timestamp":"2019-12-13T01:02:01.908Z","material_name":"人教版高中英语必修三第10讲Unit5 Canada The True North语法篇A学生版2.pdf","material_id":2050,"subject":""}
{"@version":"1","updated_at":"2019-08-10T06:50:48.000Z","season":"?","grade":"","created_at":"2019-05-27T06:26:44.000Z","@timestamp":"2019-12-13T01:02:01.880Z","material_name":"90aca2238832143fb75dcf0fe6dbbfa9.pdf","material_id":317,"subject":""}
db 中的 chinese chars 效果很好,但是 statement 中的 chinese chars 变成了 chars ?
。
解决方案
对我来说,characterEncoding=utf8
没有工作。
添加后,
stdin {
codec => plain { charset => "UTF-8"}
}
效果很好。
这是我的工作 conf 文件。发布答案有点时间,但我希望它对某人有所帮助。
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://localhost:5432/atlasdb?useTimezone=true&useLegacyDatetimeCode=false&serverTimezone=UTC&useSSL=false&useUnicode=true&characterEncoding=utf8"
jdbc_user => "atlas"
jdbc_password => "atlas"
jdbc_validate_connection => true
jdbc_driver_library => "/lib/postgres-42-test.jar"
jdbc_driver_class => "org.postgresql.Driver"
schedule => "* * * * *"
statement => "SELECT * from naver_city"
}
stdin {
codec => plain { charset => "UTF-8"}
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "2020-04-23-2"
doc_as_upsert => true
action => "update"
document_id => "%{code}"
}
stdout { codec => rubydebug }
}
推荐阅读
- android - 使用热敏打印机 android 打印 pdf/图像文件
- mysql - 如何从其他表中正确总结为“生病时间” - mysql
- .net - .Net AutoCAD Mechanical OperateTransaction.Commit 不起作用
- angular - 我的变量如何更改为未定义的 Angular
- mysql - 如何将表拆分为块并选择每个块的 AVG
- dart - 如何防止 Flutter Inkwell 上的多次触摸
- javascript - 如何避免页面刷新/渲染上的双重时间绑定控件
- javascript - JavaScript 打开输入日期选择器事件
- postfix-mta - Postfix 中的基本垃圾邮件过滤器
- spring-boot - 找到多个名为 [spring_web] 的片段