首页 > 解决方案 > spark(scala)中的postgres几何类型错误

问题描述

我使用包含来自 postgresql 的几何的表中的 spark 创建了一个数据框,如下所示

val df = sparkSession.read.format("jdbc")
      .option("url", pgInfo)
      .option("dbtable", "SELECT * FROM tableName")
      .option("user", "user")
      .option("password", "password")
      .option("header", "true")
      .option("driver", "org.postgresql.Driver").load

然后,数据框再次存储在新表中。

df.filter(row => ~~ ).write.mode(SaveMode.Overwrite).jdbc("pg_info1", "new_table", "pg_info2")

当我应用包含 的表格时polygons,它可以正常工作。但是在应用包含 的表linestring时,会出现以下错误:

Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO linestring_table ("geom","val") VALUES ('0102000020110F00000C00000086584F5529496A410F80812D274D4541E641D63829496A416AED7E75194D4541AE49ED1F29496A41178BA0CD0A4D45410936C4C329496A41869567EF064D454104A6EEA92F496A410A642E67E84C4541F0375FE330496A41DE709EC8DB4C4541232DB89F30496A412D1C8D53CF4C45417707220E31496A4108D93FCDCC4C45412DF9BD7634496A412068C137C24C4541F9DC327F37496A4184EA9324B04C4541E107F7BD3C496A4199B6C256944C4541ACC0E3133E496A4100F6692C8D4C4541','1234') was aborted: ERROR: column "geom" is of type geometry but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.
  Position: 73  Call getNextException to see other errors in the batch.
    at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:154)
    at org.postgresql.core.ResultHandlerDelegate.handleError(ResultHandlerDelegate.java:50)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2269)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:511)
    at org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:851)
    at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:874)
    at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1569)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:659)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:821)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:821)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:109)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: column "geom" is of type geometry but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.
  Position: 73
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2533)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2268)
    ... 17 more

标签: postgresqlapache-sparkjdbcgis

解决方案


推荐阅读