首页 > 解决方案 > 无法通过 Dataproc 上的 SQL 代理连接到 Cloud SQL

问题描述

我正在尝试通过 Cloud SQL 代理从 Dataproc 访问 Cloud SQL(不使用 Hive)

根据此处的说明进行大量修改后: https ://github.com/GoogleCloudPlatform/dataproc-initialization-actions/tree/master/cloud-sql-proxy

我到了至少创建集群没有错误并且似乎安装了代理的地步。但是,我的 Java Spark 作业无法连接到集群并出现以下错误:

Exception in thread "main" java.sql.SQLException: Access denied for user 'root'@'localhost' (using password: NO)

我故意创建了一个没有用户密码的实例,但它也不适用于有密码的实例。

我觉得奇怪的是,当我从本地计算机访问同一个数据库时,也使用本地运行的 Cloud SQL 代理,一切正常,但是当我试图通过故意提交错误的密码来强制出现类似的错误时,我得到了类似的,但不同的错误,像这样:

Exception in thread "main" java.sql.SQLException: Access denied for user 'root'@'cloudsqlproxy~217.138.38.242' (using password: YES)

因此,在 Dataproc 错误中,它显示为root@localhost,而在我的本地代理中,错误显示为root@cloudproxy~IP address。为什么要这样做?在这两个地方运行的代码完全相同。似乎它正在尝试连接到 Dataproc 主机中的本地设备?

进一步证实这一点的是,当尝试在 Dataproc 上失败时,我没有看到服务器端记录了此错误,但是当我从本地计算机强制失败时,记录了该错误。所以 Dataproc 的代理似乎没有指向 SQL Server?

我使用以下说明创建了集群:

--scopes sql-admin \
--initialization-actions gs://bucket_name/cloud-sql-proxy.sh \
--metadata 'enable-cloud-sql-hive-metastore=false' \
--metadata 'additional-cloud-sql-instances=project_id:europe-west2:sql_instance_id' \

我在 Spark 代码中指定的连接字符串是这样的:

jdbc:mysql://127.0.0.1:3306/database_name

谢谢你的帮助!

**** 更新:

根据以下建议,我将连接字符串修改为如下:

"jdbc:mysql://google/DATABASE_NAME?cloudSqlInstance=INSTANCE_NAME&socketFactory=com.google.cloud.sql.mysql.SocketFactory&useSSL=false&user=root"

但是,在这种情况下,我收到以下错误:

Exception in thread "main" java.sql.SQLNonTransientConnectionException: Cannot connect to MySQL server on google:3,306.

Make sure that there is a MySQL server running on the machine/port you are trying to connect to and that the machine this software is running on is able to connect to this host/port (i.e. not firewalled). Also make sure that the server has not been started with the --skip-networking flag.


    at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:108)
    at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:95)
    at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:87)
    at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:61)
    at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:71)
    at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:458)
    at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:230)
    at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:226)

应该如何/在哪里获得“谷歌”的驱动程序?另外,请注意,它似乎对默认端口 3306 进行了错误格式化,并将其显示为 3,306?(我尝试明确提供端口,但这没有帮助......

标签: google-cloud-sqlgoogle-cloud-dataproc

解决方案


I followed instructions in the tutorial you shared and both Cloud SQL instance and Dataproc Cluster were created. The validation process also was carried out:

$ gcloud dataproc jobs submit pyspark --cluster githubtest pyspark_metastore_test.py
Job [63d2e1ef8c9f45ae818c135c775dcf93] submitted.
Waiting for job output...
18/08/22 17:21:51 INFO org.spark_project.jetty.util.log: Logging initialized @3074ms
...
Successfully found table table_mdhw in Cloud SQL Hive metastore                 
18/08/22 17:22:53 INFO org.spark_project.jetty.server.AbstractConnector: Stopped Spark@5061d2ce{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
Job [63d2e1ef8c9f45ae818c135c775dcf93] finished successfully.

I only got the same error like yours when I put a different password for root. Could you update the root password and try again from the master the following command?

 mysql -u root -h 127.0.0.1 -p

In my environment, the command above connects successfully. If that works, please check this link for further steps to connect your Java application. Authentication and the connector mysql-connector-java are required as additional steps.

Hope it helps!


推荐阅读