首页 > 解决方案 > kubernetes 中的 Janusgraph 无法连接到作为另一个服务运行的 Cassandra

问题描述

我正在尝试运行 Janusgraph,存储为 Cassandra,作为同一集群中的另一个服务运行,Elasticsearch 用于索引,再次作为同一集群中的另一个服务运行。

虽然所需的端口在两个服务中都是打开的,但 janusgraph pods 的日志显示它在连接到 Cassandra 时面临的连接超时。

23343 [main] WARN  org.apache.tinkerpop.gremlin.server.GremlinServer  - Graph [graph] configured at [conf/gremlin-server/janusgraph.properties] could not be instantiated and will not be available in Gremlin Server.  GraphFactory message: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:82)
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:70)
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:104)
    at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.lambda$new$0(DefaultGraphManager.java:57)
    at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671)
    at org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager.<init>(DefaultGraphManager.java:55)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:110)
    at org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor.<init>(ServerGremlinExecutor.java:89)
    at org.apache.tinkerpop.gremlin.server.GremlinServer.<init>(GremlinServer.java:110)
    at org.apache.tinkerpop.gremlin.server.GremlinServer.main(GremlinServer.java:354)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:78)
    ... 13 more
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager
    at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:69)
    at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:477)
    at org.janusgraph.diskstorage.Backend.getStorageManager(Backend.java:409)
    at org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1376)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:164)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:133)
    at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:113)
    ... 18 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:58)
    ... 24 more
Caused by: org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend
    at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:619)
    at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.<init>(AstyanaxStoreManager.java:314)
    ... 29 more
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=cassandra(SERVICE_IP):9160, latency=10001(10001), attempts=1]Timed out waiting for connection
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
    at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
    at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117)
    at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:352)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateKeyspace(ThriftClusterImpl.java:321)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.addKeyspace(ThriftClusterImpl.java:294)
    at org.janusgraph.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:614)

我正在为 cassandra 运行 janusgrah v2 图像和 gcr.io/google-samples/cassandra:v13 图像。

我也尝试从busybox pod连接到cassandra端口9160。但似乎不起作用。但有趣的是:ping似乎适用于服务名称(此处为 cassandra)。但只有当它到达telnet端口 9160 或 9042 时,我才会收到连接被拒绝错误。

这是卡桑德拉 STS:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  name: cassandra
spec:
  clusterIP: None
  ports:
  - port: 9042
    name: cql
  - port: 9160
    name: thrift
  selector:
    app: cassandra
---    
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      #schedulerName: stork       #Check benefits of using STORK as scheduler.
      containers:
      - name: cassandra
        image: gcr.io/google-samples/cassandra:v13
        ports:
          - containerPort: 7000
            name: intra-node
          - containerPort: 7001
            name: tls-intra-node
          - containerPort: 7199
            name: jmx
          - containerPort: 9042
            name: cql
          - containerPort: 9160
            name: thrift
          - containerPort: 9142
            name: transportssl
        resources:
          limits:
            cpu: "1Gi"
            memory: 2Gi
          requests:
            cpu: "500m"
            memory: 1Gi
        securityContext:
          capabilities:
            add:
              - IPC_LOCK
        lifecycle:
          preStop:
            exec:
              command: 
              - /bin/sh
              - -c
              - nodetool drain
        env:
          - name: CASSANDRA_SEEDS
            value: cassandra-0.cassandra.default.svc.cluster.local
          - name: MAX_HEAP_SIZE 
            value: 512M
          - name: HEAP_NEWSIZE
            value: 512M
          - name: CASSANDRA_CLUSTER_NAME
            value: "Cassandra"
          - name: CASSANDRA_DC
            value: "DC1"
          - name: CASSANDRA_RACK
            value: "Rack1"
          - name: CASSANDRA_AUTO_BOOTSTRAP
            value: "false"            
          - name: CASSANDRA_ENDPOINT_SNITCH
            value: GossipingPropertyFileSnitch
          - name: CASSANDRA_RPC_ADDRESS
            value: 0.0.0.0
          - name: CASSANDRA_NUM_TOKENS
            value: "32"
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - /ready-probe.sh
          initialDelaySeconds: 15
          timeoutSeconds: 5
        volumeMounts:
        - name: nfs-pvc-cassandra
          mountPath: /srv/nfs/kubedata/janus
      restartPolicy: Always
      volumes:
        - name: nfs-pvc-cassandra
          persistentVolumeClaim:
            claimName: nfs-pvc-cassandra

我可以进一步调试的方法是什么?

标签: kuberneteskubernetes-helmconnection-timeoutjanusgraph

解决方案


如果您的 janusgraph 在您的主机上运行,​​您可能需要对您的 kubernetes 服务端口进行端口转发才能在本地访问它。也许你已经做到了


推荐阅读