java - NLTK -> 使用斯坦福依赖解析器 ->
问题描述
NLTK 似乎有很多相互矛盾的文档(NLTK/StanfordNLP 文档的权威来源在哪里?)。
我的问题:从 nltk 调用 StanfordParser 的首选方法是什么?这是我的代码,但在 java 调用中有一些不正确的地方。
from nltk.parse.stanford import StanfordDependencyParser
import os
parser_home = '/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/'
# os.environ['CLASSPATH'] = parser_home
parser = StanfordDependencyParser(
model_path = parser_home + 'stanford-parser.jar',
path_to_models_jar = parser_home + 'stanford-parser-3.9.1-models.jar',
verbose = True
)
result = parser.raw_parse('Here is an example sentence.')
这是我的错误。任何帮助表示赞赏。我还没有找到与我的完全匹配的。我正在设置类路径,但我不确定这是必需的。
[Found stanford-parser\.jar: /Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser.jar]
[Found stanford-parser-(\d+)(\.(\d+))+-models\.jar: /Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser-3.9.1-models.jar]
/Users/myname/anaconda3/envs/nlp/lib/python3.6/site-packages/ipykernel_launcher.py:12: DeprecationWarning: The StanfordDependencyParser will be deprecated
Please use nltk.parse.corenlp.StanforCoreNLPDependencyParser instead.
if sys.path[0] == '':
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.lang.RuntimeException: /Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser.jar: expecting BEGIN block; got PK��aL META-INF/��PKPK��aLMETA-INF/MANIFEST.MFE��
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.confirmBeginBlock(LexicalizedParser.java:536)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromTextFile(LexicalizedParser.java:546)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:406)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:186)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.main(LexicalizedParser.java:1400)
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-18-052e46a6f6aa> in <module>()
----> 1 result = parser.raw_parse('Here is an example sentence.')
~/anaconda3/envs/nlp/lib/python3.6/site-packages/nltk/parse/stanford.py in raw_parse(self, sentence, verbose)
132 :rtype: iter(Tree)
133 """
--> 134 return next(self.raw_parse_sents([sentence], verbose))
135
136 def raw_parse_sents(self, sentences, verbose=False):
~/anaconda3/envs/nlp/lib/python3.6/site-packages/nltk/parse/stanford.py in raw_parse_sents(self, sentences, verbose)
150 '-outputFormat', self._OUTPUT_FORMAT,
151 ]
--> 152 return self._parse_trees_output(self._execute(cmd, '\n'.join(sentences), verbose))
153
154 def tagged_parse(self, sentence, verbose=False):
~/anaconda3/envs/nlp/lib/python3.6/site-packages/nltk/parse/stanford.py in _execute(self, cmd, input_, verbose)
216 cmd.append(input_file.name)
217 stdout, stderr = java(cmd, classpath=self._classpath,
--> 218 stdout=PIPE, stderr=PIPE)
219
220 stdout = stdout.replace(b'\xc2\xa0', b' ')
~/anaconda3/envs/nlp/lib/python3.6/site-packages/nltk/__init__.py in java(cmd, classpath, stdin, stdout, stderr, blocking)
134 if p.returncode != 0:
135 print(_decode_stdoutdata(stderr))
--> 136 raise OSError('Java command failed : ' + str(cmd))
137
138 return (stdout, stderr)
OSError: Java command failed : ['/usr/bin/java', '-mx1000m', '-cp', '/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser-3.9.1-models.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser-3.9.1-javadoc.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/ejml-0.23.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser-3.9.1-sources.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/slf4j-api.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser-3.9.1-models.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser.jar:/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/slf4j-api-1.7.12-sources.jar', 'edu.stanford.nlp.parser.lexparser.LexicalizedParser', '-model', '/Users/myname/Documents/nlp/stanford-parser-full-2018-02-27/stanford-parser.jar', '-sentences', 'newline', '-outputFormat', 'conll2007', '-encoding', 'utf8', '/var/folders/kg/y1g8nszj77z0pm6mzplqv7580000gp/T/tmp93uyyya_']
解决方案
在挖掘之后,似乎StanfordDependencyParser
该类已在 NLTK 中被弃用:
新的,改进的方式:
首先,从这里下载完整的 CoreNLP 文件,然后通过运行以下命令在下载的文件夹中启动 CoreNLP 服务器(我选择端口 9010)。该文件夹看起来像stanford-parser-full-2018-02-27
目录,对您来说:
$ java -mx1g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9010 -timeout 15000
然后,运行以下代码:
from nltk.parse.corenlp import CoreNLPParser
parser = CoreNLPParser(url='http://localhost:{somePort}'
next(
parser.raw_parse('The quick brown fox sucks at jumping.')
).pretty_print()
ROOT
|
S
__________|__________________________
| VP |
| ____|___ |
| | PP |
| | ___|_____ |
| | | S |
| | | | |
NP | | VP |
____|__________ | | | |
DT JJ JJ NN VBZ IN VBG .
| | | | | | | |
The quick brown fox sucks at jumping .
此外,有趣的事实是,一旦服务器运行,您可以导航到http://localhost:9010
(或您选择的任何端口)并查看一个漂亮的小界面以进行修改。
推荐阅读
- xamarin.forms - 如何克隆随机访问流?
- php - 数据库模型的缓存包装器
- javascript - 滚动到顶部按钮在 Android 设备上不起作用
- git - `git rebase -i`,交互式列表中缺少几个正常的提交
- python - 如何将调查结果转换为python pandas中总受访者的百分比?
- docker - nginx反向代理简单配置不重定向
- maven - 切换操作系统后无法生成源
- c - 在 C 中使用反射/位反转输入计算 CRC16
- php - 如何在 laravel 项目中使用 kafka 和 redis?
- office-js - Office Web 加载项无法正确加载。拒绝连接