NLTK was unable to find the java file! for Stanford POS Tagger

2024/11/16 15:19:13

I have been stuck trying to get the Stanford POS Tagger to work for a while. From an old SO post I found the following (slightly modified) code:

stanford_dir = 'C:/Users/.../stanford-postagger-2017-06-09/'from nltk.tag import StanfordPOSTagger
#from nltk.tag.stanford import StanfordPOSTagger # I tried it both ways
from nltk import word_tokenize# Add the jar and model via their path (instead of setting environment variables):
jar = stanford_dir + 'stanford-postagger.jar'
model = stanford_dir + 'models/english-left3words-distsim.tagger'pos_tagger = StanfordPOSTagger(model, jar, encoding='utf8')text = pos_tagger.tag(word_tokenize("What's the airspeed of an unladen swallow ?"))
print(text)

However, I get the following error:

LookupError: ===========================================================================
NLTK was unable to find the java file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
===========================================================================

I don't know what java file it is talking about. I'm sure it's finding the right files because if I change the path to something incorrect I get a different error:

LookupError: Could not find stanford-postagger.jar jar file at C:/Users/.../stanford-postagger-2017-06-09/sstanford-postagger.jar

What java file is missing? How do I get the Stanford POS tagger to work?

EDIT:

I went to this link for Stanford NLP on Windows and tried:

(Second EDIT - adding the installation procedures):

import urllib.request
import zipfile
urllib.request.urlretrieve(r'http://nlp.stanford.edu/software/stanford-postagger-full-2015-04-20.zip', r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20.zip')
zfile = zipfile.ZipFile(r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20.zip')
zfile.extractall(r'C:/Users/HMISYS/Downloads/')
# End second editfrom nltk.tag.stanford import StanfordPOSTagger
# Trying on an older version
_model_filename = r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20/models/english-bidirectional-distsim.tagger'
_path_to_jar = r'C:/Users/HMISYS/Downloads/stanford-postagger-full-2015-04-20/stanford-postagger.jar'
st = StanfordPOSTagger(model_filename=_model_filename, path_to_jar=_path_to_jar)
text = st.tag(nltk.word_tokenize("What's the airspeed of an unladen swallow ?"))
print(text)

but I got the same error. Based on this post I set my path variables with the following:

set STANFORDTOOLSDIR=$HOME
set CLASSPATH=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/stanford-postagger.jar
set export STANFORD_MODELS=$STANFORDTOOLSDIR/stanford-postagger-full-2015-04-20/models

But I get this error:

NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH environment variable.
Answer

I added the following lines to my code and it worked:

import os
java_path = "C:/Program Files/Java/jdk1.8.0_161/bin/java.exe"
os.environ['JAVAHOME'] = java_path
https://en.xdnf.cn/q/71655.html

Related Q&A

Append a list in Google Sheet from Python

I have a list in Python which I simply want to write (append) in the first column row-by-row in a Google Sheet. Im done with all the initial authentication part, and heres the code:credentials = Google…

Compute linear regression standardized coefficient (beta) with Python

I would like to compute the beta or standardized coefficient of a linear regression model using standard tools in Python (numpy, pandas, scipy.stats, etc.).A friend of mine told me that this is done in…

Individually labeled bars for bar graph in Plotly

I was trying to create annotations for grouped bar charts - where each bar has a specific data label that shows the value of that bar and is located above the centre of the bar.I tried a simple modific…

Is there a way to subclass a generator in Python 3?

Aside from the obvious, I thought Id try this, just in case:def somegen(input=None):...yield...gentype = type(somegen()) class subgen(gentype):def best_function_ever():...Alas, Pythons response was qui…

represent binary search trees in python

how do i represent binary search trees in python?

Python os.path.commonprefix - is there a path oriented function?

So I have this python code:print os.path.commonprefix([rC:\root\dir,rC:\root\dir1])Real ResultC:\root\dirDesired resultC:\rootQuestion 1Based on os.path.commonprefix documentation: Return the longest p…

Importing Stripe into Django - NameError

I cant seem to figure out how to import Stripe into my Django project. Im running Python 2.7.3 and I keep receiving NameError at /complete/ global name. stripe is not defined.Even when I just open up T…

getting line-numbers that were changed

Given two text files A,B, what is an easy way to get the line numbers of lines in B not present in A? I see theres difflib, but dont see an interface for retrieving line numbers

How to subclass a subclass of numpy.ndarray

Im struggling to subclass my own subclass of numpy.ndarray. I dont really understand what the problem is and would like someone to explain what goes wrong in the following cases and how to do what Im t…

How to ignore an invalid SSL certificate with requests_html?

So basically Im trying to scrap the javascript generated data from a website. To do this, Im using the Python library requests_html. Here is my code :from requests_html import HTMLSession session = HTM…