NLTK Data installation issues

2024/9/27 20:23:38

I am trying to install NLTK Data on Mac OSX 10.9 . The download directory to be set, as mentioned in NLTK 3.0 documentation, is /usr/share/nltk_data for central installation. But for this path, I get the error OSError: [Errno 13] Permission denied: '/usr/share/nltk_data'

Can I set the download directory as /Users/ananya/nltk_data for central installation?

I have Python 2.7 installed in my machine

Thanks, Ananya

Answer

Have you tried:

$ sudo python
>>> import nltk
>>> nltk.download()

To check if the downloads work, try a few of the corpora that you have downloaded, e.g.

>>> from nltk.corpus import wordnet
>>> wordnet.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]

If the corpora are not installed properly, you will see something like this:

Traceback (most recent call last):File "<stdin>", line 1, in <module>File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py", line 68, in __getattr__self.__load()File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py", line 56, in __loadexcept LookupError: raise e
LookupError: 
**********************************************************************Resource 'corpora/wordnet' not found.  Please use the NLTKDownloader to obtain the resource:  >>> nltk.download()Searched in:- '/home/alvas/nltk_data'- '/usr/share/nltk_data'- '/usr/local/share/nltk_data'- '/usr/lib/nltk_data'- '/usr/local/lib/nltk_data'
**********************************************************************
https://en.xdnf.cn/q/71413.html

Related Q&A

Why does the simplest requests_mock example fail with pytest?

I have a peculiar problem with requests_mock. I want to use it with pytest to test my API wrapper library.Ive tried to use the first example in the requests_mock docs, except I put it in a test_mock()-…

Error installing PyCurl

I tried installing pycurl via pip. it didnt work and instead it gives me this error.running installrunning buildrunning build_pyrunning build_extbuilding pycurl extensiongcc-4.2 -fno-strict-aliasing -f…

Serializing objects containing django querysets

Django provides tools to serialize querysets (django.core.serializers), but what about serializing querysets living inside other objects (like dictionaries)?I want to serialize the following dictionar…

How do I list my scheduled queries via the Python google client API?

I have set up my service account and I can run queries on bigQuery using client.query(). I could just write all my scheduled queries into this new client.query() format but I already have many schedule…

What does conda env do under the hood?

After searching and not finding, I must ask here:How does conda env work under the hood, meaning, how does anaconda handle environments?To clarify, I would like an answer or a reference to questions l…

Numpy array larger than RAM: write to disk or out-of-core solution?

I have the following workflow, whereby I append data to an empty pandas Series object. (This empty array could also be a NumPy array, or even a basic list.)in_memory_array = pd.Series([])for df in list…

Pandas DataFrame styler - How to style pandas dataframe as excel table?

How to style the pandas dataframe as an excel table (alternate row colour)? Sample style:Sample data: import pandas as pd import seaborn as snsdf = sns.load_dataset("tips")

Remove namespace with xmltodict in Python

xmltodict converts XML to a Python dictionary. It supports namespaces. I can follow the example on the homepage and successfully remove a namespace. However, I cannot remove the namespace from my XM…

Groupby count only when a certain value is present in one of the column in pandas

I have a dataframe similar to the below mentioned database:+------------+-----+--------+| time | id | status |+------------+-----+--------+| 1451606400 | id1 | Yes || 1451606400 | id1 | Yes …

how to save tensorflow model to pickle file

I want to save a Tensorflow model and then later use it for deployment purposes. I dont want to use model.save() to save it because my purpose is to somehow pickle it and use it in a different system w…