Question 1

I have two sets of wordnet synsets (contained in two separate list objects, s1 and s2), from which I want to find the maximum path similarity score for each synset in s1 onto s2 with the length of output equal that of s1. For example, if s1 contains 4 synsets, then the length of output should be 4.

I have experimented with the following code (so far):

import numpy as np
import nltk
from nltk.corpus import wordnet as wn
import pandas as pd
#two wordnet synsets (s1, s2)
s1 = [wn.synset('be.v.01'),
wn.synset('angstrom.n.01'),
wn.synset('trial.n.02'),
wn.synset('function.n.01')]
s2 = [wn.synset('use.n.01'),
wn.synset('function.n.01'),
wn.synset('check.n.01'),
wn.synset('code.n.01'),
wn.synset('inch.n.01'),
wn.synset('be.v.01'),
wn.synset('correct.v.01')]
# define a function to find the highest path similarity score for each synset in s1 onto s2, with the length of output equal that of s1
ps_list = []
def similarity_score(s1, s2):
for word1 in s1:
best = max(wn.path_similarity(word1, word2) for word2 in s2)
ps_list.append(best)
return ps_list
ps_list(s1, s2)

But it returns this following error message

'>' not supported between instances of 'NoneType' and 'float'

I couldn't figure out what's going on with code. Would anyone care to take a look at my code and share his/her insights on the for loop? It will be really appreciated.

Thank you.

The full error traceback is here

TypeError                                 Traceback (most recent call last)
<ipython-input-73-4506121e17dc> in <module>()
38     return word_list
39 
---> 40 s = similarity_score(s1, s2)
41 
42 
<ipython-input-73-4506121e17dc> in similarity_score(s1, s2)
33 def similarity_score(s1, s2):
34     for word1 in s1:
---> 35         best = max(wn.path_similarity(word1, word2) for word2 in s2)
36         word_list.append(best)
37 
TypeError: '>' not supported between instances of 'NoneType' and 'float'

[edit] I came up with this temporary solution:

s_list = []
for word1 in s1:
best = [word1.path_similarity(word2) for word2 in s2]
b = pd.Series(best).max()
s_list.append(b)

It's not elegant but it works. Wonder if anyone have better solutions or handy tricks to handle this?

Question 2

I have no experience with the nltk module, but from reading the docs I can see that path_similarity is a method of whatever object wn.synset(args) returns. You are instead treating it as a function.

What you should be doing, is something like this:

ps_list = []
for word1 in s1:best = max(word1.path_similarity(word2) for word2 in s2) #path_similarity is a method of each synsetps_list.append(best)

Iterate one list of synsets over another

Related Q&A

Flask werkzeug.routing.BuildError

getting attribute of an element with its corresponding Id

How to install selenium python on Mac

aws s3 - object has no attribute server_side_encryption

Removing nested for loop to find coincidence values

Combine two pandas DataFrame into one new

Reasons of slowness in numpy.dot() function and how to mitigate them if custom classes are used?

How to open cmd and run ipconfig in python

Using OAuth to authenticate Office 365/Graph users with Django

Python flatten array inside numpy array