Iterate one list of synsets over another

2024/10/14 2:22:32

I have two sets of wordnet synsets (contained in two separate list objects, s1 and s2), from which I want to find the maximum path similarity score for each synset in s1 onto s2 with the length of output equal that of s1. For example, if s1 contains 4 synsets, then the length of output should be 4.

I have experimented with the following code (so far):

import numpy as np
import nltk
from nltk.corpus import wordnet as wn
import pandas as pd
#two wordnet synsets (s1, s2)
s1 = [wn.synset('be.v.01'),
wn.synset('angstrom.n.01'),
wn.synset('trial.n.02'),
wn.synset('function.n.01')]
s2 = [wn.synset('use.n.01'),
wn.synset('function.n.01'),
wn.synset('check.n.01'),
wn.synset('code.n.01'),
wn.synset('inch.n.01'),
wn.synset('be.v.01'),
wn.synset('correct.v.01')]
# define a function to find the highest path similarity score for each synset in s1 onto s2, with the length of output equal that of s1
ps_list = []
def similarity_score(s1, s2):
for word1 in s1:
best = max(wn.path_similarity(word1, word2) for word2 in s2)
ps_list.append(best)
return ps_list
ps_list(s1, s2)

But it returns this following error message

'>' not supported between instances of 'NoneType' and 'float'

I couldn't figure out what's going on with code. Would anyone care to take a look at my code and share his/her insights on the for loop? It will be really appreciated.

Thank you.

The full error traceback is here

TypeError                                 Traceback (most recent call last)
<ipython-input-73-4506121e17dc> in <module>()
38     return word_list
39 
---> 40 s = similarity_score(s1, s2)
41 
42 
<ipython-input-73-4506121e17dc> in similarity_score(s1, s2)
33 def similarity_score(s1, s2):
34     for word1 in s1:
---> 35         best = max(wn.path_similarity(word1, word2) for word2 in s2)
36         word_list.append(best)
37 
TypeError: '>' not supported between instances of 'NoneType' and 'float'

[edit] I came up with this temporary solution:

s_list = []
for word1 in s1:
best = [word1.path_similarity(word2) for word2 in s2]
b = pd.Series(best).max()
s_list.append(b)

It's not elegant but it works. Wonder if anyone have better solutions or handy tricks to handle this?

Answer

I have no experience with the nltk module, but from reading the docs I can see that path_similarity is a method of whatever object wn.synset(args) returns. You are instead treating it as a function.

What you should be doing, is something like this:

ps_list = []
for word1 in s1:best = max(word1.path_similarity(word2) for word2 in s2) #path_similarity is a method of each synsetps_list.append(best)
https://en.xdnf.cn/q/118009.html

Related Q&A

Flask werkzeug.routing.BuildError

I doing a flask app and when i try to put a link to redirect a user to his profile page by callingBuildError: Could not build url for endpoint profile. Did you forgetto specify values [business_name]?…

getting attribute of an element with its corresponding Id

suppose that i have this xml file :<article-set xmlns:ns0="http://casfwcewf.xsd" format-version="5"> <article><article id="11234"><source><hostn…

How to install selenium python on Mac

Ive downloaded the Selenium zip file for python and it contains the folder with the setup.py. It says on python.org that I have to type in terminal python setup.py install but it gives me this error th…

aws s3 - object has no attribute server_side_encryption

Can someone please explain the differences in these two calls. The first one gives the correct server_side_encryption and the second one gives an error. The other attributes give the same value-#!/usr/…

Removing nested for loop to find coincidence values

I am currently using a nested for loop to iterate through to arrays to find values that match a certain criterion. The problem is that this method is incredibly inefficient and time consuming. I was to…

Combine two pandas DataFrame into one new

I have two Pandas DataFrames whose data from different sources, but both DataFrames have the same column names. When combined only one column will keep the name.Like this:speed_df = pd.DataFrame.from_d…

Reasons of slowness in numpy.dot() function and how to mitigate them if custom classes are used?

I am profiling a numpy dot product call. numpy.dot(pseudo,pseudo)pseudo is a numpy array of custom objects. Defined as:pseudo = numpy.array([[PseudoBinary(1), PseudoBinary(0), PseudoBinary(1)],[PseudoB…

How to open cmd and run ipconfig in python

I would like to write a script that do something like that: open the cmd and run the commend "ipconfig" and than copy my ip and paste it to a text file. I wrote the beginning of the script …

Using OAuth to authenticate Office 365/Graph users with Django

We are creating an application for use in our organization, but we only want people in our organization to be able to use the app. We had the idea of using Microsofts OAuth endpoint in order to authent…

Python flatten array inside numpy array

I have a pretty stupid question, but for some reason, I just cant figure out what to do. I have a multi-dimensional numpy array, that should have the following shape:(345138, 30, 300)However, it actual…