Similarity between two text documents in Python

2024/7/4 15:55:30

You are provided with four documents, numbered 1 to 4, each with a single sentence of text. Determine the identifier of the document which is the most similar to the first document, as computed according to the TF-IDF scores.

My name is Ankit,
Ankit name is very famous,
Ankit like his name
India has a lot of beautiful cities

Output the integer (which may be either 2 or 3 or 4), leaving no leading or trailing spaces.

Answer
import numpy as npfrom sklearn.feature_extraction.text import TfidfVectorizervect = TfidfVectorizer(min_df=1)tfidf = vect.fit_transform(["My name is Ankit","Ankit name is very famous","Ankit like his name","India has a lot of beautiful cities"])print ((tfidf * tfidf.T).A)
https://en.xdnf.cn/q/120745.html

Related Q&A

Aligning strings in Python

I searched for creating aligned strings in Python and found some relevant stuff, but didnt work for me. Heres one example:for line in [[1, 128, 1298039], [123388, 0, 2]]:print({:>8} {:>8} {:>8…

Program fails in Python 2.7.8 but runs in Python 3.4.1

Ive ran this code using Python 3.4.1 and it works, but if I use Python 2.7.8 it fails, why?i=1 while i<10:for x in(1,2,3,4,5,6,7,8,9):print (i*x,\t,end=)if x==9: print(\n)i=i+1

Clustered Stacked Bar in Python Pandas [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 7 years ago.Improve…

A pythons beginner exercise [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

Python iterate over multi value nested dictionary [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 7…

Why is this python while loop not ending?

I am wondering why this code seems to loop infinitely? The logic, while not False = while True, and this True is referring to 100 < 0 which is false, hence it should execute the print statement ins…

sort a field in ascending order and delete the first and last number [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.This question appears to be off-topic because it lacks sufficient information to diagnose the proble…

PermissionError: [Errno 13] Permission denied:

Im trying to write in a txt file the vertices of a spline mesh, but I get this error: PermissionError: [Errno 13] Permission denied: C\:Windows\system32\vt_84.txtThe code is

python calculator [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable…

How to sequence row based on another row?

I am trying to convert a formula from excel to pandas.The DataFrame looks like this: Column A Column B H H H J J J J K K I want to fill column B to increment while the value in co…