N_gram frequency python NTLK

2024/9/20 17:23:06

I want to write a function that returns the frequency of each element in the n-gram of a given text. Help please. I did this code fo counting frequency of 2-gram

code:

 from nltk import FreqDistfrom nltk.util import ngrams    def compute_freq():textfile = "please write a function"bigramfdist = FreqDist()threeramfdist = FreqDist()for line in textfile:if len(line) > 1:tokens = line.strip().split(' ')bigrams = ngrams(tokens, 2)bigramfdist.update(bigrams)return bigramfdistbigramfdist = compute_freq()
Answer

I don't see an expected output section, hence I assume this is what might need.

import nltkdef compute_freq(sentence, n_value=2):tokens = nltk.word_tokenize(sentence)ngrams = nltk.ngrams(tokens, n_value)ngram_fdist = nltk.FreqDist(ngrams)return ngram_fdist

By default this function returns frequency distribution of bigrams - for example,

text = "This is an example sentence."
freq_dist = compute_freq(text)

Now, freq_dist would look like -

FreqDist({('is', 'an'): 1, ('example', 'sentence'): 1, ('an', 'example'): 1, ('This', 
'is'): 1, ('sentence', '.'): 1})

From here you can print the keys and values like so

for k,v in freq_dist.items():print(k, v) ('is', 'an') 1
('example', 'sentence') 1
('an', 'example') 1
('This', 'is') 1
('sentence', '.') 1

For anything other that bigram, just change the 'n_value' argument when calling the function. For example,

freq_dist = compute_freq(text, n_value=3) #will give you trigram distribution('example', 'sentence', '.') 1
('an', 'example', 'sentence') 1
('This', 'is', 'an') 1
('is', 'an', 'example') 1
https://en.xdnf.cn/q/119474.html

Related Q&A

Is there a way to have a list of 4 billion numbers in Python?

I made a binary search function and Im curious what would happen if I used it on 4 billion numbers, but I get a MemoryError every time I use it. Is there a way to store the list without this issue?

ValueError: invalid literal for int() with base 10: when it worked before

Im having some issues with my program, basically what Im trying to do is Stenography, insert an image into another image and then extract the secret image.My program is able to insert just fine, but ex…

How to fetch the current branch from Jenkins?

I would like to query Jenkins using its API and Python to fetch the branch that is currently ready to be built.How can I do that?

How to vertically stretch graphs with matplotlib subplot [duplicate]

This question already has answers here:How do I change the size of figures drawn with Matplotlib?(16 answers)Closed 5 years ago.With the following code, I try to plot 12 different histograms in one pi…

Python Selenium Traceback (most recent call last):

Im trying to use selenium for a python web scraper but when I try to run the program I get the following error: "/Applications/Python 3.8/IDLE.app/Contents/MacOS/Python" "/Applications/P…

getting error while installing opencv via pip

python version = Python 3.8.0pip version = 19.3.1C:\Users\Sami Ullah Ch>pip3 install opencv-pythonERROR: Could not find a version that satisfies the requirement opencv-python (from versions: none)

Check whether text contains x numbers in a row [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 8 years ago.Improve…

How to add polling interval for a GET request in python

I have a case where I have to keep checking the response of a GET call until I see the status as success in the api response. And it takes around 20 to 50 mins to get the status from active to success.…

Total beginner wrote a tic tac toe game in Python and would like some feedback

Ive decided to learn Python about 2 weeks ago, been going through various books and videos, and Ive decided to try my hand at programming a tic tac toe game. I was somewhat successful (it doesnt recogn…