how do I count unique words of text files in specific directory with Python? [closed]

2024/7/7 6:26:57

im writing a report and I need to count unique words of text files.

My texts are in D:\shakeall and they're totally 42 files...

I know some about Python, but I don't know what to do now.

This is what I know how it works.

  1. read files in directory

  2. make up a list of words from texts

  3. count total/unique words

all I know is this. and some about for, while, lists and indexes, variables, lists...

What I want to do is make my own function library and use it to get result.

I really appreciate any advice about my questions.

------p.s.

I really know almost nothing about Python. What I can only do is a simple math or printing words in a list..given topic is too hard for me. Sorry.

Answer
textfile=open('somefile.txt','r')
text_list=[line.split(' ') for line in textfile]
unique_words=[word for word in text_list if word not in unique_words]
print(len(unique_words))

That's the general gist of it

https://en.xdnf.cn/q/119642.html

Related Q&A

Python convert path to dict

I have a list of paths that need to be converted to a dict ["/company/accounts/account1/accountId=11111","/company/accounts/account1/accountName=testacc","/company/accounts/acc…

Python: How to download images with the URLs in the excel and replace the URLs with the pictures?

As shown in the below picture,theres an excel sheet and about 2,000 URLs of cover images in the F column. What I want to do is that downloading the pictures with the URLs and replace the URL with the…

I cant figure out pip tensorrt line 17 error

I couldnt install it in any way, I wonder what could be the cause of the error. I installed C++ and other necessary stuff I am using windows 11 I installed pip install nvidia-pyindex with no problem. S…

Extracting specific values for a header in different lines using regex

I have text string which has multiple lines and each line has mix of characters/numbers and spaces etc. Here is how a couple lines look like:WEIGHT VOLUME CHA…

Creating a function to process through a .txt file of student grades

Can someone help...My driver file is here:from functions import process_marks def main():try:f = open(argv[1])except FileNotFoundError:print("\nFile ", argv[1], "is not available")e…

Python Reddit PRAW get top week. How to change limit?

I have been familiarising myself with PRAW for reddit. I am trying to get the top x posts for the week, however I am having trouble changing the limit for the "top" method. The documentatio…

I want to convert string 1F to hex 1F in Python, what should I do?

num="1F" nm="1" nm1="2" hex(num)^hex(nm)^hex(nm1)I wrote it like the code above, but hex doesnt work properly. I want to convert the string to hexadecimal, and I want an x…

How to call a function in a Django template?

I have a function on my views.py file that connects to a mail server and then appends to my Django model the email addresses of the recipients. The script works good. In Django, Im displaying the model…

Remove duplicates from json file matching against multiple keys

Original Post = Remove duplicates from json dataThis is only my second post. I didnt have enough points to comment my question on the original post...So here I am.Andy Hayden makes a great point - &quo…

Merging CSVs with similar name python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 8…