Search engine using python for bookmarked sites [closed]

2024/11/17 11:37:57

The idea I have is to build a search engine based on my bookmarks file which I have in CSV format.

The motivation behind this idea is that I have a large number of bookmarks related to the educational resources which I want to be able to search and find related content for a particular topic or subject.

I am not a very good programmer(I can write simple programs in c++ and java) and have recently started learning python.

Is the implementation of such project possible in one month?

I have searched and found that a CSV module exists in python language and the only idea I can get is from the udacity CS101 course of building a search engine using python.

My question is whether this is possible and where to start ?

Answer

I implemented a search engine both in Perl and Python at work. The first one was put together for a production problem in great haste and took 2 hours to build, from concept to run. I want to open-source the final version, but not sure where to start since it was work done for employer. Anyhow, here's the algorithm:

st={} #dictonary for search engine tree
for bokm in bookmarks:bokm=re.sub('\W_',' ',bokm).toupper() #filter out junk charsct = st;   #cursor for traversing and building our treefor c in bokm.split():if not ct[c]: ct[c]={}ct = ct[c]

At this point you have a dict-tree of chars that comprise your bookmarks. It will only find matches from beginning of bookmark, you can modify the algorithm to hash bookmarks starting from any word instead. Make sure to pprint.pprint(st) to see the beauty of it for yourself.

So let's say you are searching now and typed the word "dog":

def search(word, st):word=re.sub('\W_',' ',word).toupper() #pass word through same filter!ct = st #init our cursorfor c in word.split():try:ct = ct[c]     #traverse the treeexcept KeyError:return False    #pattern diverged, no matchreturn True #run out of word chars and every character matched. Found a match!

You can pretty much plug this in and start using. It does not return WHICH pattern it matched, you need to record that at the ends of search tree branches and recursively traverse the subtree after the last search word character to print all bookmarks that matched.

PS: There are many possible ways to implement word search. The beauty of this method is that it find matches almost instantly, always, regardless of the size of your bookmarks file. The second benefit is that search() can be modified to show you results as you type, with each key press, because it traverses our bookmark tree character by character, and it will do it instantaneously.

https://en.xdnf.cn/q/120211.html

Related Q&A

How to extract only particular set of structs from a file between braces in python

a. Have a scenario, where in my function reads in a file which contains list of c-structures as shown below, reads the file and extracts all the information between { } braces for each structure and st…

Selection of Face of a STL by Face Normal value Threshold

I want to write a script in Python which can generate facegroups in a STL as per the Face Normal value condition. For example, Provided is the snap of Stl, Different colour signifies the face group con…

Python , Changing a font size of a string variable

I have a variable that gets sent to a email as text but the text is all pretty much a standard size with everything the same. I would like to add some emphasis to it as well as make it bigger and make …

Python http.server command gives Syntax Error [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.This question was caused by a typo or a problem that can no longer be reproduced. While similar q…

If statement to check if the value of a variable is in a JSON file [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

Assign variables to a pandas dataframe when specific cell is empty

I am assigning some variables to values from a data frame. The data frame created using this code data = [[tom, 10], ["", 15], [juli, 14]] df = pd.DataFrame(data, columns=[Name, Age])So after…

Try Except for one variable in multiple variables

I am reading every row in a dataframe and assigning its values in each column to the variables The dataframe created using this code data = [[tom, 10], [, 15], [juli, 14]] df = pd.DataFrame(data, colum…

Print method invalid syntax reversing a string [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.This question was caused by a typo or a problem that can no longer be reproduced. While similar q…

How QRegularExpression can be passed to Qt::MatchRegularExpression

I am trying this sample code that I found which is really really good. I am also trying to figure out the same thing to find an item and scroll to it, but this time I wanted to match the the string whi…

Python list rearrange the elements in a required order

My main list has 44 names as elements. I want to rearrange them in a specific order. I am giving here an example. Note that elements in my actual list are some technical names. No way related to what I…