how to read a fasta file in python?

2024/11/15 14:57:08

I'm trying to read a FASTA file and then find specific motif(string) and print out the sequence and number of times it occurs. A FASTA file is just series of sequences(strings) that starts with a header line and the signature for header or start of a new sequence is ">". in a new line immediately after the header is the sequence of letters.I'm not done with code but so far I have this and it gives me this error:

AttributeError: 'str' object has no attribute 'next'

I'm not sure what's wrong here.

import reheader=""
counts=0
newline=""f1=open('fpprotein_fasta(2).txt','r')
f2=open('motifs.xls','w')
for line in f1:if line.startswith('>'):header=line#print headernextline=line.next()for i in nextline:motif="ML[A-Z][A-Z][IV]R"if re.findall(motif,nextline):counts+=1#print (header+'\t'+counts+'\t'+motif+'\n')fout.write(header+'\t'+counts+'\t'+motif+'\n')f1.close()
f2.close()
Answer

The error is likely coming from the line:

nextline=line.next()

line is the string you have already read, there is no next() method on it.

Part of the problem is that you're trying to mix two different ways of reading the file - you are iterating over the lines using for line in f1 and <handle>.next().

Also, if you are working with FASTA files I recommend using Biopython: it makes working with collections of sequences much easier. In particular, Chapter 14 on motifs will be of particular interest to you. This will likely require that you learn more about Python in order to achieve what you want, but if you're going to be doing a lot more bioinformatics than what your example here shows then it's definitely worth the investment of time.

https://en.xdnf.cn/q/71874.html

Related Q&A

Passing a pandas dataframe column to an NLTK tokenizer

I have a pandas dataframe raw_df with 2 columns, ID and sentences. I need to convert each sentence to a string. The code below produces no errors and says datatype of rule is "object." raw_d…

SWIG - Wrap C string array to python list

I was wondering what is the correct way to wrap an array of strings in C to a Python list using SWIG.The array is inside a struct :typedef struct {char** my_array;char* some_string; }Foo;SWIG automati…

How to show an Image with pillow and update it?

I want to show an image recreated from an img-vector, everything fine. now I edit the Vector and want to show the new image, and that multiple times per second. My actual code open tons of windows, wit…

How do I map Alt Gr key combinations in vim?

Suppose I wanted to map the command :!python % <ENTER> to pressing the keys Alt Gr and j together?

cannot import name get_user_model

I use django-registrations and while I add this code in my admin.pyfrom django.contrib import adminfrom customer.models import Customerfrom .models import UserProfilefrom django.contrib.auth.admin impo…

pytest: Best Way To Add Long Test Description in the Report

By default pytest use test function names or test files names in pytest reportsis there any Best way to add test description (Long test name) in the report with out renaming the files or functions usin…

Adding a calculated column to pandas dataframe

I am completely new to Python, pandas and programming in general, and I cannot figure out the following:I have accessed a database with the help of pandas and I have put the data from the query into a …

Scipy: Centroid of convex hull

how can I calculate the centroid of a convex hull using python and scipy? All I found are methods for computing Area and Volume.regards,frank.

Creating a montage of pictures in python

I have no experience with python, but the owner of this script is not responding.When I drag my photos over this script, to create a montage, it ends up cutting off half of the last photo on the right …

stop python program when ssh pipe is broken

Im writing a python script with an infinite while loop that I am running over ssh. I would like the script to terminate when someone kills ssh. For example:The script (script.py):while True:# do someth…