printing files based on character

2024/11/13 9:28:39

I have a directory(data) that contain thousand of files.Each time I want to select three files that are just differ by only one characterAB[C,D,E] and want to perform some computation on the selected three files later.

My files are present inside the directory as follows

DT.ABC.2007.182.144018.txt
DT.ABD.2007.182.144018.txt
DT.ABE.2007.182.144018.txtDT.ABC.2001.005.1444.txt
DT.ABD.2001.005.1444.txt
DT.ABE.2001.005.1444.txtDT.ABC.2003.005.1244.txt
DT.ABD.2003.005.1244.txt
DT.ABE.2003.005.1244.txt

and at first i want to print

    DT.ABC.2007.182.144018.txtDT.ABD.2007.182.144018.txtDT.ABE.2007.182.144018.txt

then

DT.ABC.2001.005.1444.txt
DT.ABD.2001.005.1444.txt
DT.ABE.2001.005.1444.txt

and same process would goes on until finishing reading all the files in the directory.

I tried the code below:

import glob
for file in glob.glob('/data/*.txt'):print(st)

But it print all the files randomly instead of printing the same three(differ only by [C,D,E] character.I hope experts may help me.Thanks in advance.

Answer

Here is a simple function which lists files and groups them by the first and third component of the file name.

def groupfiles(pattern):files = glob.glob(pattern)filedict = defaultdict(list)for file in files:parts = file.split(".")filedict[".".join([parts[0], parts[2]])].append(file)for filegroup in filedict.values():yield filegroup

This groups together and returns a list of files at a time (yield is a keyword which produces a generator; but you can think of it as a sort of replacement for return, only the function continues where it left off after the previous call instead of running from the start the next time you call it) and so does not hard-code the limit of three files at a time.

Demo: https://ideone.com/w2Sf80

https://en.xdnf.cn/q/119371.html

Related Q&A

Parsing CSV file using Panda

I have been using matplotlib for quite some time now and it is great however, I want to switch to panda and my first attempt at it didnt go so well.My data set looks like this:sam,123,184,2.6,543 winte…

Getting division by zero error with Python and OpenCV

I am using this code to remove the lines from the following image:I dont know the reason, but it gives me as output ZeroDivisionError: division by zero error on line 34 - x0, x1, y0, y1 = (0, im_wb.sha…

Pandas complex calculation based on other columns

I have successfully created new columns based on arithmetic for other columns but now I have a more challenging need to first select elements based on matches of multiple columns then perform math and …

how to generate word from a to z [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

How to webscrape all shoes on nike page using python

I am trying to webscrape all the shoes on https://www.nike.com/w/mens-shoes-nik1zy7ok. How do I scrape all the shoes including the shoes that load as you scroll down the page? The exact information I …

Pyo in Python: name Server not defined

I recently installed Pyo, and I entered Python 3.6 and typedfrom pyo import * s = Server().boot() s.start() sf = SfPlayer("C:\Users\myname\Downloads\wot.mp3", speed=1, loop=True).out()but I …

Limited digits with str.format(), and then only when they matter

If were printing a dollar amount, we usually want to always display two decimal digits.cost1, cost2 = 123.456890123456789, 357.000 print {c1:.2f} {c2:.2f}.format(c1=cost1, c2=cost2)shows123.46 357.00…

How is covariance implemented internally in numpy?

This is the definition of a covariance matrix. http://en.wikipedia.org/wiki/Covariance_matrix#DefinitionEach element in the matrix, except in the principal diagonal, (if I am not wrong) simplifies to E…

Pulling excel rows to display as a grid in tkinter

I am imaging fluorescent cells from a 384-well plate and my software spits out a formatted excel analysis of the data (16 rowsx24 columns of images turns into a list of data, with 2 measurements from e…

Django Migrating DB django.db.utils.ProgrammingError: relation django_site does not exist

Doing a site upgrade for Django, now pushing it to the server when I try python manage.py makemigrations I get this error (kpsga) sammy@kpsga:~/webapps/kpsga$ python manage.py makemigrations Traceback …