Fastest way to extract tar files using Python

2024/7/7 6:01:29

I have to extract hundreds of tar.bz files each with size of 5GB. So tried the following code:

import tarfile
from multiprocessing import Poolfiles = glob.glob('D:\\*.tar.bz') ##All my files are in D
for f in files:tar = tarfile.open (f, 'r:bz2')pool = Pool(processes=5)pool.map(tar.extractall('E:\\') ###I want to extract them in Etar.close()

But the code has type error: TypeError: map() takes at least 3 arguments (2 given)

How can I solve it? Any further ideas to accelerate extracting?

Answer

Define a function that extract a single tar file. Pass that function and a tar file list to multiprocessing.Pool.map:

import glob
import tarfile
from functools import partial
from multiprocessing import Pooldef extract(path, dest):with tarfile.open(path, 'r:bz2') as tar:tar.extractall(dest)if __name__ == '__main__':files = glob.glob('D:\\*.tar.bz')pool = Pool(processes=5)pool.map(partial(extract, dest='E:\\'), files)
https://en.xdnf.cn/q/119937.html

Related Q&A

Python - Split a string but keep contiguous uppercase letters [duplicate]

This question already has answers here:Splitting on group of capital letters in python(3 answers)Closed 3 years ago.I would like to split strings to separate words by capital letters, but if it contain…

Python: Find a Sentence between some website-tags using regex

I want to find a sentence between the ...class="question-hyperlink"> tags. With this code:import urllib2 import reresponse = urllib2.urlopen(https://stackoverflow.com/questions/tagged/pyth…

How to download all the href (pdf) inside a class with python beautiful soup?

I have around 900 pages and each page contains 10 buttons (each button has pdf). I want to download all the pdfs - the program should browse to all the pages and download the pdfs one by one. Code only…

Reducing the complexity/computation time for a basic graph formula [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 4 years ago.Improve…

Find All Possible Fixed Size String Python

Problem: I want to generate all possible combination from 36 characters that consist of alphabet and numbers in a fixed length string. Assume that the term "fixed length" is the upper bound f…

What is the concept of namespace when importing a function from another module?

main.py:from module1 import some_function x=10 some_function()module1.py:def some_function():print str(x)When I execute the main.py, it gives an error in the moduel1.py indicating that x is not availab…

How to pass a literal value to a kedro node? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 4 years ago.This po…

How to Loop a List and Extract required data (Beautiful Soup)

I need help in looping a list and extracting the src links. This is my list and the code: getimages = getDetails.find_all(img) #deleting the first image in the list getimages[0].decompose() print(getim…

square root without pre-defined function in python

How can one find the square root of a number without using any pre-defined functions in python?I need the main logic of how a square root of a program works. In general math we will do it using HCF bu…

How do I sort a text file by three columns with a specific order to those columns in Python?

How do I sort a text file by three columns with a specific order to those columns in Python?My text_file is in the following format with whitespaces between columns:Team_Name Team_Mascot Team_Color Te…