Creating h5 file for storing a dataset to train super resolution GAN

2024/7/6 22:46:32

I am trying to create a h5 file for storing a dataset for training a super resolution GAN. Where each training pair would be a Low resolution and a High resolution image. The dataset will contain the data in the following manner: [[LR1,HR1],[LR2,HR2],...[LRn,HRn]]. I have 256x256 RGB images for HR and 128x128 RGB for LR. I am a bit skeptical about the best way to store this in a h5 file and shall I scale the images by 255 before storing them in the h5 file?

I have wrote the following code to do so. Any help/suggestions would be highly appreciated.

import h5py
import numpy as np
import os
import cv2
import globdef store_super_resolution_dataset_in_h5_file(path_to_LR,path_to_HR):'''This function takes the files with the same name from LR and HR folders and stores the new dataset in h5 format'''#create LR and HR image listsLR_images = glob.glob(path_to_LR+'*.jpg')HR_images = glob.glob(path_to_HR+'*.jpg')#sort the listsLR_images.sort()HR_images.sort()print('LR_images: ',LR_images)print('HR_images: ',HR_images)#create a h5 fileh5_file = h5py.File('super_resolution_dataset.h5','w')#create a dataset in the h5 filedataset = h5_file.create_dataset('super_resolution_dataset',(len(LR_images),2,256,256),dtype='f')#store the images in the datasetfor i in range(len(LR_images)):LR_image = cv2.imread(LR_images[i])HR_image = cv2.imread(HR_images[i])dataset[i,0,:,:] = LR_imagedataset[i,1,:,:] = HR_image#close the h5 fileh5_file.close()
Answer

There are 2 code segments below. The first code segment shows my recommended method: loading Hi-Res and Low-Res images to separate datasets to reduce the HDF5 file size. The second simply corrects errors in your code (modified to use the with/as: context manager). Both code segments begin after the #create a h5 file comment.

I ran a test with 43 images to compare resulting file sizes. Results are:

  • 1 dataset size = 66.0 MB
  • 2 dataset size = 41.3 MB (37% reduction)

Recommended method using 2 datasets:

# get image dtypes and create a h5 file
LR_dt = cv2.imread(LR_images[0]).dtype
HR_dt = cv2.imread(HR_images[0]).dtype
with h5py.File('low_hi_resolution_dataset.h5','w') as h5_file:#create 2 datasets for LR and HR images in the h5 filelr_ds = h5_file.create_dataset('low_res_dataset',(len(LR_images),128,128,3),dtype=LR_dt)hr_ds = h5_file.create_dataset('hi_res_dataset',(len(LR_images),256,256,3),dtype=HR_dt)#store the images in the datasetfor i in range(len(LR_images)):LR_image = cv2.imread(LR_images[i])HR_image = cv2.imread(HR_images[i])lr_ds[i] = LR_imagehr_ds[i] = HR_image

Modifications to your method:

# get LR image dtype and create a h5 file
LR_dt = cv2.imread(LR_images[0]).dtype
with h5py.File('super_resolution_dataset.h5','w') as h5_file:#create a dataset in the h5 filedataset = h5_file.create_dataset('super_resolution_dataset',(len(LR_images),2,256,256,3),dtype=LR_dt)#store the images in the datasetfor i in range(len(LR_images)):LR_image = cv2.imread(LR_images[i])HR_image = cv2.imread(HR_images[i])dataset[i,0,0:128,0:128,:] = LR_imagedataset[i,1,:,:,:] = HR_image
https://en.xdnf.cn/q/119575.html

Related Q&A

How to resolve wide_to_long error in pandas

I have following dataframeAnd I want to convert it into the following format:-To do so I have used the following code snippet:-df = pd.wide_to_long(df, stubnames=[manufacturing_unit_,outlet_,inventory,…

Odoo 10: enter value in Many2one field dynamically

I added in my models.py :commercial_group = fields.Many2one("simcard.simcard")and in my views.xml :<field name="commercial_group" widget="selection"/>And then i am t…

How to erode this thresholded image using OpenCV

I am trying to first remove the captcha numbers by thresholding and then eroding it ,to get slim continuous lines to get better output. Problem:the eroded image is not continuous as u can see Original …

Searching for only the first value in an array in a csv file

So i am creating a account login system which searches a database for a username (and its relevant password) and, if found, will log the user on.This is what the csv file currently looks like[dom, ente…

how to write a single row cell by cell and fill it in csv file

I have a CSV file that only has column headers:cat mycsv.csvcol_1@@@col_2@@@col_3@@@col_3I have to fill a single row with None values in each cell of the CSV file. Can someone suggest me the best-optim…

Greedy String Tiling in Python

I am trying to learn greedy string tiling in algorithmI have two lists as follows:a=[a,b,c,d,e,f] b=[d,e,a,b,c,f]i would like to retrieve c=[a,b,c,d,e]Another example would be a = [1,2,3,4,5,6,7,8,9,1,…

Python - efficient way to create 20 variables?

I need to create 20 variables in Python. That variables are all needed, they should initially be empty strings and the empty strings will later be replaced with other strings. I cann not create the var…

Whatsapp asking for updating chrome version

I am trying to open whatsapp with selenium and python, it was working fine until today. In headless or non, whatsapp is now asking to update chrome, when I try to do so, Chrome throws this error: An er…

how to find the longest N words from a list, using python?

I am now studying Python, and I am trying to solve the following exercise:Assuming there is a list of words in a text file, My goal is to print the longest N words in this list.Where there are several …

([False, True] and [True, True]) evaluates to [True, True]

I have observed the following behavior in python 3: >>> ([False, True] and [True, True]) [True, True]>>> ([False, True] or [True, True]) [False, True]I was expecting exactly the oppos…