remove empty line printed from hive query output using python

2024/11/17 0:20:42

i am performing a hive query and storing the output in a tsv file in the local FS. I am running a for loop for the hive query and passing different parameters. If the hive query returns no output once in the for loop it prints an empty line in the tsv file. This causes NULL values to be pushed to my DB in the backend. Hence, after the for loop runs and the file is created - i have the below code to remove all the empty lines printed, but it doesn't work.

How do i remove the empty line from this file?

` 395.9   429.61  PT  `

code:

with open('file.tsv','r+w') as file:for line in file:if line.strip():file.write(line)

thanks

Answer

Usually you would open the input file and write the non-empty lines to a second file:

with open('file.tsv') as infile, open('filtered_file.tsv', 'w') as outfile:for line in infile:if line.strip():outfile.write(line)

If you want to filter the file inplace you can use FileInput with the inplace option:

import fileinput
for line in fileinput.FileInput("infile", inplace=1):if line.strip():print line

however, this uses an intermediate file and may not work in low disk space situations.

To filter the file inplace without allocating any additional disk space you could try something like this:

with open('file.tsv', 'r+') as infile:read_pos = write_pos = 0line = infile.readline()while line:read_pos += len(line)if line.strip():infile.seek(write_pos)infile.write(line)write_pos += len(line)infile.seek(read_pos)line = infile.readline()# update file size to the new, possibly reduced, sizeinfile.truncate(write_pos)
https://en.xdnf.cn/q/119134.html

Related Q&A

.exceptions.WebDriverException: Message: Can not connect to the Service

struggling to find a solution all over, have latest chrome 117 and also downloaded chromedriver and used the path accordingly in script also tried with chrome browser Although it opens the browser but …

How to call a previous function in a new function? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.This question does not appear to be about programming within the scope defined in the help center.Cl…

Using simpleauth to login in with GAE

This question is in the reference of this. As suggested I am using simpleauth to login via linkedin. Now I am having trouble with the redirect_uri. I have successfully deployed dev_appserver.py example…

How do i force my code to print in python

Im having trouble trying to work out an error in my code. It isnt printing the final product and leaving a blank space.playing = True string = "" Alphabet = (z,a,b, c, d, e, f, g, h, i, j, k,…

Adding specific days in python table

I have a dataset (Product_ID,date_time, Sold) which has products sold on various dates. The dates are not consistent and are given for 9 months with random 13 days or more from a month. I have to segre…

django how to following relationships backwards?

I am having some issue with following relationships backwards. From the parent page i want to be able to see what children belong to that parent. Heres what i got so farmodel.pyclass Parents(models.Mod…

Python File handling: Seaching for specific numbers

Im creating a document in which I need to record license plates of vehicles (its a practice exercise, nothing illegal) and calculate the speed they travel at and display all the vehicles that are trave…

How to convert token list into wordnet lemma list using nltk?

I have a list of tokens extracted out of a pdf source. I am able to pre process the text and tokenize it but I want to loop through the tokens and convert each token in the list to its lemma in the wor…

Script throws an error when it is made to run using multiprocessing

Ive written a script in python in combination with BeautifulSoup to extract the title of books which get populated upon providing some ISBN numbers in amazon search box. Im providing those ISBN numbers…

Efficiently pair random elements of list

I have a list of n elements say: foo = [a, b, c, d, e] I would like to randomly pair elements of this list to receive for example: bar = [[a, c], [b, e]] where the last element will be discarded if the…