I am newbie to Python. I have large file with repetitive string through the logs
Example:
abc
def
efg
gjk
abc
def
efg
gjk
abc
def
efg
gjk
abc
def
efg
gjk
Expected Result
--------------------Section1---------------------------
abc
def
efg
gjk
--------------------Section2---------------------------
abc
def
efg
gjk
--------------------Section3---------------------------
abc
def
efg
gjk
--------------------Section4---------------------------
abc
def
efg
gjk
Could some provide me pointers to proceed with this.
I tried grep for the particular string, it gives me only the string in particular order.
I want the entire log from abc to gjk put in a section.
If a section is defined by the starting line, you can use a generator function to yield sections from an input iterable:
def per_section(iterable):section = []for line in iterable:if line.strip() == 'abc':# start of a section, yield previousif section:yield sectionsection = []section.append(line)# lines done, yield lastif section:yield section
Use this with an input file, for example:
with open('somefile') as inputfile:for i, section in enumerate(per_section(inputfile)):print '------- section {} ---------'.format(i)print ''.join(section)
If sections are simply based on the number of lines, use the itertools
grouper recipe to group the input iterable into groups of a fixed length:
from itertools import izip_longestdef grouper(iterable, n, fillvalue=None):"Collect data into fixed-length chunks or blocks"# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxxargs = [iter(iterable)] * nreturn izip_longest(fillvalue=fillvalue, *args)with open('somefile') as inputfile:for i, section in enumerate(grouper(inputfile, 4, '\n')):print '------- section {} ---------'.format(i)print ''.join(section)