Regular expression - replace all spaces in beginning of line with periods

2024/12/9 20:59:04

I don't care if I achieve this through vim, sed, awk, python etc. I tried in all, could not get it done.

For an input like this:

top           f1    f2    f3sub1       f1    f2    f3sub2       f1    f2    f3sub21   f1    f2    f3sub3       f1    f2    f3

I want:

top           f1    f2    f3
...sub1       f1    f2    f3
...sub2       f1    f2    f3
......sub21   f1    f2    f3
...sub3       f1    f2    f3

Then I want to just load this up in Excel (delimited by whitespace) and still be able to look at the hierarchy-ness of the first column!

I tried many things, but end up losing the hierarchy information

Answer

With this as the input:

$ cat file
top           f1    f2    f3sub1       f1    f2    f3sub2       f1    f2    f3sub21   f1    f2    f3sub3       f1    f2    f3

Try:

$ sed -E ':a; s/^( *) ([^ ])/\1.\2/; ta' file
top           f1    f2    f3
...sub1       f1    f2    f3
...sub2       f1    f2    f3
......sub21   f1    f2    f3
...sub3       f1    f2    f3

How it works:

  • :a

    This creates a label a.

  • s/^( *) ([^ ])/\1.\2/

    If the line begins with spaces, this replaces the last space in the leading spaces with a period.

    In more detail, ^( *) matches all leading blanks except the last and stores them in group 1. The regex ([^ ]) (which, despite what stackoverflow makes it look like, consists of a blank followed by ([^ ])) matches a blank followed by a nonblank and stores the nonblank in group 2.

    \1.\2 replaces the matched text with group 1, followed by a period, followed by group 2.

  • ta

    If the substituted command resulted in a substitution, then branch back to label a and try over again.

Compatibility:

  1. The above was tested on modern GNU sed. For BSD/OSX sed, one might or might not need to use:

    sed -E -e :a -e 's/^( *) ([^ ])/\1.\2/' -e ta file
    

    On ancient GNU sed, one needs to use -r in place of -E:

    sed -r ':a; s/^( *) ([^ ])/\1.\2/; ta' file
    
  2. The above assumed that the spaces were blanks. If they are tabs, then you will have to decide what your tabstop is and make substitutions accordingly.

https://en.xdnf.cn/q/73079.html

Related Q&A

Writing append only gzipped log files in Python

I am building a service where I log plain text format logs from several sources (one file per source). I do not intend to rotate these logs as they must be around forever.To make these forever around f…

How to configure bokeh plot to have responsive width and fixed height

I use bokeh embedded via the components function. Acutally I use :plot.sizing_mode = "scale_width"Which scales according to the width and maintains the aspect ratio. But I would like to have …

Matplotlib show multiple images with for loop [duplicate]

This question already has an answer here:Can I generate and show a different image during each loop?(1 answer)Closed 8 years ago.I want to display multiple figure in Matplotlib. Heres my code:for i in…

How do I efficiently fill a file with null data from python?

I need to create files of arbitrary size that contain no data. The are potentially quite large. While I could just loop through and write a single null character until Ive reached the file size, that s…

Setting specific permission in amazon s3 boto bucket

I have a bucket called ben-bucket inside that bucket I have multiple files. I want to be able to set permissions for each file URL. Im not too sure but Im assuming if I wanted URL for each file inside …

Create new column in dataframe with match values from other dataframe

Have two dataframes, one has few information (df1) and other has all data (df2). What I am trying to create in a new column in df1 that finds the Total2 values and populates the new column accordingly…

MYSQL- python pip install error

I tried to get build an app on Django and I wanted to use MySQL as the database. After setting up the settings.py right, I tried to migrate. Then I got the obvious error saying that MySQL is not instal…

How to do a boxplot with individual data points using seaborn

I have a box plot that I create using the following command: sns.boxplot(y=points_per_block, x=block, data=data, hue=habit_trial)So the different colors represent whether the trial was a habit trial or…

Load QDialog directly from UI-File?

I work with QT Designer and create my GUIs with it. To launch the main program, I use this code:import sys from PyQt4 import uic, QtGui, QtCore from PyQt4.QtGui import * from PyQt4.QtCore import *try:_…

Is there a way to detect if running code is being executed inside a context manager?

As the title states, is there a way to do something like this:def call_back():if called inside context:print("running in context")else:print("called outside context")And this would …