I don't care if I achieve this through vim, sed, awk, python etc. I tried in all, could not get it done.
For an input like this:
top f1 f2 f3sub1 f1 f2 f3sub2 f1 f2 f3sub21 f1 f2 f3sub3 f1 f2 f3
I want:
top f1 f2 f3
...sub1 f1 f2 f3
...sub2 f1 f2 f3
......sub21 f1 f2 f3
...sub3 f1 f2 f3
Then I want to just load this up in Excel (delimited by whitespace) and still be able to look at the hierarchy-ness of the first column!
I tried many things, but end up losing the hierarchy information
With this as the input:
$ cat file
top f1 f2 f3sub1 f1 f2 f3sub2 f1 f2 f3sub21 f1 f2 f3sub3 f1 f2 f3
Try:
$ sed -E ':a; s/^( *) ([^ ])/\1.\2/; ta' file
top f1 f2 f3
...sub1 f1 f2 f3
...sub2 f1 f2 f3
......sub21 f1 f2 f3
...sub3 f1 f2 f3
How it works:
:a
This creates a label a
.
s/^( *) ([^ ])/\1.\2/
If the line begins with spaces, this replaces the last space in the leading spaces with a period.
In more detail, ^( *)
matches all leading blanks except the last and stores them in group 1. The regex ([^ ])
(which, despite what stackoverflow makes it look like, consists of a blank followed by ([^ ])
) matches a blank followed by a nonblank and stores the nonblank in group 2.
\1.\2
replaces the matched text with group 1, followed by a period, followed by group 2.
ta
If the substituted command resulted in a substitution, then branch back to label a
and try over again.
Compatibility:
The above was tested on modern GNU sed. For BSD/OSX sed, one might or might not need to use:
sed -E -e :a -e 's/^( *) ([^ ])/\1.\2/' -e ta file
On ancient GNU sed, one needs to use -r
in place of -E
:
sed -r ':a; s/^( *) ([^ ])/\1.\2/; ta' file
The above assumed that the spaces were blanks. If they are tabs, then you will have to decide what your tabstop is and make substitutions accordingly.