Transformation of pandas DataFrame adds a blank row

2024/11/19 20:18:41

My original question was posted here. I have a dataframe as follows:

ID  START   END  SEQ
1   11      12   1
1   14      15   3 
1   13      14   2 
2   10      14   1
3   11      15   1
3   16      17   2

I wanted to transform it into this DataFrame:

ID  START_1  END_1  SEQ_1  START_2  END_2  SEQ_2 START_3  END_3  SEQ_3
1   11       12     1      13       14     2     14       15     3 
2   10       14     1      NA       NA     NA    NA       NA     NA   
3   11       15     1      16       17     2     NA       NA     NA 

After pivot_table transformations I received a DataFrame that has an additional blank row after the header:

test_2['SEQ1'] = test_2.SEQ
test_2 = test_2.pivot_table(index= ['ID','SEQ1']).unstack()
test_2 = test_2.sort_index(axis=1, level=1)
test_2.columns = ['_'.join((col[0], str(col[1]))) for col in test_2]
test_2

test_2

    START_1  END_1  SEQ_1  START_2  END_2  SEQ_2 START_3  END_3  SEQ_3
ID
1   11       12     1      13       14     2     14       15     3 
2   10       14     1      NA       NA     NA    NA       NA     NA   
3   11       15     1      16       17     2     NA       NA     NA 

How can I delete these row and align all the headers? I tried to delte the row in a conventional way using test2[:2], but it didn't delete the blank row.

EDIT:

This is the more realistic dataset's extract:

ID  INDEX           START                   END                 SEQ     NUM_PREV     NUM_ACTUAL   NUM_NEXT             TIME   PRE_TIME      LOC_IND
079C    333334.0    2016-06-23 12:45:32 2016-06-23 12:51:05 1   1      23456           25456           29456           30      2               YES
079C    333334.0    2016-06-23 12:47:05 2016-06-23 12:51:05 2   2     29456           39458           39945           20      0               NO
Answer

Consider resetting index after the pivot/unstack operation:

from io import StringIO
import pandas as pddata='''
ID  START   END  SEQ
1   11      12   1
1   14      15   3 
1   13      14   2 
2   10      14   1
3   11      15   1
3   16      17   2
'''test_2 = pd.read_table(StringIO(data), sep="\\s+")
seq = set(test_2['SEQ'].tolist())test_2['SEQ1'] = test_2.SEQ
test_2 = test_2.pivot_table(index= ['ID','SEQ1']).unstack()
test_2 = test_2.sort_index(axis=1, level=1)
test_2.columns = ['_'.join((col[0], str(col[1]))) for col in test_2]test_2 = test_2.reset_index()
#    ID  END_1  SEQ_1  START_1  END_2  SEQ_2  START_2  END_3  SEQ_3  START_3
# 0   1   12.0    1.0     11.0   14.0    2.0     13.0   15.0    3.0     14.0
# 1   2   14.0    1.0     10.0    NaN    NaN      NaN    NaN    NaN      NaN
# 2   3   15.0    1.0     11.0   17.0    2.0     16.0    NaN    NaN      NaN

However, as you can see it changes column ordering, so consider a nested list comprehension with sum() to flatten it, all for a suitable order:

seqmax = max(seq)+1
colorder = ['ID'] +  sum([['START_'+str(i),'END_'+str(i),'SEQ_'+str(i)]for i in range(1, seqmax) if i in seq],[])test_2 = test_2[colorder]#    ID  START_1  END_1  SEQ_1  START_2  END_2  SEQ_2  START_3  END_3  SEQ_3
# 0   1     11.0   12.0    1.0     13.0   14.0    2.0     14.0   15.0    3.0
# 1   2     10.0   14.0    1.0      NaN    NaN    NaN      NaN    NaN    NaN
# 2   3     11.0   15.0    1.0     16.0   17.0    2.0      NaN    NaN    NaN
https://en.xdnf.cn/q/118500.html

Related Q&A

How to find a element after a search click checkbox in selenium Python

After I search for a user in the search field, I get the user I searched Now I need to select this user that shown in the list I tried with xpath and did not find the element ! could you help ? So aft…

Running parameterized queries

Quite new to this google bigquery sql thing so please bear with me. Im trying to build a google standardSQL parameterized query. The following sample was used and ran successfully on Google BigQuery We…

how to do this disparity map in OpenCV

paper "Fast Obstacle Detection Using U-Disparity Maps with Stereo Vision"reference paper: "Fast Obstacle Detection Using U-Disparity Maps with Stereo Vision"I want to ask can opencv…

How do I give out a role when I click on a reaction? It doesnt work for me?

Im trying to make sure that when you enter a command, a message is sent and a reaction is set, and those who click on the reaction get a role, but when you enter a command, nothing happens. Here is the…

Python Pandas DataFrame Rounding of big fraction values [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

Splitting Data in Python?

it does not work. I want to split data as in code in lines attribute. class movie_analyzer:def __init__(self,s):for c in punctuation:import removiefile = open(s, encoding = "latin-1")movielis…

Use local function variable inside loop [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 7 years ago.Improve…

Groupby Pandas , calculate multiple columns based on date difference

I have a pandas dataframe shown below: CID RefID Date Group MID 100 1 1/01/2021 A 100 2 3/01/2021 A 100 3 4/01/20…

EC2 .bashrc and .bash_profile re-setting

Reason Im asking: pycurl requires both libcurl-devel and openssl-devel. To install these, I have these two lines the my .bash_profile: sudo yum install libcurl-devel sudo yum install -y openssl-develPr…

How to load mutiple PPM files present in a folder as single Numpy ndarray?

The following Python code creates list of numpy array. I want to load by data sets as a numpy array that has dimension K x M x N x 3 , where K is the index of the image and M x N x 3 is the dimension …