fulfill an empty dataframe with common index values from another Daframe

2024/10/12 0:34:05

I have a daframe with a series of period 1 month and frequency one second.

The problem the time step between records is not always 1 second.

time                c1  c2
2013-01-01 00:00:01 5   3
2013-01-01 00:00:03 7   2
2013-01-01 00:00:04 1   5
2013-01-01 00:00:05 4   3
2013-01-01 00:00:06 5   6
2013-01-01 00:00:09 4   2
2013-01-01 00:00:10 7   8

Then I want to create an empty dataframe with the same columns and for the whole period corrected. That means with as many records as seconds has a month. This empty dataframe is fulfilled in principle with nan values:

time                c1  c2
2013-01-01 00:00:01 nan nan
2013-01-01 00:00:02 nan nan
2013-01-01 00:00:03 nan nan
2013-01-01 00:00:04 nan nan
2013-01-01 00:00:05 nan nan
2013-01-01 00:00:06 nan nan
2013-01-01 00:00:07 nan nan
2013-01-01 00:00:08 nan nan
2013-01-01 00:00:09 nan nan
2013-01-01 00:00:10 nan nan

Then compare both, and fulfill the empty one, with the common rows with my first dataframe. The non-common should remain with nan values.

time                c1  c2
2013-01-01 00:00:01 5   3
2013-01-01 00:00:02 nan nan
2013-01-01 00:00:03 7   2
2013-01-01 00:00:04 1   5
2013-01-01 00:00:05 4   3
2013-01-01 00:00:06 5   6
2013-01-01 00:00:07 nan nan
2013-01-01 00:00:08 nan nan
2013-01-01 00:00:09 4   2
2013-01-01 00:00:10 7   8

My try:

#Read from a file the first dataframe
df1=pd.read_table(fin,parse_dates=0],names=ch,index_col=0,header=0,decimal='.',skiprows=c)
#create an empty dataframe 
N=86400 * 31#seconds per month
index=pd.date_range(df1.index[0], periods=N-1, freq='1s')
df2=pd.DataFrame(index=index, columns=df1.columns)

Now I try with merge or concat but without the expected result:

df2.merge(df1, how='outer')
pd.concat([df2,df1], axis=0, join='outer')
Answer

I don't think you need a second dataframe. If you call resample without a fill_method, it will store NaNs for the missing periods:

df.resample("s").max()
Out[62]: c1   c2
time                         
2013-01-01 00:00:01  5.0  3.0
2013-01-01 00:00:02  NaN  NaN
2013-01-01 00:00:03  7.0  2.0
2013-01-01 00:00:04  1.0  5.0
2013-01-01 00:00:05  4.0  3.0
2013-01-01 00:00:06  5.0  6.0
2013-01-01 00:00:07  NaN  NaN
2013-01-01 00:00:08  NaN  NaN
2013-01-01 00:00:09  4.0  2.0
2013-01-01 00:00:10  7.0  8.0

max() here is just an arbitrary method so that it returns a dataframe. You can replace it with mean, min etc. assuming you have no duplicates. If you have duplicates, they will be aggregated by that function.

As Paul H suggested in the comments, you can use df.resample("s").asfreq() without any aggregation. It skips an unnecessary step of aggregation so it is probably more efficient. It will raise an error if you have duplicate values in the index.

https://en.xdnf.cn/q/118262.html

Related Q&A

How to mix numpy slices to list of indices?

I have a numpy.array, called grid, with shape:grid.shape = [N, M_1, M_2, ..., M_N]The values of N, M_1, M_2, ..., M_N are known only after initialization.For this example, lets say N=3 and M_1 = 20, M_…

Visualize strengths and weaknesses of a sample from pre-trained model

Lets say Im trying to predict an apartment price. So, I have a lot of labeled data, where on each apartment I have features that could affect the price like:city street floor year built socioeconomic s…

Scrapy get result in shell but not in script

one topic again ^^ Based on recommendations here, Ive implemented my bot the following and tested it all in shell :name_list = response.css("h2.label.title::text").extract()packaging_list = r…

How to find a source when a website uses javascript

What I want to achieve I am trying to scrape the website below using Beautiful-soup and when I load the page it does not give the table that shows various quotes. In my previous posts folks have helped…

How to print a list of dicts as an aligned table?

So after going through multiple questions regarding the alignment using format specifiers I still cant figure out why the numerical data gets printed to stdout in a wavy fashion.def create_data(soup_ob…

abstract classes in python: Enforcing type

My question is related to this question Is enforcing an abstract method implementation unpythonic? . I am using abstract classes in python but I realize that there is nothing that stops the user from …

Convert image array to original svs format

Im trying to apply a foreground extraction to a SVS image (Whole Slide Image) usign OpenSlide library.First, I converted my image to an array to work on my foreground extraction:image = np.asarray(oslI…

Printing bytestring via variable

I have the following Unicode text stored in variable:myvariable = Gen\xe8veWhat I want to do is to print myvariable and show this:GenveI tried this but failed:print myvariable.decode(utf-8)Whats the ri…

Loop and arrays of strings in python

I have the following data set:column1HL111 PG3939HL11 HL339PG RC--HL--PGI am attempting to write a function that does the following:Loop through each row of column1 Pull only the alphabet and put into…

2 Dendrograms + Heatmap from condensed correlationmatrix with scipy

I try to create something like this: plotting results of hierarchical clustering ontop of a matrix of data in pythonUnfortunatelly when I try to execute the code, I get the following warnings:Warning (…