Pandas convert yearly to monthly

2024/10/4 1:23:07

I'm working on pulling financial data, in which some is formatted in yearly and other is monthly. My model will need all of it monthly, therefore I need that same yearly value repeated for each month. I've been using this stack post and trying to adapt the code to my data.

Here is my dataframe:

df.head()date ticker value
0 1999-12-31  ECB/RA6  1.0
1 2000-12-31  ECB/RA6  4.0
2 2001-12-31  ECB/RA6  2.0
3 2002-12-31  ECB/RA6  3.0
4 2003-12-31  ECB/RA6  2.0

Here is my desired output first 5 rows:

   date ticker value
0 1999-12-31  ECB/RA6  1.0
1 2000-01-31  ECB/RA6  4.0
2 2000-02-28  ECB/RA6  4.0
3 2000-13-31  ECB/RA6  4.0
4 2000-04-30  ECB/RA6  4.0

And my code:

df['date'] = pd.to_datetime(df['date'], format='%Y-%m')
df = df.pivot(index='date', columns='ticker')
start_date = df.index.min() - pd.DateOffset(day=1)
end_date = df.index.max() + pd.DateOffset(day=31)
dates = pd.date_range(start_date, end_date, freq='M')
dates.name = 'date'
df = df.reindex(dates, method='ffill')df = df.stack('ticker')
df = df.sortlevel(level=1)
df = df.reset_index()

However, it is not repeating the months as expected

Answer

You want resample

First, you need to set the index so that resample will work. Then you backfill and reset the index.

df.set_index('date').resample('M').bfill().reset_index()date   ticker  value
0  1999-12-31  ECB/RA6    1.0
1  2000-01-31  ECB/RA6    4.0
2  2000-02-29  ECB/RA6    4.0
3  2000-03-31  ECB/RA6    4.0
4  2000-04-30  ECB/RA6    4.0
5  2000-05-31  ECB/RA6    4.0
6  2000-06-30  ECB/RA6    4.0
7  2000-07-31  ECB/RA6    4.0
8  2000-08-31  ECB/RA6    4.0
9  2000-09-30  ECB/RA6    4.0
10 2000-10-31  ECB/RA6    4.0
11 2000-11-30  ECB/RA6    4.0
12 2000-12-31  ECB/RA6    4.0
13 2001-01-31  ECB/RA6    2.0
14 2001-02-28  ECB/RA6    2.0
15 2001-03-31  ECB/RA6    2.0
...

To handle this per ticker

df.set_index('date').groupby('ticker', group_keys=False) \.resample('M').bfill().reset_index()
https://en.xdnf.cn/q/70660.html

Related Q&A

Firebase database data to R

I have a database in Google Firebase that has streaming sensor data. I have a Shiny app that needs to read this data and map the sensors and their values.I am trying to pull the data from Firebase into…

Django 1.8 Migrations - NoneType object has no attribute _meta

Attempting to migrate a project from Django 1.7 to 1.8. After wrestling with code errors, Im able to get migrations to run. However, when I try to migrate, Im given the error "NoneType object has …

Manage dependencies of git submodules with poetry

We have a repository app-lib that is used as sub-module in 4 other repos and in each I have to add all dependencies for the sub-module. So if I add/remove a dependency in app-lib I have to adjust all o…

Create Boxplot Grouped By Column

I have a Pandas DataFrame, df, that has a price column and a year column. I want to create a boxplot after grouping the rows based on their year. Heres an example: import pandas as pd temp = pd.DataF…

How can I configure gunicorn to use a consistent error log format?

I am using Gunicorn in front of a Python Flask app. I am able to configure the access log format using the --access-log-format command line parameter when I run gunicorn. But I cant figure out how to c…

Implementing seq2seq with beam search

Im now implementing seq2seq model based on the example code that tensorflow provides. And I want to get a top-5 decoder outputs to do a reinforcement learning.However, they implemented translation mode…

Pandas Random Weighted Choice

I would like to randomly select a value in consideration of weightings using Pandas.df:0 1 2 3 4 5 0 40 5 20 10 35 25 1 24 3 12 6 21 15 2 72 9 36 18 63 45 3 8 1 4 2 7 5 4 16 2 8 4…

Matplotlib TypeError: NoneType object is not callable

Ive run this code many times but now its failing. Matplotlib wont work for any example, even the most trivial. This is the error Im getting, but Im not sure what to make of it. I know this is vague and…

Resize image faster in OpenCV Python

I have a lot of image files in a folder (5M+). These images are of different sizes. I want to resize these images to 128x128. I used the following function in a loop to resize in Python using OpenCVdef…

How to install Yandex CatBoost on Anaconda x64?

Iv successfully installed CatBoost via pip install catboostBut Iv got errors, when I tried sample python script in Jupiter Notebookimport numpy as np from catboost import CatBoostClassifierImportError:…