Pandas dataframe: omit weekends and days near holidays

2024/9/23 20:26:16

I have a Pandas dataframe with a DataTimeIndex and some other columns, similar to this:

import pandas as pd
import numpy as nprange = pd.date_range('2017-12-01', '2018-01-05', freq='6H')
df = pd.DataFrame(index = range)# Average speed in miles per hour
df['value'] = np.random.randint(low=0, high=60, size=len(df.index))df.info()
# DatetimeIndex: 141 entries, 2017-12-01 00:00:00 to 2018-01-05 00:00:00
# Freq: 6H
# Data columns (total 1 columns):
# value    141 non-null int64
# dtypes: int64(1)
# memory usage: 2.2 KBdf.head(10)
#                      value
# 2017-12-01 00:00:00     15
# 2017-12-01 06:00:00     54
# 2017-12-01 12:00:00     19
# 2017-12-01 18:00:00     13
# 2017-12-02 00:00:00     35
# 2017-12-02 06:00:00     31
# 2017-12-02 12:00:00     58
# 2017-12-02 18:00:00      6
# 2017-12-03 00:00:00      8
# 2017-12-03 06:00:00     30

How can I select or filter the entries that are:

  1. Weekdays only (that is, not weekend days Saturday or Sunday)

  2. Not within N days of the dates in a list (e.g. U.S. holidays like '12-25' or '01-01')?

I was hoping for something like:

df = exclude_Sat_and_Sun(df)omit_days = ['12-25', '01-01']
N = 3 # days near the holidays
df = exclude_days_near_omit_days(N, omit_days)

I was thinking of creating a new column to break out the month and day and then comparing them to the criteria for 1 and 2 above. However, I was hoping for something more Pythonic using the DateTimeIndex.

Thanks for any help.

Answer

The first part can be easily accomplished using the Pandas DatetimeIndex.dayofweek property, which starts counting weekdays with Monday as 0 and ending with Sunday as 6.

df[df.index.dayofweek < 5] will give you only the weekdays.


For the second part you can use the datetime module. Below I will give an example for only one date, namely 2017-12-25. You can easily generalize it to a list of dates, for example by defining a helper function.

from datetime import datetime, timedeltaN = 3df[abs(df.index.date - datetime.strptime("2017-12-25", '%Y-%m-%d').date()) > timedelta(N)]

This will give all dates that are more than N=3 days away from 2017-12-25. That is, it will exclude an interval of 7 days from 2017-12-22 to 2017-12-28.


Lastly, you can combine the two criteria using the & operator, as you probably know.

df[(df.index.dayofweek < 5) & (abs(df.index.date - datetime.strptime("2017-12-25", '%Y-%m-%d').date()) > timedelta(N))]
https://en.xdnf.cn/q/71783.html

Related Q&A

How to dump a boolean matrix in numpy?

I have a graph represented as a numpy boolean array (G.adj.dtype == bool). This is homework in writing my own graph library, so I cant use networkx. I want to dump it to a file so that I can fiddle wit…

Cant append_entry FieldList in Flask-wtf more than once

I have a form with flask-wtf for uploading images, also file field can be multiple fields. my form: class ComposeForm(Form):attachment = FieldList(FileField(_(file)), _(attachment))add_upload = SubmitF…

What is the best way to use python code from Scala (or Java)? [duplicate]

This question already has answers here:Closed 11 years ago.Possible Duplicate:Java Python Integration There is some code written in Python and I need to use it from Scala. The code uses some native C.…

Pandas groupby week given a datetime column

Lets say I have the following data sample:df = pd.DataFrame({date:[2011-01-01,2011-01-02,2011-01-03,2011-01-04,2011-01-05,2011-01-06,2011-01-07,2011-01-08,2011-01-09,2011-12-30,2011-12-31],revenue:[5,3…

Django form to indicate input type

Another basic question Im afraid which Im struggling with. Ive been through the various Django documentation pages and also search this site. The only thing I have found on here was back in 2013 which…

run multi command in the same jupyter cells

Im trying to display 2 output of 2 lines in the same time, I use Panda library and it seems like it display only the output of second line:import pandas as pd data = {"state": ["Ohio&quo…

Pandas how to get rows with consecutive dates and sales more than 1000?

I have a data frame called df: Date Sales 01/01/2020 812 02/01/2020 981 03/01/2020 923 04/01/2020 1033 05/01/2020 988 ... ...How can I get the first occurrence of 7 conse…

Use Python alongside C# in Windows UWP app

I started writing an application in Python, but I now want to switch to C# and UWP. I know that you cannot write a UWP app in Python, but I am trying to see if I can write some code in Python and acces…

How do you go from a sip.voidptr (QImage.constBits()) to a ctypes void or char pointer?

Im using python and of course you cant loop through every pixel of a large image very quickly, so I defer to a C DLL.I want to do something like this:img = QImage("myimage.png").constBits() i…

Just returning the text of elements in xpath (python / lxml)

I have an XML structure like this:mytree = """ <path><to><nodes><info>1</info><info>2</info><info>3</info></nodes></to> …