Python: Grouping into timeslots (minutes) for days of data

2024/10/4 11:28:06

I have a list of events that occur at mS accurate intervals, that spans a few days. I want to cluster all the events that occur in a 'per-n-minutes' slot (can be twenty events, can be no events). I have a datetime.datetime item for each event, so I can get datetime.datetime.minute without any trouble.

My list of events is sorted in time order, earliest first, latest last. The list is complete for the time period I am working on.

The idea being that I can change list:-

[[a],[b],[c],[d],[e],[f],[g],[h],[i]...]

where a, b, c, occur between mins 0 and 29, d,e,f,g occur between mins 30 and 59, nothing between 0 and 29 (next hour), h, i between 30 and 59 ...

into a new list:-

[[[a],[b],[c]],[[d],[e],[f],[g]],[],[[h],[i]]...]

I'm not sure how to build an iterator that loops through the two time slots until the time series list ends. Anything I can think of using xrange stops once it completes, so I wondered if there was a way of using `while' to do the slicing?

I also will be using a smaller timeslot, probably 5 mins, I used 30mins as a shorter example for demonstration.

(for context, I'm making a geo plotted time based view of the recent quakes in New Zealand. and want to show all the quakes that occurs in a small block of time in one step to speed up the replay)

Answer
# create sample data
from datetime import datetime, timedelta
d = datetime.now()
data = [d + timedelta(minutes=i) for i in xrange(100)]# prepare and group the data
from itertools import groupbydef get_key(d):# group by 30 minutesk = d + timedelta(minutes=-(d.minute % 30)) return datetime(k.year, k.month, k.day, k.hour, k.minute, 0)g = groupby(sorted(data), key=get_key)# print data
for key, items in g:print keyfor item in items:print '-', item

This is a python translation of this answer, which works by rounding the datetime to the next boundary and use that for grouping.


If you really need the possible empty groups, you can just add them by using this or a similar method:

def add_missing_empty_frames(g):last_key = Nonefor key, items in g:if last_key:while (key-last_key).seconds > 30*60:empty_key = last_key + timedelta(minutes=30)yield (empty_key, [])last_key = empty_keyyield (key, items)last_key = keyfor key, items in add_missing_empty_frames(g):...
https://en.xdnf.cn/q/70613.html

Related Q&A

signal.alarm not triggering exception on time

Ive slightly modified the signal example from the official docs (bottom of page).Im calling sleep 10 but I would like an alarm to be raised after 1 second. When I run the following snippet it takes way…

Execute Python (selenium) script in crontab

I have read most of the python/cron here in stackoverflow and yet couldnt make my script run. I understood that I need to run my script through shell (using zsh & ipython by the way), but really I …

Get post data from ajax post request in python file

Im trying to post some data with an ajax post request and execute a python file, retrieving the data in the python file, and return a result.I have the following ajax code$(function () {$("#upload…

How to implement maclaurin series in keras?

I am trying to implement expandable CNN by using maclaurin series. The basic idea is the first input node can be decomposed into multiple nodes with different orders and coefficients. Decomposing singl…

Rowwise min() and max() fails for column with NaNs

I am trying to take the rowwise max (and min) of two columns containing datesfrom datetime import date import pandas as pd import numpy as np df = pd.DataFrame({date_a : [date(2015, 1, 1), date(2012…

Convert column suffixes from pandas join into a MultiIndex

I have two pandas DataFrames with (not necessarily) identical index and column names. >>> df_L = pd.DataFrame({X: [1, 3], Y: [5, 7]})>>> df_R = pd.DataFrame({X: [2, 4], Y: [6, 8]})I c…

sys-package-mgr*: cant create package cache dir when run python script with Jython

I want to run Python script with Jython. the result show correctly, but at the same time there is an warning message, "sys-package-mgr*: cant create package cache dir"How could I solve this p…

Python WWW macro

i need something like iMacros for Python. It would be great to have something like that:browse_to(www.google.com) type_in_input(search, query) click_button(search) list = get_all(<p>)Do you know …

Django custom context_processors in render_to_string method

Im building a function to send email and I need to use a context_processor variable inside the HTML template of the email, but this dont work.Example:def send_email(plain_body_template_name, html_body_…

Using string as variable name

Is there any way for me to use a string to call a method of a class? Heres an example that will hopefully explain better (using the way I think it should be):class helloworld():def world(self):print &…