Generate random timeseries data with dates

2024/10/13 9:23:55

I am trying to generate random data(integers) with dates so that I can practice pandas data analytics commands on it and plot time series graphs.

             temp     depth   acceleration
2019-01-1 -0.218062 -1.215978 -1.674843
2019-02-1 -0.465085 -0.188715  0.241956
2019-03-1 -1.464794 -1.354594  0.635196
2019-04-1  0.103813  0.194349 -0.450041
2019-05-1  0.437921  0.073829  1.346550

Is there any random dataframe generator that can generate something like this with each date having a gap of one month?

Answer

You can either use pandas.util.testing

import pandas.util.testing as testing
import numpy as np
np.random.seed(1)testing.N, testing.K = 5, 3  # Setting the rows and columns of the desired dataprint testing.makeTimeDataFrame(freq='MS')
>>>A         B         C
2000-01-01 -0.488392  0.429949 -0.723245
2000-02-01  1.247192 -0.513568 -0.512677
2000-03-01  0.293828  0.284909  1.190453
2000-04-01 -0.326079 -1.274735 -0.008266
2000-05-01 -0.001980  0.745803  1.519243

Or, if you need more control over the random values being generated, you can use something like

import numpy as np
import pandas as pd
np.random.seed(1)rows,cols = 5,3
data = np.random.rand(rows,cols) # You can use other random functions to generate values with constraints
tidx = pd.date_range('2019-01-01', periods=rows, freq='MS') # freq='MS'set the frequency of date in months and start from day 1. You can use 'T' for minutes and so on
data_frame = pd.DataFrame(data, columns=['a','b','c'], index=tidx)
print data_frame
>>>a         b         c
2019-01-01  0.992856  0.217750  0.538663
2019-02-01  0.189226  0.847022  0.156730
2019-03-01  0.572417  0.722094  0.868219
2019-04-01  0.023791  0.653147  0.857148
2019-05-01  0.729236  0.076817  0.743955
https://en.xdnf.cn/q/69551.html

Related Q&A

Spark select top values in RDD

The original dataset is:# (numbersofrating,title,avg_rating) newRDD =[(3,monster,4),(4,minions 3D,5),....] I want to select top N avg_ratings in newRDD.I use the following code,it has an error.selectne…

Python module BeautifulSoup extracting anchors href

i am using BeautifulSoup module to select all href from html by this way:def extract_links(html):soup = BeautifulSoup(html)anchors = soup.findAll(a)print anchorslinks = []for a in anchors:links.append(…

Pandas: how to get a particular group after groupby? [duplicate]

This question already has answers here:How to access subdataframes of pandas groupby by key(6 answers)Closed 9 years ago.I want to group a dataframe by a column, called A, and inspect a particular grou…

aws cli in cygwin - how to clean up differences in windows and cygwin style paths

I suspect this is my ineptitude in getting path variables set right, but Im at a loss.Ive installed the aws cli using pip in cygwin.pip install awscliI have two python environments... a windows anacon…

Print all variables and their values [duplicate]

This question already has answers here:too many values to unpack, iterating over a dict. key=>string, value=>list(8 answers)Closed 6 years ago.This question has been asked quite a bit, and Ive tr…

How to emulate multiprocessing.Pool.map() in AWS Lambda?

Python on AWS Lambda does not support multiprocessing.Pool.map(), as documented in this other question. Please note that the other question was asking why it doesnt work. This question is different, Im…

Tkinter overrideredirect no longer receiving event bindings

I have a tinter Toplevel window that I want to come up without a frame or a titlebar and slightly transparent, and then solid when the mouse moves over the window. To do this I am using both Toplevel.…

Reusing Tensorflow session in multiple threads causes crash

Background: I have some complex reinforcement learning algorithm that I want to run in multiple threads. ProblemWhen trying to call sess.run in a thread I get the following error message:RuntimeError: …

Conditional column arithmetic in pandas dataframe

I have a pandas dataframe with the following structure:import numpy as np import pandas as pd myData = pd.DataFrame({x: [1.2,2.4,5.3,2.3,4.1], y: [6.7,7.5,8.1,5.3,8.3], condition:[1,1,np.nan,np.nan,1],…

Need some assistance with Python threading/queue

import threading import Queue import urllib2 import timeclass ThreadURL(threading.Thread):def __init__(self, queue):threading.Thread.__init__(self)self.queue = queuedef run(self):while True:host = self…