Question 1

I have a dataset (Product_ID,date_time, Sold) which has products sold on various dates. The dates are not consistent and are given for 9 months with random 13 days or more from a month. I have to segregate the data in a such a way that the for each product how many products were sold on 1-3 given days, 4-7 given days, 8-15 given days and >16 given days. . So how can I code this in python using pandas and other packages

PRODUCT_ID DATE_LOCATION Sold 0E4234 01-08-16 0:00 2 0E4234 02-08-16 0:00 7 0E4234 04-08-16 0:00 3 0E4234 08-08-16 0:00 1 0E4234 09-08-16 0:00 2 . . (same product for 9 months sold data) . 0G2342 02-08-16 0:00 1 0G2342 03-08-16 0:00 2 0G2342 06-08-16 0:00 1 0G2342 09-08-16 0:00 1 0G2342 11-08-16 0:00 3 0G2342 15-08-16 0:00 3 . . .(goes for 64 products each with 9 months of data) .

I don't know even how to code for this in python The output needed is

PRODUCT_ID      Days   Sold
0E4234          1-3      94-7      38-15     16>16     (remaing values sum)
0G2342          1-3      34-7      18-15     7>16    (remaing values sum)
.
.(for 64 products)
.

Would be happy if at least someone posted a link to where to start

Question 2

You can first convert dates to dtetimes and get days by dt.day:

df['DATE_LOCATION'] = pd.to_datetime(df['DATE_LOCATION'], dayfirst=True)
days = df['DATE_LOCATION'].dt.day

Then binning by cut:

rng = pd.cut(days, bins=[0,3,7,15,31], labels=['1-3', '4-7','8-15', '>=16'])
print (rng)
0      1-3
1      1-3
2      4-7
3     8-15
4     8-15
5      1-3
6      1-3
7      4-7
8     8-15
9     8-15
10    8-15
Name: DATE_LOCATION, dtype: category
Categories (4, object): [1-3 < 4-7 < 8-15 < >=16]

And aggregate sum by product and binned Series:

df = df.groupby(["PRODUCT_ID",rng])['Sold'].sum()
print (df)
PRODUCT_ID  DATE_LOCATION
0E4234      1-3              94-7              38-15             3
0G2342      1-3              34-7              18-15             7
Name: Sold, dtype: int64

If need also count per years:

df = df.groupby([df['DATE_LOCATION'].dt.year.rename('YEAR'), "PRODUCT_ID",rng])['Sold'].sum()
print (df)YEAR  PRODUCT_ID  DATE_LOCATION
2016  0E4234      1-3              94-7              38-15             30G2342      1-3              34-7              18-15             7
Name: Sold, dtype: int64

Adding specific days in python table

Related Q&A

django how to following relationships backwards?

Python File handling: Seaching for specific numbers

How to convert token list into wordnet lemma list using nltk?

Script throws an error when it is made to run using multiprocessing

Efficiently pair random elements of list

ALL permutations of a list with repetition but not doubles

NameError: name current_portfolio is not defined

Scrape an Ajax form with .submit() with Python and Selenium

How to process break an array in Python?

Why am I getting replacement index 1 out of range for positional args tuple error