I am completely new to Python, pandas and programming in general, and I cannot figure out the following:
I have accessed a database with the help of pandas and I have put the data from the query into a dataframe, df. One of the column contains birthdays, which can have the following forms:- 01/25/1980 (string)- 01/25 (string)- None (NoneType)
Now, I would like to add a new column to df, which stores the ages of the people in the database. So I have done the following:
def addAge(df):today = date.today()df["age"] = Nonefor index, row in df.iterrows():if row["birthday"] != None:if len(row["birthday"]) == 10:birthday = df["birthday"]birthdayDate = datetime.date(int(birthday[6:]), int(birthday[:2]), int(birthday[3:5])) row["age"] = today.year - birthdayDate.year - ((today.month, today.day) < (birthdayDate.month, birthdayDate.day))print row["birthday"], row["age"] #this is just for testingaddAge(df)
print df
The line print row["birthday"], row["age"] correctly prints the birthdays and the ages. But when I call print df, the column age always contains "None". Could you guys explain to me what I have been doing wrong? Thanks!