Pandas: Calculated Column Based On Values In One Column
I have columns like this in a csv file (I load it using read_csv('fileA.csv', parse_dates=['ProcessA_Timestamp'])) Item ProcessA_Timestamp 'A' 2014-06-08 03:32:20 'B' 2014
Solution 1:
You can use the pandas groupby-apply combo. Group the dataframe by "Item" and apply a function that calculates the process time. Something like:
import pandas as pd
def calc_process_time(row):
ts = row["ProcessA_Timestamp].values
if len(ts) == 1:
return pd.NaT
else:
return ts[-1] - ts[0] #last time - first time
df.groupby("Item").apply(calc_process_time)
Post a Comment for "Pandas: Calculated Column Based On Values In One Column"