Skip to content Skip to sidebar Skip to footer

Pandas: Calculated Column Based On Values In One Column

I have columns like this in a csv file (I load it using read_csv('fileA.csv', parse_dates=['ProcessA_Timestamp'])) Item ProcessA_Timestamp 'A' 2014-06-08 03:32:20 'B' 2014

Solution 1:

You can use the pandas groupby-apply combo. Group the dataframe by "Item" and apply a function that calculates the process time. Something like:

import pandas as pd

def calc_process_time(row):
    ts = row["ProcessA_Timestamp].values
    if len(ts) == 1:
        return pd.NaT
    else:
        return ts[-1] - ts[0] #last time - first time

df.groupby("Item").apply(calc_process_time)

Post a Comment for "Pandas: Calculated Column Based On Values In One Column"