Skip to content Skip to sidebar Skip to footer

Combine Or Iterate Pandas Rows On Specific Columns

I am struggling to figure this row by row iteration out in pandas. I have a dataset that contains chat conversations between 2 parties. I would like to combine the dataset to row b

Solution 1:

You could groupby on consecutive line_by and the using agg aggregate for lastest timestamp, and ''.joinline_text

In [1918]:(df.groupby((df.line_by!=df.line_by.shift()).cumsum(),as_index=False).agg({'id':'first','timestamp':'last','line_by':'first','line_text':''.join}))Out[1918]:timestampline_textidline_by002:54.3TextLine11234  Person1103:47.0TextLine2TextLine31234  Person2205:46.2TextLine4TextLine51234  Person1306:44.5TextLine69876  Person2407:27.6TextLine79876  Person1510:20.3TextLine8TextLine99876  Person2

Details

In [1919]:(df.line_by!=df.line_by.shift()).cumsum()Out[1919]:011222334354657686Name:line_by,dtype:int32In [1920]:dfOut[1920]:idtimestampline_byline_text01234   02:54.3Person1TextLine111234   03:23.8Person2TextLine221234   03:47.0Person2TextLine331234   04:46.8Person1TextLine441234   05:46.2Person1TextLine559876   06:44.5Person2TextLine669876   07:27.6Person1TextLine779876   08:17.5Person2TextLine889876   10:20.3Person2TextLine9

Post a Comment for "Combine Or Iterate Pandas Rows On Specific Columns"