Combine Or Iterate Pandas Rows On Specific Columns
I am struggling to figure this row by row iteration out in pandas. I have a dataset that contains chat conversations between 2 parties. I would like to combine the dataset to row b
Solution 1:
You could groupby
on consecutive line_by
and the using agg
aggregate for lastest timestamp
, and ''.join
line_text
In [1918]:(df.groupby((df.line_by!=df.line_by.shift()).cumsum(),as_index=False).agg({'id':'first','timestamp':'last','line_by':'first','line_text':''.join}))Out[1918]:timestampline_textidline_by002:54.3TextLine11234 Person1103:47.0TextLine2TextLine31234 Person2205:46.2TextLine4TextLine51234 Person1306:44.5TextLine69876 Person2407:27.6TextLine79876 Person1510:20.3TextLine8TextLine99876 Person2
Details
In [1919]:(df.line_by!=df.line_by.shift()).cumsum()Out[1919]:011222334354657686Name:line_by,dtype:int32In [1920]:dfOut[1920]:idtimestampline_byline_text01234 02:54.3Person1TextLine111234 03:23.8Person2TextLine221234 03:47.0Person2TextLine331234 04:46.8Person1TextLine441234 05:46.2Person1TextLine559876 06:44.5Person2TextLine669876 07:27.6Person1TextLine779876 08:17.5Person2TextLine889876 10:20.3Person2TextLine9
Post a Comment for "Combine Or Iterate Pandas Rows On Specific Columns"