Form Groups Of Individuals Python (pandas)
I have a data set of the following form: import pandas as pd d1 = {'Subject': ['Subject1','Subject1','Subject1','Subject2','Subject2','Subject2','Subject3','Subject3','Subject3','S
Solution 1:
Use:
from itertools import combinations
d1['Category'] = d1['Category'].mask(d1['Category'] == '')
L = [(i[0], i[1], y[0], y[1]) for i, x in d1.groupby(['Event','Category'])['Subject']
for y inlist(combinations(x, 2))]
df = pd.DataFrame(L, columns=['Event','Category','Match1','Match2'])
df1 = (df.rename(columns={'Match1':'Subject'})
.merge(d1, on=['Event','Category','Subject'], how='left')
.iloc[:, 4:]
.add_suffix('.1'))
df2 = (df.rename(columns={'Match2':'Subject'})
.merge(d1, on=['Event','Category','Subject'], how='left')
.iloc[:, 4:]
.add_suffix('.2'))
fin = pd.concat([df, df1, df2], axis=1)
print (fin)
Event Category Match1 Match2 Variable1.1 Variable2.1 Variable3.1 \
0 1 1 Subject1 Subject4 1 12 -6
1 1 2 Subject2 Subject3 4 9 -3
2 2 1 Subject1 Subject2 2 11 -5
3 2 1 Subject1 Subject4 2 11 -5
4 2 1 Subject2 Subject4 5 8 -4
5 3 2 Subject1 Subject2 3 10 -4
6 3 2 Subject1 Subject3 3 10 -4
7 3 2 Subject2 Subject3 6 7 -3
Variable1.2 Variable2.2 Variable3.2
0 10 3 1
1 7 6 -2
2 5 8 -4
3 11 2 2
4 11 2 2
5 6 7 -3
6 9 4 0
7 9 4 0
Explanation:
- Replace empty strings to NaNs by
mask
-groupby
siletly remove these rows - Create
DataFrame
by list comprehension with flattening of all combinations of length2
of columnSubject
by groups per columnsEvent
andCategory
- Double join variable columns by
merge
with left join, filter out first4
columns by positions byiloc
and addadd_suffix
oradd_prefix
for avoid duplicated columns names - Last
concat
all 3 DataFrames together
Post a Comment for "Form Groups Of Individuals Python (pandas)"