Skip to content Skip to sidebar Skip to footer

Overwrite Columns In Dataframes Of Different Sizes Pandas

I have following two Data Frames: df1 = pd.DataFrame({'ids':[1,2,3,4,5],'cost':[0,0,1,1,0]}) df2 = pd.DataFrame({'ids':[1,5],'cost':[1,4]}) And I want to update the values of df1

Solution 1:

You could do this with a left merge:

merged = pd.merge(df1, df2, on='ids', how='left')
merged['cost'] = merged.cost_x.where(merged.cost_y.isnull(), merged['cost_y'])
result = merged[['ids','cost']]

However you can avoid the need for the merge (and get better performance) if you set the ids as an index column; then pandas can use this to align the results for you:

df1 = df1.set_index('ids')
df2 = df2.set_index('ids')

df1.cost.where(~df1.index.isin(df2.index), df2.cost)
ids
11.020.031.041.054.0Name: cost, dtype: float64

Solution 2:

You can use set_index and combine first to give precedence to values in df2

df_result = df2.set_index('ids').combine_first(df1.set_index('ids'))
df_result.reset_index()

You get

   ids  cost
0   1   1
1   2   0
2   3   1
3   4   1
4   5   4

Solution 3:

Another way to do it, using a temporary merged dataframe which you can discard after use.

import pandas as pd

df1 = pd.DataFrame({'ids':[1,2,3,4,5],'cost':[0,0,1,1,0]})
df2 = pd.DataFrame({'ids':[1,5],'cost':[1,4]})

dftemp = df1.merge(df2,on='ids',how='left', suffixes=('','_r'))
print(dftemp)

df1.loc[~pd.isnull(dftemp.cost_r), 'cost'] = dftemp.loc[~pd.isnull(dftemp.cost_r), 'cost_r']
del dftemp 

df1 = df1[['ids','cost']]print(df1)


OUTPUT-----:
dftemp:
   cost  ids  cost_r
0011.0102     NaN
213     NaN
314     NaN
4054.0

df1:
   ids  cost
011.0120.0231.0341.0454.0

Post a Comment for "Overwrite Columns In Dataframes Of Different Sizes Pandas"