Skip to content Skip to sidebar Skip to footer

Mapping Values From One Dataframe To Another

I am trying to figure out some fast and clean way to map values from one DataFrame A to another. Let say I have DataFrame like this one: C1 C2 C3 C4 C5 1 a b c a 2

Solution 1:

Another alternative is map. Although it requires looping over columns, if I didn't mess up the tests, it is still faster than replace:

A = pd.DataFrame(np.random.choice(list("abcdef"), (1000, 1000)))
B = pd.DataFrame({'Code': ['a', 'b', 'c', 'd', 'e'],
                  'Value': ["'House'", "'Bike'", "'Lamp'", "'Window'", "'Car'"]})
B = B.set_index("Code")["Value"]

%timeit A.replace(B)
1 loop, best of 3: 970 ms per loop

C = pd.DataFrame()

%%timeit
forcolin A:
    C[col] = A[col].map(B).fillna(A[col])
1loop, best of 3: 586 ms per loop

Solution 2:

You could use replace:

A.replace(B.set_index('Code')['Value'])

import pandas as pd
A = pd.DataFrame(
    {'C1': ['a', 'd', 'a', 'b'],
     'C2': ['b', 'a', 'c', 'e'],
     'C3': ['c', 'e', '', 'e'],
     'C4': ['a', 'b', '', ''],
     'C5': ['', 'a', '', '']})
B = pd.DataFrame({'Code': ['a', 'b', 'c', 'd', 'e'],
                  'Value': ["'House'", "'Bike'", "'Lamp'", "'Window'", "'Car'"]})
print(A.replace(B.set_index('Code')['Value']))

yields

         C1       C2      C3       C4       C5
0   'House''Bike''Lamp''House'         
1  'Window''House''Car''Bike''House'
2   'House''Lamp'                          
3    'Bike''Car''Car'

Post a Comment for "Mapping Values From One Dataframe To Another"