Skip to content Skip to sidebar Skip to footer

Pandas - Create A New Column With Apply For Float Indexed Dataframe

I'm using pandas 13.0 and I'm trying to create a new colum using apply() and a function name foo(). My dataframe is as follow: df = pandas.DataFrame({ 'a':[ 0.0, 0.1, 0.

Solution 1:

The problem here is that you are trying to process this row-wise but you are passing series as arguements which is wrong you could do it this way:

In [7]:

df['d'] = df.apply(lambda row: foo(row['b'], row['c']), axis=1)
df
Out[7]:
       a   b  c    d
a                   
0.00.0101100.10.1202400.20.2303900.30.3404160

A better way would be to just call your function direct:

In [8]:

df['d'] = foo(df['b'], df['c'])
df
Out[8]:
       a   b  c    d
a                   
0.0  0.0  10  1   10
0.1  0.1  20  2   40
0.2  0.2  30  3   90
0.3  0.3  40  4  160

The advantage with the above method is that it is vectorised and will perform the operation on the whole series rather than a row at a time.

In [15]:

%timeit df['d'] = df.apply(lambda row: foo(row['b'], row['c']), axis=1)%timeit df['d'] = foo(df['b'], df['c'])
1000 loops, best of 3: 270 µs per loop
1000 loops, best of 3: 214 µs per loop

Not much difference here, now compare with a 400,000 row df:

In [18]:

%timeit df['d'] = df.apply(lambda row: foo(row['b'], row['c']), axis=1)%timeit df['d'] = foo(df['b'], df['c'])
1 loops, best of 3: 5.84 s per loop
100 loops, best of 3: 8.68 ms per loop

So you see here ~672x speed up.

Post a Comment for "Pandas - Create A New Column With Apply For Float Indexed Dataframe"