Pandas - Create A New Column With Apply For Float Indexed Dataframe
I'm using pandas 13.0 and I'm trying to create a new colum using apply() and a function name foo(). My dataframe is as follow: df = pandas.DataFrame({ 'a':[ 0.0, 0.1, 0.
Solution 1:
The problem here is that you are trying to process this row-wise but you are passing series as arguements which is wrong you could do it this way:
In [7]:
df['d'] = df.apply(lambda row: foo(row['b'], row['c']), axis=1)
df
Out[7]:
a b c d
a
0.00.0101100.10.1202400.20.2303900.30.3404160
A better way would be to just call your function direct:
In [8]:
df['d'] = foo(df['b'], df['c'])
df
Out[8]:
a b c d
a
0.0 0.0 10 1 10
0.1 0.1 20 2 40
0.2 0.2 30 3 90
0.3 0.3 40 4 160
The advantage with the above method is that it is vectorised and will perform the operation on the whole series rather than a row at a time.
In [15]:
%timeit df['d'] = df.apply(lambda row: foo(row['b'], row['c']), axis=1)%timeit df['d'] = foo(df['b'], df['c'])
1000 loops, best of 3: 270 µs per loop
1000 loops, best of 3: 214 µs per loop
Not much difference here, now compare with a 400,000 row df:
In [18]:
%timeit df['d'] = df.apply(lambda row: foo(row['b'], row['c']), axis=1)%timeit df['d'] = foo(df['b'], df['c'])
1 loops, best of 3: 5.84 s per loop
100 loops, best of 3: 8.68 ms per loop
So you see here ~672x speed up.
Post a Comment for "Pandas - Create A New Column With Apply For Float Indexed Dataframe"