Splitting Dataframe Into Two And Using Tilde ~ As Variable
Solution 1:
For your limited use case, there is limited benefit in what you are requesting.
GroupBy
Your real problem, however, is the number of variables you are having to create. You can halve them via GroupBy
and a calculated grouper:
df = pd.DataFrame({'teste': ['Place', 'Null', 'Something', 'Place'],
'value': [1, 2, 3, 4]})
dfs = dict(tuple(df.groupby(df['teste'] == 'Place')))
{False: teste value
1Null22Something3,
True: teste value
0Place13Place4}
Then access your dataframes via dfs[0]
and dfs[1]
, since False == 0
and True == 1
. There is a benefit with this last example. You now remove the need to create new variables unnecessarily. Your dataframes are organized since they exist in the same dictionary.
Function dispatching
Your precise requirement can be met via the operator
module and an identity function:
from operator import invert
tilde = [invert, lambda x: x]
mask = df.teste == 'Place' # don't repeat mask calculations unnecessarily
df1 = df[tilde[0](mask)]
df2 = df[tilde[1](mask)]
Sequence unpacking
If your intention is to use one line, use sequence unpacking:
df1, df2 = (df[func(mask)] forfuncintilde)
Note you can replicate the GroupBy
result via:
dfs = dict(enumerate(df[func(mask)] forfuncintilde)
But this is verbose and convoluted. Stick with the GroupBy
solution.
Solution 2:
You could possibly condense your code a little by defining your tests and then iterating over those. Let me illustrate:
tests = ["Place", "Foo", "Bar"]
for t in tests:
# not sure what you are doing exactly, just copied it
1 - df = df[~(df.teste.isin([t]))]
2 - df = df[(df.teste.isin([t]))]
That way, you only have two linesdoing the actual work, and simply adding another test to the list saves you duplicating code. No idea if this is what you want, though.
Post a Comment for "Splitting Dataframe Into Two And Using Tilde ~ As Variable"