Skip to content Skip to sidebar Skip to footer

How To Filter Rows Based On Unix Based Regular Expressions Passed As An Input Argument To A Data Frame Column

I have the following data frame import numpy as np import pandas as pd import os csvFile = 'csv.csv' csvDelim = '@@@' df = pd.read_csv(csvFile, engine='python', index_col=False, d

Solution 1:

I believe you need to delete the 'r' in the second line of the code below:

text = '*CLK*'findtext = 'r'+text+".*"colName = 'Signal'

It looks like youre trying to make a python raw string, if you are using python3 or greater thats not necessary.

Also the regex you are using is not suitable for what you want, try the following, you can try experimenting with https://pythex.org/ to construct the Regex you want. If all you are trying to do is match rows which contain CLK findtext = '.*CLK'

Solution 2:

remove the asterisks (*) and use the .contains method instead of the .match method. Use case=False to find upper and lowercase letters

see this code:

text = 'CLK'
findtext = 'r'+text+".*"
colName = 'Signal'

df[colName].str.contains(text, case=False)

Post a Comment for "How To Filter Rows Based On Unix Based Regular Expressions Passed As An Input Argument To A Data Frame Column"