How Can Keep Index After Normalization On Columns In Pandas Dataframe Between Different Range

March 08, 2024 Post a Comment

I applied normalization on multiple columns in Pandas dataframe by using for-loop under the condition of below: Normalization for A , B columns between : [-1 , +1] Normalization fo

Solution 1:

if I understand you correctly, my_file.csv / df2 should look like the lower output from your question? Then I believe you just have a typo in your construction of df2, you want the index to look the same as df1, so:

df2 = pd.DataFrame(data, index = id_set[:,0])

instead of

df2 = pd.DataFrame(data, index= id_set[0:])

(notice the contents of the square brackets). This will make your output file my_file.csv look like this:

,A,B,C
0,2.19117130798,-2.5897247305,-342.5448852240000410,2.19117130798,-4.3811855641,-335.93652430920,2.19117130798,-2.5897247305,-342.54488522400004
...

While your output file norm.csv looks like this:

,A,B,C
0,-1.0,0.16582420581574775,145.053947420818841,-1.0,0.037447604422215175,145.92985965785882,-1.0,0.16582420581574775,145.05394742081884
...

If you want your output file norm.csv to have the same index (0,10,20 instead of 0,1,2...) you need to define norm_data as

norm_data = pd.DataFrame(data, index = id_set[:,0])

instead of

norm_data = pd.DataFrame(data)

Also, I should note that your data contains a couple of NaN/inf entries, which mess up your normalization.

You can replace those using

df = df.replace(np.inf, np.nan)
df = df.fillna(0)

(credit to this question/answer), using the same for df2. You can also replace the NaN/inf entries with other values using the same functions.

Python Manual

How Can Keep Index After Normalization On Columns In Pandas Dataframe Between Different Range

Solution 1:

Post a Comment for "How Can Keep Index After Normalization On Columns In Pandas Dataframe Between Different Range"