How Can Keep Index After Normalization On Columns In Pandas Dataframe Between Different Range
Solution 1:
if I understand you correctly, my_file.csv / df2 should look like the lower output from your question? Then I believe you just have a typo in your construction of df2, you want the index to look the same as df1, so:
df2 = pd.DataFrame(data, index = id_set[:,0])
instead of
df2 = pd.DataFrame(data, index= id_set[0:])
(notice the contents of the square brackets).
This will make your output file my_file.csv
look like this:
,A,B,C
0,2.19117130798,-2.5897247305,-342.5448852240000410,2.19117130798,-4.3811855641,-335.93652430920,2.19117130798,-2.5897247305,-342.54488522400004
...
While your output file norm.csv
looks like this:
,A,B,C
0,-1.0,0.16582420581574775,145.053947420818841,-1.0,0.037447604422215175,145.92985965785882,-1.0,0.16582420581574775,145.05394742081884
...
If you want your output file norm.csv
to have the same index (0,10,20 instead of 0,1,2...) you need to define norm_data as
norm_data = pd.DataFrame(data, index = id_set[:,0])
instead of
norm_data = pd.DataFrame(data)
Also, I should note that your data contains a couple of NaN/inf
entries, which mess up your normalization.
You can replace those using
df = df.replace(np.inf, np.nan)
df = df.fillna(0)
(credit to this question/answer), using the same for df2. You can also replace the NaN/inf
entries with other values using the same functions.
Post a Comment for "How Can Keep Index After Normalization On Columns In Pandas Dataframe Between Different Range"