How To Take Out The Column Index Name In Dataframe
Solution 1:
Try using the reset_index
method which moves the DataFrame's index into a column (which is what you want, I think).
Solution 2:
Short answer: you can't and it's not clear why this could ever "cause problems". The 'Date' name is naming the Index of the DataFrame, which is different from any of the columns. It gets printed with this offset specifically so you will not confuse it with a column of the frame. You would not slice into the date with DataFrame['Date']
as per below:
>>> import numpy as np; import pandas; import datetime
>>> dfrm = pandas.DataFrame(np.random.rand(10,3),
... columns=['A','B','C'],
... index = pandas.Index(
... [datetime.date(2012,6,elem) for elem inrange(1,11)],
... name="Date"))
>>> dfrm
A B C
Date
2012-06-01 0.2837240.8630120.7988912012-06-02 0.0972310.2775640.8723062012-06-03 0.8214610.4994850.1264412012-06-04 0.8877820.3894860.3741182012-06-05 0.2480650.0322870.8509392012-06-06 0.1019170.1211710.5776432012-06-07 0.2252780.1613010.7089962012-06-08 0.9060420.8288140.2475642012-06-09 0.7333630.9240760.3933532012-06-100.2738370.3180130.754807>>> dfrm['Date']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1458, in __getitem__
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 294, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 625, in get
_, block = self._find_block(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 715, in _find_block
self._check_have(item)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 722, in _check_have
raise KeyError('no item named %s' % str(item))
KeyError: 'no item named Date'
Longer answer:
You can change your DataFrame by adding the index into its own column if you'd like it to print that way. For example:
>>>dfrm['Date']=dfrm.index>>>dfrmABCDateDate2012-06-01 0.2837240.8630120.7988912012-06-012012-06-02 0.0972310.2775640.8723062012-06-022012-06-03 0.8214610.4994850.1264412012-06-032012-06-04 0.8877820.3894860.3741182012-06-042012-06-05 0.2480650.0322870.8509392012-06-052012-06-06 0.1019170.1211710.5776432012-06-062012-06-07 0.2252780.1613010.7089962012-06-072012-06-08 0.9060420.8288140.2475642012-06-082012-06-09 0.7333630.9240760.3933532012-06-092012-06-10 0.2738370.3180130.7548072012-06-10
After this, you could simply change the name of the index so that nothing prints:
>>>dfrm.reindex(pandas.Series(dfrm.index.values,name=''))ABCDate2012-06-01 0.2837240.8630120.7988912012-06-012012-06-02 0.0972310.2775640.8723062012-06-022012-06-03 0.8214610.4994850.1264412012-06-032012-06-04 0.8877820.3894860.3741182012-06-042012-06-05 0.2480650.0322870.8509392012-06-052012-06-06 0.1019170.1211710.5776432012-06-062012-06-07 0.2252780.1613010.7089962012-06-072012-06-08 0.9060420.8288140.2475642012-06-082012-06-09 0.7333630.9240760.3933532012-06-092012-06-10 0.2738370.3180130.7548072012-06-10
This seems a bit overkill. Another option is to just change the index to integers or something after adding the Date as a column:
>>>dfrm.reset_index()
or if you already moved the index into a column manually, then just
>>>dfrm.index=range(len(dfrm))>>>dfrmABCDate00.2837240.8630120.7988912012-06-0110.0972310.2775640.8723062012-06-0220.8214610.4994850.1264412012-06-0330.8877820.3894860.3741182012-06-0440.2480650.0322870.8509392012-06-0550.1019170.1211710.5776432012-06-0660.2252780.1613010.7089962012-06-0770.9060420.8288140.2475642012-06-0880.7333630.9240760.3933532012-06-0990.2738370.3180130.7548072012-06-10
Or the following if you care about the order the columns appear:
>>>dfrm.ix[:,[-1]+range(len(dfrm.columns)-1)]DateABC02012-06-01 0.2837240.8630120.79889112012-06-02 0.0972310.2775640.87230622012-06-03 0.8214610.4994850.12644132012-06-04 0.8877820.3894860.37411842012-06-05 0.2480650.0322870.85093952012-06-06 0.1019170.1211710.57764362012-06-07 0.2252780.1613010.70899672012-06-08 0.9060420.8288140.24756482012-06-09 0.7333630.9240760.39335392012-06-10 0.2738370.3180130.754807
Added
Here are a few helpful functions to include in an iPython configuration script (so that they are loaded upon startup), or to put in a module you can easily load when working in Python.
############ Imports ############import pandas
import datetime
import numpy as np
from dateutil import relativedelta
from pandas.io import data as pdata
############################################# Functions to retrieve Yahoo finance data ############################################## Utility to get generic stock symbol data from Yahoo finance.# Starts two days prior to present (or most recent business day)# and goes back a specified number of days.defgetStockSymbolData(sym_list, end_date=datetime.date.today()+relativedelta.relativedelta(days=-1), num_dates = 30):
dReader = pdata.DataReader
start_date = end_date + relativedelta.relativedelta(days=-num_dates)
returndict( (sym, dReader(sym, "yahoo", start=start_date, end=end_date)) for sym in sym_list )
#### Utility function to get some AAPL data when needed# for testing.defgetAAPL(end_date=datetime.date.today()+relativedelta.relativedelta(days=-1), num_dates = 30):
dReader = pdata.DataReader
return getStockSymbolData(['AAPL'], end_date=end_date, num_dates=num_dates)
###
I also made a class below to hold some data for common stocks:
###### Define a 'Stock' class that can hold simple info# about a security, like SEDOL and CUSIP info. This# is mainly for debugging things and quickly getting# info for a single security.classMyStock():
def__init__(self, ticker='None', sedol='None', country='None'):
self.ticker = ticker
self.sedol=sedol
self.country = country
###defgetData(self, end_date=datetime.date.today()+relativedelta.relativedelta(days=-1), num_dates = 30):
return pandas.DataFrame(getStockSymbolData([self.ticker], end_date=end_date, num_dates=num_dates)[self.ticker])
######### Make some default stock objects for common stocks.
AAPL = MyStock(ticker='AAPL', sedol='03783310', country='US')
SAP = MyStock(ticker='SAP', sedol='484628', country='DE')
Post a Comment for "How To Take Out The Column Index Name In Dataframe"