Date Axis In Heatmap Seaborn
A little info: I'm very new to programming and this is a small part of the my first script. The goal of this particular segment is to display a seaborn heatmap with vertical depth
Solution 1:
You have to use strftime function for your date series of dataframe to plot xtick labels correctly:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import random
dates = [datetime.today() - timedelta(days=x * random.getrandbits(1)) for x in xrange(25)]
df = pd.DataFrame({'depth': [0.1,0.05, 0.01, 0.005, 0.001, 0.1, 0.05, 0.01, 0.005, 0.001, 0.1, 0.05, 0.01, 0.005, 0.001, 0.1, 0.05, 0.01, 0.005, 0.001, 0.1, 0.05, 0.01, 0.005, 0.001],\
'date': dates,\
'value': [-4.1808639999999997, -9.1753490000000006, -11.408113999999999, -10.50245, -8.0274750000000008, -0.72260200000000008, -6.9963940000000004, -10.536339999999999, -9.5440649999999998, -7.1964070000000007, -0.39225599999999999, -6.6216390000000001, -9.5518009999999993, -9.2924690000000005, -6.7605589999999998, -0.65214700000000003, -6.8852289999999989, -9.4557760000000002, -8.9364629999999998, -6.4736289999999999, -0.96481800000000006, -6.051482, -9.7846860000000007, -8.5710630000000005, -6.1461209999999999]})
pivot = df.pivot(index='depth', columns='date', values='value')
sns.set()
ax = sns.heatmap(pivot)
ax.set_xticklabels(df['date'].dt.strftime('%d-%m-%Y'))
plt.xticks(rotation=-90)
plt.show()
Solution 2:
Example with standard heatmap datetime labels
import pandas as pd
import seaborn as sns
dates = pd.date_range('2019-01-01', '2020-12-01')
df = pd.DataFrame(np.random.randint(0, 100, size=(len(dates), 4)), index=dates)
sns.heatmap(df)
We can create some helper classes/functions to get to some better looking labels and placement. AxTransformer
enables conversion from data coordinates to tick locations, set_date_ticks
allows custom date ranges to be applied to plots.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from collections.abc import Iterable
from sklearn import linear_model
classAxTransformer:
def__init__(self, datetime_vals=False):
self.datetime_vals = datetime_vals
self.lr = linear_model.LinearRegression()
returndefprocess_tick_vals(self, tick_vals):
ifnotisinstance(tick_vals, Iterable) orisinstance(tick_vals, str):
tick_vals = [tick_vals]
if self.datetime_vals == True:
tick_vals = pd.to_datetime(tick_vals).astype(int).values
tick_vals = np.array(tick_vals)
return tick_vals
deffit(self, ax, axis='x'):
axis = getattr(ax, f'get_{axis}axis')()
tick_locs = axis.get_ticklocs()
tick_vals = self.process_tick_vals([label._text for label in axis.get_ticklabels()])
self.lr.fit(tick_vals.reshape(-1, 1), tick_locs)
returndeftransform(self, tick_vals):
tick_vals = self.process_tick_vals(tick_vals)
tick_locs = self.lr.predict(np.array(tick_vals).reshape(-1, 1))
return tick_locs
defset_date_ticks(ax, start_date, end_date, axis='y', date_format='%Y-%m-%d', **date_range_kwargs):
dt_rng = pd.date_range(start_date, end_date, **date_range_kwargs)
ax_transformer = AxTransformer(datetime_vals=True)
ax_transformer.fit(ax, axis=axis)
getattr(ax, f'set_{axis}ticks')(ax_transformer.transform(dt_rng))
getattr(ax, f'set_{axis}ticklabels')(dt_rng.strftime(date_format))
ax.tick_params(axis=axis, which='both', bottom=True, top=False, labelbottom=True)
return ax
These provide us a lot of flexibility, e.g.
fig, ax = plt.subplots(dpi=150)
sns.heatmap(df, ax=ax)
set_date_ticks(ax, '2019-01-01', '2020-12-01', freq='3MS')
or if you really want to get weird you can do stuff like
fig, ax = plt.subplots(dpi=150)
sns.heatmap(df, ax=ax)
set_date_ticks(ax, '2019-06-01', '2020-06-01', freq='2MS', date_format='%b `%y')
For your specific example you'll have to pass axis='x'
to set_date_ticks
Solution 3:
- First, the
'date'
column must be converted to adatetime dtype
withpandas.to_datetime
- If the desired result is to only have the dates (without time), then the easiest solution is to use the
.dt
accessor to extract the.date
component. Alternative, usedt.strftime
to set a specific string format.strftime()
andstrptime()
Format Codesdf.date.dt.strftime('%H:%M')
would extract hours and minutes into a string like'14:29'
- In the example below, the extracted date is assigned to the same column, but the value can also be assigned as a new column.
pandas.DataFrame.pivot_table
is used to aggregate a function if there are multiple values in a column for eachindex
,pandas.DataFrame.pivot
should be used if there is only a single value.- This is better than
.groupby
because the dataframe is correctly shaped to be easily plotted.
- This is better than
- Tested in
python 3.8.11
,pandas 1.3.2
,matplotlib 3.4.3
,seaborn 0.11.2
import pandas as pd
import numpy as np
import seaborn as sns
# create sample data
dates = [f'2016-08-{d}T00:00:00.000000000'for d inrange(9, 26, 2)] + ['2016-09-09T00:00:00.000000000']
depths = np.arange(1.25, 5.80, 0.25)
np.random.seed(365)
p1 = np.random.dirichlet(np.ones(10), size=1)[0] # random probabilities for random.choice
p2 = np.random.dirichlet(np.ones(19), size=1)[0] # random probabilities for random.choice
data = {'date': np.random.choice(dates, size=1000, p=p1), 'depth': np.random.choice(depths, size=1000, p=p2), 'capf': np.random.normal(0.3, 0.05, size=1000)}
df = pd.DataFrame(data)
# display(df.head())
date depth capf
02016-08-19T00:00:00.0000000004.750.33923312016-08-19T00:00:00.0000000003.000.37039522016-08-21T00:00:00.0000000005.750.33289532016-08-23T00:00:00.0000000001.750.23754342016-08-23T00:00:00.0000000005.750.272067# make sure the date column is converted to a datetime dtype
df.date = pd.to_datetime(df.date)
# extract only the date component of the date column
df.date = df.date.dt.date
# reshape the data for heatmap; if there's no need to aggregate a function, then use .pivot(...)
dfp = df.pivot_table(index='depth', columns='date', values='capf', aggfunc='mean')
# display(dfp.head())
date 2016-08-09 2016-08-112016-08-132016-08-152016-08-172016-08-192016-08-212016-08-232016-08-252016-09-09
depth
1.500.334661 NaN NaN 0.3026700.3141860.3252570.3136450.263135 NaN NaN
1.750.3054880.3030050.4101240.2990950.3138990.2807320.2757580.260641 NaN 0.3180992.000.3223120.274105 NaN 0.3196060.2689840.3684490.3115170.309923 NaN 0.3061622.250.2899590.315081 NaN 0.3022020.3062860.3398090.2925460.3142250.263875 NaN
2.500.3142270.296968 NaN 0.3127050.3337970.2995560.3271870.326958 NaN NaN
# plot
sns.heatmap(dfp, cmap='GnBu')
Solution 4:
I had a similar problem, but the date was the index. I've just converted the date to string (pandas 1.0) before plotting and it worked for me.
heat['date'] = heat.date.astype('string')
Post a Comment for "Date Axis In Heatmap Seaborn"