Solution to TIMSER

It is just simple linear regression with log transformation (Here I did log log transformation but log transformation alone gives pretty much the same results). More importantly, I ignored training data completely as it is too much in the past and just introduces noise (most of the training data are from before the 2007 crisis, so it is to be expected that it is not very informative post-2007). I used only validation data for regression.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

In[68]:

val_df=pd.read_csv(‘val.csv’)
subm=pd.read_csv(‘sample_submission_copy.csv’)

In[69]:

X=np.arange(len(val_df)).reshape(-1,1)
Y=val_df.iloc[:,1].values.reshape(-1,1)
Y=np.log(Y)
Y=np.log(Y)

In[70]:

linear_regressor=LinearRegression(normalize=True)

In[71]:

linear_regressor.fit(X,Y)

In[72]:

test_X=np.arange(len(val_df),len(val_df)+1904).reshape(-1,1)

In[73]:

Y_pred=linear_regressor.predict(test_X)

In[74]:

plt.scatter(X,Y)
plt.plot(test_X, Y_pred, color=‘red’)
plt.show()

In[75]:

Y_pred= np.exp(np.exp(Y_pred))

In[76]:

Y_pred=pd.DataFrame(data=Y_pred, index=np.array(range(1904)), columns=np.array(range(1)))

In[77]:

subm[‘value’]=Y_pred

In[78]:

subma=subm.to_csv(‘submission.csv’)