Fitting silently fails for a series of type object

Bug #1229147 reported by Dov Grobgeld
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
statsmodels
New
Undecided
Unassigned

Bug Description

When trying to do linear regression of a pandas.Series() object I used the following code:

import pandas as pd
import statsmodels.formula.api as sm

x = [0.,1.,2.,3.,4.]
y = [1,2.1,2.9,3.1,3.9]
s = pd.Series(y,index=x)
df = pd.DataFrame({'Y':s,'X':s.index})
fitted_values = sm.ols(formula='Y ~ X', data=df).fit().fittedvalues

I was suprised that the fitted_values are identical to y:

In [14]: fitted_values
Out[14]:
0 1.0
1 2.1
2 2.9
3 3.1
4 3.9

This seems to be due to the fact that s.index has the dtype 'object'. If casted to float64, this solves the problem:

df = pd.DataFrame({'Y':s,'X':s.index}).astype(float64)
fitted_values = sm.ols(formula='Y ~ X', data=df).fit().fittedvalues

In [19]: fitted_values
Out[19]:
0 1.24
1 1.92
2 2.60
3 3.28
4 3.96

In my opinion, either sm.ols should do this type conversion internally, or throw an exception about illegal input.

Version of statsmodels is 0.5.0 .

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.