下面是我使用的代码:
#Divide data into train and test
p_dt_train, p_dt_test = train_test_split(p_main_df, test_size=0.2)
#Shape #(3485868, 26)
p_dt_train.shape
p_fit_DT = DecisionTreeRegressor(max_depth=2).fit(p_dt_train.iloc[:,0:26], p_dt_train.iloc[:,26])
但是当我运行上面的代码行(又名p_fit_DT
(时,会发生以下错误:
IndexError Traceback (most recent call last)
<ipython-input-59-9bbf02d21cd5> in <module>()
1 # In[1]:
2 #Decision tree for regression
----> 3 p_fit_DT = DecisionTreeRegressor(max_depth=2).fit(p_dt_train.iloc[:,0:26], p_dt_train.iloc[:,26])
4
5 #Apply model on test data
D:My_Workanaconda3libsite-packagespandascoreindexing.py in __getitem__(self, key)
1470 except (KeyError, IndexError):
1471 pass
-> 1472 return self._getitem_tuple(key)
1473 else:
1474 # we by definition only have the 0th axis
D:My_Workanaconda3libsite-packagespandascoreindexing.py in _getitem_tuple(self, tup)
2011 def _getitem_tuple(self, tup):
2012
-> 2013 self._has_valid_tuple(tup)
2014 try:
2015 return self._getitem_lowerdim(tup)
D:My_Workanaconda3libsite-packagespandascoreindexing.py in _has_valid_tuple(self, key)
220 raise IndexingError('Too many indexers')
221 try:
--> 222 self._validate_key(k, i)
223 except ValueError:
224 raise ValueError("Location based indexing can only have "
D:My_Workanaconda3libsite-packagespandascoreindexing.py in _validate_key(self, key, axis)
1955 return
1956 elif is_integer(key):
-> 1957 self._validate_integer(key, axis)
1958 elif isinstance(key, tuple):
1959 # a tuple should already have been caught by this point
D:My_Workanaconda3libsite-packagespandascoreindexing.py in _validate_integer(self, key, axis)
2007 l = len(ax)
2008 if key >= l or key < -l:
-> 2009 raise IndexError("single positional indexer is out-of-bounds")
2010
2011 def _getitem_tuple(self, tup):
IndexError: single positional indexer is out-of-bounds
请指导我哪里出错了。 提前谢谢。
如果您的DataFrame
形状为(3485868, 26)
,则沿轴 1 的索引将从 0 到 25(包括 0 和 25(。所以也许你的意思是要做:
p_fit_DT =
DecisionTreeRegressor(max_depth=2).fit(p_dt_train.iloc[:,0:25],
p_dt_train.iloc[:,25])
如果将代码重构为更多步骤,也可能更清晰,例如:
# Initialise instance for Decision Tree Regression
dtr = DecisionTreeRegressor(max_depth=2)
# Get training inputs and outputs as numpy arrays
X_tr, y_tr = p_dt_train.iloc[:, 0:25].values, p_dt_train.iloc[:, 25].values.reshape((-1, 1))
# Fit model using training data
dtr.fit(X_tr, y_tr)