我正在尝试为一些学校成绩数据创建泊松回归,这似乎是迄今为止最好的宝石。
通过这篇文章的实践分析,我得出了这个错误:
irb(main):001:0> require 'daru'
require 'statsample-glm'
=> false
=> false
irb(main):003:0> data_set = Daru::DataFrame.from_csv "logistic_mle.csv"
=> #<Daru::DataFrame(200x4)>
a b c y
0 0.75171213 -3.2683591 1.70092606 0
1 0.55421406 -2.9565972 2.66368360 0
2 -1.8533164 -2.8293733 3.34679611 0
3 -2.8861015 -0.7389824 4.74970154 0
4 -2.6055309 0.56102031 5.48308397 0
5 -4.2735321 1.62383436 5.35813425 0
6 -4.7701259 1.22025583 6.41070111 0
7 -6.9231483 2.86547174 8.73185919 0
8 -7.5641950 4.94028695 8.94193466 0
9 -8.6309366 4.27420502 9.27002100 0
10 -8.9911114 5.10389362 11.7669513 0
11 -9.9905763 7.87484596 12.4794035 0
12 -10.381878 8.84300238 13.7498993 0
13 -11.047682 9.44613324 13.5025027 0
14 -12.434424 9.70515870 15.1221173 0
15 -13.627294 10.4190343 16.3289942 0
16 -15.620222 11.3788332 17.7367653 0
17 -16.292239 13.1516565 18.6939344 0
18 -16.715913 14.9076297 18.0246863 0
19 -17.950125 15.8533651 20.6826094 0
20 -18.989884 15.4331557 20.9101142 0
21 -19.908508 16.8542366 22.0721145 0
22 -21.146652 18.6785324 23.4977598 0
23 -21.367574 18.3208056 23.9121114 0
24 -22.131396 20.7616214 24.1683442 0
25 -23.163631 21.1293492 25.2695476 0
26 -24.136076 21.7035705 27.9161820 0
27 -25.386072 23.3588003 27.8755285 0
28 -27.254627 24.9201403 28.9810564 0
29 -28.845061 25.1681854 29.6749936 0
... ... ... ... ...
irb(main):004:0> glm = Statsample::GLM.compute data_set, :y, :logistic, {constant: 1, algorithm: :mle}
Traceback (most recent call last):
1: from (irb):4
IndexError (Specified vector y does not exist)
对错误的进一步检查显示:
Caused by:
IndexError: Specified index :y does not exist
我已经尝试将标题重新格式化为";日期";而不是";字符串";基于这篇stackoverflow帖子中的一条评论,这条评论几乎没有关联,错误没有变化。
SO社区有什么想法吗?
对不起,我发布得太快了。我找到了一个有效的解决方案:
代替
data_set = Daru::DataFrame.from_csv "logistic_mle.csv"
这条线路工作:
data_set = Daru::DataFrame.from_csv("logistic_mle.csv", headers: true, header_converters: :symbol)