构建tf.estimator input_fn:feature不在features字典中



我有一个记录语料库,表示电子游戏中的对决。我想把这个输入tf.estimator.DNNClassifier

这些记录包含0队的5名英雄和1队的5位英雄的文本表示、游戏的地图以及游戏的获胜者。我想把这三个特征表示为三个稀疏向量。

我现在不使用熊猫或numpy。在我能够详细阐述我的tf知识之前,我更愿意暂时保持它尽可能简单。(但再简单不过了!)。

也许问这个问题的最好方法是展示我所拥有的,并在make_input_fn上请求帮助填写空白

import tensorflow as tf
import packunpack as source
import tempfile
from collections import namedtuple
GameRecord = namedtuple('GameRecord', 'team_0 team_1 game_map winner')
def parse(line):
parts = line.rstrip().split("t")
return GameRecord(
game_map = parts[1], 
team_0 = parts[2].split(","), 
team_1 = parts[3].split(","), 
winner = int(parts[4]))
def conjugate(record):
return GameRecord(
team_0 = record.team_1, 
team_1 = record.team_0, 
game_map = record.game_map, 
winner = 0 if record.winner == 1 else 1)
def sparse_team(team):
return tf.SparseTensor(indices=team, values = [1] * len(team), dense_shape=[len(source.heroes_array)])
def sparse_map(i):
return tf.SparseTensor(indices=[i], values = [1], dense_shape=[len(source.maps_array)])
def make_input_fn(filename, shuffle = True, add_conjugate_games = True):
def _fn():
records = []
with open(filename, "r") as raw:
i = 0
for line in raw:
record = parse(line)
records.append(record)
if add_conjugate_games:
# the team_0 and team_1 designations are arbitrary, and so the same inference should be drawn from a game and its "conjugate" game
records.append(conjugate(record))
team_0s = map(lambda r: sparse_team(r.team_0), records)
team_1s = map(lambda r: sparse_team(r.team_1), records)
maps = map(lambda r: sparse_map(r.game_map), records)
winners = map(lambda r: tf.constant([r.winner]), records)
return ({
team_0: team_0s,
team_1: team_1s,
game_map: maps,
}, 
winners)
#Please help me finish this function?
return _fn
team_0 = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("team_0", source.heroes_array), 1)
team_1 = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("team_1", source.heroes_array), 1)
game_map = tf.feature_column.embedding_column(
tf.feature_column.categorical_column_with_vocabulary_list("game_map", source.maps_array), 1)
model_dir = tempfile.mkdtemp()
m = tf.estimator.DNNClassifier(
model_dir=model_dir,
hidden_units = [1024, 512, 256], 
feature_columns=[team_0, team_1, game_map])
def main():
m.train(input_fn=make_input_fn("validation.txt"))
if __name__ == "__main__":
main()

我今天已经浏览了所有的文档,但我能找到的所有代码示例都显示了如何将panda和numpy数据结构输入input_fn,并通过调用对我不起作用的辅助函数来破坏过程的底层机制。

(例如。,https://www.tensorflow.org/get_started/input_fn和https://www.tensorflow.org/tutorials/wide)

tf版本1.4.0-dev20171008

当我运行时,我会得到这个堆栈跟踪。我认为它不喜欢_fn的返回值。但那本字典确实有我给AFAICT型号的功能名称。

File "estimator.py", line 72, in <module>
main()
File "estimator.py", line 69, in main
m.train(input_fn=make_input_fn("validation.txt"))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 711, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 334, in _model_fn
config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 190, in _dnn_model_fn
logits = logit_fn(features=features, mode=mode)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/canned/dnn.py", line 89, in dnn_logit_fn
features=features, feature_columns=feature_columns)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 230, in input_lay
er
trainable=trainable)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 1834, in _get_den
se_tensor
inputs, weight_collections=weight_collections, trainable=trainable)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 2119, in _get_spa
rse_tensors
return _CategoricalColumn.IdWeightPair(inputs.get(self), None)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 1533, in get
transformed = column._transform_feature(self)  # pylint: disable=protected-access
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 2087, in _transfo
rm_feature
input_tensor = _to_sparse_input(inputs.get(self.key))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/feature_column/feature_column.py", line 1529, in get
raise ValueError('Feature {} is not in features dictionary.'.format(key))
ValueError: Feature team_0 is not in features dictionary.

我认为您应该检查您的数据,并确保您丢失的字段(team_0)正确显示。它可能有很多问题,比如格式错误的数据,或者字段名称在训练数据源中拼写错误。

最新更新