如何修复TypeError:在Detectron2的DataLoader工作进程1中捕获TypeError



我正在尝试使用COCO数据集训练Detectron2模型。我的数据集似乎加载正确。但当我尝试使用DefaultTrainer训练模型时,我会得到

TypeError: Caught TypeError in DataLoader worker process 1.

这是我的设置:

from detectron2.engine import DefaultTrainer
# TOTAL_NUM_IMAGES = 10531
cfg = get_cfg()
cfg.OUTPUT_DIR = os.path.join('./output')
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")  # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
# single_iteration = cfg.SOLVER.IMS_PER_BATCH
# iterations_for_one_epoch = TOTAL_NUM_IMAGES / single_iteration
# cfg.SOLVER.MAX_ITER = int(iterations_for_one_epoch) * 20
cfg.SOLVER.STEPS = []        # do not decay learning rate
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only has one class (person). (see https://detectron2.readthedocs.io/tutorials/datasets.html#update-the-config-for-new-datasets)
# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()

经过几次迭代,我得到了这个错误:

[01/06 15:14:00 d2.utils.events]:  eta: 11:25:20  iter: 125  total_loss: 0.9023  loss_cls: 0.1827  loss_box_reg: 0.1385  loss_mask: 0.5601  loss_rpn_cls: 0.009945  loss_rpn_loc: 0.0023  time: 0.5232  data_time: 0.3085  lr: 3.1219e-05  max_mem: 3271M
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-8c48e6e17647> in <module>()
26 trainer = DefaultTrainer(cfg)
27 trainer.resume_or_load(resume=False)
---> 28 trainer.train()
8 frames
/usr/local/lib/python3.7/dist-packages/torch/_utils.py in reraise(self)
432             # instantiate since we don't know how to
433             raise RuntimeError(msg) from None
--> 434         raise exception
435 
436 
TypeError: Caught TypeError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch
data.append(next(self.dataset_iter))
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/common.py", line 201, in __iter__
yield self.dataset[idx]
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/common.py", line 90, in __getitem__
data = self._map_func(self._dataset[cur_idx])
File "/usr/local/lib/python3.7/dist-packages/detectron2/utils/serialize.py", line 26, in __call__
return self._obj(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/dataset_mapper.py", line 189, in __call__
self._transform_annotations(dataset_dict, transforms, image_shape)
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/dataset_mapper.py", line 128, in _transform_annotations
for obj in dataset_dict.pop("annotations")
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/dataset_mapper.py", line 129, in <listcomp>
if obj.get("iscrowd", 0) == 0
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/detection_utils.py", line 297, in transform_instance_annotations
p.reshape(-1) for p in transforms.apply_polygons(polygons)
File "/usr/local/lib/python3.7/dist-packages/fvcore/transforms/transform.py", line 297, in <lambda>
return lambda x: self._apply(x, name)
File "/usr/local/lib/python3.7/dist-packages/fvcore/transforms/transform.py", line 291, in _apply
x = getattr(t, meth)(x)
File "/usr/local/lib/python3.7/dist-packages/fvcore/transforms/transform.py", line 150, in apply_polygons
return [self.apply_coords(p) for p in polygons]
File "/usr/local/lib/python3.7/dist-packages/fvcore/transforms/transform.py", line 150, in <listcomp>
return [self.apply_coords(p) for p in polygons]
File "/usr/local/lib/python3.7/dist-packages/detectron2/data/transforms/transform.py", line 150, in apply_coords
coords[:, 0] = coords[:, 0] * (self.new_w * 1.0 / self.w)
TypeError: can't multiply sequence by non-int of type 'float'

在"注释";其中用科学记数法书写,从而产生一些类型为float的id。把它们转换成整数就解决了这个问题。

最新更新