将模糊文件类型输入tensorflow



我目前正在尝试在MRI扫描图像上训练神经网络模型。这些图像是NIfTI(.nii(文件格式的,我不认为tensorflow或keras具有固有的读取能力。我有一个python包,它允许我在python中读取这些文件,但是我很难弄清楚如何将这个包与tensorflow接口。我首先创建了一个tf.data.Dataset对象,其中包含我每次MRI扫描的路径,然后我尝试使用Dataset.map((函数读取每个文件,并创建一个图像和标签对的数据集。我的问题是,tf.data.Dataset对象似乎将每个文件名存储在张量中,而不是字符串中,但可以读取.nii文件类型的函数无法读取张量。有没有一种方法可以将文件路径字符串张量转换为可读字符串,以允许我打开文件?如果没有,是否有更好的方法来创建数据集?

为了社区的利益,指定下面的代码,该代码出现在评论部分"agrits"指定的链接中。

# Creates a .tfrecord file from a directory of nifti images.
#   This assumes your niftis are soreted into subdirs by directory, and a regex
#   can be written to match a volume-filenames and label-filenames
#
# USAGE
#  python ./genTFrecord.py <data-dir> <input-vol-regex> <label-vol-regex>
# EXAMPLE:
#  python ./genTFrecord.py ./buckner40 'norm' 'aseg' buckner40.tfrecords
#
# Based off of this: 
#   http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/21/tfrecords-guide/
# imports
import numpy as np
import tensorflow as tf
import nibabel as nib
import os, sys, re
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def select_hipp(x):
x[x != 17] = 0
x[x == 17] = 1
return x
def crop_brain(x):
x = x[90:130,90:130,90:130] #should take volume zoomed in on hippocampus area
return x
def preproc_brain(x):
x = select_hipp(x)
x = crop_brain(x)   
return x
def listfiles(folder):
for root, folders, files in os.walk(folder):
for filename in folders + files:
yield os.path.join(root, filename)
def gen_filename_pairs(data_dir, v_re, l_re):
unfiltered_filelist=list(listfiles(data_dir))
input_list = [item for item in unfiltered_filelist if re.search(v_re,item)]
label_list = [item for item in unfiltered_filelist if re.search(l_re,item)]
print("input_list size:    ", len(input_list))
print("label_list size:    ", len(label_list))
if len(input_list) != len(label_list):
print("input_list size and label_list size don't match")
raise Exception
return zip(input_list, label_list)
# parse args
data_dir = sys.argv[1]
v_regex  = sys.argv[2]
l_regex  = sys.argv[3]
outfile  = sys.argv[4]
print("data_dir:   ", data_dir)
print("v_regex:    ", v_regex )
print("l_regex:    ", l_regex )
print("outfile:    ", outfile )
# Generate a list of (volume_filename, label_filename) tuples
filename_pairs = gen_filename_pairs(data_dir, v_regex, l_regex)
# To compare original to reconstructed images
original_images = []
writer = tf.python_io.TFRecordWriter(outfile)
for v_filename, l_filename in filename_pairs:
print("Processing:")
print("  volume: ", v_filename)
print("  label:  ", l_filename) 
# The volume, in nifti format   
v_nii = nib.load(v_filename)
# The volume, in numpy format
v_np = v_nii.get_data().astype('int16')
# The volume, in raw string format
v_np = crop_brain(v_np)
# The volume, in raw string format
v_raw = v_np.tostring()
# The label, in nifti format
l_nii = nib.load(l_filename)
# The label, in numpy format
l_np = l_nii.get_data().astype('int16')
# Preprocess the volume
l_np = preproc_brain(l_np)
# The label, in raw string format
l_raw = l_np.tostring()
# Dimensions
x_dim = v_np.shape[0]
y_dim = v_np.shape[1]
z_dim = v_np.shape[2]
print("DIMS: " + str(x_dim) + str(y_dim) + str(z_dim))
# Put in the original images into array for future check for correctness
# Uncomment to test (this is a memory hog)
########################################
# original_images.append((v_np, l_np))
data_point = tf.train.Example(features=tf.train.Features(feature={
'image_raw': _bytes_feature(v_raw),
'label_raw': _bytes_feature(l_raw)}))
writer.write(data_point.SerializeToString())
writer.close()

相关内容

最新更新