如果这个问题太基础了,我很抱歉,但我刚刚开始使用 PyTorch(和 Python)。

我试图一步一步地按照这里的说明进行操作: https ://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html

但是,我正在使用一些 DICOM 文件,这些文件保存在两个目录中(CANCER/NOCANCER)。我用拆分文件夹拆分它们,使其结构化以与 ImageFolder 数据集一起使用(如教程中所做的那样)。

我知道我只需要加载从 DICOM 文件中提取的 pixel_arrays,并且我编写了一些辅助函数来:

  1. 读取 .dcm 文件的所有路径;
  2. 读取它们并提取pixel_array;
  3. 做一点预处理。以下是辅助函数的概要:
import os
import pydicom
import cv2
import numpy as np 
def createListFiles(dirName):
   print("Fetching all the files in the data directory...")
   lstFilesDCM =[]
   for root, dir, fileList in os.walk(dirName):
       for filename in fileList:
            if ".dcm" in filename.lower():
               lstFilesDCM.append(os.path.join( root , filename))
   return lstFilesDCM
def castHeight(list):
   lstHeight = []
   min_height = 0        
   for filenameDCM in list:
       readfile = pydicom.read_file(filenameDCM)
       min_height = np.min(lstHeight)   
   return  min_height
def castWidth(list):
   lstWidth = []
   min_Width = 0
   for filenameDCM in list:
       readfile = pydicom.read_file(filenameDCM)
       min_Width = np.min(lstWidth)   
   return  min_Width
def Preproc1(listDCM):
   new_height, new_width = castHeight(listDCM), castWidth(listDCM)
   ConstPixelDims = (len(listDCM), int(new_height), int(new_width)) 
   ArrayDCM = np.zeros(ConstPixelDims, dtype=np.float32)
   ## loop through all the DICOM files
   for filenameDCM in listDCM:    
       ## read the file
       ds = pydicom.read_file(filenameDCM)
       mx0 = ds.pixel_array
       ## Standardisation 
       imgb = mx0.astype('float32')
       imgb_stand = (imgb - imgb.mean(axis=(0, 1), keepdims=True)) / imgb.std(axis=(0, 1), keepdims=True)
       ## Normalisation 
       imgb_norm = cv2.normalize(imgb_stand, None, 0, 1, cv2.NORM_MINMAX)        
       ## we make sure that data is saved as a data_array as a numpy array
       data = np.array(imgb_norm)
       ## we save it into ArrayDicom and resize it based 'ConstPixelDims' 
       ArrayDCM[listDCM.index(filenameDCM), :, :] =  cv2.resize(data, (int(new_width), int(new_height)), interpolation = cv2.INTER_CUBIC)
   return ArrayDCM


# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
# Create training and validation dataloaders
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val']}


image_datasets = {x: datasets.ImageFolder(Preproc1(os.path.join(data_dir, x)), data_transforms[x]) for x in ['train', 'val']}


另外,我的另一个问题是:当教程建议进行 transforms.Normalize 时,是否值得在我的预处理中进行标准化步骤?


