用于创建CNN的Keras-数组大小为训练图像的两倍

n8ghc7c1 于 2023-06-30 发布在其他

关注(0)|答案(1)|浏览(111)

我正在创建一个CNN来确定扫描的图像是猫还是狗。我有一个文件夹，里面有猫和狗的照片。猫在图像#前有“cat.”，狗在图像#前有“dog”;例如猫的图像可以是“cat.403.jpg”。总共有大约20，000张图片。然而，当我运行下面的代码时，我的数组存储了大约40，000个值，几乎是数量的两倍。我不明白这是怎么回事代码如下：

DATADIR = "C:/Users/me/Jupyter Codes/dogs-vs-cats/train/train"
CATEGORIES = ['dog', 'cat']
IMG_SIZE = 50 
training_data = []
cat_img_array = []
dog_img_array = []
def create_training_data():
    for category in CATEGORIES:
        train_path = os.path.join(DATADIR)   # path to train folder
        class_num = CATEGORIES.index(category)
        for img in os.listdir(train_path):
            animal = img.split('.')[0]
            if animal == 'cat':
                cat_img_array = cv2.imread(os.path.join(train_path, img), cv2.IMREAD_GRAYSCALE)  # convert to grayscale bc RGB > gray; plus not essential
                cv2.resize(cat_img_array, (IMG_SIZE, IMG_SIZE))
                training_data.append([cat_img_array, class_num])
                
            elif animal == 'dog':
                dog_img_array = cv2.imread(os.path.join(train_path, img), cv2.IMREAD_GRAYSCALE)
                cv2.resize(dog_img_array, (IMG_SIZE, IMG_SIZE))
                training_data.append([dog_img_array, class_num])
create_training_data()
print(len(training_data))

这将返回41900张图像
当我运行这段代码时：

# pack data into variables before fed into CNN
X = []
Y = []
for features, label in training_data:
    X.append(features)
    Y.append(label)
    
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1) # -1 -> make array number based on data; 1 -> grayscale (3 if RGB)
Y = np.array(Y).reshape(-1, IMG_SIZE, IMG_SIZE, 1)

我得到这些错误消息：
VisibleDeprecationWarning：不赞成从不规则的嵌套序列（即具有不同长度或形状的列表或元组或ndarray的列表或元组）创建ndarray。如果你打算这样做，你必须在创建ndarray时指定'dtype=object'。X = np.array（X）.reshape（-1，IMG_SIZE，IMG_SIZE，1）# -1 ->根据数据生成数组编号; 1 ->灰度（如果是RGB，则为3）
ValueError：无法将大小为41900的数组整形为shape（50，50，1）

keras

来源：https://stackoverflow.com/questions/76553128/keras-for-creating-cnn-size-of-array-double-number-of-training-images

1条答案

按热度按时间

jmo0nnb31#

您正在使用此行对数据集进行两次迭代：

for category in CATEGORIES: ## CATEGORIES = ['dog', 'cat']

从这个意义上说，你添加了两次所有的东西，也错误地标记了两次数据-第一次通过它将所有东西标记为“狗”，然后第二次“猫”。
也检查这一行..它没有做你想要的，因为它没有修改数组的位置。

cv2.resize(cat_img_array, (IMG_SIZE, IMG_SIZE))

如果你只是做了这样的事情。你也可以去掉if语句。

CATEGORIES = {'cat': 0, 'dog': 1}
for img in os.listdir(train_path):
    animal = img.split('.')[0]
    img_array = cv2.imread(os.path.join(train_path, img), cv2.IMREAD_GRAYSCALE)  # convert to grayscale bc RGB > gray; plus not essential
    resized = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
    training_data.append([resized, CATEGORIES[animal]])

展开查看全部

赞(0）回复(0）举报 2023-06-30

我来回答

用于创建CNN的Keras-数组大小为训练图像的两倍

1条答案

相关问题

热门标签

最新问答