keras CNN探头的数据预处理

f0ofjuux 于 2023-02-12 发布在其他

关注(0)|答案(1)|浏览(122)

对于深度学习和学习，我还是个新手，很抱歉这是非常基础的，但我正在开发一个模型，用无人机摄影来检测invasive coconut rhinoceros beetles对棕榈树的破坏。我得到的1080p照片是在离地面250英尺的地方拍摄的，然后被裁剪成等尺寸的更小的图像，有些有一棵或多棵棕榈树，有些没有。我使用labelStudio生成指向jpg对应路径的XML文件。
我目前的问题是将XML输入到CSV中，以便在Keras上进行训练和验证。每个裁剪的图像都有相同的名称，例如：无人机_图片1 11.jpg 12.jpg 13.jpg ... 46.jpg
无人机_图片2 11.jpg 12.jpg 13.jpg ... 46.jpg
无人机_图片1000 11.jpg 12.jpg 13.jpg ... 46.jpg
我正在使用一个python脚本，这个脚本是由我之前的一个学生编写的，它应该将用于训练和验证的数据拆分到不同的目录中，并创建csv文件和模型。但是当我运行它时，它似乎有一个问题，即裁剪后的图像具有相同的命名方案。我的测试和验证目录现在看起来如下所示：
测试目录和验证目录11.jpg 11（1）. jpg 11（2）. jpg 12.jpg 13.jpg 13（1）. jpg 152.jpg ... 999.jpg 999（1）. jpg 1000.jpg
注意：裁剪后的图片都有相同的命名方案，但在不同的目录中。然而，当使用脚本分割成测试和验证组时，它会得到一张重复的照片，并在括号中添加一个数字。
我的问题：有没有更好的方法可以将带有XML注解的图像数据预处理为csv格式，而不必手动更改1000个图像名称？请记住，XML注解也指向它们的jpg名称路径，因此如果我更改jpg名称，也必须更改它们的XML注解。
我唯一能想到的是编写一个新的裁剪脚本，确保下次获取图像数据时名称都不同，但我不希望使用当前数据返回。

- 编辑：**

更新：看起来我需要确保路径斜线是一致的。
这是Cropped Img Directories的图片。
这是the training and validation sets that were created的图像
这是the csv files generated的图像。
我创建的用于编辑XML标签的脚本（* 主要是GPT *）： tags:

import os
import tkinter as tk
from tkinter import filedialog
from xml.etree import ElementTree as ET

def browse_directory():
    root = tk.Tk()
    root.withdraw()
    xml_directory = filedialog.askdirectory(parent=root, title='Choose the directory of the XML files')
    jpg_directory = filedialog.askdirectory(parent=root, title='Choose the directory of the JPG files')
    batch_edit_xml(xml_directory, jpg_directory)

def headless_mode():
    xml_directory = input("Enter the path of the XML folder: ")
    jpg_directory = input("Enter the path of the JPG folder: ")
    batch_edit_xml(xml_directory, jpg_directory)

def batch_edit_xml(xml_directory, jpg_directory):
    count = 1 # initializing count to 1
    for root, dirs, files in os.walk(xml_directory):
        for file in files:
            if file.endswith(".xml"):
                file_path = os.path.join(root, file) # creating a file path by joining the root and the file name
                xml_tree = ET.parse(file_path) # parsing the XML file
                xml_root = xml_tree.getroot() # getting the root of the XML file
                filename = os.path.splitext(file)[0] # getting the file name without the extension
                jpg_path = os.path.join(jpg_directory, os.path.basename(root), filename + '.jpg') # creating a jpg path
                xml_root.find('./path').text = jpg_path # finding the path element in the XML file and updating it with the jpg_path
                xml_tree.write(file_path) # writing the changes back to the XML file
                print(f"{count} of {len(files)}: {file_path}") # printing the current count and the total number of files processed
                count += 1
                if count > len(files): # checking if the count has reached the length of the files
                    count = 1 # resetting the count back to 1
    print("Edit Complete") # indicating that the edit is complete

mode = input("Enter 1 for headless mode or 2 for desktop mode: ")
if mode == '1':
    headless_mode()
elif mode == '2':
    browse_directory()
else:
    print("Invalid input. Please enter 1 or 2.")

keras

来源：https://stackoverflow.com/questions/75414320/data-preprocessing-for-cnn-tips

1条答案

按热度按时间

ee7vknir1#

编写另一个python脚本来读取test dir中的所有图像并将其保存到csv文件中并不困难，python中的示例代码如下：

import os
import pandas as pd
images = []
# suppose test_dir holds all test images
for path, subdirs, files in os.walk(test_dir): 
    for image_name in files:
        images.append(os.path.join(path, image_name))
dict = {'image name': images}        
df = pd.DataFrame(dict) 
df.to_csv('your.csv')

赞(0）回复(0）举报 2023-02-12

我来回答

keras CNN探头的数据预处理

1条答案

相关问题

热门标签

最新问答