c++ 在OpenGL ES 2.0 / OpenGL 2.1中使用EGL 1.4减少屏幕外渲染内存传输开销的方法

x6492ojm  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(225)

我很早就开始尝试在一个运行Linux的无头嵌入式设备上用MALI 400 GPU做一些图像处理任务。它通过官方驱动程序支持OpenGL ES 2.0,并可能通过非官方的LIMA驱动程序支持基本完整的OpenGL 2.1。
具体来说,我有一些图像通过外部系统进入DMAMap内存,我将它们加载到(MONO8/LUMINANCE)纹理中,运行着色器程序渲染到另一个纹理中,然后将其读取出来用于glReadPixels。我可以发布更完整的代码,如果它会帮助任何人,但现在我只显示设置的相关部分,以避免混乱(我认为都很标准):

// Setup code:
display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
eglInitialize(display, &major, &minor);
const EGLint configAttributes[] = {
        EGL_RENDERABLE_TYPE, EGL_OPENGL_ES2_BIT,
        EGL_SURFACE_TYPE, EGL_PBUFFER_BIT,
        EGL_BLUE_SIZE, 8,
        EGL_GREEN_SIZE, 8,
        EGL_RED_SIZE, 8,
        EGL_ALPHA_SIZE, 8,
        EGL_NONE
  };
eglChooseConfig(display, configAttributes, &config, 1, &numConfigs)
const EGLint pbufferAttributes[] = {
        EGL_WIDTH, 1920,
        EGL_HEIGHT, 1200,
        EGL_NONE
};
surface = eglCreatePbufferSurface(display, config, pbufferAttributes);
const EGLint contextAttributes[] = {
        EGL_CONTEXT_CLIENT_VERSION, 2,
        EGL_NONE
};
context = eglCreateContext(display, config, EGL_NO_CONTEXT, contextAttributes);
eglMakeCurrent(display, surface, surface, context);

... Setup shaders, VBOs, etc ...

// Texture used to load image
glGenTextures(1, &textureID);
glBindTexture(GL_TEXTURE_2D, textureID);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
unsigned char* data = (unsigned char*)malloc(1920 * 1200 * sizeof(unsigned char));
glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE, 1920, 1200, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, data); // Bind to dummy data at first, check if we can remove this
glBindBuffer(GL_ARRAY_BUFFER, VBOVertices);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 5 * sizeof(GLfloat), (GLvoid*)0);
glEnableVertexAttribArray(0);
glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 5 * sizeof(GLfloat), (GLvoid*)(3 * sizeof(GLfloat)));
glEnableVertexAttribArray(1);

// Texture used to render into
GLuint framebuffer;
glGenFramebuffers(1, &framebuffer);
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);
GLuint texture;
glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_2D, texture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 1920, 1200, 0, GL_RGBA, GL_UNSIGNED_BYTE, NULL);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texture, 0);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, textureID);
// LATER: In main loop
auto start = std::chrono::steady_clock::now(); // Start timer for image loading
glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE, 1920, 1200, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, image.img); // Load current image into texture from before
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, textureID);
glUniform1i(glGetUniformLocation(shaderProgram, "texture1"), 0);
auto text_loaded = std::chrono::steady_clock::now(); // Texture is loaded, end timer for image loading
glUseProgram(shaderProgram);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glFinish();
auto gl_finished = std::chrono::steady_clock::now(); // All rendering should be done when glFinish returns?
glReadPixels(0, 0, 1920, 1200, GL_RGBA, GL_UNSIGNED_BYTE, preprocessed_img_buf);
auto end = std::chrono::steady_clock::now(); // End timer for readout
// This is needed to get back into MONO8 format
for (int i = 0; i < 1920 * 1200; i++){
  processed_img_buf[i] = preprocessed_img_buf[i * 4];
}
// Print frame times
auto text_loaded_time = std::chrono::duration_cast<std::chrono::microseconds>(text_loaded - start).count();
auto frame_render_time = std::chrono::duration_cast<std::chrono::microseconds>(gl_finished - text_loaded).count();
auto frame_readout_time = std::chrono::duration_cast<std::chrono::microseconds>(end - gl_finished).count();

试图用上面的代码得到一个松散的基准,我看到的时间我几乎不敢相信:

Image loading time: 7148 us
Render time: 85720 us
Readout time: 158734 us
Total frame time: 251602 us

Image loading time: 4797 us
Render time: 85841 us
Readout time: 152563 us
Total frame time: 243201 us

Image loading time: 6018 us
Render time: 85757 us
Readout time: 158420 us
Total frame time: 250195 us

我希望glReadPixels会很慢,但不会是渲染时间的两倍/慢于10 FPS(假设我以相当理智的方式进行基准测试)。这让我觉得我做错了什么,然而我所尝试的一切似乎都以某种方式不受支持:

  • 尝试使用适当的GL_LUMINANCE纹理渲染,但得到错误,似乎我的EGL没有一个配置只有一个8bpp通道,所以我不能渲染的方式,因为我理解它?
  • 尝试在glReadPixels上使用GL_LUMINANCE,但发现文档中的该函数不支持该格式,据说仅支持GL_RGBGL_RGBAGL_ALPHA
  • 尝试破解我的着色器只使用alpha通道,所以我不需要读取4倍冗余数据,并调用GL_ALPHAGL_ALPHA,但尽管文档,我遇到了无效的操作错误(也许我错误配置了其他东西之前发生的事情,虽然?)
  • 尝试寻找其他类型的缓冲区/对象(pixmap,renderbuf,pixelbufferobj,数据缓冲区),但似乎都有某种限制,使它们不适合此(主要是格式/用例/缺乏支持)
  • 我看了EGL图像,但我对它们感到非常困惑,我不想破坏我的代码中看起来像我期望的那样工作的部分,而且从我所知道的来看,它们更多的是用于输入,所以我的想法是,我宁愿攻击花费一半以上时间的部分

所以我的标题问题:在我描述的情况下,有没有一种方法可以更好地做到这一点?

选项/设置的组合会是什么样子?我认为这是可行的,现在我怀疑这个GPU是否对我有用,而不需要大量的时间投资。是这样吗?或者我离一个更合理的帧时间还有几个枚举?
还有一些我的想法/问题,但不想考虑主要问题的一部分,以避免分散注意力:

  • 虽然可能非常复杂,但我可能可以利用系统中的DMA控制器/内存来完成这种传输,就像它们通常出现在屏幕上一样快。这是可能的,它甚至会帮助,或者我高估了马力400的力量,它从来没有打算驱动1920 x1200?
  • 我也可以尝试编译LIMA驱动程序并访问GL 2.1,但我仍然在尝试学习GLES和EGL,所以我对堆叠更多的东西来学习持谨慎态度,除非有承诺我可以实现我想要做的事情。
  • 让驱动程序开源本身会有帮助吗?我在驱动程序开发方面做得很少,在GPU上也没有做过,所以我不确定这有多雄心勃勃,尽管我觉得我可能会比尝试做我认为在GLES中简单的任务更舒服。
kqhtkvqz

kqhtkvqz1#

正如我在评论中提到的,您可以将GBM库用于您的任务。以下是一个分步指南:
1.使用一个或多个GBM BO(缓冲区对象)创建GBM曲面。
1.继续从GBM表面创建EGLSurface以用于渲染目的。
1.如果您的图形驱动程序支持它,则可以使用DRM PRIME API为每个BO获取DMA文件描述符(fd)。这将允许您Map这些缓冲区并根据需要读取它们的内容。
1.需要注意的是,在创建GBM图面时,请确保使用线性格式修饰符,以便Map内存的内容对您的用例有意义。
下面的示例演示了如何将内容渲染到GBM表面,并使用Map的DMA文件描述符将结果保存到PNG图像中:

meson.build

project(
    'OpenGL DMA Read Example',
    'c',
    version : '0.1.0',
    meson_version: '>= 0.59.0',
    default_options: [
        'warning_level=2',
        'buildtype=debug'
    ]
)

c = meson.get_compiler('c')

include_paths = []

include_paths_sys = [
    '/usr/local/include',
    '/usr/include/drm',
    '/usr/include/libdrm',
    '/usr/include/freetype2'
]

foreach p : include_paths_sys
    if run_command('[', '-d', p, ']', check : false).returncode() == 0
      include_paths += [include_directories(p)]
    endif
endforeach

egl_dep             = c.find_library('EGL')
glesv2_dep          = c.find_library('GLESv2')
drm_dep             = c.find_library('drm')
gbm_dep             = c.find_library('gbm')
freeimage_dep       = c.find_library('freeimage')

executable(
    'dma-read',
    sources : ['main.c'],
    include_directories : include_paths,
    dependencies : [
        egl_dep,
        glesv2_dep,
        drm_dep,
        gbm_dep,
        freeimage_dep
    ])

main.c

#include <EGL/egl.h>
#include <errno.h>
#include <gbm.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <xf86drmMode.h>
#include <sys/ioctl.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <GLES2/gl2.h>
#include <FreeImage.h>
#include <linux/dma-buf.h>
#include <linux/dma-heap.h>

#define PNG_PATH "/tmp/dma_read.png"
#define DRM_DEVICE "/dev/dri/card0"
#define WIDTH 512
#define HEIGHT 512
#define FORMAT GBM_FORMAT_ARGB8888

static int drmFd, dmaFd;
static char *map;
static int offset;
static unsigned int stride;
static struct gbm_device *gbmDevice;
static struct gbm_surface *gbmSurface;
static struct gbm_bo *gbmBO;
static EGLDisplay eglDisplay;
static EGLContext eglContext;
static EGLSurface eglSurface;
static EGLConfig eglConfig;

static const EGLint eglConfigAttribs[] =
{
    EGL_SURFACE_TYPE, EGL_WINDOW_BIT,
    EGL_RED_SIZE, 8,
    EGL_GREEN_SIZE, 8,
    EGL_BLUE_SIZE, 8,
    EGL_ALPHA_SIZE, 8,
    EGL_RENDERABLE_TYPE, EGL_OPENGL_ES2_BIT,
    EGL_NONE
};

static int matchConfigToVisual(EGLDisplay egl_display, EGLint visual_id, EGLConfig *configs, int count)
{
    for (int i = 0; i < count; ++i)
    {
        EGLint id;

        if (!eglGetConfigAttrib(egl_display, configs[i], EGL_NATIVE_VISUAL_ID, &id))
            continue;

        if (id == visual_id)
            return i;
    }

    return -1;
}

static int chooseEGLConfiguration(EGLDisplay egl_display, const EGLint *attribs, EGLint visual_id, EGLConfig *config_out)
{
    EGLint count = 0;
    EGLint matched = 0;
    EGLConfig *configs;
    int config_index = -1;

    if (!eglGetConfigs(egl_display, NULL, 0, &count) || count < 1)
    {
        printf("No EGL configs to choose from.\n");
        return 0;
    }

    configs = (void**)malloc(count * sizeof *configs);

    if (!configs)
        return 0;

    if (!eglChooseConfig(egl_display, attribs, configs, count, &matched) || !matched)
    {
        printf("No EGL configs with appropriate attributes.\n");
        goto out;
    }

    if (!visual_id)
        config_index = 0;

    if (config_index == -1)
        config_index = matchConfigToVisual(egl_display, visual_id, configs, matched);

    if (config_index != -1)
        *config_out = configs[config_index];

out:
    free(configs);
    if (config_index == -1)
        return 0;

    return 1;
}

int getDMAFDFromBO(int drmFd, struct gbm_bo *bo)
{
    struct drm_prime_handle prime_handle;
    memset(&prime_handle, 0, sizeof(prime_handle));
    prime_handle.handle = gbm_bo_get_handle(bo).u32;
    prime_handle.flags = DRM_CLOEXEC | DRM_RDWR;
    prime_handle.fd = -1;

    if (ioctl(drmFd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &prime_handle) != 0)
        goto fail;

    if (prime_handle.fd < 0)
        goto fail;

    // Set read and write permissions on the file descriptor
    if (fcntl(prime_handle.fd, F_SETFL, fcntl(prime_handle.fd, F_GETFL) | O_RDWR) == -1)
    {
        close(prime_handle.fd);
        goto fail;
    }

    printf("Got BO DMA fd using DRM_IOCTL_PRIME_HANDLE_TO_FD.\n");
    return prime_handle.fd;

fail:

    prime_handle.fd = gbm_bo_get_fd(bo);

    if (prime_handle.fd >= 0)
    {
        printf("Got BO DMA fd using gbm_bo_get_fd().\n");
        return prime_handle.fd;
    }

    printf("Failed to get fd for handle %u: %s\n", prime_handle.handle, strerror(errno));
    return -1;
}

int mapDMA()
{
    map = mmap(NULL, HEIGHT * stride, PROT_READ | PROT_WRITE, MAP_SHARED, dmaFd, 0);

    if (map == MAP_FAILED)
    {
        map = mmap(NULL, HEIGHT * stride, PROT_WRITE, MAP_SHARED, dmaFd, 0);

        if (map == MAP_FAILED)
        {
            void **dummy = NULL;
            map = gbm_bo_map(gbmBO, 0, 0, WIDTH, HEIGHT, GBM_BO_TRANSFER_READ, &stride, dummy);

            if (!map)
            {
                printf("Failed to map DMA fd.\n");
                return 0;
            }
        }
    }

    return 1;
}

int init()
{
    drmFd = open(DRM_DEVICE, O_RDWR);

    if (drmFd < 0)
    {
        printf("Failed to open DRM device %s.\n", DRM_DEVICE);
        return 0;
    }

    gbmDevice = gbm_create_device(drmFd);

    if (!gbmDevice)
    {
        printf("Failed to create GBM device.\n");
        return 0;
    }

    eglDisplay = eglGetDisplay(gbmDevice);

    if (eglDisplay == EGL_NO_DISPLAY)
    {
        printf("Failed to get EGL display.\n");
        return 0;
    }

    if (!eglInitialize(eglDisplay, NULL, NULL))
    {
        printf("Failed to initialize EGL display.\n");
        return 0;
    }

    if (!chooseEGLConfiguration(eglDisplay, eglConfigAttribs, FORMAT, &eglConfig))
    {
        printf("Failed to choose EGL configuration.\n");
        return 0;
    }

    eglContext = eglCreateContext(eglDisplay, eglConfig, EGL_NO_CONTEXT, NULL);

    if (eglContext == EGL_NO_CONTEXT)
    {
        printf("Failed to create EGL context.\n");
        return 0;
    }

    gbmSurface = gbm_surface_create(
        gbmDevice,
        WIDTH,
        HEIGHT,
        FORMAT,
        GBM_BO_USE_RENDERING | GBM_BO_USE_LINEAR);

    if (!gbmSurface)
    {
        printf("Failed to create GBM surface.\n");
        return 0;
    }

    eglSurface = eglCreateWindowSurface(eglDisplay, eglConfig, (EGLNativeWindowType)gbmSurface, NULL);

    if (eglSurface == EGL_NO_SURFACE)
    {
        printf("Failed to create EGL surface.\n");
        return 0;
    }

    eglMakeCurrent(eglDisplay,
                   eglSurface,
                   eglSurface,
                   eglContext);

    eglSwapBuffers(eglDisplay, eglSurface);

    // Create a single BO (calling gbm_surface_lock_front_buffer() again before gbm_surface_release_buffer() would create another BO)
    gbmBO = gbm_surface_lock_front_buffer(gbmSurface);
    gbm_surface_release_buffer(gbmSurface, gbmBO);

    stride = gbm_bo_get_stride(gbmBO);
    offset = gbm_bo_get_offset(gbmBO, 0);
    dmaFd = getDMAFDFromBO(drmFd, gbmBO);

    if (dmaFd < 0)
        return 0;

    if (!mapDMA())
        return 0;

    return 1;
}

void savePNG()
{
    eglSwapBuffers(eglDisplay, eglSurface);
    gbm_surface_lock_front_buffer(gbmSurface);

    struct dma_buf_sync sync;
    sync.flags = DMA_BUF_SYNC_START | DMA_BUF_SYNC_READ;
    ioctl(dmaFd, DMA_BUF_IOCTL_SYNC, &sync);

    FIBITMAP *image = FreeImage_ConvertFromRawBits((BYTE*)&map[offset],
                                                   WIDTH,
                                                   HEIGHT,
                                                   stride,
                                                   32,
                                                   0xFF0000, 0x00FF00, 0x0000FF,
                                                   false);

    if (FreeImage_Save(FIF_PNG, image, PNG_PATH, PNG_DEFAULT))
        printf("PNG image saved: %s.\n", PNG_PATH);
    else
        printf("Failed to save PNG image: %s.\n", PNG_PATH);

    FreeImage_Unload(image);

    sync.flags = DMA_BUF_SYNC_END | DMA_BUF_SYNC_READ;
    ioctl(dmaFd, DMA_BUF_IOCTL_SYNC, &sync);

    gbm_surface_release_buffer(gbmSurface, gbmBO);
}

void render()
{
    glEnable(GL_SCISSOR_TEST);

    // Red
    glViewport(0, 0, WIDTH/2, HEIGHT/2);
    glScissor(0, 0, WIDTH/2, HEIGHT/2);
    glClearColor(1.f, 0.f, 0.f, 1.f);
    glClear(GL_COLOR_BUFFER_BIT);

    // Green
    glViewport(WIDTH/2, 0, WIDTH/2, HEIGHT/2);
    glScissor(WIDTH/2, 0, WIDTH/2, HEIGHT/2);
    glClearColor(0.f, 1.f, 0.f, 1.f);
    glClear(GL_COLOR_BUFFER_BIT);

    // Blue
    glViewport(0, HEIGHT/2, WIDTH/2, HEIGHT/2);
    glScissor(0, HEIGHT/2, WIDTH/2, HEIGHT/2);
    glClearColor(0.f, 0.f, 1.f, 1.f);
    glClear(GL_COLOR_BUFFER_BIT);

    // Black
    glViewport(WIDTH/2, HEIGHT/2, WIDTH/2, HEIGHT/2);
    glScissor(WIDTH/2, HEIGHT/2, WIDTH/2, HEIGHT/2);
    glClearColor(0.f, 0.f, 0.f, 1.f);
    glClear(GL_COLOR_BUFFER_BIT);
}

int main()
{
    if (!init())
        return 1;
    
    render();
    savePNG();
    return 0;
}

要测试它,请将文件放在同一个目录中并运行以下命令:

$ meson setup build
$ cd build
$ meson compile
$ ./dma-read

如果一切顺利,应该会在/tmp/dma_read.png中保存一个如图所示的PNG文件。

相关问题