mysql假设partial`pwrite()`表示磁盘已满

b4qexyjb  于 2021-06-20  发布在  Mysql
关注(0)|答案(0)|浏览(232)

我正在支持一个大客户,在运行vmware的centos 6或7虚拟服务器上运行一些MySQL5.5.59数据库。mysql数据库存储在iscsi LUN上的xfs文件系统上。
我在mysql错误日志中得到了与错误相关的间歇性损坏表:

180619 15:08:38  InnoDB: Error: Write to file (merge) failed at offset 0 656408576
InnoDB: 1048576 bytes should have been written, only 921600 were written.
InnoDB: Operating system error number 0.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 0 means 'Success'.

磁盘永远不会满,没有配额,文件大小只有几gb(2-6),并且在64位linux上xfs支持的文件大小范围内。
请注意 errno0 .
生成此错误的mysql代码是 storage/innobase/os/os0file.c . 它盲目地假设任何部分写入都意味着磁盘已满(5.5版的完整mysql源代码在https://dev.mysql.com/downloads/mysql/5.5.html):

ret = os_file_pwrite(file, buf, n, offset, offset_high);

if ((ulint)ret == n) {

    return(TRUE);
}

if (!os_has_said_disk_full) {

    ut_print_timestamp(stderr);

    fprintf(stderr,
        "  InnoDB: Error: Write to file %s failed"
        " at offset %lu %lu.\n"
        "InnoDB: %lu bytes should have been written,"
        " only %ld were written.\n"
        "InnoDB: Operating system error number %lu.\n"
        "InnoDB: Check that your OS and file system"
        " support files of this size.\n"
        "InnoDB: Check also that the disk is not full"
        " or a disk quota exceeded.\n",
        name, offset_high, offset, n, (long int)ret,
        (ulint)errno);
    if (strerror(errno) != NULL) {
        fprintf(stderr,
            "InnoDB: Error number %lu means '%s'.\n",
            (ulint)errno, strerror(errno));
    }

    fprintf(stderr,
        "InnoDB: Some operating system error numbers"
        " are described at\n"
        "InnoDB: "
        REFMAN "operating-system-error-codes.html\n");

    os_has_said_disk_full = TRUE;
}

return(FALSE);

mysql的innodb的相关部分 os_file_pwrite() 函数(它是实际 pwrite() 函数调用):

/*******************************************************************//**
Does a synchronous write operation in Posix.
@return number of bytes written, -1 if error */
static
ssize_t
os_file_pwrite(
/*===========*/
    os_file_t   file,   /*!< in: handle to a file */
    const void* buf,    /*!< in: buffer from where to write */
    ulint       n,  /*!< in: number of bytes to write */
    ulint       offset, /*!< in: least significant 32 bits of file
                offset where to write */
    ulint       offset_high) /*!< in: most significant 32 bits of
                offset */
{
    ssize_t ret;
    off_t   offs;

    ut_a((offset & 0xFFFFFFFFUL) == offset);

    /* If off_t is > 4 bytes in size, then we assume we can pass a
    64-bit address */

    if (sizeof(off_t) > 4) {
        offs = (off_t)offset + (((off_t)offset_high) << 32);
    } else {
        offs = (off_t)offset;

        if (offset_high > 0) {
            fprintf(stderr,
                "InnoDB: Error: file write"
                " at offset > 4 GB\n");
        }
    }

    os_n_file_writes++;

# if defined(HAVE_PWRITE) && !defined(HAVE_BROKEN_PREAD)

    os_mutex_enter(os_file_count_mutex);
    os_file_n_pending_pwrites++;
    os_n_pending_writes++;
    os_mutex_exit(os_file_count_mutex);

    ret = pwrite(file, buf, (ssize_t)n, offs);

    os_mutex_enter(os_file_count_mutex);
    os_file_n_pending_pwrites--;
    os_n_pending_writes--;
    os_mutex_exit(os_file_count_mutex);

# ifdef UNIV_DO_FLUSH

    if (srv_unix_file_flush_method != SRV_UNIX_LITTLESYNC
        && srv_unix_file_flush_method != SRV_UNIX_NOSYNC
        && !os_do_not_call_flush_at_each_write) {

        /* Always do fsync to reduce the probability that when
        the OS crashes, a database page is only partially
        physically written to disk. */

        ut_a(TRUE == os_file_flush(file));
    }

# endif /* UNIV_DO_FLUSH */

    return(ret);

粗略地检查一下mysql源代码的其余部分,似乎可以看出mysql在获得部分 pwrite() ,导致数据库中的数据损坏。
posix标准 pwrite() 表示部分写入是可接受的结果,并表示它不是错误条件:
这个 write() 函数应尝试写入 nbyte 指向的缓冲区中的字节 buf 与打开的文件描述符关联的文件, fildes .
...
这个 pwrite() 功能应等同于 write() ...
返回值
成功完成后,这些函数应返回实际写入与相关文件的字节数 fildes . 该数字不得大于 nbyte . 否则, -1 应退还 errno 设置为指示错误。
根据posix,唯一 errno 表示磁盘已满的值是:
[PC]包含该文件的设备上没有剩余的可用空间。
这对我来说是一个mysql错误-一个局部错误 pwrite() 结果不是错误情况,也不一定意味着底层磁盘已满—尤其是在 errno0 .
我特别想问我对posix的解释是否正确。如果是这样,我计划提交一个mysql错误报告,因为我希望数据库是可靠的。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题