我正在支持一个大客户,在运行vmware的centos 6或7虚拟服务器上运行一些MySQL5.5.59数据库。mysql数据库存储在iscsi LUN上的xfs文件系统上。
我在mysql错误日志中得到了与错误相关的间歇性损坏表:
180619 15:08:38 InnoDB: Error: Write to file (merge) failed at offset 0 656408576
InnoDB: 1048576 bytes should have been written, only 921600 were written.
InnoDB: Operating system error number 0.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 0 means 'Success'.
磁盘永远不会满,没有配额,文件大小只有几gb(2-6),并且在64位linux上xfs支持的文件大小范围内。
请注意 errno
是 0
.
生成此错误的mysql代码是 storage/innobase/os/os0file.c
. 它盲目地假设任何部分写入都意味着磁盘已满(5.5版的完整mysql源代码在https://dev.mysql.com/downloads/mysql/5.5.html):
ret = os_file_pwrite(file, buf, n, offset, offset_high);
if ((ulint)ret == n) {
return(TRUE);
}
if (!os_has_said_disk_full) {
ut_print_timestamp(stderr);
fprintf(stderr,
" InnoDB: Error: Write to file %s failed"
" at offset %lu %lu.\n"
"InnoDB: %lu bytes should have been written,"
" only %ld were written.\n"
"InnoDB: Operating system error number %lu.\n"
"InnoDB: Check that your OS and file system"
" support files of this size.\n"
"InnoDB: Check also that the disk is not full"
" or a disk quota exceeded.\n",
name, offset_high, offset, n, (long int)ret,
(ulint)errno);
if (strerror(errno) != NULL) {
fprintf(stderr,
"InnoDB: Error number %lu means '%s'.\n",
(ulint)errno, strerror(errno));
}
fprintf(stderr,
"InnoDB: Some operating system error numbers"
" are described at\n"
"InnoDB: "
REFMAN "operating-system-error-codes.html\n");
os_has_said_disk_full = TRUE;
}
return(FALSE);
mysql的innodb的相关部分 os_file_pwrite()
函数(它是实际 pwrite()
函数调用):
/*******************************************************************//**
Does a synchronous write operation in Posix.
@return number of bytes written, -1 if error */
static
ssize_t
os_file_pwrite(
/*===========*/
os_file_t file, /*!< in: handle to a file */
const void* buf, /*!< in: buffer from where to write */
ulint n, /*!< in: number of bytes to write */
ulint offset, /*!< in: least significant 32 bits of file
offset where to write */
ulint offset_high) /*!< in: most significant 32 bits of
offset */
{
ssize_t ret;
off_t offs;
ut_a((offset & 0xFFFFFFFFUL) == offset);
/* If off_t is > 4 bytes in size, then we assume we can pass a
64-bit address */
if (sizeof(off_t) > 4) {
offs = (off_t)offset + (((off_t)offset_high) << 32);
} else {
offs = (off_t)offset;
if (offset_high > 0) {
fprintf(stderr,
"InnoDB: Error: file write"
" at offset > 4 GB\n");
}
}
os_n_file_writes++;
# if defined(HAVE_PWRITE) && !defined(HAVE_BROKEN_PREAD)
os_mutex_enter(os_file_count_mutex);
os_file_n_pending_pwrites++;
os_n_pending_writes++;
os_mutex_exit(os_file_count_mutex);
ret = pwrite(file, buf, (ssize_t)n, offs);
os_mutex_enter(os_file_count_mutex);
os_file_n_pending_pwrites--;
os_n_pending_writes--;
os_mutex_exit(os_file_count_mutex);
# ifdef UNIV_DO_FLUSH
if (srv_unix_file_flush_method != SRV_UNIX_LITTLESYNC
&& srv_unix_file_flush_method != SRV_UNIX_NOSYNC
&& !os_do_not_call_flush_at_each_write) {
/* Always do fsync to reduce the probability that when
the OS crashes, a database page is only partially
physically written to disk. */
ut_a(TRUE == os_file_flush(file));
}
# endif /* UNIV_DO_FLUSH */
return(ret);
粗略地检查一下mysql源代码的其余部分,似乎可以看出mysql在获得部分 pwrite()
,导致数据库中的数据损坏。
posix标准 pwrite()
表示部分写入是可接受的结果,并表示它不是错误条件:
这个 write()
函数应尝试写入 nbyte
指向的缓冲区中的字节 buf
与打开的文件描述符关联的文件, fildes
.
...
这个 pwrite()
功能应等同于 write()
...
返回值
成功完成后,这些函数应返回实际写入与相关文件的字节数 fildes
. 该数字不得大于 nbyte
. 否则, -1
应退还 errno
设置为指示错误。
根据posix,唯一 errno
表示磁盘已满的值是:
[PC]包含该文件的设备上没有剩余的可用空间。
这对我来说是一个mysql错误-一个局部错误 pwrite()
结果不是错误情况,也不一定意味着底层磁盘已满—尤其是在 errno
是 0
.
我特别想问我对posix的解释是否正确。如果是这样,我计划提交一个mysql错误报告,因为我希望数据库是可靠的。
暂无答案!
目前还没有任何答案,快来回答吧!