gcc 为什么字符串常量存储在.rodata中,而地址位于代码段中?

laik7k3q  于 2024-01-08  发布在  其他
关注(0)|答案(3)|浏览(210)

我的简单代码是,

int main()
{
    const char *str="jigneshparmar";
    printf("address of str data:%p , address of str variable:%p\n",(void*)str,(void*)&str );
    getchar();

    return 0;
}

字符串
这里,字符串常量“jignesh”存储在只读数据段中。
通过使用size命令,这里size的输出是:

gcc datasec.c -o datasec
size -A datasec

datasec  :
section              size    addr
.interp                28     792
.note.gnu.property     32     824
.note.gnu.build-id     36     856
.note.ABI-tag          32     892
.gnu.hash              36     928
.dynsym               168     968
.dynstr               133    1136
.gnu.version           14    1270
.gnu.version_r         32    1288
.rela.dyn             192    1320
.rela.plt              24    1512
.init                  27    4096
.plt                   32    4128
.plt.got               16    4160
.plt.sec               16    4176
.text                 405    4192
.fini                  13    4600

.rodata                18    8192

.eh_frame_hdr          68    8212
.eh_frame             264    8280
.init_array             8   15800
.fini_array             8   15808
.dynamic              496   15816
.got                   72   16312
.data                  16   16384
.bss                    8   16400
.comment               42       0
Total                2236


当我增加字符串常量时,这个.rodata的大小也会增加。
我打印的字符串的地址属于代码段。

./datasec 
address of str data:0x55f5301f3004 ,address of str variable:0x7ffd0a2b1940


地址0x55f5301f3004位于代码段中。

cat /proc/4018/maps
555da289d000-555da289e000 r--p 00000000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da289e000-555da289f000 r-xp 00001000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da289f000-555da28a0000 r--p 00002000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da28a0000-555da28a1000 r--p 00002000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da28a1000-555da28a2000 rw-p 00003000 103:02 13109134                  /root/Desktop/lsp-prac/datasec
555da416c000-555da418d000 rw-p 00000000 00:00 0                          [heap]
7f5485c38000-7f5485c5d000 r--p 00000000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485c5d000-7f5485dd5000 r-xp 00025000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485dd5000-7f5485e1f000 r--p 0019d000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e1f000-7f5485e20000 ---p 001e7000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e20000-7f5485e23000 r--p 001e7000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e23000-7f5485e26000 rw-p 001ea000 103:02 9963657                   /usr/lib/x86_64-linux-gnu/libc-2.31.so
7f5485e26000-7f5485e2c000 rw-p 00000000 00:00 0 
7f5485e41000-7f5485e42000 r--p 00000000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e42000-7f5485e65000 r-xp 00001000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e65000-7f5485e6d000 r--p 00024000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e6e000-7f5485e6f000 r--p 0002c000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e6f000-7f5485e70000 rw-p 0002d000 103:02 9963653                   /usr/lib/x86_64-linux-gnu/ld-2.31.so
7f5485e70000-7f5485e71000 rw-p 00000000 00:00 0 
7ffda6e9c000-7ffda6ebd000 rw-p 00000000 00:00 0                          [stack]
7ffda6ed9000-7ffda6edd000 r--p 00000000 00:00 0                          [vvar]
7ffda6edd000-7ffda6edf000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]


这怎么可能
先谢了。

yb3bgrhw

yb3bgrhw1#

Linux内核通过将ELF可执行文件Map到内存中来“加载”它们。这发生在页面粒度上,/proc/PID/maps中的 offset 字段(权限后面的字段)描述了每个Map区域的文件偏移量。
字符串文本存储在.rodata部分的ELF文件中,可执行代码存储在.text部分。
Linux内核确实使用节头来确定要Map的内容。ELF文件格式有一组 * 程序头 *,您可以在例如readelf -l binary中看到。这里相关的是LOAD头;这些指定了Linux内核Map到内存中的内容。
例如,下面是来自x86-64(readelf -l /bin/cat)上的GNU Coreutils 8.28 cat的两个LOAD程序头:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000079d0 0x00000000000079d0  R E    0x200000
  LOAD           0x0000000000007a70 0x0000000000207a70 0x0000000000207a70
                 0x0000000000000650 0x00000000000007f0  RW     0x200000

字符串
当Linux内核执行ELF文件时,它会将LOAD程序头Map到内存中。看到0x 0 - 0x 79 d 0只有读和执行,0x 6a 60 - 0x 8260只有读写(内存地址0x 207 a70 - 0x 208260),而根本没有“只读”吗?
使用objdump -d -s /bin/cat,我们可以看到(仅限相关片段):

0000000000001ad0 <.text>:
    1ad0:       53                      push   %rbx
    1ad1:       48 8d 35 6c 41 00 00    lea    0x416c(%rip),%rsi        # 5c44 <_IO_stdin_used@@Base+0x4>
    1ad8:       ba 05 00 00 00          mov    $0x5,%edx
    1add:       31 ff                   xor    %edi,%edi
    1adf:       e8 3c fd ff ff          callq  1820 <dcgettext@plt>
    1ae4:       48 89 c3                mov    %rax,%rbx
    1ae7:       e8 a4 fc ff ff          callq  1790 <__errno_location@plt>
[snipped lots of disassembly]
    5c16:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    5c1d:       00 00 00 
    5c20:       31 d2                   xor    %edx,%edx
    5c22:       31 f6                   xor    %esi,%esi
    5c24:       e9 17 be ff ff          jmpq   1a40 <__cxa_atexit@plt>


Contents of section .rodata:
 5c40 01000200 77726974 65206572 726f7200  ....write error.
 5c50 63617400 5b007465 73742069 6e766f63  cat.[.test invoc
 5c60 6174696f 6e004d75 6c74692d 63616c6c  ation.Multi-call
 5c70 20696e76 6f636174 696f6e00 73686132   invocation.sha2
 5c80 32347375 6d007368 61322075 74696c69  24sum.sha2 utili
 5c90 74696573 00736861 32353673 756d0073  ties.sha256sum.s
 5ca0 68613338 3473756d 00736861 35313273  ha384sum.sha512s
 5cb0 756d000a 2573206f 6e6c696e 65206865  um..%s online he
 5cc0 6c703a20 3c25733e 0a00474e 5520636f  lp: <%s>..GNU co
 5cd0 72657574 696c7300 656e5f00 2f757372  reutils.en_./usr
 5ce0 2f736861 72652f6c 6f63616c 65005269  /share/locale.Ri
 5cf0 63686172 64204d2e 20537461 6c6c6d61  chard M. Stallma
 5d00 6e00546f 72626a6f 726e2047 72616e6c  n.Torbjorn Granl
 5d10 756e6400 62656e73 74757641 45540073  und.benstuvAET.s
 5d20 74616e64 61726420 6f757470 75740025  tandard output.%


Contents of section .init_array:
 207a70 10280000 00000000                    .(......        
Contents of section .fini_array:
 207a78 d0270000 00000000                    .'......        
Contents of section .data.rel.ro:
 207a80 7a5d0000 00000000 00000000 00000000  z]..............
 207a90 00000000 00000000 62000000 00000000  ........b.......
 207aa0 8a5d0000 00000000 00000000 00000000  .]..............
[...]
 207c00 e75c0000 00000000 27630000 00000000  .\......'c......
 207c10 00000000 00000000                    ........        
Contents of section .dynamic:
 207c18 01000000 00000000 01000000 00000000  ................
 207c28 0c000000 00000000 20170000 00000000  ........ .......
 207c38 0d000000 00000000 2c5c0000 00000000  ........,\......
[...]
 207de8 00000000 00000000 00000000 00000000  ................
 207df8 00000000 00000000 00000000 00000000  ................
Contents of section .got:
 207e08 187c2000 00000000 00000000 00000000  .| .............
 207e18 00000000 00000000 56170000 00000000  ........V.......
[...]
 207fe8 00000000 00000000 00000000 00000000  ................
 207ff8 00000000 00000000                    ........        
Contents of section .data:
 208000 00000000 00000000 08802000 00000000  .......... .....
 208010 20202020 20202020 20202020 20202020                  
 208020 20300900 00000000 21802000 00000000   0......!. .....
 208030 1c802000 00000000 7b620000 00000000  .. .....{b......


看看.text和.rodata是如何属于同一个LOAD程序头的?这就是为什么它们以相同的方式Map。.data部分是一个单独的LOAD程序头,因此被单独Map,具有不同的权限。
您可以看到,在大多数Linux系统(包括Ubuntu 18.04.5,上面的代码来自于此)上的x86-64上使用的链接器文件将.text和.rodata部分合并到一个LOAD程序头中;由于这是控制Linux内核如何将ELF可执行文件加载(Map)到内存中的方法,因此它们被Map到相同的内存区域,具有相同的权限(r-xp)。
考虑以下示例程序maps.c

// SPDX-License-Identifier: CC0-1.0
#define  _POSIX_C_SOURCE  200809L
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

struct map_entry {
    struct map_entry   *next;
    uintptr_t           addr;
    uintptr_t           ends;
    char                line[];
};

struct map_entry *map = NULL;

static struct map_entry *map_find(const void *const ptr)
{
    const uintptr_t   addr = (uintptr_t)ptr;
    struct map_entry *curr = map;

    while (curr)
        if (addr >= curr->addr && addr <= curr->ends)
            return curr;
        else
            curr = curr->next;

    return NULL;
}

static void map_init(void)
{
    char   *line = NULL;
    size_t  size = 0;
    ssize_t len;
    FILE   *in;

    /* Already mapped? */
    if (map)
        return;

    struct map_entry *root = NULL;

    in = fopen("/proc/self/maps", "r");
    if (!in) {
        fprintf(stderr, "Cannot read /proc/self/maps: %s.\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    while (1) {
        len = getline(&line, &size, in);
        if (len < 0)
            break;

        /* Remove newline at end. */
        while (len > 0 && line[len-1] == '\n')
            line[--len] = '\0';

        char               *ptr = line;
        char               *end = line;
        unsigned long long  val;

        /* Parse start address. */
        errno = 0;
        val = strtoull(ptr, &end, 16);
        if (errno) {
            fprintf(stderr, "/proc/self/maps: %s: %s.\n", line, strerror(errno));
            exit(EXIT_FAILURE);
        }
        if (end == ptr || *end != '-') {
            fprintf(stderr, "/proc/self/maps: %s: Error parsing line.\n", line);
            exit(EXIT_FAILURE);
        }
        ptr = ++end;
        const uintptr_t  addr = val;

        /* Parse end address (actually one plus end address). */
        errno = 0;
        val = strtoull(ptr, &end, 16);
        if (errno) {
            fprintf(stderr, "/proc/self/maps: %s: %s.\n", line, strerror(errno));
            exit(EXIT_FAILURE);
        }
        if (end == ptr || *end != ' ') {
            fprintf(stderr, "/proc/self/maps: %s: Error parsing line.\n", line);
            exit(EXIT_FAILURE);
        }
        const uintptr_t  ends = val;

        /* Allocate a new map entry for this one. */
        struct map_entry *ent = malloc(sizeof (struct map_entry) + len + 1);
        if (!ent) {
            fprintf(stderr, "/proc/self/maps: Out of memory.\n");
            exit(EXIT_FAILURE);
        }

        /* Copy line, including the end-of-string '\0'. */
        memcpy(ent->line, line, len + 1);

        ent->addr = addr;
        ent->ends = ends - 1;

        /* Prepend to root list. */
        ent->next = root;
        root      = ent;
    }

    /* Discard line buffer, since it is no longer needed. */
    free(line);  /* Note: free(NULL) is safe, and does nothing. */

    if (ferror(in) || !feof(in)) {
        fprintf(stderr, "/proc/self/maps: Read error.\n");
        exit(EXIT_FAILURE);
    }
    if (fclose(in)) {
        fprintf(stderr, "/proc/self/maps: Error closing file.\n");
        exit(EXIT_FAILURE);
    }

    /* Reverse the list.  Since we prepended each entry, it is in reverse order. */
    while (root) {
        struct map_entry *curr = root;

        root = root->next;

        /* Prepend to map list. */
        curr->next = map;
        map        = curr;
    }
}

const char *const literal1 = "String literal 1";
const char        array1[] = "String array 1";

int main(void)
{
    const char *const literal2 = "String literal 2";
    const char        array2[] = "String array 2";

    struct map_entry *ent;
    map_init();

    ent = map_find(&literal1);
    if (ent)
        printf("Variable 'literal1' has address %p:\n\t%s\n", (void *)&literal1, ent->line);

    ent = map_find(literal1);
    if (ent)
        printf("Variable 'literal1' points to address %p:\n\t%s\n", (void *)literal1, ent->line);

    ent = map_find(&array1);
    if (ent)
        printf("Variable 'array1' has address %p:\n\t%s\n", (void *)&array1, ent->line);

    ent = map_find(array1);
    if (ent)
        printf("Variable 'array1' points to address %p:\n\t%s\n", (void *)array1, ent->line);

    ent = map_find(&literal2);
    if (ent)
        printf("Variable 'literal2' has address %p:\n\t%s\n", (void *)&literal2, ent->line);

    ent = map_find(literal2);
    if (ent)
        printf("Variable 'literal2' points to address %p:\n\t%s\n", (void *)literal2, ent->line);

    ent = map_find(&array2);
    if (ent)
        printf("Variable 'array2' has address %p:\n\t%s\n", (void *)&array2, ent->line);

    ent = map_find(array2);
    if (ent)
        printf("Variable 'array2' points to address %p:\n\t%s\n", (void *)array2, ent->line);

    return EXIT_SUCCESS;
}


使用gcc -Wall -Wextra -O2 maps.c -o maps编译它,然后运行./maps

Variable 'literal1' has address 0x5651567add48:
    5651567ad000-5651567ae000 r--p 00001000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'literal1' points to address 0x5651565ad21f:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'array1' has address 0x5651565ad448:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'array1' points to address 0x5651565ad448:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'literal2' has address 0x7fff34c15dd8:
    7fff34bf7000-7fff34c18000 rw-p 00000000 00:00 0                          [stack]
Variable 'literal2' points to address 0x5651565ad1c4:
    5651565ac000-5651565ae000 r-xp 00000000 fd:03 6953200                    /home/glaerbo/kildekode/maps/maps
Variable 'array2' has address 0x7fff34c15df9:
    7fff34bf7000-7fff34c18000 rw-p 00000000 00:00 0                          [stack]
Variable 'array2' points to address 0x7fff34c15df9:
    7fff34bf7000-7fff34c18000 rw-p 00000000 00:00 0                          [stack]


其示出literal1const char *const literal1 = "...";)如何属于被Map为r--p的存储器区域,但是指向被Map为r-xp的存储器。
“嘿,我记得你说过内核只MapLOAD程序头?" 是的;那个特殊的Map不是由内核创建的,而是由动态链接器创建的。我没有说 * 只 * 内核将ELF可执行文件Map到内存中;我解释了内核如何将必需的最小LOAD程序头Map到内存中,并将执行交给该代码。库,该代码Map程序的其余部分和任何先决条件的动态库。)
但是注意,array1不可变的字符数组完全在r-xpMap的内存区域中,就像literal2引用的字符串字面量一样。
因为array2literal2是在main()函数中声明的,所以它们位于[stack]内存区域。

8hhllhi2

8hhllhi22#

两件事正在发生:
1.您没有正确打印字符串字面量的地址;
1.您没有正确查看Map。
正如我在评论中提到的,%p期望其对应的参数具有void *类型,而对printf的调用是C中少数(也许是唯一)必须显式转换指向void *的指针的地方之一,因此字符串字面量的地址可能没有正确格式化。
否则,您将无法正确查看Map。
我用你的代码在我的系统上编译。当我运行它的时候我得到输出

address of str data:0x400580 , address of str variable:0x7fff22b9e938

字符串
您可以使用objdump实用程序查看可执行文件的各个部分-要查看.rodata的内容,请执行以下操作:

objdump -s -j .rodata file


当我在我构建的代码上这样做时,我得到

Contents of section .rodata:
 400570 01000200 00000000 00000000 00000000  ................
 400580 6a69676e 65736870 61726d61 72000000  jigneshparmar...
 400590 61646472 65737320 6f662073 74722064  address of str d
 4005a0 6174613a 2570202c 20616464 72657373  ata:%p , address
 4005b0 206f6620 73747220 76617269 61626c65   of str variable
 4005c0 3a25700a 00                          :%p..


与程序的输出相匹配

gz5pxeao

gz5pxeao3#

你可能有错误的地址不知何故。我尝试了你的代码,得到的结果是字符串常量地址str是在.rodata节,和地址的局部变量str是在堆栈上。
这是汇编代码所期望的。.rodata 部分中的字符串常量的地址被计算为相对地址(以启用位置无关代码),作为下一条指令的地址加上偏移量 * 0x 5ed *,并放入$rax寄存器。下一条指令将此地址 * 0x 55555556008 * 放入堆栈:

0x00005555555551a4 <+27>:    lea    0xe5d(%rip),%rax        #0x555555556008
   0x00005555555551ab <+34>:    mov    %rax,-0x10(%rbp)

字符串
我们现在把str(.rodata中字符串的地址)放在堆栈上。用gdb检查 * 0x 55555556008 *,这是 str 的值放在堆栈上:

(gdb) x/32gx $sp
0x7fffffffdd00: 0x0000000000000000  0x0000000100000000
0x7fffffffdd10: 0x0000000000000000  0x7af4b6fe7bb81800
0x7fffffffdd20: 0x0000000400000003  0x0000555555556008
0x7fffffffdd30: 0x00007fffffffdde0  0x0000555555557db0


我们检查这确实是字符串的地址:

(gdb) x/s 0x0000555555556008
0x555555556008: "jigneshparmar"


你可以在这个屏幕截图中看到str地址在堆栈上:
x1c 0d1x的数据
所以这是程序的真实的输出。我们希望地址不会与我们使用gdb运行程序时相同:

drazen@HP-ProBook-640G1:~/proba$ ./main
address of str data:0x564ff8d34008 , address of str variable:0x7ffffac9fc80


流程图为:



所以我们可以确认str指向.rodata,&str指向一个 stack

相关问题