Erlang原子是如何工作的?

8oomwypt  于 2022-12-08  发布在  Erlang
关注(0)|答案(2)|浏览(165)

试图找到有关详细信息的文档,我没有找到太多,除了:

  • 有一个(erlang运行时示例-)原子表。
  • Atom字符串文字仅存储一次。
  • 原子取1字。

对我来说,这留下了很多不清楚的东西。
1.原子字值是否始终相同,与模块加载到运行时示例中的顺序无关?如果模块A和B都定义/引用了一些原子,原子的值是否会根据是A还是B先加载而在会话之间发生变化?
1.当匹配模块中的原子时,是否有一些“原子文字到原子值”的解析发生?模块是否有一些自己的模块本地原子值查找表,在模块加载时填充?
1.在一个分布式场景中,两个Erlang运行时示例相互通信。是否有一些“同步原子表”动作在进行?或者原子是否被序列化为字符串文字,而不是单词?

5gfr0r5j

5gfr0r5j1#

Atom is simply an ID maintained by the VM. The representation of the ID is a machine integer of the underlying architecture, e.g. 4 bytes on 32-bit systems and 8 bytes on 64-bit systems. See the usage in the LYSE book .
The same atom in the same running VM is always mapped to the same ID (integer). For example the following tuple:

{apple, pear, cherry, apple}

could be stored as the following tuple in the actual Erlang memory:

{1, 2, 3, 1}

All atoms are stored in one big table which is never garbage-collected, i.e. once an atom is created in a running VM it stays in the table until the VM is shut down.
Answering your questions:

1 . No. The ID of the atom will change between VM runs. If you shut down the VM and reload the tuple above the system might end up with the following IDs:

{50, 51, 52, 50}

depending on what other atoms have been created before it was loaded. Atoms only live as long as the VM.

2 . No. There is only one table of atoms per VM. All literal atoms in the module are mapped to their IDs when the module is loaded. If a particular atom doesn't yet exist in that table then it's inserted and stays there until the VM restarts.
3 . No. Tables with atoms are per VM and they are separate. Consider a situation when two VMs are started at the same time but they don't know of each other. Atoms created in each VM may have different IDs in the table. If at some point in time one node gets to know about the other node different atoms will have different IDs. They can't be easily synchronized or merged. But atoms aren't simply send as text representations to the other node either. They are "compressed" to a form of cache and send all together in the header. See the distribution header in the description of the communication protocol. Basically, the header contains atoms used in later terms with their IDs and textual representation. Then each term references the atom by the ID specified in the header rather than passing the same text each time.

cclgggtu

cclgggtu2#

为了不涉及具体实现而获得真正的基础知识,原子是一个有名称的字面“事物”。它的值总是它自己,并且它知道自己的名称。通常在需要标记时使用它,如原子okerror。原子是唯一的,因为系统中只有一个原子foo,每次我引用foo时,我指的是这个唯一的foo,不管它们是否在同一个模块中,也不管它们是否来自同一个进程,总是只有一个foo
一点实现。原子被存储在全局原子表中,当你创建一个新原子时,如果它还没有被插入到表中。这使得比较原子是否相等非常快,因为你只需要检查两个原子是否引用了原子表中的同一个槽。
虽然VM的独立示例 nodes 具有独立的atom表,但分布式Erlang中节点之间的通信已为此进行了优化,因此通常不需要在节点之间发送实际的atom名称。

相关问题