为什么在Erlang中"Map"的查找属性比"记录"慢?

quhf5bfb  于 2022-12-08  发布在  Erlang
关注(0)|答案(3)|浏览(205)

我正在读编程Erlang,在书的第5章它说:
记录只是伪装的元组,因此它们与元组具有相同的存储和性能特征。Map比元组使用更多的存储,并且具有更慢的查找属性。
在我以前学过的语言中,情况并非如此,Map通常以哈希表的形式实现,因此查找时间复杂度为O(1);记录(带名称的元组)通常被实现为不可变的List,并且查找时间复杂度为O(N)
Erlang中这些数据结构的实现有什么不同?

uurity8g

uurity8g1#

There's no real practical performance difference between record lookup and map lookup for small numbers of fields. For large numbers of fields, though, there is, because record information is known at compile time while map keys need not be, so maps use a different lookup mechanism than records. But records and maps are not intended as interchangeable replacements, and so comparing them for use cases involving anything more than a small number of fields is pointless IMO; if you know the fields you need at compile time, use records, but if you don't, use maps or another similar mechanism. Because of this, the following focuses only on the differences in performance of looking up one record field and one map key.
Let's look at the assembler for two functions, one that accesses a record field and one that accesses a map key. Here are the functions:

-record(foo, {f}).

r(#foo{f=X}) ->
    X.

m(#{f := X}) ->
    X.

Both use pattern matching to extract a value from the given type instance.
Here's the assembly for r/1 :

{function, r, 1, 2}.
  {label,1}.
    {line,[{location,"f2.erl",6}]}.
    {func_info,{atom,f2},{atom,r},1}.
  {label,2}.
    {test,is_tuple,{f,1},[{x,0}]}.
    {test,test_arity,{f,1},[{x,0},2]}.
    {get_tuple_element,{x,0},0,{x,1}}.
    {get_tuple_element,{x,0},1,{x,2}}.
    {test,is_eq_exact,{f,1},[{x,1},{atom,foo}]}.
    {move,{x,2},{x,0}}.
    return.

The interesting part here starts under {label,2} . The code verifies that the argument is a tuple, then verifies the tuple's arity, and extracts two elements from it. After verifying that the first element of the tuple is equal to the atom foo , it returns the value of the second element, which is record field f .
Now let's look at the assembly of the m/1 function:

{function, m, 1, 4}.
  {label,3}.
    {line,[{location,"f2.erl",9}]}.
    {func_info,{atom,f2},{atom,m},1}.
  {label,4}.
    {test,is_map,{f,3},[{x,0}]}.
    {get_map_elements,{f,3},{x,0},{list,[{atom,f},{x,0}]}}.
    return.

This code verifies that the argument is a map, then extracts the value associated with map key f .
The costs of each function come down to the costs of the assembly instructions. The record function has more instructions but it's likely they're less expensive than the instructions in the map function because all the record information is known at compile time. This is especially true as the key count for the map increases, since that means the get_map_elements call has to potentially wade through more map data to find what it's looking for.
We can write functions to call these accessors numerous times, then time the new functions. Here are two sets of recursive functions that call the accessors N times:

call_r(N) ->
    call_r(#foo{f=1},N).
call_r(_,0) ->
    ok;
call_r(F,N) ->
    1 = r(F),
    call_r(F,N-1).

call_m(N) ->
    call_m(#{f => 1},N).
call_m(_,0) ->
    ok;
call_m(M,N) ->
    1 = m(M),
    call_m(M,N-1).

We can call these with timer:tc/3 to check the execution time for each function. Let's call each one ten million times, but do so 50 times and take the average execution time. First, the record function:

1> lists:sum([element(1,timer:tc(f2,call_r,[10000000])) || _ <- lists:seq(1,50)])/50.
237559.02

This means calling the function ten million times took an average of 238ms. Now, the map function:

2> lists:sum([element(1,timer:tc(f2,call_m,[10000000])) || _ <- lists:seq(1,50)])/50.
235871.94

Calling the map function ten million times averaged 236ms per call. Your mileage will vary of course, as did mine; I observed that running each multiple times sometimes resulted in the record function being faster and sometimes the map function being faster, but neither was ever faster by a large margin. I'd encourage you to do your own measurements, but it seems that there's virtually no performance difference between the two, at least for accessing a single field via pattern matching. As the number of fields increases, though, the difference in performance becomes more apparent: for 10 fields, maps are slower by about 0.5%, and for 50 fields, maps are slower by about 50%. But as I stated up front, I see this as being immaterial, since if you're trying to use records and maps interchangeably you're doing it wrong.

  • UPDATE: based on the conversation in the comments I clarified the answer to discuss performance differences as the number of fields/keys increases, and to point out that records and maps are not meant to be interchangeable.*
  • UPDATE: for Erlang/OTP 24 the Erlang Efficiency Guide was augmented with a chapter on maps that's worth reading for detailed answers to this question.*
8xiog9wr

8xiog9wr2#

使用Erlang/OTP 22 [Erts-10.6]重复测试时,我得到了不同的结果。
r/1的反汇编代码不同:
记录查找速度提高了1.5倍以上。

{function, r, 1, 2}.
  {label,1}.
    {line,[{location,"f2.erl",9}]}.
    {func_info,{atom,f2},{atom,r},1}.
  {label,2}.
    {test,is_tagged_tuple,{f,1},[{x,0},2,{atom,foo}]}.
    {get_tuple_element,{x,0},1,{x,0}}.
    return.

{function, m, 1, 4}.
  {label,3}.
    {line,[{location,"f2.erl",12}]}.
    {func_info,{atom,f2},{atom,m},1}.
  {label,4}.
    {test,is_map,{f,3},[{x,0}]}.
    {get_map_elements,{f,3},{x,0},{list,[{atom,f},{x,1}]}}.
    {move,{x,1},{x,0}}.
    return.

9> lists:sum([element(1,timer:tc(f2,call_r,[10000000])) || _ <- lists:seq(1,50)])/50.
234309.04
10> lists:sum([element(1,timer:tc(f2,call_m,[10000000])) || _ <- lists:seq(1,50)])/50.
341411.9

After I declared -compile({inline, [r/1, m/1]}).

13> lists:sum([element(1,timer:tc(f2,call_r,[10000000])) || _ <- lists:seq(1,50)])/50.
199978.9
14> lists:sum([element(1,timer:tc(f2,call_m,[10000000])) || _ <- lists:seq(1,50)])/50.
356002.48
dnph8jn4

dnph8jn43#

我将10个元素的记录与同样大小的Map进行了比较。在这种情况下,记录被证明快了2倍多。

-module(f22).

-compile({inline, [r/1, m/1]}).

-export([call_r/1, call_r/2, call_m/1, call_m/2]).

-define(I, '2').
-define(V,  2 ).

-record(foo, {
    '1',
    '2',
    '3',
    '4',
    '5',
    '6',
    '7',
    '8',
    '9',
    '0'
    }).

r(#foo{?I = X}) ->
    X.

m(#{?I := X}) ->
    X.

call_r(N) ->
    call_r(#foo{
    '1' = 1,
    '2' = 2,
    '3' = 3,
    '4' = 4,
    '5' = 5,
    '6' = 6,
    '7' = 7,
    '8' = 8,
    '9' = 9,
    '0' = 0
    }, N).
call_r(_,0) ->
    ok;
call_r(F,N) ->
    ?V = r(F),
    call_r(F,N-1).

call_m(N) ->
    call_m(#{
    '1' => 1,
    '2' => 2,
    '3' => 3,
    '4' => 4,
    '5' => 5,
    '6' => 6,
    '7' => 7,
    '8' => 8,
    '9' => 9,
    '0' => 0
    }, N).
call_m(_,0) ->
    ok;
call_m(F,N) ->
    ?V = m(F),
    call_m(F,N-1).

% lists:sum([element(1,timer:tc(f22,call_r,[10000000])) || _ <- lists:seq(1,50)])/50.
% 229777.3
% lists:sum([element(1,timer:tc(f22,call_m,[10000000])) || _ <- lists:seq(1,50)])/50.
% 395897.68

% After declaring 
% -compile({inline, [r/1, m/1]}).
% lists:sum([element(1,timer:tc(f22,call_r,[10000000])) || _ <- lists:seq(1,50)])/50.
% 130859.98
% lists:sum([element(1,timer:tc(f22,call_m,[10000000])) || _ <- lists:seq(1,50)])/50.
% 306490.6
% 306490.6 / 130859.98 .
% 2.34212629407401

相关问题