erlang 为什么我的脚本会使解释器崩溃?

oxalkeyp  于 2022-12-08  发布在  Erlang
关注(0)|答案(2)|浏览(145)

I have a tiny program that reads a csv file (100M). The problem is that my program makes the Erlang interpreter crash:

Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot reallocate 3563526520 bytes of memory (of type "heap").
Aborted

Here is the program:

readlines(FileName) ->
    {ok, Device} = file:open(FileName, [read]),
    try get_all_lines(Device)
      after file:close(Device)
    end.

get_all_lines(Device) ->
    case io:get_line(Device, "") of
        eof -> [];
        Line -> [Line | get_all_lines(Device)]
    end.

And I do:

Path="...csv".
Lines=tut6:readlines(Path).

And this procudes a crash.
Can someone please tell me what the problem is? Maybe something is wrong with my program? How can I avoid the crashes?
Thanks in advance

yks3o0rb

yks3o0rb1#

Did you realize that 3563526520 is 3.3 GB? How much memory does your system have? The gigantic memory consumption stems from the fact that you have chosen the least optimal algorithm for reading the lines:

  1. You try to read all the lines to the memory before acting on them
  2. You chose to represent the text as list, which uses 8 bytes for each character read from the file (or 16 bytes on 64-bit systems)
  3. You don't use tail-recursion which means the compiler can't optimize your code to be more memory efficient
    So, to fix the code:
  4. Read one line at at time, then parse and process it and store as Erlang terms rather than the raw input data
  5. Read lines as binaries, as suggested by Hynek -Pichi- Vychodil
  6. Make the function reading the file tail-recursive
    Learn You Some Erlang has an excellent discussion about tail recursive functions if you want to know how to properly implement such functions.
    If the function was written in a tail-recursive manner the whole algorithm could look like this:
get_all_lines(Device) ->
    get_all_lines(Device, []).

get_all_lines(Device, List) ->
    case io:get_line(Device, "") of
        eof ->
            lists:reverse(List);
        Line ->
            Data = process_line(Line),
            get_all_lines(Device, [Data | List])
    end.
huwehgph

huwehgph2#

尝试

{ok, Device} = file:open(FileName, [read, binary]),

然后重新思考你真正在做什么。

相关问题