Elixir/Erlang file_server消息积压和不可靠的吞吐量导致性能问题

dzhpxtsq  于 2022-12-08  发布在  Erlang
关注(0)|答案(2)|浏览(141)

I'm running a production app that does a lot of I/O. Whenever the system gets flooded with new requests (with witch I do a ton of IO) I see the Erlang file_server backing up with messages. The backup/slowdown can last hours depending on our volume.

It's my understanding that a lot of File calls actually go through the Erlang file_server. Which appears to have limited throughput. Furthermore, When the message queue gets backed up the entire app is essentially frozen (locked up) and it cannot process new IO requests at all.
All of the IO calls are using the File module. I've specified the [:raw] option everywhere that will allow it. It's my understanding that passing in :raw will bypass the file_server.
This is a really big issue for us, and I imagine others have run into it at some point. I experimented with rewriting the IO logic in Ruby witch resulted in a massive gain in throughput (I don't have exact numbers, but it was a noticeable difference).
Anyone know what else I can look at doing to increase performance/throughput?
Sample Code:

defmodule MyModule.Ingestion.Insertion.Folder do
  use MyModule.Poller
  alias MyModule.Helpers

  def perform() do
    Logger.info("#{__MODULE__} starting check")

    for path <- paths() do
      files = Helpers.Path.list_files(path, ".json")

      Task.async_stream(
        files,
        fn file ->
          result =
            file
            |> File.read!()
            |> Jason.decode()

          case result do
            {:ok, data} ->
              file_created_at = Helpers.File.created_time(file)
              data = Map.put(data, :file_created_at, file_created_at)
              filename = Path.basename(file, ".json")
              :ok = MyModule.InsertWorker.enqueue(%{data: data, filename: filename})

              destination =
                Application.fetch_env!(:my_application, :backups) <> filename <> ".json"

              File.copy!(file, destination)
              File.rm!(file)

            _err ->
              nil
          end
        end,
        timeout: 60_000,
        max_concurrency: 10
      )
      |> Stream.run()
    end

    Logger.info("#{__MODULE__} check finished")
  end

  def paths() do
    path = Application.fetch_env!(:my_application, :lob_path)

    [
      path <> "postcards/",
      path <> "letters/"
    ]
  end
end
3phpmpom

3phpmpom1#

考虑使用async_threads调整虚拟机

h6my8fg2

h6my8fg22#

对于将来发现这个问题的人。问题的根源在于使用File.copy!和路径名。当你这样做时,副本将通过Erlang file_server,这是一个非常难以诊断的巨大瓶颈的原因。不要使用File.copy/1和路径名,而是使用打开的文件作为输入。如下所示

source = "path/to/some/file"
destination = "path/to/destination/"

with {:ok, source} <- File.open(source, [:raw, :read]),
     {:ok, destination} <- File.open(destination, [:raw, :write]),
     {:ok, bytes} <- File.copy(source, destination),
     :ok <- File.close(source),
     :ok <- File.close(destination) do
  {:ok, bytes}
end

相关问题