Erlang/OTP框架的error_logger在相当高的负载下挂起

agyaoht7  于 2022-12-08  发布在  Erlang
关注(0)|答案(2)|浏览(181)

My application is basically a content based router which will route MMS events.
The logger I am using is the one that comes with the OTP framework in SASL mode "error_logger"
The issue is ::
I am using a client to generate MMS events with default values. This client (in Java) has the ability to send high load of events in multiple THREADS
I am sending 100 events in 10 threads (each thread sending 10 MMS events) to the my router written in Erlang/OTP.
The problem is, when such high load is received by my router , my Logger hangs i.e it stops updating my Log file. But the router is still able to route the events.
The conclusions that I have come up with is ::

  1. Scheduling problem in Erlang when such high load of events is received (a separate process for each event).
  2. A very unlikely dead-loack state.
  3. Might be due to sending events in multiple threads rather than sending them sequentially. But I guess a router will be connected to multiple service provider boxes, so I thought of sending events in threads.
    Can anybody help mw in demystifying the problem?
jei2mxaa

jei2mxaa1#

你已经有了一个很好的答案,但我会补充到讨论中。
error_logger默认使用缓存的磁盘写操作,所以一种可能性是,在低负载下你不会注意到这一点,但在高负载下,你的写操作会该高速缓存中停留一段时间。
顺便说一句:多个线程调用Erlang应该没有问题。
另一种测试方法是将您自己的logger添加到error_logger中,看看会发生什么。可能是打印到shell或其他“快速”的东西。

4nkexdtk

4nkexdtk2#

Which version of Erlang are you using? Prior to R14A (R13B4 maybe?), there was a performance penalty when you invoked a selective receive when the message queue contained a lot of messages. This behaviour meant that in a process that receives lots of messages ( error_logger being the canonical example), if it was barely keeping up with the load then a small spike in load could cause the cost of processing to spike up and stay there as the new processing cost was higher than the process could bear. This problem has been solved in R14A.
Secondly - why are you sending a high volume of events/calls/logs to a text logger? Formatting strings for output to a human readable log file is a lot more expensive than using a binary disk_log for instance. Reducing the cost of logging will help, but reducing the volume of logs will help even more. Maybe investigate exactly why you need to log these things and see if you can't record them another (less expensive) way.
Problems with error_logger are often symptoms of some other overload problem. Try looking at the message queue sizes for all your processes when this problem occurs and see if something else is backed up too. The following erlang shellcode might help:

[ { P, element(2, process_info(P, message_queue_len)) } 
  || P <- erlang:processes(), is_process_alive(P) ]

相关问题