Erlang?中的高可用性

fzsnzjdm  于 2022-12-08  发布在  Erlang
关注(0)|答案(1)|浏览(201)

在Erlang中实现高可用性的典型方法是什么?
假设某个gen_server在本地注册为?MODULE,给定N是独立的,并且由默认的Erlang节点互连,每个Erlang节点运行该gen_server的一个示例,如何1)确保没有请求由于某个参与节点的故障而丢失(只要它们中至少有一个在线),2)负载平衡它们,以避免某些节点过载,而其他节点挂起等待新消息?据我所知,不存在内置的负载平衡器:没有pg2或更新的pg是足够的(仍然可能是在这个方向上进一步工作的良好基础)。
我敢打赌这是一个常见的问题,并且确实存在经过实战检验的“爱尔兰式”解决方案。

qgzx9mmu

qgzx9mmu1#

I think that for 1) to have only-once guarantee you need some kind of distributed transaction algorithm because connections might fail and you don't know the state of the request in the remote node: Is the remote node dead? is it alive and just disconnected because a network failure? how far into the request processing did it go before the failure?
You should check mnesia , it's deeply integrated with Erlang.
If you relax the requirements for 1) (for instance if the requests are idempotent. you only care for at-least-once or the failures are not common), it may suffice with monitoring the remote gen_server and just replaying the request if the connection to the remote server is lost for whatever reason.
For 2 we use haproxy or nginx webserver in a least-conn fashion in front of the nodes, although I believe that you mean 'inside' Erlang. In that case I'd do the following to have a local ETS with the load info:

  1. Have a MODULE sidekick that broadcasts the local MODULE 's mailbox size (or other metric) periodically to other sidekicks in the cluster.
  2. If the sidekick receives this broadcast, it writes the origin node and size into an ETS or just saves them internally and stores the least-busy for the time being in an ETS
  3. If the sidekick notices that a remote node disconnects, it updates the ETS
    Regarding OTP23's pg , don't discard it so easily. By the doc Process Groups implement strong eventual consistency. you may have overloaded servers leave the process group temporarily and they will eventually stop receiving requests. You can have several servers by node with a low trigger to leave the group for a more uniform distribution.

相关问题