Erlang进程与Java线程

ycl3bljg  于 2022-12-08  发布在  Erlang
关注(0)|答案(7)|浏览(211)

我正在阅读"Elixir in Action" book by Saša Jurić, and in the first chapter,上面写着:
Erlang进程彼此完全隔离,它们不共享内存,一个进程的崩溃不会导致其他进程的崩溃。
Java线程不也是这样吗?我的意思是,当Java线程崩溃时,它也不会使其他线程崩溃--特别是,如果我们正在查看请求处理线程(让我们将main线程排除在此讨论之外)

s3fp2yjn

s3fp2yjn1#

Repeat after me: "These are different paradigms"

Say that aloud 20 times or so -- it is our mantra for the moment.
If we really must compare apples and oranges, let's at least consider where the common aspects of "being fruit" intersect.
Java "objects" are a Java programmer's basic unit of computation. That is, an object (basically a struct with arms and legs that has encapsulation somewhat more strictly enforced than in C++) is the primary tool with which you model the world. You think "This object knows/has Data {X,Y,Z} and performs Functions {A(),B(),C()} over it, carries the Data everywhere it goes, and can communicate with other objects by calling functions/methods defined as part of their public interface. It is a noun, and that noun does stuff.". That is to say, you orient your thought process around these units of computation. The default case is that things that happen amongst the objects occur in sequence, and a crash interrupts that sequence. They are called "objects" and hence (if we disregard Alan Kay's original meaning) we get "object orientation".
Erlang "processes" are an Erlang programmer's basic unit of computation. A process (basically a self-contained sequential program running in its own time and space) is the primary tool with which an Erlanger models the world(1). Similar to how Java objects define a level of encapsulation, Erlang processes also define the level of encapsulation, but in the case of Erlang the units of computation are completely cut off from one another. You cannot call a method or function on another process, nor can you access any data that lives within it, nor does one process even run within the same timing context as any other processes, and there is no guarantee about the ordering of message reception relative to other processes which may be sending messages. They may as well be on different planets entirely (and, come to think of it, this is actually plausible). They can crash independently of one another and the other processes are only impacted if they have deliberately elected to be impacted (and even this involves messaging: essentially registering to receive a suicide note from the dead process which itself is not guaranteed to arrive in any sort of order relative to the system as a whole, to which you may or may not choose to react).
Java deals with complexity directly in compound algorithms: how objects work together to solve a problem. It is designed to do this within a single execution context, and the default case in Java is sequential execution. Multiple threads in Java indicates multiple running contexts and is a very complex topic because of the impact activity in different timing contexts have on one another (and the system as a whole: hence defensive programming, exception schemes, etc.). Saying "multi-threaded" in Java means something different than it does in Erlang, in fact this is never even said in Erlang because it is always the base case. Note here that Java threads imply segregation as pertains to time, not memory or visible references -- visibility in Java is controlled manually by choosing what is private and what is public; universally accessible elements of a system must be either designed to be "threadsafe" and reentrant, sequentialized via queueing mechanisms, or employ locking mechanisms. In short: scheduling is a manually managed issue in threaded/concurrent Java programs.
Erlang separates each processes' running context in terms of execution timing (scheduling), memory access and reference visibility and in doing so simplifies each component of an algorithm by isolating it completely. This is not just the default case, this is the only case available under this model of computation. This comes at the cost of never knowing exactly the sequence of any given operation once a part of your processing sequences crosses a message barrier -- because messages are all essentially network protocols and there are no method calls that can be guaranteed to execute within a given context. This would be analogous to creating a JVM instance per object, and only permitting them to communicate across sockets -- that would be ridiculously cumbersome in Java, but is the way Erlang is designed to work (incidentally, this is also the basis of the concept of writing "Java microservices" if one ditches the web-oriented baggage the buzzword tends to entail -- Erlang programs are, by default, swarms of microservices). Its all about tradeoffs.
These are different paradigms. The closest commonality we can find is to say that from the programmer's perspective, Erlang processes are analogous to Java objects. If we must find something to compare Java threads to... well, we're simply not going to find something like that in Erlang, because there is no such comparable concept in Erlang. To beat a dead horse: these are different paradigms. If you write a few non-trivial programs in Erlang this will become readily apparent.

Note that I'm saying "these are different paradigms" but have not even touched the topic of OOP vs FP. The difference between "thinking in Java" and "thinking in Erlang" is more fundamental than OOP vs FP. (In fact, one could write an OOP language for the Erlang VM that works like Java -- for example: An implementation of OOP objects in Erlang .)
While it is true that Erlang's "concurrency oriented" or "process oriented" foundation is closer to what Alan Kay had in mind when he coined the term "object oriented"(2), that is not really the point here. What Kay was getting at was that one can reduce the cognitive complexity of a system by cutting your computrons into discrete chunks, and isolation is necessary for that. Java accomplishes this in a way that leaves it still fundamentally procedural in nature, but structures code around a special syntax over higher-order dispatching closures called "class definitions". Erlang does this by splitting the running context up per object. This means Erlang thingies can't call methods on one another, but Java thingies can. This means Erlang thingies can crash in isolation but Java thingies can't. A vast number of implications flow from this basic difference -- hence "different paradigms". Tradeoffs.
Footnotes:

  1. Incidentally, Erlang implements a version of " the actor model ", but we don't use this terminology as Erlang predates the popularization of this model. Joe was unaware of it when he designed Erlang and wrote his thesis .
  2. Alan Kay has said quite a bit about what he meant when he coined the term "object oriented", the most interesting being his take on messaging (one-way notification from one independent process with its own timing and memory to another) VS calls (function or method calls within a sequential execution context with shared memory) -- and how the lines blur a bit between programming interface as presented by the programming language and the implementation underneath.
wpcxdonn

wpcxdonn2#

当然不是。Java中的所有线程共享相同的地址空间,所以一个线程可以丢弃另一个线程拥有的东西。在Erlang VM中,这是不可能的,因为每个进程都与其他进程隔离。这就是它们的全部意义。任何时候你想让一个进程处理来自另一个进程的数据,你的代码必须向另一个进程发送消息。进程之间唯一共享的是大型二进制对象,并且这些对象是不可变的。

pengsaosao

pengsaosao3#

Java线程实际上可以共享内存。例如,你可以将同一个示例传递给两个不同的线程,这两个线程都可以操作它的状态,从而导致潜在的问题,如deadlocks
另一方面,Elixir/Erlang通过不变性的概念来解决这个问题,所以当你向一个进程传递某个东西时,它将是原始值的副本。

ztmd8pv5

ztmd8pv54#

when Java thread dies, it too does not impact other threads
Let me ask a counterquestion: why do you think Thread.stop() has been deprecated for more than a decade? The reason why is precisely the negation of your statement above.
To give two specific examples: you stop() a thread while it's executing something as innocuous-sounding as System.out.println() or Math.random() . Result: those two features are now broken for the entire JVM. The same pertains to any other synchronized code your application may execute.
if we are looking at request-processing threads
The application may theoretically be coded such that absolutely no shared resource protected by locks is ever used; however that will only help to point out the exact extent to which Java threads are codependent. And the "independence" achieved will only pertain to the request-processing threads, not to all threads in such an application.

cmssoen2

cmssoen25#

作为对前面答案的补充,Java线程有两种类型:后台进程和非后台进程。
要改变线程的类型,你可以调用.setDaemon(boolean on)。不同的是守护进程线程不会阻止JVM退出。正如线程的Javadoc所说:
当运行的线程都是守护程序线程时,Java虚拟机将退出。
这意味着:用户线程(那些没有被特别设置为守护进程的线程)阻止JVM终止。另一方面,当所有非守护进程线程完成时,守护进程线程可能正在运行,在这种情况下JVM将退出。因此,回答你的问题:您可以启动一个在结束时不退出JVM的线程。
至于与Erlang/Elixir的比较,不要忘记:* * 它们是不同的范例,如前所述。**
JVM并不是不可能模仿Erlang的行为,尽管这不是它的本意,因此,它需要进行很多权衡。下面的项目试图实现这一点:

flmtquvp

flmtquvp6#

我的意思是,当Java线程崩溃时,它也不会使其他线程崩溃
是和否。我解释:

  • 引用共享内存:Java进程中的不同线程共享整个堆,因此线程可以以大量计划和计划外的方式进行交互。**但是,**堆栈中的对象(例如,传递给被调用方法的上下文)或ThreadLocal是它们自己的线程(除非它们开始共享引用)。
  • 崩溃:如果Java中的线程崩溃(Throwable被传播到Thread.run(),或者某些东西被循环或阻塞),该事故可能不会影响其他线程(例如,服务器中的连接池将继续运行)。但是由于不同的线程交互。如果其中一个线程异常结束,其他线程将很容易陷入困境(例如,一个线程试图从一个空管道中读取另一个没有关闭其末端的线程)。因此,除非开发人员非常谨慎,否则很可能会发生副作用。

我怀疑任何其他的范式都是把线程当作完全独立的孤岛来运行的。它们必须共享信息并以某种方式进行协调。这样就有机会把事情搞砸。只是它们会采取一种更具防御性的方法,“给你更少的绳子上吊”(与指针的用法相同)。

cyej8jka

cyej8jka7#

一开始,吸引力可能是这样的。在这种有限的对比背景下(“一个撞另一个”),它们可能看起来一样。但当我们涉足它们的细节和本质时,真实的的差异就会显现出来。@zxq9给出了相当多的对比细节,希望它有助于理解它们确实在细节上是不同的。
-- ErLang是分布式系统工程的一个奇迹,它对问题领域的看法确实是惊人的,它对系统资源的处理方式也确实是与众不同的,在其他任何地方都不会遇到。

相关问题