我目前正在尝试理解Erlang集合的行为,当我计算字符串变位词上的集合时。在我的理解中,两个变位词应该产生两个相同的字符串集合。
Set1 = sets:from_list("orchestra").
Set2 = sets:from_list("carthorse").
Set1 =:= Set2. %% true
然而,使用sets:intersection
,我们得到了一个不同的集合,它不等于前两个集合。
Intersection = sets:intersection(Set1, Set2).
Intersection =:= Set1. %% false
Intersection =:= Set2. %% false
基于Erlang中集合交集的计算方式,这种行为是否有特殊的原因?
3条答案
按热度按时间1cklez4t1#
sets
模块的实现并不保证两个集合可以与=:=
进行比较,即使它们包含相同的元素。内部数据结构可以不同。您可以使用is_subset/2
或subtract/2
之类的运算(相对效率较低),或者可以使用to_list/1
,然后使用lists:sort/1
来获得两个可以直接比较的列表。(字符列表)无论如何,你最好马上使用ordsets
。这些是有序的列表,你可以把它们作为集合来操作,并且可以直接进行比较。对于小的集合,它们通常比sets
更有效。xxhby3vn2#
=:=
运算符比较Set1
和Intersection
的表示,但不能保证表示是什么,也不能保证同一个集合只有一个表示。sets的文档只在描述如何比较集合的元素时讨论了
=:=
,而不是集合本身。对于集合相等,可以定义
ars1skjm3#
Let's take the code and look at what happens along the way.
We are defining 2 sets
and
where the second lines represent the values created with a representation. So based on the above definition, we infer that the representation chosen for storing sets is of a tuple.
We can refer to the Term Comparison section of Erlang Reference manual which states that
...Lists are compared element by element. Tuples are ordered by size, two tuples with the same size are compared element by element...
*Having that as an invariant, now lets look at the set which comes about after an
intersection
Again, this being a set gets the same underlying representation of the tuple, but if we look at the last element, there is a string (essentially a list of characters in Erlang)
"cs"
, which is different in value in the same place when we definedSet1
andSet2
. And hence the inequality as per theTerm Comparison
.Now, let's attempt to change the sub-value
"cs"
insideIntersection
to"sc"
and then as per theTerm Comparison
rules, two sets must satisfy equality.And now let's compare
Improvised_Intersection
withSet1
andSet2
, which givesIt's just an attempt to provide some insight into what already @Richard and others have pointed out.