本文整理了Java中cz.seznam.euphoria.core.client.operator.Union
类的一些代码示例,展示了Union
类的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Union
类的具体详情如下:
包路径:cz.seznam.euphoria.core.client.operator.Union
类名称:Union
[英]The union of at least two datasets of the same type.
In the context of Euphoria, a union is merely a logical view on two datasets as one. One can think of a union as a plain concatenation of at least two dataset without any guarantees about the order of the datasets' elements. Unlike in set theory, such a union has no notion of uniqueness, i.e. if the two input dataset contain the same elements, these will all appear in the output dataset (as duplicates) untouched.
Example:
Dataset xs = ...;
The "both" dataset from the above example can now be processed with an operator expecting only a single input dataset, e.g. FlatMap, which will then effectively process both "xs" and "ys".
Note: the order of the dataset does not matter. Indeed, the order of the elements themselves in the union is intentionally not specified at all.
Dataset xs = ...;
现在,可以使用只需要一个输入数据集(例如FlatMap)的操作员来处理上述示例中的“两者”数据集,然后该操作员将有效地处理“xs”和“ys”。
注意:数据集的顺序无关紧要。事实上,联盟中元素本身的顺序根本没有被有意指定。
####建筑商:
1.【命名】。。。。。。。。。。。。。。。。。。给操作员起名[可选]
1.关于。。。。。。。。。。。。。。。。。。。。。。。输入数据集
1.产出。。。。。。。。。。。。。。。。。。。构建输出数据集
代码示例来源:origin: seznam/euphoria
/**
* Starts building a nameless Union operator to view at least two datasets as one.
*
* @param <IN> the type of elements in the data sets
*
* @param dataSets at least the two data sets
*
* @return the next builder to complete the setup of the {@link Union} operator
*
* @see #named(String)
* @see OfBuilder#of(List)
*/
@SafeVarargs
public static <IN> OutputBuilder<IN> of(Dataset<IN>... dataSets) {
return of(Arrays.asList(dataSets));
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<IN> output(OutputHint... outputHints) {
final Flow flow = dataSets.get(0).getFlow();
final Union<IN> union = new Union<>(name, flow, dataSets, Sets.newHashSet(outputHints));
flow.add(union);
return union.output();
}
}
代码示例来源:origin: seznam/euphoria
@SuppressWarnings("unchecked")
Union(String name, Flow flow, List<Dataset<IN>> dataSets, Set<OutputHint> outputHints) {
super(name, flow);
Preconditions.checkArgument(
dataSets.size() > 1,
"Union needs at least two data sets.");
Preconditions.checkArgument(
dataSets.stream().map(Dataset::getFlow).distinct().count() == 1,
"Only data sets from the same flow can be passed to Union.");
this.dataSets = dataSets;
this.output = createOutput(dataSets.get(0), outputHints);
}
代码示例来源:origin: seznam/euphoria
@Test
public void testBuild() {
final Flow flow = Flow.create("TEST");
final Dataset<String> left = Util.createMockDataset(flow, 2);
final Dataset<String> right = Util.createMockDataset(flow, 3);
final Dataset<String> unioned = Union.named("Union1")
.of(left, right)
.output();
assertEquals(flow, unioned.getFlow());
assertEquals(1, flow.size());
final Union union = (Union) flow.operators().iterator().next();
assertEquals(flow, union.getFlow());
assertEquals("Union1", union.getName());
assertEquals(unioned, union.output());
assertEquals(2, union.listInputs().size());
}
代码示例来源:origin: seznam/euphoria
@Test
public void testBuild_ImplicitName() {
final Flow flow = Flow.create("TEST");
final Dataset<String> left = Util.createMockDataset(flow, 2);
final Dataset<String> right = Util.createMockDataset(flow, 3);
Union.of(left, right).output();
final Union union = (Union) flow.operators().iterator().next();
assertEquals("Union", union.getName());
}
}
代码示例来源:origin: seznam/euphoria
@Override
@SuppressWarnings("unchecked")
public JavaRDD<?> translate(Union operator, SparkExecutorContext context) {
final List<JavaRDD<?>> inputs = context.getInputs(operator);
if (inputs.size() < 2) {
throw new IllegalStateException("Union operator needs at least 2 inputs");
}
return inputs
.stream()
.reduce(
(l, r) ->
((JavaRDD<Object>) l)
.union((JavaRDD<Object>) r)
.setName(operator.getName()))
.orElseThrow(() -> new IllegalArgumentException("Unable to reduce inputs."));
}
}
代码示例来源:origin: seznam/euphoria
@Test
public void testBuild_ThreeDataSet() {
final Flow flow = Flow.create("TEST");
final Dataset<String> first = Util.createMockDataset(flow, 1);
final Dataset<String> second = Util.createMockDataset(flow, 2);
final Dataset<String> third = Util.createMockDataset(flow, 3);
final Dataset<String> unioned = Union.named("Union1")
.of(first, second, third)
.output();
assertEquals(flow, unioned.getFlow());
assertEquals(1, flow.size());
final Union union = (Union) flow.operators().iterator().next();
assertEquals(flow, union.getFlow());
assertEquals("Union1", union.getName());
assertEquals(unioned, union.output());
assertEquals(3, union.listInputs().size());
}
代码示例来源:origin: seznam/euphoria
/**
* Starts building a nameless Union operator to view at least two datasets as one.
*
* @param <IN> the type of elements in the data sets
*
* @param dataSets at least the two data sets
*
* @return the next builder to complete the setup of the {@link Union} operator
*
* @see #named(String)
* @see OfBuilder#of(List)
*/
@SafeVarargs
public static <IN> OutputBuilder<IN> of(Dataset<IN>... dataSets) {
return of(Arrays.asList(dataSets));
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<IN> output(OutputHint... outputHints) {
final Flow flow = dataSets.get(0).getFlow();
final Union<IN> union = new Union<>(name, flow, dataSets, Sets.newHashSet(outputHints));
flow.add(union);
return union.output();
}
}
代码示例来源:origin: seznam/euphoria
@SuppressWarnings("unchecked")
Union(String name, Flow flow, List<Dataset<IN>> dataSets, Set<OutputHint> outputHints) {
super(name, flow);
Preconditions.checkArgument(
dataSets.size() > 1,
"Union needs at least two data sets.");
Preconditions.checkArgument(
dataSets.stream().map(Dataset::getFlow).distinct().count() == 1,
"Only data sets from the same flow can be passed to Union.");
this.dataSets = dataSets;
this.output = createOutput(dataSets.get(0), outputHints);
}
代码示例来源:origin: seznam/euphoria
public Dataset<T> union(Dataset<T> other) {
return new Dataset<>(Union.of(this.wrap, other.wrap).output());
}
代码示例来源:origin: seznam/euphoria
new Union<>(getName() + "::Union", flow,
Arrays.asList(leftMap.output(), rightMap.output()));
getName() + "::ReduceStateByKey",
flow,
union.output(),
keyExtractor,
e -> e,
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5, 6)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(7, 8, 9),
Arrays.asList(10, 11, 12)));
return Union.of(first, second).output();
}
代码示例来源:origin: seznam/euphoria
new Union<>(getName() + "::Union", flow,
Arrays.asList(leftMap.output(), rightMap.output()));
getName() + "::ReduceStateByKey",
flow,
union.output(),
keyExtractor,
e -> e,
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5, 6)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(7, 8, 9),
Arrays.asList(10, 11, 12)));
return Union.of(first, second).output();
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(4, 5, 6),
Arrays.asList(7, 8, 9)));
final Dataset<Integer> third = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(10, 11, 12),
Arrays.asList(13, 14, 15),
Arrays.asList(16, 17, 18)));
return Union.of(first, second, third).output();
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5, 6)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(7, 8, 9),
Arrays.asList(10, 11, 12)));
final Dataset<Integer> third = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(13, 14, 15),
Arrays.asList(16, 17, 18)));
return Union.of(first, second, third).output();
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(4, 5, 6),
Arrays.asList(7, 8, 9)));
final Dataset<Integer> third = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(10, 11, 12),
Arrays.asList(13, 14, 15),
Arrays.asList(16, 17, 18)));
return Union.of(first, second, third).output();
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5, 6)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(7, 8, 9),
Arrays.asList(10, 11, 12)));
final Dataset<Integer> third = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(13, 14, 15),
Arrays.asList(16, 17, 18)));
return Union.of(first, second, third).output();
}
代码示例来源:origin: seznam/euphoria
@Override
public Dataset<Integer> getOutput(Flow flow, boolean bounded) {
final Dataset<Integer> first = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(1, 2, 3)));
final Dataset<Integer> second = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(4, 5, 6)));
final Dataset<Integer> third = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(7, 8, 9)));
final Dataset<Integer> fourth = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(10, 11, 12)));
final Dataset<Integer> fifth = flow.createInput(ListDataSource.of(bounded,
Arrays.asList(13, 14, 15)));
return Union.of(first, second, third, fourth, fifth).output();
}
内容来源于网络,如有侵权,请联系作者删除!