org.apache.gobblin.source.workunit.Extract类的使用及代码示例

x33g5p2x 于2022-01-19 转载在其他

字(10.1k)|赞(0)|评价(0)|浏览(128)

本文整理了Java中org.apache.gobblin.source.workunit.Extract类的一些代码示例，展示了Extract类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台，是从一些精选项目中提取出来的代码，具有较强的参考意义，能在一定程度帮忙到你。Extract类的具体详情如下：
包路径：org.apache.gobblin.source.workunit.Extract
类名称：Extract

Extract介绍

[英]A class representing all the base attributes required by all tables types. Subclasses will be expected to validate each table type for their respective required attributes.

The extract ID only needs to be unique for Extracts belonging to the same namespace/table. One or more WorkUnits can share the same extract ID. WorkUnits that do share an extract ID will be considered parts of a single Extract for the purpose of applying publishing policies.
[中]表示所有表类型所需的所有基本属性的类。子类需要验证每个表类型各自所需的属性。
对于属于同一命名空间/表的摘录，摘录ID只需要是唯一的。一个或多个工作单元可以共享相同的提取ID。共享提取ID的工作单元将被视为单个提取的一部分，以便应用发布策略。

代码示例

代码示例来源：origin: apache/incubator-gobblin

/**
 * Get the {@link org.apache.gobblin.source.workunit.Extract} associated with the {@link WorkUnit}.
 *
 * @return {@link org.apache.gobblin.source.workunit.Extract} associated with the {@link WorkUnit}
 */
public Extract getExtract() {
 return new Extract(this.workUnit.getExtract());
}

代码示例来源：origin: apache/incubator-gobblin

@Override
public int hashCode() {
 return (this.getNamespace() + this.getTable() + this.getExtractId()).hashCode();
}

代码示例来源：origin: apache/incubator-gobblin

/**
 * Set a (non-globally) unique ID for this {@link Extract}.
 *
 * @param extractId unique ID for this {@link Extract}
 */
public void setExtractId(String extractId) {
 setProp(ConfigurationKeys.EXTRACT_EXTRACT_ID_KEY, extractId);
}

代码示例来源：origin: apache/incubator-gobblin

/**
 * Get the writer output file path corresponding to this {@link Extract}.
 *
 * @return writer output file path corresponding to this {@link Extract}
 * @deprecated As {@code this.getIsFull} is deprecated.
 */
@Deprecated
public String getOutputFilePath() {
 return this.getNamespace().replaceAll("\\.", "/") + "/" + this.getTable() + "/" + this.getExtractId() + "_"
   + (this.getIsFull() ? "full" : "append");
}

代码示例来源：origin: apache/incubator-gobblin

/**
 * Add more primary keys to the existing set of primary keys.
 *
 * @param primaryKeyFieldName primary key names
 * @deprecated @deprecated It is recommended to add primary keys in {@code WorkUnit} instead of {@code Extract}.
 */
@Deprecated
public void addPrimaryKey(String... primaryKeyFieldName) {
 StringBuilder sb = new StringBuilder(getProp(ConfigurationKeys.EXTRACT_PRIMARY_KEY_FIELDS_KEY, ""));
 Joiner.on(",").appendTo(sb, primaryKeyFieldName);
 setProp(ConfigurationKeys.EXTRACT_PRIMARY_KEY_FIELDS_KEY, sb.toString());
}

代码示例来源：origin: apache/incubator-gobblin

/**
  * Returns a unique {@link Extract} instance.
  * Any two calls of this method from the same {@link ExtractFactory} instance guarantees to
  * return {@link Extract}s with different IDs.
  *
  * @param type {@link TableType}
  * @param namespace dot separated namespace path
  * @param table table name
  * @return a unique {@link Extract} instance
  */
 public synchronized Extract getUniqueExtract(TableType type, String namespace, String table) {
  Extract newExtract = new Extract(type, namespace, table);
  while (this.createdInstances.contains(newExtract)) {
   if (Strings.isNullOrEmpty(newExtract.getExtractId())) {
    newExtract.setExtractId(this.dtf.print(new DateTime()));
   } else {
    DateTime extractDateTime = this.dtf.parseDateTime(newExtract.getExtractId());
    newExtract.setExtractId(this.dtf.print(extractDateTime.plusSeconds(1)));
   }
  }
  this.createdInstances.add(newExtract);
  return newExtract;
 }
}

代码示例来源：origin: apache/incubator-gobblin

if (previousExtract.getNamespace().equals(namespace) && previousExtract.getTable().equals(table)) {
 this.previousTableState.addAll(pre);

代码示例来源：origin: apache/incubator-gobblin

@Test
public void testGetDefaultWriterFilePath() {
 String namespace = "gobblin.test";
 String tableName = "test-table";
 SourceState sourceState = new SourceState();
 WorkUnit state = WorkUnit.create(new Extract(sourceState, TableType.APPEND_ONLY, namespace, tableName));
 Assert.assertEquals(WriterUtils.getWriterFilePath(state, 0, 0), new Path(state.getExtract().getOutputFilePath()));
 Assert.assertEquals(WriterUtils.getWriterFilePath(state, 2, 0), new Path(state.getExtract().getOutputFilePath(),
   ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + "0"));
}

代码示例来源：origin: apache/incubator-gobblin

private Extract getExtractForFile(PartitionAwareFileRetriever.FileInfo file,
  String topicName,
  String namespace,
  Map<Long, Extract> extractMap) {
 Extract extract = extractMap.get(file.getWatermarkMsSinceEpoch());
 if (extract == null) {
  // Create an extract object for the dayPath
  extract = new Extract(this.tableType, namespace, topicName);
  LOG.info("Created extract: " + extract.getExtractId() + " for path " + topicName);
  extractMap.put(file.getWatermarkMsSinceEpoch(), extract);
 }
 return extract;
}

代码示例来源：origin: apache/incubator-gobblin

table.setNamespace(extract.getNamespace());
table.setName(extract.getTable());
if (extract.hasType()) {
 table.setType(TableTypeEnum.valueOf(extract.getType().name()));

代码示例来源：origin: apache/incubator-gobblin

@Test
public void schemaWithRecordOfEnum()
  throws Exception {
 String testName = "schemaWithRecordOfEnum";
 JsonObject schema = getSchemaData(testName).getAsJsonObject();
 JsonObject expected = getExpectedSchema(testName).getAsJsonObject();
 RecordConverter converter = new RecordConverter(new JsonSchema(schema), state,
   buildNamespace(state.getExtract().getNamespace(), "something"));
 Assert.assertEquals(avroSchemaToJsonElement(converter), expected);
}

代码示例来源：origin: apache/incubator-gobblin

@Override
public MessageType convertSchema(JsonArray inputSchema, WorkUnitState workUnit)
  throws SchemaConversionException {
 String fieldName = workUnit.getExtract().getTable();
 JsonSchema jsonSchema = new JsonSchema(inputSchema);
 jsonSchema.setColumnName(fieldName);
 recordConverter = new RecordConverter(jsonSchema, ROOT);
 return (MessageType) recordConverter.schema();
}

代码示例来源：origin: apache/incubator-gobblin

/**
  * Verify that each {@link Extract} created by an {@ExtractFactory} has a unique ID.
  */
 @Test
 public void testGetUniqueExtract() {
  ExtractFactory extractFactory = new ExtractFactory("yyyyMMddHHmmss");
  Set<String> extractIDs = Sets.newHashSet();
  int numOfExtracts = 100;
  for (int i = 0; i < numOfExtracts; i++) {
   extractIDs
     .add(extractFactory.getUniqueExtract(Extract.TableType.APPEND_ONLY, "namespace", "table").getExtractId());
  }
  Assert.assertEquals(extractIDs.size(), numOfExtracts);
 }
}

代码示例来源：origin: apache/incubator-gobblin

/**
 * Create a new properly populated {@link Extract} instance.
 *
 * <p>
 *   This method should always return a new unique {@link Extract} instance.
 * </p>
 *
 * @param type {@link org.apache.gobblin.source.workunit.Extract.TableType}
 * @param namespace namespace of the table this extract belongs to
 * @param table name of the table this extract belongs to
 * @return a new unique {@link Extract} instance
 *
 * @Deprecated Use {@link org.apache.gobblin.source.extractor.extract.AbstractSource#createExtract(
 *org.apache.gobblin.source.workunit.Extract.TableType, String, String)}
 */
@Deprecated
public synchronized Extract createExtract(Extract.TableType type, String namespace, String table) {
 Extract extract = new Extract(this, type, namespace, table);
 while (EXTRACT_SET.contains(extract)) {
  if (Strings.isNullOrEmpty(extract.getExtractId())) {
   extract.setExtractId(DTF.print(new DateTime()));
  } else {
   DateTime extractDateTime = DTF.parseDateTime(extract.getExtractId());
   extract.setExtractId(DTF.print(extractDateTime.plusSeconds(1)));
  }
 }
 EXTRACT_SET.add(extract);
 return extract;
}

代码示例来源：origin: org.apache.gobblin/gobblin-api

/**
 * Get the writer output file path corresponding to this {@link Extract}.
 *
 * @return writer output file path corresponding to this {@link Extract}
 * @deprecated As {@code this.getIsFull} is deprecated.
 */
@Deprecated
public String getOutputFilePath() {
 return this.getNamespace().replaceAll("\\.", "/") + "/" + this.getTable() + "/" + this.getExtractId() + "_"
   + (this.getIsFull() ? "full" : "append");
}

代码示例来源：origin: apache/incubator-gobblin

@Override
public Schema convertSchema(JsonArray schema, WorkUnitState workUnit)
  throws SchemaConversionException {
 try {
  JsonSchema jsonSchema = new JsonSchema(schema);
  jsonSchema.setColumnName(workUnit.getExtract().getTable());
  recordConverter = new RecordConverter(jsonSchema, workUnit, workUnit.getExtract().getNamespace());
 } catch (UnsupportedDateTypeException e) {
  throw new SchemaConversionException(e);
 }
 Schema recordSchema = recordConverter.schema();
 if (workUnit
   .getPropAsBoolean(CONVERTER_AVRO_NULLIFY_FIELDS_ENABLED, DEFAULT_CONVERTER_AVRO_NULLIFY_FIELDS_ENABLED)) {
  return this.generateSchemaWithNullifiedField(workUnit, recordSchema);
 }
 return recordSchema;
}

代码示例来源：origin: apache/incubator-gobblin

@Test
public void testGetDefaultWriterFilePathWithWorkUnitState() {
 String namespace = "gobblin.test";
 String tableName = "test-table";
 SourceState sourceState = new SourceState();
 WorkUnit workUnit = WorkUnit.create(new Extract(sourceState, TableType.APPEND_ONLY, namespace, tableName));
 WorkUnitState workUnitState = new WorkUnitState(workUnit);
 Assert.assertEquals(WriterUtils.getWriterFilePath(workUnitState, 0, 0), new Path(workUnitState.getExtract()
   .getOutputFilePath()));
 Assert.assertEquals(WriterUtils.getWriterFilePath(workUnitState, 2, 0), new Path(workUnitState.getExtract()
   .getOutputFilePath(), ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + "0"));
}

代码示例来源：origin: org.apache.gobblin/gobblin-core

private Extract getExtractForFile(PartitionAwareFileRetriever.FileInfo file,
  String topicName,
  String namespace,
  Map<Long, Extract> extractMap) {
 Extract extract = extractMap.get(file.getWatermarkMsSinceEpoch());
 if (extract == null) {
  // Create an extract object for the dayPath
  extract = new Extract(this.tableType, namespace, topicName);
  LOG.info("Created extract: " + extract.getExtractId() + " for path " + topicName);
  extractMap.put(file.getWatermarkMsSinceEpoch(), extract);
 }
 return extract;
}

代码示例来源：origin: org.apache.gobblin/gobblin-runtime

table.setNamespace(extract.getNamespace());
table.setName(extract.getTable());
if (extract.hasType()) {
 table.setType(TableTypeEnum.valueOf(extract.getType().name()));

代码示例来源：origin: apache/incubator-gobblin

@Test
public void schemaWithRecordOfArray()
  throws Exception {
 String testName = "schemaWithRecordOfArray";
 JsonObject schema = getSchemaData(testName).getAsJsonObject();
 JsonObject expected = getExpectedSchema(testName).getAsJsonObject();
 RecordConverter converter = new RecordConverter(new JsonSchema(schema), state,
   buildNamespace(state.getExtract().getNamespace(), "something"));
 Assert.assertEquals(avroSchemaToJsonElement(converter), expected);
}

内容来源于网络，如有侵权，请联系作者删除！

相关文章

热门标签

Java query python Node 开发语言 request Util 数据库 Table 后端算法 Logger Message Element Parser

最新文章

高级程序员和新手小白程序员区别你是那个等级看解决bug速度
浏览(1707) 发布于 8个月前
还在用双层for循环吗？太慢了
浏览(1274) 发布于 8个月前
我用EasyExcel优化了公司的导出（附踩坑记录）
浏览(1317) 发布于 8个月前
记录因Sharding Jdbc批量操作引发的一次fullGC
浏览(1124) 发布于 8个月前
进大厂必须要会的单元测试
浏览(1138) 发布于 8个月前