org.apache.gobblin.source.workunit.Extract类的使用及代码示例

x33g5p2x  于2022-01-19 转载在 其他  
字(10.1k)|赞(0)|评价(0)|浏览(128)

本文整理了Java中org.apache.gobblin.source.workunit.Extract类的一些代码示例,展示了Extract类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Extract类的具体详情如下:
包路径:org.apache.gobblin.source.workunit.Extract
类名称:Extract

Extract介绍

[英]A class representing all the base attributes required by all tables types. Subclasses will be expected to validate each table type for their respective required attributes.

The extract ID only needs to be unique for Extracts belonging to the same namespace/table. One or more WorkUnits can share the same extract ID. WorkUnits that do share an extract ID will be considered parts of a single Extract for the purpose of applying publishing policies.
[中]表示所有表类型所需的所有基本属性的类。子类需要验证每个表类型各自所需的属性。
对于属于同一命名空间/表的摘录,摘录ID只需要是唯一的。一个或多个工作单元可以共享相同的提取ID。共享提取ID的工作单元将被视为单个提取的一部分,以便应用发布策略。

代码示例

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Get the {@link org.apache.gobblin.source.workunit.Extract} associated with the {@link WorkUnit}.
  3. *
  4. * @return {@link org.apache.gobblin.source.workunit.Extract} associated with the {@link WorkUnit}
  5. */
  6. public Extract getExtract() {
  7. return new Extract(this.workUnit.getExtract());
  8. }

代码示例来源:origin: apache/incubator-gobblin

  1. @Override
  2. public int hashCode() {
  3. return (this.getNamespace() + this.getTable() + this.getExtractId()).hashCode();
  4. }

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Set a (non-globally) unique ID for this {@link Extract}.
  3. *
  4. * @param extractId unique ID for this {@link Extract}
  5. */
  6. public void setExtractId(String extractId) {
  7. setProp(ConfigurationKeys.EXTRACT_EXTRACT_ID_KEY, extractId);
  8. }

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Get the writer output file path corresponding to this {@link Extract}.
  3. *
  4. * @return writer output file path corresponding to this {@link Extract}
  5. * @deprecated As {@code this.getIsFull} is deprecated.
  6. */
  7. @Deprecated
  8. public String getOutputFilePath() {
  9. return this.getNamespace().replaceAll("\\.", "/") + "/" + this.getTable() + "/" + this.getExtractId() + "_"
  10. + (this.getIsFull() ? "full" : "append");
  11. }

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Add more primary keys to the existing set of primary keys.
  3. *
  4. * @param primaryKeyFieldName primary key names
  5. * @deprecated @deprecated It is recommended to add primary keys in {@code WorkUnit} instead of {@code Extract}.
  6. */
  7. @Deprecated
  8. public void addPrimaryKey(String... primaryKeyFieldName) {
  9. StringBuilder sb = new StringBuilder(getProp(ConfigurationKeys.EXTRACT_PRIMARY_KEY_FIELDS_KEY, ""));
  10. Joiner.on(",").appendTo(sb, primaryKeyFieldName);
  11. setProp(ConfigurationKeys.EXTRACT_PRIMARY_KEY_FIELDS_KEY, sb.toString());
  12. }

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Returns a unique {@link Extract} instance.
  3. * Any two calls of this method from the same {@link ExtractFactory} instance guarantees to
  4. * return {@link Extract}s with different IDs.
  5. *
  6. * @param type {@link TableType}
  7. * @param namespace dot separated namespace path
  8. * @param table table name
  9. * @return a unique {@link Extract} instance
  10. */
  11. public synchronized Extract getUniqueExtract(TableType type, String namespace, String table) {
  12. Extract newExtract = new Extract(type, namespace, table);
  13. while (this.createdInstances.contains(newExtract)) {
  14. if (Strings.isNullOrEmpty(newExtract.getExtractId())) {
  15. newExtract.setExtractId(this.dtf.print(new DateTime()));
  16. } else {
  17. DateTime extractDateTime = this.dtf.parseDateTime(newExtract.getExtractId());
  18. newExtract.setExtractId(this.dtf.print(extractDateTime.plusSeconds(1)));
  19. }
  20. }
  21. this.createdInstances.add(newExtract);
  22. return newExtract;
  23. }
  24. }

代码示例来源:origin: apache/incubator-gobblin

  1. if (previousExtract.getNamespace().equals(namespace) && previousExtract.getTable().equals(table)) {
  2. this.previousTableState.addAll(pre);

代码示例来源:origin: apache/incubator-gobblin

  1. @Test
  2. public void testGetDefaultWriterFilePath() {
  3. String namespace = "gobblin.test";
  4. String tableName = "test-table";
  5. SourceState sourceState = new SourceState();
  6. WorkUnit state = WorkUnit.create(new Extract(sourceState, TableType.APPEND_ONLY, namespace, tableName));
  7. Assert.assertEquals(WriterUtils.getWriterFilePath(state, 0, 0), new Path(state.getExtract().getOutputFilePath()));
  8. Assert.assertEquals(WriterUtils.getWriterFilePath(state, 2, 0), new Path(state.getExtract().getOutputFilePath(),
  9. ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + "0"));
  10. }

代码示例来源:origin: apache/incubator-gobblin

  1. private Extract getExtractForFile(PartitionAwareFileRetriever.FileInfo file,
  2. String topicName,
  3. String namespace,
  4. Map<Long, Extract> extractMap) {
  5. Extract extract = extractMap.get(file.getWatermarkMsSinceEpoch());
  6. if (extract == null) {
  7. // Create an extract object for the dayPath
  8. extract = new Extract(this.tableType, namespace, topicName);
  9. LOG.info("Created extract: " + extract.getExtractId() + " for path " + topicName);
  10. extractMap.put(file.getWatermarkMsSinceEpoch(), extract);
  11. }
  12. return extract;
  13. }

代码示例来源:origin: apache/incubator-gobblin

  1. table.setNamespace(extract.getNamespace());
  2. table.setName(extract.getTable());
  3. if (extract.hasType()) {
  4. table.setType(TableTypeEnum.valueOf(extract.getType().name()));

代码示例来源:origin: apache/incubator-gobblin

  1. @Test
  2. public void schemaWithRecordOfEnum()
  3. throws Exception {
  4. String testName = "schemaWithRecordOfEnum";
  5. JsonObject schema = getSchemaData(testName).getAsJsonObject();
  6. JsonObject expected = getExpectedSchema(testName).getAsJsonObject();
  7. RecordConverter converter = new RecordConverter(new JsonSchema(schema), state,
  8. buildNamespace(state.getExtract().getNamespace(), "something"));
  9. Assert.assertEquals(avroSchemaToJsonElement(converter), expected);
  10. }

代码示例来源:origin: apache/incubator-gobblin

  1. @Override
  2. public MessageType convertSchema(JsonArray inputSchema, WorkUnitState workUnit)
  3. throws SchemaConversionException {
  4. String fieldName = workUnit.getExtract().getTable();
  5. JsonSchema jsonSchema = new JsonSchema(inputSchema);
  6. jsonSchema.setColumnName(fieldName);
  7. recordConverter = new RecordConverter(jsonSchema, ROOT);
  8. return (MessageType) recordConverter.schema();
  9. }

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Verify that each {@link Extract} created by an {@ExtractFactory} has a unique ID.
  3. */
  4. @Test
  5. public void testGetUniqueExtract() {
  6. ExtractFactory extractFactory = new ExtractFactory("yyyyMMddHHmmss");
  7. Set<String> extractIDs = Sets.newHashSet();
  8. int numOfExtracts = 100;
  9. for (int i = 0; i < numOfExtracts; i++) {
  10. extractIDs
  11. .add(extractFactory.getUniqueExtract(Extract.TableType.APPEND_ONLY, "namespace", "table").getExtractId());
  12. }
  13. Assert.assertEquals(extractIDs.size(), numOfExtracts);
  14. }
  15. }

代码示例来源:origin: apache/incubator-gobblin

  1. /**
  2. * Create a new properly populated {@link Extract} instance.
  3. *
  4. * <p>
  5. * This method should always return a new unique {@link Extract} instance.
  6. * </p>
  7. *
  8. * @param type {@link org.apache.gobblin.source.workunit.Extract.TableType}
  9. * @param namespace namespace of the table this extract belongs to
  10. * @param table name of the table this extract belongs to
  11. * @return a new unique {@link Extract} instance
  12. *
  13. * @Deprecated Use {@link org.apache.gobblin.source.extractor.extract.AbstractSource#createExtract(
  14. *org.apache.gobblin.source.workunit.Extract.TableType, String, String)}
  15. */
  16. @Deprecated
  17. public synchronized Extract createExtract(Extract.TableType type, String namespace, String table) {
  18. Extract extract = new Extract(this, type, namespace, table);
  19. while (EXTRACT_SET.contains(extract)) {
  20. if (Strings.isNullOrEmpty(extract.getExtractId())) {
  21. extract.setExtractId(DTF.print(new DateTime()));
  22. } else {
  23. DateTime extractDateTime = DTF.parseDateTime(extract.getExtractId());
  24. extract.setExtractId(DTF.print(extractDateTime.plusSeconds(1)));
  25. }
  26. }
  27. EXTRACT_SET.add(extract);
  28. return extract;
  29. }

代码示例来源:origin: org.apache.gobblin/gobblin-api

  1. /**
  2. * Get the writer output file path corresponding to this {@link Extract}.
  3. *
  4. * @return writer output file path corresponding to this {@link Extract}
  5. * @deprecated As {@code this.getIsFull} is deprecated.
  6. */
  7. @Deprecated
  8. public String getOutputFilePath() {
  9. return this.getNamespace().replaceAll("\\.", "/") + "/" + this.getTable() + "/" + this.getExtractId() + "_"
  10. + (this.getIsFull() ? "full" : "append");
  11. }

代码示例来源:origin: apache/incubator-gobblin

  1. @Override
  2. public Schema convertSchema(JsonArray schema, WorkUnitState workUnit)
  3. throws SchemaConversionException {
  4. try {
  5. JsonSchema jsonSchema = new JsonSchema(schema);
  6. jsonSchema.setColumnName(workUnit.getExtract().getTable());
  7. recordConverter = new RecordConverter(jsonSchema, workUnit, workUnit.getExtract().getNamespace());
  8. } catch (UnsupportedDateTypeException e) {
  9. throw new SchemaConversionException(e);
  10. }
  11. Schema recordSchema = recordConverter.schema();
  12. if (workUnit
  13. .getPropAsBoolean(CONVERTER_AVRO_NULLIFY_FIELDS_ENABLED, DEFAULT_CONVERTER_AVRO_NULLIFY_FIELDS_ENABLED)) {
  14. return this.generateSchemaWithNullifiedField(workUnit, recordSchema);
  15. }
  16. return recordSchema;
  17. }

代码示例来源:origin: apache/incubator-gobblin

  1. @Test
  2. public void testGetDefaultWriterFilePathWithWorkUnitState() {
  3. String namespace = "gobblin.test";
  4. String tableName = "test-table";
  5. SourceState sourceState = new SourceState();
  6. WorkUnit workUnit = WorkUnit.create(new Extract(sourceState, TableType.APPEND_ONLY, namespace, tableName));
  7. WorkUnitState workUnitState = new WorkUnitState(workUnit);
  8. Assert.assertEquals(WriterUtils.getWriterFilePath(workUnitState, 0, 0), new Path(workUnitState.getExtract()
  9. .getOutputFilePath()));
  10. Assert.assertEquals(WriterUtils.getWriterFilePath(workUnitState, 2, 0), new Path(workUnitState.getExtract()
  11. .getOutputFilePath(), ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + "0"));
  12. }

代码示例来源:origin: org.apache.gobblin/gobblin-core

  1. private Extract getExtractForFile(PartitionAwareFileRetriever.FileInfo file,
  2. String topicName,
  3. String namespace,
  4. Map<Long, Extract> extractMap) {
  5. Extract extract = extractMap.get(file.getWatermarkMsSinceEpoch());
  6. if (extract == null) {
  7. // Create an extract object for the dayPath
  8. extract = new Extract(this.tableType, namespace, topicName);
  9. LOG.info("Created extract: " + extract.getExtractId() + " for path " + topicName);
  10. extractMap.put(file.getWatermarkMsSinceEpoch(), extract);
  11. }
  12. return extract;
  13. }

代码示例来源:origin: org.apache.gobblin/gobblin-runtime

  1. table.setNamespace(extract.getNamespace());
  2. table.setName(extract.getTable());
  3. if (extract.hasType()) {
  4. table.setType(TableTypeEnum.valueOf(extract.getType().name()));

代码示例来源:origin: apache/incubator-gobblin

  1. @Test
  2. public void schemaWithRecordOfArray()
  3. throws Exception {
  4. String testName = "schemaWithRecordOfArray";
  5. JsonObject schema = getSchemaData(testName).getAsJsonObject();
  6. JsonObject expected = getExpectedSchema(testName).getAsJsonObject();
  7. RecordConverter converter = new RecordConverter(new JsonSchema(schema), state,
  8. buildNamespace(state.getExtract().getNamespace(), "something"));
  9. Assert.assertEquals(avroSchemaToJsonElement(converter), expected);
  10. }

相关文章