org.apache.tika.io.IOUtils类的使用及代码示例

x33g5p2x  于2022-01-21 转载在 其他  
字(11.3k)|赞(0)|评价(0)|浏览(218)

本文整理了Java中org.apache.tika.io.IOUtils类的一些代码示例,展示了IOUtils类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。IOUtils类的具体详情如下:
包路径:org.apache.tika.io.IOUtils
类名称:IOUtils

IOUtils介绍

[英]General IO stream manipulation utilities.

This class provides static utility methods for input/output operations.

  • closeQuietly - these methods close a stream ignoring nulls and exceptions
  • toXxx/read - these methods read data from a stream
  • write - these methods write data to a stream
  • copy - these methods copy all the data from one stream to another
  • contentEquals - these methods compare the content of two streams

The byte-to-char methods and char-to-byte methods involve a conversion step. Two methods are provided in each case, one that uses the platform default encoding and the other which allows you to specify an encoding. You are encouraged to always specify an encoding because relying on the platform default can lead to unexpected results, for example when moving from development to production.

All the methods in this class that read a stream are buffered internally. This means that there is no cause to use a BufferedInputStream or BufferedReader. The default buffer size of 4K has been shown to be efficient in tests.

Wherever possible, the methods in this class do not flush or close the stream. This is to avoid making non-portable assumptions about the streams' origin and further use. Thus the caller is still responsible for closing streams after use.

Origin of code: Excalibur.
[中]通用IO流操作实用程序。
此类为输入/输出操作提供静态实用程序方法。
*closeQuietly-这些方法关闭流时忽略null和异常
*toXxx/read-这些方法从流中读取数据
*写入-这些方法将数据写入流
*复制-这些方法将所有数据从一个流复制到另一个流
*contentEquals-这些方法比较两个流的内容
字节到字符方法和字符到字节方法涉及一个转换步骤。每种情况下都提供了两种方法,一种使用平台默认编码,另一种允许您指定编码。我们鼓励您始终指定编码,因为依赖平台默认值可能会导致意外的结果,例如从开发转移到生产时。
此类中读取流的所有方法都在内部缓冲。这意味着没有理由使用BufferedInputStreamBufferedReader。测试表明,默认缓冲区大小4K是有效的。
只要可能,此类中的方法都不会刷新或关闭流。这是为了避免对流的起源和进一步使用做出不可移植的假设。因此,调用方仍然负责在使用后关闭流。
代码来源:神剑。

代码示例

代码示例来源:origin: apache/tika

@Override
  public int read() throws IOException {
    int c = streams[currentStreamIndex].read();
    if (c < 0) {
      IOUtils.closeQuietly(streams[currentStreamIndex]);
      while (currentStreamIndex < streams.length-1) {
        currentStreamIndex++;
        int tmpC = streams[currentStreamIndex].read();
        if (tmpC < 0) {
          IOUtils.closeQuietly(streams[currentStreamIndex]);
        } else {
          return tmpC;
        }
      }
      return -1;
    }
    return c;
  }
}

代码示例来源:origin: apache/tika

/**
 * Get the contents of an <code>InputStream</code> as a <code>byte[]</code>.
 * <p>
 * This method buffers the input internally, so there is no need to use a
 * <code>BufferedInputStream</code>.
 * 
 * @param input  the <code>InputStream</code> to read from
 * @return the requested byte array
 * @throws NullPointerException if the input is null
 * @throws IOException if an I/O error occurs
 */
public static byte[] toByteArray(InputStream input) throws IOException {
  ByteArrayOutputStream output = new ByteArrayOutputStream();
  copy(input, output);
  return output.toByteArray();
}

代码示例来源:origin: apache/tika

try (ByteArrayOutputStream byteStream = new ByteArrayOutputStream()) {
  IOUtils.copy(stream, byteStream);
  request.setEntity(new ByteArrayEntity(byteStream.toByteArray()));
  String replyMessage = IOUtils.toString(reply);
  if (response.getStatusLine().getStatusCode() == 200) {
    JSONObject jReply = (JSONObject) new JSONParser().parse(replyMessage);

代码示例来源:origin: apache/tika

public void run() {
    try {
      IOUtils.copy(stream, new NullOutputStream());
    } catch (IOException e) {
    } finally {
      IOUtils.closeQuietly(stream);
    }
  }
};

代码示例来源:origin: apache/tika

ByteArrayOutputStream stdErrOutputStream = new ByteArrayOutputStream();
    IOUtils.copy(tempOutputFileInputStream, outputStream);
    IOUtils.closeQuietly(tikaInputStream);
  IOUtils.closeQuietly(outputStream);
  IOUtils.closeQuietly(stdErrOutputStream);
  if (process.exitValue() != 0) {
    throw new TikaException("There was an error executing the command line" +
        "\nExecutable Command:\n\n" + cmd +
        "\nExecutable Error:\n\n" + stdErrOutputStream.toString(UTF_8.name()));

代码示例来源:origin: apache/tika

public void run() {
  try {
    try {
      IOUtils.copy(input, output);
    } finally {
      output.close();
    }
  } catch (Exception e) {
    exception = e;
  }
}

代码示例来源:origin: apache/tika

/**
 * Translate the given text InputStream to and from the given languages.
 * @see org.apache.tika.language.translate.Translator
 * @param text The text to translate.
 * @param sourceLanguage The input text language (for example, "hi").
 * @param targetLanguage The desired output language (for example, "fr").
 * @return The translated text. If translation is unavailable (client keys not set), returns the same text back.
 */
public String translate(InputStream text, String sourceLanguage, String targetLanguage){
  try {
    return translator.translate(IOUtils.toString(text), sourceLanguage, targetLanguage);
  } catch (Exception e){
    throw new IllegalStateException("Error translating data.", e);
  }
}

代码示例来源:origin: info.magnolia/magnolia-4-5-migration

static String htmlHeader(String title) {
  String header = "<html>\n";
  header += "  <head>\n";
  header += "    <title>Migration report" + (StringUtils.isBlank(title) ? "" : ": " + title) + "</title>\n";
  String style = null;
  try {
    String[] resources = ClasspathResourcesUtil.findResources("/report-generator/migration-report.css");
    if (resources != null && resources.length > 0) {
      InputStream in = DefaultReportGenerator.class.getResourceAsStream(resources[0]);
      style = IOUtils.toString(in);
      IOUtils.closeQuietly(in);
    }
  } catch (IOException e) {
    log.error("Cannot load CSS style: "+e.getMessage());
    log.debug("Cannot load CSS style.", e);
  }
  if (style!=null) {
    header += "<style>\n";
    header += style;
    header += "</style>\n";
  }
  header += "  </head>\n";
  header += "  <body>\n";
  header += "  <h1>Migration report</h1>\n\n";
  return header;
}

代码示例来源:origin: com.google.code.crawler-commons/crawler-commons

byte[] content = IOUtils.toByteArray(fis);
  long stopTime = System.currentTimeMillis();
  long totalReadTime = Math.max(1, stopTime - startTime);
  throw new IOFetchException(url, e);
} finally {
  IOUtils.closeQuietly(fis);

代码示例来源:origin: apache/tika

/**
 * Get the contents of an <code>InputStream</code> as a list of Strings,
 * one entry per line, using the default character encoding of the platform.
 * <p>
 * This method buffers the input internally, so there is no need to use a
 * <code>BufferedInputStream</code>.
 *
 * @param input  the <code>InputStream</code> to read from, not null
 * @return the list of Strings, never null
 * @throws NullPointerException if the input is null
 * @throws IOException if an I/O error occurs
 * @since Commons IO 1.1
 */
public static List<String> readLines(InputStream input) throws IOException {
  InputStreamReader reader = new InputStreamReader(input, UTF_8);
  return readLines(reader);
}

代码示例来源:origin: apache/tika

private static void benchmark(File file) throws Exception {
  if (file.isHidden()) {
    // ignore
  } else if (file.isFile()) {
    try (InputStream input = new FileInputStream(file)) {
      byte[] content = IOUtils.toByteArray(input);
      String type =
          tika.detect(new ByteArrayInputStream(content));
      long start = System.currentTimeMillis();
      for (int i = 0; i < 1000; i++) {
        tika.detect(new ByteArrayInputStream(content));
      }
      System.out.printf(
          Locale.ROOT,
          "%6dns per Tika.detect(%s) = %s%n",
          System.currentTimeMillis() - start, file, type);
    }
  } else if (file.isDirectory()) {
    for (File child : file.listFiles()) {
      benchmark(child);
    }
  }
}

代码示例来源:origin: apache/tika

/**
 * Convert the specified CharSequence to an input stream, encoded as bytes
 * using the default character encoding of the platform.
 *
 * @param input the CharSequence to convert
 * @return an input stream
 * @since IO 2.0
 */
public static InputStream toInputStream(CharSequence input) {
  return toInputStream(input.toString());
}

代码示例来源:origin: org.apache.tika/tika-core

ByteArrayOutputStream stdErrOutputStream = new ByteArrayOutputStream();
    IOUtils.copy(tempOutputFileInputStream, outputStream);
    IOUtils.closeQuietly(tikaInputStream);
  IOUtils.closeQuietly(outputStream);
  IOUtils.closeQuietly(stdErrOutputStream);
  if (process.exitValue() != 0) {
    throw new TikaException("There was an error executing the command line" +
        "\nExecutable Command:\n\n" + cmd +
        "\nExecutable Error:\n\n" + stdErrOutputStream.toString(UTF_8.name()));

代码示例来源:origin: org.apache.tika/tika-core

public void run() {
    try {
      IOUtils.copy(stream, new NullOutputStream());
    } catch (IOException e) {
    } finally {
      IOUtils.closeQuietly(stream);
    }
  }
};

代码示例来源:origin: apache/tika

public void run() {
    OutputStream stdin = process.getOutputStream();
    try {
      IOUtils.copy(stream, stdin);
    } catch (IOException e) {
    }
  }
};

代码示例来源:origin: apache/tika

/**
 * Translate the given text InputStream to the given language, attempting to auto-detect the source language.
 * This does not close the stream, so the caller has the responsibility of closing it.
 * @see org.apache.tika.language.translate.Translator
 * @param text The text to translate.
 * @param targetLanguage The desired output language (for example, "en").
 * @return The translated text. If translation is unavailable (client keys not set), returns the same text back.
 */
public String translate(InputStream text, String targetLanguage){
  try {
    return translator.translate(IOUtils.toString(text), targetLanguage);
  } catch (Exception e){
    throw new IllegalStateException("Error translating data.", e);
  }
}

代码示例来源:origin: apache/tika

/**
 * Get the contents of an <code>InputStream</code> as a list of Strings,
 * one entry per line, using the specified character encoding.
 * <p>
 * Character encoding names can be found at
 * <a href="http://www.iana.org/assignments/character-sets">IANA</a>.
 * <p>
 * This method buffers the input internally, so there is no need to use a
 * <code>BufferedInputStream</code>.
 *
 * @param input  the <code>InputStream</code> to read from, not null
 * @param encoding  the encoding to use, null means platform default
 * @return the list of Strings, never null
 * @throws NullPointerException if the input is null
 * @throws IOException if an I/O error occurs
 * @since Commons IO 1.1
 */
public static List<String> readLines(InputStream input, String encoding) throws IOException {
  if (encoding == null) {
    return readLines(input);
  } else {
    InputStreamReader reader = new InputStreamReader(input, encoding);
    return readLines(reader);
  }
}

代码示例来源:origin: apache/tika

@Test
public void testDetectApplicationEnviHdr() throws Exception {
  InputStream iStream = MagicDetectorTest.class.getResourceAsStream(
     "/test-documents/ang20150420t182050_corr_v1e_img.hdr");
  byte[] data = IOUtils.toByteArray(iStream);
  MediaType testMT = new MediaType("application", "envi.hdr");
  Detector detector = new MagicDetector(testMT, data, null, false, 0, 0);
  // Deliberately prevent InputStream.read(...) from reading the entire
  // buffer in one go
  InputStream stream = new RestrictiveInputStream(data);
  assertEquals(testMT, detector.detect(stream, new Metadata()));
}

代码示例来源:origin: apache/tika

/**
 * Convert the specified CharSequence to an input stream, encoded as bytes
 * using the specified character encoding.
 * <p>
 * Character encoding names can be found at
 * <a href="http://www.iana.org/assignments/character-sets">IANA</a>.
 *
 * @param input the CharSequence to convert
 * @param encoding the encoding to use, null means platform default
 * @throws IOException if the encoding is invalid
 * @return an input stream
 * @since IO 2.0
 */
public static InputStream toInputStream(CharSequence input, String encoding) throws IOException {
  return toInputStream(input.toString(), encoding);
}

代码示例来源:origin: apache/tika

/**
 * Get the contents of a <code>Reader</code> as a <code>byte[]</code>
 * using the default character encoding of the platform.
 * <p>
 * This method buffers the input internally, so there is no need to use a
 * <code>BufferedReader</code>.
 * 
 * @param input  the <code>Reader</code> to read from
 * @return the requested byte array
 * @throws NullPointerException if the input is null
 * @throws IOException if an I/O error occurs
 */
public static byte[] toByteArray(Reader input) throws IOException {
  ByteArrayOutputStream output = new ByteArrayOutputStream();
  copy(input, output);
  return output.toByteArray();
}

相关文章