org.jsoup.parser.Parser.parseInput()方法的使用及代码示例

x33g5p2x  于2022-01-26 转载在 其他  
字(6.1k)|赞(0)|评价(0)|浏览(143)

本文整理了Java中org.jsoup.parser.Parser.parseInput()方法的一些代码示例,展示了Parser.parseInput()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Parser.parseInput()方法的具体详情如下:
包路径:org.jsoup.parser.Parser
类名称:Parser
方法名:parseInput

Parser.parseInput介绍

暂无

代码示例

代码示例来源:origin: org.jsoup/jsoup

  1. /**
  2. Parse HTML into a Document, using the provided Parser. You can provide an alternate parser, such as a simple XML
  3. (non-HTML) parser.
  4. @param html HTML to parse
  5. @param baseUri The URL where the HTML was retrieved from. Used to resolve relative URLs to absolute URLs, that occur
  6. before the HTML declares a {@code <base href>} tag.
  7. @param parser alternate {@link Parser#xmlParser() parser} to use.
  8. @return sane HTML
  9. */
  10. public static Document parse(String html, String baseUri, Parser parser) {
  11. return parser.parseInput(html, baseUri);
  12. }

代码示例来源:origin: org.jsoup/jsoup

  1. doc = parser.parseInput(docData, baseUri);
  2. reader.skip(1);
  3. try {
  4. doc = parser.parseInput(reader, baseUri);
  5. } catch (UncheckedIOException e) {

代码示例来源:origin: org.kie.workbench/kie-wb-common-cli-forms-migration

  1. private String readTaskFormName(DataInputAssociation inputAssociation) {
  2. Optional<FormalExpression> optional = inputAssociation.getAssignment()
  3. .stream()
  4. .filter(assignment -> assignment.getFrom() != null && assignment.getFrom() instanceof FormalExpression)
  5. .map(assignment -> (FormalExpression)assignment.getFrom())
  6. .findAny();
  7. if(optional.isPresent()) {
  8. return Parser.xmlParser().parseInput(optional.get().getBody(), "").toString();
  9. }
  10. return "";
  11. }

代码示例来源:origin: addthis/hydra

  1. try {
  2. Parser parser = Parser.htmlParser().setTrackErrors(0);
  3. @Nonnull Document doc = parser.parseInput(html, "");
  4. @Nonnull Elements tags = doc.select(tagName);

代码示例来源:origin: DigitalPebble/storm-crawler

  1. /**
  2. * Attempt to find a META tag in the HTML that hints at the character set
  3. * used to write the document.
  4. */
  5. private static String getCharsetFromMeta(byte buffer[], int maxlength) {
  6. // convert to UTF-8 String -- which hopefully will not mess up the
  7. // characters we're interested in...
  8. int len = buffer.length;
  9. if (maxlength > 0 && maxlength < len) {
  10. len = maxlength;
  11. }
  12. String html = new String(buffer, 0, len, DEFAULT_CHARSET);
  13. Document doc = Parser.htmlParser().parseInput(html, "dummy");
  14. // look for <meta http-equiv="Content-Type"
  15. // content="text/html;charset=gb2312"> or HTML5 <meta charset="gb2312">
  16. Elements metaElements = doc
  17. .select("meta[http-equiv=content-type], meta[charset]");
  18. String foundCharset = null;
  19. for (Element meta : metaElements) {
  20. if (meta.hasAttr("http-equiv"))
  21. foundCharset = getCharsetFromContentType(meta.attr("content"));
  22. if (foundCharset == null && meta.hasAttr("charset"))
  23. foundCharset = meta.attr("charset");
  24. if (foundCharset != null)
  25. return foundCharset;
  26. }
  27. return foundCharset;
  28. }

代码示例来源:origin: samczsun/Skype4J

  1. @Override
  2. public void handle(SkypeImpl skype, JsonObject resource) throws ConnectionException, ChatNotFoundException, IOException {
  3. String content = Utils.getString(resource, "content");
  4. String chatId = Utils.getString(resource, "conversationLink");
  5. String author = getAuthor(resource);
  6. Validate.notNull(content, "Null content");
  7. Validate.notNull(chatId, "Null chat");
  8. Validate.notNull(author, "Null author");
  9. String username = getUsername(author);
  10. Validate.notNull(username, "Null username");
  11. Chat chat = getChat(chatId, skype);
  12. Validate.notNull(chat, "Null chatobj");
  13. Participant initiator = chat.getParticipant(username);
  14. Validate.notNull(initiator, "Null initiator");
  15. Document doc = Parser.xmlParser().parseInput(content, "");
  16. List<ReceivedFile> receivedFiles = doc
  17. .getElementsByTag("file")
  18. .stream()
  19. .map(fe -> new ReceivedFileImpl(fe.text(), Long.parseLong(fe.attr("size")),
  20. Long.parseLong(fe.attr("tid"))))
  21. .collect(Collectors.toList());
  22. FileReceivedEvent event = new FileReceivedEvent(chat, initiator, receivedFiles);
  23. skype.getEventDispatcher().callEvent(event);
  24. }
  25. },

代码示例来源:origin: DigitalPebble/storm-crawler

  1. .decode(ByteBuffer.wrap(content)).toString();
  2. jsoupDoc = Parser.htmlParser().parseInput(html, url);

代码示例来源:origin: org.kie.workbench.forms/kie-wb-common-forms-jbpm-integration-backend

  1. if (!StringUtils.isEmpty(taskName)) {
  2. taskName = Parser.xmlParser().parseInput(taskName,
  3. "").toString();
  4. formVariables.setTaskName(taskName);

代码示例来源:origin: samczsun/Skype4J

  1. Participant u = getUser(from, c);
  2. String content = resource.get("content").asString();
  3. Document doc = Parser.xmlParser().parseInput(content, "");
  4. if (doc.getElementsByTag("meta").size() == 0) {
  5. throw new IllegalArgumentException("No meta? " + resource);

代码示例来源:origin: DigitalPebble/storm-crawler

  1. @Test
  2. public void testExclusionCase() throws IOException {
  3. Config conf = new Config();
  4. conf.put(TextExtractor.EXCLUDE_PARAM_NAME, "style");
  5. TextExtractor extractor = new TextExtractor(conf);
  6. String content = "<html>the<STYLE>main</STYLE>content of the page</html>";
  7. Document jsoupDoc = Parser.htmlParser().parseInput(content,
  8. "http://stormcrawler.net");
  9. String text = extractor.text(jsoupDoc.body());
  10. assertEquals("the content of the page", text);
  11. }

代码示例来源:origin: DigitalPebble/storm-crawler

  1. @Test
  2. public void testMainContent() throws IOException {
  3. Config conf = new Config();
  4. conf.put(TextExtractor.INCLUDE_PARAM_NAME, "DIV[id=\"maincontent\"]");
  5. TextExtractor extractor = new TextExtractor(conf);
  6. String content = "<html>the<div id='maincontent'>main<div>content</div></div>of the page</html>";
  7. Document jsoupDoc = Parser.htmlParser().parseInput(content,
  8. "http://stormcrawler.net");
  9. String text = extractor.text(jsoupDoc.body());
  10. assertEquals("main content", text);
  11. }

代码示例来源:origin: DigitalPebble/storm-crawler

  1. @Test
  2. public void testExclusion() throws IOException {
  3. Config conf = new Config();
  4. conf.put(TextExtractor.EXCLUDE_PARAM_NAME, "STYLE");
  5. TextExtractor extractor = new TextExtractor(conf);
  6. String content = "<html>the<style>main</style>content of the page</html>";
  7. Document jsoupDoc = Parser.htmlParser().parseInput(content,
  8. "http://stormcrawler.net");
  9. String text = extractor.text(jsoupDoc.body());
  10. assertEquals("the content of the page", text);
  11. }

相关文章