org.jsoup.Jsoup.parse()方法的使用及代码示例

x33g5p2x  于2022-01-21 转载在 其他  
字(5.3k)|赞(0)|评价(0)|浏览(1111)

本文整理了Java中org.jsoup.Jsoup.parse()方法的一些代码示例,展示了Jsoup.parse()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Jsoup.parse()方法的具体详情如下:
包路径:org.jsoup.Jsoup
类名称:Jsoup
方法名:parse

Jsoup.parse介绍

[英]Parse the contents of a file as HTML. The location of the file is used as the base URI to qualify relative URLs.
[中]将文件内容解析为HTML。文件的位置用作基本URI以限定相对URL。

代码示例

代码示例来源:origin: code4craft/webmagic

  1. public Html(String text, String url) {
  2. try {
  3. this.document = Jsoup.parse(text, url);
  4. } catch (Exception e) {
  5. this.document = null;
  6. logger.warn("parse document error ", e);
  7. }
  8. }

代码示例来源:origin: code4craft/webmagic

  1. public Html(String text) {
  2. try {
  3. this.document = Jsoup.parse(text);
  4. } catch (Exception e) {
  5. this.document = null;
  6. logger.warn("parse document error ", e);
  7. }
  8. }

代码示例来源:origin: square/retrofit

  1. @Override public Page convert(ResponseBody responseBody) throws IOException {
  2. Document document = Jsoup.parse(responseBody.string());
  3. List<String> links = new ArrayList<>();
  4. for (Element element : document.select("a[href]")) {
  5. links.add(element.attr("href"));
  6. }
  7. return new Page(document.title(), Collections.unmodifiableList(links));
  8. }
  9. }

代码示例来源:origin: code4craft/webmagic

  1. public List<Element> selectElements(String text) {
  2. if (text != null) {
  3. return selectElements(Jsoup.parse(text));
  4. } else {
  5. return new ArrayList<Element>();
  6. }
  7. }

代码示例来源:origin: code4craft/webmagic

  1. @Override
  2. public String select(String text) {
  3. if (text != null) {
  4. return select(Jsoup.parse(text));
  5. }
  6. return null;
  7. }

代码示例来源:origin: code4craft/webmagic

  1. @Override
  2. public List<String> selectList(String text) {
  3. if (text != null) {
  4. return selectList(Jsoup.parse(text));
  5. } else {
  6. return new ArrayList<String>();
  7. }
  8. }

代码示例来源:origin: code4craft/webmagic

  1. public Element selectElement(String text) {
  2. if (text != null) {
  3. return selectElement(Jsoup.parse(text));
  4. }
  5. return null;
  6. }

代码示例来源:origin: jphp-group/jphp

  1. @Signature
  2. public static Document parseText(String text, String baseUri) {
  3. return Jsoup.parse(text, baseUri);
  4. }
  5. }

代码示例来源:origin: ChinaSilence/any-video

  1. /**
  2. * 文本预处理
  3. * 英文小写 -> 移除code -> 移除Html标签
  4. */
  5. private String preHandle(String content){
  6. content = content.toLowerCase();
  7. content = content.replaceAll(" ", "").replaceAll("<code[\\s\\S]*?</code>", "");
  8. return Jsoup.parse(content).text();
  9. }

代码示例来源:origin: square/okhttp

  1. Document document = Jsoup.parse(response.body().string(), url.toString());
  2. for (Element element : document.select("a[href]")) {
  3. String href = element.attr("href");

代码示例来源:origin: k9mail/k-9

  1. /**
  2. * Convert an HTML string to a plain text string.
  3. * @param html HTML string to convert.
  4. * @return Plain text result.
  5. */
  6. public static String htmlToText(final String html) {
  7. Document document = Jsoup.parse(html);
  8. return HtmlToPlainText.toPlainText(document.body())
  9. .replace(PREVIEW_OBJECT_CHARACTER, PREVIEW_OBJECT_REPLACEMENT)
  10. .replace(NBSP_CHARACTER, NBSP_REPLACEMENT);
  11. }

代码示例来源:origin: k9mail/k-9

  1. public Document sanitize(String html) {
  2. Document dirtyDocument = Jsoup.parse(html);
  3. Document cleanedDocument = cleaner.clean(dirtyDocument);
  4. headCleaner.clean(dirtyDocument, cleanedDocument);
  5. return cleanedDocument;
  6. }
  7. }

代码示例来源:origin: javaee-samples/javaee7-samples

  1. public static String formatHTML(String html) {
  2. try {
  3. return parse(html, "", xmlParser()).toString();
  4. } catch (Exception e) {
  5. return html;
  6. }
  7. }

代码示例来源:origin: JpressProjects/jpress

  1. public static String getText(String html) {
  2. if (StrUtils.isBlank(html)) {
  3. return html;
  4. }
  5. return Jsoup.parse(html).text();
  6. }

代码示例来源:origin: k9mail/k-9

  1. private void assertHtmlContainsElement(String html, String cssQuery, int numberOfExpectedOccurrences) {
  2. Document document = Jsoup.parse(html);
  3. int numberOfFoundElements = document.select(cssQuery).size();
  4. assertEquals("Expected to find '" + cssQuery + "' " + numberOfExpectedOccurrences + " time(s) in:\n" + html,
  5. numberOfExpectedOccurrences, numberOfFoundElements);
  6. }
  7. }

代码示例来源:origin: JpressProjects/jpress

  1. public String replaceSrcTemplateSrcPath(String content) {
  2. if (StrUtils.isBlank(content)) {
  3. return content;
  4. }
  5. Document doc = Jsoup.parse(content);
  6. Elements jsElements = doc.select("script[src]");
  7. replace(jsElements, "src");
  8. Elements imgElements = doc.select("img[src]");
  9. replace(imgElements, "src");
  10. Elements linkElements = doc.select("link[href]");
  11. replace(linkElements, "href");
  12. return doc.toString();
  13. }

代码示例来源:origin: jphp-group/jphp

  1. @Signature
  2. public static Document parse(Environment env, Memory source, String encoding, String baseUri) throws IOException {
  3. InputStream is = Stream.getInputStream(env, source);
  4. try {
  5. return Jsoup.parse(is, encoding, baseUri);
  6. } finally {
  7. Stream.closeStream(env, is);
  8. }
  9. }

代码示例来源:origin: seven332/EhViewer

  1. public static String parse(String body) throws ParseException {
  2. try {
  3. Document d = Jsoup.parse(body, EhUrl.URL_FORUMS);
  4. Element userlinks = d.getElementById("userlinks");
  5. Element child = userlinks.child(0).child(0).child(0);
  6. return child.attr("href");
  7. } catch (Throwable e) {
  8. ExceptionUtils.throwIfFatal(e);
  9. throw new ParseException("Parse forums error", body);
  10. }
  11. }
  12. }

代码示例来源:origin: k9mail/k-9

  1. @Test
  2. public void wrapMessageContent_putsMessageContentInBody() {
  3. String content = "Some text";
  4. String html = HtmlConverter.wrapMessageContent(content);
  5. assertEquals(content, Jsoup.parse(html).body().text());
  6. }

代码示例来源:origin: k9mail/k-9

  1. private String stripSignatureInternal(String content) {
  2. Document document = Jsoup.parse(content);
  3. AdvancedNodeTraversor nodeTraversor = new AdvancedNodeTraversor(new StripSignatureFilter());
  4. nodeTraversor.filter(document.body());
  5. return HtmlProcessor.toCompactString(document);
  6. }

相关文章