org.jsoup.nodes.Document.location()方法的使用及代码示例

x33g5p2x  于2022-01-18 转载在 其他  
字(2.6k)|赞(0)|评价(0)|浏览(181)

本文整理了Java中org.jsoup.nodes.Document.location()方法的一些代码示例,展示了Document.location()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Document.location()方法的具体详情如下:
包路径:org.jsoup.nodes.Document
类名称:Document
方法名:location

Document.location介绍

[英]Get the URL this Document was parsed from. If the starting URL is a redirect, this will return the final URL from which the document was served from.
[中]获取解析此文档的URL。如果起始URL是重定向,则将返回从中提供文档的最终URL。

代码示例

代码示例来源:origin: RipMeApp/ripme

private List<String> getURLsFromChap(Document doc) {
  LOGGER.debug("Getting urls from " + doc.location());
  List<String> result = new ArrayList<>();
  for (Element el : doc.select(".vung-doc > img")) {
    result.add(el.attr("src"));
  }
  return result;
}

代码示例来源:origin: RipMeApp/ripme

@Override
protected List<String> getURLsFromPage(Document page) {
  JSONObject collectionData = getCollectionData(page);
  if (collectionData == null) {
    LOGGER.error("Unable to find JSON data at URL: " + page.location());
    // probably better than returning null, as the ripper will display
    // that nothing was found instead of a NullPointerException
    return new ArrayList<>();
  } else {
    return getImageURLs(collectionData);
  }
}

代码示例来源:origin: RipMeApp/ripme

@Override
protected void downloadURL(URL url, int index) {
  addURLToDownload(url, getPrefix(++this.index), currAlbum.location,
      currAlbum.currPage.location(), currAlbum.cookies);
}

代码示例来源:origin: org.jsoup/jsoup

/**
 * Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document
 * before converting.
 * @param in jsoup doc
 * @param out w3c doc
 * @see org.jsoup.helper.W3CDom#fromJsoup(org.jsoup.nodes.Document)
 */
public void convert(org.jsoup.nodes.Document in, Document out) {
  if (!StringUtil.isBlank(in.location()))
    out.setDocumentURI(in.location());
  org.jsoup.nodes.Element rootEl = in.child(0); // skip the #root node
  NodeTraversor.traverse(new W3CBuilder(out), rootEl);
}

代码示例来源:origin: RipMeApp/ripme

throw new IOException("No images found at " + doc.location());
LOGGER.debug("Fetching description(s) from " + doc.location());
List<String> textURLs = getDescriptionsFromPage(doc);
if (!textURLs.isEmpty()) {
  LOGGER.debug("Found description link(s) from " + doc.location());
  for (String textURL : textURLs) {
    if (isStopped()) {

代码示例来源:origin: 4pr0n/ripme

throw new IOException("No images found at " + doc.location());

代码示例来源:origin: g00glen00b/spring-samples

private Mono<CrawlerResult> getCrawlerResult(Document document, int depth) {
  return Flux.fromIterable(document.getElementsByTag("a"))
    .map(element -> element.absUrl("href"))
    .collectList()
    .map(hyperlinks -> new CrawlerResult(document.location(), document.title(), document.text(), hyperlinks, depth));
}

相关文章