本文整理了Java中edu.uci.ics.crawler4j.url.WebURL.getPath()
方法的一些代码示例,展示了WebURL.getPath()
的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。WebURL.getPath()
方法的具体详情如下:
包路径:edu.uci.ics.crawler4j.url.WebURL
类名称:WebURL
方法名:getPath
暂无
代码示例来源:origin: yasserg/crawler4j
private Set<WebURL> parseOutgoingUrls(WebURL referringPage) throws UnsupportedEncodingException {
Set<String> extractedUrls = extractUrlInCssText(this.getTextContent());
final String pagePath = referringPage.getPath();
final String pageUrl = referringPage.getURL();
Set<WebURL> outgoingUrls = new HashSet<>();
for (String url : extractedUrls) {
String relative = getLinkRelativeTo(pagePath, url);
String absolute = getAbsoluteUrlFrom(URLCanonicalizer.getCanonicalURL(pageUrl), relative);
WebURL webURL = new WebURL();
webURL.setURL(absolute);
outgoingUrls.add(webURL);
}
return outgoingUrls;
}
代码示例来源:origin: tim232385/WebVideoBot
public String getEmbedKey(WebURL webURL) {
final Pattern EMBED_PATTERN = Pattern.compile("(\\/embed\\/)(.*)");
if(!EMBED_PATTERN.matcher(webURL.getPath()).matches()){
return "";
} else {
return EMBED_PATTERN.matcher(webURL.getPath()).replaceAll("$2");
}
}
代码示例来源:origin: biezhi/java-library-examples
String url = page.getWebURL().getURL();
String domain = page.getWebURL().getDomain();
String path = page.getWebURL().getPath();
String subDomain = page.getWebURL().getSubDomain();
String parentUrl = page.getWebURL().getParentUrl();
代码示例来源:origin: tjake/stormscraper
if (pageTracker.getIfPresent(curURL.getURL()) != null && (!curURL.getPath().equals("/") && depth != 0))
内容来源于网络,如有侵权,请联系作者删除!