如何将联机html页列表中的字符串发送到java数组?

kkbh8khc  于 2021-07-06  发布在  Java
关注(0)|答案(1)|浏览(365)

我是 java 新手。目前,我有一个数组,其中有许多值是我从html页面手动复制的。这些值只是一个名称列表,我使用从不断更新的html页面中出现的搜索过滤器来查找这些名称。我需要帮助找到一种方法来维护它,以便我的应用程序通过get请求连接到html页面,并自动获取这些值填充数组(最好将大数组存储在单独的文件中),而不是我每次更新它。
假设这是我在搜索框中查找意大利食品时html页面中的列表:
披萨
面团
饺子
等。。。
我的阵法是 String[] foodNames = {"Pizza", "Pasta", "Ravioli" ...} 不是说它是相关的,但我附加了一点上下文代码,它不会为您工作,因为cookie值和网站是虚拟值。我希望我的解释有道理。提前谢谢!

public static void main(final String[] args) throws Exception {
        //String to store all food names
        String[] foodNames = {"Pizza", "Pasta", "Ravioli" ...};

        //Webpage cookie to connect to webpage that requires login
        String cookie ="12345678912345678";

        for(String foodName : foodNames){
            System.out.println("-------------------" + foodName + "--------------------");
            //Get url
            URL foodRecipeUrl = new URL("https://horriblefoodrecipeslol.com/italian" + foodNames + "/+/ingredients/calories" );
            //Send request
            HttpURLConnection conn = (HttpURLConnection) foodRecipeUrl.openConnection();
            conn.setRequestProperty("Cookie", cookie);
            conn.setRequestMethod("GET");

            try {
                BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));

            String line;
            StringBuilder result = new StringBuilder();
            while ((line = rd.readLine()) != null) {
                result.append(line);
            }
            rd.close();

            //Filter out special characters
            String plainText = result.toString().replaceAll("(?s)<[^>]*>(\\s*<[^>]*>)*", "");

            if(plainText.contains("pesto")){
                System.out.print("This recipe is Italian");
            }
            } catch (FileNotFoundException e){
                System.out.println(String.format("No food for you", foodName));
                continue;
            }
        }
    }
}
plicqrtu

plicqrtu1#

下面是一个使用jsoup启动以下网站的简单示例:https://www.jamieoliver.com/recipes/category/world/italian/

import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class TestJsoup {

    public static void main(String[] args) throws IOException {
        String html = "https://www.jamieoliver.com/recipes/category/world/italian/";
        Document doc = Jsoup.connect(html).get();
        Elements recipeTitles = doc.select("div.recipe-title");
        for(Element e : recipeTitles){
            System.out.println(e.text());
        }
    }
}

输出:

Super-quick fresh pasta
Buddy's Bolognese
Beautiful courgette carbonara
Broccoli & anchovy orecchiette
Spaghetti with anchovies, dried chilli & pangrattato
Epic vegan lasagne
Danny Devito's family pasta
Amazing ravioli
Rolled cassata
Amalfi lemon tart
...

相关问题