一、爬虫介绍:请查看我的上篇文章 “Java爬虫_静态页面”
二、动态爬虫工具介绍:
1、IDEA,开发工具,创建Maven项目
2、htmlunit:是自动化测试工具,集成了下载(HttpClient),Dom(NekoHtml),驱动JS(Rhino)
3、其它JAR包:junit、jsoup、jxl
三、开发过程及相关代码
3.1、创建Maven项目

image.png

3.2、pom.xml中添加项目依赖



    4.0.0

    cll
    demo
    1.0-SNAPSHOT
    
        UTF-8
        UTF-8
        UTF-8
        1.7
        1.7
    

    
        
            junit
            junit
            4.11
            test
        
        
            net.sourceforge.htmlunit
            htmlunit
            2.27
        
        
            org.jsoup
            jsoup
            1.8.3
        
        
            org.apache.poi
            poi
            3.10.1
        
        
            com.hynnet
            jxl
            2.6.12.1
        
    

3.3、创建一个java类QYEmailHelper.java

import CaililiangTools.ConfigHelper;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.*;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.nodes.Node;
import org.jsoup.select.Elements;

import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.HashMap;
import java.util.Properties;

public class QYEmailHelper {
    static WebClient webClient=new WebClient(BrowserVersion.CHROME);
    ArrayList> returnList = new ArrayList>();
    static String baseUrl ="";
    static  int num =1;
    ConfigHelper configHelper = new ConfigHelper();
    Properties properties=null;

    //浏览器初始化
    public void WebClientInit(){
        webClient.getCookieManager().setCookiesEnabled(true);//设置cookie是否可用
        webClient.getOptions().setActiveXNative(false);
        webClient.getOptions().setRedirectEnabled(true);// 启动客户端重定向
        webClient.getOptions().setCssEnabled(false);//禁用Css,可避免自动二次请求CSS进行渲染
        webClient.getOptions().setJavaScriptEnabled(true); // 启动JS
        webClient.getOptions().setUseInsecureSSL(true);//忽略ssl认证
        webClient.getOptions().setThrowExceptionOnScriptError(false);//运行错误时,不抛出异常
        webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController());// 设置Ajax异步
        webClient.getOptions().setMaxInMemory(50000);
        properties = configHelper.getEmailUserInfos();
    }

    public void closeWebClient(){
        webClient.close();
        webClient=new WebClient(BrowserVersion.CHROME);
    }

    //用户登录并返回收件箱的地址
    public String UserLogin(String url,String name,String password) throws Exception{
        url = url.replace("param=caill@primeton.com","param="+name);
        final HtmlPage page = webClient.getPage(url);
        System.err.println("查询中,请稍候");
        //TimeUnit.SECONDS.sleep(3);  //web请求数据需要时间,必须让主线程休眠片刻
        HtmlForm form=page.getForms().get(0);
        HtmlPasswordInput txtPwd = (HtmlPasswordInput)form.getInputByName("pp");//密码框
        txtPwd.setValueAttribute(password);//设置密码
        HtmlSubmitInput submit=(HtmlSubmitInput) form.getInputByValue("登录");
        final HtmlPage page2 = (HtmlPage) submit.click();//登录进入
        DomElement e =page2.getElementById("folder_1");
        HtmlPage page3 = webClient.getPage("https://mail.primeton.com"+e.getAttribute("href"));
        //TimeUnit.SECONDS.sleep(3);  //web请求数据需要时间,必须让主线程休眠片刻
        HtmlInlineFrame frame1 = (HtmlInlineFrame)page3.getElementById("mainFrame");
        String src = frame1.getAttribute("src");
        baseUrl="https://mail.primeton.com"+src;
        return "https://mail.primeton.com"+src;
    }

    //抓取Url中的数据
    public long getHtmlPage(String url,long startTime,long endTime) throws Exception{
        HashMap returnMap = new HashMap();
        long endTime2=0L;
        HtmlPage page = webClient.getPage(url);
        HtmlBody tbody = (HtmlBody) page.getBody();
        DomNodeList lists = tbody.getElementsByTagName("table");
        //System.out.println( page.asXml());
        for(HtmlElement he:lists){
            long time =0L;
            HashMap results = new HashMap();
            String xml = he.asXml();
            if(xml.startsWith("startTime){
            String nextPageUrl=baseUrl.replace("page=0","page="+num);
            num++;
            grabData(nextPageUrl,startTime,endTime);
        }
    }

    public int exportData(long startTime,long endTime,String name,String password){
        int returnInt = 0;
        this.WebClientInit();
        String webUrl="https://mail.primeton.com/cgi-bin/loginpage?t=logindomain&s=logout&f=biz&param=caill@primeton.com";
        String url1 = null;
        try {
            url1 = this.UserLogin(webUrl,name,password);
            grabData(url1,startTime,endTime);
            ExcelHelper excelHelper = new ExcelHelper();
            excelHelper.exportExcel(this.returnList);
            num=1;
            baseUrl ="";
            returnList = new ArrayList>();
            closeWebClient();
        } catch (Exception e) {
            returnInt = 1;
            num=1;
            baseUrl ="";
            returnList = new ArrayList>();
            closeWebClient();
        }
        return returnInt;
    }
}


四、总结
上述代码可以完成动态代码的获取(以上代码是一个示例的一部分,单独运行会报错,代码只供参考,下篇文章会给出全部的代码),同时注意:开发爬虫时尽可能去创建Maven项目,如果创建普通项目需要引入一大串东西,否则一直会报错,本人折腾了半天还是有问题,就放弃了。 五、注意点
5.1、页面内嵌的Iframe是不能直接解析到内容的,必须先解析获取其url,然后再通过url再次获取页面数据进行解析

推荐阅读更多精彩内容

  • node + vue全栈手撸后台管理系统
    简介 最近整理学习资料的时候,看到了一个半年前写的后台管理系统。于是想着整理一篇关于node+vue去实现一个后台...
    超人陈立青阅读 658评论 0赞 9
  • spring boot集成minio,最新版
    demo地址:https://github.com/songshijun1995/minio-demo[https...
    归来_仍是少年阅读 8,813评论 5赞 46
  • CMS基于SpringBoot+Shiro+Mybatis+Druid+layui后台管理系统
    contentManagerSystem后台管理系统 简介 contentManagerSystem,后台管理系统...
    让我来处理高并发阅读 1,137评论 1赞 12
  • 创建一个纯净的SpringBoot项目
    一、前言 Spring 是 Java 开发非常流行且优秀的框架,一般用来做 Web 开发,但是如果我们只想使用 S...
    小布_cvg阅读 246评论 0赞 4
  • 推荐10个必学python爬虫框架,你常用的是哪个?
    实现爬虫技术的编程环境有很多种,Java、Python、C++等都可以用来爬虫。但很多人选择Python来写爬虫,...
    黑羽_692867294阅读 335评论 0赞 8
评论0
1赞
赞赏
下载App
{"dataManager":"[]","props":{"isServer":true,"initialState":{"global":{"done":false,"artFromType":null,"fontType":"black","modal":{"ContributeModal":false,"RewardListModal":false,"PayModal":false,"CollectionModal":false,"LikeListModal":false,"ReportModal":false,"QRCodeShareModal":false,"BookCatalogModal":false,"RewardModal":false},"ua":{"value":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36","isIE11":false,"earlyIE":null,"chrome":"58.0","firefox":null,"safari":null,"isMac":false},"$diamondRate":{"displayable":false,"rate":0},"readMode":"day","locale":"zh-CN","seoList":[{"comments_count":0,"public_abbr":"简介 最近整理学习资料的时候,看到了一个半年前写的后台管理系统。于是想着整理一篇关于node+vue去实现一个后台...","share_image_url":"https://upload-images.jianshu.io/upload_images/8867356-a9d88ba796f97a0e.png","slug":"3dae6124de0b","user":{"id":8867356,"nickname":"超人陈立青","slug":"35a394141c39","avatar":"https://upload.jianshu.io/users/upload_avatars/8867356/a06b99fb-45e1-4277-a21c-51d4c4fe44a1.JPG"},"likes_count":9,"title":"node + vue全栈手撸后台管理系统","id":83015816,"views_count":658},{"comments_count":5,"public_abbr":"demo地址:https://github.com/songshijun1995/minio-demo[https...","share_image_url":"https://upload-images.jianshu.io/upload_images/24191368-00350d2d41b5ff3d.png","slug":"c0bf5facde51","user":{"id":24191368,"nickname":"归来_仍是少年","slug":"3185aff85525","avatar":"https://upload.jianshu.io/users/upload_avatars/24191368/6addddd9-b1bd-466d-b023-4ae73f8f9327"},"likes_count":46,"title":"spring boot集成minio,最新版","id":81087300,"views_count":8813},{"comments_count":1,"public_abbr":"contentManagerSystem后台管理系统 简介 contentManagerSystem,后台管理系统...","share_image_url":"https://upload-images.jianshu.io/upload_images/24075190-76d05298f2b284e6","slug":"33a7312669a5","user":{"id":24075190,"nickname":"让我来处理高并发","slug":"5c920ad6a8e0","avatar":"https://upload.jianshu.io/users/upload_avatars/24075190/3397ec3b-02c6-48b6-ae5a-8b3ede77e86a.jpg"},"likes_count":12,"title":"CMS基于SpringBoot+Shiro+Mybatis+Druid+layui后台管理系统","id":81714932,"views_count":1137},{"comments_count":0,"public_abbr":"一、前言 Spring 是 Java 开发非常流行且优秀的框架,一般用来做 Web 开发,但是如果我们只想使用 S...","share_image_url":"","slug":"c865782fceae","user":{"id":15228836,"nickname":"小布_cvg","slug":"1a1d86b8b698","avatar":"https://upload.jianshu.io/users/upload_avatars/15228836/ff3210bb-5109-4196-a22b-831db1633dda"},"likes_count":4,"title":"创建一个纯净的SpringBoot项目","id":78914978,"views_count":246},{"comments_count":0,"public_abbr":"实现爬虫技术的编程环境有很多种,Java、Python、C++等都可以用来爬虫。但很多人选择Python来写爬虫,...","share_image_url":"","slug":"55abe1632c44","user":{"id":24716216,"nickname":"黑羽_692867294","slug":"270aaf625f00","avatar":"https://upload.jianshu.io/users/upload_avatars/24716216/98bdc1da-d43f-4f53-b827-8763837f3877.jpg"},"likes_count":8,"title":"推荐10个必学python爬虫框架,你常用的是哪个?","id":82611495,"views_count":335}]},"note":{"data":{"is_author":false,"last_updated_at":1552407469,"public_title":"Java爬虫_动态页面","purchased":false,"liked_note":false,"comments_count":0,"free_content":"u003cpu003e一、爬虫介绍:请查看我的上篇文章 “Java爬虫_静态页面”u003cbru003en二、动态爬虫工具介绍:u003cbru003en1、IDEA,开发工具,创建Maven项目u003cbru003en2、htmlunit:是自动化测试工具,集成了下载(HttpClient),Dom(NekoHtml),驱动JS(Rhino)u003cbru003en3、其它JAR包:junit、jsoup、jxlu003cbru003en三、开发过程及相关代码u003cbru003en3.1、创建Maven项目u003c/pu003enu003cbru003enu003cdiv class="image-package"u003enu003cdiv class="image-container" style="max-width: 700px; max-height: 577px;"u003enu003cdiv class="image-container-fill" style="padding-bottom: 76.92999999999999%;"u003eu003c/divu003enu003cdiv class="image-view" data-width="750" data-height="577"u003eu003cimg data-original-src="//upload-images.jianshu.io/upload_images/5131080-2df8049d5adc0558.png" data-original-width="750" data-original-height="577" data-original-format="image/png" data-original-filesize="95805"u003eu003c/divu003enu003c/divu003enu003cdiv class="image-caption"u003eimage.pngu003c/divu003enu003c/divu003eu003cbru003enu003cpu003e3.2、pom.xml中添加项目依赖u003c/pu003ennu003cpreu003eu003ccodeu003eu0026lt;?xml version="1.0" encoding="UTF-8"?u0026gt;nu0026lt;project xmlns="http://maven.apache.org/POM/4.0.0"n xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"n xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"u0026gt;n u0026lt;modelVersionu0026gt;4.0.0u0026lt;/modelVersionu0026gt;nn u0026lt;groupIdu0026gt;cllu0026lt;/groupIdu0026gt;n u0026lt;artifactIdu0026gt;demou0026lt;/artifactIdu0026gt;n u0026lt;versionu0026gt;1.0-SNAPSHOTu0026lt;/versionu0026gt;n u0026lt;propertiesu0026gt;n u0026lt;project.build.sourceEncodingu0026gt;UTF-8u0026lt;/project.build.sourceEncodingu0026gt;n u0026lt;project.reporting.outputEncodingu0026gt;UTF-8u0026lt;/project.reporting.outputEncodingu0026gt;n u0026lt;maven.compiler.encodingu0026gt;UTF-8u0026lt;/maven.compiler.encodingu0026gt;n u0026lt;maven.compiler.sourceu0026gt;1.7u0026lt;/maven.compiler.sourceu0026gt;n u0026lt;maven.compiler.targetu0026gt;1.7u0026lt;/maven.compiler.targetu0026gt;n u0026lt;/propertiesu0026gt;nn u0026lt;dependenciesu0026gt;n u0026lt;dependencyu0026gt;n u0026lt;groupIdu0026gt;junitu0026lt;/groupIdu0026gt;n u0026lt;artifactIdu0026gt;junitu0026lt;/artifactIdu0026gt;n u0026lt;versionu0026gt;4.11u0026lt;/versionu0026gt;n u0026lt;scopeu0026gt;testu0026lt;/scopeu0026gt;n u0026lt;/dependencyu0026gt;n u0026lt;dependencyu0026gt;n u0026lt;groupIdu0026gt;net.sourceforge.htmlunitu0026lt;/groupIdu0026gt;n u0026lt;artifactIdu0026gt;htmlunitu0026lt;/artifactIdu0026gt;n u0026lt;versionu0026gt;2.27u0026lt;/versionu0026gt;n u0026lt;/dependencyu0026gt;n u0026lt;dependencyu0026gt;n u0026lt;groupIdu0026gt;org.jsoupu0026lt;/groupIdu0026gt;n u0026lt;artifactIdu0026gt;jsoupu0026lt;/artifactIdu0026gt;n u0026lt;versionu0026gt;1.8.3u0026lt;/versionu0026gt;n u0026lt;/dependencyu0026gt;n u0026lt;dependencyu0026gt;n u0026lt;groupIdu0026gt;org.apache.poiu0026lt;/groupIdu0026gt;n u0026lt;artifactIdu0026gt;poiu0026lt;/artifactIdu0026gt;n u0026lt;versionu0026gt;3.10.1u0026lt;/versionu0026gt;n u0026lt;/dependencyu0026gt;n u0026lt;dependencyu0026gt;n u0026lt;groupIdu0026gt;com.hynnetu0026lt;/groupIdu0026gt;n u0026lt;artifactIdu0026gt;jxlu0026lt;/artifactIdu0026gt;n u0026lt;versionu0026gt;2.6.12.1u0026lt;/versionu0026gt;n u0026lt;/dependencyu0026gt;n u0026lt;/dependenciesu0026gt;nu0026lt;/projectu0026gt;nu003c/codeu003eu003c/preu003enu003cpu003e3.3、创建一个java类QYEmailHelper.javau003c/pu003enu003cpreu003eu003ccodeu003eimport CaililiangTools.ConfigHelper;nimport com.gargoylesoftware.htmlunit.BrowserVersion;nimport com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;nimport com.gargoylesoftware.htmlunit.WebClient;nimport com.gargoylesoftware.htmlunit.html.*;nimport org.jsoup.Jsoup;nimport org.jsoup.nodes.Document;nimport org.jsoup.nodes.Element;nimport org.jsoup.nodes.Node;nimport org.jsoup.select.Elements;nnimport java.text.SimpleDateFormat;nimport java.util.ArrayList;nimport java.util.Date;nimport java.util.HashMap;nimport java.util.Properties;nnpublic class QYEmailHelper {n static WebClient webClient=new WebClient(BrowserVersion.CHROME);n ArrayListu0026lt;HashMapu0026lt;String,Stringu0026gt;u0026gt; returnList = new ArrayListu0026lt;HashMapu0026lt;String,Stringu0026gt;u0026gt;();n static String baseUrl ="";n static int num =1;n ConfigHelper configHelper = new ConfigHelper();n Properties properties=null;nn //浏览器初始化n public void WebClientInit(){n webClient.getCookieManager().setCookiesEnabled(true);//设置cookie是否可用n webClient.getOptions().setActiveXNative(false);n webClient.getOptions().setRedirectEnabled(true);// 启动客户端重定向n webClient.getOptions().setCssEnabled(false);//禁用Css,可避免自动二次请求CSS进行渲染n webClient.getOptions().setJavaScriptEnabled(true); // 启动JSn webClient.getOptions().setUseInsecureSSL(true);//忽略ssl认证n webClient.getOptions().setThrowExceptionOnScriptError(false);//运行错误时,不抛出异常n webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);n webClient.setAjaxController(new NicelyResynchronizingAjaxController());// 设置Ajax异步n webClient.getOptions().setMaxInMemory(50000);n properties = configHelper.getEmailUserInfos();n }nn public void closeWebClient(){n webClient.close();n webClient=new WebClient(BrowserVersion.CHROME);n }nn //用户登录并返回收件箱的地址n public String UserLogin(String url,String name,String password) throws Exception{n url = url.replace("param=caill@primeton.com","param="+name);n final HtmlPage page = webClient.getPage(url);n System.err.println("查询中,请稍候");n //TimeUnit.SECONDS.sleep(3); //web请求数据需要时间,必须让主线程休眠片刻n HtmlForm form=page.getForms().get(0);n HtmlPasswordInput txtPwd = (HtmlPasswordInput)form.getInputByName("pp");//密码框n txtPwd.setValueAttribute(password);//设置密码n HtmlSubmitInput submit=(HtmlSubmitInput) form.getInputByValue("登录");n final HtmlPage page2 = (HtmlPage) submit.click();//登录进入n DomElement e =page2.getElementById("folder_1");n HtmlPage page3 = webClient.getPage("https://mail.primeton.com"+e.getAttribute("href"));n //TimeUnit.SECONDS.sleep(3); //web请求数据需要时间,必须让主线程休眠片刻n HtmlInlineFrame frame1 = (HtmlInlineFrame)page3.getElementById("mainFrame");n String src = frame1.getAttribute("src");n baseUrl="https://mail.primeton.com"+src;n return "https://mail.primeton.com"+src;n }nn //抓取Url中的数据n public long getHtmlPage(String url,long startTime,long endTime) throws Exception{n HashMapu0026lt;String,Stringu0026gt; returnMap = new HashMapu0026lt;String,Stringu0026gt;();n long endTime2=0L;n HtmlPage page = webClient.getPage(url);n HtmlBody tbody = (HtmlBody) page.getBody();n DomNodeListu0026lt;HtmlElementu0026gt; lists = tbody.getElementsByTagName("table");n //System.out.println( page.asXml());n for(HtmlElement he:lists){n long time =0L;n HashMapu0026lt;String,Stringu0026gt; results = new HashMapu0026lt;String,Stringu0026gt;();n String xml = he.asXml();n if(xml.startsWith("u0026lt;table cellspacing=\"0\" class=") u0026amp;u0026amp; xml.contains("u0026lt;input totime=")){n Document document = Jsoup.parse(xml);n Elements es = document.getElementsByClass("cx");n Elements es2 = document.getElementsByClass("black");n for(Element e :es){n Node node =e.childNode(1);n time = Long.parseLong(node.attr("totime"));n endTime2 = time;n String email = node.attr("fa");n if(properties.containsKey(email)){n String value = properties.getProperty(email);n String[] vs = value.split("@@");n if(vs.length==2){n results.put("totime",new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date(time)));n results.put("unread",(node.attr("unread")).equalsIgnoreCase("true")?"已读":"未读");n results.put("name",vs[1]);n results.put("mail",email);n results.put("dept",vs[0]);n }else{n results.put("totime",new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date(time)));n results.put("unread",(node.attr("unread")).equalsIgnoreCase("true")?"已读":"未读");n results.put("name",node.attr("fn"));n results.put("mail",email);n results.put("dept","");n }n }else{n results.put("totime",new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date(time)));n results.put("unread",(node.attr("unread")).equalsIgnoreCase("true")?"已读":"未读");n results.put("name",node.attr("fn"));n results.put("mail",email);n results.put("dept","");n }n }n for(Element e :es2){n results.put("title",e.ownText());n }n if(timeu0026lt;=endTime u0026amp;u0026amp; startTimeu0026lt;=time){n returnList.add(results);n }n }n };n return endTime2;n }nn public void grabData(String url,long startTime,long endTime) throws Exception{n long endTime2 = getHtmlPage(url,startTime,endTime);n if(endTime2u0026gt;startTime){n String nextPageUrl=baseUrl.replace("page=0","page="+num);n num++;n grabData(nextPageUrl,startTime,endTime);n }n }nn public int exportData(long startTime,long endTime,String name,String password){n int returnInt = 0;n this.WebClientInit();n String webUrl="https://mail.primeton.com/cgi-bin/loginpage?t=logindomainu0026amp;s=logoutu0026amp;f=bizu0026amp;param=caill@primeton.com";n String url1 = null;n try {n url1 = this.UserLogin(webUrl,name,password);n grabData(url1,startTime,endTime);n ExcelHelper excelHelper = new ExcelHelper();n excelHelper.exportExcel(this.returnList);n num=1;n baseUrl ="";n returnList = new ArrayListu0026lt;HashMapu0026lt;String,Stringu0026gt;u0026gt;();n closeWebClient();n } catch (Exception e) {n returnInt = 1;n num=1;n baseUrl ="";n returnList = new ArrayListu0026lt;HashMapu0026lt;String,Stringu0026gt;u0026gt;();n closeWebClient();n }n return returnInt;n }n}nnu003c/codeu003eu003c/preu003enu003cpu003e四、总结u003cbru003en上述代码可以完成动态代码的获取(以上代码是一个示例的一部分,单独运行会报错,代码只供参考,下篇文章会给出全部的代码),同时注意:开发爬虫时尽可能去创建Maven项目,如果创建普通项目需要引入一大串东西,否则一直会报错,本人折腾了半天还是有问题,就放弃了。u003c/pu003enu003cpu003e五、注意点u003cbru003en5.1、页面内嵌的Iframe是不能直接解析到内容的,必须先解析获取其url,然后再通过url再次获取页面数据进行解析u003c/pu003en","voted_down":false,"rewardable":true,"show_paid_comment_tips":false,"share_image_url":"http://upload-images.jianshu.io/upload_images/5131080-2df8049d5adc0558.png","slug":"afac471a4a91","user":{"liked_by_user":false,"following_count":33,"gender":1,"avatar_widget":null,"slug":"590835eefc1c","intro":"","likes_count":71,"nickname":"笑才","badges":[],"total_fp_amount":"39832951710354456882","wordage":52810,"avatar":"https://upload.jianshu.io/users/upload_avatars/5131080/3c7fe3c4-5fdc-42c3-b752-02ed97762b1a.jpg","id":5131080,"liked_user":false},"likes_count":0,"paid_type":"free","show_ads":true,"paid_content_accessible":false,"total_fp_amount":"0","trial_open":false,"reprintable":true,"bookmarked":false,"wordage":311,"featured_comments_count":0,"downvotes_count":0,"wangxin_trial_open":null,"guideShow":{"audit_user_nickname_spliter":0,"pc_note_bottom_btn":1,"pc_like_author_guidance":0,"ban_some_labels":1,"audit_user_background_image_spliter":0,"audit_note_spliter":0,"launch_tab":0,"include_post":0,"pc_login_guidance":1,"audit_comment_spliter":1,"recommend_reason":1,"pc_note_bottom_qrcode":0,"audit_user_avatar_spliter":0,"audit_collection_spliter":0,"subscription_guide_entry":1,"creation_muti_function_on":1,"explore_score_searcher":1,"audit_user_spliter":1,"pc_note_popup":0},"commentable":true,"total_rewards_count":0,"id":42752886,"notebook":{"name":""},"activity_collection_slug":null,"description":"一、爬虫介绍:请查看我的上篇文章 “Java爬虫_静态页面”二、动态爬虫工具介绍:1、IDEA,开发工具,创建Maven项目2、htmlunit:是自动化测试工具,集成了下载...","first_shared_at":1552194964,"views_count":615,"notebook_id":34994612},"baseList":{"likeList":[],"rewardList":[]},"status":"success","statusCode":0},"user":{"isLogin":false,"userInfo":{}},"comments":{"list":[],"featuredList":[]}},"initialProps":{"pageProps":{"query":{"slug":"afac471a4a91"}},"localeData":{"common":{"jianshu":"简书","diamond":"简书钻","totalAssets":"总资产{num}","diamondValue":" (约{num}元)","login":"登录","logout":"注销","register":"注册","on":"开","off":"关","follow":"关注","followBook":"关注连载","following":"已关注","cancelFollow":"取消关注","publish":"发布","wordage":"字数","audio":"音频","read":"阅读","reward":"赞赏","zan":"赞","comment":"评论","expand":"展开","prevPage":"上一页","nextPage":"下一页","floor":"楼","confirm":"确定","delete":"删除","report":"举报","fontSong":"宋体","fontBlack":"黑体","chs":"简体","cht":"繁体","jianChat":"简信","postRequest":"投稿请求","likeAndZan":"喜欢和赞","rewardAndPay":"赞赏和付费","home":"我的主页","markedNotes":"收藏的文章","likedNotes":"喜欢的文章","paidThings":"已购内容","wallet":"我的钱包","setting":"设置","feedback":"帮助与反馈","loading":"加载中...","needLogin":"请登录后进行操作","trialing":"文章正在审核中...","reprintTip":"禁止转载,如需转载请通过简信或评论联系作者。"},"error":{"rewardSelf":"无法打赏自己的文章哟~"},"message":{"paidNoteTip":"付费购买后才可以参与评论哦","CommentDisableTip":"作者关闭了评论功能","contentCanNotEmptyTip":"回复内容不能为空","addComment":"评论发布成功","deleteComment":"评论删除成功","likeComment":"评论点赞成功","setReadMode":"阅读模式设置成功","setFontType":"字体设置成功","setLocale":"显示语言设置成功","follow":"关注成功","cancelFollow":"取消关注成功","copySuccess":"复制代码成功"},"header":{"homePage":"首页","download":"下载APP","discover":"发现","message":"消息","reward":"赞赏支持","editNote":"编辑文章","writeNote":"写文章"},"note":{},"noteMeta":{"lastModified":"最后编辑于 ","wordage":"字数 {num}","viewsCount":"阅读 {num}"},"divider":{"selfText":"以下内容为付费内容,定价 ¥{price}","paidText":"已付费,可查看以下内容","notPaidText":"还有 {percent} 的精彩内容","modify":"点击修改"},"paidPanel":{"buyNote":"支付 ¥{price} 继续阅读","buyBook":"立即拿下 ¥{price}","freeTitle":"该作品为付费连载","freeText":"购买即可永久获取连载内的所有内容,包括将来更新的内容","paidTitle":"还没看够?拿下整部连载!","paidText":"永久获得连载内的所有内容, 包括将来更新的内容"},"book":{"last":"已是最后","lookCatalog":"查看连载目录","header":"文章来自以下连载"},"action":{"like":"{num}人点赞","collection":"收入专题","report":"举报文章"},"comment":{"allComments":"全部评论","featuredComments":"精彩评论","closed":"评论已关闭","close":"关闭评论","open":"打开评论","desc":"按时间倒序","asc":"按时间正序","disableText1":"用户已关闭评论,","disableText2":"与Ta简信交流","placeholder":"写下你的评论...","publish":"发表","create":" 添加新评论","reply":" 回复","restComments":"还有{num}条评论,","expandImage":"展开剩余{num}张图","deleteText":"确定要删除评论么?"},"collection":{"title":"被以下专题收入,发现更多相似内容","putToMyCollection":"收入我的专题"},"seoList":{"title":"推荐阅读","more":"更多精彩内容"},"sideList":{"title":"推荐阅读"},"wxShareModal":{"desc":"打开微信“扫一扫”,打开网页后点击屏幕右上角分享按钮"},"bookChapterModal":{"try":"试读","toggle":"切换顺序"},"collectionModal":{"title":"收入到我管理的专题","search":"搜索我管理的专题","newCollection":"新建专题","create":"创建","nothingFound":"未找到相关专题","loadMore":"展开查看更多"},"contributeModal":{"search":"搜索专题投稿","newCollection":"新建专题","addNewOne":"去新建一个","nothingFound":"未找到相关专题","loadMore":"展开查看更多","managed":"我管理的专题","recommend":"推荐专题"},"QRCodeShow":{"payTitle":"微信扫码支付","payText":"支付金额"},"rewardModal":{"title":"给作者送糖","custom":"自定义","placeholder":"给Ta留言...","choose":"选择支付方式","balance":"简书余额","tooltip":"网站该功能暂时下线,如需使用,请到简书App操作","confirm":"确认支付","success":"赞赏成功"},"payModal":{"payBook":"购买连载","payNote":"购买文章","promotion":"优惠券","promotionFetching":"优惠券获取中...","noPromotion":"无可用优惠券","promotionNum":"{num}张可用","noUsePromotion":"不使用优惠券","validPromotion":"可用优惠券","invalidPromotion":"不可用优惠券","total":"支付总额","tip1":"· 你将购买的商品为虚拟内容服务,购买后不支持退订、转让、退换,请斟酌确认。","tip2":"· 购买后可在“已购内容”中查看和使用。","success":"购买成功"},"reportModal":{"ad":"广告及垃圾信息","plagiarism":"抄袭或未授权转载","placeholder":"写下举报的详情情况(选填)","success":"举报成功"},"guidModal":{"modalAText":"相似文章推荐","subText":"下载简书APP,浏览更多相似文章","btnAText":"先不下载,下次再说","followOkText":"关注作者成功!","followTextTip":"下载简书APP,作者更多精彩内容更新及时提醒!","followBtn":"下次再说","downloadTipText":"更多精彩内容下载简书APP","footerDownLoadText":"下载简书APP","modabTitle":"免费送你2次抽奖机会","modalbTip":"你有很大概率抽取AirPods Pro","modalbFooterTip":"下载简书APP,天天参与抽大奖","modalReward":"抽奖","scanQrtip":"扫码下载简书APP","downloadAppText":"下载简书APP,随时随地发现和创作内容","redText":"阅读","likesText":"赞","downLoadLeft":"下载App"}},"currentLocale":"zh-CN","asPath":"/p/afac471a4a91"}},"page":"/p/[slug]","query":{"slug":"afac471a4a91"},"buildId":"u2bo4IW2hHO5mrSY24vYe","assetPrefix":"https://cdn2.jianshu.io/shakespeare"}

文章来源于互联网:Java爬虫_动态页面

发表评论