6 changed files with 101 additions and 0 deletions
Binary file not shown.
@ -0,0 +1 @@ |
|||
[2026-05-21 14:54:56] [CRAWLER_004] 未知网站: unknown,可选值: stats, cas, gov, all |
|||
@ -0,0 +1,100 @@ |
|||
[2026-05-27 14:58:22] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 14:58:22] [INFO] [com.crawler.site.GovNewsCrawler] ========== Start crawling: 中国政府网 ========== |
|||
[2026-05-27 14:58:22] [INFO] [com.crawler.site.GovNewsCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 14:58:22] [DEBUG] [com.crawler.site.GovNewsCrawler] Preparing to crawl page 1: https://www.gov.cn/ |
|||
[2026-05-27 14:58:22] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1573ms before request |
|||
[2026-05-27 14:58:24] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15 |
|||
[2026-05-27 14:58:24] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.gov.cn/ |
|||
[2026-05-27 14:58:24] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.gov.cn/ status: 200 duration: 404ms |
|||
[2026-05-27 14:58:24] [INFO] [com.crawler.site.GovNewsCrawler] Page 1 completed, got 3 items |
|||
[2026-05-27 14:58:24] [INFO] [com.crawler.site.GovNewsCrawler] Saving 3 items |
|||
[2026-05-27 14:58:24] [INFO] [com.crawler.site.GovNewsCrawler] ========== Crawling completed: 中国政府网, duration: 2081ms ========== |
|||
[2026-05-27 15:00:00] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:00:00] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ========== |
|||
[2026-05-27 15:00:00] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:00:00] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:00:00] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1244ms before request |
|||
[2026-05-27 15:00:01] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:00:01] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 840ms |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 2265ms ========== |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.site.CasResearchCrawler] ========== Start crawling: 中科院-科研动态 ========== |
|||
[2026-05-27 15:00:02] [INFO] [com.crawler.site.CasResearchCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:00:02] [DEBUG] [com.crawler.site.CasResearchCrawler] Preparing to crawl page 1: https://www.cas.cn/ |
|||
[2026-05-27 15:00:02] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2944ms before request |
|||
[2026-05-27 15:00:05] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:00:05] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.cas.cn/ |
|||
[2026-05-27 15:00:05] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.cas.cn/ status: 200 duration: 242ms |
|||
[2026-05-27 15:00:05] [DEBUG] [com.crawler.http.JsoupHttpClient] Cookie updated |
|||
[2026-05-27 15:00:06] [INFO] [com.crawler.site.CasResearchCrawler] Page 1 completed, got 14 items |
|||
[2026-05-27 15:00:06] [INFO] [com.crawler.site.CasResearchCrawler] Saving 14 items |
|||
[2026-05-27 15:00:06] [INFO] [com.crawler.site.CasResearchCrawler] ========== Crawling completed: 中科院-科研动态, duration: 3254ms ========== |
|||
[2026-05-27 15:00:06] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:00:06] [INFO] [com.crawler.site.GovNewsCrawler] ========== Start crawling: 中国政府网 ========== |
|||
[2026-05-27 15:00:06] [INFO] [com.crawler.site.GovNewsCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:00:06] [DEBUG] [com.crawler.site.GovNewsCrawler] Preparing to crawl page 1: https://www.gov.cn/ |
|||
[2026-05-27 15:00:06] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1484ms before request |
|||
[2026-05-27 15:00:07] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15 |
|||
[2026-05-27 15:00:07] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.gov.cn/ |
|||
[2026-05-27 15:00:07] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.gov.cn/ status: 200 duration: 236ms |
|||
[2026-05-27 15:00:07] [INFO] [com.crawler.site.GovNewsCrawler] Page 1 completed, got 3 items |
|||
[2026-05-27 15:00:07] [INFO] [com.crawler.site.GovNewsCrawler] Saving 3 items |
|||
[2026-05-27 15:00:07] [INFO] [com.crawler.site.GovNewsCrawler] ========== Crawling completed: 中国政府网, duration: 1740ms ========== |
|||
[2026-05-27 15:02:06] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:02:06] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ========== |
|||
[2026-05-27 15:02:06] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:02:06] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:02:06] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2483ms before request |
|||
[2026-05-27 15:02:08] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:02:08] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:02:09] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 642ms |
|||
[2026-05-27 15:02:09] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items |
|||
[2026-05-27 15:02:09] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items |
|||
[2026-05-27 15:02:09] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 3231ms ========== |
|||
[2026-05-27 15:08:01] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:08:01] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ========== |
|||
[2026-05-27 15:08:01] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:08:01] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:08:01] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1781ms before request |
|||
[2026-05-27 15:08:03] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:08:03] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:08:03] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 507ms |
|||
[2026-05-27 15:08:03] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items |
|||
[2026-05-27 15:08:03] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items |
|||
[2026-05-27 15:08:03] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 2507ms ========== |
|||
[2026-05-27 15:39:23] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:39:23] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ========== |
|||
[2026-05-27 15:39:23] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:39:23] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:39:23] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1015ms before request |
|||
[2026-05-27 15:39:24] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:39:24] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:39:24] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 647ms |
|||
[2026-05-27 15:39:24] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items |
|||
[2026-05-27 15:39:24] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items |
|||
[2026-05-27 15:39:24] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 1795ms ========== |
|||
[2026-05-27 15:40:20] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:40:20] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ========== |
|||
[2026-05-27 15:40:20] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:40:20] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:40:20] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2219ms before request |
|||
[2026-05-27 15:40:22] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:40:22] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:40:23] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 600ms |
|||
[2026-05-27 15:40:23] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items |
|||
[2026-05-27 15:40:23] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items |
|||
[2026-05-27 15:40:23] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 2922ms ========== |
|||
[2026-05-27 15:42:23] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms |
|||
[2026-05-27 15:42:23] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ========== |
|||
[2026-05-27 15:42:23] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1 |
|||
[2026-05-27 15:42:23] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:42:23] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2380ms before request |
|||
[2026-05-27 15:42:25] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 |
|||
[2026-05-27 15:42:25] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/ |
|||
[2026-05-27 15:42:26] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 644ms |
|||
[2026-05-27 15:42:26] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items |
|||
[2026-05-27 15:42:26] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items |
|||
[2026-05-27 15:42:26] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 3130ms ========== |
|||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading…
Reference in new issue