You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

100 lines
12 KiB

[2026-05-27 14:58:22] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 14:58:22] [INFO] [com.crawler.site.GovNewsCrawler] ========== Start crawling: 中国政府网 ==========
[2026-05-27 14:58:22] [INFO] [com.crawler.site.GovNewsCrawler] Total pages to crawl: 1
[2026-05-27 14:58:22] [DEBUG] [com.crawler.site.GovNewsCrawler] Preparing to crawl page 1: https://www.gov.cn/
[2026-05-27 14:58:22] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1573ms before request
[2026-05-27 14:58:24] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15
[2026-05-27 14:58:24] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.gov.cn/
[2026-05-27 14:58:24] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.gov.cn/ status: 200 duration: 404ms
[2026-05-27 14:58:24] [INFO] [com.crawler.site.GovNewsCrawler] Page 1 completed, got 3 items
[2026-05-27 14:58:24] [INFO] [com.crawler.site.GovNewsCrawler] Saving 3 items
[2026-05-27 14:58:24] [INFO] [com.crawler.site.GovNewsCrawler] ========== Crawling completed: 中国政府网, duration: 2081ms ==========
[2026-05-27 15:00:00] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:00:00] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ==========
[2026-05-27 15:00:00] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1
[2026-05-27 15:00:00] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:00:00] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1244ms before request
[2026-05-27 15:00:01] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
[2026-05-27 15:00:01] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:00:02] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 840ms
[2026-05-27 15:00:02] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items
[2026-05-27 15:00:02] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items
[2026-05-27 15:00:02] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 2265ms ==========
[2026-05-27 15:00:02] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:00:02] [INFO] [com.crawler.site.CasResearchCrawler] ========== Start crawling: 中科院-科研动态 ==========
[2026-05-27 15:00:02] [INFO] [com.crawler.site.CasResearchCrawler] Total pages to crawl: 1
[2026-05-27 15:00:02] [DEBUG] [com.crawler.site.CasResearchCrawler] Preparing to crawl page 1: https://www.cas.cn/
[2026-05-27 15:00:02] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2944ms before request
[2026-05-27 15:00:05] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
[2026-05-27 15:00:05] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.cas.cn/
[2026-05-27 15:00:05] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.cas.cn/ status: 200 duration: 242ms
[2026-05-27 15:00:05] [DEBUG] [com.crawler.http.JsoupHttpClient] Cookie updated
[2026-05-27 15:00:06] [INFO] [com.crawler.site.CasResearchCrawler] Page 1 completed, got 14 items
[2026-05-27 15:00:06] [INFO] [com.crawler.site.CasResearchCrawler] Saving 14 items
[2026-05-27 15:00:06] [INFO] [com.crawler.site.CasResearchCrawler] ========== Crawling completed: 中科院-科研动态, duration: 3254ms ==========
[2026-05-27 15:00:06] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:00:06] [INFO] [com.crawler.site.GovNewsCrawler] ========== Start crawling: 中国政府网 ==========
[2026-05-27 15:00:06] [INFO] [com.crawler.site.GovNewsCrawler] Total pages to crawl: 1
[2026-05-27 15:00:06] [DEBUG] [com.crawler.site.GovNewsCrawler] Preparing to crawl page 1: https://www.gov.cn/
[2026-05-27 15:00:06] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1484ms before request
[2026-05-27 15:00:07] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 14_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15
[2026-05-27 15:00:07] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.gov.cn/
[2026-05-27 15:00:07] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.gov.cn/ status: 200 duration: 236ms
[2026-05-27 15:00:07] [INFO] [com.crawler.site.GovNewsCrawler] Page 1 completed, got 3 items
[2026-05-27 15:00:07] [INFO] [com.crawler.site.GovNewsCrawler] Saving 3 items
[2026-05-27 15:00:07] [INFO] [com.crawler.site.GovNewsCrawler] ========== Crawling completed: 中国政府网, duration: 1740ms ==========
[2026-05-27 15:02:06] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:02:06] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ==========
[2026-05-27 15:02:06] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1
[2026-05-27 15:02:06] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:02:06] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2483ms before request
[2026-05-27 15:02:08] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
[2026-05-27 15:02:08] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:02:09] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 642ms
[2026-05-27 15:02:09] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items
[2026-05-27 15:02:09] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items
[2026-05-27 15:02:09] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 3231ms ==========
[2026-05-27 15:08:01] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:08:01] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ==========
[2026-05-27 15:08:01] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1
[2026-05-27 15:08:01] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:08:01] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1781ms before request
[2026-05-27 15:08:03] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
[2026-05-27 15:08:03] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:08:03] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 507ms
[2026-05-27 15:08:03] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items
[2026-05-27 15:08:03] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items
[2026-05-27 15:08:03] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 2507ms ==========
[2026-05-27 15:39:23] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:39:23] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ==========
[2026-05-27 15:39:23] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1
[2026-05-27 15:39:23] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:39:23] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 1015ms before request
[2026-05-27 15:39:24] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
[2026-05-27 15:39:24] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:39:24] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 647ms
[2026-05-27 15:39:24] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items
[2026-05-27 15:39:24] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items
[2026-05-27 15:39:24] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 1795ms ==========
[2026-05-27 15:40:20] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:40:20] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ==========
[2026-05-27 15:40:20] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1
[2026-05-27 15:40:20] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:40:20] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2219ms before request
[2026-05-27 15:40:22] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
[2026-05-27 15:40:22] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:40:23] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 600ms
[2026-05-27 15:40:23] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items
[2026-05-27 15:40:23] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items
[2026-05-27 15:40:23] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 2922ms ==========
[2026-05-27 15:42:23] [INFO] [com.crawler.http.JsoupHttpClient] JsoupHttpClient initialized, timeout: 15000ms
[2026-05-27 15:42:23] [INFO] [com.crawler.site.StatsGovCrawler] ========== Start crawling: 国家统计局-新闻发布 ==========
[2026-05-27 15:42:23] [INFO] [com.crawler.site.StatsGovCrawler] Total pages to crawl: 1
[2026-05-27 15:42:23] [DEBUG] [com.crawler.site.StatsGovCrawler] Preparing to crawl page 1: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:42:23] [DEBUG] [com.crawler.http.JsoupHttpClient] Waiting 2380ms before request
[2026-05-27 15:42:25] [DEBUG] [com.crawler.http.JsoupHttpClient] Using User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
[2026-05-27 15:42:25] [INFO] [com.crawler.http.JsoupHttpClient] Starting request: https://www.stats.gov.cn/sj/sjjd/
[2026-05-27 15:42:26] [INFO] [com.crawler.http.JsoupHttpClient] Request completed: https://www.stats.gov.cn/sj/sjjd/ status: 200 duration: 644ms
[2026-05-27 15:42:26] [INFO] [com.crawler.site.StatsGovCrawler] Page 1 completed, got 30 items
[2026-05-27 15:42:26] [INFO] [com.crawler.site.StatsGovCrawler] Saving 30 items
[2026-05-27 15:42:26] [INFO] [com.crawler.site.StatsGovCrawler] ========== Crawling completed: 国家统计局-新闻发布, duration: 3130ms ==========