You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

114 lines
5.9 KiB

2026-06-01 12:11:34.447 [main] INFO com.spider.service.DoubanBookSpider - 开始爬取豆瓣读书热度最高的 50 本书...
2026-06-01 12:11:34.450 [main] INFO com.spider.service.DoubanBookSpider - 正在抓取第 1 页: https://book.douban.com/chart?sub_type=1
2026-06-01 12:11:37.452 [main] INFO com.spider.service.DoubanBookSpider - 找到 40 个 h2 标签
2026-06-01 12:11:37.466 [main] INFO com.spider.service.DoubanBookSpider - 正在抓取第 2 页: https://book.douban.com/chart?sub_type=1&page=2
2026-06-01 12:11:39.733 [main] INFO com.spider.service.DoubanBookSpider - 找到 40 个 h2 标签
2026-06-01 12:11:39.738 [main] INFO com.spider.service.DoubanBookSpider - 豆瓣读书爬取完成,共获取 50 本书
2026-06-01 12:11:39.760 [main] INFO com.spider.service.DataStorageService - 书籍数据已保存到: data\books.csv
2026-06-01 12:11:39.761 [main] INFO com.spider.service.DoubanMovieSpider - 开始爬取豆瓣电影Top250...
2026-06-01 12:11:39.761 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 1 页 (1): https://movie.douban.com/top250
2026-06-01 12:11:42.200 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 2 页 (26): https://movie.douban.com/top250?start=25
2026-06-01 12:11:44.360 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 3 页 (51): https://movie.douban.com/top250?start=50
2026-06-01 12:11:46.508 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 4 页 (76): https://movie.douban.com/top250?start=75
2026-06-01 12:11:48.740 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 5 页 (101): https://movie.douban.com/top250?start=100
2026-06-01 12:11:50.890 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 6 页 (126): https://movie.douban.com/top250?start=125
2026-06-01 12:11:53.041 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 7 页 (151): https://movie.douban.com/top250?start=150
2026-06-01 12:11:55.290 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 8 页 (176): https://movie.douban.com/top250?start=175
2026-06-01 12:11:57.549 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 9 页 (201): https://movie.douban.com/top250?start=200
2026-06-01 12:11:59.800 [main] INFO com.spider.service.DoubanMovieSpider - 正在抓取第 10 页 (226): https://movie.douban.com/top250?start=225
2026-06-01 12:12:02.148 [main] INFO com.spider.service.DoubanMovieSpider - 豆瓣电影爬取完成,共获取 250 部电影
2026-06-01 12:12:02.224 [main] INFO com.spider.service.DataStorageService - 电影数据已保存到: data\movies.csv
2026-06-01 12:12:02.225 [main] INFO com.spider.service.BaiduHotSearchSpider - 开始爬取百度实时热搜榜前 50 条...
2026-06-01 12:12:02.225 [main] INFO com.spider.service.BaiduHotSearchSpider - 正在抓取: https://top.baidu.com/board?tab=realtime
2026-06-01 12:12:04.523 [main] INFO com.spider.service.BaiduHotSearchSpider - 获取到HTML长度: 195924 字节
2026-06-01 12:12:04.523 [main] INFO com.spider.service.BaiduHotSearchSpider - HTML内容前2000字符:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta content="always" name="referrer">
<meta name="theme-color" content="#2932e1">
<link rel="shortcut icon" href="//www.baidu.com/favicon.ico" type="image/x-icon" />
<link rel="icon" sizes="any" mask href="//www.baidu.com/img/baidu_85beaf5496f291521eb75ba38eacbd87.svg">
<link rel="dns-prefetch" href="//fyb-pc-static.cdn.bcebos.com"/>
<meta name="keywords" content="百度热搜,百度热搜榜,百度搜索排行榜,搜索排行榜,百度热门搜索,今日热搜,今日热点,排行榜,热搜榜,热词榜,热门话题,网络热点,实时热点,热门事件,热点">
<meta name="description" content="百度热搜以数亿用户海量的真实数据为基础,通过专业的数据挖掘方法,计算关键词的热搜指数,旨在建立权威、全面、热门、时效的各类关键词排行榜,引领热词阅读时代。">
<title>百度热搜</title>
<style data-vue-ssr-id="22cfed39:0">
.c-gap-top-small {
margin-top: 3px;
}
.c-gap-top {
margin-top: 7px;
}
.c-gap-top-large {
margin-top: 11px;
}
.c-gap-top-mini {
margin-top: 2px;
}
.c-gap-top-xsmall {
margin-top: 4px;
}
.c-gap-top-middle {
margin-top: 10px;
}
.c-gap-bottom-small {
margin-bottom: 3px;
}
.c-gap-bottom {
margin-bottom: 7px;
}
.c-gap-bottom-large {
margin-bottom: 11px;
}
.c-gap-bottom-mini {
margin-bottom: 2px;
}
.c-gap-bottom-xsmall {
margin-bottom: 4px;
}
.c-gap-bottom-middle {
margin-bottom: 10px;
}
.c-gap-left {
margin-left: 12px;
}
.c-gap-left-small {
margin-left: 8px;
}
.c-gap-left-xsmall {
margin-left: 4px;
}
.c-gap-left-mini {
margin-left: 2px;
}
.c-gap-left-large {
margin-left: 16px;
}
.c-gap-left-middle {
margin-left: 10px;
}
.c-gap-right {
margin-right: 12px;
}
.c-gap-right-small {
margin-right: 8px;
}
.c-gap-right-xsmall {
margin-right: 4px;
}
.c-gap-right-mini {
margin-right: 2px;
}
.c-gap-right-large {
margin-right: 16px;
}
.c-gap-right-middle {
margin-right: 10
2026-06-01 12:12:04.529 [main] INFO com.spider.service.BaiduHotSearchSpider - 百度热搜爬取完成,共获取 50 条热搜
2026-06-01 12:12:04.532 [main] INFO com.spider.service.DataStorageService - 热搜数据已保存到: data\hotsearch.csv
2026-06-01 12:12:04.533 [main] INFO com.spider.service.DataStorageService - 书籍数据已保存到: data\books.csv
2026-06-01 12:12:04.533 [main] INFO com.spider.service.DataStorageService - 电影数据已保存到: data\movies.csv
2026-06-01 12:12:04.534 [main] INFO com.spider.service.DataStorageService - 热搜数据已保存到: data\hotsearch.csv