# University News Crawler Java homework project for crawling: - `https://news.hnu.edu.cn/` - `https://news.csu.edu.cn/` - `https://news.hunnu.edu.cn/` The code demonstrates the required architecture: - CLI interactive command line - MVC: `model`, `view`, `controller` - Command pattern: `command` package - Strategy pattern: `strategy` package, one strategy per target website - Custom exception hierarchy: `exception` package - File persistence: JSON or CSV output ## Run ```powershell mvn test mvn exec:java -Dexec.args="crawl --site all --limit 5 --format json --out data/news.json" ``` Interactive CLI: ```powershell mvn exec:java ``` Useful commands: ```text help sites crawl --site all --limit 10 --format json --out data/news.json crawl --site hnu --limit 5 --format csv --out data/hnu.csv exit ``` ## Output Fields Each crawled news item includes: - school - site key - title - url - publish time - source - author - summary - content preview - crawled time