diff --git a/w11/java1/java-cli/.gitignore b/w11/java1/java-cli/.gitignore new file mode 100644 index 0000000..0ebcf1a --- /dev/null +++ b/w11/java1/java-cli/.gitignore @@ -0,0 +1,4 @@ +*.jar +*.jar +*.class +*.log \ No newline at end of file diff --git a/w11/java1/java-cli/W10 PPT.md b/w11/java1/java-cli/W10 PPT.md new file mode 100644 index 0000000..d4ba310 --- /dev/null +++ b/w11/java1/java-cli/W10 PPT.md @@ -0,0 +1,492 @@ +--- +id: "24" +title: w10-设计模式 +slug: w10-design-patterns +status: draft +view_count: 0 +created_at: 2026-05-07T12:00:00+08:00 +updated_at: 2026-05-07T14:00:00.000000000+08:00 +--- + +# 高级程序设计 · 第10周 + +### 设计模式:灵活性与可扩展性 + +### 策略模式 + 工厂 + Repository 实战 + +--- + +### 📌 本周导航 + +- W9回顾:骨架的成就与隐患 +- 策略模式:解析器的“插头标准” +- 解析器工厂:自动匹配的魔法 +- Repository:武装数据访问 +- 整体架构串联:调用链全程 +- 代码落地 + 实践任务 +- 架构反思 + W11 预告 + +--- + +## 1️⃣ W9回顾:骨架的成就与隐患 + +### 我们建了一座漂亮的房子 + +- ✅ MVC 分层清晰 +- ✅ Command 模式:**新增命令,Controller 零改动** +- ✅ 所有输出走 `ConsoleView` +- ✅ 工程包结构标准 + +--- + +### 但问题也随之而来 + +```java +// CrawlCommand 里解析逻辑怎么办? +if (url.contains("blog.example.com")) { + // 博客解析... +} else if (url.contains("news.example.com")) { + // 新闻解析... +} else { + view.printError("Unsupported website!"); +} +``` + +> 😫 每支持一个新网站,就要加一个 `else if` + +--- + +### 还有另一个“裸奔”的数据 + +```java +List
articles = new ArrayList<>(); +// 所有 Command 都可以: +articles.clear(); +articles.add(null); +articles.remove(0); +``` + +> 🚨 数据没有任何保护,靠口头约定是靠不住的 + +--- + +### 本周任务 + +1. **解析逻辑可插拔** → 策略模式 + 工厂 +2. **数据访问加守卫** → Repository 模式 + +> W9 搭骨架,W10 装盔甲 + +--- + +## 2️⃣ 策略模式:解析器的“插头标准” + +### 墙上的插座,为什么什么电器都能插? + +- **三孔插座** 是标准接口 +- 电视、电脑、手机充电器都实现这个接口 +- 插座不关心你是什么电器 + +--- + +### 爬虫的世界也一样 + +- `CrawlStrategy` = 插座接口 +- `BlogStrategy`、`NewsStrategy` = 具体电器 +- `CrawlCommand` = 使用电器的人 +- `StrategyFactory` = 插座面板 + +--- + +### 接口即合同 + +```java +public interface CrawlStrategy { + List
parse(String url, Document doc); + boolean supports(String url); +} +``` + +- `supports()`:我能不能处理这个 URL? +- `parse()`:怎么解析? +- **任何网站想被爬,签这份合同!** + +--- + +### 策略 vs 硬编码 + +| 维度 | if-else 屎山 | 策略模式 | +|------|-------------|----------| +| 新增网站 | 改 Command | 新建策略类 | +| 修改解析 | 翻找 else if | 只改对应类 | +| 测试 | 启动整个爬虫 | 单独测策略 | +| 开闭原则 | ❌ 修改开放 | ✅ 扩展开放,修改关闭 | + +--- + +### 具体策略示例 + +```java +public class BlogStrategy implements CrawlStrategy { + public boolean supports(String url) { + return url.contains("blog.example.com"); + } + public List
parse(String url, Document doc) { + List
articles = new ArrayList<>(); + for (Element e : doc.select(".post-title")) { + articles.add(new Article(e.text(), url, "")); + } + return articles; + } +} +``` + +> ✨ 一个新网站,一个独立类,各扫门前雪 + +--- + +## 3️⃣ 解析器工厂:自动匹配的魔法 + +### 谁来选择策略? + +- 如果 `CrawlCommand` 遍历所有策略 → 策略模式白用了 +- 我们需要一个黑盒子:**丢入 URL,返回合适的解析器** + +--- + +### 工厂登场 + +```java +public class StrategyFactory { + private final List strategies = new ArrayList<>(); + + public StrategyFactory() { + strategies.add(new BlogStrategy()); + strategies.add(new NewsStrategy()); + } + + public CrawlStrategy getStrategy(String url) { + for (CrawlStrategy s : strategies) { + if (s.supports(url)) return s; + } + return null; + } +} +``` + +> 🔧 新增网站只需:新建策略类 + 工厂里注册一行 + +--- + +### 开闭原则的胜利 + +- ✅ `CrawlCommand` 完全不改 +- ✅ 新增 `XxxStrategy` 和一行注册 +- ✅ 所有策略的调用方式完全一致 + +> 这就是 **“对扩展开放,对修改关闭”** + +--- + +### 重构后的 CrawlCommand + +```java +public void execute(String[] args, ArticleRepository repository) { + String url = args[1]; + CrawlStrategy strategy = strategyFactory.getStrategy(url); + if (strategy == null) { + view.printError("No strategy for: " + url); + return; + } + Document doc = Jsoup.connect(url).get(); + List
parsed = strategy.parse(url, doc); + for (Article a : parsed) { + repository.add(a); + } + view.printSuccess("Crawled " + parsed.size() + " articles."); +} +``` + +> 🧠 CrawlCommand 现在只做 **“调度”**,不做解析 + +--- + +## 4️⃣ Repository:武装数据访问 + +### 共享 List 的问题 + +```java +articles.clear(); // 清空 +articles.add(null); // 塞 null +articles.remove(0); // 随意删除 +``` + +> 靠约定维护的秩序,终将被打破 + +--- + +### 给数据装上防盗门 + +```java +public class ArticleRepository { + private final List
articles = new ArrayList<>(); + + public void add(Article article) { + if (article == null) throw new IllegalArgumentException(...); + articles.add(article); + } + + public List
getAll() { + return Collections.unmodifiableList(articles); + } + + public int size() { return articles.size(); } + + public void clear() { articles.clear(); } +} +``` + +--- + +### 三道防线 + +| 机制 | 作用 | +|------|------| +| **add 拒绝 null** | 规则写在代码里,不靠口头约定 | +| **getAll 返回不可变视图** | 任何修改立即抛异常 | +| **必须通过 repository 访问** | 封装内部结构,只暴露安全方法 | + +--- + +### 所有 Command 签名改变 + +```java +// W9 +public void execute(String[] args, List
articles); + +// W10 +public void execute(String[] args, ArticleRepository repository); +``` + +> 语义变化:从“给你数据随便玩” → “给你安全的存取通道” + +--- + +## 5️⃣ 整体架构串联 + +### 一个 `crawl` 命令的完整旅程 + +``` +用户输入 "crawl https://blog.example.com" + ↓ +ConsoleView 解析 + ↓ +Controller 路由 → CrawlCommand + ↓ +StrategyFactory.getStrategy(url) → BlogStrategy + ↓ +Jsoup 抓取 → Document + ↓ +BlogStrategy.parse(url, doc) → List
+ ↓ +Repository.add() 存储 + ↓ +ConsoleView 输出成功信息 +``` + +--- + +### 架构全景图 + +![mvc-strategy-repo](/api/v1/attachments/8 "width=70% center") + +```mermaid +flowchart TD + User(["👤 用户输入
crawl https://blog.example.com"]) --> View + + subgraph View["🎨 View 层 (ConsoleView)"] + ReadLine["readLine()"] + Display["display() / printSuccess()"] + end + + ReadLine --> Controller + + subgraph Controller["🧭 Controller 层"] + Router["CrawlerController
Map 路由"] + end + + Router --> Command + + subgraph Command["⚡ Command 层"] + CrawlCmd["CrawlCommand
(调度者)"] + end + + CrawlCmd --> Factory + + subgraph Strategy["🧩 Strategy 层"] + Factory["StrategyFactory
(自动匹配)"] + StrategyI["<> CrawlStrategy"] + BlogS["BlogStrategy"] + NewsS["NewsStrategy"] + Factory --> StrategyI --> BlogS + StrategyI --> NewsS + end + + BlogS --> Repository + + subgraph Repository["🔐 Repository 层"] + Repo["ArticleRepository
(add / getAll)"] + RepoList["List
(私有)"] + Repo --> RepoList + end + + RepoList --> Model + + subgraph Model["📦 Model 层"] + Article["Article"] + end + + CrawlCmd --> Display + Repository --> Display +``` + +> 🗺️ 每一层都有清晰的职责,每一处扩展都只需要新增而不是修改 + +--- + +## 6️⃣ 代码落地(分步升级) + +### 从 W9 升级到 W10 的改动清单 + +1. 新建 `strategy/` 包 → `CrawlStrategy` 接口 +2. 实现 `BlogStrategy`、`NewsStrategy` +3. 实现 `StrategyFactory` +4. 新建 `repository/` 包 → `ArticleRepository` +5. 修改 `Command` 接口签名 +6. 重写 `CrawlCommand` +7. 调整其他所有 `Command` +8. 调整 `Controller` 和 `App.java` + +--- + +### 关键代码演示 + +- `Collections.unmodifiableList()` 的用法 +- `StrategyFactory.getStrategy()` 的遍历逻辑 +- `CrawlCommand` 从“写死解析”到“调度组装” + +```java +// 一个改动示例 +for (Article a : parsed) { + repository.add(a); // 旧: articles.add(a); +} +``` + +--- + +### 找茬点 + +- `StrategyFactory` 没匹配到策略时返回 `null` +- `CrawlCommand` 检查 `null` 并报错 +- 有没有更优雅的方式避免 `null` 判断? + +> 🔍 课后用 AI 探索 “空对象模式” 的前奏 + +--- + +## 7️⃣ 架构反思 + 下周预告 + +### 当前架构的脆弱点 + +- ❌ 异常处理单一笼统 +- ❌ 没有重试机制 +- ❌ 网络超时无控制 +- ❌ 日志仅输出到终端 + +--- + +### W11 目标:健壮性工程 + +- ✅ **自定义异常体系**:把“出错了”变成具体的业务异常 +- ✅ **工程化日志**:记录谁、什么时间、做了什么 +- ✅ **防御式编程 + 重试机制**:网络抖动不再致命 + +> W9 搭骨架 → W10 装盔甲 → W11 让它经得起毒打 + +--- + +## 8️⃣ 实践任务(现场) + +### 必做 + +1. 基于 W9 项目升级到 W10 +2. 至少实现 2 个 CrawlStrategy(可模拟) +3. 实现 `StrategyFactory` 和 `ArticleRepository` +4. 测试完整 `crawl` → `list` 流程 + +### 验收标准 + +- [ ] 新增策略只加类+注册,零改动旧代码 +- [ ] `getAll()` 返回不可修改视图 +- [ ] `CrawlCommand` 不含网站特定解析 +- [ ] 所有 Command 用 Repository +- [ ] 无地方直接操作 `List
` + +--- + +## 9️⃣ 课后作业 + +### 必做 + +1. 完善 `ArticleRepository`:增加 `addAll`,防御 null +2. **★ AnalyzeCommand**:复用策略解析但不存储,输出统计信息 +3. **AI 架构审计**:发送类签名给 AI,检查策略解耦与封装 + +### 选做 + +- 正则策略匹配、默认策略、策略优先级 +- 思考题:两个策略都 `supports` 同一 URL 时怎么办? + +--- + +## 🤖 AI 协同升级 + +### 架构审计师(必做) + +- 画出类依赖图 +- 发给 AI:“检查开闭原则达成度,Repository 封装完备性,是否存在循环依赖” + +### 进阶探究 + +- 不用工厂,直接用 `Map` 存起来 vs `StrategyFactory` 的区别? + +--- + +## 📚 总结 + +- ✅ 策略模式:算法可插拔,新增网站零痛苦 +- ✅ 工厂:自动匹配,URL → 策略的魔法 +- ✅ Repository:数据守卫,规则从口头约定变成代码强制 +- ✅ 架构:从“分开”到“优雅合上”,对扩展开放,对修改关闭 + +### W11 预告 + +自定义异常体系 + 日志 + 重试机制 + +> 🚀 让我们造的爬虫,经得住现实的考验 + +--- + +## 谢谢! + +**保持工程洁癖,下周见!** + +--- + +# 居中标题 + +## 居中副标题 + +### 居中内容 + +--- \ No newline at end of file diff --git a/w11/java1/java-cli/pom.xml b/w11/java1/java-cli/pom.xml new file mode 100644 index 0000000..39e0eb1 --- /dev/null +++ b/w11/java1/java-cli/pom.xml @@ -0,0 +1,57 @@ + + 4.0.0 + com.example + datacollect-cli + 0.1.0 + + 11 + 11 + + + + org.jsoup + jsoup + 1.17.2 + + + ch.qos.logback + logback-classic + 1.4.11 + + + + + + org.apache.maven.plugins + maven-compiler-plugin + 3.8.1 + + + org.apache.maven.plugins + maven-assembly-plugin + 3.3.0 + + + + com.example.datacollect.Main + + + + jar-with-dependencies + + + + + make-assembly + package + + single + + + + + + + diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/Main.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/Main.java new file mode 100644 index 0000000..385911b --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/Main.java @@ -0,0 +1,113 @@ +package com.example.datacollect; + +import com.example.datacollect.controller.CrawlerController; +import com.example.datacollect.exception.ParseException; +import com.example.datacollect.model.Article; +import com.example.datacollect.repository.ArticleRepository; +import com.example.datacollect.strategy.CrawlStrategy; +import com.example.datacollect.strategy.StrategyFactory; +import com.example.datacollect.view.ConsoleView; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class Main { + private static final Logger logger = LoggerFactory.getLogger(Main.class); + + public static void main(String[] args) { + logger.info("Starting CLI Crawler application"); + + ConsoleView view = new ConsoleView(); + ArticleRepository repository = new ArticleRepository(); + StrategyFactory strategyFactory = new StrategyFactory(); + + if (args.length > 0 && "-test".equals(args[0])) { + logger.info("Running in test mode"); + runTest(view, repository, strategyFactory); + return; + } + + CrawlerController controller = new CrawlerController(view, repository, strategyFactory); + + view.printSuccess("Welcome to CLI Crawler (w10_3)! Type help for commands."); + logger.info("Application ready, waiting for input"); + + while (true) { + controller.handle(view.readLine()); + } + } + + private static void runTest(ConsoleView view, ArticleRepository repository, StrategyFactory strategyFactory) { + strategyFactory.register(new MockStrategy()); + + CrawlerController controller = new CrawlerController(view, repository, strategyFactory); + + view.printSuccess("=== 测试完整 crawl → list 流程 ==="); + + view.printInfo("\n1. 测试空列表状态:"); + controller.handle("list"); + + view.printInfo("\n2. 测试无效 URL(无匹配策略):"); + controller.handle("crawl https://unknown.example.com"); + + view.printInfo("\n3. 测试爬取 mock.example.com:"); + controller.handle("crawl https://mock.example.com"); + + view.printInfo("\n4. 测试 list 显示爬取结果:"); + controller.handle("list"); + + view.printInfo("\n5. 测试爬取 blog.example.com:"); + controller.handle("crawl https://blog.example.com"); + + view.printInfo("\n6. 测试 list 显示累计结果:"); + controller.handle("list"); + + view.printInfo("\n7. 测试 getAll() 返回不可修改视图:"); + testUnmodifiableView(repository); + + view.printInfo("\n8. 测试 Repository 防御检查:"); + testRepositoryDefense(repository); + + view.printSuccess("\n=== 测试完成 ==="); + } + + private static void testUnmodifiableView(ArticleRepository repository) { + try { + repository.getAll().add(new Article("Test", "http://test.com", "")); + System.out.println("ERROR: 应该抛出 UnsupportedOperationException"); + } catch (UnsupportedOperationException e) { + System.out.println("SUCCESS: getAll() 返回不可修改视图,正确抛出异常"); + } + } + + private static void testRepositoryDefense(ArticleRepository repository) { + try { + repository.add(null); + System.out.println("ERROR: 应该抛出 NullPointerException"); + } catch (NullPointerException e) { + System.out.println("SUCCESS: 添加 null 文章正确抛出异常"); + } + + try { + repository.add(new Article("", "http://test.com", "")); + System.out.println("ERROR: 应该抛出 IllegalArgumentException"); + } catch (IllegalArgumentException e) { + System.out.println("SUCCESS: 添加空标题文章正确抛出异常"); + } + } + + public static class MockStrategy implements CrawlStrategy { + @Override + public boolean supports(String url) { + return url != null && url.contains("mock.example.com"); + } + + @Override + public java.util.List
parse(String url, org.jsoup.nodes.Document doc) throws ParseException { + java.util.List
articles = new java.util.ArrayList<>(); + articles.add(new Article("模拟文章 1", url + "/article1", "模拟内容 1")); + articles.add(new Article("模拟文章 2", url + "/article2", "模拟内容 2")); + articles.add(new Article("模拟文章 3", url + "/article3", "模拟内容 3")); + return articles; + } + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/command/Command.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/Command.java new file mode 100644 index 0000000..029cadc --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/Command.java @@ -0,0 +1,8 @@ +package com.example.datacollect.command; + +import com.example.datacollect.repository.ArticleRepository; + +public interface Command { + String getName(); + void execute(String[] args, ArticleRepository repository); +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/command/CrawlCommand.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/CrawlCommand.java new file mode 100644 index 0000000..35e86d3 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/CrawlCommand.java @@ -0,0 +1,96 @@ +package com.example.datacollect.command; + +import com.example.datacollect.exception.ParseException; +import com.example.datacollect.repository.ArticleRepository; +import com.example.datacollect.strategy.CrawlStrategy; +import com.example.datacollect.strategy.StrategyFactory; +import com.example.datacollect.view.ConsoleView; +import org.jsoup.Jsoup; +import org.jsoup.nodes.Document; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class CrawlCommand implements Command { + private static final Logger logger = LoggerFactory.getLogger(CrawlCommand.class); + private static final int MAX_RETRY = 3; + private static final long RETRY_DELAY_MS = 1000; + + private final ConsoleView view; + private final StrategyFactory strategyFactory; + + public CrawlCommand(ConsoleView view, StrategyFactory strategyFactory) { + this.view = view; + this.strategyFactory = strategyFactory; + } + + @Override + public String getName() { + return "crawl"; + } + + @Override + public void execute(String[] args, ArticleRepository repository) { + if (args.length < 2) { + String errorMsg = "Usage: crawl "; + logger.warn(errorMsg); + view.printError(errorMsg); + return; + } + String url = args[1]; + logger.info("Crawl started for: {}", url); + + CrawlStrategy strategy = strategyFactory.getStrategy(url); + if (strategy == null) { + String errorMsg = "No strategy found for: " + url; + logger.warn(errorMsg); + view.printError(errorMsg); + return; + } + + logger.info("Starting crawl for URL: {}", url); + view.printInfo("Crawling: " + url); + + Document doc = null; + int attempt = 0; + boolean success = false; + + while (attempt < MAX_RETRY && !success) { + attempt++; + try { + logger.debug("Attempt {} to fetch URL: {}", attempt, url); + doc = Jsoup.connect(url).get(); + success = true; + } catch (Exception e) { + logger.warn("Attempt {} failed for URL {}: {}", attempt, url, e.getMessage()); + if (attempt < MAX_RETRY) { + logger.info("Retrying in {}ms...", RETRY_DELAY_MS); + try { + Thread.sleep(RETRY_DELAY_MS); + } catch (InterruptedException ie) { + Thread.currentThread().interrupt(); + break; + } + } + } + } + + if (!success) { + String errorMsg = "Failed to fetch URL after " + MAX_RETRY + " attempts: " + url; + logger.error(errorMsg); + view.printError(errorMsg); + return; + } + + try { + var articles = strategy.parse(url, doc); + for (var article : articles) { + repository.add(article); + } + logger.info("Successfully crawled {} articles from {}", articles.size(), url); + view.printSuccess("Crawled " + articles.size() + " articles."); + } catch (ParseException e) { + logger.error("Parse error for URL {}: {}", url, e.getMessage(), e); + view.printError("Parse error: " + e.getMessage()); + } + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/command/ExitCommand.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/ExitCommand.java new file mode 100644 index 0000000..51ee001 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/ExitCommand.java @@ -0,0 +1,28 @@ +package com.example.datacollect.command; + +import com.example.datacollect.repository.ArticleRepository; +import com.example.datacollect.view.ConsoleView; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class ExitCommand implements Command { + private static final Logger logger = LoggerFactory.getLogger(ExitCommand.class); + + private final ConsoleView view; + + public ExitCommand(ConsoleView view) { + this.view = view; + } + + @Override + public String getName() { + return "exit"; + } + + @Override + public void execute(String[] args, ArticleRepository repository) { + logger.info("Exiting application"); + view.printSuccess("Bye!"); + System.exit(0); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/command/HelpCommand.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/HelpCommand.java new file mode 100644 index 0000000..932c1db --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/HelpCommand.java @@ -0,0 +1,27 @@ +package com.example.datacollect.command; + +import com.example.datacollect.repository.ArticleRepository; +import com.example.datacollect.view.ConsoleView; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class HelpCommand implements Command { + private static final Logger logger = LoggerFactory.getLogger(HelpCommand.class); + + private final ConsoleView view; + + public HelpCommand(ConsoleView view) { + this.view = view; + } + + @Override + public String getName() { + return "help"; + } + + @Override + public void execute(String[] args, ArticleRepository repository) { + logger.debug("Displaying help"); + view.printInfo("Commands: crawl , list, help, exit"); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/command/ListCommand.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/ListCommand.java new file mode 100644 index 0000000..2b6998e --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/command/ListCommand.java @@ -0,0 +1,27 @@ +package com.example.datacollect.command; + +import com.example.datacollect.repository.ArticleRepository; +import com.example.datacollect.view.ConsoleView; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class ListCommand implements Command { + private static final Logger logger = LoggerFactory.getLogger(ListCommand.class); + + private final ConsoleView view; + + public ListCommand(ConsoleView view) { + this.view = view; + } + + @Override + public String getName() { + return "list"; + } + + @Override + public void execute(String[] args, ArticleRepository repository) { + logger.debug("Listing articles"); + view.display(repository.getAll()); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/controller/CrawlerController.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/controller/CrawlerController.java new file mode 100644 index 0000000..a91901d --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/controller/CrawlerController.java @@ -0,0 +1,57 @@ +package com.example.datacollect.controller; + +import com.example.datacollect.command.Command; +import com.example.datacollect.command.CrawlCommand; +import com.example.datacollect.command.ExitCommand; +import com.example.datacollect.command.HelpCommand; +import com.example.datacollect.command.ListCommand; +import com.example.datacollect.repository.ArticleRepository; +import com.example.datacollect.strategy.StrategyFactory; +import com.example.datacollect.view.ConsoleView; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +public class CrawlerController { + private static final Logger logger = LoggerFactory.getLogger(CrawlerController.class); + + private final Map commands = new HashMap<>(); + private final ConsoleView view; + private final ArticleRepository repository; + + public CrawlerController(ConsoleView view, ArticleRepository repository, StrategyFactory strategyFactory) { + this.view = view; + this.repository = repository; + register(new HelpCommand(view)); + register(new ListCommand(view)); + register(new CrawlCommand(view, strategyFactory)); + register(new ExitCommand(view)); + logger.info("CrawlerController initialized with {} commands", commands.size()); + } + + private void register(Command command) { + commands.put(command.getName(), command); + logger.debug("Registered command: {}", command.getName()); + } + + public void handle(String input) { + String text = input == null ? "" : input.trim(); + if (text.isEmpty()) { + return; + } + + String[] args = text.split("\\s+"); + String cmdName = args[0].toLowerCase(); + Command command = commands.get(cmdName); + if (command == null) { + String errorMsg = "Unknown command: " + cmdName; + logger.warn(errorMsg); + view.printError(errorMsg); + return; + } + logger.debug("Executing command: {}", cmdName); + command.execute(args, repository); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/CrawlerException.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/CrawlerException.java new file mode 100644 index 0000000..bde38fd --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/CrawlerException.java @@ -0,0 +1,11 @@ +package com.example.datacollect.exception; + +public class CrawlerException extends Exception { + public CrawlerException(String message) { + super(message); + } + + public CrawlerException(String message, Throwable cause) { + super(message, cause); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/NetworkException.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/NetworkException.java new file mode 100644 index 0000000..b80f1bb --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/NetworkException.java @@ -0,0 +1,11 @@ +package com.example.datacollect.exception; + +public class NetworkException extends CrawlerException { + public NetworkException(String message) { + super(message); + } + + public NetworkException(String message, Throwable cause) { + super(message, cause); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/ParseException.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/ParseException.java new file mode 100644 index 0000000..ef4c5a1 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/exception/ParseException.java @@ -0,0 +1,11 @@ +package com.example.datacollect.exception; + +public class ParseException extends CrawlerException { + public ParseException(String message) { + super(message); + } + + public ParseException(String message, Throwable cause) { + super(message, cause); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/model/Article.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/model/Article.java new file mode 100644 index 0000000..147dbe6 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/model/Article.java @@ -0,0 +1,45 @@ +package com.example.datacollect.model; + +public class Article { + private String title; + private String url; + private String content; + + public Article(String title, String url, String content) { + this.title = title; + this.url = url; + this.content = content; + } + + public String getTitle() { + return title; + } + + public void setTitle(String title) { + this.title = title; + } + + public String getUrl() { + return url; + } + + public void setUrl(String url) { + this.url = url; + } + + public String getContent() { + return content; + } + + public void setContent(String content) { + this.content = content; + } + + @Override + public String toString() { + return "Article{" + + "title='" + title + '\'' + + ", url='" + url + '\'' + + '}'; + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/repository/ArticleRepository.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/repository/ArticleRepository.java new file mode 100644 index 0000000..6e3577c --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/repository/ArticleRepository.java @@ -0,0 +1,80 @@ +package com.example.datacollect.repository; + +import com.example.datacollect.model.Article; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Objects; + +public class ArticleRepository { + private static final Logger logger = LoggerFactory.getLogger(ArticleRepository.class); + + private final List
articles = new ArrayList<>(); + + public void add(Article article) { + Objects.requireNonNull(article, "Article cannot be null"); + + if (article.getTitle() == null || article.getTitle().trim().isEmpty()) { + logger.warn("Attempted to add article with empty title"); + throw new IllegalArgumentException("Article title cannot be null or empty"); + } + + if (article.getUrl() == null || article.getUrl().trim().isEmpty()) { + logger.warn("Attempted to add article with empty URL"); + throw new IllegalArgumentException("Article URL cannot be null or empty"); + } + + articles.add(article); + logger.debug("Added article: {}", article.getTitle()); + } + + public void addAll(List
articleList) { + Objects.requireNonNull(articleList, "Article list cannot be null"); + + if (articleList.isEmpty()) { + logger.debug("Empty article list provided, nothing to add"); + return; + } + + for (Article article : articleList) { + add(article); + } + logger.info("Added {} articles", articleList.size()); + } + + public List
getAll() { + List
result = Collections.unmodifiableList(articles); + logger.debug("Returning {} articles (unmodifiable)", result.size()); + return result; + } + + public Article get(int index) { + if (index < 0 || index >= articles.size()) { + logger.warn("Index out of bounds: {} (size: {})", index, articles.size()); + throw new IndexOutOfBoundsException("Index: " + index + ", Size: " + articles.size()); + } + return articles.get(index); + } + + public int size() { + return articles.size(); + } + + public boolean isEmpty() { + return articles.isEmpty(); + } + + public void clear() { + int size = articles.size(); + articles.clear(); + logger.info("Cleared {} articles", size); + } + + public boolean contains(Article article) { + Objects.requireNonNull(article, "Article cannot be null"); + return articles.contains(article); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/BlogStrategy.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/BlogStrategy.java new file mode 100644 index 0000000..40ccabb --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/BlogStrategy.java @@ -0,0 +1,43 @@ +package com.example.datacollect.strategy; + +import com.example.datacollect.exception.ParseException; +import com.example.datacollect.model.Article; +import org.jsoup.nodes.Document; +import org.jsoup.nodes.Element; +import org.jsoup.select.Elements; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +public class BlogStrategy implements CrawlStrategy { + private static final Logger logger = LoggerFactory.getLogger(BlogStrategy.class); + + @Override + public boolean supports(String url) { + boolean supported = url != null && url.contains("blog.example.com"); + logger.debug("BlogStrategy supports URL {}: {}", url, supported); + return supported; + } + + @Override + public List
parse(String url, Document doc) throws ParseException { + try { + logger.debug("Parsing blog page: {}", url); + List
articles = new ArrayList<>(); + Elements titles = doc.select(".post-title"); + logger.debug("Found {} titles", titles.size()); + + for (Element e : titles) { + articles.add(new Article(e.text(), url, "")); + } + + logger.info("Parsed {} articles from blog", articles.size()); + return articles; + } catch (Exception e) { + logger.error("Failed to parse blog page: {}", e.getMessage(), e); + throw new ParseException("Failed to parse blog page", e); + } + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/CrawlStrategy.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/CrawlStrategy.java new file mode 100644 index 0000000..71dab50 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/CrawlStrategy.java @@ -0,0 +1,12 @@ +package com.example.datacollect.strategy; + +import com.example.datacollect.exception.ParseException; +import com.example.datacollect.model.Article; +import org.jsoup.nodes.Document; + +import java.util.List; + +public interface CrawlStrategy { + List
parse(String url, Document doc) throws ParseException; + boolean supports(String url); +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/HnuNewsStrategy.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/HnuNewsStrategy.java new file mode 100644 index 0000000..74d6bc6 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/HnuNewsStrategy.java @@ -0,0 +1,65 @@ +package com.example.datacollect.strategy; + +import com.example.datacollect.exception.ParseException; +import com.example.datacollect.model.Article; +import org.jsoup.nodes.Document; +import org.jsoup.nodes.Element; +import org.jsoup.select.Elements; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +public class HnuNewsStrategy implements CrawlStrategy { + private static final Logger logger = LoggerFactory.getLogger(HnuNewsStrategy.class); + + @Override + public boolean supports(String url) { + boolean supported = url != null && url.contains("news.hnu.edu.cn"); + logger.debug("HnuNewsStrategy supports URL {}: {}", url, supported); + return supported; + } + + @Override + public List
parse(String url, Document doc) throws ParseException { + try { + logger.debug("Parsing Hnu news page: {}", url); + List
articles = new ArrayList<>(); + Elements listItems = doc.select("ul.list11 li"); + logger.debug("Found {} list items", listItems.size()); + + for (Element li : listItems) { + Element link = li.selectFirst("a"); + if (link == null) continue; + + String articleUrl = link.attr("href"); + if (!articleUrl.startsWith("http")) { + articleUrl = "https://news.hnu.edu.cn" + articleUrl.replace("..", ""); + } + + String title = ""; + Element titleEl = link.selectFirst("h4.l2.h4s2"); + if (titleEl != null) { + title = titleEl.text().trim(); + } + + String content = ""; + Element contentEl = link.selectFirst("p.l3.ps3"); + if (contentEl != null) { + content = contentEl.text().trim(); + } + + if (!title.isEmpty()) { + articles.add(new Article(title, articleUrl, content)); + } + } + + logger.info("Parsed {} articles from Hnu news", articles.size()); + return articles; + } catch (Exception e) { + logger.error("Failed to parse Hnu news page: {}", e.getMessage(), e); + throw new ParseException("Failed to parse Hnu news page", e); + } + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/NewsStrategy.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/NewsStrategy.java new file mode 100644 index 0000000..7117197 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/NewsStrategy.java @@ -0,0 +1,43 @@ +package com.example.datacollect.strategy; + +import com.example.datacollect.exception.ParseException; +import com.example.datacollect.model.Article; +import org.jsoup.nodes.Document; +import org.jsoup.nodes.Element; +import org.jsoup.select.Elements; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +public class NewsStrategy implements CrawlStrategy { + private static final Logger logger = LoggerFactory.getLogger(NewsStrategy.class); + + @Override + public boolean supports(String url) { + boolean supported = url != null && url.contains("news.example.com"); + logger.debug("NewsStrategy supports URL {}: {}", url, supported); + return supported; + } + + @Override + public List
parse(String url, Document doc) throws ParseException { + try { + logger.debug("Parsing news page: {}", url); + List
articles = new ArrayList<>(); + Elements items = doc.select(".article-headline"); + logger.debug("Found {} headlines", items.size()); + + for (Element e : items) { + articles.add(new Article(e.text(), url, "")); + } + + logger.info("Parsed {} articles from news", articles.size()); + return articles; + } catch (Exception e) { + logger.error("Failed to parse news page: {}", e.getMessage(), e); + throw new ParseException("Failed to parse news page", e); + } + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/StrategyFactory.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/StrategyFactory.java new file mode 100644 index 0000000..89a4815 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/strategy/StrategyFactory.java @@ -0,0 +1,55 @@ +package com.example.datacollect.strategy; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; +import java.util.Objects; + +public class StrategyFactory { + private static final Logger logger = LoggerFactory.getLogger(StrategyFactory.class); + + private final List strategies = new ArrayList<>(); + + public StrategyFactory() { + strategies.add(new HnuNewsStrategy()); + strategies.add(new BlogStrategy()); + strategies.add(new NewsStrategy()); + logger.info("StrategyFactory initialized with {} strategies", strategies.size()); + } + + public CrawlStrategy getStrategy(String url) { + Objects.requireNonNull(url, "URL cannot be null"); + + if (url.trim().isEmpty()) { + logger.warn("Empty URL provided"); + return null; + } + + for (CrawlStrategy s : strategies) { + if (s.supports(url)) { + logger.debug("Found strategy {} for URL: {}", s.getClass().getSimpleName(), url); + return s; + } + } + logger.warn("No strategy found for URL: {}", url); + return null; + } + + public void register(CrawlStrategy strategy) { + Objects.requireNonNull(strategy, "Strategy cannot be null"); + + if (strategies.contains(strategy)) { + logger.warn("Strategy {} already registered", strategy.getClass().getSimpleName()); + return; + } + + strategies.add(strategy); + logger.info("Registered strategy: {}", strategy.getClass().getSimpleName()); + } + + public int getStrategyCount() { + return strategies.size(); + } +} diff --git a/w11/java1/java-cli/src/main/java/com/example/datacollect/view/ConsoleView.java b/w11/java1/java-cli/src/main/java/com/example/datacollect/view/ConsoleView.java new file mode 100644 index 0000000..afdc912 --- /dev/null +++ b/w11/java1/java-cli/src/main/java/com/example/datacollect/view/ConsoleView.java @@ -0,0 +1,52 @@ +package com.example.datacollect.view; + +import com.example.datacollect.model.Article; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.List; +import java.util.Scanner; + +public class ConsoleView { + private static final Logger logger = LoggerFactory.getLogger(ConsoleView.class); + + private static final String ANSI_RESET = "\u001B[0m"; + private static final String ANSI_GREEN = "\u001B[32m"; + private static final String ANSI_RED = "\u001B[31m"; + private static final String ANSI_BLUE = "\u001B[34m"; + + private final Scanner scanner = new Scanner(System.in); + + public String readLine() { + logger.debug("Reading input from console"); + System.out.print("> "); + return scanner.nextLine(); + } + + public void printSuccess(String msg) { + logger.info("Success: {}", msg); + System.out.println(ANSI_GREEN + msg + ANSI_RESET); + } + + public void printError(String msg) { + logger.error("Error: {}", msg); + System.out.println(ANSI_RED + msg + ANSI_RESET); + } + + public void printInfo(String msg) { + logger.debug("Info: {}", msg); + System.out.println(ANSI_BLUE + msg + ANSI_RESET); + } + + public void display(List
articles) { + logger.debug("Displaying {} articles", articles.size()); + if (articles.isEmpty()) { + printInfo("暂无文章,请先执行 crawl。"); + return; + } + for (int i = 0; i < articles.size(); i++) { + Article a = articles.get(i); + System.out.println((i + 1) + ". " + a.getTitle() + " | " + a.getUrl()); + } + } +} diff --git a/w11/java1/java-cli/src/main/resources/logback.xml b/w11/java1/java-cli/src/main/resources/logback.xml new file mode 100644 index 0000000..e31e311 --- /dev/null +++ b/w11/java1/java-cli/src/main/resources/logback.xml @@ -0,0 +1,27 @@ + + + + + + %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + + + ${LOG_PATH}/crawler.log + + ${LOG_PATH}/crawler.%d{yyyy-MM-dd}.log + 30 + + + %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + + + + + + + + diff --git a/w11/java1/java-cli/target/classes/logback.xml b/w11/java1/java-cli/target/classes/logback.xml new file mode 100644 index 0000000..e31e311 --- /dev/null +++ b/w11/java1/java-cli/target/classes/logback.xml @@ -0,0 +1,27 @@ + + + + + + %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + + + ${LOG_PATH}/crawler.log + + ${LOG_PATH}/crawler.%d{yyyy-MM-dd}.log + 30 + + + %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + + + + + + + + diff --git a/w11/java1/java-cli/target/maven-archiver/pom.properties b/w11/java1/java-cli/target/maven-archiver/pom.properties new file mode 100644 index 0000000..5c1de34 --- /dev/null +++ b/w11/java1/java-cli/target/maven-archiver/pom.properties @@ -0,0 +1,3 @@ +artifactId=datacollect-cli +groupId=com.example +version=0.1.0 diff --git a/w11/java1/java-cli/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst b/w11/java1/java-cli/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst new file mode 100644 index 0000000..5838b5c --- /dev/null +++ b/w11/java1/java-cli/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst @@ -0,0 +1,19 @@ +com\example\datacollect\command\ListCommand.class +com\example\datacollect\command\CrawlCommand.class +com\example\datacollect\strategy\BlogStrategy.class +com\example\datacollect\repository\ArticleRepository.class +com\example\datacollect\Main.class +com\example\datacollect\view\ConsoleView.class +com\example\datacollect\command\ExitCommand.class +com\example\datacollect\command\HelpCommand.class +com\example\datacollect\Main$MockStrategy.class +com\example\datacollect\strategy\NewsStrategy.class +com\example\datacollect\command\Command.class +com\example\datacollect\controller\CrawlerController.class +com\example\datacollect\exception\CrawlerException.class +com\example\datacollect\exception\NetworkException.class +com\example\datacollect\strategy\StrategyFactory.class +com\example\datacollect\strategy\HnuNewsStrategy.class +com\example\datacollect\exception\ParseException.class +com\example\datacollect\strategy\CrawlStrategy.class +com\example\datacollect\model\Article.class diff --git a/w11/java1/java-cli/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst b/w11/java1/java-cli/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst new file mode 100644 index 0000000..707f7d9 --- /dev/null +++ b/w11/java1/java-cli/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst @@ -0,0 +1,18 @@ +D:\java1\java-cli\src\main\java\com\example\datacollect\strategy\StrategyFactory.java +D:\java1\java-cli\src\main\java\com\example\datacollect\exception\CrawlerException.java +D:\java1\java-cli\src\main\java\com\example\datacollect\strategy\BlogStrategy.java +D:\java1\java-cli\src\main\java\com\example\datacollect\Main.java +D:\java1\java-cli\src\main\java\com\example\datacollect\repository\ArticleRepository.java +D:\java1\java-cli\src\main\java\com\example\datacollect\command\ExitCommand.java +D:\java1\java-cli\src\main\java\com\example\datacollect\strategy\CrawlStrategy.java +D:\java1\java-cli\src\main\java\com\example\datacollect\exception\ParseException.java +D:\java1\java-cli\src\main\java\com\example\datacollect\strategy\HnuNewsStrategy.java +D:\java1\java-cli\src\main\java\com\example\datacollect\command\CrawlCommand.java +D:\java1\java-cli\src\main\java\com\example\datacollect\command\ListCommand.java +D:\java1\java-cli\src\main\java\com\example\datacollect\view\ConsoleView.java +D:\java1\java-cli\src\main\java\com\example\datacollect\exception\NetworkException.java +D:\java1\java-cli\src\main\java\com\example\datacollect\strategy\NewsStrategy.java +D:\java1\java-cli\src\main\java\com\example\datacollect\model\Article.java +D:\java1\java-cli\src\main\java\com\example\datacollect\controller\CrawlerController.java +D:\java1\java-cli\src\main\java\com\example\datacollect\command\HelpCommand.java +D:\java1\java-cli\src\main\java\com\example\datacollect\command\Command.java diff --git a/w11/java1/java-cli/第10周——设计模式:灵活性与可扩展性.md b/w11/java1/java-cli/第10周——设计模式:灵活性与可扩展性.md new file mode 100644 index 0000000..9641102 --- /dev/null +++ b/w11/java1/java-cli/第10周——设计模式:灵活性与可扩展性.md @@ -0,0 +1,705 @@ +# 教案:《高级程序设计》第10周——设计模式:灵活性与可扩展性 + +| 项目 | 内容 | +| -------- | ---------------------------------------------------------------------------- | +| **课程名称** | 高级程序设计 | +| **周次** | 第10周 | +| **主题** | 设计模式——灵活性与可扩展性 | +| **学时** | 2学时(90分钟) | +| **授课对象** | 已完成第9周CLI+MVC架构学习,具备Command模式基础 | +| **教学环境** | JDK 17+、IntelliJ IDEA、Maven | +| **前情提要** | W9搭建了CLI骨架:MVC分层 + Command路由,但留下了两大隐患——解析逻辑耦合在Command中、List\共享引用裸奔 | + +--- + +## 教学调整说明:为什么W10要在“骨架”上装“盔甲”? + +> **W9成果**:一个可扩展的命令行骨架 → **W9痛点**:解析器与数据存储仍在“裸奔” + +| 维度 | W9状态 | W10目标 | +|------|--------|---------| +| **架构** | MVC分层清晰 | MVC + 策略模式 + 仓库层 | +| **命令扩展** | 新增命令不改Controller | 新增解析器不改任何旧代码 | +| **数据安全** | List\全员可写 | Repository封装,只暴露安全接口 | +| **解析逻辑** | 硬编码在CrawlCommand内 | 策略模式,按URL自动匹配 | +| **代码量** | ~8个类 | ~12个类,但每个更小更纯粹 | + +**决策理由**: +1. W9学生已经感受到Command模式的好处——**多态带来的扩展性** +2. 策略模式是多态思想的又一次实战,是**接口抽象的深化** +3. 仓库层是“封装”这一OOP核心原则的落地,补上W9留下的课 +4. 解析器工厂让学生看到**“自动匹配”**的威力——增加网站支持只需新增一个类 + +**更深层的教育价值**: +> W9教会学生“怎么把代码分开”,W10要教会学生“怎么把代码分开后还能优雅地合上”——**接口即合同,工厂即自动匹配,仓库即数据守卫**。这三句话,就是本周的全部精华。 + +--- + +## 一、教学目标 + +| 目标维度 | 具体描述 | +|----------|----------| +| **知识掌握** | 理解策略模式的定义与多态本质;掌握工厂模式的两类变体(工厂方法/简单工厂)及适用场景;理解仓库模式对数据访问的封装原理。 | +| **工程实践** | 能在爬虫项目中用策略模式封装不同网站的解析逻辑;能实现解析器工厂,根据URL自动匹配解析策略;能用Repository模式替代裸List,提供安全的数据访问接口。 | +| **思维转型** | 从“写死逻辑”转向“策略可插拔”;从“直接操作集合”转向“通过仓库存取”;理解“对扩展开放,对修改关闭”的开闭原则。 | +| **工具应用** | 利用AI审查策略模式实现是否真正解耦;让AI扮演“网站结构分析师”辅助编写具体解析策略;用AI生成Repository的安全接口建议。 | + +--- + +## 二、教学重点与难点 + +| 项目 | 内容 | 突破方法 | +|------|------|----------| +| **重点** | 策略模式的多态本质、解析器工厂的自动匹配机制、Repository对数据访问的封装 | 以“新增网站需要改什么”为切入点,展示策略模式的开闭原则达成;通过“攻击”当前List裸奔的问题,引出Repository的必然性 | +| **难点** | 理解“接口即合同”的抽象思维、工厂模式中反射/Map注册的实现、仓库层与Strategy模式的协同 | 用“插座与电器”类比接口标准;现场演示从硬编码→工厂→反射的演进路径;用时序图展示“用户→Command→Strategy→Repository”的完整调用链 | + +--- + +## 三、教学过程设计(90分钟) + +| 环节 | 时间 | 教学内容 | 师生活动 | AI协同点 | +| -------------------------- | --- | ----------------------------------------------------------------- | -------------------------------------- | --------------------------- | +| **1. W9回顾与痛点暴露** | 8' | 回顾W9成果(CLI骨架),暴露两大隐患:①CrawlCommand里解析逻辑硬编码;②List\全员可读可写 | **教师演示**:展示W9代码,用“事故场景”引发思考 | — | +| **2. 策略模式:解析器的“插头标准化”** | 18' | 策略模式定义、接口设计、多态调用、与Command模式的对比 | **类比**:插座与电器;**教师演示**:从if-else到策略模式的演进 | 让AI生成“策略模式vs switch-case”对比 | +| **3. 解析器工厂:自动匹配的魔法** | 14' | 工厂模式的两种形态(简单工厂→Map注册工厂),解析器工厂实现 | **教师演示**:先用if-else判断host,再升级为Map注册工厂 | 让AI解释工厂模式与策略模式如何协同 | +| **4. Repository模式:武装数据访问** | 12' | Repository定义、接口设计、替换List\后的影响 | **教师演示**:在原代码中把List替换为Repository,展示改动点 | 学生用AI审计Repository接口的“最小完备性” | +| **5. 整体架构串联** | 8' | 用一张时序图串联:用户→CLI→Controller→Command→Strategy→Repository→Model | **师生互动**:让学生在白板上画出调用链 | — | +| **6. 代码落地** | 20' | 实现CrawlStrategy接口 + 两个策略 + 解析器工厂 + ArticleRepository | **教师演示**:分步写出代码,刻意埋入“策略匹配失败”的异常处理 | 完成后用AI检查策略模式实现 | +| **7. 架构反思与W11预告** | 5' | 当前架构还有什么隐患?(异常处理不统一、日志缺失)→ 预告W11健壮性工程 | **师生互动**:如果解析器工厂找不到匹配策略,会发生什么? | — | +| **8. 实践任务** | 5' | 实现策略模式和仓库层,完成本周代码升级 | 学生现场编码,教师巡视 | — | + +--- + +## 四、核心教学内容脚本 + +### 4.1 W9回顾与痛点暴露(8分钟) + +**教师口播**: +> "上节课我们搭了一个很漂亮的骨架——CLI+MVC+Command模式。我们先来表扬一下自己:新增一个命令,只要新建一个类,Controller零改动。但请大家想一个问题——" + +**投影展示W9的CrawlCommand存根**: +```java +public class CrawlCommand implements Command { + // ... + public void execute(String[] args, List
articles) { + if (args.length < 2) { + view.printError("Usage: crawl "); + return; + } + view.printInfo("Stub: Would crawl " + args[1]); + } +} +``` + +**提问引导**: +1. "这个存根下周要填坑了。假设我们现在要真正实现爬取,代码写在哪?" +2. "如果我要支持两个网站——比如一个技术博客和一个新闻网站——它们的HTML结构完全不一样,这个`execute`方法会变成什么样?" + +**展示“噩梦版”CrawlCommand**: +```java +public void execute(String[] args, List
articles) { + String url = args[1]; + // 五十行if-else地狱... + if (url.contains("blog.example.com")) { + // 解析技术博客的HTML + Document doc = Jsoup.connect(url).get(); + Elements titles = doc.select(".post-title"); + for (Element e : titles) { + articles.add(new Article(e.text(), url, "")); + } + } else if (url.contains("news.example.com")) { + // 解析新闻网站的HTML + Document doc = Jsoup.connect(url).get(); + Elements items = doc.select(".article-headline"); + for (Element e : items) { + articles.add(new Article(e.text(), url, "")); + } + } else { + view.printError("Unsupported website!"); + } +} +``` + +**痛点提炼**: +> "看到了吗?每支持一个新网站,就要在这里加一个`else if`。这就是W1我们痛批的'牵一发而动全身',只不过这次灾难地点从`main`搬到了`CrawlCommand`。" +> +> "更重要的是,我们上节课辛辛苦苦实现了Command模式,难道解析逻辑又要回到if-else地狱吗?**这就是W10要解决的第一个问题:怎么让解析逻辑也可插拔?**" + +**第二个隐患——共享状态的回顾**: +> "还有一件事,我们上节课结束前提到的:`List
articles`在所有Command之间共享。任何一个Command都可以往里面塞东西、删东西、甚至清空。这是W10要解决的第二个问题:**怎么给数据装上'防盗门'?**" + +--- + +### 4.2 策略模式:解析器的“插头标准化”(18分钟) + +#### 4.2.1 从类比切入 + +**教师口播**: +> "先讲个生活场景。你家里墙上有一个三孔插座,你可以插电视、插电脑、插手机充电器——任何符合这个标准的电器都能用。插座不在乎你是什么电器,它只认接口标准。" + +**类比映射**: + +| 生活场景 | 代码对应 | +|----------|----------| +| 三孔插座 | `CrawlStrategy` 接口 | +| 电视/电脑充电器 | 具体解析策略(BlogStrategy/NewsStrategy) | +| 电流 | 输入:URL + Document;输出:List\ | +| 你(使用者) | CrawlCommand | +| 插座面板 | 解析器工厂 | + +> "策略模式的核心思想就是:**定义一个算法接口,让具体的算法实现可以互相替换,而使用算法的客户端不受影响。**" + +#### 4.2.2 策略模式定义 + +```java +// src/main/java/com/crawler/strategy/CrawlStrategy.java +package com.crawler.strategy; + +import com.crawler.model.Article; +import org.jsoup.nodes.Document; +import java.util.List; + +public interface CrawlStrategy { + /** + * 从已获取的Document中解析文章列表 + * @param url 原始请求URL(用于填充Article) + * @param doc Jsoup解析后的Document + * @return 解析出的文章列表 + */ + List
parse(String url, Document doc); + + /** + * 判断此策略是否为给定URL服务 + * @param url 待判断的URL + * @return true表示此策略可以处理该URL + */ + boolean supports(String url); +} +``` + +**教师口播**: +> "注意,策略接口里有两个方法。`parse`是干活的那个,`supports`是'我能不能干这个活'——这是什么?**这是合同!** 任何网站想被我们爬虫支持,就必须签署这份合同:告诉我你是不是我的客户(supports),以及怎么解析你(parse)。" + +#### 4.2.3 具体策略实现示例 + +```java +// BlogStrategy.java - 技术博客解析策略 +public class BlogStrategy implements CrawlStrategy { + @Override + public boolean supports(String url) { + return url.contains("blog.example.com"); + } + + @Override + public List
parse(String url, Document doc) { + List
articles = new ArrayList<>(); + Elements titles = doc.select(".post-title"); + for (Element e : titles) { + articles.add(new Article(e.text(), url, "")); + } + return articles; + } +} + +// NewsStrategy.java - 新闻网站解析策略 +public class NewsStrategy implements CrawlStrategy { + @Override + public boolean supports(String url) { + return url.contains("news.example.com"); + } + + @Override + public List
parse(String url, Document doc) { + List
articles = new ArrayList<>(); + Elements items = doc.select(".article-headline"); + for (Element e : items) { + articles.add(new Article(e.text(), url, "")); + } + return articles; + } +} +``` + +**对比:策略模式 vs 硬编码if-else** + +| 维度 | if-else屎山 | 策略模式 | +|------|-------------|----------| +| 新增网站 | 改CrawlCommand,加else if | 新写一个类,实现CrawlStrategy | +| 修改解析逻辑 | 在CrawlCommand里翻找对应的else if | 只改对应策略类 | +| 测试 | 必须启动整个爬虫 | 单独对Strategy做单元测试 | +| 是否符合开闭原则 | ❌ 对修改开放 | ✅ 对扩展开放,对修改关闭 | + +**与Command模式的对比(加深理解)**: +> "上节课Command模式,我们为每个命令定义一个类;这节课策略模式,我们为每个网站的解析算法定义一个类。**本质上都是同一个OOP思想:用多态替代条件分支。** 只不过Command的接口是`execute()`,Strategy的接口是`parse()`。" +> +> "这张图你们可以记下来:**接口是消除if-else的利器,多态是接口的灵魂。**" + +--- + +### 4.3 解析器工厂:自动匹配的魔法(14分钟) + +#### 4.3.1 问题引出 + +**教师口播**: +> "现在我们有A网站的策略、B网站的策略。问题来了:谁来选策略?谁来遍历所有策略,找到一个supports返回true的?" +> +> "如果把这个逻辑写在CrawlCommand里,那策略模式就白用了——CrawlCommand还是得'知道'有哪些策略。我们要的是一个黑盒子:**把URL丢进去,自动弹出一个合适的解析器。**" + +#### 4.3.2 解析器工厂的实现 + +```java +// src/main/java/com/crawler/strategy/StrategyFactory.java +package com.crawler.strategy; + +import java.util.ArrayList; +import java.util.List; + +public class StrategyFactory { + private final List strategies = new ArrayList<>(); + + // 注册策略——新的网站只需在这里加一行 + public StrategyFactory() { + strategies.add(new BlogStrategy()); + strategies.add(new NewsStrategy()); + // 未来增加新网站:strategies.add(new XxxStrategy()); + } + + /** + * 根据URL自动匹配解析策略 + * @param url 目标URL + * @return 匹配的策略,如果没有匹配返回null + */ + public CrawlStrategy getStrategy(String url) { + for (CrawlStrategy s : strategies) { + if (s.supports(url)) { + return s; + } + } + return null; // 未找到匹配策略 + } +} +``` + +**教师口播**: +> "这个工厂类足够简单:一个List存所有策略,一个方法遍历找到匹配的。但简单不等于不强大。** +> +> **关键点**:新增网站支持,只需要——" +1. 写一个`XxxStrategy`实现`CrawlStrategy` +2. 在工厂构造器里加一行`strategies.add(new XxxStrategy())` +> +> "CrawlCommand一行不改。这就是开闭原则的胜利。" + +#### 4.3.3 从简单工厂到更高级的注册机制(拓展思维) + +**教师口播**: +> "有同学可能会问:还要在工厂构造器里加一行,能不能做到完全零改动?当然可以——用反射或者SPI。" + +**演示概念(不要求实现)**: +```java +// 进阶思路:扫描指定包下的所有CrawlStrategy实现类 +// 用反射自动注册,真正做到“新增类即生效” +// 这是Spring框架的核心思想之一 +``` + +> "这个技术我们暂时不要求掌握,但我希望你们知道:你现在写的每一个`new XxxStrategy()`,在未来都可能进化为框架级别的自动装配。**你现在建立的思维习惯,决定了你未来能走多高。**" + +#### 4.3.4 重构后的CrawlCommand + +```java +public class CrawlCommand implements Command { + private ConsoleView view; + private StrategyFactory strategyFactory; + private ArticleRepository repository; // 注意:这里是Repository了! + + public CrawlCommand(ConsoleView v, StrategyFactory f, ArticleRepository r) { + this.view = v; + this.strategyFactory = f; + this.repository = r; + } + + public String getName() { return "crawl"; } + + public void execute(String[] args, List
articles) { + if (args.length < 2) { + view.printError("Usage: crawl "); + return; + } + String url = args[1]; + + // 1. 工厂自动选策略 + CrawlStrategy strategy = strategyFactory.getStrategy(url); + if (strategy == null) { + view.printError("No strategy found for: " + url); + return; + } + + // 2. 抓取页面 + view.printInfo("Crawling: " + url); + try { + Document doc = Jsoup.connect(url).get(); + List
parsed = strategy.parse(url, doc); + + // 3. 通过仓库存入(而不是直接操作List) + for (Article a : parsed) { + repository.add(a); + } + view.printSuccess("Crawled " + parsed.size() + " articles."); + } catch (IOException e) { + view.printError("Failed to crawl: " + e.getMessage()); + } + } +} +``` + +**教师口播**: +> "注意这个CrawlCommand现在的职责:拿到URL → 交给工厂选策略 → 执行解析 → 交给仓库存储。**它自己在干什么?在调度!** 这就是上节课我们讲的Controller的'调度思维',现在向Command内部延伸了。" + +--- + +### 4.4 Repository模式:武装数据访问(12分钟) + +#### 4.4.1 问题重提 + +**教师口播**: +> "回到上节课结束时的那个问题:`List
`在所有Command之间共享。任何一个Command都可以做这些事——" +```java +articles.clear(); // 清空所有文章 +articles.add(null); // 塞入null +articles.remove(0); // 随意删除 +``` + +> "如果一个新同事接手开发,他不知道'不要动这个List'的潜规则,写了一个`articles.clear()`,你的`list`命令就突然什么都不显示了。**靠代码约定维护的秩序,早晚会被打破。我们需要实体的'规则'——代码层面的约束。**" + +#### 4.4.2 ArticleRepository的定义 + +```java +// src/main/java/com/crawler/repository/ArticleRepository.java +package com.crawler.repository; + +import com.crawler.model.Article; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +public class ArticleRepository { + private final List
articles = new ArrayList<>(); + + /** + * 添加一篇文章。注意:不接受null,这是代码层面的规则,不是口头约定。 + */ + public void add(Article article) { + if (article == null) { + throw new IllegalArgumentException("Article cannot be null"); + } + articles.add(article); + } + + /** + * 获取所有文章的只读视图 + * 调用者无法通过此返回值修改内部数据 + */ + public List
getAll() { + return Collections.unmodifiableList(articles); + } + + /** + * 获取文章数量 + */ + public int size() { + return articles.size(); + } + + /** + * 清空(仅管理员可调——下一篇:权限控制) + */ + public void clear() { + articles.clear(); + } +} +``` + +**教师口播**: +> "三个关键设计点——" +> +> - **add()拒绝null**:规则写在代码里,不是写在邮件里 +> - **getAll()返回不可修改的视图**:`Collections.unmodifiableList()`——调用者如果尝试add/remove,会**直接抛异常**,不是'悄悄的bug' +> - **ClearCommand要清空数据?调`repository.clear()`**,而不是直接操作List +> +> "这就是面向对象的第一课——封装。把数据藏起来,只暴露安全的方法。从'直接操作集合'到'通过仓库存取',是程序员成熟度的分水岭。" + +#### 4.4.3 仓库引入后的架构变化 + +**Command接口的execute方法调整**: + +```java +// 调整前(W9) +public interface Command { + String getName(); + void execute(String[] args, List
articles); +} + +// 调整后(W10) +public interface Command { + String getName(); + void execute(String[] args, ArticleRepository repository); +} +``` + +**教师口播**: +> "这个改动很小——把`List
`换成`ArticleRepository`。但语义完全不同:之前是'给你数据随便玩',现在是'给你一个安全的存取通道'。" + +**所有Command同步调整**: + +```java +// ListCommand.java - 调整后 +public class ListCommand implements Command { + private ConsoleView view; + public ListCommand(ConsoleView v) { this.view = v; } + public String getName() { return "list"; } + public void execute(String[] args, ArticleRepository repository) { + view.display(repository.getAll()); // 通过仓库获取数据 + } +} + +// ClearCommand.java(新增示例) +public class ClearCommand implements Command { + private ConsoleView view; + public ClearCommand(ConsoleView v) { this.view = v; } + public String getName() { return "clear"; } + public void execute(String[] args, ArticleRepository repository) { + repository.clear(); + view.printSuccess("All articles cleared."); + } +} +``` + +**Controller和main的调整**: + +```java +// App.java - 调整后 +public class App { + public static void main(String[] args) { + ConsoleView view = new ConsoleView(); + ArticleRepository repository = new ArticleRepository(); // 替代 List
+ StrategyFactory factory = new StrategyFactory(); // 新增 + + CrawlerController controller = new CrawlerController(view, repository, factory); + + view.printSuccess("Welcome to CLI Crawler v2.0!"); + view.printInfo("Type 'help' for commands."); + + while (true) { + controller.handle(view.readLine()); + } + } +} +``` + +--- + +### 4.5 整体架构串联(8分钟) + +**教师口播**: +> "现在我们把所有部件串起来,看看一个`crawl https://blog.example.com`命令走过的完整路径。" + +**时序图(口述配白板绘制)**: +``` +用户输入 "crawl https://blog.example.com" + │ + ▼ +ConsoleView.readLine() + │ + ▼ +CrawlerController.handle("crawl https://blog.example.com") + │ Map查找 "crawl" → CrawlCommand + ▼ +CrawlCommand.execute(args, repository) + │ + ├─► StrategyFactory.getStrategy(url) + │ │ 遍历List + │ │ BlogStrategy.supports(url) → true! + │ ▼ + │ 返回 BlogStrategy + │ + ├─► Jsoup.connect(url).get() → Document + │ + ├─► BlogStrategy.parse(url, doc) → List
+ │ + └─► for each article: repository.add(article) + │ + ▼ + ArticleRepository.articles.add(article) + +最终:ConsoleView.printSuccess("Crawled N articles.") +``` + +**教师口播**: +> "七步调用,每一步职责清晰:View负责输入输出,Controller负责路由,Command负责调度,Factory负责匹配,Strategy负责解析,Repository负责存储。**没有哪个类干了两个人的活,也没有哪个类不知道自己的活是什么。**" +> +> "这就是工程化——不是把代码写得快,是把代码写得对。" + +--- + +### 4.6 代码落地(20分钟) + +**教师准备**:课前准备一份“W9升级到W10”的改动清单,现场演示关键改动。 + +**改动清单**: +1. 新建`strategy/`包,创建`CrawlStrategy`接口 +2. 新建`strategy/BlogStrategy.java` +3. 新建`strategy/NewsStrategy.java` +4. 新建`strategy/StrategyFactory.java` +5. 新建`repository/`包,创建`ArticleRepository.java` +6. 修改`Command`接口的`execute`签名 +7. 修改`CrawlCommand`,引入`StrategyFactory`和`ArticleRepository` +8. 修改其余所有`Command`实现类 +9. 修改`CrawlerController`构造器 +10. 修改`App.java` + +**教师演示关键步骤**(重点演示): +- `ArticleRepository`的`Collections.unmodifiableList()` +- `StrategyFactory`的遍历匹配逻辑 +- `CrawlCommand`重写后的调度结构 + +**刻意埋入的“找茬点”**: +> "我在`StrategyFactory.getStrategy()`里,如果没有匹配的策略就返回`null`。然后在`CrawlCommand`里检查null。这其实叫'null object pattern的前奏'——如果我不想让Command检查null,我应该怎么改工厂?大家带着这个问题用AI探究。" + +--- + +### 4.7 架构反思与W11预告(5分钟) + +**教师口播**: +> "现在我们的架构比W9强壮多了:解析逻辑可插拔,数据访问有守卫。但还有一些漏洞——" + +**逐一点破**: +1. **异常处理**:`CrawlCommand`用了一个笼统的`catch (IOException e)`,如果解析过程中抛出其他异常怎么办? +2. **网络超时**:如果目标网站3秒没响应,当前代码会一直等吗? +3. **日志缺失**:所有的成功/失败信息只输出到终端,如果程序半夜跑,第二天想看昨晚抓了多少——看不了。 +4. **重试机制**:如果一次失败就直接报错,要不要给个重试的机会? + +**W11预告**: +> "下周,我们会做三件事:**自定义异常体系**、**工程化日志框架**、**防御式编程与重试机制**。W9搭骨架,W10装盔甲,W11要让这个系统**经得起现实的毒打**。" + +--- + +### 4.8 实践任务(5分钟) + +**任务要求**: +1. 从W9代码出发,完成W10升级 +2. 实现至少两个`CrawlStrategy`(可以是模拟的,不要求真实爬取) +3. 实现`StrategyFactory`和`ArticleRepository` +4. 确保所有Command通过Repository访问数据 +5. 运行并测试完整流程 + +**验收标准**: +- [x] 新增策略类只需新建文件+工厂注册一行,其余代码零改动 +- [x] `ArticleRepository`的`getAll()`返回不可修改视图 +- [x] `CrawlCommand`不包含任何网站特定的解析逻辑 +- [x] `StrategyFactory`能根据URL自动匹配正确的策略 +- [x] 所有Command的`execute`方法签名已更新为`ArticleRepository` +- [x] 无任何地方直接操作`List
` + +--- + +## 五、课后作业 + +### 5.1 必做任务 + +1. **完善ArticleRepository**:增加`addAll(List
)`批量添加方法,注意防御null +2. **★ AnalyzeCommand(集大成作业)**: + - 实现`analyze `命令 + - 内部调用`StrategyFactory`匹配策略 + - 调用策略解析文章后,**不存到Repository**,而是分析统计信息: + - 文章总数 + - 标题平均长度 + - 按某种规则排名的Top 5 + - 结果只输出,不存储 + - **提示**:这就是策略的复用——同一个解析策略,既能为`crawl`服务(存入仓库),也能为`analyze`服务(仅分析) + +3. **AI架构审计**:将完整代码的类图(或类名与方法签名列表)发给AI,指令: + > "作为Java架构审计师,请检查:①策略模式的实现是否正确解耦(CrawlCommand是否仍然包含网站特定逻辑);②Repository是否真正封装了数据访问(是否存在绕过Repository直接操作List的地方);③工厂的匹配逻辑是否存在性能隐患。请给出具体的改进建议。" + +### 5.2 选做任务 + +1. **正则策略匹配**:将`Supports()`的判断从`url.contains()`改为正则表达式,让一张策略可以匹配一类URL +2. **默认策略(DefaultStrategy)**:当没有策略匹配时,提供一个通用的“标题提取”逻辑 +3. **策略优先级**:给每个策略加一个`priority`字段,工厂按优先级匹配(而不是按注册顺序) +4. **思考并回答(200字)**: + > "策略模式中,策略的`supports()`方法有可能让两个策略都返回true,这时该选哪个?`StrategyFactory`的遍历顺序会如何影响结果?你有什么解决方案?" + +### 5.3 思考题 + +1. **Repository与List的区别是什么?** 如果Repository只是包了一层List,为什么还要用? +2. **策略工厂的演进**:如果网站数量增加到100个,逐个注册的写法还合适吗?你想到什么解决方案? +3. **`Collections.unmodifiableList()`返回的是什么?** 它真的“不可修改”吗?如果原List被修改,这个不可修改视图会怎样? + +--- + +## 六、AI协同升级 + +### 架构审计师任务(必做) + +**学生执行步骤**: +1. 画出当前项目的类依赖图(手绘或工具生成) +2. 将类名和依赖关系发给AI +3. 输入指令: + > "作为Java架构审计师,请检查这个爬虫项目的架构。重点关注:①策略模式是否真正实现了开闭原则(增加新网站是否真的只需新增类);②Repository封装是否完整(是否有绕过Repository的路径);③是否存在循环依赖。请逐一指出问题并给出改进建议。" + +**预期AI输出**: +- 指出是否还存在“改一处影响多处”的耦合 +- 判断Repository的API设计是否完备 +- 评价整体架构的开闭原则达成度 + +### 进阶AI探究(选做) + +> "假设我有一个CrawlStrategy接口和10个实现类。不用工厂模式,直接用一个Map存起来,key是策略名称。这和StrategyFactory设计有什么本质区别?各自的优缺点是什么?" + +--- + +## 七、教学反思与调整记录 + +| 日期 | 事项 | 调整内容 | +|------|------|----------| +| 2026-05-01 | 首次编写 | 基于W9骨架,引入策略模式+工厂+Repository | +| 2026-05-07 | 结构优化 | 调整策略模式与工厂的讲解顺序,先策略后工厂更自然 | + +--- + +## 附录1:W9到W10改动对照表 + +| 改动项 | W9代码 | W10代码 | +|--------|--------|---------| +| 数据存储 | `List
articles` | `ArticleRepository repository` | +| Command接口 | `execute(String[], List
)` | `execute(String[], ArticleRepository)` | +| 解析逻辑位置 | `CrawlCommand`内部 | 各`CrawlStrategy`实现类 | +| URL匹配 | 无(硬编码) | `StrategyFactory.getStrategy(url)` | +| 数据添加 | `articles.add(article)` | `repository.add(article)` | +| 数据读取 | 直接遍历`articles` | `repository.getAll()` | + +## 附录2:常见问题速查 + +| 问题 | 解答 | +|------|------| +| 策略模式和Command模式有什么区别? | Command封装“动作”(做什么事),Strategy封装“算法”(怎么做)。在爬虫中:crawl是命令(动作),如何解析是策略(算法)。 | +| 工厂一定要叫Factory吗? | 不必须。但叫Factory意味着“创建对象”的职责,符合模式命名的惯例。 | +| `Collections.unmodifiableList()`有什么用? | 返回一个只读视图,调用add/remove等方法会抛`UnsupportedOperationException`。 | +| Repository和DAO有什么区别? | 在我们的上下文中可以视为同义词。严谨地说,Repository是领域驱动设计的概念,更偏向“集合语义”;DAO更偏数据库操作。 | +| 策略的`supports()`返回true但解析失败怎么办? | 那是策略实现的bug,该策略应修复。Factory不负责验证策略的正确性。 | + +## 附录3:教学逻辑说明 + +| 顺序 | 内容 | 设计理由 | +|------|------|----------| +| 1 | W9回顾+痛点暴露 | 承上启下,从已知问题引出新知识 | +| 2 | 策略模式 | 解决解析逻辑耦合问题,深化多态理解 | +| 3 | 解析器工厂 | 解决策略选择问题,引入工厂模式 | +| 4 | Repository模式 | 解决数据安全问题,实践封装原则 | +| 5 | 架构串联 | 将所有部件统一,形成完整心智模型 | +| 6 | 代码落地 | 实践验证,从“听懂”到“会做” | +| 7 | 架构反思+预告 | 暴露新问题,为W11健壮性工程铺垫 | + +--- + +## 版本说明 + +- **v1(本版)**:基于W9教案模式首次编写,包含策略模式、工厂模式、Repository模式的完整引入 \ No newline at end of file