# Chat API 返回内容区分设计文档

## 概述

本文档描述如何将 Chat API 的返回内容区分为"上文"、"引用"、"正文"、"深度思考"四个部分。

## 当前实现分析

### 现有返回结构

```typescript
{ type: 'content' | 'sources' | 'historyId'; data: any }
```

### 现有问题

1. **content 类型过于笼统**：混合了搜索状态、调试信息、AI正文等多种内容
2. **缺少上文返回**：历史对话被格式化后传给LLM，但没有返回给客户端
3. **缺少深度思考**：没有支持 reasoning/thinking 过程的返回

## 设计方案

### 新的返回类型定义

```typescript
export type ChatStreamType = 
  | 'history'      // 上文：历史对话内容
  | 'sources'      // 引用：搜索结果来源
  | 'content'      // 正文：AI生成的回答
  | 'thinking'     // 深度思考：推理过程
  | 'status'       // 状态：搜索进度、调试信息
  | 'historyId';   // 历史ID：聊天会话ID

export interface ChatStreamChunk {
  type: ChatStreamType;
  data: any;
}
```

### 各类型详细说明

#### 1. history - 上文

**用途**：返回历史对话内容，让客户端可以显示上下文

**数据结构**：
```typescript
{
  type: 'history',
  data: {
    messages: Array<{
      role: 'user' | 'assistant';
      content: string;
    }>;
  }
}
```

**触发时机**：在聊天开始时返回

#### 2. sources - 引用

**用途**：返回搜索结果来源，包含文件名、内容片段、相似度分数

**数据结构**：
```typescript
{
  type: 'sources',
  data: Array<{
    fileName: string;
    content: string;
    score: number;
    chunkIndex: number;
    fileId: string;
  }>
}
```

**触发时机**：在搜索完成后、生成正文前返回

#### 3. content - 正文

**用途**：返回AI生成的回答内容（纯文本）

**数据结构**：
```typescript
{
  type: 'content',
  data: string  // 流式返回的文本片段
}
```

**触发时机**：LLM生成过程中持续返回

#### 4. thinking - 深度思考

**用途**：返回AI的推理过程（如Chain-of-Thought）

**数据结构**：
```typescript
{
  type: 'thinking',
  data: string  // 流式返回的思考过程
}
```

**触发时机**：在正文之前返回（如果模型支持）

#### 5. status - 状态

**用途**：返回搜索进度、调试信息等

**数据结构**：
```typescript
{
  type: 'status',
  data: {
    stage: 'searching' | 'search_complete' | 'generating' | 'debug';
    message: string;
    details?: any;
  }
}
```

**触发时机**：在整个过程中穿插返回

#### 6. historyId - 历史ID

**用途**：返回聊天会话ID

**数据结构**：
```typescript
{
  type: 'historyId',
  data: string  // 会话ID
}
```

**触发时机**：在聊天开始时返回

## 需要修改的文件

### 后端文件

1. **server/src/chat/chat.service.ts**
   - 修改 `streamChat` 方法的返回类型
   - 区分不同类型的 content
   - 添加 history 类型返回
   - 添加 thinking 类型支持

2. **server/src/api/api-v1.controller.ts**
   - 更新 SSE 响应格式
   - 确保所有类型都被正确处理

3. **server/src/types.ts**
   - 添加新的类型定义

### 前端文件

4. **web/services/chatService.ts**
   - 更新流式响应处理逻辑
   - 区分不同类型的 chunk

5. **web/components/ChatInterface.tsx** 或相关组件
   - 更新UI渲染逻辑
   - 分别显示上文、引用、正文、深度思考

## 实现步骤

### 第一阶段：类型定义

1. 在 `server/src/types.ts` 中添加新的类型定义
2. 更新 `ChatStreamChunk` 接口

### 第二阶段：后端实现

3. 修改 `chat.service.ts` 中的 `streamChat` 方法
   - 添加 history 类型返回
   - 将 content 细分为 status、thinking、content
   - 保持 sources 不变

4. 更新 `api-v1.controller.ts` 确保正确处理新类型

### 第三阶段：前端实现

5. 更新 `chatService.ts` 处理新的流式响应
6. 更新聊天界面组件，分别渲染不同类型的 chunk

## 兼容性考虑

- 保持向后兼容：旧的 `content` 类型仍然可用
- 新增类型是可选的，不影响现有功能
- 前端可以逐步适配新类型

## 深度思考支持（健壮性设计）

### 设计原则

**不预判模型能力，而是动态检测响应内容**

优点：
- 无需维护模型列表
- 自动适配任何新模型
- 对不支持的模型零开销
- 始终保持健壮性

### 动态检测机制

#### 核心思路

在流式响应过程中，实时检测是否出现 thinking 标记：
- 如果检测到 → 解析并返回 `thinking` + `content`
- 如果未检测到 → 按普通 `content` 处理

#### 流式处理状态机

```typescript
enum StreamState {
  WAITING,           // 等待首个 chunk
  IN_THINKING,       // 正在接收 thinking 内容
  IN_CONTENT,        // 正在接收 content 内容
  COMPLETED          // 完成
}

interface StreamProcessor {
  state: StreamState;
  buffer: string;           // 缓冲区，用于检测标签
  thinkingContent: string;  // 累积的 thinking 内容
  contentContent: string;   // 累积的 content 内容
  thinkingDetected: boolean; // 是否检测到 thinking
}
```

#### 处理流程

```typescript
async function* processStream(stream: AsyncIterable<{content: string}>) {
  const processor: StreamProcessor = {
    state: StreamState.WAITING,
    buffer: '',
    thinkingContent: '',
    contentContent: '',
    thinkingDetected: false,
  };
  
  for await (const chunk of stream) {
    processor.buffer += chunk.content;
    
    // 状态机处理
    switch (processor.state) {
      case StreamState.WAITING:
        // 检测是否以 thinking 标签开头
        if (processor.buffer.includes('<thinking>')) {
          processor.state = StreamState.IN_THINKING;
          processor.thinkingDetected = true;
          // 提取 <thinking> 之后的内容
          const parts = processor.buffer.split('<thinking>');
          processor.buffer = parts[1] || '';
        } else if (processor.buffer.length > 100) {
          // 超过 100 字符仍未检测到 thinking，判定为普通内容
          processor.state = StreamState.IN_CONTENT;
          yield { type: 'content', data: processor.buffer };
          processor.buffer = '';
        }
        break;
        
      case StreamState.IN_THINKING:
        // 检测 thinking 结束标签
        if (processor.buffer.includes('</thinking>')) {
          const parts = processor.buffer.split('</thinking>');
          processor.thinkingContent += parts[0];
          processor.buffer = parts[1] || '';
          processor.state = StreamState.IN_CONTENT;
          
          // 返回完整的 thinking 内容
          yield { type: 'thinking', data: processor.thinkingContent };
          
          // 如果 buffer 中有内容，作为 content 开始
          if (processor.buffer) {
            yield { type: 'content', data: processor.buffer };
            processor.buffer = '';
          }
        } else {
          // 继续累积 thinking 内容
          // 为了流式显示，可以分批返回
          yield { type: 'thinking', data: chunk.content };
        }
        break;
        
      case StreamState.IN_CONTENT:
        // 直接返回 content
        yield { type: 'content', data: chunk.content };
        break;
    }
  }
  
  // 处理剩余 buffer
  if (processor.buffer) {
    if (processor.state === StreamState.IN_THINKING) {
      yield { type: 'thinking', data: processor.buffer };
    } else {
      yield { type: 'content', data: processor.buffer };
    }
  }
}
```

### 三种场景自动适配

#### 场景1：普通模型（不支持 thinking）

**检测过程**：
1. 收到首个 chunk
2. 等待 100 字符仍未检测到 `<thinking>`
3. 判定为普通内容

**返回**：
- 仅 `content` 类型
- 无 `thinking` 类型

#### 场景2：Reasoning 模型但未触发

**检测过程**：
1. 收到首个 chunk
2. 等待 100 字符仍未检测到 `<thinking>`
3. 判定为普通内容

**返回**：
- 仅 `content` 类型
- 无 `thinking` 类型

#### 场景3：Reasoning 模型且触发

**检测过程**：
1. 收到首个 chunk
2. 检测到 `<thinking>` 标签
3. 进入 thinking 状态
4. 收集直到 `</thinking>`
5. 之后的内容作为 content

**返回**：
- `thinking` 类型（思考过程）
- `content` 类型（正文）

### 多格式支持

支持多种 thinking 标签格式：

```typescript
const THINKING_PATTERNS = {
  // 格式1: <thinking>...</thinking>
  xml: {
    start: /<thinking>/i,
    end: /<\/thinking>/i,
  },
  // 格式2: ```thinking\n...\n```
  codeBlock: {
    start: /```thinking/i,
    end: /```/i,
  },
  // 格式3: [THINKING]...[/THINKING]
  bracket: {
    start: /\[THINKING\]/i,
    end: /\[\/THINKING\]/i,
  },
};

// 自动检测使用哪种格式
function detectThinkingFormat(buffer: string): keyof typeof THINKING_PATTERNS | null {
  for (const [format, pattern] of Object.entries(THINKING_PATTERNS)) {
    if (pattern.start.test(buffer)) {
      return format as keyof typeof THINKING_PATTERNS;
    }
  }
  return null;
}
```

### Thinking 内容解析

支持多种 thinking 格式：

```typescript
function extractThinking(response: string): { thinking: string; content: string } {
  // 格式1: <thinking>...</thinking>
  const thinkingTagMatch = response.match(/<thinking>([\s\S]*?)<\/thinking>/i);
  if (thinkingTagMatch) {
    const thinking = thinkingTagMatch[1].trim();
    const content = response.replace(/<thinking>[\s\S]*?<\/thinking>/i, '').trim();
    return { thinking, content };
  }
  
  // 格式2: ```thinking\n...\n```
  const codeBlockMatch = response.match(/```thinking\n([\s\S]*?)\n```/i);
  if (codeBlockMatch) {
    const thinking = codeBlockMatch[1].trim();
    const content = response.replace(/```thinking\n[\s\S]*?\n```/i, '').trim();
    return { thinking, content };
  }
  
  // 格式3: [THINKING]...[/THINKING]
  const bracketMatch = response.match(/\[THINKING\]([\s\S]*?)\[\/THINKING\]/i);
  if (bracketMatch) {
    const thinking = bracketMatch[1].trim();
    const content = response.replace(/\[THINKING\][\s\S]*?\[\/THINKING\]/i, '').trim();
    return { thinking, content };
  }
  
  // 未检测到 thinking，返回原内容
  return { thinking: '', content: response };
}
```

### 前端健壮性处理

前端需要能够处理所有情况：

```typescript
// 流式响应处理
function handleStreamChunk(chunk: ChatStreamChunk) {
  switch (chunk.type) {
    case 'history':
      setHistory(chunk.data.messages);
      break;
      
    case 'sources':
      setSources(chunk.data);
      break;
      
    case 'thinking':
      // 只有收到 thinking 类型时才显示深度思考区域
      setThinking(prev => prev + chunk.data);
      setShowThinking(true);
      break;
      
    case 'content':
      setContent(prev => prev + chunk.data);
      break;
      
    case 'status':
      setStatus(chunk.data);
      break;
  }
}

// 渲染时的健壮性检查
function renderChat() {
  return (
    <div>
      {/* 上文：只有有内容时才显示 */}
      {history.length > 0 && <HistorySection messages={history} />}
      
      {/* 引用：只有有内容时才显示 */}
      {sources.length > 0 && <SourcesSection sources={sources} />}
      
      {/* 深度思考：只有有内容时才显示 */}
      {showThinking && thinking.length > 0 && (
        <ThinkingSection thinking={thinking} />
      )}
      
      {/* 正文：始终显示 */}
      <ContentSection content={content} />
    </div>
  );
}
```

### 配置选项

在模型配置中添加 reasoning 支持选项：

```typescript
// 模型配置扩展
interface ModelConfig {
  // ... 现有字段
  
  // 新增：reasoning 配置
  reasoning?: {
    enabled: boolean;           // 是否启用 reasoning 解析
    format?: 'auto' | 'thinking_tag' | 'reasoning_field';  // 格式检测方式
  };
}
```

### 降级策略总结

| 模型类型 | 是否支持 | 是否触发 | 返回类型 | 前端显示 |
|----------|----------|----------|----------|----------|
| 普通模型（如 GPT-4） | ❌ | - | content | 仅正文 |
| Reasoning模型未触发 | ✅ | ❌ | content | 仅正文 |
| Reasoning模型已触发 | ✅ | ✅ | thinking + content | 深度思考 + 正文 |

### 错误处理

```typescript
// 解析 thinking 时的错误处理
try {
  const { thinking, content } = extractThinking(fullResponse);
  
  if (thinking) {
    yield { type: 'thinking', data: thinking };
  }
  
  yield { type: 'content', data: content };
} catch (error) {
  // 解析失败时，降级为普通 content
  logger.warn('Failed to parse thinking, falling back to content', error);
  yield { type: 'content', data: fullResponse };
}
```

## 测试计划

### 基础功能测试

1. 测试 history 类型正确返回历史对话
2. 测试 sources 类型正确返回引用来源
3. 测试 content 类型正确返回正文
4. 测试 status 类型正确返回状态信息

### 深度思考健壮性测试

5. **测试普通模型**：使用 GPT-4 等不支持 reasoning 的模型
   - 验证不返回 thinking 类型
   - 验证只返回 content 类型
   - 验证前端不显示深度思考区域

6. **测试 Reasoning 模型未触发**：使用 DeepSeek-R1 但提问简单问题
   - 验证不返回 thinking 类型
   - 验证只返回 content 类型

7. **测试 Reasoning 模型已触发**：使用 DeepSeek-R1 提问复杂问题
   - 验证返回 thinking 类型
   - 验证返回 content 类型
   - 验证前端正确显示深度思考区域

8. **测试解析容错**：模拟 malformed thinking 标签
   - 验证解析失败时降级为 content
   - 验证不抛出异常

### 前端渲染测试

9. 测试前端正确渲染所有类型
10. 测试前端在没有 thinking 时不显示深度思考区域
11. 测试前端在有 thinking 时正确显示深度思考区域