# AuraK

AuraK is a multi-tenant intelligent AI knowledge base platform. Built with React + NestJS, it's a full-stack RAG (Retrieval-Augmented Generation) system with external API support, RBAC, and tenant isolation.

## ✨ Features

- 🔐 **User System**: Complete user registration, login, and permission management
- 🤖 **Multi-Model Support**: OpenAI-compatible interfaces + Google Gemini native support
- 📚 **Intelligent Knowledge Base**: Document upload, chunking, vectorization, hybrid search
- 💬 **Streaming Chat**: Real-time display of processing status and generated content
- 🔍 **Citation Tracking**: Clear display of source documents and related segments for answers
- 🌍 **Multi-Language Support**: Japanese, Chinese, and English for interface and AI responses
- 👁️ **Vision Capabilities**: Supports multimodal models for image processing
- ⚙️ **Flexible Configuration**: User-specific API keys and inference parameter customization
- 🎯 **Dual-Mode Processing**: Fast mode (Tika) + High-precision mode (Vision Pipeline)
- 💰 **Cost Management**: User quota management and cost estimation

## 🏗️ Tech Stack

### Frontend

- **Framework**: React 19 + TypeScript + Vite
- **Styling**: Tailwind CSS
- **Icons**: Lucide React
- **State Management**: React Context

### Backend

- **Framework**: NestJS + TypeScript
- **AI Framework**: LangChain
- **Database**: SQLite (metadata) + Elasticsearch (vector storage)
- **File Processing**: Apache Tika + Vision Pipeline
- **Authentication**: JWT
- **Document Conversion**: LibreOffice + ImageMagick

## 🏢 Internal Network Deployment

This system supports deployment in internal networks. Main modifications include:

- **External Resources**: KaTeX CSS moved from external CDN to local resources
- **AI Models**: Supports configuring internal AI model services without external API access
- **Build Configuration**: Dockerfiles can be configured to use internal image registries

See [Internal Deployment Guide](INTERNAL_DEPLOYMENT_GUIDE.md) for detailed configuration instructions.

## 🚀 Quick Start

### Prerequisites

- Node.js 18+
- Yarn
- Docker & Docker Compose

### 1. Clone the Project

```bash
git clone <repository-url>
cd simple-kb
```

### 2. Install Dependencies

```bash
yarn install
```

### 3. Start Basic Services

```bash
docker-compose up -d elasticsearch tika libreoffice
```

### 4. Configure Environment Variables

```bash
# Backend environment setup
cp server/.env.sample server/.env
# Edit server/.env file (set API keys, etc.)

# Frontend environment setup
cp web/.env.example web/.env
# Edit web/.env file (modify frontend settings as needed)
```

See the comments in `server/.env.sample` and `web/.env.example` for detailed configuration.

### 5. Start Development Server

```bash
yarn dev
```

Access http://localhost:5173 to get started!

## 📖 User Guide

### 1. User Registration/Login

- Account registration is required for first-time use.
- Each user has their own independent knowledge base and model settings.

### 2. AI Model Configuration

- Add AI models from "Model Management".
- Supports OpenAI, DeepSeek, Claude and other compatible interfaces.
- Supports Google Gemini native interface.
- Configure LLM, Embedding, and Rerank models.

### 3. Document Upload

- Supports various formats: PDF, Word, PPT, Excel, etc.
- Choose between Fast mode (text-only) or High-precision mode (image + text mixed).
- Adjustable chunk size and overlap for documents.
- Select embedding model for vectorization.

### 4. Start Intelligent Q&A

- Ask questions based on uploaded documents.
- View search and generation process in real-time.
- Check answer sources and related document fragments.

## 🔧 Configuration Guide

### Model Settings

- **LLM Model**: Used for dialogue generation (e.g., GPT-4, Gemini-1.5-Pro)
- **Embedding Model**: Used for document vectorization (e.g., text-embedding-3-small)
- **Rerank Model**: Used for re-ranking search results (optional)

### Inference Parameters

- **Temperature**: Controls answer randomness (0-1)
- **Max Tokens**: Maximum output length
- **Top K**: Number of document segments to search
- **Similarity Threshold**: Filters low-relevance content

## 📁 Project Structure

```
simple-kb/
├── web/                 # Frontend application
│   ├── components/      # React components
│   ├── services/        # API services
│   ├── contexts/        # React Context
│   └── utils/          # Utility functions
├── server/             # Backend application
│   ├── src/
│   │   ├── auth/       # Authentication module
│   │   ├── chat/       # Chat module
│   │   ├── knowledge-base/ # Knowledge base module
│   │   ├── model-config/   # Model configuration module
│   │   └── user/       # User module
│   └── data/           # Data storage
├── docs/               # Project documentation
└── docker-compose.yml  # Docker configuration
```

## 📚 Documentation

- [System Design Document](docs/DESIGN.md)
- [Current Implementation Status](docs/CURRENT_IMPLEMENTATION.md)
- [API Documentation](docs/API.md)
- [Deployment Guide](docs/DEPLOYMENT.md)
- [RAG Feature Implementation](docs/rag_complete_implementation.md)

## 🐳 Docker Deployment

### Development Environment

```bash
# Start basic services
docker-compose up -d elasticsearch tika

# Local development
yarn dev
```

### Production Environment

```bash
# Build and start all services
docker-compose up -d
```

## 🤝 Contributing

1. Fork the project
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## 📄 License

This project is provided under the MIT license. See the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- [LangChain](https://langchain.com/) - AI application development framework
- [NestJS](https://nestjs.com/) - Node.js backend framework
- [React](https://react.dev/) - Frontend UI framework
- [Elasticsearch](https://www.elastic.co/) - Search and analytics engine
- [Apache Tika](https://tika.apache.org/) - Document parsing tool

## 📞 Support

For questions or suggestions, please submit an [Issue](../../issues) or contact the maintainers.