# AuraK AuraK is a multi-tenant intelligent AI knowledge base platform. Built with React + NestJS, it's a full-stack RAG (Retrieval-Augmented Generation) system with external API support, RBAC, and tenant isolation. ## ✨ Features - 🔐 **User System**: Complete user registration, login, and permission management - 🤖 **Multi-Model Support**: OpenAI-compatible interfaces + Google Gemini native support - 📚 **Intelligent Knowledge Base**: Document upload, chunking, vectorization, hybrid search - 💬 **Streaming Chat**: Real-time display of processing status and generated content - 🔍 **Citation Tracking**: Clear display of source documents and related segments for answers - 🌍 **Multi-Language Support**: Japanese, Chinese, and English for interface and AI responses - 👁️ **Vision Capabilities**: Supports multimodal models for image processing - ⚙️ **Flexible Configuration**: User-specific API keys and inference parameter customization - 🎯 **Dual-Mode Processing**: Fast mode (Tika) + High-precision mode (Vision Pipeline) - 💰 **Cost Management**: User quota management and cost estimation ## 🏗️ Tech Stack ### Frontend - **Framework**: React 19 + TypeScript + Vite - **Styling**: Tailwind CSS - **Icons**: Lucide React - **State Management**: React Context ### Backend - **Framework**: NestJS + TypeScript - **AI Framework**: LangChain - **Database**: SQLite (metadata) + Elasticsearch (vector storage) - **File Processing**: Apache Tika + Vision Pipeline - **Authentication**: JWT - **Document Conversion**: LibreOffice + ImageMagick ## 🏢 Internal Network Deployment This system supports deployment in internal networks. Main modifications include: - **External Resources**: KaTeX CSS moved from external CDN to local resources - **AI Models**: Supports configuring internal AI model services without external API access - **Build Configuration**: Dockerfiles can be configured to use internal image registries See [Internal Deployment Guide](INTERNAL_DEPLOYMENT_GUIDE.md) for detailed configuration instructions. ## 🚀 Quick Start ### Prerequisites - Node.js 18+ - Yarn - Docker & Docker Compose ### 1. Clone the Project ```bash git clone cd simple-kb ``` ### 2. Install Dependencies ```bash yarn install ``` ### 3. Start Basic Services ```bash docker-compose up -d elasticsearch tika libreoffice ``` ### 4. Configure Environment Variables ```bash # Backend environment setup cp server/.env.sample server/.env # Edit server/.env file (set API keys, etc.) # Frontend environment setup cp web/.env.example web/.env # Edit web/.env file (modify frontend settings as needed) ``` See the comments in `server/.env.sample` and `web/.env.example` for detailed configuration. ### 5. Start Development Server ```bash yarn dev ``` Access http://localhost:5173 to get started! ## 📖 User Guide ### 1. User Registration/Login - Account registration is required for first-time use. - Each user has their own independent knowledge base and model settings. ### 2. AI Model Configuration - Add AI models from "Model Management". - Supports OpenAI, DeepSeek, Claude and other compatible interfaces. - Supports Google Gemini native interface. - Configure LLM, Embedding, and Rerank models. ### 3. Document Upload - Supports various formats: PDF, Word, PPT, Excel, etc. - Choose between Fast mode (text-only) or High-precision mode (image + text mixed). - Adjustable chunk size and overlap for documents. - Select embedding model for vectorization. ### 4. Start Intelligent Q&A - Ask questions based on uploaded documents. - View search and generation process in real-time. - Check answer sources and related document fragments. ## 🔧 Configuration Guide ### Model Settings - **LLM Model**: Used for dialogue generation (e.g., GPT-4, Gemini-1.5-Pro) - **Embedding Model**: Used for document vectorization (e.g., text-embedding-3-small) - **Rerank Model**: Used for re-ranking search results (optional) ### Inference Parameters - **Temperature**: Controls answer randomness (0-1) - **Max Tokens**: Maximum output length - **Top K**: Number of document segments to search - **Similarity Threshold**: Filters low-relevance content ## 📁 Project Structure ``` simple-kb/ ├── web/ # Frontend application │ ├── components/ # React components │ ├── services/ # API services │ ├── contexts/ # React Context │ └── utils/ # Utility functions ├── server/ # Backend application │ ├── src/ │ │ ├── auth/ # Authentication module │ │ ├── chat/ # Chat module │ │ ├── knowledge-base/ # Knowledge base module │ │ ├── model-config/ # Model configuration module │ │ └── user/ # User module │ └── data/ # Data storage ├── docs/ # Project documentation └── docker-compose.yml # Docker configuration ``` ## 📚 Documentation - [System Design Document](docs/DESIGN.md) - [Current Implementation Status](docs/CURRENT_IMPLEMENTATION.md) - [API Documentation](docs/API.md) - [Deployment Guide](docs/DEPLOYMENT.md) - [RAG Feature Implementation](docs/rag_complete_implementation.md) ## 🐳 Docker Deployment ### Development Environment ```bash # Start basic services docker-compose up -d elasticsearch tika # Local development yarn dev ``` ### Production Environment ```bash # Build and start all services docker-compose up -d ``` ## 🤝 Contributing 1. Fork the project 2. Create a feature branch (`git checkout -b feature/AmazingFeature`) 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`) 4. Push to the branch (`git push origin feature/AmazingFeature`) 5. Open a Pull Request ## 📄 License This project is provided under the MIT license. See the [LICENSE](LICENSE) file for details. ## 🙏 Acknowledgments - [LangChain](https://langchain.com/) - AI application development framework - [NestJS](https://nestjs.com/) - Node.js backend framework - [React](https://react.dev/) - Frontend UI framework - [Elasticsearch](https://www.elastic.co/) - Search and analytics engine - [Apache Tika](https://tika.apache.org/) - Document parsing tool ## 📞 Support For questions or suggestions, please submit an [Issue](../../issues) or contact the maintainers.