# .cursorrules
# Custom rules for AI assistance in the Graphrag_test_app project
You are an AI assistant specialized in developing and maintaining a Streamlit-based graphrag application. We are using Microsoft Grapgrag to create a knowledge graph and rag that will function a writing aide for academic papers. The application is designed to provide a user-friendly interface for querying and visualizing the graph-based data, with both global, local and search functionalities.
## Project Overview
- **Objective**: Create a user-friendly interface for querying and visualizing graph-based data, with both global and local search functionalities
- **Tools and Technologies**:
- **Frontend**: Streamlit for web interface
- **Processing**: Python for data processing and graph operations
- **Helper Functions**: Separated into global and local utilities
- **File Processing**: Custom preprocessing for text files with markdown support
## Core Components
1. **Search Functionality**:
- **Global Search** (🌐): Implementation in pages/1_🌐_Global_search.py
- **Local Search** (🔍): Implementation in pages/2_🔍_Local_search.py
- **Drift Search** (🔎): Implementation in pages/3_🔎_Drift_search.py
- **Query Processing**: Graph-based query handling in Graph_query.py
-
2. **Helper Functions**:
- **Global Helpers**: Utility functions in global_helper_functions.py
- **Local Helpers**: Local-specific utilities in local_helper_functions.py
- **Drift Helpers**: Drift-specific utilities in drift_helper_functions.py
3. **Preprocessing**:
- **File Conversion**: Text file processing with markdown and comments support
- **Data Preparation**: Standardized preprocessing pipeline
## Coding Standards
- **Language**: Python 3.10+
- **Style Guide**:
- Follow PEP 8 conventions
- Use termcolor for console output
- Define major constants in UPPERCASE
- Implement clear error handling with try-except blocks
- "pipeline" syntax, specifically using the pipe operator | in Python.
- use logging for debugging purposes
## File Organization
- **Pages**: Store Streamlit pages in the /pages directory
- **Preprocessing**: Keep preprocessing scripts in /preprocessing
- **Helpers**: Maintain separate files for global and local helper functions
## Best Practices
1. **Code Structure**:
- Implement modular, reusable components
- Maintain clear separation between global and local functionalities
- Use descriptive variable names and comments
2. **Error Handling**:
- Use try-except blocks with specific error messages
- Implement logging for debugging purposes
- Handle file operations with proper encoding (utf-8)
3. **Configuration**:
- Use environment variables for sensitive data
- Maintain an up-to-date requirements.txt
- Avoid hardcoding configuration values
## Documentation
- **Code Comments**: Provide clear, concise documentation for functions
- **README**: Keep README.md updated with setup and usage instructions
- **Inline Documentation**: Use descriptive variable names and comments
## Security
- **API Keys**: Store in environment variables
- **File Operations**: Implement safe file handling practices
- **Input Validation**: Validate all user inputs before processing
## Testing
- **Manual Testing**: Test both global and local search functionalities
- **Error Cases**: Verify proper handling of edge cases
- **UI Testing**: Ensure Streamlit interface remains responsive
## Performance
- **Optimization**: Focus on efficient graph operations
- **Memory Management**: Handle large datasets efficiently
- **Response Time**: Optimize search query performance
python