```
├── .github/
├── FUNDING.yml
├── workflows/
├── github-test.yml
├── local-test.yml
├── .gitignore
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── all_rag_techniques/
├── HyDe_Hypothetical_Document_Embedding.ipynb
├── HyPE_Hypothetical_Prompt_Embeddings.ipynb
├── Microsoft_GraphRag.ipynb
├── adaptive_retrieval.ipynb
├── choose_chunk_size.ipynb
├── context_enrichment_window_around_chunk.ipynb
├── context_enrichment_window_around_chunk_with_llamaindex.ipynb
├── contextual_chunk_headers.ipynb
├── contextual_compression.ipynb
├── crag.ipynb
├── dartboard.ipynb
```
## /.github/FUNDING.yml
```yml path="/.github/FUNDING.yml"
github: NirDiamant
```
## /.github/workflows/github-test.yml
```yml path="/.github/workflows/github-test.yml"
name: GitHub PR Test
on:
pull_request:
types: [opened, synchronize, reopened]
branches:
- main
paths:
- "requirements.txt"
jobs:
test:
runs-on: ubuntu-latest
env:
OPENAI_API_KEY: "123"
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.12.6'
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run tests
run: |
pytest
```
## /.github/workflows/local-test.yml
```yml path="/.github/workflows/local-test.yml"
name: Local Test with act
on:
workflow_dispatch:
jobs:
test:
container:
image: catthehacker/ubuntu:act-latest
env:
OPENAI_API_KEY: "123"
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.12.6'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run tests
run: |
pytest
```
## /.gitignore
```gitignore path="/.gitignore"
# Ignore everything by default
*
# Allow only specific file types
!*.py
!*.md
!*.txt
!*.ipynb
!*jpeg
!*svg
!*.png
!LICENSE
!*.gif
!all_rag_techniques
!data
!evaluation
!images
!*.yml
!.github
!.github/workflows
# remove specific files
__pycache__/
__init__.py
```
## /CONTRIBUTING.md
# Contributing to RAG Techniques
Welcome to the world's largest and most comprehensive repository of Retrieval-Augmented Generation (RAG) tutorials! 🌟 We're thrilled you're interested in contributing to this ever-growing knowledge base. Your expertise and creativity can help us maintain our position at the forefront of RAG technology.
## Join Our Community
We have a vibrant Discord community where contributors can discuss ideas, ask questions, and collaborate on RAG techniques. Join us at:
[RAG Techniques Discord Server](https://discord.gg/cA6Aa4uyDX)
Don't hesitate to introduce yourself and share your thoughts!
## Ways to Contribute
We welcome contributions of all kinds! Here are some ways you can help:
1. **Add New RAG Techniques:** Create new notebooks showcasing novel RAG methods.
2. **Improve Existing Notebooks:** Enhance, update, or expand our current tutorials.
3. **Fix Bugs:** Help us squash bugs in existing code or explanations.
4. **Enhance Documentation:** Improve clarity, add examples, or fix typos in our docs.
5. **Share Creative Ideas:** Have an innovative idea? We're all ears!
6. **Engage in Discussions:** Participate in our Discord community to help shape the future of RAG.
Remember, no contribution is too small. Every improvement helps make this repository an even better resource for the community.
## Reporting Issues
Found a problem or have a suggestion? Please create an issue on GitHub, providing as much detail as possible. You can also discuss issues in our Discord community.
## Contributing Code or Content
1. **Fork and Branch:** Fork the repository and create your branch from `main`.
2. **Make Your Changes:** Implement your contribution, following our best practices.
3. **Test:** Ensure your changes work as expected.
4. **Follow the Style:** Adhere to the coding and documentation conventions used throughout the project.
5. **Commit:** Make your git commits informative and concise.
6. **Stay Updated:** The main branch is frequently updated. Before opening a pull request, make sure your code is up-to-date with the current main branch and has no conflicts.
7. **Push and Pull Request:** Push to your fork and submit a pull request.
8. **Discuss:** Use the Discord community to discuss your contribution if you need feedback or have questions.
## Adding a New RAG Method
When adding a new RAG method to the repository, please follow these additional steps:
1. Create your notebook in the `all_rag_techniques` folder.
2. Update BOTH the list and table in README.md:
### A. Update the List of Techniques
- Add your new method to the list of techniques in the README
- Place it in the appropriate position based on complexity (methods are sorted from easiest to most complicated)
- Use the following format for the link:
```
### [Number]. [Your Method Name 🏷️](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/your_file_name.ipynb)
```
- Replace `[Number]` with the appropriate number, `[Your Method Name]` with your method's name, and `your_file_name.ipynb` with the actual name of your notebook file
- Choose an appropriate emoji that represents your method
### B. Update the Techniques Table
- Add a new row to the table with your technique
- Include all available implementations (LangChain, LlamaIndex, and/or Runnable Script)
- Use the following format:
```
| [Number] | [Category] | [LangChain](...) / [LlamaIndex](...) / [Runnable Script](...) | [Description] |
```
- Make sure to:
- Update the technique number to maintain sequential order
- Choose the appropriate category with emoji
- Include links to all available implementations
- Write a clear, concise description
### C. Important Note
- After inserting your new method, make sure to update the numbers of all subsequent techniques to maintain the correct order in BOTH the list and the table
- The numbers in the list and table must match exactly
- If you add a new technique as number 5, all techniques after it should be incremented by 1 in both places
For example, if you're adding a new technique between Simple RAG and Next Method:
In the list:
```
### 1. [Simple RAG 🌱](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag.ipynb)
### 2. [Your New Method 🆕](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/your_new_method.ipynb)
### 3. [Next Method 🔜](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/next_method.ipynb)
```
And in the table:
```
| 1 | Foundational 🌱 | [LangChain](...) / [LlamaIndex](...) / [Runnable Script](...) | Basic RAG implementation |
| 2 | Your Category 🆕 | [LangChain](...) / [LlamaIndex](...) / [Runnable Script](...) | Your new method description |
| 3 | Next Category 🔜 | [LangChain](...) / [LlamaIndex](...) / [Runnable Script](...) | Next method description |
```
Remember: Always update BOTH the list and table when adding new techniques, and ensure the numbers match exactly between them.
## Notebook Structure
For new notebooks or significant additions to existing ones, please follow this structure:
1. **Title and Overview:** Clear title and brief overview of the technique.
2. **Detailed Explanation:** Cover motivation, key components, method details, and benefits.
3. **Visual Representation:** Include a diagram to visualize the technique. We recommend using Mermaid syntax for creating these diagrams. Here's how to do it:
• Create a graph using Mermaid's graph TD (top-down) syntax
• You can use Claude or other AI assistants to help you design the graph if needed
• Paste your Mermaid code into [Mermaid Live Editor](https://mermaid.live/)
• In the "Actions" tab of Mermaid Live Editor, download the SVG file of your diagram
• Store the SVG file in the [images folder](https://github.com/NirDiamant/RAG_Techniques/tree/main/images) of the repository
• Use an appropriate, descriptive name for the file
• In your notebook, display the image using Markdown syntax:
```markdown

```
This process ensures consistency in our visual representations and makes it easy for others to understand and potentially modify the diagrams in the future.
4. **Implementation:** Step-by-step Python implementation with clear comments and explanations.
5. **Usage Example:** Demonstrate the technique with a practical example.
6. **Comparison:** Compare with basic RAG, both qualitatively and quantitatively if possible.
7. **Additional Considerations:** Discuss limitations, potential improvements, or specific use cases.
8. **References:** Include relevant citations or resources if you have.
## Notebook Best Practices
To ensure consistency and readability across all notebooks:
1. **Code Cell Descriptions:** Each code cell should be preceded by a markdown cell with a clear, concise title describing the cell's content or purpose.
2. **Clear Unnecessary Outputs:** Before committing your notebook, clear all unnecessary cell outputs. This helps reduce file size and avoids confusion from outdated results.
3. **Consistent Formatting:** Maintain consistent formatting throughout the notebook, including regular use of markdown headers, code comments, and proper indentation.
## Code Quality and Readability
To ensure the highest quality and readability of our code:
1. **Write Clean Code:** Follow best practices for clean, readable code.
2. **Use Comments:** Add clear and concise comments to explain complex logic.
3. **Format Your Code:** Use consistent formatting throughout your contribution.
4. **Language Model Review:** After completing your code, consider passing it through a language model for additional formatting and readability improvements. This extra step can help make your code even more accessible and maintainable.
## Documentation
Clear documentation is crucial. Whether you're improving existing docs or adding new ones, follow the same process: fork, change, test, and submit a pull request.
## Final Notes
We're grateful for all our contributors and excited to see how you'll help expand the world's most comprehensive RAG resource. Don't hesitate to ask questions in our Discord community if you're unsure about anything.
Let's harness our collective knowledge and creativity to push the boundaries of RAG technology together!
Happy contributing! 🚀
## /LICENSE
``` path="/LICENSE"
Custom License Agreement
This License Agreement ("Agreement") is a legal agreement between Nir Diamant ("Licensor") and any individual or entity ("Licensee" or "Contributor") who accesses, uses, or contributes to this repository. By accessing, using, or contributing to the Repository, you agree to be bound by the terms of this Agreement.
1. Grant of License for Non-Commercial Use
1.1 Non-Commercial Use License: The Licensor grants the Licensee a worldwide, royalty-free, non-exclusive, non-transferable license to use, reproduce, modify, and distribute the content of the Repository ("Licensed Material") for non-commercial purposes only, subject to the terms and conditions of this Agreement.
1.2 Attribution Requirement: When using or distributing the Licensed Material, the Licensee must provide appropriate credit to the Licensor by:
- Citing the Licensor's name as specified.
- Including a link to the Repository.
- Indicating if changes were made to the Licensed Material.
1.3 No Commercial Use: Licensees are expressly prohibited from using the Licensed Material, in whole or in part, for any commercial purpose without prior written permission from the Licensor.
2. Reservation of Commercial Rights
2.1 Exclusive Commercial Rights: All commercial rights to the Licensed Material are exclusively reserved by the Licensor. The Licensor retains the sole right to use, reproduce, modify, distribute, and sublicense the Licensed Material for commercial purposes.
2.2 Requesting Commercial Permission: Parties interested in using the Licensed Material for commercial purposes must obtain explicit written consent from the Licensor. Requests should be directed to the contact information provided at the end of this Agreement.
3. Contributions
3.1 Contributor License Grant: By submitting any content ("Contribution") to the Repository, the Contributor grants the Licensor an exclusive, perpetual, irrevocable, worldwide, royalty-free license to use, reproduce, modify, distribute, sublicense, and create derivative works from the Contribution for any purpose, including commercial purposes.
3.2 Non-Commercial Use by Contributor: Contributors retain the right to use their own Contributions for non-commercial purposes under the same terms as this Agreement.
3.3 Warranty of Originality: Contributors represent and warrant that their Contributions are original works and do not infringe upon the intellectual property rights of any third party.
3.4 No Commercial Rights for Contributors: Contributors acknowledge that they have no rights to use the Licensed Material or their Contributions for commercial purposes.
4. Restrictions
4.1 Prohibition of Commercial Exploitation: Licensees and Contributors may not:
- Use the Licensed Material or any Contributions for commercial purposes.
- Distribute the Licensed Material or any Contributions as part of any commercial product or service.
- Sublicense the Licensed Material or any Contributions for commercial use.
4.2 No Endorsement: Licensees and Contributors may not imply endorsement or affiliation with the Licensor without explicit written permission.
5. Term and Termination
5.1 Term: This Agreement is effective upon acceptance and continues unless terminated as provided herein.
5.2 Termination for Breach: The Licensor may terminate this Agreement immediately if the Licensee or Contributor breaches any of its terms.
5.3 Effect of Termination: Upon termination, all rights granted under this Agreement cease, and the Licensee or Contributor must destroy all copies of the Licensed Material in their possession.
5.4 Survival: Sections 2, 3, 4, 6, and 7 survive termination of this Agreement.
6. Disclaimer of Warranties and Limitation of Liability
6.1 As-Is Basis: The Licensed Material and any Contributions are provided "AS IS," without warranties or conditions of any kind, either express or implied.
6.2 Disclaimer: The Licensor expressly disclaims all warranties, including but not limited to warranties of title, non-infringement, merchantability, and fitness for a particular purpose.
6.3 Limitation of Liability: In no event shall the Licensor be liable for any direct, indirect, incidental, special, exemplary, or consequential damages arising in any way out of the use of the Licensed Material or Contributions.
7. General Provisions
7.1 Entire Agreement: This Agreement constitutes the entire agreement between the parties concerning the subject matter hereof and supersedes all prior agreements and understandings.
7.2 Modification: The Licensor reserves the right to modify this Agreement at any time. Continued use of the Repository constitutes acceptance of the modified terms.
7.3 Severability: If any provision of this Agreement is found to be unenforceable, the remainder shall remain in full force and effect.
7.4 Waiver: Failure to enforce any provision of this Agreement shall not constitute a waiver of such provision.
7.5 Governing Law: This Agreement shall be governed by and construed in accordance with the laws of [Your Jurisdiction], without regard to its conflict of law principles.
7.6 Dispute Resolution: Any disputes arising under or in connection with this Agreement shall be subject to the exclusive jurisdiction of the courts located in [Your Jurisdiction].
8. Acceptance
By accessing, using, or contributing to the Repository, you acknowledge that you have read, understood, and agree to be bound by the terms and conditions of this Agreement.
Contact Information
For any questions or requests regarding this Agreement, please contact:
Name: Nir Diamant
Email: nirdiamant21@gmail.com
```
## /README.md
[](http://makeapullrequest.com)
[](https://www.linkedin.com/in/nir-diamant-759323134/)
[](https://twitter.com/NirDiamantAI)
[](https://discord.gg/cA6Aa4uyDX)
[](https://github.com/sponsors/NirDiamant)
> 🌟 **Support This Project:** Your sponsorship fuels innovation in RAG technologies. **[Become a sponsor](https://github.com/sponsors/NirDiamant)** to help maintain and expand this valuable resource!
## Sponsors ❤️
A big thank you to the wonderful sponsor(s) who support this project!
# Advanced RAG Techniques: Elevating Your Retrieval-Augmented Generation Systems 🚀
Welcome to one of the most comprehensive and dynamic collections of Retrieval-Augmented Generation (RAG) tutorials available today. This repository serves as a hub for cutting-edge techniques aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems.
## 📫 Stay Updated!
🚀 Cutting-edge Updates |
💡 Expert Insights |
🎯 Top 0.1% Content |
[](https://diamantai.substack.com/?r=336pe4&utm_campaign=pub-share-checklist)
*Join over 20,000 of AI enthusiasts getting unique cutting-edge insights and free tutorials!* ***Plus, subscribers get exclusive early access and special 33% discounts to my book and the upcoming RAG Techniques course!***
[](https://diamantai.substack.com/?r=336pe4&utm_campaign=pub-share-checklist)
## Introduction
Retrieval-Augmented Generation (RAG) is revolutionizing the way we combine information retrieval with generative AI. This repository showcases a curated collection of advanced techniques designed to supercharge your RAG systems, enabling them to deliver more accurate, contextually relevant, and comprehensive responses.
Our goal is to provide a valuable resource for researchers and practitioners looking to push the boundaries of what's possible with RAG. By fostering a collaborative environment, we aim to accelerate innovation in this exciting field.
## Related Projects
🖋️ Check out my **[Prompt Engineering Techniques guide](https://github.com/NirDiamant/Prompt_Engineering)** for a comprehensive collection of prompting strategies, from basic concepts to advanced techniques, enhancing your ability to interact effectively with AI language models.
🤖 Explore my **[GenAI Agents Repository](https://github.com/NirDiamant/GenAI_Agents)** to discover a variety of AI agent implementations and tutorials, showcasing how different AI technologies can be combined to create powerful, interactive systems.
## A Community-Driven Knowledge Hub
**This repository grows stronger with your contributions!** Join our vibrant Discord community — the central hub for shaping and advancing this project together 🤝
**[RAG Techniques Discord Community](https://discord.gg/cA6Aa4uyDX)**
Whether you're an expert or just starting out, your insights can shape the future of RAG. Join us to propose ideas, get feedback, and collaborate on innovative techniques. For contribution guidelines, please refer to our **[CONTRIBUTING.md](https://github.com/NirDiamant/RAG_Techniques/blob/main/CONTRIBUTING.md)** file. Let's advance RAG technology together!
🔗 For discussions on GenAI, RAG, or custom agents, or to explore knowledge-sharing opportunities, feel free to **[connect on LinkedIn](https://www.linkedin.com/in/nir-diamant-759323134/)**.
## Key Features
- 🧠 State-of-the-art RAG enhancements
- 📚 Comprehensive documentation for each technique
- 🛠️ Practical implementation guidelines
- 🌟 Regular updates with the latest advancements
## Advanced Techniques
Explore our extensive list of cutting-edge RAG techniques:
| # | Category | Technique | Description |
|---|----------|-----------|-------------|
| 1 | Foundational 🌱 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag.ipynb) | Basic RAG implementation with LangChain |
| 2 | Foundational 🌱 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_csv_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_csv_rag.ipynb) | RAG implementation using CSV files as data source |
| 3 | Foundational 🌱 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reliable_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reliable_rag.ipynb) | Enhanced RAG with validation and refinement |
| 4 | Foundational 🌱 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/choose_chunk_size.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/choose_chunk_size.ipynb) | Optimizing text chunk sizes for better retrieval |
| 5 | Foundational 🌱 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/proposition_chunking.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/proposition_chunking.ipynb) | Breaking text into meaningful propositions |
| 6 | Query Enhancement 🔍 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/query_transformations.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/query_transformations.ipynb) | Enhancing queries through various transformations |
| 7 | Query Enhancement 🔍 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb) | Using hypothetical questions for better retrieval |
| 8 | Query Enhancement 🔍 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyPE_Hypothetical_Prompt_Embedding.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyPE_Hypothetical_Prompt_Embedding.ipynb) | Precomputing hypothetical prompts at indexing stage |
| 9 | Context Enrichment 📚 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb) | Adding context headers to document chunks |
| 10 | Context Enrichment 📚 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/relevant_segment_extraction.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/relevant_segment_extraction.ipynb) | Extracting relevant multi-chunk segments |
| 11 | Context Enrichment 📚 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk.ipynb) | Enhancing context around retrieved chunks |
| 12 | Context Enrichment 📚 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/semantic_chunking.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/semantic_chunking.ipynb) | Dividing documents based on semantic coherence |
| 13 | Context Enrichment 📚 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_compression.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_compression.ipynb) | Compressing information while preserving relevance |
| 14 | Context Enrichment 📚 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/document_augmentation.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/document_augmentation.ipynb) | Enhancing documents through question generation |
| 15 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval.ipynb) | Combining different retrieval methods |
| 16 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking.ipynb) | Advanced scoring mechanisms for better ranking |
| 17 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_faceted_filtering.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_faceted_filtering.ipynb) | Applying various filtering techniques |
| 18 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/hierarchical_indices.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/hierarchical_indices.ipynb) | Multi-tiered system for efficient retrieval |
| 19 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/ensemble_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/ensemble_retrieval.ipynb) | Combining multiple retrieval models |
| 20 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/dartboard.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/dartboard.ipynb) | Optimizing for relevant information gain |
| 21 | Advanced Retrieval 🚀 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_model_rag_with_captioning.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_model_rag_with_captioning.ipynb) | Handling diverse data types |
| 22 | Iterative Techniques 🔁 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/retrieval_with_feedback_loop.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/retrieval_with_feedback_loop.ipynb) | Learning from user interactions |
| 23 | Iterative Techniques 🔁 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/adaptive_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/adaptive_retrieval.ipynb) | Dynamic adjustment of retrieval strategies |
| 24 | Iterative Retrieval 🔄 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/iterative_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/iterative_retrieval.ipynb) | Multiple rounds of retrieval refinement |
| 25 | Evaluation 📊 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_deep_eval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_deep_eval.ipynb) | Comprehensive RAG system evaluation |
| 26 | Evaluation 📊 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_grouse.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_grouse.ipynb) | Contextually-grounded LLM evaluation |
| 27 | Explainability 🔬 | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/explainable_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/explainable_retrieval.ipynb) | Providing transparency in retrieval process |
| 28 | Advanced Architecture 🏗️ | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/graph_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/graph_rag.ipynb) | Incorporating structured knowledge graphs |
| 29 | Advanced Architecture 🏗️ | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/Microsoft_GraphRag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/Microsoft_GraphRag.ipynb) | Microsoft's advanced RAG with knowledge graphs |
| 30 | Advanced Architecture 🏗️ | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/raptor.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/raptor.ipynb) | Tree-organized retrieval with recursive processing |
| 31 | Advanced Architecture 🏗️ | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/self_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/self_rag.ipynb) | Dynamic combination of retrieval and generation |
| 32 | Advanced Architecture 🏗️ | [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/crag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/crag.ipynb) | Dynamic evaluation and correction of retrieval |
| 33 | Special Technique 🌟 | [Sophisticated Controllable Agent](https://github.com/NirDiamant/Controllable-RAG-Agent) | Advanced RAG solution for complex questions |
### 🌱 Foundational RAG Techniques
1. Simple RAG 🌱
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag.ipynb)
- **LlamaIndex**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag_with_llamaindex.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_rag_with_llamaindex.ipynb)
- **[Runnable Script](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques_runnable_scripts/simple_rag.py)**
#### Overview 🔎
Introducing basic RAG techniques ideal for newcomers.
#### Implementation 🛠️
Start with basic retrieval queries and integrate incremental learning mechanisms.
2. Simple RAG using a CSV file 🧩
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_csv_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_csv_rag.ipynb)
- **LlamaIndex**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_csv_rag_with_llamaindex.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/simple_csv_rag_with_llamaindex.ipynb)
#### Overview 🔎
Introducing basic RAG using CSV files.
#### Implementation 🛠️
This uses CSV files to create basic retrieval and integrates with openai to create question and answering system.
3. **Reliable RAG 🏷️**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reliable_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reliable_rag.ipynb)
#### Overview 🔎
Enhances the Simple RAG by adding validation and refinement to ensure the accuracy and relevance of retrieved information.
#### Implementation 🛠️
Check for retrieved document relevancy and highlight the segment of docs used for answering.
4. Choose Chunk Size 📏
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/choose_chunk_size.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/choose_chunk_size.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/choose_chunk_size.py)**
#### Overview 🔎
Selecting an appropriate fixed size for text chunks to balance context preservation and retrieval efficiency.
#### Implementation 🛠️
Experiment with different chunk sizes to find the optimal balance between preserving context and maintaining retrieval speed for your specific use case.
5. **Proposition Chunking ⛓️💥**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/proposition_chunking.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/proposition_chunking.ipynb)
#### Overview 🔎
Breaking down the text into concise, complete, meaningful sentences allowing for better control and handling of specific queries (especially extracting knowledge).
#### Implementation 🛠️
- 💪 **Proposition Generation:** The LLM is used in conjunction with a custom prompt to generate factual statements from the document chunks.
- ✅ **Quality Checking:** The generated propositions are passed through a grading system that evaluates accuracy, clarity, completeness, and conciseness.
#### Additional Resources 📚
- **[The Propositions Method: Enhancing Information Retrieval for AI Systems](https://open.substack.com/pub/diamantai/p/the-propositions-method-enhancing?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the benefits and implementation of proposition chunking in RAG systems.
### 🔍 Query Enhancement
6. Query Transformations 🔄
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/query_transformations.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/query_transformations.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/query_transformations.py)**
#### Overview 🔎
Modifying and expanding queries to improve retrieval effectiveness.
#### Implementation 🛠️
- ✍️ **Query Rewriting:** Reformulate queries to improve retrieval.
- 🔙 **Step-back Prompting:** Generate broader queries for better context retrieval.
- 🧩 **Sub-query Decomposition:** Break complex queries into simpler sub-queries.
7. Hypothetical Questions (HyDE Approach) ❓
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/HyDe_Hypothetical_Document_Embedding.py)**
#### Overview 🔎
Generating hypothetical questions to improve alignment between queries and data.
#### Implementation 🛠️
Create hypothetical questions that point to relevant locations in the data, enhancing query-data matching.
#### Additional Resources 📚
- **[HyDE: Exploring Hypothetical Document Embeddings for AI Retrieval](https://open.substack.com/pub/diamantai/p/hyde-exploring-hypothetical-document?r=336pe4&utm_campaign=post&utm_medium=web)** - A short blog post explaining this method clearly.
### 📚 Context and Content Enrichment
8. Hypothetical Prompt Embeddings (HyPE) ❓🚀
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/HyPE_Hypothetical_Prompt_Embeddings.py)**
#### Overview 🔎
HyPE (Hypothetical Prompt Embeddings) is an enhancement to traditional RAG retrieval that **precomputes hypothetical prompts at the indexing stage**, but inseting the chunk in their place. This transforms retrieval into a **question-question matching task**. This avoids the need for runtime synthetic answer generation, reducing inference-time computational overhead while **improving retrieval alignment**.
#### Implementation 🛠️
- 📖 **Precomputed Questions:** Instead of embedding document chunks, HyPE **generates multiple hypothetical queries per chunk** at indexing time.
- 🔍 **Question-Question Matching:** User queries are matched against stored hypothetical questions, leading to **better retrieval alignment**.
- ⚡ **No Runtime Overhead:** Unlike HyDE, HyPE does **not require LLM calls at query time**, making retrieval **faster and cheaper**.
- 📈 **Higher Precision & Recall:** Improves retrieval **context precision by up to 42 percentage points** and **claim recall by up to 45 percentage points**.
#### Additional Resources 📚
- **[Preprint: Hypothetical Prompt Embeddings (HyPE)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335)** - Research paper detailing the method, evaluation, and benchmarks.
9. **Contextual Chunk Headers :label:**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb)
#### Overview 🔎
Contextual chunk headers (CCH) is a method of creating document-level and section-level context, and prepending those chunk headers to the chunks prior to embedding them.
#### Implementation 🛠️
Create a chunk header that includes context about the document and/or section of the document, and prepend that to each chunk in order to improve the retrieval accuracy.
#### Additional Resources 📚
**[dsRAG](https://github.com/D-Star-AI/dsRAG)**: open-source retrieval engine that implements this technique (and a few other advanced RAG techniques)
10. **Relevant Segment Extraction 🧩**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/relevant_segment_extraction.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/relevant_segment_extraction.ipynb)
#### Overview 🔎
Relevant segment extraction (RSE) is a method of dynamically constructing multi-chunk segments of text that are relevant to a given query.
#### Implementation 🛠️
Perform a retrieval post-processing step that analyzes the most relevant chunks and identifies longer multi-chunk segments to provide more complete context to the LLM.
11. Context Enrichment Techniques 📝
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk.ipynb)
- **LlamaIndex**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk_with_llamaindex.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk_with_llamaindex.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/context_enrichment_window_around_chunk.py)**
#### Overview 🔎
Enhancing retrieval accuracy by embedding individual sentences and extending context to neighboring sentences.
#### Implementation 🛠️
Retrieve the most relevant sentence while also accessing the sentences before and after it in the original text.
12. Semantic Chunking 🧠
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/semantic_chunking.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/semantic_chunking.ipynb)
- **[Runnable Script](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques_runnable_scripts/semantic_chunking.py)**
#### Overview 🔎
Dividing documents based on semantic coherence rather than fixed sizes.
#### Implementation 🛠️
Use NLP techniques to identify topic boundaries or coherent sections within documents for more meaningful retrieval units.
#### Additional Resources 📚
- **[Semantic Chunking: Improving AI Information Retrieval](https://open.substack.com/pub/diamantai/p/semantic-chunking-improving-ai-information?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the benefits and implementation of semantic chunking in RAG systems.
13. Contextual Compression 🗜️
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_compression.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_compression.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/contextual_compression.py)**
#### Overview 🔎
Compressing retrieved information while preserving query-relevant content.
#### Implementation 🛠️
Use an LLM to compress or summarize retrieved chunks, preserving key information relevant to the query.
14. Document Augmentation through Question Generation for Enhanced Retrieval
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/document_augmentation.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/document_augmentation.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/document_augmentation.py)**
#### Overview 🔎
This implementation demonstrates a text augmentation technique that leverages additional question generation to improve document retrieval within a vector database. By generating and incorporating various questions related to each text fragment, the system enhances the standard retrieval process, thus increasing the likelihood of finding relevant documents that can be utilized as context for generative question answering.
#### Implementation 🛠️
Use an LLM to augment text dataset with all possible questions that can be asked to each document.
### 🚀 Advanced Retrieval Methods
15. Fusion Retrieval 🔗
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval.ipynb)
- **LlamaIndex**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval_with_llamaindex.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval_with_llamaindex.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/fusion_retrieval.py)**
#### Overview 🔎
Optimizing search results by combining different retrieval methods.
#### Implementation 🛠️
Combine keyword-based search with vector-based search for more comprehensive and accurate retrieval.
16. Intelligent Reranking 📈
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking.ipynb)
- **LlamaIndex**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking_with_llamaindex.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking_with_llamaindex.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/reranking.py)**
#### Overview 🔎
Applying advanced scoring mechanisms to improve the relevance ranking of retrieved results.
#### Implementation 🛠️
- 🧠 **LLM-based Scoring:** Use a language model to score the relevance of each retrieved chunk.
- 🔀 **Cross-Encoder Models:** Re-encode both the query and retrieved documents jointly for similarity scoring.
- 🏆 **Metadata-enhanced Ranking:** Incorporate metadata into the scoring process for more nuanced ranking.
#### Additional Resources 📚
- **[Relevance Revolution: How Re-ranking Transforms RAG Systems](https://open.substack.com/pub/diamantai/p/relevance-revolution-how-re-ranking?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the power of re-ranking in enhancing RAG system performance.
17. Multi-faceted Filtering 🔍
#### Overview 🔎
Applying various filtering techniques to refine and improve the quality of retrieved results.
#### Implementation 🛠️
- 🏷️ **Metadata Filtering:** Apply filters based on attributes like date, source, author, or document type.
- 📊 **Similarity Thresholds:** Set thresholds for relevance scores to keep only the most pertinent results.
- 📄 **Content Filtering:** Remove results that don't match specific content criteria or essential keywords.
- 🌈 **Diversity Filtering:** Ensure result diversity by filtering out near-duplicate entries.
18. Hierarchical Indices 🗂️
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/hierarchical_indices.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/hierarchical_indices.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/hierarchical_indices.py)**
#### Overview 🔎
Creating a multi-tiered system for efficient information navigation and retrieval.
#### Implementation 🛠️
Implement a two-tiered system for document summaries and detailed chunks, both containing metadata pointing to the same location in the data.
#### Additional Resources 📚
- **[Hierarchical Indices: Enhancing RAG Systems](https://open.substack.com/pub/diamantai/p/hierarchical-indices-enhancing-rag?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the power of hierarchical indices in enhancing RAG system performance.
19. Ensemble Retrieval 🎭
#### Overview 🔎
Combining multiple retrieval models or techniques for more robust and accurate results.
#### Implementation 🛠️
Apply different embedding models or retrieval algorithms and use voting or weighting mechanisms to determine the final set of retrieved documents.
20. Dartboard Retrieval 🎯
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/dartboard.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/dartboard.ipynb)
#### Overview 🔎
Optimizing over Relevant Information Gain in Retrieval
#### Implementation 🛠️
- Combine both relevance and diversity into a single scoring function and directly optimize for it.
- POC showing plain simple RAG underperforming when the database is dense, and the dartboard retrieval outperforming it.
21. Multi-modal Retrieval 📽️
#### Overview 🔎
Extending RAG capabilities to handle diverse data types for richer responses.
#### Implementation 🛠️
- **Multi-model RAG with Multimedia Captioning**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_model_rag_with_captioning.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_model_rag_with_captioning.ipynb) - Caption and store all the other multimedia data like pdfs, ppts, etc., with text data in vector store and retrieve them together.
- **Multi-model RAG with Colpali**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_model_rag_with_colpali.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/multi_model_rag_with_colpali.ipynb) - Instead of captioning convert all the data into image, then find the most relevant images and pass them to a vision large language model.
### 🔁 Iterative and Adaptive Techniques
22. Retrieval with Feedback Loops 🔁
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/retrieval_with_feedback_loop.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/retrieval_with_feedback_loop.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/retrieval_with_feedback_loop.py)**
#### Overview 🔎
Implementing mechanisms to learn from user interactions and improve future retrievals.
#### Implementation 🛠️
Collect and utilize user feedback on the relevance and quality of retrieved documents and generated responses to fine-tune retrieval and ranking models.
23. Adaptive Retrieval 🎯
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/adaptive_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/adaptive_retrieval.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/adaptive_retrieval.py)**
#### Overview 🔎
Dynamically adjusting retrieval strategies based on query types and user contexts.
#### Implementation 🛠️
Classify queries into different categories and use tailored retrieval strategies for each, considering user context and preferences.
24. Iterative Retrieval 🔄
#### Overview 🔎
Performing multiple rounds of retrieval to refine and enhance result quality.
#### Implementation 🛠️
Use the LLM to analyze initial results and generate follow-up queries to fill in gaps or clarify information.
### 📊 Evaluation
25. **DeepEval Evaluation**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_deep_eval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_deep_eval.ipynb) | Comprehensive RAG system evaluation |
#### Overview 🔎
Performing evaluations Retrieval-Augmented Generation systems, by covering several metrics and creating test cases.
#### Implementation 🛠️
Use the `deepeval` library to conduct test cases on correctness, faithfulness and contextual relevancy of RAG systems.
26. **GroUSE Evaluation**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_grouse.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/evaluation/evaluation_grouse.ipynb) | Contextually-grounded LLM evaluation |
#### Overview 🔎
Evaluate the final stage of Retrieval-Augmented Generation using metrics of the GroUSE framework and meta-evaluate your custom LLM judge on GroUSE unit tests.
#### Implementation 🛠️
Use the `grouse` package to evaluate contextually-grounded LLM generations with GPT-4 on the 6 metrics of the GroUSE framework and use unit tests to evaluate a custom Llama 3.1 405B evaluator.
### 🔬 Explainability and Transparency
27. Explainable Retrieval 🔍
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/explainable_retrieval.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/explainable_retrieval.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/explainable_retrieval.py)**
#### Overview 🔎
Providing transparency in the retrieval process to enhance user trust and system refinement.
#### Implementation 🛠️
Explain why certain pieces of information were retrieved and how they relate to the query.
### 🏗️ Advanced Architectures
28. Knowledge Graph Integration (Graph RAG) 🕸️
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/graph_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/graph_rag.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/graph_rag.py)**
#### Overview 🔎
Incorporating structured data from knowledge graphs to enrich context and improve retrieval.
#### Implementation 🛠️
Retrieve entities and their relationships from a knowledge graph relevant to the query, combining this structured data with unstructured text for more informative responses.
29. GraphRag (Microsoft) 🎯
- **GraphRag**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/Microsoft_GraphRag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/Microsoft_GraphRag.ipynb)
#### Overview 🔎
Microsoft GraphRAG (Open Source) is an advanced RAG system that integrates knowledge graphs to improve the performance of LLMs
#### Implementation 🛠️
• Analyze an input corpus by extracting entities, relationships from text units. generates summaries of each community and its constituents from the bottom-up.
30. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval 🌳
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/raptor.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/raptor.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/raptor.py)**
#### Overview 🔎
Implementing a recursive approach to process and organize retrieved information in a tree structure.
#### Implementation 🛠️
Use abstractive summarization to recursively process and summarize retrieved documents, organizing the information in a tree structure for hierarchical context.
31. Self RAG 🔁
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/self_rag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/self_rag.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/self_rag.py)**
#### Overview 🔎
A dynamic approach that combines retrieval-based and generation-based methods, adaptively deciding whether to use retrieved information and how to best utilize it in generating responses.
#### Implementation 🛠️
• Implement a multi-step process including retrieval decision, document retrieval, relevance evaluation, response generation, support assessment, and utility evaluation to produce accurate, relevant, and useful outputs.
32. Corrective RAG 🔧
- **LangChain**: [
](https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/crag.ipynb) [
](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/crag.ipynb)
- **[Runnable Script](all_rag_techniques_runnable_scripts/crag.py)**
#### Overview 🔎
A sophisticated RAG approach that dynamically evaluates and corrects the retrieval process, combining vector databases, web search, and language models for highly accurate and context-aware responses.
#### Implementation 🛠️
• Integrate Retrieval Evaluator, Knowledge Refinement, Web Search Query Rewriter, and Response Generator components to create a system that adapts its information sourcing strategy based on relevance scores and combines multiple sources when necessary.
## 🌟 Special Advanced Technique 🌟
33. **[Sophisticated Controllable Agent for Complex RAG Tasks 🤖](https://github.com/NirDiamant/Controllable-RAG-Agent)**
#### Overview 🔎
An advanced RAG solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve. This approach uses a sophisticated deterministic graph as the "brain" 🧠 of a highly controllable autonomous agent, capable of answering non-trivial questions from your own data.
#### Implementation 🛠️
• Implement a multi-step process involving question anonymization, high-level planning, task breakdown, adaptive information retrieval and question answering, continuous re-planning, and rigorous answer verification to ensure grounded and accurate responses.
## Getting Started
To begin implementing these advanced RAG techniques in your projects:
1. Clone this repository:
```
git clone https://github.com/NirDiamant/RAG_Techniques.git
```
2. Navigate to the technique you're interested in:
```
cd all_rag_techniques/technique-name
```
3. Follow the detailed implementation guide in each technique's directory.
## Contributing
We welcome contributions from the community! If you have a new technique or improvement to suggest:
1. Fork the repository
2. Create your feature branch: `git checkout -b feature/AmazingFeature`
3. Commit your changes: `git commit -m 'Add some AmazingFeature'`
4. Push to the branch: `git push origin feature/AmazingFeature`
5. Open a pull request
## Contributors
[](https://github.com/NirDiamant/RAG_Techniques/graphs/contributors)
## License
This project is licensed under a custom non-commercial license - see the [LICENSE](LICENSE) file for details.
---
⭐️ If you find this repository helpful, please consider giving it a star!
Keywords: RAG, Retrieval-Augmented Generation, NLP, AI, Machine Learning, Information Retrieval, Natural Language Processing, LLM, Embeddings, Semantic Search
## /all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb
```ipynb path="/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Hypothetical Document Embedding (HyDE) in Document Retrieval\n",
"\n",
"## Overview\n",
"\n",
"This code implements a Hypothetical Document Embedding (HyDE) system for document retrieval. HyDE is an innovative approach that transforms query questions into hypothetical documents containing the answer, aiming to bridge the gap between query and document distributions in vector space.\n",
"\n",
"## Motivation\n",
"\n",
"Traditional retrieval methods often struggle with the semantic gap between short queries and longer, more detailed documents. HyDE addresses this by expanding the query into a full hypothetical document, potentially improving retrieval relevance by making the query representation more similar to the document representations in the vector space.\n",
"\n",
"## Key Components\n",
"\n",
"1. PDF processing and text chunking\n",
"2. Vector store creation using FAISS and OpenAI embeddings\n",
"3. Language model for generating hypothetical documents\n",
"4. Custom HyDERetriever class implementing the HyDE technique\n",
"\n",
"## Method Details\n",
"\n",
"### Document Preprocessing and Vector Store Creation\n",
"\n",
"1. The PDF is processed and split into chunks.\n",
"2. A FAISS vector store is created using OpenAI embeddings for efficient similarity search.\n",
"\n",
"### Hypothetical Document Generation\n",
"\n",
"1. A language model (GPT-4) is used to generate a hypothetical document that answers the given query.\n",
"2. The generation is guided by a prompt template that ensures the hypothetical document is detailed and matches the chunk size used in the vector store.\n",
"\n",
"### Retrieval Process\n",
"\n",
"The `HyDERetriever` class implements the following steps:\n",
"\n",
"1. Generate a hypothetical document from the query using the language model.\n",
"2. Use the hypothetical document as the search query in the vector store.\n",
"3. Retrieve the most similar documents to this hypothetical document.\n",
"\n",
"## Key Features\n",
"\n",
"1. Query Expansion: Transforms short queries into detailed hypothetical documents.\n",
"2. Flexible Configuration: Allows adjustment of chunk size, overlap, and number of retrieved documents.\n",
"3. Integration with OpenAI Models: Uses GPT-4 for hypothetical document generation and OpenAI embeddings for vector representation.\n",
"\n",
"## Benefits of this Approach\n",
"\n",
"1. Improved Relevance: By expanding queries into full documents, HyDE can potentially capture more nuanced and relevant matches.\n",
"2. Handling Complex Queries: Particularly useful for complex or multi-faceted queries that might be difficult to match directly.\n",
"3. Adaptability: The hypothetical document generation can adapt to different types of queries and document domains.\n",
"4. Potential for Better Context Understanding: The expanded query might better capture the context and intent behind the original question.\n",
"\n",
"## Implementation Details\n",
"\n",
"1. Uses OpenAI's ChatGPT model for hypothetical document generation.\n",
"2. Employs FAISS for efficient similarity search in the vector space.\n",
"3. Allows for easy visualization of both the hypothetical document and retrieved results.\n",
"\n",
"## Conclusion\n",
"\n",
"Hypothetical Document Embedding (HyDE) represents an innovative approach to document retrieval, addressing the semantic gap between queries and documents. By leveraging advanced language models to expand queries into hypothetical documents, HyDE has the potential to significantly improve retrieval relevance, especially for complex or nuanced queries. This technique could be particularly valuable in domains where understanding query intent and context is crucial, such as legal research, academic literature review, or advanced information retrieval systems."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Clone the repository to access helper functions and evaluation modules\n",
"!git clone https://github.com/N7/RAG_TECHNIQUES.git\n",
"import sys\n",
"sys.path.append('RAG_TECHNIQUES')\n",
"# If you need to run with the latest data\n",
"# !cp -r RAG_TECHNIQUES/data ."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"from dotenv import load_dotenv\n",
"\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"from helper_functions import *\n",
"from evaluation.evalute_rag import *\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define document(s) path"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"path = \"data/Understanding_Climate_Change.pdf\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define the HyDe retriever class - creating vector store, generating hypothetical document, and retrieving"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"class HyDERetriever:\n",
" def __init__(self, files_path, chunk_size=500, chunk_overlap=100):\n",
" self.llm = ChatOpenAI(temperature=0, model_name=\"gpt-4o-mini\", max_tokens=4000)\n",
"\n",
" self.embeddings = OpenAIEmbeddings()\n",
" self.chunk_size = chunk_size\n",
" self.chunk_overlap = chunk_overlap\n",
" self.vectorstore = encode_pdf(files_path, chunk_size=self.chunk_size, chunk_overlap=self.chunk_overlap)\n",
" \n",
" \n",
" self.hyde_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"chunk_size\"],\n",
" template=\"\"\"Given the question '{query}', generate a hypothetical document that directly answers this question. The document should be detailed and in-depth.\n",
" the document size has be exactly {chunk_size} characters.\"\"\",\n",
" )\n",
" self.hyde_chain = self.hyde_prompt | self.llm\n",
"\n",
" def generate_hypothetical_document(self, query):\n",
" input_variables = {\"query\": query, \"chunk_size\": self.chunk_size}\n",
" return self.hyde_chain.invoke(input_variables).content\n",
"\n",
" def retrieve(self, query, k=3):\n",
" hypothetical_doc = self.generate_hypothetical_document(query)\n",
" similar_docs = self.vectorstore.similarity_search(hypothetical_doc, k=k)\n",
" return similar_docs, hypothetical_doc\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a HyDe retriever instance"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"retriever = HyDERetriever(path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Demonstrate on a use case"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"test_query = \"What is the main cause of climate change?\"\n",
"results, hypothetical_doc = retriever.retrieve(test_query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Plot the hypothetical document and the retrieved documnets "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"docs_content = [doc.page_content for doc in results]\n",
"\n",
"print(\"hypothetical_doc:\\n\")\n",
"print(text_wrap(hypothetical_doc)+\"\\n\")\n",
"show_context(docs_content)"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb
```ipynb path="/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Hypothetical Prompt Embeddings (HyPE)\n",
"\n",
"## Overview\n",
"\n",
"This code implements a Retrieval-Augmented Generation (RAG) system enhanced by Hypothetical Prompt Embeddings (HyPE). Unlike traditional RAG pipelines that struggle with query-document style mismatch, HyPE precomputes hypothetical questions during the indexing phase. This transforms retrieval into a question-question matching problem, eliminating the need for expensive runtime query expansion techniques.\n",
"\n",
"## Key Components of notebook\n",
"\n",
"1. PDF processing and text extraction\n",
"2. Text chunking to maintain coherent information units\n",
"3. **Hypothetical Prompt Embedding Generation** using an LLM to create multiple proxy questions per chunk\n",
"4. Vector store creation using [FAISS](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) and OpenAI embeddings\n",
"5. Retriever setup for querying the processed documents\n",
"6. Evaluation of the RAG system\n",
"\n",
"## Method Details\n",
"\n",
"### Document Preprocessing\n",
"\n",
"1. The PDF is loaded using `PyPDFLoader`.\n",
"2. The text is split into chunks using `RecursiveCharacterTextSplitter` with specified chunk size and overlap.\n",
"\n",
"### Hypothetical Question Generation\n",
"\n",
"Instead of embedding raw text chunks, HyPE **generates multiple hypothetical prompts** for each chunk. These **precomputed questions** simulate user queries, improving alignment with real-world searches. This removes the need for runtime synthetic answer generation needed in techniques like HyDE.\n",
"\n",
"### Vector Store Creation\n",
"\n",
"1. Each hypothetical question is embedded using OpenAI embeddings.\n",
"2. A FAISS vector store is built, associating **each question embedding with its original chunk**.\n",
"3. This approach **stores multiple representations per chunk**, increasing retrieval flexibility.\n",
"\n",
"### Retriever Setup\n",
"\n",
"1. The retriever is optimized for **question-question matching** rather than direct document retrieval.\n",
"2. The FAISS index enables **efficient nearest-neighbor** search over the hypothetical prompt embeddings.\n",
"3. Retrieved chunks provide a **richer and more precise context** for downstream LLM generation.\n",
"\n",
"## Key Features\n",
"\n",
"1. **Precomputed Hypothetical Prompts** – Improves query alignment without runtime overhead.\n",
"2. **Multi-Vector Representation**– Each chunk is indexed multiple times for broader semantic coverage.\n",
"3. **Efficient Retrieval** – FAISS ensures fast similarity search over the enhanced embeddings.\n",
"4. **Modular Design** – The pipeline is easy to adapt for different datasets and retrieval settings. Additionally it's compatible with most optimizations like reranking etc.\n",
"\n",
"## Evaluation\n",
"\n",
"HyPE's effectiveness is evaluated across multiple datasets, showing:\n",
"\n",
"- Up to 42 percentage points improvement in retrieval precision\n",
"- Up to 45 percentage points improvement in claim recall\n",
" (See full evaluation results in [preprint](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335))\n",
"\n",
"## Benefits of this Approach\n",
"\n",
"1. **Eliminates Query-Time Overhead** – All hypothetical generation is done offline at indexing.\n",
"2. **Enhanced Retrieval Precision** – Better alignment between queries and stored content.\n",
"3. **Scalable & Efficient** – No addinal per-query computational cost; retrieval is as fast as standard RAG.\n",
"4. **Flexible & Extensible** – Can be combined with advanced RAG techniques like reranking.\n",
"\n",
"## Conclusion\n",
"\n",
"HyPE provides a scalable and efficient alternative to traditional RAG systems, overcoming query-document style mismatch while avoiding the computational cost of runtime query expansion. By moving hypothetical prompt generation to indexing, it significantly enhances retrieval precision and efficiency, making it a practical solution for real-world applications.\n",
"\n",
"For further details, refer to the full paper: [preprint](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335)\n",
"\n",
"\n",
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install faiss-cpu futures langchain-community python-dotenv tqdm"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Clone the repository to access helper functions and evaluation modules\n",
"!git clone https://github.com/N7/RAG_TECHNIQUES.git\n",
"import sys\n",
"sys.path.append('RAG_TECHNIQUES')\n",
"# If you need to run with the latest data\n",
"# !cp -r RAG_TECHNIQUES/data ."
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"import faiss\n",
"from tqdm import tqdm\n",
"from dotenv import load_dotenv\n",
"from concurrent.futures import ThreadPoolExecutor, as_completed\n",
"from langchain_community.docstore.in_memory import InMemoryDocstore\n",
"\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable (comment out if not using OpenAI)\n",
"if not os.getenv('OPENAI_API_KEY'):\n",
" os.environ[\"OPENAI_API_KEY\"] = input(\"Please enter your OpenAI API key: \")\n",
"else:\n",
" os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"from helper_functions import *\n",
"from evaluation.evalute_rag import *\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define constants\n",
"\n",
"- `PATH`: path to the data, to be embedded into the RAG pipeline\n",
"\n",
"This tutorial uses OpenAI endpoint ([avalible models](https://platform.openai.com/docs/pricing)). \n",
"- `LANGUAGE_MODEL_NAME`: The name of the language model to be used. \n",
"- `EMBEDDING_MODEL_NAME`: The name of the embedding model to be used.\n",
"\n",
"The tutroial uses a `RecursiveCharacterTextSplitter` chunking approach where the chunking length function used is python `len` function. The chunking varables to be tweaked here are:\n",
"- `CHUNK_SIZE`: The minimum length of one chunk\n",
"- `CHUNK_OVERLAP`: The overlap of two consecutive chunks."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [],
"source": [
"PATH = \"data/Understanding_Climate_Change.pdf\"\n",
"LANGUAGE_MODEL_NAME = \"gpt-4o-mini\"\n",
"EMBEDDING_MODEL_NAME = \"text-embedding-3-small\"\n",
"CHUNK_SIZE = 1000\n",
"CHUNK_OVERLAP = 200"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define generation of Hypothetical Prompt Embeddings\n",
"\n",
"The code block below generates hypothetical questions for each text chunk and embeds them for retrieval.\n",
"\n",
"- An LLM extracts key questions from the input chunk.\n",
"- These questions are embedded using OpenAI's model.\n",
"- The function returns the original chunk and its prompt embeddings later used for retrieval.\n",
"\n",
"To ensure clean output, extra newlines are removed, and regex parsing can improve list formatting when needed."
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [],
"source": [
"def generate_hypothetical_prompt_embeddings(chunk_text: str):\n",
" \"\"\"\n",
" Uses the LLM to generate multiple hypothetical questions for a single chunk.\n",
" These questions will be used as 'proxies' for the chunk during retrieval.\n",
"\n",
" Parameters:\n",
" chunk_text (str): Text contents of the chunk\n",
"\n",
" Returns:\n",
" chunk_text (str): Text contents of the chunk. This is done to make the \n",
" multithreading easier\n",
" hypothetical prompt embeddings (List[float]): A list of embedding vectors\n",
" generated from the questions\n",
" \"\"\"\n",
" llm = ChatOpenAI(temperature=0, model_name=LANGUAGE_MODEL_NAME)\n",
" embedding_model = OpenAIEmbeddings(model=EMBEDDING_MODEL_NAME)\n",
"\n",
" question_gen_prompt = PromptTemplate.from_template(\n",
" \"Analyze the input text and generate essential questions that, when answered, \\\n",
" capture the main points of the text. Each question should be one line, \\\n",
" without numbering or prefixes.\\n\\n \\\n",
" Text:\\n{chunk_text}\\n\\nQuestions:\\n\"\n",
" )\n",
" question_chain = question_gen_prompt | llm | StrOutputParser()\n",
"\n",
" # parse questions from response\n",
" # Notes: \n",
" # - gpt4o likes to split questions by \\n\\n so we remove one \\n\n",
" # - for production or if using smaller models from ollama, it's beneficial to use regex to parse \n",
" # things like (un)ordeed lists\n",
" # r\"^\\s*[\\-\\*\\•]|\\s*\\d+\\.\\s*|\\s*[a-zA-Z]\\)\\s*|\\s*\\(\\d+\\)\\s*|\\s*\\([a-zA-Z]\\)\\s*|\\s*\\([ivxlcdm]+\\)\\s*\"\n",
" questions = question_chain.invoke({\"chunk_text\": chunk_text}).replace(\"\\n\\n\", \"\\n\").split(\"\\n\")\n",
" \n",
" return chunk_text, embedding_model.embed_documents(questions)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define creation and population of FAISS Vector Store\n",
"\n",
"The code block below builds a FAISS vector store by embedding text chunks in parallel.\n",
"\n",
"What happens?\n",
"- Parallel processing – Uses threading to generate embeddings faster.\n",
"- FAISS initialization – Sets up an L2 index for efficient similarity search.\n",
"- Chunk embedding – Each chunk is stored multiple times, once for each generated question embedding.\n",
"- In-memory storage – Uses InMemoryDocstore for fast lookup.\n",
"\n",
"This ensures efficient retrieval, improving query alignment with precomputed question embeddings."
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [],
"source": [
"def prepare_vector_store(chunks: List[str]):\n",
" \"\"\"\n",
" Creates and populates a FAISS vector store from a list of text chunks.\n",
"\n",
" This function processes a list of text chunks in parallel, generating \n",
" hypothetical prompt embeddings for each chunk.\n",
" The embeddings are stored in a FAISS index for efficient similarity search.\n",
"\n",
" Parameters:\n",
" chunks (List[str]): A list of text chunks to be embedded and stored.\n",
"\n",
" Returns:\n",
" FAISS: A FAISS vector store containing the embedded text chunks.\n",
" \"\"\"\n",
"\n",
" # Wait with initialization to see vector lengths\n",
" vector_store = None \n",
"\n",
" with ThreadPoolExecutor() as pool: \n",
" # Use threading to speed up generation of prompt embeddings\n",
" futures = [pool.submit(generate_hypothetical_prompt_embeddings, c) for c in chunks]\n",
" \n",
" # Process embeddings as they complete\n",
" for f in tqdm(as_completed(futures), total=len(chunks)): \n",
" \n",
" chunk, vectors = f.result() # Retrieve the processed chunk and its embeddings\n",
" \n",
" # Initialize the FAISS vector store on the first chunk\n",
" if vector_store == None: \n",
" vector_store = FAISS(\n",
" embedding_function=OpenAIEmbeddings(model=EMBEDDING_MODEL_NAME), # Define embedding model\n",
" index=faiss.IndexFlatL2(len(vectors[0])) # Define an L2 index for similarity search\n",
" docstore=InMemoryDocstore(), # Use in-memory document storage\n",
" index_to_docstore_id={} # Maintain index-to-document mapping\n",
" )\n",
" \n",
" # Pair the chunk's content with each generated embedding vector.\n",
" # Each chunk is inserted multiple times, once for each prompt vector\n",
" chunks_with_embedding_vectors = [(chunk.page_content, vec) for vec in vectors]\n",
" \n",
" # Add embeddings to the store\n",
" vector_store.add_embeddings(chunks_with_embedding_vectors) \n",
"\n",
" return vector_store # Return the populated vector store\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Encode PDF into a FAISS Vector Store\n",
"\n",
"The code block below processes a PDF file and stores its content as embeddings for retrieval.\n",
"\n",
"What happens?\n",
"- PDF loading – Extracts text from the document.\n",
"- Chunking – Splits text into overlapping segments for better context retention.\n",
"- Preprocessing – Cleans text to improve embedding quality.\n",
"- Vector store creation – Generates embeddings and stores them in FAISS for retrieval."
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [],
"source": [
"def encode_pdf(path, chunk_size=1000, chunk_overlap=200):\n",
" \"\"\"\n",
" Encodes a PDF book into a vector store using OpenAI embeddings.\n",
"\n",
" Args:\n",
" path: The path to the PDF file.\n",
" chunk_size: The desired size of each text chunk.\n",
" chunk_overlap: The amount of overlap between consecutive chunks.\n",
"\n",
" Returns:\n",
" A FAISS vector store containing the encoded book content.\n",
" \"\"\"\n",
"\n",
" # Load PDF documents\n",
" loader = PyPDFLoader(path)\n",
" documents = loader.load()\n",
"\n",
" # Split documents into chunks\n",
" text_splitter = RecursiveCharacterTextSplitter(\n",
" chunk_size=chunk_size, chunk_overlap=chunk_overlap, length_function=len\n",
" )\n",
" texts = text_splitter.split_documents(documents)\n",
" cleaned_texts = replace_t_with_space(texts)\n",
"\n",
" vectorstore = prepare_vector_store(cleaned_texts)\n",
"\n",
" return vectorstore"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create HyPE vector store\n",
"\n",
"Now we process the PDF and store its embeddings.\n",
"This step initializes the FAISS vector store with the encoded document."
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 97/97 [00:22<00:00, 4.40it/s]\n"
]
}
],
"source": [
"# Chunk size can be quite large with HyPE as we are not loosing percision with more\n",
"# information. For production, test how exhaustive your model is in generating sufficient \n",
"# amount of questions per chunk. This will mostly depend on your information density.\n",
"chunks_vector_store = encode_pdf(PATH, chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create retriever\n",
"\n",
"Now we set up the retriever to fetch relevant chunks from the vector store.\n",
"\n",
"Retrieves the top `k=3` most relevant chunks based on query similarity."
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [],
"source": [
"chunks_query_retriever = chunks_vector_store.as_retriever(search_kwargs={\"k\": 3})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test retriever\n",
"\n",
"Now we test retrieval using a sample query.\n",
"\n",
"- Queries the vector store to find the most relevant chunks.\n",
"- Deduplicates results to remove potentially repeated chunks.\n",
"- Displays the retrieved context for inspection.\n",
"\n",
"This step verifies that the retriever returns meaningful and diverse information for the given question."
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Context 1:\n",
"Most of these climate changes are attributed to very small variations in Earth's orbit that \n",
"change the amount of solar energy our planet receives. During the Holocene epoch, which \n",
"began at the end of the last ice age, human societies f lourished, but the industrial era has seen \n",
"unprecedented changes. \n",
"Modern Observations \n",
"Modern scientific observations indicate a rapid increase in global temperatures, sea levels, \n",
"and extreme weather events. The Intergovernmental Panel on Climate Change (IPCC) has \n",
"documented these changes extensively. Ice core samples, tree rings, and ocean sediments \n",
"provide a historical record that scientists use to understand past climate conditions and \n",
"predict future trends. The evidence overwhelmingly shows that recent changes are primarily \n",
"driven by human activities, particularly the emission of greenhou se gases. \n",
"Chapter 2: Causes of Climate Change \n",
"Greenhouse Gases\n",
"\n",
"\n",
"Context 2:\n",
"driven by human activities, particularly the emission of greenhou se gases. \n",
"Chapter 2: Causes of Climate Change \n",
"Greenhouse Gases \n",
"The primary cause of recent climate change is the increase in greenhouse gases in the \n",
"atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
"oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is essential \n",
"for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
"activities have intensified this natural process, leading to a warmer climate. \n",
"Fossil Fuels \n",
"Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
"natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
"the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
"today. \n",
"Coal\n",
"\n",
"\n",
"Context 3:\n",
"Understanding Climate Change \n",
"Chapter 1: Introduction to Climate Change \n",
"Climate change refers to significant, long -term changes in the global climate. The term \n",
"\"global climate\" encompasses the planet's overall weather patterns, including temperature, \n",
"precipitation, and wind patterns, over an extended period. Over the past cent ury, human \n",
"activities, particularly the burning of fossil fuels and deforestation, have significantly \n",
"contributed to climate change. \n",
"Historical Context \n",
"The Earth's climate has changed throughout history. Over the past 650,000 years, there have \n",
"been seven cycles of glacial advance and retreat, with the abrupt end of the last ice age about \n",
"11,700 years ago marking the beginning of the modern climate era and human civilization. \n",
"Most of these climate changes are attributed to very small variations in Earth's orbit that \n",
"change the amount of solar energy our planet receives. During the Holocene epoch, which\n",
"\n",
"\n"
]
}
],
"source": [
"test_query = \"What is the main cause of climate change?\"\n",
"context = retrieve_context_per_question(test_query, chunks_query_retriever)\n",
"context = list(set(context))\n",
"show_context(context)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evaluate results"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'questions': ['1. **Multiple Choice: Causes of Climate Change**',\n",
" ' - What is the primary cause of the current climate change trend?',\n",
" ' A) Solar radiation variations',\n",
" ' B) Natural cycles of the Earth',\n",
" ' C) Human activities, such as burning fossil fuels',\n",
" ' D) Volcanic eruptions',\n",
" '',\n",
" '2. **True or False: Impact on Biodiversity**',\n",
" ' - True or False: Climate change does not have any significant impact on the migration patterns and extinction rates of various species.',\n",
" '',\n",
" '3. **Short Answer: Mitigation Strategies**',\n",
" ' - What are two effective strategies that can be implemented at a community level to mitigate the effects of climate change?',\n",
" '',\n",
" '4. **Matching: Climate Change Effects**',\n",
" ' - Match the following effects of climate change (numbered) with their likely consequences (lettered).',\n",
" ' 1. Rising sea levels',\n",
" ' 2. Increased frequency of extreme weather events',\n",
" ' 3. Melting polar ice caps',\n",
" ' 4. Ocean acidification',\n",
" ' ',\n",
" ' A) Displacement of coastal communities',\n",
" ' B) Loss of marine biodiversity',\n",
" ' C) Increased global temperatures',\n",
" ' D) More frequent and severe hurricanes and floods',\n",
" '',\n",
" '5. **Essay: International Cooperation**',\n",
" ' - Discuss the importance of international cooperation in combating climate change. Include examples of successful global agreements or initiatives and explain how they have contributed to addressing climate change.'],\n",
" 'results': ['\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 2,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 1,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 1,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 2,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 2,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 2,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 2,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 5,\\n \"Completeness\": 4,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 2,\\n \"Completeness\": 1,\\n \"Conciseness\": 2\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 3,\\n \"Conciseness\": 3\\n}\\n\`\`\`',\n",
" '\`\`\`json\\n{\\n \"Relevance\": 4,\\n \"Completeness\": 2,\\n \"Conciseness\": 3\\n}\\n\`\`\`'],\n",
" 'average_scores': None}"
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"evaluate_rag(chunks_query_retriever)"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/Microsoft_GraphRag.ipynb
```ipynb path="/all_rag_techniques/Microsoft_GraphRag.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Microsoft GraphRAG: Enhancing Retrieval-Augmented Generation with Knowledge Graphs\n",
"\n",
" \n",
"## Overview\n",
"\n",
" \n",
"Microsoft GraphRAG is an advanced Retrieval-Augmented Generation (RAG) system that integrates knowledge graphs to improve the performance of large language models (LLMs). Developed by Microsoft Research, GraphRAG addresses limitations in traditional RAG approaches by using LLM-generated knowledge graphs to enhance document analysis and improve response quality.\n",
"\n",
"## Motivation\n",
"\n",
" \n",
"Traditional RAG systems often struggle with complex queries that require synthesizing information from disparate sources. GraphRAG aims to:\n",
"Connect related information across datasets.\n",
"Enhance understanding of semantic concepts.\n",
"Improve performance on global sensemaking tasks.\n",
"\n",
"## Key Components\n",
"\n",
"Knowledge Graph Generation: Constructs graphs with entities as nodes and relationships as edges.\n",
"Community Detection: Identifies clusters of related entities within the graph.\n",
"Summarization: Generates summaries for each community to provide context for LLMs.\n",
"Query Processing: Uses these summaries to enhance the LLM's ability to answer complex questions.\n",
"## Method Details\n",
"\n",
"Indexing Stage\n",
"\n",
" \n",
"Text Chunking: Splits source texts into manageable chunks.\n",
"Element Extraction: Uses LLMs to identify entities and relationships.\n",
"Graph Construction: Builds a graph from the extracted elements.\n",
"Community Detection: Applies algorithms like Leiden to find communities.\n",
"Community Summarization: Creates summaries for each community.\n",
"\n",
"Query Stage\n",
"\n",
" \n",
"Local Answer Generation: Uses community summaries to generate preliminary answers.\n",
"Global Answer Synthesis: Combines local answers to form a comprehensive response.\n",
"\n",
"\n",
"## Benefits of GraphRAG\n",
"GraphRAG is a powerful tool that addresses some of the key limitations of the baseline RAG model. Unlike the standard RAG model, GraphRAG excels at identifying connections between disparate pieces of information and drawing insights from them. This makes it an ideal choice for users who need to extract insights from large data collections or documents that are difficult to summarize. By leveraging its advanced graph-based architecture, GraphRAG is able to provide a holistic understanding of complex semantic concepts, making it an invaluable tool for anyone who needs to find information quickly and accurately. Whether you're a researcher, analyst, or just someone who needs to stay informed, GraphRAG can help you connect the dots and uncover new insights.\n",
"\n",
"## Conclusion\n",
"\n",
"Microsoft GraphRAG represents a significant step forward in retrieval-augmented generation, particularly for tasks requiring a global understanding of datasets. By incorporating knowledge graphs, it offers improved performance, making it ideal for complex information retrieval and analysis.\n",
"\n",
"For those experienced with basic RAG systems, GraphRAG offers an opportunity to explore more sophisticated solutions, although it may not be necessary for all use cases.\n",
"Retrieval Augmented Generation (RAG) is often performed by chunking long texts, creating a text embedding for each chunk, and retrieving chunks for including in the LLM generation context based on a similarity search against the query. This approach works well in many scenarios, and at compelling speed and cost trade-offs, but doesn't always cope well in scenarios where a detailed understanding of the text is required.\n",
"\n",
"GraphRag ( [microsoft.github.io/graphrag](https://microsoft.github.io/graphrag/) )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To run this notebook you can use either OpenAI API key or Azure OpenAI key. \n",
"Create a `.env` file and fill in the credentials for your OpenAI or Azure Open AI deployment. The following code loads these environment variables and sets up our AI client.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"AZURE_OPENAI_API_KEY=\"\"\n",
"AZURE_OPENAI_ENDPOINT=\"\"\n",
"GPT4O_MODEL_NAME=\"gpt-4o\"\n",
"TEXT_EMBEDDING_3_LARGE_DEPLOYMENT_NAME=\"\"\n",
"AZURE_OPENAI_API_VERSION=\"2024-06-01\"\n",
"\n",
"OPENAI_API_KEY=\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install graphrag"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install beautifulsoup4 openai python-dotenv pyyaml"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation\n",
"\n",
"The cell below installs all necessary packages required to run this notebook. If you're running this notebook in a new environment, execute this cell first to ensure all dependencies are installed."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install openai python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from dotenv import load_dotenv\n",
"import os\n",
"load_dotenv()\n",
"from openai import AzureOpenAI, OpenAI\n",
"\n",
"AZURE=True #Change to False to use OpenAI\n",
"if AZURE:\n",
" AZURE_OPENAI_API_KEY = os.getenv(\"AZURE_OPENAI_API_KEY\")\n",
" AZURE_OPENAI_ENDPOINT = os.getenv(\"AZURE_OPENAI_ENDPOINT\")\n",
" GPT4O_DEPLOYMENT_NAME = os.getenv(\"GPT4O_MODEL_NAME\")\n",
" TEXT_EMBEDDING_3_LARGE_NAME = os.getenv(\"TEXT_EMBEDDING_3_LARGE_DEPLOYMENT_NAME\")\n",
" AZURE_OPENAI_API_VERSION = os.getenv(\"AZURE_OPENAI_API_VERSION\")\n",
" oai = AzureOpenAI(azure_endpoint=AZURE_OPENAI_ENDPOINT, api_key=AZURE_OPENAI_API_KEY, api_version=AZURE_OPENAI_API_VERSION)\n",
"else:\n",
" OPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\")\n",
" oai = OpenAI(api_key=OPENAI_API_KEY)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll start by getting a text to work with. The Wikipedia article on Elon Musk"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"from bs4 import BeautifulSoup\n",
"\n",
"url = \"https://en.wikipedia.org/wiki/Elon_Musk\" # Replace with the URL of the web page you want to scrape\n",
"response = requests.get(url)\n",
"soup = BeautifulSoup(response.text, \"html.parser\")\n",
"\n",
"if not os.path.exists('data'): \n",
" os.makedirs('data')\n",
"\n",
"if not os.path.exists('data/elon.md'):\n",
" elon = soup.text.split('\\nSee also')[0]\n",
" with open('data/elon.md', 'w', encoding='utf-8') as f:\n",
" f.write(elon)\n",
"else:\n",
" with open('data/elon.md', 'r') as f:\n",
" elon = f.read()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"GraphRag has a convenient set of CLI commands we can use. We'll start by configuring the system, then run the indexing operation. Indexing with GraphRag is a much lengthier process, and one that costs significantly more, since rather than just calculating embeddings, GraphRag makes many LLM calls to analyse the text, extract entities, and construct the graph. That's a one-time expense, though."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import yaml\n",
"\n",
"if not os.path.exists('data/graphrag'):\n",
" !python -m graphrag.index --init --root data/graphrag\n",
"\n",
"with open('data/graphrag/settings.yaml', 'r') as f:\n",
" settings_yaml = yaml.load(f, Loader=yaml.FullLoader)\n",
"settings_yaml['llm']['model'] = \"gpt-4o\"\n",
"settings_yaml['llm']['api_key'] = AZURE_OPENAI_API_KEY if AZURE else OPENAI_API_KEY\n",
"settings_yaml['llm']['type'] = 'azure_openai_chat' if AZURE else 'openai_chat'\n",
"settings_yaml['embeddings']['llm']['api_key'] = AZURE_OPENAI_API_KEY if AZURE else OPENAI_API_KEY\n",
"settings_yaml['embeddings']['llm']['type'] = 'azure_openai_embedding' if AZURE else 'openai_embedding'\n",
"settings_yaml['embeddings']['llm']['model'] = TEXT_EMBEDDING_3_LARGE_NAME if AZURE else 'text-embedding-3-large'\n",
"if AZURE:\n",
" settings_yaml['llm']['api_version'] = AZURE_OPENAI_API_VERSION\n",
" settings_yaml['llm']['deployment_name'] = GPT4O_DEPLOYMENT_NAME\n",
" settings_yaml['llm']['api_base'] = AZURE_OPENAI_ENDPOINT\n",
" settings_yaml['embeddings']['llm']['api_version'] = AZURE_OPENAI_API_VERSION\n",
" settings_yaml['embeddings']['llm']['deployment_name'] = TEXT_EMBEDDING_3_LARGE_NAME\n",
" settings_yaml['embeddings']['llm']['api_base'] = AZURE_OPENAI_ENDPOINT\n",
"\n",
"with open('data/graphrag/settings.yaml', 'w') as f:\n",
" yaml.dump(settings_yaml, f)\n",
"\n",
"if not os.path.exists('data/graphrag/input'):\n",
" os.makedirs('data/graphrag/input')\n",
" !cp data/elon.md data/graphrag/input/elon.txt\n",
" !python -m graphrag.index --root ./data/graphrag"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You should get an output:\n",
"\ud83d\ude80 \u001bAll workflows completed successfully.\u001b"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To query GraphRag we'll use its CLI again, making sure to configure it with a context length equivalent to what we use in our embeddings search."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"import subprocess\n",
"import re\n",
"DEFAULT_RESPONSE_TYPE = 'Summarize and explain in 1-2 paragraphs with bullet points using at most 300 tokens'\n",
"DEFAULT_MAX_CONTEXT_TOKENS = 10000\n",
"\n",
"def remove_data(text):\n",
" return re.sub(r'\\[Data:.*?\\]', '', text).strip()\n",
"\n",
"\n",
"def ask_graph(query,method):\n",
" env = os.environ.copy() | {\n",
" 'GRAPHRAG_GLOBAL_SEARCH_MAX_TOKENS': str(DEFAULT_MAX_CONTEXT_TOKENS),\n",
" }\n",
" command = [\n",
" 'python', '-m', 'graphrag.query',\n",
" '--root', './data/graphrag',\n",
" '--method', method,\n",
" '--response_type', DEFAULT_RESPONSE_TYPE,\n",
" query,\n",
" ]\n",
" output = subprocess.check_output(command, universal_newlines=True, env=env, stderr=subprocess.DEVNULL)\n",
" return remove_data(output.split('Search Response: ')[1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"GrpahRag offers 2 types of search:\n",
"1. Global Search for reasoning about holistic questions about the corpus by leveraging the community summaries.\n",
"2. Local Search for reasoning about specific entities by fanning-out to their neighbors and associated concepts."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's check the local search:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Elon Musk has founded several companies and subsidiaries across various industries. Here's a summary:\n",
"\n",
"- **SpaceX**: Founded in 2002, SpaceX is a private aerospace manufacturer and space transportation company. Musk serves as the CEO and chief engineer .\n",
"\n",
"- **Tesla, Inc.**: Although not originally founded by Musk, he became an early investor and later the CEO and product architect, significantly shaping its direction .\n",
"\n",
"- **Neuralink**: Co-founded by Musk, this company focuses on developing brain-machine interfaces to enhance human-computer interaction .\n",
"\n",
"- **The Boring Company**: Founded by Musk, it specializes in tunnel construction and innovative transportation solutions .\n",
"\n",
"- **X.com/PayPal**: Musk co-founded X.com, which later became PayPal after merging with Confinity .\n",
"\n",
"- **Zip2**: Co-founded with his brother Kimbal, this was Musk's first venture, later acquired by Compaq .\n",
"\n",
"- **SolarCity**: Co-created by Musk, it was later acquired by Tesla and rebranded as Tesla Energy .\n",
"\n",
"- **xAI**: Founded in 2023, this company focuses on artificial intelligence research .\n",
"\n",
"- **OpenAI**: Co-founded by Musk, this nonprofit organization is dedicated to AI research .\n",
"\n",
"In total, Musk has founded or co-founded at least nine companies and subsidiaries."
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import Markdown\n",
"local_query=\"What and how many companies and subsidieries founded by Elon Musk\"\n",
"local_result = ask_graph(local_query,'local')\n",
"\n",
"Markdown(local_result)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Elon Musk has achieved significant accomplishments across various industries, demonstrating his influence and innovation:\n",
"\n",
"- **Space Exploration**: Founder, CEO, and chief engineer of SpaceX, Musk has propelled the company to the forefront of space exploration and satellite deployment, establishing it as a leading spaceflight services provider .\n",
"\n",
"- **Automotive Industry**: As CEO of Tesla, Musk has driven the company to the forefront of electric vehicles and sustainable energy, significantly impacting the automotive industry with innovations in electric cars and energy solutions .\n",
"\n",
"- **Online Payments**: Co-founded X.com, which evolved into PayPal, revolutionizing online transactions and becoming a major player in the online payment industry .\n",
"\n",
"- **Neural Technology**: Co-founded Neuralink, focusing on advancing brain-machine interface technology to enhance the connection between the human brain and computers .\n",
"\n",
"- **Infrastructure**: Founded The Boring Company, specializing in tunnel construction to reduce traffic congestion through innovative underground transportation systems ."
],
"text/plain": [
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"global_query=\"What are the major accomplishments of Elon Musk?\"\n",
"global_result = ask_graph(global_query,'global')\n",
"\n",
"Markdown(global_result)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/adaptive_retrieval.ipynb
```ipynb path="/all_rag_techniques/adaptive_retrieval.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/adaptive_retrieval.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Adaptive Retrieval-Augmented Generation (RAG) System\n",
"\n",
"## Overview\n",
"\n",
"This system implements an advanced Retrieval-Augmented Generation (RAG) approach that adapts its retrieval strategy based on the type of query. By leveraging Language Models (LLMs) at various stages, it aims to provide more accurate, relevant, and context-aware responses to user queries.\n",
"\n",
"## Motivation\n",
"\n",
"Traditional RAG systems often use a one-size-fits-all approach to retrieval, which can be suboptimal for different types of queries. Our adaptive system is motivated by the understanding that different types of questions require different retrieval strategies. For example, a factual query might benefit from precise, focused retrieval, while an analytical query might require a broader, more diverse set of information.\n",
"\n",
"## Key Components\n",
"\n",
"1. **Query Classifier**: Determines the type of query (Factual, Analytical, Opinion, or Contextual).\n",
"\n",
"2. **Adaptive Retrieval Strategies**: Four distinct strategies tailored to different query types:\n",
" - Factual Strategy\n",
" - Analytical Strategy\n",
" - Opinion Strategy\n",
" - Contextual Strategy\n",
"\n",
"3. **LLM Integration**: LLMs are used throughout the process to enhance retrieval and ranking.\n",
"\n",
"4. **OpenAI GPT Model**: Generates the final response using the retrieved documents as context.\n",
"\n",
"## Method Details\n",
"\n",
"### 1. Query Classification\n",
"\n",
"The system begins by classifying the user's query into one of four categories:\n",
"- Factual: Queries seeking specific, verifiable information.\n",
"- Analytical: Queries requiring comprehensive analysis or explanation.\n",
"- Opinion: Queries about subjective matters or seeking diverse viewpoints.\n",
"- Contextual: Queries that depend on user-specific context.\n",
"\n",
"### 2. Adaptive Retrieval Strategies\n",
"\n",
"Each query type triggers a specific retrieval strategy:\n",
"\n",
"#### Factual Strategy\n",
"- Enhances the original query using an LLM for better precision.\n",
"- Retrieves documents based on the enhanced query.\n",
"- Uses an LLM to rank documents by relevance.\n",
"\n",
"#### Analytical Strategy\n",
"- Generates multiple sub-queries using an LLM to cover different aspects of the main query.\n",
"- Retrieves documents for each sub-query.\n",
"- Ensures diversity in the final document selection using an LLM.\n",
"\n",
"#### Opinion Strategy\n",
"- Identifies different viewpoints on the topic using an LLM.\n",
"- Retrieves documents representing each viewpoint.\n",
"- Uses an LLM to select a diverse range of opinions from the retrieved documents.\n",
"\n",
"#### Contextual Strategy\n",
"- Incorporates user-specific context into the query using an LLM.\n",
"- Performs retrieval based on the contextualized query.\n",
"- Ranks documents considering both relevance and user context.\n",
"\n",
"### 3. LLM-Enhanced Ranking\n",
"\n",
"After retrieval, each strategy uses an LLM to perform a final ranking of the documents. This step ensures that the most relevant and appropriate documents are selected for the next stage.\n",
"\n",
"### 4. Response Generation\n",
"\n",
"The final set of retrieved documents is passed to an OpenAI GPT model, which generates a response based on the query and the provided context.\n",
"\n",
"## Benefits of This Approach\n",
"\n",
"1. **Improved Accuracy**: By tailoring the retrieval strategy to the query type, the system can provide more accurate and relevant information.\n",
"\n",
"2. **Flexibility**: The system adapts to different types of queries, handling a wide range of user needs.\n",
"\n",
"3. **Context-Awareness**: Especially for contextual queries, the system can incorporate user-specific information for more personalized responses.\n",
"\n",
"4. **Diverse Perspectives**: For opinion-based queries, the system actively seeks out and presents multiple viewpoints.\n",
"\n",
"5. **Comprehensive Analysis**: The analytical strategy ensures a thorough exploration of complex topics.\n",
"\n",
"## Conclusion\n",
"\n",
"This adaptive RAG system represents a significant advancement over traditional RAG approaches. By dynamically adjusting its retrieval strategy and leveraging LLMs throughout the process, it aims to provide more accurate, relevant, and nuanced responses to a wide variety of user queries."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install faiss-cpu langchain langchain-openai python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Clone the repository to access helper functions and evaluation modules\n",
"!git clone https://github.com/N7/RAG_TECHNIQUES.git\n",
"import sys\n",
"sys.path.append('RAG_TECHNIQUES')\n",
"# If you need to run with the latest data\n",
"# !cp -r RAG_TECHNIQUES/data ."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"from dotenv import load_dotenv\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.vectorstores import FAISS\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.prompts import PromptTemplate\n",
"\n",
"from langchain_core.retrievers import BaseRetriever\n",
"from typing import Dict, Any\n",
"from langchain.docstore.document import Document\n",
"from langchain_openai import ChatOpenAI\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"from helper_functions import *\n",
"from evaluation.evalute_rag import *\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define the query classifer class"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"class categories_options(BaseModel):\n",
" category: str = Field(description=\"The category of the query, the options are: Factual, Analytical, Opinion, or Contextual\", example=\"Factual\")\n",
"\n",
"\n",
"class QueryClassifier:\n",
" def __init__(self):\n",
" self.llm = ChatOpenAI(temperature=0, model_name=\"gpt-4o\", max_tokens=4000)\n",
" self.prompt = PromptTemplate(\n",
" input_variables=[\"query\"],\n",
" template=\"Classify the following query into one of these categories: Factual, Analytical, Opinion, or Contextual.\\nQuery: {query}\\nCategory:\"\n",
" )\n",
" self.chain = self.prompt | self.llm.with_structured_output(categories_options)\n",
"\n",
"\n",
" def classify(self, query):\n",
" print(\"clasiffying query\")\n",
" return self.chain.invoke(query).category"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define the Base Retriever class, such that the complex ones will inherit from it"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"class BaseRetrievalStrategy:\n",
" def __init__(self, texts):\n",
" self.embeddings = OpenAIEmbeddings()\n",
" text_splitter = CharacterTextSplitter(chunk_size=800, chunk_overlap=0)\n",
" self.documents = text_splitter.create_documents(texts)\n",
" self.db = FAISS.from_documents(self.documents, self.embeddings)\n",
" self.llm = ChatOpenAI(temperature=0, model_name=\"gpt-4o\", max_tokens=4000)\n",
"\n",
"\n",
" def retrieve(self, query, k=4):\n",
" return self.db.similarity_search(query, k=k)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define Factual retriever strategy"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"class relevant_score(BaseModel):\n",
" score: float = Field(description=\"The relevance score of the document to the query\", example=8.0)\n",
"\n",
"class FactualRetrievalStrategy(BaseRetrievalStrategy):\n",
" def retrieve(self, query, k=4):\n",
" print(\"retrieving factual\")\n",
" # Use LLM to enhance the query\n",
" enhanced_query_prompt = PromptTemplate(\n",
" input_variables=[\"query\"],\n",
" template=\"Enhance this factual query for better information retrieval: {query}\"\n",
" )\n",
" query_chain = enhanced_query_prompt | self.llm\n",
" enhanced_query = query_chain.invoke(query).content\n",
" print(f'enhande query: {enhanced_query}')\n",
"\n",
" # Retrieve documents using the enhanced query\n",
" docs = self.db.similarity_search(enhanced_query, k=k*2)\n",
"\n",
" # Use LLM to rank the relevance of retrieved documents\n",
" ranking_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"doc\"],\n",
" template=\"On a scale of 1-10, how relevant is this document to the query: '{query}'?\\nDocument: {doc}\\nRelevance score:\"\n",
" )\n",
" ranking_chain = ranking_prompt | self.llm.with_structured_output(relevant_score)\n",
"\n",
" ranked_docs = []\n",
" print(\"ranking docs\")\n",
" for doc in docs:\n",
" input_data = {\"query\": enhanced_query, \"doc\": doc.page_content}\n",
" score = float(ranking_chain.invoke(input_data).score)\n",
" ranked_docs.append((doc, score))\n",
"\n",
" # Sort by relevance score and return top k\n",
" ranked_docs.sort(key=lambda x: x[1], reverse=True)\n",
" return [doc for doc, _ in ranked_docs[:k]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define Analytical reriever strategy"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [],
"source": [
"class SelectedIndices(BaseModel):\n",
" indices: List[int] = Field(description=\"Indices of selected documents\", example=[0, 1, 2, 3])\n",
"\n",
"class SubQueries(BaseModel):\n",
" sub_queries: List[str] = Field(description=\"List of sub-queries for comprehensive analysis\", example=[\"What is the population of New York?\", \"What is the GDP of New York?\"])\n",
"\n",
"class AnalyticalRetrievalStrategy(BaseRetrievalStrategy):\n",
" def retrieve(self, query, k=4):\n",
" print(\"retrieving analytical\")\n",
" # Use LLM to generate sub-queries for comprehensive analysis\n",
" sub_queries_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"k\"],\n",
" template=\"Generate {k} sub-questions for: {query}\"\n",
" )\n",
"\n",
" llm = ChatOpenAI(temperature=0, model_name=\"gpt-4o\", max_tokens=4000)\n",
" sub_queries_chain = sub_queries_prompt | llm.with_structured_output(SubQueries)\n",
"\n",
" input_data = {\"query\": query, \"k\": k}\n",
" sub_queries = sub_queries_chain.invoke(input_data).sub_queries\n",
" print(f'sub queries for comprehensive analysis: {sub_queries}')\n",
"\n",
" all_docs = []\n",
" for sub_query in sub_queries:\n",
" all_docs.extend(self.db.similarity_search(sub_query, k=2))\n",
"\n",
" # Use LLM to ensure diversity and relevance\n",
" diversity_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"docs\", \"k\"],\n",
" template=\"\"\"Select the most diverse and relevant set of {k} documents for the query: '{query}'\\nDocuments: {docs}\\n\n",
" Return only the indices of selected documents as a list of integers.\"\"\"\n",
" )\n",
" diversity_chain = diversity_prompt | self.llm.with_structured_output(SelectedIndices)\n",
" docs_text = \"\\n\".join([f\"{i}: {doc.page_content[:50]}...\" for i, doc in enumerate(all_docs)])\n",
" input_data = {\"query\": query, \"docs\": docs_text, \"k\": k}\n",
" selected_indices_result = diversity_chain.invoke(input_data).indices\n",
" print(f'selected diverse and relevant documents')\n",
" \n",
" return [all_docs[i] for i in selected_indices_result if i < len(all_docs)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define Opinion retriever strategy"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"class OpinionRetrievalStrategy(BaseRetrievalStrategy):\n",
" def retrieve(self, query, k=3):\n",
" print(\"retrieving opinion\")\n",
" # Use LLM to identify potential viewpoints\n",
" viewpoints_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"k\"],\n",
" template=\"Identify {k} distinct viewpoints or perspectives on the topic: {query}\"\n",
" )\n",
" viewpoints_chain = viewpoints_prompt | self.llm\n",
" input_data = {\"query\": query, \"k\": k}\n",
" viewpoints = viewpoints_chain.invoke(input_data).content.split('\\n')\n",
" print(f'viewpoints: {viewpoints}')\n",
"\n",
" all_docs = []\n",
" for viewpoint in viewpoints:\n",
" all_docs.extend(self.db.similarity_search(f\"{query} {viewpoint}\", k=2))\n",
"\n",
" # Use LLM to classify and select diverse opinions\n",
" opinion_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"docs\", \"k\"],\n",
" template=\"Classify these documents into distinct opinions on '{query}' and select the {k} most representative and diverse viewpoints:\\nDocuments: {docs}\\nSelected indices:\"\n",
" )\n",
" opinion_chain = opinion_prompt | self.llm.with_structured_output(SelectedIndices)\n",
" \n",
" docs_text = \"\\n\".join([f\"{i}: {doc.page_content[:100]}...\" for i, doc in enumerate(all_docs)])\n",
" input_data = {\"query\": query, \"docs\": docs_text, \"k\": k}\n",
" selected_indices = opinion_chain.invoke(input_data).indices\n",
" print(f'selected diverse and relevant documents')\n",
" \n",
" return [all_docs[int(i)] for i in selected_indices.split() if i.isdigit() and int(i) < len(all_docs)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define Contextual retriever strategy"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"class ContextualRetrievalStrategy(BaseRetrievalStrategy):\n",
" def retrieve(self, query, k=4, user_context=None):\n",
" print(\"retrieving contextual\")\n",
" # Use LLM to incorporate user context into the query\n",
" context_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"context\"],\n",
" template=\"Given the user context: {context}\\nReformulate the query to best address the user's needs: {query}\"\n",
" )\n",
" context_chain = context_prompt | self.llm\n",
" input_data = {\"query\": query, \"context\": user_context or \"No specific context provided\"}\n",
" contextualized_query = context_chain.invoke(input_data).content\n",
" print(f'contextualized query: {contextualized_query}')\n",
"\n",
" # Retrieve documents using the contextualized query\n",
" docs = self.db.similarity_search(contextualized_query, k=k*2)\n",
"\n",
" # Use LLM to rank the relevance of retrieved documents considering the user context\n",
" ranking_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"context\", \"doc\"],\n",
" template=\"Given the query: '{query}' and user context: '{context}', rate the relevance of this document on a scale of 1-10:\\nDocument: {doc}\\nRelevance score:\"\n",
" )\n",
" ranking_chain = ranking_prompt | self.llm.with_structured_output(relevant_score)\n",
" print(\"ranking docs\")\n",
"\n",
" ranked_docs = []\n",
" for doc in docs:\n",
" input_data = {\"query\": contextualized_query, \"context\": user_context or \"No specific context provided\", \"doc\": doc.page_content}\n",
" score = float(ranking_chain.invoke(input_data).score)\n",
" ranked_docs.append((doc, score))\n",
"\n",
"\n",
" # Sort by relevance score and return top k\n",
" ranked_docs.sort(key=lambda x: x[1], reverse=True)\n",
"\n",
" return [doc for doc, _ in ranked_docs[:k]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define the Adapive retriever class"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [],
"source": [
"class AdaptiveRetriever:\n",
" def __init__(self, texts: List[str]):\n",
" self.classifier = QueryClassifier()\n",
" self.strategies = {\n",
" \"Factual\": FactualRetrievalStrategy(texts),\n",
" \"Analytical\": AnalyticalRetrievalStrategy(texts),\n",
" \"Opinion\": OpinionRetrievalStrategy(texts),\n",
" \"Contextual\": ContextualRetrievalStrategy(texts)\n",
" }\n",
"\n",
" def get_relevant_documents(self, query: str) -> List[Document]:\n",
" category = self.classifier.classify(query)\n",
" strategy = self.strategies[category]\n",
" return strategy.retrieve(query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define aditional retriever that inherits from langchain BaseRetriever "
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"class PydanticAdaptiveRetriever(BaseRetriever):\n",
" adaptive_retriever: AdaptiveRetriever = Field(exclude=True)\n",
"\n",
" class Config:\n",
" arbitrary_types_allowed = True\n",
"\n",
" def get_relevant_documents(self, query: str) -> List[Document]:\n",
" return self.adaptive_retriever.get_relevant_documents(query)\n",
"\n",
" async def aget_relevant_documents(self, query: str) -> List[Document]:\n",
" return self.get_relevant_documents(query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define the Adaptive RAG class"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"class AdaptiveRAG:\n",
" def __init__(self, texts: List[str]):\n",
" adaptive_retriever = AdaptiveRetriever(texts)\n",
" self.retriever = PydanticAdaptiveRetriever(adaptive_retriever=adaptive_retriever)\n",
" self.llm = ChatOpenAI(temperature=0, model_name=\"gpt-4o\", max_tokens=4000)\n",
" \n",
" # Create a custom prompt\n",
" prompt_template = \"\"\"Use the following pieces of context to answer the question at the end. \n",
" If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
"\n",
" {context}\n",
"\n",
" Question: {question}\n",
" Answer:\"\"\"\n",
" prompt = PromptTemplate(template=prompt_template, input_variables=[\"context\", \"question\"])\n",
" \n",
" # Create the LLM chain\n",
" self.llm_chain = prompt | self.llm\n",
" \n",
" \n",
"\n",
" def answer(self, query: str) -> str:\n",
" docs = self.retriever.get_relevant_documents(query)\n",
" input_data = {\"context\": \"\\n\".join([doc.page_content for doc in docs]), \"question\": query}\n",
" return self.llm_chain.invoke(input_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Demonstrate use of this model"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"# Usage\n",
"texts = [\n",
" \"The Earth is the third planet from the Sun and the only astronomical object known to harbor life.\"\n",
" ]\n",
"rag_system = AdaptiveRAG(texts)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Showcase the four different types of queries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"factual_result = rag_system.answer(\"What is the distance between the Earth and the Sun?\").content\n",
"print(f\"Answer: {factual_result}\")\n",
"\n",
"analytical_result = rag_system.answer(\"How does the Earth's distance from the Sun affect its climate?\").content\n",
"print(f\"Answer: {analytical_result}\")\n",
"\n",
"opinion_result = rag_system.answer(\"What are the different theories about the origin of life on Earth?\").content\n",
"print(f\"Answer: {opinion_result}\")\n",
"\n",
"contextual_result = rag_system.answer(\"How does the Earth's position in the Solar System influence its habitability?\").content\n",
"print(f\"Answer: {contextual_result}\")"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/choose_chunk_size.ipynb
```ipynb path="/all_rag_techniques/choose_chunk_size.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/choose_chunk_size.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install llama-index openai python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"import random\n",
"\n",
"nest_asyncio.apply()\n",
"from dotenv import load_dotenv\n",
"\n",
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
"from llama_index.core.prompts import PromptTemplate\n",
"\n",
"from llama_index.core.evaluation import (\n",
" DatasetGenerator,\n",
" FaithfulnessEvaluator,\n",
" RelevancyEvaluator\n",
")\n",
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core import Settings\n",
"\n",
"import openai\n",
"import time\n",
"import os\n",
"load_dotenv()\n",
"openai.api_key = os.getenv(\"OPENAI_API_KEY\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read Docs"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"data_dir = \"../data\"\n",
"documents = SimpleDirectoryReader(data_dir).load_data()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create evaluation questions and pick k out of them"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"num_eval_questions = 25\n",
"\n",
"eval_documents = documents[0:20]\n",
"data_generator = DatasetGenerator.from_documents(eval_documents)\n",
"eval_questions = data_generator.generate_questions_from_nodes()\n",
"k_eval_questions = random.sample(eval_questions, num_eval_questions)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define metrics evaluators and modify llama_index faithfullness evaluator prompt to rely on the context "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# We will use GPT-4 for evaluating the responses\n",
"gpt4 = OpenAI(temperature=0, model=\"gpt-4o\")\n",
"\n",
"# Set appropriate settings for the LLM\n",
"Settings.llm = gpt4\n",
"\n",
"# Define Faithfulness Evaluators which are based on GPT-4\n",
"faithfulness_gpt4 = FaithfulnessEvaluator()\n",
"\n",
"faithfulness_new_prompt_template = PromptTemplate(\"\"\" Please tell if a given piece of information is directly supported by the context.\n",
" You need to answer with either YES or NO.\n",
" Answer YES if any part of the context explicitly supports the information, even if most of the context is unrelated. If the context does not explicitly support the information, answer NO. Some examples are provided below.\n",
"\n",
" Information: Apple pie is generally double-crusted.\n",
" Context: An apple pie is a fruit pie in which the principal filling ingredient is apples.\n",
" Apple pie is often served with whipped cream, ice cream ('apple pie à la mode'), custard, or cheddar cheese.\n",
" It is generally double-crusted, with pastry both above and below the filling; the upper crust may be solid or latticed (woven of crosswise strips).\n",
" Answer: YES\n",
"\n",
" Information: Apple pies taste bad.\n",
" Context: An apple pie is a fruit pie in which the principal filling ingredient is apples.\n",
" Apple pie is often served with whipped cream, ice cream ('apple pie à la mode'), custard, or cheddar cheese.\n",
" It is generally double-crusted, with pastry both above and below the filling; the upper crust may be solid or latticed (woven of crosswise strips).\n",
" Answer: NO\n",
"\n",
" Information: Paris is the capital of France.\n",
" Context: This document describes a day trip in Paris. You will visit famous landmarks like the Eiffel Tower, the Louvre Museum, and Notre-Dame Cathedral.\n",
" Answer: NO\n",
"\n",
" Information: {query_str}\n",
" Context: {context_str}\n",
" Answer:\n",
"\n",
" \"\"\")\n",
"\n",
"faithfulness_gpt4.update_prompts({\"your_prompt_key\": faithfulness_new_prompt_template}) # Update the prompts dictionary with the new prompt template\n",
"\n",
"# Define Relevancy Evaluators which are based on GPT-4\n",
"relevancy_gpt4 = RelevancyEvaluator()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Function to evaluate metrics for each chunk size"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"# Define function to calculate average response time, average faithfulness and average relevancy metrics for given chunk size\n",
"# We use GPT-3.5-Turbo to generate response and GPT-4 to evaluate it.\n",
"def evaluate_response_time_and_accuracy(chunk_size, eval_questions):\n",
" \"\"\"\n",
" Evaluate the average response time, faithfulness, and relevancy of responses generated by GPT-3.5-turbo for a given chunk size.\n",
" \n",
" Parameters:\n",
" chunk_size (int): The size of data chunks being processed.\n",
" \n",
" Returns:\n",
" tuple: A tuple containing the average response time, faithfulness, and relevancy metrics.\n",
" \"\"\"\n",
"\n",
" total_response_time = 0\n",
" total_faithfulness = 0\n",
" total_relevancy = 0\n",
"\n",
" # create vector index\n",
" llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
"\n",
" Settings.llm = llm\n",
" Settings.chunk_size = chunk_size\n",
" Settings.chunk_overlap = chunk_size // 5 \n",
"\n",
" vector_index = VectorStoreIndex.from_documents(eval_documents)\n",
" \n",
" # build query engine\n",
" query_engine = vector_index.as_query_engine(similarity_top_k=5)\n",
" num_questions = len(eval_questions)\n",
"\n",
" # Iterate over each question in eval_questions to compute metrics.\n",
" # While BatchEvalRunner can be used for faster evaluations (see: https://docs.llamaindex.ai/en/latest/examples/evaluation/batch_eval.html),\n",
" # we're using a loop here to specifically measure response time for different chunk sizes.\n",
" for question in eval_questions:\n",
" start_time = time.time()\n",
" response_vector = query_engine.query(question)\n",
" elapsed_time = time.time() - start_time\n",
" \n",
" faithfulness_result = faithfulness_gpt4.evaluate_response(\n",
" response=response_vector\n",
" ).passing\n",
" \n",
" relevancy_result = relevancy_gpt4.evaluate_response(\n",
" query=question, response=response_vector\n",
" ).passing\n",
"\n",
" total_response_time += elapsed_time\n",
" total_faithfulness += faithfulness_result\n",
" total_relevancy += relevancy_result\n",
"\n",
" average_response_time = total_response_time / num_questions\n",
" average_faithfulness = total_faithfulness / num_questions\n",
" average_relevancy = total_relevancy / num_questions\n",
"\n",
" return average_response_time, average_faithfulness, average_relevancy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test different chunk sizes "
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\N7\\AppData\\Local\\Temp\\ipykernel_22672\\1178342312.py:21: DeprecationWarning: Call to deprecated class method from_defaults. (ServiceContext is deprecated, please use `llama_index.settings.Settings` instead.) -- Deprecated since version 0.10.0.\n",
" service_context = ServiceContext.from_defaults(llm=llm, chunk_size=chunk_size, chunk_overlap=chunk_size//5)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Chunk size 128 - Average Response time: 1.35s, Average Faithfulness: 1.00, Average Relevancy: 1.00\n",
"Chunk size 256 - Average Response time: 1.31s, Average Faithfulness: 1.00, Average Relevancy: 1.00\n"
]
}
],
"source": [
"chunk_sizes = [128, 256]\n",
"\n",
"for chunk_size in chunk_sizes:\n",
" avg_response_time, avg_faithfulness, avg_relevancy = evaluate_response_time_and_accuracy(chunk_size, k_eval_questions)\n",
" print(f\"Chunk size {chunk_size} - Average Response time: {avg_response_time:.2f}s, Average Faithfulness: {avg_faithfulness:.2f}, Average Relevancy: {avg_relevancy:.2f}\")"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/context_enrichment_window_around_chunk.ipynb
```ipynb path="/all_rag_techniques/context_enrichment_window_around_chunk.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Context Enrichment Window for Document Retrieval\n",
"\n",
"## Overview\n",
"\n",
"This code implements a context enrichment window technique for document retrieval in a vector database. It enhances the standard retrieval process by adding surrounding context to each retrieved chunk, improving the coherence and completeness of the returned information.\n",
"\n",
"## Motivation\n",
"\n",
"Traditional vector search often returns isolated chunks of text, which may lack necessary context for full understanding. This approach aims to provide a more comprehensive view of the retrieved information by including neighboring text chunks.\n",
"\n",
"## Key Components\n",
"\n",
"1. PDF processing and text chunking\n",
"2. Vector store creation using FAISS and OpenAI embeddings\n",
"3. Custom retrieval function with context window\n",
"4. Comparison between standard and context-enriched retrieval\n",
"\n",
"## Method Details\n",
"\n",
"### Document Preprocessing\n",
"\n",
"1. The PDF is read and converted to a string.\n",
"2. The text is split into chunks with overlap, each chunk tagged with its index.\n",
"\n",
"### Vector Store Creation\n",
"\n",
"1. OpenAI embeddings are used to create vector representations of the chunks.\n",
"2. A FAISS vector store is created from these embeddings.\n",
"\n",
"### Context-Enriched Retrieval\n",
"\n",
"1. The `retrieve_with_context_overlap` function performs the following steps:\n",
" - Retrieves relevant chunks based on the query\n",
" - For each relevant chunk, fetches neighboring chunks\n",
" - Concatenates the chunks, accounting for overlap\n",
" - Returns the expanded context for each relevant chunk\n",
"\n",
"### Retrieval Comparison\n",
"\n",
"The notebook includes a section to compare standard retrieval with the context-enriched approach.\n",
"\n",
"## Benefits of this Approach\n",
"\n",
"1. Provides more coherent and contextually rich results\n",
"2. Maintains the advantages of vector search while mitigating its tendency to return isolated text fragments\n",
"3. Allows for flexible adjustment of the context window size\n",
"\n",
"## Conclusion\n",
"\n",
"This context enrichment window technique offers a promising way to improve the quality of retrieved information in vector-based document search systems. By providing surrounding context, it helps maintain the coherence and completeness of the retrieved information, potentially leading to better understanding and more accurate responses in downstream tasks such as question answering."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install langchain python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Clone the repository to access helper functions and evaluation modules\n",
"!git clone https://github.com/N7/RAG_TECHNIQUES.git\n",
"import sys\n",
"sys.path.append('RAG_TECHNIQUES')\n",
"# If you need to run with the latest data\n",
"# !cp -r RAG_TECHNIQUES/data ."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\Users\\N7\\PycharmProjects\\llm_tasks\\RAG_TECHNIQUES\\.venv\\Lib\\site-packages\\deepeval\\__init__.py:45: UserWarning: You are using deepeval version 0.21.73, however version 0.21.78 is available. You should consider upgrading via the \"pip install --upgrade deepeval\" command.\n",
" warnings.warn(\n"
]
}
],
"source": [
"import os\n",
"import sys\n",
"from dotenv import load_dotenv\n",
"from langchain.docstore.document import Document\n",
"\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"from helper_functions import *\n",
"from evaluation.evalute_rag import *\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define path to PDF"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"path = \"data/Understanding_Climate_Change.pdf\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read PDF to string"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"content = read_pdf_to_string(path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Function to split text into chunks with metadata of the chunk chronological index"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def split_text_to_chunks_with_indices(text: str, chunk_size: int, chunk_overlap: int) -> List[Document]:\n",
" chunks = []\n",
" start = 0\n",
" while start < len(text):\n",
" end = start + chunk_size\n",
" chunk = text[start:end]\n",
" chunks.append(Document(page_content=chunk, metadata={\"index\": len(chunks), \"text\": text}))\n",
" start += chunk_size - chunk_overlap\n",
" return chunks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Split our document accordingly"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"chunks_size = 400\n",
"chunk_overlap = 200\n",
"docs = split_text_to_chunks_with_indices(content, chunks_size, chunk_overlap)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create vector store and retriever"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"embeddings = OpenAIEmbeddings()\n",
"vectorstore = FAISS.from_documents(docs, embeddings)\n",
"chunks_query_retriever = vectorstore.as_retriever(search_kwargs={\"k\": 1})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Function to draw the kth chunk (in the original order) from the vector store \n"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"def get_chunk_by_index(vectorstore, target_index: int) -> Document:\n",
" \"\"\"\n",
" Retrieve a chunk from the vectorstore based on its index in the metadata.\n",
" \n",
" Args:\n",
" vectorstore (VectorStore): The vectorstore containing the chunks.\n",
" target_index (int): The index of the chunk to retrieve.\n",
" \n",
" Returns:\n",
" Optional[Document]: The retrieved chunk as a Document object, or None if not found.\n",
" \"\"\"\n",
" # This is a simplified version. In practice, you might need a more efficient method\n",
" # to retrieve chunks by index, depending on your vectorstore implementation.\n",
" all_docs = vectorstore.similarity_search(\"\", k=vectorstore.index.ntotal)\n",
" for doc in all_docs:\n",
" if doc.metadata.get('index') == target_index:\n",
" return doc\n",
" return None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check the function"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Understanding Climate Change \n",
"Chapter 1: Introduction to Climate Change \n",
"Climate change refers to significant, long-term changes in the global climate. The term \n",
"\"global climate\" encompasses the planet's overall weather patterns, including temperature, \n",
"precipitation, and wind patterns, over an extended period. Over the past century, human \n",
"activities, particularly the burning of fossil fuels and \n"
]
}
],
"source": [
"chunk = get_chunk_by_index(vectorstore, 0)\n",
"print(chunk.page_content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Function that retrieves from the vector stroe based on semantic similarity and then pads each retrieved chunk with its num_neighbors before and after, taking into account the chunk overlap to construct a meaningful wide window arround it"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [],
"source": [
"def retrieve_with_context_overlap(vectorstore, retriever, query: str, num_neighbors: int = 1, chunk_size: int = 200, chunk_overlap: int = 20) -> List[str]:\n",
" \"\"\"\n",
" Retrieve chunks based on a query, then fetch neighboring chunks and concatenate them, \n",
" accounting for overlap and correct indexing.\n",
"\n",
" Args:\n",
" vectorstore (VectorStore): The vectorstore containing the chunks.\n",
" retriever: The retriever object to get relevant documents.\n",
" query (str): The query to search for relevant chunks.\n",
" num_neighbors (int): The number of chunks to retrieve before and after each relevant chunk.\n",
" chunk_size (int): The size of each chunk when originally split.\n",
" chunk_overlap (int): The overlap between chunks when originally split.\n",
"\n",
" Returns:\n",
" List[str]: List of concatenated chunk sequences, each centered on a relevant chunk.\n",
" \"\"\"\n",
" relevant_chunks = retriever.get_relevant_documents(query)\n",
" result_sequences = []\n",
"\n",
" for chunk in relevant_chunks:\n",
" current_index = chunk.metadata.get('index')\n",
" if current_index is None:\n",
" continue\n",
"\n",
" # Determine the range of chunks to retrieve\n",
" start_index = max(0, current_index - num_neighbors)\n",
" end_index = current_index + num_neighbors + 1 # +1 because range is exclusive at the end\n",
"\n",
" # Retrieve all chunks in the range\n",
" neighbor_chunks = []\n",
" for i in range(start_index, end_index):\n",
" neighbor_chunk = get_chunk_by_index(vectorstore, i)\n",
" if neighbor_chunk:\n",
" neighbor_chunks.append(neighbor_chunk)\n",
"\n",
" # Sort chunks by their index to ensure correct order\n",
" neighbor_chunks.sort(key=lambda x: x.metadata.get('index', 0))\n",
"\n",
" # Concatenate chunks, accounting for overlap\n",
" concatenated_text = neighbor_chunks[0].page_content\n",
" for i in range(1, len(neighbor_chunks)):\n",
" current_chunk = neighbor_chunks[i].page_content\n",
" overlap_start = max(0, len(concatenated_text) - chunk_overlap)\n",
" concatenated_text = concatenated_text[:overlap_start] + current_chunk\n",
"\n",
" result_sequences.append(concatenated_text)\n",
"\n",
" return result_sequences"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Comparing regular retrival and retrival with context window"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Baseline approach\n",
"query = \"Explain the role of deforestation and fossil fuels in climate change.\"\n",
"baseline_chunk = chunks_query_retriever.get_relevant_documents(query\n",
" ,\n",
" k=1\n",
")\n",
"# Focused context enrichment approach\n",
"enriched_chunks = retrieve_with_context_overlap(\n",
" vectorstore,\n",
" chunks_query_retriever,\n",
" query,\n",
" num_neighbors=1,\n",
" chunk_size=400,\n",
" chunk_overlap=200\n",
")\n",
"\n",
"print(\"Baseline Chunk:\")\n",
"print(baseline_chunk[0].page_content)\n",
"print(\"\\nEnriched Chunks:\")\n",
"print(enriched_chunks[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### An example that showcases the superiority of additional context window"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Regular retrieval:\n",
"\n",
"Context 1:\n",
"\n",
"Deep Learning, a subset of machine learning using neural networks with many layers, began to show promising results in the early 2010s. The breakthrough came in 2012 when a deep neural network significantly outperformed other machine learning method\n",
"\n",
"\n",
"\n",
"Retrieval with context overlap:\n",
"\n",
"Context 1:\n",
"ng multi-layer networks during this time.\n",
"\n",
"The late 1990s and 2000s marked the rise of machine learning approaches. Support Vector Machines (SVMs) and Random Forests became popular for various classification and regression tasks.\n",
"\n",
"Deep Learning, a subset of machine learning using neural networks with many layers, began to show promising results in the early 2010s. The breakthrough came in 2012 when a deep neural network significantly outperformed other machine learning methods in the ImageNet competition.\n",
"\n",
"Since then, deep learning has revolutionized many AI applications, including image and speech recognition, natural language processing, and game playing. In 2016, Google's AlphaGo defeated a world c\n",
"\n",
"\n"
]
}
],
"source": [
"\n",
"document_content = \"\"\"\n",
"Artificial Intelligence (AI) has a rich history dating back to the mid-20th century. The term \"Artificial Intelligence\" was coined in 1956 at the Dartmouth Conference, marking the field's official beginning.\n",
"\n",
"In the 1950s and 1960s, AI research focused on symbolic methods and problem-solving. The Logic Theorist, created in 1955 by Allen Newell and Herbert A. Simon, is often considered the first AI program.\n",
"\n",
"The 1960s saw the development of expert systems, which used predefined rules to solve complex problems. DENDRAL, created in 1965, was one of the first expert systems, designed to analyze chemical compounds.\n",
"\n",
"However, the 1970s brought the first \"AI Winter,\" a period of reduced funding and interest in AI research, largely due to overpromised capabilities and underdelivered results.\n",
"\n",
"The 1980s saw a resurgence with the popularization of expert systems in corporations. The Japanese government's Fifth Generation Computer Project also spurred increased investment in AI research globally.\n",
"\n",
"Neural networks gained prominence in the 1980s and 1990s. The backpropagation algorithm, although discovered earlier, became widely used for training multi-layer networks during this time.\n",
"\n",
"The late 1990s and 2000s marked the rise of machine learning approaches. Support Vector Machines (SVMs) and Random Forests became popular for various classification and regression tasks.\n",
"\n",
"Deep Learning, a subset of machine learning using neural networks with many layers, began to show promising results in the early 2010s. The breakthrough came in 2012 when a deep neural network significantly outperformed other machine learning methods in the ImageNet competition.\n",
"\n",
"Since then, deep learning has revolutionized many AI applications, including image and speech recognition, natural language processing, and game playing. In 2016, Google's AlphaGo defeated a world champion Go player, a landmark achievement in AI.\n",
"\n",
"The current era of AI is characterized by the integration of deep learning with other AI techniques, the development of more efficient and powerful hardware, and the ethical considerations surrounding AI deployment.\n",
"\n",
"Transformers, introduced in 2017, have become a dominant architecture in natural language processing, enabling models like GPT (Generative Pre-trained Transformer) to generate human-like text.\n",
"\n",
"As AI continues to evolve, new challenges and opportunities arise. Explainable AI, robust and fair machine learning, and artificial general intelligence (AGI) are among the key areas of current and future research in the field.\n",
"\"\"\"\n",
"\n",
"chunks_size = 250\n",
"chunk_overlap = 20\n",
"document_chunks = split_text_to_chunks_with_indices(document_content, chunks_size, chunk_overlap)\n",
"document_vectorstore = FAISS.from_documents(document_chunks, embeddings)\n",
"document_retriever = document_vectorstore.as_retriever(search_kwargs={\"k\": 1})\n",
"\n",
"query = \"When did deep learning become prominent in AI?\"\n",
"context = document_retriever.get_relevant_documents(query)\n",
"context_pages_content = [doc.page_content for doc in context]\n",
"\n",
"print(\"Regular retrieval:\\n\")\n",
"show_context(context_pages_content)\n",
"\n",
"sequences = retrieve_with_context_overlap(document_vectorstore, document_retriever, query, num_neighbors=1)\n",
"print(\"\\nRetrieval with context enrichment:\\n\")\n",
"show_context(sequences)"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/context_enrichment_window_around_chunk_with_llamaindex.ipynb
```ipynb path="/all_rag_techniques/context_enrichment_window_around_chunk_with_llamaindex.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/context_enrichment_window_around_chunk_with_llamaindex.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Context Enrichment Window for Document Retrieval\n",
"\n",
"## Overview\n",
"\n",
"This code implements a context enrichment window technique for document retrieval in a vector database. It enhances the standard retrieval process by adding surrounding context to each retrieved chunk, improving the coherence and completeness of the returned information.\n",
"\n",
"## Motivation\n",
"\n",
"Traditional vector search often returns isolated chunks of text, which may lack necessary context for full understanding. This approach aims to provide a more comprehensive view of the retrieved information by including neighboring text chunks.\n",
"\n",
"## Key Components\n",
"\n",
"1. PDF processing and text chunking\n",
"2. Vector store creation using FAISS and OpenAI embeddings\n",
"3. Custom retrieval function with context window\n",
"4. Comparison between standard and context-enriched retrieval\n",
"\n",
"## Method Details\n",
"\n",
"### Document Preprocessing\n",
"\n",
"1. The PDF is read and converted to a string.\n",
"2. The text is split into chunks with surrounding sentences\n",
"\n",
"### Vector Store Creation\n",
"\n",
"1. OpenAI embeddings are used to create vector representations of the chunks.\n",
"2. A FAISS vector store is created from these embeddings.\n",
"\n",
"### Context-Enriched Retrieval\n",
"\n",
"LlamaIndex has a special parser for such task. [SentenceWindowNodeParser](https://docs.llamaindex.ai/en/stable/module_guides/loading/node_parsers/modules/#sentencewindownodeparser) this parser splits documents into sentences. But the resulting nodes inculde the surronding senteces with a relation structure. Then, on the query [MetadataReplacementPostProcessor](https://docs.llamaindex.ai/en/stable/module_guides/querying/node_postprocessors/node_postprocessors/#metadatareplacementpostprocessor) helps connecting back these related sentences.\n",
"\n",
"### Retrieval Comparison\n",
"\n",
"The notebook includes a section to compare standard retrieval with the context-enriched approach.\n",
"\n",
"## Benefits of this Approach\n",
"\n",
"1. Provides more coherent and contextually rich results\n",
"2. Maintains the advantages of vector search while mitigating its tendency to return isolated text fragments\n",
"3. Allows for flexible adjustment of the context window size\n",
"\n",
"## Conclusion\n",
"\n",
"This context enrichment window technique offers a promising way to improve the quality of retrieved information in vector-based document search systems. By providing surrounding context, it helps maintain the coherence and completeness of the retrieved information, potentially leading to better understanding and more accurate responses in downstream tasks such as question answering."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install faiss-cpu llama-index python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core import Settings\n",
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.embeddings.openai import OpenAIEmbedding\n",
"from llama_index.core.readers import SimpleDirectoryReader\n",
"from llama_index.vector_stores.faiss import FaissVectorStore\n",
"from llama_index.core.ingestion import IngestionPipeline\n",
"from llama_index.core.node_parser import SentenceWindowNodeParser, SentenceSplitter\n",
"from llama_index.core import VectorStoreIndex\n",
"from llama_index.core.postprocessor import MetadataReplacementPostProcessor\n",
"import faiss\n",
"import os\n",
"import sys\n",
"from dotenv import load_dotenv\n",
"from pprint import pprint\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')\n",
"\n",
"# Llamaindex global settings for llm and embeddings\n",
"EMBED_DIMENSION=512\n",
"Settings.llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
"Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-3-small\", dimensions=EMBED_DIMENSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read docs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"path = \"data/\"\n",
"reader = SimpleDirectoryReader(input_dir=path, required_exts=['.pdf'])\n",
"documents = reader.load_data()\n",
"print(documents[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create vector store and retriever"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create FaisVectorStore to store embeddings\n",
"fais_index = faiss.IndexFlatL2(EMBED_DIMENSION)\n",
"vector_store = FaissVectorStore(faiss_index=fais_index)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ingestion Pipelines"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Ingestion Pipeline with Sentence Splitter"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"base_pipeline = IngestionPipeline(\n",
" transformations=[SentenceSplitter()],\n",
" vector_store=vector_store\n",
")\n",
"\n",
"base_nodes = base_pipeline.run(documents=documents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Ingestion Pipeline with Sentence Window"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"node_parser = SentenceWindowNodeParser(\n",
" # How many sentences on both sides to capture. \n",
" # Setting this to 3 results in 7 sentences.\n",
" window_size=3,\n",
" # the metadata key for to be used in MetadataReplacementPostProcessor\n",
" window_metadata_key=\"window\",\n",
" # the metadata key that holds the original sentence\n",
" original_text_metadata_key=\"original_sentence\"\n",
")\n",
"\n",
"# Create a pipeline with defined document transformations and vectorstore\n",
"pipeline = IngestionPipeline(\n",
" transformations=[node_parser],\n",
" vector_store=vector_store,\n",
")\n",
"\n",
"windowed_nodes = pipeline.run(documents=documents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Querying"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"Explain the role of deforestation and fossil fuels in climate change\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Querying *without* Metadata Replacement "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create vector index from base nodes\n",
"base_index = VectorStoreIndex(base_nodes)\n",
"\n",
"# Instantiate query engine from vector index\n",
"base_query_engine = base_index.as_query_engine(\n",
" similarity_top_k=1,\n",
")\n",
"\n",
"# Send query to the engine to get related node(s)\n",
"base_response = base_query_engine.query(query)\n",
"\n",
"print(base_response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Print Metadata of the Retrieved Node"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pprint(base_response.source_nodes[0].node.metadata)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Querying with Metadata Replacement\n",
"\"Metadata replacement\" intutively might sound a little off topic since we're working on the base sentences. But LlamaIndex stores these \"before/after sentences\" in the metadata data of the nodes. Therefore to build back up these windows of sentences we need Metadata replacement post processor."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create window index from nodes created from SentenceWindowNodeParser\n",
"windowed_index = VectorStoreIndex(windowed_nodes)\n",
"\n",
"# Instantiate query enine with MetadataReplacementPostProcessor\n",
"windowed_query_engine = windowed_index.as_query_engine(\n",
" similarity_top_k=1,\n",
" node_postprocessors=[\n",
" MetadataReplacementPostProcessor(\n",
" target_metadata_key=\"window\" # `window_metadata_key` key defined in SentenceWindowNodeParser\n",
" )\n",
" ],\n",
")\n",
"\n",
"# Send query to the engine to get related node(s)\n",
"windowed_response = windowed_query_engine.query(query)\n",
"\n",
"print(windowed_response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Print Metadata of the Retrieved Node"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Window and original sentence are added to the metadata\n",
"pprint(windowed_response.source_nodes[0].node.metadata)"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/contextual_chunk_headers.ipynb
```ipynb path="/all_rag_techniques/contextual_chunk_headers.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Contextual Chunk Headers (CCH)\n",
"\n",
"## Overview\n",
"\n",
"Contextual chunk headers (CCH) is a method of creating chunk headers that contain higher-level context (such as document-level or section-level context), and prepending those chunk headers to the chunks prior to embedding them. This gives the embeddings a much more accurate and complete representation of the content and meaning of the text. In our testing, this feature leads to a substantial improvement in retrieval quality. In addition to increasing the rate at which the correct information is retrieved, CCH also reduces the rate at which irrelevant results show up in the search results. This reduces the rate at which the LLM misinterprets a piece of text in downstream chat and generation applications.\n",
"\n",
"## Motivation\n",
"\n",
"Many of the problems developers face with RAG come down to this: Individual chunks oftentimes do not contain sufficient context to be properly used by the retrieval system or the LLM. This leads to the inability to answer questions and, more worryingly, hallucinations.\n",
"\n",
"Examples of this problem\n",
"- Chunks oftentimes refer to their subject via implicit references and pronouns. This causes them to not be retrieved when they should be, or to not be properly understood by the LLM.\n",
"- Individual chunks oftentimes only make sense in the context of the entire section or document, and can be misleading when read on their own.\n",
"\n",
"## Key Components\n",
"\n",
"#### Contextual chunk headers\n",
"The idea here is to add in higher-level context to the chunk by prepending a chunk header. This chunk header could be as simple as just the document title, or it could use a combination of document title, a concise document summary, and the full hierarchy of section and sub-section titles.\n",
"\n",
"## Method Details\n",
"\n",
"#### Context generation\n",
"In the demonstration below we use an LLM to generate a descriptive title for the document. This is done through a simple prompt where you pass in a truncated version of the document text and ask the LLM to generate a descriptive title for the document. If you already have sufficiently descriptive document titles then you can directly use those instead. We've found that a document title is the simplest and most important kind of higher-level context to include in the chunk header.\n",
"\n",
"Other kinds of context you can include in the chunk header:\n",
"- Concise document summary\n",
"- Section/sub-section title(s)\n",
" - This helps the retrieval system handle queries for larger sections or topics in documents.\n",
"\n",
"#### Embed chunks with chunk headers\n",
"The text you embed for each chunk is simply the concatenation of the chunk header and the chunk text. If you use a reranker during retrieval, you'll want to make sure you use this same concatenation there too.\n",
"\n",
"#### Add chunk headers to search results\n",
"Including the chunk headers when presenting the search results to the LLM is also beneficial as it gives the LLM more context, and makes it less likely that it misunderstands the meaning of a chunk."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"You'll need a Cohere API key and an OpenAI API key for this notebook."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install langchain openai python-dotenv tiktoken"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import cohere\n",
"import tiktoken\n",
"from typing import List\n",
"from openai import OpenAI\n",
"import os\n",
"from dotenv import load_dotenv\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"os.environ[\"CO_API_KEY\"] = os.getenv('CO_API_KEY') # Cohere API key\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY') # OpenAI API key"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load the document and split it into chunks\n",
"We'll use the basic LangChain RecursiveCharacterTextSplitter for this demo, but you can combine CCH with more sophisticated chunking methods for even better performance."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n",
"!wget -O data/nike_2023_annual_report.txt https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/nike_2023_annual_report.txt\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"\n",
"def split_into_chunks(text: str, chunk_size: int = 800) -> list[str]:\n",
" \"\"\"\n",
" Split a given text into chunks of specified size using RecursiveCharacterTextSplitter.\n",
"\n",
" Args:\n",
" text (str): The input text to be split into chunks.\n",
" chunk_size (int, optional): The maximum size of each chunk. Defaults to 800.\n",
"\n",
" Returns:\n",
" list[str]: A list of text chunks.\n",
"\n",
" Example:\n",
" >>> text = \"This is a sample text to be split into chunks.\"\n",
" >>> chunks = split_into_chunks(text, chunk_size=10)\n",
" >>> print(chunks)\n",
" ['This is a', 'sample', 'text to', 'be split', 'into', 'chunks.']\n",
" \"\"\"\n",
" text_splitter = RecursiveCharacterTextSplitter(\n",
" chunk_size=chunk_size,\n",
" chunk_overlap=0,\n",
" length_function=len\n",
" )\n",
" documents = text_splitter.create_documents([text])\n",
" return [document.page_content for document in documents]\n",
"\n",
"# File path for the input document\n",
"FILE_PATH = \"data/nike_2023_annual_report.txt\"\n",
"\n",
"# Read the document and split it into chunks\n",
"with open(FILE_PATH, \"r\") as file:\n",
" document_text = file.read()\n",
"\n",
"chunks = split_into_chunks(document_text, chunk_size=800)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate descriptive document title to use in chunk header"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"NIKE, INC. ANNUAL REPORT ON FORM 10-K\n"
]
}
],
"source": [
"# Constants\n",
"DOCUMENT_TITLE_PROMPT = \"\"\"\n",
"INSTRUCTIONS\n",
"What is the title of the following document?\n",
"\n",
"Your response MUST be the title of the document, and nothing else. DO NOT respond with anything else.\n",
"\n",
"{document_title_guidance}\n",
"\n",
"{truncation_message}\n",
"\n",
"DOCUMENT\n",
"{document_text}\n",
"\"\"\".strip()\n",
"\n",
"TRUNCATION_MESSAGE = \"\"\"\n",
"Also note that the document text provided below is just the first ~{num_words} words of the document. That should be plenty for this task. Your response should still pertain to the entire document, not just the text provided below.\n",
"\"\"\".strip()\n",
"\n",
"MAX_CONTENT_TOKENS = 4000\n",
"MODEL_NAME = \"gpt-4o-mini\"\n",
"TOKEN_ENCODER = tiktoken.encoding_for_model('gpt-3.5-turbo')\n",
"\n",
"def make_llm_call(chat_messages: list[dict]) -> str:\n",
" \"\"\"\n",
" Make an API call to the OpenAI language model.\n",
"\n",
" Args:\n",
" chat_messages (list[dict]): A list of message dictionaries for the chat completion.\n",
"\n",
" Returns:\n",
" str: The generated response from the language model.\n",
" \"\"\"\n",
" client = OpenAI(api_key=os.getenv(\"OPENAI_API_KEY\"))\n",
" response = client.chat.completions.create(\n",
" model=MODEL_NAME,\n",
" messages=chat_messages,\n",
" max_tokens=MAX_CONTENT_TOKENS,\n",
" temperature=0.2,\n",
" )\n",
" return response.choices[0].message.content.strip()\n",
"\n",
"def truncate_content(content: str, max_tokens: int) -> tuple[str, int]:\n",
" \"\"\"\n",
" Truncate the content to a specified maximum number of tokens.\n",
"\n",
" Args:\n",
" content (str): The input text to be truncated.\n",
" max_tokens (int): The maximum number of tokens to keep.\n",
"\n",
" Returns:\n",
" tuple[str, int]: A tuple containing the truncated content and the number of tokens.\n",
" \"\"\"\n",
" tokens = TOKEN_ENCODER.encode(content, disallowed_special=())\n",
" truncated_tokens = tokens[:max_tokens]\n",
" return TOKEN_ENCODER.decode(truncated_tokens), min(len(tokens), max_tokens)\n",
"\n",
"def get_document_title(document_text: str, document_title_guidance: str = \"\") -> str:\n",
" \"\"\"\n",
" Extract the title of a document using a language model.\n",
"\n",
" Args:\n",
" document_text (str): The text of the document.\n",
" document_title_guidance (str, optional): Additional guidance for title extraction. Defaults to \"\".\n",
"\n",
" Returns:\n",
" str: The extracted document title.\n",
" \"\"\"\n",
" # Truncate the content if it's too long\n",
" document_text, num_tokens = truncate_content(document_text, MAX_CONTENT_TOKENS)\n",
" truncation_message = TRUNCATION_MESSAGE.format(num_words=3000) if num_tokens >= MAX_CONTENT_TOKENS else \"\"\n",
"\n",
" # Prepare the prompt for title extraction\n",
" prompt = DOCUMENT_TITLE_PROMPT.format(\n",
" document_title_guidance=document_title_guidance,\n",
" document_text=document_text,\n",
" truncation_message=truncation_message\n",
" )\n",
" chat_messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" \n",
" return make_llm_call(chat_messages)\n",
"\n",
"# Example usage\n",
"if __name__ == \"__main__\":\n",
" # Assuming document_text is defined elsewhere\n",
" document_title = get_document_title(document_text)\n",
" print(f\"Document Title: {document_title}\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Add chunk header and measure impact\n",
"Let's look at a specific example to demonstrate the impact of adding a chunk header. We'll use the Cohere reranker to measure relevance to a query with and without a chunk header."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Chunk header:\n",
"Document Title: NIKE, INC. ANNUAL REPORT ON FORM 10-K\n",
"\n",
"Chunk text:\n",
"Given the broad and global scope of our operations, we are particularly vulnerable to the physical risks of climate change, such \n",
"as shifts in weather patterns. Extreme weather conditions in the areas in which our retail stores, suppliers, manufacturers, \n",
"customers, distribution centers, offices, headquarters and vendors are located could adversely affect our operating results and \n",
"financial condition. Moreover, natural disasters such as earthquakes, hurricanes, wildfires, tsunamis, floods or droughts, whether \n",
"occurring in the United States or abroad, and their related consequences and effects, including energy shortages and public \n",
"health issues, have in the past temporarily disrupted, and could in the future disrupt, our operations, the operations of our\n",
"\n",
"Query: Nike climate change impact\n",
"\n",
"Similarity w/o contextual chunk header: 0.10576342\n",
"Similarity with contextual chunk header: 0.92206234\n"
]
}
],
"source": [
"def rerank_documents(query: str, chunks: List[str]) -> List[float]:\n",
" \"\"\"\n",
" Use Cohere Rerank API to rerank the search results.\n",
"\n",
" Args:\n",
" query (str): The search query.\n",
" chunks (List[str]): List of document chunks to be reranked.\n",
"\n",
" Returns:\n",
" List[float]: List of similarity scores for each chunk, in the original order.\n",
" \"\"\"\n",
" MODEL = \"rerank-english-v3.0\"\n",
" client = cohere.Client(api_key=os.environ[\"CO_API_KEY\"])\n",
"\n",
" reranked_results = client.rerank(model=MODEL, query=query, documents=chunks)\n",
" results = reranked_results.results\n",
" reranked_indices = [result.index for result in results]\n",
" reranked_similarity_scores = [result.relevance_score for result in results]\n",
" \n",
" # Convert back to order of original documents\n",
" similarity_scores = [0] * len(chunks)\n",
" for i, index in enumerate(reranked_indices):\n",
" similarity_scores[index] = reranked_similarity_scores[i]\n",
"\n",
" return similarity_scores\n",
"\n",
"def compare_chunk_similarities(chunk_index: int, chunks: List[str], document_title: str, query: str) -> None:\n",
" \"\"\"\n",
" Compare similarity scores for a chunk with and without a contextual header.\n",
"\n",
" Args:\n",
" chunk_index (int): Index of the chunk to inspect.\n",
" chunks (List[str]): List of all document chunks.\n",
" document_title (str): Title of the document.\n",
" query (str): The search query to use for comparison.\n",
"\n",
" Prints:\n",
" Chunk header, chunk text, query, and similarity scores with and without the header.\n",
" \"\"\"\n",
" chunk_text = chunks[chunk_index]\n",
" chunk_wo_header = chunk_text\n",
" chunk_w_header = f\"Document Title: {document_title}\\n\\n{chunk_text}\"\n",
"\n",
" similarity_scores = rerank_documents(query, [chunk_wo_header, chunk_w_header])\n",
"\n",
" print(f\"\\nChunk header:\\nDocument Title: {document_title}\")\n",
" print(f\"\\nChunk text:\\n{chunk_text}\")\n",
" print(f\"\\nQuery: {query}\")\n",
" print(f\"\\nSimilarity without contextual chunk header: {similarity_scores[0]:.4f}\")\n",
" print(f\"Similarity with contextual chunk header: {similarity_scores[1]:.4f}\")\n",
"\n",
"# Notebook cell for execution\n",
"# Assuming chunks and document_title are defined in previous cells\n",
"CHUNK_INDEX_TO_INSPECT = 86\n",
"QUERY = \"Nike climate change impact\"\n",
"\n",
"compare_chunk_similarities(CHUNK_INDEX_TO_INSPECT, chunks, document_title, QUERY)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This chunk is clearly about the impact of climate change on some organization, but it doesn't explicitly say \"Nike\" in it. So the relevance to the query \"Nike climate change impact\" in only about 0.1. By simply adding the document title to the chunk that similarity goes up to 0.92."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Eval results\n",
"\n",
"#### KITE\n",
"\n",
"We evaluated CCH on an end-to-end RAG benchmark we created, called KITE (Knowledge-Intensive Task Evaluation).\n",
"\n",
"KITE currently consists of 4 datasets and a total of 50 questions.\n",
"- **AI Papers** - ~100 academic papers about AI and RAG, downloaded from arXiv in PDF form.\n",
"- **BVP Cloud 10-Ks** - 10-Ks for all companies in the Bessemer Cloud Index (~70 of them), in PDF form.\n",
"- **Sourcegraph Company Handbook** - ~800 markdown files, with their original directory structure, downloaded from Sourcegraph's publicly accessible company handbook GitHub [page](https://github.com/sourcegraph/handbook/tree/main/content).\n",
"- **Supreme Court Opinions** - All Supreme Court opinions from Term Year 2022 (delivered from January '23 to June '23), downloaded from the official Supreme Court [website](https://www.supremecourt.gov/opinions/slipopinion/22) in PDF form.\n",
"\n",
"Ground truth answers are included with each sample. Most samples also include grading rubrics. Grading is done on a scale of 0-10 for each question, with a strong LLM doing the grading.\n",
"\n",
"We compare performance with and without CCH. For the CCH config we use document title and document summary. All other parameters remain the same between the two configurations. We use the Cohere 3 reranker, and we use GPT-4o for response generation.\n",
"\n",
"| | No-CCH | CCH |\n",
"|-------------------------|----------|--------------|\n",
"| AI Papers | 4.5 | 4.7 |\n",
"| BVP Cloud | 2.6 | 6.3 |\n",
"| Sourcegraph | 5.7 | 5.8 |\n",
"| Supreme Court Opinions | 6.1 | 7.4 |\n",
"| **Average** | 4.72 | 6.04 |\n",
"\n",
"We can see that CCH leads to an improvement in performance on each of the four datasets. Some datasets see a large improvement while others see a small improvement. The overall average score increases from 4.72 -> 6.04, a 27.9% increase.\n",
"\n",
"#### FinanceBench\n",
"\n",
"We've also evaluated CCH on FinanceBench, where it contributed to a score of 83%, compared to a baseline score of 19%. For that benchmark, we tested CCH and relevant segment extraction (RSE) jointly, so we can't say exactly how much CCH contributed to that result. But the combination of CCH and RSE clearly leads to substantial accuracy improvements on FinanceBench."
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"vscode": {
"interpreter": {
"hash": "44d0561a9d33f22b2e67e0485c48036e39d1c698628b030a9859974b559ff507"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/contextual_compression.ipynb
```ipynb path="/all_rag_techniques/contextual_compression.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_compression.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Contextual Compression in Document Retrieval\n",
"\n",
"## Overview\n",
"\n",
"This code demonstrates the implementation of contextual compression in a document retrieval system using LangChain and OpenAI's language models. The technique aims to improve the relevance and conciseness of retrieved information by compressing and extracting the most pertinent parts of documents in the context of a given query.\n",
"\n",
"## Motivation\n",
"\n",
"Traditional document retrieval systems often return entire chunks or documents, which may contain irrelevant information. Contextual compression addresses this by intelligently extracting and compressing only the most relevant parts of retrieved documents, leading to more focused and efficient information retrieval.\n",
"\n",
"## Key Components\n",
"\n",
"1. Vector store creation from a PDF document\n",
"2. Base retriever setup\n",
"3. LLM-based contextual compressor\n",
"4. Contextual compression retriever\n",
"5. Question-answering chain integrating the compressed retriever\n",
"\n",
"## Method Details\n",
"\n",
"### Document Preprocessing and Vector Store Creation\n",
"\n",
"1. The PDF is processed and encoded into a vector store using a custom `encode_pdf` function.\n",
"\n",
"### Retriever and Compressor Setup\n",
"\n",
"1. A base retriever is created from the vector store.\n",
"2. An LLM-based contextual compressor (LLMChainExtractor) is initialized using OpenAI's GPT-4 model.\n",
"\n",
"### Contextual Compression Retriever\n",
"\n",
"1. The base retriever and compressor are combined into a ContextualCompressionRetriever.\n",
"2. This retriever first fetches documents using the base retriever, then applies the compressor to extract the most relevant information.\n",
"\n",
"### Question-Answering Chain\n",
"\n",
"1. A RetrievalQA chain is created, integrating the compression retriever.\n",
"2. This chain uses the compressed and extracted information to generate answers to queries.\n",
"\n",
"## Benefits of this Approach\n",
"\n",
"1. Improved relevance: The system returns only the most pertinent information to the query.\n",
"2. Increased efficiency: By compressing and extracting relevant parts, it reduces the amount of text the LLM needs to process.\n",
"3. Enhanced context understanding: The LLM-based compressor can understand the context of the query and extract information accordingly.\n",
"4. Flexibility: The system can be easily adapted to different types of documents and queries.\n",
"\n",
"## Conclusion\n",
"\n",
"Contextual compression in document retrieval offers a powerful way to enhance the quality and efficiency of information retrieval systems. By intelligently extracting and compressing relevant information, it provides more focused and context-aware responses to queries. This approach has potential applications in various fields requiring efficient and accurate information retrieval from large document collections."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install langchain python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Clone the repository to access helper functions and evaluation modules\n",
"!git clone https://github.com/N7/RAG_TECHNIQUES.git\n",
"import sys\n",
"sys.path.append('RAG_TECHNIQUES')\n",
"# If you need to run with the latest data\n",
"# !cp -r RAG_TECHNIQUES/data ."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"from dotenv import load_dotenv\n",
"from langchain.retrievers.document_compressors import LLMChainExtractor\n",
"from langchain.retrievers import ContextualCompressionRetriever\n",
"from langchain.chains import RetrievalQA\n",
"\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"from helper_functions import *\n",
"from evaluation.evalute_rag import *\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define document's path"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"path = \"data/Understanding_Climate_Change.pdf\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a vector store"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"vector_store = encode_pdf(path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a retriever + contexual compressor + combine them "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# Create a retriever\n",
"retriever = vector_store.as_retriever()\n",
"\n",
"\n",
"#Create a contextual compressor\n",
"llm = ChatOpenAI(temperature=0, model_name=\"gpt-4o-mini\", max_tokens=4000)\n",
"compressor = LLMChainExtractor.from_llm(llm)\n",
"\n",
"#Combine the retriever with the compressor\n",
"compression_retriever = ContextualCompressionRetriever(\n",
" base_compressor=compressor,\n",
" base_retriever=retriever\n",
")\n",
"\n",
"# Create a QA chain with the compressed retriever\n",
"qa_chain = RetrievalQA.from_chain_type(\n",
" llm=llm,\n",
" retriever=compression_retriever,\n",
" return_source_documents=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example usage"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The main topic of the document is climate change, focusing on international collaboration, national strategies, policy development, and the ethical dimensions of climate justice. It discusses frameworks like the UNFCCC and the Paris Agreement, as well as the importance of sustainable practices for future generations.\n",
"Source documents: [Document(metadata={'source': '../data/Understanding_Climate_Change.pdf', 'page': 9}, page_content='Chapter 6: Global and Local Climate Action \\nInternational Collaboration \\nUnited Nations Framework Convention on Climate Change (UNFCCC) \\nThe UNFCCC is an international treaty aimed at addressing climate change. It provides a \\nframework for negotiating specific protocols and agreements, such as the Kyoto Protocol and \\nthe Paris Agreement. Global cooperation under the UNFCCC is crucial for coordinated \\nclimate action. \\nParis Agreement \\nThe Paris Agreement, adopted in 2015, aims to limit global warming to well below 2 degrees \\nCelsius above pre-industrial levels, with efforts to limit the increase to 1.5 degrees Celsius. \\nCountries submit nationally determined contributions (NDCs) outlining their climate action \\nplans and targets. \\nNational Strategies \\nCarbon Pricing \\nCarbon pricing mechanisms, such as carbon taxes and cap-and-trade systems, incentivize \\nemission reductions by assigning a cost to carbon emissions. These policies encourage'), Document(metadata={'source': '../data/Understanding_Climate_Change.pdf', 'page': 27}, page_content='Legacy for Future Generations \\nOur actions today shape the world for future generations. Ensuring a sustainable and resilient \\nplanet is our responsibility to future generations. By working together, we can create a legacy \\nof environmental stewardship, social equity, and global solidarity. \\nChapter 19: Climate Change and Policy \\nPolicy Development and Implementation \\nNational Climate Policies \\nCountries around the world are developing and implementing national climate policies to \\naddress climate change. These policies set emission reduction targets, promote renewable \\nenergy, and support adaptation measures. Effective policy implementation requires'), Document(metadata={'source': '../data/Understanding_Climate_Change.pdf', 'page': 18}, page_content='This vision includes a healthy planet, thriving ecosystems, and equitable societies. Working together towards this vision creates a sense of purpose and motivation . By embracing these principles and taking concerted action, we can address the urgent challenge of climate change and build a sustainable, resilient, and equitable world for all. The path forward requires courage, commitment, and collaboration, but the rewa rds are immense—a thriving planet and a prosperous future for generations to come. \\nChapter 13: Climate Change and Social Justice \\nClimate Justice \\nUnderstanding Climate Justice \\nClimate justice emphasizes the ethical dimensions of climate change, recognizing that its impacts are not evenly distributed. Vulnerable populations, including low -income communities, indigenous peoples, and marginalized groups, often face the greatest ris ks while contributing the least to greenhouse gas emissions. Climate justice advocates for')]\n"
]
}
],
"source": [
"query = \"What is the main topic of the document?\"\n",
"result = qa_chain.invoke({\"query\": query})\n",
"print(result[\"result\"])\n",
"print(\"Source documents:\", result[\"source_documents\"])"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
## /all_rag_techniques/crag.ipynb
```ipynb path="/all_rag_techniques/crag.ipynb"
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/crag.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Corrective RAG Process: Retrieval-Augmented Generation with Dynamic Correction\n",
"\n",
"## Overview\n",
"\n",
"The Corrective RAG (Retrieval-Augmented Generation) process is an advanced information retrieval and response generation system. It extends the standard RAG approach by dynamically evaluating and correcting the retrieval process, combining the power of vector databases, web search, and language models to provide accurate and context-aware responses to user queries.\n",
"\n",
"## Motivation\n",
"\n",
"While traditional RAG systems have improved information retrieval and response generation, they can still fall short when the retrieved information is irrelevant or outdated. The Corrective RAG process addresses these limitations by:\n",
"\n",
"1. Leveraging pre-existing knowledge bases\n",
"2. Evaluating the relevance of retrieved information\n",
"3. Dynamically searching the web when necessary\n",
"4. Refining and combining knowledge from multiple sources\n",
"5. Generating human-like responses based on the most appropriate knowledge\n",
"\n",
"## Key Components\n",
"\n",
"1. **FAISS Index**: A vector database for efficient similarity search of pre-existing knowledge.\n",
"2. **Retrieval Evaluator**: Assesses the relevance of retrieved documents to the query.\n",
"3. **Knowledge Refinement**: Extracts key information from documents when necessary.\n",
"4. **Web Search Query Rewriter**: Optimizes queries for web searches when local knowledge is insufficient.\n",
"5. **Response Generator**: Creates human-like responses based on the accumulated knowledge.\n",
"\n",
"## Method Details\n",
"\n",
"1. **Document Retrieval**: \n",
" - Performs similarity search in the FAISS index to find relevant documents.\n",
" - Retrieves top-k documents (default k=3).\n",
"\n",
"2. **Document Evaluation**:\n",
" - Calculates relevance scores for each retrieved document.\n",
" - Determines the best course of action based on the highest relevance score.\n",
"\n",
"3. **Corrective Knowledge Acquisition**:\n",
" - If high relevance (score > 0.7): Uses the most relevant document as-is.\n",
" - If low relevance (score < 0.3): Corrects by performing a web search with a rewritten query.\n",
" - If ambiguous (0.3 ≤ score ≤ 0.7): Corrects by combining the most relevant document with web search results.\n",
"\n",
"4. **Adaptive Knowledge Processing**:\n",
" - For web search results: Refines the knowledge to extract key points.\n",
" - For ambiguous cases: Combines raw document content with refined web search results.\n",
"\n",
"5. **Response Generation**:\n",
" - Uses a language model to generate a human-like response based on the query and acquired knowledge.\n",
" - Includes source information in the response for transparency.\n",
"\n",
"## Benefits of the Corrective RAG Approach\n",
"\n",
"1. **Dynamic Correction**: Adapts to the quality of retrieved information, ensuring relevance and accuracy.\n",
"2. **Flexibility**: Leverages both pre-existing knowledge and web search as needed.\n",
"3. **Accuracy**: Evaluates the relevance of information before using it, ensuring high-quality responses.\n",
"4. **Transparency**: Provides source information, allowing users to verify the origin of the information.\n",
"5. **Efficiency**: Uses vector search for quick retrieval from large knowledge bases.\n",
"6. **Contextual Understanding**: Combines multiple sources of information when necessary to provide comprehensive responses.\n",
"7. **Up-to-date Information**: Can supplement or replace outdated local knowledge with current web information.\n",
"\n",
"## Conclusion\n",
"\n",
"The Corrective RAG process represents a sophisticated evolution of the standard RAG approach. By intelligently evaluating and correcting the retrieval process, it overcomes common limitations of traditional RAG systems. This dynamic approach ensures that responses are based on the most relevant and up-to-date information available, whether from local knowledge bases or the web. The system's ability to adapt its information sourcing strategy based on relevance scores makes it particularly suited for applications requiring high accuracy and current information, such as research assistance, dynamic knowledge bases, and advanced question-answering systems."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"

\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Package Installation and Imports\n",
"\n",
"The cell below installs all necessary packages required to run this notebook.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"!pip install langchain langchain-openai python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Clone the repository to access helper functions and evaluation modules\n",
"!git clone https://github.com/N7/RAG_TECHNIQUES.git\n",
"import sys\n",
"sys.path.append('RAG_TECHNIQUES')\n",
"# If you need to run with the latest data\n",
"# !cp -r RAG_TECHNIQUES/data ."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"from dotenv import load_dotenv\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_openai import ChatOpenAI\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"\n",
"\n",
"# Original path append replaced for Colab compatibility\n",
"from helper_functions import *\n",
"from evaluation.evalute_rag import *\n",
"\n",
"# Load environment variables from a .env file\n",
"load_dotenv()\n",
"\n",
"# Set the OpenAI API key environment variable\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')\n",
"from langchain.tools import DuckDuckGoSearchResults\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define files path"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Download required data files\n",
"import os\n",
"os.makedirs('data', exist_ok=True)\n",
"\n",
"# Download the PDF document used in this notebook\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n",
"!wget -O data/Understanding_Climate_Change.pdf https://raw.githubusercontent.com/N7/RAG_TECHNIQUES/main/data/Understanding_Climate_Change.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"path = \"data/Understanding_Climate_Change.pdf\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a vector store"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"vectorstore = encode_pdf(path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize OpenAI language model\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(model=\"gpt-4o-mini\", max_tokens=1000, temperature=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize search tool"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchResults()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define retrieval evaluator, knowledge refinement and query rewriter llm chains"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# Retrieval Evaluator\n",
"class RetrievalEvaluatorInput(BaseModel):\n",
" relevance_score: float = Field(..., description=\"The relevance score of the document to the query. the score should be between 0 and 1.\")\n",
"def retrieval_evaluator(query: str, document: str) -> float:\n",
" prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"document\"],\n",
" template=\"On a scale from 0 to 1, how relevant is the following document to the query? Query: {query}\\nDocument: {document}\\nRelevance score:\"\n",
" )\n",
" chain = prompt | llm.with_structured_output(RetrievalEvaluatorInput)\n",
" input_variables = {\"query\": query, \"document\": document}\n",
" result = chain.invoke(input_variables).relevance_score\n",
" return result\n",
"\n",
"# Knowledge Refinement\n",
"class KnowledgeRefinementInput(BaseModel):\n",
" key_points: str = Field(..., description=\"The document to extract key information from.\")\n",
"def knowledge_refinement(document: str) -> List[str]:\n",
" prompt = PromptTemplate(\n",
" input_variables=[\"document\"],\n",
" template=\"Extract the key information from the following document in bullet points:\\n{document}\\nKey points:\"\n",
" )\n",
" chain = prompt | llm.with_structured_output(KnowledgeRefinementInput)\n",
" input_variables = {\"document\": document}\n",
" result = chain.invoke(input_variables).key_points\n",
" return [point.strip() for point in result.split('\\n') if point.strip()]\n",
"\n",
"# Web Search Query Rewriter\n",
"class QueryRewriterInput(BaseModel):\n",
" query: str = Field(..., description=\"The query to rewrite.\")\n",
"def rewrite_query(query: str) -> str:\n",
" prompt = PromptTemplate(\n",
" input_variables=[\"query\"],\n",
" template=\"Rewrite the following query to make it more suitable for a web search:\\n{query}\\nRewritten query:\"\n",
" )\n",
" chain = prompt | llm.with_structured_output(QueryRewriterInput)\n",
" input_variables = {\"query\": query}\n",
" return chain.invoke(input_variables).query.strip()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Helper function to parse search results\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"def parse_search_results(results_string: str) -> List[Tuple[str, str]]:\n",
" \"\"\"\n",
" Parse a JSON string of search results into a list of title-link tuples.\n",
"\n",
" Args:\n",
" results_string (str): A JSON-formatted string containing search results.\n",
"\n",
" Returns:\n",
" List[Tuple[str, str]]: A list of tuples, where each tuple contains the title and link of a search result.\n",
" If parsing fails, an empty list is returned.\n",
" \"\"\"\n",
" try:\n",
" # Attempt to parse the JSON string\n",
" results = json.loads(results_string)\n",
" # Extract and return the title and link from each result\n",
" return [(result.get('title', 'Untitled'), result.get('link', '')) for result in results]\n",
" except json.JSONDecodeError:\n",
" # Handle JSON decoding errors by returning an empty list\n",
" print(\"Error parsing search results. Returning empty list.\")\n",
" return []"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define sub functions for the CRAG process"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"def retrieve_documents(query: str, faiss_index: FAISS, k: int = 3) -> List[str]:\n",
" \"\"\"\n",
" Retrieve documents based on a query using a FAISS index.\n",
"\n",
" Args:\n",
" query (str): The query string to search for.\n",
" faiss_index (FAISS): The FAISS index used for similarity search.\n",
" k (int): The number of top documents to retrieve. Defaults to 3.\n",
"\n",
" Returns:\n",
" List[str]: A list of the retrieved document contents.\n",
" \"\"\"\n",
" docs = faiss_index.similarity_search(query, k=k)\n",
" return [doc.page_content for doc in docs]\n",
"\n",
"def evaluate_documents(query: str, documents: List[str]) -> List[float]:\n",
" \"\"\"\n",
" Evaluate the relevance of documents based on a query.\n",
"\n",
" Args:\n",
" query (str): The query string.\n",
" documents (List[str]): A list of document contents to evaluate.\n",
"\n",
" Returns:\n",
" List[float]: A list of relevance scores for each document.\n",
" \"\"\"\n",
" return [retrieval_evaluator(query, doc) for doc in documents]\n",
"\n",
"def perform_web_search(query: str) -> Tuple[List[str], List[Tuple[str, str]]]:\n",
" \"\"\"\n",
" Perform a web search based on a query.\n",
"\n",
" Args:\n",
" query (str): The query string to search for.\n",
"\n",
" Returns:\n",
" Tuple[List[str], List[Tuple[str, str]]]: \n",
" - A list of refined knowledge obtained from the web search.\n",
" - A list of tuples containing titles and links of the sources.\n",
" \"\"\"\n",
" rewritten_query = rewrite_query(query)\n",
" web_results = search.run(rewritten_query)\n",
" web_knowledge = knowledge_refinement(web_results)\n",
" sources = parse_search_results(web_results)\n",
" return web_knowledge, sources\n",
"\n",
"def generate_response(query: str, knowledge: str, sources: List[Tuple[str, str]]) -> str:\n",
" \"\"\"\n",
" Generate a response to a query using knowledge and sources.\n",
"\n",
" Args:\n",
" query (str): The query string.\n",
" knowledge (str): The refined knowledge to use in the response.\n",
" sources (List[Tuple[str, str]]): A list of tuples containing titles and links of the sources.\n",
"\n",
" Returns:\n",
" str: The generated response.\n",
" \"\"\"\n",
" response_prompt = PromptTemplate(\n",
" input_variables=[\"query\", \"knowledge\", \"sources\"],\n",
" template=\"Based on the following knowledge, answer the query. Include the sources with their links (if available) at the end of your answer:\\nQuery: {query}\\nKnowledge: {knowledge}\\nSources: {sources}\\nAnswer:\"\n",
" )\n",
" input_variables = {\n",
" \"query\": query,\n",
" \"knowledge\": knowledge,\n",
" \"sources\": \"\\n\".join([f\"{title}: {link}\" if link else title for title, link in sources])\n",
" }\n",
" response_chain = response_prompt | llm\n",
" return response_chain.invoke(input_variables).content\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### CRAG process\n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"def crag_process(query: str, faiss_index: FAISS) -> str:\n",
" \"\"\"\n",
" Process a query by retrieving, evaluating, and using documents or performing a web search to generate a response.\n",
"\n",
" Args:\n",
" query (str): The query string to process.\n",
" faiss_index (FAISS): The FAISS index used for document retrieval.\n",
"\n",
" Returns:\n",
" str: The generated response based on the query.\n",
" \"\"\"\n",
" print(f\"\\nProcessing query: {query}\")\n",
"\n",
" # Retrieve and evaluate documents\n",
" retrieved_docs = retrieve_documents(query, faiss_index)\n",
" eval_scores = evaluate_documents(query, retrieved_docs)\n",
" \n",
" print(f\"\\nRetrieved {len(retrieved_docs)} documents\")\n",
" print(f\"Evaluation scores: {eval_scores}\")\n",
"\n",
" # Determine action based on evaluation scores\n",
" max_score = max(eval_scores)\n",
" sources = []\n",
" \n",
" if max_score > 0.7:\n",
" print(\"\\nAction: Correct - Using retrieved document\")\n",
" best_doc = retrieved_docs[eval_scores.index(max_score)]\n",
" final_knowledge = best_doc\n",
" sources.append((\"Retrieved document\", \"\"))\n",
" elif max_score < 0.3:\n",
" print(\"\\nAction: Incorrect - Performing web search\")\n",
" final_knowledge, sources = perform_web_search(query)\n",
" else:\n",
" print(\"\\nAction: Ambiguous - Combining retrieved document and web search\")\n",
" best_doc = retrieved_docs[eval_scores.index(max_score)]\n",
" # Refine the retrieved knowledge\n",
" retrieved_knowledge = knowledge_refinement(best_doc)\n",
" web_knowledge, web_sources = perform_web_search(query)\n",
" final_knowledge = \"\\n\".join(retrieved_knowledge + web_knowledge)\n",
" sources = [(\"Retrieved document\", \"\")] + web_sources\n",
"\n",
" print(\"\\nFinal knowledge:\")\n",
" print(final_knowledge)\n",
" \n",
" print(\"\\nSources:\")\n",
" for title, link in sources:\n",
" print(f\"{title}: {link}\" if link else title)\n",
"\n",
" # Generate response\n",
" print(\"\\nGenerating response...\")\n",
" response = generate_response(query, final_knowledge, sources)\n",
"\n",
" print(\"\\nResponse generated\")\n",
" return response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example query with high relevance to the document\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"What are the main causes of climate change?\"\n",
"result = crag_process(query, vectorstore)\n",
"print(f\"Query: {query}\")\n",
"print(f\"Answer: {result}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example query with low relevance to the document\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = \"how did harry beat quirrell?\"\n",
"result = crag_process(query, vectorstore)\n",
"print(f\"Query: {query}\")\n",
"print(f\"Answer: {result}\")"
]
}
],
"metadata": {
"colab": {
"name": "",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
```
The content has been capped at 50000 tokens, and files over NaN bytes have been omitted. The user could consider applying other filters to refine the result. The better and more specific the context, the better the LLM can follow instructions. If the context seems verbose, the user can refine the filter using uithub. Thank you for using https://uithub.com - Perfect LLM context for any GitHub repo.