```
├── .cursor/
   ├── rules/
      ├── python.mdc
├── .env.example
├── .gitignore
├── .python-version
├── LICENSE
├── README.md
├── cursor-rules-cli/
   ├── .pypirc
   ├── LICENSE
   ├── MANIFEST.in
   ├── PUBLISHING.md
   ├── README.md
   ├── cursor-rules-cli-logo.jpeg
   ├── cursor-rules-cli.cast
   ├── cursor-rules-cli.gif
   ├── cursor-rules-cli.png
   ├── pyproject.toml
   ├── rules.json
   ├── setup.py
   ├── src/
      ├── __init__.py
      ├── downloader.py
      ├── installer.py
      ├── main.py
      ├── matcher.py
      ├── scanner.py
      ├── utils.py
├── pyproject.toml
├── requirements.txt
├── rules-mdc/
   ├── actix-web.mdc
   ├── aiohttp.mdc
   ├── amazon-ec2.mdc
   ├── amazon-s3.mdc
   ├── android-sdk.mdc
```


## /.cursor/rules/python.mdc

```mdc path="/.cursor/rules/python.mdc" 
---
description: package and dependency mangenement
globs: 
alwaysApply: true
---
Use uv instead of pip. 

uv add library

uv run script.py

DO NOT create requirements.txt. Use uv add commands. 
```

## /.env.example

```example path="/.env.example" 
# API Keys for MDC Rules Generator
# Copy this file to .env and fill in your API keys

# Required for Exa semantic search
EXA_API_KEY=your_exa_api_key_here

# Choose one of the following based on your LLM provider:

# For Gemini (default in config.yaml)
GOOGLE_API_KEY=your_google_api_key_here

# For OpenAI models (uncomment if using)
# OPENAI_API_KEY=your_openai_api_key_here

# For Anthropic Claude models (uncomment if using)
# ANTHROPIC_API_KEY=your_anthropic_api_key_here 
```

## /.gitignore

```gitignore path="/.gitignore" 
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info

# Virtual environments
.venv

awesome-cursorrules/
.env
exa_results/
logs/
.cache/
.DS_Store

```

## /.python-version

```python-version path="/.python-version" 
3.11

```

## /LICENSE

``` path="/LICENSE" 
Creative Commons Legal Code

CC0 1.0 Universal

    CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
    LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
    ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
    INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
    REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
    PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
    THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
    HEREUNDER.

Statement of Purpose

The laws of most jurisdictions throughout the world automatically confer
exclusive Copyright and Related Rights (defined below) upon the creator
and subsequent owner(s) (each and all, an "owner") of an original work of
authorship and/or a database (each, a "Work").

Certain owners wish to permanently relinquish those rights to a Work for
the purpose of contributing to a commons of creative, cultural and
scientific works ("Commons") that the public can reliably and without fear
of later claims of infringement build upon, modify, incorporate in other
works, reuse and redistribute as freely as possible in any form whatsoever
and for any purposes, including without limitation commercial purposes.
These owners may contribute to the Commons to promote the ideal of a free
culture and the further production of creative, cultural and scientific
works, or to gain reputation or greater distribution for their Work in
part through the use and efforts of others.

For these and/or other purposes and motivations, and without any
expectation of additional consideration or compensation, the person
associating CC0 with a Work (the "Affirmer"), to the extent that he or she
is an owner of Copyright and Related Rights in the Work, voluntarily
elects to apply CC0 to the Work and publicly distribute the Work under its
terms, with knowledge of his or her Copyright and Related Rights in the
Work and the meaning and intended legal effect of CC0 on those rights.

1. Copyright and Related Rights. A Work made available under CC0 may be
protected by copyright and related or neighboring rights ("Copyright and
Related Rights"). Copyright and Related Rights include, but are not
limited to, the following:

  i. the right to reproduce, adapt, distribute, perform, display,
     communicate, and translate a Work;
 ii. moral rights retained by the original author(s) and/or performer(s);
iii. publicity and privacy rights pertaining to a person's image or
     likeness depicted in a Work;
 iv. rights protecting against unfair competition in regards to a Work,
     subject to the limitations in paragraph 4(a), below;
  v. rights protecting the extraction, dissemination, use and reuse of data
     in a Work;
 vi. database rights (such as those arising under Directive 96/9/EC of the
     European Parliament and of the Council of 11 March 1996 on the legal
     protection of databases, and under any national implementation
     thereof, including any amended or successor version of such
     directive); and
vii. other similar, equivalent or corresponding rights throughout the
     world based on applicable law or treaty, and any national
     implementations thereof.

2. Waiver. To the greatest extent permitted by, but not in contravention
of, applicable law, Affirmer hereby overtly, fully, permanently,
irrevocably and unconditionally waives, abandons, and surrenders all of
Affirmer's Copyright and Related Rights and associated claims and causes
of action, whether now known or unknown (including existing as well as
future claims and causes of action), in the Work (i) in all territories
worldwide, (ii) for the maximum duration provided by applicable law or
treaty (including future time extensions), (iii) in any current or future
medium and for any number of copies, and (iv) for any purpose whatsoever,
including without limitation commercial, advertising or promotional
purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
member of the public at large and to the detriment of Affirmer's heirs and
successors, fully intending that such Waiver shall not be subject to
revocation, rescission, cancellation, termination, or any other legal or
equitable action to disrupt the quiet enjoyment of the Work by the public
as contemplated by Affirmer's express Statement of Purpose.

3. Public License Fallback. Should any part of the Waiver for any reason
be judged legally invalid or ineffective under applicable law, then the
Waiver shall be preserved to the maximum extent permitted taking into
account Affirmer's express Statement of Purpose. In addition, to the
extent the Waiver is so judged Affirmer hereby grants to each affected
person a royalty-free, non transferable, non sublicensable, non exclusive,
irrevocable and unconditional license to exercise Affirmer's Copyright and
Related Rights in the Work (i) in all territories worldwide, (ii) for the
maximum duration provided by applicable law or treaty (including future
time extensions), (iii) in any current or future medium and for any number
of copies, and (iv) for any purpose whatsoever, including without
limitation commercial, advertising or promotional purposes (the
"License"). The License shall be deemed effective as of the date CC0 was
applied by Affirmer to the Work. Should any part of the License for any
reason be judged legally invalid or ineffective under applicable law, such
partial invalidity or ineffectiveness shall not invalidate the remainder
of the License, and in such case Affirmer hereby affirms that he or she
will not (i) exercise any of his or her remaining Copyright and Related
Rights in the Work or (ii) assert any associated claims and causes of
action with respect to the Work, in either case contrary to Affirmer's
express Statement of Purpose.

4. Limitations and Disclaimers.

 a. No trademark or patent rights held by Affirmer are waived, abandoned,
    surrendered, licensed or otherwise affected by this document.
 b. Affirmer offers the Work as-is and makes no representations or
    warranties of any kind concerning the Work, express, implied,
    statutory or otherwise, including without limitation warranties of
    title, merchantability, fitness for a particular purpose, non
    infringement, or the absence of latent or other defects, accuracy, or
    the present or absence of errors, whether or not discoverable, all to
    the greatest extent permissible under applicable law.
 c. Affirmer disclaims responsibility for clearing rights of other persons
    that may apply to the Work or any use thereof, including without
    limitation any person's Copyright and Related Rights in the Work.
    Further, Affirmer disclaims responsibility for obtaining any necessary
    consents, permissions or other rights required for any use of the
    Work.
 d. Affirmer understands and acknowledges that Creative Commons is not a
    party to this document and has no duty or obligation with respect to
    this CC0 or use of the Work.

```

## /README.md

# MDC Rules Generator

> **Disclaimer:** This project is not officially associated with or endorsed by Cursor. It is a community-driven initiative to enhance the Cursor experience.

<a href="https://www.producthunt.com/posts/cursor-rules-cli?embed=true&utm_source=badge-featured&utm_medium=badge&utm_souce=badge-cursor&#0045;rules&#0045;cli" target="_blank"><img src="https://api.producthunt.com/widgets/embed-image/v1/featured.svg?post_id=936513&theme=light&t=1741030422709" alt="Cursor&#0032;Rules&#0032;CLI - Auto&#0045;install&#0032;relevant&#0032;Cursor&#0032;rules&#0032;with&#0032;one&#0032;simple&#0032;command | Product Hunt" style="width: 250px; height: 54px;" width="250" height="54" /></a>

This project generates Cursor MDC (Markdown Cursor) rule files from a structured JSON file containing library information. It uses Exa for semantic search and LLM (Gemini) for content generation.

## Features

- Generates comprehensive MDC rule files for libraries
- Uses Exa for semantic web search to gather best practices
- Leverages LLM to create detailed, structured content
- Supports parallel processing for efficiency
- Tracks progress to allow resuming interrupted runs
- Smart retry system that focuses on failed libraries by default

## Prerequisites

- Python 3.8+
- [uv](https://github.com/astral-sh/uv) for dependency management
- API keys for:
  - Exa (for semantic search)
  - LLM provider (Gemini, OpenAI, or Anthropic)

## Installation

1. Clone this repository:
   ```bash
   git clone https://github.com/sanjeed5/awesome-cursor-rules-mdc.git
   cd awesome-cursor-rules-mdc
   ```

2. Install dependencies using uv:
   ```bash
   uv sync
   ```

3. Set up environment variables:
   Create a `.env` file in the project root with your API keys (see `.env.example`):
   ```
   EXA_API_KEY=your_exa_api_key
   GOOGLE_API_KEY=your_google_api_key  # For Gemini
   # Or use one of these depending on your LLM choice:
   # OPENAI_API_KEY=your_openai_api_key
   # ANTHROPIC_API_KEY=your_anthropic_api_key
   ```

## Usage

Run the generator script with:

```bash
uv run src/generate_mdc_files.py
```

By default, the script will only process libraries that failed in previous runs.

### Command-line Options

- `--test`: Run in test mode (process only one library)
- `--tag TAG`: Process only libraries with a specific tag
- `--library LIBRARY`: Process only a specific library
- `--output OUTPUT_DIR`: Specify output directory for MDC files
- `--verbose`: Enable verbose logging
- `--workers N`: Set number of parallel workers
- `--rate-limit N`: Set API rate limit calls per minute
- `--regenerate-all`: Process all libraries, including previously completed ones

### Examples

```bash
# Process failed libraries (default behavior)
uv run src/generate_mdc_files.py

# Regenerate all libraries
uv run src/generate_mdc_files.py --regenerate-all

# Process only Python libraries
uv run src/generate_mdc_files.py --tag python

# Process a specific library
uv run src/generate_mdc_files.py --library react
```

## Adding New Rules

Adding support for new libraries is simple:

1. **Edit the rules.json file**:
   - Add a new entry to the `libraries` array:
   ```json
   {
     "name": "your-library-name",
     "tags": ["relevant-tag1", "relevant-tag2"]
   }
   ```

2. **Generate the MDC files**:
   - Run the generator script:
   ```bash
   uv run src/generate_mdc_files.py
   ```
   - The script automatically detects and processes new libraries

3. **Contribute back**:
   - Test your new rules with real projects
   - Consider raising a PR to contribute your additions back to the community

## Configuration

The script uses a `config.yaml` file for configuration. You can modify this file to adjust:

- API rate limits
- Output directories
- LLM model selection
- Processing parameters

## Project Structure

```
.
├── cursor-rules-cli/     # CLI tool for finding and installing rules (deprecated)
│   ├── src/              # CLI source code
│   ├── docs/             # CLI documentation
│   └── README.md         # CLI usage instructions
├── src/                  # Main source code directory
│   ├── generate_mdc_files.py  # Main generator script
│   ├── config.yaml       # Configuration file
│   ├── mdc-instructions.txt   # Instructions for MDC generation
│   ├── logs/             # Log files directory
│   └── exa_results/      # Directory for Exa search results
├── rules-mdc/            # Output directory for generated MDC files
├── rules.json            # Input file with library information
├── pyproject.toml        # Project dependencies and metadata
├── .env.example          # Example environment variables
└── LICENSE               # MIT License
```

## License

[MIT License](LICENSE)


## /cursor-rules-cli/.pypirc

```pypirc path="/cursor-rules-cli/.pypirc" 
[distutils]
index-servers =
    pypi
    testpypi

[pypi]
username = __token__
password = your_pypi_token

[testpypi]
repository = https://test.pypi.org/legacy/
username = __token__
password = your_testpypi_token 
```

## /cursor-rules-cli/LICENSE

``` path="/cursor-rules-cli/LICENSE" 
MIT License

Copyright (c) 2025 Sanjeed

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE. 
```

## /cursor-rules-cli/MANIFEST.in

```in path="/cursor-rules-cli/MANIFEST.in" 
include LICENSE
include README.md
include rules.json
recursive-include src *.py
recursive-include src *.json 
```

## /cursor-rules-cli/PUBLISHING.md

# Publishing to PyPI

This document provides instructions for publishing the `cursor-rules` package to PyPI.

## Prerequisites

1. Create a PyPI account at https://pypi.org/account/register/
2. Generate an API token at https://pypi.org/manage/account/token/
3. Install required tools:
   ```bash
   pip install build twine
   ```

## Publishing Steps

1. Update the version in `src/__init__.py`

2. Build the package:
   ```bash
   python -m build
   ```

3. Test the package locally:
   ```bash
   pip install --force-reinstall dist/cursor_rules-*.whl
   cursor-rules --help
   ```

4. Upload to TestPyPI (optional):
   ```bash
   python -m twine upload --repository testpypi dist/*
   ```

5. Install from TestPyPI (optional):
   ```bash
   pip install --index-url https://test.pypi.org/simple/ cursor-rules
   ```

6. Upload to PyPI:
   ```bash
   python -m twine upload dist/*
   ```

## Using API Tokens

When using twine, you can either:

1. Create a `.pypirc` file in your home directory:
   ```
   [distutils]
   index-servers =
       pypi
       testpypi

   [pypi]
   username = __token__
   password = your_pypi_token

   [testpypi]
   repository = https://test.pypi.org/legacy/
   username = __token__
   password = your_testpypi_token
   ```

2. Or provide credentials via environment variables:
   ```bash
   export TWINE_USERNAME=__token__
   export TWINE_PASSWORD=your_pypi_token
   ```

3. Or enter them when prompted by twine.

## Updating the Package

1. Make your changes
2. Update the version in `src/__init__.py`
3. Rebuild and upload following the steps above 

## /cursor-rules-cli/README.md

# Cursor Rules CLI

> **Disclaimer:** This project is not officially associated with or endorsed by Cursor. It is a community-driven initiative to enhance the Cursor experience.

<a href="https://www.producthunt.com/posts/cursor-rules-cli?embed=true&utm_source=badge-featured&utm_medium=badge&utm_souce=badge-cursor&#0045;rules&#0045;cli" target="_blank"><img src="https://api.producthunt.com/widgets/embed-image/v1/featured.svg?post_id=936513&theme=light&t=1741030422709" alt="Cursor&#0032;Rules&#0032;CLI - Auto&#0045;install&#0032;relevant&#0032;Cursor&#0032;rules&#0032;with&#0032;one&#0032;simple&#0032;command | Product Hunt" style="width: 250px; height: 54px;" width="250" height="54" /></a>

A simple tool that helps you find and install the right Cursor rules for your project. It scans your project to identify libraries and frameworks you're using and suggests matching rules.

![Cursor Rules CLI Demo](cursor-rules-cli.gif)

## Features

- 🔍 Auto-detects libraries in your project
- 📝 Supports direct library specification
- 📥 Downloads and installs rules into Cursor
- 🎨 Provides a colorful, user-friendly interface
- 🔀 Works with custom rule repositories
- 🔒 100% privacy-focused (all scanning happens locally)
- 🔄 GitHub API integration for reliable downloads

## Installation

```bash
pip install cursor-rules
```

## Basic Usage

```bash
# Scan current project and install matching rules
cursor-rules

# Specify libraries directly (skips project scanning)
cursor-rules --libraries "react,tailwind,typescript"

# Scan a specific project directory
cursor-rules -d /path/to/my/project
```

## Common Options

| Option | Description |
|--------|-------------|
| `--dry-run` | Preview without installing anything |
| `--force` | Replace existing rules |
| `-v, --verbose` | Show detailed output |
| `--quick-scan` | Faster scan (checks package files only) |
| `--max-results N` | Show top N results (default: 20) |

## Custom Repositories

```bash
# Use rules from your own GitHub repository
cursor-rules --source https://github.com/your-username/your-repo

# Save repository setting for future use
cursor-rules --source https://github.com/your-username/your-repo --save-config
```

## Repository URL Format

The tool now uses the GitHub API to reliably download rules. You can specify the repository URL in several formats:

```bash
# Standard GitHub repository URL (recommended)
cursor-rules --source https://github.com/username/repo

# With a specific branch
cursor-rules --source https://github.com/username/repo/tree/branch-name

# Legacy raw content URL will also work
cursor-rules --source https://raw.githubusercontent.com/username/repo/branch
```

## Configuration

```bash
# View current settings
cursor-rules --show-config

# Save settings globally
cursor-rules --save-config

# Save settings for current project only
cursor-rules --save-project-config
```

## Full Options Reference

Run `cursor-rules --help` to see all available options.

## License

MIT

## Todo:
- [ ] Test the custom repo feature

## /cursor-rules-cli/cursor-rules-cli-logo.jpeg

Binary file available at https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/refs/heads/main/cursor-rules-cli/cursor-rules-cli-logo.jpeg

## /cursor-rules-cli/cursor-rules-cli.cast

```cast path="/cursor-rules-cli/cursor-rules-cli.cast" 
{"version": 2, "width": 93, "height": 21, "timestamp": 1741028347, "env": {"SHELL": "/bin/zsh", "TERM": "xterm-256color"}}
[0.984595, "o", "\u001b[1m\u001b[7m%\u001b[27m\u001b[1m\u001b[0m                                                                                            \r \r"]
[0.985302, "o", "\u001b]2;stylumia@Sanjeeds-MacBook-Air:~/projects/agents/smolagents-tutorials\u0007\u001b]1;..nts-tutorials\u0007"]
[0.987528, "o", "\u001b]7;file://Sanjeeds-MacBook-Air.local/Users/stylumia/projects/agents/smolagents-tutorials\u001b\\"]
[1.005076, "o", "\u001b]697;OSCUnlock=7b6883b11bfa4f3a909885153dc9f047\u0007\u001b]697;Dir=/Users/stylumia/projects/agents/smolagents-tutorials\u0007\u001b]697;Shell=zsh\u0007"]
[1.005108, "o", "\u001b]697;ShellPath=/bin/zsh\u0007\u001b]697;PID=89174\u0007\u001b]697;ExitCode=0\u0007"]
[1.005248, "o", "\u001b]697;TTY=/dev/ttys141\u0007\u001b]697;Log=\u0007\u001b]697;ZshAutosuggestionColor=fg=8\u0007"]
[1.00539, "o", "\u001b]697;FigAutosuggestionColor=\u0007\u001b]697;User=stylumia\u0007"]
[1.007315, "o", "\r\u001b[0m\u001b[27m\u001b[24m\u001b[J\u001b]697;StartPrompt\u0007\u001b[01;32m➜  \u001b[36msmolagents-tutorials\u001b[00m \u001b]697;EndPrompt\u0007\u001b]697;NewCmd=7b6883b11bfa4f3a909885153dc9f047\u0007"]
[1.00733, "o", "\u001b[K\u001b[68C\u001b]697;StartPrompt\u0007\u001b]697;EndPrompt\u0007\u001b[68D"]
[1.007538, "o", "\u001b[?1h\u001b=\u001b[?2004h"]
[1.059751, "o", "\r\r\u001b[0m\u001b[27m\u001b[24m\u001b[J\u001b]697;StartPrompt\u0007\u001b[01;32m➜  \u001b[36msmolagents-tutorials\u001b[00m \u001b[01;34mgit:(\u001b[31mmain\u001b[34m) \u001b[33m✗\u001b[00m \u001b]697;EndPrompt\u0007\u001b]697;NewCmd=7b6883b11bfa4f3a909885153dc9f047\u0007\u001b[K\u001b[55C\u001b]697;StartPrompt\u0007\u001b]697;EndPrompt\u0007\u001b[55D"]
[1.754231, "o", "c"]
[1.766068, "o", "\b\u001b[1m\u001b[31mc\u001b[0m\u001b[39m"]
[1.766439, "o", "\b\u001b[1m\u001b[31mc\u001b[0m\u001b[39m\u001b[90mlear\u001b[39m\b\b\b\b"]
[1.939123, "o", "\b\u001b[1m\u001b[31mc\u001b[1m\u001b[31mu\u001b[0m\u001b[39m\u001b[39m \u001b[39m \u001b[39m \b\b\b"]
[1.941497, "o", "\b\b\u001b[0m\u001b[32mc\u001b[0m\u001b[32mu\u001b[39m"]
[1.941812, "o", "\u001b[90mrsor-rules\u001b[39m\u001b[10D"]
[2.120794, "o", "\b\b\u001b[32mc\u001b[32mu\u001b[32mr\u001b[39m"]
[2.126876, "o", "\b\b\b\u001b[1m\u001b[31mc\u001b[1m\u001b[31mu\u001b[1m\u001b[31mr\u001b[0m\u001b[39m"]
[2.484178, "o", "\u001b[39ms\u001b[39mo\u001b[39mr\u001b[39m-\u001b[39mr\u001b[39mu\u001b[39ml\u001b[39me\u001b[39ms"]
[2.486415, "o", "\u001b[12D\u001b[0m\u001b[32mc\u001b[0m\u001b[32mu\u001b[0m\u001b[32mr\u001b[32ms\u001b[32mo\u001b[32mr\u001b[32m-\u001b[32mr\u001b[32mu\u001b[32ml\u001b[32me\u001b[32ms\u001b[39m"]
[2.939399, "o", "\u001b[?1l\u001b>"]
[2.939717, "o", "\u001b[?2004l"]
[2.941204, "o", "\r\r\n"]
[2.942476, "o", "\u001b]697;OSCLock=7b6883b11bfa4f3a909885153dc9f047\u0007"]
[2.942588, "o", "\u001b]697;PreExec\u0007"]
[2.942643, "o", "\u001b]2;cursor-rules\u0007\u001b]1;cursor-rules\u0007"]
[3.164394, "o", "\u001b[32mINFO\u001b[0m: Scanning for libraries and frameworks...\r\n"]
[3.164481, "o", "\u001b[32mINFO\u001b[0m: Scanning for libraries and frameworks...\r\n"]
[3.166587, "o", "\u001b[32mINFO\u001b[0m: Detected 135 libraries/frameworks.\r\n"]
[3.166609, "o", "\u001b[32mINFO\u001b[0m: Finding relevant rules...\r\n"]
[3.167137, "o", "\u001b[32mINFO\u001b[0m: Successfully loaded rules.json from /Users/stylumia/projects/awesome-cursor-rules-mdc/cursor-rules-cli/rules.json\r\n"]
[3.168112, "o", "\u001b[32mINFO\u001b[0m: Found \u001b[32m20\u001b[0m relevant rule files.\r\n"]
[3.16821, "o", "\r\n\u001b[1m\u001b[34mAvailable Cursor rules for your project:\u001b[0m\r\n\r\n\u001b[1m\u001b[32mDirect Dependencies:\u001b[0m\r\n\u001b[32m1.\u001b[0m \u001b[36mscikit-learn\u001b[0m [ai, ml, machine-learning, python, data-science] (0.87)\r\n\u001b[32m2.\u001b[0m \u001b[36mpandas\u001b[0m [ai, ml, data-science, python, data-analysis] (0.87)\r\n\u001b[32m3.\u001b[0m \u001b[36mnumpy\u001b[0m [ai, ml, data-science, python, numerical-computing] (0.87)\r\n\u001b[32m4.\u001b[0m \u001b[36mscipy\u001b[0m [ai, ml, data-science, python, scientific-computing] (0.87)\r\n\u001b[32m5.\u001b[0m \u001b[36mtornado\u001b[0m [backend, framework, python, async] (0.83)\r\n\u001b[32m6.\u001b[0m \u001b[36msmolagents\u001b[0m [ai, ml, llm, python, agent-framework, lightweight] (0.82)\r\n\u001b[32m7.\u001b[0m \u001b[36msqlalchemy\u001b[0m [database, orm, python, sql] (0.82)\r\n\u001b[32m8.\u001b[0m \u001b[36msetuptools\u001b[0m [development, build-tool, python, packaging] (0.82)\r\n\u001b[32m9.\u001b[0m \u001b[36mpydantic\u001b[0m [development, python, data-validation, type-checking] (0.82)\r\n\u001b[32m10.\u001b[0m \u001b[36mlangchain\u001b[0m [ai, ml, llm, python] (0.82)\r\n\u001b[32m11.\u001b[0m \u001b[36mhttpx\u001b[0m [web, python, http-client, async] (0.82)\r\n\u001b[32m12.\u001b[0m \u001b[36maiohttp\u001b[0m [web, pyt"]
[3.168309, "o", "hon, http-client, async] (0.82)\r\n\u001b[32m13.\u001b[0m \u001b[36mtransformers\u001b[0m [python, nlp, deep-learning, huggingface] (0.82)\r\n\u001b[32m14.\u001b[0m \u001b[36mrich\u001b[0m [python, utilities, terminal, formatting] (0.82)\r\n\u001b[32m15.\u001b[0m \u001b[36mrequests\u001b[0m [web, python, http-client] (0.81)\r\n\u001b[32m16.\u001b[0m \u001b[36mbeautifulsoup4\u001b[0m [python, web-scraping, html-parsing] (0.81)\r\n\u001b[32m17.\u001b[0m \u001b[36manyio\u001b[0m [python, async, compatibility-layer] (0.81)\r\n\u001b[32m18.\u001b[0m \u001b[36mtqdm\u001b[0m [python, utilities, progress-bar] (0.81)\r\n\u001b[32m19.\u001b[0m \u001b[36mclick\u001b[0m [python, utilities, cli] (0.81)\r\n"]
[3.168328, "o", "\r\n\u001b[1m\u001b[33mOther Relevant Rules:\u001b[0m\r\n\u001b[32m20.\u001b[0m \u001b[36mpytorch\u001b[0m [ai, ml, machine-learning, python, deep-learning] (0.82)\r\n\r\n\u001b[1mSelect rules to install:\u001b[0m\r\n  \u001b[33m* Enter comma-separated numbers (e.g., 1,3,5)\u001b[0m\r\n  \u001b[33m* Type 'all' to select all rules\u001b[0m\r\n  \u001b[33m* Type 'category:name' to select all rules in a category (e.g., 'category:development')\u001b[0m\r\n  \u001b[33m* Type 'none' to cancel\u001b[0m\r\n\u001b[32m> \u001b[0m"]
[4.962026, "o", "6"]
[5.205157, "o", ","]
[5.755048, "o", "9"]
[6.427783, "o", ","]
[6.737927, "o", "1"]
[7.081027, "o", "6"]
[7.560083, "o", "\r\n"]
[7.633727, "o", "\u001b[32mINFO\u001b[0m: Downloaded smolagents\r\n"]
[7.847926, "o", "\u001b[32mINFO\u001b[0m: Downloaded beautifulsoup4\r\n"]
[7.942039, "o", "\u001b[32mINFO\u001b[0m: Downloaded pydantic\r\n"]
[7.94271, "o", "\u001b[32mINFO\u001b[0m: Successfully downloaded all 3 rules\r\n"]
[7.944752, "o", "\u001b[32mINFO\u001b[0m: Backed up existing rules to /Users/stylumia/projects/agents/smolagents-tutorials/.cursor/backups/rules_backup_20250304_002915\r\n"]
[7.944899, "o", "\u001b[33mWARNING\u001b[0m: \u001b[33mSkipping smolagents: Rule already exists (use --force to overwrite)\u001b[0m\r\n"]
[7.946423, "o", "\u001b[32mINFO\u001b[0m: Installed 2/3 rules to /Users/stylumia/projects/agents/smolagents-tutorials/.cursor/rules\r\n"]
[7.946532, "o", "\u001b[32mINFO\u001b[0m: \u001b[32m✅ Successfully installed 2 rules!\u001b[0m\r\n\u001b[33mWARNING\u001b[0m: \u001b[33m\u001b[33m⚠️ Failed to install 1 rules:\u001b[0m\u001b[0m\r\n"]
[7.946792, "o", "\u001b[0m"]
[7.976129, "o", "\u001b[1m\u001b[7m%\u001b[27m\u001b[1m\u001b[0m                                                                                            \r \r"]
[7.977311, "o", "\u001b]2;stylumia@Sanjeeds-MacBook-Air:~/projects/agents/smolagents-tutorials\u0007\u001b]1;..nts-tutorials\u0007"]
[7.980009, "o", "\u001b]7;file://Sanjeeds-MacBook-Air.local/Users/stylumia/projects/agents/smolagents-tutorials\u001b\\"]
[7.992933, "o", "\u001b]697;OSCUnlock=7b6883b11bfa4f3a909885153dc9f047\u0007\u001b]697;Dir=/Users/stylumia/projects/agents/smolagents-tutorials\u0007"]
[7.993049, "o", "\u001b]697;Shell=zsh\u0007\u001b]697;ShellPath=/bin/zsh\u0007"]
[7.993191, "o", "\u001b]697;PID=89174\u0007\u001b]697;ExitCode=0\u0007\u001b]697;TTY=/dev/ttys141\u0007\u001b]697;Log=\u0007\u001b]697;ZshAutosuggestionColor=fg=8\u0007\u001b]697;FigAutosuggestionColor=\u0007\u001b]697;User=stylumia\u0007"]
[7.99525, "o", "\r\u001b[0m\u001b[27m\u001b[24m\u001b[J\u001b]697;StartPrompt\u0007\u001b[01;32m➜  \u001b[36msmolagents-tutorials\u001b[00m \u001b[01;34mgit:(\u001b[31mmain\u001b[34m) \u001b[33m✗\u001b[00m \u001b]697;EndPrompt\u0007\u001b]697;NewCmd=7b6883b11bfa4f3a909885153dc9f047\u0007"]
[7.995282, "o", "\u001b[K\u001b[55C\u001b]697;StartPrompt\u0007\u001b]697;EndPrompt\u0007\u001b[55D"]
[7.995396, "o", "\u001b[?1h\u001b=\u001b[?2004h"]
[9.5534, "o", "e"]
[9.558133, "o", "\b\u001b[1m\u001b[31me\u001b[0m\u001b[39m"]
[9.558402, "o", "\b\u001b[1m\u001b[31me\u001b[0m\u001b[39m\u001b[90mxit\u001b[39m\b\b\b"]
[9.752584, "o", "\b\u001b[1m\u001b[31me\u001b[1m\u001b[31mx\u001b[0m\u001b[39m"]
[9.754766, "o", "\b\b\u001b[0m\u001b[32me\u001b[0m\u001b[32mx\u001b[39m"]
[9.912208, "o", "\b\b\u001b[32me\u001b[32mx\u001b[32mi\u001b[39m"]
[9.916324, "o", "\b\b\b\u001b[1m\u001b[31me\u001b[1m\u001b[31mx\u001b[1m\u001b[31mi\u001b[0m\u001b[39m"]
[10.150086, "o", "\b\u001b[1m\u001b[31mi\u001b[1m\u001b[31mt\u001b[0m\u001b[39m"]
[10.152603, "o", "\b\b\b\b\u001b[0m\u001b[32me\u001b[0m\u001b[32mx\u001b[0m\u001b[32mi\u001b[0m\u001b[32mt\u001b[39m"]
[10.996501, "o", "\u001b[?1l\u001b>"]
[10.996806, "o", "\u001b[?2004l"]
[10.998773, "o", "\r\r\n"]
[11.000081, "o", "\u001b]697;OSCLock=7b6883b11bfa4f3a909885153dc9f047\u0007"]
[11.000111, "o", "\u001b]697;PreExec\u0007"]
[11.000504, "o", "\u001b]2;exit\u0007\u001b]1;exit\u0007"]

```

## /cursor-rules-cli/cursor-rules-cli.gif

Binary file available at https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/refs/heads/main/cursor-rules-cli/cursor-rules-cli.gif

## /cursor-rules-cli/cursor-rules-cli.png

Binary file available at https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/refs/heads/main/cursor-rules-cli/cursor-rules-cli.png

## /cursor-rules-cli/pyproject.toml

```toml path="/cursor-rules-cli/pyproject.toml" 
[build-system]
requires = ["setuptools>=42", "wheel"]
build-backend = "setuptools.build_meta" 
```

## /cursor-rules-cli/rules.json

```json path="/cursor-rules-cli/rules.json" 
{
  "libraries": [
    {
      "name": "react",
      "tags": ["frontend", "framework", "javascript"]
    },
    {
      "name": "react-native",
      "tags": ["frontend", "framework", "javascript", "mobile", "cross-platform"]
    },
    {
      "name": "react-query",
      "tags": ["frontend", "javascript", "data-fetching"]
    },
    {
      "name": "react-redux",
      "tags": ["frontend", "javascript", "state-management"]
    },
    {
      "name": "react-mobx",
      "tags": ["frontend", "javascript", "state-management"]
    },
    {
      "name": "next-js",
      "tags": ["frontend", "framework", "javascript", "react", "ssr"]
    },
    {
      "name": "vue",
      "tags": ["frontend", "framework", "javascript"]
    },
    {
      "name": "vue3",
      "tags": ["frontend", "framework", "javascript"]
    },
    {
      "name": "nuxt",
      "tags": ["frontend", "framework", "javascript", "vue", "ssr"]
    },
    {
      "name": "angular",
      "tags": ["frontend", "framework", "javascript", "typescript"]
    },
    {
      "name": "svelte",
      "tags": ["frontend", "framework", "javascript"]
    },
    {
      "name": "sveltekit",
      "tags": ["frontend", "framework", "javascript", "svelte", "ssr"]
    },
    {
      "name": "solidjs",
      "tags": ["frontend", "framework", "javascript"]
    },
    {
      "name": "qwik",
      "tags": ["frontend", "framework", "javascript"]
    },
    {
      "name": "express",
      "tags": ["backend", "framework", "javascript", "nodejs"]
    },
    {
      "name": "nestjs",
      "tags": ["backend", "framework", "javascript", "typescript", "nodejs"]
    },
    {
      "name": "bun",
      "tags": ["backend", "javascript", "runtime", "nodejs-alternative"]
    },
    {
      "name": "django",
      "tags": ["backend", "framework", "python", "orm", "full-stack"]
    },
    {
      "name": "flask",
      "tags": ["backend", "framework", "python", "microframework"]
    },
    {
      "name": "fastapi",
      "tags": ["backend", "framework", "python", "api", "async"]
    },
    {
      "name": "pyramid",
      "tags": ["backend", "framework", "python"]
    },
    {
      "name": "tornado",
      "tags": ["backend", "framework", "python", "async"]
    },
    {
      "name": "sanic",
      "tags": ["backend", "framework", "python", "async"]
    },
    {
      "name": "bottle",
      "tags": ["backend", "framework", "python", "microframework"]
    },
    {
      "name": "laravel",
      "tags": ["backend", "framework", "php"]
    },
    {
      "name": "springboot",
      "tags": ["backend", "framework", "java"]
    },
    {
      "name": "fiber",
      "tags": ["backend", "framework", "go"]
    },
    {
      "name": "servemux",
      "tags": ["backend", "framework", "go"]
    },
    {
      "name": "phoenix",
      "tags": ["backend", "framework", "elixir"]
    },
    {
      "name": "actix-web",
      "tags": ["backend", "framework", "rust"]
    },
    {
      "name": "rocket",
      "tags": ["backend", "framework", "rust"]
    },
    {
      "name": "shadcn",
      "tags": ["ui", "component-library", "react"]
    },
    {
      "name": "chakra-ui",
      "tags": ["ui", "component-library", "react"]
    },
    {
      "name": "material-ui",
      "tags": ["ui", "component-library", "react"]
    },
    {
      "name": "tailwind",
      "tags": ["ui", "css", "utility-first"]
    },
    {
      "name": "jetpack-compose",
      "tags": ["ui", "mobile", "android", "kotlin"]
    },
    {
      "name": "tkinter",
      "tags": ["ui", "gui", "python", "desktop"]
    },
    {
      "name": "pyqt",
      "tags": ["ui", "gui", "python", "desktop", "qt"]
    },
    {
      "name": "pyside",
      "tags": ["ui", "gui", "python", "desktop", "qt"]
    },
    {
      "name": "kivy",
      "tags": ["ui", "gui", "python", "cross-platform", "mobile"]
    },
    {
      "name": "pygame",
      "tags": ["ui", "gui", "python", "game-development"]
    },
    {
      "name": "customtkinter",
      "tags": ["ui", "gui", "python", "desktop", "tkinter"]
    },
    {
      "name": "redux",
      "tags": ["state-management", "javascript", "react"]
    },
    {
      "name": "mobx",
      "tags": ["state-management", "javascript", "react"]
    },
    {
      "name": "zustand",
      "tags": ["state-management", "javascript", "react"]
    },
    {
      "name": "riverpod",
      "tags": ["state-management", "flutter", "dart"]
    },
    {
      "name": "supabase",
      "tags": ["database", "sql", "postgresql", "backend-as-service"]
    },
    {
      "name": "postgresql",
      "tags": ["database", "sql", "relational"]
    },
    {
      "name": "prisma",
      "tags": ["database", "orm", "typescript", "javascript"]
    },
    {
      "name": "mongodb",
      "tags": ["database", "nosql", "document"]
    },
    {
      "name": "redis",
      "tags": ["database", "nosql", "key-value", "in-memory"]
    },
    {
      "name": "duckdb",
      "tags": ["database", "analytics", "sql", "olap"]
    },
    {
      "name": "sqlalchemy",
      "tags": ["database", "orm", "python", "sql"]
    },
    {
      "name": "peewee",
      "tags": ["database", "orm", "python", "sql"]
    },
    {
      "name": "pony",
      "tags": ["database", "orm", "python", "sql"]
    },
    {
      "name": "tortoise-orm",
      "tags": ["database", "orm", "python", "sql", "async"]
    },
    {
      "name": "django-orm",
      "tags": ["database", "orm", "python", "sql", "django"]
    },
    {
      "name": "vite",
      "tags": ["development", "build-tool", "javascript"]
    },
    {
      "name": "webpack",
      "tags": ["development", "build-tool", "javascript"]
    },
    {
      "name": "turbopack",
      "tags": ["development", "build-tool", "javascript"]
    },
    {
      "name": "poetry",
      "tags": ["development", "build-tool", "python", "dependency-management"]
    },
    {
      "name": "setuptools",
      "tags": ["development", "build-tool", "python", "packaging"]
    },
    {
      "name": "jest",
      "tags": ["development", "testing", "javascript"]
    },
    {
      "name": "detox",
      "tags": ["development", "testing", "javascript", "react-native", "e2e"]
    },
    {
      "name": "playwright",
      "tags": ["development", "testing", "javascript", "e2e", "browser"]
    },
    {
      "name": "vitest",
      "tags": ["development", "testing", "javascript", "vite"]
    },
    {
      "name": "python",
      "tags": ["python"]
    },
    {
      "name": "pytest",
      "tags": ["development", "testing", "python"]
    },
    {
      "name": "unittest",
      "tags": ["development", "testing", "python", "standard-library"]
    },
    {
      "name": "nose2",
      "tags": ["development", "testing", "python"]
    },
    {
      "name": "hypothesis",
      "tags": ["development", "testing", "python", "property-based"]
    },
    {
      "name": "behave",
      "tags": ["development", "testing", "python", "bdd"]
    },
    {
      "name": "docker",
      "tags": ["development", "containerization", "devops"]
    },
    {
      "name": "kubernetes",
      "tags": ["development", "containerization", "orchestration", "devops"]
    },
    {
      "name": "git",
      "tags": ["development", "version-control"]
    },
    {
      "name": "mkdocs",
      "tags": ["development", "documentation", "markdown"]
    },
    {
      "name": "sphinx",
      "tags": ["development", "documentation", "python", "rst"]
    },
    {
      "name": "pdoc",
      "tags": ["development", "documentation", "python", "auto-generation"]
    },
    {
      "name": "github-actions",
      "tags": ["development", "ci-cd", "devops"]
    },
    {
      "name": "terraform",
      "tags": ["development", "infrastructure", "iac", "devops"]
    },
    {
      "name": "black",
      "tags": ["development", "python", "formatter", "linting"]
    },
    {
      "name": "flake8",
      "tags": ["development", "python", "linting"]
    },
    {
      "name": "pylint",
      "tags": ["development", "python", "linting", "static-analysis"]
    },
    {
      "name": "mypy",
      "tags": ["development", "python", "type-checking", "static-analysis"]
    },
    {
      "name": "isort",
      "tags": ["development", "python", "formatter", "imports"]
    },
    {
      "name": "pydantic",
      "tags": ["development", "python", "data-validation", "type-checking"]
    },
    {
      "name": "pyright",
      "tags": ["development", "python", "type-checking", "static-analysis"]
    },
    {
      "name": "tauri",
      "tags": ["cross-platform", "desktop", "rust", "javascript"]
    },
    {
      "name": "electron",
      "tags": ["cross-platform", "desktop", "javascript"]
    },
    {
      "name": "expo",
      "tags": ["cross-platform", "mobile", "react-native"]
    },
    {
      "name": "flutter",
      "tags": ["cross-platform", "mobile", "dart"]
    },
    {
      "name": "pytorch",
      "tags": ["ai", "ml", "machine-learning", "python", "deep-learning"]
    },
    {
      "name": "scikit-learn",
      "tags": ["ai", "ml", "machine-learning", "python", "data-science"]
    },
    {
      "name": "pandas",
      "tags": ["ai", "ml", "data-science", "python", "data-analysis"]
    },
    {
      "name": "tensorflow",
      "tags": ["ai", "ml", "machine-learning", "python", "deep-learning"]
    },
    {
      "name": "keras",
      "tags": ["ai", "ml", "machine-learning", "python", "deep-learning"]
    },
    {
      "name": "xgboost",
      "tags": ["ai", "ml", "machine-learning", "python", "gradient-boosting"]
    },
    {
      "name": "lightgbm",
      "tags": ["ai", "ml", "machine-learning", "python", "gradient-boosting"]
    },
    {
      "name": "cuda",
      "tags": ["ai", "ml", "gpu-computing", "parallel-computing"]
    },
    {
      "name": "numba",
      "tags": ["ai", "ml", "gpu-computing", "python", "jit-compiler"]
    },
    {
      "name": "langchain",
      "tags": ["ai", "ml", "llm", "python"]
    },
    {
      "name": "huggingface",
      "tags": ["ai", "ml", "llm", "python", "transformers"]
    },
    {
      "name": "vllm",
      "tags": ["ai", "ml", "llm", "python", "inference"]
    },
    {
      "name": "llama-index",
      "tags": ["ai", "ml", "llm", "python", "rag"]
    },
    {
      "name": "modal",
      "tags": ["ai", "ml", "cloud-inference", "serverless", "deployment"]
    },
    {
      "name": "numpy",
      "tags": ["ai", "ml", "data-science", "python", "numerical-computing"]
    },
    {
      "name": "scipy",
      "tags": ["ai", "ml", "data-science", "python", "scientific-computing"]
    },
    {
      "name": "matplotlib",
      "tags": ["ai", "ml", "data-science", "python", "data-visualization"]
    },
    {
      "name": "seaborn",
      "tags": ["ai", "ml", "data-science", "python", "data-visualization"]
    },
    {
      "name": "plotly",
      "tags": ["ai", "ml", "data-science", "python", "interactive-visualization"]
    },
    {
      "name": "statsmodels",
      "tags": ["ai", "ml", "data-science", "python", "statistics"]
    },
    {
      "name": "dask",
      "tags": ["ai", "ml", "data-science", "python", "parallel-computing", "big-data"]
    },
    {
      "name": "htmx",
      "tags": ["web", "javascript", "modern-patterns"]
    },
    {
      "name": "trpc",
      "tags": ["web", "typescript", "api", "modern-patterns"]
    },
    {
      "name": "typescript",
      "tags": ["web", "javascript", "type-checking", "language"]
    },
    {
      "name": "zod",
      "tags": ["web", "typescript", "validation", "type-checking"]
    },
    {
      "name": "axios",
      "tags": ["web", "javascript", "http-client"]
    },
    {
      "name": "guzzle",
      "tags": ["web", "php", "http-client"]
    },
    {
      "name": "requests",
      "tags": ["web", "python", "http-client"]
    },
    {
      "name": "httpx",
      "tags": ["web", "python", "http-client", "async"]
    },
    {
      "name": "aiohttp",
      "tags": ["web", "python", "http-client", "async"]
    },
    {
      "name": "graphql",
      "tags": ["web", "api", "query-language"]
    },
    {
      "name": "apollo-client",
      "tags": ["web", "api", "graphql", "javascript"]
    },
    {
      "name": "flask-restful",
      "tags": ["web", "api", "python", "flask"]
    },
    {
      "name": "solidity",
      "tags": ["blockchain", "ethereum", "smart-contracts", "language"]
    },
    {
      "name": "hardhat",
      "tags": ["blockchain", "ethereum", "development", "javascript"]
    },
    {
      "name": "vercel",
      "tags": ["cloud", "deployment", "serverless", "frontend"]
    },
    {
      "name": "cloudflare",
      "tags": ["cloud", "deployment", "edge-computing", "cdn"]
    },
    {
      "name": "aws-lambda",
      "tags": ["cloud", "serverless", "aws"]
    },
    {
      "name": "aws",
      "tags": ["cloud", "major-platform"]
    },
    {
      "name": "gcp",
      "tags": ["cloud", "major-platform"]
    },
    {
      "name": "azure",
      "tags": ["cloud", "major-platform"]
    },
    {
      "name": "beautifulsoup4",
      "tags": ["python", "web-scraping", "html-parsing"]
    },
    {
      "name": "scrapy",
      "tags": ["python", "web-scraping", "crawler", "framework"]
    },
    {
      "name": "selenium",
      "tags": ["python", "web-scraping", "browser-automation", "testing"]
    },
    {
      "name": "asyncio",
      "tags": ["python", "async", "standard-library"]
    },
    {
      "name": "trio",
      "tags": ["python", "async"]
    },
    {
      "name": "anyio",
      "tags": ["python", "async", "compatibility-layer"]
    },
    {
      "name": "nltk",
      "tags": ["python", "nlp", "text-processing"]
    },
    {
      "name": "spacy",
      "tags": ["python", "nlp", "text-processing"]
    },
    {
      "name": "gensim",
      "tags": ["python", "nlp", "topic-modeling"]
    },
    {
      "name": "transformers",
      "tags": ["python", "nlp", "deep-learning", "huggingface"]
    },
    {
      "name": "pillow",
      "tags": ["python", "image-processing"]
    },
    {
      "name": "opencv-python",
      "tags": ["python", "image-processing", "computer-vision"]
    },
    {
      "name": "scikit-image",
      "tags": ["python", "image-processing", "scientific-computing"]
    },
    {
      "name": "tqdm",
      "tags": ["python", "utilities", "progress-bar"]
    },
    {
      "name": "rich",
      "tags": ["python", "utilities", "terminal", "formatting"]
    },
    {
      "name": "click",
      "tags": ["python", "utilities", "cli"]
    },
    {
      "name": "typer",
      "tags": ["python", "utilities", "cli"]
    },
    {
      "name": "streamlit",
      "tags": ["python", "utilities", "data-apps", "dashboard"]
    },
    {
      "name": "css",
      "tags": ["web", "frontend", "styling", "language"]
    },
    {
      "name": "crewai",
      "tags": ["ai", "ml", "llm", "python", "agent-framework", "multi-agent"]
    },
    {
      "name": "smolagents",
      "tags": ["ai", "ml", "llm", "python", "agent-framework", "lightweight"]
    },
    {
      "name": "langgraph",
      "tags": ["ai", "ml", "llm", "python", "agent-framework", "workflow"]
    },
    {
      "name": "autogen",
      "tags": ["ai", "ml", "llm", "python", "agent-framework", "multi-agent"]
    },
    {
      "name": "llamaindex-js",
      "tags": ["ai", "ml", "llm", "javascript", "rag"]
    },
    {
      "name": "langchain-js",
      "tags": ["ai", "ml", "llm", "javascript"]
    },
    {
      "name": "asp-net",
      "tags": ["backend", "framework", "csharp", "microsoft"]
    },
    {
      "name": "aws-amplify",
      "tags": ["cloud", "frontend", "aws", "full-stack"]
    },
    {
      "name": "aws-cli",
      "tags": ["cloud", "devops", "aws", "command-line"]
    },
    {
      "name": "aws-dynamodb",
      "tags": ["database", "nosql", "aws", "cloud"]
    },
    {
      "name": "aws-ecs",
      "tags": ["cloud", "containerization", "aws", "orchestration"]
    },
    {
      "name": "aws-rds",
      "tags": ["database", "sql", "aws", "cloud"]
    },
    {
      "name": "amazon-ec2",
      "tags": ["cloud", "infrastructure", "aws", "virtual-machines"]
    },
    {
      "name": "amazon-s3",
      "tags": ["cloud", "storage", "aws", "object-storage"]
    },
    {
      "name": "android-sdk",
      "tags": ["mobile", "framework", "java", "kotlin"]
    },
    {
      "name": "ansible",
      "tags": ["devops", "infrastructure", "automation", "configuration-management"]
    },
    {
      "name": "ant-design",
      "tags": ["ui", "component-library", "react", "design-system"]
    },
    {
      "name": "apollo-graphql",
      "tags": ["web", "api", "graphql", "javascript"]
    },
    {
      "name": "astro",
      "tags": ["frontend", "framework", "javascript", "static-site"]
    },
    {
      "name": "auth0",
      "tags": ["authentication", "security", "identity", "saas"]
    },
    {
      "name": "azure-pipelines",
      "tags": ["devops", "ci-cd", "microsoft", "cloud"]
    },
    {
      "name": "bash",
      "tags": ["shell", "scripting", "unix", "command-line"]
    },
    {
      "name": "boto3",
      "tags": ["cloud", "aws", "python", "sdk"]
    },
    {
      "name": "c-sharp",
      "tags": ["language", "microsoft", "dotnet", "backend"]
    },
    {
      "name": "cheerio",
      "tags": ["web-scraping", "javascript", "html-parsing", "nodejs"]
    },
    {
      "name": "circleci",
      "tags": ["devops", "ci-cd", "cloud", "automation"]
    },
    {
      "name": "clerk",
      "tags": ["authentication", "security", "identity", "saas"]
    },
    {
      "name": "codemirror",
      "tags": ["editor", "javascript", "text-editor", "code-editor"]
    },
    {
      "name": "cypress",
      "tags": ["testing", "e2e", "javascript", "browser"]
    },
    {
      "name": "d3",
      "tags": ["data-visualization", "javascript", "svg", "charts"]
    },
    {
      "name": "datadog",
      "tags": ["monitoring", "observability", "devops", "cloud"]
    },
    {
      "name": "deno",
      "tags": ["javascript", "runtime", "typescript", "nodejs-alternative"]
    },
    {
      "name": "digitalocean",
      "tags": ["cloud", "hosting", "infrastructure", "paas"]
    },
    {
      "name": "discord-api",
      "tags": ["api", "messaging", "gaming", "communication"]
    },
    {
      "name": "django-rest-framework",
      "tags": ["api", "python", "django", "rest"]
    },
    {
      "name": "drizzle",
      "tags": ["database", "orm", "typescript", "sql"]
    },
    {
      "name": "elk-stack",
      "tags": ["logging", "monitoring", "search", "analytics"]
    },
    {
      "name": "esbuild",
      "tags": ["build-tool", "javascript", "bundler", "performance"]
    },
    {
      "name": "eslint",
      "tags": ["linting", "javascript", "static-analysis", "code-quality"]
    },
    {
      "name": "elasticsearch",
      "tags": ["search", "database", "full-text", "analytics"]
    },
    {
      "name": "emacs",
      "tags": ["editor", "text-editor", "lisp", "extensible"]
    },
    {
      "name": "ffmpeg",
      "tags": ["multimedia", "video", "audio", "conversion"]
    },
    {
      "name": "fabric-js",
      "tags": ["canvas", "graphics", "javascript", "interactive"]
    },
    {
      "name": "firebase",
      "tags": ["backend-as-service", "database", "authentication", "google"]
    },
    {
      "name": "fontawesome",
      "tags": ["icons", "ui", "web", "design"]
    },
    {
      "name": "gcp-cli",
      "tags": ["cloud", "google", "command-line", "devops"]
    },
    {
      "name": "gitlab-ci",
      "tags": ["devops", "ci-cd", "automation", "git"]
    },
    {
      "name": "go",
      "tags": ["language", "backend", "performance", "google"]
    },
    {
      "name": "godot",
      "tags": ["game-development", "engine", "cross-platform", "open-source"]
    },
    {
      "name": "google-maps-js",
      "tags": ["maps", "geolocation", "javascript", "api"]
    },
    {
      "name": "gradle",
      "tags": ["build-tool", "java", "android", "automation"]
    },
    {
      "name": "grafana",
      "tags": ["monitoring", "visualization", "dashboards", "observability"]
    },
    {
      "name": "heroku",
      "tags": ["cloud", "paas", "hosting", "deployment"]
    },
    {
      "name": "insomnia",
      "tags": ["api", "testing", "development", "http-client"]
    },
    {
      "name": "ionic",
      "tags": ["mobile", "framework", "cross-platform", "javascript"]
    },
    {
      "name": "jax",
      "tags": ["ai", "ml", "numerical-computing", "python"]
    },
    {
      "name": "junit",
      "tags": ["testing", "java", "unit-testing", "framework"]
    },
    {
      "name": "java",
      "tags": ["language", "backend", "enterprise", "jvm"]
    },
    {
      "name": "jenkins",
      "tags": ["devops", "ci-cd", "automation", "build"]
    },
    {
      "name": "jquery",
      "tags": ["javascript", "dom", "library", "frontend"]
    },
    {
      "name": "llvm",
      "tags": ["compiler", "infrastructure", "optimization", "toolchain"]
    },
    {
      "name": "mlx",
      "tags": ["ai", "ml", "apple", "deep-learning"]
    },
    {
      "name": "maven",
      "tags": ["build-tool", "java", "dependency-management", "project-management"]
    },
    {
      "name": "microsoft-teams",
      "tags": ["collaboration", "communication", "microsoft", "enterprise"]
    },
    {
      "name": "mockito",
      "tags": ["testing", "java", "mocking", "unit-testing"]
    },
    {
      "name": "neo4j",
      "tags": ["database", "graph", "nosql", "relationships"]
    },
    {
      "name": "netlify",
      "tags": ["hosting", "deployment", "jamstack", "frontend"]
    },
    {
      "name": "nginx",
      "tags": ["web-server", "proxy", "load-balancer", "performance"]
    },
    {
      "name": "notion-api",
      "tags": ["api", "productivity", "collaboration", "integration"]
    },
    {
      "name": "openai",
      "tags": ["ai", "ml", "llm", "api"]
    },
    {
      "name": "php",
      "tags": ["language", "backend", "web", "server-side"]
    },
    {
      "name": "postman",
      "tags": ["api", "testing", "development", "http-client"]
    },
    {
      "name": "puppeteer",
      "tags": ["web-scraping", "browser-automation", "testing", "javascript"]
    },
    {
      "name": "ros",
      "tags": ["robotics", "framework", "middleware", "distributed-systems"]
    },
    {
      "name": "railway",
      "tags": ["hosting", "deployment", "paas", "devops"]
    },
    {
      "name": "remix",
      "tags": ["frontend", "framework", "react", "javascript"]
    },
    {
      "name": "ruby",
      "tags": ["language", "backend", "web", "scripting"]
    },
    {
      "name": "rust",
      "tags": ["language", "systems", "performance", "safety"]
    },
    {
      "name": "sqlite",
      "tags": ["database", "sql", "embedded", "lightweight"]
    },
    {
      "name": "sentry",
      "tags": ["error-tracking", "monitoring", "debugging", "observability"]
    },
    {
      "name": "socket-io",
      "tags": ["websockets", "real-time", "javascript", "communication"]
    },
    {
      "name": "spring",
      "tags": ["backend", "framework", "java", "enterprise"]
    },
    {
      "name": "stripe",
      "tags": ["payments", "api", "e-commerce", "financial"]
    },
    {
      "name": "three-js",
      "tags": ["3d", "graphics", "webgl", "javascript"]
    },
    {
      "name": "tinygrad",
      "tags": ["ai", "ml", "deep-learning", "lightweight"]
    },
    {
      "name": "unity",
      "tags": ["game-development", "engine", "cross-platform", "c-sharp"]
    },
    {
      "name": "unreal-engine",
      "tags": ["game-development", "engine", "cross-platform", "c++"]
    },
    {
      "name": "vim",
      "tags": ["editor", "text-editor", "terminal", "productivity"]
    },
    {
      "name": "zsh",
      "tags": ["shell", "command-line", "unix", "terminal"]
    }
  ]
} 
```

## /cursor-rules-cli/setup.py

```py path="/cursor-rules-cli/setup.py" 
#!/usr/bin/env python
"""
Setup script for cursor-rules.
"""

from setuptools import setup, find_packages
import os
import shutil
from pathlib import Path

# Read version from __init__.py
with open(os.path.join("src", "__init__.py"), "r") as f:
    for line in f:
        if line.startswith("__version__"):
            version = line.split("=")[1].strip().strip('"').strip("'")
            break
    else:
        version = "0.1.0"

# Read long description from README.md
long_description = "A CLI tool to scan projects and install relevant Cursor rules (.mdc files)."
readme_path = Path("README.md")
if readme_path.exists():
    with open(readme_path, "r", encoding="utf-8") as f:
        long_description = f.read()

# Always copy the latest rules.json from project root to ensure consistency
project_root = Path(__file__).parent.parent
root_rules_json = project_root / "rules.json"
package_rules_json = Path("rules.json")

if root_rules_json.exists():
    # Copy to package root only
    shutil.copy2(root_rules_json, package_rules_json)
    print(f"Copied rules.json from project root to package root")
else:
    print("Warning: rules.json not found in project root")

setup(
    name="cursor-rules",
    version=version,
    description="A CLI tool to scan projects and install relevant Cursor rules",
    long_description=long_description,
    long_description_content_type="text/markdown",
    author="sanjeed5",
    author_email="hi@sanjeed.in",
    url="https://github.com/sanjeed5/awesome-cursor-rules-mdc",
    package_dir={"cursor_rules_cli": "src"},
    packages=["cursor_rules_cli"],
    include_package_data=True,
    package_data={
        "cursor_rules_cli": ["*.json"],
    },
    entry_points={
        "console_scripts": [
            "cursor-rules=cursor_rules_cli.main:main",
        ],
    },
    python_requires=">=3.8",
    keywords="cursor, rules, mdc, cli",
    classifiers=[
        "Development Status :: 3 - Alpha",
        "Intended Audience :: Developers",
        "License :: OSI Approved :: MIT License",
        "Programming Language :: Python :: 3",
        "Programming Language :: Python :: 3.8",
        "Programming Language :: Python :: 3.9",
        "Programming Language :: Python :: 3.10",
        "Programming Language :: Python :: 3.11",
    ],
    install_requires=[
        "requests>=2.25.0",
        "colorama>=0.4.4",
        "tqdm>=4.62.0",
        "urllib3>=2.0.0",
        "validators>=0.20.0",
    ]
) 
```

## /cursor-rules-cli/src/__init__.py

```py path="/cursor-rules-cli/src/__init__.py" 
"""
Cursor Rules CLI - A tool to scan projects and suggest relevant Cursor rules
"""

__version__ = "0.5.2" 
```

## /cursor-rules-cli/src/downloader.py

```py path="/cursor-rules-cli/src/downloader.py" 
"""
Downloader module for downloading MDC rule files.

This module handles downloading the selected MDC rule files from the repository.
"""

import os
import time
import logging
import requests
import re
import base64
from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple
from concurrent.futures import ThreadPoolExecutor, as_completed
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
from cursor_rules_cli import utils

logger = logging.getLogger(__name__)

# Rate limiting settings
DEFAULT_RATE_LIMIT = 10  # requests per second
DEFAULT_MAX_RETRIES = 3
DEFAULT_RETRY_DELAY = 2  # seconds
DEFAULT_TIMEOUT = 10  # seconds

class DownloadError(Exception):
    """Custom exception for download errors."""
    pass

class ValidationError(Exception):
    """Custom exception for validation errors."""
    pass

def extract_repo_info(source_url: str) -> Tuple[str, str, str]:
    """
    Extract owner and repo name from GitHub URL.
    
    Args:
        source_url: GitHub URL
        
    Returns:
        Tuple of (owner, repo, branch)
        
    Raises:
        ValueError: If URL is not a valid GitHub repository URL
    """
    # Handle various GitHub URL formats
    github_patterns = [
        r"https?://github\.com/([^/]+)/([^/]+)(?:/tree/([^/]+))?",  # github.com URLs
        r"https?://raw\.githubusercontent\.com/([^/]+)/([^/]+)/([^/]+)"  # raw.githubusercontent.com URLs
    ]
    
    for pattern in github_patterns:
        match = re.match(pattern, source_url)
        if match:
            groups = match.groups()
            owner = groups[0]
            repo = groups[1]
            branch = groups[2] if len(groups) > 2 and groups[2] else "main"
            return owner, repo, branch
    
    raise ValueError(f"Invalid GitHub URL format: {source_url}")

def create_session() -> requests.Session:
    """
    Create a requests session with retry configuration.
    
    Returns:
        Configured requests session
    """
    session = requests.Session()
    
    # Configure retry strategy
    retry_strategy = Retry(
        total=DEFAULT_MAX_RETRIES,
        backoff_factor=0.5,
        status_forcelist=[429, 500, 502, 503, 504],
    )
    
    # Mount the retry adapter
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)
    
    return session

def verify_source_url(source_url: str, session: requests.Session = None) -> Tuple[bool, str]:
    """
    Verify that the source URL is a valid GitHub repository.
    
    Args:
        source_url: GitHub repository URL
        session: Optional requests session to use
        
    Returns:
        Tuple of (is_accessible: bool, error_message: str)
    """
    if not session:
        session = create_session()
        
    try:
        # Extract repo information
        owner, repo, branch = extract_repo_info(source_url)
        
        # Check if the repository exists using GitHub API
        api_url = f"https://api.github.com/repos/{owner}/{repo}"
        response = session.get(api_url, timeout=DEFAULT_TIMEOUT)
        
        if response.status_code >= 400:
            return False, f"GitHub repository not found: {owner}/{repo} (Status code: {response.status_code})"
        
        # Check if the branch exists
        branches_url = f"{api_url}/branches/{branch}"
        response = session.get(branches_url, timeout=DEFAULT_TIMEOUT)
        
        if response.status_code >= 400:
            return False, f"Branch not found: {branch} (Status code: {response.status_code})"
            
        # Check if the rules-mdc directory exists
        contents_url = f"{api_url}/contents/rules-mdc?ref={branch}"
        response = session.get(contents_url, timeout=DEFAULT_TIMEOUT)
        
        if response.status_code >= 400:
            return False, f"Rules directory not found: rules-mdc (Status code: {response.status_code})"
                
        return True, ""
    except ValueError as e:
        return False, str(e)
    except requests.RequestException as e:
        return False, f"Failed to connect to GitHub: {e}"
    except Exception as e:
        return False, f"Unexpected error verifying source URL: {e}"

def download_rules(
    rules: List[Dict[str, Any]],
    source_url: str,
    temp_dir: Optional[Path] = None,
    rate_limit: int = DEFAULT_RATE_LIMIT,
    max_retries: int = DEFAULT_MAX_RETRIES,
    max_workers: int = 4,
) -> List[Dict[str, Any]]:
    """
    Download selected MDC rule files from GitHub.
    
    Args:
        rules: List of rule metadata to download
        source_url: GitHub repository URL
        temp_dir: Temporary directory to store downloaded files
        rate_limit: Maximum requests per second
        max_retries: Maximum number of retries for failed downloads
        max_workers: Maximum number of concurrent downloads
        
    Returns:
        List of downloaded rule metadata with local file paths
        
    Raises:
        DownloadError: If there are critical download failures
    """
    if not rules:
        logger.warning("No rules to download")
        return []
    
    # Create temporary directory if not provided
    if temp_dir is None:
        temp_dir = Path.home() / ".cursor-rules-cli" / "temp"
    
    temp_dir.mkdir(parents=True, exist_ok=True)
    logger.debug(f"Using temporary directory: {temp_dir}")
    
    # Create rate limiter and session
    rate_limiter = utils.RateLimiter(rate_limit)
    session = create_session()
    
    # Verify source URL is accessible
    is_accessible, error_msg = verify_source_url(source_url, session)
    if not is_accessible:
        logger.error(f"Source URL verification failed: {error_msg}")
        logger.error(f"Please check if the source URL is correct: {source_url}")
        raise DownloadError(f"Source URL is not accessible: {error_msg}")
    
    # Get repository information
    try:
        owner, repo, branch = extract_repo_info(source_url)
        logger.info(f"Using GitHub repository: {owner}/{repo}, branch: {branch}")
    except ValueError as e:
        logger.error(f"Invalid GitHub URL: {str(e)}")
        raise DownloadError(f"Invalid GitHub URL: {str(e)}")
        
    # Download rules in parallel
    downloaded_rules = []
    failed_downloads = []
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        # Submit download tasks
        future_to_rule = {
            executor.submit(
                download_rule_from_github,
                rule,
                owner,
                repo,
                branch,
                temp_dir,
                rate_limiter,
                session,
                max_retries
            ): rule for rule in rules
        }
        
        # Process results as they complete
        for future in as_completed(future_to_rule):
            rule = future_to_rule[future]
            try:
                result = future.result()
                if result:
                    downloaded_rules.append(result)
                    logger.info(f"Downloaded {rule['name']}")
                else:
                    failed_downloads.append(rule)
                    logger.error(f"Failed to download {rule['name']}")
            except Exception as e:
                failed_downloads.append(rule)
                logger.error(f"Error downloading {rule['name']}: {str(e)}")
    
    # Close the session
    session.close()
    
    # Report download statistics
    total_rules = len(rules)
    success_count = len(downloaded_rules)
    failed_count = len(failed_downloads)
    
    if failed_count > 0:
        failed_names = [rule['name'] for rule in failed_downloads]
        logger.warning(
            f"Downloaded {success_count}/{total_rules} rules. "
            f"Failed to download {failed_count} rules: {', '.join(failed_names)}"
        )
        if failed_count == total_rules:
            logger.error("All downloads failed. Please check your internet connection and the source URL.")
            logger.error(f"Source URL: {source_url}")
            raise DownloadError("All downloads failed. Check internet connection and source URL.")
    else:
        logger.info(f"Successfully downloaded all {success_count} rules")
    
    return downloaded_rules

def download_rule_from_github(
    rule: Dict[str, Any],
    owner: str,
    repo: str,
    branch: str,
    temp_dir: Path,
    rate_limiter: utils.RateLimiter,
    session: requests.Session,
    max_retries: int,
) -> Optional[Dict[str, Any]]:
    """
    Download a single MDC rule file from GitHub with validation.
    
    Args:
        rule: Rule metadata
        owner: GitHub repository owner
        repo: GitHub repository name
        branch: GitHub repository branch
        temp_dir: Temporary directory to store downloaded file
        rate_limiter: Rate limiter instance
        session: Requests session
        max_retries: Maximum number of retries
        
    Returns:
        Updated rule metadata with local file path or None if failed
        
    Raises:
        ValidationError: If the downloaded content fails validation
    """
    name = rule["name"]
    file_path = f"rules-mdc/{name}.mdc"
    
    # Create the GitHub API URL for the file
    api_url = f"https://api.github.com/repos/{owner}/{repo}/contents/{file_path}?ref={branch}"
    
    # Create local file path
    local_path = temp_dir / f"{name}.mdc"
    
    # Try to download the file
    for attempt in range(max_retries + 1):
        try:
            # Respect rate limit
            rate_limiter.wait()
            
            # Download the file using GitHub API
            response = session.get(api_url, timeout=DEFAULT_TIMEOUT)
            response.raise_for_status()
            
            # Extract content from GitHub API response
            data = response.json()
            if "content" not in data:
                raise ValidationError(f"GitHub API response doesn't contain file content for {name}")
                
            # Decode base64 content
            content = base64.b64decode(data["content"].replace("\n", "")).decode("utf-8")
            
            # Validate content
            is_valid, error_msg = utils.validate_mdc_content(content)
            if not is_valid:
                logger.error(f"Content validation failed for {name}: {error_msg}")
                raise ValidationError(f"Content validation failed: {error_msg}")
            
            # Calculate content hash before saving
            content_hash = utils.calculate_content_hash(content)
            logger.debug(f"Content hash for {name}: {content_hash}")
            
            # Save the file
            with open(local_path, "w", encoding="utf-8") as f:
                f.write(content)
            
            # Verify the saved file
            saved_hash = utils.calculate_file_hash(local_path)
            logger.debug(f"Saved file hash for {name}: {saved_hash}")
            
            if saved_hash != content_hash:
                logger.error(f"File integrity check failed for {name}. Content hash: {content_hash}, File hash: {saved_hash}")
                # Fix: Read the file back and compare the content
                with open(local_path, "r", encoding="utf-8") as f:
                    saved_content = f.read()
                if content == saved_content:
                    logger.info(f"Content matches but hashes differ for {name}. This may be due to line ending differences. Proceeding anyway.")
                    # Update rule metadata and continue
                    rule["local_path"] = str(local_path)
                    rule["content"] = content
                    rule["hash"] = saved_hash  # Use the file hash since that's what we'll verify against later
                    return rule
                else:
                    logger.error(f"Content mismatch for {name}")
                    raise ValidationError("File integrity check failed")
            
            # Update rule metadata
            rule["local_path"] = str(local_path)
            rule["content"] = content
            rule["hash"] = content_hash
            
            return rule
            
        except requests.RequestException as e:
            if attempt < max_retries:
                delay = DEFAULT_RETRY_DELAY * (attempt + 1)
                logger.warning(f"Attempt {attempt + 1}/{max_retries + 1} failed for {name}: {e}")
                logger.warning(f"Retrying in {delay} seconds...")
                time.sleep(delay)
            else:
                logger.error(f"Failed to download {name} after {max_retries + 1} attempts: {e}")
                return None
                
        except ValidationError as e:
            logger.error(f"Validation failed for {name}: {e}")
            return None
            
        except Exception as e:
            logger.error(f"Unexpected error downloading {name}: {e}")
            return None
    
    return None

def preview_rule_content(rule: Dict[str, Any], max_lines: int = 10) -> str:
    """
    Generate a preview of the rule content.
    
    Args:
        rule: Rule metadata with content
        max_lines: Maximum number of lines to include
        
    Returns:
        Preview of the rule content
    """
    if "content" not in rule:
        return "Content not available"
    
    lines = rule["content"].splitlines()
    if len(lines) <= max_lines:
        return rule["content"]
    
    # Show first few lines
    return "\n".join(lines[:max_lines]) + f"\n... (and {len(lines) - max_lines} more lines)"

if __name__ == "__main__":
    # For testing
    import json
    logging.basicConfig(level=logging.DEBUG)
    
    # Example rule
    test_rule = {
        "name": "react",
        "tags": ["frontend", "framework", "javascript"],
        "path": "rules-mdc/react.mdc",
        "url": "https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/main/rules-mdc/react.mdc",
        "description": "react (frontend, framework, javascript)",
    }
    
    # Test download
    downloaded = download_rules([test_rule], "")
    
    if downloaded:
        print(f"Downloaded rule: {downloaded[0]['name']}")
        print("Preview:")
        print(preview_rule_content(downloaded[0])) 
```

## /cursor-rules-cli/src/installer.py

```py path="/cursor-rules-cli/src/installer.py" 
"""
Installer module for installing MDC rule files.

This module handles installing the downloaded MDC rule files to the project's
.cursor/rules directory.
"""

import os
import shutil
import logging
from pathlib import Path
from typing import Dict, List, Any, Optional
from datetime import datetime

logger = logging.getLogger(__name__)

def install_rules(
    rules: List[Dict[str, Any]],
    force: bool = False,
    cursor_dir: Optional[Path] = None,
    backup: bool = True,
) -> Dict[str, List]:
    """
    Install downloaded MDC rule files to the project's .cursor/rules directory.
    
    Args:
        rules: List of rule metadata with local file paths
        force: Whether to overwrite existing rules
        cursor_dir: Path to .cursor directory (defaults to ./.cursor in current directory)
        backup: Whether to backup existing rules
        
    Returns:
        Dictionary with 'installed' and 'failed' lists
    """
    result = {
        "installed": [],
        "failed": []
    }
    
    if not rules:
        logger.warning("No rules to install")
        return result
    
    # Determine .cursor directory - use project local directory
    if cursor_dir is None:
        cursor_dir = Path.cwd() / ".cursor"
    
    # Create rules directory if it doesn't exist
    rules_dir = cursor_dir / "rules"
    rules_dir.mkdir(parents=True, exist_ok=True)
    
    logger.debug(f"Installing rules to {rules_dir}")
    
    # Backup existing rules if needed
    if backup and any(rules_dir.glob("*.mdc")):
        backup_dir = create_backup(rules_dir)
        if backup_dir:
            logger.info(f"Backed up existing rules to {backup_dir}")
    
    # Install each rule
    for rule in rules:
        if "local_path" not in rule:
            failure = {
                "rule": rule,
                "error": "No local file path"
            }
            result["failed"].append(failure)
            logger.warning(f"Skipping {rule['name']}: No local file path")
            continue
        
        # Determine target path
        target_path = rules_dir / f"{rule['name']}.mdc"
        
        # Check if rule already exists
        if target_path.exists() and not force:
            failure = {
                "rule": rule,
                "error": "Rule already exists (use --force to overwrite)"
            }
            result["failed"].append(failure)
            logger.warning(f"Skipping {rule['name']}: Rule already exists (use --force to overwrite)")
            continue
        
        # Copy the rule file
        try:
            shutil.copy2(rule["local_path"], target_path)
            logger.debug(f"Installed {rule['name']} to {target_path}")
            result["installed"].append(rule)
        except IOError as e:
            failure = {
                "rule": rule,
                "error": str(e)
            }
            result["failed"].append(failure)
            logger.error(f"Failed to install {rule['name']}: {e}")
    
    logger.info(f"Installed {len(result['installed'])}/{len(rules)} rules to {rules_dir}")
    return result

def create_backup(rules_dir: Path) -> Optional[Path]:
    """
    Create a backup of existing rules in the project directory.
    
    Args:
        rules_dir: Path to .cursor/rules directory
        
    Returns:
        Path to backup directory or None if failed
    """
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    # Keep backups in the project directory under .cursor/backups
    backup_dir = rules_dir.parent / "backups" / f"rules_backup_{timestamp}"
    
    try:
        # Create backup directory
        backup_dir.mkdir(parents=True, exist_ok=True)
        
        # Copy existing rules
        for rule_file in rules_dir.glob("*.mdc"):
            shutil.copy2(rule_file, backup_dir / rule_file.name)
        
        return backup_dir
    except IOError as e:
        logger.error(f"Failed to create backup: {e}")
        return None

def list_installed_rules(cursor_dir: Optional[Path] = None) -> List[Dict[str, Any]]:
    """
    List installed MDC rule files.
    
    Args:
        cursor_dir: Path to .cursor directory (defaults to ./.cursor in current directory)
        
    Returns:
        List of installed rule metadata
    """
    # Determine .cursor directory - use project local directory
    if cursor_dir is None:
        cursor_dir = Path.cwd() / ".cursor"
    
    rules_dir = cursor_dir / "rules"
    
    if not rules_dir.exists():
        logger.debug(f"Rules directory not found: {rules_dir}")
        return []
    
    installed_rules = []
    for rule_file in rules_dir.glob("*.mdc"):
        # Extract rule name from filename
        name = rule_file.stem
        
        # Read first few lines to extract description
        try:
            with open(rule_file, "r", encoding="utf-8") as f:
                content = f.read(1000)  # Read first 1000 chars
                
                # Try to extract description from frontmatter
                description = name
                if "description:" in content:
                    desc_line = [line for line in content.split("\n") if "description:" in line]
                    if desc_line:
                        description = desc_line[0].split("description:")[1].strip()
        except IOError:
            description = name
        
        installed_rules.append({
            "name": name,
            "path": str(rule_file),
            "description": description,
        })
    
    return installed_rules

if __name__ == "__main__":
    # For testing
    import sys
    logging.basicConfig(level=logging.DEBUG)
    
    # List installed rules
    rules = list_installed_rules()
    print(f"Installed rules: {len(rules)}")
    for rule in rules:
        print(f"  - {rule['name']}: {rule['description']}") 
```

## /cursor-rules-cli/src/main.py

```py path="/cursor-rules-cli/src/main.py" 
#!/usr/bin/env python
"""
cursor-rules-cli: A tool to scan projects and suggest relevant Cursor rules

This module serves as the entry point for the CLI tool.
"""

import os
import sys
import logging
import argparse
from pathlib import Path
from typing import Dict, Any, List
import json
from colorama import Fore, Style, init as init_colorama

# Initialize colorama
init_colorama()

# Import local modules
from cursor_rules_cli.scanner import scan_project, scan_package_files
from cursor_rules_cli.matcher import match_libraries
from cursor_rules_cli.downloader import download_rules
from cursor_rules_cli.installer import install_rules
from cursor_rules_cli.utils import (
    load_config, save_config, get_config_file, 
    load_project_config, save_project_config, get_project_config_file,
    merge_configs, validate_github_repo, DEFAULT_RULES_PATH
)

# Configure logging with colors
class ColoredFormatter(logging.Formatter):
    """Custom formatter to add colors to log messages."""
    
    COLORS = {
        'DEBUG': Fore.CYAN,
        'INFO': Fore.GREEN,
        'WARNING': Fore.YELLOW,
        'ERROR': Fore.RED,
        'CRITICAL': Fore.RED + Style.BRIGHT
    }
    
    def format(self, record):
        levelname = record.levelname
        if levelname in self.COLORS:
            record.levelname = f"{self.COLORS[levelname]}{levelname}{Style.RESET_ALL}"
            if record.levelno >= logging.WARNING:
                record.msg = f"{self.COLORS[levelname]}{record.msg}{Style.RESET_ALL}"
        return super().format(record)

# Configure logging
handler = logging.StreamHandler()
handler.setFormatter(ColoredFormatter("%(levelname)s: %(message)s"))
logging.basicConfig(
    level=logging.INFO,
    handlers=[handler]
)
logger = logging.getLogger(__name__)

def parse_args():
    """Parse command line arguments."""
    parser = argparse.ArgumentParser(
        description="Scan your project and install relevant Cursor rules (.mdc files)."
    )
    
    parser.add_argument(
        "-d", "--directory",
        default=".",
        help="Project directory to scan (default: current directory)"
    )
    
    parser.add_argument(
        "--dry-run",
        action="store_true",
        help="Show what would be done without making changes"
    )
    
    parser.add_argument(
        "--force",
        action="store_true",
        help="Force overwrite existing rules"
    )
    
    parser.add_argument(
        "--source",
        default="https://github.com/sanjeed5/awesome-cursor-rules-mdc",
        help="GitHub repository URL for downloading rules"
    )
    
    parser.add_argument(
        "--custom-repo",
        default=None,
        help="GitHub username/repo for a forked repository (e.g., 'username/repo')"
    )
    
    parser.add_argument(
        "--set-repo",
        action="store_true",
        help="Set custom repository without running scan"
    )
    
    parser.add_argument(
        "--rules-json",
        default=None,
        help="Path to custom rules.json file"
    )
    
    parser.add_argument(
        "--save-config",
        action="store_true",
        help="Save current settings as default configuration"
    )
    
    parser.add_argument(
        "--save-project-config",
        action="store_true",
        help="Save current settings as project-specific configuration"
    )
    
    parser.add_argument(
        "--show-config",
        action="store_true",
        help="Show current configuration"
    )
    
    parser.add_argument(
        "--quick-scan",
        action="store_true",
        help="Perform a quick scan (only check package files, not imports)"
    )
    
    parser.add_argument(
        "--max-results",
        type=int,
        default=20,
        help="Maximum number of rules to display (default: 20)"
    )
    
    parser.add_argument(
        "--min-score",
        type=float,
        default=0.5,
        help="Minimum relevance score for rules (0-1, default: 0.5)"
    )
    
    parser.add_argument(
        "--libraries",
        type=str,
        help="Comma-separated list of libraries to match directly (e.g., 'react,vue,django')"
    )
    
    parser.add_argument(
        "-v", "--verbose",
        action="store_true",
        help="Enable verbose output"
    )
    
    return parser.parse_args()

def display_config(config: Dict[str, Any], global_config: Dict[str, Any], project_config: Dict[str, Any]):
    """Display the current configuration."""
    print(f"\n{Style.BRIGHT}{Fore.BLUE}Current Configuration:{Style.RESET_ALL}")
    
    # First display CLI-wide settings
    cli_wide_settings = ["custom_repo", "source"]
    if any(setting in config for setting in cli_wide_settings):
        print(f"\n{Fore.BLUE}CLI-wide settings:{Style.RESET_ALL}")
        for key in cli_wide_settings:
            if key in config:
                source = f" {Fore.GREEN}(global){Style.RESET_ALL}" if key in global_config else f" {Fore.YELLOW}(default){Style.RESET_ALL}"
                print(f"  {Fore.BLUE}{key}{Style.RESET_ALL}: {config[key]}{source}")
    
    # Then display project-specific settings
    project_settings = [k for k in config.keys() if k not in cli_wide_settings]
    if project_settings:
        print(f"\n{Fore.BLUE}Project-specific settings:{Style.RESET_ALL}")
        for key in project_settings:
            source = ""
            if key in project_config:
                source = f" {Fore.CYAN}(project){Style.RESET_ALL}"
            elif key in global_config:
                source = f" {Fore.GREEN}(global){Style.RESET_ALL}"
            else:
                source = f" {Fore.YELLOW}(default){Style.RESET_ALL}"
            
            print(f"  {Fore.BLUE}{key}{Style.RESET_ALL}: {config[key]}{source}")
    
    print()

def main():
    """Main entry point for the CLI."""
    # Parse command line arguments
    args = parse_args()
    
    # Convert directory to Path
    project_dir = Path(args.directory).resolve()
    
    # Load configurations
    global_config = load_config()
    project_config = load_project_config(project_dir)
    
    # Merge configurations (project config takes precedence)
    config = merge_configs(global_config, project_config)
    
    # Ensure rules_json is in config
    if "rules_json" not in config:
        config["rules_json"] = str(DEFAULT_RULES_PATH)
    
    # Ensure source is in config
    if "source" not in config:
        config["source"] = "https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/main"
    
    # Handle direct library input if provided
    libraries_directly_provided = False
    if args.libraries:
        libraries = [lib.strip() for lib in args.libraries.split(",") if lib.strip()]
        if not libraries:
            logger.error("No valid libraries provided")
            return 1
        logger.info(f"Using directly provided libraries: {', '.join(libraries)}")
        detected_libraries = libraries
        libraries_directly_provided = True
    else:
        # Run the scanning phase
        try:
            # Use quick scan if requested
            scan_start_msg = "Quick scanning" if args.quick_scan else "Scanning"
            logger.info(f"{scan_start_msg} for libraries and frameworks...")
            
            # Scan project for libraries
            logger.info("Scanning for libraries and frameworks...")
            detected_libraries = scan_project(
                project_dir=project_dir,
                quick_scan=args.quick_scan,
                rules_path=config["rules_json"],
                use_cache=not args.force
            )
            
            # Get direct match libraries from package files
            direct_match_libraries = scan_package_files(Path(project_dir))
            
            logger.info(f"Detected {len(detected_libraries)} libraries/frameworks.")
            
            # Match libraries with rules
            logger.info("Finding relevant rules...")
            matching_rules = match_libraries(
                detected_libraries=detected_libraries,
                source_url=config["source"],
                direct_match_libraries=direct_match_libraries,
                custom_json_path=config["rules_json"],
                max_results=args.max_results,
                min_score=args.min_score
            )
            
            if not matching_rules:
                logger.warning("No matching libraries found for your project.")
                return 0
            
            logger.info(f"Found {Fore.GREEN}{len(matching_rules)}{Style.RESET_ALL} relevant rule files.")
            
            # Display and select rules to download
            selected_rules = display_matched_rules(matching_rules, args.max_results)
            
            if not selected_rules:
                logger.info("No rules selected. Exiting.")
                return 0
            
            # Download selected rules
            if args.dry_run:
                logger.info(f"{Fore.YELLOW}DRY RUN:{Style.RESET_ALL} Would download the following rules:")
                for rule in selected_rules:
                    logger.info(f"  - {Fore.CYAN}{rule}{Style.RESET_ALL}")
            else:
                try:
                    # Normalize source URL if needed (remove trailing slashes)
                    source_url = config["source"].rstrip('/')
                    
                    # Log source information
                    logger.info(f"Using source URL: {source_url}")
                    
                    # Download selected rules
                    downloaded_rules = download_rules(selected_rules, source_url)
                    
                    # Install downloaded rules
                    result = install_rules(downloaded_rules, force=args.force)
                    
                    if result["installed"]:
                        logger.info(f"{Fore.GREEN}✅ Successfully installed {len(result['installed'])} rules!{Style.RESET_ALL}")
                    
                    if result["failed"]:
                        logger.warning(f"{Fore.YELLOW}⚠️ Failed to install {len(result['failed'])} rules:{Style.RESET_ALL}")
                        for rule in result["failed"]:
                            logger.warning(f"  - {Fore.CYAN}{rule}{Style.RESET_ALL}")
                except Exception as e:
                    logger.error(f"An error occurred: {str(e)}")
                    return 1
        
        except KeyboardInterrupt:
            logger.info(f"\n{Fore.YELLOW}Operation cancelled by user.{Style.RESET_ALL}")
            return 130
        except Exception as e:
            logger.error(f"An error occurred: {e}")
            if args.verbose:
                import traceback
                traceback.print_exc()
            return 1
    
    # Override config with command line arguments
    if args.custom_repo is not None:
        # Validate custom repo if provided
        if args.custom_repo and not validate_github_repo(args.custom_repo):
            logger.error(f"{Fore.RED}Invalid GitHub repository: {args.custom_repo}{Style.RESET_ALL}")
            logger.error(f"{Fore.RED}Repository must exist and contain a rules.json file.{Style.RESET_ALL}")
            return 1
        config["custom_repo"] = args.custom_repo
    elif "custom_repo" not in config:
        config["custom_repo"] = None
        
    if args.rules_json is not None:
        config["rules_json"] = args.rules_json
    elif "rules_json" not in config:
        config["rules_json"] = str(DEFAULT_RULES_PATH)
        
    if args.source != "https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/main":
        config["source"] = args.source
    elif "source" not in config:
        config["source"] = "https://raw.githubusercontent.com/sanjeed5/awesome-cursor-rules-mdc/main"
    
    # Set custom repository without running scan if requested
    if args.set_repo:
        if args.custom_repo is None:
            logger.error(f"{Fore.RED}Please specify a custom repository with --custom-repo.{Style.RESET_ALL}")
            return 1
        
        global_config["custom_repo"] = config["custom_repo"]
        save_config(global_config)
        logger.info(f"{Fore.GREEN}Custom repository set to: {config['custom_repo']}{Style.RESET_ALL}")
        return 0
    
    # Show configuration if requested
    if args.show_config:
        display_config(config, global_config, project_config)
        return 0
    
    # Save configuration if requested
    if args.save_config:
        # For custom repo, only save to global config, not project config
        global_config_to_save = global_config.copy()
        if "custom_repo" in config:
            global_config_to_save["custom_repo"] = config["custom_repo"]
        if "source" in config:
            global_config_to_save["source"] = config["source"]
        
        save_config(global_config_to_save)
        logger.info(f"{Fore.GREEN}Global configuration saved successfully.{Style.RESET_ALL}")
        if not args.directory or args.directory == ".":
            return 0
    
    if args.save_project_config:
        # Don't include custom_repo in project config
        project_config_to_save = {k: v for k, v in config.items() if k not in ["custom_repo", "source"]}
        save_project_config(project_dir, project_config_to_save)
        logger.info(f"{Fore.GREEN}Project configuration saved to {project_dir / '.cursor-rules-cli.json'}{Style.RESET_ALL}")
        if not args.directory or args.directory == ".":
            return 0
    
    # Set log level based on verbosity
    if args.verbose:
        logging.getLogger().setLevel(logging.DEBUG)
    
    logger.info(f"{Style.BRIGHT}{Fore.BLUE}Cursor Rules CLI{Style.RESET_ALL}")
    logger.info(f"Scanning project directory: {Fore.CYAN}{os.path.abspath(args.directory)}{Style.RESET_ALL}")
    
    # Handle custom repository if specified
    source_url = config["source"]
    if config["custom_repo"]:
        source_url = f"https://raw.githubusercontent.com/{config['custom_repo']}/main"
        logger.info(f"Using custom repository: {Fore.CYAN}{config['custom_repo']}{Style.RESET_ALL}")
    
    # Run the scanning phase only if libraries were not directly provided
    try:
        # Skip scanning if libraries were directly provided
        if not libraries_directly_provided:
            # Use quick scan if requested
            scan_start_msg = "Quick scanning" if args.quick_scan else "Scanning"
            logger.info(f"{scan_start_msg} for libraries and frameworks...")
            
            # Scan project for libraries
            logger.info("Scanning for libraries and frameworks...")
            detected_libraries = scan_project(
                project_dir=project_dir,
                quick_scan=args.quick_scan,
                rules_path=config["rules_json"],
                use_cache=not args.force
            )
            
            # Get direct match libraries from package files
            direct_match_libraries = scan_package_files(Path(project_dir))
            
            logger.info(f"Detected {len(detected_libraries)} libraries/frameworks.")
        else:
            # For directly provided libraries, we don't need to scan package files
            direct_match_libraries = set(detected_libraries)
            logger.info(f"{Fore.CYAN}Skipping project scan - using directly provided libraries only{Style.RESET_ALL}")
        
        # Match libraries with rules
        logger.info("Finding relevant rules...")
        matching_rules = match_libraries(
            detected_libraries=detected_libraries,
            source_url=source_url,
            direct_match_libraries=direct_match_libraries,
            custom_json_path=config["rules_json"],
            max_results=args.max_results,
            min_score=args.min_score
        )
        
        if not matching_rules:
            logger.warning("No matching libraries found for your project.")
            return 0
        
        logger.info(f"Found {Fore.GREEN}{len(matching_rules)}{Style.RESET_ALL} relevant rule files.")
        
        # Display and select rules to download
        selected_rules = display_matched_rules(matching_rules, args.max_results)
        
        if not selected_rules:
            logger.info("No rules selected. Exiting.")
            return 0
        
        # Download selected rules
        if args.dry_run:
            logger.info(f"{Fore.YELLOW}DRY RUN:{Style.RESET_ALL} Would download the following rules:")
            for rule in selected_rules:
                logger.info(f"  - {Fore.CYAN}{rule}{Style.RESET_ALL}")
        else:
            downloaded_rules = download_rules(selected_rules, source_url)
            
            # Install downloaded rules
            result = install_rules(downloaded_rules, force=args.force)
            
            if result["installed"]:
                logger.info(f"{Fore.GREEN}✅ Successfully installed {len(result['installed'])} rules!{Style.RESET_ALL}")
            
            if result["failed"]:
                logger.warning(f"{Fore.YELLOW}⚠️ Failed to install {len(result['failed'])} rules:{Style.RESET_ALL}")
                for rule in result["failed"]:
                    logger.warning(f"  - {Fore.CYAN}{rule}{Style.RESET_ALL}")
            
        return 0
    
    except KeyboardInterrupt:
        logger.info(f"\n{Fore.YELLOW}Operation cancelled by user.{Style.RESET_ALL}")
        return 130
    except Exception as e:
        logger.error(f"An error occurred: {e}")
        if args.verbose:
            import traceback
            traceback.print_exc()
        return 1

def group_rules_by_category(rules: List[Dict[str, Any]]) -> Dict[str, List[Dict[str, Any]]]:
    """
    Group rules by their category.
    
    Args:
        rules: List of rule dictionaries with category information
        
    Returns:
        Dictionary mapping categories to lists of rules
    """
    categories = {}
    
    for rule in rules:
        category = rule.get("category", "other")
        if category not in categories:
            categories[category] = []
        categories[category].append(rule)
    
    return categories

def get_category_display_name(category: str) -> str:
    """
    Get a display name for a category.
    
    Args:
        category: Category key
        
    Returns:
        Display name for the category
    """
    category_names = {
        "development": "Development Tools",
        "frontend": "Frontend Frameworks & Libraries",
        "backend": "Backend Frameworks & Libraries",
        "database": "Database & ORM",
        "ai_ml": "AI & Machine Learning",
        "devops": "DevOps & Cloud",
        "utilities": "Utilities & CLI Tools",
        "other": "Other Libraries",
    }
    
    return category_names.get(category, category.title())

def display_matched_rules(matched_rules: List[Dict[str, Any]], max_results: int = 20) -> List[Dict[str, Any]]:
    """
    Display matched rules and return selected rule objects.
    
    Args:
        matched_rules: List of matched rules
        max_results: Maximum number of rules to display
        
    Returns:
        List of selected rule objects
    """
    if not matched_rules:
        logger.info("No relevant rules found for your project.")
        return []
    
    # Group rules by category (direct_match vs others)
    direct_matches = []
    other_matches = []
    
    for rule in matched_rules:
        # Check if this is a direct match from package files
        if rule.get("is_direct_match", False):
            direct_matches.append(rule)
        else:
            other_matches.append(rule)
    
    # Sort each group by relevance score
    direct_matches.sort(key=lambda x: x["relevance_score"], reverse=True)
    other_matches.sort(key=lambda x: x["relevance_score"], reverse=True)
    
    # Combine the lists with direct matches first
    sorted_rules = direct_matches + other_matches
    
    # Limit to max_results
    display_rules = sorted_rules[:max_results]
    
    # Display rules
    print(f"\n{Style.BRIGHT}{Fore.BLUE}Available Cursor rules for your project:{Style.RESET_ALL}\n")
    
    # Display direct matches first
    if direct_matches:
        print(f"{Style.BRIGHT}{Fore.GREEN}Direct Dependencies:{Style.RESET_ALL}")
        for i, rule in enumerate([r for r in display_rules if r.get("is_direct_match", False)], 1):
            tags = f"[{', '.join(rule.get('tags', []))}]" if rule.get('tags') else ""
            score = f"({rule['relevance_score']:.2f})"
            print(f"{Fore.GREEN}{i}.{Style.RESET_ALL} {Fore.CYAN}{rule['rule']}{Style.RESET_ALL} {tags} {score}")
    
    # Display other matches
    if other_matches and any(not r.get("is_direct_match", False) for r in display_rules):
        print(f"\n{Style.BRIGHT}{Fore.YELLOW}Other Relevant Rules:{Style.RESET_ALL}")
        # Continue numbering from where direct matches left off
        start_idx = len([r for r in display_rules if r.get("is_direct_match", False)]) + 1
        for i, rule in enumerate([r for r in display_rules if not r.get("is_direct_match", False)], start_idx):
            tags = f"[{', '.join(rule.get('tags', []))}]" if rule.get('tags') else ""
            score = f"({rule['relevance_score']:.2f})"
            print(f"{Fore.GREEN}{i}.{Style.RESET_ALL} {Fore.CYAN}{rule['rule']}{Style.RESET_ALL} {tags} {score}")
    
    # Get user selection
    print(f"\n{Style.BRIGHT}Select rules to install:{Style.RESET_ALL}")
    print(f"  {Fore.YELLOW}* Enter comma-separated numbers (e.g., 1,3,5){Style.RESET_ALL}")
    print(f"  {Fore.YELLOW}* Type 'all' to select all rules{Style.RESET_ALL}")
    print(f"  {Fore.YELLOW}* Type 'category:name' to select all rules in a category (e.g., 'category:development'){Style.RESET_ALL}")
    print(f"  {Fore.YELLOW}* Type 'none' to cancel{Style.RESET_ALL}")
    
    selection = input(f"{Fore.GREEN}> {Style.RESET_ALL}").strip().lower()
    
    if selection == "none":
        logger.info("No rules selected. Exiting.")
        return []
    
    if selection == "all":
        return display_rules
    
    if selection.startswith("category:"):
        category = selection.split(":", 1)[1]
        return [
            rule for rule in display_rules
            if category in rule.get("tags", [])
        ]
    
    try:
        indices = [int(idx.strip()) for idx in selection.split(",") if idx.strip()]
        return [display_rules[idx - 1] for idx in indices if 1 <= idx <= len(display_rules)]
    except (ValueError, IndexError):
        logger.error(f"{Fore.RED}Invalid selection. Please try again.{Style.RESET_ALL}")
        return display_matched_rules(matched_rules, max_results)

if __name__ == "__main__":
    sys.exit(main()) 
```

## /cursor-rules-cli/src/matcher.py

```py path="/cursor-rules-cli/src/matcher.py" 
"""
Matcher module for matching detected libraries with MDC rules.

This module matches detected libraries with MDC rules based on relevance scores,
library relationships, and project context.
"""

import os
import json
import logging
from pathlib import Path
from typing import Dict, List, Set, Optional, Any, Tuple
from cursor_rules_cli import utils

logger = logging.getLogger(__name__)

# Minimum relevance score for a rule to be considered
MIN_RELEVANCE_SCORE = 0.5

# Maximum number of rules to return
MAX_RULES = 10

def match_libraries(
    detected_libraries: List[str],
    source_url: str,
    direct_match_libraries: Optional[Set[str]] = None,
    custom_json_path: Optional[Path] = None,
    max_results: int = MAX_RULES,
    min_score: float = MIN_RELEVANCE_SCORE
) -> List[Dict[str, Any]]:
    """
    Match detected libraries with available rules.
    
    Args:
        detected_libraries: List of detected libraries
        source_url: Base URL for the repository
        direct_match_libraries: Set of libraries that are direct matches from package files
        custom_json_path: Path to custom rules.json file
        max_results: Maximum number of rules to return
        min_score: Minimum relevance score for rules
        
    Returns:
        List of matched rules with metadata
    """
    # Create a RuleMatcher instance
    matcher = RuleMatcher(
        rules_path=str(custom_json_path) if custom_json_path else None,
        min_relevance_score=min_score,
        max_rules=max_results
    )
    
    # Match rules
    matched_rules = matcher.match_rules(detected_libraries)
    
    # Add URL and other metadata to each rule
    for rule in matched_rules:
        rule_name = rule.get("rule")
        rule["name"] = rule_name
        # We don't construct a URL here anymore - the downloader will handle this using GitHub API
        
        # Mark direct matches
        if direct_match_libraries and rule_name.lower() in (lib.lower() for lib in direct_match_libraries):
            rule["is_direct_match"] = True
        else:
            rule["is_direct_match"] = False
        
    return matched_rules

class RuleMatcher:
    """
    Class for matching detected libraries with MDC rules.
    """
    
    def __init__(
        self,
        rules_path: str = None,
        use_cache: bool = True,
        min_relevance_score: float = MIN_RELEVANCE_SCORE,
        max_rules: int = MAX_RULES
    ):
        """
        Initialize the RuleMatcher.
        
        Args:
            rules_path: Path to rules.json file
            use_cache: Whether to use caching
            min_relevance_score: Minimum relevance score for a rule
            max_rules: Maximum number of rules to return
        """
        self.rules_path = rules_path
        self.use_cache = use_cache
        self.min_relevance_score = min_relevance_score
        self.max_rules = max_rules
        
        # Load library data from rules.json
        self.library_data = utils.load_library_data(rules_path)
        
        # Create library mappings
        self._create_library_mappings()
    
    def _create_library_mappings(self):
        """Create mappings for efficient library lookups."""
        self.lib_to_tags = {}
        self.tag_to_libs = {}
        self.lib_to_related = {}
        
        if not self.library_data or "libraries" not in self.library_data:
            return
        
        for lib in self.library_data["libraries"]:
            lib_name = lib["name"].lower()
            
            # Map library to its tags
            tags = lib.get("tags", [])
            self.lib_to_tags[lib_name] = set(tags)
            
            # Map tags to libraries
            for tag in tags:
                if tag not in self.tag_to_libs:
                    self.tag_to_libs[tag] = set()
                self.tag_to_libs[tag].add(lib_name)
            
            # Map library to related libraries
            related = lib.get("related", [])
            self.lib_to_related[lib_name] = set(related)
    
    def match_rules(
        self,
        detected_libraries: List[str],
        project_context: Optional[Dict[str, float]] = None
    ) -> List[Dict[str, Any]]:
        """
        Match detected libraries with MDC rules.
        
        Args:
            detected_libraries: List of detected libraries
            project_context: Optional project context scores
            
        Returns:
            List of matched rules with relevance scores
        """
        if not self.library_data or "libraries" not in self.library_data:
            logger.warning("No libraries found in rules.json")
            return []
        
        # Check cache first
        if self.use_cache:
            cache_key = utils.create_cache_key(
                ",".join(sorted(detected_libraries)),
                str(project_context),
                self.min_relevance_score,
                self.max_rules
            )
            cached_data = utils.get_cached_data(cache_key)
            if cached_data:
                logger.debug("Using cached rule matches")
                return cached_data
        
        # Normalize library names
        normalized_libs = {
            utils.normalize_library_name(lib, self.library_data)
            for lib in detected_libraries
        }
        
        # Get project context if not provided
        if project_context is None:
            project_context = utils.get_project_context(normalized_libs, self.library_data)
        
        # Calculate relevance scores for each library in rules.json
        library_scores = []
        for library in self.library_data["libraries"]:
            score = self._calculate_library_relevance(
                library,
                normalized_libs,
                project_context
            )
            
            if score >= self.min_relevance_score:
                library_scores.append((library, score))
        
        # Sort libraries by relevance score
        library_scores.sort(key=lambda x: x[1], reverse=True)
        
        # Format results
        results = []
        for library, score in library_scores[:self.max_rules]:
            result = {
                "rule": library["name"],
                "relevance_score": round(score, 3),
                "description": f"{library['name']} ({', '.join(library.get('tags', []))})",
                "tags": library.get("tags", []),
                "libraries": [library["name"]],
                "category": self._categorize_library(library, normalized_libs)
            }
            results.append(result)
        
        # Cache results
        if self.use_cache:
            utils.set_cached_data(cache_key, results)
        
        return results
    
    def _calculate_library_relevance(
        self,
        library: Dict[str, Any],
        detected_libs: Set[str],
        project_context: Dict[str, float]
    ) -> float:
        """
        Calculate relevance score for a library.
        
        Args:
            library: Library data
            detected_libs: Set of detected libraries
            project_context: Project context scores
            
        Returns:
            Relevance score between 0 and 1
        """
        # Direct match score
        lib_name = library["name"].lower()
        direct_match = 1.0 if lib_name in detected_libs else 0.0
        
        # Tag similarity score
        tag_score = self._calculate_tag_similarity_score(library, detected_libs)
        
        # Context score from project type and tags
        context_score = self._calculate_context_score(library, project_context)
        
        # Combine scores with weights
        weights = {
            "direct_match": 0.8,
            "tag_similarity": 0.15,
            "context": 0.05
        }
        
        total_score = (
            weights["direct_match"] * direct_match +
            weights["tag_similarity"] * tag_score +
            weights["context"] * context_score
        )
        
        return total_score
    
    def _calculate_context_score(
        self,
        library: Dict[str, Any],
        project_context: Dict[str, float]
    ) -> float:
        """
        Calculate context match score for a library.
        
        Args:
            library: Library data
            project_context: Project context scores
            
        Returns:
            Score between 0 and 1
        """
        library_tags = set(library.get("tags", []))
        if not library_tags or not project_context:
            return 0
        
        # Calculate weighted average of context scores for matching tags
        total_score = 0
        total_weight = 0
        
        for tag in library_tags:
            if tag in project_context:
                weight = project_context[tag]
                total_score += weight
                total_weight += 1
        
        return total_score / total_weight if total_weight > 0 else 0
    
    def _calculate_tag_similarity_score(
        self,
        library: Dict[str, Any],
        detected_libs: Set[str]
    ) -> float:
        """
        Calculate tag similarity score for a library.
        
        Args:
            library: Library data
            detected_libs: Set of detected libraries
            
        Returns:
            Score between 0 and 1
        """
        library_tags = set(library.get("tags", []))
        if not library_tags:
            return 0
        
        # Get all tags from detected libraries
        lib_tags = set()
        for lib in detected_libs:
            if lib in self.lib_to_tags:
                lib_tags.update(self.lib_to_tags[lib])
        
        if not lib_tags:
            return 0
        
        # Calculate Jaccard similarity
        intersection = library_tags & lib_tags
        union = library_tags | lib_tags
        
        return len(intersection) / len(union)
    
    def _categorize_library(
        self,
        library: Dict[str, Any],
        detected_libs: Set[str]
    ) -> str:
        """
        Categorize a library based on its relationship to detected libraries.
        
        Args:
            library: Library data
            detected_libs: Set of detected libraries
            
        Returns:
            Category string
        """
        lib_name = library["name"].lower()
        
        # Check for direct matches
        if lib_name in detected_libs:
            return "direct_match"
        
        # Check for tag matches
        library_tags = set(library.get("tags", []))
        lib_tags = set()
        for lib in detected_libs:
            if lib in self.lib_to_tags:
                lib_tags.update(self.lib_to_tags[lib])
        
        if library_tags & lib_tags:
            return "tag_match"
        
        return "suggested"

if __name__ == "__main__":
    # For testing
    import sys
    logging.basicConfig(level=logging.DEBUG)
    
    if len(sys.argv) > 1:
        rules_path = sys.argv[1]
    else:
        rules_path = None
    
    # Example usage
    matcher = RuleMatcher(rules_path)
    detected_libs = ["react", "next-js", "tailwindcss"]
    matched_rules = matcher.match_rules(detected_libs)
    
    print("\nDetected libraries:", detected_libs)
    print("\nMatched rules:")
    for rule in matched_rules:
        print(f"\n{rule['rule']} (score: {rule['relevance_score']}):")
        print(f"  Category: {rule['category']}")
        print(f"  Description: {rule['description']}")
        print(f"  Tags: {', '.join(rule['tags'])}")
        print(f"  Libraries: {', '.join(rule['libraries'])}") 
```

## /cursor-rules-cli/src/scanner.py

```py path="/cursor-rules-cli/src/scanner.py" 
"""
Scanner module for detecting libraries and frameworks in a project.

This module scans a project directory to identify which libraries and frameworks
are being used based on package manager files, import statements, and
framework-specific file patterns.
"""

import os
import json
import logging
import re
from pathlib import Path
from typing import Dict, List, Set, Optional, Tuple, Any
from concurrent.futures import ThreadPoolExecutor, as_completed
from cursor_rules_cli import utils

logger = logging.getLogger(__name__)

# File patterns to look for
PACKAGE_PATTERNS = {
    "node": [
        "package.json",
        "yarn.lock",
        "pnpm-lock.yaml",
        "package-lock.json"
    ],
    "python": [
        "requirements.txt",
        "pyproject.toml",
        "Pipfile",
        "setup.py",
        "uv.lock",
        "poetry.lock",
        "conda.yaml",
        "environment.yml"
    ],
    "php": ["composer.json", "composer.lock"],
    "rust": ["Cargo.toml", "Cargo.lock"],
    "go": ["go.mod", "go.sum"],
    "ruby": ["Gemfile", "Gemfile.lock"],
    "java": ["pom.xml", "build.gradle", "build.gradle.kts"],
    "dotnet": ["*.csproj", "*.fsproj", "*.vbproj", "packages.config"],
}

# Framework-specific file patterns
FRAMEWORK_PATTERNS = {
    "react": ["src/App.jsx", "src/App.tsx", "src/App.js", "public/index.html"],
    "vue": ["src/App.vue", "src/main.js", "public/index.html"],
    "angular": ["angular.json", "src/app/app.module.ts"],
    "next-js": ["next.config.js", "pages/_app.js", "pages/_app.tsx"],
    "nuxt": ["nuxt.config.js", "nuxt.config.ts"],
    "svelte": ["svelte.config.js", "src/App.svelte"],
    "django": ["manage.py", "wsgi.py", "asgi.py"],
    "flask": ["app.py", "wsgi.py", "application.py"],
    "fastapi": ["main.py"],
    "express": ["app.js", "server.js"],
    "nestjs": ["nest-cli.json", "src/main.ts"],
    "laravel": ["artisan", "composer.json"],
    "spring-boot": ["src/main/java", "src/main/resources/application.properties"],
}

# Import patterns for different languages
IMPORT_PATTERNS = {
    "python": {
        "files": ["*.py"],
        "regex": [
            r"(?:^|\n)\s*(?:import|from)\s+([a-zA-Z0-9_.]+)",
            r"(?:^|\n)\s*from\s+([a-zA-Z0-9_.]+)\s+import",
            r"(?:^|\n)\s*__import__\(['\"]([a-zA-Z0-9_.]+)['\"]\)",
            r"(?:^|\n)\s*importlib\.import_module\(['\"]([a-zA-Z0-9_.]+)['\"]\)"
        ]
    },
    "javascript": {
        "files": ["*.js", "*.jsx", "*.ts", "*.tsx"],
        "regex": [
            r"(?:^|\n)\s*import\s+.*?(?:from\s+['\"]([^'\"]+)['\"]|['\"]([^'\"]+)['\"])",
            r"(?:^|\n)\s*require\(['\"]([^'\"]+)['\"]\)",
            r"(?:^|\n)\s*import\(['\"]([^'\"]+)['\"]\)"
        ]
    },
    "php": {
        "files": ["*.php"],
        "regex": [
            r"(?:^|\n)\s*(?:use|require|include|require_once|include_once)\s+['\"]?([a-zA-Z0-9_\\/.]+)",
            r"(?:^|\n)\s*namespace\s+([a-zA-Z0-9_\\/.]+)"
        ]
    },
    "java": {
        "files": ["*.java"],
        "regex": [
            r"(?:^|\n)\s*import\s+([a-zA-Z0-9_.]+)",
            r"(?:^|\n)\s*package\s+([a-zA-Z0-9_.]+)"
        ]
    },
    "rust": {
        "files": ["*.rs"],
        "regex": [
            r"(?:^|\n)\s*(?:use|extern\s+crate)\s+([a-zA-Z0-9_:]+)",
            r"(?:^|\n)\s*mod\s+([a-zA-Z0-9_]+)"
        ]
    },
}

# Directories to exclude from scanning
EXCLUDED_DIRS = [
    "node_modules",
    "venv",
    ".venv",
    "env",
    ".env",
    "__pycache__",
    ".git",
    ".github",
    ".idea",
    ".vscode",
    "dist",
    "build",
    "target",
    "out",
    "bin",
    "obj",
    ".next",
    ".nuxt",
    ".svelte-kit",
    ".cache",
    ".pytest_cache",
    ".mypy_cache",
    ".ruff_cache",
    "site-packages",
    "lib/python*",
]

# Maximum directory depth for import scanning
MAX_SCAN_DEPTH = 5

def scan_project(
    project_dir: str,
    quick_scan: bool = False,
    max_depth: int = MAX_SCAN_DEPTH,
    rules_path: str = None,
    max_workers: int = None,
    use_cache: bool = True
) -> List[str]:
    """
    Scan a project directory to detect libraries and frameworks.
    
    Args:
        project_dir: Path to the project directory
        quick_scan: If True, only scan package files, not imports
        max_depth: Maximum directory depth for scanning
        rules_path: Path to rules.json file
        max_workers: Maximum number of worker threads (None for CPU count)
        use_cache: Whether to use caching
        
    Returns:
        List of detected libraries and frameworks
    """
    project_path = Path(project_dir).resolve()
    logger.debug(f"Scanning project at {project_path}")
    
    # Check cache first
    if use_cache:
        cache_key = utils.create_cache_key(
            str(project_path),
            quick_scan,
            max_depth,
            rules_path
        )
        cached_data = utils.get_cached_data(cache_key)
        if cached_data:
            logger.debug("Using cached scan results")
            return cached_data
    
    # Load library data from rules.json
    library_data = utils.load_library_data(rules_path)
    
    # Track both the libraries and their sources
    detected_libraries = set()
    direct_match_libraries = set()  # Track direct matches separately
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        # Submit package file scanning first to identify direct dependencies
        package_files_future = executor.submit(scan_package_files, project_path)
        
        # Submit other scanning tasks
        future_to_task = {
            executor.submit(scan_docker_files, project_path): "docker_files",
            executor.submit(scan_github_actions, project_path): "github_actions",
            executor.submit(detect_frameworks, project_path): "frameworks"
        }
        
        # Process package files result first to identify direct dependencies
        try:
            direct_matches = package_files_future.result()
            detected_libraries.update(direct_matches)
            direct_match_libraries.update(direct_matches)  # Mark as direct matches
            logger.debug(f"Completed package_files scan, found {len(direct_matches)} direct dependencies")
        except Exception as e:
            logger.error(f"Error in package_files scan: {e}")
        
        # Add import scanning if not quick scan
        if not quick_scan:
            future_to_task[executor.submit(scan_imports, project_path, max_depth)] = "imports"
        
        # Process results from other scanning tasks
        for future in as_completed(future_to_task):
            task_name = future_to_task[future]
            try:
                result = future.result()
                detected_libraries.update(result)
                logger.debug(f"Completed {task_name} scan")
            except Exception as e:
                logger.error(f"Error in {task_name} scan: {e}")
    
    # Normalize library names
    normalized_libraries = {
        utils.normalize_library_name(lib, library_data)
        for lib in detected_libraries
    }
    
    normalized_direct_matches = {
        utils.normalize_library_name(lib, library_data)
        for lib in direct_match_libraries
    }
    
    # Detect additional frameworks based on rules.json
    if library_data:
        framework_libs = detect_frameworks_from_rules(normalized_libraries, library_data)
        normalized_libraries.update(framework_libs)
    
    # Sort libraries, prioritizing direct matches first, then by popularity
    sorted_libraries = sorted(
        normalized_libraries,
        key=lambda x: (
            x in normalized_direct_matches,  # Direct matches first
            utils.calculate_library_popularity(x, library_data)  # Then by popularity
        ),
        reverse=True
    )
    
    # Cache results
    if use_cache:
        utils.set_cached_data(cache_key, sorted_libraries)
    
    return sorted_libraries

def scan_package_files(project_path: Path) -> Set[str]:
    """
    Scan package manager files to detect libraries.
    
    Args:
        project_path: Path to the project directory
        
    Returns:
        Set of detected libraries
    """
    detected_libs = set()
    
    # Check for Node.js package files
    for node_file in ["package.json", "yarn.lock", "pnpm-lock.yaml"]:
        file_path = project_path / node_file
        if file_path.exists():
            logger.debug(f"Found {node_file}")
            try:
                if node_file == "package.json":
                    with open(file_path, 'r') as f:
                        data = json.load(f)
                    
                    # Add dependencies
                    deps = data.get("dependencies", {})
                    dev_deps = data.get("devDependencies", {})
                    all_deps = {**deps, **dev_deps}
                    
                    # Add detected libraries
                    detected_libs.update(all_deps.keys())
                    
                    # Detect framework from dependencies
                    framework_deps = {
                        "react": "react",
                        "vue": "vue",
                        "next": "next-js",
                        "nuxt": "nuxt",
                        "svelte": "svelte",
                        "@angular/core": "angular",
                        "express": "express",
                        "@nestjs/core": "nestjs"
                    }
                    
                    for dep, framework in framework_deps.items():
                        if dep in deps:
                            detected_libs.add(framework)
                
                elif node_file == "yarn.lock":
                    with open(file_path, 'r') as f:
                        content = f.read()
                    # Extract package names from yarn.lock
                    packages = re.findall(r'^"?([^@\s"]+)@', content, re.MULTILINE)
                    detected_libs.update(packages)
                
                elif node_file == "pnpm-lock.yaml":
                    with open(file_path, 'r') as f:
                        content = f.read()
                    # Extract package names from pnpm-lock.yaml
                    packages = re.findall(r'(?:^|\n)\s*/([^/:]+):', content)
                    detected_libs.update(packages)
                    
            except (json.JSONDecodeError, IOError) as e:
                logger.warning(f"Error parsing {node_file}: {e}")
    
    # Check for Python package files
    python_files = {
        "requirements.txt": r'^([a-zA-Z0-9_.-]+)',
        "pyproject.toml": None,  # Pattern not needed, handled specially
        "Pipfile": r'(?:^|\n)\s*([a-zA-Z0-9_.-]+)\s*=',
        "setup.py": r'install_requires=\[([^\]]+)\]',
        # Keep uv.lock for compatibility with uv (modern Python package manager)
        # Only check for it if it exists to avoid unnecessary file operations
        "uv.lock": r'name\s*=\s*"([^"]+)"' if os.path.exists(project_path / "uv.lock") else None
    }
    
    for file_name, pattern in python_files.items():
        file_path = project_path / file_name
        if file_path.exists():
            logger.debug(f"Found {file_name}")
            try:
                with open(file_path, 'r') as f:
                    content = f.read()
                
                if file_name == "setup.py":
                    # Special handling for setup.py
                    matches = re.search(pattern, content)
                    if matches:
                        packages = re.findall(r'[\'"]([^\'\"]+)[\'"]', matches.group(1))
                        detected_libs.update(p.split('>=')[0].split('==')[0].strip() for p in packages)
                elif file_name == "pyproject.toml":
                    # Special handling for pyproject.toml
                    # Look for dependencies section in PEP 621 format
                    pep621_deps_match = re.search(r'\[project\].*?dependencies\s*=\s*\[(.*?)\]', content, re.DOTALL)
                    if pep621_deps_match:
                        deps_content = pep621_deps_match.group(1)
                        # Extract package names from dependencies
                        packages = re.findall(r'[\'"]([a-zA-Z0-9_.-]+)(?:>=|==|>|<|~=|!=|@|$)', deps_content)
                        detected_libs.update(packages)
                    
                    # Look for dependencies section in Poetry format
                    poetry_deps_match = re.search(r'\[tool\.poetry\.dependencies\](.*?)(?:\[|\Z)', content, re.DOTALL)
                    if poetry_deps_match:
                        deps_content = poetry_deps_match.group(1)
                        # Extract package names from Poetry dependencies
                        packages = re.findall(r'([a-zA-Z0-9_.-]+)\s*=', deps_content)
                        detected_libs.update(packages)
                    
                    # Also check for dev-dependencies in Poetry format
                    dev_deps_match = re.search(r'\[tool\.poetry\.dev-dependencies\](.*?)(?:\[|\Z)', content, re.DOTALL)
                    if dev_deps_match:
                        dev_deps_content = dev_deps_match.group(1)
                        dev_packages = re.findall(r'([a-zA-Z0-9_.-]+)\s*=', dev_deps_content)
                        detected_libs.update(dev_packages)
                else:
                    # General pattern matching
                    packages = re.findall(pattern, content)
                    detected_libs.update(p.split('>=')[0].split('==')[0].strip() for p in packages)
                    
                # Check for common frameworks
                framework_packages = {"django", "flask", "fastapi"}
                detected_libs.update(framework_packages & detected_libs)
                
            except IOError as e:
                logger.warning(f"Error reading {file_name}: {e}")
    
    return detected_libs

def scan_docker_files(project_path: Path) -> Set[str]:
    """
    Scan Dockerfile and docker-compose files for libraries.
    
    Args:
        project_path: Path to the project directory
        
    Returns:
        Set of detected libraries
    """
    detected_libs = set()
    
    docker_files = ["Dockerfile", "docker-compose.yml", "docker-compose.yaml"]
    for file_name in docker_files:
        file_path = project_path / file_name
        if file_path.exists():
            logger.debug(f"Found {file_name}")
            try:
                with open(file_path, 'r') as f:
                    content = f.read()
                
                # Look for common package installations
                pip_packages = re.findall(r'pip\s+install\s+([^\s&|;]+)', content)
                npm_packages = re.findall(r'npm\s+install\s+([^\s&|;]+)', content)
                apt_packages = re.findall(r'apt-get\s+install\s+([^\s&|;]+)', content)
                
                detected_libs.update(pip_packages)
                detected_libs.update(npm_packages)
                detected_libs.update(apt_packages)
                
                # Look for base images
                base_images = re.findall(r'FROM\s+([^\s:]+)', content)
                detected_libs.update(base_images)
                
            except IOError as e:
                logger.warning(f"Error reading {file_name}: {e}")
    
    return detected_libs

def scan_github_actions(project_path: Path) -> Set[str]:
    """
    Scan GitHub Actions workflow files for libraries.
    
    Args:
        project_path: Path to the project directory
        
    Returns:
        Set of detected libraries
    """
    detected_libs = set()
    
    workflows_dir = project_path / ".github" / "workflows"
    if not workflows_dir.exists():
        return detected_libs
    
    for workflow_file in workflows_dir.glob("*.yml"):
        logger.debug(f"Found workflow file: {workflow_file}")
        try:
            with open(workflow_file, 'r') as f:
                content = f.read()
            
            # Look for common actions and tools
            actions = re.findall(r'uses:\s+([^\s@]+)', content)
            detected_libs.update(actions)
            
            # Look for package installations
            pip_packages = re.findall(r'pip\s+install\s+([^\s&|;]+)', content)
            npm_packages = re.findall(r'npm\s+install\s+([^\s&|;]+)', content)
            
            detected_libs.update(pip_packages)
            detected_libs.update(npm_packages)
            
        except IOError as e:
            logger.warning(f"Error reading workflow file {workflow_file}: {e}")
    
    return detected_libs

def detect_frameworks(project_path: Path) -> Set[str]:
    """
    Detect frameworks based on specific file patterns.
    
    Args:
        project_path: Path to the project directory
        
    Returns:
        Set of detected frameworks
    """
    detected_frameworks = set()
    
    for framework, patterns in FRAMEWORK_PATTERNS.items():
        for pattern in patterns:
            # Check if the pattern is a directory
            if not pattern.endswith(('/', '\\')) and not os.path.splitext(pattern)[1]:
                if (project_path / pattern).is_dir():
                    logger.debug(f"Found framework directory pattern: {pattern}")
                    detected_frameworks.add(framework)
                    break
            
            # Check for file patterns
            matches = list(project_path.glob(pattern))
            if matches:
                logger.debug(f"Found framework file pattern: {pattern}")
                detected_frameworks.add(framework)
                break
    
    return detected_frameworks

def detect_frameworks_from_rules(detected_libs: Set[str], library_data: Dict[str, Any]) -> Set[str]:
    """
    Detect frameworks based on detected libraries and rules.json data.
    
    Args:
        detected_libs: Set of detected libraries
        library_data: Library data from rules.json
        
    Returns:
        Set of detected frameworks
    """
    detected_frameworks = set()
    
    if not library_data or "libraries" not in library_data:
        return detected_frameworks
    
    # Create a mapping of library names to their data
    lib_map = {lib["name"].lower(): lib for lib in library_data["libraries"]}
    
    # Check detected libraries against rules.json
    for lib in detected_libs:
        lib_lower = lib.lower()
        if lib_lower in lib_map:
            # Add the library itself
            detected_frameworks.add(lib_lower)
            
            # Check if this library is a framework
            tags = lib_map[lib_lower].get("tags", [])
            if "framework" in tags:
                detected_frameworks.add(lib_lower)
                
            # Check for related libraries based on tags
            # For example, if we detect "react", we might want to check for "react-router"
            if "react" in lib_lower and "frontend" in tags:
                for related_lib, related_data in lib_map.items():
                    if "react" in related_lib and related_lib != lib_lower:
                        if any(tag in related_data.get("tags", []) for tag in ["frontend", "ui"]):
                            detected_frameworks.add(related_lib)
    
    return detected_frameworks

def scan_imports(project_path: Path, max_depth: int = MAX_SCAN_DEPTH) -> Set[str]:
    """
    Scan source files for import statements to detect libraries.
    
    Args:
        project_path: Path to the project directory
        max_depth: Maximum directory depth for scanning
        
    Returns:
        Set of detected libraries from imports
    """
    detected_imports = set()
    
    for lang, pattern_info in IMPORT_PATTERNS.items():
        file_patterns = pattern_info["files"]
        import_regexes = pattern_info["regex"]
        
        # Use a more efficient file traversal with depth limit and exclusions
        for file_pattern in file_patterns:
            for file_path in find_files(project_path, file_pattern, max_depth):
                try:
                    with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
                        content = f.read()
                    
                    # Find all imports using multiple regex patterns
                    for import_regex in import_regexes:
                        imports = re.findall(import_regex, content)
                        
                        # Process matches
                        for imp in imports:
                            if isinstance(imp, tuple):
                                # Some regex patterns might have multiple capture groups
                                imp = next((i for i in imp if i), "")
                            
                            if imp:
                                # Extract the top-level package name
                                top_level = imp.split('.')[0].split('/')[0]
                                if top_level and not top_level.startswith(('.', '_')):
                                    detected_imports.add(top_level.lower())
                
                except (IOError, UnicodeDecodeError) as e:
                    logger.debug(f"Error reading {file_path}: {e}")
    
    return detected_imports

def find_files(root_dir: Path, pattern: str, max_depth: int, current_depth: int = 0) -> List[Path]:
    """
    Find files matching a pattern with depth limit and directory exclusions.
    
    Args:
        root_dir: Root directory to start searching from
        pattern: File pattern to match
        max_depth: Maximum directory depth to search
        current_depth: Current depth in the directory tree
        
    Returns:
        List of file paths matching the pattern
    """
    if current_depth > max_depth:
        return []
    
    matching_files = []
    
    try:
        for item in root_dir.iterdir():
            if item.is_file() and item.match(pattern):
                matching_files.append(item)
            elif item.is_dir() and not should_exclude_dir(item):
                matching_files.extend(find_files(item, pattern, max_depth, current_depth + 1))
    except (PermissionError, OSError) as e:
        logger.debug(f"Error accessing {root_dir}: {e}")
    
    return matching_files

def should_exclude_dir(dir_path: Path) -> bool:
    """
    Check if a directory should be excluded from scanning.
    
    Args:
        dir_path: Path to the directory
        
    Returns:
        True if the directory should be excluded, False otherwise
    """
    dir_name = dir_path.name
    return dir_name in EXCLUDED_DIRS or dir_name.startswith('.')

if __name__ == "__main__":
    # For testing
    import sys
    logging.basicConfig(level=logging.DEBUG)
    
    if len(sys.argv) > 1:
        project_dir = sys.argv[1]
    else:
        project_dir = "."
        
    libraries = scan_project(project_dir)
    print(f"Detected libraries: {libraries}") 
```

## /cursor-rules-cli/src/utils.py

```py path="/cursor-rules-cli/src/utils.py" 
"""
Utils module with helper functions for the Cursor Rules CLI.

This module provides utility functions for file operations, security verification,
and other common tasks.
"""

import os
import re
import hashlib
import logging
import platform
import json
import time
from threading import Lock
from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple, Set
import requests
from functools import lru_cache
from urllib.parse import urlparse
import validators

logger = logging.getLogger(__name__)

class RateLimiter:
    """Rate limiter to avoid overwhelming servers."""
    
    def __init__(self, rate_limit: int):
        """
        Initialize rate limiter.
        
        Args:
            rate_limit: Maximum requests per second
        """
        self.rate_limit = rate_limit
        self.last_request_time = 0
        self._lock = Lock()  # Thread-safe locking
    
    def wait(self):
        """
        Wait if necessary to respect the rate limit.
        Thread-safe implementation.
        """
        with self._lock:
            current_time = time.time()
            elapsed = current_time - self.last_request_time
            
            # If we've made a request recently, wait
            if elapsed < (1.0 / self.rate_limit):
                sleep_time = (1.0 / self.rate_limit) - elapsed
                time.sleep(sleep_time)
            
            self.last_request_time = time.time()

def calculate_content_hash(content: str) -> str:
    """
    Calculate SHA-256 hash of content string.
    
    Args:
        content: Content to hash
        
    Returns:
        SHA-256 hash as hex string
    """
    # Normalize line endings to '\n'
    content = content.replace('\r\n', '\n').replace('\r', '\n')
    return hashlib.sha256(content.encode('utf-8')).hexdigest()

def get_cursor_dir() -> Path:
    """
    Get the path to the project's .cursor directory.
    
    Returns:
        Path to the project's .cursor directory
    """
    return Path.cwd() / ".cursor"

def get_rules_dir() -> Path:
    """
    Get the path to the project's .cursor/rules directory.
    
    Returns:
        Path to the project's .cursor/rules directory
    """
    return get_cursor_dir() / "rules"

def get_config_file() -> Path:
    """
    Get the path to the configuration file.
    
    Returns:
        Path to the configuration file in the project's .cursor directory
    """
    return get_cursor_dir() / "rules-cli-config.json"

def load_config() -> Dict[str, Any]:
    """
    Load configuration from the config file.
    
    Returns:
        Configuration dictionary
    """
    config_file = get_config_file()
    
    if not config_file.exists():
        return {}
    
    try:
        with open(config_file, 'r') as f:
            return json.load(f)
    except (IOError, json.JSONDecodeError) as e:
        logger.error(f"Failed to load config file: {e}")
        return {}

def save_config(config: Dict[str, Any]) -> bool:
    """
    Save configuration to the config file.
    
    Args:
        config: Configuration dictionary
        
    Returns:
        True if successful, False otherwise
    """
    config_file = get_config_file()
    
    try:
        # Ensure the directory exists
        ensure_dir_exists(config_file.parent)
        
        with open(config_file, 'w') as f:
            json.dump(config, f, indent=2)
        
        return True
    except IOError as e:
        logger.error(f"Failed to save config file: {e}")
        return False

def ensure_dir_exists(path: Path) -> bool:
    """
    Ensure a directory exists, creating it if necessary.
    
    Args:
        path: Path to the directory
        
    Returns:
        True if the directory exists or was created, False otherwise
    """
    try:
        path.mkdir(parents=True, exist_ok=True)
        return True
    except OSError as e:
        logger.error(f"Failed to create directory {path}: {e}")
        return False

def is_url_trusted(url: str) -> Tuple[bool, str]:
    """
    Check if a URL is from a trusted source using proper URL parsing.
    
    Args:
        url: URL to check
        
    Returns:
        Tuple of (is_trusted: bool, error_message: str)
    """
    # First validate URL format
    if not validators.url(url):
        return False, "Invalid URL format"
    
    try:
        parsed_url = urlparse(url)
        
        # Check for HTTPS
        if parsed_url.scheme != "https":
            return False, "URL must use HTTPS"
        
        # List of trusted domains and their subdomains
        trusted_domains = [
            "raw.githubusercontent.com",
            "github.com",
        ]
        
        # Extract domain from URL
        domain = parsed_url.netloc.lower()
        
        # Check if domain exactly matches or is subdomain of trusted domains
        is_trusted = any(
            domain == trusted_domain or domain.endswith(f".{trusted_domain}")
            for trusted_domain in trusted_domains
        )
        
        if not is_trusted:
            return False, f"Domain {domain} is not in trusted list"
        
        # Additional security checks for GitHub URLs
        if "github" in domain:
            # Validate path format for raw.githubusercontent.com
            if domain == "raw.githubusercontent.com":
                path_parts = [p for p in parsed_url.path.split("/") if p]
                if len(path_parts) < 4:  # username/repo/branch/path
                    return False, "Invalid GitHub raw URL format"
            
            # Validate path format for github.com
            elif domain == "github.com":
                path_parts = [p for p in parsed_url.path.split("/") if p]
                if len(path_parts) < 2:  # username/repo
                    return False, "Invalid GitHub repository URL format"
        
        return True, ""
        
    except Exception as e:
        return False, f"URL validation error: {str(e)}"

def validate_mdc_content(content: str) -> Tuple[bool, str]:
    """
    Validate MDC file content more thoroughly.
    
    Args:
        content: Content to validate
        
    Returns:
        Tuple of (is_valid: bool, error_message: str)
    """
    if not content:
        return False, "Empty content"
    
    # Check for frontmatter
    if not content.startswith("---"):
        return False, "Missing frontmatter start"
    
    # Find end of frontmatter
    frontmatter_end = content.find("---", 3)
    if frontmatter_end == -1:
        return False, "Missing frontmatter end"
    
    # Extract frontmatter
    frontmatter = content[3:frontmatter_end].strip()
    
    # Required fields in frontmatter
    required_fields = ["description", "globs"]
    
    # Check for required fields
    for field in required_fields:
        if f"{field}:" not in frontmatter:
            return False, f"Missing required field: {field}"
    
    # Check for content after frontmatter
    content_after_frontmatter = content[frontmatter_end + 3:].strip()
    if not content_after_frontmatter:
        return False, "No content after frontmatter"
    
    # Check for potentially malicious content
    suspicious_patterns = [
        r"<script",
        r"javascript:",
        r"data:text/html",
        r"vbscript:",
        r"onload=",
        r"onerror=",
    ]
    
    for pattern in suspicious_patterns:
        if re.search(pattern, content, re.IGNORECASE):
            return False, f"Suspicious content detected: {pattern}"
    
    # Validate globs format
    try:
        globs_line = next(line for line in frontmatter.split("\n") if line.startswith("globs:"))
        globs_value = globs_line.split(":", 1)[1].strip()
        if not globs_value:
            return False, "Empty globs value"
        
        # Basic glob pattern validation
        invalid_chars = '<>|"'
        if any(char in globs_value for char in invalid_chars):
            return False, "Invalid characters in globs pattern"
            
    except StopIteration:
        return False, "Could not parse globs field"
    
    return True, ""

def calculate_file_hash(file_path: Path) -> Optional[str]:
    """
    Calculate the SHA-256 hash of a file.
    
    Args:
        file_path: Path to the file
        
    Returns:
        SHA-256 hash as a hex string, or None if failed
    """
    try:
        # Read the file as text and normalize line endings
        with open(file_path, "r", encoding="utf-8") as f:
            content = f.read()
        
        # Normalize line endings to '\n'
        content = content.replace('\r\n', '\n').replace('\r', '\n')
        
        # Calculate hash from normalized content
        return hashlib.sha256(content.encode('utf-8')).hexdigest()
    except IOError as e:
        logger.error(f"Failed to calculate hash for {file_path}: {e}")
        return None

def is_valid_mdc_file(content: str) -> bool:
    """
    Check if the content is a valid MDC file.
    
    Args:
        content: File content to check
        
    Returns:
        True if valid, False otherwise
    """
    # Check for frontmatter
    if not content.startswith("---"):
        return False
    
    # Check for description
    if "description:" not in content:
        return False
    
    # Check for globs
    if "globs:" not in content:
        return False
    
    # Check for closing frontmatter
    frontmatter_end = content.find("---", 3)
    if frontmatter_end == -1:
        return False
    
    # Check for content after frontmatter
    if len(content) <= frontmatter_end + 3:
        return False
    
    # Check if there's actual content after the frontmatter
    content_after_frontmatter = content[frontmatter_end + 3:].strip()
    if not content_after_frontmatter:
        return False
    
    return True

def sanitize_filename(name: str) -> str:
    """
    Sanitize a filename to ensure it's valid.
    
    Args:
        name: Filename to sanitize
        
    Returns:
        Sanitized filename
    """
    # Replace invalid characters with underscores
    invalid_chars = r'[<>:"/\\|?*]'
    sanitized = re.sub(invalid_chars, "_", name)
    
    # Ensure it's not too long
    max_length = 255 if platform.system() != "Windows" else 240
    if len(sanitized) > max_length:
        sanitized = sanitized[:max_length]
    
    return sanitized

def preview_content(content: str, max_lines: int = 10) -> str:
    """
    Generate a preview of content.
    
    Args:
        content: Content to preview
        max_lines: Maximum number of lines to include
        
    Returns:
        Preview of the content
    """
    lines = content.split("\n")
    
    if len(lines) <= max_lines:
        return content
    
    # Show first few lines and indicate there's more
    preview_lines = lines[:max_lines]
    preview = "\n".join(preview_lines)
    preview += f"\n... ({len(lines) - max_lines} more lines)"
    
    return preview

def validate_github_repo(repo: str) -> bool:
    """
    Validate a GitHub repository string.
    
    Args:
        repo: GitHub repository string (username/repo)
        
    Returns:
        True if valid, False otherwise
    """
    if not repo:
        return False
    
    # Check format (username/repo)
    if not re.match(r'^[a-zA-Z0-9_-]+/[a-zA-Z0-9_.-]+$', repo):
        return False
    
    # Check if the repository exists and has a rules.json file
    try:
        url = f"https://raw.githubusercontent.com/{repo}/main/rules.json"
        response = requests.head(url, timeout=5)
        return response.status_code == 200
    except requests.RequestException:
        return False

def get_project_config_file(project_dir: Path) -> Path:
    """
    Get the path to the project-specific configuration file.
    
    Args:
        project_dir: Path to the project directory
        
    Returns:
        Path to the project configuration file
    """
    return project_dir / ".cursor-rules-cli.json"

def load_project_config(project_dir: Path) -> Dict[str, Any]:
    """
    Load configuration from the project-specific config file.
    
    Args:
        project_dir: Path to the project directory
        
    Returns:
        Project configuration dictionary
    """
    config_file = get_project_config_file(project_dir)
    
    if not config_file.exists():
        return {}
    
    try:
        with open(config_file, 'r') as f:
            return json.load(f)
    except (IOError, json.JSONDecodeError) as e:
        logger.error(f"Failed to load project config file: {e}")
        return {}

def save_project_config(project_dir: Path, config: Dict[str, Any]) -> bool:
    """
    Save configuration to the project-specific config file.
    
    Args:
        project_dir: Path to the project directory
        config: Configuration dictionary
        
    Returns:
        True if successful, False otherwise
    """
    config_file = get_project_config_file(project_dir)
    
    try:
        with open(config_file, 'w') as f:
            json.dump(config, f, indent=2)
        
        return True
    except IOError as e:
        logger.error(f"Failed to save project config file: {e}")
        return False

def merge_configs(global_config: Dict[str, Any], project_config: Dict[str, Any]) -> Dict[str, Any]:
    """
    Merge global and project-specific configurations.
    Project configuration takes precedence over global configuration for project-specific settings.
    CLI-wide settings (like custom_repo and source) are always taken from global config.
    
    Args:
        global_config: Global configuration dictionary
        project_config: Project-specific configuration dictionary
        
    Returns:
        Merged configuration dictionary
    """
    # CLI-wide settings that should only come from global config
    cli_wide_settings = ["custom_repo", "source"]
    
    # Start with a copy of the global config
    merged = global_config.copy()
    
    # Update with project config, but exclude CLI-wide settings
    project_config_filtered = {k: v for k, v in project_config.items() if k not in cli_wide_settings}
    merged.update(project_config_filtered)
    
    return merged

# Default paths
DEFAULT_RULES_PATH = Path(__file__).parent.parent / "rules.json"
DEFAULT_CACHE_DIR = Path(__file__).parent.parent / ".cache"

@lru_cache(maxsize=1)
def load_library_data(rules_path: Optional[str] = None) -> Dict[str, Any]:
    """
    Load and cache library data from rules.json.
    
    Args:
        rules_path: Optional path to rules.json
        
    Returns:
        Dictionary of library data
    """
    # If a specific path is provided, use it first
    if rules_path and Path(rules_path).exists():
        logger.debug(f"Using specified rules.json at {rules_path}")
        path_to_use = Path(rules_path)
    else:
        # Search for rules.json in priority order
        possible_paths = [
            Path(__file__).parent.parent / "rules.json",  # package root rules.json
            Path.cwd() / "rules.json"  # current directory rules.json
        ]
        
        path_to_use = None
        for path in possible_paths:
            if path.exists():
                path_to_use = path
                logger.debug(f"Found rules.json at {path}")
                break
    
    if not path_to_use or not path_to_use.exists():
        logger.warning("rules.json not found in any standard location, using default library detection")
        return {}
    
    try:
        with open(path_to_use, 'r') as f:
            data = json.load(f)
            logger.info(f"Successfully loaded rules.json from {path_to_use}")
            return data
    except (json.JSONDecodeError, IOError) as e:
        logger.warning(f"Error loading rules.json from {path_to_use}: {e}")
        return {}

def normalize_library_name(name: str, library_data: Dict[str, Any]) -> str:
    """
    Normalize a library name to match rules.json conventions.
    
    Args:
        name: Library name to normalize
        library_data: Library data from rules.json
        
    Returns:
        Normalized library name
    """
    if not library_data or "libraries" not in library_data:
        return name.lower()
    
    name_lower = name.lower()
    lib_map = {lib["name"].lower(): lib["name"] for lib in library_data["libraries"]}
    
    # Handle special cases and common aliases
    special_cases = {
        "torch": "pytorch",
        "tf": "tensorflow",
        "bs4": "beautifulsoup4",
        "plt": "matplotlib",
        "np": "numpy",
        "pd": "pandas"
    }
    
    if name_lower in special_cases and special_cases[name_lower] in lib_map:
        return lib_map[special_cases[name_lower]]
    
    return lib_map.get(name_lower, name_lower)

def calculate_library_popularity(lib_name: str, library_data: Dict[str, Any]) -> float:
    """
    Calculate a library's popularity score based on its tags and relationships.
    
    Args:
        lib_name: Library name
        library_data: Library data from rules.json
        
    Returns:
        Popularity score between 0 and 1
    """
    if not library_data or "libraries" not in library_data:
        return 0.5  # Default score
    
    lib_map = {lib["name"].lower(): lib for lib in library_data["libraries"]}
    lib_name_lower = lib_name.lower()
    
    if lib_name_lower not in lib_map:
        return 0.5
    
    lib_info = lib_map[lib_name_lower]
    tags = lib_info.get("tags", [])
    
    # Base score from number of tags (more tags = more versatile)
    tag_score = min(len(tags) / 10, 0.5)  # Up to 0.5 from tags
    
    # Additional score from important tags
    important_tags = {"framework", "language", "major-platform"}
    tag_importance = sum(0.1 for tag in tags if tag in important_tags)
    
    # Calculate related libraries score
    related_count = sum(1 for lib in library_data["libraries"]
                       if any(tag in lib.get("tags", []) for tag in tags))
    relationship_score = min(related_count / len(library_data["libraries"]), 0.3)
    
    total_score = tag_score + tag_importance + relationship_score
    return min(total_score, 1.0)

def get_project_context(detected_libs: Set[str], library_data: Dict[str, Any]) -> Dict[str, float]:
    """
    Determine project context based on detected libraries.
    
    Args:
        detected_libs: Set of detected library names
        library_data: Library data from rules.json
        
    Returns:
        Dictionary of context scores (e.g., {'frontend': 0.8, 'backend': 0.3})
    """
    contexts = {
        "frontend": 0.0,
        "backend": 0.0,
        "data-science": 0.0,
        "devops": 0.0,
        "mobile": 0.0
    }
    
    if not library_data or "libraries" not in library_data:
        return contexts
    
    lib_map = {lib["name"].lower(): lib for lib in library_data["libraries"]}
    total_libs = len(detected_libs)
    
    if total_libs == 0:
        return contexts
    
    # Context indicators in tags
    context_tags = {
        "frontend": {"frontend", "ui", "javascript", "css", "html"},
        "backend": {"backend", "api", "server", "database"},
        "data-science": {"data-science", "machine-learning", "ai", "analytics"},
        "devops": {"devops", "ci-cd", "containerization", "cloud"},
        "mobile": {"mobile", "ios", "android", "cross-platform"}
    }
    
    # Calculate scores based on detected libraries
    for lib in detected_libs:
        lib_lower = lib.lower()
        if lib_lower in lib_map:
            lib_tags = set(lib_map[lib_lower].get("tags", []))
            
            for context, indicators in context_tags.items():
                if lib_tags & indicators:  # If there's any overlap
                    contexts[context] += 1 / total_libs
    
    # Normalize scores
    max_score = max(contexts.values())
    if max_score > 0:
        contexts = {k: v / max_score for k, v in contexts.items()}
    
    return contexts

def create_cache_key(*args) -> str:
    """
    Create a cache key from arguments.
    
    Args:
        *args: Arguments to create key from
        
    Returns:
        Cache key string
    """
    key = ":".join(str(arg) for arg in args)
    
    # If the key is too long, hash it to avoid file name length issues
    if len(key) > 100:
        return hashlib.md5(key.encode()).hexdigest()
    
    return key

def get_cached_data(cache_key: str) -> Optional[Any]:
    """
    Get data from cache.
    
    Args:
        cache_key: Cache key
        
    Returns:
        Cached data or None if not found
    """
    cache_file = DEFAULT_CACHE_DIR / f"{cache_key}.json"
    if cache_file.exists():
        try:
            with open(cache_file, 'r') as f:
                return json.load(f)
        except (json.JSONDecodeError, IOError):
            return None
    return None

def set_cached_data(cache_key: str, data: Any) -> bool:
    """
    Save data to cache.
    
    Args:
        cache_key: Cache key
        data: Data to cache
        
    Returns:
        True if successful, False otherwise
    """
    try:
        DEFAULT_CACHE_DIR.mkdir(parents=True, exist_ok=True)
        cache_file = DEFAULT_CACHE_DIR / f"{cache_key}.json"
        with open(cache_file, 'w') as f:
            json.dump(data, f)
        return True
    except (IOError, OSError):
        return False

if __name__ == "__main__":
    # For testing
    logging.basicConfig(level=logging.DEBUG)
    
    cursor_dir = get_cursor_dir()
    rules_dir = get_rules_dir()
    
    print(f"Cursor directory: {cursor_dir}")
    print(f"Rules directory: {rules_dir}")
    
    # Test directory creation
    if ensure_dir_exists(rules_dir):
        print(f"Created rules directory: {rules_dir}")
    
    # Test filename sanitization
    print(f"Sanitized filename: {sanitize_filename('invalid:file*name.txt')}") 
```

## /pyproject.toml

```toml path="/pyproject.toml" 
[project]
name = "mdc-rules-generator"
version = "0.1.0"
description = "Generate Cursor MDC rule files from a structured JSON file"
authors = [
    {name = "Sanjeed", email = "hi@sanjeed.in"},
]
requires-python = ">=3.8"
readme = "README.md"
license = {text = "MIT"}
dependencies = [
    "python-dotenv>=1.0.0",
    "litellm>=1.0.0",
    "tenacity>=8.2.3",
    "ratelimit>=2.2.1",
    "pydantic>=2.0.0",
    "exa-py>=1.0.0",
    "pyyaml>=6.0.0",
    "build>=1.2.2.post1",
    "twine>=6.1.0",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

```

## /requirements.txt

litellm>=1.30.3
python-dotenv>=1.0.1
tenacity>=8.2.3
ratelimit>=2.2.1
pydantic>=2.6.1 

## /rules-mdc/actix-web.mdc

```mdc path="/rules-mdc/actix-web.mdc" 
---
description: Comprehensive best practices for developing robust, efficient, and maintainable applications using the actix-web framework in Rust. This rule covers coding standards, project structure, performance, security, testing, and common pitfalls.
globs: **/*.rs
---
# Actix-web Best Practices: A Comprehensive Guide

This guide provides a comprehensive overview of best practices for developing applications using the actix-web framework in Rust. It covers various aspects of development, including code organization, performance optimization, security, testing, and common pitfalls.

## 1. Code Organization and Structure

A well-organized codebase is crucial for maintainability and scalability. Here's how to structure your actix-web project effectively:

### 1.1. Directory Structure Best Practices

Adopt a modular and layered architecture. A common and recommended directory structure is as follows:


project_root/
├── src/
│   ├── main.rs           # Entry point of the application
│   ├── lib.rs            # Library file if extracting common functionality
│   ├── modules/
│   │   ├── mod.rs        # Module declaration
│   │   ├── auth/         # Authentication module
│   │   │   ├── mod.rs    # Auth module declaration
│   │   │   ├── models.rs # Auth models
│   │   │   ├── routes.rs # Auth routes
│   │   │   ├── handlers.rs # Auth handlers
│   │   ├── users/        # User management module
│   │   │   ├── mod.rs    # User module declaration
│   │   │   ├── models.rs # User models
│   │   │   ├── routes.rs # User routes
│   │   │   ├── handlers.rs # User handlers
│   ├── models/           # Data models
│   │   ├── mod.rs
│   │   ├── user.rs      # User model
│   │   ├── post.rs      # Post model
│   ├── routes/           # Route configurations
│   │   ├── mod.rs
│   │   ├── auth_routes.rs # Authentication routes
│   │   ├── user_routes.rs # User routes
│   ├── handlers/         # Request handlers (controllers)
│   │   ├── mod.rs
│   │   ├── auth_handlers.rs # Authentication handlers
│   │   ├── user_handlers.rs # User handlers
│   ├── middleware/       # Custom middleware components
│   │   ├── mod.rs
│   │   ├── logger.rs   # Logging middleware
│   │   ├── auth.rs     # Authentication middleware
│   ├── utils/            # Utility functions and modules
│   │   ├── mod.rs
│   │   ├── db.rs        # Database connection utility
│   ├── errors/           # Custom error definitions
│   │   ├── mod.rs
│   │   ├── app_error.rs # Application-specific error types
├── tests/            # Integration and unit tests
│   ├── mod.rs
│   ├── api_tests.rs  # Integration tests for API endpoints
├── .env                # Environment variables
├── Cargo.toml          # Project dependencies and metadata
├── Cargo.lock          # Dependency lockfile


### 1.2. File Naming Conventions

*   **Modules:** Use lowercase, descriptive names (e.g., `auth`, `users`, `posts`).
*   **Files:** Use lowercase with underscores (e.g., `user_routes.rs`, `auth_handlers.rs`).
*   **Models:** Use singular nouns (e.g., `user.rs`, `post.rs`).
*   **Handlers:** Use descriptive names indicating the action performed (e.g., `create_user`, `get_user`).
*   **Routes:** Use names indicating the resource they handle (e.g., `user_routes`, `auth_routes`).

### 1.3. Module Organization

*   **Explicit Module Declarations:** Always declare submodules in `mod.rs` files. This ensures proper module resolution and prevents naming conflicts.
*   **Clear Boundaries:** Each module should have a well-defined responsibility. Avoid mixing unrelated functionalities within the same module.
*   **Public vs. Private:** Use `pub` keyword judiciously to control visibility. Keep implementation details private to modules to prevent accidental external dependencies.

### 1.4. Component Architecture

*   **Layered Architecture:** Separate concerns into distinct layers (e.g., data access, business logic, presentation). This improves testability and maintainability.
*   **Dependency Injection:**  Use dependency injection to provide dependencies to handlers. This makes it easier to test and configure your application.
*   **Services:** Encapsulate business logic into services. Handlers should primarily focus on request/response handling and delegate business logic to services.

### 1.5. Code Splitting Strategies

*   **Feature-Based Splitting:** Group code based on features (e.g., authentication, user management). This makes it easier to understand and maintain related code.
*   **Module-Based Splitting:** Split code into modules based on functionality. This improves code organization and reusability.
*   **Lazy Loading (Future Enhancement):**  For very large applications, consider lazy loading modules or features to reduce initial startup time.  This can be accomplished by dynamically enabling parts of your application based on configuration or runtime conditions.

## 2. Common Patterns and Anti-patterns

### 2.1. Design Patterns Specific to Actix-web

*   **Extractor Pattern:** Use extractors to handle different types of incoming data (e.g., `Path`, `Query`, `Json`, `Form`). Extractors simplify handler logic and provide type safety.
*   **Middleware Pattern:** Implement custom middleware for tasks like logging, authentication, and request modification. Middleware allows you to intercept and process requests before they reach the handlers.
*   **State Management Pattern:** Use `web::Data` to share application state across handlers. This provides a thread-safe way to access shared resources like database connections and configuration settings.
*   **Error Handling Pattern:** Define custom error types and implement the `ResponseError` trait for centralized error handling and consistent error responses.

### 2.2. Recommended Approaches for Common Tasks

*   **Database Integration:** Use an asynchronous database driver like `tokio-postgres` or `sqlx` for efficient database interactions.
*   **Authentication:** Implement authentication using JWT (JSON Web Tokens) or other secure authentication mechanisms.
*   **Authorization:** Implement role-based access control (RBAC) or attribute-based access control (ABAC) to restrict access to resources based on user roles or attributes.
*   **Logging:** Use a logging framework like `tracing` or `log` for structured logging and monitoring.
*   **Configuration Management:** Use a configuration library like `config` or `dotenv` to manage application settings from environment variables and configuration files.

### 2.3. Anti-patterns and Code Smells to Avoid

*   **Long Handler Functions:** Keep handler functions short and focused. Delegate complex logic to services or helper functions.
*   **Tight Coupling:** Avoid tight coupling between modules. Use interfaces and dependency injection to decouple components.
*   **Ignoring Errors:** Always handle errors gracefully and provide informative error messages to the client.
*   **Blocking Operations in Handlers:** Avoid performing blocking operations (e.g., synchronous I/O) in handler functions. Use asynchronous operations to prevent blocking the event loop.
*   **Overusing Global State:** Minimize the use of global state. Prefer passing state as dependencies to handler functions.

### 2.4. State Management Best Practices

*   **Immutable State:** Prefer immutable state whenever possible. This reduces the risk of race conditions and makes it easier to reason about the application.
*   **Thread-Safe Data Structures:** Use thread-safe data structures like `Arc<Mutex<T>>` or `RwLock<T>` to share mutable state across threads.
*   **Avoid Direct Mutability:** Avoid directly mutating shared state. Instead, use atomic operations or message passing to coordinate state updates.
*   **Dependency Injection:** Use dependency injection to provide state to handler functions. This makes it easier to test and configure the application.

### 2.5. Error Handling Patterns

*   **Custom Error Types:** Define custom error types to represent different error scenarios in your application.
*   **`ResponseError` Trait:** Implement the `ResponseError` trait for custom error types to generate appropriate HTTP responses.
*   **Centralized Error Handling:** Use a centralized error handling mechanism (e.g., middleware) to catch and process errors consistently.
*   **Informative Error Messages:** Provide informative error messages to the client to help them understand and resolve the issue.
*   **Logging Errors:** Log errors with sufficient detail to help diagnose and debug issues.
*  **`Result` Type:** Leverage the `Result` type effectively, propagating errors up the call stack using the `?` operator, and handle them at the appropriate level.

## 3. Performance Considerations

Optimizing performance is crucial for building scalable and responsive actix-web applications.

### 3.1. Optimization Techniques

*   **Asynchronous Operations:** Use asynchronous operations for I/O-bound tasks (e.g., database access, network requests) to prevent blocking the event loop.
*   **Connection Pooling:** Use connection pooling for database connections to reduce the overhead of establishing new connections.
*   **Caching:** Implement caching for frequently accessed data to reduce database load and improve response times.
*   **Compression:** Enable compression (e.g., gzip) for responses to reduce the amount of data transmitted over the network.
*   **Keep-Alive Connections:** Use keep-alive connections to reuse existing TCP connections and reduce connection establishment overhead.

### 3.2. Memory Management

*   **Avoid Unnecessary Cloning:** Minimize cloning of data to reduce memory allocations and copying.
*   **Use References:** Use references instead of copying data whenever possible.
*   **Smart Pointers:** Use smart pointers (e.g., `Box`, `Arc`, `Rc`) to manage memory efficiently.
*   **String Handling:** Be mindful of string handling. Use `String` when ownership is needed, and `&str` when a read-only view is sufficient.

### 3.3. Rendering Optimization

*   **Template Caching:** Cache templates to reduce the overhead of parsing and compiling templates on each request.
*   **Minimize DOM Updates:** Minimize DOM updates in the client-side JavaScript code to improve rendering performance.
*   **Efficient Serialization:** Ensure your data serialization is efficient, using appropriate data structures and serialization libraries (e.g., `serde_json`).

### 3.4. Bundle Size Optimization

*   **Dependency Pruning:** Remove unused dependencies from your `Cargo.toml` file to reduce the bundle size.
*   **Feature Flags:** Use feature flags to enable or disable optional features at compile time.
*   **Code Minification:** Use code minification to reduce the size of your JavaScript and CSS files.

### 3.5. Lazy Loading Strategies

*   **Lazy Initialization:** Use lazy initialization for expensive resources to defer their creation until they are actually needed.
*   **On-Demand Loading:** Load resources on demand (e.g., images, data) to reduce the initial load time.

## 4. Security Best Practices

Security is paramount for building robust and reliable actix-web applications.

### 4.1. Common Vulnerabilities and How to Prevent Them

*   **SQL Injection:** Use parameterized queries or ORMs to prevent SQL injection attacks.
*   **Cross-Site Scripting (XSS):** Sanitize user input and escape output to prevent XSS attacks.
*   **Cross-Site Request Forgery (CSRF):** Implement CSRF protection to prevent unauthorized requests from other websites.
*   **Authentication and Authorization:** Use strong authentication and authorization mechanisms to protect sensitive resources.
*   **Denial-of-Service (DoS):** Implement rate limiting and other defense mechanisms to prevent DoS attacks.

### 4.2. Input Validation

*   **Validate All Input:** Validate all user input to ensure that it conforms to the expected format and range.
*   **Use Type Safety:** Use type safety to prevent invalid data from being processed.
*   **Regular Expressions:** Use regular expressions to validate complex input patterns.
*   **Whitelist vs. Blacklist:** Prefer whitelisting valid input over blacklisting invalid input.

### 4.3. Authentication and Authorization Patterns

*   **JWT (JSON Web Tokens):** Use JWT for stateless authentication and authorization.
*   **OAuth 2.0:** Use OAuth 2.0 for delegated authorization.
*   **RBAC (Role-Based Access Control):** Use RBAC to restrict access to resources based on user roles.
*   **ABAC (Attribute-Based Access Control):** Use ABAC to restrict access to resources based on user attributes.
*   **Password Hashing:** Always hash passwords using a strong hashing algorithm (e.g., bcrypt, Argon2) and store them securely.

### 4.4. Data Protection Strategies

*   **Encryption:** Encrypt sensitive data at rest and in transit.
*   **Data Masking:** Mask sensitive data to prevent unauthorized access.
*   **Data Anonymization:** Anonymize data to protect user privacy.
*   **Access Control:** Implement strict access control policies to restrict access to sensitive data.

### 4.5. Secure API Communication

*   **HTTPS:** Use HTTPS for all API communication to encrypt data in transit.
*   **TLS Certificates:** Use valid TLS certificates from a trusted certificate authority.
*   **API Keys:** Use API keys to authenticate API clients.
*   **Rate Limiting:** Implement rate limiting to prevent abuse and DoS attacks.

## 5. Testing Approaches

Thorough testing is essential for ensuring the quality and reliability of actix-web applications.

### 5.1. Unit Testing Strategies

*   **Test Individual Modules:** Unit test individual modules and functions in isolation.
*   **Mock Dependencies:** Use mocking to isolate units from external dependencies (e.g., database, API).
*   **Test Edge Cases:** Test edge cases and boundary conditions to ensure that the code handles them correctly.
*   **Table-Driven Tests:** Use table-driven tests to test multiple scenarios with different inputs and expected outputs.

### 5.2. Integration Testing

*   **Test API Endpoints:** Integration test API endpoints to ensure that they function correctly together.
*   **Test Database Interactions:** Test database interactions to ensure that data is read and written correctly.
*   **Test Middleware:** Test middleware to ensure that they correctly process requests and responses.

### 5.3. End-to-End Testing

*   **Simulate User Interactions:** End-to-end tests simulate user interactions to test the entire application flow.
*   **Use a Testing Framework:** Use a testing framework (e.g., Selenium, Cypress) to automate end-to-end tests.

### 5.4. Test Organization

*   **Test Directory:** Keep tests in a separate `tests` directory.
*   **Test Modules:** Organize tests into modules that mirror the application structure.
*   **Test Naming:** Use descriptive names for test functions to indicate what they are testing.

### 5.5. Mocking and Stubbing

*   **Mock External Dependencies:** Mock external dependencies (e.g., database, API) to isolate units from external factors.
*   **Use Mocking Libraries:** Use mocking libraries (e.g., `mockall`) to create mock objects and define their behavior.
*   **Stub Data:** Use stub data to simulate different scenarios and test edge cases.

## 6. Common Pitfalls and Gotchas

Be aware of common pitfalls and gotchas when developing actix-web applications.

### 6.1. Frequent Mistakes Developers Make

*   **Blocking Operations:** Performing blocking operations in handler functions can block the event loop and degrade performance.
*   **Incorrect Error Handling:** Ignoring errors or not handling them correctly can lead to unexpected behavior and security vulnerabilities.
*   **Not Validating Input:** Not validating user input can lead to security vulnerabilities and data corruption.
*   **Overusing Global State:** Overusing global state can make the application difficult to reason about and test.
*   **Not Using Asynchronous Operations:** Not using asynchronous operations for I/O-bound tasks can degrade performance.

### 6.2. Edge Cases to be Aware Of

*   **Handling Large Requests:** Be mindful of handling large requests and implement appropriate size limits to prevent DoS attacks.
*   **Handling Concurrent Requests:** Ensure that the application can handle concurrent requests efficiently and without race conditions.
*   **Handling Network Errors:** Handle network errors gracefully and provide informative error messages to the client.
*   **Handling Database Connection Errors:** Handle database connection errors gracefully and implement retry mechanisms.

### 6.3. Version-Specific Issues

*   **Breaking Changes:** Be aware of breaking changes in actix-web and its dependencies.
*   **Deprecated Features:** Avoid using deprecated features and migrate to the recommended alternatives.
*   **Compatibility:** Ensure that the application is compatible with the target Rust version and operating system.

### 6.4. Compatibility Concerns

*   **Rust Version:** Ensure compatibility with the supported Rust versions.
*   **Operating System:** Test on different operating systems (Linux, macOS, Windows).
*   **Browser Compatibility (if applicable):** If the application includes a front-end, test with various browsers.

### 6.5. Debugging Strategies

*   **Logging:** Use logging to track the application's execution flow and identify potential issues.
*   **Debugging Tools:** Use debugging tools (e.g., `gdb`, `lldb`) to inspect the application's state and step through the code.
*   **Unit Tests:** Write unit tests to isolate and debug individual modules and functions.
*   **Profiling:** Use profiling tools to identify performance bottlenecks.

## 7. Tooling and Environment

Using the right tools and environment can significantly improve the development experience and productivity.

### 7.1. Recommended Development Tools

*   **Rust IDE:** Use a Rust IDE (e.g., Visual Studio Code with the Rust extension, IntelliJ Rust) for code completion, syntax highlighting, and debugging.
*   **Cargo:** Use Cargo, the Rust package manager, for managing dependencies and building the application.
*   **Rustup:** Use Rustup for managing Rust toolchains and versions.
*   **Clippy:** Use Clippy, a Rust linter, for identifying potential code quality issues.
*   **Formatter:** Use rustfmt to automatically format the code according to the Rust style guide.

### 7.2. Build Configuration

*   **Release Mode:** Build the application in release mode for optimized performance.
*   **Link-Time Optimization (LTO):** Enable link-time optimization to improve performance.
*   **Codegen Units:** Experiment with different codegen unit settings to optimize compilation time and code size.

### 7.3. Linting and Formatting

*   **Clippy:** Use Clippy to identify potential code quality issues and enforce coding standards.
*   **Rustfmt:** Use rustfmt to automatically format the code according to the Rust style guide.
*   **Pre-commit Hooks:** Use pre-commit hooks to automatically run Clippy and rustfmt before committing changes.

### 7.4. Deployment Best Practices

*   **Containerization:** Use containerization (e.g., Docker) to package the application and its dependencies into a portable container.
*   **Orchestration:** Use container orchestration (e.g., Kubernetes) to manage and scale the application.
*   **Reverse Proxy:** Use a reverse proxy (e.g., Nginx, Apache) to handle incoming requests and route them to the application.
*   **Load Balancing:** Use load balancing to distribute traffic across multiple instances of the application.
*   **Monitoring:** Implement monitoring to track the application's health and performance.

### 7.5. CI/CD Integration

*   **Continuous Integration (CI):** Use a CI system (e.g., GitHub Actions, GitLab CI, Jenkins) to automatically build, test, and lint the code on every commit.
*   **Continuous Delivery (CD):** Use a CD system to automatically deploy the application to production after it passes all tests.
*   **Automated Testing:** Automate unit, integration, and end-to-end tests in the CI/CD pipeline.

By following these best practices, you can build robust, efficient, and maintainable actix-web applications that meet the highest standards of quality and security. Remember to stay up-to-date with the latest recommendations and adapt them to your specific project needs.
```

## /rules-mdc/aiohttp.mdc

```mdc path="/rules-mdc/aiohttp.mdc" 
---
description: Comprehensive guide for aiohttp development covering code organization, performance, security, testing, and deployment best practices. Provides actionable guidance for developers to build robust and maintainable aiohttp applications.
globs: **/*.py
---
# Aiohttp Best Practices

This document provides a comprehensive guide to aiohttp development, covering code organization, performance, security, testing, and deployment.

Library Information:
- Name: aiohttp
- Tags: web, python, http-client, async

## 1. Code Organization and Structure

### 1.1. Directory Structure Best Practices:

*   **Project Root:**
    *   `src/`: Contains the main application code.
        *   `main.py`: Entry point of the application.
        *   `app.py`: Application factory and setup.
        *   `routes.py`: Defines application routes.
        *   `handlers/`: Contains request handlers.
            *   `user_handlers.py`: User-related handlers.
            *   `product_handlers.py`: Product-related handlers.
        *   `middlewares/`: Custom middleware components.
            *   `logging_middleware.py`: Logging middleware.
            *   `auth_middleware.py`: Authentication middleware.
        *   `utils/`: Utility modules.
            *   `db.py`: Database connection and utilities.
            *   `config.py`: Configuration management.
    *   `tests/`: Contains unit and integration tests.
        *   `conftest.py`: Pytest configuration file.
        *   `unit/`: Unit tests.
        *   `integration/`: Integration tests.
    *   `static/`: Static files (CSS, JavaScript, images).
    *   `templates/`: Jinja2 or other template files.
    *   `docs/`: Project documentation.
    *   `requirements.txt`: Python dependencies.
    *   `Dockerfile`: Docker configuration file.
    *   `docker-compose.yml`: Docker Compose configuration.
    *   `.env`: Environment variables.
    *   `README.md`: Project description and instructions.
    *   `.gitignore`: Specifies intentionally untracked files that Git should ignore.
    *   `.cursor/rules/`: Project specific Cursor AI rules.

### 1.2. File Naming Conventions:

*   Python files: `snake_case.py` (e.g., `user_handlers.py`, `database_utils.py`).
*   Class names: `CamelCase` (e.g., `UserHandler`, `DatabaseConnection`).
*   Function names: `snake_case` (e.g., `get_user`, `create_product`).
*   Variables: `snake_case` (e.g., `user_id`, `product_name`).
*   Constants: `UPPER_SNAKE_CASE` (e.g., `DEFAULT_PORT`, `MAX_CONNECTIONS`).

### 1.3. Module Organization:

*   Group related functionality into modules.
*   Use clear and descriptive module names.
*   Avoid circular dependencies.
*   Keep modules focused and concise.
*   Use packages to organize modules into a hierarchical structure.

### 1.4. Component Architecture:

*   **Layered Architecture:** Separate the application into distinct layers (e.g., presentation, business logic, data access).
*   **Microservices Architecture:** Decompose the application into small, independent services.
*   **Hexagonal Architecture (Ports and Adapters):** Decouple the application core from external dependencies.
*   **MVC (Model-View-Controller):** Organize the application into models (data), views (presentation), and controllers (logic).

### 1.5. Code Splitting Strategies:

*   **Route-based splitting:** Load modules based on the requested route.
*   **Feature-based splitting:** Divide the application into feature modules.
*   **Component-based splitting:** Split the application into reusable components.
*   **On-demand loading:** Load modules only when they are needed.
*   **Asynchronous loading:** Use `asyncio.gather` or similar techniques to load modules concurrently.

## 2. Common Patterns and Anti-patterns

### 2.1. Design Patterns:

*   **Singleton:** For managing shared resources like database connections or configuration objects.
*   **Factory:** For creating instances of classes with complex initialization logic.
*   **Strategy:** For implementing different algorithms or behaviors.
*   **Observer:** For implementing event-driven systems.
*   **Middleware:** For handling cross-cutting concerns like logging, authentication, and error handling.

### 2.2. Recommended Approaches for Common Tasks:

*   **Request Handling:** Use request handlers to process incoming requests.
*   **Routing:** Use `aiohttp.web.RouteTableDef` for defining routes.
*   **Middleware:** Implement middleware for request pre-processing and response post-processing.
*   **Data Serialization:** Use `aiohttp.web.json_response` for serializing data to JSON.
*   **Error Handling:** Implement custom error handlers to handle exceptions gracefully.
*   **Session Management:** Use `aiohttp-session` for managing user sessions.
*   **WebSockets:** Utilize `aiohttp.web.WebSocketResponse` for handling WebSocket connections.

### 2.3. Anti-patterns and Code Smells:

*   **Creating a new `ClientSession` for each request:** This is a performance bottleneck. Reuse a single `ClientSession`.
*   **Blocking operations in asynchronous code:** Avoid using blocking operations (e.g., `time.sleep`) in asynchronous code.
*   **Ignoring exceptions:** Always handle exceptions properly to prevent unexpected behavior.
*   **Overusing global variables:** Avoid using global variables as much as possible to maintain code clarity and testability.
*   **Tight coupling:** Decouple components to improve maintainability and reusability.
*   **Hardcoding configuration:** Use environment variables or configuration files to manage configuration settings.

### 2.4. State Management:

*   **Application State:** Store application-level state in the `aiohttp.web.Application` instance.
*   **Request State:** Store request-specific state in the `aiohttp.web.Request` instance.
*   **Session State:** Use `aiohttp-session` to manage user session data.
*   **Database:** Use a database like PostgreSQL, MySQL, or MongoDB to store persistent state.
*   **Redis/Memcached:** Use in-memory data stores for caching frequently accessed data.

### 2.5. Error Handling:

*   **Use `try-except` blocks:** Wrap code that may raise exceptions in `try-except` blocks.
*   **Handle specific exceptions:** Catch specific exception types instead of using a generic `except Exception`.
*   **Log exceptions:** Log exceptions with detailed information for debugging.
*   **Return informative error responses:** Return appropriate HTTP status codes and error messages to the client.
*   **Implement custom error handlers:** Create custom error handlers to handle specific exception types.
*   **Use `aiohttp.web.HTTPException`:** Raise `aiohttp.web.HTTPException` to return HTTP error responses.

## 3. Performance Considerations

### 3.1. Optimization Techniques:

*   **Reuse `ClientSession`:** Always reuse a single `ClientSession` instance for making multiple requests.
*   **Connection Pooling:** aiohttp automatically uses connection pooling, so reuse your session.
*   **Keep-Alive Connections:** Keep-alive connections are enabled by default, reducing connection overhead.
*   **Gzip Compression:** Enable Gzip compression for responses to reduce bandwidth usage.
*   **Caching:** Implement caching for frequently accessed data to reduce database load.
*   **Optimize Database Queries:** Optimize database queries to improve response times.
*   **Use Indexes:** Use indexes in your database tables to speed up queries.
*   **Limit Payload Size:** Keep request and response payloads as small as possible.
*   **Background Tasks:** Use `asyncio.create_task` to offload long-running tasks to the background.
*   **Profiling:** Use profiling tools to identify performance bottlenecks.

### 3.2. Memory Management:

*   **Avoid Memory Leaks:** Ensure that all resources are properly released to prevent memory leaks.
*   **Use Generators:** Use generators to process large datasets in chunks.
*   **Limit Object Creation:** Minimize the creation of objects to reduce memory overhead.
*   **Use Data Structures Efficiently:** Choose appropriate data structures to optimize memory usage.
*   **Garbage Collection:** Understand how Python's garbage collection works and optimize your code accordingly.

### 3.3. Rendering Optimization:

*   **Template Caching:** Cache templates to reduce rendering time.
*   **Minimize Template Logic:** Keep template logic simple and move complex logic to request handlers.
*   **Use Asynchronous Templates:** Use asynchronous template engines like `aiohttp-jinja2`.
*   **Optimize Static Files:** Optimize static files (CSS, JavaScript, images) to reduce page load times.

### 3.4. Bundle Size Optimization:

*   **Minimize Dependencies:** Reduce the number of dependencies in your project.
*   **Tree Shaking:** Use tree shaking to remove unused code from your bundles.
*   **Code Minification:** Minify your code to reduce bundle sizes.
*   **Code Compression:** Compress your code to further reduce bundle sizes.

### 3.5. Lazy Loading:

*   **Lazy Load Modules:** Load modules only when they are needed.
*   **Lazy Load Images:** Load images only when they are visible in the viewport.
*   **Use Asynchronous Loading:** Use `asyncio.gather` or similar techniques to load resources concurrently.

## 4. Security Best Practices

### 4.1. Common Vulnerabilities:

*   **SQL Injection:** Prevent SQL injection by using parameterized queries or an ORM.
*   **Cross-Site Scripting (XSS):** Prevent XSS by escaping user input in templates.
*   **Cross-Site Request Forgery (CSRF):** Prevent CSRF by using CSRF tokens.
*   **Authentication and Authorization Issues:** Implement secure authentication and authorization mechanisms.
*   **Denial-of-Service (DoS) Attacks:** Implement rate limiting and other measures to prevent DoS attacks.
*   **Insecure Dependencies:** Keep your dependencies up to date to prevent vulnerabilities.

### 4.2. Input Validation:

*   **Validate all user input:** Validate all user input to prevent malicious data from entering your application.
*   **Use a validation library:** Use a validation library like `marshmallow` or `voluptuous` to simplify input validation.
*   **Sanitize user input:** Sanitize user input to remove potentially harmful characters.
*   **Limit input length:** Limit the length of input fields to prevent buffer overflows.
*   **Use regular expressions:** Use regular expressions to validate input patterns.

### 4.3. Authentication and Authorization:

*   **Use a strong authentication scheme:** Use a strong authentication scheme like OAuth 2.0 or JWT.
*   **Store passwords securely:** Store passwords securely using a hashing algorithm like bcrypt.
*   **Implement role-based access control (RBAC):** Use RBAC to control access to resources based on user roles.
*   **Use secure cookies:** Use secure cookies to protect session data.
*   **Implement multi-factor authentication (MFA):** Use MFA to add an extra layer of security.

### 4.4. Data Protection:

*   **Encrypt sensitive data:** Encrypt sensitive data at rest and in transit.
*   **Use HTTPS:** Use HTTPS to encrypt communication between the client and the server.
*   **Store data securely:** Store data in a secure location with appropriate access controls.
*   **Regularly back up data:** Regularly back up data to prevent data loss.
*   **Comply with data privacy regulations:** Comply with data privacy regulations like GDPR and CCPA.

### 4.5. Secure API Communication:

*   **Use HTTPS:** Always use HTTPS for API communication.
*   **Implement API authentication:** Use API keys or tokens to authenticate API requests.
*   **Rate limit API requests:** Implement rate limiting to prevent abuse.
*   **Validate API requests:** Validate API requests to prevent malicious data from entering your application.
*   **Log API requests:** Log API requests for auditing and debugging.

## 5. Testing Approaches

### 5.1. Unit Testing:

*   **Test individual components:** Unit tests should test individual components in isolation.
*   **Use a testing framework:** Use a testing framework like `pytest` or `unittest`.
*   **Write clear and concise tests:** Write clear and concise tests that are easy to understand.
*   **Test edge cases:** Test edge cases and boundary conditions.
*   **Use mocks and stubs:** Use mocks and stubs to isolate components under test.

### 5.2. Integration Testing:

*   **Test interactions between components:** Integration tests should test interactions between different components.
*   **Test with real dependencies:** Integration tests should use real dependencies whenever possible.
*   **Test the entire application flow:** Integration tests should test the entire application flow.
*   **Use a testing database:** Use a testing database to isolate integration tests from the production database.

### 5.3. End-to-End Testing:

*   **Test the entire system:** End-to-end tests should test the entire system from end to end.
*   **Use a testing environment:** Use a testing environment that mimics the production environment.
*   **Automate end-to-end tests:** Automate end-to-end tests to ensure that the system is working correctly.
*   **Use a browser automation tool:** Use a browser automation tool like Selenium or Puppeteer.

### 5.4. Test Organization:

*   **Organize tests by module:** Organize tests by module to improve test discovery and maintainability.
*   **Use descriptive test names:** Use descriptive test names that clearly indicate what the test is verifying.
*   **Use test fixtures:** Use test fixtures to set up and tear down test environments.
*   **Use test markers:** Use test markers to categorize tests and run specific test suites.

### 5.5. Mocking and Stubbing:

*   **Use mocks to simulate dependencies:** Use mocks to simulate the behavior of dependencies.
*   **Use stubs to provide predefined responses:** Use stubs to provide predefined responses to API calls.
*   **Use mocking libraries:** Use mocking libraries like `unittest.mock` or `pytest-mock`.
*   **Avoid over-mocking:** Avoid over-mocking, as it can make tests less reliable.

## 6. Common Pitfalls and Gotchas

### 6.1. Frequent Mistakes:

*   **Not handling exceptions properly:** Always handle exceptions to prevent unexpected behavior.
*   **Using blocking operations in asynchronous code:** Avoid using blocking operations in asynchronous code.
*   **Not closing `ClientSession`:** Always close the `ClientSession` to release resources.
*   **Not validating user input:** Always validate user input to prevent security vulnerabilities.
*   **Not using HTTPS:** Always use HTTPS for secure communication.

### 6.2. Edge Cases:

*   **Handling timeouts:** Implement proper timeout handling to prevent requests from hanging indefinitely.
*   **Handling connection errors:** Handle connection errors gracefully to prevent application crashes.
*   **Handling large payloads:** Handle large payloads efficiently to prevent memory issues.
*   **Handling concurrent requests:** Handle concurrent requests properly to prevent race conditions.
*   **Handling Unicode encoding:** Be aware of Unicode encoding issues when processing text data.

### 6.3. Version-Specific Issues:

*   **aiohttp version compatibility:** Be aware of compatibility issues between different aiohttp versions.
*   **asyncio version compatibility:** Be aware of compatibility issues between aiohttp and different asyncio versions.
*   **Python version compatibility:** Be aware of compatibility issues between aiohttp and different Python versions.

### 6.4. Compatibility Concerns:

*   **Compatibility with other libraries:** Be aware of compatibility issues between aiohttp and other libraries.
*   **Compatibility with different operating systems:** Be aware of compatibility issues between aiohttp and different operating systems.
*   **Compatibility with different web servers:** Be aware of compatibility issues between aiohttp and different web servers.

### 6.5. Debugging Strategies:

*   **Use logging:** Use logging to track application behavior and identify issues.
*   **Use a debugger:** Use a debugger to step through code and examine variables.
*   **Use a profiler:** Use a profiler to identify performance bottlenecks.
*   **Use error reporting tools:** Use error reporting tools to track and fix errors in production.
*   **Use a network analyzer:** Use a network analyzer like Wireshark to capture and analyze network traffic.

## 7. Tooling and Environment

### 7.1. Recommended Development Tools:

*   **IDE:** Use an IDE like VS Code, PyCharm, or Sublime Text.
*   **Virtual Environment:** Use a virtual environment to isolate project dependencies.
*   **Package Manager:** Use a package manager like pip or poetry to manage dependencies.
*   **Testing Framework:** Use a testing framework like pytest or unittest.
*   **Linting Tool:** Use a linting tool like pylint or flake8 to enforce code style.
*   **Formatting Tool:** Use a formatting tool like black or autopep8 to format code automatically.

### 7.2. Build Configuration:

*   **Use a build system:** Use a build system like Make or tox to automate build tasks.
*   **Define dependencies in `requirements.txt` or `pyproject.toml`:** Specify all project dependencies in a `requirements.txt` or `pyproject.toml` file.
*   **Use a Dockerfile:** Use a Dockerfile to create a containerized build environment.
*   **Use Docker Compose:** Use Docker Compose to manage multi-container applications.

### 7.3. Linting and Formatting:

*   **Use a consistent code style:** Use a consistent code style throughout the project.
*   **Configure linting tools:** Configure linting tools to enforce code style rules.
*   **Configure formatting tools:** Configure formatting tools to format code automatically.
*   **Use pre-commit hooks:** Use pre-commit hooks to run linters and formatters before committing code.

### 7.4. Deployment:

*   **Use a web server:** Use a web server like Nginx or Apache to serve the application.
*   **Use a process manager:** Use a process manager like Supervisor or systemd to manage the application process.
*   **Use a reverse proxy:** Use a reverse proxy to improve security and performance.
*   **Use a load balancer:** Use a load balancer to distribute traffic across multiple servers.
*   **Use a monitoring system:** Use a monitoring system to track application health and performance.
*   **Standalone Server:** aiohttp.web.run_app(), simple but doesn't utilize all CPU cores.
*   **Nginx + Supervisord:** Nginx prevents attacks, allows utilizing all CPU cores, and serves static files faster.
*   **Nginx + Gunicorn:** Gunicorn launches the app as worker processes, simplifying deployment compared to bare Nginx.

### 7.5. CI/CD Integration:

*   **Use a CI/CD pipeline:** Use a CI/CD pipeline to automate the build, test, and deployment process.
*   **Use a CI/CD tool:** Use a CI/CD tool like Jenkins, GitLab CI, or GitHub Actions.
*   **Run tests in the CI/CD pipeline:** Run tests in the CI/CD pipeline to ensure that code changes don't break the application.
*   **Automate deployment:** Automate deployment to reduce manual effort and improve consistency.

## Additional Best Practices:

*   **Session Management:** Always create a `ClientSession` for making requests and reuse it across multiple requests to benefit from connection pooling. Avoid creating a new session for each request, as this can lead to performance issues.
*   **Error Handling:** Implement robust error handling in your request handlers. Use try-except blocks to manage exceptions, particularly for network-related errors. For example, handle `ConnectionResetError` to manage client disconnections gracefully.
*   **Middleware Usage:** Utilize middleware for cross-cutting concerns such as logging, error handling, and modifying requests/responses. Define middleware functions that accept a request and a handler, allowing you to process requests before they reach your main handler.
*   **Graceful Shutdown:** Implement graceful shutdown procedures for your server to ensure that ongoing requests are completed before the application exits. This can be achieved by registering shutdown signals and cleaning up resources.
*   **Security Practices:** When deploying, consider using a reverse proxy like Nginx for added security and performance. Configure SSL/TLS correctly to secure your application.
*   **Character Set Detection:**  If a response does not include the charset needed to decode the body, use `ClientSession` accepts a `fallback_charset_resolver` parameter which can be used to introduce charset guessing functionality.
*   **Persistent Session:** Use `Cleanup Context` when creating a persistent session.

By adhering to these practices, developers can enhance the reliability, performance, and security of their `aiohttp` applications.
```

## /rules-mdc/amazon-ec2.mdc

```mdc path="/rules-mdc/amazon-ec2.mdc" 
---
description: This rule file provides best practices, coding standards, and security guidelines for developing, deploying, and maintaining applications using the amazon-ec2 library within the AWS ecosystem. It focuses on infrastructure as code (IaC), resource management, performance, and security considerations for robust and scalable EC2-based solutions.
globs: **/*.{tf,json,yml,yaml,py,js,ts,sh,java,go,rb,m}
---
- ## General Principles
  - **Infrastructure as Code (IaC):** Treat your infrastructure as code. Define and provision AWS resources (EC2 instances, security groups, networks) using code (e.g., AWS CloudFormation, AWS CDK, Terraform). This ensures consistency, repeatability, and version control.
  - **Security First:** Integrate security best practices into every stage of development, from IaC template creation to instance configuration. Implement the principle of least privilege, regularly patch instances, and utilize security assessment tools.
  - **Modularity and Reusability:** Design your infrastructure and application code in modular components that can be reused across multiple projects or environments.
  - **Automation:** Automate as much of the infrastructure provisioning, deployment, and management processes as possible. Use CI/CD pipelines for automated testing and deployment.
  - **Monitoring and Logging:** Implement comprehensive monitoring and logging to track the health, performance, and security of your EC2 instances and applications.

- ## 1. Code Organization and Structure

  - **Directory Structure Best Practices:**
    - Adopt a logical directory structure that reflects the architecture of your application and infrastructure.
    - Example:
      
      project-root/
      ├── modules/                  # Reusable infrastructure modules (e.g., VPC, security groups)
      │   ├── vpc/                # VPC module
      │   │   ├── main.tf          # Terraform configuration for the VPC
      │   │   ├── variables.tf     # Input variables for the VPC module
      │   │   ├── outputs.tf       # Output values for the VPC module
      │   ├── security_group/    # Security Group module
      │   │   ├── ...
      ├── environments/            # Environment-specific configurations
      │   ├── dev/               # Development environment
      │   │   ├── main.tf          # Terraform configuration for the Dev environment
      │   │   ├── variables.tf     # Environment specific variables
      │   ├── prod/              # Production environment
      │   │   ├── ...
      ├── scripts/                 # Utility scripts (e.g., deployment scripts, automation scripts)
      │   ├── deploy.sh          # Deployment script
      │   ├── update_ami.py      # Python script to update AMI
      ├── application/            # Application code
      │   ├── src/                # Source code
      │   ├── tests/              # Unit and integration tests
      ├── README.md
      └── ...
      
  - **File Naming Conventions:**
    - Use consistent and descriptive file names.
    - Examples:
      - `main.tf`: Main Terraform configuration file
      - `variables.tf`: Terraform variables file
      - `outputs.tf`: Terraform output values file
      - `deploy.sh`: Deployment script
      - `instance.py`: Python module for instance management
  - **Module Organization:**
    - Encapsulate reusable infrastructure components into modules (e.g., VPC, security groups, load balancers).
    - Each module should have:
      - A clear purpose.
      - Well-defined input variables and output values.
      - Comprehensive documentation.
    - Keep modules small and focused.
  - **Component Architecture:**
    - Design your application as a collection of loosely coupled components.
    - Each component should have:
      - A well-defined interface.
      - Clear responsibilities.
      - Independent deployment lifecycle.
  - **Code Splitting:**
    - Break down large application codebases into smaller, manageable modules.
    - Use lazy loading to load modules on demand, reducing initial load time.
    - Example (Python):
      python
      # main.py
      import importlib

      def load_module(module_name):
          module = importlib.import_module(module_name)
          return module

      # Load the module when needed
      my_module = load_module('my_module')
      my_module.my_function()
      

- ## 2. Common Patterns and Anti-patterns

  - **Design Patterns:**
    - **Singleton:** Use when exactly one instance of a class is needed (e.g., a configuration manager).
    - **Factory:** Use to create objects without specifying their concrete classes (e.g., creating different types of EC2 instances).
    - **Strategy:** Use to define a family of algorithms, encapsulate each one, and make them interchangeable (e.g., different instance termination strategies).
  - **Common Tasks:**
    - **Creating an EC2 Instance (AWS CLI):**
      bash
      aws ec2 run-instances --image-id ami-xxxxxxxxxxxxxxxxx --instance-type t2.micro --key-name MyKeyPair --security-group-ids sg-xxxxxxxxxxxxxxxxx
      
    - **Creating an EC2 Instance (AWS CDK):
      typescript
      import * as ec2 from 'aws-cdk-lib/aws-ec2';

      const vpc = new ec2.Vpc(this, 'TheVPC', { maxAzs: 3 });

      const instance = new ec2.Instance(this, 'EC2Instance', {
        vpc: vpc,
        instanceType: ec2.InstanceType.of(ec2.InstanceClass.T2, ec2.InstanceSize.MICRO),
        machineImage: new ec2.AmazonLinux2023Image({generation: ec2.AmazonLinuxGeneration.AMAZON_LINUX_2}),
      });
      
    - **Attaching an EBS Volume:**
      - Ensure the EBS volume is in the same Availability Zone as the EC2 instance.
      - Use the `aws ec2 attach-volume` command or the equivalent SDK call.
  - **Anti-patterns:**
    - **Hardcoding AWS Credentials:** Never hardcode AWS credentials in your code. Use IAM roles for EC2 instances and IAM users with restricted permissions for local development.
    - **Creating Publicly Accessible S3 Buckets:** Avoid creating S3 buckets that are publicly accessible without proper security controls.
    - **Ignoring Error Handling:** Always handle exceptions and errors gracefully. Provide meaningful error messages and logging.
    - **Over-Permissive Security Groups:** Implement the principle of least privilege. Grant only the minimum necessary permissions to your security groups.
  - **State Management:**
    - Use state files (e.g., Terraform state) to track the current state of your infrastructure.
    - Store state files securely (e.g., in an S3 bucket with encryption and versioning).
    - Use locking mechanisms to prevent concurrent modifications to the state file.
  - **Error Handling:**
    - Implement robust error handling to catch exceptions and prevent application crashes.
    - Use try-except blocks to handle potential errors.
    - Log error messages with sufficient detail for debugging.

- ## 3. Performance Considerations

  - **Optimization Techniques:**
    - **Instance Type Selection:** Choose the appropriate EC2 instance type based on your application's requirements (CPU, memory, network).
    - **EBS Optimization:** Use Provisioned IOPS (PIOPS) EBS volumes for high-performance applications.
    - **Caching:** Implement caching mechanisms to reduce database load and improve response times (e.g., using Amazon ElastiCache).
    - **Load Balancing:** Distribute traffic across multiple EC2 instances using an Elastic Load Balancer (ELB).
    - **Auto Scaling:** Use Auto Scaling groups to automatically scale your EC2 instances based on demand.
  - **Memory Management:**
    - Monitor memory usage on your EC2 instances.
    - Optimize application code to reduce memory consumption.
    - Use memory profiling tools to identify memory leaks.
  - **Bundle Size Optimization:**
    - Minimize the size of your application's deployment package.
    - Remove unnecessary dependencies.
    - Use code minification and compression.
    - Example (Python):
      bash
      # Create a virtual environment
      python3 -m venv .venv
      source .venv/bin/activate

      # Install only necessary dependencies
      pip install --no-cache-dir -r requirements.txt

      # Create deployment package
      zip -r deployment_package.zip *
      
  - **Lazy Loading:**
    - Load application modules on demand to reduce initial load time.
    - Use code splitting to break down large modules into smaller chunks.
    - Example (JavaScript):
      javascript
      // main.js
      async function loadModule() {
        const module = await import('./my_module.js');
        module.myFunction();
      }

      loadModule();
      

- ## 4. Security Best Practices

  - **Common Vulnerabilities:**
    - **SQL Injection:** Prevent SQL injection by using parameterized queries and input validation.
    - **Cross-Site Scripting (XSS):** Prevent XSS by sanitizing user input and encoding output.
    - **Remote Code Execution (RCE):** Prevent RCE by validating user input and using secure coding practices.
    - **Unsecured API endpoints:** Secure API endpoints using authentication and authorization mechanisms.
  - **Input Validation:**
    - Validate all user input to prevent malicious code from being injected into your application.
    - Use regular expressions and data type validation.
  - **Authentication and Authorization:**
    - Use strong authentication mechanisms (e.g., multi-factor authentication).
    - Implement role-based access control (RBAC) to restrict access to sensitive resources.
    - Use AWS IAM roles for EC2 instances to grant access to AWS resources.
  - **Data Protection:**
    - Encrypt sensitive data at rest and in transit.
    - Use HTTPS for all API communication.
    - Store sensitive data in secure storage (e.g., AWS Secrets Manager).
  - **Secure API Communication:**
    - Use HTTPS for all API communication.
    - Validate API requests and responses.
    - Implement rate limiting to prevent abuse.
    - Use AWS API Gateway to manage and secure your APIs.

- ## 5. Testing Approaches

  - **Unit Testing:**
    - Write unit tests for individual components to verify their functionality.
    - Use mocking and stubbing to isolate components from external dependencies.
    - Example (Python):
      python
      import unittest
      from unittest.mock import Mock

      class MyComponent:
          def __init__(self, external_dependency):
              self.dependency = external_dependency

          def my_function(self, input_data):
              result = self.dependency.process_data(input_data)
              return result

      class TestMyComponent(unittest.TestCase):
          def test_my_function(self):
              # Create a mock for the external dependency
              mock_dependency = Mock()
              mock_dependency.process_data.return_value = "Mocked Result"

              # Create an instance of MyComponent with the mock dependency
              component = MyComponent(mock_dependency)

              # Call the function to be tested
              result = component.my_function("Test Input")

              # Assert the expected behavior
              self.assertEqual(result, "Mocked Result")
              mock_dependency.process_data.assert_called_once_with("Test Input")

      if __name__ == '__main__':
          unittest.main()
      
  - **Integration Testing:**
    - Write integration tests to verify the interaction between components.
    - Test the integration of your application with AWS services.
  - **End-to-End Testing:**
    - Write end-to-end tests to verify the entire application flow.
    - Simulate real user scenarios.
  - **Test Organization:**
    - Organize your tests into a logical directory structure.
    - Use meaningful test names.
    - Keep tests independent of each other.
  - **Mocking and Stubbing:**
    - Use mocking and stubbing to isolate components from external dependencies.
    - Example (AWS CDK):
      typescript
      // Mocking AWS SDK calls in Jest
      jest.mock('aws-sdk', () => {
        const mEC2 = {
          describeInstances: jest.fn().mockReturnValue({
            promise: jest.fn().mockResolvedValue({ Reservations: [] }),
          }),
        };
        return {
          EC2: jest.fn().mockImplementation(() => mEC2),
        };
      });
      

- ## 6. Common Pitfalls and Gotchas

  - **Frequent Mistakes:**
    - **Incorrect Security Group Configuration:** Incorrectly configured security groups can expose your EC2 instances to security risks.
    - **Insufficient Resource Limits:** Exceeding AWS resource limits can cause application failures.
    - **Not Using Auto Scaling:** Not using Auto Scaling can lead to performance bottlenecks and outages during periods of high demand.
    - **Forgetting to Terminate Unused Instances:** Forgetting to terminate unused EC2 instances can lead to unnecessary costs.
  - **Edge Cases:**
    - **Spot Instance Interruptions:** Spot instances can be interrupted with short notice. Design your application to handle spot instance interruptions gracefully.
    - **Network Connectivity Issues:** Network connectivity issues can prevent your application from accessing AWS services or other resources.
  - **Version-Specific Issues:**
    - Be aware of version-specific issues with the amazon-ec2 library or AWS services.
    - Consult the documentation for the specific versions you are using.
  - **Compatibility Concerns:**
    - Ensure compatibility between your application and the underlying operating system and libraries.
    - Test your application on different operating systems and browsers.
  - **Debugging Strategies:**
    - Use logging and monitoring to track the behavior of your application.
    - Use debugging tools to identify and fix errors.
    - Consult the AWS documentation and community forums for help.

- ## 7. Tooling and Environment

  - **Recommended Tools:**
    - **AWS CLI:** Command-line interface for interacting with AWS services.
    - **AWS Management Console:** Web-based interface for managing AWS resources.
    - **AWS CloudFormation:** Infrastructure as code service for provisioning and managing AWS resources.
    - **AWS CDK:** Cloud Development Kit for defining cloud infrastructure in code.
    - **Terraform:** Infrastructure as code tool for provisioning and managing cloud resources.
    - **Packer:** Tool for creating machine images.
    - **Ansible:** Configuration management tool.
  - **Build Configuration:**
    - Use a build tool (e.g., Make, Gradle, Maven) to automate the build process.
    - Define dependencies and build steps in a build file.
    - Example (Python):
      makefile
      # Makefile
      venv: 
      	python3 -m venv .venv
      	. .venv/bin/activate
      	pip install -r requirements.txt

      deploy:
      	zip -r deployment_package.zip *
      	aws s3 cp deployment_package.zip s3://my-bucket/deployment_package.zip
      	aws lambda update-function-code --function-name my-function --s3-bucket my-bucket --s3-key deployment_package.zip
      
  - **Linting and Formatting:**
    - Use a linter (e.g., pylint, eslint) to enforce code style and identify potential errors.
    - Use a formatter (e.g., black, prettier) to automatically format your code.
  - **Deployment Best Practices:**
    - Use a deployment pipeline to automate the deployment process.
    - Deploy to a staging environment before deploying to production.
    - Use blue/green deployments to minimize downtime.
  - **CI/CD Integration:**
    - Integrate your application with a CI/CD system (e.g., Jenkins, CircleCI, GitLab CI).
    - Automate testing, building, and deployment.

- ## Additional Considerations

  - **Cost Optimization:** Regularly review your AWS resource usage and identify opportunities for cost savings. Consider using Reserved Instances or Spot Instances to reduce costs.
  - **Disaster Recovery:** Implement a disaster recovery plan to ensure business continuity in the event of an outage. Use AWS Backup or other backup solutions to protect your data.
  - **Compliance:** Ensure that your application complies with relevant regulations and standards (e.g., PCI DSS, HIPAA).

- ## References

  - [AWS EC2 Best Practices](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-best-practices.html)
  - [AWS CDK Best Practices](https://docs.aws.amazon.com/cdk/v2/guide/best-practices.html)
  - [Terraform AWS Provider Best Practices](https://docs.aws.amazon.com/prescriptive-guidance/latest/terraform-aws-provider-best-practices/structure.html)
```

## /rules-mdc/amazon-s3.mdc

```mdc path="/rules-mdc/amazon-s3.mdc" 
---
description: This rule file provides comprehensive best practices, coding standards, and security guidelines for developing applications using Amazon S3. It aims to ensure secure, performant, and maintainable S3 integrations.
globs: **/*S3*.{js,ts,jsx,tsx,py,java,go,csharp}
---
- Always disable public access to S3 buckets unless explicitly needed. Use AWS Identity and Access Management (IAM) policies and bucket policies for access control instead of Access Control Lists (ACLs), which are now generally deprecated.
- Implement encryption for data at rest using Server-Side Encryption (SSE), preferably with AWS Key Management Service (KMS) for enhanced security.
- Use S3 Transfer Acceleration for faster uploads over long distances and enable versioning to protect against accidental deletions. Monitor performance using Amazon CloudWatch and enable logging for auditing. Additionally, consider using S3 Storage Lens for insights into storage usage and activity trends.
- Leverage S3's lifecycle policies to transition objects to cheaper storage classes based on access patterns, and regularly review your storage usage to optimize costs. Utilize S3 Intelligent-Tiering for automatic cost savings based on changing access patterns.

## Amazon S3 Best Practices and Coding Standards

This document provides comprehensive best practices, coding standards, and security guidelines for developing applications using Amazon S3.  Following these guidelines will help ensure that your S3 integrations are secure, performant, maintainable, and cost-effective.

### 1. Code Organization and Structure

#### 1.1. Directory Structure Best Practices

Organize your code related to Amazon S3 into logical directories based on functionality.


project/
├── src/
│   ├── s3/
│   │   ├── utils.js          # Utility functions for S3 operations
│   │   ├── uploader.js        # Handles uploading files to S3
│   │   ├── downloader.js      # Handles downloading files from S3
│   │   ├── config.js          # Configuration for S3 (bucket name, region, etc.)
│   │   ├── errors.js          # Custom error handling for S3 operations
│   │   └── index.js           # Entry point for S3 module
│   ├── ...
│   └── tests/
│       ├── s3/
│       │   ├── uploader.test.js # Unit tests for uploader.js
│       │   └── ...
│       └── ...
├── ...


#### 1.2. File Naming Conventions

Use descriptive and consistent file names.

*   `uploader.js`: Module for uploading files to S3.
*   `downloader.js`: Module for downloading files from S3.
*   `s3_service.py`: (Python Example) Defines S3 related services.
*   `S3Manager.java`: (Java Example) Manages S3 client and configurations.

#### 1.3. Module Organization

*   **Single Responsibility Principle:** Each module should have a clear and specific purpose (e.g., uploading, downloading, managing bucket lifecycle).
*   **Abstraction:** Hide complex S3 operations behind simpler interfaces.
*   **Configuration:** Store S3 configuration details (bucket name, region, credentials) in a separate configuration file.

Example (JavaScript):
javascript
// s3/uploader.js
import AWS from 'aws-sdk';
import config from './config';

const s3 = new AWS.S3(config.s3);

export const uploadFile = async (file, key) => {
  const params = {
    Bucket: config.s3.bucketName,
    Key: key,
    Body: file
  };
  try {
    await s3.upload(params).promise();
    console.log(`File uploaded successfully: ${key}`);
  } catch (error) {
    console.error('Error uploading file:', error);
    throw error; // Re-throw for handling in the caller.
  }
};


#### 1.4. Component Architecture

For larger applications, consider a component-based architecture. This can involve creating distinct components for different S3-related tasks. For example:

*   **Upload Component:** Handles file uploads, progress tracking, and error handling.
*   **Download Component:** Handles file downloads, progress tracking, and caching.
*   **Management Component:** Manages bucket creation, deletion, and configuration.

#### 1.5. Code Splitting

If you have a large application using S3, consider using code splitting to reduce the initial load time.  This involves breaking your code into smaller chunks that can be loaded on demand.  This is especially relevant for front-end applications using S3 for asset storage.

*   **Dynamic Imports:** Use dynamic imports to load S3-related modules only when needed.
*   **Webpack/Rollup:** Configure your bundler to create separate chunks for S3 code.

### 2. Common Patterns and Anti-patterns

#### 2.1. Design Patterns

*   **Strategy Pattern:**  Use a strategy pattern to handle different storage classes or encryption methods.
*   **Factory Pattern:**  Use a factory pattern to create S3 clients with different configurations.
*   **Singleton Pattern:** Use a singleton pattern if you want to use one s3 instance for all the s3 interactions. 

#### 2.2. Recommended Approaches for Common Tasks

*   **Uploading large files:** Use multipart upload for files larger than 5 MB.  This allows you to upload files in parallel and resume interrupted uploads.
*   **Downloading large files:** Use byte-range fetches to download files in chunks. This is useful for resuming interrupted downloads and for accessing specific portions of a file.
*   **Deleting multiple objects:** Use the `deleteObjects` API to delete multiple objects in a single request.  This is more efficient than deleting objects one by one.

#### 2.3. Anti-patterns and Code Smells

*   **Hardcoding credentials:** Never hardcode AWS credentials in your code.  Use IAM roles or environment variables.
*   **Insufficient error handling:**  Always handle errors from S3 operations gracefully.  Provide informative error messages and retry failed operations.
*   **Ignoring bucket access control:** Properly configure bucket policies and IAM roles to restrict access to your S3 buckets.
*   **Overly permissive bucket policies:** Avoid granting overly broad permissions in your bucket policies. Follow the principle of least privilege.
*   **Not using versioning:**  Failing to enable versioning can lead to data loss if objects are accidentally deleted or overwritten.
*   **Assuming immediate consistency:** S3 provides eventual consistency for some operations.  Be aware of this and design your application accordingly.
*   **Polling for object existence:** Instead of polling, use S3 events to trigger actions when objects are created or modified.
*   **Inefficient data retrieval:**  Avoid retrieving entire objects when only a portion of the data is needed. Use byte-range fetches or S3 Select to retrieve only the necessary data.

#### 2.4. State Management

*   **Stateless operations:**  Design your S3 operations to be stateless whenever possible. This makes your application more scalable and resilient.
*   **Caching:**  Use caching to reduce the number of requests to S3.  Consider using a CDN (Content Delivery Network) to cache frequently accessed objects.
*   **Session management:** If you need to maintain state, store session data in a separate database or cache, not in S3.

#### 2.5. Error Handling

*   **Retry mechanism:** Implement retry logic with exponential backoff for transient errors.
*   **Specific error handling:** Handle different S3 errors differently (e.g., retry 503 errors, log 403 errors).
*   **Centralized error logging:** Log all S3 errors to a centralized logging system for monitoring and analysis.

Example (Python):
python
import boto3
from botocore.exceptions import ClientError
import time

s3 = boto3.client('s3')

def upload_file(file_name, bucket, object_name=None):
    """Upload a file to an S3 bucket"""
    if object_name is None:
        object_name = file_name

    for attempt in range(3): # Retry up to 3 times
        try:
            response = s3.upload_file(file_name, bucket, object_name)
            return True
        except ClientError as e:
            if e.response['Error']['Code'] == 'NoSuchBucket':
                print(f"The bucket {bucket} does not exist.")
                return False
            elif e.response['Error']['Code'] == 'AccessDenied':
                print("Access denied.  Check your credentials and permissions.")
                return False
            else:
                print(f"An error occurred: {e}")
                if attempt < 2:  # Wait and retry
                    time.sleep(2 ** attempt)
                else:
                    return False
        except Exception as e:
            print(f"An unexpected error occurred: {e}")
            return False
    return False # Reached max retries and failed


### 3. Performance Considerations

#### 3.1. Optimization Techniques

*   **Use S3 Transfer Acceleration:** If you are uploading or downloading files from a geographically distant location, use S3 Transfer Acceleration to improve performance. S3 Transfer Acceleration utilizes Amazon CloudFront's globally distributed edge locations.
*   **Use multipart upload:** For large files, use multipart upload to upload files in parallel. The documentation states the best practice is to use this for files larger than 5MB. 
*   **Enable gzip compression:**  Compress objects before uploading them to S3 to reduce storage costs and improve download times.  Set the `Content-Encoding` header to `gzip` when uploading compressed objects.
*   **Use HTTP/2:** Enable HTTP/2 on your S3 bucket to improve performance.
*   **Optimize object sizes:**  Store related data in a single object to reduce the number of requests to S3.

#### 3.2. Memory Management

*   **Stream data:**  Avoid loading entire files into memory. Use streams to process data in chunks.
*   **Release resources:**  Release S3 client objects when they are no longer needed.

#### 3.3. Bundle Size Optimization

*   **Tree shaking:**  Use a bundler that supports tree shaking to remove unused code from your bundle.
*   **Code splitting:**  Split your code into smaller chunks that can be loaded on demand.

#### 3.4. Lazy Loading

*   **Load images on demand:**  Load images from S3 only when they are visible on the screen.
*   **Lazy load data:**  Load data from S3 only when it is needed.

### 4. Security Best Practices

#### 4.1. Common Vulnerabilities

*   **Publicly accessible buckets:**  Ensure that your S3 buckets are not publicly accessible.
*   **Insufficient access control:**  Properly configure bucket policies and IAM roles to restrict access to your S3 buckets.
*   **Cross-site scripting (XSS):**  Sanitize user input to prevent XSS attacks if you are serving content directly from S3.
*   **Data injection:** Validate all data before storing it in S3 to prevent data injection attacks.

#### 4.2. Input Validation

*   **Validate file types:**  Validate the file types of uploaded objects to prevent malicious files from being stored in S3.
*   **Validate file sizes:**  Limit the file sizes of uploaded objects to prevent denial-of-service attacks.
*   **Sanitize file names:**  Sanitize file names to prevent directory traversal attacks.

#### 4.3. Authentication and Authorization

*   **Use IAM roles:**  Use IAM roles to grant permissions to applications running on EC2 instances or other AWS services.
*   **Use temporary credentials:**  Use temporary credentials for applications that need to access S3 from outside of AWS.  You can use AWS STS (Security Token Service) to generate temporary credentials.
*   **Principle of least privilege:**  Grant only the minimum permissions required for each user or application.

#### 4.4. Data Protection

*   **Encrypt data at rest:**  Use server-side encryption (SSE) or client-side encryption to encrypt data at rest in S3.
*   **Encrypt data in transit:**  Use HTTPS to encrypt data in transit between your application and S3.
*   **Enable versioning:**  Enable versioning to protect against accidental data loss.
*   **Enable MFA Delete:** Require multi-factor authentication to delete objects from S3.
*   **Object locking:**  Use S3 Object Lock to prevent objects from being deleted or overwritten for a specified period of time.

#### 4.5. Secure API Communication

*   **Use HTTPS:** Always use HTTPS to communicate with the S3 API.
*   **Validate certificates:**  Validate the SSL/TLS certificates of the S3 endpoints.
*   **Restrict access:** Restrict access to the S3 API using IAM policies and bucket policies.

### 5. Testing Approaches

#### 5.1. Unit Testing

*   **Mock S3 client:**  Mock the S3 client to isolate your unit tests.
*   **Test individual functions:**  Test individual functions that interact with S3.
*   **Verify error handling:**  Verify that your code handles S3 errors correctly.

Example (JavaScript with Jest):
javascript
// s3/uploader.test.js
import { uploadFile } from './uploader';
import AWS from 'aws-sdk';

jest.mock('aws-sdk', () => {
  const mS3 = {
    upload: jest.fn().mockReturnThis(),
    promise: jest.fn(),
  };
  return {
    S3: jest.fn(() => mS3),
  };
});

describe('uploadFile', () => {
  it('should upload file successfully', async () => {
    const mockS3 = new AWS.S3();
    mockS3.promise.mockResolvedValue({});
    const file = 'test file content';
    const key = 'test.txt';

    await uploadFile(file, key);

    expect(AWS.S3).toHaveBeenCalledTimes(1);
    expect(mockS3.upload).toHaveBeenCalledWith({
      Bucket: 'your-bucket-name',
      Key: key,
      Body: file
    });
  });

  it('should handle upload error', async () => {
    const mockS3 = new AWS.S3();
    mockS3.promise.mockRejectedValue(new Error('Upload failed'));
    const file = 'test file content';
    const key = 'test.txt';

    await expect(uploadFile(file, key)).rejects.toThrow('Upload failed');
  });
});


#### 5.2. Integration Testing

*   **Test with real S3 buckets:**  Create a dedicated S3 bucket for integration tests.
*   **Test end-to-end flows:**  Test complete workflows that involve S3 operations.
*   **Verify data integrity:**  Verify that data is correctly stored and retrieved from S3.

#### 5.3. End-to-End Testing

*   **Simulate user scenarios:**  Simulate real user scenarios to test your application's S3 integration.
*   **Monitor performance:**  Monitor the performance of your S3 integration under load.

#### 5.4. Test Organization

*   **Separate test directories:**  Create separate test directories for unit tests, integration tests, and end-to-end tests.
*   **Descriptive test names:**  Use descriptive test names that clearly indicate what is being tested.

#### 5.5. Mocking and Stubbing

*   **Mock S3 client:** Use a mocking library (e.g., Jest, Mockito) to mock the S3 client.
*   **Stub S3 responses:** Stub S3 API responses to simulate different scenarios.
*   **Use dependency injection:** Use dependency injection to inject mocked S3 clients into your components.

### 6. Common Pitfalls and Gotchas

#### 6.1. Frequent Mistakes

*   **Forgetting to handle errors:** Failing to handle S3 errors can lead to unexpected behavior and data loss.
*   **Using incorrect region:**  Using the wrong region can result in connection errors and data transfer costs.
*   **Exposing sensitive data:**  Storing sensitive data in S3 without proper encryption can lead to security breaches.
*   **Not cleaning up temporary files:**  Failing to delete temporary files after uploading them to S3 can lead to storage waste.
*   **Overusing public read access:**  Granting public read access to S3 buckets can expose sensitive data to unauthorized users.

#### 6.2. Edge Cases

*   **Eventual consistency:**  S3 provides eventual consistency for some operations. Be aware of this and design your application accordingly.
*   **Object size limits:**  Be aware of the object size limits for S3.
*   **Request rate limits:**  Be aware of the request rate limits for S3.
*   **Special characters in object keys:**  Handle special characters in object keys correctly.

#### 6.3. Version-Specific Issues

*   **SDK compatibility:**  Ensure that your AWS SDK is compatible with the S3 API version.
*   **API changes:**  Be aware of any API changes that may affect your application.

#### 6.4. Compatibility Concerns

*   **Browser compatibility:**  Ensure that your application is compatible with different browsers if you are using S3 directly from the browser.
*   **Serverless environments:**  Be aware of any limitations when using S3 in serverless environments (e.g., Lambda).

#### 6.5. Debugging Strategies

*   **Enable logging:**  Enable logging to track S3 API calls and errors.
*   **Use S3 monitoring tools:** Use S3 monitoring tools to monitor the performance and health of your S3 buckets.
*   **Check S3 access logs:** Analyze S3 access logs to identify potential security issues.
*   **Use AWS CloudTrail:**  Use AWS CloudTrail to track API calls to S3.
*   **Use AWS X-Ray:** Use AWS X-Ray to trace requests through your application and identify performance bottlenecks.

### 7. Tooling and Environment

#### 7.1. Recommended Development Tools

*   **AWS CLI:**  The AWS Command Line Interface (CLI) is a powerful tool for managing S3 resources.
*   **AWS SDK:**  The AWS SDK provides libraries for interacting with S3 from various programming languages.
*   **S3 Browser:** S3 Browser is a Windows client for managing S3 buckets and objects.
*   **Cyberduck:** Cyberduck is a cross-platform client for managing S3 buckets and objects.
*   **Cloudberry Explorer:** Cloudberry Explorer is a Windows client for managing S3 buckets and objects.

#### 7.2. Build Configuration

*   **Use environment variables:**  Store S3 configuration details (bucket name, region, credentials) in environment variables.
*   **Use a build tool:** Use a build tool (e.g., Maven, Gradle, Webpack) to manage your project dependencies and build your application.

#### 7.3. Linting and Formatting

*   **Use a linter:** Use a linter (e.g., ESLint, PyLint) to enforce code style and best practices.
*   **Use a formatter:**  Use a code formatter (e.g., Prettier, Black) to automatically format your code.

#### 7.4. Deployment

*   **Use infrastructure as code:** Use infrastructure as code (e.g., CloudFormation, Terraform) to automate the deployment of your S3 resources.
*   **Use a deployment pipeline:**  Use a deployment pipeline to automate the deployment of your application.
*   **Use blue/green deployments:**  Use blue/green deployments to minimize downtime during deployments.

#### 7.5. CI/CD Integration

*   **Integrate with CI/CD tools:** Integrate your S3 deployment process with CI/CD tools (e.g., Jenkins, CircleCI, Travis CI).
*   **Automate testing:** Automate your unit tests, integration tests, and end-to-end tests as part of your CI/CD pipeline.
*   **Automate deployments:** Automate the deployment of your application to S3 as part of your CI/CD pipeline.

By following these best practices and coding standards, you can ensure that your Amazon S3 integrations are secure, performant, maintainable, and cost-effective.
```


The content has been capped at 50000 tokens, and files over NaN bytes have been omitted. The user could consider applying other filters to refine the result. The better and more specific the context, the better the LLM can follow instructions. If the context seems verbose, the user can refine the filter using uithub. Thank you for using https://uithub.com - Perfect LLM context for any GitHub repo.