```
├── .github/
   ├── ISSUE_TEMPLATE/
      ├── bug_report.md
├── CODE_OF_CONDUCT.md
├── LICENSE
├── README.md
├── environment.yml
├── remwm.py
├── remwmgui.py
├── setup.sh
├── utils.py
```


## /.github/ISSUE_TEMPLATE/bug_report.md

---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: [e.g. iOS]
 - Browser [e.g. chrome, safari]
 - Version [e.g. 22]

**Smartphone (please complete the following information):**
 - Device: [e.g. iPhone6]
 - OS: [e.g. iOS8.1]
 - Browser [e.g. stock browser, safari]
 - Version [e.g. 22]

**Additional context**
Add any other context about the problem here.


## /CODE_OF_CONDUCT.md

# Contributor Covenant Code of Conduct

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
  and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
  overall community

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or
  advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
  address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
https://github.com/D-Ogi/WatermarkRemover-AI/issues.
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series
of actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior,  harassment of an
individual, or aggression toward or disparagement of classes of individuals.

**Consequence**: A permanent ban from any sort of public interaction within
the community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.


## /LICENSE

``` path="/LICENSE" 
MIT License

Copyright (c) 2025 D-Ogi

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

```

## /README.md

# WatermarkRemover-AI

**AI-Powered Watermark Removal Tool using Florence-2 and LaMA Models**

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Python: 3.10+](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/downloads/)

![image](https://github.com/user-attachments/assets/8f7fb600-695f-4dd7-958c-0cff516b5c7a)

Example of watermark removal with LaMa inpainting

![image](https://github.com/user-attachments/assets/e89825fb-3b14-4358-96f3-feb526908ad3)

![image](https://github.com/user-attachments/assets/64e63d5c-4ecc-4fe0-954d-1b72e6a29580)


## Overview

`WatermarkRemover-AI` is a cutting-edge application that leverages AI models for precise watermark detection and seamless removal. It uses Florence-2 from Microsoft for watermark identification and LaMA for inpainting to fill in the removed regions naturally. The software offers both a command-line interface (CLI) and a PyQt6-based graphical user interface (GUI), making it accessible to both casual and advanced users.

## Table of Contents

- [Features](#features)
- [Technical Overview](#technical-overview)
- [Installation](#installation)
- [Usage](#usage)
  - [Preferred Way: Setup Script](#preferred-way-setup-script)
  - [Manual Way](#manual-way)
  - [Using the GUI](#using-the-gui)
  - [Using the CLI](#using-the-cli)
- [Upgrade Notes](#upgrade-notes)
- [Alpha Masking](#alpha-masking)
- [Contributing](#contributing)
- [License](#license)

---

## Features

- **Dual Modes**: Process individual images or entire directories of images.
- **Advanced Watermark Detection**: Utilizes Florence-2's open-vocabulary detection for accurate watermark identification.
- **Seamless Inpainting**: Employs LaMA for high-quality, context-aware inpainting.
- **Customizable Output**:
  - Configure maximum bounding box size for watermark detection.
  - Set transparency for watermark regions.
  - Force specific output formats (PNG, WEBP, JPG).
- **Progress Tracking**: Real-time progress updates in both GUI and CLI modes.
- **Dark Mode Support**: GUI automatically adapts to system dark mode settings.
- **Efficient Resource Management**: Optimized for GPU acceleration using CUDA (optional).

---

## Technical Overview

### Florence-2 for Watermark Detection
- Florence-2 detects watermarks using open-vocabulary object detection.
- Bounding boxes are filtered to ensure that only small regions (configurable by the user) are processed.

### LaMA for Inpainting
- The LaMA model seamlessly fills in watermark regions with context-aware content.
- Supports high-resolution inpainting by using cropping and resizing strategies.

### PyQt6 GUI
- User-friendly interface for selecting input/output paths, configuring settings, and tracking progress.
- Dark mode and customization options enhance the user experience.

---

## Installation

### Prerequisites

- Conda/Miniconda installed.
- CUDA (optional for GPU acceleration; the application runs well on CPUs too).

### Steps

1. **Clone the Repository:**
   ```bash
   git clone https://github.com/D-Ogi/WatermarkRemover-AI.git
   cd WatermarkRemover-AI
   ```

2. **Run the Setup Script:**
   ```bash
   bash setup.sh
   ```

   The `setup.sh` script automatically sets up the environment, installs dependencies, and launches the GUI application. It also provides convenient options for CLI usage.

3. **Download the LaMA Model:**
   ```bash
   conda activate py312aiwatermark
   iopaint download --model lama
   ```

   The LaMA inpainting model files aren't included in the repository and must be downloaded separately (approximately 196MB). 

5. **Fast-Track Options:**
   - **To Use the CLI Immediately**: After running `setup.sh`, you can use the CLI directly without activating the environment manually:
     ```bash
     ./setup.sh input_path output_path [options]
     ```
     Example:
     ```bash
     ./setup.sh ./input_images ./output_images --overwrite --transparent
     ```
   - **To Activate the Environment Without Starting the Application**: Use:
     ```bash
     conda activate py312aiwatermark
     ```

---

## Usage

### Preferred Way: Setup Script

1. **Run the Setup Script**:
   ```bash
   bash setup.sh
   ```
   - The GUI will launch automatically, and the environment will be ready for immediate CLI or GUI use.
   - For CLI use, run:
     ```bash
     ./setup.sh input_path output_path [options]
     ```
     Example:
     ```bash
     ./setup.sh ./input_images ./output_images --overwrite --transparent
     ```

### Manual Way

1. **Activate the Environment**:
   ```bash
   conda activate py312aiwatermark
   ```
2. **Launch GUI or CLI**:
   - **GUI**:
     ```bash
     python remwmgui.py
     ```
   - **CLI**:
     ```bash
     python remwm.py input_path output_path [options]
     ```

### Using the GUI

1. **Launch the GUI**:
   If not launched automatically, start it with:
   ```bash
   python remwmgui.py
   ```

2. **Configure Settings**:
   - **Mode**: Select "Process Single Image" or "Process Directory".
   - **Paths**: Browse and set the input/output directories.
   - **Options**:
     - Enable overwriting of existing files (directory processing only, single image processing always overwrites)
     - Enable transparency for watermark regions.
     - Adjust the maximum bounding box size for watermark detection.
   - **Output Format**: Choose between PNG, WEBP, JPG, or retain the original format.

3. **Start Processing**:
   - Click "Start" to begin processing.
   - Monitor progress and logs in the GUI.

### Using the CLI

1. **Basic Command**:
   ```bash
   python remwm.py input_path output_path
   ```

2. **Options**:
   - `--overwrite`: Overwrite existing files.
   - `--transparent`: Make watermark regions transparent instead of removing them.
   - `--max-bbox-percent`: Set the maximum bounding box size for watermark detection (default: 10%).
   - `--force-format`: Force output format (PNG, WEBP, or JPG).

3. **Example**:
   ```bash
   python remwm.py ./input_images ./output_images --overwrite --max-bbox-percent=15 --force-format=PNG
   ```
---

### Upgrade Notes

If you have previously used an older version of the repository or set up an incorrect Conda environment, follow these steps to upgrade:

1. **Update the Repository**:
   ```bash
   git pull
   ```

2. **Remove the Old Environment**:
   ```bash
   conda deactivate
   conda env remove -n py312
   ```

3. **Run the Setup Script**:
   ```bash
   bash setup.sh
   ```

This will recreate the correct environment (`py312aiwatermark`) and ensure all dependencies are up-to-date.


---

## Alpha Masking

We implemented alpha masking to allow selective manipulation of watermark regions without altering other parts of the image.

### Why Alpha Masking?
- **Precision**: Enable box-targeted watermark removal by isolating specific regions.
- **Flexibility**: By controlling opacity in alpha layers, we can achieve a range of effects by complete removal to transparency.
- **Minimal Impact**: This method ensures that areas outside the watermark remain untouched, preserving image quality.


---

## Contributing

Contributions are welcome! To contribute:

1. Fork the repository.
2. Create a new branch for your feature.
3. Submit a pull request detailing your changes.

---

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.


## /environment.yml

```yml path="/environment.yml" 
name: py312aiwatermark
channels:
  - nvidia
  - conda-forge
  - defaults
dependencies:
  - python=3.12
  - pip
  - numpy
  - tqdm
  - loguru
  - click
  - pillow
  - opencv
  - pip:
      - --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126

```

## /remwm.py

```py path="/remwm.py" 
import sys
import click
from pathlib import Path
import cv2
import numpy as np
from PIL import Image, ImageDraw
from transformers import AutoProcessor, AutoModelForCausalLM
from iopaint.model_manager import ModelManager
from iopaint.schema import HDStrategy, LDMSampler, InpaintRequest as Config
import torch
from torch.nn import Module
import tqdm
from loguru import logger
from enum import Enum

try:
    from cv2.typing import MatLike
except ImportError:
    MatLike = np.ndarray

class TaskType(str, Enum):
    OPEN_VOCAB_DETECTION = "<OPEN_VOCABULARY_DETECTION>"
    """Detect bounding box for objects and OCR text"""

def identify(task_prompt: TaskType, image: MatLike, text_input: str, model: AutoModelForCausalLM, processor: AutoProcessor, device: str):
    if not isinstance(task_prompt, TaskType):
        raise ValueError(f"task_prompt must be a TaskType, but {task_prompt} is of type {type(task_prompt)}")

    prompt = task_prompt.value if text_input is None else task_prompt.value + text_input
    inputs = processor(text=prompt, images=image, return_tensors="pt")
    inputs = {k: v.to(device) for k, v in inputs.items()}

    generated_ids = model.generate(
        input_ids=inputs["input_ids"],
        pixel_values=inputs["pixel_values"],
        max_new_tokens=1024,
        early_stopping=False,
        do_sample=False,
        num_beams=3,
    )
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
    return processor.post_process_generation(
        generated_text, task=task_prompt.value, image_size=(image.width, image.height)
    )

def get_watermark_mask(image: MatLike, model: AutoModelForCausalLM, processor: AutoProcessor, device: str, max_bbox_percent: float):
    text_input = "watermark"
    task_prompt = TaskType.OPEN_VOCAB_DETECTION
    parsed_answer = identify(task_prompt, image, text_input, model, processor, device)

    mask = Image.new("L", image.size, 0)
    draw = ImageDraw.Draw(mask)

    detection_key = "<OPEN_VOCABULARY_DETECTION>"
    if detection_key in parsed_answer and "bboxes" in parsed_answer[detection_key]:
        image_area = image.width * image.height
        for bbox in parsed_answer[detection_key]["bboxes"]:
            x1, y1, x2, y2 = map(int, bbox)
            bbox_area = (x2 - x1) * (y2 - y1)
            if (bbox_area / image_area) * 100 <= max_bbox_percent:
                draw.rectangle([x1, y1, x2, y2], fill=255)
            else:
                logger.warning(f"Skipping large bounding box: {bbox} covering {bbox_area / image_area:.2%} of the image")

    return mask

def process_image_with_lama(image: MatLike, mask: MatLike, model_manager: ModelManager):
    config = Config(
        ldm_steps=50,
        ldm_sampler=LDMSampler.ddim,
        hd_strategy=HDStrategy.CROP,
        hd_strategy_crop_margin=64,
        hd_strategy_crop_trigger_size=800,
        hd_strategy_resize_limit=1600,
    )
    result = model_manager(image, mask, config)

    if result.dtype in [np.float64, np.float32]:
        result = np.clip(result, 0, 255).astype(np.uint8)

    return result

def make_region_transparent(image: Image.Image, mask: Image.Image):
    image = image.convert("RGBA")
    mask = mask.convert("L")
    transparent_image = Image.new("RGBA", image.size)
    for x in range(image.width):
        for y in range(image.height):
            if mask.getpixel((x, y)) > 0:
                transparent_image.putpixel((x, y), (0, 0, 0, 0))
            else:
                transparent_image.putpixel((x, y), image.getpixel((x, y)))
    return transparent_image

@click.command()
@click.argument("input_path", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@click.option("--overwrite", is_flag=True, help="Overwrite existing files in bulk mode.")
@click.option("--transparent", is_flag=True, help="Make watermark regions transparent instead of removing.")
@click.option("--max-bbox-percent", default=10.0, help="Maximum percentage of the image that a bounding box can cover.")
@click.option("--force-format", type=click.Choice(["PNG", "WEBP", "JPG"], case_sensitive=False), default=None, help="Force output format. Defaults to input format.")
def main(input_path: str, output_path: str, overwrite: bool, transparent: bool, max_bbox_percent: float, force_format: str):
    input_path = Path(input_path)
    output_path = Path(output_path)

    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}")
    florence_model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True).to(device).eval()
    florence_processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)
    logger.info("Florence-2 Model loaded")

    if not transparent:
        model_manager = ModelManager(name="lama", device=device)
        logger.info("LaMa model loaded")

    def handle_one(image_path: Path, output_path: Path):
        if output_path.exists() and not overwrite:
            logger.info(f"Skipping existing file: {output_path}")
            return

        image = Image.open(image_path).convert("RGB")
        mask_image = get_watermark_mask(image, florence_model, florence_processor, device, max_bbox_percent)

        if transparent:
            result_image = make_region_transparent(image, mask_image)
        else:
            lama_result = process_image_with_lama(np.array(image), np.array(mask_image), model_manager)
            result_image = Image.fromarray(cv2.cvtColor(lama_result, cv2.COLOR_BGR2RGB))

        # Determine output format
        if force_format:
            output_format = force_format.upper()
        elif transparent:
            output_format = "PNG"
        else:
            output_format = image_path.suffix[1:].upper()
            if output_format not in ["PNG", "WEBP", "JPG"]:
                output_format = "PNG"
        
        # Map JPG to JPEG for PIL compatibility
        if output_format == "JPG":
            output_format = "JPEG"

        if transparent and output_format == "JPG":
            logger.warning("Transparency detected. Defaulting to PNG for transparency support.")
            output_format = "PNG"

        new_output_path = output_path.with_suffix(f".{output_format.lower()}")
        result_image.save(new_output_path, format=output_format)
        logger.info(f"input_path:{image_path}, output_path:{new_output_path}")

    if input_path.is_dir():
        if not output_path.exists():
            output_path.mkdir(parents=True)

        images = list(input_path.glob("*.[jp][pn]g")) + list(input_path.glob("*.webp"))
        total_images = len(images)

        for idx, image_path in enumerate(tqdm.tqdm(images, desc="Processing images")):
            output_file = output_path / image_path.name
            handle_one(image_path, output_file)
            progress = int((idx + 1) / total_images * 100)
            print(f"input_path:{image_path}, output_path:{output_file}, overall_progress:{progress}")
    else:
        output_file = output_path.with_suffix(".webp" if transparent else output_path.suffix)
        handle_one(input_path, output_file)
        print(f"input_path:{input_path}, output_path:{output_file}, overall_progress:100")

if __name__ == "__main__":
    main()

```

## /remwmgui.py

```py path="/remwmgui.py" 
import os
import sys
import subprocess
import psutil
import yaml
import torch
from pathlib import Path
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QFileDialog, QLabel, QLineEdit, QPushButton, QVBoxLayout, QHBoxLayout, QWidget, QTextEdit,
    QProgressBar, QComboBox, QMessageBox, QRadioButton, QButtonGroup, QSlider, QCheckBox, QStatusBar
)
from PyQt6.QtCore import Qt, pyqtSignal, QObject, QThread, QTimer
from PyQt6.QtGui import QPalette, QColor
from loguru import logger

CONFIG_FILE = "ui.yml"

class Worker(QObject):
    log_signal = pyqtSignal(str)
    progress_signal = pyqtSignal(int)
    finished_signal = pyqtSignal()

    def __init__(self, process):
        super().__init__()
        self.process = process

    def run(self):
        try:
            for line in iter(self.process.stdout.readline, ""):
                self.log_signal.emit(line)
                if "overall_progress:" in line:
                    progress = int(line.strip().split("overall_progress:")[1].strip())
                    self.progress_signal.emit(progress)

            self.process.stdout.close()
        finally:
            self.finished_signal.emit()

class WatermarkRemoverGUI(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("Watermark Remover GUI")
        self.setGeometry(100, 100, 800, 600)

        # Initialize UI elements
        self.radio_single = QRadioButton("Process Single Image")
        self.radio_batch = QRadioButton("Process Directory")
        self.radio_single.setChecked(True)
        self.mode_group = QButtonGroup()
        self.mode_group.addButton(self.radio_single)
        self.mode_group.addButton(self.radio_batch)

        self.input_path = QLineEdit(self)
        self.output_path = QLineEdit(self)
        self.overwrite_checkbox = QCheckBox("Overwrite Existing Files", self)
        self.transparent_checkbox = QCheckBox("Make Watermark Transparent", self)
        self.max_bbox_percent_slider = QSlider(Qt.Orientation.Horizontal, self)
        self.max_bbox_percent_slider.setRange(1, 100)
        self.max_bbox_percent_slider.setValue(10)
        self.max_bbox_percent_label = QLabel(f"Max BBox Percent: 10%", self)
        self.max_bbox_percent_slider.valueChanged.connect(self.update_bbox_label)

        self.force_format_png = QRadioButton("PNG")
        self.force_format_webp = QRadioButton("WEBP")
        self.force_format_jpg = QRadioButton("JPG")
        self.force_format_none = QRadioButton("None")
        self.force_format_none.setChecked(True)
        self.force_format_group = QButtonGroup()
        self.force_format_group.addButton(self.force_format_png)
        self.force_format_group.addButton(self.force_format_webp)
        self.force_format_group.addButton(self.force_format_jpg)
        self.force_format_group.addButton(self.force_format_none)

        self.progress_bar = QProgressBar(self)
        self.logs = QTextEdit(self)
        self.logs.setReadOnly(True)
        self.logs.setVisible(False)

        self.start_button = QPushButton("Start", self)
        self.stop_button = QPushButton("Stop", self)
        self.toggle_logs_button = QPushButton("Show Logs", self)
        self.toggle_logs_button.setCheckable(True)
        self.stop_button.setDisabled(True)

        # Status bar for system info
        self.status_bar = QStatusBar()
        self.setStatusBar(self.status_bar)
        self.timer = QTimer()
        self.timer.timeout.connect(self.update_system_info)
        self.timer.start(1000)  # Update every second

        self.process = None
        self.thread = None

        # Layout
        layout = QVBoxLayout()

        # Mode selection
        mode_layout = QHBoxLayout()
        mode_layout.addWidget(self.radio_single)
        mode_layout.addWidget(self.radio_batch)

        # Input and output paths
        path_layout = QVBoxLayout()
        path_layout.addWidget(QLabel("Input Path:"))
        path_layout.addWidget(self.input_path)
        path_layout.addWidget(QPushButton("Browse", clicked=self.browse_input))
        path_layout.addWidget(QLabel("Output Path:"))
        path_layout.addWidget(self.output_path)
        path_layout.addWidget(QPushButton("Browse", clicked=self.browse_output))

        # Options
        options_layout = QVBoxLayout()
        options_layout.addWidget(self.overwrite_checkbox)
        options_layout.addWidget(self.transparent_checkbox)

        bbox_layout = QVBoxLayout()
        bbox_layout.addWidget(self.max_bbox_percent_label)
        bbox_layout.addWidget(self.max_bbox_percent_slider)
        options_layout.addLayout(bbox_layout)

        force_format_layout = QHBoxLayout()
        force_format_layout.addWidget(QLabel("Force Format:"))
        force_format_layout.addWidget(self.force_format_png)
        force_format_layout.addWidget(self.force_format_webp)
        force_format_layout.addWidget(self.force_format_jpg)
        force_format_layout.addWidget(self.force_format_none)
        options_layout.addLayout(force_format_layout)

        # Logs and progress
        progress_layout = QVBoxLayout()
        progress_layout.addWidget(QLabel("Progress:"))
        progress_layout.addWidget(self.progress_bar)
        progress_layout.addWidget(self.toggle_logs_button)
        progress_layout.addWidget(self.logs)

        # Buttons
        button_layout = QHBoxLayout()
        button_layout.addWidget(self.start_button)
        button_layout.addWidget(self.stop_button)

        # Final assembly
        layout.addLayout(mode_layout)
        layout.addLayout(path_layout)
        layout.addLayout(options_layout)
        layout.addLayout(progress_layout)
        layout.addLayout(button_layout)

        container = QWidget()
        container.setLayout(layout)
        self.setCentralWidget(container)

        # Connect buttons
        self.start_button.clicked.connect(self.start_processing)
        self.stop_button.clicked.connect(self.stop_processing)
        self.toggle_logs_button.toggled.connect(self.toggle_logs)

        self.apply_dark_mode_if_needed()

        # Load configuration
        self.load_config()

    def update_bbox_label(self, value):
        self.max_bbox_percent_label.setText(f"Max BBox Percent: {value}%")

    def toggle_logs(self, checked):
        self.logs.setVisible(checked)
        self.toggle_logs_button.setText("Hide Logs" if checked else "Show Logs")

    def apply_dark_mode_if_needed(self):
        if QApplication.instance().styleHints().colorScheme() == Qt.ColorScheme.Dark:
            dark_palette = QPalette()
            dark_palette.setColor(QPalette.ColorRole.Window, QColor(53, 53, 53))
            dark_palette.setColor(QPalette.ColorRole.WindowText, QColor(255, 255, 255))
            dark_palette.setColor(QPalette.ColorRole.Base, QColor(25, 25, 25))
            dark_palette.setColor(QPalette.ColorRole.AlternateBase, QColor(53, 53, 53))
            dark_palette.setColor(QPalette.ColorRole.ToolTipBase, QColor(255, 255, 255))
            dark_palette.setColor(QPalette.ColorRole.ToolTipText, QColor(255, 255, 255))
            dark_palette.setColor(QPalette.ColorRole.Text, QColor(255, 255, 255))
            dark_palette.setColor(QPalette.ColorRole.Button, QColor(53, 53, 53))
            dark_palette.setColor(QPalette.ColorRole.ButtonText, QColor(255, 255, 255))
            dark_palette.setColor(QPalette.ColorRole.BrightText, QColor(255, 0, 0))
            dark_palette.setColor(QPalette.ColorRole.Link, QColor(42, 130, 218))

            dark_palette.setColor(QPalette.ColorRole.Highlight, QColor(42, 130, 218))
            dark_palette.setColor(QPalette.ColorRole.HighlightedText, QColor(0, 0, 0))

            QApplication.instance().setPalette(dark_palette)

    def update_system_info(self):
        cuda_available = "CUDA: Available" if torch.cuda.is_available() else "CUDA: Not Available"
        ram = psutil.virtual_memory()
        ram_usage = ram.used // (1024 ** 2)
        ram_total = ram.total // (1024 ** 2)
        ram_percentage = ram.percent

        vram_status = "Not Available"
        if torch.cuda.is_available():
            gpu_info = torch.cuda.get_device_properties(0)
            vram_total = gpu_info.total_memory // (1024 ** 2)
            vram_used = vram_total - (torch.cuda.memory_reserved(0) // (1024 ** 2))
            vram_percentage = (vram_used / vram_total) * 100
            vram_status = f"VRAM: {vram_used} MB / {vram_total} MB ({vram_percentage:.2f}%)"

        status_text = (
            f"{cuda_available} | RAM: {ram_usage} MB / {ram_total} MB ({ram_percentage}%) | {vram_status} | CPU Load: {psutil.cpu_percent()}%"
        )
        self.status_bar.showMessage(status_text)

    def browse_input(self):
        if self.radio_single.isChecked():
            path, _ = QFileDialog.getOpenFileName(self, "Select Input Image", "", "Images (*.png *.jpg *.jpeg *.webp)")
        else:
            path = QFileDialog.getExistingDirectory(self, "Select Input Directory")
        if path:
            self.input_path.setText(path)

    def browse_output(self):
        path = QFileDialog.getExistingDirectory(self, "Select Output Directory")
        if path:
            self.output_path.setText(path)

    def start_processing(self):
        input_path = self.input_path.text()
        output_path = self.output_path.text()

        if not input_path or not output_path:
            QMessageBox.critical(self, "Error", "Input and Output paths are required.")
            return

        overwrite = "--overwrite" if self.overwrite_checkbox.isChecked() else ""
        transparent = "--transparent" if self.transparent_checkbox.isChecked() else ""
        max_bbox_percent = self.max_bbox_percent_slider.value()
        force_format = "None"
        if self.force_format_png.isChecked():
            force_format = "PNG"
        elif self.force_format_webp.isChecked():
            force_format = "WEBP"
        elif self.force_format_jpg.isChecked():
            force_format = "JPG"

        force_format_option = f"--force-format={force_format}" if force_format != "None" else ""

        command = [
            "python", "remwm.py",
            input_path, output_path,
            overwrite, transparent,
            f"--max-bbox-percent={max_bbox_percent}",
            force_format_option
        ]
        command = [arg for arg in command if arg]  # Remove empty strings

        self.process = subprocess.Popen(
            command,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True,
            env={**os.environ, "PYTHONUNBUFFERED": "1"}
        )


        self.worker = Worker(self.process)
        self.worker.log_signal.connect(self.update_logs)
        self.worker.progress_signal.connect(self.update_progress_bar)
        self.worker.finished_signal.connect(self.reset_ui)

        self.thread = QThread()
        self.worker.moveToThread(self.thread)
        self.thread.started.connect(self.worker.run)
        self.thread.start()

        self.stop_button.setDisabled(False)
        self.start_button.setDisabled(True)

    def update_logs(self, line):
        self.logs.append(line)

    def update_progress_bar(self, progress):
        self.progress_bar.setValue(progress)

    def stop_processing(self):
        if self.process:
            self.process.terminate()
            self.process.wait()
            self.reset_ui()

    def reset_ui(self):
        self.stop_button.setDisabled(True)
        self.start_button.setDisabled(False)
        if self.thread and self.thread.isRunning():
            self.thread.quit()
            self.thread.wait()
        self.process = None
        self.thread = None

    def save_config(self):
        config = {
            "input_path": self.input_path.text(),
            "output_path": self.output_path.text(),
            "overwrite": self.overwrite_checkbox.isChecked(),
            "transparent": self.transparent_checkbox.isChecked(),
            "max_bbox_percent": self.max_bbox_percent_slider.value(),
            "force_format": "PNG" if self.force_format_png.isChecked() else "WEBP" if self.force_format_webp.isChecked() else "JPG" if self.force_format_jpg.isChecked() else "None",
            "mode": "single" if self.radio_single.isChecked() else "batch"
        }
        with open(CONFIG_FILE, "w") as f:
            yaml.dump(config, f)

    def load_config(self):
        if os.path.exists(CONFIG_FILE):
            with open(CONFIG_FILE, "r") as f:
                config = yaml.safe_load(f)
                self.input_path.setText(config.get("input_path", ""))
                self.output_path.setText(config.get("output_path", ""))
                self.overwrite_checkbox.setChecked(config.get("overwrite", False))
                self.transparent_checkbox.setChecked(config.get("transparent", False))
                self.max_bbox_percent_slider.setValue(config.get("max_bbox_percent", 10))
                force_format = config.get("force_format", "None")
                if force_format == "PNG":
                    self.force_format_png.setChecked(True)
                elif force_format == "WEBP":
                    self.force_format_webp.setChecked(True)
                elif force_format == "JPG":
                    self.force_format_jpg.setChecked(True)
                else:
                    self.force_format_none.setChecked(True)
                mode = config.get("mode", "single")
                if mode == "single":
                    self.radio_single.setChecked(True)
                else:
                    self.radio_batch.setChecked(True)

    def closeEvent(self, event):
        self.save_config()
        event.accept()

if __name__ == "__main__":
    app = QApplication(sys.argv)
    gui = WatermarkRemoverGUI()
    gui.show()
    sys.exit(app.exec())
```

## /setup.sh

```sh path="/setup.sh" 
#!/usr/bin/env bash

# Exit immediately if a command exits with a non-zero status
set -e

# Default values
ENV_NAME="py312aiwatermark"
ENV_FILE="environment.yml"
INSTALL_DIR=""
FORCE_REINSTALL=false
USE_DEFAULT_DIR=true

# Function to detect previously used directory
find_existing_install_dir() {
    if [ -d "$PWD/conda_envs" ] && conda env list | grep -q "^${ENV_NAME}\\s"; then
        echo "$PWD/conda_envs"
    else
        echo ""
    fi
}

# Function to display usage
usage() {
    echo "Usage: $0 [options] -- [script arguments]"
    echo "Options:"
    echo "  --activate                Activate the Conda environment and provide instructions to deactivate it."
    echo "  --reinstall               Reinstall the Conda environment."
    echo "  --current-dir             Use the current directory as the root for the Conda environment."
    echo "  --help                    Show this help message."
    echo "Script Arguments:"
    echo "  Any arguments after -- will be passed directly to remwm.py."
    exit 1
}

# Parse arguments
ACTIVATE_ONLY=false
SCRIPT_ARGS=()

while [[ $# -gt 0 ]]; do
    case "$1" in
        --activate)
            ACTIVATE_ONLY=true
            shift
            ;;
        --reinstall)
            FORCE_REINSTALL=true
            shift
            ;;
        --current-dir)
            USE_DEFAULT_DIR=false
            INSTALL_DIR="$PWD/conda_envs"
            shift
            ;;
        --help)
            usage
            ;;
        --)
            shift
            SCRIPT_ARGS=("$@")
            break
            ;;
        *)
            echo "Unknown option: $1"
            usage
            ;;
    esac
done

# Check if Conda is installed
if ! command -v conda &> /dev/null; then
    echo "Conda could not be found. Please install Conda or Miniconda and try again."
    exit 1
fi

# Determine installation directory
if [ "$USE_DEFAULT_DIR" = false ]; then
    echo "Using current directory as the root for Conda environments: $INSTALL_DIR"
    export CONDA_ENVS_PATH="$INSTALL_DIR"
    export CONDA_PKGS_DIRS="$INSTALL_DIR/pkgs"
else
    EXISTING_DIR=$(find_existing_install_dir)
    if [ -n "$EXISTING_DIR" ]; then
        echo "Detected existing installation in: $EXISTING_DIR"
        export CONDA_ENVS_PATH="$EXISTING_DIR"
        export CONDA_PKGS_DIRS="$EXISTING_DIR/pkgs"
    else
        echo "Using default Conda directory."
    fi
fi

# Check if the environment already exists or needs to be reinstalled
if conda env list | grep -q "^${ENV_NAME}\\s"; then
    if [ "$FORCE_REINSTALL" = true ]; then
        echo "Reinstalling environment '${ENV_NAME}'..."
        conda env remove -n "${ENV_NAME}"
    else
        echo "Environment '${ENV_NAME}' already exists. Activating it..."
        eval "$(conda shell.bash hook)"
        conda activate "${ENV_NAME}"
    fi
fi

if ! conda env list | grep -q "^${ENV_NAME}\\s"; then
    # Create the Conda environment
    echo "Creating Conda environment '${ENV_NAME}' from '${ENV_FILE}'..."
    conda env create -f "${ENV_FILE}" || {
        echo "Failed to create the Conda environment."
        exit 1
    }
    eval "$(conda shell.bash hook)"
    conda activate "${ENV_NAME}"
fi

if [ "$ACTIVATE_ONLY" = true ]; then
    echo "Environment '${ENV_NAME}' activated. To deactivate, run 'conda deactivate'."
    exit 0
fi

# Ensure required dependencies are installed
pip list | grep -q PyQt6 || pip install PyQt6
pip list | grep -q transformers || pip install transformers
pip list | grep -q iopaint || pip install iopaint
pip list | grep -q opencv-python-headless || pip install opencv-python-headless

# Run remwm.py with passed arguments
python remwm.py "${SCRIPT_ARGS[@]}"

```

## /utils.py

```py path="/utils.py" 
from enum import Enum
import random
import matplotlib.patches as patches
import numpy as np
from PIL import ImageDraw

# Constants
colormap = ['blue', 'orange', 'green', 'purple', 'brown', 'pink', 'gray', 'olive', 'cyan', 'red',
            'lime', 'indigo', 'violet', 'aqua', 'magenta', 'coral', 'gold', 'tan', 'skyblue']

# To be set
model = None
processor = None

def set_model_info(model_, processor_):
    global model, processor
    model = model_
    processor = processor_

class TaskType(str, Enum):
    """The types of tasks supported"""
    CAPTION = '<CAPTION>'
    DETAILED_CAPTION = '<DETAILED_CAPTION>'
    MORE_DETAILED_CAPTION = '<MORE_DETAILED_CAPTION>'

def run_example(task_prompt: TaskType, image, text_input=None):
    """Runs an inference task using the model."""
    if not isinstance(task_prompt, TaskType):
        raise ValueError(f"task_prompt must be a TaskType, but {task_prompt} is of type {type(task_prompt)}")

    prompt = task_prompt.value if text_input is None else task_prompt.value + text_input
    inputs = processor(text=prompt, images=image, return_tensors="pt")
    generated_ids = model.generate(
        input_ids=inputs["input_ids"].cuda(),
        pixel_values=inputs["pixel_values"].cuda(),
        max_new_tokens=1024,
        early_stopping=False,
        do_sample=False,
        num_beams=3,
    )
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
    parsed_answer = processor.post_process_generation(
        generated_text,
        task=task_prompt.value,
        image_size=(image.width, image.height)
    )
    return parsed_answer

def draw_polygons(image, prediction, fill_mask=False):
    """Draws segmentation masks with polygons on an image."""
    draw = ImageDraw.Draw(image)
    for polygons, label in zip(prediction['polygons'], prediction['labels']):
        color = random.choice(colormap)
        fill_color = random.choice(colormap) if fill_mask else None

        for polygon in polygons:
            polygon = np.array(polygon).reshape(-1, 2)
            if len(polygon) < 3:
                print('Invalid polygon:', polygon)
                continue

            polygon = (polygon * 1).reshape(-1).tolist()  # No scaling
            draw.polygon(polygon, outline=color, fill=fill_color)
            draw.text((polygon[0] + 8, polygon[1] + 2), label, fill=color)

    return image

def draw_ocr_bboxes(image, prediction):
    """Draws OCR bounding boxes on an image."""
    draw = ImageDraw.Draw(image)
    bboxes, labels = prediction['quad_boxes'], prediction['labels']
    for box, label in zip(bboxes, labels):
        color = random.choice(colormap)
        new_box = (np.array(box) * 1).tolist()  # No scaling
        draw.polygon(new_box, width=3, outline=color)
        draw.text((new_box[0] + 8, new_box[1] + 2), "{}".format(label), align="right", fill=color)
    return image

def convert_bbox_to_relative(box, image):
    """Converts bounding box pixel coordinates to relative coordinates in the range 0-999."""
    return [
        (box[0] / image.width) * 999,
        (box[1] / image.height) * 999,
        (box[2] / image.width) * 999,
        (box[3] / image.height) * 999,
    ]

def convert_relative_to_bbox(relative, image):
    """Converts list of relative coordinates to pixel coordinates."""
    return [
        (relative[0] / 999) * image.width,
        (relative[1] / 999) * image.height,
        (relative[2] / 999) * image.width,
        (relative[3] / 999) * image.height,
    ]

def convert_bbox_to_loc(box, image):
    """Converts bounding box pixel coordinates to position tokens."""
    relative_coordinates = convert_bbox_to_relative(box, image)
    return ''.join([f'<loc_{i}>' for i in relative_coordinates])

```


The better and more specific the context, the better the LLM can follow instructions. If the context seems verbose, the user can refine the filter using uithub. Thank you for using https://uithub.com - Perfect LLM context for any GitHub repo.