Byaidu/PDFMathTranslate/main 88k tokens More Tools
```
├── .dockerignore (700 tokens)
├── .github/
   ├── FUNDING.yml (200 tokens)
   ├── ISSUE_TEMPLATE/
      ├── blank.yaml (100 tokens)
      ├── config.yml
      ├── 功能请求_cn.yaml (200 tokens)
      ├── 功能请求_en.yaml (200 tokens)
      ├── 问题反馈_cn.yaml (400 tokens)
      ├── 问题反馈_en.yaml (600 tokens)
   ├── dependabot.yml (100 tokens)
   ├── release-drafter.yml (300 tokens)
   ├── workflows/
      ├── black.format.yml (100 tokens)
      ├── exe-build.yml (1300 tokens)
      ├── fork-build.yml (2.2k tokens)
      ├── fork-test.yml (200 tokens)
      ├── python-publish.yml (3.7k tokens)
      ├── python-test.yml (600 tokens)
├── .gitignore (700 tokens)
├── .gitmodules
├── .pre-commit-config.yaml (100 tokens)
├── Dockerfile (200 tokens)
├── LICENSE (omitted)
├── README.md (3.9k tokens)
├── app.json
├── docker-compose.yml (300 tokens)
├── docs/
   ├── ADVANCED.md (3.5k tokens)
   ├── APIS.md (500 tokens)
   ├── CODE_OF_CONDUCT.md (1000 tokens)
   ├── PROXY_CONFIGURATION.md (1000 tokens)
   ├── README_GUI.md (200 tokens)
   ├── README_ja-JP.md (3.2k tokens)
   ├── README_ko-KR.md (4.6k tokens)
   ├── README_zh-CN.md (2.7k tokens)
   ├── README_zh-TW.md (3k tokens)
   ├── images/
      ├── after.png
      ├── banner.png
      ├── before.png
      ├── cmd.explained.png
      ├── cmd.explained.zh.png
      ├── gui.gif
      ├── preview.gif
├── pdf2zh/
   ├── __init__.py (100 tokens)
   ├── backend.py (600 tokens)
   ├── cache.py (900 tokens)
   ├── config.py (1400 tokens)
   ├── converter.py (4.6k tokens)
   ├── converter_docx.py (300 tokens)
   ├── doclayout.py (1400 tokens)
   ├── gui.py (5.8k tokens)
   ├── high_level.py (2.8k tokens)
   ├── kernel/
      ├── __init__.py (100 tokens)
      ├── legacy.py (500 tokens)
      ├── precise.py (1600 tokens)
      ├── protocol.py (300 tokens)
      ├── registry.py (400 tokens)
      ├── v2_bridge.py (1200 tokens)
      ├── v2_worker.py (900 tokens)
   ├── mcp_server.py (800 tokens)
   ├── pdf2zh.py (3k tokens)
   ├── pdfinterp.py (2.7k tokens)
   ├── translator.py (8.1k tokens)
├── pyproject.toml (400 tokens)
├── script/
   ├── Dockerfile.China (100 tokens)
   ├── Dockerfile.Demo (200 tokens)
   ├── _pystand_static.int (200 tokens)
   ├── build-win64.ps1 (600 tokens)
   ├── setup.bat (200 tokens)
├── setup.cfg
├── test/
   ├── file/
      ├── translate.cli.font.unknown.pdf
      ├── translate.cli.plain.text.pdf
      ├── translate.cli.text.with.figure.pdf
   ├── test_cache.py (1600 tokens)
   ├── test_cli.py (200 tokens)
   ├── test_converter.py (700 tokens)
   ├── test_doclayout.py (700 tokens)
   ├── test_kernel.py (5k tokens)
   ├── test_translator.py (1700 tokens)
```


## /.dockerignore

```dockerignore path="/.dockerignore" 
.github
docs
.git
.pre-commit-config.yaml
uv.lock
pdf2zh_files
gui/pdf2zh_files
gradio_files
tmp
gui/gradio_files
gui/tmp
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
.vscode
.DS_Store

```

## /.github/FUNDING.yml

```yml path="/.github/FUNDING.yml" 
# These are supported funding model platforms

github: [Byaidu, reycn, Wybxc, hellofinch] # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
polar: # Replace with a single Polar username
buy_me_a_coffee: # Replace with a single Buy Me a Coffee username
thanks_dev: # Replace with a single thanks.dev username
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']

```

## /.github/ISSUE_TEMPLATE/blank.yaml

```yaml path="/.github/ISSUE_TEMPLATE/blank.yaml" 
name: Blank Issue
description: Create a blank issue for discussion
body:  
  - type: markdown
    attributes:
      value: |
        ## 2.0 is released! new repo is [HERE](https://github.com/PDFMathTranslate/PDFMathTranslate-next)
  - type: checkboxes
    id: checks
    attributes:
      label: Before you asking
      options:
      - label: I have tried the PDFMathTranslate-next and give feedback in PDFMathTranslate-next
        required: true
  - type: checkboxes
    id: checks
    attributes:
      label: before ...
      options:
      - label: This issue is not about question or bug.
        required: true
  - type: textarea
    id: describe
    attributes:
      label: Add a description
```

## /.github/ISSUE_TEMPLATE/config.yml

```yml path="/.github/ISSUE_TEMPLATE/config.yml" 
blank_issues_enabled: false

```

## /.github/ISSUE_TEMPLATE/功能请求_cn.yaml

```yaml path="/.github/ISSUE_TEMPLATE/功能请求_cn.yaml" 
name: 功能请求
description: 使用中文进行功能请求
labels: ['enhancement']
body:
  - type: markdown
    attributes:
      value: |
        ## 2.0 is released! new repo is [HERE](https://github.com/PDFMathTranslate/PDFMathTranslate-next)
  - type: checkboxes
    id: checks
    attributes:
      label: 在提问之前...
      options:
      - label: 我已经尝试了PDFMathTranslate-next,并在PDFMathTranslate-next提交了issue
        required: true
  - type: textarea
    id: describe
    attributes:
      label: 在什么场景下,需要你请求的功能?
      description: 简要描述相关的使用场景
    validations:
      required: false
  - type: textarea
    id: solution
    attributes:
      label: 解决方案
      description: 描述你想要的解决方案
    validations:
      required: false
  - type: textarea
    id: additional
    attributes:
      label: 其他内容
      description: 关于该功能请求的任何其他项目。
    validations:
      required: false
```

## /.github/ISSUE_TEMPLATE/功能请求_en.yaml

```yaml path="/.github/ISSUE_TEMPLATE/功能请求_en.yaml" 
name: Feature request
description: Suggest an idea for this project
labels: ['enhancement']
body:
  - type: markdown
    attributes:
      value: |
        ## 2.0 is released! new repo is [HERE](https://github.com/PDFMathTranslate/PDFMathTranslate-next)
  - type: checkboxes
    id: checks
    attributes:
      label: Before you asking
      options:
      - label: I have tried the PDFMathTranslate-next and give feedback in PDFMathTranslate-next
        required: true
  - type: textarea
    id: describe
    attributes:
      label: Is your feature request related to a problem?
      description: A clear and concise description of what the problem is
      placeholder: Ex. I'm always frustrated when ...
    validations:
      required: false
  - type: textarea
    id: solution
    attributes:
      label: Describe the solution you'd like
      description: A clear and concise description of what you want to happen
    validations:
      required: false
  - type: textarea
    id: additional
    attributes:
      label: Additional context
      description: Add any other projects about the feature request here.
    validations:
      required: false
```

## /.github/ISSUE_TEMPLATE/问题反馈_cn.yaml

```yaml path="/.github/ISSUE_TEMPLATE/问题反馈_cn.yaml" 
name: 上报 Bug
description: 使用中文进行 Bug 报告
labels: ['bug']
body:
  - type: markdown
    attributes:
      value: |
        ## 2.0 is released! new repo is [HERE](https://github.com/PDFMathTranslate/PDFMathTranslate-next)
  - type: checkboxes
    id: checks
    attributes:
      label: 在提问之前...
      options:
      - label: 我已经尝试了PDFMathTranslate-next,并在PDFMathTranslate-next提交了issue
        required: true
      - label: 我已经搜索了现有的 issues
        required: true
      - label: 我在提问题之前至少花费了 5 分钟来思考和准备
        required: true
      - label: 我已经认真且完整的阅读了 wiki
        required: true
      - label: 我已经认真检查了问题和网络环境无关(包括但不限于Google不可用,模型下载失败)
        required: true
  - type: markdown
    attributes:
      value: |
        感谢您使用本项目并反馈!
        请再次确认上述复选框所述的内容已经认真执行!
  - type: textarea
    id: environment
    attributes:
      label: 使用的环境
      placeholder: |
          - **OS**: Ubuntu 24.10  
          - **Python**: 3.12.0  
          - **pdf2zh**: 1.9.0
      render: markdown
    validations:
      required: false
  - type: dropdown
    id: install
    attributes:
      label: 请选择安装方式
      options:
        - pip
        - exe
        - 源码
        - docker
    validations:
      required: true
  - type: textarea
    id: describe
    attributes:
      label: 描述你的问题
      description: 简要描述你的问题
    validations:
      required: true
  - type: textarea
    id: reproduce
    attributes:
      label: 如何复现
      description: 重现该行为的步骤
      value: |
        1. 执行 '...'
        2. 选择 '....'
        3. 出现问题
    validations:
      required: false
  - type: textarea
    id: expected
    attributes:
      label: 预期行为
      description: 简要描述你期望得到的反馈
    validations:
      required: false
  - type: textarea
    id: logs
    attributes:
      label: 相关 Logs
      description: 请复制并粘贴任何相关的日志输出。
      render: Text
    validations:
      required: false
  - type: textarea
    id: PDFfile
    attributes:
      label: 原始PDF文件
      description: |
        如果涉及到排版错误的问题,请一定提供原始的PDF文件,方便复现错误。
    validations:
      required: false
  - type: textarea
    id: others
    attributes:
      label: 还有别的吗?
      description: |
        相关的配置?链接?参考资料?
        任何能让我们对你所遇到的问题有更多了解的东西。
    validations:
      required: false
```

## /.github/ISSUE_TEMPLATE/问题反馈_en.yaml

```yaml path="/.github/ISSUE_TEMPLATE/问题反馈_en.yaml" 
name: Bug Report
description: Create a report to help us improve
labels: ['bug']
body:
  - type: markdown
    attributes:
      value: |
        ## 2.0 is released! new repo is [HERE](https://github.com/PDFMathTranslate/PDFMathTranslate-next)
  - type: checkboxes
    id: checks
    attributes:
      label: Before you asking
      options:
      - label: I have tried the PDFMathTranslate-next and give feedback in PDFMathTranslate-next
        required: true
      - label: I have searched the existing issues
        required: true
      - label: I spend at least 5 minutes for thinking and preparing
        required: true
      - label: I have thoroughly and completely read the wiki.
        required: true
      - label: I have carefully checked the issue, and it is unrelated to the network environment.
        required: true
  - type: markdown
    attributes:
      value: |
        Thank you for using this project and providing feedback!
  - type: textarea
    id: environment
    attributes:
      label: Environment
      placeholder: |
          - **OS**: Ubuntu 24.10
          - **Python**: 3.12.0
          - **pdf2zh**: 1.9.0
      render: markdown
    validations:
      required: false
  - type: dropdown
    id: install
    attributes:
      label: How to install pdf2zh
      options:
        - pip
        - exe
        - source
        - docker
    validations:
      required: true
  - type: textarea
    id: describe
    attributes:
      label: Describe the bug
      description: A clear and concise description of what the bug is.
    validations:
      required: true
  - type: textarea
    id: reproduce
    attributes:
      label: To Reproduce
      description: Steps to reproduce the behavior
      value: |
        1. execute '...'
        2. select '....'
        3. see errors
    validations:
      required: false
  - type: textarea
    id: expected
    attributes:
      label: Expected behavior
      description: A clear and concise description of what you expected to happen.
    validations:
      required: false
  - type: textarea
    id: logs
    attributes:
      label: Relevant log output
      description: Please copy and paste any relevant log output. This will be automatically formatted into code, so no need for backticks.
      render: Text
    validations:
      required: false
  - type: textarea
    id: PDFfile
    attributes:
      label: Origin PDF file
      description: |
        If the issue involves formatting errors, please provide the original PDF file to facilitate reproduction of the error.
    validations:
      required: false
  - type: textarea
    id: others
    attributes:
      label: Anything else?
      description: |
        Related configs? Links? References?
        Anything that will give us more context about the issue you are encountering!
    validations:
      required: false
```

## /.github/dependabot.yml

```yml path="/.github/dependabot.yml" 
version: 2
updates:
  - package-ecosystem: github-actions
    directory: "/"
    schedule:
      interval: weekly
  # - package-ecosystem: pip
  #   directory: "/.github/workflows"
  #   schedule:
  #     interval: weekly
  # - package-ecosystem: pip
  #   directory: "/docs"
  #   schedule:
  #     interval: weekly
  - package-ecosystem: pip
    directory: "/"
    schedule:
      interval: weekly
    versioning-strategy: lockfile-only
    allow:
      - dependency-type: "all"
```

## /.github/release-drafter.yml

```yml path="/.github/release-drafter.yml" 
name-template: 'v$RESOLVED_VERSION'
tag-template: 'v$RESOLVED_VERSION'
categories:
  - title: '🚀 Features'
    labels:
      - 'feature'
      - 'enhancement'
  - title: '🐛 Bug Fixes'
    labels:
      - 'fix'
      - 'bugfix'
      - 'bug'
  - title: '🧰 Maintenance'
    labels:
      - 'chore'
      - 'maintenance'
      - 'refactor'
  - title: '📝 Documentation'
    labels:
      - 'docs'
      - 'documentation'
change-template: '- $TITLE @$AUTHOR (#$NUMBER)'
change-title-escapes: '\<*_&' # You can add # and @ to disable mentions
version-resolver:
  major:
    labels:
      - 'major'
  minor:
    labels:
      - 'minor'
  patch:
    labels:
      - 'patch'
  default: patch
template: |
  ## Changes

  $CHANGES

  ## Contributors
  
  $CONTRIBUTORS

  ## Windows Specific

  If you cannot open it after downloading, please install https://aka.ms/vs/17/release/vc_redist.x64.exe and try again.

  ## Assets

  - pdf2zh-v$RESOLVED_VERSION-win64.zip: pdf2zh **without** assets(font, model, etc.)
  - pdf2zh-v$RESOLVED_VERSION-with-assets-win64.zip: (**Recommended**) pdf2zh **with** assets(font, model, etc.)

  > [!NOTE]
  >
  > The version without assets will also dynamically download resources when running, but the download may fail due to network issues.
```

## /.github/workflows/black.format.yml

```yml path="/.github/workflows/black.format.yml" 
name: Format Code with Black

on: [push, pull_request]

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  lint:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    if: >-
      !startsWith(github.event.head_commit.message, 'doc:') &&
      !startsWith(github.event.head_commit.message, 'doc(') &&
      !startsWith(github.event.pull_request.title, 'doc:') &&
      !startsWith(github.event.pull_request.title, 'doc(')
    steps:
      - uses: actions/checkout@v6
      - uses: psf/black@stable

```

## /.github/workflows/exe-build.yml

```yml path="/.github/workflows/exe-build.yml" 
name: windows exe Release Workflow

on:
  workflow_dispatch:
    inputs:
      release_version:
        description: 'Release Version (e.g., v1.0.0)'
        required: true
        type: string
  # push:
    # debug purpose
env:
  WIN_EXE_PYTHON_VERSION: 3.12.9
jobs:
  build-win64-exe:
    runs-on: windows-latest
    steps:
      - name: checkout babeldoc metadata
        uses: actions/checkout@v6
        with:
          repository: funstory-ai/BabelDOC
          path: babeldoctemp1234567
          token: ${{ secrets.GITHUB_TOKEN }}
          sparse-checkout: babeldoc/assets/embedding_assets_metadata.py
      - name: Cached Assets
        id: cache-assets
        uses: actions/cache@v5
        with:
          path: ~/.cache/babeldoc
          key: test-1-babeldoc-assets-${{ hashFiles('babeldoctemp1234567/babeldoc/assets/embedding_assets_metadata.py') }}
      - name: 检出代码
        uses: actions/checkout@v6

      - name: Setup uv with Python ${{ env.WIN_EXE_PYTHON_VERSION }}
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: ${{ env.WIN_EXE_PYTHON_VERSION }}
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"

      - name: 执行所有任务(创建目录、下载、解压、复制文件、安装依赖)
        shell: pwsh
        run: |
          Write-Host "==== 删除 babeldoctemp1234567 文件夹 ===="
          if (Test-Path "./babeldoctemp1234567") {
              Remove-Item -Path "./babeldoctemp1234567" -Recurse -Force
              Write-Host "babeldoctemp1234567 文件夹已成功删除"
          } else {
              Write-Host "babeldoctemp1234567 文件夹不存在,无需删除"
          }
          Write-Host "==== 创建必要的目录 ===="
          New-Item -Path "./build" -ItemType Directory -Force
          New-Item -Path "./build/runtime" -ItemType Directory -Force
          New-Item -Path "./dep_build" -ItemType Directory -Force

          Write-Host "==== 复制代码到 dep_build ===="
          Get-ChildItem -Path "./" -Exclude "dep_build", "build" | Copy-Item -Destination "./dep_build" -Recurse -Force

          Write-Host "==== 下载并解压 Python ${{ env.WIN_EXE_PYTHON_VERSION }} ===="
          Write-Host "pythonUrl: https://www.python.org/ftp/python/${{ env.WIN_EXE_PYTHON_VERSION }}/python-${{ env.WIN_EXE_PYTHON_VERSION }}-embed-amd64.zip"
          $pythonUrl = "https://www.python.org/ftp/python/${{ env.WIN_EXE_PYTHON_VERSION }}/python-${{ env.WIN_EXE_PYTHON_VERSION }}-embed-amd64.zip"
          $pythonZip = "./dep_build/python.zip"
          Invoke-WebRequest -Uri $pythonUrl -OutFile $pythonZip
          Expand-Archive -Path $pythonZip -DestinationPath "./build/runtime" -Force

          Write-Host "==== 下载并解压 PyStand ===="
          $pystandUrl = "https://github.com/skywind3000/PyStand/releases/download/1.1.4/PyStand-v1.1.4-exe.zip"
          $pystandZip = "./dep_build/PyStand.zip"
          Invoke-WebRequest -Uri $pystandUrl -OutFile $pystandZip
          Expand-Archive -Path $pystandZip -DestinationPath "./dep_build/PyStand" -Force

          Write-Host "==== 复制 PyStand.exe 到 build 并重命名 ===="
          $pystandExe = "./dep_build/PyStand/PyStand-x64-CLI/PyStand.exe"
          $destExe = "./build/pdf2zh.exe"
          if (Test-Path $pystandExe) {
              Copy-Item -Path $pystandExe -Destination $destExe -Force
          } else {
              Write-Host "错误: PyStand.exe 未找到!"
              exit 1
          }
          Write-Host "==== 创建 Python venv 在 dep_build ===="
          uv venv ./dep_build/venv

          ./dep_build/venv/Scripts/activate

          Write-Host "==== 在 venv 环境中安装项目依赖 ===="
          uv pip install .

          Write-Host "==== 复制 venv/Lib/site-packages 到 build/ ===="
          Copy-Item -Path "./dep_build/venv/Lib/site-packages" -Destination "./build/site-packages" -Recurse -Force

          Write-Host "==== 复制 script/_pystand_static.int 到 build/ ===="
          $staticFile = "./script/_pystand_static.int"
          $destStatic = "./build/_pystand_static.int"
          if (Test-Path $staticFile) {
              Copy-Item -Path $staticFile -Destination $destStatic -Force
          } else {
              Write-Host "错误: script/_pystand_static.int 未找到!"
              exit 1
          }

          uv run --active babeldoc --generate-offline-assets ./build

      - name: Upload build artifact
        uses: actions/upload-artifact@v7
        with:
          name: win64-exe
          path: ./build
          if-no-files-found: error
          compression-level: 9
          include-hidden-files: true

  test-win64-exe:
    needs: 
      - build-win64-exe
    runs-on: windows-latest
    steps:
      - name: 检出代码
        uses: actions/checkout@v6

      - name: Download build artifact
        uses: actions/download-artifact@v8
        with:
          name: win64-exe
          path: ./build

      - name: Test show version
        run: |
          ./build/pdf2zh.exe --version
      
      - name: Test - Translate a PDF file with plain text only
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file

      - name: Test - Translate a PDF file figure
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.text.with.figure.pdf -o ./test/file

      - name: Delete offline assets and cache
        shell: pwsh
        run: |
          Write-Host "==== 查找并删除离线资源包 ===="
          $offlineAssetsPath = Get-ChildItem -Path "./build" -Filter "offline_assets_*.zip" -Recurse | Select-Object -First 1 -ExpandProperty FullName
          if ($offlineAssetsPath) {
            Write-Host "找到离线资源包: $offlineAssetsPath"
            Remove-Item -Path $offlineAssetsPath -Force
            Write-Host "已删除离线资源包"
          } else {
            Write-Host "未找到离线资源包"
          }
          
          Write-Host "==== 删除缓存目录 ===="
          $cachePath = "$env:USERPROFILE/.cache/babeldoc"
          if (Test-Path $cachePath) {
            Remove-Item -Path $cachePath -Recurse -Force
            Write-Host "已删除缓存目录: $cachePath"
          } else {
            Write-Host "缓存目录不存在: $cachePath"
          }

      - name: Test - Translate without offline assets
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file
          
      - name: Upload test results
        uses: actions/upload-artifact@v7
        with:
          name: test-results
          path: ./test/file/

  
```

## /.github/workflows/fork-build.yml

```yml path="/.github/workflows/fork-build.yml" 
name: fork-build

on:
  workflow_dispatch:
  # debug purpose
  # push:

env:
  REGISTRY: ghcr.io
  REPO_LOWER: ${{ github.repository_owner }}/${{ github.event.repository.name }}
  GHCR_REPO: ghcr.io/${{ github.repository }}
  WIN_EXE_PYTHON_VERSION: 3.12.9
jobs:
  check-repository:
    name: Check if running in main repository
    runs-on: ubuntu-latest
    outputs:
      is_main_repo: ${{ github.repository == 'Byaidu/PDFMathTranslate' }}
    steps:
      - run: echo "Running repository check"

  test:
    uses: ./.github/workflows/python-test.yml
    needs: check-repository
    if: needs.check-repository.outputs.is_main_repo != 'true'

  build:
    strategy:
      fail-fast: false
      matrix:
        include:
          - platform: linux/amd64
            runner: ubuntu-latest
          - platform: linux/arm64
            runner: ubuntu-24.04-arm
    runs-on: ${{ matrix.runner }}
    needs: 
      - check-repository
      - test
    if: needs.check-repository.outputs.is_main_repo != 'true'
    permissions:
      contents: read
      packages: write
      
    steps:
      - name: Convert to lowercase
        run: |
          echo "GHCR_REPO_LOWER=$(echo ${{ env.GHCR_REPO }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

      - name: Prepare
        run: |
          platform=${{ matrix.platform }}
          echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV

      - name: Checkout repository
        uses: actions/checkout@v6

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v6
        with:
          images: |
            ${{ env.GHCR_REPO_LOWER }}

      - name: Login to GHCR
        uses: docker/login-action@v4
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}


      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v4

      - name: Build and push by digest
        id: build
        uses: docker/build-push-action@v6
        with:
          platforms: ${{ matrix.platform }}
          labels: ${{ steps.meta.outputs.labels }}
          outputs: type=image,name=${{ env.GHCR_REPO_LOWER }},push-by-digest=true,name-canonical=true,push=true
          cache-from: ${{ matrix.platform == 'linux/amd64' && 'type=gha' || '' }}
          cache-to: ${{ matrix.platform == 'linux/amd64' && 'type=gha,mode=max' || '' }}

      - name: Export digest
        run: |
          mkdir -p ${{ runner.temp }}/digests
          digest="${{ steps.build.outputs.digest }}"
          touch "${{ runner.temp }}/digests/${digest#sha256:}"

      - name: Upload digest
        uses: actions/upload-artifact@v7
        with:
          name: digests-${{ env.PLATFORM_PAIR }}
          path: ${{ runner.temp }}/digests/*
          if-no-files-found: error
          retention-days: 1

  merge:
    runs-on: ubuntu-latest
    needs:
      - check-repository
      - test
      - build
    if: needs.check-repository.outputs.is_main_repo != 'true'
    permissions:
      contents: read
      packages: write
      
    steps:
      - name: Convert to lowercase
        run: |
          echo "GHCR_REPO_LOWER=$(echo ${{ env.GHCR_REPO }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

      - name: Download digests
        uses: actions/download-artifact@v8
        with:
          path: ${{ runner.temp }}/digests
          pattern: digests-*
          merge-multiple: true

      - name: Login to GHCR
        uses: docker/login-action@v4
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v4

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v6
        with:
          images: |
            ${{ env.GHCR_REPO_LOWER }}
          tags: |
            type=raw,value=dev
            type=semver,pattern={{version}}
            type=semver,pattern={{major}}.{{minor}}

      - name: Create manifest list and push
        working-directory: ${{ runner.temp }}/digests
        run: |
          docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
            $(printf '${{ env.GHCR_REPO_LOWER }}@sha256:%s ' *)

      - name: Inspect image
        run: |
          docker buildx imagetools inspect ${{ env.GHCR_REPO_LOWER }}:${{ steps.meta.outputs.version }}
  
  build-win64-exe:
    runs-on: windows-latest
    needs:
      - check-repository
    if: needs.check-repository.outputs.is_main_repo != 'true'
    steps:
      - name: 检出代码
        uses: actions/checkout@v6

      - name: Setup uv with Python ${{ env.WIN_EXE_PYTHON_VERSION }}
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: ${{ env.WIN_EXE_PYTHON_VERSION }}
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"

      - name: 执行所有任务(创建目录、下载、解压、复制文件、安装依赖)
        shell: pwsh
        run: |
          Write-Host "==== 创建必要的目录 ===="
          New-Item -Path "./build" -ItemType Directory -Force
          New-Item -Path "./build/runtime" -ItemType Directory -Force
          New-Item -Path "./dep_build" -ItemType Directory -Force

          Write-Host "==== 复制代码到 dep_build ===="
          Get-ChildItem -Path "./" -Exclude "dep_build", "build" | Copy-Item -Destination "./dep_build" -Recurse -Force

          Write-Host "==== 下载并解压 Python ${{ env.WIN_EXE_PYTHON_VERSION }} ===="
          Write-Host "pythonUrl: https://www.python.org/ftp/python/${{ env.WIN_EXE_PYTHON_VERSION }}/python-${{ env.WIN_EXE_PYTHON_VERSION }}-embed-amd64.zip"
          $pythonUrl = "https://www.python.org/ftp/python/${{ env.WIN_EXE_PYTHON_VERSION }}/python-${{ env.WIN_EXE_PYTHON_VERSION }}-embed-amd64.zip"
          $pythonZip = "./dep_build/python.zip"
          Invoke-WebRequest -Uri $pythonUrl -OutFile $pythonZip
          Expand-Archive -Path $pythonZip -DestinationPath "./build/runtime" -Force

          Write-Host "==== 下载 Visual C++ Redistributable 安装程序 ===="
          $vcRedistUrl = "https://aka.ms/vs/17/release/vc_redist.x64.exe"
          $vcRedistPath = "./build/无法运行请安装vc_redist.x64.exe"
          Invoke-WebRequest -Uri $vcRedistUrl -OutFile $vcRedistPath
          Write-Host "已下载 Visual C++ Redistributable 安装程序到: $vcRedistPath"

          Write-Host "==== 下载并解压 PyStand ===="
          $pystandUrl = "https://github.com/skywind3000/PyStand/releases/download/1.1.4/PyStand-v1.1.4-exe.zip"
          $pystandZip = "./dep_build/PyStand.zip"
          Invoke-WebRequest -Uri $pystandUrl -OutFile $pystandZip
          Expand-Archive -Path $pystandZip -DestinationPath "./dep_build/PyStand" -Force

          Write-Host "==== 复制 PyStand.exe 到 build 并重命名 ===="
          $pystandExe = "./dep_build/PyStand/PyStand-x64-CLI/PyStand.exe"
          $destExe = "./build/pdf2zh.exe"
          if (Test-Path $pystandExe) {
              Copy-Item -Path $pystandExe -Destination $destExe -Force
          } else {
              Write-Host "错误: PyStand.exe 未找到!"
              exit 1
          }
          Write-Host "==== 创建 Python venv 在 dep_build ===="
          uv venv ./dep_build/venv

          ./dep_build/venv/Scripts/activate

          Write-Host "==== 在 venv 环境中安装项目依赖 ===="
          uv pip install .

          Write-Host "==== 复制 venv/Lib/site-packages 到 build/ ===="
          Copy-Item -Path "./dep_build/venv/Lib/site-packages" -Destination "./build/site-packages" -Recurse -Force

          Write-Host "==== 复制 script/_pystand_static.int 到 build/ ===="
          $staticFile = "./script/_pystand_static.int"
          $destStatic = "./build/_pystand_static.int"
          if (Test-Path $staticFile) {
              Copy-Item -Path $staticFile -Destination $destStatic -Force
          } else {
              Write-Host "错误: script/_pystand_static.int 未找到!"
              exit 1
          }

      - name: Upload build artifact
        uses: actions/upload-artifact@v7
        with:
          name: win64-exe
          path: ./build
          if-no-files-found: error
          compression-level: 1
          include-hidden-files: true

  test-win64-exe:
    needs: 
      - build-win64-exe
      - check-repository
    if: needs.check-repository.outputs.is_main_repo != 'true'
    runs-on: windows-latest
    steps:
      - name: 检出代码
        uses: actions/checkout@v6

      - name: Download build artifact
        uses: actions/download-artifact@v8
        with:
          name: win64-exe
          path: ./build

      - name: Test show version (online mode)
        run: |
          ./build/pdf2zh.exe --version
      
      - name: Test - Translate a PDF file with plain text only (online mode)
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file

      - name: Test - Translate a PDF file figure
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.text.with.figure.pdf -o ./test/file

      - name: Test - Translate without offline assets (online mode)
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file
          
      - name: Upload test results
        uses: actions/upload-artifact@v7
        with:
          name: test-results
          path: ./test/file/
          if-no-files-found: error

      - name: Setup uv with Python ${{ env.WIN_EXE_PYTHON_VERSION }}
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: ${{ env.WIN_EXE_PYTHON_VERSION }}
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"

      - name: Generate offline assets
        shell: pwsh
        run: |
          Write-Host "==== 生成离线资源包 ===="
          uv run --active babeldoc --generate-offline-assets ./build

      - name: Delete cache
        shell: pwsh
        run: |
          Write-Host "==== 删除缓存目录 ===="
          $cachePath = "$env:USERPROFILE/.cache/babeldoc"
          if (Test-Path $cachePath) {
            Remove-Item -Path $cachePath -Recurse -Force
            Write-Host "已删除缓存目录: $cachePath"
          } else {
            Write-Host "缓存目录不存在: $cachePath"
          }

      - name: Test - Translate with offline assets (offline mode)
        run: |
          Write-Host "==== 测试离线资源包 ===="
          New-Item -Path "./test/file/offline_result" -ItemType Directory -Force
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file/offline_result

      - name: Upload offline test results
        uses: actions/upload-artifact@v7
        with:
          name: offline-test-results
          path: ./test/file/offline_result/
          if-no-files-found: error

      - name: Upload build with offline assets artifact
        uses: actions/upload-artifact@v7
        with:
          name: win64-exe-with-assets
          path: ./build
          if-no-files-found: error
          compression-level: 1
          include-hidden-files: true
```

## /.github/workflows/fork-test.yml

```yml path="/.github/workflows/fork-test.yml" 
name: fork-test

on:
  push:
    branches: [ "main", "master" ]

env:
  REGISTRY: ghcr.io
  REPO_LOWER: ${{ github.repository_owner }}/${{ github.event.repository.name }}
  GHCR_REPO: ghcr.io/${{ github.repository }}
  WIN_EXE_PYTHON_VERSION: 3.12.9
jobs:
  check-repository:
    name: Check if running in main repository
    runs-on: ubuntu-latest
    outputs:
      is_main_repo: ${{ github.repository == 'Byaidu/PDFMathTranslate' }}
      is_doc_only: ${{ steps.check-doc.outputs.is_doc_only }}
    steps:
      - run: echo "Running repository check"
      - name: Check if commit is doc-only
        id: check-doc
        run: |
          MSG=$(echo "${{ github.event.head_commit.message }}" | head -n 1)
          if [[ "$MSG" == doc:* ]] || [[ "$MSG" == "doc("* ]]; then
            echo "is_doc_only=true" >> $GITHUB_OUTPUT
          else
            echo "is_doc_only=false" >> $GITHUB_OUTPUT
          fi

  test:
    uses: ./.github/workflows/python-test.yml
    needs: check-repository
    if: needs.check-repository.outputs.is_main_repo != 'true' && needs.check-repository.outputs.is_doc_only != 'true'

```

## /.github/workflows/python-publish.yml

```yml path="/.github/workflows/python-publish.yml" 
name: Test and Release

on:
  push:
    branches:
      - main
      - master

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: false

permissions:
  id-token: write
  contents: write
  pull-requests: write

env:
  REGISTRY: ghcr.io
  REPO_LOWER: ${{ github.repository_owner }}/${{ github.event.repository.name }}
  GHCR_REPO: ghcr.io/${{ github.repository }}
  DOCKERHUB_REPO: byaidu/pdf2zh
  WIN_EXE_PYTHON_VERSION: "3.12.9"

jobs:
  check-repository:
    name: Check if running in main repository
    runs-on: ubuntu-latest
    outputs:
      # debug purpose
      is_main_repo: ${{ github.repository == 'Byaidu/PDFMathTranslate' }}
      is_doc_only: ${{ steps.check-doc.outputs.is_doc_only }}
    steps:
      - run: echo "Running repository check"
      - name: Check if commit is doc-only
        id: check-doc
        run: |
          MSG=$(echo "${{ github.event.head_commit.message }}" | head -n 1)
          if [[ "$MSG" == doc:* ]] || [[ "$MSG" == "doc("* ]]; then
            echo "is_doc_only=true" >> $GITHUB_OUTPUT
          else
            echo "is_doc_only=false" >> $GITHUB_OUTPUT
          fi

  test:
    needs: check-repository
    uses: ./.github/workflows/python-test.yml
    if: needs.check-repository.outputs.is_main_repo == 'true' && needs.check-repository.outputs.is_doc_only != 'true'

  build:
    name: Build distribution 📦
    needs: [test, check-repository]
    if: needs.check-repository.outputs.is_main_repo == 'true' && needs.check-repository.outputs.is_doc_only != 'true'
    runs-on: ubuntu-latest
    timeout-minutes: 15
    outputs:
      is_release: ${{ steps.check-version.outputs.tag }}
      version: ${{ steps.check-version.outputs.tag && steps.get-release-version.outputs.version || steps.get-dev-version.outputs.version }}
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
        with:
          persist-credentials: true
          fetch-depth: 2
          token: ${{ secrets.GITHUB_TOKEN }}

      - name: Setup uv with Python 3.12
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: "3.12"
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"

      - name: Check if there is a parent commit
        id: check-parent-commit
        run: |
          echo "sha=$(git rev-parse --verify --quiet HEAD^)" >> $GITHUB_OUTPUT

      - name: Detect and tag new version
        id: check-version
        if: steps.check-parent-commit.outputs.sha
        uses: salsify/action-detect-and-tag-new-version@b1778166f13188a9d478e2d1198f993011ba9864 # v2.0.3
        with:
          version-command: |
            cat pyproject.toml | grep "version = " | head -n 1 | awk -F'"' '{print $2}'
          tag-template: 'v{VERSION}'

      - name: Install Dependencies
        run: |
          uv sync

      - name: Bump version for developmental release
        if: "!steps.check-version.outputs.tag"
        id: get-dev-version
        run: |
          version=$(bumpver update --patch --tag=final --dry 2>&1 | grep "New Version" | awk '{print $NF}')
          echo "version=$version.dev$(date +%s)" >> $GITHUB_OUTPUT
          bumpver update --set-version $version.dev$(date +%s)

      - name: Get release version
        if: steps.check-version.outputs.tag
        id: get-release-version
        run: |
          version=$(cat pyproject.toml | grep "version = " | head -n 1 | awk -F'"' '{print $2}')
          echo "version=$version" >> $GITHUB_OUTPUT

      - name: Build package
        run: "uv build"

      - name: Store the distribution packages
        uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
        with:
          name: python-package-distributions
          path: dist/

  publish-to-pypi:
    name: Publish Python 🐍 distribution 📦 to PyPI
    if: needs.build.outputs.is_release != ''
    needs:
      - check-repository
      - build
      - test-win64-exe
    runs-on: ubuntu-latest
    environment:
      name: pypi
      url: https://pypi.org/p/pdf2zh
    permissions:
      id-token: write
    steps:
      - name: Download all the dists
        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
        with:
          name: python-package-distributions
          path: dist/
      - name: Publish distribution 📦 to PyPI
        uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # v1.13.0

  publish-to-testpypi:
    name: Publish Python 🐍 distribution 📦 to TestPyPI
    if: needs.build.outputs.is_release == ''
    needs:
      - check-repository
      - build
      - test-win64-exe
    runs-on: ubuntu-latest
    environment:
      name: testpypi
      url: https://test.pypi.org/p/pdf2zh
    permissions:
      id-token: write
    steps:
      - name: Download all the dists
        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
        with:
          name: python-package-distributions
          path: dist/
      - name: Publish distribution 📦 to TestPyPI
        uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # v1.13.0
        with:
          repository-url: https://test.pypi.org/legacy/

  build-docker-image:
    strategy:
      fail-fast: false
      matrix:
        include:
          - platform: linux/amd64
            runner: ubuntu-latest
          - platform: linux/arm64
            runner: ubuntu-24.04-arm
    runs-on: ${{ matrix.runner }}
    timeout-minutes: 30
    needs:
      - build
      - check-repository
    if: needs.check-repository.outputs.is_main_repo == 'true' && needs.check-repository.outputs.is_doc_only != 'true'
    environment:
      name: ${{ needs.build.outputs.is_release != '' && 'pypi' || 'testpypi' }}
      url: ${{ needs.build.outputs.is_release != '' && 'https://hub.docker.com/r/byaidu/pdf2zh/tags?name=latest' || 'https://hub.docker.com/r/byaidu/pdf2zh/tags?name=dev' }}
    permissions:
      contents: read
      packages: write

    steps:
      - name: Convert to lowercase
        run: |
          echo "GHCR_REPO_LOWER=$(echo ${{ env.GHCR_REPO }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

      - name: Prepare
        run: |
          platform=${{ matrix.platform }}
          echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV

      - name: Checkout repository
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

      - name: Setup uv with Python 3.12
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: "3.12"
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"

      - name: Set version from build job
        if: needs.build.outputs.is_release == ''
        run: |
          uv tool install bumpver
          echo "Using version: ${{ needs.build.outputs.version }}"
          bumpver update --set-version ${{ needs.build.outputs.version }}

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@030e881283bb7a6894de51c315a6bfe6a94e05cf # v6
        with:
          images: |
            ${{ env.DOCKERHUB_REPO }}
            ${{ env.GHCR_REPO_LOWER }}
          tags: |
            type=raw,value=dev
            type=raw,value=${{ needs.build.outputs.version }},enable=${{ needs.build.outputs.is_release != '' }}
            type=raw,value=latest,enable=${{ needs.build.outputs.is_release != '' }}

      - name: Login to Docker.io
        uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
        with:
          registry: docker.io
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}

      - name: Login to GHCR
        uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4

      - name: Build and push by digest
        id: build
        uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
        with:
          platforms: ${{ matrix.platform }}
          labels: ${{ steps.meta.outputs.labels }}
          outputs: type=image,"name=${{ env.DOCKERHUB_REPO }},${{ env.GHCR_REPO_LOWER }}",push-by-digest=true,name-canonical=true,push=true
          cache-from: type=gha,scope=${{ matrix.platform }}
          cache-to: type=gha,mode=max,scope=${{ matrix.platform }}

      - name: Export digest
        run: |
          mkdir -p ${{ runner.temp }}/digests
          digest="${{ steps.build.outputs.digest }}"
          touch "${{ runner.temp }}/digests/${digest#sha256:}"

      - name: Upload digest
        uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
        with:
          name: digests-${{ env.PLATFORM_PAIR }}
          path: ${{ runner.temp }}/digests/*
          if-no-files-found: error
          retention-days: 1

  merge-docker-image:
    runs-on: ubuntu-latest
    timeout-minutes: 15
    needs:
      - build
      - build-docker-image
      - check-repository
      - test-win64-exe
    if: needs.check-repository.outputs.is_main_repo == 'true' && needs.check-repository.outputs.is_doc_only != 'true'
    environment:
      name: ${{ needs.build.outputs.is_release != '' && 'pypi' || 'testpypi' }}
      url: ${{ needs.build.outputs.is_release != '' && 'https://hub.docker.com/r/byaidu/pdf2zh/tags?name=latest' || 'https://hub.docker.com/r/byaidu/pdf2zh/tags?name=dev' }}
    permissions:
      contents: read
      packages: write
    steps:
      - name: Convert to lowercase
        run: |
          echo "GHCR_REPO_LOWER=$(echo ${{ env.GHCR_REPO }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV

      - name: Download digests
        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
        with:
          path: ${{ runner.temp }}/digests
          pattern: digests-*
          merge-multiple: true

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@030e881283bb7a6894de51c315a6bfe6a94e05cf # v6
        with:
          images: |
            ${{ env.DOCKERHUB_REPO }}
            ${{ env.GHCR_REPO_LOWER }}
          tags: |
            type=raw,value=dev
            type=raw,value=${{ needs.build.outputs.version }},enable=${{ needs.build.outputs.is_release != '' && 'true' || 'false' }}
            type=raw,value=latest,enable=${{ needs.build.outputs.is_release != '' && 'true' || 'false' }}

      - name: Login to Docker.io
        uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
        with:
          registry: docker.io
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}

      - name: Login to GHCR
        uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Create manifest list and push
        working-directory: ${{ runner.temp }}/digests
        run: |
          docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
            $(printf '${{ env.DOCKERHUB_REPO }}@sha256:%s ' *)
          docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
            $(printf '${{ env.GHCR_REPO_LOWER }}@sha256:%s ' *)

      - name: Inspect image
        run: |
          docker buildx imagetools inspect ${{ env.DOCKERHUB_REPO }}:${{ steps.meta.outputs.version }}
          docker buildx imagetools inspect ${{ env.GHCR_REPO_LOWER }}:${{ steps.meta.outputs.version }}

  build-win64-exe:
    runs-on: windows-latest
    timeout-minutes: 30
    needs:
      - check-repository
    if: needs.check-repository.outputs.is_main_repo == 'true' && needs.check-repository.outputs.is_doc_only != 'true'
    steps:
      - name: checkout babeldoc metadata
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
        with:
          repository: funstory-ai/BabelDOC
          path: babeldoctemp1234567
          token: ${{ secrets.GITHUB_TOKEN }}
          sparse-checkout: babeldoc/assets/embedding_assets_metadata.py
      - name: Cached Assets
        id: cache-assets
        uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
        with:
          path: ~/.cache/babeldoc
          key: test-1-babeldoc-assets-${{ hashFiles('babeldoctemp1234567/babeldoc/assets/embedding_assets_metadata.py') }}
      - name: 检出代码
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

      - name: Setup uv with Python ${{ env.WIN_EXE_PYTHON_VERSION }}
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: ${{ env.WIN_EXE_PYTHON_VERSION }}
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"

      - name: Build Windows executable
        shell: pwsh
        run: |
          ./script/build-win64.ps1 `
            -PythonVersion "${{ env.WIN_EXE_PYTHON_VERSION }}" `
            -CleanBabelDoc `
            -DownloadVCRedist `
            -GenerateOfflineAssets

      - name: Upload build with offline assets artifact
        uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
        with:
          name: win64-exe-with-assets
          path: ./build
          if-no-files-found: error
          compression-level: 1
          include-hidden-files: true

  test-win64-exe:
    needs:
      - build-win64-exe
      - check-repository
    if: needs.check-repository.outputs.is_main_repo == 'true' && needs.check-repository.outputs.is_doc_only != 'true'
    runs-on: windows-latest
    timeout-minutes: 20
    steps:
      - name: 检出代码
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

      - name: Download build artifact
        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
        with:
          name: win64-exe-with-assets
          path: ./build

      - name: Test show version
        run: |
          ./build/pdf2zh.exe --version

      - name: Test - Translate a PDF file with plain text only
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file

      - name: Test - Translate a PDF file figure
        run: |
          ./build/pdf2zh.exe ./test/file/translate.cli.text.with.figure.pdf -o ./test/file

      - name: Delete offline assets and cache
        shell: pwsh
        run: |
          $offlineAssetsPath = Get-ChildItem -Path "./build" -Filter "offline_assets_*.zip" -Recurse | Select-Object -First 1 -ExpandProperty FullName
          if ($offlineAssetsPath) {
            Remove-Item -Path $offlineAssetsPath -Force
          }
          $cachePath = "$env:USERPROFILE/.cache/babeldoc"
          if (Test-Path $cachePath) {
            Remove-Item -Path $cachePath -Recurse -Force
          }

      - name: Test - Translate without offline assets
        run: |
          New-Item -Path "./test/file/offline_result" -ItemType Directory -Force
          ./build/pdf2zh.exe ./test/file/translate.cli.plain.text.pdf -o ./test/file/offline_result

      - name: Upload test results
        uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7
        with:
          name: test-results
          path: ./test/file/
          retention-days: 7

  release-draft:
    name: Release Draft Tasks
    needs:
      - check-repository
      - build
      - publish-to-pypi
      - publish-to-testpypi
      - merge-docker-image
      - test-win64-exe
    if: |
      always() && needs.check-repository.outputs.is_main_repo == 'true' &&
      needs.check-repository.outputs.is_doc_only != 'true' &&
      (needs.publish-to-pypi.result == 'success' || needs.publish-to-testpypi.result == 'success') &&
      needs.merge-docker-image.result == 'success' &&
      needs.test-win64-exe.result == 'success'
    runs-on: ubuntu-latest
    timeout-minutes: 10
    permissions:
      contents: write
      pull-requests: write
    outputs:
      tag_name: ${{ steps.release-drafter.outputs.tag_name }}
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
        with:
          persist-credentials: true
          fetch-depth: 2
          token: ${{ secrets.GITHUB_TOKEN }}

      - name: Publish the release notes
        id: release-drafter
        uses: release-drafter/release-drafter@139054aeaa9adc52ab36ddf67437541f039b88e2 # v7.1.1
        with:
          publish: ${{ needs.build.outputs.is_release != '' }}
          tag: ${{ needs.build.outputs.is_release }}
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  upload-release:
    needs: [release-draft, check-repository]
    runs-on: ubuntu-latest
    timeout-minutes: 10
    if: always() && needs.check-repository.outputs.is_main_repo == 'true' &&
      needs.check-repository.outputs.is_doc_only != 'true' &&
      needs.release-draft.result == 'success'
    steps:
      - name: 检出代码
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

      - name: Download build artifact
        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
        with:
          name: win64-exe-with-assets
          path: ./build

      - name: Create release zip
        run: |
          mv ./build ./pdf2zh
          zip -9qr "pdf2zh-${{ needs.release-draft.outputs.tag_name }}-with-assets-win64.zip" ./pdf2zh/*

          # Find and delete offline asset files
          find ./pdf2zh -name "offline_assets_*.zip" -type f -print -delete
          echo "Remaining offline assets files (should be empty):"
          find ./pdf2zh -name "offline_assets_*.zip" -type f

          zip -9qr "pdf2zh-${{ needs.release-draft.outputs.tag_name }}-win64.zip" ./pdf2zh/*

      - name: Upload to latest release
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          LATEST_RELEASE=${{ needs.release-draft.outputs.tag_name }}
          echo "Latest release tag: $LATEST_RELEASE"
          gh release upload "$LATEST_RELEASE" "pdf2zh-${{ needs.release-draft.outputs.tag_name }}-win64.zip" --clobber

```

## /.github/workflows/python-test.yml

```yml path="/.github/workflows/python-test.yml" 
name: Test and Build Python Package

on:
  pull_request:
  workflow_call:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  build-and-test:
    runs-on: ${{ matrix.runner }}
    timeout-minutes: 30
    if: >-
      !startsWith(github.event.pull_request.title, 'doc:') &&
      !startsWith(github.event.pull_request.title, 'doc(')
    strategy:
      fail-fast: false
      matrix:
        python-version: ["3.11", "3.12"]
        runner:
          - ubuntu-latest
          - ubuntu-24.04-arm
    steps:
      - name: checkout babeldoc metadata
        uses: actions/checkout@v6
        with:
          repository: funstory-ai/BabelDOC
          path: babeldoctemp1234567
          token: ${{ secrets.GITHUB_TOKEN }}
          sparse-checkout: babeldoc/assets/embedding_assets_metadata.py
      - name: Cached Assets
        id: cache-assets
        uses: actions/cache@v5
        with:
          path: ~/.cache/babeldoc
          key: test-1-babeldoc-assets-${{ hashFiles('babeldoctemp1234567/babeldoc/assets/embedding_assets_metadata.py') }}
      - uses: actions/checkout@v6
      - name: Setup uv with Python ${{ matrix.python-version }}
        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
        with:
          python-version: ${{ matrix.python-version }}
          enable-cache: true
          cache-dependency-glob: "pyproject.toml"
      - name: Cache apt packages
        if: runner.os == 'Linux'
        id: cache-apt
        uses: actions/cache@v5
        with:
          path: ~/apt-cache
          key: apt-libreoffice-${{ matrix.runner }}

      - name: Install system dependencies
        if: runner.os == 'Linux'
        run: |
          if [ -d ~/apt-cache ] && [ "$(ls -A ~/apt-cache 2>/dev/null)" ]; then
            echo "Restoring cached apt packages..."
            sudo cp ~/apt-cache/*.deb /var/cache/apt/archives/ 2>/dev/null || true
          fi
          sudo apt-get update
          sudo apt-get install -y --no-install-recommends libreoffice-core libreoffice-writer
          mkdir -p ~/apt-cache
          cp /var/cache/apt/archives/*.deb ~/apt-cache/ 2>/dev/null || true

      - name: Install dependencies
        run: |
          uv sync

      - name: Test - Unit Test
        run: |
          uv run pytest .

      - name: Test - Translate a PDF file with plain text only
        run: uv run pdf2zh ./test/file/translate.cli.plain.text.pdf -o ./test/file

      - name: Test - Translate a PDF file figure
        run: uv run pdf2zh ./test/file/translate.cli.text.with.figure.pdf -o ./test/file

      # - name: Test - Translate a PDF file with unknown font
      #   run:
      #     pdf2zh ./test/file/translate.cli.font.unknown.pdf

      - name: Test - Start GUI and exit
        run: timeout 10 uv run pdf2zh -i  || code=$?; if [[ $code -ne 124 && $code -ne 0 ]]; then exit $code; fi

      - name: Build as a package
        run: uv build

      - name: Upload test results
        uses: actions/upload-artifact@v7
        with:
          name: test-results-${{ matrix.python-version }}-${{ matrix.runner }}
          path: ./test/file/
          retention-days: 7

```

## /.gitignore

```gitignore path="/.gitignore" 
# Experimental kernel files
*.csv
pdf2zh/kernel/PDFMathTranslate-next.git/*.csv
pdf2zh/kernel/PDFMathTranslate-next.git/*.pdf

# PDF Files
pdf2zh_files
gui/pdf2zh_files
gradio_files
tmp
gui/gradio_files
gui/tmp
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
pdf2zh-dev/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
.vscode
.DS_Store
uv.lock
*.pdf
*.docx

```

## /.gitmodules

```gitmodules path="/.gitmodules" 
[submodule "vendor/PDFMathTranslate-next"]
	path = pdf2zh/kernel/PDFMathTranslate-next.git
	url = https://github.com/PDFMathTranslate/PDFMathTranslate-next.git

```

## /.pre-commit-config.yaml

```yaml path="/.pre-commit-config.yaml" 
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
files: '^.*\.py{{contextString}}#39;
repos:
-   repo: local
    hooks:
    - id: black
      name: black
      entry: black --check --diff --color
      language: python
    - id: flake8
      name: flake8
      entry: flake8 --ignore E203,E261,E501,W503,E741
      language: python

```

## /Dockerfile

``` path="/Dockerfile" 
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim

WORKDIR /app


EXPOSE 7860

ENV PYTHONUNBUFFERED=1

# # Download all required fonts
# ADD "https://github.com/satbyy/go-noto-universal/releases/download/v7.0/GoNotoKurrent-Regular.ttf" /app/
# ADD "https://github.com/timelic/source-han-serif/releases/download/main/SourceHanSerifCN-Regular.ttf" /app/
# ADD "https://github.com/timelic/source-han-serif/releases/download/main/SourceHanSerifTW-Regular.ttf" /app/
# ADD "https://github.com/timelic/source-han-serif/releases/download/main/SourceHanSerifJP-Regular.ttf" /app/
# ADD "https://github.com/timelic/source-han-serif/releases/download/main/SourceHanSerifKR-Regular.ttf" /app/

RUN apt-get update && \
     apt-get install --no-install-recommends -y libgl1 libglib2.0-0 libxext6 libsm6 libxrender1 libreoffice-core libreoffice-writer && \
     rm -rf /var/lib/apt/lists/*

COPY pyproject.toml .
RUN uv pip install --system --no-cache -r pyproject.toml && babeldoc --version && babeldoc --warmup

COPY . .

RUN uv pip install --system --no-cache . && uv pip install --system --no-cache -U "babeldoc<0.3.0" "pymupdf<1.25.3" "pdfminer-six==20250416" && babeldoc --version && babeldoc --warmup

CMD ["pdf2zh", "-i"]

```

## /README.md

<div align="center">
	<a href="https://go.warp.dev/PDFMathTranslate" target="_blank">
		<sup>Special thanks to:</sup>
		<br>
		<img alt="Warp sponsorship" width="400" src="https://github.com/warpdotdev/brand-assets/blob/main/Github/Sponsor/Warp-Github-LG-02.png">
		<br>
		<h>Warp, built for coding with multiple AI agents</b>
		<br>
		<sup>Available for macOS, Linux and Windows</sup>
	</a>
</div>

<br>

<div align="center">

English | [简体中文](docs/README_zh-CN.md) | [繁體中文](docs/README_zh-TW.md) | [日本語](docs/README_ja-JP.md) | [한국어](docs/README_ko-KR.md)

<img src="./docs/images/banner.png" width="320px"  alt="PDF2ZH"/>

<h2 id="title">PDFMathTranslate</h2>

<p>
  <!-- PyPI -->
  <a href="https://pypi.org/project/pdf2zh/">
    <img src="https://img.shields.io/pypi/v/pdf2zh"></a>
  <a href="https://pepy.tech/projects/pdf2zh">
    <img src="https://static.pepy.tech/badge/pdf2zh"></a>
  <a href="https://hub.docker.com/r/byaidu/pdf2zh">
    <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a>
  <a href="https://hellogithub.com/repository/8ec2cfd3ef744762bf531232fa32bc47" target="_blank"><img src="https://api.hellogithub.com/v1/widgets/recommend.svg?rid=8ec2cfd3ef744762bf531232fa32bc47&claim_uid=JQ0yfeBNjaTuqDU&theme=small" alt="Featured|HelloGitHub" /></a>
  <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview">
    <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a>
  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"></a>
  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
  <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
    <img src="https://img.shields.io/badge/contributions-welcome-green"></a>
  <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
    <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"></a>
  <!-- License -->
  <a href="./LICENSE">
    <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"></a>
</p>

<a href="https://trendshift.io/repositories/19816" target="_blank"><img src="https://trendshift.io/api/badge/repositories/19816" alt="PDFMathTranslate%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

</div>

<h2 id="updates">1. What does this do?</h2>

Scientific PDF document translation preserving layouts.

- 📊 Preserve formulas, charts, table of contents, and annotations.
- 🌐 Support [multiple languages](#usage), and diverse [translation services](#usage).
- 🤖 Provides [commandline tool](#usage), [interactive user interface](#install), and [Docker](#install)

<div align="center">
<img src="./docs/images/preview.gif" width="80%"/>
</div>

<h2 id="updates">2. Recent Updates</h2>

- [March 23, 2026] Experimental support for v2.0 translation kernel using isolated environment (`--mode precise`). (by [@reycn](https://github.com/reycn))
- [March 22, 2026] Supporting MiniMax (PR by [@octo-patch](https://github.com/octo-patch))
- [March 22, 2026] Fixing OpenAI-related issues (PR by [@samqin123](https://github.com/samqin123))
- [March 22, 2026] Fixing HTTP-related issues (PR by [@soukouki](https://github.com/soukouki))
- [March 22, 2026] Faster model loading on mac and OONX platforms, GUI starting-up, version printing, and continuous integration.(by [@reycn](https://github.com/reycn))
- [May 9, 2025] pdf2zh 2.0 Preview Version [#586](https://github.com/Byaidu/PDFMathTranslate/issues/586): The Windows ZIP file and Docker image are now available.

  > [!NOTE]
  >
  > 2.0 Moved to a new repository under the organization: [PDFMathTranslate/PDFMathTranslate-next](https://github.com/PDFMathTranslate/PDFMathTranslate-next)
  > 
  > Version 2.0 official release has been published.

<h2 id="use-section">3. Use 🌟</h2>
<h3 id="demo">3.1 Online Service 🌟</h3>

You can try our application out using either of the following demos:

- [Public free service](https://pdf2zh.com/) online without installation _(recommended)_.
- [Immersive Translate - BabelDOC](https://app.immersivetranslate.com/babel-doc/) Free usage quota is available; please refer to the FAQ section on the page for details. _(recommended)_
- [Demo hosted on HuggingFace](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)
- [Demo hosted on ModelScope](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate) without installation.

Note that the computing resources of the demo are limited, so please avoid abusing them.

<h3 id="install">3.2 Local Installation</h3>

For different use cases, we provide distinct methods to use our program:

<details open>
  <summary>3.2.1 Python: Install using uv</summary>

1. Python installed (3.11 <= version <= 3.12)

2. Install our package:

   ```bash
   pip install uv
   uv tool install --python 3.12 pdf2zh
   ```

3. Execute translation, files generated in [current working directory](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):

   ```bash
   pdf2zh document.pdf
   ```

</details>
<details>
  <summary>3.2.2 Python: Install using pip</summary>

1. Python installed (3.11 <= version <= 3.12)
2. Install our package:

   ```bash
   pip install pdf2zh
   ```

3. Execute translation, files generated in [current working directory](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):

   ```bash
   pdf2zh document.pdf
   ```

</details>
<details>
  <summary>3.3.3 Python: Graphic user interface</summary>

1. Python installed (3.11 <= version <= 3.12)

2. Install our package:

  ```bash
  pip install pdf2zh
  ```

3. Start using in browser:

   ```bash
   pdf2zh -i
   ```

4. If your browser has not been started automatically, goto

   ```bash
   http://localhost:7860/
   ```

   <img src="./docs/images/gui.gif" width="500"/>

See [documentation for GUI](./docs/README_GUI.md) for more details.

</details>

<details>
  <summary>3.2.4 Application: On Windows</summary>

1. Download pdf2zh-version-win64.zip from [release page](https://github.com/Byaidu/PDFMathTranslate/releases)

2. Unzip and double-click `pdf2zh.exe` to run.


  > [!TIP]
  >
  > - If you're using Windows and cannot open the file after downloading, please install [vc_redist.x64.exe](https://aka.ms/vs/17/release/vc_redist.x64.exe) and try again.
  > 
</details>


<details>

<summary>3.2.5 Reference manager: Zotero Plugin</summary>


See [Zotero PDF2zh](https://github.com/guaguastandup/zotero-pdf2zh) for more details.

</details>


<details>
  <summary>3.2.6 Docker: Containerized Deployment</summary>

1. Pull and run:

   ```bash
   docker pull byaidu/pdf2zh
   docker run -d -p 7860:7860 byaidu/pdf2zh
   ```

2. Open in browser:

   ```
   http://localhost:7860/
   ```

For docker deployment on cloud service:

<div>
<a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate">
  <img src="https://www.herokucdn.com/deploy/button.svg" alt="Deploy" height="26"></a>
<a href="https://render.com/deploy">
  <img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Koyeb" height="26"></a>
<a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn">
  <img src="https://zeabur.com/button.svg" alt="Deploy on Zeabur" height="26"></a>
<a href="https://template.sealos.io/deploy?templateName=pdf2zh">
  <img src="https://sealos.io/Deploy-on-Sealos.svg" alt="Deploy on Sealos" height="26"></a>
<a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate">
  <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="Deploy to Koyeb" height="26"></a>
</div>

> [!TIP]
>
> - If you cannot access Docker Hub, please try the image on [GitHub Container Registry](https://github.com/Byaidu/PDFMathTranslate/pkgs/container/pdfmathtranslate).
> ```bash
> docker pull ghcr.io/byaidu/pdfmathtranslate
> docker run -d -p 7860:7860 ghcr.io/byaidu/pdfmathtranslate
> ```
</details>

<details>
  <summary>3.2.* Solutions for network issues in installation</summary>

  Users in specific regions may encounter network difficulties when loading the AI model. The current program relies on the AI model (`wybxc/DocLayout-YOLO-DocStructBench-onnx`), and some users are unable to download it due to these network issues.

  To address issues with downloading this model, use the following environment variable as a workaround:

  ```shell
  set HF_ENDPOINT=https://hf-mirror.com
  ```

  For PowerShell user:

  ```shell
  $env:HF_ENDPOINT = https://hf-mirror.com
  ```

  If the solution does not work to you / you encountered other issues, please refer to [Frequently Asked Questions](https://github.com/Byaidu/PDFMathTranslate/wiki#-faq--%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98).
</details>


<h2 id="usage">4. Technical Details</h2>

### 4.1 Advanced options

Execute the translation command in the command line to generate the translated document `example-mono.pdf` and the bilingual document `example-dual.pdf` in the current working directory. Use Google as the default translation service. More support translation services can find [HERE](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services).

<img src="./docs/images/cmd.explained.png" width="580px"  alt="cmd"/>

In the following table, we list all advanced options for reference:

| Option                | Function                                                                                                      | Example                                        |
| --------------------- | ------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- |
| files                 | Local files                                                                                                   | `pdf2zh ~/local.pdf`                           |
| links                 | Online files                                                                                                  | `pdf2zh http://arxiv.org/paper.pdf`            |
| `-i`                  | [Enter GUI](#gui)                                                                                             | `pdf2zh -i`                                    |
| `-p`                  | [Partial document translation](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#partial) | `pdf2zh example.pdf -p 1`                      |
| `-li`                 | [Source language](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#languages)            | `pdf2zh example.pdf -li en`                    |
| `-lo`                 | [Target language](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#languages)            | `pdf2zh example.pdf -lo zh`                    |
| `-s`                  | [Translation service](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services)         | `pdf2zh example.pdf -s deepl`                  |
| `-t`                  | [Multi-threads](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#threads)                | `pdf2zh example.pdf -t 1`                      |
| `-o`                  | Output dir                                                                                                    | `pdf2zh example.pdf -o output`                 |
| `-f`, `-c`            | [Exceptions](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#exceptions)                | `pdf2zh example.pdf -f "(MS.*)"`               |
| `-cp`                 | Compatibility Mode                                                                                            | `pdf2zh example.pdf --compatible`              |
| `--skip-subset-fonts` | [Skip font subset](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#font-subset)         | `pdf2zh example.pdf --skip-subset-fonts`       |
| `--ignore-cache`      | [Ignore translate cache](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cache)         | `pdf2zh example.pdf --ignore-cache`            |
| `--share`             | Public link                                                                                                   | `pdf2zh -i --share`                            |
| `--authorized`        | [Authorization](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#auth)                   | `pdf2zh -i --authorized users.txt [auth.html]` |
| `--prompt`            | [Custom Prompt](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#prompt)                 | `pdf2zh --prompt [prompt.txt]`                 |
| `--onnx`              | [Use Custom DocLayout-YOLO ONNX model]                                                                        | `pdf2zh --onnx [onnx/model/path]`              |
| `--serverport`        | [Use Custom WebUI port]                                                                                       | `pdf2zh --serverport 7860`                     |
| `--dir`               | [batch translate]                                                                                             | `pdf2zh --dir /path/to/translate/`             |
| `--config`            | [configuration file](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig)             | `pdf2zh --config /path/to/config/config.json`  |
| `--serverport`        | [custom gradio server port]                                                                                   | `pdf2zh --serverport 7860`                     |
| `--mode`              | Translation mode: `fast` (default, v1) or `precise` (v2, experimental, requires pdf2zh_next submodule)         | `pdf2zh --mode precise example.pdf`            |
| `--babeldoc`          | Use Experimental backend [BabelDOC](https://funstory-ai.github.io/BabelDOC/) to translate                     | `pdf2zh --babeldoc` -s openai example.pdf      |
| `--mcp`               | Enable MCP STDIO mode                                                                                         | `pdf2zh --mcp`                                 |
| `--sse`               | Enable MCP SSE mode                                                                                           | `pdf2zh --mcp --sse`                           |

For detailed explanations, please refer to our document about [Advanced Usage](./docs/ADVANCED.md) for a full list of each option.

<h3 id="downstream">4.2 Downstream Development</h3>
For downstream applications, please refer to our document about [API Details](./docs/APIS.md) for further information about:

- [Python API](./docs/APIS.md#api-python), how to use the program in other Python programs
- [HTTP API](./docs/APIS.md#api-http), how to communicate with a server with the program installed

<h3 id="downstream">4.3 Differences between two major forks</h3>

- [Byaidu/PDFMathTranslate](https://github.com/Byaidu/PDFMathTranslate): The present and the original project for stable release.

- [PDFMathTranslate/PDFMathTranslate-next](https://github.com/PDFMathTranslate/PDFMathTranslate-next): A fork with web-ui and additional features. This fork handles a large number of marginal cases, improves PDF compatibility, and optimizes cross-column and cross-page semantic consistency, dynamic scaling, and dynamic scaling consistency, among many other translation quality improvements. However, this fork is intended solely for development and does not address compatibility issues and is not designed for community-contributions.

<h2 id="information">5. Project Information</h2>
<h3 id="citation">5.1 Citation</h3>

This work has been accepted by the [*Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations*](https://aclanthology.org/2025.emnlp-demos.71/) (EMNLP 2025). 

Citation:

```
@inproceedings{ouyang-etal-2025-pdfmathtranslate,
	    title = "{PDFM}ath{T}ranslate: Scientific Document Translation Preserving Layouts",
	    author = "Ouyang, Rongxin  and
	      Chu, Chang  and
	      Xin, Zhikuang  and
	      Ma, Xiangyao",
	    editor = {Habernal, Ivan  and
	      Schulam, Peter  and
	      Tiedemann, J{\"o}rg},
	    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
	    month = nov,
	    year = "2025",
	    address = "Suzhou, China",
	    publisher = "Association for Computational Linguistics",
	    url = "https://aclanthology.org/2025.emnlp-demos.71/",
	    pages = "918--924",
	    ISBN = "979-8-89176-334-0",
	    abstract = "Language barriers in scientific documents hinder the diffusion and development of science and technologies. However, prior efforts in translating such documents largely overlooked the information in layouts. To bridge the gap, we introduce PDFMathTranslate, the world{'}s first open-source software for translating scientific documents while preserving layouts. Leveraging the most recent advances in large language models and precise layout detection, we contribute to the community with key improvements in precision, flexibility, and efficiency. The work is open-sourced at https://github.com/byaidu/pdfmathtranslate with more than 222k downloads."
	}
```
<h3 id="acknowledgement">5.2 Acknowledgement</h3>

- [Immersive Translation](https://immersivetranslate.com) sponsors monthly Pro membership redemption codes for active contributors to this project, see details at: [CONTRIBUTOR_REWARD.md](https://github.com/funstory-ai/BabelDOC/blob/main/docs/CONTRIBUTOR_REWARD.md)

- New backend: [BabelDOC](https://github.com/funstory-ai/BabelDOC)

- Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)

- Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)

- Document extraction: [MinerU](https://github.com/opendatalab/MinerU)

- Document Preview: [Gradio PDF](https://github.com/freddyaboulton/gradio-pdf)

- Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)

- Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)

- Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)

- Multilingual Font: [Go Noto Universal](https://github.com/satbyy/go-noto-universal)

<h3 id="contrib">5.3 Contributors</h3>

<a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
  <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
</a>

![Alt](https://repobeats.axiom.co/api/embed/dfa7583da5332a11468d686fbd29b92320a6a869.svg "Repobeats analytics image")

For details on how to contribute, please consult the [Contribution Guide](https://github.com/Byaidu/PDFMathTranslate/wiki/Contribution-Guide---%E8%B4%A1%E7%8C%AE%E6%8C%87%E5%8D%97).


<h3 id="star_hist">5.4 Star History</h3>

<a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
 <picture>
   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" />
   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" />
   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/>
 </picture>
</a>


## /app.json

```json path="/app.json" 
{
    "name": "PDFMathTranslate",
    "description": "PDF scientific paper translation and bilingual comparison.",
    "repository": "https://github.com/Byaidu/PDFMathTranslate"
}
```

## /docker-compose.yml

```yml path="/docker-compose.yml" 
# This is the final, recommended configuration.
# It builds a single, self-contained image with all dependencies.

services:
  pdf2zh:
    build:
      context: .
      # All the setup steps are now part of a one-time build process.
      dockerfile_inline: |
        FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim

        WORKDIR /app

        # 1. Install system-level dependencies FIRST.
        # This is what solves the "libGL.so.1 not found" error.
        RUN apt-get update && \
            apt-get install --no-install-recommends -y libgl1 libglib2.0-0 libxext6 libsm6 libxrender1 && \
            rm -rf /var/lib/apt/lists/*

        # 2. Copy only the dependency file and install Python packages.
        # This layer is cached and only re-runs if pyproject.toml changes.
        COPY pyproject.toml .
        RUN uv pip install --system --no-cache -r pyproject.toml

        # 3. Copy the rest of your application code.
        COPY . .

        # 4. Install the local package and perform final updates/warmups.
        RUN uv pip install --system --no-cache . && \
            uv pip install --system --no-cache -U "babeldoc<0.3.0" "pymupdf<1.25.3" "pdfminer-six==20250416" && \
            babeldoc --warmup

    # The rest of the configuration is for RUNNING the built image.
    ports:
      - "7860:7860"

    environment:
      - PYTHONUNBUFFERED=1
      # The UV_LINK_MODE warning happens during build, so we can set it there if needed,
      # but it's generally harmless.

    command: ["pdf2zh", "-i"]

    # Optional: Mount a volume for persistent data I/O if needed
    # volumes:
    #   - ./data:/app/data

    stdin_open: true
    tty: true
```

## /docs/ADVANCED.md

[**Documentation**](https://github.com/Byaidu/PDFMathTranslate) > **Advanced Usage** _(current)_

---

<h3 id="toc">Table of Contents</h3>

- [Full / partial translation](#partial)
- [Specify source and target languages](#language)
- [Translate with different services](#services)
- [Translate wih exceptions](#exceptions)
- [Multi-threads](#threads)
- [Custom prompt](#prompt)
- [Authorization](#auth)
- [Custom configuration file](#cofig)
- [Fonts Subseting](#fonts-subset)
- [Translation cache](#cache)

---

<h3 id="partial">Full / partial translation</h3>

- Entire document

  ```bash
  pdf2zh example.pdf
  ```

- Part of the document

  ```bash
  pdf2zh example.pdf -p 1-3,5
  ```

[⬆️ Back to top](#toc)

---

<h3 id="language">Specify source and target languages</h3>

See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages)

```bash
pdf2zh example.pdf -li en -lo ja
```

[⬆️ Back to top](#toc)

---

<h3 id="services">Translate with different services</h3>

We've provided a detailed table on the required [environment variables](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) for each translation service. Make sure to set them before using the respective service.

| **Translator**       | **Service**    | **Environment Variables**                                             | **Default Values**                                       | **Notes**                                                                                                                                                                                                 |
|----------------------|----------------|-----------------------------------------------------------------------|----------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Google (Default)** | `google`       | None                                                                  | N/A                                                      | None                                                                                                                                                                                                      |
| **Bing**             | `bing`         | None                                                                  | N/A                                                      | None                                                                                                                                                                                                      |
| **302.AI**           | `302ai`       | `X302AI_API_KEY`, `X302AI_MODEL`                                         | `[Your Key]`, `Gemma-7B` | See [302.AI](https://share.302.ai/tqTWfD)                                                                                                                                                   |
| **OpenAI**           | `openai`       | `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL`, `OPENAI_STOP_TOKENS`, `OPENAI_MAX_TOKENS` | `https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini`, ` `, `-1` | See [OpenAI](https://platform.openai.com/docs/overview)                                                                                                                                                   |
| **DeepL**            | `deepl`        | `DEEPL_AUTH_KEY`                                                      | `[Your Key]`                                             | See [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API)                                                                                                             |
| **DeepLX**           | `deeplx`       | `DEEPLX_ENDPOINT`                                                     | `https://api.deepl.com/translate`                        | See [DeepLX](https://github.com/OwO-Network/DeepLX)                                                                                                                                                       |
| **Ollama**           | `ollama`       | `OLLAMA_HOST`, `OLLAMA_MODEL`                                         | `http://127.0.0.1:11434`, `gemma2`                       | See [Ollama](https://github.com/ollama/ollama)                                                                                                                                                            |
| **Xinference**       | `xinference`   | `XINFERENCE_HOST`, `XINFERENCE_MODEL`                                 | `http://127.0.0.1:9997`, `gemma-2-it`                    | See [Xinference](https://github.com/xorbitsai/inference)                                                                                                                                                                                        |
| **AzureOpenAI**      | `azure-openai` | `AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_MODEL` | `[Your Endpoint]`, `[Your Key]`, `gpt-4o-mini`           | See [Azure OpenAI](https://learn.microsoft.com/zh-cn/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython&pivots=programming-language-python) |
| **Zhipu**            | `zhipu`        | `ZHIPU_API_KEY`, `ZHIPU_MODEL`                                        | `[Your Key]`, `glm-4-flash`                              | See [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk)                                                                                                                                 |
| **ModelScope**       | `modelscope`   | `MODELSCOPE_API_KEY`, `MODELSCOPE_MODEL`                              | `[Your Key]`, `Qwen/Qwen2.5-Coder-32B-Instruct`          | See [ModelScope](https://www.modelscope.cn/docs/model-service/API-Inference/intro)                                                                                                                        |
| **Silicon**          | `silicon`      | `SILICON_API_KEY`, `SILICON_MODEL`                                    | `[Your Key]`, `Qwen/Qwen2.5-7B-Instruct`                 | See [SiliconCloud](https://docs.siliconflow.cn/quickstart)                                                                                                                                                |
| **Gemini**           | `gemini`       | `GEMINI_API_KEY`, `GEMINI_MODEL`                                      | `[Your Key]`, `gemini-1.5-flash`                         | See [Gemini](https://ai.google.dev/gemini-api/docs/openai)                                                                                                                                                |
| **Azure**            | `azure`        | `AZURE_ENDPOINT`, `AZURE_API_KEY`                                     | `https://api.translator.azure.cn`, `[Your Key]`          | See [Azure](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview)                                                                                                                 |
| **Tencent**          | `tencent`      | `TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`                   | `[Your ID]`, `[Your Key]`                                | See [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104)                                                                                                                            |
| **Dify**             | `dify`         | `DIFY_API_URL`, `DIFY_API_KEY`                                        | `[Your DIFY URL]`, `[Your Key]`                          | See [Dify](https://github.com/langgenius/dify),Three variables, lang_out, lang_in, and text, need to be defined in Dify's workflow input.                                                                 |
| **AnythingLLM**      | `anythingllm`  | `AnythingLLM_URL`, `AnythingLLM_APIKEY`                               | `[Your AnythingLLM URL]`, `[Your Key]`                   | See [anything-llm](https://github.com/Mintplex-Labs/anything-llm)                                                                                                                                         |
|**Argos Translate**|`argos`| | |See [argos-translate](https://github.com/argosopentech/argos-translate)|
|**Grok**|`grok`| `GROK_API_KEY`, `GROK_MODEL`, `GROK_BASE_URL` (optional) | `[Your GROK_API_KEY]`, `grok-2-1212`, `https://api.x.ai/v1` |See [Grok](https://docs.x.ai/docs/overview). **Note:** When using custom proxy, ensure `GROK_BASE_URL` ends with `/v1` (e.g., `http://your-proxy:8000/v1`)|
|**Groq**|`groq`| `GROQ_API_KEY`, `GROQ_MODEL` | `[Your GROQ_API_KEY]`, `llama-3-3-70b-versatile` |See [Groq](https://console.groq.com/docs/models)|
|**DeepSeek**|`deepseek`| `DEEPSEEK_API_KEY`, `DEEPSEEK_MODEL` | `[Your DEEPSEEK_API_KEY]`, `deepseek-chat` |See [DeepSeek](https://www.deepseek.com/)|
|**MiniMax**|`minimax`| `MINIMAX_API_KEY`, `MINIMAX_MODEL` | `[Your MINIMAX_API_KEY]`, `MiniMax-M2.7` |See [MiniMax](https://platform.minimaxi.com/)|
|**OpenAI-Liked**|`openailiked`| `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL` | `url`, `[Your Key]`, `model name` | None |
|**OpenAI-Liked**|`openailiked`| `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL`, `OPENAILIKED_STOP_TOKENS`, `OPENAILIKED_MAX_TOKENS` | `url`, `[Your Key]`, `model name`, ` `, `-1` | None |
|**Ali Qwen Translation**|`qwen-mt`| `ALI_MODEL`, `ALI_API_KEY`, `ALI_DOMAINS` | `qwen-mt-turbo`, `[Your Key]`, `scientific paper` | Tranditional Chinese are not yet supported, it will be translated into Simplified Chinese. More see [Qwen MT](https://bailian.console.aliyun.com/?spm=5176.28197581.0.0.72e329a4HRxe99#/model-market/detail/qwen-mt-turbo) |

For large language models that are compatible with the OpenAI API but not listed in the table above, you can set environment variables using the same method outlined for OpenAI in the table.

Use `-s service` or `-s service:model` to specify service:

```bash
pdf2zh example.pdf -s openai:gpt-4o-mini
```

Or specify model with environment variables:

```bash
set OPENAI_MODEL=gpt-4o-mini
pdf2zh example.pdf -s openai
```

For PowerShell user:

```shell
$env:OPENAI_MODEL = gpt-4o-mini
pdf2zh example.pdf -s openai
```

[⬆️ Back to top](#toc)

---

<h3 id="exceptions">Translate wih exceptions</h3>

Use regex to specify formula fonts and characters that need to be preserved:

```bash
pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
```

Preserve `Latex`, `Mono`, `Code`, `Italic`, `Symbol` and `Math` fonts by default:

```bash
pdf2zh example.pdf -f "(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)"
```

[⬆️ Back to top](#toc)

---

<h3 id="threads">Multi-threads</h3>

Use `-t` to specify how many threads to use in translation:

```bash
pdf2zh example.pdf -t 1
```

[⬆️ Back to top](#toc)

---

<h3 id="prompt">Custom prompt</h3>

Note: System prompt is currently not supported. See [this change](https://github.com/Byaidu/PDFMathTranslate/pull/637).

Use `--prompt` to specify which prompt to use in llm:

```bash
pdf2zh example.pdf --prompt prompt.txt
```

For example:

```txt
You are a professional, authentic machine translation engine. Only Output the translated text, do not include any other text.

Translate the following markdown source text to ${lang_out}. Keep the formula notation {v*} unchanged. Output translation directly without any additional text.

Source Text: ${text}

Translated Text:
```

In custom prompt file, there are three variables can be used.

|**variables**|**comment**|
|-|-|
|`lang_in`|input language|
|`lang_out`|output language|
|`text`|text need to be translated|

[⬆️ Back to top](#toc)

---

<h3 id="auth">Authorization</h3>

Use `--authorized` to specify which user to use Web UI and custom the login page:

```bash
pdf2zh example.pdf --authorized users.txt auth.html
```

example users.txt
Each line contains two elements, username, and password, separated by a comma.

```
admin,123456
user1,password1
user2,abc123
guest,guest123
test,test123
```

example auth.html

```html
<!DOCTYPE html>
<html>
<head>
    <title>Simple HTML</title>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>Welcome to my simple HTML page.</p>
</body>
</html>
```

[⬆️ Back to top](#toc)

---

<h3 id="cofig">Custom configuration file</h3>

Use `--config` to specify which file to configure the PDFMathTranslate:

```bash
pdf2zh example.pdf --config config.json
```

```bash
pdf2zh -i --config config.json
```

example config.json

> **⚠️ Important:** When using OpenAI-compatible APIs or custom proxies (like Grok, OpenAI-liked, etc.), ensure the `BASE_URL` ends with `/v1` (e.g., `https://api.openai.com/v1` or `http://your-proxy:8000/v1`). Missing the `/v1` suffix will result in 404 errors.

```json
{
    "USE_MODELSCOPE": "0",
    "PDF2ZH_LANG_FROM": "English",
    "PDF2ZH_LANG_TO": "Simplified Chinese",
    "NOTO_FONT_PATH": "/app/SourceHanSerifCN-Regular.ttf",
    "translators": [
        {
            "name": "deeplx",
            "envs": {
                "DEEPLX_ENDPOINT": "http://localhost:1188/translate/",
                "DEEPLX_ACCESS_TOKEN": null
            }
        },
        {
            "name": "ollama",
            "envs": {
                "OLLAMA_HOST": "http://127.0.0.1:11434",
                "OLLAMA_MODEL": "gemma2"
            }
        },
        {
            "name": "grok",
            "envs": {
                "GROK_BASE_URL": "https://api.x.ai/v1",
                "GROK_API_KEY": "your-api-key",
                "GROK_MODEL": "grok-2-1212"
            }
        }
    ]
}
```

By default, the config file is saved in the `~/.config/PDFMathTranslate/config.json`. The program will start by reading the contents of config.json, and after that it will read the contents of the environment variables. When an environment variable is available, the contents of the environment variable are used first and the file is updated.

[⬆️ Back to top](#toc)

---

<h3 id="font-subset">Fonts subsetting</h3>

By default, PDFMathTranslate uses fonts subsetting to decrease sizes of output files. You can use `--skip-subset-fonts` option to disable fonts subsetting when encoutering compatibility issues.

```bash
pdf2zh example.pdf --skip-subset-fonts
```

[⬆️ Back to top](#toc)

---

<h3 id="cache">Translation cache</h3>

PDFMathTranslate caches translated texts to increase speed and avoid unnecessary API calls for same contents. You can use `--ignore-cache` option to ignore translation cache and force retranslation.

```bash
pdf2zh example.pdf --ignore-cache
```

[⬆️ Back to top](#toc)

---

<h3 id="public-services">Deployment as a public services</h3>

PDFMathTranslate has added the features of **enabling partial services** and **hiding Backend information** in 
the configuration file. You can enable these by setting `ENABLED_SERVICES` and `HIDDEN_GRADIO_DETAILS` in the 
configuration file. Among them:

- `ENABLED_SERVICES` allows you to choose to enable only certain options, limiting the number of available services.
- `HIDDEN_GRADIO_DETAILS` will hide the real API_KEY on the web, preventing users from obtaining server-side keys.

A usable configuration is as follows:

> **⚠️ Important:** The `BASE_URL` must end with `/v1` for OpenAI-compatible APIs.

```json
{
    "USE_MODELSCOPE": "0",
    "translators": [
        {
            "name": "grok",
            "envs": {
                "GROK_BASE_URL": "https://api.x.ai/v1",
                "GROK_API_KEY": "your-api-key",
                "GROK_MODEL": "grok-2-1212"
            }
        },
        {
            "name": "openai",
            "envs": {
                "OPENAI_BASE_URL": "https://api.openai.com/v1",
                "OPENAI_API_KEY": "sk-xxxx",
                "OPENAI_MODEL": "gpt-4o-mini"
            }
        }
    ],
    "ENABLED_SERVICES": [
        "OpenAI",
        "Grok"
    ],
    "HIDDEN_GRADIO_DETAILS": true,
    "PDF2ZH_LANG_FROM": "English",
    "PDF2ZH_LANG_TO": "Simplified Chinese",
    "NOTO_FONT_PATH": "/app/SourceHanSerifCN-Regular.ttf"
}
```

[⬆️ Back to top](#toc)


---

<h3 id="mcp">MCP</h3>

PDFMathTranslate can run as MCP server. To use this, you need to run `uv pip install pdf2zh`, and config `claude_desktop_config.json`, an example config is as follows:

``` json
{
    "mcpServers": {
        "filesystem": {
            "command": "npx",
            "args": [
                "-y",
                "@modelcontextprotocol/server-filesystem",
                "/path/to/Document"
            ]
        },
        "translate_pdf": {
            "command": "uv",
            "args": [
                "run",
                "pdf2zh",
                "--mcp"
            ]
        }
    }
}
```

[filesystem](https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem) is a reuqired mcp server to find pdf file, and `translate_pdf` is our mcp server.

To test if the mcp server works, you can open claude desktop and tell

```
find the `test.pdf` in my Document folder and translate it to Chinese
```


## /docs/APIS.md

[**Documentation**](https://github.com/Byaidu/PDFMathTranslate) > **API Details** _(current)_

<h2 id="toc">Table of Content</h2>
The present project supports two types of APIs, All methods need the Redis;

- [Functional calls in Python](#api-python)
- [HTTP protocols](#api-http)

---

<h2 id="api-python">Python</h2>

As `pdf2zh` is an installed module in Python, we expose two methods for other programs to call in any Python scripts.

For example, if you want translate a document from English to Chinese using Google Translate, you may use the following code:

```python
from pdf2zh import translate, translate_stream

params = {
    'lang_in': 'en',
    'lang_out': 'zh',
    'service': 'google',
    'thread': 4,
}
```
Translate with files:
```python
(file_mono, file_dual) = translate(files=['example.pdf'], **params)[0]
```
Translate with stream:
```python
with open('example.pdf', 'rb') as f:
    (stream_mono, stream_dual) = translate_stream(stream=f.read(), **params)
```

[⬆️ Back to top](#toc)

---

<h2 id="api-http">HTTP</h2>

In a more flexible way, you can communicate with the program using HTTP protocols, if:

1. Install and run backend

   ```bash
   pip install pdf2zh[backend]
   pdf2zh --flask
   pdf2zh --celery worker
   ```

2. Using HTTP protocols as follows:

   - Submit translate task

     ```bash
     curl http://localhost:11008/v1/translate -F "file=@example.pdf" -F "data={\"lang_in\":\"en\",\"lang_out\":\"zh\",\"service\":\"google\",\"thread\":4}"
     {"id":"d9894125-2f4e-45ea-9d93-1a9068d2045a"}
     ```

   - Check Progress

     ```bash
     curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
     {"info":{"n":13,"total":506},"state":"PROGRESS"}
     ```

   - Check Progress _(if finished)_

     ```bash
     curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
     {"state":"SUCCESS"}
     ```

   - Save monolingual file

     ```bash
     curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/mono --output example-mono.pdf
     ```

   - Save bilingual file

     ```bash
     curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/dual --output example-dual.pdf
     ```

   - Interrupt if running and delete the task
     ```bash
     curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a -X DELETE
     ```

[⬆️ Back to top](#toc)

---


## /docs/CODE_OF_CONDUCT.md

# Contributor Covenant Code of Conduct

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
  and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
  overall community

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or
  advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
  address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
 .
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series
of actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior,  harassment of an
individual, or aggression toward or disparagement of classes of individuals.

**Consequence**: A permanent ban from any sort of public interaction within
the community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.


## /docs/PROXY_CONFIGURATION.md

# Proxy Configuration Guide

This guide explains how to configure PDFMathTranslate to work with custom OpenAI-compatible proxies.

## Quick Start

### Using Grok via Custom Proxy

1. Edit your configuration file:
   ```bash
   nano ~/.config/PDFMathTranslate/config.json
   ```

2. Add the grok translator configuration:
   ```json
   {
       "translators": [
           {
               "name": "grok",
               "envs": {
                   "GROK_BASE_URL": "http://your-proxy:8000/v1",
                   "GROK_API_KEY": "your-api-key",
                   "GROK_MODEL": "grok-4",
                   "GROK_STREAM": "false"
               }
           }
       ]
   }
   ```

3. Run translation:
   ```bash
   pdf2zh input.pdf --service grok -o ./output
   ```

## Configuration Options

### Grok Translator

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `GROK_BASE_URL` | No | `https://api.x.ai/v1` | API endpoint URL |
| `GROK_API_KEY` | Yes | - | API authentication key |
| `GROK_MODEL` | No | `grok-2-1212` | Model name |
| `GROK_STREAM` | No | `true` | Enable streaming mode |

### OpenAIliked Translator

For custom proxies that don't support streaming:

```json
{
    "name": "openailiked",
    "envs": {
        "OPENAILIKED_BASE_URL": "http://your-proxy:8000/v1",
        "OPENAILIKED_API_KEY": "your-api-key",
        "OPENAILIKED_MODEL": "grok-4",
        "OPENAILIKED_STREAM": "false"
    }
}
```

### OpenAI Translator

```json
{
    "name": "openai",
    "envs": {
        "OPENAI_BASE_URL": "https://api.openai.com/v1",
        "OPENAI_API_KEY": "your-api-key",
        "OPENAI_MODEL": "gpt-4o-mini",
        "OPENAI_STREAM": "true"
    }
}
```

## Troubleshooting

### Error: "Model not found"

**Cause**: The proxy doesn't recognize the model name.

**Solution**: Check available models:
```bash
curl http://your-proxy:8000/v1/models \
  -H "Authorization: Bearer your-api-key"
```

Then update `GROK_MODEL` to a valid model name.

### Error: "'str' object has no attribute 'choices'"

**Cause**: The proxy returns streaming format, but the code expects non-streaming.

**Solution**: Set `*_STREAM` to `"false"`:
```json
{
    "GROK_STREAM": "false"
}
```

### Error: "Connection error" or "404 Not Found"

**Cause**: Incorrect base URL.

**Solution**: Verify the URL ends with `/v1`:
```json
{
    "GROK_BASE_URL": "http://your-proxy:8000/v1"
}
```

### Error: "Missing authentication token"

**Cause**: API key not configured or incorrect.

**Solution**: Verify your API key in the configuration.

## Example Configurations

### grok2api Proxy

```json
{
    "name": "grok",
    "envs": {
        "GROK_BASE_URL": "http://104.248.73.236:8000/v1",
        "GROK_API_KEY": "xiaoyibao@1234",
        "GROK_MODEL": "grok-4",
        "GROK_STREAM": "false"
    }
}
```

### Official X.AI (Streaming Enabled)

```json
{
    "name": "grok",
    "envs": {
        "GROK_API_KEY": "your-xai-api-key",
        "GROK_MODEL": "grok-2-1212",
        "GROK_STREAM": "true"
    }
}
```

### Local Ollama (via OpenAI-compatible endpoint)

```json
{
    "name": "openailiked",
    "envs": {
        "OPENAILIKED_BASE_URL": "http://localhost:11434/v1",
        "OPENAILIKED_API_KEY": "ollama",
        "OPENAILIKED_MODEL": "llama3",
        "OPENAILIKED_STREAM": "false"
    }
}
```

## Environment Variables

You can also set configuration via environment variables:

```bash
export GROK_BASE_URL="http://your-proxy:8000/v1"
export GROK_API_KEY="your-api-key"
export GROK_MODEL="grok-4"
export GROK_STREAM="false"

pdf2zh input.pdf --service grok
```

Environment variables take priority over `config.json` settings.

## Complete Example

```json
{
    "USE_MODELSCOPE": "0",
    "PDF2ZH_LANG_FROM": "English",
    "PDF2ZH_LANG_TO": "Simplified Chinese",
    "translators": [
        {
            "name": "grok",
            "envs": {
                "GROK_BASE_URL": "http://your-proxy:8000/v1",
                "GROK_API_KEY": "your-api-key",
                "GROK_MODEL": "grok-4",
                "GROK_STREAM": "false"
            }
        },
        {
            "name": "openailiked",
            "envs": {
                "OPENAILIKED_BASE_URL": "http://your-proxy:8000/v1",
                "OPENAILIKED_API_KEY": "your-api-key",
                "OPENAILIKED_MODEL": "grok-4",
                "OPENAILIKED_STREAM": "false"
            }
        }
    ],
    "ENABLED_SERVICES": ["grok", "openailiked"]
}
```

## Notes

- Streaming mode (`"true"`) provides faster response perception but may have compatibility issues with some proxies
- Non-streaming mode (`"false"`) is more compatible but waits for the complete response
- Always include `/v1` at the end of your base URL
- Use `OPENAILIKED` service for maximum compatibility with custom proxies


## /docs/README_GUI.md

# Interact with GUI

This subfolder provides the GUI mode of `pdf2zh`.

## Usage

1. Run `pdf2zh -i`

2. Drop the PDF file into the window and click `Translate`.

### Environment Variables

You can set the source and target languages using environment variables:

- `PDF2ZH_LANG_FROM`: Sets the source language. Defaults to "English".
- `PDF2ZH_LANG_TO`: Sets the target language. Defaults to "Simplified Chinese".

### Supported Languages

The following languages are supported:

- English
- Simplified Chinese
- Traditional Chinese
- French
- German
- Japanese
- Korean
- Russian
- Spanish
- Italian

## Preview

<img src="./images/before.png" width="500"/>
<img src="./images/after.png" width="500"/>

## Maintainance

GUI maintained by [Rongxin](https://github.com/reycn)


## /docs/README_ja-JP.md

<div align="center">

[English](../README.md) | [简体中文](README_zh-CN.md) | [繁體中文](README_zh-TW.md) | 日本語

<img src="./images/banner.png" width="320px"  alt="PDF2ZH"/>  

<h2 id="title">PDFMathTranslate</h2>

<p>
  <!-- PyPI -->
  <a href="https://pypi.org/project/pdf2zh/">
    <img src="https://img.shields.io/pypi/v/pdf2zh"/></a>
  <a href="https://pepy.tech/projects/pdf2zh">
    <img src="https://static.pepy.tech/badge/pdf2zh"></a>
  <a href="https://hub.docker.com/repository/docker/byaidu/pdf2zh">
    <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a>
  <!-- License -->
  <a href="./LICENSE">
    <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a>
  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
  <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
    <img src="https://img.shields.io/badge/contributions-welcome-green"/></a>
  <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview">
    <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a>
  <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
    <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
</p>

<a href="https://trendshift.io/repositories/12424" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12424" alt="Byaidu%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

</div>

科学 PDF 文書の翻訳およびバイリンガル比較ツール

- 📊 数式、チャート、目次、注釈を保持 *([プレビュー](#preview))*
- 🌐 [複数の言語](#language) と [多様な翻訳サービス](#services) をサポート
- 🤖 [コマンドラインツール](#usage)、[インタラクティブユーザーインターフェース](#gui)、および [Docker](#docker) を提供

フィードバックは [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues)、[Telegram グループ](https://t.me/+Z9_SgnxmsmA5NzBl)

<h2 id="updates">最近の更新</h2>

- [2024年11月26日] CLIがオンラインファイルをサポートするようになりました *(by [@reycn](https://github.com/reycn))*  
- [2024年11月24日] 依存関係のサイズを削減するために [ONNX](https://github.com/onnx/onnx) サポートを追加しました *(by [@Wybxc](https://github.com/Wybxc))*  
- [2024年11月23日] 🌟 [公共サービス](#demo) がオンラインになりました! *(by [@Byaidu](https://github.com/Byaidu))*  
- [2024年11月23日] ウェブボットを防ぐためのファイアウォールを追加しました *(by [@Byaidu](https://github.com/Byaidu))*  
- [2024年11月22日] GUIがイタリア語をサポートし、改善されました *(by [@Byaidu](https://github.com/Byaidu), [@reycn](https://github.com/reycn))*  
- [2024年11月22日] デプロイされたサービスを他の人と共有できるようになりました *(by [@Zxis233](https://github.com/Zxis233))*  
- [2024年11月22日] Tencent翻訳をサポートしました *(by [@hellofinch](https://github.com/hellofinch))*  
- [2024年11月21日] GUIがバイリンガルドキュメントのダウンロードをサポートするようになりました *(by [@reycn](https://github.com/reycn))*  
- [2024年11月20日] 🌟 [デモ](#demo) がオンラインになりました! *(by [@reycn](https://github.com/reycn))*  

<h2 id="preview">プレビュー</h2>

<div align="center">
<img src="./images/preview.gif" width="80%"/>
</div>

<h2 id="demo">公共サービス 🌟</h2>

### 無料サービス (<https://pdf2zh.com/>)

インストールなしで [公共サービス](https://pdf2zh.com/) をオンラインで試すことができます。  

### デモ

インストールなしで [HuggingFace上のデモ](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker), [ModelScope上のデモ](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate) を試すことができます。
デモの計算リソースは限られているため、乱用しないようにしてください。

<h2 id="install">インストールと使用方法</h2>

このプロジェクトを使用するための4つの方法を提供しています:[コマンドライン](#cmd)、[ポータブル](#portable)、[GUI](#gui)、および [Docker](#docker)。

pdf2zhの実行には追加モデル(`wybxc/DocLayout-YOLO-DocStructBench-onnx`)が必要です。このモデルはModelScopeでも見つけることができます。起動時にこのモデルのダウンロードに問題がある場合は、以下の環境変数を使用してください:

```shell
set HF_ENDPOINT=https://hf-mirror.com
```

For PowerShell user:
```shell
$env:HF_ENDPOINT = https://hf-mirror.com
```

<h3 id="cmd">方法1. コマンドライン</h3>

  1. Pythonがインストールされていること (バージョン3.11 <= バージョン <= 3.12)
  2. パッケージをインストールします:

      ```bash
      pip install pdf2zh
      ```

  3. 翻訳を実行し、[現在の作業ディレクトリ](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444) にファイルを生成します:

      ```bash
      pdf2zh document.pdf
      ```

<h3 id="portable">方法2. ポータブル</h3>

Python環境を事前にインストールする必要はありません

[setup.bat](https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/script/setup.bat) をダウンロードしてダブルクリックして実行します

<h3 id="gui">方法3. GUI</h3>

1. Pythonがインストールされていること (バージョン3.11 <= バージョン <= 3.12)
2. パッケージをインストールします:

      ```bash
      pip install pdf2zh
      ```

3. ブラウザで使用を開始します:

      ```bash
      pdf2zh -i
      ```

4. ブラウザが自動的に起動しない場合は、次のURLを開きます:

    ```bash
    http://localhost:7860/
    ```

    <img src="./images/gui.gif" width="500"/>

詳細については、[GUIのドキュメント](./README_GUI.md) を参照してください。

<h3 id="docker">方法4. Docker</h3>

1. プルして実行します:

    ```bash
    docker pull byaidu/pdf2zh
    docker run -d -p 7860:7860 byaidu/pdf2zh
    ```

2. ブラウザで開きます:

    ```
    http://localhost:7860/
    ```

クラウドサービスでのDockerデプロイメント用:

<div>
<a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate">
  <img src="https://www.herokucdn.com/deploy/button.svg" alt="Deploy" height="26"></a>
<a href="https://render.com/deploy">
  <img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Koyeb" height="26"></a>
<a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn">
  <img src="https://zeabur.com/button.svg" alt="Deploy on Zeabur" height="26"></a>
<a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate">
  <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="Deploy to Koyeb" height="26"></a>
</div>

<h2 id="usage">高度なオプション</h2>

コマンドラインで翻訳コマンドを実行し、現在の作業ディレクトリに翻訳されたドキュメント `example-mono.pdf` とバイリンガルドキュメント `example-dual.pdf` を生成します。デフォルトではGoogle翻訳サービスを使用します。More support translation services can find [HERE](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services).


<img src="./images/cmd.explained.png" width="580px"  alt="cmd"/>  

以下の表に、参考のためにすべての高度なオプションをリストしました:

| オプション    | 機能 | 例 |
| -------- | ------- |------- |
| files | ローカルファイル |  `pdf2zh ~/local.pdf` |
| links | オンラインファイル |  `pdf2zh http://arxiv.org/paper.pdf` |
| `-i`  | [GUIに入る](#gui) |  `pdf2zh -i` |
| `-p`  | [部分的なドキュメント翻訳](#partial) |  `pdf2zh example.pdf -p 1` |
| `-li` | [ソース言語](#languages) |  `pdf2zh example.pdf -li en` |
| `-lo` | [ターゲット言語](#languages) |  `pdf2zh example.pdf -lo zh` |
| `-s`  | [翻訳サービス](#services) |  `pdf2zh example.pdf -s deepl` |
| `-t`  | [マルチスレッド](#threads) | `pdf2zh example.pdf -t 1` |
| `-o`  | 出力ディレクトリ | `pdf2zh example.pdf -o output` |
| `-f`, `-c` | [例外](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
| `--share` | [gradio公開リンクを取得] | `pdf2zh -i --share` |
| `--authorized` | [[ウェブ認証とカスタム認証ページの追加](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.)] | `pdf2zh -i --authorized users.txt [auth.html]` |
| `--prompt` | [カスタムビッグモデルのプロンプトを使用する] | `pdf2zh --prompt [prompt.txt]` |
| `--onnx` | [カスタムDocLayout-YOLO ONNXモデルの使用] | `pdf2zh --onnx [onnx/model/path]` |
| `--serverport` | [カスタムWebUIポートを使用する] | `pdf2zh --serverport 7860` |
| `--dir` | [batch translate] | `pdf2zh --dir /path/to/translate/` |
| `--config` | [configuration file](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig) | `pdf2zh --config /path/to/config/config.json` |
| `--serverport` | [custom gradio server port] | `pdf2zh --serverport 7860` |

<h3 id="partial">全文または部分的なドキュメント翻訳</h3>

- **全文翻訳**

```bash
pdf2zh example.pdf
```

- **部分翻訳**

```bash
pdf2zh example.pdf -p 1-3,5
```

<h3 id="language">ソース言語とターゲット言語を指定</h3>

[Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages)、[DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages) を参照してください

```bash
pdf2zh example.pdf -li en -lo ja
```

<h3 id="services">異なるサービスで翻訳</h3>

以下の表は、各翻訳サービスに必要な [環境変数](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) を示しています。各サービスを使用する前に、これらの変数を設定してください。

|**Translator**|**Service**|**Environment Variables**|**Default Values**|**Notes**|
|-|-|-|-|-|
|**Google (Default)**|`google`|None|N/A|None|
|**Bing**|`bing`|None|N/A|None|
|**DeepL**|`deepl`|`DEEPL_AUTH_KEY`|`[Your Key]`|See [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API)|
|**DeepLX**|`deeplx`|`DEEPLX_ENDPOINT`|`https://api.deepl.com/translate`|See [DeepLX](https://github.com/OwO-Network/DeepLX)|
|**Ollama**|`ollama`|`OLLAMA_HOST`, `OLLAMA_MODEL`|`http://127.0.0.1:11434`, `gemma2`|See [Ollama](https://github.com/ollama/ollama)|
|**OpenAI**|`openai`|`OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL`, `OPENAI_STOP_TOKENS`, `OPENAI_MAX_TOKENS` |`https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini`, ` `, `-1`|See [OpenAI](https://platform.openai.com/docs/overview)|
|**AzureOpenAI**|`azure-openai`|`AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_MODEL`|`[Your Endpoint]`, `[Your Key]`, `gpt-4o-mini`|See [Azure OpenAI](https://learn.microsoft.com/zh-cn/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython&pivots=programming-language-python)|
|**Zhipu**|`zhipu`|`ZHIPU_API_KEY`, `ZHIPU_MODEL`|`[Your Key]`, `glm-4-flash`|See [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk)|
| **ModelScope**       | `modelscope`   |`MODELSCOPE_API_KEY`, `MODELSCOPE_MODEL`|`[Your Key]`, `Qwen/Qwen2.5-Coder-32B-Instruct`| See [ModelScope](https://www.modelscope.cn/docs/model-service/API-Inference/intro)|
|**Silicon**|`silicon`|`SILICON_API_KEY`, `SILICON_MODEL`|`[Your Key]`, `Qwen/Qwen2.5-7B-Instruct`|See [SiliconCloud](https://docs.siliconflow.cn/quickstart)|
|**Gemini**|`gemini`|`GEMINI_API_KEY`, `GEMINI_MODEL`|`[Your Key]`, `gemini-1.5-flash`|See [Gemini](https://ai.google.dev/gemini-api/docs/openai)|
|**Azure**|`azure`|`AZURE_ENDPOINT`, `AZURE_API_KEY`|`https://api.translator.azure.cn`, `[Your Key]`|See [Azure](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview)|
|**Tencent**|`tencent`|`TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`|`[Your ID]`, `[Your Key]`|See [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104)|
|**Dify**|`dify`|`DIFY_API_URL`, `DIFY_API_KEY`|`[Your DIFY URL]`, `[Your Key]`|See [Dify](https://github.com/langgenius/dify),Three variables, lang_out, lang_in, and text, need to be defined in Dify's workflow input.|
|**AnythingLLM**|`anythingllm`|`AnythingLLM_URL`, `AnythingLLM_APIKEY`|`[Your AnythingLLM URL]`, `[Your Key]`|See [anything-llm](https://github.com/Mintplex-Labs/anything-llm)|
|**Argos Translate**|`argos`| | |See [argos-translate](https://github.com/argosopentech/argos-translate)|
|**Grok**|`grok`| `GORK_API_KEY`, `GORK_MODEL` | `[Your GORK_API_KEY]`, `grok-2-1212` |See [Grok](https://docs.x.ai/docs/overview)|
|**DeepSeek**|`deepseek`| `DEEPSEEK_API_KEY`, `DEEPSEEK_MODEL` | `[Your DEEPSEEK_API_KEY]`, `deepseek-chat` |See [DeepSeek](https://www.deepseek.com/)|
|**MiniMax**|`minimax`| `MINIMAX_API_KEY`, `MINIMAX_MODEL` | `[Your MINIMAX_API_KEY]`, `MiniMax-M2.7` |See [MiniMax](https://platform.minimaxi.com/)|
|**OpenAI-Liked**|`openailiked`| `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL` | `url`, `[Your Key]`, `model name` | None |
|**OpenAI-Liked**|`openailiked`| `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL`, `OPENAILIKED_STOP_TOKENS`, `OPENAILIKED_MAX_TOKENS` | `url`, `[Your Key]`, `model name`, `model name`, ` `, `-1` | None |

(need Japenese translation)
For large language models that are compatible with the OpenAI API but not listed in the table above, you can set environment variables using the same method outlined for OpenAI in the table.

`-s service` または `-s service:model` を使用してサービスを指定します:

```bash
pdf2zh example.pdf -s openai:gpt-4o-mini
```

または環境変数でモデルを指定します:

```bash
set OPENAI_MODEL=gpt-4o-mini
pdf2zh example.pdf -s openai
```

For PowerShell user:
```shell
$env:OPENAI_MODEL = gpt-4o-mini
pdf2zh example.pdf -s openai
```

<h3 id="exceptions">例外を指定して翻訳</h3>

正規表現を使用して保持する必要がある数式フォントと文字を指定します:

```bash
pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
```

デフォルトで `Latex`、`Mono`、`Code`、`Italic`、`Symbol` および `Math` フォントを保持します:

```bash
pdf2zh example.pdf -f "(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)"
```

<h3 id="threads">スレッド数を指定</h3>

`-t` を使用して翻訳に使用するスレッド数を指定します:

```bash
pdf2zh example.pdf -t 1
```

<h3 id="prompt">カスタム プロンプト</h3>

`--prompt`を使用して、LLMで使用するプロンプトを指定します:

```bash
pdf2zh example.pdf -pr prompt.txt
```


`prompt.txt`の例:

```txt
[
    {
        "role": "system",
        "content": "You are a professional,authentic machine translation engine.",
    },
    {
        "role": "user",
        "content": "Translate the following markdown source text to ${lang_out}. Keep the formula notation {{v*}} unchanged. Output translation directly without any additional text.\nSource Text: ${text}\nTranslated Text:",
    },
]
```


カスタムプロンプトファイルでは、以下の3つの変数が使用できます。

|**変数**|**内容**|
|-|-|
|`lang_in`|ソース言語|
|`lang_out`|ターゲット言語|
|`text`|翻訳するテキスト|

<h2 id="todo">API</h2>

### Python

```python
from pdf2zh import translate, translate_stream

params = {"lang_in": "en", "lang_out": "zh", "service": "google", "thread": 4}
file_mono, file_dual = translate(files=["example.pdf"], **params)[0]
with open("example.pdf", "rb") as f:
    stream_mono, stream_dual = translate_stream(stream=f.read(), **params)
```

### HTTP

```bash
pip install pdf2zh[backend]
pdf2zh --flask
pdf2zh --celery worker
```

```bash
curl http://localhost:11008/v1/translate -F "file=@example.pdf" -F "data={\"lang_in\":\"en\",\"lang_out\":\"zh\",\"service\":\"google\",\"thread\":4}"
{"id":"d9894125-2f4e-45ea-9d93-1a9068d2045a"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
{"info":{"n":13,"total":506},"state":"PROGRESS"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
{"state":"SUCCESS"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/mono --output example-mono.pdf

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/dual --output example-dual.pdf

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a -X DELETE
```

<h2 id="acknowledgement">謝辞</h2>

- ドキュメントのマージ:[PyMuPDF](https://github.com/pymupdf/PyMuPDF)

- ドキュメントの解析:[Pdfminer.six](https://github.com/pdfminer/pdfminer.six)

- ドキュメントの抽出:[MinerU](https://github.com/opendatalab/MinerU)

- ドキュメントプレビュー:[Gradio PDF](https://github.com/freddyaboulton/gradio-pdf)

- マルチスレッド翻訳:[MathTranslate](https://github.com/SUSYUSTC/MathTranslate)

- レイアウト解析:[DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)

- ドキュメント標準:[PDF Explained](https://zxyle.github.io/PDF-Explained/)、[PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)

- 多言語フォント:[Go Noto Universal](https://github.com/satbyy/go-noto-universal)

<h2 id="contrib">貢献者</h2>

<a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
  <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
</a>

![Alt](https://repobeats.axiom.co/api/embed/dfa7583da5332a11468d686fbd29b92320a6a869.svg "Repobeats analytics image")

<h2 id="star_hist">スター履歴</h2>

<a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
 <picture>
   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" />
   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" />
   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/>
 </picture>
</a>


## /docs/README_ko-KR.md

# Create new file

<div align="center">

[English](../README.md) | [简体中文](README_zh-CN.md) | [繁體中文](README_zh-TW.md) | [日本語](README_ja-JP.md) | 한국어

<img src="./images/banner.png" width="320px"  alt="PDF2ZH"/>

<h2 id="title">PDFMathTranslate</h2>

<p>
  <!-- PyPI -->
  <a href="https://pypi.org/project/pdf2zh/">
    <img src="https://img.shields.io/pypi/v/pdf2zh"/></a>
  <a href="https://pepy.tech/projects/pdf2zh">
    <img src="https://static.pepy.tech/badge/pdf2zh"></a>
  <a href="https://hub.docker.com/repository/docker/byaidu/pdf2zh">
    <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a>
  <!-- License -->
  <a href="./LICENSE">
    <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a>
  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
  <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
    <img src="https://img.shields.io/badge/contributions-welcome-green"/></a>
  <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview">
    <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a>
  <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
    <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
</p>

<a href="https://trendshift.io/repositories/12424" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12424" alt="Byaidu%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

</div>

과학 PDF 문서 번역 및 이중 언어 비교 도구

- 📊 수식, 차트, 목차, 주석 유지 _([미리보기](#preview))_
- 🌐 [다양한 언어](#language)와 [다양한 번역 서비스](#services) 지원
- 🤖 [커맨드라인 도구](#usage), [대화형 사용자 인터페이스](#gui), 및 [Docker](#docker) 제공

피드백은 [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) 또는 [Telegram 그룹](https://t.me/+Z9_SgnxmsmA5NzBl)에서 해주세요.

<h2 id="updates">최근 업데이트</h2>

- [2024년 12월 24일] [Xinference](https://github.com/xorbitsai/inference) 실행 로컬 LLM 지원 추가 _(by [@imClumsyPanda](https://github.com/imClumsyPanda))_
- [2024년 11월 26일] CLI가 온라인 파일을 지원하게 되었습니다 _(by [@reycn](https://github.com/reycn))_
- [2024년 11월 24일] 의존성 크기를 줄이기 위해 [ONNX](https://github.com/onnx/onnx) 지원 추가 _(by [@Wybxc](https://github.com/Wybxc))_
- [2024년 11월 23일] 🌟 [무료 공공 서비스](#demo) 온라인! _(by [@Byaidu](https://github.com/Byaidu))_
- [2024년 11월 23일] 웹 봇을 방지하기 위한 방화벽 추가 _(by [@Byaidu](https://github.com/Byaidu))_
- [2024년 11월 22일] GUI가 이탈리아어를 지원하고 개선되었습니다 _(by [@Byaidu](https://github.com/Byaidu), [@reycn](https://github.com/reycn))_
- [2024년 11월 22일] 배포된 서비스를 다른 사람과 공유할 수 있게 되었습니다 _(by [@Zxis233](https://github.com/Zxis233))_
- [2024년 11월 22일] Tencent 번역 지원 _(by [@hellofinch](https://github.com/hellofinch))_
- [2024년 11월 21일] GUI가 이중 언어 문서 다운로드를 지원하게 되었습니다 _(by [@reycn](https://github.com/reycn))_
- [2024년 11월 20일] 🌟 [데모](#demo)가 온라인이 되었습니다! _(by [@reycn](https://github.com/reycn))_

<h2 id="preview">미리보기</h2>

<div align="center">
<img src="./images/preview.gif" width="80%"/>
</div>

<h2 id="demo">공공 서비스 🌟</h2>

### 무료 서비스 (<https://pdf2zh.com/>)

설치 없이 [무료 공공 서비스](https://pdf2zh.com/)를 온라인으로 사용해 볼 수 있습니다.

### 데모

설치 없이 [HuggingFace의 데모](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)와 [ModelScope의 데모](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate)를 사용해 볼 수 있습니다.
데모의 컴퓨팅 리소스가 제한되어 있으므로 남용하지 말아주세요.

<h2 id="install">설치 및 사용법</h2>

이 프로젝트를 사용하는 4가지 방법을 제공합니다: [커맨드라인 도구](#cmd), [포터블](#portable), [GUI](#gui), 및 [Docker](#docker).

pdf2zh 실행에는 추가 모델(`wybxc/DocLayout-YOLO-DocStructBench-onnx`)이 필요합니다. 이 모델은 ModelScope에서도 찾을 수 있습니다. 시작할 때 이 모델 다운로드에 문제가 있다면 다음 환경 변수를 사용하세요:

```shell
set HF_ENDPOINT=https://hf-mirror.com
```

PowerShell 사용자의 경우:

```shell
$env:HF_ENDPOINT = https://hf-mirror.com
```

<h3 id="cmd">방법 1. 커맨드라인 도구</h3>

1. Python이 설치되어 있어야 합니다 (버전 3.11 <= 버전 <= 3.12)
2. 패키지를 설치합니다:

   ```bash
   pip install pdf2zh
   ```

3. 번역을 실행하고 [현재 작업 디렉토리](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444)에 파일을 생성합니다:

   ```bash
   pdf2zh document.pdf
   ```

<h3 id="portable">방법 2. 포터블</h3>

Python 환경을 미리 설치할 필요가 없습니다.

[setup.bat](https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/script/setup.bat)을 다운로드하고 더블클릭하여 실행합니다.

<h3 id="gui">방법 3. GUI</h3>

1. Python이 설치되어 있어야 합니다 (버전 3.11 <= 버전 <= 3.12)
2. 패키지를 설치합니다:

   ```bash
   pip install pdf2zh
   ```

3. 브라우저에서 사용을 시작합니다:

   ```bash
   pdf2zh -i
   ```

4. 브라우저가 자동으로 시작되지 않으면 다음 URL을 엽니다:

   ```bash
   http://localhost:7860/
   ```

   <img src="./images/gui.gif" width="500"/>

자세한 내용은 [GUI 문서](./README_GUI.md)를 참조하세요.

<h3 id="docker">방법 4. Docker</h3>

1. 풀하고 실행합니다:

   ```bash
   docker pull byaidu/pdf2zh
   docker run -d -p 7860:7860 byaidu/pdf2zh
   ```

2. 브라우저에서 엽니다:

   ```
   http://localhost:7860/
   ```

클라우드 서비스에서 Docker 배포용:

<div>
<a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate">
  <img src="https://www.herokucdn.com/deploy/button.svg" alt="Deploy" height="26"></a>
<a href="https://render.com/deploy">
  <img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Koyeb" height="26"></a>
<a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn">
  <img src="https://zeabur.com/button.svg" alt="Deploy on Zeabur" height="26"></a>
<a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate">
  <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="Deploy to Koyeb" height="26"></a>
</div>

<h2 id="usage">고급 옵션</h2>

커맨드라인에서 번역 명령을 실행하여 현재 작업 디렉토리에 번역된 문서 `example-mono.pdf`와 이중 언어 문서 `example-dual.pdf`를 생성합니다. 기본적으로 Google 번역 서비스를 사용합니다. 더 많은 지원 번역 서비스는 [여기](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services)에서 찾을 수 있습니다.

<img src="./images/cmd.explained.png" width="580px"  alt="cmd"/>

다음 표에 참고용으로 모든 고급 옵션을 나열했습니다:

| 옵션           | 기능                                                                                                             | 예시                                           |
| -------------- | ---------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- |
| files          | 로컬 파일                                                                                                        | `pdf2zh ~/local.pdf`                           |
| links          | 온라인 파일                                                                                                      | `pdf2zh http://arxiv.org/paper.pdf`            |
| `-i`           | [GUI 진입](#gui)                                                                                                 | `pdf2zh -i`                                    |
| `-p`           | [부분 문서 번역](#partial)                                                                                       | `pdf2zh example.pdf -p 1`                      |
| `-li`          | [소스 언어](#languages)                                                                                          | `pdf2zh example.pdf -li en`                    |
| `-lo`          | [대상 언어](#languages)                                                                                          | `pdf2zh example.pdf -lo zh`                    |
| `-s`           | [번역 서비스](#services)                                                                                         | `pdf2zh example.pdf -s deepl`                  |
| `-t`           | [멀티스레드](#threads)                                                                                           | `pdf2zh example.pdf -t 1`                      |
| `-o`           | 출력 디렉토리                                                                                                    | `pdf2zh example.pdf -o output`                 |
| `-f`, `-c`     | [예외](#exceptions)                                                                                              | `pdf2zh example.pdf -f "(MS.*)"`               |
| `--share`      | [gradio 공개 링크 얻기]                                                                                          | `pdf2zh -i --share`                            |
| `--authorized` | [[웹 인증 및 사용자 정의 인증 페이지 추가](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.)] | `pdf2zh -i --authorized users.txt [auth.html]` |
| `--prompt`     | [사용자 정의 대형 모델 프롬프트 사용]                                                                            | `pdf2zh --prompt [prompt.txt]`                 |
| `--onnx`       | [사용자 정의 DocLayout-YOLO ONNX 모델 사용]                                                                      | `pdf2zh --onnx [onnx/model/path]`              |
| `--serverport` | [사용자 정의 WebUI 포트 사용]                                                                                    | `pdf2zh --serverport 7860`                     |
| `--dir`        | [배치 번역]                                                                                                      | `pdf2zh --dir /path/to/translate/`             |
| `--config`     | [구성 파일](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig)                         | `pdf2zh --config /path/to/config/config.json`  |

<h3 id="partial">전체 또는 부분 문서 번역</h3>

- **전체 번역**

```bash
pdf2zh example.pdf
```

- **부분 번역**

```bash
pdf2zh example.pdf -p 1-3,5
```

<h3 id="language">소스 언어와 대상 언어 지정</h3>

[Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages) 참조

```bash
pdf2zh example.pdf -li en -lo ko
```

<h3 id="services">다른 서비스로 번역</h3>

다음 표는 각 번역 서비스에 필요한 [환경 변수](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4)를 보여줍니다. 각 서비스를 사용하기 전에 이러한 변수를 설정하세요.

| **번역기**          | **서비스**     | **환경 변수**                                                         | **기본값**                                               | **참고**                                                                                                                                                                                                   |
| ------------------- | -------------- | --------------------------------------------------------------------- | -------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Google (기본)**   | `google`       | 없음                                                                  | N/A                                                      | 없음                                                                                                                                                                                                       |
| **Bing**            | `bing`         | 없음                                                                  | N/A                                                      | 없음                                                                                                                                                                                                       |
| **DeepL**           | `deepl`        | `DEEPL_AUTH_KEY`                                                      | `[Your Key]`                                             | [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API) 참조                                                                                                             |
| **DeepLX**          | `deeplx`       | `DEEPLX_ENDPOINT`                                                     | `https://api.deepl.com/translate`                        | [DeepLX](https://github.com/OwO-Network/DeepLX) 참조                                                                                                                                                       |
| **Ollama**          | `ollama`       | `OLLAMA_HOST`, `OLLAMA_MODEL`                                         | `http://127.0.0.1:11434`, `gemma2`                       | [Ollama](https://github.com/ollama/ollama) 참조                                                                                                                                                            |
| **OpenAI**          | `openai`       | `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL`, `OPENAI_STOP_TOKENS`, `OPENAI_MAX_TOKENS`| `https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini`, ` `, `-1` | [OpenAI](https://platform.openai.com/docs/overview) 참조                                                                                                                                                   |
| **AzureOpenAI**     | `azure-openai` | `AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_MODEL` | `[Your Endpoint]`, `[Your Key]`, `gpt-4o-mini`           | [Azure OpenAI](https://learn.microsoft.com/zh-cn/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython&pivots=programming-language-python) 참조 |
| **Zhipu**           | `zhipu`        | `ZHIPU_API_KEY`, `ZHIPU_MODEL`                                        | `[Your Key]`, `glm-4-flash`                              | [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk) 참조                                                                                                                                 |
| **ModelScope**      | `modelscope`   | `MODELSCOPE_API_KEY`, `MODELSCOPE_MODEL`                              | `[Your Key]`, `Qwen/Qwen2.5-Coder-32B-Instruct`          | [ModelScope](https://www.modelscope.cn/docs/model-service/API-Inference/intro) 참조                                                                                                                        |
| **Silicon**         | `silicon`      | `SILICON_API_KEY`, `SILICON_MODEL`                                    | `[Your Key]`, `Qwen/Qwen2.5-7B-Instruct`                 | [SiliconCloud](https://docs.siliconflow.cn/quickstart) 참조                                                                                                                                                |
| **Gemini**          | `gemini`       | `GEMINI_API_KEY`, `GEMINI_MODEL`                                      | `[Your Key]`, `gemini-1.5-flash`                         | [Gemini](https://ai.google.dev/gemini-api/docs/openai) 참조                                                                                                                                                |
| **Azure**           | `azure`        | `AZURE_ENDPOINT`, `AZURE_API_KEY`                                     | `https://api.translator.azure.cn`, `[Your Key]`          | [Azure](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview) 참조                                                                                                                 |
| **Tencent**         | `tencent`      | `TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`                   | `[Your ID]`, `[Your Key]`                                | [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104) 참조                                                                                                                            |
| **Dify**            | `dify`         | `DIFY_API_URL`, `DIFY_API_KEY`                                        | `[Your DIFY URL]`, `[Your Key]`                          | [Dify](https://github.com/langgenius/dify) 참조, Dify의 워크플로우 입력에서 lang_out, lang_in, text 세 변수를 정의해야 합니다.                                                                             |
| **AnythingLLM**     | `anythingllm`  | `AnythingLLM_URL`, `AnythingLLM_APIKEY`                               | `[Your AnythingLLM URL]`, `[Your Key]`                   | [anything-llm](https://github.com/Mintplex-Labs/anything-llm) 참조                                                                                                                                         |
| **Argos Translate** | `argos`        |                                                                       |                                                          | [argos-translate](https://github.com/argosopentech/argos-translate) 참조                                                                                                                                   |
| **Grok**            | `grok`         | `GORK_API_KEY`, `GORK_MODEL`                                          | `[Your GORK_API_KEY]`, `grok-2-1212`                     | [Grok](https://docs.x.ai/docs/overview) 참조                                                                                                                                                               |
| **DeepSeek**        | `deepseek`     | `DEEPSEEK_API_KEY`, `DEEPSEEK_MODEL`                                  | `[Your DEEPSEEK_API_KEY]`, `deepseek-chat`               | [DeepSeek](https://www.deepseek.com/) 참조                                                                                                                                                                 |
| **MiniMax**         | `minimax`      | `MINIMAX_API_KEY`, `MINIMAX_MODEL`                                    | `[Your MINIMAX_API_KEY]`, `MiniMax-M2.7`                 | [MiniMax](https://platform.minimaxi.com/) 참조                                                                                                                                                             |
| **OpenAI-Liked**    | `openailiked` | `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL`       | `url`, `[Your Key]`, `model name`                        | 없음                                                                                                                                                                                                       |
| **OpenAI-Liked**    | `openailiked` | `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL`, `OPENAILIKED_STOP_TOKENS`, `OPENAILIKED_MAX_TOKENS`| `url`, `[Your Key]`, `model name`, ` `, `-1`| 없음                                                                                                                                                                                                       |

위 표에 없는 OpenAI API와 호환되는 대형 언어 모델의 경우, 표의 OpenAI와 동일한 방식으로 환경 변수를 설정할 수 있습니다.

`-s service` 또는 `-s service:model`을 사용하여 번역 서비스를 지정합니다:

```bash
pdf2zh example.pdf -s openai:gpt-4o-mini
```

또는 환경 변수로 모델을 지정합니다:

```bash
set OPENAI_MODEL=gpt-4o-mini
pdf2zh example.pdf -s openai
```

PowerShell 사용자의 경우:

```shell
$env:OPENAI_MODEL = gpt-4o-mini
pdf2zh example.pdf -s openai
```

<h3 id="exceptions">예외 지정</h3>

정규식을 사용하여 보존해야 할 수식 폰트와 문자를 지정합니다:

```bash
pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
```

기본적으로 `Latex`, `Mono`, `Code`, `Italic`, `Symbol` 및 `Math` 폰트를 보존합니다:

```bash
pdf2zh example.pdf -f "(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)"
```

<h3 id="threads">스레드 수 지정</h3>

`-t`를 사용하여 번역에 사용할 스레드 수를 지정합니다:

```bash
pdf2zh example.pdf -t 1
```

<h3 id="prompt">사용자 정의 프롬프트</h3>

`--prompt`를 사용하여 LLM에서 사용할 프롬프트를 지정합니다:

```bash
pdf2zh example.pdf -pr prompt.txt
```

`prompt.txt` 예시:

```txt
[
    {
        "role": "system",
        "content": "You are a professional,authentic machine translation engine.",
    },
    {
        "role": "user",
        "content": "Translate the following markdown source text to ${lang_out}. Keep the formula notation {{v*}} unchanged. Output translation directly without any additional text.\nSource Text: ${text}\nTranslated Text:",
    },
]
```

사용자 정의 프롬프트 파일에서는 다음 세 가지 변수를 사용할 수 있습니다:

| **변수**   | **내용**      |
| ---------- | ------------- |
| `lang_in`  | 소스 언어     |
| `lang_out` | 대상 언어     |
| `text`     | 번역할 텍스트 |

<h2 id="todo">API</h2>

### Python

```python
from pdf2zh import translate, translate_stream

params = {"lang_in": "en", "lang_out": "ko", "service": "google", "thread": 4}
file_mono, file_dual = translate(files=["example.pdf"], **params)[0]
with open("example.pdf", "rb") as f:
    stream_mono, stream_dual = translate_stream(stream=f.read(), **params)
```

### HTTP

```bash
pip install pdf2zh[backend]
pdf2zh --flask
pdf2zh --celery worker
```

```bash
curl http://localhost:11008/v1/translate -F "file=@example.pdf" -F "data={\"lang_in\":\"en\",\"lang_out\":\"ko\",\"service\":\"google\",\"thread\":4}"
{"id":"d9894125-2f4e-45ea-9d93-1a9068d2045a"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
{"info":{"n":13,"total":506},"state":"PROGRESS"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
{"state":"SUCCESS"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/mono --output example-mono.pdf

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/dual --output example-dual.pdf

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a -X DELETE
```

<h2 id="acknowledgement">감사의 말</h2>

- 문서 병합: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
- 문서 파싱: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
- 문서 추출: [MinerU](https://github.com/opendatalab/MinerU)
- 문서 미리보기: [Gradio PDF](https://github.com/freddyaboulton/gradio-pdf)
- 멀티스레드 번역: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
- 레이아웃 파싱: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
- 문서 표준: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
- 다국어 폰트: [Go Noto Universal](https://github.com/satbyy/go-noto-universal)

<h2 id="contrib">기여자</h2>

<a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
  <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
</a>

![Alt](https://repobeats.axiom.co/api/embed/dfa7583da5332a11468d686fbd29b92320a6a869.svg "Repobeats analytics image")

<h2 id="star_hist">스타 히스토리</h2>

<a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
 <picture>
   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" />
   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" />
   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/>
 </picture>
</a>


## /docs/README_zh-CN.md

<div align="center">

[English](../README.md) | 简体中文 | [繁體中文](README_zh-TW.md) | [日本語](README_ja-JP.md)

<img src="./images/banner.png" width="320px"  alt="PDF2ZH"/>  

<h2 id="title">PDFMathTranslate</h2>

<p>
  <!-- PyPI -->
  <a href="https://pypi.org/project/pdf2zh/">
    <img src="https://img.shields.io/pypi/v/pdf2zh"/></a>
  <a href="https://pepy.tech/projects/pdf2zh">
    <img src="https://static.pepy.tech/badge/pdf2zh"></a>
  <a href="https://hub.docker.com/repository/docker/byaidu/pdf2zh">
    <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a>
  <!-- License -->
  <a href="./LICENSE">
    <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a>
  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
  <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
    <img src="https://img.shields.io/badge/contributions-welcome-green"/></a>
  <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview">
    <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a>
  <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
    <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
</p>

<a href="https://trendshift.io/repositories/12424" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12424" alt="Byaidu%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

</div>

科学 PDF 文档翻译及双语对照工具

- 📊 保留公式、图表、目录和注释 *([预览效果](#preview))*
- 🌐 支持 [多种语言](./ADVANCED.md#language) 和 [诸多翻译服务](./ADVANCED.md#services)
- 🤖 提供 [命令行工具](#usage),[图形交互界面](#gui),以及 [容器化部署](#docker)

欢迎在 [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) 或 [Telegram 用户群](https://t.me/+Z9_SgnxmsmA5NzBl)

有关如何贡献的详细信息,请查阅 [贡献指南](https://github.com/Byaidu/PDFMathTranslate/wiki/Contribution-Guide---%E8%B4%A1%E7%8C%AE%E6%8C%87%E5%8D%97)

<h2 id="updates">更新</h2>


- [2026年3月23日] 实验性支持 v2.0 翻译内核,使用隔离环境运行(`--mode precise`)。(由[@reycn](https://github.com/reycn) 提交)

- [2026年3月22日] 支持 MiniMax(由[@octo-patch](https://github.com/octo-patch) 提交的PR)

- [2026年3月22日] 修复与 OpenAI 相关的问题(由[@samqin123](https://github.com/samqin123) 提交的PR)

- [2026年3月22日] 修复与 HTTP 相关的问题(由[@soukouki](https://github.com/soukouki) 提交的PR)

- [2026年3月22日] 在 mac 和 OONX 平台上加快模型加载速度,GUI 启动,版本打印和持续集成。(由[@reycn](https://github.com/reycn) 提交)
- [2025 年 2 月 22 日] 更好的发布 CI 和精心打包的 windows-amd64 exe (由 [@awwaawwa](https://github.com/awwaawwa) 提供)
- [2024 年 12 月 24 日] 翻译器现在支持在 [Xinference](https://github.com/xorbitsai/inference) 上使用本地模型 _(由 [@imClumsyPanda](https://github.com/imClumsyPanda) 提供)_

<h2 id="preview">预览</h2>
<div align="center">
<img src="./images/preview.gif" width="80%"/>
</div>

<h2 id="demo">在线演示 🌟</h2>

<h2 id="demo">在线服务 🌟</h2>

您可以通过以下演示尝试我们的应用程序:

- [公共免费服务](https://pdf2zh.com/) 在线使用,无需安装 _(推荐)_。
- [沉浸式翻译 - BabelDOC](https://app.immersivetranslate.com/babel-doc/) 每月免费 1000 页 _(推荐)_
- [在 HuggingFace 上托管的演示](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)
- [在 ModelScope 上托管的演示](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate) 无需安装。

请注意演示的计算资源有限,请避免滥用它们。
<h2 id="install">安装和使用</h2>

### 方法

针对不同的使用案例,我们提供不同的方法来使用我们的程序:

<details open>
  <summary>1. UV 安装</summary>

1. 安装 Python (3.11 <= 版本 <= 3.12)
2. 安装我们的包:

   ```bash
   pip install uv
   uv tool install --python 3.12 pdf2zh
   ```

3. 执行翻译,文件生成在 [当前工作目录](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):

   ```bash
   pdf2zh document.pdf
   ```

</details>

<details>
  <summary>2. Windows exe</summary>

1. 从 [发布页面](https://github.com/Byaidu/PDFMathTranslate/releases) 下载 pdf2zh-version-win64.zip

2. 解压缩并双击 `pdf2zh.exe` 运行。

</details>

<details>
  <summary id="gui">3. 图形用户界面</summary>
1. 安装 Python (3.11 <= 版本 <= 3.12)
2. 安装我们的包:

```bash
pip install pdf2zh
```

3. 在浏览器中开始使用:

   ```bash
   pdf2zh -i
   ```

4. 如果您的浏览器没有自动启动,请访问

   ```bash
   http://localhost:7860/
   ```

   <img src="./images/gui.gif" width="500"/>

有关更多详细信息,请参阅 [GUI 文档](./README_GUI.md)。

</details>

<details>
  <summary id="docker">4. Docker</summary>

1. 拉取并运行:

   ```bash
   docker pull byaidu/pdf2zh
   docker run -d -p 7860:7860 byaidu/pdf2zh
   ```

2. 在浏览器中打开:

   ```
   http://localhost:7860/
   ```

对于云服务上的 docker 部署:

<div>
<a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate">
  <img src="https://www.herokucdn.com/deploy/button.svg" alt="部署" height="26"></a>
<a href="https://render.com/deploy">
  <img src="https://render.com/images/deploy-to-render-button.svg" alt="部署到 Koyeb" height="26"></a>
<a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn">
  <img src="https://zeabur.com/button.svg" alt="在 Zeabur 上部署" height="26"></a>
<a href="https://template.sealos.io/deploy?templateName=pdf2zh">
  <img src="https://sealos.io/Deploy-on-Sealos.svg" alt="在 Sealos 上部署" height="26"></a>
<a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate">
  <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="部署到 Koyeb" height="26"></a>
</div>

</details>

<details>
  <summary>5. Zotero 插件</summary>

有关更多细节,请参见 [Zotero PDF2zh](https://github.com/guaguastandup/zotero-pdf2zh)。

</details>

<details>
  <summary>6. 命令行</summary>

1. 已安装 Python(3.11 <= 版本 <= 3.12)
2. 安装我们的包:

   ```bash
   pip install pdf2zh
   ```

3. 执行翻译,文件生成在 [当前工作目录](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):

   ```bash
   pdf2zh document.pdf
   ```

</details>

> [!TIP]
>
> - 如果你使用 Windows 并在下载后无法打开文件,请安装 [vc_redist.x64.exe](https://aka.ms/vs/17/release/vc_redist.x64.exe) 并重试。
>
> - 如果你无法访问 Docker Hub,请尝试在 [GitHub 容器注册中心](https://github.com/Byaidu/PDFMathTranslate/pkgs/container/pdfmathtranslate) 上使用该镜像。
> ```bash
> docker pull ghcr.io/byaidu/pdfmathtranslate
> docker run -d -p 7860:7860 ghcr.io/byaidu/pdfmathtranslate
> ```

### 无法安装?

当前程序在工作前需要一个 AI 模型 (`wybxc/DocLayout-YOLO-DocStructBench-onnx`),一些用户由于网络问题无法下载。如果你在下载此模型时遇到问题,我们提供以下环境变量的解决方法:

```shell
set HF_ENDPOINT=https://hf-mirror.com
```

对于 PowerShell 用户:

```shell
$env:HF_ENDPOINT = https://hf-mirror.com
```

如果此解决方案对您无效或您遇到其他问题,请参阅 [常见问题解答](https://github.com/Byaidu/PDFMathTranslate/wiki#-faq--%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)。


<h2 id="usage">高级选项</h2>

在命令行中执行翻译命令,在当前工作目录下生成译文文档 `example-mono.pdf` 和双语对照文档 `example-dual.pdf`,默认使用 Google 翻译服务,更多支持的服务在[这里](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services))。

<img src="./images/cmd.explained.png" width="580px"  alt="cmd"/>  

在下表中,我们列出了所有高级选项供参考:

| 选项         | 功能                                                                                                          | 示例                                           |
| ------------ | ------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- |
| files        | 本地文件                                                                                                     | `pdf2zh ~/local.pdf`                           |
| links        | 在线文件                                                                                                     | `pdf2zh http://arxiv.org/paper.pdf`            |
| `-i`         | [进入 GUI](#gui)                                                                                            | `pdf2zh -i`                                    |
| `-p`         | [部分文档翻译](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#partial)                | `pdf2zh example.pdf -p 1`                      |
| `-li`        | [源语言](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#languages)                    | `pdf2zh example.pdf -li en`                    |
| `-lo`        | [目标语言](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#languages)                  | `pdf2zh example.pdf -lo zh`                    |
| `-s`         | [翻译服务](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services)                   | `pdf2zh example.pdf -s deepl`                  |
| `-t`         | [多线程](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#threads)                      | `pdf2zh example.pdf -t 1`                      |
| `-o`         | 输出目录                                                                                                     | `pdf2zh example.pdf -o output`                 |
| `-f`, `-c`   | [异常](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#exceptions)                     | `pdf2zh example.pdf -f "(MS.*)"`               |
| `-cp`        | 兼容模式                                                                                                     | `pdf2zh example.pdf --compatible`              |
| `--share`    | 公开链接                                                                                                     | `pdf2zh -i --share`                            |
| `--authorized` | [授权](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#auth)                         | `pdf2zh -i --authorized users.txt [auth.html]` |
| `--prompt`   | [自定义提示](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#prompt)                   | `pdf2zh --prompt [prompt.txt]`                 |
| `--onnx`     | [使用自定义 DocLayout-YOLO ONNX 模型]                                                                        | `pdf2zh --onnx [onnx/model/path]`              |
| `--serverport` | [使用自定义 WebUI 端口]                                                                                    | `pdf2zh --serverport 7860`                     |
| `--dir`      | [批量翻译]                                                                                                   | `pdf2zh --dir /path/to/translate/`             |
| `--config`   | [配置文件](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig)                       | `pdf2zh --config /path/to/config/config.json`  |
| `--serverport` | [自定义 gradio 服务器端口]                                                                                 | `pdf2zh --serverport 7860`                     |
| `--mode`   | 翻译模式:`fast`(默认,v1)或 `precise`(v2,实验性,需要 pdf2zh_next 子模块)                                | `pdf2zh --mode precise example.pdf`            |
| `--babeldoc`| 使用实验性后端 [BabelDOC](https://funstory-ai.github.io/BabelDOC/) 翻译 |`pdf2zh --babeldoc` -s openai example.pdf|

有关详细说明,请参阅我们的文档 [高级用法](./ADVANCED.md),以获取每个选项的完整列表。

<h2 id="downstream">二次开发 (API)</h2>

当前的 pdf2zh API 暂时已弃用。API 将在 [pdf2zh 2.0](https://github.com/Byaidu/PDFMathTranslate/issues/586)发布后重新提供。对于需要程序化访问的用户,请使用[BabelDOC](https://github.com/funstory-ai/BabelDOC)的 `babeldoc.high_level.async_translate` 函数。

API 暂时弃用意味着:相关代码暂时不会被移除,但不会提供技术支持,也不会修复 bug。

<!-- 对于下游应用程序,请参阅我们的文档 [API 详细信息](./APIS.md),以获取更多信息:
- [Python API](./APIS.md#api-python),如何在其他 Python 程序中使用该程序
- [HTTP API](./APIS.md#api-http),如何与已安装该程序的服务器进行通信 -->

<h2 id="todo">待办事项</h2>

- [ ] 使用基于 DocLayNet 的模型解析布局,[PaddleX](https://github.com/PaddlePaddle/PaddleX/blob/17cc27ac3842e7880ca4aad92358d3ef8555429a/paddlex/repo_apis/PaddleDetection_api/object_det/official_categories.py#L81),[PaperMage](https://github.com/allenai/papermage/blob/9cd4bb48cbedab45d0f7a455711438f1632abebe/README.md?plain=1#L102),[SAM2](https://github.com/facebookresearch/sam2)

- [ ] 修复页面旋转、目录、列表格式

- [ ] 修复旧论文中的像素公式

- [ ] 异步重试,除了 KeyboardInterrupt

- [ ] 针对西方语言的 Knuth–Plass 算法

- [ ] 支持非 PDF/A 文件

- [ ] [Zotero](https://github.com/zotero/zotero) 和 [Obsidian](https://github.com/obsidianmd/obsidian-releases) 的插件

<h2 id="acknowledgement">致谢</h2>

- [Immersive Translation](https://immersivetranslate.com) 为此项目的活跃贡献者提供每月的专业会员兑换码,详细信息请查看:[CONTRIBUTOR_REWARD.md](https://github.com/funstory-ai/BabelDOC/blob/main/docs/CONTRIBUTOR_REWARD.md)

- 文档合并:[PyMuPDF](https://github.com/pymupdf/PyMuPDF)

- 文档解析:[Pdfminer.six](https://github.com/pdfminer/pdfminer.six)

- 文档提取:[MinerU](https://github.com/opendatalab/MinerU)

- 文档预览:[Gradio PDF](https://github.com/freddyaboulton/gradio-pdf)

- 多线程翻译:[MathTranslate](https://github.com/SUSYUSTC/MathTranslate)

- 布局解析:[DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)

- 文档标准:[PDF Explained](https://zxyle.github.io/PDF-Explained/),[PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)

- 多语言字体:[Go Noto Universal](https://github.com/satbyy/go-noto-universal)

<h2 id="contrib">贡献者</h2>

<a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
  <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
</a>

![Alt](https://repobeats.axiom.co/api/embed/dfa7583da5332a11468d686fbd29b92320a6a869.svg "Repobeats analytics image")

<h2 id="star_hist">星标历史</h2>

<a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
 <picture>
   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" />
   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" />
   <img alt="星标历史图表" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/>
 </picture>
</a>


## /docs/README_zh-TW.md

<div align="center">

[English](../README.md) | [简体中文](README_zh-CN.md) | 繁體中文 | [日本語](README_ja-JP.md)

<img src="./images/banner.png" width="320px"  alt="PDF2ZH"/>  

<h2 id="title">PDFMathTranslate</h2>

<p>
  <!-- PyPI -->
  <a href="https://pypi.org/project/pdf2zh/">
    <img src="https://img.shields.io/pypi/v/pdf2zh"/></a>
  <a href="https://pepy.tech/projects/pdf2zh">
    <img src="https://static.pepy.tech/badge/pdf2zh"></a>
  <a href="https://hub.docker.com/repository/docker/byaidu/pdf2zh">
    <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a>
  <!-- License -->
  <a href="./LICENSE">
    <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a>
  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
  <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
    <img src="https://img.shields.io/badge/contributions-welcome-green"/></a>
  <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview">
    <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a>
  <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
    <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
</p>

<a href="https://trendshift.io/repositories/12424" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12424" alt="Byaidu%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

</div>

科學 PDF 文件翻譯及雙語對照工具

- 📊 保留公式、圖表、目錄和註釋 *([預覽效果](#preview))*
- 🌐 支援 [多種語言](#language) 和 [諸多翻譯服務](#services)
- 🤖 提供 [命令列工具](#usage)、[圖形使用者介面](#gui),以及 [容器化部署](#docker)

歡迎在 [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) 或 [Telegram 使用者群](https://t.me/+Z9_SgnxmsmA5NzBl)(https://qm.qq.com/q/DixZCxQej0) 中提出回饋

如需瞭解如何貢獻的詳細資訊,請查閱 [貢獻指南](https://github.com/Byaidu/PDFMathTranslate/wiki/Contribution-Guide---%E8%B4%A1%E7%8C%AE%E6%8C%87%E5%8D%97)

<h2 id="updates">近期更新</h2>

- [Dec. 24 2024] 翻譯功能支援接入由 [Xinference](https://github.com/xorbitsai/inference) 執行的本機 LLM _(by [@imClumsyPanda](https://github.com/imClumsyPanda))_
- [Nov. 26 2024] CLI 現在已支援(多個)線上 PDF 檔 *(by [@reycn](https://github.com/reycn))*  
- [Nov. 24 2024] 為了降低依賴大小,提供 [ONNX](https://github.com/onnx/onnx) 支援 *(by [@Wybxc](https://github.com/Wybxc))*  
- [Nov. 23 2024] 🌟 [免費公共服務](#demo) 上線! *(by [@Byaidu](https://github.com/Byaidu))*  
- [Nov. 23 2024] 新增防止網頁爬蟲的防火牆 *(by [@Byaidu](https://github.com/Byaidu))*  
- [Nov. 22 2024] 圖形使用者介面現已支援義大利語並進行了一些更新 *(by [@Byaidu](https://github.com/Byaidu), [@reycn](https://github.com/reycn))*  
- [Nov. 22 2024] 現在你可以將自己部署的服務分享給朋友 *(by [@Zxis233](https://github.com/Zxis233))*  
- [Nov. 22 2024] 支援騰訊翻譯 *(by [@hellofinch](https://github.com/hellofinch))*  
- [Nov. 21 2024] 圖形使用者介面現在支援下載雙語文件 *(by [@reycn](https://github.com/reycn))*  
- [Nov. 20 2024] 🌟 提供了 [線上示範](#demo)! *(by [@reycn](https://github.com/reycn))*  

<h2 id="preview">效果預覽</h2>

<div align="center">
<img src="./images/preview.gif" width="80%"/>
</div>

<h2 id="demo">線上示範 🌟</h2>

### 免費服務 (<https://pdf2zh.com/>)

你可以立即嘗試 [免費公共服務](https://pdf2zh.com/) 而無需安裝

### 線上示範

你可以直接在 [HuggingFace 上的線上示範](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)和[魔搭的線上示範](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate)進行嘗試,無需安裝。
請注意,示範使用的運算資源有限,請勿濫用。

<h2 id="install">安裝與使用</h2>

我們提供了四種使用此專案的方法:[命令列工具](#cmd)、[便攜式安裝](#portable)、[圖形使用者介面](#gui) 與 [容器化部署](#docker)。

pdf2zh 在執行時需要額外下載模型(`wybxc/DocLayout-YOLO-DocStructBench-onnx`),該模型也可在魔搭(ModelScope)上取得。如果在啟動時下載該模型時遇到問題,請使用如下環境變數:
```shell
set HF_ENDPOINT=https://hf-mirror.com
```

<h3 id="cmd">方法一、命令列工具</h3>

1. 確保已安裝 Python 版本大於 3.11 且小於 3.12  
2. 安裝此程式:

   ```bash
   pip install pdf2zh
   ```

3. 執行翻譯,生成檔案位於 [目前工作目錄](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):

   ```bash
   pdf2zh document.pdf
   ```

<h3 id="portable">方法二、便攜式安裝</h3>

無需預先安裝 Python 環境

下載 [setup.bat](https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/script/setup.bat) 並直接雙擊執行

<h3 id="gui">方法三、圖形使用者介面</h3>

1. 確保已安裝 Python 版本大於 3.11 且小於 3.12  
2. 安裝此程式:

   ```bash
   pip install pdf2zh
   ```

3. 在瀏覽器中啟動使用:

   ```bash
   pdf2zh -i
   ```

4. 如果您的瀏覽器沒有自動開啟並跳轉,請手動在瀏覽器開啟:

   ```bash
   http://localhost:7860/
   ```

   <img src="./images/gui.gif" width="500"/>

查看 [documentation for GUI](/README_GUI.md) 以獲取詳細說明

<h3 id="docker">方法四、容器化部署</h3>

1. 拉取 Docker 映像檔並執行:

   ```bash
   docker pull byaidu/pdf2zh
   docker run -d -p 7860:7860 byaidu/pdf2zh
   ```

2. 透過瀏覽器開啟:

   ```
   http://localhost:7860/
   ```

用於在雲服務上部署容器映像檔:

<div>
<a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate">
  <img src="https://www.herokucdn.com/deploy/button.svg" alt="Deploy" height="26"></a>
<a href="https://render.com/deploy">
  <img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Koyeb" height="26"></a>
<a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn">
  <img src="https://zeabur.com/button.svg" alt="Deploy on Zeabur" height="26"></a>
<a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate">
  <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="Deploy to Koyeb" height="26"></a>
</div>

<h2 id="usage">高級選項</h2>

在命令列中執行翻譯指令,並在目前工作目錄下生成譯文檔案 `example-mono.pdf` 和雙語對照檔案 `example-dual.pdf`。預設使用 Google 翻譯服務。

<img src="./images/cmd.explained.png" width="580px"  alt="cmd"/>  

以下表格列出了所有高級選項,供參考:

| Option    | 功能 | 範例 |
| -------- | ------- |------- |
| files | 本機檔案 |  `pdf2zh ~/local.pdf` |
| links | 線上檔案 |  `pdf2zh http://arxiv.org/paper.pdf` |
| `-i`  | [進入圖形介面](#gui) |  `pdf2zh -i` |
| `-p`  | [僅翻譯部分文件](#partial) |  `pdf2zh example.pdf -p 1` |
| `-li` | [原文語言](#language) |  `pdf2zh example.pdf -li en` |
| `-lo` | [目標語言](#language) |  `pdf2zh example.pdf -lo zh` |
| `-s`  | [指定翻譯服務](#services) |  `pdf2zh example.pdf -s deepl` |
| `-t`  | [多執行緒](#threads) | `pdf2zh example.pdf -t 1` |
| `-o`  | 輸出目錄 | `pdf2zh example.pdf -o output` |
| `-f`, `-c` | [例外規則](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
| `--share` | [獲取 gradio 公開連結] | `pdf2zh -i --share` |
| `--authorized` | [[添加網頁認證及自訂認證頁面](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.)] | `pdf2zh -i --authorized users.txt [auth.html]` |
| `--prompt` | [使用自訂的大模型 Prompt] | `pdf2zh --prompt [prompt.txt]` |
| `--onnx` | [使用自訂的 DocLayout-YOLO ONNX 模型] | `pdf2zh --onnx [onnx/model/path]` |
| `--serverport` | [自訂 WebUI 埠號] | `pdf2zh --serverport 7860` |
| `--dir` | [資料夾翻譯] | `pdf2zh --dir /path/to/translate/` |

<h3 id="partial">全文或部分文件翻譯</h3>

- **全文翻譯**

```bash
pdf2zh example.pdf
```

- **部分翻譯**

```bash
pdf2zh example.pdf -p 1-3,5
```

<h3 id="language">指定原文語言與目標語言</h3>

可參考 [Google 語言代碼](https://developers.google.com/admin-sdk/directory/v1/languages)、[DeepL 語言代碼](https://developers.deepl.com/docs/resources/supported-languages)

```bash
pdf2zh example.pdf -li en -lo ja
```

<h3 id="services">使用不同的翻譯服務</h3>

下表列出了每個翻譯服務所需的 [環境變數](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4)。在使用前,請先確保已設定好對應的變數。

|**Translator**|**Service**|**Environment Variables**|**Default Values**|**Notes**|
|-|-|-|-|-|
|**Google (Default)**|`google`|無|N/A|無|
|**Bing**|`bing`|無|N/A|無|
|**DeepL**|`deepl`|`DEEPL_AUTH_KEY`|`[Your Key]`|參閱 [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API)|
|**DeepLX**|`deeplx`|`DEEPLX_ENDPOINT`|`https://api.deepl.com/translate`|參閱 [DeepLX](https://github.com/OwO-Network/DeepLX)|
|**Ollama**|`ollama`|`OLLAMA_HOST`, `OLLAMA_MODEL`|`http://127.0.0.1:11434`, `gemma2`|參閱 [Ollama](https://github.com/ollama/ollama)|
|**OpenAI**|`openai`|`OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL`, `OPENAI_STOP_TOKENS`, `OPENAI_MAX_TOKENS`|`https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini`, ` `, `-1`|參閱 [OpenAI](https://platform.openai.com/docs/overview)|
|**AzureOpenAI**|`azure-openai`|`AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_MODEL`|`[Your Endpoint]`, `[Your Key]`, `gpt-4o-mini`|參閱 [Azure OpenAI](https://learn.microsoft.com/zh-cn/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython&pivots=programming-language-python)|
|**Zhipu**|`zhipu`|`ZHIPU_API_KEY`, `ZHIPU_MODEL`|`[Your Key]`, `glm-4-flash`|參閱 [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk)|
| **ModelScope**       | `modelscope`   |`MODELSCOPE_API_KEY`, `MODELSCOPE_MODEL`|`[Your Key]`, `Qwen/Qwen2.5-Coder-32B-Instruct`| 參閱 [ModelScope](https://www.modelscope.cn/docs/model-service/API-Inference/intro)|
|**Silicon**|`silicon`|`SILICON_API_KEY`, `SILICON_MODEL`|`[Your Key]`, `Qwen/Qwen2.5-7B-Instruct`|參閱 [SiliconCloud](https://docs.siliconflow.cn/quickstart)|
|**Gemini**|`gemini`|`GEMINI_API_KEY`, `GEMINI_MODEL`|`[Your Key]`, `gemini-1.5-flash`|參閱 [Gemini](https://ai.google.dev/gemini-api/docs/openai)|
|**Azure**|`azure`|`AZURE_ENDPOINT`, `AZURE_API_KEY`|`https://api.translator.azure.cn`, `[Your Key]`|參閱 [Azure](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview)|
|**Tencent**|`tencent`|`TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`|`[Your ID]`, `[Your Key]`|參閱 [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104)|
|**Dify**|`dify`|`DIFY_API_URL`, `DIFY_API_KEY`|`[Your DIFY URL]`, `[Your Key]`|參閱 [Dify](https://github.com/langgenius/dify),需要在 Dify 的工作流程輸入中定義三個變數:lang_out、lang_in、text。|
|**AnythingLLM**|`anythingllm`|`AnythingLLM_URL`, `AnythingLLM_APIKEY`|`[Your AnythingLLM URL]`, `[Your Key]`|參閱 [anything-llm](https://github.com/Mintplex-Labs/anything-llm)|
|**Argos Translate**|`argos`| | |參閱 [argos-translate](https://github.com/argosopentech/argos-translate)|
|**Grok**|`grok`| `GORK_API_KEY`, `GORK_MODEL` | `[Your GORK_API_KEY]`, `grok-2-1212` |參閱 [Grok](https://docs.x.ai/docs/overview)|
|**DeepSeek**|`deepseek`| `DEEPSEEK_API_KEY`, `DEEPSEEK_MODEL` | `[Your DEEPSEEK_API_KEY]`, `deepseek-chat` |參閱 [DeepSeek](https://www.deepseek.com/)|
|**MiniMax**|`minimax`| `MINIMAX_API_KEY`, `MINIMAX_MODEL` | `[Your MINIMAX_API_KEY]`, `MiniMax-M2.7` |參閱 [MiniMax](https://platform.minimaxi.com/)|
|**OpenAI-Liked**|`openailiked`| `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL` | `url`, `[Your Key]`, `model name` | 無 |
|**OpenAI-Liked**|`openailiked`| `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL`, `OPENAILIKED_STOP_TOKENS`, `OPENAILIKED_MAX_TOKENS` | `url`, `[Your Key]`, `model name`, ` `, `-1` | 無 |

對於不在上述表格中,但兼容 OpenAI API 的大語言模型,可以使用與 OpenAI 相同的方式設定環境變數。

使用 `-s service` 或 `-s service:model` 指定翻譯服務:

```bash
pdf2zh example.pdf -s openai:gpt-4o-mini
```

或使用環境變數指定模型:

```bash
set OPENAI_MODEL=gpt-4o-mini
pdf2zh example.pdf -s openai
```

<h3 id="exceptions">指定例外規則</h3>

使用正則表達式指定需要保留的公式字體與字元:

```bash
pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
```

預設保留 `Latex`, `Mono`, `Code`, `Italic`, `Symbol` 以及 `Math` 字體:

```bash
pdf2zh example.pdf -f "(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)"
```

<h3 id="threads">指定執行緒數量</h3>

使用 `-t` 參數指定翻譯使用的執行緒數量:

```bash
pdf2zh example.pdf -t 1
```

<h3 id="prompt">自訂大模型 Prompt</h3>

使用 `--prompt` 指定在使用大模型翻譯時所採用的 Prompt 檔案。

```bash
pdf2zh example.pdf -pr prompt.txt
```

範例 `prompt.txt` 檔案內容:

```
[
    {
        "role": "system",
        "content": "You are a professional,authentic machine translation engine.",
    },
    {
        "role": "user",
        "content": "Translate the following markdown source text to ${lang_out}. Keep the formula notation {{v*}} unchanged. Output translation directly without any additional text.\nSource Text: ${text}\nTranslated Text:",
    },
]
```

在自訂 Prompt 檔案中,可以使用以下三個內建變數來傳遞參數:
|**變數名稱**|**說明**|
|-|-|
|`lang_in`|輸入語言|
|`lang_out`|輸出語言|
|`text`|需要翻譯的文本|

<h2 id="todo">API</h2>

### Python

```python
from pdf2zh import translate, translate_stream

params = {"lang_in": "en", "lang_out": "zh", "service": "google", "thread": 4}
file_mono, file_dual = translate(files=["example.pdf"], **params)[0]
with open("example.pdf", "rb") as f:
    stream_mono, stream_dual = translate_stream(stream=f.read(), **params)
```

### HTTP

```bash
pip install pdf2zh[backend]
pdf2zh --flask
pdf2zh --celery worker
```

```bash
curl http://localhost:11008/v1/translate -F "file=@example.pdf" -F "data={\"lang_in\":\"en\",\"lang_out\":\"zh\",\"service\":\"google\",\"thread\":4}"
{"id":"d9894125-2f4e-45ea-9d93-1a9068d2045a"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
{"info":{"n":13,"total":506},"state":"PROGRESS"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a
{"state":"SUCCESS"}

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/mono --output example-mono.pdf

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/dual --output example-dual.pdf

curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a -X DELETE
```

<h2 id="acknowledgement">致謝</h2>

- 文件合併:[PyMuPDF](https://github.com/pymupdf/PyMuPDF)
- 文件解析:[Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
- 文件提取:[MinerU](https://github.com/opendatalab/MinerU)
- 文件預覽:[Gradio PDF](https://github.com/freddyaboulton/gradio-pdf)
- 多執行緒翻譯:[MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
- 版面解析:[DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
- PDF 標準:[PDF Explained](https://zxyle.github.io/PDF-Explained/)、[PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
- 多語言字型:[Go Noto Universal](https://github.com/satbyy/go-noto-universal)

<h2 id="contrib">貢獻者</h2>

<a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
  <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
</a>

![Alt](https://repobeats.axiom.co/api/embed/dfa7583da5332a11468d686fbd29b92320a6a869.svg "Repobeats analytics image")

<h2 id="star_hist">星標歷史</h2>

<a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
 <picture>
   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" />
   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" />
   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/>
 </picture>
</a>

## /docs/images/after.png

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/after.png

## /docs/images/banner.png

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/banner.png

## /docs/images/before.png

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/before.png

## /docs/images/cmd.explained.png

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/cmd.explained.png

## /docs/images/cmd.explained.zh.png

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/cmd.explained.zh.png

## /docs/images/gui.gif

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/gui.gif

## /docs/images/preview.gif

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/docs/images/preview.gif

## /pdf2zh/__init__.py

```py path="/pdf2zh/__init__.py" 
import logging

log = logging.getLogger(__name__)

__version__ = "1.9.11"
__author__ = "Byaidu"
__all__ = ["translate", "translate_stream"]


def __getattr__(name):
    if name in {"translate", "translate_stream"}:
        from pdf2zh.high_level import translate, translate_stream

        return {"translate": translate, "translate_stream": translate_stream}[name]
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

```

## /pdf2zh/backend.py

```py path="/pdf2zh/backend.py" 
from flask import Flask, request, send_file
from celery import Celery, Task
from celery.result import AsyncResult
from pdf2zh import translate_stream
import tqdm
import json
import io
from string import Template
from pdf2zh.doclayout import ModelInstance
from pdf2zh.config import ConfigManager

flask_app = Flask("pdf2zh")
flask_app.config.from_mapping(
    CELERY=dict(
        broker_url=ConfigManager.get("CELERY_BROKER", "redis://127.0.0.1:6379/0"),
        result_backend=ConfigManager.get("CELERY_RESULT", "redis://127.0.0.1:6379/0"),
    )
)


def celery_init_app(app: Flask) -> Celery:
    class FlaskTask(Task):
        def __call__(self, *args, **kwargs):
            with app.app_context():
                return self.run(*args, **kwargs)

    celery_app = Celery(app.name)
    celery_app.config_from_object(app.config["CELERY"])
    celery_app.Task = FlaskTask
    celery_app.set_default()
    celery_app.autodiscover_tasks()
    app.extensions["celery"] = celery_app
    return celery_app


celery_app = celery_init_app(flask_app)


@celery_app.task(bind=True)
def translate_task(
    self: Task,
    stream: bytes,
    args: dict,
):
    def progress_bar(t: tqdm.tqdm):
        self.update_state(state="PROGRESS", meta={"n": t.n, "total": t.total})  # noqa
        print(f"Translating {t.n} / {t.total} pages")

    if "prompt" in args:
        args["prompt"] = Template(args["prompt"])

    doc_mono, doc_dual = translate_stream(
        stream,
        callback=progress_bar,
        model=ModelInstance.value,
        **args,
    )
    return doc_mono, doc_dual


@flask_app.route("/v1/translate", methods=["POST"])
def create_translate_tasks():
    file = request.files["file"]
    stream = file.stream.read()
    print(request.form.get("data"))
    args = json.loads(request.form.get("data"))
    task = translate_task.delay(stream, args)
    return {"id": task.id}


@flask_app.route("/v1/translate/<id>", methods=["GET"])
def get_translate_task(id: str):
    result: AsyncResult = celery_app.AsyncResult(id)
    if str(result.state) == "PROGRESS":
        return {"state": str(result.state), "info": result.info}
    else:
        return {"state": str(result.state)}


@flask_app.route("/v1/translate/<id>", methods=["DELETE"])
def delete_translate_task(id: str):
    result: AsyncResult = celery_app.AsyncResult(id)
    result.revoke(terminate=True)
    return {"state": str(result.state)}


@flask_app.route("/v1/translate/<id>/<format>")
def get_translate_result(id: str, format: str):
    result = celery_app.AsyncResult(id)
    if not result.ready():
        return {"error": "task not finished"}, 400
    if not result.successful():
        return {"error": "task failed"}, 400
    doc_mono, doc_dual = result.get()
    to_send = doc_mono if format == "mono" else doc_dual
    return send_file(io.BytesIO(to_send), "application/pdf")


if __name__ == "__main__":
    flask_app.run()

```

## /pdf2zh/cache.py

```py path="/pdf2zh/cache.py" 
import logging
import os
import json
from peewee import Model, SqliteDatabase, AutoField, CharField, TextField, SQL
from typing import Optional

# we don't init the database here
db = SqliteDatabase(None)
logger = logging.getLogger(__name__)


class _TranslationCache(Model):
    id = AutoField()
    translate_engine = CharField(max_length=20)
    translate_engine_params = TextField()
    original_text = TextField()
    translation = TextField()

    class Meta:
        database = db
        constraints = [SQL("""
            UNIQUE (
                translate_engine,
                translate_engine_params,
                original_text
                )
            ON CONFLICT REPLACE
            """)]


class TranslationCache:
    @staticmethod
    def _sort_dict_recursively(obj):
        if isinstance(obj, dict):
            return {
                k: TranslationCache._sort_dict_recursively(v)
                for k in sorted(obj.keys())
                for v in [obj[k]]
            }
        elif isinstance(obj, list):
            return [TranslationCache._sort_dict_recursively(item) for item in obj]
        return obj

    def __init__(self, translate_engine: str, translate_engine_params: dict = None):
        assert (
            len(translate_engine) < 20
        ), "current cache require translate engine name less than 20 characters"
        self.translate_engine = translate_engine
        self.replace_params(translate_engine_params)

    # The program typically starts multi-threaded translation
    # only after cache parameters are fully configured,
    # so thread safety doesn't need to be considered here.
    def replace_params(self, params: dict = None):
        if params is None:
            params = {}
        self.params = params
        params = self._sort_dict_recursively(params)
        self.translate_engine_params = json.dumps(params)

    def update_params(self, params: dict = None):
        if params is None:
            params = {}
        self.params.update(params)
        self.replace_params(self.params)

    def add_params(self, k: str, v):
        self.params[k] = v
        self.replace_params(self.params)

    # Since peewee and the underlying sqlite are thread-safe,
    # get and set operations don't need locks.
    def get(self, original_text: str) -> Optional[str]:
        result = _TranslationCache.get_or_none(
            translate_engine=self.translate_engine,
            translate_engine_params=self.translate_engine_params,
            original_text=original_text,
        )
        return result.translation if result else None

    def set(self, original_text: str, translation: str):
        try:
            _TranslationCache.create(
                translate_engine=self.translate_engine,
                translate_engine_params=self.translate_engine_params,
                original_text=original_text,
                translation=translation,
            )
        except Exception as e:
            logger.debug(f"Error setting cache: {e}")


def init_db(remove_exists=False):
    cache_folder = os.path.join(os.path.expanduser("~"), ".cache", "pdf2zh")
    os.makedirs(cache_folder, exist_ok=True)
    # The current version does not support database migration, so add the version number to the file name.
    cache_db_path = os.path.join(cache_folder, "cache.v1.db")
    if remove_exists and os.path.exists(cache_db_path):
        os.remove(cache_db_path)
    db.init(
        cache_db_path,
        pragmas={
            "journal_mode": "wal",
            "busy_timeout": 1000,
        },
    )
    db.create_tables([_TranslationCache], safe=True)


def init_test_db():
    import tempfile

    cache_db_path = tempfile.mktemp(suffix=".db")
    test_db = SqliteDatabase(
        cache_db_path,
        pragmas={
            "journal_mode": "wal",
            "busy_timeout": 1000,
        },
    )
    test_db.bind([_TranslationCache], bind_refs=False, bind_backrefs=False)
    test_db.connect()
    test_db.create_tables([_TranslationCache], safe=True)
    return test_db


def clean_test_db(test_db):
    test_db.drop_tables([_TranslationCache])
    test_db.close()
    db_path = test_db.database
    if os.path.exists(db_path):
        os.remove(test_db.database)
    wal_path = db_path + "-wal"
    if os.path.exists(wal_path):
        os.remove(wal_path)
    shm_path = db_path + "-shm"
    if os.path.exists(shm_path):
        os.remove(shm_path)


init_db()

```

## /pdf2zh/config.py

```py path="/pdf2zh/config.py" 
import json
from pathlib import Path
from threading import RLock  # 改成 RLock
import os
import copy


class ConfigManager:
    _instance = None
    _lock = RLock()  # 用 RLock 替换 Lock,允许在同一个线程中重复获取锁

    @classmethod
    def get_instance(cls):
        """获取单例实例"""
        # 先判断是否存在实例,如果不存在再加锁进行初始化
        if cls._instance is None:
            with cls._lock:
                if cls._instance is None:
                    cls._instance = cls()
        return cls._instance

    def __init__(self):
        # 防止重复初始化
        if hasattr(self, "_initialized") and self._initialized:
            return
        self._initialized = True

        self._config_path = Path.home() / ".config" / "PDFMathTranslate" / "config.json"
        self._config_data = {}

        # 这里不要再加锁,因为外层可能已经加了锁 (get_instance), RLock也无妨
        self._ensure_config_exists()

    def _ensure_config_exists(self, isInit=True):
        """确保配置文件存在,如果不存在则创建默认配置"""
        # 这里也不需要显式再次加锁,原因同上,方法体中再调用 _load_config(),
        # 而 _load_config() 内部会加锁。因为 RLock 是可重入的,不会阻塞。
        if not self._config_path.exists():
            if isInit:
                self._config_path.parent.mkdir(parents=True, exist_ok=True)
                self._config_data = {}  # 默认配置内容
                self._save_config()
            else:
                raise ValueError(f"config file {self._config_path} not found!")
        else:
            self._load_config()

    def _load_config(self):
        """从 config.json 中加载配置"""
        with self._lock:  # 加锁确保线程安全
            with self._config_path.open("r", encoding="utf-8") as f:
                self._config_data = json.load(f)

    def _save_config(self):
        """保存配置到 config.json"""
        with self._lock:  # 加锁确保线程安全
            # 移除循环引用并写入
            cleaned_data = self._remove_circular_references(self._config_data)
            with self._config_path.open("w", encoding="utf-8") as f:
                json.dump(cleaned_data, f, indent=4, ensure_ascii=False)

    def _remove_circular_references(self, obj, seen=None):
        """递归移除循环引用"""
        if seen is None:
            seen = set()
        obj_id = id(obj)
        if obj_id in seen:
            return None  # 遇到已处理过的对象,视为循环引用
        seen.add(obj_id)

        if isinstance(obj, dict):
            return {
                k: self._remove_circular_references(v, seen) for k, v in obj.items()
            }
        elif isinstance(obj, list):
            return [self._remove_circular_references(i, seen) for i in obj]
        return obj

    @classmethod
    def custome_config(cls, file_path):
        """使用自定义路径加载配置文件"""
        custom_path = Path(file_path)
        if not custom_path.exists():
            raise ValueError(f"Config file {custom_path} not found!")
        # 加锁
        with cls._lock:
            instance = cls()
            instance._config_path = custom_path
            # 此处传 isInit=False,若不存在则报错;若存在则正常 _load_config()
            instance._ensure_config_exists(isInit=False)
            cls._instance = instance

    @classmethod
    def get(cls, key, default=None):
        """获取配置值"""
        instance = cls.get_instance()
        # 读取时,加锁或不加锁都行。但为了统一,我们在修改配置前后都要加锁。
        # get 只要最终需要保存,则会加锁 -> _save_config()
        if key in instance._config_data:
            return instance._config_data[key]

        # 若环境变量中存在该 key,则使用环境变量并写回 config
        if key in os.environ:
            value = os.environ[key]
            instance._config_data[key] = value
            instance._save_config()
            return value

        # 若 default 不为 None,则设置并保存
        if default is not None:
            instance._config_data[key] = default
            instance._save_config()
            return default

        # 找不到则抛出异常
        # raise KeyError(f"{key} is not found in config file or environment variables.")
        return default

    @classmethod
    def set(cls, key, value):
        """设置配置值并保存"""
        instance = cls.get_instance()
        with instance._lock:
            instance._config_data[key] = value
            instance._save_config()

    @classmethod
    def get_translator_by_name(cls, name):
        """根据 name 获取对应的 translator 配置"""
        instance = cls.get_instance()
        translators = instance._config_data.get("translators", [])
        for translator in translators:
            if translator.get("name") == name:
                return translator["envs"]
        return None

    @classmethod
    def set_translator_by_name(cls, name, new_translator_envs):
        """根据 name 设置或更新 translator 配置"""
        instance = cls.get_instance()
        with instance._lock:
            translators = instance._config_data.get("translators", [])
            for translator in translators:
                if translator.get("name") == name:
                    translator["envs"] = copy.deepcopy(new_translator_envs)
                    instance._save_config()
                    return
            translators.append(
                {"name": name, "envs": copy.deepcopy(new_translator_envs)}
            )
            instance._config_data["translators"] = translators
            instance._save_config()

    @classmethod
    def get_env_by_translatername(cls, translater_name, name, default=None):
        """根据 name 获取对应的 translator 配置"""
        instance = cls.get_instance()
        translators = instance._config_data.get("translators", [])
        for translator in translators:
            if translator.get("name") == translater_name.name:
                if translator["envs"][name]:
                    return translator["envs"][name]
                else:
                    with instance._lock:
                        translator["envs"][name] = default
                        instance._save_config()
                        return default

        with instance._lock:
            translators = instance._config_data.get("translators", [])
            for translator in translators:
                if translator.get("name") == translater_name.name:
                    translator["envs"][name] = default
                    instance._save_config()
                    return default
            translators.append(
                {
                    "name": translater_name.name,
                    "envs": copy.deepcopy(translater_name.envs),
                }
            )
            instance._config_data["translators"] = translators
            instance._save_config()
            return default

    @classmethod
    def delete(cls, key):
        """删除配置值并保存"""
        instance = cls.get_instance()
        with instance._lock:
            if key in instance._config_data:
                del instance._config_data[key]
                instance._save_config()

    @classmethod
    def clear(cls):
        """删除配置值并保存"""
        instance = cls.get_instance()
        with instance._lock:
            instance._config_data = {}
            instance._save_config()

    @classmethod
    def all(cls):
        """返回所有配置项"""
        instance = cls.get_instance()
        # 这里只做读取操作,一般可不加锁。不过为了保险也可以加锁。
        return instance._config_data

    @classmethod
    def remove(cls):
        instance = cls.get_instance()
        with instance._lock:
            os.remove(instance._config_path)

```

## /pdf2zh/converter.py

```py path="/pdf2zh/converter.py" 
import concurrent.futures
import logging
import re
import unicodedata
from enum import Enum
from string import Template
from typing import Dict

import numpy as np
from pdfminer.converter import PDFConverter
from pdfminer.layout import LTChar, LTFigure, LTLine, LTPage
from pdfminer.pdffont import PDFCIDFont, PDFUnicodeNotDefined
from pdfminer.pdfinterp import PDFGraphicState, PDFResourceManager
from pdfminer.utils import apply_matrix_pt, mult_matrix
from pymupdf import Font
from tenacity import retry, wait_fixed

from pdf2zh.translator import (
    AnythingLLMTranslator,
    ArgosTranslator,
    AzureOpenAITranslator,
    AzureTranslator,
    BaseTranslator,
    BingTranslator,
    DeepLTranslator,
    DeepLXTranslator,
    DeepseekTranslator,
    DifyTranslator,
    GeminiTranslator,
    GoogleTranslator,
    GrokTranslator,
    GroqTranslator,
    MiniMaxTranslator,
    ModelScopeTranslator,
    OllamaTranslator,
    OpenAIlikedTranslator,
    OpenAITranslator,
    QwenMtTranslator,
    SiliconTranslator,
    TencentTranslator,
    XinferenceTranslator,
    ZhipuTranslator,
    X302AITranslator,
)

log = logging.getLogger(__name__)


class PDFConverterEx(PDFConverter):
    def __init__(
        self,
        rsrcmgr: PDFResourceManager,
    ) -> None:
        PDFConverter.__init__(self, rsrcmgr, None, "utf-8", 1, None)

    def begin_page(self, page, ctm) -> None:
        # 重载替换 cropbox
        x0, y0, x1, y1 = page.cropbox
        x0, y0 = apply_matrix_pt(ctm, (x0, y0))
        x1, y1 = apply_matrix_pt(ctm, (x1, y1))
        mediabox = (0, 0, abs(x0 - x1), abs(y0 - y1))
        self.cur_item = LTPage(page.pageno, mediabox)

    def end_page(self, page):
        # 重载返回指令流
        return self.receive_layout(self.cur_item)

    def begin_figure(self, name, bbox, matrix) -> None:
        # 重载设置 pageid
        self._stack.append(self.cur_item)
        self.cur_item = LTFigure(name, bbox, mult_matrix(matrix, self.ctm))
        self.cur_item.pageid = self._stack[-1].pageid

    def end_figure(self, _: str) -> None:
        # 重载返回指令流
        fig = self.cur_item
        assert isinstance(self.cur_item, LTFigure), str(type(self.cur_item))
        self.cur_item = self._stack.pop()
        self.cur_item.add(fig)
        return self.receive_layout(fig)

    def render_char(
        self,
        matrix,
        font,
        fontsize: float,
        scaling: float,
        rise: float,
        cid: int,
        ncs,
        graphicstate: PDFGraphicState,
    ) -> float:
        # 重载设置 cid 和 font
        try:
            text = font.to_unichr(cid)
            assert isinstance(text, str), str(type(text))
        except PDFUnicodeNotDefined:
            text = self.handle_undefined_char(font, cid)
        textwidth = font.char_width(cid)
        textdisp = font.char_disp(cid)
        item = LTChar(
            matrix,
            font,
            fontsize,
            scaling,
            rise,
            text,
            textwidth,
            textdisp,
            ncs,
            graphicstate,
        )
        self.cur_item.add(item)
        item.cid = cid  # hack 插入原字符编码
        item.font = font  # hack 插入原字符字体
        return item.adv


class Paragraph:
    def __init__(self, y, x, x0, x1, y0, y1, size, brk):
        self.y: float = y  # 初始纵坐标
        self.x: float = x  # 初始横坐标
        self.x0: float = x0  # 左边界
        self.x1: float = x1  # 右边界
        self.y0: float = y0  # 上边界
        self.y1: float = y1  # 下边界
        self.size: float = size  # 字体大小
        self.brk: bool = brk  # 换行标记


# fmt: off
class TranslateConverter(PDFConverterEx):
    def __init__(
        self,
        rsrcmgr,
        vfont: str = None,
        vchar: str = None,
        thread: int = 0,
        layout={},
        lang_in: str = "",
        lang_out: str = "",
        service: str = "",
        noto_name: str = "",
        noto: Font = None,
        envs: Dict = None,
        prompt: Template = None,
        ignore_cache: bool = False,
    ) -> None:
        super().__init__(rsrcmgr)
        self.vfont = vfont
        self.vchar = vchar
        self.thread = thread
        self.layout = layout
        self.noto_name = noto_name
        self.noto = noto
        self.translator: BaseTranslator = None
        # e.g. "ollama:gemma2:9b" -> ["ollama", "gemma2:9b"]
        param = service.split(":", 1)
        service_name = param[0]
        service_model = param[1] if len(param) > 1 else None
        if not envs:
            envs = {}
        for translator in [GoogleTranslator, BingTranslator, DeepLTranslator, DeepLXTranslator, OllamaTranslator, XinferenceTranslator, AzureOpenAITranslator,
                           OpenAITranslator, ZhipuTranslator, ModelScopeTranslator, SiliconTranslator, GeminiTranslator, AzureTranslator, TencentTranslator, DifyTranslator, AnythingLLMTranslator, ArgosTranslator, GrokTranslator, GroqTranslator, DeepseekTranslator, MiniMaxTranslator, OpenAIlikedTranslator, QwenMtTranslator, X302AITranslator]:
            if service_name == translator.name:
                self.translator = translator(lang_in, lang_out, service_model, envs=envs, prompt=prompt, ignore_cache=ignore_cache)
        if not self.translator:
            raise ValueError("Unsupported translation service")

    def receive_layout(self, ltpage: LTPage):
        # 段落
        sstk: list[str] = []            # 段落文字栈
        pstk: list[Paragraph] = []      # 段落属性栈
        vbkt: int = 0                   # 段落公式括号计数
        # 公式组
        vstk: list[LTChar] = []         # 公式符号组
        vlstk: list[LTLine] = []        # 公式线条组
        vfix: float = 0                 # 公式纵向偏移
        # 公式组栈
        var: list[list[LTChar]] = []    # 公式符号组栈
        varl: list[list[LTLine]] = []   # 公式线条组栈
        varf: list[float] = []          # 公式纵向偏移栈
        vlen: list[float] = []          # 公式宽度栈
        # 全局
        lstk: list[LTLine] = []         # 全局线条栈
        xt: LTChar = None               # 上一个字符
        xt_cls: int = -1                # 上一个字符所属段落,保证无论第一个字符属于哪个类别都可以触发新段落
        vmax: float = ltpage.width / 4  # 行内公式最大宽度
        ops: str = ""                   # 渲染结果

        def vflag(font: str, char: str):    # 匹配公式(和角标)字体
            if isinstance(font, bytes):     # 不一定能 decode,直接转 str
                try:
                    font = font.decode('utf-8')  # 尝试使用 UTF-8 解码
                except UnicodeDecodeError:
                    font = ""
            font = font.split("+")[-1]      # 字体名截断
            if re.match(r"\(cid:", char):
                return True
            # 基于字体名规则的判定
            if self.vfont:
                if re.match(self.vfont, font):
                    return True
            else:
                if re.match(                                            # latex 字体
                    r"(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)",
                    font,
                ):
                    return True
            # 基于字符集规则的判定
            if self.vchar:
                if re.match(self.vchar, char):
                    return True
            else:
                if (
                    char
                    and char != " "                                     # 非空格
                    and (
                        unicodedata.category(char[0])
                        in ["Lm", "Mn", "Sk", "Sm", "Zl", "Zp", "Zs"]   # 文字修饰符、数学符号、分隔符号
                        or ord(char[0]) in range(0x370, 0x400)          # 希腊字母
                    )
                ):
                    return True
            return False

        ############################################################
        # A. 原文档解析
        for child in ltpage:
            if isinstance(child, LTChar):
                cur_v = False
                layout = self.layout[ltpage.pageid]
                # ltpage.height 可能是 fig 里面的高度,这里统一用 layout.shape
                h, w = layout.shape
                # 读取当前字符在 layout 中的类别
                cx, cy = np.clip(int(child.x0), 0, w - 1), np.clip(int(child.y0), 0, h - 1)
                cls = layout[cy, cx]
                # 锚定文档中 bullet 的位置
                if child.get_text() == "•":
                    cls = 0
                # 判定当前字符是否属于公式
                if (                                                                                        # 判定当前字符是否属于公式
                    cls == 0                                                                                # 1. 类别为保留区域
                    or (cls == xt_cls and len(sstk[-1].strip()) > 1 and child.size < pstk[-1].size * 0.79)  # 2. 角标字体,有 0.76 的角标和 0.799 的大写,这里用 0.79 取中,同时考虑首字母放大的情况
                    or vflag(child.fontname, child.get_text())                                              # 3. 公式字体
                    or (child.matrix[0] == 0 and child.matrix[3] == 0)                                      # 4. 垂直字体
                ):
                    cur_v = True
                # 判定括号组是否属于公式
                if not cur_v:
                    if vstk and child.get_text() == "(":
                        cur_v = True
                        vbkt += 1
                    if vbkt and child.get_text() == ")":
                        cur_v = True
                        vbkt -= 1
                if (                                                        # 判定当前公式是否结束
                    not cur_v                                               # 1. 当前字符不属于公式
                    or cls != xt_cls                                        # 2. 当前字符与前一个字符不属于同一段落
                    # or (abs(child.x0 - xt.x0) > vmax and cls != 0)        # 3. 段落内换行,可能是一长串斜体的段落,也可能是段内分式换行,这里设个阈值进行区分
                    # 禁止纯公式(代码)段落换行,直到文字开始再重开文字段落,保证只存在两种情况
                    # A. 纯公式(代码)段落(锚定绝对位置)sstk[-1]=="" -> sstk[-1]=="{v*}"
                    # B. 文字开头段落(排版相对位置)sstk[-1]!=""
                    or (sstk[-1] != "" and abs(child.x0 - xt.x0) > vmax)    # 因为 cls==xt_cls==0 一定有 sstk[-1]=="",所以这里不需要再判定 cls!=0
                ):
                    if vstk:
                        if (                                                # 根据公式右侧的文字修正公式的纵向偏移
                            not cur_v                                       # 1. 当前字符不属于公式
                            and cls == xt_cls                               # 2. 当前字符与前一个字符属于同一段落
                            and child.x0 > max([vch.x0 for vch in vstk])    # 3. 当前字符在公式右侧
                        ):
                            vfix = vstk[0].y0 - child.y0
                        if sstk[-1] == "":
                            xt_cls = -1 # 禁止纯公式段落(sstk[-1]=="{v*}")的后续连接,但是要考虑新字符和后续字符的连接,所以这里修改的是上个字符的类别
                        sstk[-1] += f"{{v{len(var)}}}"
                        var.append(vstk)
                        varl.append(vlstk)
                        varf.append(vfix)
                        vstk = []
                        vlstk = []
                        vfix = 0
                # 当前字符不属于公式或当前字符是公式的第一个字符
                if not vstk:
                    if cls == xt_cls:               # 当前字符与前一个字符属于同一段落
                        if child.x0 > xt.x1 + 1:    # 添加行内空格
                            sstk[-1] += " "
                        elif child.x1 < xt.x0:      # 添加换行空格并标记原文段落存在换行
                            sstk[-1] += " "
                            pstk[-1].brk = True
                    else:                           # 根据当前字符构建一个新的段落
                        sstk.append("")
                        pstk.append(Paragraph(child.y0, child.x0, child.x0, child.x0, child.y0, child.y1, child.size, False))
                if not cur_v:                                               # 文字入栈
                    if (                                                    # 根据当前字符修正段落属性
                        child.size > pstk[-1].size                          # 1. 当前字符比段落字体大
                        or len(sstk[-1].strip()) == 1                       # 2. 当前字符为段落第二个文字(考虑首字母放大的情况)
                    ) and child.get_text() != " ":                          # 3. 当前字符不是空格
                        pstk[-1].y -= child.size - pstk[-1].size            # 修正段落初始纵坐标,假设两个不同大小字符的上边界对齐
                        pstk[-1].size = child.size
                    sstk[-1] += child.get_text()
                else:                                                       # 公式入栈
                    if (                                                    # 根据公式左侧的文字修正公式的纵向偏移
                        not vstk                                            # 1. 当前字符是公式的第一个字符
                        and cls == xt_cls                                   # 2. 当前字符与前一个字符属于同一段落
                        and child.x0 > xt.x0                                # 3. 前一个字符在公式左侧
                    ):
                        vfix = child.y0 - xt.y0
                    vstk.append(child)
                # 更新段落边界,因为段落内换行之后可能是公式开头,所以要在外边处理
                pstk[-1].x0 = min(pstk[-1].x0, child.x0)
                pstk[-1].x1 = max(pstk[-1].x1, child.x1)
                pstk[-1].y0 = min(pstk[-1].y0, child.y0)
                pstk[-1].y1 = max(pstk[-1].y1, child.y1)
                # 更新上一个字符
                xt = child
                xt_cls = cls
            elif isinstance(child, LTFigure):   # 图表
                pass
            elif isinstance(child, LTLine):     # 线条
                layout = self.layout[ltpage.pageid]
                # ltpage.height 可能是 fig 里面的高度,这里统一用 layout.shape
                h, w = layout.shape
                # 读取当前线条在 layout 中的类别
                cx, cy = np.clip(int(child.x0), 0, w - 1), np.clip(int(child.y0), 0, h - 1)
                cls = layout[cy, cx]
                if vstk and cls == xt_cls:      # 公式线条
                    vlstk.append(child)
                else:                           # 全局线条
                    lstk.append(child)
            else:
                pass
        # 处理结尾
        if vstk:    # 公式出栈
            sstk[-1] += f"{{v{len(var)}}}"
            var.append(vstk)
            varl.append(vlstk)
            varf.append(vfix)
        log.debug("\n==========[VSTACK]==========\n")
        for id, v in enumerate(var):  # 计算公式宽度
            l = max([vch.x1 for vch in v]) - v[0].x0
            log.debug(f'< {l:.1f} {v[0].x0:.1f} {v[0].y0:.1f} {v[0].cid} {v[0].fontname} {len(varl[id])} > v{id} = {"".join([ch.get_text() for ch in v])}')
            vlen.append(l)

        ############################################################
        # B. 段落翻译
        log.debug("\n==========[SSTACK]==========\n")

        @retry(wait=wait_fixed(1))
        def worker(s: str):  # 多线程翻译
            if not s.strip() or re.match(r"^\{v\d+\}{{contextString}}quot;, s):  # 空白和公式不翻译
                return s
            try:
                new = self.translator.translate(s)
                return new
            except BaseException as e:
                if log.isEnabledFor(logging.DEBUG):
                    log.exception(e)
                else:
                    log.exception(e, exc_info=False)
                raise e
        with concurrent.futures.ThreadPoolExecutor(
            max_workers=self.thread
        ) as executor:
            news = list(executor.map(worker, sstk))

        ############################################################
        # C. 新文档排版
        def raw_string(fcur: str, cstk: str):  # 编码字符串
            if fcur == self.noto_name:
                return "".join(["%04x" % self.noto.has_glyph(ord(c)) for c in cstk])
            elif isinstance(self.fontmap[fcur], PDFCIDFont):  # 判断编码长度
                return "".join(["%04x" % ord(c) for c in cstk])
            else:
                return "".join(["%02x" % ord(c) for c in cstk])

        # 根据目标语言获取默认行距
        LANG_LINEHEIGHT_MAP = {
            "zh-cn": 1.4, "zh-tw": 1.4, "zh-hans": 1.4, "zh-hant": 1.4, "zh": 1.4,
            "ja": 1.1, "ko": 1.2, "en": 1.2, "ar": 1.0, "ru": 0.8, "uk": 0.8, "ta": 0.8
        }
        default_line_height = LANG_LINEHEIGHT_MAP.get(self.translator.lang_out.lower(), 1.1) # 小语种默认1.1
        _x, _y = 0, 0
        ops_list = []

        def gen_op_txt(font, size, x, y, rtxt):
            return f"/{font} {size:f} Tf 1 0 0 1 {x:f} {y:f} Tm [<{rtxt}>] TJ "

        def gen_op_line(x, y, xlen, ylen, linewidth):
            return f"ET q 1 0 0 1 {x:f} {y:f} cm [] 0 d 0 J {linewidth:f} w 0 0 m {xlen:f} {ylen:f} l S Q BT "

        for id, new in enumerate(news):
            x: float = pstk[id].x                       # 段落初始横坐标
            y: float = pstk[id].y                       # 段落初始纵坐标
            x0: float = pstk[id].x0                     # 段落左边界
            x1: float = pstk[id].x1                     # 段落右边界
            height: float = pstk[id].y1 - pstk[id].y0   # 段落高度
            size: float = pstk[id].size                 # 段落字体大小
            brk: bool = pstk[id].brk                    # 段落换行标记
            cstk: str = ""                              # 当前文字栈
            fcur: str = None                            # 当前字体 ID
            lidx = 0                                    # 记录换行次数
            tx = x
            fcur_ = fcur
            ptr = 0
            log.debug(f"< {y} {x} {x0} {x1} {size} {brk} > {sstk[id]} | {new}")

            ops_vals: list[dict] = []

            while ptr < len(new):
                vy_regex = re.match(
                    r"\{\s*v([\d\s]+)\}", new[ptr:], re.IGNORECASE
                )  # 匹配 {vn} 公式标记
                mod = 0  # 文字修饰符
                if vy_regex:  # 加载公式
                    ptr += len(vy_regex.group(0))
                    try:
                        vid = int(vy_regex.group(1).replace(" ", ""))
                        adv = vlen[vid]
                    except Exception:
                        continue  # 翻译器可能会自动补个越界的公式标记
                    if var[vid][-1].get_text() and unicodedata.category(var[vid][-1].get_text()[0]) in ["Lm", "Mn", "Sk"]:  # 文字修饰符
                        mod = var[vid][-1].width
                else:  # 加载文字
                    ch = new[ptr]
                    fcur_ = None
                    try:
                        if fcur_ is None and self.fontmap["tiro"].to_unichr(ord(ch)) == ch:
                            fcur_ = "tiro"  # 默认拉丁字体
                    except Exception:
                        pass
                    if fcur_ is None:
                        fcur_ = self.noto_name  # 默认非拉丁字体
                    if fcur_ == self.noto_name: # FIXME: change to CONST
                        adv = self.noto.char_lengths(ch, size)[0]
                    else:
                        adv = self.fontmap[fcur_].char_width(ord(ch)) * size
                    ptr += 1
                if (                                # 输出文字缓冲区
                    fcur_ != fcur                   # 1. 字体更新
                    or vy_regex                     # 2. 插入公式
                    or x + adv > x1 + 0.1 * size    # 3. 到达右边界(可能一整行都被符号化,这里需要考虑浮点误差)
                ):
                    if cstk:
                        ops_vals.append({
                            "type": OpType.TEXT,
                            "font": fcur,
                            "size": size,
                            "x": tx,
                            "dy": 0,
                            "rtxt": raw_string(fcur, cstk),
                            "lidx": lidx
                        })
                        cstk = ""
                if brk and x + adv > x1 + 0.1 * size:  # 到达右边界且原文段落存在换行
                    x = x0
                    lidx += 1
                if vy_regex:  # 插入公式
                    fix = 0
                    if fcur is not None:  # 段落内公式修正纵向偏移
                        fix = varf[vid]
                    for vch in var[vid]:  # 排版公式字符
                        vc = chr(vch.cid)
                        ops_vals.append({
                            "type": OpType.TEXT,
                            "font": self.fontid[vch.font],
                            "size": vch.size,
                            "x": x + vch.x0 - var[vid][0].x0,
                            "dy": fix + vch.y0 - var[vid][0].y0,
                            "rtxt": raw_string(self.fontid[vch.font], vc),
                            "lidx": lidx
                        })
                        if log.isEnabledFor(logging.DEBUG):
                            lstk.append(LTLine(0.1, (_x, _y), (x + vch.x0 - var[vid][0].x0, fix + y + vch.y0 - var[vid][0].y0)))
                            _x, _y = x + vch.x0 - var[vid][0].x0, fix + y + vch.y0 - var[vid][0].y0
                    for l in varl[vid]:  # 排版公式线条
                        if l.linewidth < 5:  # hack 有的文档会用粗线条当图片背景
                            ops_vals.append({
                                "type": OpType.LINE,
                                "x": l.pts[0][0] + x - var[vid][0].x0,
                                "dy": l.pts[0][1] + fix - var[vid][0].y0,
                                "linewidth": l.linewidth,
                                "xlen": l.pts[1][0] - l.pts[0][0],
                                "ylen": l.pts[1][1] - l.pts[0][1],
                                "lidx": lidx
                            })
                else:  # 插入文字缓冲区
                    if not cstk:  # 单行开头
                        tx = x
                        if x == x0 and ch == " ":  # 消除段落换行空格
                            adv = 0
                        else:
                            cstk += ch
                    else:
                        cstk += ch
                adv -= mod # 文字修饰符
                fcur = fcur_
                x += adv
                if log.isEnabledFor(logging.DEBUG):
                    lstk.append(LTLine(0.1, (_x, _y), (x, y)))
                    _x, _y = x, y
            # 处理结尾
            if cstk:
                ops_vals.append({
                    "type": OpType.TEXT,
                    "font": fcur,
                    "size": size,
                    "x": tx,
                    "dy": 0,
                    "rtxt": raw_string(fcur, cstk),
                    "lidx": lidx
                })

            line_height = default_line_height

            while (lidx + 1) * size * line_height > height and line_height >= 1:
                line_height -= 0.05

            for vals in ops_vals:
                if vals["type"] == OpType.TEXT:
                    ops_list.append(gen_op_txt(vals["font"], vals["size"], vals["x"], vals["dy"] + y - vals["lidx"] * size * line_height, vals["rtxt"]))
                elif vals["type"] == OpType.LINE:
                    ops_list.append(gen_op_line(vals["x"], vals["dy"] + y - vals["lidx"] * size * line_height, vals["xlen"], vals["ylen"], vals["linewidth"]))

        for l in lstk:  # 排版全局线条
            if l.linewidth < 5:  # hack 有的文档会用粗线条当图片背景
                ops_list.append(gen_op_line(l.pts[0][0], l.pts[0][1], l.pts[1][0] - l.pts[0][0], l.pts[1][1] - l.pts[0][1], l.linewidth))

        ops = f"BT {''.join(ops_list)}ET "
        return ops


class OpType(Enum):
    TEXT = "text"
    LINE = "line"

```

## /pdf2zh/converter_docx.py

```py path="/pdf2zh/converter_docx.py" 
"""Convert doc/docx files to PDF using LibreOffice headless."""

import logging
import shutil
import subprocess
import tempfile
from pathlib import Path

logger = logging.getLogger(__name__)

SUPPORTED_EXTENSIONS = {".doc", ".docx"}


def is_convertible(filename: str) -> bool:
    return Path(filename).suffix.lower() in SUPPORTED_EXTENSIONS


def convert_to_pdf(input_path: str) -> str:
    """Convert a doc/docx file to PDF using LibreOffice.

    Returns the path to the generated temporary PDF file.
    The caller is responsible for cleaning up the temp file.
    """
    soffice = shutil.which("soffice") or shutil.which("libreoffice")
    if not soffice:
        raise RuntimeError(
            "LibreOffice is required to convert doc/docx files. "
            "Install it with: apt-get install libreoffice-core (Linux) "
            "or brew install --cask libreoffice (macOS)"
        )

    p = Path(input_path)
    tmpdir = tempfile.mkdtemp(prefix="pdf2zh_docx_")

    result = subprocess.run(
        [soffice, "--headless", "--convert-to", "pdf", "--outdir", tmpdir, str(p)],
        capture_output=True,
        text=True,
        timeout=120,
    )

    if result.returncode != 0:
        raise RuntimeError(f"LibreOffice conversion failed: {result.stderr}")

    pdf_path = Path(tmpdir) / f"{p.stem}.pdf"
    if not pdf_path.exists():
        raise RuntimeError(f"Conversion produced no output. Expected: {pdf_path}")

    logger.info(f"Converted {p.name} -> {pdf_path}")
    return str(pdf_path)

```

## /pdf2zh/doclayout.py

```py path="/pdf2zh/doclayout.py" 
import abc
import logging
import os

import cv2
import numpy as np
import ast
from babeldoc.assets.assets import get_doclayout_onnx_model_path

try:
    import onnx
    import onnxruntime
except ImportError as e:
    if "DLL load failed" in str(e):
        raise OSError(
            "Microsoft Visual C++ Redistributable is not installed. "
            "Download it at https://aka.ms/vs/17/release/vc_redist.x64.exe"
        ) from e
    raise

logger = logging.getLogger(__name__)

_BACKEND_PROVIDERS = {
    "cpu": ["CPUExecutionProvider"],
    "cuda": ["CUDAExecutionProvider", "CPUExecutionProvider"],
    "dml": ["DmlExecutionProvider", "CPUExecutionProvider"],
}

_preferred_backend: str | None = None


def set_backend(name: str) -> None:
    """Set the ONNX Runtime execution provider backend.

    Args:
        name: One of 'auto', 'cpu', 'cuda', 'dml'.
    """
    global _preferred_backend
    _preferred_backend = None if name == "auto" else name


class DocLayoutModel(abc.ABC):
    @staticmethod
    def load_onnx():
        model = OnnxModel.from_pretrained()
        return model

    @staticmethod
    def load_available():
        return DocLayoutModel.load_onnx()

    @property
    @abc.abstractmethod
    def stride(self) -> int:
        """Stride of the model input."""
        pass

    @abc.abstractmethod
    def predict(self, image, imgsz=1024, **kwargs) -> list:
        """
        Predict the layout of a document page.

        Args:
            image: The image of the document page.
            imgsz: Resize the image to this size. Must be a multiple of the stride.
            **kwargs: Additional arguments.
        """
        pass


class YoloResult:
    """Helper class to store detection results from ONNX model."""

    def __init__(self, boxes, names):
        self.boxes = [YoloBox(data=d) for d in boxes]
        self.boxes.sort(key=lambda x: x.conf, reverse=True)
        self.names = names


class YoloBox:
    """Helper class to store detection results from ONNX model."""

    def __init__(self, data):
        self.xyxy = data[:4]
        self.conf = data[-2]
        self.cls = data[-1]


class OnnxModel(DocLayoutModel):
    def __init__(self, model_path: str):
        model_path = str(model_path)
        self.model_path = model_path

        # Extract metadata without full model deserialization
        model = onnx.load(model_path, load_external_data=False)
        metadata = {d.key: d.value for d in model.metadata_props}
        self._stride = ast.literal_eval(metadata["stride"])
        self._names = ast.literal_eval(metadata["names"])
        del model  # free memory before creating session

        sess_options = onnxruntime.SessionOptions()
        sess_options.graph_optimization_level = (
            onnxruntime.GraphOptimizationLevel.ORT_ENABLE_ALL
        )

        if _preferred_backend and _preferred_backend in _BACKEND_PROVIDERS:
            providers = _BACKEND_PROVIDERS[_preferred_backend]
        else:
            providers = onnxruntime.get_available_providers()

        # Providers like CoreML generate compiled nodes that cannot be
        # serialized, so only cache the optimized graph for CPU-only.
        compiled_providers = {"CoreMLExecutionProvider", "TensorrtExecutionProvider"}
        can_cache = not compiled_providers.intersection(providers)
        if can_cache:
            optimized_path = model_path + ".optimized"
            if os.path.exists(optimized_path):
                model_path = optimized_path
            else:
                sess_options.optimized_model_filepath = optimized_path

        self.model = onnxruntime.InferenceSession(
            model_path, sess_options, providers=providers
        )
        logger.info("ONNX Runtime providers: %s", self.model.get_providers())

    @staticmethod
    def from_pretrained():
        pth = get_doclayout_onnx_model_path()
        return OnnxModel(pth)

    @property
    def stride(self):
        return self._stride

    def resize_and_pad_image(self, image, new_shape):
        """
        Resize and pad the image to the specified size, ensuring dimensions are multiples of stride.

        Parameters:
        - image: Input image
        - new_shape: Target size (integer or (height, width) tuple)
        - stride: Padding alignment stride, default 32

        Returns:
        - Processed image
        """
        if isinstance(new_shape, int):
            new_shape = (new_shape, new_shape)

        h, w = image.shape[:2]
        new_h, new_w = new_shape

        # Calculate scaling ratio
        r = min(new_h / h, new_w / w)
        resized_h, resized_w = int(round(h * r)), int(round(w * r))

        # Resize image
        image = cv2.resize(
            image, (resized_w, resized_h), interpolation=cv2.INTER_LINEAR
        )

        # Calculate padding size and align to stride multiple
        pad_w = (new_w - resized_w) % self.stride
        pad_h = (new_h - resized_h) % self.stride
        top, bottom = pad_h // 2, pad_h - pad_h // 2
        left, right = pad_w // 2, pad_w - pad_w // 2

        # Add padding
        image = cv2.copyMakeBorder(
            image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114)
        )

        return image

    def scale_boxes(self, img1_shape, boxes, img0_shape):
        """
        Rescales bounding boxes (in the format of xyxy by default) from the shape of the image they were originally
        specified in (img1_shape) to the shape of a different image (img0_shape).

        Args:
            img1_shape (tuple): The shape of the image that the bounding boxes are for,
                in the format of (height, width).
            boxes (torch.Tensor): the bounding boxes of the objects in the image, in the format of (x1, y1, x2, y2)
            img0_shape (tuple): the shape of the target image, in the format of (height, width).

        Returns:
            boxes (torch.Tensor): The scaled bounding boxes, in the format of (x1, y1, x2, y2)
        """

        # Calculate scaling ratio
        gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])

        # Calculate padding size
        pad_x = round((img1_shape[1] - img0_shape[1] * gain) / 2 - 0.1)
        pad_y = round((img1_shape[0] - img0_shape[0] * gain) / 2 - 0.1)

        # Remove padding and scale boxes
        boxes[..., :4] = (boxes[..., :4] - [pad_x, pad_y, pad_x, pad_y]) / gain
        return boxes

    def predict(self, image, imgsz=1024, **kwargs):
        # Preprocess input image
        orig_h, orig_w = image.shape[:2]
        pix = self.resize_and_pad_image(image, new_shape=imgsz)
        pix = np.transpose(pix, (2, 0, 1))  # CHW
        pix = np.expand_dims(pix, axis=0)  # BCHW
        pix = pix.astype(np.float32) / 255.0  # Normalize to [0, 1]
        new_h, new_w = pix.shape[2:]

        # Run inference
        preds = self.model.run(None, {"images": pix})[0]

        # Postprocess predictions
        preds = preds[preds[..., 4] > 0.25]
        preds[..., :4] = self.scale_boxes(
            (new_h, new_w), preds[..., :4], (orig_h, orig_w)
        )
        return [YoloResult(boxes=preds, names=self._names)]


class ModelInstance:
    value: OnnxModel = None

```

## /pdf2zh/high_level.py

```py path="/pdf2zh/high_level.py" 
"""Functions that can be used for the most common use-cases for pdf2zh.six"""

import asyncio
import io
import os
import re
import sys
import tempfile
import logging
from asyncio import CancelledError
from pathlib import Path
from string import Template
from typing import Any, BinaryIO, List, Optional, Dict

import numpy as np
import requests
import tqdm

from pdf2zh.converter_docx import convert_to_pdf, is_convertible
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfexceptions import PDFValueError
from pdfminer.pdfinterp import PDFResourceManager
from pdfminer.pdfpage import PDFPage
from pdfminer.pdfparser import PDFParser
from pymupdf import Document, Font

from pdf2zh.converter import TranslateConverter
from pdf2zh.doclayout import OnnxModel
from pdf2zh.pdfinterp import PDFPageInterpreterEx

from pdf2zh.config import ConfigManager
from babeldoc.assets.assets import get_font_and_metadata

NOTO_NAME = "noto"

logger = logging.getLogger(__name__)

noto_list = [
    "am",  # Amharic
    "ar",  # Arabic
    "bn",  # Bengali
    "bg",  # Bulgarian
    "chr",  # Cherokee
    "el",  # Greek
    "gu",  # Gujarati
    "iw",  # Hebrew
    "hi",  # Hindi
    "kn",  # Kannada
    "ml",  # Malayalam
    "mr",  # Marathi
    "ru",  # Russian
    "sr",  # Serbian
    "ta",  # Tamil
    "te",  # Telugu
    "th",  # Thai
    "ur",  # Urdu
    "uk",  # Ukrainian
]


def check_files(files: List[str]) -> List[str]:
    files = [
        f for f in files if not f.startswith("http://")
    ]  # exclude online files, http
    files = [
        f for f in files if not f.startswith("https://")
    ]  # exclude online files, https
    missing_files = [file for file in files if not os.path.exists(file)]
    return missing_files


def translate_patch(
    inf: BinaryIO,
    pages: Optional[list[int]] = None,
    vfont: str = "",
    vchar: str = "",
    thread: int = 0,
    doc_zh: Document = None,
    lang_in: str = "",
    lang_out: str = "",
    service: str = "",
    noto_name: str = "",
    noto: Font = None,
    callback: object = None,
    cancellation_event: asyncio.Event = None,
    model: OnnxModel = None,
    envs: Dict = None,
    prompt: Template = None,
    ignore_cache: bool = False,
    **kwarg: Any,
) -> None:
    rsrcmgr = PDFResourceManager()
    layout = {}
    device = TranslateConverter(
        rsrcmgr,
        vfont,
        vchar,
        thread,
        layout,
        lang_in,
        lang_out,
        service,
        noto_name,
        noto,
        envs,
        prompt,
        ignore_cache,
    )

    assert device is not None
    obj_patch = {}
    interpreter = PDFPageInterpreterEx(rsrcmgr, device, obj_patch)
    if pages:
        total_pages = len(pages)
    else:
        total_pages = doc_zh.page_count

    parser = PDFParser(inf)
    doc = PDFDocument(parser)
    with tqdm.tqdm(total=total_pages) as progress:
        for pageno, page in enumerate(PDFPage.create_pages(doc)):
            if cancellation_event and cancellation_event.is_set():
                raise CancelledError("task cancelled")
            if pages and (pageno not in pages):
                continue
            progress.update()
            if callback:
                callback(progress)
            page.pageno = pageno
            pix = doc_zh[page.pageno].get_pixmap()
            image = np.frombuffer(pix.samples, np.uint8).reshape(
                pix.height, pix.width, 3
            )[:, :, ::-1]
            page_layout = model.predict(image, imgsz=int(pix.height / 32) * 32)[0]
            # kdtree 是不可能 kdtree 的,不如直接渲染成图片,用空间换时间
            box = np.ones((pix.height, pix.width))
            h, w = box.shape
            vcls = ["abandon", "figure", "table", "isolate_formula", "formula_caption"]
            for i, d in enumerate(page_layout.boxes):
                if page_layout.names[int(d.cls)] not in vcls:
                    x0, y0, x1, y1 = d.xyxy.squeeze()
                    x0, y0, x1, y1 = (
                        np.clip(int(x0 - 1), 0, w - 1),
                        np.clip(int(h - y1 - 1), 0, h - 1),
                        np.clip(int(x1 + 1), 0, w - 1),
                        np.clip(int(h - y0 + 1), 0, h - 1),
                    )
                    box[y0:y1, x0:x1] = i + 2
            for i, d in enumerate(page_layout.boxes):
                if page_layout.names[int(d.cls)] in vcls:
                    x0, y0, x1, y1 = d.xyxy.squeeze()
                    x0, y0, x1, y1 = (
                        np.clip(int(x0 - 1), 0, w - 1),
                        np.clip(int(h - y1 - 1), 0, h - 1),
                        np.clip(int(x1 + 1), 0, w - 1),
                        np.clip(int(h - y0 + 1), 0, h - 1),
                    )
                    box[y0:y1, x0:x1] = 0
            layout[page.pageno] = box
            # 新建一个 xref 存放新指令流
            page.page_xref = doc_zh.get_new_xref()  # hack 插入页面的新 xref
            doc_zh.update_object(page.page_xref, "<<>>")
            doc_zh.update_stream(page.page_xref, b"")
            doc_zh[page.pageno].set_contents(page.page_xref)
            interpreter.process_page(page)

    device.close()
    return obj_patch


def translate_stream(
    stream: bytes,
    pages: Optional[list[int]] = None,
    lang_in: str = "",
    lang_out: str = "",
    service: str = "",
    thread: int = 0,
    vfont: str = "",
    vchar: str = "",
    callback: object = None,
    cancellation_event: asyncio.Event = None,
    model: OnnxModel = None,
    envs: Dict = None,
    prompt: Template = None,
    skip_subset_fonts: bool = False,
    ignore_cache: bool = False,
    **kwarg: Any,
):
    font_list = [("tiro", None)]

    font_path = download_remote_fonts(lang_out.lower())
    noto_name = NOTO_NAME
    noto = Font(noto_name, font_path)
    font_list.append((noto_name, font_path))

    doc_en = Document(stream=stream)
    stream = io.BytesIO()
    doc_en.save(stream)
    doc_zh = Document(stream=stream)
    page_count = doc_zh.page_count
    # font_list = [("GoNotoKurrent-Regular.ttf", font_path), ("tiro", None)]
    font_id = {}
    for page in doc_zh:
        for font in font_list:
            font_id[font[0]] = page.insert_font(font[0], font[1])
    xreflen = doc_zh.xref_length()
    for xref in range(1, xreflen):
        for label in ["Resources/", ""]:  # 可能是基于 xobj 的 res
            try:  # xref 读写可能出错
                font_res = doc_zh.xref_get_key(xref, f"{label}Font")
                target_key_prefix = f"{label}Font/"
                if font_res[0] == "xref":
                    resource_xref_id = re.search("(\\d+) 0 R", font_res[1]).group(1)
                    xref = int(resource_xref_id)
                    font_res = ("dict", doc_zh.xref_object(xref))
                    target_key_prefix = ""

                if font_res[0] == "dict":
                    for font in font_list:
                        target_key = f"{target_key_prefix}{font[0]}"
                        font_exist = doc_zh.xref_get_key(xref, target_key)
                        if font_exist[0] == "null":
                            doc_zh.xref_set_key(
                                xref,
                                target_key,
                                f"{font_id[font[0]]} 0 R",
                            )
            except Exception:
                pass

    fp = io.BytesIO()

    doc_zh.save(fp)
    obj_patch: dict = translate_patch(fp, **locals())

    for obj_id, ops_new in obj_patch.items():
        # ops_old=doc_en.xref_stream(obj_id)
        # print(obj_id)
        # print(ops_old)
        # print(ops_new.encode())
        doc_zh.update_stream(obj_id, ops_new.encode())

    doc_en.insert_file(doc_zh)
    for id in range(page_count):
        doc_en.move_page(page_count + id, id * 2 + 1)
    if not skip_subset_fonts:
        doc_zh.subset_fonts(fallback=True)
        doc_en.subset_fonts(fallback=True)
    return (
        doc_zh.write(deflate=True, garbage=3, use_objstms=1),
        doc_en.write(deflate=True, garbage=3, use_objstms=1),
    )


def convert_to_pdfa(input_path, output_path):
    """
    Convert PDF to PDF/A format

    Args:
        input_path: Path to source PDF file
        output_path: Path to save PDF/A file
    """
    from pikepdf import Dictionary, Name, Pdf

    # Open the PDF file
    pdf = Pdf.open(input_path)

    # Add PDF/A conformance metadata
    metadata = {
        "pdfa_part": "2",
        "pdfa_conformance": "B",
        "title": pdf.docinfo.get("/Title", ""),
        "author": pdf.docinfo.get("/Author", ""),
        "creator": "PDF Math Translate",
    }

    with pdf.open_metadata() as meta:
        meta.load_from_docinfo(pdf.docinfo)
        meta["pdfaid:part"] = metadata["pdfa_part"]
        meta["pdfaid:conformance"] = metadata["pdfa_conformance"]

    # Create OutputIntent dictionary
    output_intent = Dictionary(
        {
            "/Type": Name("/OutputIntent"),
            "/S": Name("/GTS_PDFA1"),
            "/OutputConditionIdentifier": "sRGB IEC61966-2.1",
            "/RegistryName": "http://www.color.org",
            "/Info": "sRGB IEC61966-2.1",
        }
    )

    # Add output intent to PDF root
    if "/OutputIntents" not in pdf.Root:
        pdf.Root.OutputIntents = [output_intent]
    else:
        pdf.Root.OutputIntents.append(output_intent)

    # Save as PDF/A
    pdf.save(output_path, linearize=True)
    pdf.close()


def translate(
    files: list[str],
    output: str = "",
    pages: Optional[list[int]] = None,
    lang_in: str = "",
    lang_out: str = "",
    service: str = "",
    thread: int = 0,
    vfont: str = "",
    vchar: str = "",
    callback: object = None,
    compatible: bool = False,
    cancellation_event: asyncio.Event = None,
    model: OnnxModel = None,
    envs: Dict = None,
    prompt: Template = None,
    skip_subset_fonts: bool = False,
    ignore_cache: bool = False,
    **kwarg: Any,
):
    if not files:
        raise PDFValueError("No files to process.")

    missing_files = check_files(files)

    if missing_files:
        print("The following files do not exist:", file=sys.stderr)
        for file in missing_files:
            print(f"  {file}", file=sys.stderr)
        raise PDFValueError("Some files do not exist.")

    result_files = []

    for file in files:
        if type(file) is str and (
            file.startswith("http://") or file.startswith("https://")
        ):
            print("Online files detected, downloading...")
            try:
                r = requests.get(file, allow_redirects=True)
                if r.status_code == 200:
                    with tempfile.NamedTemporaryFile(
                        suffix=".pdf", delete=False
                    ) as tmp_file:
                        print(f"Writing the file: {file}...")
                        tmp_file.write(r.content)
                        file = tmp_file.name
                else:
                    r.raise_for_status()
            except Exception as e:
                raise PDFValueError(
                    f"Errors occur in downloading the PDF file. Please check the link(s).\nError:\n{e}"
                )

        # Convert doc/docx to PDF if needed
        _converted_pdf = None
        if is_convertible(file):
            _converted_pdf = convert_to_pdf(file)
            filename = os.path.splitext(os.path.basename(file))[0]
            file = _converted_pdf
        else:
            filename = os.path.splitext(os.path.basename(file))[0]

        # If the commandline has specified converting to PDF/A format
        # --compatible / -cp
        if compatible:
            with tempfile.NamedTemporaryFile(
                suffix="-pdfa.pdf", delete=False
            ) as tmp_pdfa:
                print(f"Converting {file} to PDF/A format...")
                convert_to_pdfa(file, tmp_pdfa.name)
                doc_raw = open(tmp_pdfa.name, "rb")
                os.unlink(tmp_pdfa.name)
        else:
            doc_raw = open(file, "rb")
        s_raw = doc_raw.read()
        doc_raw.close()

        temp_dir = Path(tempfile.gettempdir())
        file_path = Path(file)
        try:
            if file_path.exists() and file_path.resolve().is_relative_to(
                temp_dir.resolve()
            ):
                file_path.unlink(missing_ok=True)
                logger.debug(f"Cleaned temp file: {file_path}")
        except Exception:
            logger.warning(f"Failed to clean temp file {file_path}", exc_info=True)

        s_mono, s_dual = translate_stream(
            s_raw,
            **locals(),
        )
        file_mono = Path(output) / f"{filename}-mono.pdf"
        file_dual = Path(output) / f"{filename}-dual.pdf"
        doc_mono = open(file_mono, "wb")
        doc_dual = open(file_dual, "wb")
        doc_mono.write(s_mono)
        doc_dual.write(s_dual)
        doc_mono.close()
        doc_dual.close()
        result_files.append((str(file_mono), str(file_dual)))

    return result_files


def download_remote_fonts(lang: str):
    lang = lang.lower()
    LANG_NAME_MAP = {
        **{la: "GoNotoKurrent-Regular.ttf" for la in noto_list},
        **{
            la: f"SourceHanSerif{region}-Regular.ttf"
            for region, langs in {
                "CN": ["zh-cn", "zh-hans", "zh"],
                "TW": ["zh-tw", "zh-hant"],
                "JP": ["ja"],
                "KR": ["ko"],
            }.items()
            for la in langs
        },
    }
    font_name = LANG_NAME_MAP.get(lang, "GoNotoKurrent-Regular.ttf")

    # docker
    font_path = ConfigManager.get("NOTO_FONT_PATH", Path("/app", font_name).as_posix())
    if not Path(font_path).exists():
        font_path, _ = get_font_and_metadata(font_name)
        font_path = font_path.as_posix()

    logger.info(f"use font: {font_path}")

    return font_path

```

## /pdf2zh/kernel/__init__.py

```py path="/pdf2zh/kernel/__init__.py" 
"""Kernel package — hot-pluggable translation kernel registry."""

from pdf2zh.kernel.registry import KernelRegistry
from pdf2zh.kernel.legacy import LegacyKernel
from pdf2zh.kernel.precise import PreciseKernel

# Always register both kernels.
# PreciseKernel.is_available() returns False if submodule/venv not set up.
KernelRegistry.register(LegacyKernel())
KernelRegistry.register(PreciseKernel())

__all__ = ["KernelRegistry"]

```

## /pdf2zh/kernel/legacy.py

```py path="/pdf2zh/kernel/legacy.py" 
"""Fast kernel adapter — wraps existing pdf2zh.high_level.translate()."""

from __future__ import annotations

import asyncio
import logging
from pathlib import Path
from typing import Any, Optional

from pdf2zh.kernel.protocol import TranslateRequest, TranslateResult

logger = logging.getLogger(__name__)


class LegacyKernel:
    """Kernel adapter for the original pdf2zh translation pipeline (fast mode)."""

    @property
    def name(self) -> str:
        return "fast"

    @property
    def version(self) -> str:
        from pdf2zh import __version__

        return __version__

    def is_available(self) -> bool:
        return True

    def translate(
        self,
        request: TranslateRequest,
        callback: Any = None,
        cancellation_event: Optional[asyncio.Event] = None,
    ) -> list[TranslateResult]:
        from pdf2zh.doclayout import ModelInstance, OnnxModel
        from pdf2zh.high_level import translate

        # Ensure model is loaded
        if ModelInstance.value is None:
            ModelInstance.value = OnnxModel.load_available()

        # Build kwargs matching high_level.translate() signature
        kwargs: dict[str, Any] = {
            "files": request.files,
            "output": request.output,
            "lang_in": request.lang_in,
            "lang_out": request.lang_out,
            "service": request.service,
            "thread": request.thread,
            "vfont": request.vfont,
            "vchar": request.vchar,
            "callback": callback,
            "cancellation_event": cancellation_event,
            "model": ModelInstance.value,
            "envs": request.envs or {},
            "skip_subset_fonts": request.skip_subset_fonts,
            "ignore_cache": request.ignore_cache,
            "compatible": request.compatible,
        }

        if request.pages and isinstance(request.pages, list):
            kwargs["pages"] = request.pages

        if request.prompt:
            from string import Template

            kwargs["prompt"] = Template(request.prompt)

        result_files = translate(**kwargs)

        results = []
        for mono_path, dual_path in result_files:
            results.append(
                TranslateResult(
                    mono_pdf=Path(mono_path),
                    dual_pdf=Path(dual_path),
                )
            )
        return results

    async def translate_async(
        self,
        request: TranslateRequest,
        callback: Any = None,
        cancellation_event: Optional[asyncio.Event] = None,
    ) -> list[TranslateResult]:
        return await asyncio.to_thread(
            self.translate, request, callback, cancellation_event
        )

```

## /pdf2zh/kernel/protocol.py

```py path="/pdf2zh/kernel/protocol.py" 
"""Kernel protocol — unified interface for translation kernels."""

from __future__ import annotations

import asyncio
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Optional, Protocol, runtime_checkable


@dataclass
class TranslateRequest:
    """Unified translation request bridging CLI args to kernel-specific config."""

    files: list[str]
    lang_in: str = "en"
    lang_out: str = "zh"
    service: str = "google"
    pages: Optional[list[int] | str] = None
    output: str = ""
    thread: int = 4
    vfont: str = ""
    vchar: str = ""
    prompt: Optional[str] = None
    envs: Optional[dict] = field(default_factory=dict)
    debug: bool = False
    skip_subset_fonts: bool = False
    ignore_cache: bool = False
    compatible: bool = False


@dataclass
class TranslateResult:
    """Unified result from either kernel."""

    mono_pdf: Optional[Path | bytes] = None
    dual_pdf: Optional[Path | bytes] = None
    time_cost: float = 0.0


@runtime_checkable
class KernelProtocol(Protocol):
    """What every kernel must implement."""

    @property
    def name(self) -> str: ...

    @property
    def version(self) -> str: ...

    def translate(
        self,
        request: TranslateRequest,
        callback: Any = None,
        cancellation_event: Optional[asyncio.Event] = None,
    ) -> list[TranslateResult]: ...

    async def translate_async(
        self,
        request: TranslateRequest,
        callback: Any = None,
        cancellation_event: Optional[asyncio.Event] = None,
    ) -> list[TranslateResult]: ...

    def is_available(self) -> bool: ...

```

## /script/Dockerfile.China

```China path="/script/Dockerfile.China" 
FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim

WORKDIR /app


EXPOSE 7860

ENV PYTHONUNBUFFERED=1
ADD "https://ghgo.xyz/https://github.com/satbyy/go-noto-universal/releases/download/v7.0/GoNotoKurrent-Regular.ttf" /app
RUN apt-get update && \
     apt-get install --no-install-recommends -y libgl1 && \
     rm -rf /var/lib/apt/lists/* && uv pip install --system --no-cache huggingface-hub && \
     python3 -c "from huggingface_hub import hf_hub_download; hf_hub_download('wybxc/DocLayout-YOLO-DocStructBench-onnx','doclayout_yolo_docstructbench_imgsz1024.onnx');"

COPY . .

RUN uv pip install --system --no-cache .

CMD ["pdf2zh", "-i"]

```

## /setup.cfg

```cfg path="/setup.cfg" 
[flake8]
max-line-length = 120
ignore = E203,E261,E501,W503,E741
exclude = .git,build,dist,docs
```

## /test/file/translate.cli.font.unknown.pdf

Binary file available at https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/test/file/translate.cli.font.unknown.pdf


The content has been capped at 50000 tokens. The user could consider applying other filters to refine the result. The better and more specific the context, the better the LLM can follow instructions. If the context seems verbose, the user can refine the filter using uithub. Thank you for using https://uithub.com - Perfect LLM context for any GitHub repo.
Copied!