Playground

Configure extraction settings, choose output formats, and generate ready-to-use commands for CLI, Docker, and Apify. Open the homepage to start.

Extract from HTML

HTML input is optional. Paste raw HTML into the input field to preview extraction results alongside generated commands. Without HTML, the playground generates commands only.

  • Paste or upload HTML into the input field on the homepage
  • Select output formats and configure extraction settings
  • Press Run to extract content and generate commands

Upload an HTML file

  • Click the Upload button in the toolbar above the input field
  • Select an HTML file from your computer
  • The file content fills the input field automatically

Generate commands without HTML

You can generate CLI, Docker, and Apify commands without providing HTML input:

  • Select the command types you want (CLI, Docker, Apify CLI, Apify JSON)
  • Configure extraction settings as needed
  • Press Run to generate commands targeting a placeholder URL

Install the Python package

Contextractor is available as a Python package on PyPI:

  • Install via pip: pip install contextractor
  • Or via pipx: pipx install contextractor
  • Requires Python 3.9+

Output formats

Choose your preferred output format:

  • Plain text — clean text with no markup
  • Markdown — preserves headings, lists, links, and basic formatting
  • JSON — structured output with metadata (title, author, date, etc.)
  • XML — standard XML output
  • XML-TEI — TEI-compliant XML for academic and archival use
  • JSONL — newline-delimited JSON, one object per page (CLI only, for batch/pipeline use)

Download or copy

After extraction, you can:

  • Copy the result to your clipboard
  • Download the extracted content as a file

Updated: April 14, 2026