Playground
Configure extraction settings, choose output formats, and generate ready-to-use commands for CLI, Docker, and Apify. Open the homepage to start.
Extract from HTML
HTML input is optional. Paste raw HTML into the input field to preview extraction results alongside generated commands. Without HTML, the playground generates commands only.
- Paste or upload HTML into the input field on the homepage
- Select output formats and configure extraction settings
- Press Run to extract content and generate commands
Upload an HTML file
- Click the Upload button in the toolbar above the input field
- Select an HTML file from your computer
- The file content fills the input field automatically
Generate commands without HTML
You can generate CLI, Docker, and Apify commands without providing HTML input:
- Select the command types you want (CLI, Docker, Apify CLI, Apify JSON)
- Configure extraction settings as needed
- Press Run to generate commands targeting a placeholder URL
Install the Python package
Contextractor is available as a Python package on PyPI:
- Install via pip:
pip install contextractor - Or via pipx:
pipx install contextractor - Requires Python 3.9+
Output formats
Choose your preferred output format:
- Plain text — clean text with no markup
- Markdown — preserves headings, lists, links, and basic formatting
- JSON — structured output with metadata (title, author, date, etc.)
- XML — standard XML output
- XML-TEI — TEI-compliant XML for academic and archival use
- JSONL — newline-delimited JSON, one object per page (CLI only, for batch/pipeline use)
Download or copy
After extraction, you can:
- Copy the result to your clipboard
- Download the extracted content as a file
Updated: April 14, 2026