ArchiveBox
ArchiveBox is a powerful, self-hosted, open-source solution for collecting, saving, and viewing websites offline.
Description
ArchiveBox is an open-source, self-hosted web archiving tool that allows users to save and manage web pages offline. It supports various input formats including URLs, browser history, bookmarks, and services like Pocket and Pinboard. ArchiveBox saves web content in multiple redundant formats (HTML, PDF, screenshots, etc.), ensuring long-term accessibility and preservation. It offers a command-line interface (CLI) for easy management, a self-hosted web application for visual browsing and administration, and uses standard, easily readable file formats. The project is actively maintained with comprehensive documentation and a supportive community.
Features
Key features include free and open-source licensing, a powerful command-line interface, comprehensive documentation, the ability to extract various content types (media, articles, code, etc.), support for scheduled and real-time importing from various sources, use of durable, long-term formats (HTML, PDF, JSON, etc.), and usability as a CLI, web app, Python API, or desktop app. Advanced features include support for archiving content requiring logins, cookies, or paywalls, with careful attention to security considerations. Future development plans include enhanced JS support for ad-blocking, autoscrolling, and more.
Benefits
- Ensures long-term preservation of web content by saving multiple formats. - Maintains user control over data through self-hosting. - Supports archiving both public and private web content. - Offers various input methods: URLs, bookmarks, browser history, RSS feeds, and more. - Provides a user-friendly CLI and web interface for managing archives. - Uses standard file formats, making data accessible without specialized software. - Active development, comprehensive documentation, and a helpful community.
Links
- Home: https://archivebox.io
- Source code: https://github.com/ArchiveBox/ArchiveBox
Details
- Open Source: ✅
- European: ❌