A Rust CLI tool that fetches posts from an XML/Atom feed (e.g. a Blogger blog), extracts the title, cover image, and plain-text body of each post, and persists them into a PostgreSQL database.
- Reads
FEED_URLandDATABASE_URLfrom a.envfile. - Paginates through the feed, fetching posts in parallel (up to 10 concurrent requests).
- For each post, scrapes the full page to extract the
og:title,og:image, and thediv.post-bodycontent (converted to clean plain text). - Inserts new posts into the
poststable; already-present posts are silently skipped. - Errors are logged to
logs/errors.log.
- Rust (stable)
- Docker (for the PostgreSQL container)
sqlx-cli(installed once withmake install-tools)
Copy .env.example to .env (or create .env) and fill in your values:
POSTGRES_DB=feeddb
POSTGRES_USER=feeduser
POSTGRES_PASSWORD=feedpass
DATABASE_URL=postgres://feeduser:feedpass@localhost:5437/feeddb
FEED_URL=https://yourblog.blogspot.com/feeds/posts/default?alt=atomInstall sqlx-cli, start the database, and apply migrations in one step:
make install-tools # only needed once
make init # starts Docker Postgres + runs migrationsmake fetch # fetch all posts from the feed and save them to the DB| Command | Description |
|---|---|
make build |
Optimised release build |
make check |
Check for compile errors without building |
make fmt |
Format code with rustfmt |
make lint |
Lint with clippy |
make test |
Run tests (Postgres is started automatically via testcontainers) |
make db-up |
Start the PostgreSQL Docker container |
make db-down |
Stop and remove the PostgreSQL container |
make migrate |
Apply pending migrations |
make migrate-down |
Revert the last migration |
make db-reset |
Wipe all data and re-initialise from scratch |
make sqlx-prepare |
Regenerate .sqlx/ cache for offline builds |