sync-to-readwise

Pluggable framework for syncing third-party content into Readwise Reader. Daemon-on-Docker, secrets via Doppler.

~940 lines of Python 2 sources (youtube, github_stars) Python 3.14 · uv · APScheduler · httpx · pydantic-settings Repo: allenhutchison/sync-to-readwise

Sources fetch candidate items from external APIs. The Syncer asks Readwise what's already saved (cached in memory), then writes only the new items via Readwise's /save/ endpoint. APScheduler runs each source on its own interval. Doppler provides every secret at process start; nothing is committed.

Doppler project: sync-to-readwise Docker container (chowda) docker-entrypoint.sh doppler run -- "$@" cli.run_daemon APScheduler · per-source interval Syncer dedup + push YouTubeLikesSource YouTube Data API v3 GitHubStarsSource GitHub REST · /user/starred Readwise Reader v3 /list/, /save/ solid: writes · dashed: reads
core (cli, syncer, readwise client) source adapters (per integration) Readwise (target)

Design choices that matter

Dedup is read-from-Readwise, not local state.

Each sync warms an in-memory URL set from /list/?category=<cat>, scoped to the source's category (video for YouTube, article for github_stars). We never re-save an existing URL, so triaging in Reader is preserved.

Save throttle honours the documented limit.

3.5s minimum between /save/ calls keeps us under Readwise's 20/min. On 429s we read Retry-After and sleep that long (tenacity-style backoff capped too low for Readwise's ~47s windows).

Sources are opt-in by credentials.

Every registered source is probe-built at startup. Missing creds → source_skipped warning + skip from the schedule. Adding a source to registry.py never breaks an existing deployment.

Doppler is the only secret store.

Conventional unprefixed env names for SDKs (READWISE_TOKEN, GITHUB_TOKEN, YOUTUBE_OAUTH_*). Internal config uses SYNCRW_. The container's doppler run -- wraps CMD only when DOPPLER_TOKEN is set.

Click any row for responsibilities and what it imports.

Built-in sources

YT
YouTubeLikesSource   source
paginates the authenticated user's liked videos playlist via YouTube Data API v3
Auth: OAuth 2.0 installed-app flow, refresh token at /data/youtube_token.json
Defaults: location=later, tag youtube, dedup against category=video
Notes: private/deleted videos are filtered out silently
GH
GitHubStarsSource   source
paginates GET /user/starred?sort=created via GitHub REST
Auth: personal access token (default scope is enough for public stars; repo for private)
Defaults: location=later, tag github, dedup against category=article
Notes: httpx client is scoped to fetch_candidates() so its connection pool is released per cycle

Source contract

class Source(ABC):
    name: str
    default_location: str = "later"
    default_tags: tuple[str, ...] = ()
    readwise_category: str | None = None  # scopes the dedup cache

    @abstractmethod
    def fetch_candidates(self) -> Iterable[Item]: ...

What happens when the daemon ticks

Each interval, APScheduler fires _run_source(name) for the source whose timer expired. Two sources run on independent timers; they share one ReadwiseClient (and thus one in-memory dedup cache).

1
build_source(name, cfg)
Looks up the source factory in the registry. Each call constructs a fresh source — the daemon doesn't keep long-lived API clients between ticks.
2
readwise.warm_cache(category=<cat>)
First tick: paginates /list/?category=<cat>, populates _known_urls. Subsequent ticks in the same daemon process: no-op (idempotent per category).
3
source.fetch_candidates()
Lazy iterator. Pages the upstream API (YouTube / GitHub). Yields Item(url, title, …).
4
readwise.exists(url)
O(1) set lookup. Hit → skip. Miss → step 5.
5
readwise.create_document(item, location, tags)
Throttles to 3.5s minimum between saves. POSTs /save/. Tags: source defaults ∪ per-source override from YAML. URL added to _known_urls on success.
6
SyncResult(seen, created, skipped, errors) logged
Structured log emitted as sync_completed. Errors during a single item are logged with item_create_failed and the loop continues.

Adding a new source

  1. Create src/sync_to_readwise/sources/<name>.py implementing Source:
    class MySource(Source):
        name = "my_source"
        default_location = "later"
        default_tags = ("my_source",)
        readwise_category = "article"   # or "video" / etc.
    
        def __init__(self, *, token: str) -> None:
            if not token:
                raise ValueError("MY_TOKEN must be set.")
            self._token = token
    
        def fetch_candidates(self) -> Iterable[Item]:
            with httpx.Client(...) as c:
                for ... in ...:
                    yield Item(url=..., source_name="my_source", title=...)
  2. Add the secret to core/config.py Settings:
    my_token: SecretStr = Field(default=SecretStr(""), validation_alias="MY_TOKEN")
  3. Register a factory in registry.py:
    def _build_my_source(cfg: AppConfig, src_cfg: SourceConfig) -> Source:
        return MySource(token=cfg.settings.my_token.get_secret_value())
    
    REGISTRY = {..., "my_source": _build_my_source}
  4. Forward the env var into the dev container in docker-compose.yml and document it in the README's secrets table.
  5. Add the secret to Doppler. Daemon will probe-build at next start: present → enabled, missing → source_skipped.

No syncer / scheduler / dedup changes needed — the framework picks it up automatically.

Operations

Production host
homelab machine chowda, Docker
Image
allenhutchison/sync-to-readwise:latest (Docker Hub)
Built by
.github/workflows/publish.yml on every push to main + vX.Y.Z tags
CI
.github/workflows/ci.yml — ruff format + lint + Docker smoke (CLI --help dispatch)
Dependencies
weekly Dependabot PRs for uv (Python deps), GitHub Actions, Docker base images
Runtime entrypoint
docker-entrypoint.sh wraps CMD with doppler run -- when DOPPLER_TOKEN is set; otherwise plain exec
Persistent volume
/data — holds youtube_token.json (OAuth refresh) and optional config.yaml

Secrets in Doppler

SecretRequiredUsed by
READWISE_TOKENyescore
YOUTUBE_OAUTH_CLIENT_IDyoutube onlyyoutube
YOUTUBE_OAUTH_CLIENT_SECRETyoutube onlyyoutube
GITHUB_TOKENgithub_stars onlygithub_stars

Useful commands

# Local dev sync (one-shot, useful for testing a source)
doppler run -- docker compose run --rm sync-to-readwise sync-to-readwise sync-once youtube

# Run the daemon locally
doppler run -- docker compose up -d

# YouTube OAuth dance (opens a printed URL; redirect to localhost:8080)
doppler run -- docker compose run --rm --service-ports sync-to-readwise sync-to-readwise setup youtube

# Tail logs on chowda
docker logs --tail 200 -f sync-to-readwise