Skip to main content

Module http

Module http 

Source
Expand description

Centralized HTTP client wrapper. All Source impls fetch through here.

Security defaults per docs/SECURITY.md:

  • rustls TLS only (no openssl, no native-tls — enforced by deny.toml)
  • HTTPS-only redirect policy (file://, data://, http:// rejected)
  • Per-source redirect host allowlist (docs/REDIRECT_ALLOWLIST.md)
  • Body size cap (crate::PDF_MAX_BYTES = 100 MB)
  • Per-request timeouts (connect 10s, read 60s, total 300s)
  • PDF magic-byte check on the first 5 bytes (%PDF-)
  • User-Agent: doiget/<version> (+https://github.com/sotashimozono/doiget)

See docs/SECURITY.md §1.2-1.3 / §1.10 and docs/REDIRECT_ALLOWLIST.md.

§Architectural note: per-source reqwest::Client

reqwest::redirect::Policy::custom receives only an Attempt value, which exposes the next URL and previous URL chain but not the original request’s headers. That makes the “tag the request with X-Doiget-Source and inspect it from inside the redirect closure” approach infeasible on reqwest 0.13.x. Instead, HttpClient holds one [reqwest::Client] per source — each client’s redirect closure captures that source’s SourceAllowlist so cross-source confusion is impossible by construction.

Structs§

HttpClient
Workspace-wide HTTP client with the security defaults applied.
SourceAllowlist
Per-source allowlist entry. Matches the schema in docs/REDIRECT_ALLOWLIST.md §2.

Enums§

HttpError
Errors that can arise during HTTP fetches.

Functions§

init_tls
Public entry point for callers that build their own reqwest::Client outside of HttpClient and need the process-default TLS provider installed first (ADR-0020 Amendment 1).
oa_publisher_allowlist
Hard-coded Phase 1 allowlist for the synthetic "oa-publisher" source — the publisher / preprint / repository hosts to which Unpaywall’s best_oa_location.url (or url_for_pdf) typically resolves.
tier_1_allowlist
Hard-coded Phase 1 allowlist for Tier 1 sources. Sourced from docs/REDIRECT_ALLOWLIST.md §3.
tier_2_allowlist
Hard-coded Phase 4 allowlist for Tier 2 metadata sources (OpenAlex, Semantic Scholar, DOAJ). Sourced from docs/SOURCES.md §1 (the Tier 2 table) and docs/REDIRECT_ALLOWLIST.md §3 (same redirect-allowlist policy as Tier 1, distinct source keys).