Public API

`doiget-cli` and `doiget-mcp` are not bound by this guarantee — they are end-user

Public API (doiget-core)

Status: NORMATIVE. This is the semver-locked Rust API surface of the doiget-core crate. Breaking changes to any item here require a major version bump and an ADR. Adding new items is a minor bump.

doiget-cli and doiget-mcp are not bound by this guarantee — they are end-user binaries / servers and may evolve more freely.

1. Re-exports (top of lib.rs)

Note: the semver-locked surface is the public identifier set, not the submodule layout. File splits within doiget-core that preserve the public identifier set are not a major bump.

pub use crate::ref_::{Ref, Doi, ArxivId};
pub use crate::safekey::Safekey;
pub use crate::capability::{
    AlwaysOn, CapabilityProfile, MetadataAccess, RateLimits, TdmGrant,
};
pub use crate::source::{Source, FetchContext, FetchResult, FetchError};
pub use crate::store::{Store, Metadata, EntryInfo, StoreError};
pub use crate::error::{ErrorCode, DenialContext, DenialReason};
pub use crate::provenance::{ProvenanceLog, LogEvent, LogError};
// ADR-0024 — audit-identity surface:
pub use crate::canonical::{CanonicalRef, SourceType};

2. Trait surface

pub trait Source: Send + Sync {
    fn name(&self) -> &str;
    fn can_serve(&self, profile: &CapabilityProfile, ref_: &Ref) -> bool;
    async fn fetch(
        &self,
        ref_: &Ref,
        profile: &CapabilityProfile,
        ctx: &FetchContext,
    ) -> Result<FetchResult, FetchError>;
}

pub trait Store: Send + Sync {
    fn read(&self, key: &Safekey) -> Result<Option<Metadata>, StoreError>;
    fn write(
        &self,
        key: &Safekey,
        m: &Metadata,
        pdf: Option<&Path>,
    ) -> Result<(), StoreError>;
    fn list_recent(&self, limit: usize) -> Result<Vec<EntryInfo>, StoreError>;
    fn search(&self, query: &str, limit: usize) -> Result<Vec<EntryInfo>, StoreError>;
}

3. Core types

#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
pub enum Ref {
    Doi(Doi),
    Arxiv(ArxivId),
}

#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(transparent)]
pub struct Doi(pub(crate) String);

#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(transparent)]
pub struct ArxivId(pub(crate) String);

#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(transparent)]
pub struct Safekey(pub(crate) String);

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct Metadata {
    pub schema_version: String,
    pub title:    String,
    pub authors:  Vec<String>,
    pub year:     Option<i32>,
    pub doi:      Option<Doi>,
    pub arxiv_id: Option<ArxivId>,
    pub abstract_: Option<String>,
    pub venue:    Option<String>,
    pub publisher: Option<String>,
    pub issn:     Option<String>,
    pub isbn:     Option<String>,
    pub type_:    Option<String>,
    pub keywords: Vec<String>,
    pub doiget:   Option<DoigetExtension>,
}

#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct DoigetExtension {
    pub fetched_at: chrono::DateTime<chrono::Utc>,
    pub source:     String,
    pub license:    String,
    pub size_bytes: u64,
    pub mcp_call_id: Option<String>,
}

4. Constructors and validation

impl Doi {
    pub fn parse(s: &str) -> Result<Self, RefParseError>;
    pub fn as_str(&self) -> &str;
}

impl ArxivId {
    pub fn parse(s: &str) -> Result<Self, RefParseError>;
    pub fn as_str(&self) -> &str;
}

impl Ref {
    pub fn parse(s: &str) -> Result<Self, RefParseError>;
    pub fn safekey(&self) -> Safekey;
}

parse returns a [RefParseError] variant naming the specific rejection category (Empty, MissingDoiPrefix, MissingDoiSuffixSeparator, InvalidDoiRegistrant, EmptyDoiSuffix, DoiSuffixTooLong { len, max }, InvalidDoiSuffixChar { ch }, InvalidArxivShape). The granular shape is preserved for tests and future log breadcrumbs; at the public MCP / CLI boundary, all variants funnel to [ErrorCode::InvalidRef] via the impl From<RefParseError> for ErrorCode blanket conversion, so ? propagation collapses to INVALID_REF automatically.

RefParseError is #[non_exhaustive]; adding new categories is a non-breaking change. Pattern-match with a wildcard arm.

The dedicated RefParseError type was introduced by PR #55; see also the CHANGELOG.md.

5. CapabilityProfile

impl CapabilityProfile {
    pub fn from_env() -> Result<Self, CapabilityError>;
}

pub enum CapabilityError {
    AgreedButNoKey { agree_var: String, key_var: String },
    KeyButNotAgreed { agree_var: String },
}

See CAPABILITY.md for the full type definition and resolution rules.

6. Stability guarantees

7. MSRV

doiget-core's declared MSRV (Cargo.toml [workspace.package] rust-version) is 1.86. Active development tracks channel = "stable" in rust-toolchain.toml, so day-to-day builds use the latest stable toolchain; the CI msrv job pins explicitly to 1.86 to verify the declared floor still holds.

Raising the declared MSRV is a minor version bump and requires a CHANGELOG entry. Lowering it requires an ADR (we do not retroactively re-support older toolchains without explicit reason). The 1.0 release may re-evaluate the policy and adopt a stable-channel-tracks-current-stable-minus-N rule.

8. Structured denial context (NORMATIVE; ADR-0023)

#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum DenialReason {
    RedirectNotInAllowlist,
    InsecureScheme,
    HostInBlockList,
    SizeCapExceeded,
    SchemaDrift,
    CapabilityNotGranted,
    RateLimitWindow,
    SsrfPrivateAddress,
    ContentTypeMismatch,
}

#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
#[serde(deny_unknown_fields)]
pub struct DenialContext {
    pub reason:    DenialReason,
    pub source:    Option<String>,
    pub attempted: Option<String>,
    pub expected:  Option<Vec<String>>,
    pub hop_index: Option<u8>,
    pub cap:       Option<u64>,
    pub actual:    Option<u64>,
}

impl From<&crate::http::HttpError> for Option<DenialContext> { /* … */ }
impl From<&crate::source::FetchError> for Option<DenialContext> { /* … */ }

The DenialReason enum is closed: adding a variant is a minor semver bump, renaming or repurposing one is breaking. The DenialContext struct is not #[non_exhaustive] because deny_unknown_fields already prevents forward-compatible field additions on the wire — adding a field is a breaking change. See ERRORS.md §3.1, §5.1 for the runtime surface and MCP_TOOLS.md §5 for the JSON envelope.

9. Audit-identity: CanonicalRef (NORMATIVE; ADR-0021, ADR-0024)

doiget-core provides the four-tuple audit identity defined by ADR-0021 and implemented per ADR-0024. The re-exports are listed in §1 (CanonicalRef, SourceType).

#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[serde(rename_all = "lowercase")]
#[non_exhaustive]
pub enum SourceType {
    Doi,
    Arxiv,
    // future: Pmid, Handle, ...
}

#[derive(Debug, Clone, PartialEq, Eq, Hash, serde::Serialize, serde::Deserialize)]
#[non_exhaustive]
pub struct CanonicalRef {
    pub source_type: SourceType,
    pub source_id: String,
    pub resolver_profile: String, // e.g. "crossref", "unpaywall", "arxiv", "oa-publisher"
    pub version: Option<String>,  // e.g. arXiv "v2"; None encodes the empty trailing input
}

impl CanonicalRef {
    pub fn new(
        source_type: SourceType,
        source_id: impl Into<String>,
        resolver_profile: impl Into<String>,
        version: Option<String>,
    ) -> Self;
    pub fn digest(&self) -> [u8; 32];
    pub fn digest_hex(&self) -> String;
}

impl Ref {
    /// Promote a `Ref` to a `CanonicalRef` with the given resolver
    /// profile and optional version (ADR-0021 §1).
    pub fn promote(&self, resolver_profile: &str, version: Option<&str>) -> CanonicalRef;
}

The digest algorithm is the NORMATIVE SHA256(source_type | 0x00 | source_id | 0x00 | resolver_profile | 0x00 | version_or_empty) shape — version_or_empty is the empty byte sequence when version is None, NOT a sentinel.

The companion provenance-log row schema bump (v1 → v2) is documented in PROVENANCE_LOG.md §3 + §3.1. The one-shot migration ships as doiget provenance migrate [--dry-run].


Source: site/content/developer/public-api.md