API Reference
Environment detection
BiblioFetch.detect_environment — Function
detect_environment(; probe = true) -> RuntimeDetect hostname, applicable config profile, effective proxy (env > profile), optionally probe reachability, and classify the operating mode.
BiblioFetch.effective_runtime — Function
effective_runtime(; probe = true) -> RuntimeAlias for detect_environment; kept as the public "what should I use right now?" accessor.
BiblioFetch.load_config — Function
load_config(; path = ENV["BIBLIOFETCH_CONFIG"] or default)
-> (config::Dict, path_or_nothing)Read and parse the global BiblioFetch config TOML. Returns (Dict(), nothing) when no file is present at path. The default location is ~/.config/bibliofetch/config.toml; $BIBLIOFETCH_CONFIG overrides it.
References — parse / classify
BiblioFetch.normalize_key — Function
normalize_key(s) -> StringNormalize a user-provided reference to a canonical key:
- DOI → lowercase DOI (
10.1103/physrevb.xx.yyyy) - arXiv →
arxiv:<id>
Throws ArgumentError if unrecognized.
BiblioFetch.is_doi — Function
is_doi(s) -> BoolWhether s looks like a DOI (10.xxxx/anything). Strips surrounding whitespace but does not otherwise transform the input.
BiblioFetch.is_arxiv — Function
is_arxiv(s) -> BoolWhether s looks like an arXiv id — both the new-style (1706.03762, optionally with a version suffix v2 and an arxiv: prefix) and the legacy slash form (cond-mat/0608208).
BiblioFetch.is_arxiv_versions — Function
is_arxiv_versions(s) -> BoolWhether s is the multi-version pseudo-ref form arxiv:<id>@all or arxiv:<id>@v1,v3 / arxiv:<id>@1,3. These refs can't be fetched as-is — the run loop expands them into one FetchEntry per version before dispatching to fetch_paper!.
BiblioFetch.parse_arxiv_version_spec — Function
parse_arxiv_version_spec(s) -> (base_key, spec)Parse an arxiv:<id>@… pseudo-ref into its components.
base_key— the canonicalarxiv:<id>key (lower-cased, no version suffix, witharxiv:prefix).spec— either:all(every known version) or a sortedVector{Int}of explicit version numbers.
Throws ArgumentError when s is not a well-formed pseudo-ref.
arXiv version discovery
BiblioFetch.arxiv_latest_version — Function
arxiv_latest_version(id; proxy, timeout, base_url = ARXIV_API_URL)
-> Int or nothingReturn the number of the latest published version of arXiv paper id. arXiv's API answers an id_list=<id> query with the entry's canonical URL in <id>, which always carries the current vN suffix — the integer after v is the latest-version number. Missing or unparseable responses return nothing. Strips an arxiv: prefix if passed.
BiblioFetch.arxiv_list_versions — Function
arxiv_list_versions(id; kwargs...) -> Vector{Int}Return every version number an arXiv paper has, in ascending order. arXiv numbers versions sequentially from 1, so this is 1:arxiv_latest_version(id) with the API call cached into a single trip. Returns Int[] on lookup failure.
kwargs are forwarded to arxiv_latest_version.
Store
BiblioFetch.Store — Type
Store(root)Handle on a BiblioFetch store directory. Holds the root path; all PDF and metadata paths are derived from it. Construct with open_store — the raw constructor does not create the backing directory layout.
BiblioFetch.open_store — Function
open_store(root) -> StoreCreate (if needed) the store directory layout under root:
<root>/
<group>/<safekey>.pdf # PDFs live next to their group subdir
<safekey>.pdf # (or at the root for ungrouped entries)
.metadata/<safekey>.toml # one TOML per paper (editable, hidden)BiblioFetch.list_entries — Function
list_entries(store) -> Vector{String}Return the filesystem-safe keys of every paper currently tracked in the store, sorted alphabetically. These are the stems of files under <root>/.metadata/, not the canonical DOI/arXiv keys (use entry_info to get the key).
BiblioFetch.entry_info — Function
entry_info(store, key) -> NamedTuple | NothingSummary record for one entry — key, title, status, source, group, pdf_path, year. Returns nothing when the key has no metadata on disk.
Store lock
BiblioFetch.StoreLock — Type
StoreLock(store::Store; stale_after_s = 600)Best-effort exclusive lock for write operations on a store. Implemented as a pidfile at <store.root>/.metadata/run.pid containing hostname + pid + ISO timestamp. A lock older than stale_after_s (default 10 min) is treated as abandoned and reclaimed.
Use via with_store_lock.
BiblioFetch.with_store_lock — Function
with_store_lock(fn, store::Store; force = false, stale_after_s = 600)Run fn() while holding an exclusive StoreLock on store. Releases the lock on normal return or exception. Pass force=true to override an existing live lock (useful for known-dead processes).
Project skeleton
BiblioFetch.generate — Function
generate(path; force = false) -> StringCreate a BiblioFetch project skeleton under path. Copies every file in the package's config/template/ directory (job.toml + README.md at present) into path, creating intermediate directories as needed.
path— absolute or~-prefixed; expanded before use. If it's relative, it's resolved againstpwd().force— whenfalse(default) andpathalready exists and is non-empty,generaterefuses with anArgumentError.trueoverwrites any clashing file unconditionally.
Returns the absolute path of the created project, ready to pass to bibliofetch run <path>/job.toml (with relative-target resolution — see load_job).
Fetch
BiblioFetch.fetch_paper! — Function
fetch_paper!(store, key; rt, group = "", force = false,
sources = DEFAULT_SOURCES, source_policy = :lenient,
verbose = true) -> FetchResultResolve key (DOI or arxiv:…) and try the configured sources in order:
:unpaywall→ OA PDF (requiresrt.email):arxiv→ arXiv preprint (always OA):direct→doi.org/<doi>through proxy (only when proxy is reachable)
source_policy controls which sources are allowed to produce candidates:
:lenient(default) — every source listed insourcesis eligible.:strict— onlyPUBLISHER_SOURCESproduce candidates; preprint routes (:arxiv,:s2) are silently dropped, and:unpaywallis only kept when its bestoalocation hashost_type = "publisher".
also_arxiv (default false) — after a successful primary fetch whose source is not already :arxiv, BiblioFetch does a companion download of the arXiv preprint (if an arXiv id was discovered from Crossref relation.has-preprint or the title-search fallback) into preprint_pdf_path(store, key; group). Records preprint_* fields in the entry's metadata TOML. Silently no-ops when no arXiv id exists.
The PDF is stored at pdf_path(store, key; group) — i.e. in store.root/<group>/. Per-attempt diagnostics are recorded in the returned FetchResult.attempts.
BiblioFetch.sync! — Function
sync!(store; rt = detect_environment(), force = false, verbose = true)
-> Vector{FetchResult}Walk the store's metadata directory and (re)fetch entries, preserving each entry's stored group.
- default (
force = false): skip entries that already havestatus = "ok"and a PDF on disk. Everything else — pending, failed, or status-ok with a missing PDF — is fetched. Useful for resuming a partial run. force = true: every tracked entry is re-downloaded, even ones already on disk.force = trueis propagated tofetch_paper!, so its cached fast-path is bypassed and the PDF is overwritten.
BiblioFetch.AttemptLog — Type
AttemptLogOne source attempt during a fetch — useful for diagnosing why a key failed. retry_count is the number of retries burned inside this attempt (driven by retry_statuses / exceptions in _http_get_with_retry); retried_statuses is the list of HTTP statuses that triggered each retry. 0 inside retried_statuses stands for a pre-server / exception retry (no response arrived) — the request never reached HTTP. When a source completed on the first try, retry_count == 0 and retried_statuses is empty.
Jobs
BiblioFetch.load_job — Function
load_job(path; runtime = detect_environment()) -> FetchJobParse a bibliofetch.toml file. Fills in missing fetch.email from runtime, flattens [doi] groups into FetchEntrys, deduplicates keys (lenient by default), and returns the job without performing any network I/O.
BiblioFetch.run — Function
BiblioFetch.run(path_or_job; verbose = true) -> FetchJobResultExecute a job. path_or_job may be a path to a bibliofetch.toml or an already-loaded FetchJob. Writes PDFs into job.target/<group>/, metadata into job.target/.metadata/, and a run log into job.log_file.
BiblioFetch.FetchEntry — Type
FetchEntryOne reference pulled from a job file: normalized key, assigned group, and (after running) its fetch status and per-source attempt log.
BiblioFetch.FetchJob — Type
FetchJobParsed bibliofetch.toml — the list of references to pull, where to put them, and which sources / concurrency / overwrite policy to use.
BiblioFetch.FetchJobResult — Type
FetchJobResultReturned by BiblioFetch.run — the job plus post-run entries and elapsed time.
BibTeX
BiblioFetch.bibtex_entry — Function
bibtex_entry(md; key = _bibtex_key(md)) -> StringRender one metadata dict as a BibTeX entry string (including trailing newline). Uses @article when md["journal"] is non-empty, @misc otherwise (arXiv preprints, tech reports). Fields are always written in the same order.
BiblioFetch.write_bibtex — Function
write_bibtex(store, path; key_filter = nothing) -> IntIterate every status = "ok" entry in the store's .metadata/, assign a FirstAuthorSurnameYear citekey (with letter-suffix disambiguation for collisions), and write the combined BibTeX to path. Returns the number of entries written. When key_filter is a Set{String} of normalized keys only those entries are written.
BiblioFetch.parse_bibtex — Function
parse_bibtex(text) -> Vector{BibEntry}Walk a BibTeX source string and collect every @TYPE{key, fields…} entry. Top-level brace balancing is manual (so nested {…} inside field values don't confuse the scanner); individual field extraction uses regex that tolerates single-level braces, which covers every real doi / eprint / url value.
Entries that fail to parse (malformed headers, unbalanced braces, etc.) are skipped silently — a single broken entry shouldn't abort the whole import.
BiblioFetch.bibentry_to_ref — Function
bibentry_to_ref(entry) -> String | NothingDerive the identifier BiblioFetch should queue for a bib entry. Checks, in order:
doi→ return the DOI as-is (will be normalized downstream)eprint→ ifarchivePrefixisarxivor absent, returnarxiv:<eprint>url→ if it's adoi.org/…orarxiv.org/abs/…URL, return the extracted identifier
Returns nothing when nothing usable is found (e.g. an entry with only a title and unstructured publisher).
BiblioFetch.import_bib! — Function
import_bib!(store, path) -> (added, skipped)Parse path as a BibTeX file and queue every entry that yields a recognizable DOI or arXiv id into store. Returns:
added::Vector{NamedTuple{(:citekey, :ref, :key)}}— entries successfully queued.citekeyis the BibTeX citekey,refis what we extracted,keyis the normalized store key.skipped::Vector{NamedTuple{(:citekey, :reason)}}— entries rejected either because no usable identifier was found or because normalization of the extracted string failed.
Duplicate refs already in the store are treated as success (queued = idempotent).
BiblioFetch.BibEntry — Type
BibEntryOne entry scanned out of a .bib file — type, citekey, and a flattened field map. Field keys are lowercased; field values are stripped of their {…} / "…" wrapper but not of nested LaTeX braces (which almost never appear in the identifier fields we care about).
CSL JSON
BiblioFetch.csl_entry — Function
csl_entry(md; id = _bibtex_key(md)) -> Dict{String,Any}Map one internal metadata TOML dict to a CSL JSON record. The returned Dict is ready to be passed straight to JSON3.write as one element of a CSL JSON array. Empty / missing fields are omitted rather than written as empty strings — Pandoc/Quarto are happier with absent keys than with empty ones, and the resulting file stays diffable.
Field mapping:
id→_bibtex_key(md)(same citekey as the BibTeX export)type→"article-journal"ifmd["journal"]is non-empty, else"manuscript"(CSL's term for unpublished / preprint material)title→md["title"]author→[{family, given}, …]frommd["authors"]container-title→md["journal"]issued→{date-parts: [[year]]}DOI→md["key"]when it parses as a DOI, elsemd["doi"]URL→https://doi.org/<DOI>whenever a DOI is presentabstract→md["abstract"]when non-empty
BiblioFetch.write_csl — Function
write_csl(store, path; key_filter = nothing) -> IntIterate every status = "ok" entry in the store's .metadata/, build a CSL JSON array, and write it to path. Returns the number of entries written. When key_filter is a Set{String} of normalized keys only those entries are included. Citekeys (id) are disambiguated with a trailing letter suffix on collision, mirroring write_bibtex so the two exports stay in sync.
Citation graph visualization
BiblioFetch.to_dot — Function
to_dot(store; queued_only = false, include_isolated = false) -> StringRender the store's citation graph as a Graphviz DOT source string. Pipe through dot -Tpng > graph.png (or -Tsvg) to view.
queued_only = true— show only the expansion tree (edges fromreferenced_by), not the full citation fabric.include_isolated = true— keep entries that aren't part of any edge (default: hide them so the graph stays readable).
Node labels are the same citekeys bibliofetch bib emits; node colour/style reflects status (ok / pending / failed).
BiblioFetch.to_mermaid — Function
to_mermaid(store; queued_only = false, include_isolated = false) -> StringRender the store's citation graph as Mermaid (graph LR) source, ready to paste into a Markdown fence on GitHub / Obsidian / Docusaurus. Same edge-policy and filter flags as to_dot.
Deduplication
BiblioFetch.find_duplicates — Function
find_duplicates(store) -> Vector{Pair{String,Vector{String}}}Scan store's metadata directory and return a list of sha256 => keys pairs for every hash held by more than one entry. Each keys vector is sorted lexicographically, so the canonical (kept) key in dedup operations is deterministic.
The SHA-256 comes from the sha256 field written by fetch_paper!. Entries whose metadata lacks that field (very old stores, failed fetches, entries resolved into duplicate_of) are skipped.
BiblioFetch.resolve_duplicates! — Function
resolve_duplicates!(store; apply = false) -> NamedTupleWalk the duplicate groups reported by find_duplicates. For each group keep the lexicographically first key as canonical; the rest are recorded with duplicate_of = "<canonical>" and their pdf_path is redirected to the canonical entry's file. On-disk duplicate PDFs are removed when apply = true; otherwise the function just reports what would happen.
Returns (; groups, bytes_freed, canonicals) — groups is the output of find_duplicates, bytes_freed is the size that would be (or was) recovered, and canonicals is a duplicate_key => canonical_key map.
Doctor (store integrity)
BiblioFetch.doctor — Function
doctor(store) -> Vector{StoreIssue}Inventory the store for operational problems:
- cross-reference metadata
pdf_pathvs on-disk files (missing / orphan) - flag
.partleftover files from interrupted downloads - flag 0-byte PDFs
- when a metadata entry records a
sha256, verify the on-disk file still hashes to the same value (:sha_mismatch)
One pass, no network. Returns a flat list sorted first by kind, then by key / path.
BiblioFetch.fix! — Function
fix!(store, issues; kinds = (:incomplete_part,)) -> IntApply safe auto-fixes to a subset of issues. Returns the number of issues acted on. Safe defaults:
:incomplete_part— remove the.partfile unconditionally:pdf_missing— clearpdf_pathfrom the metadata entry; don't touch the metadata's other fields, so a subsequentbibliofetch sync --forcecan re-fetch
Other kinds (:orphan_pdf, :sha_mismatch, :empty_pdf) are opt-in — pass their symbol in kinds to include them. Orphan removal in particular is destructive and should be reviewed first.
BiblioFetch.StoreIssue — Type
StoreIssueOne integrity problem doctor found. kind is one of:
:pdf_missing— metadata lists apdf_pathwhose file is gone:orphan_pdf— a PDF on disk isn't referenced by any metadata entry:incomplete_part— a.partleftover from an interrupted download:sha_mismatch— metadata has asha256that no longer matches the file on disk (PDF was replaced / corrupted):empty_pdf—pdf_pathexists but the file is 0 bytes
key identifies the metadata entry the issue belongs to, if any; orphan disk files have key == "".
Search
BiblioFetch.search_entries — Function
search_entries(store, query; fields, group, status, case_sensitive)
-> Vector{SearchMatch}Substring-search the store's metadata. By default matches in any of title / authors / abstract / journal / key; override with fields.
query— the text to search for (empty ⇒ every entry, useful with filters).fields— tuple / vector ofSymbolfield names to match against.group— optional group-prefix filter (empty ⇒ all groups).status— optional exact-match filter ("ok"/"failed"/"pending").case_sensitive— defaultfalse.
Results are sorted by number of matched fields (desc), key (asc). A paper is returned at most once even if multiple fields hit.
BiblioFetch.SearchMatch — Type
SearchMatchOne row in the result of search_entries — the hit's normalized key, its status/title/year/group for display, which fields contained the query, and a ±40-char snippet around the first match for context.
Statistics
BiblioFetch.stats — Function
stats(store) -> StoreStatsWalk the store's .metadata/ directory once and aggregate:
- per-status / per-source / per-group counts
- PDF file count and total byte size (counts only files that exist)
pdf_missing— entries whose metadata lists apdf_paththat's goneduplicate_resolved— entries linked to a canonical byresolve_duplicates!graph_expanded— entries queued by a citation hop (depth > 0)oldest_fetch/newest_fetch— earliest and latestfetched_attimestamps,nothingwhen the store has no successful fetches yet
One pass, no network. Safe to call on huge stores; per-entry cost is dominated by TOML.parsefile on the metadata file.
BiblioFetch.StoreStats — Type
StoreStatsAggregate counts and sizes for a store, one walk of .metadata/ away. Used by bibliofetch stats for a daily-review dashboard and by any caller who wants to know "what's actually in here?" without enumerating entries by hand.
External metadata sources
BiblioFetch.datacite_lookup — Function
datacite_lookup(doi; proxy = nothing, timeout = 15, base_url = DATACITE_URL,
max_retries, base_delay) -> DictFetch DataCite metadata for a DOI and return it in Crossref's metadata shape (so it slots straight into the existing fetch_paper! extraction). Returns an empty Dict on any failure.
Used as a fallback after Crossref returns nothing — covers dataset DOIs registered through Zenodo, Figshare, institutional DataCite clients, etc.
BiblioFetch.s2_lookup — Function
s2_lookup(ref; api_key = ENV["SEMANTIC_SCHOLAR_API_KEY"], proxy = nothing,
timeout = 15, base_url = S2_URL, max_retries, base_delay)
-> DictLook up a paper on Semantic Scholar. ref is a normalized key (10.xxxx/yyy or arxiv:…).
Returns a Dict{String,Any} with the fields BiblioFetch cares about:
"title"—String"authors"—Vector{String}(display names, one per author)"year"—Intornothing"abstract"—String(empty when S2 didn't have one)"journal"—String(empty when S2 didn't record one)"oa_pdf_url"—Stringpointing at the publisher / repository PDF, present only whenopenAccessPdf.urlis non-empty"s2_paper_id"— S2's own stable id, useful for follow-ups
Empty Dict on any failure (unreachable, 404, malformed JSON, etc.).
BiblioFetch.openalex_lookup — Function
openalex_lookup(ref; mailto = nothing, proxy = nothing, timeout = 15,
base_url = OPENALEX_URL, max_retries, base_delay)
-> (oa_pdf_url_or_nothing, metadata_dict)Look up a work on OpenAlex by DOI or arXiv id. ref is the OpenAlex selector string — either "doi:<DOI>" or "arxiv:<id>".
Returns (pdf_url, metadata):
pdf_url— best OA PDF URL OpenAlex has, ornothingwhen none is registered. Sourced fromopen_access.oa_url, falling back tobest_oa_location.pdf_url.metadata— Crossref-shaped dict (title/author/container-title/issued/abstract), so it slots straight into the existingfetch_paper!extraction. EmptyDicton hard failure.
mailto enables OpenAlex's polite pool — pass rt.email here for nicer rate limits. No auth is needed beyond that.
BiblioFetch.doaj_lookup — Function
doaj_lookup(doi; proxy = nothing, timeout = 15, base_url = DOAJ_URL,
max_retries, base_delay, sleep_fn)
-> (pdf_url_or_nothing, metadata_dict)Look up a DOI in DOAJ's article index. Returns (pdf_url, metadata) where:
pdf_urlis the firstlinkentry whosecontent_typecontains "pdf" or whoseurlends in.pdf—nothingwhen no such link exists.metadatais a Crossref-shaped dict translated from DOAJ'sbibjson(title / author / container-title / issued / abstract). ReturnsDict()on hard failure (unreachable, non-200, no results, malformed JSON).
Used as an opt-in publisher source for vetted gold-OA journals — surfaces PDFs from smaller / non-English titles that Unpaywall doesn't index.
Publisher TDM (authenticated)
BiblioFetch.aps_tdm_url — Function
aps_tdm_url(doi; base_url = APS_TDM_URL) -> StringBuild the harvest.aps.org URL that returns a PDF for the given APS DOI. Does no validation beyond the 10.1103 prefix check in is_aps_doi.
BiblioFetch.is_aps_doi — Function
is_aps_doi(doi) -> BoolWhether doi is published by the American Physical Society — all APS DOIs live under the 10.1103/ prefix. Checked before dispatching to harvest.aps.org to avoid spraying the endpoint with DOIs that would 404 anyway (and to conserve token quota).
BiblioFetch.elsevier_tdm_url — Function
elsevier_tdm_url(doi; base_url = ELSEVIER_TDM_URL) -> StringBuild the api.elsevier.com URL that returns the article PDF when combined with Accept: application/pdf and the API-key headers from elsevier_tdm_auth_headers.
BiblioFetch.is_elsevier_doi — Function
is_elsevier_doi(doi) -> BoolWhether doi is published by Elsevier. 10.1016/* covers ScienceDirect, Cell Press, The Lancet, and the vast majority of Elsevier content.
BiblioFetch.elsevier_tdm_auth_headers — Function
elsevier_tdm_auth_headers(; api_key = ENV["ELSEVIER_API_KEY"],
insttoken = ENV["ELSEVIER_INSTTOKEN"])
-> Vector{Pair{String,String}}Build the header set for an Elsevier TDM request: X-ELS-APIKey always (when a key is configured), plus X-ELS-Insttoken when a token is set too. Returns an empty Vector when no key is configured — the fetch pipeline uses that to skip :elsevier entirely, matching the APS TDM pattern (don't spray requests that will 401 anyway).
BiblioFetch.springer_oa_lookup — Function
springer_oa_lookup(doi; api_key = ENV["SPRINGER_API_KEY"],
proxy = nothing, timeout = 15,
base_url = SPRINGER_OA_URL)
-> (pdf_url_or_nothing, metadata_dict)Ask the Springer Nature OpenAccess API whether doi is registered as an OA article. Returns (pdf_url, metadata):
pdf_urlis the canonicallink.springer.com/content/pdf/<DOI>.pdfURL when the API confirms OA registration, elsenothing.metadatais the parsed JSON body (emptyDicton hard failure or when the response has norecords).
Returns (nothing, Dict()) without a network call when no API key is configured — the fetch pipeline uses that to skip :springer entirely, matching the APS/Elsevier "don't spray un-authenticated requests" pattern.
BiblioFetch.is_springer_doi — Function
is_springer_doi(doi) -> BoolWhether doi is published under a Springer Nature imprint. Covers the four prefixes worth gating on:
10.1007/— Springer (journals + books, the overwhelming majority)10.1038/— Nature portfolio (mix of OA and paywalled; OA API will tell us which)10.1186/— BMC / BioMed Central (all OA)10.1140/— European Physical Journal (EPJ)
Other Springer-distributed prefixes (10.1057 Palgrave, 10.1023 legacy Kluwer, 10.1134 Allerton) are rare in practice and omitted to keep the guard tight.
Network status
BiblioFetch.status — Function
status(; rt = detect_environment(), timeout = 5.0, probes = _STATUS_PROBES)
-> NetworkStatusProbe every supported metadata / PDF endpoint concurrently (@async + fetch) and report which ones respond from the current network. Total wall time is roughly timeout plus a small fixed cost, not #probes × timeout.
Exposed so user code (and live-network tests) can ask "is Crossref reachable from here?" before queueing work. The probes kwarg is overridable so integration tests can point it at a local mock server.
BiblioFetch.is_reachable — Function
is_reachable(status, source) -> BoolQuick predicate for test-gating code: is_reachable(status, :crossref) returns true iff the corresponding probe succeeded.
BiblioFetch.NetworkStatus — Type
NetworkStatusAggregate of all probe results + the derived effective_sources — which of (:unpaywall, :arxiv, :direct) can actually do their job right now.
Useful to distinguish "at the university, full access" from "at home, OA only" before kicking off a long job, and to gate live-network tests so they skip cleanly on CI / offline machines.
BiblioFetch.ProbeResult — Type
ProbeResultOne reachability probe record — which endpoint, did it respond, how fast, what HTTP status came back. reachable treats HTTP 2xx-4xx as reachable (the server is up, just may or may not have the thing we probed for); 5xx / connection errors / timeouts are false.
Vault (topic-based collection)
BiblioFetch.VaultTopic — Type
VaultTopicOne topic loaded from a TOML file in the vault directory.
BiblioFetch.VaultIndex — Type
VaultIndexParsed vault.toml (or a synthetic one from the directory listing).
BiblioFetch.load_vault_index — Function
load_vault_index(dir) -> VaultIndexRead vault.toml if present; otherwise treat every *.toml (except vault.toml) in dir as a topic file.
BiblioFetch.list_topics — Function
list_topics(index) -> Vector{VaultTopic}BiblioFetch.topic_refs — Function
topic_refs(topic) -> Vector{String}Return the normalized keys for all refs in a topic.
BiblioFetch.vault_add_ref! — Function
vault_add_ref!(topic_name, raw_ref; dir) -> StringAppend raw_ref to [doi].list in <dir>/<topic_name>.toml, creating the file with an empty [topic] header if it does not exist. Returns the normalized key.
BiblioFetch.vault_fetch! — Function
vault_fetch!(index; topic_name, runtime, verbose) -> Dict{String,FetchJobResult}Fetch papers for all topics (or a named subset) into index.store. Returns a Dict mapping topic name → FetchJobResult.
BiblioFetch.vault_bib — Function
vault_bib(index; topic_name, out) -> IntWrite a BibTeX file for all vault papers (or one topic). Returns entry count.
BiblioFetch.vault_search — Function
vault_search(index, query; fields, case_sensitive) -> Vector{SearchMatch}Search across all papers in the vault store.
CLI
BiblioFetch.cli_main — Function
cli_main(args = ARGS) -> IntDispatch a bibliofetch … command line. Returns exit code.
BiblioFetch.julia_main — Function
julia_main() -> CintEntry point for PackageCompiler.create_app. Delegates to cli_main(ARGS).
Native app build
BiblioFetch.build — Function
build(; sysimage_dir, bindir, force) -> StringCompile BiblioFetch into a sysimage using PackageCompiler.jl (create_sysimage with incremental=true), then write a thin shell wrapper into bindir.
Using a sysimage (rather than create_app) avoids the isolated-build errors that create_app triggers for packages with binary C extensions (HTTP → MbedTLS). It also produces a much smaller artefact (~40 MB vs ~300 MB) because the Julia runtime is not bundled — the system-installed julia is reused.
After a successful build, bibliofetch starts in under a second.
Arguments
sysimage_dir: directory wheresys.so(Linux/macOS) orsys.dll(Windows) is written. Default:~/.local/share/bibliofetchbindir: where thebibliofetchwrapper script is installed. Default:~/.local/binforce: overwrite an existing sysimage. Default:false
Example
using Pkg; Pkg.add("PackageCompiler") # once
using BiblioFetch
BiblioFetch.build() # ~2–4 min, run once per Julia version
BiblioFetch.build(force=true) # rebuild after Pkg.update()BiblioFetch.clean — Function
clean(; sysimage_dir, bindir, verbose) -> NothingRemove the sysimage and wrapper script installed by build.
Deletes:
sysimage_dir/sys.{so,dylib,dll}— the compiled sysimagebindir/bibliofetch(orbibliofetch.cmdon Windows) — the wrapper script
The directories themselves are left in place. Silently skips files that do not exist unless verbose=true.
Example
using BiblioFetch
BiblioFetch.clean() # remove default installation
BiblioFetch.clean(verbose=true) # print each file removed