Python API Reference

This page documents the public Python API for EZPI.

Core Functions

ezpi.add_rfc822(repo, content, domain=None, env=None)[source]

Add an RFC822 message to the repository.

This is the main entry point for adding email messages to a public-inbox repository. The message can be provided as raw bytes or as an email.message.Message object.

The function automatically:

  • Adds Date header if missing

  • Generates Message-Id if missing

  • Fixes charset for non-ASCII content

  • Extracts author info from the From header

Parameters:
  • repo (str) – Path to the bare git repository.

  • content (Message | bytes) – The email message as bytes or a Message object.

  • domain (str | None) – Optional domain for generating the Message-Id if missing.

  • env (dict[str, str] | None) – Optional git environment variables for the commit.

Raises:
Return type:

None

Example:

# From bytes
ezpi.add_rfc822('/path/to/repo.git', email_bytes)

# From Message object
msg = email.message.EmailMessage()
msg['From'] = 'sender@example.com'
msg['Subject'] = 'Test'
msg.set_content('Hello')
ezpi.add_rfc822('/path/to/repo.git', msg)
ezpi.add_plaintext(repo, content, subject, authorname, authoremail, domain=None)[source]

Add plaintext content to the repository as an RFC822 message.

This is a convenience wrapper that creates a minimal RFC822 message from plaintext content and adds it to the repository.

Parameters:
  • repo (str) – Path to the bare git repository.

  • content (str) – The plaintext message body.

  • subject (str) – The email subject line.

  • authorname (str) – Display name for the From header.

  • authoremail (str) – Email address for the From header.

  • domain (str | None) – Optional domain for generating the Message-Id.

Raises:
Return type:

None

Example:

ezpi.add_plaintext(
    '/path/to/repo.git',
    content='Hello, world!',
    subject='Greeting',
    authorname='John Doe',
    authoremail='john@example.com',
)

v2 Format Functions

ezpi.add_rfc822_v2(v2path, content, domain=None, env=None, auto_epoch='size')[source]

Add an RFC822 message to a public-inbox v2 format repository.

This function manages the v2 inbox structure, creating it if necessary and handling epoch rotation based on the specified mode.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • content (Message | bytes) – The email message as bytes or a Message object.

  • domain (str | None) – Optional domain for generating the Message-Id if missing.

  • env (dict[str, str] | None) – Optional git environment variables.

  • auto_epoch (str) – Epoch rotation mode - ‘size’ (default) or ‘annual’.

Raises:
  • ValueError – If the message is missing required headers.

  • RuntimeError – If any git operation fails or lock cannot be acquired.

Return type:

None

Example:

ezpi.add_rfc822_v2('/path/to/inbox', email_bytes)
ezpi.add_rfc822_v2('/path/to/inbox', msg, auto_epoch='annual')
ezpi.init_v2_inbox(v2path)[source]

Initialize a new public-inbox v2 format inbox.

Creates the v2 directory structure including: - inbox.lock file for global locking - git/ directory for epoch repositories - git/0.git as the first epoch - all.git/ as read-only endpoint with alternates

Parameters:

v2path (str) – Path where the v2 inbox should be created.

Returns:

Path to the first epoch repository (git/0.git).

Raises:
Return type:

str

ezpi.init_epoch(v2path, epoch)[source]

Create a new epoch repository in a v2 inbox.

Creates a bare git repository at v2path/git/{epoch}.git and updates the alternates file in all.git to include the new epoch.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • epoch (int) – Epoch number (0-based integer).

Returns:

Path to the newly created epoch repository.

Raises:

RuntimeError – If git init fails.

Return type:

str

ezpi.get_latest_epoch(v2path)[source]

Find the highest numbered epoch in a v2 inbox.

Parameters:

v2path (str) – Path to the v2 inbox directory.

Returns:

Tuple of (epoch_number, epoch_path).

Raises:

FileNotFoundError – If no epochs exist.

Return type:

tuple[int, str]

ezpi.get_epoch_size(epoch_path)[source]

Calculate the total size of a git epoch repository.

Sums the size of pack files and loose objects.

Parameters:

epoch_path (str) – Path to the epoch bare git repository.

Returns:

Total size in bytes.

Return type:

int

ezpi.should_rotate_epoch(epoch_path, mode)[source]

Check if a new epoch should be created.

Parameters:
  • epoch_path (str) – Path to the current epoch repository.

  • mode (str) – Rotation mode - ‘size’ or ‘annual’.

Returns:

True if a new epoch should be created.

Return type:

bool

Reading Functions

Added in version 0.6.

These functions let you consume messages out of a v2 inbox that you (or someone else) is keeping up to date. EZPI itself never fetches from remotes – refresh the local checkout with git pull, lei up, etc. before calling these.

ezpi.iter_new_messages(v2path, cursor_name, auto_advance=False, start='head')[source]

Yield messages the named cursor has not seen yet.

On first use (no cursor file exists), behavior depends on start:

  • 'head' (default): advance cursor to the current HEAD of every epoch and yield nothing. The caller will only see messages added after this first call.

  • 'beginning': walk the entire history from the oldest commit.

New epochs appearing after the cursor was saved are walked from their first commit automatically.

If a stored commit hash has been rewritten away (rebase), this function transparently invokes recover_cursor() and resumes. Recovery always succeeds by falling through to the top commit in the worst case.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • cursor_name (str) – The cursor identifier.

  • auto_advance (bool) – If True, persist cursor state automatically after each yielded message. If False, caller must call save_cursor().

  • start (str) – First-use policy, 'head' or 'beginning'.

Yields:

Tuples of (epoch_number, commit_hash, raw_rfc822_bytes).

Raises:

ValueError – If start is neither 'head' nor 'beginning'.

Return type:

Iterator[tuple[int, str, bytes]]

ezpi.iter_messages(v2path, since=None)[source]

Yield (epoch, commit, raw_bytes) for messages in chronological order.

Walks every epoch in order, skipping no-op commits (purge/rm commits without an m blob). If since is given (maps epoch number to commit hash), only commits strictly after that hash are yielded in that epoch. Epochs not present in since are walked from their first commit.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • since (dict[int, str] | None) – Optional {epoch: commit_hash} map marking a cursor position per epoch.

Yields:

Tuples of (epoch_number, commit_hash, raw_rfc822_bytes).

Raises:
  • StaleCommitError – If a since hash is not in the repo. Carries the epoch and hash so the cursor-aware layer can recover.

  • CursorError – On unexpected git failures while reading blobs.

Return type:

Iterator[tuple[int, str, bytes]]

ezpi.get_all_epochs(v2path)[source]

List all epochs in a v2 inbox, sorted ascending by epoch number.

Parameters:

v2path (str) – Path to the v2 inbox directory.

Returns:

A list of (epoch_number, epoch_path) tuples. Empty if no epochs exist.

Raises:

FileNotFoundError – If the inbox has no git/ directory.

Return type:

list[tuple[int, str]]

Cursor State

Added in version 0.6.

Named cursors track per-reader position. EZPI stores one JSON file per cursor alongside the inbox, at {v2path}/ezpi-cursor.{name}.json.

ezpi.load_cursor(v2path, cursor_name)[source]

Load cursor state for cursor_name, or None if no state exists.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • cursor_name (str) – The cursor identifier.

Returns:

A dict with the cursor state, or None if the cursor has never been saved.

Raises:

CursorStateError – If the cursor file exists but cannot be parsed.

Return type:

dict[str, Any] | None

ezpi.save_cursor(v2path, cursor_name, epoch, commit, msg_bytes, commit_date=None)[source]

Persist a cursor position after processing a message.

Stores msgid+subject (extracted from msg_bytes) and the committer date of commit; all three are required for rebase recovery.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • cursor_name (str) – The cursor identifier.

  • epoch (int) – The epoch containing commit.

  • commit (str) – The commit hash just processed.

  • msg_bytes (bytes) – The raw message bytes yielded for that commit.

  • commit_date (str | None) – Optional pre-computed committer date in git’s %ci format (YYYY-MM-DD HH:MM:SS +ZZZZ). If omitted, a git log -1 --format=%ci subprocess is spawned. Callers iterating with iter_new_messages() and auto_advance receive this value for free and pass it through to avoid the extra fork.

Return type:

None

ezpi.reset_cursor(v2path, cursor_name)[source]

Delete the cursor state file. No-op if it doesn’t exist.

Parameters:
  • v2path (str)

  • cursor_name (str)

Return type:

None

ezpi.recover_cursor(v2path, cursor_name, epoch)[source]

Re-locate a cursor’s stored commit after history was rewritten.

Uses the stored commit_date to anchor a --since-as-filter search, then matches on (subject, msgid). Mirrors korgalore’s algorithm (pi_feed.py:248-303) including its fallback choices:

  • If rev-list itself errors, fast-forward to the current top commit.

  • If rev-list yields no candidates, save state at the top commit.

  • If no candidate matches the stored headers, fall back to the first candidate after the date (logged as a warning). Trades possible-skip for small re-delivery risk.

Parameters:
  • v2path (str) – Path to the v2 inbox directory.

  • cursor_name (str) – The cursor identifier.

  • epoch (int) – The epoch whose stored position needs recovery.

Returns:

The new commit hash, also written back into the cursor file.

Raises:

CursorStateError – If there is no stored state for epoch or required fields are missing.

Return type:

str

Reading Exceptions

Added in version 0.6.

exception ezpi.CursorError[source]

Base class for cursor-related errors.

exception ezpi.CursorStateError[source]

Cursor state file is missing a required field or cannot be parsed.

exception ezpi.StaleCommitError(epoch, commit)[source]

A since commit hash given to iter_messages() is not present in the repo (history was rewritten). Carries the epoch and stored commit so the cursor-aware layer can trigger recovery.

Parameters:
Return type:

None

Utility Functions

ezpi.run_hook(repo)[source]

Run the post-commit hook if it exists and is executable.

Parameters:

repo (str) – Path to the bare git repository.

Return type:

None

ezpi.clean_header(hdrval)[source]

Decode and clean an email header value.

Handles RFC2047 encoded headers (e.g., =?utf-8?q?…?=) and normalizes whitespace. Invalid encodings are handled gracefully with replacement.

Parameters:

hdrval (str) – The raw header value to decode.

Returns:

The decoded and cleaned header value as a string.

Return type:

str

Low-Level Functions

These functions are primarily for internal use but are documented for advanced users.

ezpi.git_write_commit(repo, env, c_msg, body, dest='m')[source]

Create a git commit containing a single file with the given content.

This is a low-level function that creates git objects (blob, tree, commit). Uses pygit2 when available, otherwise falls back to git subprocess commands. The commit is made to refs/heads/master.

Parameters:
  • repo (str) – Path to the bare git repository.

  • env (dict[str, str]) – Environment variables for the git commit (GIT_AUTHOR_*, GIT_COMMITTER_*).

  • c_msg (str) – The commit message (typically the email subject).

  • body (bytes) – The file content to store (typically the serialized email).

  • dest (str) – Filename for the blob in the tree (default: ‘m’).

Raises:
Return type:

None

ezpi.git_run_command(gitdir, args, stdin=None, env=None)[source]

Run a git command and return its output.

Parameters:
  • gitdir (str) – Path to the git repository (sets GIT_DIR environment variable).

  • args (list[str]) – List of arguments to pass to git (without ‘git’ itself).

  • stdin (bytes | None) – Optional bytes to send to the command’s stdin.

  • env (dict[str, str] | None) – Optional environment variables to set for the command.

Returns:

A tuple of (return_code, stdout_bytes, stderr_bytes).

Return type:

tuple[int, bytes, bytes]

ezpi.check_valid_repo(repo)[source]

Verify that a path is a valid bare git repository.

Parameters:

repo (str) – Path to the repository to check.

Raises:

FileNotFoundError – If the path doesn’t exist or isn’t a valid bare git repo.

Return type:

None

Constants

ezpi.DEFAULT_NAME

Default author name used when none is provided: 'EZ PI'

ezpi.DEFAULT_ADDR

Default author email used when none is provided: 'ezpi@localhost'

ezpi.DEFAULT_SUBJ

Default subject used when none is provided: 'EZPI commit'

ezpi.V2_SIZE_THRESHOLD

Size threshold for epoch rotation in bytes (1GB): 1073741824

ezpi.PI_HEAD

Git ref used for commits: 'refs/heads/master'