Skip to content

Artifacts

Artifacts are the shared work products that agents collaborate on. They are the source of truth in a Claw society -- agents never pass state to each other directly. All shared state flows through versioned artifacts.

What Is an Artifact?

An artifact is a named, versioned piece of data. Every time an agent writes to an artifact, a new version is created with:

  • A version ID (e.g. v1, v2, or a git SHA prefix)
  • The author (which agent wrote it)
  • A content hash (SHA-256 of the written content)
  • A timestamp

This gives you a full audit trail of who changed what and when.

from claw import StringArtifact

doc = StringArtifact("design-doc")
v1 = doc.write("# Design\nFirst draft.", agent="writer")
print(v1.id)            # "v1"
print(v1.author)        # "writer"
print(v1.content_hash)  # "a1b2c3d4e5f6"

Built-in Artifact Types

Claw ships with two in-memory artifact types in the core package and three external artifact types in claw.artifacts.

StringArtifact

The simplest artifact. Stores a string in memory with full version history.

from claw import StringArtifact

doc = StringArtifact("design-doc")
doc.write("# Design\nFirst draft.", agent="writer")
doc.write("# Design\nSecond draft with more detail.", agent="writer")

content = doc.read(budget=4000)  # budget-aware read
print(content)

Each write() replaces the entire content. To see what changed between versions, use diff().

JsonArtifact

Stores structured JSON data with merge-patch semantics (RFC 7386). Each write() merges the new JSON into the existing state rather than replacing it.

from claw import JsonArtifact

config = JsonArtifact("api-config")
config.write('{"endpoints": ["/users"], "version": 1}', agent="designer")

# Merge-patch: only the keys you include are updated
config.write('{"endpoints": ["/users", "/posts"]}', agent="designer")

# Read the merged result
print(config.read(budget=4000))
# {"endpoints": ["/users", "/posts"], "version": 1}

To remove a key, set it to null in the patch:

config.write('{"version": null}', agent="designer")
# "version" key is now removed from the state

Schema Validation

JsonArtifact supports optional JSON Schema validation for required keys and property types:

schema = {
    "required": ["name", "status"],
    "properties": {
        "name": {"type": "string"},
        "status": {"type": "string"},
        "priority": {"type": "integer"},
    },
}

task = JsonArtifact("task", schema=schema)
task.write('{"name": "Fix bug", "status": "open"}', agent="pm")

# This would raise ValueError -- missing required key "name":
# task.write('{"status": "closed"}', agent="pm")  # works (merge keeps "name")

CodeFileArtifact

File-backed artifact that lives in a git repository. Each write() writes the file to disk, stages it with git add, and creates a git commit. Versions are tied to commit SHAs.

from pathlib import Path
from claw.artifacts import CodeFileArtifact

code = CodeFileArtifact("src/main.py", workdir=Path("./repo"))
v1 = code.write("print('hello')", agent="coder")
v2 = code.write("print('world')", agent="coder")

# Diff uses git diff between commits
print(code.diff(v1, v2))

Key properties:

  • code.file_path -- absolute path to the file on disk
  • code.workdir -- root directory of the git repository
  • Reads return the current file content from disk
  • Parent directories are created automatically on write

GitHubIssueArtifact

Artifact backed by a real GitHub issue, accessed via the gh CLI.

from claw.artifacts import GitHubIssueArtifact

issue = GitHubIssueArtifact(issue_number=42, repo="owner/repo")

# Read fetches the issue title, body, labels, and comments
content = issue.read(budget=2000)

# Write posts a comment
issue.write('{"comment": "Working on this."}', agent="pm")

# Write can also set labels
issue.write('{"labels": ["bug", "priority-high"]}', agent="pm")

Write accepts a JSON string with these optional keys:

Key Effect
comment Posts a comment on the issue
labels Adds labels to the issue
assignees Sets assignees on the issue

If the content is not valid JSON, it is treated as a plain comment.

Dry Run Mode

Pass dry_run=True to log commands without executing them. Useful for testing societies that interact with GitHub:

issue = GitHubIssueArtifact(issue_number=42, repo="owner/repo", dry_run=True)

GitHubPRArtifact

Artifact backed by a real GitHub pull request. Provides both read/write access and lifecycle actions (approve, request changes, merge).

from claw.artifacts import GitHubPRArtifact

pr = GitHubPRArtifact(pr_number=1, repo="owner/repo")

# Read fetches title, body, changed files, reviews, and comments
content = pr.read(budget=5000)

# Lifecycle actions
pr.approve(message="LGTM")
pr.request_changes(message="Fix error handling in auth module")
pr.merge(strategy="squash")  # "squash", "merge", or "rebase"

Write accepts a JSON string with these optional keys:

Key Effect
review Posts a review comment
approve Approves the PR (value is the approval message)
request_changes Requests changes (value is the message)
comment Posts a general comment
merge Merges the PR (value is the strategy)

Like GitHubIssueArtifact, this type also supports dry_run=True.

Version History

Every artifact keeps a chronological list of all versions. Access it via the history property:

doc = StringArtifact("spec")
doc.write("Draft 1", agent="alice")
doc.write("Draft 2", agent="bob")
doc.write("Draft 3", agent="alice")

for v in doc.history:
    print(f"{v.id} by {v.author} ({v.content_hash})")
# v1 by alice (...)
# v2 by bob (...)
# v3 by alice (...)

To get just the latest version:

current = doc.version()  # returns Version or None if never written

Budget-Aware Reads

Every artifact's read() method takes a budget parameter -- the maximum number of tokens to return. If the content exceeds the budget, it is truncated.

doc = StringArtifact("big-doc")
doc.write("A" * 100_000, agent="writer")

short = doc.read(budget=100)   # returns ~400 characters
full = doc.read(budget=50_000) # returns the full content

How Tokenization Works

Budget management uses the Tokenizer protocol. The default CharApproxTokenizer estimates 1 token as approximately 4 characters. This is fast and sufficient for most use cases.

You can inject a custom tokenizer (e.g. one backed by tiktoken) into any artifact:

from claw import Tokenizer, StringArtifact

class TiktokenTokenizer:
    def count(self, text: str) -> int:
        # Use tiktoken for accurate counts
        ...

    def truncate(self, text: str, budget: int) -> str:
        # Truncate to budget tokens
        ...

doc = StringArtifact("precise-doc", tokenizer=TiktokenTokenizer())

Diffing

Compare any two versions of an artifact:

doc = StringArtifact("spec")
doc.write("Line 1\nLine 2\n", agent="alice")
doc.write("Line 1\nLine 2 (edited)\nLine 3\n", agent="bob")

v1 = doc.history[0]
v2 = doc.history[1]
print(doc.diff(v1, v2))

For StringArtifact, this produces a unified diff. For JsonArtifact, it shows a key-level diff (added, removed, changed keys). For CodeFileArtifact, it uses git diff between commits.

Custom Artifacts

You can create your own artifact types by implementing the ArtifactProtocol. Any object that satisfies this protocol can be registered in the artifact store and used by the runtime.

from claw import ArtifactProtocol, Version

class MyArtifact:
    @property
    def name(self) -> str:
        """Unique identifier for this artifact."""
        ...

    @property
    def history(self) -> list[Version]:
        """All versions in chronological order."""
        ...

    def read(self, budget: int) -> str:
        """Load content within a token budget."""
        ...

    def write(self, content: str, agent: str) -> Version:
        """Mutate the artifact and return the new version."""
        ...

    def diff(self, v1: Version, v2: Version) -> str:
        """Compute a human-readable diff between two versions."""
        ...

    def version(self) -> Version | None:
        """Return the current version, or None if never written."""
        ...

The protocol is defined as a typing.Protocol with runtime_checkable, so you can verify your implementation:

assert isinstance(MyArtifact(), ArtifactProtocol)

Artifacts in Societies

Artifacts are passed to the runtime when you run a society. After execution, the final state of all artifacts is available in the result:

from claw import Society, Agent, Cooperation, LocalRuntime, StringArtifact

# Build a society
society = Society("doc-review")
writer = Agent("writer", role="author", model="claude-sonnet")
reviewer = Agent("reviewer", role="critic", model="claude-sonnet")
society.add(writer, reviewer)
society.connect(writer, reviewer, Cooperation())

# Create artifacts
doc = StringArtifact("design-doc")

# Run with artifacts
runtime = LocalRuntime(MockLLM())
result = await runtime.run(society, "Write a design doc", artifacts=[doc])

# Inspect final artifact state
for name, content in result.artifacts.items():
    print(f"{name}: {content[:80]}")

Artifact Declarations vs. Instances

Claw distinguishes between two levels of artifact objects:

  • Artifact (from claw.artifact) -- a lightweight metadata reference used in edge type declarations. It carries only a name and an optional description. Use this when defining the society graph.

  • StringArtifact, JsonArtifact, etc. -- concrete implementations with actual read/write/diff behavior. These are created at runtime and registered in the ArtifactStore.

from claw import Artifact, Cooperation, StringArtifact

# Lightweight reference for graph definition
spec = Artifact(name="task-spec", description="The task specification")
society.connect(writer, reviewer, Cooperation(artifacts=[spec]))

# Concrete instance for runtime
task_spec = StringArtifact("task-spec")
result = await runtime.run(society, "Do the task", artifacts=[task_spec])

Conflict Detection

The ArtifactStore tracks writes within each execution batch. If two different agents attempt to write to the same artifact within a single batch, an ArtifactConflictError is raised:

from claw import ArtifactConflictError

# This is handled internally by the runtime, but you can catch it
# in custom resolution strategies or observers.

This ensures that concurrent writes are detected rather than silently overwriting each other.

Summary

Type Storage Versioning Best For
StringArtifact In-memory Sequential (v1, v2, ...) Documents, specs, drafts
JsonArtifact In-memory Sequential with merge-patch Config, structured data
CodeFileArtifact File + git Git commit SHAs Source code
GitHubIssueArtifact GitHub API Sequential Issue tracking
GitHubPRArtifact GitHub API Sequential Code review workflows

All types support budget-aware reads, version history, and diffing. All types satisfy the ArtifactProtocol, making them interchangeable in the runtime.