Verification

Verification determines whether a claim is true. TruthKeeper uses multiple verification strategies that can be combined for higher accuracy.

Verification Strategies

MiniCheck (NLI-based)

MiniCheck uses natural language inference to check if the evidence entails the claim. It's best for claims about documentation, comments, and natural language sources.

from truthkeeper.verification import MiniCheckVerifier

verifier = MiniCheckVerifier(
    model="bespoke-minicheck-7b",  # or "gpt-4" for higher accuracy
    threshold=0.8
)

result = await verifier.verify(
    claim="The API rate limits requests to 100 per minute",
    evidence="Rate limiting: 100 requests/minute per API key"
)

print(f"Entailment score: {result.score}")  # 0.94
print(f"Verdict: {result.verdict}")  # SUPPORTED

When to Use

  • Documentation-based claims
  • Claims about comments or docstrings
  • Claims paraphrasing natural language sources

AST Verifier

The AST verifier parses code and checks claims against the abstract syntax tree. It understands code structure, not just text patterns.

from truthkeeper.verification import ASTVerifier

verifier = ASTVerifier(
    languages=["python", "typescript", "go"]
)

result = await verifier.verify(
    claim="UserService.authenticate() calls PasswordHasher.verify()",
    evidence_file="src/services/user.py"
)

print(f"Call found: {result.call_exists}")  # True
print(f"At lines: {result.locations}")  # [47, 52]

Supported Checks

  • Function/method existence
  • Call relationships
  • Class inheritance
  • Import dependencies
  • Type annotations
  • Decorator usage

Multi-Source Corroboration

Checks if multiple independent sources support the claim. Higher agreement = higher confidence.

from truthkeeper.verification import CorroborationVerifier

verifier = CorroborationVerifier(
    min_sources=2,
    agreement_threshold=0.7
)

result = await verifier.verify(
    claim="The application uses PostgreSQL as its primary database",
    sources=[
        "file://docker-compose.yml",
        "file://README.md",
        "file://src/config.py"
    ]
)

print(f"Sources agreeing: {result.agreeing_sources}")  # 3
print(f"Agreement score: {result.agreement}")  # 1.0

Custom Verifiers

You can implement custom verification strategies for domain-specific needs:

from truthkeeper.verification import BaseVerifier, VerificationResult

class APISchemaVerifier(BaseVerifier):
    """Verifies claims against OpenAPI schemas."""

    async def verify(self, claim: str, evidence: Evidence) -> VerificationResult:
        # Load OpenAPI schema
        schema = await self.load_schema(evidence.source)

        # Check claim against schema
        if self.claim_matches_schema(claim, schema):
            return VerificationResult(
                verdict="SUPPORTED",
                score=0.95,
                details={"matched_path": "/users/{id}"}
            )

        return VerificationResult(
            verdict="UNSUPPORTED",
            score=0.2,
            details={"reason": "No matching endpoint found"}
        )

# Register the custom verifier
client.register_verifier("api_schema", APISchemaVerifier())

Verification Pipeline

When a claim is verified, TruthKeeper runs it through a configurable pipeline:

┌─────────────────────────────────────────────────────────────────┐
│                     Verification Pipeline                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Claim ──► [Evidence Extraction] ──► [Strategy Selection]      │
│                                              │                   │
│              ┌───────────────────────────────┼───────────────┐   │
│              │                               │               │   │
│              ▼                               ▼               ▼   │
│       ┌──────────┐                   ┌──────────┐    ┌──────────┐│
│       │MiniCheck │                   │   AST    │    │  Corr.   ││
│       │ (0.92)   │                   │ (0.88)   │    │ (1.0)    ││
│       └────┬─────┘                   └────┬─────┘    └────┬─────┘│
│            │                              │               │      │
│            └──────────────┬───────────────┴───────────────┘      │
│                           ▼                                      │
│                  [Confidence Calculator]                         │
│                           │                                      │
│                           ▼                                      │
│                  confidence = 0.93                               │
│                  state = SUPPORTED                               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Confidence Calculation

The final confidence score is a weighted average of all verification results:

confidence = (
    0.4 * minicheck_score +      # Semantic understanding
    0.2 * authority_score +      # Source trustworthiness
    0.3 * corroboration_score +  # Cross-source agreement
    0.1 * recency_score          # Evidence freshness
)

# Weights are configurable per-project
client.configure_verification(
    weights={
        "minicheck": 0.5,
        "ast": 0.3,
        "corroboration": 0.2
    }
)

Verification Triggers

Verification runs automatically when:

  • Claim created: Initial verification (HYPOTHESIS → SUPPORTED/OUTDATED)
  • Source changed: Reverification of stale claims
  • Manual request: User triggers reverification
  • Scheduled: Periodic reverification of old claims
# Configure automatic reverification
client.configure_verification(
    auto_verify_on_create=True,
    reverify_stale_claims=True,
    periodic_reverification={
        "enabled": True,
        "interval_days": 7,
        "min_age_days": 30
    }
)

Handling Verification Failures

When verification fails or produces low confidence:

result = await client.verify_claim(claim_id)

if result.verdict == "CONTESTED":
    # Multiple strategies disagree
    print(f"Conflicting results: {result.strategy_results}")

    # Queue for human review
    await client.queue_for_review(
        claim_id,
        reason="Verification strategies disagree",
        priority="HIGH"
    )

elif result.verdict == "UNSUPPORTED":
    # Evidence doesn't support the claim
    print(f"Claim not supported: {result.details}")

    # Mark as outdated
    await client.update_claim_state(claim_id, "OUTDATED")

Performance Considerations

Caching

Verification results are cached to avoid redundant computation:

client.configure_verification(
    cache_ttl=3600,  # Cache results for 1 hour
    cache_key_includes=["evidence_hash", "strategy_version"]
)

Async Verification

For high-throughput scenarios, verification can run asynchronously:

# Queue verification without waiting
job_id = await client.queue_verification(claim_id)

# Check status later
status = await client.get_verification_status(job_id)
if status.complete:
    print(f"Result: {status.result}")

Next Steps