Metadata-Version: 2.4
Name: a2rag
Version: 0.1.0
Summary: Abstention-Aware RAG Decision Layer
Author: AIBee Research
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Provides-Extra: full
Requires-Dist: requests; extra == "full"
Requires-Dist: rich; extra == "full"
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: provides-extra
Dynamic: requires-python
Dynamic: summary

# A2RAG — Abstention-Aware RAG Decision Layer

Decides when your RAG system should **answer**, **ask for clarification**, or **abstain**.

## Installation

```bash
pip install a2rag
```

## Quick Start

```python
from a2rag import A2RAGClient

client = A2RAGClient(api_key="your_key_here")

# Your existing RAG pipeline
contexts     = your_rag.retrieve(user_query)
draft_answer = your_llm.generate(user_query, contexts)

# A2RAG decides what to do
decision = client.decide(user_query, contexts, draft_answer)

if decision.should_answer:
    show_to_user(draft_answer)
elif decision.should_clarify:
    ask_user(decision.clarification)
elif decision.should_abstain:
    escalate_to_human()
```

## Decision Properties

```python
decision.action           # "answer" | "clarify" | "abstain"
decision.confidence       # 0-1
decision.clarification    # Question to ask user (if action=clarify)
decision.missing_fields   # What info is missing
decision.should_answer    # bool
decision.should_clarify   # bool
decision.should_abstain   # bool
decision.evidence_score   # How well corpus supports the answer
decision.is_high_confidence  # confidence >= 0.80
```

## Metrics & Analytics

```python
# Get metrics for last 30 days
m = client.metrics(days=30)
print(f"Answer rate:  {m.answer_rate:.1%}")
print(f"UAR:          {m.uar:.1%}")      # Unsafe Answer Rate
print(f"ORS:          {m.ors:.1%}")      # Overall Reliability Score
print(f"Avg latency:  {m.avg_latency_ms:.0f}ms")

# By domain
by_domain = client.metrics_by_domain()

# By language
by_lang = client.metrics_by_language()

# Over time
trends = client.trends(days=30, interval="day")

# Error analysis
errors = client.error_analysis()
```

## Local Dashboard

```python
client.dashboard()        # Opens browser at http://localhost:7860
# or from terminal:
# a2rag dashboard
```

All data is stored locally. Nothing is sent to A2RAG servers (paid plans).

## Feedback

```python
decision = client.decide(query, contexts, draft)
# ... show to user, get feedback ...
client.feedback(decision.decision_id, was_correct=True)
```

## Calibration

```python
labeled_data = [
    {"query": "...", "contexts": [...], "draft_answer": "...", "label": "answer"},
    # ... 50+ examples
]
result = client.calibrate(labeled_data, domain="insurance")
print(f"Optimal tau_evidence: {result.tau_evidence}")
```

## Export

```python
client.export("decisions.csv", days=30)
# or: a2rag export decisions.csv
```

## Privacy

- **All decision history** is stored locally in `~/.a2rag/decisions.db`
- **Free tier**: sends anonymous metrics (action, confidence, language, latency)
- **Paid tiers**: `A2RAGClient(telemetry=False)` — zero data sent
- **Query content, contexts, and answers are NEVER sent to A2RAG servers**

## Supported Languages

English, Hebrew, Arabic, French, Spanish, and more — language is auto-detected.

## Supported Context Formats

```python
# Any of these work:
contexts = ["plain string"]
contexts = [{"text": "...", "score": 0.9}]
contexts = [langchain_document]
contexts = [llamaindex_node]
contexts = [your_custom_chunk]  # any object with .text attribute
```
