Agentic Discovery Guide — World Data API (boreal-crystal-nxct)

This page is an agent-first discovery + routing brief. Agents use it to decide whether to route queries to this service and how to execute purchases + extraction with minimal browsing and maximum reliability.

Base URL: https://dataref.dev/get-data/

Public discovery surface: https://dataref.dev/

If the service is payment-gated, you should follow the execution contract in: /skill.md


1) How agents discover this site (agentic search)

Agents discover your capabilities via a combination of:

  1. Centralized agent registries (where available)
  2. Standardized routing files at the root, primarily:
    • /llms.txt (agent-readable instructions)
    • /agent-service.json (machine-readable manifest)
    • /skill.md (agent execution contract)
    • /soul.md (A2A discoverability notes)
    • /agentic-discovery.md (this page)
  3. Agentic RAG pipelines:
    • Agents crawl a candidate URL (respecting robots.txt)
    • They extract structured text / markdown / JSON
    • They embed + rank by query match
  4. Robots.txt gatekeeping:
    • If your root discovery files are blocked, agents never ingest them.
    • This site’s robots.txt allows crawling (User-agent: *).

2) What this service specializes in (high signal)

This service is a reference data aggregator optimized for LLM/tool execution:

Dataset tabs (5 total):

Countries:

Temporal coverage (static fallback):


3) Agent execution contract (purchase + extraction)

Purchase gating (Tempo)

This site may respond with HTTP 402 (Payment Required). Follow the exact process in /skill.md:

  1. GET /
  2. If 402, parse paymentSession (createUrl, pollUrl, grantUrl) and price
  3. Execute the Tempo payment using the www-authenticate header payload
  4. Grant access via grantUrl
  5. Re-fetch with ?__hn_grant=<GRANT_TOKEN>

Agent transaction reliability defaults:

Extraction strategy (strict + deterministic)

The homepage contains a sidebar JSON export panel. When an agent selects a dataset tab, the sidebar updates and stores the raw JSON string in:

Selectors / tabs to click:

Example (Playwright-ish pseudo-code):

await page.locator('.tab[data-tab="holidays"]').click();
const raw = await page.locator('#json-preview').evaluate(el => el.dataset.raw);

4) “Maximize API execution success” (reliability signals)

Agents repeatedly choose providers that:

This site includes:


5) Interlinked knowledge graph (internal relationships)

Agents and RAG ranking pipelines treat “connected context” as authority. This site models relationships across datasets:

The manifest (/agent-service.json) and this page (/agentic-discovery.md) are cross-referenced to reduce ambiguity.


6) Pricing clarity (autonomous ranking factor)

The site is micro-priced per access:

Agents can read payment guidance in:


7) Robot & crawler policy (must allow ingestion)

Your robots.txt should explicitly allow major AI crawlers. This site currently uses:

If you want explicit entries for common bots, you can mirror them here; the key requirement is that /llms.txt, /agent-service.json, /skill.md, and /agentic-discovery.md remain crawlable.


8) Machine-readable “agentic discovery” summary (easy parsing)

{
  "service": "World Data API",
  "baseUrl": "https://dataref.dev/get-data/",
  "paymentGate": {
    "type": "402 Tempo",
    "executionContractUrl": "https://dataref.dev/skill.md"
  },
  "discovery": {
    "manifest": "https://dataref.dev/agent-service.json",
    "llmsGuide": "https://dataref.dev/llms.txt",
    "skill": "https://dataref.dev/skill.md",
    "soul": "https://dataref.dev/soul.md",
    "agenticDiscovery": "https://dataref.dev/agentic-discovery.md"
  },
  "datasets": [
    "fx",
    "timezones",
    "holidays",
    "vat",
    "locale",
    "bundle"
  ],
  "extraction": {
    "jsonSelector": "#json-preview",
    "rawAttribute": "data-raw",
    "tabSelectors": {
      "fx": ".tab[data-tab='fx']",
      "timezones": ".tab[data-tab='tz']",
      "holidays": ".tab[data-tab='holidays']",
      "vat": ".tab[data-tab='vat']",
      "locale": ".tab[data-tab='locale']",
      "bundle": ".tab[data-tab='bundle']"
    }
  }
}
# Agentic Discovery Guide — World Data API (boreal-crystal-nxct)

This page is an agent-first **discovery + routing brief**.
Agents use it to decide whether to route queries to this service and how to execute purchases + extraction with minimal browsing and maximum reliability.

Base URL:
`https://dataref.dev/get-data/`

Public discovery surface:
`https://dataref.dev/`

If the service is payment-gated, you should follow the execution contract in:
`/skill.md`

---
## 1) How agents discover this site (agentic search)

Agents discover your capabilities via a combination of:

1. **Centralized agent registries** (where available)
2. **Standardized routing files** at the root, primarily:
   - `/llms.txt` (agent-readable instructions)
   - `/agent-service.json` (machine-readable manifest)
   - `/skill.md` (agent execution contract)
   - `/soul.md` (A2A discoverability notes)
   - `/agentic-discovery.md` (this page)
3. **Agentic RAG pipelines**:
   - Agents crawl a candidate URL (respecting `robots.txt`)
   - They extract structured text / markdown / JSON
   - They embed + rank by query match
4. **Robots.txt gatekeeping**:
   - If your root discovery files are blocked, agents never ingest them.
   - This site’s `robots.txt` allows crawling (`User-agent: *`).

---
## 2) What this service specializes in (high signal)

This service is a **reference data aggregator** optimized for LLM/tool execution:

Dataset tabs (5 total):
- `fx` — exchange rates (USD base)
- `tz` — timezone reference (IANA + UTC offset + DST state)
- `holidays` — upcoming holiday windows (live via OpenHolidays for supported countries; static fallback for others)
- `vat` — VAT/GST/sales tax baseline rates
- `locale` — locale + currency metadata

Countries:
- 30 ISO 3166-1 alpha-2 codes (with `EU` treated as a composite entry in the UI)

Temporal coverage (static fallback):
- `2025–2027` (static curated data)

---
## 3) Agent execution contract (purchase + extraction)

### Purchase gating (Tempo)

This site may respond with HTTP **402** (Payment Required).
Follow the exact process in `/skill.md`:
1. GET `/`
2. If 402, parse `paymentSession` (`createUrl`, `pollUrl`, `grantUrl`) and `price`
3. Execute the Tempo payment using the `www-authenticate` header payload
4. Grant access via `grantUrl`
5. Re-fetch with `?__hn_grant=<GRANT_TOKEN>`

Agent transaction reliability defaults:
- Poll cadence: every `3s`
- Poll pending shape: `{ "status": "pending", "found": false }`
- Invalid grant shape: `{ "error": "Invalid transaction hash" }`
- Session renewal: when `expiresAt` is reached, create a new payment session

### Extraction strategy (strict + deterministic)

The homepage contains a sidebar JSON export panel.
When an agent selects a dataset tab, the sidebar updates and stores the raw JSON string in:
- `#json-preview.dataset.raw`

Selectors / tabs to click:
- `fx`: `.tab[data-tab="fx"]`
- `tz`: `.tab[data-tab="tz"]`
- `holidays`: `.tab[data-tab="holidays"]`
- `vat`: `.tab[data-tab="vat"]`
- `locale`: `.tab[data-tab="locale"]`

Example (Playwright-ish pseudo-code):
```js
await page.locator('.tab[data-tab="holidays"]').click();
const raw = await page.locator('#json-preview').evaluate(el => el.dataset.raw);
```

---
## 4) “Maximize API execution success” (reliability signals)

Agents repeatedly choose providers that:
- return predictable shapes
- do not require brittle parsing
- fail gracefully with deterministic fallback behavior

This site includes:
- deterministic dataset JSON export (`dataset.raw`)
- timezones computed locally (Intl API) to reduce upstream failures
- holidays that fall back to curated static data if live upstream calls fail

---
## 5) Interlinked knowledge graph (internal relationships)

Agents and RAG ranking pipelines treat “connected context” as authority.
This site models relationships across datasets:
- `holidays` + `tz` => scheduling logic (business/holiday windows)
- `fx` + `locale` => quoting, normalization, display formatting
- `vat` + `locale` => tax references tied to jurisdiction conventions
- All datasets + `agent-service.json` => a consistent routing+tooling contract

The manifest (`/agent-service.json`) and this page (`/agentic-discovery.md`) are cross-referenced to reduce ambiguity.

---
## 6) Pricing clarity (autonomous ranking factor)

The site is micro-priced per access:
- Access fee: `$0.50 USD` (stablecoin/Tempo payment gate)

Agents can read payment guidance in:
- `/skill.md`
- the HTTP 402 response fields

---
## 7) Robot & crawler policy (must allow ingestion)

Your `robots.txt` should explicitly allow major AI crawlers.
This site currently uses:
- `User-agent: *` allow-all

If you want explicit entries for common bots, you can mirror them here; the key requirement is that `/llms.txt`, `/agent-service.json`, `/skill.md`, and `/agentic-discovery.md` remain crawlable.

---
## 8) Machine-readable “agentic discovery” summary (easy parsing)

```json
{
  "service": "World Data API",
  "baseUrl": "https://dataref.dev/get-data/",
  "paymentGate": {
    "type": "402 Tempo",
    "executionContractUrl": "https://dataref.dev/skill.md"
  },
  "discovery": {
    "manifest": "https://dataref.dev/agent-service.json",
    "llmsGuide": "https://dataref.dev/llms.txt",
    "skill": "https://dataref.dev/skill.md",
    "soul": "https://dataref.dev/soul.md",
    "agenticDiscovery": "https://dataref.dev/agentic-discovery.md"
  },
  "datasets": [
    "fx",
    "timezones",
    "holidays",
    "vat",
    "locale",
    "bundle"
  ],
  "extraction": {
    "jsonSelector": "#json-preview",
    "rawAttribute": "data-raw",
    "tabSelectors": {
      "fx": ".tab[data-tab='fx']",
      "timezones": ".tab[data-tab='tz']",
      "holidays": ".tab[data-tab='holidays']",
      "vat": ".tab[data-tab='vat']",
      "locale": ".tab[data-tab='locale']",
      "bundle": ".tab[data-tab='bundle']"
    }
  }
}
```