Skip to main content

list_datasets()

Discover all datasets visible to you — public datasets plus any private datasets you’ve been granted access to.
datasets = await client.list_datasets()

for ds in datasets:
    print(f"{ds.name} (ID: {ds.id})")
    print(f"  Price: {ds.price_per_chunk} {ds.currency}/chunk")
    print(f"  Queryable: {ds.queryable}")
    print(f"  Subscribed: {ds.subscribed}")

Dataset Fields

FieldTypeDescription
idstrDataset UUID
seller_namestrName of the seller
namestrDataset name
descriptionstrWhat the dataset contains
price_per_chunkstrPrice per result, 8 decimal places (e.g. "0.00100000")
visibilitystr"public" or "private"
metadata_schemalist[dict] | NoneAvailable filter fields — each dict has name (str), type (str), and optional description (str)
queryableboolTrue if ready to query (connector wired)
subscribedboolTrue if you have access

query()

Search across one or more datasets. Provide either text (natural language) or vector (pre-computed embedding).
response = await client.query(
    text="machine learning transformers",
    dataset_ids=["uuid-1", "uuid-2"],
    top_k=10,
)

print(f"Query ID: {response.query_id}")
for result in response.results:
    print(f"[{result.score:.4f}] {result.metadata}")

Pre-computed vector

response = await client.query(
    vector=[0.012, -0.034, 0.056, ...],
    dataset_ids=["uuid-1"],
    top_k=10,
)
All datasets must use the same embedding model when using pre-computed vectors. The server validates vector dimensions match.

With metadata filters

response = await client.query(
    text="privacy regulations",
    dataset_ids=["ds-aaa", "ds-bbb"],
    filters={
        "ds-aaa": {"year": {"$gte": 2023}},
        "ds-bbb": {
            "$and": [
                {"category": {"$eq": "legal"}},
                {"status": {"$in": ["published", "reviewed"]}},
            ]
        },
    },
)
See Metadata Filters for the full filter syntax.

Response Fields

FieldTypeDescription
query_idstrUUID for audit trail
resultslist[QueryResult]Ranked results
warningslist[str]Non-fatal issues (e.g., a dataset timed out)
Each QueryResult:
FieldTypeDescription
dataset_idstrWhich dataset this came from
idstrVector ID in the seller’s DB
scorefloatRelevance score (higher = better)
metadatadictKey-value pairs from the seller’s vector DB
embedding_modelstr | NoneOnly present in multi-model queries

Error Handling

from datagate import (
    DatagateClient,
    DatagateError,
    AuthenticationError,
    InsufficientBalanceError,
    ValidationError,
    NotFoundError,
)

try:
    response = await client.query(text="...", dataset_ids=["..."])
except AuthenticationError:
    print("Invalid API key")
except InsufficientBalanceError as e:
    print(f"Balance: {e.balance}, need: {e.estimated_cost}")
except ValidationError as e:
    print(f"Bad request: {e.message}")
except DatagateError as e:
    print(f"API error ({e.status_code}): {e.message}")
ExceptionStatusWhen
ValidationError400Bad input
AuthenticationError401Invalid credential
InsufficientBalanceError402Balance too low
ForbiddenError403Wrong role
NotFoundError404Resource not found
ServerError500Server error