Skip to main content

list_datasets()

Discover all datasets visible to you — public datasets plus any private datasets you’ve been granted access to.
datasets = await client.list_datasets()

for ds in datasets:
    print(f"{ds.name} (ID: {ds.id})")
    print(f"  Price: {ds.price_per_chunk} {ds.currency}/chunk")
    print(f"  Queryable: {ds.queryable}")
    print(f"  Subscribed: {ds.subscribed}")

Dataset Fields

FieldTypeDescription
idstrDataset UUID
seller_namestrName of the seller
namestrDataset name
descriptionstrWhat the dataset contains
price_per_chunkstrPrice per result, 8 decimal places
visibilitystr"public" or "private"
metadata_schemalist[dict] | NoneAvailable filter fields
queryableboolTrue if ready to query (connector wired)
subscribedboolTrue if you have access

query()

Search across one or more datasets. Provide either text (natural language) or vector (pre-computed embedding).
response = await client.query(
    text="machine learning transformers",
    dataset_ids=["uuid-1", "uuid-2"],
    top_k=10,
)

print(f"Query ID: {response.query_id}")
for result in response.results:
    print(f"[{result.score:.4f}] {result.metadata}")

Pre-computed vector

response = await client.query(
    vector=[0.012, -0.034, 0.056, ...],
    dataset_ids=["uuid-1"],
    top_k=10,
)
All datasets must use the same embedding model when using pre-computed vectors. The server validates vector dimensions match.

With metadata filters

response = await client.query(
    text="privacy regulations",
    dataset_ids=["ds-aaa", "ds-bbb"],
    filters={
        "ds-aaa": {"year": {"$gte": 2023}},
        "ds-bbb": {
            "$and": [
                {"category": {"$eq": "legal"}},
                {"status": {"$in": ["published", "reviewed"]}},
            ]
        },
    },
)
See Metadata Filters for the full filter syntax.

Response Fields

FieldTypeDescription
query_idstrUUID for audit trail
resultslist[QueryResult]Ranked results
warningslist[str]Non-fatal issues (e.g., a dataset timed out)
Each QueryResult:
FieldTypeDescription
dataset_idstrWhich dataset this came from
idstrVector ID in the seller’s DB
scorefloatRelevance score (higher = better)
metadatadictKey-value pairs from the seller’s vector DB
embedding_modelstr | NoneOnly present in multi-model queries

Error Handling

from datagate import (
    DatagateClient,
    DatagateError,
    AuthenticationError,
    InsufficientBalanceError,
    ValidationError,
    NotFoundError,
)

try:
    response = await client.query(text="...", dataset_ids=["..."])
except AuthenticationError:
    print("Invalid API key")
except InsufficientBalanceError as e:
    print(f"Balance: {e.balance}, need: {e.estimated_cost}")
except ValidationError as e:
    print(f"Bad request: {e.message}")
except DatagateError as e:
    print(f"API error ({e.status_code}): {e.message}")
ExceptionStatusWhen
ValidationError400Bad input
AuthenticationError401Invalid credential
InsufficientBalanceError402Balance too low
ForbiddenError403Wrong role
NotFoundError404Resource not found
ServerError500Server error