Skip to content

Query

Cubyc's query function allows you to filter, aggregate, and analyze your experiment runs with SQL. You can query your local or remote repositories by joining tables like config, logs, metadata, and comments.

query

query(statement, path=None, branch=None)

Query a repository of runs using SQL.

PARAMETER DESCRIPTION
statement

A valid SQL statement to query against your repository.

TYPE: str

path

The local path or remote URL of the repository to query. Defaults to the local working directory.

TYPE: str DEFAULT: None

branch

The branch to query. Defaults to all branches.

TYPE: str DEFAULT: None

RETURNS DESCRIPTION
DataFrame

A DataFrame containing the result of the query.

RAISES DESCRIPTION
HTTPError

If the server returns an error.

Notes
  • Use PostgreSQL syntax for the query.
  • You can query the following tables:
    • config
    • metadata
    • logs
    • comments
  • All tables have a id column that you can use to join them, which represents the commit SHA of the run.
Example

To query the maximum accuracy and the corresponding configuration for a project:

from cubyc import query

statement = '''
            SELECT
                config.*,
                max(logs.value)
            FROM
                config
            INNER JOIN
                logs ON config.id = logs.id
            WHERE
                logs.var = 'accuracy'
            '''

query(statement=statement, path="https://github.com/owner/project.git")