Python API

Data Model

The Python API client is primarily a GraphQL client that interacts with our GraphQL API endpoint. The data model for the python API client and the GraphQL API are identical.

If you prefer to query our API endpoint directly, it’s available at https://graphql.cryoetdataportal.cziscience.com/v1/graphql

A simplified diagram of the graph data model is below:

Simplified data model

Client

class cryoet_data_portal.Client(url: str | None = None)

A GraphQL Client library that can traverse all the metadata in the CryoET Data Portal

Parameters:

url (Optional[str]) – The API URL to connect to, defaults to “https://graphql.cryoetdataportal.cziscience.com/v1/graphql

Returns:

A GraphQL API Client library

Examples

Generate a client that connects to the default GraphQL API:

>>> client = cryoet_data_portal.Client()

Dataset

class cryoet_data_portal.Dataset(client: Client, **kwargs)

Metadata for a dataset in the CryoET Data Portal

id

An identifier for a CryoET dataset, assigned by the Data Portal. Used to identify the dataset as the directory name in data tree

Type:

int

authors

An array relationship with DatasetAuthor

Type:

List[DatasetAuthor]

cell_component_id

If the dataset focuses on a specific part of a cell, the subset is included here

Type:

str

cell_component_name

Name of the cellular component

Type:

str

cell_name

Name of the cell from which a biological sample used in a CryoET study is derived from.

Type:

str

cell_strain_id

Link to more information about the cell strain

Type:

str

cell_strain_name

Cell line or strain for the sample.

Type:

str

cell_type_id

Cell Ontology identifier for the cell type

Type:

str

dataset_citations

DOIs for publications that cite the dataset. Use a comma to separate multiple DOIs.

Type:

str

dataset_publications

DOIs for publications that describe the dataset. Use a comma to separate multiple DOIs.

Type:

str

deposition_date

Date when a dataset is initially received by the Data Portal.

Type:

datetime.date

description

A short description of a CryoET dataset, similar to an abstract for a journal article or dataset.

Type:

str

funding_sources

List[FundingSource] An array relationship with FundingSource

Type:

List[cryoet_data_portal._models.DatasetFunding]

grid_preparation

Describe Cryo-ET grid preparation.

Type:

str

https_prefix

The HTTPS directory path where this dataset is contained

Type:

str

key_photo_thumbnail_url

URL for the thumbnail of preview image.

Type:

str

key_photo_url

URL for the dataset preview image.

Type:

str

last_modified_date

Date when a released dataset is last modified.

Type:

date

organism_name

Name of the organism from which a biological sample used in a CryoET study is derived from, e.g. homo sapiens

Type:

str

organism_taxid

NCBI taxonomy identifier for the organism, e.g. 9606

Type:

str

other_setup

Describe other setup not covered by sample preparation or grid preparation that may make this dataset unique in the same publication

Type:

str

related_database_entries

If a CryoET dataset is also deposited into another database, enter the database identifier here (e.g. EMPIAR-11445). Use a comma to separate multiple identifiers.

Type:

str

If a CryoET dataset is also deposited into another database, e.g. EMPIAR, enter the database identifier here (e.g.https://www.ebi.ac.uk/empiar/EMPIAR-12345/). Use a comma to separate multiple links.

Type:

str

release_date

Date when a dataset is made available on the Data Portal.

Type:

date

runs

List[Run] An array relationship with Run

Type:

List[cryoet_data_portal._models.Run]

s3_prefix

The S3 public bucket path where this dataset is contained

Type:

str

sample_preparation

Describe how the sample was prepared.

Type:

str

sample_type

Type of samples used in a CryoET study. (cell, tissue, organism, intact organelle, in-vitro mixture, in-silico synthetic data, other)

Type:

str

tissue_id

UBERON identifier for the tissue

Type:

str

tissue_name

Name of the tissue from which a biological sample used in a CryoET study is derived from.

Type:

str

title

Title of a CryoET dataset

Type:

str

download_everything(dest_path: str | None = None)

Download all of the data for this dataset.

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

DatasetAuthor

class cryoet_data_portal.DatasetAuthor(client: Client, **kwargs)

Metadata for authors of a dataset

id

A numeric identifier for this author

Type:

int

affiliation_address

Address of the institution an author is affiliated with.

Type:

str

affiliation_identifier

A unique identifier assigned to the affiliated institution by The Research Organization Registry (ROR).

Type:

str

affiliation_name

Name of the institutions an author is affiliated with. Comma separated

Type:

str

author_list_order

The order in which the author appears in the publication

Type:

int

corresponding_author_status

Indicating whether an author is the corresponding author

Type:

bool

dataset

An object relationship with the dataset this author corresponds to

Type:

Dataset

dataset_id

Numeric identifier for the dataset this author corresponds to

Type:

int

email

Email address for each author

Type:

str

name

Full name of a dataset author (e.g. Jane Doe).

Type:

str

orcid

A unique, persistent identifier for researchers, provided by ORCID.

Type:

str

primary_author_status

Indicating whether an annotator is the main person executing the annotation, especially on manual annotation

Type:

bool

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

DatasetFunding

class cryoet_data_portal.DatasetFunding(client: Client, **kwargs)

Metadata for a dataset’s funding sources

id

A numeric identifier for this funding record

Type:

int

dataset

An object relationship with the dataset this funding source corresponds to

Type:

Dataset

dataset_id

Numeric identifier for the dataset this funding source corresponds to

Type:

int

funding_agency_name

Name of the funding agency.

Type:

str

grant_id

Grant identifier provided by the funding agency.

Type:

str

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

Run

class cryoet_data_portal.Run(client: Client, **kwargs)

Metadata for an experiment run

id

Numeric identifier (May change!)

Type:

int

dataset

An object relationship with the dataset this run is a part of

Type:

Dataset

dataset_id

Reference to the dataset this run is a part of

Type:

int

https_prefix

The HTTPS directory path where this dataset is contained

Type:

str

name

Short name for the experiment run

Type:

str

s3_prefix

The S3 public bucket path where this dataset is contained

Type:

str

tiltseries

An array relationship with TiltSeries that correspond to this run

Type:

list[TiltSeries]

tomogram_voxel_spacings

An array relationship with the Tomogram Voxel Spacings created from this run

Type:

list[TomogramVoxelSpacing]

download_everything(dest_path: str | None = None)

Download all of the data for this run.

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

TomogramVoxelSpacing

class cryoet_data_portal.TomogramVoxelSpacing(client: Client, **kwargs)

Metadata for a set of tomograms and annotations of a given voxel spacing

id

Numeric identifier (May change!)

Type:

int

annotations

An array relationship with the annotations associated with this voxel spacing

Type:

list[Annotation]

https_prefix

The HTTPS directory path where this dataset is contained

Type:

str

run

An object relationship with the run this voxel spacing is a part of

Type:

Run

run_id

Reference to the dataset this run is a part of

Type:

int

s3_prefix

The S3 public bucket path where this dataset is contained

Type:

str

tomograms

An array relationship with Tomograms of this voxel spacing

Type:

list[Tomogram]

voxel_spacing

The voxel spacing for the tomograms in this set

Type:

float

download_everything(dest_path: str | None = None)

Download all of the data for this run.

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

Tomogram

class cryoet_data_portal.Tomogram(client: Client, **kwargs)

Metadata for a tomogram

id

Numeric identifier for this tomogram (this may change!)

Type:

int

affine_transformation_matrix

The flip or rotation transformation of this author submitted tomogram is indicated here

Type:

str

ctf_corrected

Whether this tomogram is CTF corrected

Type:

bool

fiducial_alignment_status

Fiducial Alignment status: True = aligned with fiducial False = aligned without fiducial

Type:

str

deposition_id

If the tomogram is part of a deposition, the related deposition’s id

Type:

int

https_mrc_scale0

HTTPS path to this tomogram in MRC format (no scaling)

Type:

str

https_omezarr_dir

HTTPS path to this tomogram in multiscale OME-Zarr format

Type:

str

is_canonical

Is this tomogram considered the canonical tomogram for the run experiment? True=Yes

Type:

bool

key_photo_thumbnail_url

URL for the thumbnail of key photo

Type:

str

key_photo_url

URL for the key photo

Type:

str

name

Short name for this tomogram

Type:

str

neuroglancer_config

the compact json of neuroglancer config

Type:

str

offset_x

x offset data relative to the canonical tomogram in pixels

Type:

int

offset_y

y offset data relative to the canonical tomogram in pixels

Type:

int

offset_z

z offset data relative to the canonical tomogram in pixels

Type:

int

processing

Describe additional processing used to derive the tomogram

Type:

str

processing_software

Processing software used to derive the tomogram

Type:

str

reconstruction_method

Describe reconstruction method (Weighted back-projection, SART, SIRT)

Type:

str

reconstruction_software

Name of software used for reconstruction

Type:

str

s3_mrc_scale0

S3 path to this tomogram in MRC format (no scaling)

Type:

str

s3_omezarr_dir

S3 path to this tomogram in multiscale OME-Zarr format

Type:

str

scale0_dimensions

comma separated x,y,z dimensions of the unscaled tomogram

Type:

str

scale1_dimensions

comma separated x,y,z dimensions of the scale1 tomogram

Type:

str

scale2_dimensions

comma separated x,y,z dimensions of the scale2 tomogram

Type:

str

size_x

Number of pixels in the 3D data fast axis

Type:

int

size_y

Number of pixels in the 3D data medium axis

Type:

int

size_z

Number of pixels in the 3D data slow axis. This is the image projection direction at zero stage tilt

Type:

int

tomogram_version

Version of tomogram using the same software and post-processing. Version of tomogram using the same software and post-processing. This will be presented as the latest version

Type:

str

tomogram_voxel_spacing

An object relationship with a specific voxel spacing for this experiment run

Type:

TomogramVoxelSpacing

type

Tomogram purpose (ex: CANONICAL)

Type:

str

voxel_spacing

Voxel spacing equal in all three axes in angstroms

Type:

float

download_all_annotations(dest_path: str | None = None, format: str | None = None, shape: str | None = None)

Download all annotation files for this tomogram

Parameters:
  • dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

  • shape (Optional[str], optional) – Choose a specific shape type to download (e.g.: OrientedPoint, SegmentationMask)

  • format (Optional[str], optional) – Choose a specific file format to download (e.g.: mrc, ndjson)

download_mrcfile(dest_path: str | None = None)

Download an MRC file of this tomogram

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

download_omezarr(dest_path: str | None = None)

Download the OME-Zarr version of this tomogram

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

TomogramAuthor

class cryoet_data_portal.TomogramAuthor(client: Client, **kwargs)

Metadata for a tomogram’s authors

id

Numeric identifier for this tomogram’s author (this may change!)

Type:

int

affiliation_address

Address of the institution an author is affiliated with.

Type:

str

affiliation_identifier

A unique identifier assigned to the affiliated institution by The Research Organization Registry (ROR).

Type:

str

affiliation_name

Name of the institution an annotator is affiliated with. Sometimes, one annotator may have multiple affiliations.

Type:

str

author_list_order

The order in which the author appears in the publication

Type:

int

tomogram

An object relationship with the Tomogram this author is a part of

Type:

Tomogram

tomogram_id

Reference to the tomogram this author contributed to

Type:

int

corresponding_author_status

Indicating whether an author is the corresponding author

Type:

bool

email

Email address for this author

Type:

str

name

Full name of an author (e.g. Jane Doe).

Type:

str

orcid

A unique, persistent identifier for researchers, provided by ORCID.

Type:

str

primary_author_status

Indicating whether an annotator is the main person executing the annotation, especially on manual annotation

Type:

bool

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

Annotation

class cryoet_data_portal.Annotation(client: Client, **kwargs)

Metadata for an annotation

id

Numeric identifier (May change!)

Type:

int

annotation_method

Describe how the annotation is made (e.g. Manual, crYoLO, Positive Unlabeled Learning, template matching)

Type:

str

annotation_publication

DOIs for publications that describe the dataset. Use a comma to separate multiple DOIs.

Type:

str

annotation_software

Software used for generating this annotation

Type:

str

authors

An array relationship with the authors of this annotation

Type:

list[Author]

confidence_precision

Describe the confidence level of the annotation. Precision is defined as the % of annotation objects being true positive

Type:

float

confidence_recall

Describe the confidence level of the annotation. Recall is defined as the % of true positives being annotated correctly

Type:

float

deposition_date

Date when an annotation set is initially received by the Data Portal.

Type:

date

deposition_id

If the annotation is part of a deposition, the related deposition’s id

Type:

int

files

An array relationship with the files of this annotation

Type:

list[AnnotationFile]

ground_truth_status

Whether an annotation is considered ground truth, as determined by the annotator.

Type:

bool

ground_truth_used

Annotation filename used as ground truth for precision and recall

Type:

str

https_metadata_path

HTTPS path for the metadata json file for this annotation

Type:

str

last_modified_date

Date when an annotation was last modified in the Data Portal

Type:

date

method_type

The method type for generating the annotation (eg. manual, hybrid, automated)

Type:

str

object_count

Number of objects identified

Type:

int

object_description

A textual description of the annotation object, can be a longer description to include additional information not covered by the Annotation object name and state.

Type:

str

object_id

Gene Ontology Cellular Component identifier for the annotation object

Type:

str

object_name

Name of the object being annotated (e.g. ribosome, nuclear pore complex, actin filament, membrane)

Type:

str

object_state

Molecule state annotated (e.g. open, closed)

Type:

str

release_date

Date when annotation data is made public by the Data Portal.

Type:

date

tomogram_voxel_spacing

An object relationship with a specific voxel spacing for this annotation

Type:

TomogramVoxelSpacing

tomogram_voxel_spacing_id

Reference to the tomogram voxel spacing group this annotation applies to

Type:

int

s3_metadata_path

S3 path for the metadata json file for this annotation

Type:

str

download(dest_path: str | None = None, format: str | None = None, shape: str | None = None)

Download annotation files for a given format and/or shape

Parameters:
  • dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

  • shape (Optional[str], optional) – Choose a specific shape type to download (e.g.: OrientedPoint, SegmentationMask)

  • format (Optional[str], optional) – Choose a specific file format to download (e.g.: mrc, ndjson)

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

AnnotationFile

class cryoet_data_portal.AnnotationFile(client: Client, **kwargs)

Metadata for an annotation file

id

Numeric identifier (May change!)

Type:

int

format

File format for this file

Type:

str

https_path

HTTPS url for the annotation file

Type:

str

is_visualization_default

Is this annotation shape displayed in visualization tools by default

Type:

bool

s3_path

S3 path for the annotation file

Type:

str

shape_type

Describe whether this is a Point, OrientedPoint, or SegmentationMask file

Type:

str

annotation_id

Reference to the annotation this file applies to

Type:

int

Annotation

The annotation this file is a part of

Type:

Annotation

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

AnnotationAuthor

class cryoet_data_portal.AnnotationAuthor(client: Client, **kwargs)

Metadata for an annotation’s authors

id

Numeric identifier for this annotation author (this may change!)

Type:

int

affiliation_address

Address of the institution an annotator is affiliated with.

Type:

str

affiliation_identifier

A unique identifier assigned to the affiliated institution by The Research Organization Registry (ROR).

Type:

str

affiliation_name

Name of the institution an annotator is affiliated with. Sometimes, one annotator may have multiple affiliations.

Type:

str

annotation

An object relationship with the annotation this author is a part of

Type:

Annotation

annotation_id

Reference to the annotation this author contributed to

Type:

int

corresponding_author_status

Indicating whether an annotator is the corresponding author

Type:

bool

email

Email address for this author

Type:

str

name

Full name of an annotation author (e.g. Jane Doe).

Type:

str

orcid

A unique, persistent identifier for researchers, provided by ORCID.

Type:

str

primary_annotator_status

Indicating whether an annotator is the main person executing the annotation, especially on manual annotation

Type:

bool

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes

TiltSeries

class cryoet_data_portal.TiltSeries(client: Client, **kwargs)

Metadata about how a tilt series was generated, and locations of output files

id

Numeric identifier for this tilt series (this may change!)

Type:

int

run

An object relationship with the run this tiltseries is a part of

Type:

Run

run_id

Reference to the run this tiltseries is a part of

Type:

int

acceleration_voltage

Electron Microscope Accelerator voltage in volts

Type:

int

aligned_tiltseries_binning

Binning factor of the aligned tilt series

Type:

int

binning_from_frames

Describes the binning factor from frames to tilt series file

Type:

float

camera_manufacturer

(str): Name of the camera manufacturer

Type:

str

camera_model

Camera model name

Type:

str

data_acquisition_software

(str): Software used to collect data

Type:

float

frames_count

Number of frames associated with this tiltseries

Type:

int

https_alignment_file

HTTPS path to the alignment file for this tiltseries

Type:

str

https_angle_list

HTTPS path to the angle list file for this tiltseries

Type:

str

https_collection_metadata

HTTPS path to the collection metadata file for this tiltseries

Type:

str

https_mrc_bin1

HTTPS path to this tiltseries in MRC format (no scaling)

Type:

str

https_omezarr_dir

HTTPS path to this tomogram in multiscale OME-Zarr format

Type:

str

microscope_additional_info

Other microscope optical setup information, in addition to energy filter, phase plate and image corrector

Type:

str

microscope_energy_filter

(str): Energy filter setup used

Type:

str

microscope_image_corrector

Image corrector setup

Type:

str

microscope_manufacturer

Name of the microscope manufacturer

Type:

str

microscope_model

Microscope model name

Type:

str

microscope_phase_plate

Phase plate configuration

Type:

str

pixel_spacing

Pixel spacing for the tilt series

Type:

float

related_empiar_entry

If a tilt series is deposited into EMPIAR, enter the EMPIAR dataset identifier

Type:

str

s3_alignment_file

S3 path to the alignment file for this tiltseries

Type:

str

s3_angle_list

S3 path to the angle list file for this tiltseries

Type:

str

s3_collection_metadata

S3 path to the collection metadata file for this tiltseries

Type:

str

s3_mrc_bin1

S3 path to this tiltseries in MRC format (no scaling)

Type:

str

s3_omezarr_dir

S3 path to this tomogram in multiscale OME-Zarr format

Type:

str

spherical_aberration_constant

Spherical Aberration Constant of the objective lens in millimeters

Type:

float

tilt_axis

Rotation angle in degrees

Type:

float

tilt_max

Maximal tilt angle in degrees

Type:

float

tilt_min

Minimal tilt angle in degrees

Type:

float

tilt_range

Total tilt range in degrees

Type:

float

tilt_series_quality

Author assessment of tilt series quality within the dataset (1-5, 5 is best)

Type:

int

tilt_step

Tilt step in degrees

Type:

float

tilting_scheme

The order of stage tilting during acquisition of the data

Type:

str

total_flux

Number of Electrons reaching the specimen in a square Angstrom area for the entire tilt series

Type:

float

download_alignment_file(dest_path: str | None = None)

Download the alignment file for this tiltseries

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

download_angle_list(dest_path: str | None = None)

Download the angle list for this tiltseries

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

download_collection_metadata(dest_path: str | None = None)

Download the collection metadata for this tiltseries

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

download_mrcfile(dest_path: str | None = None)

Download an MRC file for this tiltseries

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

download_omezarr(dest_path: str | None = None)

Download the omezarr version of this tiltseries

Parameters:

dest_path (Optional[str], optional) – Choose a destination directory. Defaults to $CWD.

classmethod find(client: Client, query_filters: Iterable[GQLExpression] | None = None)

Find objects based on a set of search filters.

Search filters are combined with and so all results will match all filters.

Expressions with python-native operators (==, !=, >, >=, <, <=) must be in the format:

ModelSubclass.field {operator} {value}

Example

  • Tomogram.voxel_spacing.run.name == "RUN1"

Expressions with method operators (like, ilike, _in) must be in the format:

ModelSubclass.field.{operator}({value})

Examples

  • Tomogram.voxel_spacing.run.name.like("%RUN1%")

  • Tomogram.voxel_spacing.run.name._in(["RUN1", "RUN2"])

Supported operators are: ==, !=, >, >=, <, <=, like, ilike, _in

  • like is a partial match, with the % character being a wildcard

  • ilike is similar to like but case-insensitive

  • _in accepts a list of values that are acceptable matches.

Values may be strings or numbers depending on the type of the field being matched, and _in supports a list of values of the field’s corresponding type.

ModelSubclass.field may be an arbitrarily nested path to any field on any related model, such as:

ModelSubclass.related_class_field.related_field.second_related_class_field.second_field

Parameters:
  • client – A CryoET Portal API Client

  • query_filters – A set of expressions that narrow down the search results

Yields:

Matching Model objects.

Examples

Filter runs by attributes, including attributes in related models:

>>> runs = Run.find(client, query_filters=[Run.name == "TS_026", Run.dataset.id == 10000])
>>> runs = Run.find(client, query_filters=[Run.name._in(['TS_026', 'TS_027']), Run.tomogram_voxel_spacings.annotations.object_name.ilike('%membrane%')])

Get all results for this type:

>>> runs = Run.find(client)
classmethod get_by_id(client: Client, id: int)

Find objects by primary key

Parameters:
  • client – A CryoET Portal API Client

  • id – Unique identifier for the object

Returns:

A matching Model object if found, None otherwise.

Examples

Get a Run by ID:

>>> run = Run.get_by_id(client, 1)
    print(run.name)
to_dict() Dict[str, Any]

Return a dictionary representation of this object’s attributes