MolViewSpec selectors
Selectors are used in MVS to define substructures (components) and apply colors, labels, or tooltips to them. MVS nodes that take a selector parameter are component (creates a component from the parent structure node) and color (applies coloring to a part of the parent representation node).
There are three kinds of selectors: static selector, component expression, union component expression
Selectors
Static selector
A Static selector is a string that selects a part of the structure based on entity type. The supported static selectors are these: “all”, “polymer”, “protein”, “nucleic”, “branched”, “ligand”, “ion”, “water”`
Component expression
A Component expression is an object that selects a set of atoms based on their properties like chain identifier, residue number, or type symbol. The type of a component expression object is:
{
label_entity_id?: str, // Entity identifier
label_asym_id?: str, // Chain identifier in label_* numbering
auth_asym_id?: str, // Chain identifier in auth_* numbering
label_seq_id?: int, // Residue number in label_* numbering
auth_seq_id?: int, // Residue number in auth_* numbering
pdbx_PDB_ins_code?: str, // PDB insertion code
beg_label_seq_id?: int, // Minimum label_seq_id (inclusive), leave blank to start from the beginning of the chain
end_label_seq_id?: int, // Maximum label_seq_id (inclusive), leave blank to go to the end of the chain
beg_auth_seq_id?: int, // Minimum auth_seq_id (inclusive), leave blank to start from the beginning of the chain
end_auth_seq_id?: int, // Maximum auth_seq_id (inclusive), leave blank to go to the end of the chain
label_atom_id?: str, // Atom name like 'CA', 'N', 'O', in label_* numbering
auth_atom_id?: str, // Atom name like 'CA', 'N', 'O', in auth_* numbering
type_symbol?: str, // Element symbol like 'H', 'HE', 'LI', 'BE'
atom_id?: int, // Unique atom identifier (_atom_site.id)
atom_index?: int, // 0-based index of the atom in the source data
instance_id?: str // Instance identifier to distinguish instances of the same chain created by applying different symmetry operators, like 'ASM-X0-1' for assemblies or '1_555' for crystals
}A component expression can include any combination of the fields. An expression with multiple fields selects atoms that fulfill all fields at the same time. Examples:
// Select whole chain A
selector: { label_asym_id: 'A' }
// Select residues 100 to 200 (inclusive) in chain B
selector: { label_asym_id: 'B', beg_label_seq_id: 100, end_label_seq_id: 200 }
// Select C-alpha atoms in residue 100 (using auth_* numbering) of any chain
selector: { auth_seq_id: 100, type_symbol: 'C', auth_atom_id: 'CA' }Union component expression
A Union component expression is an array of simple component expressions. A union component expression is interpreted as set union, i.e. it selects all atoms that fulfill at least one of the expressions in the array. Example:
// Select chains A, B, and C
selector: [{ label_asym_id: 'A' }, { label_asym_id: 'B' }, { label_asym_id: 'C' }];
// Select residues up to 100 (inclusive) in chain A plus all magnesium atoms
selector: [{ label_asym_id: 'A', end_label_seq_id: 100 }, { type_symbol: 'MG' }];instance_id
The instance_id field in component expressions does not refer to any column in mmCIF atom_site category, but can be used distinguish instances of the same chain created by applying different symmetry operators. Instance IDs follow these rules:
Crystals
Instances created by crystal (spacegroup) symmetry use IDs in the form n_klm, in accordance with the mmCIF dictionary recommendation. However, to avoid any ambiguities, any translation index (k, l, m) smaller than 0 or greater than 9 is always enclosed in parenthesis. Indices 0–9 never use parentheses.
- e.g.
1_555,2_454 - e.g.
1_(11)15,1_1(11)5,1_11(15)(instead of ambiguous1_1115) - e.g.
1_(-1)1(-1)
Assemblies
Instances in assemblies use IDs inspired by the wwPDB recommendation for naming chains in assemblies. Instance IDs are based on data from pdbx_struct_assembly_gen and pdbx_struct_oper_list categories in mmCIF dictionary.
Where only one operator is applied to create the instance, the instance ID is ASM- plus the operator identifier (pdbx_struct_oper_list.id).
- e.g.
ASM-1,ASM-2,ASM-3,ASM-4from generator expression1,2,3,4 - e.g.
ASM-1,ASM-2,ASM-3,ASM-4,ASM-5from generator expression(1-5)
Where multiple operators are applied to create the instance, the instance ID is ASM- plus a dash-separated list of operator identifiers. The order of the operators is the same as in the generator expression (i.e. rightmost operator is applied first).
- e.g.
ASM-X0-1,ASM-X0-2…ASM-X0-20from generator expression(X0)(1-20) - e.g.
ASM-1-61,ASM-1-62…ASM-2-61,ASM-2-62…ASM-60-88from generator expression(1-60)(61-88)
ref
Component expressions can be applied to primitives as well. Furthermore, a ref can be provided to make selections with a specific node when working e.g. with multiple structures. Any MVS node allows you to set an anchor:
builder.download(url=url).parse(format="mmcif").model_structure(ref="X")
This ref can then be referenced in the context of a selection:
PrimitiveComponentExpressions(structure_ref="X", expressions=[ComponentExpression(auth_seq_id=508)])
An alternative to using selectors is using MVS annotations. This means defining the selections in a separate file and referencing them from the MVS file.