Skip to content

Search sequences

Mutations

Nucleotide mutations and insertions

A nucleotide mutation has the format <position><base> or <base_ref><position><base>. A <base> can be one of the four nucleotides A, T, C, and G. It can also be - for deletion and N for unknown. For example if the reference sequence is A at position 23 both: 23T and A23T will yield the same results.

If your organism is multi-segmented you must append the name of the segment to the start of the mutation, e.g. S:23T and S:A23T for a mutation in segment S.

Insertions can be searched for in the same manner, they just need to have ins_ appended to the start of the mutation. Example ins_10462:A or if the organism is multi-segmented ins_S:10462:A.

Amino acid mutations and insertions

An amino acid mutation has the format <gene>:<position><base> of <gene>:<base_ref><position><base>. A <base> can be one of the 20 amino acid codes. It can also be - for deletion and X for unknown. Example: E:57Q.

Insertions can be searched for in the same manner, they just need to have ins_ appended to the start of the mutation. Example ins_NS4B:31:N.

Insertion wildcards

Loculus supports insertion queries that contain wildcards ?. For example ins_S:214:?EP? will match all cases where segment S has an insertion of EP between the positions 214 and 215 but also an insertion of other AAs which include the EP, e.g. the insertion EPE will be matched.

You can also use wildcards to match any insertion at a given position. For example ins_S:214:?: will match any (but at least one) insertion between the positions 214 and 215.

Multiple mutations

Multiple mutation filters can be provided by adding one mutation after the other.

Any mutation

To filter for any mutation at a given position you can omit the <base>.