Helm chart config

The configuration for the Helm chart is provided as a YAML file. It has the following fields:

General Settings

Field	Type	Default	Description
`name`	String	"Loculus"	The name of the Loculus instance
`logo`	Object		Configuration for the logo
`logo.url`	String		URL path to the logo image file
`logo.width`	Integer	100	Width of the logo in pixels
`logo.height`	Integer	100	Height of the logo in pixels
`accessionPrefix`	String	"LOC_"	Prefix used for accession numbers
`environment`	local, server	"server"	Deployment environment. local for development, server for production
`localHost`	String	"localhost"	Host string used for local development
`host`	String		Hostname where Loculus will be accessible
`bannerMessage`	String	"This is a demonstration environment. It may contain non-accurate test data and should not be used for real-world applications. Data will be deleted regularly."	Banner message (as HTML) to display at the very top of the page
`bannerMessageURL`	String		URL to fetch banner message HTML from - if set will be polled regularly for banner text, e.g. notice of temporary downtime, and overwrite the 'bannerMessage'-config
`welcomeMessageHTML`	String, Null		A custom welcome message to be shown on the landing page
`additionalHeadHTML`	String		Additional HTML to inject into the of pages
`createTestAccounts`	Boolean		If true, creates the users testuser and superuser
`robotsNoindexHeader`	Boolean		If true, adds a noindex header to prevent search engine indexing
`seqSets`	Object
`seqSets.enabled`	Boolean		Enable/disable SeqSets. If false, `seqSets.crossRef` can be omitted.
`seqSets.crossRef`	Object		Configuration for CrossRef integration. Set to `null` to disable CrossRef integration (you can still use SeqSets without CrossRef DOIs).
`seqSets.crossRef.DOIPrefix`	String		The DOI prefix for SeqSets
`seqSets.crossRef.endpoint`	String		The API endpoint for CrossRef
`seqSets.crossRef.databaseName`	String		The database name for CrossRef
`seqSets.crossRef.email`	String		The email address associated with the CrossRef account
`seqSets.crossRef.organization`	String		The organization name for CrossRef
`seqSets.crossRef.hostUrl`	String		The host URL for CrossRef callbacks
`seqSets.fieldsToDisplay`	Array		Fields to display in the SeqSets table
`gitHubMainUrl`	String	"https://github.com/loculus-project/loculus"	The link that the GitHub icon in the footer points to
`sequenceFlagging.github`	Object		Settings to enable sequence reporting via GitHub. If set, a button will be enabled on sequence details views to report issues with a sequence.
`dataUseTerms.enabled`	Boolean		Whether this Loculus instance handles data use terms.
`dataUseTerms.urls.open`	String		A URL describing the open data use terms.
`dataUseTerms.urls.restricted`	String		A URL describing the restricted data use terms.
`fileSharing`	Object		Settings related to the file sharing feature.
`fileSharing.outputFileUrlType`	website, backend, s3		Whether the backend should supply file URLs as links to the website, the backend itself, or directly to the S3 server objects.
`images`	Object		Which docker images to use. You can specify an image spec (type) for `lapis`, `lapisSilo`, `website` and `backend.`

Website Settings

Field	Type	Default	Description
`website`	Object		Website specific setting
`website.websiteConfig`	Object		Settings for the `website_config.json`
`website.websiteConfig.enableLoginNavigationItem`	Boolean	true	Whether the website should show the login button.
`website.websiteConfig.enableSubmissionNavigationItem`	Boolean	true	Whether the website should show "Submit" link in the top navigation bar.
`website.websiteConfig.enableSubmissionPages`	Boolean	true	Whether to completely disable submission related pages. Setting this to false is useful when hosting Loculus for analysis-only purposes.
`website.runtimeConfig.public`	Object		Settings for the `public` section of the `runtime_config.json`

User registration and authentication

Field	Type	Default	Description
`auth`	Object		User authentication (Keycloak settings)
`auth.smtp`	Object		Configuration for email sending
`auth.smtp.host`	String		SMTP server hostname
`auth.smtp.port`	Integer		SMTP server port
`auth.smtp.user`	String		SMTP username for authentication
`auth.smtp.replyTo`	String		Reply-to email address for sent emails
`auth.smtp.from`	String		From email address for sent emails
`auth.smtp.envelopeFrom`	String		Envelope from address for sent emails
`auth.verifyEmail`	Boolean	true	If true, requires email verification for new accounts
`auth.resetPasswordAllowed`	Boolean	true	If true, allows users to reset their passwords
`auth.registrationAllowed`	Boolean	true	If true, allows users to register new accounts in Keycloak.
`auth.identityProviders.orcid.clientId`	String		The client ID to use for ORCiD integration.
`insecureCookies`	Boolean	false	If true, allows insecure cookies.
`registrationMessage`	String	"You must agree to the <a href='http://main.loculus.org/terms'>terms of use</a>."	Message displayed during user registration, typically including terms of service.

Database deployments

Field	Type	Default	Description
`runDevelopmentMainDatabase`	Boolean	true	If true, runs a development database within the cluster.
`runDevelopmentKeycloakDatabase`	Boolean	true	If true, runs a development Keycloak database within the cluster.
`developmentDatabasePersistence`	Boolean	true	If true, makes the database on the argocd preview persistent.
`runDevelopmentS3`	Boolean	true	If true, runs a development MinIO S3 instance within the cluster.

For production environments, these should always be set to false. Instead, external managed databases should be used.

S3 deployments

Field	Type	Default	Description
`s3.enabled`	Boolean	false	Whether to enable S3. S3 is needed for the file sharing feature.
`s3.bucket.endpoint`	String	"s3-<host value>"	The endpoint where S3 is running. Do not include a protocol prefix.
`s3.bucket.region`	String		The region where the bucket is located.
`s3.bucket.bucket`	String		The name of the bucket.

Services

Field	Type	Default	Description
`disableWebsite`	Boolean	false	If true, disables the frontend website deployment.
`disableBackend`	Boolean	false	If true, disables the backend API deployment.
`disablePreprocessing`	Boolean	false	If true, disables preprocessing pipelines.
`disableIngest`	Boolean	false	If true, disables ingestion services.
`disableEnaSubmission`	Boolean	false	If true, disables ENA (European Nucleotide Archive) submission service.

Organism Configuration

Field	Type	Default	Description
`organisms`	Object		An object where the keys are the organism IDs and values are an Organism (type)
`lineageSystemDefinitions`	Object		An object where the keys are the lineage system names and values are links to lineage system definition files per pipeline version (See Lineage system definitions)

Lineage system definitions

Here’s an example of a lineageDefinitions section:

lineageSystemDefinitions:
  pangoLineage: # Lineage name to use in metadata fields
    1: https://example.org/lineage_defintions_v1.yaml # Definition per pipeline version
    2: https://example.org/lineage_defintions_v2.yaml
  myLineage:
    1: ...

Field	Type	Default	Description
`lineageSystemDefinitions.<name>`	Object		A map from pipeline versions to file URLs.
`lineageSystemDefinitions.<name>.<pipelineVersion>`	String		The URL to the lineage defintion file for that lineage system and that pipeline version.

Organism (type)

Each organism object has the following fields:

Field	Type	Default	Description
`enabled`	Boolean	true	Whether this organism is enabled.
`schema`	Object		Object of type organism schema
`preprocessing`	Array		Array of Preprocessing (type).
`ingest`	Object		Object of type Ingest
`referenceGenomes.nucleotideSequences`	Array		Array of Nucleotide sequence (type)
`referenceGenomes.genes`	Array		Array of Gene (type)

Organism schema (type)

Field	Type	Default	Description
`organismName`	String		Display name for the organism
`image`	String		URL to an image that will be shown on the landing page
`loadSequencesAutomatically`	Boolean	true	Whether sequences be loaded automatically on the sequence details page , rather than users having to press a button to do so. (For small genomes, this should probably be true.)
`submissionDataTypes.consensusSequences`	Boolean	true	If false, the submission form will not allow submission of consensus sequences (i.e. the sequences file must be omitted). All consensus sequence related parts on the website will be hidden.
`submissionDataTypes.files.enabled`	Boolean	false	If true, enable support for submitting additional files. S3 needs to be configured and categories need to be configured under `categories`.
`submissionDataTypes.files.categories`	Array		The file categories that can be submitted for this organism.
`submissionDataTypes.files.categories.<fileCategory>.name`	String		The name of the file category, e.g. 'raw_reads'.
`<file>`	Array		An array of output files.
`<file>.[].name`	String		TODO
`<file>.[].multipleFiles`	Boolean		TODO
`metadata`	Array		Array of Metadata fields (type) associated with the organism.
`metadataAdd`	Array		Array of Metadata fields (type) associated with the organism, in addtion to the fields in `metadata`.
`metadataTemplate`	Array		Array of strings. Which input fields to add to the downloadable metadata template on the submission and revision page.
`nucleotideSequences`	Array		Array of strings of nucleotide sequence names. Defaults to a list with just 'main'.
`richFastaHeaderFields`	Array		Which metadata fields to include in the fasta header when downloading sequences from the website when using the 'Display name' FASTA header style option.
`linkOuts`	Array		An array of external tools that can be linked to from the search page. Each linkOut has a name and URL. The URL can contain placeholders in the format [dataType] or [dataType:segment] or [dataType+rich] or [dataType:segment+rich\|format] which will be replaced with the corresponding data URLs. `rich` means display name faster headers. You can use [metadata+col1,col2] to specify columns. Valid dataTypes are: unalignedNucleotideSequences, metadata, and alignedNucleotideSequences. You can also use {{this}} notation to URL-encode the contents of the double brackets, and you can {{nest {{these}} }}.
`earliestReleaseDate`	Object		Configuration object for enabling and configuring the `earliestReleaseDate` metadata field. For each version of an accession, the `earliestReleaseDate` is calculated as the earliest date of the internal release date, the dates in the configured `externalFields` and the value from the previous version of the accession (if there is one). This can be used when having a mix of sequences imported from other databases, as well as sequences released first in this Loculus instance, to have a field that shows the earliest release date regardless of where the sequence was first released.
`earliestReleaseDate.enabled`	Boolean		Whether to enable the `earliestReleaseDate` metadata field.
`earliestReleaseDate.externalFields`	Array		Field names to use when calculating the earliest release date. The fields need to be nullable strings formated with `yyyy-mm-dd`.
`website`	Object		Configuration for how the organism data is displayed on the website
`website.tableColumns`	Array		Columns to display in the browse table
`website.defaultOrderBy`	String		Default column to sort the browse table
`website.defaultOrder`	ascending, descending		Default order direction.

Metadata field (type)

Definition of metadata fields for sequence entries of an organism, for example the collection date and location of a sample.

Field	Type	Default	Description
`name`	String		Key used across app to refer to this field.
`displayName`	String		Name displayed to users.
`type`	string, int, float, number, date, boolean, authors	"string"
`header`	String		Grouping of fields in sequence details UI.
`required`	Boolean		Whether the field is required by backend.
`desired`	Boolean		Whether the field is a desired input field for submitters.
`definition`	String		Definition of input field for submitters.
`guidance`	String		Guidance for submitters on filling in input field.
`noInput`	Boolean		Whether a field with this name is expected as possible input, and so should be included in the metadata template and as a form field. (If set to true it will not be included).
`notSearchable`	Boolean		If true, disable search for this field.
`generateIndex`	Boolean		Whether the field should be indexed for search. This is only allowed for string fields and facilitates faster filters. It is recommended if the number of different values is rather small.
`columnWidth`	Number		The minimum column width for this field on the search table.
`order`	Number		Order for the column in the search table, lower first.
`autocomplete`	Boolean		Whether autocomplete should be offered for the field. This is only allowed for string fields and probably `generateIndex` should be true.
`<option>`	Array		An array of options for the value of this field, when it is an input field.
`<option>.[].name`	String		The name of the option.
`enableSubstringSearch`	Boolean		If true, search results will contain results that contain the given value as a substring. If false (the default), only exact matches will be returned. This only works for string fields, and you cannot also enable `autocomplete`.
`rangeSearch`	Boolean		If true, enables range search for numeric fields.
`rangeOverlapSearch`	Object		Config settings for enabling range overlap search.
`rangeOverlapSearch.rangeName`	String		The range that this field belongs to. Two fields (the upper and lower bound) need to be defined with the same range ID.
`rangeOverlapSearch.rangeDisplayName`	String		The display name of the range.
`rangeOverlapSearch.bound`	lower, upper		Whether this field is the lower or upper bound of the range.
`initiallyVisible`	Boolean		If true, the field is initially visible in the UI.
`hideOnSequenceDetailsPage`	Boolean		If true, hides the field on the sequence details page.
`hideInSearchResultsTable`	Boolean		If true, hides the field in the search results table (and makes it impossible to show).
`includeInDownloadsByDefault`	Boolean		Whether this field should be included in metadata downloads by default.
`perSegment`	Boolean		Whether this is a metadata field that should exist for each segment. If so fields will be created as fieldName_A, fieldName_B in the case of an organism with segments A and B.
`oneHeader`	Boolean		For segmented fields, whether this field should be grouped by segment or not.
`customDisplay`	Object		Custom display settings for the field on the sequence details page (see here)
`customDisplay.type`	String
`customDisplay.url`	String
`ontology_id`	String
`example`	String, Number		Example value for the field.
`ingest`	String		Which NCBI field to map to this field.
`preprocessing`	Object		The values of this field will be added to the preprocessing pipeline config file and the available values depend on the chosen pipeline. For the Nextclade pipeline, please see here.
`lineageSystem`	String		Use this on string fields that contain lineages, if you want to enable searches that can include sublineages. The value needs to be a lineage system that is defined under the `lineageSystemDefinitions` key.

Custom display

Use the optional customDisplay object on a metadata field to change how the value appears on the sequence details page. The object can contain type, url, html, value and displayGroup properties. The type property controls the behaviour:

link – open the value as a hyperlink using the url where __value__ is replaced by the actual value.
badge – show mutation lists as coloured badges (requires value).
htmlTemplate – replace __value__ in the provided html string and render the resulting HTML snippet.
percentage – display numeric values as percentages.
dataUseTerms – show the value together with a data use terms history modal.
submittingGroup – interpret the value as JSON with group details and link to the group page.
fileList – interpret the value as a JSON list of files and show them as download links.

When displayGroup is provided, all entries with the same group are collapsed into a single row.

Preprocessing (type)

Field	Type	Description
`version`	Integer	Version of the preprocessing pipeline.
`image`	String	Docker image for the preprocessing pipeline.
`dockerTag`	String	Docker tag for the preprocessing pipeline.
`replicas`	Number	How many replicas of the prepreprocessing pipeline to run.
`args`	Array	Array of Strings. Arguments passed to the preprocessing pipeline.
`configFile`	Object	Object of type ConfigFile, Fields that should be added to the preprocessing pipeline config file.

The values for args and configFile depend on the used preprocessing pipeline.

Nextclade Preprocessing Pipeline ConfigFile (type)

Field	Type	Description
`alignment_requirement`	ALL, ANY	If multi-segmented viruses should require ALL segments align or ANY segment aligns
`nextclade_dataset_server`	String
`nextclade_dataset_name`	String	Required if sequences should be aligned
`require_nextclade_sort_match`	Boolean	If true run nextclade sort and require that the highest scoring match is in the config.accepted_dataset_matches
`minimizer_url`	String	Minimizer used for nextclade sort (if require_nextclade_sort_match is true), if not specified use default nextclade server minimizer
`accepted_dataset_matches`	Array	Array of strings. The dataset names that are accepted as matches when using nextclade sort, if not specified use nextclade_dataset_name.

For more details on the Nextclade preprocessing pipeline, please see here.

Ingest (type)

Field	Type	Default	Description
`image`	String	"ghcr.io/loculus-project/ingest"	Docker image for the ingest pipeline
`configFile`	Object		Object of type ConfigFile, Fields that should be added to the ingest pipeline config file
`taxon_id`	Integer		NCBI taxon ID for the organism
`segment_identification`	Object		If multi-segmented organism, how to identify segments
`grouping_override`	String		If multi-segmented organism, segment grouping overrides
`metadata_filter`	Object		Filter ingested sequences based on value in metadata. Filter should be a list of metadata field and value pairs.

The values for configFile depend on the used preprocessing pipeline.

Ingest ConfigFile (type)

For our ingest pipeline we require the following fields:

Field	Type	Description
`method`	align, minimizer	Method to identify segments, uses either nextclade align or nextclade sort
`nextclade_dataset_server`	String
`nextclade_dataset_name`	String	Required if method is align
`minimizer_parser`	Array	Required if method is minimizer, list of the name of each '_' - separated metadata field in the minimizer index
`minimizer_index`	String	Required if method is minimizer

NucleotideSequence (type)

Field	Type	Description
`name`	String	Name of the sequence
`sequence`	String
`insdcAccessionFull`	String	INSDC accession of the sequence

Gene (type)

Field	Type	Default	Description
`name`	String		Name of the sequence.
`sequence`	String

Image spec (type)

Field	Type	Description
`repository`	String	The repository to pull the image from. Example: ghcr.io/loculus-project/website
`tag`	String	The tag to pull. Examples: latest, 0.5.7
`pullPolicy`	String	The pull policy to use. Examples: IfNotPresent, Always.