Overview

SIRIUS 6 comes equipped with a comprehensive Graphical User Interface. Here’s a breakdown of the key components and their functions.

Foo
SIRIUS main application window.
  • The toolbar [1-5] at the top of the screen provides quick access to the essential tools: The first button group [1] on the left is for managing your project spaces, including creating, opening and saving them. The second button group [2] facilitates importing single features or data containing multiple features into your project space. The third button group [3] handles exporting data, e.g. for GNPS FBMN analysis or generating project space summaries. The fourth button group [4] is dedicated to computations, including starting the computations, getting the job view and importing custom databases.
    The last button group on the right [5] provides access to the log, settings, webservice information, and account details. Here you can also find a link to the online documentation (Help) as well as information on software licence and related publications (About).

  • The feature list [6] on the left side contains all imported (aligned) features. A feature contains all MS and MS/MS spectra corresponding to a measured aligned feature. For each feature, adduct type, precursor mass, retention time and confidence score (if computed) are shown in this panel.

  • The result view [8] is displayed to the right of the feature list and allows users to examine different result views. The tab selector [7] at the top of this panel lets you switch between these views, offering various perspectives on your data.

  • The bottom information bar [9] provides details about your license status for the webservice-based structure elucidation tools, the number of computed features and feature limits.

For a quick start, you can also watch our tutorial video introducing basic elements and functionality of the SIRIUS 4 GUI.

Account creation

Open Account in the top-right toolbar [5], and click Create Account to get to our user portal and create an account. After logging into the user portal, you can request a license. In the SIRIUS GUI, click Log in to enter your account credentials and login.

Data import

You can import .ms, .mgf, Agilent’s .cef, .mzml and .mzxml files using the Import button or by drag-and-drop. SIRIUS will automatically extract all relevant attributes (such as MS level, ionization, and precursor mass) from the file. When importing multiple .mzml (or .mzxml) at once, SIRIUS will ask you if it should align them. When importing a peak list file containing a molecular formula, select Ignore Formulas if you would like to do the molecular formula annotation using SIRIUS.

For more information on supported file formats, refer to the Input section.

Sort and filter

To sort the feature list, use right-click to open the [sort] dialog below.

You can sort aligned features by retention time (RT), mass, name, ID or confidence score (if available). At the bottom of this dialog, you can select the confidence mode (approximate or exact) to be displayed and used for sorting. For more details, see the confidence modes section.

The displayed feature list is already filtered by quality, i.e., features containing only MS1 spectra, features with bad quality and multimeric features are hidden. Use the switches on top of the feature list to unhide.

The feature list can further be refined by clicking the filter button (three dots to the right of the search field) to open the filter dialog:

  • In the Input tab (left), aligned features can be filtered by mass range and retention time range, as well as by specific detected adducts.
  • In the Data Quality tab (middle), you can refine the preset quality filter.
  • In the Results tab (right), you can filter for confidence score range, specific element constraints in either the neutral molecular formula or precursor formula, as well as by detected lipid classes. If structure database results are available, you can filter for hits in specific structure databases. Candidates to check allows you to specify the number of top candidates to consider.

For all filters, you can also choose to invert the filter, and whether you want to delete all non-matching compounds.

Import of custom structure and spectra databases

Custom structure databases and spectral libraries can be added via the Databases dialog, accessible via the top center of the GUI ribbon. Starting with version 6.0, SIRIUS supports the import of spectral libraries. The supported import formats for spectral data are .ms, .mgf, .msp, .mat, .txt (MassBank), .mb, .json (GNPS, MoNA). Spectra must be annotated with a structure and must be centroided. Custom structures can be used in structure database search. All imported spectra will automatically be used for spectral library matching during molecular formula annotation. A spectral library is also a molecular structure database and thus can be used for CSI:FingerID structure database search.

Foo
Custom database dialog.

Custom databases are stored as files with the .siriusdb extension. If you already have a database with this format on your local machine, you can add it to SIRIUS by clicking the Add existing Database button on the bottom right [1]. To create a new custom database, click the Create custom Database button. Imported databases can be deleted or modified using the - or pencil icon button, respectively.

To create a custom database, enter a name for the database [2] (maximum length: 15 characters), which will be displayed in the structure database search dialog. Specify the file name for the database, ensuring it ends with .siriusdb [3]. Choose be any valid, writeable path on your local machine to store the database [4]. Adjust the buffer size [5] to control how many structures or spectra should be kept in memory. This can be increased when importing large files on a faster machine. Drag and drop files or directories containing structure/spectra files to the input area [6], or use the + button to browse your file system.

For more details on custom database import and supported file formats, please see the Custom database tool section.

Please note that you have to be logged in to your SIRIUS account to import custom databases.

Computing results

SIRIUS offers two computation modes for processing data: Single Computation and Batch Computation. Both computation dialogs are similar in layout and functionality.

  • The Single Computation mode allows you to set different parameters for each individual feature. You can initiate a single computation by right-clicking on a single feature and selecting Compute in the context menu or by double-clicking on the feature. When opening the dialog, element prediction is performed to preset the chemical alphabet. However, you can change this alphabet in the Element Filter if you like.

  • For Batch Computation you can right-click on multiple selected features and choose Compute to process them collectively. Alternatively, you can use the Compute All button in the toolbar to compute all features in the project space. Here, you can use the Element Filter to choose the elements that should automatically be detected for each feature. Additionally, you can choose whether to recompute and override results for features that have already been analyzed.

The following section provides a detailed explanation of the compute dialog, using the Batch Computation dialog as example.

Compute dialog

Starting with SIRIUS 6, the compute dialog has been streamlined to improve clarity by displaying only those settings that are essential for any type of analysis. Use the Show advanced settings button at the bottom [7] to include additional settings that are relevant for specific use cases or to set limits for computation times.

Foo
Batch compute dialog.

The compute dialog is divided into five subtools: SIRIUS molecular formula annotation [1], ZODIAC [2], CSI:FingerID fingerprint prediction with CANOPUS [3], CSI:FingerID structure database search [4] and MSNovelist [5]. As of SIRIUS 6, CANOPUS is automatically executed together with the fingerprint prediction. Subtools can be selected individually or combined, but note that the selection must align with a valid SIRIUS workflow. For example, you cannot search structure databases without predicting fingerprints first. Subtools are automatically enabled/disabled to match the workflow principles.

If the Recompute already computed tasks? checkbox [6] is checked, all previously existing results for the selected features in the current project space will be invalidated and overwritten to execute the newly selected workflow. Additional parameters for specific subtools can be displayed using the Show advanced settings button [7]. To easily convert the current workflow selections into a CLI command, use the Show Command button [8] at the bottom right. You can save your computation setup as a preset to reload for the next computation [9].

Spectral library matching

If imported spectral libraries are available, SIRIUS will automatically perform spectral matching. This process runs in the background without the need for any additional parameters. For more information, refer to the Spectral library matching section.

Since structure database results depend on the selected molecular formula, SIRIUS ensures that molecular structures with a formula corresponding to a good spectral library hit are considered - even if this molecular formula receives a low score, i.e. molecular structures of well-matching reference spectra are automatically included in the structure database search.

Identifying molecular formulas with SIRIUS

To identify molecular formulas using SIRIUS, you can set general parameters [A], specify fallback adducts [B], and choose the appropriate molecular formula annotation strategy [C].

Foo
Molecular formula annotation compute dialog.
  • General settings [A]: In the Instrument field, you can choose Q-TOF, Orbitrap or FT-ICR. The choice of instrument affects only a few parameters, primarily the allowed mass deviation. If your instrument is not among these options, select Q-TOF as default.

    You can change the maximum allowed mass deviation, ensuring that SIRIUS only considers molecular formulas with mass deviations below the specified ppm threshold. For masses below 200 Da, the allowed mass deviation is $(200 \cdot \frac{ppm_{max}}{10^6})$.

    If SIRIUS predicts that the query spectrum may be a lipid, the molecular formula according to that prediction can be added and enforced (default).

  • Fallback Adducts [B]: You can specify fallback adducts that will be used for all features for which no adducts were detected during SIRIUS import or prior external annotation. Using the enforce option, you can even enforce to consider the selected adducts for all features (in addition to the detected adducts). The “base ionization” of the detected adduct will be considered by default.
    Possible Adducts: Be aware that in the compute dialog for a single compound, the adduct selection is different. The detected adducts (or SIRIUS default adducts) are pre-selected. Select the adducts you want to be considered for computation and deselect the adducts you not want to be considered for computation.
    Molecular formula annotation compute dialog for a single compound

  • Molecular Formula Generation [C]: Selecting an appropriate molecular formula annotation strategy is crucial for a successful SIRIUS analysis, as it will significantly impact subsequent steps. Parameters for the different strategies are explained in the following. Before selecting a strategy, it is important to understand the differences between the molecular formula annotation strategies to choose the most appropriate one for your analysis.

De novo + bottom up search is recommended for generic applications. Learn more.

Foo
Settings for bottom up + de novo search.

You can configure the m/z threshold [1] below which de novo molecular formula annotation will be performed alongside the bottom up search.

The element filter [2] can be applied either solely to the de novo annotations or to the bottom-up search as well. Allowed elements specifies the elements in the element set and their upper and lower limits. Autodetect specifies the elements for which SIRIUS will automatically detect the presence, absence and quantity from the input data (requires MS1 spectra). The element set can be adjusted using the ... button [a]. Before making significant changes to the element set, please consider the potential impact on running time and result quality.

De novo only is recommended for discovering “unknown unknowns”. Learn more.

Foo
Settings for de novo only annotation.

Here, the expected element set needs to be well-defined using the Element filter settings [2]. Allowed elements and Autodetect elements can again be adjusted using the ... button [a], with keeping in mind the impact on running time and result quality.

Database search is recommended for known compounds and extremely fast computation times. Learn more.

Foo
Settings for formula database search.

Choose the databases [1] to be used for molecular formula annotation. Per default, the biomolecule structure databases are selected. You can restore this default selection using the bio button.

Applying an element filter [2] is not mandatory for formula database search, but it can be used to narrow down molecular formula candidates. Allowed elements and Autodetect elements can again be adjusted using the ... button [a].

Bottom up search only can be used for a slight speed increase compared to the recommended combined approach. Learn more.

Foo
Settings for bottom up search only.

Also for bottom up search, applying an element filter [2] is not mandatory, but can be used to narrow down molecular formula candidates. Allowed elements and Autodetect elements can again be adjusted using the ... button [a].

Specifying a molecular formula (or list of molecular formulas) to run formula annotation works in single computation mode only.

Foo
Specify a (list of) molecular formula(s) to run formula annotation.

If you have imported a peak list file containing a molecular formula, you cannot override this formula using the compute dialog. To avoid this, select Ignore Formulas during import.

Advanced settings for molecular formula annotation

Foo
Advanced parameters for molecular formula annotation.
  • By default, molecular formula candidates whose theoretical isotope
    patterns deviate significantly from the measured isotope pattern are discarded. You can disable this setting [1] if you suspect poor quality isotope patterns in the input data.
  • If isotopic peaks are present in the input MS2 spectrum, they can either be used for scoring (SCORE) or be ignored (IGNORE) [2].
  • You can select the number of molecular formula candidates that will be saved [3].
  • Specify the minimum number of molecular formula candidates to store for each ionization state, even if they are not among the top n candidates [4].
  • Set a time limit (in seconds) for computing the fragmentation tree for a single molecular formula candidate [5]. Set to 0 to disable the limit.
  • Set a total time limit (in seconds) for computing the fragmentation trees for all molecular formula candidates of a feature [6]. Set to 0 to disable the limit.
  • For higher mass compounds, SIRIUS can compute fragmentation trees heuristically instead of exactly. This heuristic method can be used to pre-rank molecular formula candidates, with exact trees computed only for the top candidates. Set the m/z value above which this approach will be applied [7].
  • For very high masses, exact solutions may be impractical, and only heuristic trees should be computed. Set the m/z value above which trees will exclusively be computed using the heuristic [8].

Improve molecular formula ranking with ZODIAC

ZODIAC enhances de novo molecular formula annotation for complete biological datasets, that is high-resolution, high-mass-accuracy LC-MS/MS runs. It refines the ranking of molecular formula candidates by analyzing similarities among features in the dataset, using fragmentation trees as input.

To use ZODIAC, select both SIRIUS and ZODIAC in the compute panel. This is only possible for batch computation. To increase the chance of the correct molecular formula candidate to be in the result list, increase the number of reported candidates for SIRIUS.

!! Zodiac should not be used for non-biological samples (like standard mixtures) !!

For more details, visit the ZODIAC release page.

Advanced settings for ZODIAC

These parameters are very advanced and require a thorough understanding of ZODIAC and its underlying Gibbs sampler.

Foo
Advanced parameters for ZODIAC molecular formula annotation.
  • Specify the maximum number of candidate molecular formulas considered
    for features with m/z lower than 300. [1]
  • Specify the maximum number of candidate molecular formulas considered for features with m/z higher than 800. [2]
  • Enable or disable the 2-step approach, where higher quality features are processed first, followed by lower quality features second. [3]
  • Set the threshold for the ratio of edges of the complete network to be ignored. [4]
  • Specify the minimum number of connections required for each candidate. [5]

Predicting molecular fingerprints with CSI:FingerID and compound classes with CANOPUS

After computing the fragmentation trees, you can predict molecular fingerprints and CANOPUS compound classes. These predictions can then be used to search structure databases and/or to predict novel structures with MSNovelist. If Score threshold is selected, fingerprints are only predicted for the top-scoring fragmentation trees (molecular formulas). This is recommended and should only be disabled if you need to examine the fingerprint of a lower-scoring molecular formula.

CANOPUS predicts ClassyFire compound classes from the molecular fingerprint, without using any structure database. Classes are predicted for all features whose fragmentation tree contains at least three fragments, including features with no matching structure candidates in the database. There are no parameters to set. Compound classes are predicted separately for each molecular formula.

For more details, visit the CANOPUS release page.

Identifying the molecular structure with CSI:FingerID

CSI:FingerID facilitates the identification of molecular structures by matching predicted molecular fingerprints against database structures. SIRIUS ships with a wide range of built-in databases. Additionally, users can enhance the search capabilities by adding their own structures as a “custom database” (see Import of custom structure and spectra databases which can then be searched alongside the existing databases. If you have imported your own spectral library that should be considered for structure database search, select these libraries (databases) in the structure database search step [2].

Foo
Parameters for structure database search.

Structure database search is conducted within the user-selected databases [2]. You can choose to use PubChem as a fallback database [1] in case it contains a hit with higher confidence than those found in the selected databases. For a detailed explanation, please refer to the Methods section. The Confidence mode controls whether the approximate or exact confidence mode is used to asses if a hit in PubChem is more reliable than the hits in the selected databases.

Per default the structure databases constituting the “bio” database are selected [2]. De-select PubChem as fallback to fully search in PubChem. If database search was used for molecular formula identification, the same databases are selected here. You can revert to the default selection by clicking bio. If any custom database was loaded, they can be selected here as well.

COSMIC - confidence values for CSI:FingerID searches

COSMIC confidence scores are calculated automatically and without requiring any parameters every time a CSI:FingerID search is performed. COSMIC scores are displayed in the feature list on the left. Starting with SIRIUS 6, confidence scores are computed in both exact and approximate mode.

For more details, see the COSMIC section or visit the COSMIC release page.

Generating de novo structure candidates with MSNovelist

In some cases, it is necessary to go beyond the limits of structure database search. To address this, SIRIUS 6 newly offers de novo generation of candidate structures through MSNovelist, in addition to predicting molecular fingerprints and compound classes, as well as searching in custom databases.

Be aware that the likelihood of any de novo generation method performing well for truly novel compounds is very low. Therefore, the results from MSNovelist should rather be considered as suggestion or starting point for semi-manual analysis of compounds that cannot be elucidated otherwise.

MSNovelist will significantly slow down your SIRIUS workflow; use with caution.

Visualization of the results

The feature provides not only information about the input and compute state, it also displays the COSMIC confidence score for the top CSI:FingerID hit.

For each feature, different result views can be displayed by switching between the tabs in the result panel:

  • The LC-MS view displays the chromatographic feature alignment as well as a quality assessment of the spectra. This tab is only in use for mzML and mzXML inputs.
  • The Formulas view displays the results from the molecular formula identification.
  • The Predicted Fingerprints view contains information about the molecular properties of the molecular fingerprint predicted by CSI:FingerID.
  • The Compound Classes view shows the Classyfire classes predicted by CANOPUS.
  • The Structures view displays results from the CSI:FingerID structure search.
  • The De Novo Structures view displays MSNovelist-generated structure candidates for the current query.
  • The Substructure Annotations view shows possible substructures connected to the peaks of the MS/MS spectrumfor each candidate.
  • The Library Matches view shows matches to reference spectra if a spectral library was imported.

In all views, the top CSI:FingerID hit (as well as “highly similar” compounds in approximate confidence mode) is highlighted in green.

LC-MS view

The LC-MS tab is hidden when no LC-MS data (.mzML or .mzXML) was imported, i.e. if the data is imported as .mgf, .ms or similar file formats, no LC-MS information is available. This is also the case when LC-MS data has been processed with OpenMS or MZMine and then imported to SIRIUS.

Foo
LCMS view displaying the feature alignment and quality assessment.

The LC-MS view displays the ion chromatogram of a feature. You can choose whether to show the feature alignment or the adduct/isotope assignment [1]. Retention times are always given in minutes.

For the feature alignment, the mass traces of all runs aligned for this feature are displayed in different colors [2]. The thick black line is the merged mass trace [3]. The bold grey box highlights the selected feature [4]. Other nearby features are marked with thin grey boxes and features with low quality are marked with dashed boxes [5]. The circles indicate where the MS2 spectrum was measured. You can click on a box to zoom into the feature. A gray dashed line marks the noise level; its exact computation may vary from version to version, but it is related to the median intensity of all peaks in the MS scan.

Foo
Zoom into feature alignment.

In the adduct/isotope assignment view, you can find the merged mass trace of the feature (bold black line) as well as its isotopes (dashed black lines) and correlated adducts (grey lines) with their isotopes (dashed grey lines).

On the right, there is a basic quality assessment panel [6]. It can be used to preemptively get an idea on overall quality of the MS and MS/MS of the feature.

Formulas view

The Formulas tab displays the molecular formula candidate list [1], the mass spectra [2] and the fragmentation tree [3] of the selected feature.

Foo
Formulas view.
  • Molecular formula candidate list [1]: The candidates are ranked by the SIRIUS score. The molecular formula of the best candidate structure found by CSI:FingerID is highlighted in green (and does not necessarily have to be the candidate with the best SIRIUS score). Per default, the candidates are ordered by SIRIUS score but can be sorted by any other column.

    The Sirius Score is a combination of the score from the isotope pattern analysis (Isotope Score) of the MS1 data and the fragmentation tree score (Tree Score) from the MS2 data. It is calculated by summing both scores and then converting them into probabilities using the softmax function. These probabilities sum to one. While a higher posterior probability for the top hit might suggest that this molecular formula is more likely to be correct, it is important to note that a posterior probability of 90% does not mean there is a 90% chance that the molecular formula identification is correct! The displayed probabilities are neither q-values nor Posterior Error Probabilities. The bars visually represent this value, with a full bar indicating the highest score in the column. The SIRIUS score is the primary score that users should focus on.

    The Isotope Score and Tree Score themselves are log-transformed posterior probabilities. The bars range from the lowest score in the column (empty) to the highest score in the column (full).

    The Zodiac Score is also a probability, and here too, the bars directly represent this value.

  • Mass spectra [2]: In this panel you can switch between the MS1 spectrum, the MS1 isotope pattern mirror plot or the MS2 spectra [a]. For MS2 spectra, per default a merged spectrum is displayed, but you can also choose to view individual spectra [b]. To zoom, hold the right mouse button and drag to select an area, or scroll while hovering over an axis. In the MS2 view, peaks annotated by the fragmentation tree are highlighted in green, while those identified as noise are colored black. Hovering over a peak will display its detailed annotation [c], and clicking on a green peak will highlight the corresponding node in the fragmentation tree. Spectra views can be exported using the top right export button [d].

  • Fragmentation tree [3]: In the computed fragmentation tree, each node assigns a molecular formula to a peak in the (merged) MS2 spectrum. Each edge is a hypothetical fragmentation reaction. You can customize the tree’s appearance by selecting different node styles and color schemes [e].

    The tree can be exported [f] as SVG or PDF vector graphics. Alternatively, the DOT file format provides a text-based description of the tree, which can be rendered externally using tools like Graphviz to convert DOT files into image formats such as PDF, SVG, or PNG. The JSON format offers a machine-readable representation of the tree. For instructions on exporting fragmentation trees from the command line, see the Fragmentation tree export tool.

Predicted Fingerprints view

Even if CSI:FingerID does not identify the correct structure — in particular if the correct structure is not present in any structure database — you can still get valuable information about the structure by examining the predicted fingerprint. The Predicted Fingerprints tab displays a list of all molecular properties that make up the predicted fingerprint. When you select a molecular property, examples related to that property are shown below the list.

Foo
Predicted fingerprints view.

Compound Classes view

Foo
Compound classes view displaying the CANOPUS results.

The Compound Classes tab visualizes the CANOPUS compound class predictions in a table format [4]. Each row in the table represents one compound class. The Posterior Probability (number and bar) indicates the likelihood that the measured spectrum, given the chosen molecular formula, belongs to that class. Additional columns provide related information from the ClassyFire ontology, e.g. the respective parent class.

Above the table are two lists: Main classes and Alternative Classes.

  • Main Classes [1]: The main class of a feature is the most specific compound class with the highest priority from all compound classes with posterior probability above 50% (in green), along with its ancestor classes (in blue) in the ClassyFire ontology.
  • Alternative classes [2]: This list contains all other classes with posterior probability above 50%. In the ClassyFire chemontology, each compound is assigned to multiple classes.
  • Natural Product Classes [3]: Starting from version 5, SIRIUS also predicts Natural Product classes.

Structures view

Foo
Structures view.

The Structures tab displays candidate structures for the selected molecular formula, ranked by the CSI:FingerID search score. The highest-scoring candidate is highlighted in green [1]. If you have enabled approximate confidence mode, all candidates within an MCES distance of 2 will also be highlighted (see Expansive search). To filter the candidate list by a specific database (e.g., only compounds from KEGG and BioCyc), click the filter button [2] in the top left corner. A menu [3] will open, displaying all available databases. Only candidates from the selected databases will be shown. The databases that have been used for the structure database search are highlighted in blue.

The green and pink squares are a visualization of the CSI:FingerID predictions and scoring [4].

  • Green squares represent molecular substructures present in the candidate structure that are predicted by CSI:FingerID to be present in the measured feature. The intensity of the color indicates the predicted probability, and the size of the square reflects the reliability of the predictor.
  • Pink squares represent substructures that are predicted to be absent but are, nevertheless, found in the candidate structure. The intensity of the color reflects the probability that the structure should be absent.

Fingerprint box representation

Overall, a correct prediction is typically characterized by many large, intense green squares and as few large, intense pink squares as possible.

Hovering over a square displays the description of the molecular substructure (usually a SMARTS expression) [5]. Clicking on a square highlights the corresponding atoms in the molecule [6]. If the substructure appears multiple times, the first appearance is highlighted in dark blue, while the other matches are highlighted in translucent blue.

To filter the candidate list for for structures that contain the selected substructure, use the other filter button in the top left corner [7]. You can also filter by SMARTS pattern [8] or using the full-text search [9].

Right-clicking on a proposed structure opens a context menu [10], allowing you to:

  • Copy the InChI or InChI Key to your clipboard
  • Open the compound in PubChem
  • Open the compound in all databases
  • Highlight matching substructures
  • Show the annotated spectrum in the Substructure Annotations tab

Highlight matching substructures: When you choose Highlight matching substructures from the context menu, substructures in all structure candidates will be color-coded as follows:

  • green: substructures that are supported by fingerprint evidence.
  • pink: substructures that contradict fingerprint evidence.
  • yellow: substructures with mixed support, where both agreeing and disagreeing fingerprint evidence is present.
  • no color: substructures that are not clearly covered by fingerprint evidence
Foo
Highlighting of matching substructures.

If the structure is contained in any database, a label with the name of this database is displayed below the structure [11]. Database labels have different colors:

  • black on light blue: This database is linked. Clicking on the label opens the database entry in your browser [a].
  • grey on light blue: This database is not linked .
  • white on dark blue: This is a lipid database [b] or “El Gordo” lipid class annotation (see below) [d].
  • black on yellow: This is a custom database loaded by the user [c].
  • grey on yellow: This is result from a custom database but the database has not been loaded. This can for example happen when loading results that have been computed with a custom database that is not available on the current system.
  • pink: This structure has been generated by MSNovelist. Die label is only used in the De Novo Structures view.

Database labels

This view also includes the “El Gordo” lipid class annotation [d]. If the feature has been identified as lipid by “El Gordo” a blue notification displaying the classification will appear below the top ribbon. If the candidate structure is also a lipid, the classification will be added to the database labels. Be aware that lipid structures are often highly similar, differing only in the position of double bonds. These subtle differences can be indistinguishable using mass spectrometry alone, which is why the overarching lipid class is displayed.

Lipids

If the PubChem fallback was activated as part of Expansive search, a pink notification will be displayed below the top ribbon [e].

Finally, if a structure candidate has a reference spectrum imported via a custom database, the spectral match will be displayed. Clicking on this match will take you to the Library Matches tab.

Reference spectrum

De Novo Structures view

Foo
*De novo* structure annotation view.

The De Novo Structures tab displays the de novo structure generation results produced by MSNovelist, with each generated structure tagged with a de novo label. If a generated structure is also found in the structure databases, the corresponding labels will be added alongside the de novo label. For instance, in the example screenshot above, generated structures 1-4 are generated by MSNovelist, but also exist in the structure databases. Structure 5 was only generated de novo by MSNovelist.

By default, structure candidates from the structure databases that have NOT been generated by MSNovelist are also displayed. You can hide those structures by toggling the left button in the top right corner.

Substructure Annotation view

The Substructure Annotations tab visualizes the direct connection between the input MS/MS spectrum and the CSI:FingerID structure candidates. The table at the top [1] displays all structure candidates for a given query (as listed in the Structures tab), and all structure candidates generated by MSNovelist. You can hide/unhide MSNovelist-generated structures by toggling the left button in the top right corner [2]. When you select a structure from the list, the lower part of the view shows the fragmentation spectrum on the left [3] and the selected structure candidate on the right [4].

Foo
Substructure annotation view.

Peaks in the fragmentation spectrum are color-coded as follows:

  • Black peaks: These peaks are not used to explain the molecular formula of the candidate and are not part of the fragmentation tree (similar to the spectrum in the Formulas view. Typically, these peaks are considered noise or are not explainable by the precursor ion’s molecular formula.
  • Blue peaks: These peaks are part of the fragmentation tree and thus explain the molecular formula of the candidate. However, they do not have a substructure associated to them.
  • Purple peaks: These peaks explain the molecular formula of the candidate, AND are associated to a substructure of the candidate structure. Substructures are generated combinatorially and then scored against the peaks. By clicking on the peak, the highest scoring substructure for this peak will be highlighted within the structure. Pink bonds indicate the fragmentation that would have occurred to generate this fragment.

You can navigate through the peaks using left-click or the arrow keys.

Library Matches view

Activate spectral library results tab

SIRIUS automatically searches in your spectral libraries as part of the molecular formula annotation step. Library hits are integrated to the Structures tab to seamlessly compare structure database and spectral library hits.

To additionally activate the Library Matches tab, go to Settings and check Show "Library Matches" tab. You will get a warning dialogue explaining the spectral library search settings:

  • A spectral library is also a molecular structure database. ANY hit in the spectral library can also be found via CSI:FingerID structure database search.
  • Since structure database results depend on the selected molecular formula, SIRIUS ensures that molecular structures with a formula corresponding to a good spectral library hit are considered - even if this molecular formula receives a low score, i.e. molecular structures of well-matching reference spectra are automatically included in the structure database search.
  • Structure database search is only performed on databases selected by the user. To ensure that all your spectral libraries are considered by CSI:FingerID, select these libraries (databases) in the structure database search step.
Foo
Spectral library matches tab.

The Library Matches tab displays the spectral library matches for the measured query spectrum against a reference library. If you have multiple MS2 spectra (with different collision energies) for a feature, the best matching spectrum is shown by default. The Similarity Score and number of Shared Peaks in the list is given for this spectrum [1]. You can switch to other MS2 spectra to examine their mirror plots as well [2]. Additional metadata for the spectral hit is given on the right [3].

To zoom into the spectrum, hold the right mouse button and drag to select an area, or scroll while hovering over an axis.

For more information on spectral library searches in SIRIUS, please refer to the sections on spectral library matching and Import of Custom Structure and Spectra Databases.

Data export (Summaries and FBMN export)

Analysis results can be exported using the Summaries button in the top left tool bar.

Summary files include five types of data:

By default, only the top hit is exported for each feature (Top Hits (recommended)). You can use the drop down menu to export All Hits, Top k Hits, or Top Hits with Adducts instead. Learn more about the different export options here.

You can export the files in TSV, CSV, ZIP or XLSX format. You might want to use Quote strings to quote all string values.

In addition, you can export a Feature quality summary, with feature quality values of different categories for all features, as well as a ChemVista summary file which can be imported directly to the Agilent ChemVista software.

Feature based molecular networking (FBMN) export

TODO

Settings

You can access the settings dialog by clicking the Settings button at the top right of the user interface.

General settings:

  • UI Theme: Choose your preferred display mode to reduce eye strain (requires restart).
  • Scaling Factor: Adjust the size of the GUI by the selected factor (requires restart).
  • Confidence score display mode: Sets the mode for displaying the confidence score (either approximate or exact).
  • Allowed solvers: Select the ILP solver for SIRIUS to use in fragmentation tree computation. GLPK is free, while Gurobi is commercial but offers a free academic license.
  • Database cache: Specifies the location of the cache directory. CSI:FingerID downloads candidate structures from our server and caches them for faster retrieval.
  • REST API: Use the button to open the API in your browser. The REST API provides the full functionality of SIRIUS and its web services as background service. It is intended as entry-point for scripting languages and software integration SDKs.

Adduct settings: Add or remove custom adducts for positive and negative ion modes.

Network settings: SIRIUS supports using a proxy server to access our webservices by changing Use Proxy Server from NONE to SIRIUS and entering all required information. Your configuration will be tested when you click the save button.

Webservice

The webservice connection check dialog, accessible via the Webservice button in the top right, helps diagnose any connection issues.

Green checkmarks or red crosses indicate whether you are successfully connected to the internet, login server, license server, and web service. It also provides information about your account’s subscription status, showing whether a valid subscription is linked to the account you’re currently logged into. If there are any connection or licensing problems, detailed descriptions and potential solutions will be provided in the description box.