Welcome to the official online documentation for SIRIUS – a Java software for the analysis of small molecules from tandem mass spectrometry data.

Quick Start: For getting started quickly, see the quick-start guide.

Licenses: As of SIRIUS 5, a user account and license are required to use the webservice-based features of SIRIUS. Find out more about academic and non-academic user accounts.

Tutorials: Check our YouTube Playlist to find tutorials and learning resources. Whether you are new to SIRIUS or looking to expand your knowledge, this playlist has you covered.

Community Forum: For community support and discussions, join our SIRIUS Community space on Gitter. Here, you can connect with other users and get help from the community. Entering the Community, you can join different rooms, e.g. Troubleshooting to get assistance from other SIRIUS users.

Bug Reports and Feature Requests: If you encounter any problems or have suggestions for new features, please submit them via GitHub. Please ensure to submit all required information.

SIRIUS Integration: SIRIUS can be easily integrated into existing workflows and provides interfaces for both manual and fully automated analysis. If you’re looking for help with integrating SIRIUS into your workflow or want to share tips and code snippets, visit our SIRIUS Community space on Gitter.

Help us improve the SIRIUS Documentation!

SIRIUS introduction

The primary focus of SIRIUS is the structure elucidation of novel molecules (drug leads, leachables or contaminants in synthesis or food, new psychoactive substances, PFAS, transformation products,…), but it is also well equipped to handle more standard tasks such as dereplication of known structures.

It combines the analysis of isotope patterns in MS spectra with the analysis of fragmentation patterns in MS/MS spectra, and uses CSI:FingerID as a web service for searching molecular structure databases. It also integrates CANOPUS for de novo compound class prediction and MSNovelist for de novo structure generation.

SIRIUS is built for the analysis of small molecules, usually below 1000 Da, from all compound classes except polymers (e.g. peptides).

SIRIUS requires high mass accuracy data. The mass deviation of your MS and MS/MS spectra should be within 20 ppm. Mass spectrometry instruments such as TOF, Orbitrap and FT-ICR typically provide high mass accuracy data, as do coupled instruments such as Q-TOF, IT-TOF or IT-Orbitrap. Spectra measured with a quadrupole or linear trap do not provide the high mass accuracy that is required for our method. See Mass deviations for a detailed explanation what “mass accuracy” means in SIRIUS.

SIRIUS expects MS and MS/MS spectra as input. It is possible to omit the MS data, but this will make the analysis more time-consuming and may give you poorer results. In this case, you should consider restricting the candidate molecular formulas to those found in PubChem.

SIRIUS is designed to work off-the-shelf only for Data Dependent Acquisition (DDA) data with no or very few chimerics. While SIRIUS has been applied to Data Independent Acquisition (DIA) data, too, it’s important to note that our methods were not specifically developed for DIA and may perform less effectively in that context. For DIA data, spectra must first be deconvoluted using other software (e.g., vendor-specific tools or MSDial) before being imported into SIRIUS for analysis.

SIRIUS expects processed peak lists (centroided spectra). It does not provide routines for peak picking from profiled spectra. This is a deliberate design choice: We want you to use the best peak picking software available — or alternatively your favorite software. There are several tools that specialise in this task, such as OpenMS, MZmine or XCMS. See our video tutorials on how to preprocess your data for SIRIUS with OpenMS or MZmine.

However, SIRIUS also provides a zero parameter pre-processing tool to import LCMS-Runs directly from .mzml (or .mzxml) format to help you get started quickly. Most modern MS vendor instruments are able to export measured data from their native format to .mzML. Alternatively, watch this video tutorial how to use MSconvert/ProteoWizard to convert your vendor formats to mzml for SIRIUS.

SIRIUS identifies the molecular formula of the measured precursor ion, and annotates the spectrum by providing a molecular formula for each fragment peak. Peaks are assumed to be noise peaks if they are not annotated. Furthermore, a fragmentation tree is predicted that contains the predicted fragmentation reactions leading to the fragment peaks.

ZODIAC improves the ranking of the formula candidates provided by SIRIUS. It re-ranks the candidates by taking into account joint fragments and losses between fragmentation trees of different compounds in a data set.

CSI:FingerID identifies the structure of a compound by searching in a molecular structure database. Here and in the following, “structure” refers to the identity and connectivity (with bond multiplicities) of the atoms, but not to stereochemistry information. Elucidation of stereochemistry is currently beyond the power of automated search engines.

COSMIC confidence score assigns a confidence to CSI:FingerID structure identifications. The idea is similar to False Discovery Rates: It allows to run CSI:FingerID in high-throughput on thousands of compounds and select the most confident identifications. The workflow of generating a structure database, searching with CSI:FingerID and ranking hits by confidence score is called the COSMIC workflow. Simplify your data interpretation workflow by first identifying the most confident compounds in your sample and then using them to generate knowledge or hypotheses.

CANOPUS predicts compound classes from the molecular fingerprint predicted by CSI:FingerID without any database searching. It therefore provides structural information for compounds for which neither spectral nor structural reference data are available.

MSNovelist generates de novo structure candidates to overcome the limitations of structure database searching. Structures are generated based on molecular formula and fingerprint.

SIRIUS comes with a Graphical User Interface (GUI), a Command Line Interface (CLI) and an Application Programming Interface (API) that comes with a client in Python. All these interfaces share the same persistence layer, allowing for high-throughput computation using e.g. the CLI on a compute cluster and then manual inspection of selected results using the GUI.

Literature

The scientific development behind SIRIUS, ZODIAC, CSI:FingerID, CANOPUS, and MSNovelist started in 2005 and has required over 50 person-years (and counting) of PhD students, post-docs and principal investigators. And we’re not even talking about the development of the shiny graphical user interface introduced in version 3.1. But it is not the GUI or software development that does the work here; it is our scientific research that has made SIRIUS, ZODIAC, CSI:FingerID, CANOPUS, and MSNovelist possible. It goes without saying that 20 years of work cannot be described in a single paper.

Please cite all papers that you feel are relevant to your work. Please do not cite this manual or the SIRIUS or CSI:FingerID website, but our scientific papers.

SIRIUS 4

CSI:FingerID – searching in molecular structure databases

COSMIC confidence score

CANOPUS – compound class prediction

MSNovelist – de novo structure generation

ZODIAC – molecular formula annotation

Fragmentation tree computation

Isotope pattern analysis

Passatutto – Fragmentation tree based decoy spectra

Auto-detection of elements

Mass decomposition