Spectrum Library Search & Prediction of Functional Groups

This page facilitates the search of metabolites within the GMD by means of user submitted GC-MS spectra consiting of retention index (n-alkanes, if vailable) and mass intensities ratios. In addition, a functional group prediction will help to characterise those metabolites without available reference mass spectra included in the GMD so far. Instead, the unknown metabolite is characterised by predicted presence or absense of functional groups. For power users this functionilty presented here is exposed as soap based web services.
We kindly ask users to cite the following paper when publishing results derived from this service:
Hummel, J., Strehmel, N., Selbig, J., Walther, D. and Kopka, J. (2010) Decision tree supported substructure prediction of metabolites from GC-MS profiles, Metabolomics. http://dx.doi.org/10.1007/s11306-010-0198-7


Thresholds influencing the library search

retention index window. This value is for the library search used only. A larger window size will increase the number of matches. At the same time the identification becomes less reliable due to false matching spectra without RI consensus. The maximal number of hits returned from the data base is limited due to performance reasons.

The five thresholds given below can be used to filter the library search hits in regard to the specific distance measure. All returned matches will be filtered in regard to the most restrictive threshold.

1-DotProduct distance threshold. This value ranges from 0 - perfect match to 1 - mismatch.

Euclidean distance threshold. This value ranges from 0 - perfect match to 1 - mismatch.

Hamming distance threshold. This value ranges from 0 - perfect match to higher values indicating a mismatch.

Jaccard distance threshold. This value ranges from 0 - perfect match to 1 - mismatch.

S12GowLeg distance threshold. This value is derived from the S12 coefficient of Gower & Legendre and ranges from 0 - perfect match to 1 - mismatch.

Filter out library search hits per analyte. Checking this box will remove repeating library hits with lower dot-product match score for replica spectra of one and the same analyte.

Threshold influencing the sub-group prediction

Prediction probability threshold. This value can be used to filter for predictions exceeding a given value. The probability ranges from 0 - low probability to 100 - high probability [%].

Filter out repeating predictions. Checking this box will remove repeating predictions with lower probabilities for one and the same sub-group. As it is our goal to remove residual deconvolution errors, to improve the spectral quality, to extend the number of high quality replicate spectra for existing MSTs and more importantly, to add new metabolites to the GMD compendium, these efforts necessitate, as a consequence, an updating scheme for the DT substructure predictions. Thus, we might have multiple decision trees for the prediction of one and the same sub-group.

You can directly hyperlink your spectra for a automatic search in the GMD!

