The GMD web site use latest HTML5 features! This message indicates
backward compatibility problems in GMD's mass spectra, chemical compound structure
and metabolite profile visualizations. As your browser does not support the HTML5
<canvas> element, please consider to upgrade to a HTML5 ready
web browser, including Microsoft Internet Explorer, Mozilla Firefox, Apple Safari,
Google Chrome and Opera, to fully enjoy GMD's functionalities!
Below are our latest developments and news announcements on The Golm Metabolome Blog
in reverse chronological order!
published Thursday, June 19, 2014 by jahu I used to sign my applications (several GMD tools, GoBioSpace Search Application) with a certificate which was trusted up the chain by the MPG, the DFN and the Deutsche Telekom.
This certificate was issued 24.04.2012 and was working fine until recently. Out of blue several users reported that they could not install any more application I signed. It turned out that something changed in the certificate store of windows and you need to except the root certificate for code signing.
This postwill guide youthrough.
Normally, after clicking to install the application from the internet explorer you will see on the bottom of the page a dialog like this, asking for permission to run the installer.
Now you will experience something similar to this error message 'The signature of setup.exe is corrupt or invalid.'
or, if you click to see the details
To solve this, press the Windows Key + R and type "mmc" and click OK.
you will see the Microsoft Management Console
click File and "Add/Remove snap-In" to add the certificate Plug-In
activate "my user accout" and click finish
you will see the certificate plug-in on the right. now click OK.
In the left panel activate Trusted Root Certification Authorities and then Certificates. Search for the 'Deutsche Telekom Root CA 2' certificate in the right hand panel. Right click this certificate and select properties. Activate the the purpose 'code signing' as shown below and press OK.
Problem solved. If you retry to install some application my certificate will be accepted for code signing and, hence, the software will get installed.
published Wednesday, June 11, 2014 by jahu Our service is unavailable this morning due to network maintenance. We apologize for any inconvenience and we are working to bring the site back online as soon as possible.
published Monday, August 26, 2013 by jahu Last week I got this question asked:
"... I might as well ask if you know of a source of
peak identification for typical Arabidopsis rosette polar compounds derivatized
to typical MeOX/TMS forms (sugars, organic acids, amino acids, etc. obtained
from a typical MeOH/H20 extraction)? I am trying to identify the most abundant
polar compounds (top 100 or so) in Arabidopsis leaf tissue but I have limited
access to standards and I am sure this has already been done. Here is what I
would ideally need: a list of the most abundant compounds in their elution
order on a ms5 column or similar with spectra. Any idea if such a list can be
found on the GMD site or elsewhere?"
I took the identified analatyes from the experiment
published Friday, June 28, 2013 by jahu If you are going to the Metabolomics Society 2013 Conference in Glasgow next week then I hope to see you there. Stop by the poster P9-12 during poster session II and say hi! There is a lot going on in Metabolomics and we will be reporting on some of our latest developments towards Big Data science. I also post the link to our last year's poster from the Metabolomics Society 2012 Conference in Washington, DC.
During this year's Metabolomics Society Award Ceremony our work on Decision tree supported substructure prediction of metabolites from GC-MS profiles will be award 2013 Best Paper Runner up for the second highest total number of citations during the previous three years. Wow! What a pleasant surprise.
The cross experiment control on the metabolite detail page has been updated to convey more information at glance. Instead of a big box plot comparing all experiments to one another, a list with experiments has been added. Each row consists of a heat map and a tiny box plot, as well as the experiment name, organism, the variance, the anova F score and the main experimental conditions. When hovering above a row, a bigger version of the box plot with more detailed experimental conditions appears.
published Friday, May 11, 2012 by jahu I got this question asked today and thought this is worth be documented. I only have version 2.0 of the NIST ms search software at hand, but I assume that the way to import the reference library is pretty similar in later versions.
Start the program.
Click Librarian on the tab control at the bottom.
Create a library by clicking the most right button in the toolbar.
Type an appropriate library name - "GMD".
Close the dialog by clicking OK. The click on the Import button, left in the toolbar.
published Thursday, March 22, 2012 by jahu I got this three questions and think I should answer those here because the topic might be interesting to other people as well...
1.- Could I
consider the following examples (VAR5-Alk-NA 170001 (Classified unknown); VAR5-Alk-NA and VAR5-Alk-unknown) as unknown or do they mean something different?
[JH]
These terms do all refer to a unknown compound. This just
points to a lack in annotation. As we also ask other laboratories for their spectral
libraries it may happen that users use different terms. However, in your
example with NA170001 (classified unknown)Joachim Kopka tried to
highlight, that there is a chance of identification either as "[C5H12O5
(5TMS)|C20H52O5Si5]" or "[Pentitol (5TMS)|C20H52O5Si5]".
2.- Sometimes
the compounds are label as True-VAR5-Alk, False-VAR5-Alk and Pred-VAR5-Alk. can I understand that the
database was not curate? therefore
there are some False. However the ones that are PRED (=predicted?)
Can i trust in this prediction? or do i need only to pay attention to the compounds that says true?
[JH]
This "True",
"False" and "pred" just refers to the Retention index. The
retention index is specific to the chromatographic setup. As the GMD is a
collection of reference spectra from different labs utilising different chromatographic
setups we differentiate the quality of the retention index values.
"true" is the best
and refers to the fact, that this RI was actually experimentally observed on
this chromatographic setup.
"pred" means
predicted and is the next lower quality level. Importing a library from a other
laboratory we can correlate the retention indexes values for all compounds
which were measured in both labs. Next we use this correlation (a polynomial
fitting or whatever) for a regression of the RIs from new Compounds in the
other laboratory's library into our own chromatographic system. The quality of the retention
index values from such regression depends on the similarity of the chromatographic
variant in terms of column polarity, temperature programming and so on. This retention index prediction looks perfect,
however as can be seen from the plot, the estimated error in such an predicted
retention index is already too large for an automated spectral identification processing.
Nevertheless, it is a valuable information in the identification process of unknown
spectra. A retention index prediction with a quality near to experimental observed retention indexes is shown in plot below. If we don't have any experimentally measured retention index available, but can predict one from different chromatographic setups, we chose the one which makes the best sense from a chromatographic similarity point of view.As far as I understand this field, a final identification can only be
proofed by using authentic reference substances.
"False" refers to the
fact, that we don't have a retention index available for the selected chromatographic
setup.
retention index regression between very similar chromatographic variants but different retention index markers.
3.-PRED-Var5-Alk-Similar Does mean there is a
high probability that this compound will
be true?
[JH]
Again, predicted refers only to the retention index. And
"similar" comes from a user input. In this case it is a unknown
compound measured in a other lab with different chromatographic setup. That’s why
we just have a predicted RI available.
I am XXX in YYY lab at ZZZ university. I have just downloaded the following library GMD_20111121_VAR5_ALK_MSL.txt from
Golm Metabolome DB. Because I would like to use it AMDIS software.
However, the extension file of the library is in TXT and I need to converte it to MSL. I will appreciate a lot
if you could tel me how to
do it.
XXX
PS.
I tried to rename the extention but it did not work in AMDIS
Dear XXX,
Thank you very much for using the Golm Metabolome
Database (GMD).
Please use the Amdis software to convert the downloaded
file into a library. I try to list all necessary steps in the following
Open Amdis :)
Click Library ==> Build One Library (this option is only available if a data file
is open)
Click Files
Click Load Library Select the file downloaded from the
GMD, you might need to change file type to "all files *.*" to see
your file with file extension .txt
The import is now starting and as a result
you should see a list of 2,594 imported spectra
Click Files
Click "Save Library
As"
Give a appropriate file location and name and use the file extension msl The
file is now exported and a ".cid" file (compound identification
library) is generated, this is a crucial step
Click Exit to close the Library
Window
Click Analyse ==> Analyse GC/MS Data...
Click Target Library
Select Page "Libr."
Select "Target Compounds Library"
Click "Select New"
Select the new generated file, not the file downloaded
from the GMD Click Save Click Run
If you any problem with the library please don't hesitate
to drop me a line.
published Friday, February 3, 2012 by jahu Ricardo Silva pointed me to a problem in the GMD spectrum export for the TargetSearch software:
He wrote:
I've started to work with GC-MS analysis on R, and the
TargetSearch recomends the golm database http://gmd.mpimp-golm.mpg.de/download/,
but the librarys don't have Retention Index, is this correct? How do a get a
library with Retention Indexes?
Indeed, I found a format error due to the globalisation which led TargetSearch fail to load the textfile.
Thanks Ricardo!
ps.: If you finde any problem, please drop me a line...
published Thursday, February 2, 2012 by jahu Patrik Rydberg posted some code to automatically scale a molecule in the ChemDoodle canvas. I was looking for something like this for quite some time. Now I could this improve for my settings having molFile from many different sources by first scaling the molecule with the scaleToAverageBondLength(Number length) function.
published Tuesday, January 17, 2012 by jahu We regret that GoBioSpace service is likely to be unavailable today 17th. Jan.2012 on account of maintenance work and for the import of the current PubChem Compound and Substance databases. More than 2.5 million structures from the IBM BAO (Business Analytics and Optimization) strategic IP insight platform (SIIP) are now available in PubChem and we think this is very valuable for matching potentially unknown mass peaks.
Your GoBioSpace-Team
[update 2012/01/18]
We released a new data version of GoBioSpace, now including the latest version (yesterday, 2012/01/17) of PubChem Compound and Substance databases and adding 119,958 new unique formula to the GoBioSpace repository. However, approx. 190,000 formula are not referenced anymore and subsequently were purged from GoBioSpace.
published Tuesday, November 15, 2011 by jahu GoBioSpace is a tool to turn measured masses into source tagged sum-formulas and with this blog entry I want to focus on the datasources.
In contrast to combinatorial sum formula prediction tools coming up with a blank formula, the GoBioSpace user gets a formula tagged with many more information such as names, InChIs, hyperlinks and so on. To make this clear GoBioSpace heavily depends on its data sources, and PubChem (Compound and Substance) and ChemSpider are the biggest ones. First, I want to thank those data sources for making the data available to the community and I want to thank for the time and effort the people spend in developing this databases. Second, I want to give a short statistic with respect to the data sources. The table given below lists all databases sourced into GoBioSpace showing the total number of sum formula coming from this database and the number of sum-formula which are made available only from this depositor (no other depositor published this formula).
depositor name
total formula
formula just here
PubChem
1,968,759
32,049
PubChem (ChemSpider)
1,582,073
251,834
ChemSpider 2011.06.01
1,268,051
105,062
ChemSpider 2008.09.28
1,178,614
124,046
PubChem (DiscoveryGate)
1,011,345
8,049
PubChem (NextBio)
958,696
1,417
PubChem (Thomson Pharma)
761,176
42,357
PubChem (MolPort)
428,117
10,980
PubChem (ChemDB)
421,222
198
PubChem (ZINC)
362,513
61,484
PubChem (Ambinter)
297,530
9,077
PubChem (ChEMBL)
196,496
4,182
PubChem (ChemBank)
195,104
16,284
PubChem (Vitas-M Laboratory)
117,951
35
PubChem (ChemIDplus)
117,865
2,357
PubChem (BindingDB)
111,452
2,458
PubChem (DTP/NCI)
96,996
5,118
PubChem (NIAID)
87,586
1,331
PubChem (ChemBridge)
81,525
0
PubChem (ASINEX)
75,916
0
PubChem (MLSMR)
75,731
659
PubChem (Specs)
67,722
3
PubChem (LeadScope)
65,416
257
PubChem (ICCB-Longwood/NSRB Screening Facility, Harvard Medical School)
64,003
493
PubChem (ChemExper Chemical Directory)
61,331
0
PubChem (NIST)
53,532
6
PubChem (AAA Chemistry)
50,033
47
PubChem (ChemBlock)
48,910
0
PubChem (NovoSeek)
43,807
47
PubChem (Emory University Molecular Libraries Screening Center)
42,481
6
PubChem (Southern Research Institute)
39,224
1
PubChem (MTDP)
36,849
2
PubChem (NCGC)
35,575
291
Metabolome.JP
25,396
806
PubChem (Burnham Center for Chemical Genomics)
23,964
19
PubChem (Abbott Labs)
22,196
287
PubChem (Broad Institute)
20,506
187
PubChem (Sigma-Aldrich)
18,920
2
PubChem (NIST Chemistry WebBook)
18,501
0
PubChem (NMRShiftDB)
16,896
20
PubChem (UPCMLD)
15,836
6
PubChem (IS Chemical Technology)
15,583
267
PubChem (GLIDA, GPCR-Ligand Database)
14,497
318
KNApSAcK 2011
13,869
222
PubChem (The Scripps Research Institute Molecular Screening Center)
12,546
14
PubChem (MMDB)
12,495
862
PubChem (Kingston Chemistry)
12,301
0
PubChem (KEGG)
117,32
79
PubChem (MP Biomedicals)
10,719
263
PubChem (ChemSynthesis)
10,570
3
PubChem (ChEBI)
9,901
428
PubChem (GlaxoSmithKline (GSK))
9,728
96
PubChem (Aronis)
9,699
1
PubChem (HDH Pharma)
9,643
3
PubChem (TCI (Tokyo Chemical Industry))
9,469
110
PubChem (Hangzhou APIChem Technology)
6,893
0
KNApSAcK v1.200.03
6,772
1
KNApSAcK v1.200.02
6,724
4
KNApSAcK
6,037
0
PubChem (SMID)
5,438
0
PubChem (EPA DSSTox)
5,423
10
PubChem (DrugBank)
5,398
15
PubChem (BioCyc)
5,237
810
PubChem (Hangzhou Trylead Chemical Technology)
4,998
3
PubChem (Prous Science Drugs of the Future)
4,819
9
PubChem (LipidMAPS)
4,780
76
PubChem (Tractus)
4,515
33
PubChem (Alinda Chemical)
4,421
0
PubChem (NMMLSC)
4,357
2
PubChem (R&D Chemicals)
4,202
0
PubChem (Nature Chemical Biology)
3,918
125
PubChem (Jamson Pharmachem Technology)
3,334
21
PubChem (Comparative Toxicogenomics Database)
3,272
13
PubChem (KUMGM)
3,215
6
PubChem (Shanghai Institute of Organic Chemistry)
3,140
253
PubChem (Tyger Scientific)
2,645
1
PubChem (PDSP)
2,628
0
PubChem (xPharm)
1,959
1
PubChem (Ennopharm)
1,933
3
Human Metabolome Database
1,815
6
PubChem (ORST SMALL MOLECULE SCREENING CENTER)
1,653
0
Target Lipids
1,574
615
PubChem (Nature Chemistry)
1,514
142
PubChem (Calbiochem)
1,456
13
PubChem (University of Pittsburgh Molecular Library Screening Center)
PubChem (Chemical Biology Department, Max Planck Institute of Molecular Physiology)
82
0
PubChem (PCMD)
79
1
PubChem (Amatye)
68
0
PubChem (PANACHE)
65
0
PubChem (iThemba Pharmaceuticals)
65
0
PubChem (Vanderbilt University Medical Center)
55
12
PubChem (Excenen Pharmatech)
43
0
PubChem (PENN-ABS)
39
0
PubChem (Ambit Biosciences)
38
0
PubChem (Nature Communications)
36
0
PubChem (Vanderbilt Screening Center for GPCRs, Ion Channels and Transporters)
30
0
PubChem (Johns Hopkins Ion Channel Center)
20
0
PubChem (SGCStoCompounds)
17
0
PubChem (Web of Science)
16
0
PubChem (Zancheng Functional Chemicals)
14
1
PubChem (Southern Research Specialized Biocontainment Screening Center)
14
0
PubChem (Laboratory of Environmental Genomics, Carolina Center for Computational Toxicology, University of North Carolina at Chapel Hill)
14
0
PubChem (PennChem-GAM)
12
1
PubChem (Annker Organics)
12
0
PubChem (Nitric Oxide Research, National Cancer Institute (NCI))
8
0
PubChem (Paul Baures)
6
0
PubChem (Isoprenoids)
4
0
PubChem (Ganolix LifeScience)
2
0
PubChem (CLRI (CSIR))
2
0
PubChem (Finley and King Labs, Harvard Medical School)
1
0
PubChem (Bioprocess Technology Lab, Department of Microbiology, Bharathidasan University)
1
0
PubChem (VIT University)
1
0
PubChem was imported at 2011-FEB-15th. PubChem refers to PubChem Compound where as PubChem (...) refers to PubChem Substance with the specific sub database
One striking observation is that ChemSpider has in the later import 2011 a large proportion of unique formula,which are not included in the import from 2008. In fact, many of those formula are tagged as "This record is deprecated and may be removed soon." on the ChemSpider website.What does this mean for the chemical formula? Is the formula valid?
Some other observation from the import of the Yeast Metabolome Database is, that also smaller databases contribute formula which are not included so far in the larger databases PubChem and ChemSpider.
published Wednesday, November 2, 2011 by jahu Today I exported the current mass spectral library and linked the files on the GMD download web page.
We now permit the download of the GMD mass spectral reference library under the Creative
Commons
Attribution-ShareAlike 3.0 License (many thanks to Steffen Neumann for helping us to choose the right license). This overdue export was shifted again and again because I wanted to cleanup some issues in the export application. Also, considering the seven export formats, two retention index markers and two GC-columns, giving raise to 7*2*2=28 different files, I wanted to implement an fully automated export first.
In comparison to the deprecated library from June 2010 we added the library (~660 spectra) from Prof. Schomburgs department at Technische Universität Carolo-Wilhelmina in Braunschweig.
If you also want to see your spectra integrated within the GMD I strongly want to encourage you to submit the spectra per email in any library format you prefer.
published Thursday, October 27, 2011 by jahu Jan Lisec pointed me to a severe bug in the functional group prediction feature implemented in the GMD. It turned out that I normalised spectra before decision tree training different than spectra for spectral classification. Jan pointed me to a spectrum where the Phosphoric Acid Deriv group was predicted present based on m/z 299 although this particular mass had only a minimal intensity in this GC-MS spectrum.
The only good news is that the validation for the publication was not affected by this bug, because the cross validation is performed without this web interface. However, I removed this bug and want to apologise for this inconvenience.
published Tuesday, October 25, 2011 by jahu I just finshed some work on the GMD website to improve the usability. I replaced images of mass spectra with interactive JavaScript controls from HighCahrts where the user can zoom in without any effort. An example mass spectrum of Alanine (3TMS) can be seen here. I also used this control to visualise the nort-south-plot in the mass spectral matching result view. It took me some time to figure all the settings out and still it is not perfect, bcause the data labels of the mass peaks are rendered accross the mass peak.
in the same breath I replaced all images of chemical structures by interactive JavaScript controls from ChemDoodle Web Components. The old images were rendered using the the Hyleos .Net ChemLib control which failed to rendere some of the more complex structures.
However, I appreciate this free resources from ChemDoodle and HighCharts very much!
The fnext statement creates as session for searching the Depositors "Human Metabolome Database" (id=3), "KNApSAcK v1.200.03" (id=6) and "Metabolome.JP" (id=8) and using the Adducts "Protonation [M + H+]+" (id = 2) and "KaliumAdduct [M + K+]+" (id=3). Here you can find the full list of Depositors and Adducts!
published Tuesday, August 23, 2011 by jahu I just released a new version 2.1.2.0 of the GoBioSpace search aplication which is now capable to connect single masses in retention time, accurate mass and intensities lists based on the 12C and 34S isotopic patterns.
In addition, we now use adducts (complete list of supported adducts) to identify the compound mass from two different masses which fit together considering the mass shift of two different adducts.
Gas chromatography (GC) coupled to mass spectroscopy (MS) relies on two cornerstones to identify metabolites in complex samples typically derived from plants. First, the retention index (RI, i.e. Kovats retention index), which is a number derived from the retention time and describes the chromatographic behaviour of the substances in a particular chromatographic setup. Second, the mass spectrum as shown here for Alanine (2TMS). We combine these features for convenience in to a mass spectral tag (MST) and link MSTs to the appropriate analytes.
While for the mass spectral matching differences in the used detector technologies, namely quadrupole, ion trap and time of flight, can be deemed irrelevant, chromatography settings varying between different laboratories such as temperature programming, type of capillary column and choice of column manufacturer heavily affect the empirically determined RI properties. Procedures for the transfer of RI properties between chromatography variants are, therefore, highly relevant for a shared library use. We assessed (Strehmel, N., Hummel, J., Erban, A., Strassburg, K. and Kopka, J. (2008) Retention index thresholds for compound matching in GC-MS metabolite profiling, Journal of Chromatography B, 871, 182-190. http://dx.doi.org/10.1016/j.jchromb.2008.04.042) the accuracy of RI transfer between chromatography variants and found regressions transfering empirical determined RI properties favourable compared to retention indexes estimated from physico chemical properties as described by S. E. Stein and coworkers Estimation of Kováts Retention Indices Using Group Contributions.
Here I report on the GmdRiTransfer_01 transferring RIs from a 5%-phenyl-95%-dimethylpolysiloxane (VAR5) column towards a 35%-phenyl-65%-dimethylpolysiloxane (MDN35) capillary column. To make things more complicated the retention indexes on the VAR5 variant are based on the n-alkane homolouges while the retention indexes on the MDN35 variant are based on fatty acid methyl esthers (FAME).
The figure above depicts the training data, 899 blue dots represent analytes in the GMD with an VAR5-ALKANE RI (typically applied in the J. Kopka group) and an MDN35 FAME RI (typically utilised in the Willmitzer department). The analyte and spectrum GUIDs (global uniuqe identififiers) in the data files can be resolved using GMDs text search facility, i.e. http://gmd.mpimp-golm.mpg.de/search.aspx?query=F3EF65C6-A321-4F89-A22A-0D587219B60E. The training data are approximated using a logarithmic regression [y = 807112.206965427 * ln(x) - 5440118.048605500] (yellow graph) resulting in an averaged relative error of 5% and 7% (S.D.) and exhibiting a R2 = 0,950764659. Finally 2,340 analytes in the GMD laking a empirically determined MDN35 FAME RI were projected from the VAR5 ALKANE variant in to the MDN35 FAME variant (greenish data points). I will make new libraries for download available soon, hoewever please feel free to request an up to date GMD download.
published Tuesday, May 31, 2011 by jahu Hello,
I created this blog to provide updates and other usefull information around the Golm Metabolome Database.