GOLM METABOLOME DATABASE

- home
- search
- ms analysis
  - library search
  - prediction
- GoBioSpace
- blog
- thanks to
- publications
- imprint
- privacy policy
- contact
- help

The GMD web site use latest HTML5 features! This message indicates backward compatibility problems in GMD's mass spectra, chemical compound structure and metabolite profile visualizations. As your browser does not support the HTML5 <canvas> element, please consider to upgrade to a HTML5 ready web browser, including Microsoft Internet Explorer, Mozilla Firefox, Apple Safari, Google Chrome and Opera, to fully enjoy GMD's functionalities!

Below are our latest developments and news announcements on The Golm Metabolome Blog

in reverse chronological order!

code signing with 'Deutsche Telekom Root CA 2' root certificate

published Thursday, June 19, 2014 by jahu
I used to sign my applications (several GMD tools, GoBioSpace Search Application) with a certificate which was trusted up the chain by the MPG, the DFN and the Deutsche Telekom.

This certificate was issued 24.04.2012 and was working fine until recently. Out of blue several users reported that they could not install any more application I signed. It turned out that something changed in the certificate store of windows and you need to except the root certificate for code signing.

This post will guide you through.

Normally, after clicking to install the application from the internet explorer you will see on the bottom of the page a dialog like this, asking for permission to run the installer.

Now you will experience something similar to this error message 'The signature of setup.exe is corrupt or invalid.'

or, if you click to see the details

To solve this, press the Windows Key + R and type "mmc" and click OK.

you will see the Microsoft Management Console

click File and "Add/Remove snap-In" to add the certificate Plug-In

activate "my user accout" and click finish

you will see the certificate plug-in on the right. now click OK.

In the left panel activate Trusted Root Certification Authorities and then Certificates. Search for the 'Deutsche Telekom Root CA 2' certificate in the right hand panel. Right click this certificate and select properties. Activate the the purpose 'code signing' as shown below and press OK.

Problem solved. If you retry to install some application my certificate will be accepted for code signing and, hence, the software will get installed.

Enjoy.

GMD service unavailable

published Wednesday, June 11, 2014 by jahu
Our service is unavailable this morning due to network maintenance. We apologize for any inconvenience and we are working to bring the site back online as soon as possible.

Most Abundant Compounds in their Elution Order for Arabidopsis Sample

published Monday, August 26, 2013 by jahu
Last week I got this question asked:

"... I might as well ask if you know of a source of peak identification for typical Arabidopsis rosette polar compounds derivatized to typical MeOX/TMS forms (sugars, organic acids, amino acids, etc. obtained from a typical MeOH/H20 extraction)? I am trying to identify the most abundant polar compounds (top 100 or so) in Arabidopsis leaf tissue but I have limited access to standards and I am sure this has already been done. Here is what I would ideally need: a list of the most abundant compounds in their elution order on a ms5 column or similar with spectra. Any idea if such a list can be found on the GMD site or elsewhere?"

I took the identified analatyes from the experiment

Mining for metabolic responses to long-term salt stress: a case study on Arabidopsis thaliana Col-0 (A)

(see metabolite profile) and added the retention idexes. Null problemo!

cheers
Jan

Analyte	Metabolite	RI relative to alkane homologes on 5%-phenyl-95%-dimethylpolysiloxane capillary column
Boric-acid_3TMS	Boric-acid	971.63
Propane-1,2-diol (2TMS)	Propane-1,2-diol	988.00
Decane, n-	Decane, n-	1000.00
Siloxane	Siloxane	1021.01
Pyridine, 2-hydroxy- (1TMS)	2-Hydroxypyridine	1031.31
Lactic acid (2TMS)	Lactic acid	1044.47
Glycolic acid (2TMS)	Glycolic acid	1062.89
Hydroxylamine (3TMS)	Hydroxylamine	1104.93
similar to Cyclopentasiloxane, decamethyl	Siloxane	1117.07
NA114002 (classified unknown)		1131.09
Furan-2-carboxylic acid (1TMS)	Furan-2-carboxylic acid	1133.08
Pyridine, 3-hydroxy- (1TMS)	3-Hydroxypyridine	1136.97
Benzylalcohol (1TMS)	Benzyl Alcohole	1152.05
D116201		1161.89
Proline (1TMS)	Proline	1176.03
NA		1189.63
Dodecane	Dodecane	1200.00
Valine (2TMS)	Valine	1207.10
similar to Pentasiloxane, dodecamethyl	Siloxane	1208.14
Diethylenglycol (2TMS)	Diethyleneglycol	1236.16
NA		1238.20
Benzoic acid, (1TMS)	Benzoic acid	1250.39
Ethanolamine (3TMS)	Ethanolamine	1259.94
Phosphoric acid (3TMS)	Phosphoric acid	1261.98
Glycerol (3TMS)	Glycerol	1262.29
Siloxane	Siloxane	1285.28
Isoleucine (2TMS)	Isoleucine	1286.54
Threonine (2TMS)	Threonine	1290.45
Nicotinic acid (1TMS)	Nicotinic acid	1301.80
Glycine (3TMS)	Glycine	1302.68
Succinic acid (2TMS)	Succinic acid	1310.65
Glyceric acid (3TMS)	Glyceric acid	1319.94
NA133011		1332.25
Fumaric acid (2TMS)	Fumaric acid	1346.94
NA		1386.00
NA		1410.42
NA		1420.25
Aspartic acid (2TMS)	Aspartic acid	1422.39
NA		1436.41
NA145016 (classified unknown)		1454.42
NA145015		1455.48
Cysteamine (3TMS)	Cysteamine	1458.99
NA		1467.28
Malic acid (3TMS)	Malic acid	1479.34
Threitol (4TMS)	Threitol	1485.23
Pentadecane, n-	Pentadecane, n-	1500.00
Pyroglutamic acid (2TMS)	Pyroglutamic acid	1521.73
Butanoic acid, 4-amino- (3TMS)	Butyric acid, 4-amino-	1527.46
Glutamic acid (2TMS)	Glutamic acid	1528.20
Erythronic acid (4TMS)	Erythronic acid	1528.59
Threonic acid (4TMS)	Threonic acid	1545.94
D155405		1554.13
Phenylalanine (1TMS)	Phenylalanine	1556.05
Benzoic acid, 4-hydroxy- (2TMS)	Benzoic acid, 4-hydroxy-	1633.29
D164543		1645.42
Asparagine (3TMS)	Asparagine	1666.44
Siloxane	Siloxane	1685.56
Glucose, 1,6-anhydro-, beta- (3TMS)	Glucose, 1,6-anhydro, beta-D-	1701.20
Ribitol (5TMS)	Ribitol	1712.74
NA		1723.17
NA174001		1732.08
Putrescine (4TMS)	Putrescine	1736.70
Aconitic acid, cis- (3TMS)	cis-Aconitic acid	1741.10
NA176001 (classified unknown)		1742.97
NA		1755.80
Siloxane	Siloxane	1771.64
NA180004		1776.31
NA181001		1795.75
Octadecane, n-	Octadecane, n-	1800.00
Citric acid (4TMS)	Citric acid	1803.92
NA		1817.56
Siloxane	Siloxane	1828.56
Dehydroascorbic acid dimer (2MEOX) MP	Dehydroascorbic acid	1838.71
Psicose (1MEOX) (5TMS) MP	Psicose	1849.82
Fructose (1MEOX) (5TMS) MP	Fructose	1853.93
NA		1860.68
Mannose (1MEOX) (5TMS) MP	Mannose	1868.46
Galactose (1MEOX) (5TMS) MP	Galactose	1876.07
Glucose (1MEOX) (5TMS) MP	Glucose	1880.50
NA		1895.60
Nonadecane	Nonadecane	1900.00
NA192001 (classified unknown)		1915.94
Sorbitol (6TMS)	Sorbitol	1919.74
NA		1956.57
Siloxane	Siloxane	1975.67
Galactonic acid (6TMS)	Galactonic acid	1980.49
Gluconic acid (6TMS)	Gluconic acid	1984.52
NA		2019.51
Hexadecanoic acid (1TMS)	Palmitic acid	2045.44
Sinapic acid, cis- (2TMS)	Sinapic acid, cis-	2060.66
Inositol, myo- (6TMS)	myo-Inositol	2080.20
NA211001		2098.30
Siloxane	Siloxane	2114.20
Siloxane	Siloxane	2188.76
Docosane, n-	Docosane, n-	2200.00
Octadecanoic acid (1TMS)	Stearic acid	2243.49
Spermidine (5TMS)	Spermidine	2251.32
NA		2265.18
similar to Glycerolaldopyranosid (6TMS)		2298.80
Siloxane	Siloxane	2332.66
NA		2353.10
NA		2360.28
Siloxane	Siloxane	2399.69
Siloxane	Siloxane	2468.06
NA		2484.98
NA		2499.64
NA		2530.38
Siloxane	Siloxane	2535.24
NA		2542.64
NA		2583.33
Siloxane	Siloxane	2596.74
D260482		2604.01
Sucrose (8TMS)	Sucrose	2623.04
Siloxane	Siloxane	2716.61
Maltose (1MEOX) (8TMS) MP	Maltose	2719.80
Trehalose, alpha,alpha'-, D- (8TMS)	Trehalose, alpha,alpha'-	2726.30
NA		2748.95
NA		2785.30
D278931		2788.69
Octacosane, n-	Octacosane, n-	2800.02
D288804		2888.07
NA		2911.69
Galactinol (9TMS)	Galactinol	2966.29
NA		2987.50
NA		2996.53
NA		3098.96
Dotriacontane, n-	Dotriacontane, n-	3200.00
Raffinose (11TMS)	Raffinose	3350.64
Hexatriacontane, n-	Hexatriacontane, n-	3600.00

Metabolomics Society 2013 Conference

published Friday, June 28, 2013 by jahu
If you are going to the Metabolomics Society 2013 Conference in Glasgow next week then I hope to see you there. Stop by the poster P9-12 during poster session II and say hi! There is a lot going on in Metabolomics and we will be reporting on some of our latest developments towards Big Data science. I also post the link to our last year's poster from the Metabolomics Society 2012 Conference in Washington, DC.
During this year's Metabolomics Society Award Ceremony our work on Decision tree supported substructure prediction of metabolites from GC-MS profiles will be award 2013 Best Paper Runner up for the second highest total number of citations during the previous three years. Wow! What a pleasant surprise.

Cross Experiment Comparison of One Metabolite

published Wednesday, June 12, 2013 by Unknown

The cross experiment control on the metabolite detail page has been updated to convey more information at glance. Instead of a big box plot comparing all experiments to one another, a list with experiments has been added. Each row consists of a heat map and a tiny box plot, as well as the experiment name, organism, the variance, the anova F score and the main experimental conditions. When hovering above a row, a bigger version of the box plot with more detailed experimental conditions appears.

Example: http://gmd.mpimp-golm.mpg.de/Metabolites/37e8fffb-70da-4399-b724-476bd8715ef0.aspx#QuantitativeProfileData

How to import the GMD refenrence library into NIST MS Search

published Friday, May 11, 2012 by jahu
I got this question asked today and thought this is worth be documented. I only have version 2.0 of the NIST ms search software at hand, but I assume that the way to import the reference library is pretty similar in later versions.

Start the program.
Click Librarian on the tab control at the bottom.
Create a library by clicking the most right button in the toolbar.
Type an appropriate library name - "GMD".
Close the dialog by clicking OK. The click on the Import button, left in the toolbar.
Select the msp-file downloaded from the GMD. Make sure to first select "All files" in the file type menu.
In the option check "include Synonyms" Click "Import All".
The import is starting.
You can cancel the library matching process.
The GMD reference library is imported and ready for use.

cheers

Jan

3 question about the GMD mass spectral reference library

published Thursday, March 22, 2012 by jahu
I got this three questions and think I should answer those here because the topic might be interesting to other people as well...

1.- Could I consider the following examples (VAR5-Alk-NA 170001 (Classified unknown); VAR5-Alk-NA and VAR5-Alk-unknown) as unknown or do they mean something different?

[JH]

These terms do all refer to a unknown compound. This just points to a lack in annotation. As we also ask other laboratories for their spectral libraries it may happen that users use different terms. However, in your example with NA170001 (classified unknown) Joachim Kopka tried to highlight, that there is a chance of identification either as "[C5H12O5 (5TMS)|C20H52O5Si5]" or "[Pentitol (5TMS)|C20H52O5Si5]".

2.- Sometimes the compounds are label as True-VAR5-Alk, False-VAR5-Alk and Pred-VAR5-Alk. can I understand that the database was not curate? therefore there are some False. However the ones that are PRED (=predicted?) Can i trust in this prediction? or do i need only to pay attention to the compounds that says true?

[JH]

This "True", "False" and "pred" just refers to the Retention index. The retention index is specific to the chromatographic setup. As the GMD is a collection of reference spectra from different labs utilising different chromatographic setups we differentiate the quality of the retention index values.

"true" is the best and refers to the fact, that this RI was actually experimentally observed on this chromatographic setup.
"pred" means predicted and is the next lower quality level. Importing a library from a other laboratory we can correlate the retention indexes values for all compounds which were measured in both labs. Next we use this correlation (a polynomial fitting or whatever) for a regression of the RIs from new Compounds in the other laboratory's library into our own chromatographic system.
The quality of the retention index values from such regression depends on the similarity of the chromatographic variant in terms of column polarity, temperature programming and so on. This retention index prediction looks perfect, however as can be seen from the plot, the estimated error in such an predicted retention index is already too large for an automated spectral identification processing. Nevertheless, it is a valuable information in the identification process of unknown spectra.
A retention index prediction with a quality near to experimental observed retention indexes is shown in plot below.
If we don't have any experimentally measured retention index available, but can predict one from different chromatographic setups, we chose the one which makes the best sense from a chromatographic similarity point of view.As far as I understand this field, a final identification can only be proofed by using authentic reference substances.
"False" refers to the fact, that we don't have a retention index available for the selected chromatographic setup.

retention index regression between very similar chromatographic variants but different retention index markers.

3.-PRED-Var5-Alk-Similar Does mean there is a high probability that this compound will be true?

[JH]

Again, predicted refers only to the retention index. And "similar" comes from a user input. In this case it is a unknown compound measured in a other lab with different chromatographic setup. That’s why we just have a predicted RI available.

cheers,
Jan

How to integrate mass spectral refenrence libraries into AMDIS

published Thursday, March 22, 2012 by jahu

Recently, I got this question on how to import GMD mass spectral reference libraries into the Automated Mass Spectral Deconvolution and Identification System (AMDIS). As I think this might be interesting for other people as well, I copy the question and my answer below:

Hi,
I am XXX in YYY lab at ZZZ university. I have just downloaded the following library GMD_20111121_VAR5_ALK_MSL.txt from Golm Metabolome DB. Because I would like to use it AMDIS software.

However, the extension file of the library is in TXT and I need to converte it to MSL. I will appreciate a lot if you could tel me how to do it.

XXX

PS. I tried to rename the extention but it did not work in AMDIS

Dear XXX,

Thank you very much for using the Golm Metabolome Database (GMD).

Please use the Amdis software to convert the downloaded file into a library. I try to list all necessary steps in the following

Open Amdis :)
Click Library ==> Build One Library (this option is only available if a data file is open)
Click Files
Click Load Library Select the file downloaded from the GMD, you might need to change file type to "all files *.*" to see your file with file extension .txt
The import is now starting and as a result you should see a list of 2,594 imported spectra
Click Files
Click "Save Library As"
Give a appropriate file location and name and use the file extension msl The file is now exported and a ".cid" file (compound identification library) is generated, this is a crucial step
Click Exit to close the Library Window
Click Analyse ==> Analyse GC/MS Data...
Click Target Library
Select Page "Libr."
Select "Target Compounds Library"
Click "Select New"
Select the new generated file, not the file downloaded from the GMD Click Save Click Run

If you any problem with the library please don't hesitate to drop me a line.

Your feedback is highly appreciated.

Best regards

Jan

update spectral library for TargetSearch

published Friday, February 3, 2012 by jahu
Ricardo Silva pointed me to a problem in the GMD spectrum export for the TargetSearch software:

He wrote:

I've started to work with GC-MS analysis on R, and the TargetSearch recomends the golm database http://gmd.mpimp-golm.mpg.de/download/, but the librarys don't have Retention Index, is this correct? How do a get a library with Retention Indexes?

Indeed, I found a format error due to the globalisation which led TargetSearch fail to load the textfile.

Thanks Ricardo!

ps.: If you finde any problem, please drop me a line...

Tweaking ChemDoodle

published Thursday, February 2, 2012 by jahu
Patrik Rydberg posted some code to automatically scale a molecule in the ChemDoodle canvas. I was looking for something like this for quite some time. Now I could this improve for my settings having molFile from many different sources by first scaling the molecule with the scaleToAverageBondLength(Number length) function.

See an example here:
http://gmd.mpimp-golm.mpg.de/Analytes/0a2b3536-2245-4c0e-bdbc-495766eeec67.aspx

My code (taken from Patrik) is below:

structure = ChemDoodle.readMOL(molFile);
structure.scaleToAverageBondLength(10);
size = structure.getDimension();
scale = Math.min(canvas.width / size.x, canvas.height / size.y);
canvas.loadMolecule(structure);
canvas.specs.scale = scale * .9;
canvas.repaint();

cheers

PubChem update

published Tuesday, January 17, 2012 by jahu
We regret that GoBioSpace service is likely to be unavailable today 17th. Jan.2012 on account of maintenance work and for the import of the current PubChem Compound and Substance databases. More than 2.5 million structures from the IBM BAO (Business Analytics and Optimization) strategic IP insight platform (SIIP) are now available in PubChem and we think this is very valuable for matching potentially unknown mass peaks.

Your GoBioSpace-Team

[update 2012/01/18]
We released a new data version of GoBioSpace, now including the latest version (yesterday, 2012/01/17) of PubChem Compound and Substance databases and adding 119,958 new unique formula to the GoBioSpace repository. However, approx. 190,000 formula are not referenced anymore and subsequently were purged from GoBioSpace.

GoBioSpace' depositors

published Tuesday, November 15, 2011 by jahu
GoBioSpace is a tool to turn measured masses into source tagged sum-formulas and with this blog entry I want to focus on the datasources.

In contrast to combinatorial sum formula prediction tools coming up with a blank formula, the GoBioSpace user gets a formula tagged with many more information such as names, InChIs, hyperlinks and so on. To make this clear GoBioSpace heavily depends on its data sources, and PubChem (Compound and Substance) and ChemSpider are the biggest ones. First, I want to thank those data sources for making the data available to the community and I want to thank for the time and effort the people spend in developing this databases. Second, I want to give a short statistic with respect to the data sources. The table given below lists all databases sourced into GoBioSpace showing the total number of sum formula coming from this database and the number of sum-formula which are made available only from this depositor (no other depositor published this formula).

depositor name	total formula	formula just here
PubChem	1,968,759	32,049
PubChem (ChemSpider)	1,582,073	251,834
ChemSpider 2011.06.01	1,268,051	105,062
ChemSpider 2008.09.28	1,178,614	124,046
PubChem (DiscoveryGate)	1,011,345	8,049
PubChem (NextBio)	958,696	1,417
PubChem (Thomson Pharma)	761,176	42,357
PubChem (MolPort)	428,117	10,980
PubChem (ChemDB)	421,222	198
PubChem (ZINC)	362,513	61,484
PubChem (Ambinter)	297,530	9,077
PubChem (ChEMBL)	196,496	4,182
PubChem (ChemBank)	195,104	16,284
PubChem (Vitas-M Laboratory)	117,951	35
PubChem (ChemIDplus)	117,865	2,357
PubChem (BindingDB)	111,452	2,458
PubChem (DTP/NCI)	96,996	5,118
PubChem (NIAID)	87,586	1,331
PubChem (ChemBridge)	81,525	0
PubChem (ASINEX)	75,916	0
PubChem (MLSMR)	75,731	659
PubChem (Specs)	67,722	3
PubChem (LeadScope)	65,416	257
PubChem (ICCB-Longwood/NSRB Screening Facility, Harvard Medical School)	64,003	493
PubChem (ChemExper Chemical Directory)	61,331	0
PubChem (NIST)	53,532	6
PubChem (AAA Chemistry)	50,033	47
PubChem (ChemBlock)	48,910	0
PubChem (NovoSeek)	43,807	47
PubChem (Emory University Molecular Libraries Screening Center)	42,481	6
PubChem (Southern Research Institute)	39,224	1
PubChem (MTDP)	36,849	2
PubChem (NCGC)	35,575	291
Metabolome.JP	25,396	806
PubChem (Burnham Center for Chemical Genomics)	23,964	19
PubChem (Abbott Labs)	22,196	287
PubChem (Broad Institute)	20,506	187
PubChem (Sigma-Aldrich)	18,920	2
PubChem (NIST Chemistry WebBook)	18,501	0
PubChem (NMRShiftDB)	16,896	20
PubChem (UPCMLD)	15,836	6
PubChem (IS Chemical Technology)	15,583	267
PubChem (GLIDA, GPCR-Ligand Database)	14,497	318
KNApSAcK 2011	13,869	222
PubChem (The Scripps Research Institute Molecular Screening Center)	12,546	14
PubChem (MMDB)	12,495	862
PubChem (Kingston Chemistry)	12,301	0
PubChem (KEGG)	117,32	79
PubChem (MP Biomedicals)	10,719	263
PubChem (ChemSynthesis)	10,570	3
PubChem (ChEBI)	9,901	428
PubChem (GlaxoSmithKline (GSK))	9,728	96
PubChem (Aronis)	9,699	1
PubChem (HDH Pharma)	9,643	3
PubChem (TCI (Tokyo Chemical Industry))	9,469	110
PubChem (Hangzhou APIChem Technology)	6,893	0
KNApSAcK v1.200.03	6,772	1
KNApSAcK v1.200.02	6,724	4
KNApSAcK	6,037	0
PubChem (SMID)	5,438	0
PubChem (EPA DSSTox)	5,423	10
PubChem (DrugBank)	5,398	15
PubChem (BioCyc)	5,237	810
PubChem (Hangzhou Trylead Chemical Technology)	4,998	3
PubChem (Prous Science Drugs of the Future)	4,819	9
PubChem (LipidMAPS)	4,780	76
PubChem (Tractus)	4,515	33
PubChem (Alinda Chemical)	4,421	0
PubChem (NMMLSC)	4,357	2
PubChem (R&D Chemicals)	4,202	0
PubChem (Nature Chemical Biology)	3,918	125
PubChem (Jamson Pharmachem Technology)	3,334	21
PubChem (Comparative Toxicogenomics Database)	3,272	13
PubChem (KUMGM)	3,215	6
PubChem (Shanghai Institute of Organic Chemistry)	3,140	253
PubChem (Tyger Scientific)	2,645	1
PubChem (PDSP)	2,628	0
PubChem (xPharm)	1,959	1
PubChem (Ennopharm)	1,933	3
Human Metabolome Database	1,815	6
PubChem (ORST SMALL MOLECULE SCREENING CENTER)	1,653	0
Target Lipids	1,574	615
PubChem (Nature Chemistry)	1,514	142
PubChem (Calbiochem)	1,456	13
PubChem (University of Pittsburgh Molecular Library Screening Center)	1,400	0
PubChem (Vanderbilt Specialized Chemistry Center)	1,384	71
PubChem (Biosynth)	1,326	37
PubChem (BIDD)	1,320	0
PubChem (Exchemistry)	1,290	2
PubChem (CMLD-BU)	1,269	0
PubChem (UCLA Molecular Screening Shared Resource)	1,238	1
PubChem (MOLI)	1,219	1
Yeast Metabolome Database (2011)	1,183	35
PubChem (Circadian Research, Kay Laboratory, University of California at San Diego (UCSD))	1,179	0
Maximum Recommended Therapeutic Dose (MRTD) Database	1,101	0
PubChem (BIND)	982	0
PubChem (NINDS Approved Drug Screening Program)	964	0
PubChem (UM-BBD)	897	8
PubChem (Total TOSLab Building-Blocks)	782	0
PubChem (InFarmatik)	726	0
PubChem (Golm Metabolome Database (GMD), Max Planck Institute of Molecular Plant Physiology)	723	2
PubChem (NIH Clinical Collection)	705	2
YEASTNET Vers. 4, (2011)	521	0
PubChem (Alsachim)	517	8
PubChem (MIC Scientific)	508	0
PubChem (Molecular Libraries Program, Specialized Chemistry Center, University of Kansas)	431	5
PubChem (Selleck Chemicals)	407	2
PubChem (Biological Magnetic Resonance Data Bank (BMRB))	394	0
PubChem (CC_PMLSC)	358	0
PubChem (Shanghai Sinofluoro Scientific Company)	349	0
PubChem (MICAD)	339	36
PubChem (True PharmaChem)	316	0
PubChem (Columbia University Molecular Screening Center)	312	1
PubChem (SGCOxCompounds)	277	0
PubChem (EMD Biosciences)	267	7
PubChem (SRMLSC)	249	0
PubChem (Avanti Polar Lipids)	232	109
PubChem (Nantong Baihua Bio-Pharmaceutical Co., Ltd)	174	0
PubChem (Creasyn Finechem)	170	1
PubChem (Structural Genomics Consortium)	89	0
PubChem (IUPHAR-DB)	88	0
PubChem (Chemical Biology Department, Max Planck Institute of Molecular Physiology)	82	0
PubChem (PCMD)	79	1
PubChem (Amatye)	68	0
PubChem (PANACHE)	65	0
PubChem (iThemba Pharmaceuticals)	65	0
PubChem (Vanderbilt University Medical Center)	55	12
PubChem (Excenen Pharmatech)	43	0
PubChem (PENN-ABS)	39	0
PubChem (Ambit Biosciences)	38	0
PubChem (Nature Communications)	36	0
PubChem (Vanderbilt Screening Center for GPCRs, Ion Channels and Transporters)	30	0
PubChem (Johns Hopkins Ion Channel Center)	20	0
PubChem (SGCStoCompounds)	17	0
PubChem (Web of Science)	16	0
PubChem (Zancheng Functional Chemicals)	14	1
PubChem (Southern Research Specialized Biocontainment Screening Center)	14	0
PubChem (Laboratory of Environmental Genomics, Carolina Center for Computational Toxicology, University of North Carolina at Chapel Hill)	14	0
PubChem (PennChem-GAM)	12	1
PubChem (Annker Organics)	12	0
PubChem (Nitric Oxide Research, National Cancer Institute (NCI))	8	0
PubChem (Paul Baures)	6	0
PubChem (Isoprenoids)	4	0
PubChem (Ganolix LifeScience)	2	0
PubChem (CLRI (CSIR))	2	0
PubChem (Finley and King Labs, Harvard Medical School)	1	0
PubChem (Bioprocess Technology Lab, Department of Microbiology, Bharathidasan University)	1	0
PubChem (VIT University)	1	0

PubChem was imported at 2011-FEB-15th. PubChem refers to PubChem Compound where as PubChem (...) refers to PubChem Substance with the specific sub database

One striking observation is that ChemSpider has in the later import 2011 a large proportion of unique formula,which are not included in the import from 2008. In fact, many of those formula are tagged as "This record is deprecated and may be removed soon." on the ChemSpider website.What does this mean for the chemical formula? Is the formula valid?

Some other observation from the import of the Yeast Metabolome Database is, that also smaller databases contribute formula which are not included so far in the larger databases PubChem and ChemSpider.

Any thoughts?

new mass spectral library made available

published Wednesday, November 2, 2011 by jahu
Today I exported the current mass spectral library and linked the files on the GMD download web page.

We now permit the download of the GMD mass spectral reference library under the Creative Commons Attribution-ShareAlike 3.0 License (many thanks to Steffen Neumann for helping us to choose the right license). This overdue export was shifted again and again because I wanted to cleanup some issues in the export application. Also, considering the seven export formats, two retention index markers and two GC-columns, giving raise to 7*2*2=28 different files, I wanted to implement an fully automated export first.
In comparison to the deprecated library from June 2010 we added the library (~660 spectra) from Prof. Schomburgs department at Technische Universität Carolo-Wilhelmina in Braunschweig.
If you also want to see your spectra integrated within the GMD I strongly want to encourage you to submit the spectra per email in any library format you prefer.

cheers,
Jan

severe bug in functional group prediction

published Thursday, October 27, 2011 by jahu
Jan Lisec pointed me to a severe bug in the functional group prediction feature implemented in the GMD. It turned out that I normalised spectra before decision tree training different than spectra for spectral classification. Jan pointed me to a spectrum where the Phosphoric Acid Deriv group was predicted present based on m/z 299 although this particular mass had only a minimal intensity in this GC-MS spectrum.
The only good news is that the validation for the publication was not affected by this bug, because the cross validation is performed without this web interface. However, I removed this bug and want to apologise for this inconvenience.

Thanks Jan!

cheers,
the other Jan ;-)

GMD web site polished with JavaScript controls from HighCharts and ChemDoodle

published Tuesday, October 25, 2011 by jahu
I just finshed some work on the GMD website to improve the usability. I replaced images of mass spectra with interactive JavaScript controls from HighCahrts where the user can zoom in without any effort. An example mass spectrum of Alanine (3TMS) can be seen here. I also used this control to visualise the nort-south-plot in the mass spectral matching result view. It took me some time to figure all the settings out and still it is not perfect, bcause the data labels of the mass peaks are rendered accross the mass peak.
in the same breath I replaced all images of chemical structures by interactive JavaScript controls from ChemDoodle Web Components. The old images were rendered using the the Hyleos .Net ChemLib control which failed to rendere some of the more complex structures.

However, I appreciate this free resources from ChemDoodle and HighCharts very much!

cheers
Jan

GoBioSpace goes R

published Tuesday, August 23, 2011 by jahu

With the grateful help of Duncan Temple Lang (SSOAP package on omegahat, http://www.omegahat.org/SSOAP/ ) I could integrate GobioSpace with R.

Here is some R example code:

library(SSOAP)

gobi.url = "http://gmd.mpimp-golm.mpg.de/webservices/wsGoBioSpace.asmx?WSDL"

gobi.wsdl = processWSDL(gobi.url, port = 1)

gobi.iface = genSOAPClientInterface(, gobi.wsdl)

names(gobi.iface@functions)

The fnext statement creates as session for searching the Depositors "Human Metabolome Database" (id=3), "KNApSAcK v1.200.03" (id=6) and "Metabolome.JP" (id=8) and using the Adducts "Protonation [M + H+]+" (id = 2) and "KaliumAdduct [M + K+]+" (id=3). Here you can find the full list of Depositors and Adducts!

session = gobi.iface@functions$CreateSession(DepositorIds = c(3, 6, 8), AdductIds = c(2L, 3L))

sm = gobi.iface@functions$SearchMass12C(SessionID = session, mass = 579.1705, tolerance = 0.001)

names(sm)

sapply(sm, slot, "fID")

synonyms = gobi.iface@functions$GetSynonyms(SessionID = session, FormulaID = 21169)

Your feedback is highly appreciated!

cheers,
Jan

GoBioSpace Features Isotope and Adduct Identification

published Tuesday, August 23, 2011 by jahu
I just released a new version 2.1.2.0 of the GoBioSpace search aplication which is now capable to connect single masses in retention time, accurate mass and intensities lists based on the ¹²C and ³⁴S isotopic patterns.

In addition, we now use adducts (complete list of supported adducts) to identify the compound mass from two different masses which fit together considering the mass shift of two different adducts.

Your feedback is highly appreciated!

cheers,
Jan

Transfering Retention Indexes

published Tuesday, May 31, 2011 by jahu

Gas chromatography (GC) coupled to mass spectroscopy (MS) relies on two cornerstones to identify metabolites in complex samples typically derived from plants. First, the retention index (RI, i.e. Kovats retention index), which is a number derived from the retention time and describes the chromatographic behaviour of the substances in a particular chromatographic setup. Second, the mass spectrum as shown here for Alanine (2TMS). We combine these features for convenience in to a mass spectral tag (MST) and link MSTs to the appropriate analytes.

While for the mass spectral matching differences in the used detector technologies, namely quadrupole, ion trap and time of flight, can be deemed irrelevant, chromatography settings varying between different laboratories such as temperature programming, type of capillary column and choice of column manufacturer heavily affect the empirically determined RI properties. Procedures for the transfer of RI properties between chromatography variants are, therefore, highly relevant for a shared library use. We assessed (Strehmel, N., Hummel, J., Erban, A., Strassburg, K. and Kopka, J. (2008) Retention index thresholds for compound matching in GC-MS metabolite profiling, Journal of Chromatography B, 871, 182-190. http://dx.doi.org/10.1016/j.jchromb.2008.04.042) the accuracy of RI transfer between chromatography variants and found regressions transfering empirical determined RI properties favourable compared to retention indexes estimated from physico chemical properties as described by S. E. Stein and coworkers Estimation of Kováts Retention Indices Using Group Contributions.

Here I report on the GmdRiTransfer_01 transferring RIs from a 5%-phenyl-95%-dimethylpolysiloxane (VAR5) column towards a 35%-phenyl-65%-dimethylpolysiloxane (MDN35) capillary column. To make things more complicated the retention indexes on the VAR5 variant are based on the n-alkane homolouges while the retention indexes on the MDN35 variant are based on fatty acid methyl esthers (FAME).

The figure above depicts the training data, 899 blue dots represent analytes in the GMD with an VAR5-ALKANE RI (typically applied in the J. Kopka group) and an MDN35 FAME RI (typically utilised in the Willmitzer department). The analyte and spectrum GUIDs (global uniuqe identififiers) in the data files can be resolved using GMDs text search facility, i.e. http://gmd.mpimp-golm.mpg.de/search.aspx?query=F3EF65C6-A321-4F89-A22A-0D587219B60E. The training data are approximated using a logarithmic regression [y = 807112.206965427 * ln(x) - 5440118.048605500] (yellow graph) resulting in an averaged relative error of 5% and 7% (S.D.) and exhibiting a R² = 0,950764659. Finally 2,340 analytes in the GMD laking a empirically determined MDN35 FAME RI were projected from the VAR5 ALKANE variant in to the MDN35 FAME variant (greenish data points).
I will make new libraries for download available soon, hoewever please feel free to request an up to date GMD download.

Welcome

published Tuesday, May 31, 2011 by jahu
Hello,
I created this blog to provide updates and other usefull information around the Golm Metabolome Database.

greetings,
Jan 19 RSS items

Top