The CEU, particularly the Butterworth group, have conducted several studies using multiplex proteomic assays from SomaLogic and Olink, both in the INTERVAL study and as part of global consortia, such as the SCALLOP Consortium. These projects have primarily involved identifying the genomic determinants of circulating protein levels and using them to make aetiological insights, or using them to train models to impute protein levels using genetic information.
As part of our commitment to open science, we routinely make our GWAS summary statistics publicly available for download for reuse in other projects. Below you will find information about where to find GWAS summary statistics for several of our completed projects.
SCALLOP Consortium meta-analysis GWAS summary statistics for the Olink Inflammation panel – Zhao et al, Nature Immunology 2023
In 2023, an international team led by Jing Hua Zhao, James Peters and Adam Butterworth published a paper in Nature Immunology describing a GWAS meta-analysis of studies from the SCALLOP Consortium involving >15,000 participants. The studies had all used Olink’s 92 protein Inflammation panel to assay circulating protein levels and then run association tests with imputed genomewide array data.
The full genetic association summary statistics for these proteins are available on our Box.com site: https://app.box.com/s/m3y9n651w2qii61ag20dxzo75aqwmiql Users can either download files from the website directly, or alternatively connect to the site using rclone or LFTP. For instructions on how to use rclone or LFTP, please contact Adam Butterworth (asb38@medschl.cam.ac.uk).
For each of the 91 proteins analysed in the paper, there is a a gzipped .tbl file containing the METAL output, a .info file and a .log file.
INTERVAL study SomaLogic plasma protein GWAS summary statistics – Sun et al, Nature 2018
In 2018, a team at CEU led by Ben Sun and Adam Butterworth published a paper in Nature describing a GWAS in 3301 participants from the INTERVAL study in whom a SomaLogic aptamer-based plasma protein assay had been run, measuring ~3600 proteins (see Sun et al., Nature, 2018). In total, we identified 1,927 associations (“pQTLs”) with 1,478 proteins, greatly enhancing our understanding of the genetic determinants of human plasma protein levels.
In addition to making the individual-level genetic and proteomic data available on request via the European Genome-Phenome Archive (https://ega-archive.org/studies/EGAS00001002555), we are also making publicly available the full genetic association summary statistics.
Files can be downloaded from Box.com: https://app.box.com/s/u3flbp13zjydegrxjb2uepagp1vb6bj2 Users can either download files from the website directly, or alternatively connect to the site using rclone or LFTP. For instructions on how to use rclone or LFTP, please contact Adam Butterworth (asb38@medschl.cam.ac.uk).
There are 3,283 folders on the site, one for each of the 3,283 SOMAmers used to assay the 2,995 proteins that passed quality control in the analyses. Each folder, which is named according to the ID of the SOMAmer, contains 22 gzipped text files, one for each autosomal chromosome analysed. (The same set of variants was analysed for each of the 3,283 SOMAmers). The site also contains a .csv file that lists the SOMAmer ID, target protein, full name of the target protein, and Uniprot ID for the 3,283 SOMAmers.