client logo
Version: 1.0.1 | Published: 30 Mar 2026 | Updated: 20 days ago
ind-dataset-logo

Genomics England - Bioinformatics

Dataset

Summary

Description:
Contains tables with data related to genomic data and the outputs from the GEL interpretation pipeline data for participants from both cancer and rare disease programmes. These tables do not directly include primary + secondary sources of clinical data.
Identifier:
20
Access Tier:
  • Safeguarded
  • Controlled

Documentation

Documentation:
To identify and enrol participants for the 100,000 Genomes Project we have created NHS Genomic Medicine Centres (GMCs). Each centre includes several NHS Trusts and hospitals. GMCs recruit and consent patients. They then provide DNA samples and clinical information for analysis. Illumina, a biotechnology company, have been commissioned to sequence the DNA of participants. They return the whole genome sequences to Genomics England. We have created a secure, monitored, infrastructure to store the genome sequences and clinical data. The data is analysed within this infrastructure and any important findings, like a diagnosis, are passed back to the patient’s doctor. To help make sure that the project brings benefits for people who take part, we have created the Genomics England Clinical Interpretation Partnership (GeCIP). GeCIP brings together funders, researchers, NHS teams and trainees. They will analyse the data – to help ensure benefits for patients and an increased understanding of genomics. The data will also be used for medical and scientific research. This could be research into diagnosing, understanding or treating disease. To learn more about how we work you can read the 100,000 Genomes Project protocol. It has details of the development, delivery and operation of the project. It also sets out the patient and clinical benefit, scientific and transformational objectives, the implementation strategy and the ethical and governance frameworks.

Coverage

Spatial

Spatial Coverage:
UK

Temporal

Start Date:
01 January 2014
End Date:
01 January 2019
Date of Latest Release:
30 March 2023
Date of First Release:
30 March 2023
Temporal Aggregation:
  • Monthly
  • Point-in-Time

Provenance

Origin

Purpose:
General research use
Source:
The 100,000 Genomes Project Protocol v3, Genomics England. doi:10.6084/m9.figshare.4530893.v3. 2017. Publications that use the Genomics England Database should include an author as: Genomics England Research Consortium. Please see publication policy.

Access and Governance

Access

Jurisdiction:
Great Britain
Data Controller:
GENOMICS ENGLAND
Data Processor:
GENOMICS ENGLAND
Licence:
Fees will be dependent on the type of access that is necessary. Raw data is not eligible for export. Summary-level data may be exported provided that it is approved through the Genomics England Airlock Process
Delivery Lead Time:
2-6 months

Format and Standards

Language:
English
Format:
Multiple Formats Available
Vocabulary Encoding Scheme:
OTHER
Conforms To:
OTHER

Observations

Name
Population Type
Value
Description
Variable Measured
Unit Code
Observation Date
Findings
73517
Rare Disease - Number of genomes
Count
30 March 2023
Findings
17003
Cancer Tumour - Number of genomes
Count
30 March 2023
Findings
32753
Cancer Germline - Number of genomes
Count
30 March 2023
Persons
15624
Cancer Participants
Count
30 March 2023
Persons
72874
Rare Disease Participants
Count
30 March 2023