
Researchers from the Henry and Marilyn Taub School of Laptop Science have developed an AI-based method that accelerates DNA-based data retrieval by three orders of magnitude whereas considerably bettering accuracy. The analysis group included Ph.D. pupil Omer Sabary, Dr. Daniella Bar-Lev, Dr. Itai Orr, Prof. Eitan Yaakobi, and Prof. Tuvi Etzion.
The analysis is printed within the journal Nature Machine Intelligence.
DNA data storage is an rising discipline that leverages DNA as a platform for storing data. DNA presents important benefits as a storage medium, together with:
- Lengthy-term preservation: In 2013, researchers in Denmark efficiently extracted DNA from a horse bone courting again 700,000 years. In 2021, a global group recovered DNA from mammoths that lived over 1,000,000 years in the past. In contrast, magnetic disks utilized in data facilities have lifespans measured in years or, at greatest, a couple of a long time. This highlights DNA’s potential for long-term storage.
- Power and price effectivity: The “cloud” that powers most of as we speak’s computing providers depends on data facilities that devour roughly 3% of world electrical energy and emit round 2% of whole carbon emissions. With the exponential development of data, the environmental influence of current applied sciences is predicted to extend considerably.
- Unmatched data density: DNA storage presents data density up to 100 million times better than conventional digital storage. Which means a quantity at present holding one megabyte might theoretically retailer up to 100 terabytes utilizing DNA.
DNA is a molecule composed of a sequence of natural compounds referred to as nucleotides. These nucleotides are categorized into 4 sorts, represented by the letters A, C, G, and T. In contrast to conventional computing, the place data is encoded utilizing solely two digits (0 and 1), DNA storage relies on sequences of 4 letters, dramatically growing the variety of attainable combos.
To write down (retailer) data on this know-how, DNA synthesis is required—creating DNA molecules based mostly on the sequences encoding the knowledge. To learn the saved data, DNA sequencing is critical.

Challenges in DNA data storage
Growing DNA-based storage know-how presents a number of technological challenges:
- Each synthesis and sequencing are prolonged and error-prone processes, introducing deletion, insertion, and substitution errors
- As a result of limitations of the synthesis course of, a number of copies of every DNA molecule encoding the data are produced. These copies are saved collectively, unordered, in a storage container
- Throughout sequencing, many faulty copies of those molecules are retrieved—most containing errors, whereas some disappear totally
DNAformer: AI-powered data retrieval
The present analysis presents a complete computational answer for retrieving and correcting errors in complicated DNA-based storage techniques. Utilizing superior algorithms and encoding strategies, the researchers have demonstrated that their answer reduces data retrieval and studying time from a number of days to only 10 minutes.
The Technion-developed method, DNAformer, relies on a transformer mannequin educated on simulated data (generated utilizing a simulator, which was additionally developed at Technion) to reconstruct correct DNA sequences from faulty copies. The method additionally features a customized error-correction code tailor-made for DNA, guaranteeing strong data integrity.
Moreover, an additional security margin mechanism detects significantly noisy DNA sequences (undesirable alerts or errors that happen in the course of the sequencing course of, which may intrude with the correct interpretation of the data) and applies highly effective algorithmic instruments to deal with them effectively. On the finish of the method, the data is transformed again into digital data.
The brand new method permits the studying of 100 megabytes of data at a pace 3,200 times sooner than probably the most correct current method—with none lack of accuracy. In comparison with beforehand identified quick strategies, DNAformer additionally improves accuracy by up to 40% whereas considerably lowering processing time. This was demonstrated on a 3.1-megabyte dataset, which included:
- A colour nonetheless picture
- A 24-second audio clip of astronaut Neil Armstrong’s phrases on the moon
- A written textual content discussing DNA’s benefits as a promising data storage method
- Random data as an instance the applicability to encrypted or compressed data
The researchers plan to develop personalized variations of DNAformer tailor-made to totally different wants. They emphasize that their know-how is scalable and adaptable, which means it may be optimized for large-scale data storage functions, assembly market calls for and future DNA synthesis and sequencing developments.
Extra data:
Daniella Bar-Lev et al, Scalable and strong DNA-based storage by way of coding principle and deep studying, Nature Machine Intelligence (2025). DOI: 10.1038/s42256-025-01003-z
Technion – Israel Institute of Know-how
Quotation:
DNA data storage: AI method speeds up data retrieval by 3,200 times (2025, March 21)
retrieved 22 March 2025
from https://techxplore.com/information/2025-03-dna-storage-ai-method.html
This doc is topic to copyright. Aside from any truthful dealing for the aim of personal research or analysis, no
half could also be reproduced with out the written permission. The content material is offered for data functions solely.
Source link
#DNA #data #storage #method #speeds #data #retrieval #times