Every print subscription comes with full digital access

Science News

An illustration of a mammoth standing on snowy land with a giant tusk and ribcage on the ground. In the background, the sun sets on a cloudy sky.

The last woolly mammoths offer new clues to why the species went extinct

The last population of woolly mammoths did not go extinct 4,000 years ago from inbreeding, a new analysis shows.

Child sacrifices at famed Maya site were all boys, many closely related

Horses may have been domesticated twice. only one attempt stuck, more stories in genetics.

Art of a police officer questioning a woman in a red dress. In the back, there are two crime scene technicians analyzing evidence. A splash of blood appears behind the woman.

Scientists are fixing flawed forensics that can lead to wrongful convictions

People have been wrongly jailed for forensic failures. Scientists are working to improve police lineups, fingerprinting and even DNA analysis.

An image of RNA

Thomas Cech’s ‘The Catalyst’ spotlights RNA and its superpowers

Nobel Prize-winning biochemist Thomas Cech’s new book is part ode to RNA and part detailed history of the scientists who’ve studied it.

A chimera pig embryo

50 years ago, chimeras gave a glimpse of gene editing’s future

Advances in gene editing technology have led to the first successful transplant of a pig kidney into a human.

Several ferns with forest in the background

The largest known genome belongs to a tiny fern

Though 'Tmesipteris oblanceolata' is just 15 centimeters long, its genome dwarfs humans’ by more than 50 times.

Here’s why some pigeons do backflips

Meet the scientist homing in on the genes involved in making parlor roller pigeons do backward somersaults.

Two chimpanzees hang from a rope with two hands above a grassy field. Both are facing away from the camera.

A genetic parasite may explain why humans and other apes lack tails

Around 25 million years ago, a stretch of DNA inserted itself into an ancestral ape’s genome, an event that might have taken our tails away.

Stacks of long tubes of various lengths are seen. Inside the tubes is a bright purple "filling". This is the long part of a nerve fiber called an axon. Around those fibers are thick tubes colored brownish-gray that form an insulating sheath around the nerve. Some wispy strands of connective tissue lays over some of the tubes. Connective tissue is colored hot pink.

Ancient viruses helped speedy nerves evolve

A retrovirus embedded in the DNA of some vertebrates helps turn on production of a protein needed to insulate nerve cells, aiding speedy thoughts.

A young female-presenting person with allergies sneezes into a white handkerchief. They have brown skin and black hair pulled back into a ponytail. They are wearing a light yellow shirt and a backpack with black straps with a neon green camping roll strapped across their shoulders. Trees in various shades of green are blurred in the background.

Newfound immune cells are responsible for long-lasting allergies

A specialized type of immune cell appears primed to make the type of antibodies that lead to allergies, two research groups report.

A photograph of Krystal Tsosie smiling in her white lab coat, which has an embroidered tortoise on it.

Geneticist Krystal Tsosie advocates for Indigenous data sovereignty

A member of the Navajo Nation, she believes Indigenous geneticists have a big role to play in protecting and studying their own data.

Subscribers, enter your e-mail address for full access to the Science News archives and digital editions.

Not a subscriber? Become one now .

Single-step generation of homozygous knockout/knock-in individuals in an extremotolerant parthenogenetic tardigrade using DIPA-CRISPR

June 13, 2024

Single-step generation of homozygous knockout/knock-in individuals in an extremotolerant parthenogenetic tardigrade using DIPA-CRISPR

Image credit: pgen.1011298

Research article

A catalogue of recombination coldspots in interspecific tomato hybrids

Pooled-pollen sequencing to map regions where genetic recombination does and doesn’t occur when breeding tomatoes.

Image credit: pgen.1011336

A catalogue of recombination coldspots in interspecific tomato hybrids

Recently Published Articles

  • Mus musculus domesticus ) from the Americas">Across two continents: The genomic basis of environmental adaptation in house mice ( Mus musculus domesticus ) from the Americas
  • Exome sequencing identifies novel genetic variants associated with varicose veins
  • Regulators of rDNA array morphology in fission yeast

Current Issue

Current Issue June 2024

research article

Golgi associated RAB2 interactor protein family contributes to murine male fertility to various extents by assuring correct morphogenesis of sperm heads

GARINs are conserved in humans, and expanding the understanding of GARINs potentially contributes to the elucidation of human male infertility.

Image credit: pgen.1011337

Golgi associated RAB2 interactor protein family contributes to murine male fertility to various extents by assuring correct morphogenesis of sperm heads

Filamin protects myofibrils from contractile damage through changes in its mechanosensory region

Understanding how our muscles sustain contractile damage.

Image credit: pgen.1011101

Filamin protects myofibrils from contractile damage through changes in its mechanosensory region

Wild Patagonian yeast improve the evolutionary potential of novel interspecific hybrid strains for lager brewing

Using wild yeast strains to develop new brewing applications to expand the repertoire of de novo lager yeasts.

Wild Patagonian yeast improve the evolutionary potential of novel interspecific hybrid strains for lager brewing

Image credit: pgen.1011154

An endothelial regulatory module links blood pressure regulation with elite athletic performance

Advancing our understanding of the molecular genetics of athletic performance and vascular traits in both horses and humans.

An endothelial regulatory module links blood pressure regulation with elite athletic performance

Image credit: pgen.1011285

History of tuberculosis disease is associated with genetic regulatory variation in Peruvians

Genetic variation explains risk of TB, by regulating the expression of genes involved in the control of Mtb infection.

History of tuberculosis disease is associated with genetic regulatory variation in Peruvians

Image credit: pgen.1011313

Conserved signalling functions for Mps1, Mad1 and Mad2 in the Cryptococcus neoformans spindle checkpoint

Mps1-dependent phosphorylation of C-terminal Mad1 residues is a critical step in Cryptococcus spindle checkpoint signalling. 

Conserved signalling functions for Mps1, Mad1 and Mad2 in the Cryptococcus neoformans spindle checkpoint

Image credit: pgen.1011302

Adaptations to nitrogen availability drive ecological divergence of chemosynthetic symbionts

The importance of nitrogen availability in driving the ecological diversification of chemosynthetic symbiont species.

Adaptations to nitrogen availability drive ecological divergence of chemosynthetic symbionts

Image credit: pgen.1011295

Paramutation at the maize pl1 locus is associated with RdDM activity at distal tandem repeats

pl1 paramutation depends on trans-chromosomal RNA-directed DNA methylation operating at a discrete cis-linked and copy-number-dependent...

Paramutation at the maize pl1 locus is associated with RdDM activity at distal tandem repeats

Image credit: pgen.1011296

New PLOS journals accepting submissions

Five new journals unified in addressing global health and environmental challenges are now ready to receive submissions: PLOS Climate , PLOS Sustainability and Transformation , PLOS Water , PLOS Digital Health , and PLOS Global Public Health

COVID-19 Collection

The COVID-19 Collection highlights all content published across the PLOS journals relating to the COVID-19 pandemic.

Submit your Lab and Study Protocols to PLOS ONE !

PLOS ONE is now accepting submissions of Lab Protocols, a peer-reviewed article collaboration with protocols.io, and Study Protocols, an article that credits the work done prior to producing and publishing results.

PLOS Reviewer Center

A collection of free training and resources for peer reviewers of PLOS journals—and for the peer review community more broadly—drawn from research and interviews with staff editors, editorial board members, and experienced reviewers.

Ten Simple Rules

PLOS Computational Biology 's "Ten Simple Rules" articles provide quick, concentrated guides for mastering some of the professional challenges research scientists face in their careers.

Welcome New Associate Editors!

PLOS Genetics welcomes several new Associate Editors to our board: Nicolas Bierne, Julie Simpson, Yun Li, Hongbin Ji, Hongbing Zhang, Bertrand Servin, & Benjamin Schwessinger

Expanding human variation at PLOS Genetics

The former Natural Variation section at PLOS Genetics relaunches as Human Genetic Variation and Disease. Read the editors' reasoning behind this change.

PLOS Genetics welcomes new Section Editors

Quanjiang Ji (ShanghaiTech University) joined the editorial board and Xiaofeng Zhu (Case Western Reserve University) was promoted as new Section Editors for the PLOS Genetics Methods section.

PLOS Genetics editors elected to National Academy of Sciences

Congratulations to Associate Editor Michael Lichten and Consulting Editor Nicole King, who are newly elected members of the National Academy of Sciences.

Harmit Malik receives Novitski Prize

Congratulations to Associate Editor Harmit Malik, who was awarded the Edward Novitski Prize by the Genetics Society of America for his work on genetic conflict. Harmit has also been elected as a new member of the American Academy of Arts & Sciences.

Publish with PLOS

  • Submission Instructions
  • Submit Your Manuscript

Connect with Us

  • PLOS Genetics on Twitter
  • PLOS on Facebook

Get new content from PLOS Genetics in your inbox

Mushroom vector seamless repeat grey on black.

Out of Sight, ‘Dark Fungi’ Run the World from the Shadows

The land, water and air around us are chock-full of DNA from fungi that scientists can’t identify

Cody Cottier

Green fern on forest floor with brown leaves.

Tiny Fern Has World’s Largest Genome

A small South Pacific fern boasts more than 50 times as many base pairs as the human genome

Max Kozlov, Nature magazine

Illustration of active RNA molecules behind machines

Revolutionary Genetics Research Shows RNA May Rule Our Genome

Scientists have recently discovered thousands of active RNA molecules that can control the human body

Philip Ball

Two whiteflies against a green background

Stolen Bacterial Genes Helped Whiteflies to Become the Ultimate Pests

Rather than relying on bacteria, whiteflies cut out the middleman and acquired their own genes to process nitrogen

Rohini Subrahmanyam

Sugar glider, mid-air on black background

How Sugar Gliders Got Their Wings

Several marsupial species, including sugar gliders, independently evolved a way to make membranes that allow them to glide through the air

Viviane Callier

Top view of beetle.

Unraveling the Secrets of This Weird Beetle’s 48-Hour Clock

New research examines the molecular machinery behind a beetle’s strange biological cycle

Andrew Chapman

A seated woman in a courtroom holds a photo of family members

Forensic Genealogy Offers Families the Gift of Closure

The forensic scientist’s toolbox is growing thanks to creative methods that generate reliable leads, analyze evidence, identify suspects and solve cold cases

Nancy La Vigne

Plasmodium falciparum microscopic image.

Ancient Malaria Genome from Roman Skeleton Hints at Disease’s History

Genetic information from ancient Roman remains is helping to reveal how malaria has moved and evolved alongside people

Tosin Thompson, Nature magazine

Brown giant panda approaching on leafy ground.

Rare Brown Panda Mystery Solved after 40 Years

Chinese researchers have found the gene responsible for the brown-and-white fur of a handful of giant pandas

Xiaoying You, Nature magazine

Colorful DNA helix

What Do You Mean, Bisexual People Are ‘Risk-Taking’? Why Genetic Studies about Sexuality Can Be Fraught

A recent study on risk-taking and bisexuality made assumptions that some experts don’t agree with.

Tulika Bose, Lauren Leffer, Timmy Broderick

Coccyx, computer illustration

How Humans Lost Their Tails

A newly discovered genetic mechanism helped eliminate the tails of human ancestors

This Genetically Engineered Petunia Glows in the Dark and Could Be Yours for $29

The engineered “firefly petunia” emits a continuous green glow thanks to genes from a light-up mushroom

Katherine Bourzac, Nature magazine

  • Share full article

Advertisement

Supported by

Scientists Finish the Human Genome at Last

The complete genome uncovered more than 100 new genes that are probably functional, and many new variants that may be linked to diseases.

research articles about genes

By Carl Zimmer

Two decades after the draft sequence of the human genome was unveiled to great fanfare, a team of 99 scientists has finally deciphered the entire thing. They have filled in vast gaps and corrected a long list of errors in previous versions, giving us a new view of our DNA.

The consortium has posted six papers online in recent weeks in which they describe the full genome. These hard-sought data, now under review by scientific journals, will give scientists a deeper understanding of how DNA influences risks of disease, the scientists say, and how cells keep it in neatly organized chromosomes instead of molecular tangles.

For example, the researchers have uncovered more than 100 new genes that may be functional, and have identified millions of genetic variations between people. Some of those differences probably play a role in diseases.

For Nicolas Altemose, a postdoctoral researcher at the University of California, Berkeley, who worked on the team, the view of the complete human genome feels something like the close-up pictures of Pluto from the New Horizons space probe.

“You could see every crater, you could see every color, from something that we only had the blurriest understanding of before,” he said. “This has just been an absolute dream come true.”

Experts who were not involved in the project said it will enable scientists to explore the human genome in much greater detail. Large chunks of the genome that had been simply blank are now deciphered so clearly that scientists can start studying them in earnest.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and  log into  your Times account, or  subscribe  for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?  Log in .

Want all of The Times?  Subscribe .

DNA and Genes

research articles about genes

Genes are the blueprints of life. Genes control everything from hair color to blood sugar by telling cells which proteins to make, how much, when, and where. Genes exist in most cells. Inside a cell is a long strand of the chemical DNA (deoxyribonucleic acid). A DNA sequence is a specific lineup of chemical base pairs along its strand. The part of DNA that determines what protein to produce and when, is called a gene.

First established in 1985 by Sir Alec Jeffreys, DNA testing has become an increasingly popular method of identification and research. The applications of DNA testing, or DNA fingerprinting within forensic science is often what most people think of when they hear the phrase. Popularized by television and cinema, using DNA to match blood, hair or saliva to criminals is one purpose of testing DNA. It is also frequently used for other benefits, like wildlife studies, paternity testing, body identification, and in studies pertaining to human dispersion.While most aspects of DNA are identical in samples from all human beings, concentrating on identifying patterns called microsatellites reveals qualities specific and unique to the individual. During the early stages of this science, a DNA test was performed using an analysis called restriction fragment length polymorphism. Because this process was extremely time consuming and required a great deal of DNA, new methods like polymerase chain reaction and amplified fragment length polymorphism have been employed.The benefits of DNA testing are ample. In 1987, Colin Pitchfork became the first criminal to be caught as a result of DNA testing. The information provided with DNA tests has also helped wrongfully incarcerated people like Gary Dotson and Dennis Halstead reclaim their freedom.

Latest about Genetics

A conceptual 3D illustration showing a strand of DNA being cut with large scissors

How does CRISPR work?

By Kamal Nahas last updated 1 July 24

CRISPR is a versatile tool for editing genomes and has recently been approved as a gene therapy treatment for certain blood disorders.

An artist's rendering of two strands of DNA, one blue and one pink, with tiny X and Y chromosomes in the background

Why genetic testing can't always reveal the sex of a baby

By Maggie Ruderman, Kimberly Zayhowski published 30 June 24

Gender and sex are more complicated than X and Y chromosomes.

Slightly blurred photo shows a person in blue pajamas holding a white comforter and moving their legs around, as if restless

Restless legs syndrome tied to 140 'hotspots' in the genome

By Emily Cooke published 6 June 24

A new study has identified more than 140 novel genetic risk factors associated with the development of restless legs syndrome.

Colorful illustration of three illuminated DNA molecules against a black background

'Fossil viruses' embedded in the human genome linked to psychiatric disorders

By Sahana Sitaraman published 3 June 24

Certain stretches of ancient viral DNA in the human genome may increase the chances of developing three neuropsychiatric disorders.

Medical illustration of a single strand of messenger RNA in pink. The molecule is slightly twisted and extends across the width of the image. The background is blurred but is a mixture of blue, pink, purple and green colors.

New genetic cause of intellectual disability potentially uncovered in 'junk DNA'

By Emily Cooke published 31 May 24

Mutations in "junk DNA" could be responsible for rare genetic cases of intellectual disability, new research hints.

Photo of a silverback gorilla walking on all fours in a field in front of trees, looking into the camera.

The same genetic mutations behind gorillas' small penises may hinder fertility in men

By Nicola Williams published 21 May 24

Scientists have used the gorilla genome to probe for previously unknown genes that may contribute to infertility in men.

Illustration of an early modern man embracing a Neanderthal woman. They appear to be in a forest at night. The moonlight is shining through the trees just behind them

'More Neanderthal than human': How your health may depend on DNA from our long-lost ancestors

By Emily Cooke published 17 May 24

Neanderthals and humans mated millennia ago, and their legacy lives on in us today. Here's how.

Illustration of a DNA double helix against a blue background. Two other helices can be seen blurred in the background

10 unexpected ways Neanderthal DNA affects our health

Around 2% of the genomes of modern Eurasians contains Neanderthal DNA. Here's how it affects our health.

Neanderthal man at the human evolution exhibit at the Natural History Museum.

The mystery of the disappearing Neanderthal Y chromosome

Non-Africans carry around 2% Neanderthal DNA in their genomes — yet there's one chromosome where DNA from our ancient cousins is nowhere to be found.

Image of a toddler girl sat with her mother on her left and her father on her right. They are all smiling at the camera. The mother is wearing a black-and-white polka dot top, the toddler is wearing a bright yellow top and the father is wearing a grey shirt.

Deaf baby can hear after 'mind-blowing' gene therapy treatment

By Emily Cooke published 16 May 24

Seven months after her treatment, the baby girl can now respond to her parents' voices without the aid of a cochlear implant.

Sign up for the Live Science daily newsletter now

Get the world’s most fascinating discoveries delivered straight to your inbox.

  • 2 What causes you to get a 'stitch in your side'?
  • 3 Newly discovered asteroid larger than the Great Pyramid of Giza will zoom between Earth and the moon on Saturday
  • 4 China opens Chang'e 6 return capsule containing samples from moon's far side
  • 5 Neanderthals cared for 6-year-old with Down syndrome, fossil find reveals
  • 2 Tasselled wobbegong: The master of disguise that can eat a shark almost as big as itself
  • 4 2,000 years ago, a bridge in Switzerland collapsed on top of Celtic sacrifice victims, new study suggests
  • 5 Self-healing 'living skin' can make robots more humanlike — and it looks just as creepy as you'd expect

research articles about genes

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Genes (Basel)

Logo of genes

Breast Cancer Genetics: Diagnostics and Treatment

Carmen criscitiello.

1 Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, 20141 Milan, Italy

2 Department of Oncology and Haematology (DIPO), University of Milan, 20122 Milan, Italy

Chiara Corti

Breast cancer (BC) genetics has become a fundamental aspect of BC management.

It influences screening, follow-up, prophylactic and therapeutic recommendations in women harboring a germinal BC susceptibility gene. In addition, it helps to identify patient subgroups with either a different prognosis or different response to treatment.

This Special Issue consists of one case report, two original research articles and five reviews, covering both diagnostic aspects and therapeutic implications of genetics in BC.

Pathogenic variants in the BC susceptibility genes represent the strongest hereditary risk factor for disease development, particularly in the context of early onset breast cancer (EOBC). Indeed, around 10–20% of EOBC cases are hereditary [ 1 ]. Consequently, individuals with a personal or family history of breast, ovarian, prostate or pancreatic cancer may benefit from hereditary risk evaluation to determine their own risk and family members’ risk for these and associated cancers. In this regard, Szczerba and colleagues examined 75 tumor samples from a cohort of Polish BC patients that had negative results for targeted breast cancer susceptibility genes 1 ( BRCA1 ) mutations (c.5266dupC, c.181T > G, c.4035delA, c.68_69delAG, c.3700_3704delGTAAA). All coding regions of the BRCA1/2 genes were sequenced with Next Generation Sequencing (NGS), with the detection of nine pathogenic variants and six variants of unknown significance (VUS). The authors also focused on methodological aspects of NGS, highlighting differences in variant calling files (VCF) obtained from the same FASTQ file, according to the variant calling algorithm used. The authors conclude that this observation could potentially affect the identification and interpretation of variants [ 2 ].

Moreover, recent studies have also shown germline BRCA1/2 status to be clinically relevant in the selection of therapy for patients already diagnosed with BC. Indeed, BRCA status predicts responsiveness to platinum-based chemotherapy as well as to inhibitors of poly(ADP-ribose) polymerase (PARP), highlighting the ability of these interventions to inhibit DNA repair pathways. From a surgical standpoint, surgical risk reduction remains a powerful tool in the therapeutic armamentarium for many women with genetic predisposing variants, as comprehensively highlighted by Berger and Golshan [ 3 ]. However, initial BC and contralateral BC risks should be clearly identified (i.e., highly penetrant genes compared to moderately penetrant genes), in order to fine tune risk reduction strategies and ideal timing, also in accordance with patient’s personal preferences [ 3 ]. While the survival benefit related to prophylactic bilateral mastectomy has been established, a growing body of evidence supports the oncological safety of nipple-sparing mastectomy as a risk-reducing procedure in BRCA -mutated patients, with low rates of new BCs, low rates of postoperative complications and high levels of satisfaction and postoperative quality of life, as reported by Rocco et al. [ 4 ]. However, larger multi-institutional studies with longer follow-up are needed to establish this procedure as the best surgical option in this setting.

Besides BRCA1/2 , pathogenic variants in other high- to moderate-risk genes such as tumor protein p53 ( TP53 ), partner and localizer of BRCA2 ( PALB2 ), phosphatase and tensin homolog ( PTEN ), checkpoint kinase 2 ( CHEK2 ) and ataxia-telangiectasia mutated ( ATM ) account for a smaller percentage of BC, and, in some cases, ovarian, prostate or pancreatic cancers [ 5 , 6 , 7 , 8 ].

In particular, ATM is involved in cell cycle control, apoptosis, oxidative stress and telomere maintenance, and its role as a risk factor for cancer development is well established [ 9 ]. Recent studies confirmed that some variants of ATM are associated with intermediate- and high-grade disease, a higher rate of lymph node metastatic involvement, HER2 positivity as well as the development of a contralateral breast tumor, as depicted by Stucci and colleagues [ 9 ]. Clinicopathologic characteristics of BC developed by ATM and checkpoint kinase 2 ( CHEK2 ) mutation were also explored by Toss and colleagues, who reviewed the archive of the local Family Cancer Clinic. Since 2018, 1185 multi-gene panel tests were performed. In total, 19 ATM and 17 CHEK2 mutation carriers affected by 46 different BCs were identified. A high rate of bilateral tumors was observed in ATM (26.3%) and CHEK2 mutation carriers (41.2%). While 64.3% of CHEK2 -mutant tumors were luminal A-like, 56.2% of ATM -mutant tumors were luminal B-like/HER2-negative. Moreover, 21.4% of CHEK2 -related invasive tumors showed a lobular histotype. About a quarter of all ATM -related BCs and a third of CHEK2 -related BCs were in situ carcinomas and more than half of ATM - and CHEK2 -related BCs were diagnosed at stage I-II. The biological and clinical characteristics of ATM - and CHEK2 -related tumors may help improve diagnosis, prognostication and targeted therapeutic approaches. Importantly, the authors advise the consideration and discussion of contralateral mastectomy for ATM and CHEK2 mutation carriers at the first diagnosis of BC.

This growing body of data regarding the identification of new ATM aberration as well as association with ancestry, prognosis and treatment outcomes could support clinicians in personalizing both treatments, as well as follow-up, in these patients [ 10 ]. Moreover, since mutations in ATM are involved in DNA repair mechanisms, ATM aberrations may sensitize cancer cells to platinum-derived drugs and PARPi, as BRCA1/2 mutations do. Some evidence suggests that ATM mutations could also be involved in the resistance to cyclin-dependent kinase 4 and 6 inhibitors (CDK4/6i) in luminal BC [ 10 ].

In this context, publicly available archives and case reports highlighting relationships among human gene variants and phenotypes are of particular importance. For example, Parenti et al. identified a new ATM deletion associated with a BRCA -negative patient who developed BC at the age of 34 [ 10 ]. Her mother had unilateral receptor-positive BC at the age of 45 with axillary lymph node involvement. The authors utilized SOPHiA Genetics Hereditary Cancer Solutions gene panel to detect a copy number variant (CNV), that was first validated by Multiplex Ligation-dependent Probe Amplification (MLPA). Afterward, long-range Polymerase Chain Reaction (PCR) and Sanger sequencing were used to characterize the breakpoint at DNA level (c.2838+2162_4110-292del) in proband and to also study segregation in the patient’s mother and sister. Further characterization at the RNA level on the proband’s mother and sister identified the presence of both the wild-type and the mutant allele in the mother’s sample. This abnormal ATM protein lacks the domain required for c-Abl protein interaction and mediation of cell cycle arrest in G1 phase. In addition, at least three other important domains are deleted from the ATM protein, such as the FAT (FRAP-ATM-TRRAP), PIKK (phosphatidylinositol 3-kinase-related kinase domain) and FATC (FAT C-terminal domain) domains, mediating most ATM functions.

Siddig et al. focused more broadly on the genetic landscape of EOBC, since 10–20% of these cancers are related to germline BC susceptibility genes. The authors provide an overview of somatic mutations, chromosome CNVs, single-nucleotide polymorphisms (SNPs), differential gene expression, microRNAs and gene methylation profile as well as of altered pathways resulting from those aberrations. Interestingly, the E-Cadherin/β-Catenin complex and the overall determinants of epithelial barrier integrity have been implicated in EOBC, with cell–cell adhesion genes such as CDH1 , GATA3 , CTNNB1 , MUC17 and FLG involved [ 1 , 11 ]. Eight stromal genes are differentially expressed in breast tumors from very young patients (≤35 years) compared to tumors from older age patients (≥50 years), with UQCRQ , ALDH1A3 , EGLN1 and IGF1 being overexpressed and FUT9 , IDI2 , PDHX and CCL18 being underexpressed. The TP53 gene typically shows a high mutational load in EOBC and plays an important pathogenic role by affecting cell cycle arrest mechanisms and the transcription of other genes, such as GAS7b , which regulates the cell structure and cell migration [ 12 ]. EOBC aggressive characteristics also appear to be linked to DNA methylations events [ 13 ]. EOBC displays several CNVs implicated in tumorigenesis (6q27, 6p32 and 7p21.1), advance-stage tumor progression (22q12.3 and 22q13.31), disease progression (19q13.32) and prognosis (CNV in BIRC5 gene). However, further studies that correlate the CNV profile with the gene and protein expression profile are needed. Finally, different SNPs may be linked to EOBC tumorigenesis, progression, resistance to chemotherapy and poor prognosis. Additionally, it is possible to discriminate BC arising in young women from that in older women using a microRNA profile [ 1 ].

In terms of future perspectives, even though several disease-causing mutations have been identified, therapy is often aimed at interfering with an aberrantly activated pathway, rather than rectifying the mutation in the DNA sequence. The Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 is a groundbreaking tool that is being utilized for the identification and validation of genomic targets bearing tumorigenic potential. CRISPR/Cas9 supersedes its gene editing predecessors through its unparalleled simplicity, efficiency and affordability. Ahmed and colleagues provide an overview of the CRISPR/Cas9 mechanism and discuss genes that were edited using this system for the treatment of BC. In addition, the authors shed light on the delivery methods, both viral and non-viral, that may be used to deliver the system, as well as on the main challenges associated with each method. However, despite great expectations, remarkable limitations related to ethics, off-target effects, mutagenesis and delivery necessitate further studies. For the conventional use of this system in the near future, both precise knowledge of pathogenic variants as well as the optimization of the system itself are essential.

In conclusion, the papers in this Special Issue cover various aspects of genetics in BC. Overall, they provide a summary of hereditary BC syndromes, personalized BC risk assessments, as well as historical and novel risk reduction approaches. They also offer a comprehensive overview regarding major advances in understanding the most frequent genetic aberrations, with potential implications for present and future treatment approaches.

Conflicts of Interest

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • Health Tech
  • Health Insurance
  • Medical Devices
  • Gene Therapy
  • Neuroscience
  • H5N1 Bird Flu
  • Health Disparities
  • Infectious Disease
  • Mental Health
  • Cardiovascular Disease
  • Chronic Disease
  • Alzheimer's
  • Coercive Care
  • The Obesity Revolution
  • The War on Recovery
  • Adam Feuerstein
  • Matthew Herper
  • Jennifer Adaeze Okwerekwu
  • Ed Silverman
  • CRISPR Tracker
  • Breakthrough Device Tracker
  • Generative AI Tracker
  • Obesity Drug Tracker
  • 2024 STAT Summit
  • Wunderkinds Nomination
  • STAT Madness
  • STAT Brand Studio

Don't miss out

Subscribe to STAT+ today, for the best life sciences journalism in the industry

5 takeaways from the Human Genome Project investigation

By Ashley Smart — Undark July 9, 2024

A digital representation of the human genome.

T he Human Genome Project was among the most ambitious scientific efforts in modern history, with the aim of deciphering the chemical makeup of the entire human genetic code.

The sequence of some 3 billion DNA base pairs that comprise our genome was supposed to be a mosaic, assembled from multiple anonymous people to protect the identity of the volunteer donors. But that’s not how it turned out: One individual’s DNA accounted for the vast majority of the genome when a first draft was released in 2001.

advertisement

The story of how that happened, and its ethical implications, has largely gone untold. To piece together this history, Undark examined more than 100 emails, letters, and other documents, and interviewed many of the project’s central figures.

Here are five key takeaways:

  • The sourcing of human DNA was more ethically fraught than the project’s landmark 2001 Nature paper portrayed it to be. The full genetic sequence was constructed not only from anonymously sourced donors, but also from non-anonymous donors — including one of the project’s own scientists — and from tissue harvested from a cadaver.
  • The project derived more than 70% of its published sequence from a single anonymous male donor, known as RP11 and thought to be of African American descent.  This happened despite consent form language suggesting to donors that their DNA would constitute no more than 10 percent of the sequence, as a protective measure. Project leaders did not inform RP11 of the change.
  • Scientists and administrators involved in the project gave competing justifications for the pivot from a mosaic approach to one focused on a single donor. Some attributed it to technical difficulties of combining genetic sequences of multiple people, while others cited the pressure to beat a private-sector rival, Celera Genomics, in a race to complete the genome.
  • Legal and ethics experts found the project’s handling of donor sourcing and consent concerning. One expert described them as exemplifying a long history of deceptions that have contributed to a lack of trust in the research enterprise, especially in minoritized communities.
  • The genetic sequence that emerged from the human genome project is still widely used today as a reference genome to support clinical practice and research. Although the sequence has been revised over the years, RP11 remains its centerpiece, still constituting more than 70% of the most commonly used versions.

Read the full story here .

About the Author Reprints

Ashley smart — undark.

Ashley Smart is the associate director of the Knight Science Journalism Program at MIT and a senior editor at Undark.

STAT encourages you to share your voice. We welcome your commentary, criticism, and expertise on our subscriber-only platform, STAT+ Connect

To submit a correction request, please visit our Contact Us page .

research articles about genes

Recommended

research articles about genes

Recommended Stories

research articles about genes

The untold story of the Human Genome Project: How one man’s DNA became a pillar of genetics

research articles about genes

Three months into bird flu outbreak in U.S. dairy cows, experts see deep-rooted problems in response

research articles about genes

STAT Plus: Troubled for-profit chains are stealthily operating dozens of psychiatric hospitals under nonprofits’ names

research articles about genes

STAT Plus: Ascension is racing to unload hospitals as execs work to stem losses

research articles about genes

STAT Plus: What’s next for Cassava Sciences, and Cartesian Therapeutics has a novel excuse for dodgy data

research articles about genes

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

genes-logo

Journal Menu

  • Aims & Scope
  • Editorial Board
  • Reviewer Board
  • Topical Advisory Panel
  • Instructions for Authors
  • Special Issues
  • Sections & Collections
  • Article Processing Charge
  • Indexing & Archiving
  • Editor’s Choice Articles
  • Most Cited & Viewed
  • Journal Statistics
  • Journal History
  • Journal Awards
  • Society Collaborations
  • Conferences

Editorial Office

Journal browser.

  • arrow_forward_ios Forthcoming issue arrow_forward_ios Current issue
  • Vol. 15 (2024)
  • Vol. 14 (2023)
  • Vol. 13 (2022)
  • Vol. 12 (2021)
  • Vol. 11 (2020)
  • Vol. 10 (2019)
  • Vol. 9 (2018)
  • Vol. 8 (2017)
  • Vol. 7 (2016)
  • Vol. 6 (2015)
  • Vol. 5 (2014)
  • Vol. 4 (2013)
  • Vol. 3 (2012)
  • Vol. 2 (2011)
  • Vol. 1 (2010)

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

About Genes

Genes (ISSN 2073-4425) is an international, peer-reviewed open access journal which provides an advanced forum for studies related to genes, genetics and genomics. It publishes reviews, research articles, communications and technical notes. There is no restriction on the maximum length of the papers and we encourage scientists to publish their results in as much detail as possible.

This journal covers all topics related to genes, genetics and genomics, including, but not limited to:

  • DNA and RNA;
  • Genetic code, gene structure, and gene expression;
  • Chromosomes, recombination and linkage, and genetic mapping;
  • Transcriptional regulation, noncoding and other RNAs;
  • Cloning, genetically modified organisms;
  • Human genetics, medical genetics, gene therapy, and precision medicine;
  • Population genetics, conservation genetics, phylogenomics, and phylogenetics;
  • Functional genomics;
  • Sequencing technologies and bioinformatics;
  • Breeding and genetic selection;
  • Genetic toxicology, pharmacogenetics, and pharmacogenomics;
  • Genome editing;
  • Epigenetic therapy;
  • Environmental mutagenesis;
  • Developmental genetics;
  • Cytogenomics.

MDPI Publication Ethics Statement

research articles about genes

Book Reviews

MDPI St. Alban-Anlage 66 CH-4052 Basel Switzerland

Copyright / Open Access

Reprints may be ordered. Please contact for more information on how to order reprints.

Announcement and Advertisement

Announcements regarding academic activities such as conferences are published for free in the News & Announcements section of the journal. Advertisement can be either published or placed on the pertinent website. Contact e-mail address is .

For further MDPI contacts, see here .

Further Information

Mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

  • Search Menu
  • Sign in through your institution
  • Advance Access
  • Collections
  • Author Guidelines
  • Submission Site
  • Open Access Policy
  • Self-Archiving Policy
  • Why Submit?
  • About Horticulture Research
  • About Nanjing Agricultural University
  • Editorial Board
  • Advertising & Corporate Services
  • Journals on Oxford Academic
  • Books on Oxford Academic

Nanjing Agricultural University

Article Contents

Introduction, materials and methods, acknowledgements, author contributions, data availability statement, conflict of interests, supplementary information, a telomere-to-telomere gap-free reference genome assembly of avocado provides useful resources for identifying genes related to fatty acid biosynthesis and disease resistance.

ORCID logo

These authors contributed equally to this work.

  • Article contents
  • Figures & tables
  • Supplementary Data

Tianyu Yang, Yifan Cai, Tianping Huang, Danni Yang, Xingyu Yang, Xin Yin, Chengjun Zhang, Yunqiang Yang, Yongping Yang, A telomere-to-telomere gap-free reference genome assembly of avocado provides useful resources for identifying genes related to fatty acid biosynthesis and disease resistance, Horticulture Research , Volume 11, Issue 7, July 2024, uhae119, https://doi.org/10.1093/hr/uhae119

  • Permissions Icon Permissions

Avocado ( Persea americana Mill.) is an economically valuable plant because of the high fatty acid content and unique flavor of its fruits. Its fatty acid content, especially the relatively high unsaturated fatty acid content, provides significant health benefits. We herein present a telomere-to-telomere gapless genome assembly (841.6 Mb) of West Indian avocado. The genome contains 40 629 predicted protein-coding genes. Repeat sequences account for 57.9% of the genome. Notably, all telomeres, centromeres, and a nucleolar organizing region are included in this genome. Fragments from these three regions were observed via fluorescence in situ hybridization. We identified 376 potential disease resistance-related nucleotide-binding leucine-rich repeat genes. These genes, which are typically clustered on chromosomes, may be derived from gene duplication events. Five NLR genes ( Pa11g0262 , Pa02g4855 , Pa07g3139 , Pa07g0383 , and Pa02g3196 ) were highly expressed in leaves, stems, and fruits, indicating they may be involved in avocado disease responses in multiple tissues. We also identified 128 genes associated with fatty acid biosynthesis and analyzed their expression patterns in leaves, stems, and fruits. Pa02g0113 , which encodes one of 11 stearoyl-acyl carrier protein desaturases mediating C18 unsaturated fatty acid synthesis, was more highly expressed in the leaves than in the stems and fruits. These findings provide valuable insights that enhance our understanding of fatty acid biosynthesis in avocado.

Avocado ( Persea americana Mill.) is a tropical evergreen woody plant species originating from Central America. Its fruits are rich in nutritious, health-promoting, disease-preventing metabolites and have a creamy texture and a unique aroma because of a high fatty acid content, especially unsaturated fatty acid [ 1 ]. Thus, avocado has been consumed for over 5000 years and represents a globally economically valuable crop [ 2 , 3 ]. Over 8 million metric tons of avocado were produced worldwide in 2021 [ 4 ]. Several tropical countries, such as Mexico, Colombia, Peru, Indonesia, Dominican Republic, and Kenya, are major avocado producers, with an output exceeding 6.8 million metric tons in 2021 [ 4 ]. However, avocado production is beset by challenges. In Kenya, a major avocado exporter, more than 60% of avocado fruits do not meet international market standards because of low quality and damages due to anthracnose disease [ 5 ]. Wilt disease caused by Phytophthora cinnamomi has resulted in yield losses of ~40%–90% in Colombia and 20%–25% in California, USA, where 5% of the avocado-planting area is affected [ 6 ]. In addition, a disease induced by nectriaceous fungi has been detected in various regions (e.g. Australia, Chile, Colombia, and Italy), leading to considerable economic losses in the avocado industry [ 7–11 ].

There has been interest in the potential utility of genes encoding nucleotide-binding leucine-rich repeat receptor (NLR) proteins, which reportedly contribute to disease resistance. On the basis of a transcriptome analysis, Pérez-Torres et al. [ 12 ] determined that the expression levels of four unigenes ( UN003976 , UN001791 , UN003288 , and UN003220 ) encoding coiled-coil-type NLR proteins increased during the early stages of an infection by Fusarium kuroshium . Furthermore, there may be tissue-specific NLR network responses to specific pathogens in plants [ 12 ]. Therefore, identifying and functionally annotating NLR genes in avocado is critical for exploring their roles in immune responses to diseases.

As an economic crop with a rich cultivation history, avocado has been studied in terms of its substantial fatty acid content. The initiation of fatty acid biosynthesis involves acetyl coenzyme A (acetyl-CoA) and biochemical reactions in plastids that produce the 16:0-acyl carrier protein (ACP). Subsequently, 16:0-ACP is modified by numerous enzymatic reactions, with the resulting long-chain acyl-CoA recatalyzed and stored in the acyl-CoA pool within the endoplasmic reticulum. Concurrently, C18:0 and C18:1 are bound to malonyl-CoA through sequential reactions, similar to fatty acid synthesis in plastids, yielding desaturated long-chain fatty acids, which are stored in the phosphatidylcholine (PC) pool for subsequent processes. The products stored in the acyl-CoA Pool and PC pool are channeled into the Kennedy pathway, leading to the formation of triacylglycerols (TAGs) [ 13 ]. In most plants, the enzymatic reactions associated with C18 unsaturated fatty acid biosynthesis have been well established. The major unsaturated fatty acids in plants are oleic (18:1), linoleic (18:2), and α-linolenic (18:3) acids (i.e. C18 species) [ 14 ]. The formation of unsaturated fatty acids is mainly regulated by three specialized fatty acid desaturases (FADs) (i.e. acyl-lipid, acyl-ACP, and acyl-CoA desaturases) [ 15 ]. There has been relatively little research on the expression patterns and functions of the genes encoding enzymes involved in unsaturated fatty acid formation in avocado.

With the development of third-generation sequencing technologies, which produce much longer and more accurate reads than previous sequencing technologies, several plant telomere-to-telomere (T2T) genome assemblies have been generated for various species, including Arabidopsis thaliana [ 16 , 17 ], Oryza sativa [ 18 , 19 ], Brassica rapa [ 20 ], Actinidia chinensis [ 21 , 22 ], Rhododendron molle [ 23 ], and Rhodomyrtus tomentosa [ 24 ]. Gapless T2T genomes are useful for studying centromeres [ 17 , 18 , 20 ], which are dynamic, rapidly evolving chromosomal regions critical for maintaining chromosomal integrity and genetic information fidelity during cell division [ 25 , 26 ]. Earlier research showed that centromeric regions are usually highly methylated and contain repetitive satellite DNA sequences (satellites) and transposable elements (TEs), including long terminal repeats (LTRs) [ 17 , 18 , 20 ]. These highly repetitive, complex sequences can make it difficult to analyze plant centromere structures and functions [ 26–28 ]. Research on plant centromeres has been limited to model plants and crop species, such as A. thaliana with a 178-bp CEN178 (formerly known as CEN180) [ 29 ], rice with a 155-bp CentO [ 30 ], Zea mays with a 156-bp CentC [ 31 ], and Triticum aestivum with a 566-bp CentT566 [ 32 ]. There has been a lack of research on centromeres in avocado.

Several studies have generated valuable genomic data relevant to avocado research [ 12 , 33–38 ], including two Hass avocado genome assemblies with 12 chromosomes [ 33 , 34 ]. However, considering avocado is an economically valuable tropical plant species, its genome must be more comprehensively characterized. In this study, we generated a gapless T2T genome assembly for West Indian avocado by integrating multiple sequencing technologies, which includes all the telomeres, centromeres, and a nucleolar organizing region (NOR). These regions were validated via fluorescence in situ hybridization (FISH). Additionally, we analyzed the expression of NLR genes and genes associated with fatty acid biosynthesis in various avocado tissues. The T2T genome assembly described herein may form the basis of future research on disease resistance and fatty acid biosynthesis in avocado.

Gap-free avocado genome assembly

Multiple sequencing technologies were used to sequence the genome of a West Indian avocado plant collected from Xishuangbanna Tropical Botanical Garden, China. A preliminary genome survey, which was performed using 51.9 Gb paired-end reads generated by whole-genome next-generation sequencing (NGS) revealed the genome size (864 Mb) and heterozygosity rate (0.637%) ( Fig. S1 ). On the basis of this genome size, several sequencing platforms were used to obtain the following data: 70.9 Gb (82.1×) of PacBio HiFi reads with an N50 of 17.7 kb, 39.3 Gb (45.5×) of ONT ultra-long reads with an N50 of 100.3 kb, and 89.8 Gb (104.0×) of Pore-C reads ( Table S1 ). The HiFi reads and ONT ultra-long reads were used along with hifiasm [ 39 ] to construct a highly accurate preliminary assembly with an N50 of 63.6 Mb ( Table S2 ). After discarding organelle fragments and redundant sequences, contigs were clustered, ordered, and oriented using wf-pore-c [ 40 ] and juicebox [ 41 ] pipelines with manual validation ( Fig. S2 ). A total of 18 contigs with significant contact signals were anchored onto 12 chromosomes, seven of which were gap-free, and six gaps were added to scaffold 11 contigs into five chromosomes ( Table S3 ). Chromosome identification numbers and orientations were refined according to a published avocado genome [ 34 ] (Pa01–Pa12) ( Table S3 ). Assemblies generated by several assemblers were used to fill gaps ( Table S2 ). Contigs that could bridge any gap were used as input data of quarTeT [ 42 ] for automated gap filling, and then the filled gaps were manually validated. Thus a gap-free genome assembly was obtained. Several genomic regions were found to have low HiFi and ONT read coverage depth ( Fig. 1A ). To ensure a correct assembly, sequences in these regions were inspected and compared among assemblies generated by different assemblers ( Table S2 ). All these regions were either fixed (gap-filling method) or verified that they could be assembled using hifiasm with no additional gaps. The low coverage depth may be related to sequence repeatability and complexity. Telomeres were fixed by aligning and jointing candidate ONT ultra-long reads to the chromosome ends lacking telomeres. After completing all correction and polishing procedures, the final avocado genome assembly comprised 841.6 Mb and consisted of 12 gap-free chromosomes with an N50 of 78.8 Mb and 24 telomeres ( Table S4 ).

Landscape of the telomere-to-telomere gap-free avocado genome assembly. (A) Genome landscape Circos plot. (a) Chromosomes with gap-filling locations in black, estimated centromeres in gold, and heterozygous site density in orange red bars; (b) gene density; (c) repeat density; (d) rRNA location; (e) HiFi read depth; (f) ONT ultra-long read depth; (g) NGS read depth; (h) GC content; (i) intra-genomic collinearity. Densities and depths were calculated in 500-kb windows with 250-kb steps along chromosomes. (B) 45S rDNA array on Pa12. (C) Fluorescence signals of FISH probes pITS1&2 (red) and pTEL (green) indicate the locations of NOR and telomeres on avocado mitotic metaphase chromosomes.

Landscape of the telomere-to-telomere gap-free avocado genome assembly . (A) Genome landscape Circos plot. (a) Chromosomes with gap-filling locations in black, estimated centromeres in gold, and heterozygous site density in orange red bars; (b) gene density; (c) repeat density; (d) rRNA location; (e) HiFi read depth; (f) ONT ultra-long read depth; (g) NGS read depth; (h) GC content; (i) intra-genomic collinearity. Densities and depths were calculated in 500-kb windows with 250-kb steps along chromosomes. (B) 45S rDNA array on Pa12. (C) Fluorescence signals of FISH probes pITS1&2 (red) and pTEL (green) indicate the locations of NOR and telomeres on avocado mitotic metaphase chromosomes.

Genome annotation

Repeat sequences (repeats) in the avocado assembly were identified using the Extensive de novo TE Annotator (EDTA) pipeline [ 43 ]. Additionally, a repeat library was obtained after TEs were classified. According to the EDTA analysis, repeats accounted for 57.9% of the assembly ( Table S5 ). The most common repeats were LTR/Copia (7.3%) and LTR/Gypsy (22.1%) retrotransposons ( Table S5 ). The repeat library was used to softmask the assembly. Gene models were predicted using the softmasked assembly and BRAKER3 [ 44 ], which combined the results of transcriptome-based, homologous protein-based, and ab initio predictions. We obtained 40 629 protein-coding gene models. The genes were distributed on both chromosomal arms in a symmetrical pattern, whereas the repeats were concentrated in relatively central regions ( Fig. 1A ). The proteins encoded by these genes included homologs of 32 645 and 23 485 proteins in the non-redundant (NR) and Swiss-Prot databases, respectively. InterProScan [ 45 , 46 ] and eggNOG-mapper [ 47 , 48 ] assigned Pfam, Gene Ontology (GO), and KEGG Orthology (KO) terms to 24 877, 13 977, and 13 786 proteins, respectively ( Table S6 ). Furthermore, we identified heterozygous sequences (4 118 925 bp) at 3 158 398 sites by remapping HiFi reads to the genome ( Fig. 1A ). Most of these sequences were in intergenic and intronic regions. There were 98 128 heterozygous sites in exonic regions, including 53 628 nonsynonymous single nucleotide variants that altered 1850 transcription start or termination sites.

Noncoding RNAs were predicted by infernal cmscan [ 49 ] and Rfam [ 50 ] databases. The prediction resulted in 458 transfer RNAs, 398 small nucleolar RNAs, 177 microRNAs, and 3576 5S ribosomal RNAs ( Table S7 ). The NOR detected on Pa12 contained dozens of 45S rDNA units, which comprised a set of small subunit rRNA, internal transcribed spacer1 (ITS1), 5.8S rRNA, ITS2, and large subunit rRNA arranged head to tail ( Fig. 1B ). NOR is important for ribosome and nucleolus formation during interphase [ 51 ]. It may also be responsible for the high GC content at the end of Pa12 ( Fig. 1A ) . The A. thaliana -type telomeric repeats (TTTAGGG/CCCTAAA) were used to identify telomeres in this avocado assembly. The ends of all chromosomes contained a telomeric region ranging from 4683 bp to 27 191 bp in length ( Table S8 ). To validate the authenticity of NOR and telomeres revealed by the assembly, we designed FISH probes (pITS1, pITS2, and pTEL) on the basis of the NOR and telomere sequences ( Table S9 ). According to the red and green fluorescence signals, there was a pair of NORs among 12 pairs of chromosomes. Moreover, all chromosomes had telomeric regions at each end ( Fig. 1C ), which was in accordance with the results of the bioinformatics analysis.

Quality assessment and validation

We used multiple methods to evaluate assembly quality. The overall mapping rates of HiFi reads, ONT ultra-long reads, and NGS reads were 99.55%, 99.91%, and 97.86%, respectively. Coverage breadths of all chromosomes exceeded 99.9%, and coverage depth was generally uniform among chromosomes ( Fig. 1A ; Table S10 ). Moreover, the overall alignment rates of RNA-seq reads generated from leaves, stems, and fruits were greater than 99.1% ( Table S11 ). By elucidating the correct order and orientation of sequences, the Pore-C contact heatmap verified the continuity of the assembly ( Fig. S2 ). Merqury [ 52 ] was used to calculate the base-level quality values of the genome on the basis of HiFi reads (overall value of 56.23) ( Table S12 ). The LTR Assembly Index (LAI) [ 53 ] score calculated using intact LTR-RTs was 15.99, which reaches the reference standard. Finally, a Benchmarking Universal Single-Copy Orthologs (BUSCO) [ 54 ] analysis (in protein mode) captured 1604 of 1614 conserved genes (99.4%) in embryophyta_odb10 ( Table S13 ). These results reflect the high continuity, accuracy, and integrity of this avocado genome assembly.

Avocado centromere characterization

Iterative identification and clustering methods were used to estimate centromere locations on chromosomes ( Fig. 1A ). A total of 12 chromosome-specific centromeric repeats (CSCR) in the corresponding chromosome centromeres were identified and designated as CSCR01 to CSCR12 ( Table S14 ). Most CSCRs were longer than 1000 bp, which exceeds the length of published centromeric monomers. Seven CSCRs (CSCR01, CSCR02, CSCR03, CSCR05, CSCR06, CSCR07, and CSCR08) had similar sequences, with identity and coverage exceeding 83.0% and 98.7%, respectively ( Table S15 ; Fig. S3 ), and always appeared on the corresponding centromeres in a head-to-tail orientation ( Fig. 2A ). These CSCRs formed the Seven CSCRs Group (SCG) ( Fig. 2B ). CSCR04, CSCR11, and CSCR12 on non-SCG chromosomes were arranged in intervals, whereas CSCR09 and CSCR10 were relatively rare on the corresponding chromosomes. These CSCRs were somewhat similar to SCG according to the LASTZ and MAFFT alignments ( Table S15 ; Fig. S3 ). The Vsearch [ 55 ] clustering results indicated that CSCR01 (alternatively called PaCEN1016) can serve as a representative avocado centromeric monomer. The Pore-C signal near-absent regions and CSCR locations were used for the determination of centromere borders on each chromosome ( Table S16 ). Multiple locations in these complex regions had low HiFi and ONT read coverages, especially the long centromeric regions of Pa03 and Pa07 ( Fig. 1A ). To validate the authenticity of these regions and CSCRs, we designed a FISH probe (pCEN) on the basis of the consensus CSCR sequences ( Table S9 , Table S14 , and Fig. S3 ). Red fluorescence signals confirmed the existence of these CSCRs ( Fig. 2C ).

Centromeric architecture in avocado. (A) Tracks showing Pore-C contact signal near-absent regions, putative satellite locations, CSCR locations, LTR/Gypsy locations, LTR/Copia locations, and TIR locations. CSCR, chromosome-specific centromeric repeat; LTR, long terminal repeat; and TIR, terminal inverted repeat. Pore-C contact signals were calculated in 15-kb bins. Putative satellite and TE locations were determined using RepeatMasker, whereas CSCR locations were determined according to LASTZ alignments. (B) Neighbor-joining tree showing phylogenetic relationships of CSCRs. The gray clade comprises seven highly homologous CSCRs, which were designated as the Seven CSCRs Group (SCG). CSCRs were aligned using the MAFFT einsi algorithm. The neighbor-joining tree was constructed using TreeBeST. (C) Fluorescence signals of FISH probes pCEN (red) and pTEL (green) indicate the locations of centromeres and telomeres on avocado mitotic metaphase chromosomes.

Centromeric architecture in avocado. (A) Tracks showing Pore-C contact signal near-absent regions, putative satellite locations, CSCR locations, LTR/Gypsy locations, LTR/Copia locations, and TIR locations. CSCR, chromosome-specific centromeric repeat; LTR, long terminal repeat; and TIR, terminal inverted repeat. Pore-C contact signals were calculated in 15-kb bins. Putative satellite and TE locations were determined using RepeatMasker, whereas CSCR locations were determined according to LASTZ alignments. (B) Neighbor-joining tree showing phylogenetic relationships of CSCRs. The gray clade comprises seven highly homologous CSCRs, which were designated as the Seven CSCRs Group (SCG). CSCRs were aligned using the MAFFT einsi algorithm. The neighbor-joining tree was constructed using TreeBeST. (C) Fluorescence signals of FISH probes pCEN (red) and pTEL (green) indicate the locations of centromeres and telomeres on avocado mitotic metaphase chromosomes.

The 1 Mb regions flanking centromeres included CSCRs together with satellites and TEs. There was considerable overlap between LTR/Gypsy and SCG-rich regions, whereas non-SCG centromeres included multiple types of TEs ( Fig. 2A ). The alignments of these CSCRs to the sequences in the repeat library generated by the repeat annotation pipeline revealed the substantial similarity between these CSCRs and a number of TEs ( Table S15 ). Notably, CSCR01 contained the sequences of three TEs ( Fig. S4 ; Table S15 ). Thus, these CSCRs may have been derived from TEs. These findings indicate that TE insertions may have largely shaped the centromere structure in avocado.

Structural variation analysis

To screen for differences between the previously assembled Hass avocado genome and our West Indian avocado genome, we analyzed their structural variations ( Fig. 3 ; Fig. S5 ). Large-scale structural rearrangements were mainly detected near complex centromeric regions. Examples include the translocation on Pa02 and inversion on Pa12 ( Fig. S6 ). A total of 582 485 insertions/deletions (InDels) were identified, of which 7668 insertions and 7685 deletions were longer than 50 bp ( Table S17 ). Gene-based annotation detected 5700 InDels in the exonic regions of 4373 genes, many of which encode protein kinases, disease resistance-related proteins, transcription factors, and cytochrome P450, in the West Indian avocado genome ( Table S18 ).

Structural variations among avocado assemblies. Collinear syntenic blocks, inversions, translocations, and duplications are shown between homologous chromosomes. Chromosome-specific centromeric repeats and gaps on chromosomes are marked in yellow and white, respectively. Black triangles indicate telomeres.

Structural variations among avocado assemblies. Collinear syntenic blocks, inversions, translocations, and duplications are shown between homologous chromosomes. Chromosome-specific centromeric repeats and gaps on chromosomes are marked in yellow and white, respectively. Black triangles indicate telomeres.

Exploring NLR genes in avocado

To analyze the potential disease resistance-related NLR genes in avocado, we identified 376 and 230 NLR genes in the West Indian and Hass assemblies, respectively, using NLR-Annotator and InterProScan. These NLR genes were rarely located in Hass assembly gap regions ( Fig. S7 ). The diversity in the number of NLR genes may be due to varietal differences. On the basis of domain architectures, 376 NLR genes could be classified into three subfamilies, including Coiled-Coil NB-ARC Leucine-rich-repeat (CNL), Toll/interleukin-1 receptor NB-ARC Leucine-rich-repeat (TNL), and Resistance to Powdery Mildew Locus 8 NB-ARC Leucine-rich-repeat (RNL), among which the subfamily CNL contains 363 members, accounted for 96.54% of the total ( Fig. 4 ; Table S19 ). GO term and KEGG pathway enrichment analyses were conducted to functionally annotate the NLR genes. Notably, 80 genes were annotated with the ‘response to biotic stimulus’ (GO:0009607) GO term ( Table S20 ), whereas 154 genes were associated with the ‘plant-pathogen interaction’ (KEGG: ko04626) pathway ( Table S21 ).

Phylogenetic and transcriptome analyses of NLR genes in avocado. (A) Clustered distribution of NLR genes on chromosomes, with gene locations marked in red. (B) Neighbor-joining tree of avocado NLR proteins. The heatmap behind the tree shows the relative expression levels of the corresponding genes in leaves, stems, and fruits. The branches from identical chromosomes are marked by the same color. The different colors of gene ID represent CNL, TNL, and RNL subfamilies of NLR genes. C Jitter plot showing TPM values of NLR genes in leaves, stems, and fruits. The significance of any differences was assessed by an analysis of variance followed by Tukey’s HSD test.

Phylogenetic and transcriptome analyses of NLR genes in avocado. (A) Clustered distribution of NLR genes on chromosomes, with gene locations marked in red. (B) Neighbor-joining tree of avocado NLR proteins. The heatmap behind the tree shows the relative expression levels of the corresponding genes in leaves, stems, and fruits. The branches from identical chromosomes are marked by the same color. The different colors of gene ID represent CNL, TNL, and RNL subfamilies of NLR genes. C Jitter plot showing TPM values of NLR genes in leaves, stems, and fruits. The significance of any differences was assessed by an analysis of variance followed by Tukey’s HSD test.

Analysis of genes involved in the fatty acid biosynthesis pathway. Heatmaps present relative expression levels in leaves, stems, and fruits. TPM values were calculated as the mean value of three replicates. Gene expression levels are normalized and represented as log2(TPM + 1). Blue and red represent low and high expression levels, respectively. Abbreviations: PDH, pyruvate dehydrogenase; ACCase, acetyl-CoA carboxylase; MCMT, malonyl-CoA:ACP malonyltransferase; KAS III, ketoacyl-ACP synthase III; KAR, ketoacyl-ACP reductase; HAD, hydroxyacyl-ACP dehydrase; ER, enoyl-ACP reductase; KAS I, ketoacyl-ACP synthase I; KAS II, ketoacyl-ACP synthase II; SAD, stearoyl-ACP desaturase; FAD6, fatty acid desaturase 6; FATA, acyl-ACP thioesterase A; FATB, acyl-ACP thioesterase B; LACS, long-chain acyl-CoA synthetase; GPDH, glycerol-phosphate dehydrogenase; GPAT, glycerol-3-phosphate acyltransferase; LPAAT, 2-lysophosphatidic acid acyltransferase; PP, phosphatidate phosphatase; PLD, phospholipase D; PDCT, phosphatidylcholine diacylglycerol cholinephosphotransferase; DAG-CPT, diacylglycerol choline phosphotransferase; PLC, phospholipase C; DGAT, acyl-CoA:diacylglycerol acyltransferase; MAGAT, monoacylglycerol acyltransferase; PDAT, phospholipid:diacylglycerol acyltransferase; FAD2, fatty acid desaturase 2; FAD3, fatty acid desaturase 3; PLA2, phospholipase A2; LPCAT, 2-lysophosphatidylcholine acyltransferase; ACP, acyl carrier protein; G3P, glycerol 3-phosphate; LPA, lysophosphatidic acid; PA, phosphatidic acid; DAG, diacylglycerol; TAG, triacylglycerol; PC, phosphatidylcholine; LPC, 2-lysophosphatidylcholine.

Analysis of genes involved in the fatty acid biosynthesis pathway. Heatmaps present relative expression levels in leaves, stems, and fruits. TPM values were calculated as the mean value of three replicates. Gene expression levels are normalized and represented as log 2 (TPM + 1). Blue and red represent low and high expression levels, respectively. Abbreviations: PDH, pyruvate dehydrogenase; ACCase, acetyl-CoA carboxylase; MCMT, malonyl-CoA:ACP malonyltransferase; KAS III, ketoacyl-ACP synthase III; KAR, ketoacyl-ACP reductase; HAD, hydroxyacyl-ACP dehydrase; ER, enoyl-ACP reductase; KAS I, ketoacyl-ACP synthase I; KAS II, ketoacyl-ACP synthase II; SAD, stearoyl-ACP desaturase; FAD6, fatty acid desaturase 6; FATA, acyl-ACP thioesterase A; FATB, acyl-ACP thioesterase B; LACS, long-chain acyl-CoA synthetase; GPDH, glycerol-phosphate dehydrogenase; GPAT, glycerol-3-phosphate acyltransferase; LPAAT, 2-lysophosphatidic acid acyltransferase; PP, phosphatidate phosphatase; PLD, phospholipase D; PDCT, phosphatidylcholine diacylglycerol cholinephosphotransferase; DAG-CPT, diacylglycerol choline phosphotransferase; PLC, phospholipase C; DGAT, acyl-CoA:diacylglycerol acyltransferase; MAGAT, monoacylglycerol acyltransferase; PDAT, phospholipid:diacylglycerol acyltransferase; FAD2, fatty acid desaturase 2; FAD3, fatty acid desaturase 3; PLA2, phospholipase A2; LPCAT, 2-lysophosphatidylcholine acyltransferase; ACP, acyl carrier protein; G3P, glycerol 3-phosphate; LPA, lysophosphatidic acid; PA, phosphatidic acid; DAG, diacylglycerol; TAG, triacylglycerol; PC, phosphatidylcholine; LPC, 2-lysophosphatidylcholine.

The NLR genes were generally distributed in clusters throughout the genome ( Fig. 4A ). A neighbor-joining phylogenetic tree was constructed using the protein sequences encoded by these NLR genes ( Fig. 4B ). Numerous NLR genes with close physical proximity on chromosome were clustered together, reflecting their close phylogenetic relationships ( Fig. 4A, B ). DupGen_finder results indicated that these genes may have originated from gene duplication events (e.g. whole genome duplication, tandem duplication, proximal duplication, transposed duplication, and dispersed duplication) ( Table S22 ). Most of these NLR genes were derived from dispersed or proximal duplication events ( Table S23 ). In some duplicated gene pairs, one gene lacked NLR domains, possibly because of functional differentiation or loss during evolution; these genes were not considered as NLR genes ( Table S24 ).

We also analyzed NLR gene expression profiles in avocado leaves, stems, and fruits. Interestingly, the overall relative expression levels of these NLR genes were higher in the stems than in the leaves and fruits ( Fig. 4C ; Table S25 ), but some genes were highly expressed in all three tissues (e.g. Pa11g0262 , Pa02g4855 , Pa07g3139 , Pa07g0383 , and Pa02g3196 ). Accordingly, these genes may be involved in disease responses in all avocado plant tissues. Some genes, especially Pa02g2791 and Pa09g1054 , were expressed specifically in the stems and leaves. Additionally, our analysis of the expression profiles of NLR paralogous gene pairs revealed differences in their expression patterns among tissues. For example, Pa02g4855 was expressed at very high levels, whereas its paralog Pa02g4837 was expressed at almost undetectable levels in all three tissues. These results underscore the potential functional diversity among NLR genes and reflect the functional divergence between paralogous gene pairs.

Expression analysis of fatty acid biosynthesis pathway genes

The fatty acid content is a key trait influencing the nutrient composition and quality of avocado fruits. Fatty acid biosynthesis involves biochemical processes that occur in two distinct stages: de novo fatty acid synthesis within plastids and TAG formation in the endoplasmic reticulum. By sequence alignments and functional annotation, we identified 128 genes associated with fatty acid biosynthesis ( Fig. 5 ; Table S26 ), of which 48 and 80 genes were associated with de novo synthesis in plastids and TAG formation in the endoplasmic reticulum, respectively. Genes encoding three classes of enzymes, pyruvate dehydrogenase (PDH), acetyl-CoA carboxylase (ACCase), and malonyl-CoA:ACP malonyltransferase (MCMT), which are important for malonyl-ACP synthesis within plastids, were most highly expressed in fruits ( Fig. 5 ; Table S26 ). In addition, the expression levels of the fatty acid synthesis-related genes Pa08g1910 , which belongs to the ketoacyl-ACP synthase (KAS) III family, as well as Pa02g0257 (ketoacyl-ACP reductase, KAR), Pa02g4279 (hydroxyacyl-ACP dehydrase, HAD), Pa02g3056 (enoyl-ACP reductase, ER), and Pa05g3853 (KAS I) were approximately 10-times higher in the fruits than in the leaves or stems. Furthermore, Pa02g0113 , which encodes one of the 11 stearoyl-ACP desaturases (SADs) that primarily catalyze C18 unsaturated fatty acid synthesis, was more highly expressed in the leaves than in the stems and fruits. During the TAG formation stage, FAD2 plays a crucial role affecting unsaturated fatty acid synthesis, with Pa07g1095 , Pa07g1091 , and Pa12g0002 expressed specifically in fruits. Our results suggest the genes that were expressed at high levels or exclusively in the fruits may influence the fatty acid composition and content in avocado.

Avocado is an economically valuable plant because its fruits are a rich source of nutrients and have a unique flavor [ 56 ]. Previously published avocado genome assemblies were incomplete because of technology-related limitations [ 33 , 34 ]. The generation of a high-quality genome assembly is necessary for avocado research. In this study, we used a combination of sequencing technologies to obtain a T2T gap-free genome assembly of avocado ( Fig. 1A ) and newly detected an NOR on Pa12 ( Fig. 1A, B and Fig. 3 ). A total of 40 629 protein-coding genes and 4879 noncoding RNAs were predicted ( Fig. 1A ,   Table S6 , and Table S7 ). Using various methods, we verified the high quality of the genome assembly and protein set.

The T2T genome resources necessary for in silico centromeric research are currently limited to model plants and crops [ 17 , 18 , 20 , 57 ], with relatively little available information regarding avocado centromeres. In this study, we clarified the structural characteristics of avocado centromeres. Although CSCR sequences in the same chromosome are generally conserved and CSCR01 (i.e. PaCEN1016) may be a representative avocado centromere repeat ( Fig. 2A ), we also detected considerable variations among centromeres ( Fig. 2B ). This is in accordance with the results of earlier research on the centromeres of other species, including CEN178 in A. thaliana and CEN137 in the Saccharum complex [ 17 , 29 , 58 , 59 ]. These CENs have another feature in common with CSCRs in SCG: they are arranged in a head-to-tail manner on chromosomes [ 17 , 58 , 59 ], whereas centromeric monomers in kiwifruit are arranged in regular intervals [ 21 ]. Compared with previously identified centromeric repeats in model plants and crops (up to several hundred base pairs in length) [ 21 , 29–32 ], avocado CSCRs are much longer (>1000 bp) and their sequences differ considerably from the sequences of published centromeric repeats. In addition, centromeres on Pa04, Pa09, and Pa10 contain many TEs, especially LTR/Gypsy retrotransposons ( Fig. 2A ). Similar results were also reported for other plant species, including B. rapa and the Saccharum complex [ 20 , 58 ], indicating that LTRs may have substantially modulated the centromeric architecture during evolution.

Many functionally validated disease resistance-related genes belong to the NLR gene family, which includes several subfamilies that differ regarding their structural domains [ 60 , 61 ]. We identified 376 NLR genes in this avocado genome assembly, which distributed in clusters that may be coordinately regulated ( Fig. 4A ), thereby enabling avocado to rapidly perceive and respond to pathogen attacks [ 62 , 63 ]. Our data indicated Pa11g0262 , Pa07g3139 , and Pa07g0383 were most abundantly transcribed in the leaves, stems, and fruits ( Table S25 ). Pa11g0262 is partially homologous to AT5G46510 , which is a disease resistance-related gene expressed during different developmental stages of A. thaliana [ 64 ]. Pa07g3139 and Pa07g0383 are homologous to AT3G50950 and AT3G07040 , respectively ( Fig. S8 ), both of which encode a canonical NLR protein required for recognizing the phytopathogenic bacterium Pseudomonas syringae [ 25 ]. Another study determined the NLR unigene UN001791 is responsive to an infection by F. kuroshium [ 12 ]; this unigene is highly similar to Pa02g4855 . These results suggest that these NLR genes may be relevant to future research on the molecular mechanisms underlying responses to diseases in avocado.

The fatty acids in avocado fruits contain a high proportion of unsaturated fatty acids [ 65 ], which influence avocado quality. During fatty acid biosynthesis, ACCase catalyzes the committed and rate-limiting step of de novo fatty acid synthesis in plastids. In Brassica napus , the inhibition of ACCase activity leads to decreased fatty acid synthesis [ 66 ]. In the examined avocado fruits, three ACCase genes, Pa06g1401 , Pa09g2145 , and Pa10g1932 , were expressed at high levels, suggesting they may affect fatty acid synthesis. Earlier research showed fatty acid compositions influence the physicochemical properties, nutritive value, and industrial uses of plant oils [ 67 ]. The formation of unsaturated fatty acids is mainly controlled by specialized fatty acid-modifying enzymes [ 68 ]. By inserting the first double bond into 18:0, SAD is a major determinant of the homeostasis between unsaturated and saturated fatty acids. In the A. thaliana ssi2/fab2 mutant, the loss of SAD leads to the considerable accumulation of stearic acid (C18:0) and low C18:1 level [ 69 ]. Notably, in avocado fruits, one SAD gene ( Pa02g0113 ) was expressed at significantly higher levels than the other SAD genes. The expression patterns of these genes provide valuable insights regarding fatty acid biosynthesis in avocado.

Plant materials

Leaves, stems, and fruits were collected from a young and healthy West Indian avocado tree in the Xishuangbanna Tropical Botanical Garden, Yunnan province, China (101.2768 E, 21.9201 N). Samples were immediately frozen in liquid nitrogen and stored at −80°C for the subsequent whole-genome sequencing analysis and construction of the Pore-C library.

Library construction and sequencing

Genomic DNA (gDNA) was extracted from leaves by CTAB method. After determining gDNA quality and quantity, ~8 μg size-selected (>50 kb) gDNA fragments were used for ONT ultra-long sequencing, which was completed on an Oxford Nanopore PromethION instrument. For HiFi sequencing, Pacific Biosciences SMRTbell target-size libraries were constructed according to the manufacturer’s standard protocol. Approximately 8 μg gDNA was used to construct libraries, which were screened regarding size and then sequenced on a PacBio Sequel II instrument. For paired-end sequencing, libraries were constructed according to the MGIEasy Universal DNA Library Prep Kit v1.0 protocol. For Pore-C sequencing, fresh leaves were immersed in 2% (v/v) fresh formaldehyde for DNA cross-linking, after which the Pore-C library was prepared by digesting the DNA using DpnII. For transcriptome sequencing, total RNA was extracted from leaves, stems, and fruits using TRIzol reagent. The RNA fragments with a poly-A tail were enriched and used as the template for cDNA synthesis, after which the cDNA ends were repaired, an A-tail was added, and an adapter was ligated according to the library construction protocol. High-quality NGS and transcriptome libraries were sequenced on the DNBSEQ-T7RS platform, whereas the high-quality Pore-C library was sequenced on the Oxford Nanopore PromethION instrument.

Genome assembly and gap filling

Paired-end sequencing reads were filtered and cleaned using fastp v0.23.2 [ 70 ] (−l 140 -n 0). For the genome survey, a K -mer ( k  = 21) analysis was performed using Meryl v1.4 ( https://github.com/marbl/meryl ) and GenomeScope2 [ 71 ] (−p 2 -k 21) along with NGS clean reads. Hifiasm v0.19.5-r587 [ 39 ], Verkko v1.4 [ 72 ], NextDenovo v2.5.0 [ 73 ], and HiCanu v2.2 [ 74 ] were used to assemble the preliminary genome ( Table S2 ). Organelle fragments were identified by aligning the assembly with TAIR10 A. thaliana chloroplast and mitochondrial sequences using LASTZ v1.04.22 ( https://github.com/lastz/lastz ). Purge_dups v1.2.5 [ 75 ] was used to remove redundant contigs. Wf-pore-c [ 40 ] was used to detect valid Pore-C signals. The valid Pore-C contact pairs file was converted to the hic format using juicebox_scripts ( https://github.com/phasegenomics ) and then imported into juicebox v1.11.08 [ 41 ] for clustering, ordering, and orienting. 3D-DNA v210623 [ 76 ] was used to generate the draft assembly on the basis of the review.assembly file from juicebox. NextDenovo v2.5.0 [ 73 ] (read_cutoff = 1 k; seed_cutoff = 76 246; genome_size = 864 m) was used to assemble the ONT ultra-long reads. NextPolish v1.4.1 [ 77 ] (task = 661 212) was used to polish the NextDenovo contigs according to both HiFi and NGS reads. ONT ultra-long reads and HiFi reads were assembled by Verkko v1.4 [ 72 ]. The NextDenovo and Verkko contigs were aligned to the draft assembly using minimap2 v2.24-r1122 [ 78 ] (−x asm5) to extract gap-bridging contigs, which were then used by quarTeT [ 42 ] to fill gaps.

Correction and polishing procedures

To detect potential misassembled regions, we mapped ONT ultra-long reads, HiFi reads, and clean NGS reads to the assembly using minimap2 and Bowtie2 (−very-sensitive) to obtain coverage depth statistics. Read depths were calculated using SAMtools v1.18 [ 79 ] bedcov in 200-kb windows (−Q 10). Contigs generated by NextDenovo and Verkko were used to correct low-depth regions via the gap-filling method. NextPolish2 v0.2.0 [ 80 ] was used to polish the assembly with HiFi reads and NGS reads according to the author-suggested procedure. A Perl script was used to detect A. thaliana -type telomeric repeats (5′-TTTAGGG-3′ and 5'-CCCTAAA-3′) on chromosomes and in ONT ultra-long reads to screen for chromosome ends lacking telomeres and reads useful for fixing telomeres, respectively. The candidate reads were aligned to the chromosomes lacking telomeres using minimap2. The telomeric sequence on the longest mapped read was connected to the chromosome end.

EDTA v2.1.0 [ 43 ] and RepeatModeler v2.0.2 ( http://www.repeatmasker.org ) were used for the de novo identification of repeats and the construction of the repeat library. TEsorter v1.4.6 [ 81 ] (−db rexdb-plant) was used to further classify the TEs. Satellites were predicted using TAREAN v0.3.8.1–466 [ 82 ] and random 15× NGS reads as well as the galaxy online server ( https://repeatexplorer-elixir.cerit-sc.cz/galaxy ). The repeat library was used by RepeatMasker v4.1.2-p1 ( http://www.repeatmasker.org ) (−s) to softmask the assembly before predicting gene models. BRAKER v3.0.3 [ 44 , 83 ] was used for the transcriptome-based, homologous protein-based, and ab initio predictions, which were filtered using TSEBRA v1.1.1 [ 84 ]. The annotation results were further filtered and formatted using MAKER v3.01.04 [ 85 ], gffread v0.12.7 [ 86 ], and GenomeTools v1.6.2 [ 87 ] for importing into Generic Feature Format version 3 (GFF3). Proteins were aligned to the sequences in the NR and Swiss-Prot databases using diamond v2.0.15.153 blastp (−e 1e-5 −top 1) [ 88 ]. InterProScan v5.64–96.0 [ 45 , 46 ] and the eggNOG [ 47 , 48 ] online server ( http://eggnog-mapper.embl.de/ ) were used to functionally annotate proteins and assign Pfam, GO, and KO accessions to proteins. Heterozygous sequences were identified using Clair3 v0.1-r12 [ 89 ] and bcftools v1.18 [ 79 ] (filter -i GT = "het"), with HiFi reads as the input, and annotated by Annovar v2020-06-07 [ 90 ] according to a gene-based method. Cmscan in infernal v1.1.4 [ 49 ] and the Rfam database [ 50 ] were used to predict non-coding RNAs. RectChr v1.36 ( https://github.com/hewm2008/RectChr ) was used to visualize the 45S rDNAs in NOR.

Probe and chromosome preparation for fluorescence in situ hybridization

The oligo-probes representing 45S rDNA and centromere sequences ( Table S9 ) were designed on the basis of ITS1, ITS2, and CSCR sequences in the T2T avocado genome assembly generated in this study. The A. thaliana -type telomeric sequence was used to detect telomeres in avocado ( Table S9 ). These probes were synthesized by Sangon Biotech Co., Ltd. (Shanghai, China).

The newly grown root tips of avocado seedlings exhibiting developmental consistency were carefully removed and promptly immersed in a solution containing 0.002 mol/L 8-hydroxyquinoline for 3–4 h. The subsequent preparation of mitotic metaphase chromosomes and the FISH analysis were conducted according to a slightly modified established procedure [ 91 ]. The chromosomes were counterstained with 4,6-diamidino-2-phenylindole (Vector Laboratories, Inc., Burlingame, USA) and examined using an Olympus BX-53 microscope equipped with a Photometric SenSys Olympus DP80 CCD camera (Olympus Corporation, Japan). The captured images were processed using Olympus cellSens Standard 4.1.1 software (Olympus Corporation).

Quality assessments

Raw HiFi reads and ONT ultra-long reads were aligned to the final assembly using minimap2, whereas clean NGS reads were aligned using Bowtie2 [ 92 ]. Transcriptome reads were aligned to the assembly using HISAT2 v2.2.1 [ 93 ] (—very-sensitive). The format was converted and the overall mapping rates, coverage breadth, and coverage depth statistics were calculated using the SAMtools [ 79 ] commands sort, flagstat, coverage, and bedcov, respectively. Gene and repeat densities were calculated using bedtools v2.30.0 [ 94 ] makewindows and intersect, whereas the GC content was calculated using bedtools nuc. A genome landscape Circos plot was produced with TBtools v2.008 [ 95 ]. Base quality values of the raw assembly were calculated using Merqury v1.3 [ 52 ] and raw HiFi reads. LAI in LTR_retriever v2.9.0 [ 96 ] was used to calculate LAI scores. BUSCO v5.4.7 [ 54 ] was used to evaluate the completeness of the assembly and protein set according to the embryophyta_odb10 dataset.

Centromere and structural variation characterization

A strategy involving iterative identification and clustering was used to detect centromeric repeats in the assembly. RepeatMasker (−s) was used to locate satellites. Satellite locations and Pore-C signal near-absent positions were considered together to estimate candidate centromere locations on each chromosome. High-frequency tandem repeat sequences identified in candidate centromere regions by TRF v4.09.1 [ 97 ] (2 7 7 80 10 502 000 -h) were retained for the genome-wide LASTZ alignment. Vsearch v2.22.1 [ 55 ] (—clusterout_sort —clusterout_id —fasta_width 0 —id 0.6 —cluster_size) was used to cluster the LASTZ hits on each chromosome, which resulted in 12 CSCRs. If coverage and identity percentages were both over 80% in an alignment hit of LASTZ, the hit was considered to be accurate. PyGenomeTracks v3.8 [ 98 ] was used to visualize the features on chromosome tracks. Structural variations were identified with minimap2 and Syri v1.6.3 [ 99 ] and annotated by Annovar v2020-06-07 [ 90 ] using a gene-based method. Plotsr v1.1.3 [ 100 ] was used to visualize structural variations.

Gene identification and transcriptome analysis

The predicted full-length coding sequences in avocado were used by NLR-Annotator v2.1b [ 101 ] to identify NLR domains. In accordance with the accepted definition [ 102 ], genes containing at least one NB-ARC (Pfam accession PF00931), TIR (PF01582), or RPW8 (PF05659) domain were considered as NLR genes [ 102 ]. We combined the NLR-Annotator and InterProScan Pfam annotations to obtain NLR genes. GO enrichment and KEGG pathway enrichment analyses were performed using TBtools [ 95 ]. Relative expression levels were recorded as transcripts per million (TPM) values, which were calculated using RSEM v1.3.3 [ 103 ]. The clean RNA-seq reads were mapped to the assembly using STAR v2.7.10a [ 104 ] for TPM calculation. Relative NLR expression level differences among tissues were evaluated by an analysis of variance followed by Tukey’s HSD correction. NLR proteins were aligned using the MAFFT v7.520 [ 105 ] einsi algorithm. A neighbor-joining tree was constructed using TreeBeST v1.9.2 [ 106 ], with 1000 bootstrap iterations (nj -b 1000 -W). The sequences of A. thaliana proteins involved in the fatty acid biosynthesis pathway were obtained from ARALIP ( http://aralip.plantbiology.msu.edu/pathways/pathways ) [ 13 ] to serve as queries for the BLASTP search (−evalue 1e-5) of the protein set generated in this study. The Pfam and SMART ( http://smart.emblheidelberg.de/ ) databases were screened to detect candidate proteins with conserved domains. Finally, all candidates were used to search the GenBank NR database.

We thank all the members of the laboratory for their technical and analysis assistance. We thank Liwen Bianji (Edanz) ( www.liwenbianji.cn/ac ) for editing the English text of a draft of this manuscript. This research was supported by Yunling Scholar Project (to Yongping Yang), the Major Science and Technology Projects (202202AE090016), Yunnan Revitalization Talents Support Plan (to Yunqiang Yang), the Digitalization, development and application of biotic resource (202002AA100007), the Postdoctoral Research Funding Projects of Yunnan Province (to Xin Yin), the National Natural Science Foundation of China (32100315, 31601999, 41771123, 31590820, and 31590823), the West Light Foundation of the Chinese Academy of Sciences (to Yunqiang Yang), and the 13th Five-year Informatization Plan of Chinese Academy of Sciences, Grant No. XXH13506. The funders had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

YQY and YPY designed the research. TYY, YFC, XYY, DNY, and XY analyzed the data. TPH, CJZ, YWD, YQY, and YPY contributed reagents/materials/analysis tools. TYY, YFC, and YQY wrote and reviewed the paper.

The raw sequencing data, including ONT Ultra-long reads, PacBio HiFi reads, NGS reads, Pore-C reads, and RNA-seq reads, assembly, and annotation data are accessible in Science Data Bank ( https://doi.org/10.57760/sciencedb.07602 ).

The authors declare no conflict of interest.

Supplementary data is available at Horticulture Research online.

Kilaru   A , Cao   X , Dabbs   PB . et al.    Oil biosynthesis in a basal angiosperm: transcriptome analysis of Persea Americana mesocarp . BMC Plant Biol . 2015 ; 15 : 203

Google Scholar

Cowan   AK , Wolstenholme   BN . Avocados. In: Caballero   B , ed. Encyclopedia of Food Sciences and Nutrition . 2nd ed. Oxford : Academic Press , 2003 , 348 – 53

Google Preview

Mahmassani   HA , Avendano   EE , Raman   G . et al.    Avocado consumption and risk factors for heart disease: a systematic review and meta-analysis . Am J Clin Nutr . 2018 ; 107 : 523 – 36

Food and Agriculture Organization of the United Nations . FAOSTAT Statistical Database . Rome , 2021

Kimaru   KS , Muchemi   KP , Mwangi   JW . et al.    Effects of anthracnose disease on avocado production in Kenya . Cogent Food Agric . 2020 ; 6 : 6

Ramírez-Gil   JG , Gilchrist Ramelli   E , Morales Osorio   JG . Economic impact of the avocado (cv. Hass) wilt disease complex in Antioquia, Colombia, crops under different technological management levels . Crop Prot . 2017 ; 101 : 103 – 15

Gil   GR , Osorio   JM .   First report of Cylindrocarpon destructans (Zinss) Scholten affecting avocado ( Persea americana Mill) seedling in Colombia . Rev Protección Veg . 2013 ; 28 : 27 – 35

Dann   EK , Cooke   AW , Forsberg   LI . et al.    Pathogenicity studies in avocado with three nectriaceous fungi, Calonectria ilicicola , Gliocladiopsis sp. and Ilyonectria liriodendri . Plant Pathol . 2012 ; 61 : 896 – 902

Vitale   A , Aiello   D , Guarnaccia   V . et al.    First report of root rot caused by Ilyonectria (= Neonectria ) macrodidyma on avocado ( Persea americana ) in Italy . J Phytopathol . 2011 ; 160 : 156 – 9

Zilberstein   M , Elkind   G , Zeidan   M . et al.    Wilting disease of young avocado trees caused by Neonectria radicicola in Israel . Proceedings VI World Avocado Congress . 2007 : 12 – 16

Besoain   X , Piontelli   E .   Black root rot in avocado plants ( Persea americana Mill.) by Cylindrocarpon destructans : Pathogenicity and epi-demiological aspects . Bol Micol . 1999 ; 14 : 41 – 7

Perez-Torres   CA , Ibarra-Laclette   E , Hernandez-Dominguez   EE . et al.    Molecular evidence of the avocado defense response to Fusarium kuroshium infection: a deep transcriptome analysis using RNA-Seq . PeerJ . 2021 ; 9 : e11215

Li-Beisson   Y , Shorrosh   B , Beisson   F . et al.    Acyl-lipid metabolism . The Arabidopsis Book . 2010 ; 8 : e0133

Harwood   JL . Recent advances in the biosynthesis of plant fatty acids . Biochim Biophys Acta . 1996 ; 1301 : 7 – 56

Cerone   M , Smith   TK . Desaturases: structural and mechanistic insights into the biosynthesis of unsaturated fatty acids . IUBMB Life . 2022 ; 74 : 1036 – 51

Hou   X , Wang   D , Cheng   Z . et al.    A near-complete assembly of an Arabidopsis thaliana genome . Mol Plant . 2022 ; 15 : 1247 – 50

Naish   M , Alonge   M , Wlodzimierz   P . et al.    The genetic and epigenetic landscape of the Arabidopsis centromeres . Science . 2021 ; 374 : eabi7489

Song   JM , Xie   WZ , Wang   S . et al.    Two gap-free reference genomes and a global view of the centromere architecture in rice . Mol Plant . 2021 ; 14 : 1757 – 67

Li   K , Jiang   W , Hui   Y . et al.    Gapless indica rice genome reveals synergistic contributions of active transposable elements and segmental duplications to rice genome evolution . Mol Plant . 2021 ; 14 : 1745 – 56

Zhang   L , Liang   J , Chen   H . et al.    A near-complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres . Plant Biotechnol J . 2023 ; 21 : 1022 – 32

Yue   J , Chen   Q , Wang   Y . et al.    Telomere-to-telomere and gap-free reference genome assembly of the kiwifruit Actinidia chinensis . Hortic Res . 2023 ; 10 : uhac264

Han   X , Zhang   Y , Zhang   Q . et al.    Two haplotype-resolved, gap-free genome assemblies of Actinidia latifolia and Actinidia chinensis shed light on regulation mechanisms of vitamin C and sucrose metabolism in kiwifruit . Mol Plant . 2022 ; 16 : 452 – 70

Nie   S , Zhao   SW , Shi   TL . et al.    Gapless genome assembly of azalea and multi-omics investigation into divergence between two species with distinct flower color . Hortic Res . 2023 ; 10 : uhac241

Li   F , Xu   S , Xiao   Z . et al.    Gap-free genome assembly and comparative analysis reveal the evolution and anthocyanin accumulation mechanism of Rhodomyrtus tomentosa . Hortic Res . 2023 ; 10 :uhad005

Zhong   CX , Marshall   JB , Topp   C . et al.    Centromeric retroelements and satellites interact with maize kinetochore protein CENH3 . Plant Cell . 2002 ; 14 : 2825 – 36

Comai   L , Maheshwari   S , Marimuthu   MPA . Plant centromeres . Curr Opin Plant Biol . 2017 ; 36 : 158 – 67

Talbert   PB , Henikoff   S . The genetics and epigenetics of satellite centromeres . Genome Res . 2022 ; 32 : 608 – 15

Walkowiak   S , Gao   L , Monat   C . et al.    Multiple wheat genomes reveal global variation in modern breeding . Nature . 2020 ; 588 : 277 – 83

Copenhaver   GP , Nickel   K , Kuromori   T . et al.    Genetic definition and sequence analysis of Arabidopsis centromeres . Science . 1999 ; 286 : 2468 – 74

Cheng   Z , Dong   F , Langdon   T . et al.    Functional rice centromeres are marked by a satellite repeat and a centromere-specific retrotransposon . Plant Cell . 2002 ; 14 : 1691 – 704

Ananiev   EV , Phillips   RL , Rines   HW . Chromosome-specific molecular organization of maize ( Zea mays L.) centromeric regions . Proc Natl Acad Sci USA . 1998 ; 95 : 13073 – 8

Su   H , Liu   Y , Liu   C . et al.    Centromere satellite repeats have undergone rapid changes in Polyploid wheat subgenomes . Plant Cell . 2019 ; 31 : 2035 – 51

Rendon-Anaya   M , Ibarra-Laclette   E , Mendez-Bravo   A . et al.    The avocado genome informs deep angiosperm phylogeny, highlights introgressive hybridization, and reveals pathogen-influenced gene space adaptation . Proc Natl Acad Sci USA . 2019 ; 116 : 17081 – 9

Nath   O , Fletcher   SJ , Hayward   A . et al.    A haplotype resolved chromosomal level avocado genome allows analysis of novel avocado genes . Hortic Res . 2022 ; 9 : uhac157

Rubinstein   M , Eshed   R , Rozen   A . et al.    Genetic diversity of avocado ( Persea americana mill.) germplasm using pooled sequencing . BMC Genomics . 2019 ; 20 : 379

Talavera   A , Soorni   A , Bombarely   A . et al.    Genome-wide SNP discovery and genomic characterization in avocado ( Persea americana mill.) . Sci Rep . 2019 ; 9 : 20137

Castillo-Argaez   R , Konkol   JL , Vargas   AI . et al.    Disease severity and ecophysiology of rootstock/scion combinations of different avocado ( Persea americana Mill.) genotypes in response to laurel wilt . Sci Hortic . 2021 ; 287 : 110250

Solares   E , Morales-Cruz   A , Balderas   RF . et al.    Insights into the domestication of avocado and potential genetic contributors to heterodichogamy . G3 (Bethesda) . 2023 ; 13 :jkac323

Cheng   H , Jarvis   ED , Fedrigo   O . et al.    Haplotype-resolved assembly of diploid genomes without parental data . Nat Biotechnol . 2022 ; 40 : 1332 – 5

Deshpande   AS , Ulahannan   N , Pendleton   M . et al.    Identifying synergistic high-order 3D chromatin conformations from genome-scale nanopore concatemer sequencing . Nat Biotechnol . 2022 ; 40 : 1488 – 99

Durand   NC , Robinson   JT , Shamim   MS . et al.    Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom . Cell Syst . 2016 ; 3 : 99 – 101

Lin   Y , Ye   C , Li   X . et al.    quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification . Hortic Res . 2023 ; 10 : uhad127

Ou   S , Su   W , Liao   Y . et al.    Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline . Genome Biol . 2019 ; 20 : 275

Bruna   T , Lomsadze   A , Borodovsky   MA .   A new gene finding tool GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes . bioRxiv . 2024

Blum   M , Chang   H-Y , Chuguransky   S . et al.    The InterPro protein families and domains database: 20 years on . Nucleic Acids Res . 2021 ; 49 : D344 – 54

Jones   P , Binns   D , Chang   HY . et al.    InterProScan 5: genome-scale protein function classification . Bioinformatics . 2014 ; 30 : 1236 – 40

Huerta-Cepas   J , Szklarczyk   D , Heller   D . et al.    eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses . Nucleic Acids Res . 2019 ; 47 : D309 – 14

Cantalapiedra   CP , Hernandez-Plaza   A , Letunic   I . et al.    eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale . Mol Biol Evol . 2021 ; 38 : 5825 – 9

Nawrocki   EP , Eddy   SR . Infernal 1.1: 100-fold faster RNA homology searches . Bioinformatics . 2013 ; 29 : 2933 – 5

Griffiths-Jones   S , Moxon   S , Marshall   M . et al.    Rfam: annotating non-coding RNAs in complete genomes . Nucleic Acids Res . 2005 ; 33 : D121 – 4

Chandrasekhara   C , Mohannath   G , Blevins   T . et al.    Chromosome-specific NOR inactivation explains selective rRNA gene silencing and dosage control in Arabidopsis . Genes Dev . 2016 ; 30 : 177 – 90

Rhie   A , Walenz   BP , Koren   S . et al.    Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies . Genome Biol . 2020 ; 21 : 245

Ou   S , Chen   J , Jiang   N . Assessing genome assembly quality using the LTR assembly index (LAI) . Nucleic Acids Res . 2018 ; 46 :e126

Manni   M , Berkeley   MR , Seppey   M . et al.    BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes . Mol Biol Evol . 2021 ; 38 : 4647 – 54

Rognes   T , Flouri   T , Nichols   B . et al.    VSEARCH: a versatile open source tool for metagenomics . PeerJ . 2016 ; 4 :e2584

Araújo   RG , Rodriguez-Jasso   RM , Ruiz   HA . et al.    Avocado by-products: nutritional and functional properties . Trends Food Sci Technol . 2018 ; 80 : 51 – 60

Navratilova   P , Toegelova   H , Tulpova   Z . et al.    Prospects of telomere-to-telomere assembly in barley: analysis of sequence gaps in the MorexV3 reference genome . Plant Biotechnol J . 2022 ; 20 : 1373 – 86

Wang   T , Wang   B , Hua   X . et al.    A complete gap-free diploid genome in Saccharum complex and the genomic footprints of evolution in the highly polyploid Saccharum genus . Nat Plants . 2023 ; 9 : 554 – 71

Huang   Y , Ding   W , Zhang   M . et al.    The formation and evolution of centromeric satellite repeats in Saccharum species . Plant J . 2021 ; 106 : 616 – 29

Kapos   P , Devendrakumar   KT , Li   X . Plant NLRs: from discovery to application . Plant Sci . 2019 ; 279 : 3 – 18

Barragan   AC , Weigel   D . Plant NLR diversity: the known unknowns of pan-NLRomes . Plant Cell . 2021 ; 33 : 814 – 31

Okada   A , Okada   K , Miyamoto   K . et al.    OsTGAP1, a bZIP transcription factor, coordinately regulates the inductive production of diterpenoid phytoalexins in rice . J Biol Chem . 2009 ; 284 : 26510 – 8

Zhan   C , Shen   S , Yang   C . et al.    Plant metabolic gene clusters in the multi-omics era . Trends Plant Sci . 2022 ; 27 : 981 – 1001

Kim   T-H , Kunz   H-H , Bhattacharjee   S . et al.    Natural variation in small molecule-induced TIR-NB-LRR signaling induces root growth arrest via EDS1- and PAD4-complexed R protein VICTR inArabidopsis . Plant Cell . 2012 ; 24 : 5177 – 92

Moreno   AO , Dorantes   L , Galindez   J . et al.    Effect of different extraction methods on fatty acids, volatile compounds, and physical and chemical properties of avocado ( Persea americana mill.) oil . J Agric Food Chem . 2003 ; 51 : 2216 – 21

Andre   C , Haslam   RP , Shanklin   J . Feedback regulation of plastidic acetyl-CoA carboxylase by 18:1-acyl carrier protein in Brassica napus . Proc Natl Acad Sci USA . 2012 ; 109 : 10107 – 12

Snapp   AR , Lu   C . Engineering industrial fatty acids in oilseeds . Front Biol . 2012 ; 8 : 323 – 32

Damude   HG , Kinney   AJ . Engineering oilseeds to produce nutritional fatty acids . Physiol Plant . 2007 ; 132 : 1 – 10

Kachroo   A , Shanklin   J , Whittle   E . et al.    The Arabidopsis stearoyl-acyl carrier protein-desaturase family and the contribution of leaf isoforms to oleic acid synthesis . Plant Mol Biol . 2006 ; 63 : 257 – 71

Chen   S , Zhou   Y , Chen   Y . et al.    Fastp: an ultra-fast all-in-one FASTQ preprocessor . Bioinformatics . 2018 ; 34 : i884 – 90

Ranallo-Benavidez   TR , Jaron   KS , Schatz   MC . GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes . Nat Commun . 2020 ; 11 : 1432

Rautiainen   M , Nurk   S , Walenz   BP . et al.    Telomere-to-telomere assembly of diploid chromosomes with Verkko . Nat Biotechnol . 2023 ; 41 : 1474 – 82

Hu   J , Wang   Z , Sun   Z . et al.    An efficient error correction and accurate assembly tool for noisy long reads . bioRxiv . 2023

Nurk   S , Walenz   BP , Rhie   A . et al.    HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads . Genome Res . 2020 ; 30 : 1291 – 305

Guan   D , Mccarthy   SA , Wood   J . et al.    Identifying and removing haplotypic duplication in primary genome assemblies . Bioinformatics . 2020 ; 36 : 2896 – 8

Dudchenko   O , Batra   SS , Omer   AD . et al.    De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds . Science . 2017 ; 356 : 92 – 5

Hu   J , Fan   J , Sun   Z . et al.    NextPolish: a fast and efficient genome polishing tool for long-read assembly . Bioinformatics . 2020 ; 36 : 2253 – 5

Li   H . New strategies to improve minimap2 alignment accuracy . Bioinformatics . 2021 ; 37 : 4572 – 4

Danecek   P , Bonfield   JK , Liddle   J . et al.    Twelve years of SAMtools and BCFtools . GigaScience . 2021 ; 10 : giab008

Hu   J , Wang   Z , Liang   F . et al.    NextPolish2: A repeat-aware polishing tool for genomes assembled using HiFi long reads . Genom Proteom Bioinform . 2024 ; qzad009

Zhang   RG , Li   GY , Wang   XL . et al.    TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes . Hortic Res . 2022 ; 9 :uhac017

Novak   P , Avila Robledillo   L , Koblizkova   A . et al.    TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads . Nucleic Acids Res . 2017 ; 45 :e111

Bruna   T , Hoff   KJ , Lomsadze   A . et al.    BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database . NAR Genom Bioinform . 2021 ; 3 :lqaa108

Gabriel   L , Hoff   KJ , Bruna   T . et al.    TSEBRA: transcript selector for BRAKER . BMC Bioinformatics . 2021 ; 22 : 566

Campbell   MS , Holt   C , Moore   B . et al.    Genome annotation and curation using MAKER and MAKER-P . Curr Protoc Bioinform . 2014 ; 48 : 4.11.11 – 14.11.39

Pertea   G , Pertea   M . GFF utilities: GffRead and GffCompare . F1000Res . 2020 ; 9 : 9

Gremme   G , Steinbiss   S , Kurtz   S . GenomeTools: a comprehensive software library for efficient processing of structured genome annotations . IEEE/ACM Trans Comput Biol Bioinform . 2013 ; 10 : 645 – 56

Buchfink   B , Reuter   K , Drost   HG . Sensitive protein alignments at tree-of-life scale using DIAMOND . Nat Methods . 2021 ; 18 : 366 – 8

Zheng   Z , Li   S , Su   J . et al.    Symphonizing pileup and full-alignment for deep learning-based long-read variant calling . Nat Comput Sci . 2022 ; 2 : 797 – 803

Wang   K , Li   M , Hakonarson   H . ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data . Nucleic Acids Res . 2010 ; 38 : e164 – 4

Komuro   S , Endo   R , Shikata   K . et al.    Genomic and chromosomal distribution patterns of various repeated DNA sequences in wheat revealed by a fluorescence in situ hybridization procedure . Genome . 2013 ; 56 : 131 – 7

Langmead   B , Salzberg   SL . Fast gapped-read alignment with bowtie 2 . Nat Methods . 2012 ; 9 : 357 – 9

Kim   D , Paggi   JM , Park   C . et al.    Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype . Nat Biotechnol . 2019 ; 37 : 907 – 15

Quinlan   AR , Hall   IM . BEDTools: a flexible suite of utilities for comparing genomic features . Bioinformatics . 2010 ; 26 : 841 – 2

Chen   C , Chen   H , Zhang   Y . et al.    TBtools: an integrative toolkit developed for interactive analyses of big biological data . Mol Plant . 2020 ; 13 : 1194 – 202

Ou   S , Jiang   N . LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons . Plant Physiol . 2018 ; 176 : 1410 – 22

Benson   G . Tandem repeats finder: a program to analyze DNA sequences . Nucleic Acids Res . 1999 ; 27 : 573 – 80

Ramirez   F , Bhardwaj   V , Arrigoni   L . et al.    High-resolution TADs reveal DNA sequences underlying genome organization in flies . Nat Commun . 2018 ; 9 : 189

Goel   M , Sun   H , Jiao   W-B . et al.    SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies . Genome Biol . 2019 ; 20 : 277

Goel   M , Schneeberger   K , Robinson   P . Plotsr: visualizing structural similarities and rearrangements between multiple genomes . Bioinformatics . 2022 ; 38 : 2922 – 6

Steuernagel   B , Witek   K , Krattinger   SG . et al.    The NLR-annotator tool enables annotation of the intracellular immune receptor repertoire . Plant Physiol . 2020 ; 183 : 468 – 82

Van De Weyer   AL , Monteiro   F , Furzer   OJ . et al.    A species-wide inventory of NLR genes and alleles in Arabidopsis thaliana . Cell . 2019 ; 178 : 1260 – 1272.e14

Li   B , Dewey   CN . RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome . BMC Bioinformatics . 2011 ; 12 : 323

Dobin   A , Davis   CA , Schlesinger   F . et al.    STAR: ultrafast universal RNA-seq aligner . Bioinformatics . 2013 ; 29 : 15 – 21

Rozewicki   J , Li   S , Amada   KM . et al.    MAFFT-DASH: integrated protein sequence and structural alignment . Nucleic Acids Res . 2019 ; 47 : W5 – 10

Vilella   AJ , Severin   J , Ureta-Vidal   A . et al.    EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates . Genome Res . 2009 ; 19 : 327 – 35

Author notes

Supplementary data.

Month: Total Views:
April 2024 105
May 2024 184
June 2024 103
July 2024 219

Email alerts

Citing articles via.

  • International Horticulture Research Conference
  • Advertising & Corporate Services

Affiliations

  • Online ISSN 2052-7276
  • Print ISSN 2662-6810
  • Copyright © 2024 Nanjing Agricultural University
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Quick Links
  • Make An Appointment
  • Our Services
  • Price Estimate
  • Price Transparency
  • Pay Your Bill
  • Patient Experience
  • Careers at UH

Schedule an appointment today

University Hospitals Logo

  • Babies & Children
  • Bones, Joints & Muscles
  • Brain & Nerves
  • Diet & Nutrition
  • Ear, Nose & Throat
  • Eyes & Vision
  • Family Medicine
  • Heart & Vascular
  • Integrative Medicine
  • Lungs & Breathing
  • Men’s Health
  • Mental Health
  • Neurology & Neurosurgery
  • Older Adults & Aging
  • Orthopedics
  • Skin, Hair & Nails
  • Spine & Back
  • Sports Medicine & Exercise
  • Travel Medicine
  • Urinary & Kidney
  • Weight Loss & Management
  • Women's Health
  • Patient Stories
  • Infographics

Lynch Syndrome: Genetic Disorder Raises Risk of Many Cancers

July 02, 2024

Scientist pipetting sample into tray for DNA testing in laboratory

Lynch syndrome is a genetic disorder that increases the risk of developing some types of cancer at a younger age. Most often associated with colorectal cancer  and uterine  cancer, the disorder affects roughly 1 in 300 people.

“Colon and uterine cancer are the two big ones, but there is a long list of cancers that can be caused by Lynch syndrome,” says University Hospitals oncologist/hematologist Sakti Chakrabarti, MD . He shares more.

Common Cancer Types

Lynch syndrome is caused by mutations in one of five genes: MLH1, MSH2, MSH6, PMS2 or EPCAM. A person’s cancer risk varies depending on which gene mutation is present. Not everyone with Lynch syndrome will develop cancer. But the risk is substantial because of a gene mutation that prevents cells from repairing damage, Dr. Chakrabarti says.

Cancers associated with Lynch syndrome include:

  • Stomach (gastric)
  • Small intestine
  • Urinary tract (kidney, ureter, bladder)
  • Biliary tract (liver, gall bladder, bile ducts)
  • Certain skin cancers

The Importance of Genetic Testing

A family history of Lynch syndrome-associated cancers is an important reason to get genetic testing for the disorder. If one parent has the genes, there’s a 50 percent chance that a child will inherit Lynch syndrome. “Children should be screened for this condition because the risk is very high,” Dr. Chakrabarti says.

Genetic testing also may be recommended for anyone who has had colorectal or uterine cancer before age 50, or who has had more than one Lynch syndrome-associated cancer.

Cancer Prevention & Screening

People with Lynch syndrome are urged to undergo preventive cancer screenings at an earlier age than usual. For example, colonoscopy screenings typically begin at age 45. For people with Lynch syndrome, screenings should begin around age 20 to 25, Dr. Chakrabarti says. The screening recommendations vary according to the results of genetic testing and family history:

  • A person with Lynch syndrome and a family history of colorectal cancer should be screened every three to five years starting in their 30s.
  • Women are advised to consider transvaginal ultrasound exam and uterine biopsy every year or two.
  • Preventive surgeries, such as hysterectomy or oophorectomy, can also help reduce a woman’s chances of developing uterine or ovarian cancers.

Dr. Chakrabarti says the development of immunotherapies to treat cancers associated with Lynch syndrome has been promising. Immunotherapy uses a patient’s immune system to attack cancer cells.

“As we’ve increasingly used immunotherapy over the last decade, we’ve found that patients with Lynch syndrome respond very well to it,” he says. “Lynch syndrome was not talked about 10 or 15 years ago, but the discovery of immunotherapy has made it an area of focus for the prevention and treatment of cancer.”

Related Links

At University Hospitals Seidman Cancer Center , our care team provides the most advanced forms of cancer care, from prevention, screening, diagnosis, treatment through survivorship.

Tags: Lynch syndrome , Cancer , Sakti Chakrabarti, MD

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • 26 June 2024
  • Clarification 27 June 2024

Estonians gave their DNA to science — now they’re learning their genetic secrets

  • Ewen Callaway

You can also search for this author in PubMed   Google Scholar

You have full access to this article via your institution.

A large crowd of people stand close together, all looking in a similar direction.

One-fifth of Estonians now have access to information about genetic variants that could increase their chances of certain illnesses. Credit: Ints Vikmanis/Shutterstock

While much of Europe is obsessing over this year’s European Football Championships, many Estonians — whose team didn’t qualify — are absorbed in their own genomes.

This month, the 210,000 Estonians who have contributed samples to the country’s biobank — around 20% of the adult population — were given the opportunity to learn about some of their genetic traits, including disease risk, ancestry markers and how they handle caffeine.

research articles about genes

World’s biggest set of human genome sequences opens to scientists

So many people flocked to the online portal that parts of it crashed soon after it launched. “Genetic literacy in the Estonian population is maybe higher than elsewhere,” says Lili Milani, head of the Estonian Biobank and a pharmacogenomicist at the University of Tartu. “The interest is really high.”

The project is one of the world’s biggest efforts to return genetic results to research participants — most biobanks do not provide such information. One reason for sharing the results, say scientists, is to recognize the value that participants contribute. “People have donated their data for this research, and they want something back,” says Andrea Ganna, a statistical geneticist at the University of Helsinki. “It’s a no-brainer. We need to do it and participants want it.”

Returning results

The Estonian Biobank was created by a 2000 law that mandated that the database would allow participants to access their genetic data. But informing so many people about their genomes is easier said than done.

At first, specialists individually counselled participants with a high genetic risk of certain conditions, including breast cancer and cardiovascular disease, or with rare gene variants that affect how they metabolize drugs. But these ‘recall studies’ reached only 5,000 participants, says Milani. “We cannot do face-to-face consultations for 200,000 people.”

The biobank’s online portal provides more limited insights, but the emphasis is still on data that participants can use to improve their health. As well as information about cardiovascular disease and type 2 diabetes based on factors including hundreds of thousands of DNA variants, Estonians receive advice about how losing weight and making other lifestyle changes can cut disease risk. “We know genetic risk alone doesn’t tell you much. You need to put this in the context of your lifestyle,” says Milani.

research articles about genes

Australian biobank repatriates hundreds of ‘legacy’ Indigenous blood samples

The portal also informs participants about genetic influences on how their body handles medicines, such as certain blood thinners, and other substances. Milani’s own results show she carries a gene variant that slows the breakdown of caffeine, amplifying its effects. “I had one coffee yesterday and couldn’t sleep. It’s been a bit hectic with the launch of the portal,” she says.

More than 75,000 biobank participants have already visited the website, showing that interest is high, says Milani. (A measure of Neanderthal ancestry that the biobank provides has been trending on Estonian social media.) To measure the effects of receiving health-related information, Milani and her colleagues plan to compare the future health of participants who log into the portal with that of those who don’t.

“The hope, anticipation and expectation is that this should improve people’s health care,” says Dan Roden, a cardiologist and clinical pharmacologist working on personalized medicine at Vanderbilt University in Nashville, Tennessee.

Genetic counselling

The Estonian Biobank data release is part of a growing trend among population health studies. The US-government-funded All of Us study — which aims to collect genome and health data from more than one million people from diverse backgrounds — has communicated genetic results to more than 100,000 participants, with the goal to give all participants the opportunity to receive this information eventually.

The study examines a set of 59 genes for genetic variations linked to diseases that can be treated or prevented. The 3% of participants who carry any of these mutations receive genetic counselling to learn about the results, says Heidi Rehm, a clinical genomicist at Massachusetts General Hospital in Boston who is part of All of Us.

research articles about genes

An inside look at the first pig biobank

“If we’ve got their genomes and there’s really critical information in there, particularly that our researchers may study, it seems unfair not to also let them have it,” she adds. Participants who don’t have any of these mutations can see the results online and can still request a meeting with a genetic counsellor.

To be able to show results to participants, All of Us had to jump through several hoops — including operating under a regulatory protocol with the US Food and Drug Administration, which regulates genetic testing. The Estonian programme, Milani says, went through two years of back-and-forth communication with an ethics review board.

Another challenge is funding, says Ganna. Tasks such as contacting participants by post could cost hundreds of thousands of euros.

Returning genetic results is an ongoing process, says Roden. As scientists’ understanding of the links between genetics and health changes, so should the information participants receive. “You’re never at the end of this voyage,” Roden says. “I admire the Estonians, and I think it is a wonderful experiment, a wonderful way of moving genome science forward.”

Nature 631 , 17 (2024)

doi: https://doi.org/10.1038/d41586-024-02108-y

Updates & Corrections

Clarification 27 June 2024 : An earlier version of this article stated that the All of Us study got a ‘stamp of approval’ from the US Food and Drug Administration. This has been amended to clarify that its discussions with the FDA are still ongoing.

Reprints and permissions

Related Articles

research articles about genes

Massive database of 182,000 leaves is helping predict plants' family trees

Mexican biobank advances population and medical genomics of diverse ancestries:

How anti-obesity drugs cause nausea: finding offers hope for better drugs

How anti-obesity drugs cause nausea: finding offers hope for better drugs

News 10 JUL 24

Alzheimer’s plaques and tangles revealed by 3D microscopy

Alzheimer’s plaques and tangles revealed by 3D microscopy

News & Views 10 JUL 24

A liver immune rheostat regulates CD8 T cell immunity in chronic HBV infection

A liver immune rheostat regulates CD8 T cell immunity in chronic HBV infection

Article 10 JUL 24

Not all ‘open source’ AI models are actually open: here’s a ranking

Not all ‘open source’ AI models are actually open: here’s a ranking

News 19 JUN 24

A guide to the Nature Index

A guide to the Nature Index

Nature Index 05 JUN 24

The AI revolution is coming to robots: how will it change them?

The AI revolution is coming to robots: how will it change them?

News Feature 28 MAY 24

How the watermelon got its sweet taste and rosy hue

How the watermelon got its sweet taste and rosy hue

Research Highlight 10 JUL 24

Scientists edit the genes of gut bacteria in living mice

Scientists edit the genes of gut bacteria in living mice

Why cancer risk declines sharply in old age

Why cancer risk declines sharply in old age

News 02 JUL 24

Neuroscience Research Assistant/Tech - Manhattan Weill Cornell Medical College

We are seeking motivated and enthusiastic research tech applicants to work on autism mouse models and brain organoids.

New York City, New York (US)

Weill Cornell Medical College

research articles about genes

Postdoctoral Fellowship - Graph Database Developer

Postdoctoral Fellowship - Graph Database Developer Organization National Library of Medicine, National Institutes of Health, Bethesda, MD and surro...

Bethesda, Maryland

National Institutes of Health/National Library of Medicine

Postdoctoral Fellow - Boyi Gan lab

New postdoctoral positions are open in a cancer research laboratory located within The University of Texas MD Anderson Cancer Center. The lab curre...

Houston, Texas (US)

The University of Texas MD Anderson Cancer Center - Experimental Radiation Oncology

research articles about genes

Senior Research Associates x 3 – Bioinformatician Team

The Genomics and Bioinformatics Core (GBC) within the Institute of Metabolic Science – Metabolic Research Laboratories at the Clinical School, Univers

Cambridge, Cambridgeshire

University of Cambridge

research articles about genes

Al Medical Engineering at School of Biomedical Engineering

Tsinghua BME offers faculty positions in the emerging research direction of AI Medical Engineering

Beijing, China

Tsinghua University

research articles about genes

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

IMAGES

  1. Genes

    research articles about genes

  2. Analytical technologies to revolutionize the environmental mutagenesis-and genome- research

    research articles about genes

  3. Genes: Function, makeup, Human Genome Project, and research

    research articles about genes

  4. dna-genetics-molecule.jpg

    research articles about genes

  5. Genes

    research articles about genes

  6. "Rogue Gene" That Spreads Cancer Around the Body Finally Discovered

    research articles about genes

VIDEO

  1. What Is a Gene?

  2. A Study to Investigate the Role of GULP/ CED 6 Genes in “Eat Me” Signaling

  3. Genes vs. environment: Divergent paths in gaming for boys and girls uncovered

  4. GARY BRECKA KNOWS THE TRUTH ABOUT POSTPARTUM DEPRESSION?

  5. Genetics explained under 60 seconds

  6. Do Your Genes Determine Your Success In Life? With Kathryn Paige Harden

COMMENTS

  1. Genetics

    Genetics is the branch of science concerned with genes, heredity, and variation in living organisms. It seeks to understand the process of trait inheritance from parents to offspring, including ...

  2. Human Molecular Genetics and Genomics

    Genomic research has evolved from seeking to understand the fundamentals of the human genetic code to examining the ways in which this code varies among people, and then applying this knowledge to ...

  3. Genetics research

    Genetics research is the scientific discipline concerned with the study of the role of genes in traits such as the development of disease.

  4. Genetics

    A specialized type of immune cell appears primed to make the type of antibodies that lead to allergies, two research groups report.

  5. Research articles

    Genetic variants of interferon-response factor 5 are associated with the incidence of chronic kidney disease: the D.E.S.I.R. study. Frédéric Fumeron. Gilberto Velho. Nicolas Venteclef. Article ...

  6. PLOS Genetics

    History of tuberculosis disease is associated with genetic regulatory variation in Peruvians Genetic variation explains risk of TB, by regulating the expression of genes involved in the control of Mtb infection.

  7. Genetics

    Genetics coverage from Scientific American, featuring news and articles about advances in the field.

  8. Genetics

    Genetic diagnosis of rare diseases is made through a variety of methods. This study gauged the diagnostic yield of genome sequencing after negative results from exome sequencing and other methods.

  9. Scientists Finish the Human Genome at Last

    The complete genome uncovered more than 100 new genes that are probably functional, and many new variants that may be linked to diseases.

  10. Genetics in the 21st Century: Implications for patients, consumers and

    We would stress that the purpose of this article is to highlight, in a broad way, the myriad implications of genetics (and hence the need for continuing sociological research) without getting drawn into any specific ethical debates in detail.

  11. Genetics Research

    Genetics Research is an open access journal providing a key forum for original research on all aspects of human and animal genetics, reporting key findings on genomes, genes, mutations, developmental, evolutionary, and population genetics as well as ethical, legal and social aspects.

  12. Advance articles

    Genetics of Bacteria: a call for papers Genetics, iyae096, https://doi.org/10.1093/genetics/iyae096 Published: 21 June 2024 Section: Call for Papers Extract View article Corrected Proof

  13. The genetic basis of disease

    This review explores the genetic basis of human disease, including single gene disorders, chromosomal imbalances, epigenetics, cancer and complex disorders, and considers how our understanding and technological advances can be applied to provision of appropriate diagnosis, management and therapy for patients.

  14. Research articles

    Research into the genetics of immune and inflammatory disease has experienced major recent advances owing to the availability of a custom single-nucleotide polymorphism (SNP) genotyping array ...

  15. Genes: Function, makeup, Human Genome Project, and research

    Genes contain instructions for life and survival. New genetic discoveries offer insights into how life works, and hope for preventing and curing diseases.

  16. Genes

    Genes is an international, peer-reviewed open access journal focusing on genetics and molecular biology research.

  17. Research shows how RNA 'junk' controls our genes

    Researchers have made a significant advance in understanding how genes are controlled in living organisms. The new study focuses on critical snippets of RNA in the tiny, transparent roundworm ...

  18. DNA and Genes

    News and articles about advances in genetic sequencing, genetics and how DNA (deoxyribonucleic acid) might be used in the future.

  19. Genetics Research

    Genetics Research is a fully open access journal providing a key forum for original research on all aspects of human and animal genetics, reporting key findings on genomes, genes, mutations and molecular interactions, extending out to developmental, evolutionary, and population genetics as well as ethical, legal and social aspects.

  20. Breast Cancer Genetics: Diagnostics and Treatment

    Breast cancer (BC) genetics has become a fundamental aspect of BC management. It influences screening, follow-up, prophylactic and therapeutic recommendations in women harboring a germinal BC susceptibility gene. In addition, it helps to identify patient subgroups with either a different prognosis or different response to treatment.

  21. Research articles

    Non-stem cell lineages as an alternative origin of intestinal tumorigenesis in the context of inflammation. Upon inflammation and targeted gene mutation, some fully differentiated secretory and ...

  22. 5 takeaways from the Human Genome Project investigation

    To piece together this history, Undark examined more than 100 emails, letters, and other documents, and interviewed many of the Human Genome Project's central figures.

  23. About Genes

    Aims Genes (ISSN 2073-4425) is an international, peer-reviewed open access journal which provides an advanced forum for studies related to genes, genetics and genomics. It publishes reviews, research articles, communications and technical notes.

  24. telomere-to-telomere gap-free reference genome assembly of avocado

    A telomere-to-telomere gap-free reference genome assembly of avocado provides useful resources for identifying genes related to fatty acid biosynthesis and disease resistance Tianyu Yang, ... Horticulture Research, ... They can be cited using the author(s), article title, journal title, year of online publication, and DOI. They will be replaced ...

  25. Do genetics determine whether coffee is good or bad for you?

    A new study explored the links between the genetics of coffee consumption, with outcomes of obesity and substance use, and mental health.

  26. The road ahead in genetics and genomics

    His research focuses on the microbiome, nutrition and genetics, and their effect on health and disease and aims to develop personalized medicine based on big data from human cohorts.

  27. NC State and Novartis Partner to Innovate Gene and Cell Therapy

    NC State University and Novartis Gene Therapies have launched a groundbreaking partnership to enhance the manufacturing of gene and cell therapies, focusing on Lentivirus—a vital gene delivery vector used in treating aggressive cancers. This collaboration not only advances critical research but also supports the Ph.D. journey of an incoming graduate student.

  28. Browse Articles

    Mitochondrial genetics through the lens of single-cell multi-omics. This Review discusses emerging technologies and insights from mitochondrial DNA variant profiling obtained by single-cell multi ...

  29. Lynch Syndrome: Genetic Disorder Raises Risk of Many Cancers

    A family history of Lynch syndrome-associated cancers is an important reason to get genetic testing for the disorder.

  30. Estonians gave their DNA to science

    Project covering one-fifth of the country's population is one of the largest-ever efforts to share results on genetic health risks with research participants.