U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of vetsciences

Pet Ownership and Quality of Life: A Systematic Review of the Literature

Kristel j. scoresby.

1 College of Social Work, University of Tennessee, Knoxville, TN 37996, USA; ude.ktu.slov@bserocsk

Elizabeth B. Strand

2 Veterinary Social Work, Colleges of Veterinary Medicine and Social Work, University of Tennessee, Knoxville, TN 37996, USA; ude.ktu@dnartse

Zenithson Ng

3 College of Veterinary Medicine, University of Tennessee, Knoxville, TN 37996, USA; ude.ktu@gnz (Z.N.); ude.ktu.slov@zlitsc (C.R.S.); ude.ktu.slov@lebortsk (K.S.)

Kathleen C. Brown

4 Department of Public Health, University of Tennessee, Knoxville, TN 37996, USA; ude.ktu@nworbck (K.C.B.); ude.ktu@osorrabc (C.S.B.)

Charles Robert Stilz

Kristen strobel, cristina s. barroso, marcy souza, associated data.

Data was not generated in this study.

Pet ownership is the most common form of human–animal interaction, and anecdotally, pet ownership can lead to improved physical and mental health for owners. However, scant research is available validating these claims. This study aimed to review the recent peer reviewed literature to better describe the body of knowledge surrounding the relationship between pet ownership and mental health. A literature search was conducted in May 2020 using two databases to identify articles that met inclusion/exclusion criteria. After title review, abstract review, and then full article review, 54 articles were included in the final analysis. Of the 54 studies, 18 were conducted in the general population, 15 were conducted in an older adult population, eight were conducted in children and adolescents, nine focused on people with chronic disease, and four examined a specific unique population. Forty-one of the studies were cross-sectional, 11 were prospective longitudinal cohorts, and two were other study designs. For each of the articles, the impact of pet ownership on the mental health of owners was divided into four categories: positive impact ( n = 17), mixed impact ( n = 19), no impact ( n = 13), and negative impact ( n = 5). Among the reviewed articles, there was much variation in population studied and study design, and these differences make direct comparison challenging. However, when focusing on the impact of pet ownership on mental health, the results were variable and not wholly supportive of the benefit of pets on mental health. Future research should use more consistent methods across broader populations and the development of a pet-ownership survey module for use in broad, population surveys would afford a better description of the true relationship of pet ownership and mental health.

1. Introduction

Throughout history, animals have played a significant role in society including in agriculture and pet ownership. A recent survey conducted in the United States estimated that approximately 67% of homes had at least one pet, equaling about 63 million homes with at least one dog and 42 million homes with at least one cat [ 1 ]. Pets can constitute a connection to nature, function in recreational and work activities, and provide companionship in our homes [ 2 , 3 , 4 ]. The importance of animals in our lives is founded on the human–animal bond concept, which is the “mutually beneficial and dynamic relationship that exists between people and other animals that is influenced by behaviors that are essential to the health and well-being of both” [ 5 ]. This concept has championed animals as companions and family members, leading to their essential part of everyday life for many. The human–animal bond has additionally driven the common belief that pets are good for human health, both physical and mental [ 6 , 7 , 8 ].

While there are some qualitative [ 9 , 10 ] studies that claim that pet ownership benefits people, particularly in regard to improved mental health, there are few studies with substantial evidence from large, diverse population samples to support this theory. The studies that have been published are often not substantiated with regard to study populations or methods, making broad conclusions difficult. Furthermore, some studies that have investigated the correlation between pet ownership and mental health have revealed no effect, or even worse, negative effects of pet ownership [ 11 , 12 , 13 , 14 , 15 ]. The inconsistencies in the literature and limitations of these studies warrant a thorough exploration of the effect of pet ownership on mental health outcomes among large, diverse population samples.

Two previous systematic reviews of the literature did examine the relationship between pet ownership and mental health/well-being [ 16 , 17 ]. Islam and Towel [ 16 ] did not find a clear relationship between pet ownership and well-being in the 11 studies included in their review. Similarly, Brooks et al. [ 17 ] examined the role of pets in owners with diagnosed mental health problems and found mixed results across the 17 studies included in the review. The purpose of this study was to perform a systematic review of the peer-reviewed published literature containing original research that examined the relationship between pet ownership and mental health for people in any population. Previous reviews included a smaller sample of research articles, often limited to a specific population of pet owners. By describing the relationship between pet ownership and mental health across all examined populations, this study will better inform whether pets could be recommended to help with mental health and whether promotion of the human–animal bond is generally beneficial.

2. Materials and Methods

The systematic review process involved a literature search, screening, extraction, and an assessment of the remaining articles by four researchers and three graduate students. For the purpose of this study, pet ownership was limited to dogs and cats. Our research team sought to answer, “How does ownership of a dog or cat influence the mental health or quality of life of pet owners?”

In May of 2020, the following databases were searched for peer-reviewed articles on pet ownership and mental health: PubMed and Web of Science. Utilizing Boolean search terms, the literature search was conducted using the terms: anxiety OR depressi* OR bipolar OR (mental* AND (health OR disease* OR disorder* OR condition* OR ill*) for the problem, (dog OR dogs OR cat OR cats OR canine* OR feline*) AND ((pet OR pets)) AND (owner* OR companion* OR interact* OR bond* OR “human animal bond” OR “animal human bond” OR “animal assisted”) for the intervention and health* AND (impact* OR outcome* OR status OR effect* OR affect* OR consequen* OR result*) for the outcome.

Although there was not an approved PRISMA protocol, the research team used Covidence (Melbourne, Australia), a software program that tracks the systematic review screening process. Identified articles were imported into Covidence, duplicates were removed, and the remaining articles were screened by the research team. Through random assignment, each article was independently reviewed by one faculty member and one graduate student. Each reviewer indicated in Covidence if the article should be included or excluded according to established criteria ( Table 1 ). When there was a conflict between reviewers, a third reviewer (non-student) resolved the conflict. The full review process is shown in Figure 1 . At the final review stage, two researchers independently extracted specific information ( Table 2 ) from each article. The type of impact on mental health was determined based on the results reported in each article.

An external file that holds a picture, illustration, etc.
Object name is vetsci-08-00332-g001.jpg

Following a literature search, articles were reviewed for adherence to inclusion and exclusion criteria. A total of 54 articles were identified to meet all criteria.

Inclusion and exclusion criteria used for evaluation of research articles that examined the relationship between pet ownership and mental health.

At the extraction stage, the following information was used for evaluation of research articles that examined the relationship between pet ownership and mental health.

In addition to extracting the information outlined in Table 2 , an index ( Appendix A ) was created to assess article quality. The index was based on two previous systematic reviews of mental health in veterinary science [ 17 , 18 ]. Each dichotomous index question assigned a 0 if the article did not meet criteria and a 1 if the article did meet criteria. The higher the score an article received (0–9 points), the higher the quality of the article.

Interventionary studies involving animals or humans, and other studies that require ethical approval, must list the authority that provided approval and the corresponding ethical approval code.

The article review process and number of articles in each step are shown in Figure 1 . A total of 54 articles met the inclusion and exclusion criteria ( Table 1 ) and were systematically extracted ( Table 2 ). These articles were then divided into four categories based on the type of overall impact pets had on the mental health of owners: (1) positive impact (n = 17); (2) mixed impact ( n = 19); (3) no impact ( n = 13); and (4) negative impact ( n = 5). Factors that influenced mental health include (a) age (middle-aged female caregivers had more psychological stress than young female and male caregivers), (b) obedience and aggressiveness of the pet, (c) marital status (single women who owned a dog were less lonely and socially isolated than women without pets), and (d) attachment to the pet (high level of bonding has lower anxiety and depression scores than lower level of bonding) [ 19 , 20 , 21 , 22 , 23 , 24 ]. A few representative studies with mixed results include one examining the general population, which found that unmarried men who live with a pet had the most depressive symptoms and unmarried women who live with a pet had the fewest [ 19 ]. Another study examining the impact of companion animals on cancer patients found that mental health was associated with the status of cancer treatment, with those receiving intense treatment having poorer mental health [ 20 ]. In addition to overall impact, the study population, study type, population size, year of publication and article quality are reported ( Appendix B ).

Of the 54 articles, 19 (35%) were studies conducted in the general population, 15 (28%) were studies in older adult individuals, eight (15%) were in children and adolescents, six (11%) focused on people with some type of chronic physical illness/disease, three (6%) were studies in people with severe mental illness, and three (6%) studies examined unique populations. Of the 15 studies that had only older adult participants, none of them reported a positive impact. Seven of the articles reported mixed impact based on type of pet, gender, companionship, or another demographic. Six of the studies had no impact and two had a negative impact. Of the eight studies that involved children and adolescents, six of them indicated a clear positive impact, one indicated mixed impact, and one indicated no impact. Of the three studies that involved those with severe mental illness, two indicated clear positive impact and one indicated mixed impact.

Research studies either compared mental health outcomes in pet owners versus non-pet owners ( n = 41) or with regard to owner attachment to the pet ( n = 13). Similar to the overall distribution, the outcomes within these two different types of studies were distributed across all four categories ( Table 3 and Table 4 ). In 38% (five of 13) of the studies, attachment to a cat or dog was associated with a positive impact on mental health in 38% of the studies. Four of the 13 studies (31%) indicated mixed results, meaning that human–animal attachment sometimes was associated with better mental health and sometimes it was not. One example of higher attachment leading to worse mental health was for those amid cancer treatment [ 20 ]. There was no clear trend towards attachment and better mental health.

Outcomes of 41 studies that examined mental health outcomes in pet owners compared to non-pet owners.

Outcomes of nine studies that examined mental health outcomes in relationship to the pet owner’s attachment bond with their pet.

The study types included 41 (76%) cross-sectional studies, 11 (20%) prospective cohort longitudinal studies, and two (4%) other study designs. Of the cross-sectional studies, 27 (66%) found that companion animals had no or negative impact on mental health and 14 (34%) found mixed or positive impact on mental health. Of the 11 articles that reported on a longitudinal study design, five (45%) demonstrated no or negative impact and six (55%) demonstrated mixed or positive impact. Among the 54 studies, sample size ranged from 30 to 68,362.

To measure mental health constructs, 75 different validated scales were used ( Table 5 ). Eight scales were used to measure human attachment to pets. The most common scales used across studies were the CES-D (13 studies) to measure depression and the ULS (10 studies) to measure loneliness. Two scales were used by four studies each (DASS and any variation of GHQ). Three scales were used by three studies each (GDS, CABS, and any variation of PHQ). The remaining scales were used only once or twice across the studies assessed.

The scales used across studies to measure mental health.

Regarding the study quality scores ( Appendix A ), no articles received a quality score of 9, six (11%) received a score of 8, 11 (20%) received a score of 7, 20 (37%) received a score of 6, and 17 (31%) received a score of 5 or below. Of the articles with a quality scale score of 5 or lower, 18% (3) articles had no or negative impact and 82% ( n = 14) had mixed or positive impact on owner mental health. Articles with a quality scale score of 6 or higher, 43% ( n = 16) showed no or negative impact and 57% ( n = 21) showed mixed or positive impact.

4. Discussion

Understanding the nature of the relationship between mental health and pet ownership is important for both human and animal welfare and to better determine the impact of human–animal interactions. Over the years, the perspective that “pets are good for you” has become an assumption [ 25 ] and when negative implications are recognized it often relates to zoonotic diseases rather than human–animal interactions [ 26 ]. This belief in the positive aspects of the human–animal bond is strengthened by marketing tools used by the pet industry [ 27 ]. While there certainly is evidence that supports the benefits of the human–animal bond to people’s mental health [ 28 , 29 ], there is also clear and consistent evidence that the relationship is complex and sometimes negative [ 30 , 31 ]. The question of whether pets should be prescribed by health professionals is an especially important one. Recent qualitative research supports that attending to a pet can help a person manage mental health crises [ 32 ], however, doing so can also cause a person to rely on the pet instead of other evidenced based methods of seeking mental health support. The recommendation of obtaining a pet in the presence of mental illness ought to be coupled with other evidenced based strategies for mental health recovery such as increasing social support and engaging in third wave behaviorally based interventions such as Acceptance and Commitment Therapy or Dialectical Behavior Therapy.

The broad perspectives that pets are good for mental health may cause people to place false expectations on the role a dog or cat must play in their lives [ 33 ]. The anthropomorphism of pets (people placing human cognitive motivations on pets’ behavior and treating pets as people) can in fact have a negative impact on the animal’s welfare [ 34 ]. The untreated stress of people who turn to their pets instead of their human social supports and health professionals may in fact be causing pets to be more stressed [ 35 ]. Although initial data suggest relinquishment rates were not higher after COVID-19 lockdowns were lifted [ 36 ], some still have concerns that the recent increase in pet adoptions from shelters may result in pet relinquishment once the pandemic is more managed and people return to their daily work environments [ 37 ] (J. Schumacher personal communication, 5 May 2021). Developing clear guidelines about the benefits and liabilities of pet ownership and mental health is important to mitigate the public halo effect that suggests that simply acquiring a pet will improve your mental health.

Previous systematic reviews of the literature have found mixed results regarding the relationship between mental health and pet ownership [ 16 , 17 ]. Our search and review methodology was similar to Islam and Towel [ 16 ], which yielded 11 studies compared to the 54 studies compiled in this review. Although the Brooks et al. [ 17 ] review yielded 17 studies, they limited their search to studies only including people diagnosed with mental health conditions. While the current study did examine a larger body of research that covered broader populations and more recent publications than previous reviews, the findings were similar in that results varied across outcomes including positive, negative, mixed, and negligible. Unlike previous studies, this review also differentiated studies that compared pet owners to non-pet owners and studies that examined the level of attachment with a pet as a predictor of the mental health of the owner. Islam and Towel [ 16 ] argued that the definition of pet ownership needs to be defined across all studies, including aspects of length of ownership, time spent with the animal, and perceived quality of the interaction. Within these two categories of study types, the outcomes still varied and showed no consistent evidence that pet ownership is a positive contributor to mental health. The lack of consensus from these studies was not surprising. While popular literature and media consistently highlight the positive, it rarely highlights the negative aspects of pet ownership. In fact, studies with negative or non-significant findings are often subject to the “file drawer” effect, in which authors ultimately decide not to publish their studies [ 15 ]. In this review, we did find and include studies that reported negative or mixed findings.

The authors made the decision a priori to divide the results into categories based on the type of impact each study had on mental health. Among the 17 studies that were determined to have positive results, most of the studies were with children and adolescents ( n = 6) and the general adult population ( n = 6). There were some challenges to identifying these studies as clearly positive. Because a variety of different variables and a variety of different methodologies were used based on the specific purpose of each study, they could not be directly or easily compared to one another. Many of the positive impact studies investigated additional variables that could be better predictors of positive mental health than dog/cat ownership. For example, several studies indicated that children or adolescents with a dog had less depression and/or less anxiety than peers without a dog. However, family dynamics such as single parent or two parent households, time parents spend at work, presence of siblings, and family dysfunction [ 2 , 8 ] may be more significant contributors to child mental health than dog ownership.

The 19 mixed impact studies were easier to categorize because of conflicting outcomes, particularly for studies with an older adult or general adult population. In each of these studies, the direction of the outcome was influenced by demographic variables (such as gender) or the type of pet (cat or dog). For example, one general population study determined that women with pets had lower levels of depression whereas men with pets had higher levels of depression [ 19 ]. Another example is that pet-owning individuals with severe mental illness had less psychiatric hospitalizations than non-pet owning peers, however, they also had higher levels of substance use [ 38 ]. Another reason why a study would be categorized as mixed impact is if mental health was assessed using multiple instruments and yielded conflicting results. For instance, one study indicated that when compared to people without pets, those with pets had no difference in anxiety or stress scores yet had higher depression scores [ 22 ].

For the 13 studies that had no impact, most were with the older adult ( n = 6) and general adult ( n = 4) population. These studies concluded that when comparing pet ownership to non-pet ownership or when comparing attachment levels, the pet had no correlation with positive or negative mental health. Many of these studies controlled for demographic variables such as age, gender, and socioeconomic status in their statistical models. One challenge to categorizing the studies was that study participants subjectively believed their pets were helpful to their mental health despite what validated measures showed. The inclusion of these biased observations in an attempt to still put a positive spin on the study may reflect the conflict a researcher has in publishing negative results. An additional challenge is that studies that included non-mental health measures (such as physical health) showed that those with pets did better than those without. Expert reviews of pet ownership on cardiovascular health have demonstrated a significant challenge to reach a definitive conclusion of the impact of pet ownership on health based on the current evidence [ 39 ].

Five studies demonstrated a clear negative impact between pet ownership and mental health. The sample populations were general ( n = 2), older adults ( n = 2), and single adults living alone ( n = 1). In these studies, pet ownership was associated with higher levels of depression, loneliness, and other psychological symptoms across all demographic variables and type of pet (dog or cat). Again, the challenge to classifying these studies as negative impact suggests that pet ownership causes increased levels of mental health illnesses, when in reality, the studies are about correlation, not causation. There may be other factors that cause the samples in these studies to have worse mental health. As indicated by Mullersdorf et al. [ 40 ], the presence of a psychological condition could predispose individuals to become pet owners, making it difficult to truly know if pet ownership causes a negative impact on mental health. These studies, regardless of type of outcome, only indicate association of pet ownership and mental health.

Another challenge in comparing the 54 studies was the difference in methodology and quality of each study. Due to this, our methods did not evaluate the individual and overall power and effect sizes of study results. Quantitative methodologies are warranted in this field, particularly prospective, randomized, double-blind, placebo-controlled intervention trials that are longitudinal in design to provide evidence of the impact of animal ownership over time while eliminating as many extraneous and confounding variables as possible [ 41 ]. Ideally, this truly experimental model of pet ownership would include random assignment of companion animals in a closed system to eliminate as many sources of error variance as possible [ 42 ]. However, due to the nature of pet ownership being integrated as a part of daily life on a voluntary basis, this experimental model would be difficult to achieve. Perhaps the most compelling of all studies that comes closest to this design was a prospective interventional study in which 71 previous non-pet owners were given a cat or dog; results demonstrated mild benefits in mental health and behavior after 10 months of pet ownership compared to the 26 non-pet owners [ 43 ]. While noteworthy, there was lack of randomization, so the pet ownership group consisted of a relatively small number of subjects who were searching for a pet to adopt rather than receiving it on random chance. Regardless, this study still reports an improvement in mental health in this specific population. Future studies should strive to achieve this prospective, controlled, experimental methodology to more compellingly connect pet ownership with mental health.

A quality index attempted to rate the rigor of each study, but the index was subjective and based on questions that could be asked without statistical analysis (e.g., does this study include a comparison population?). The higher the score on the quality index, the more likely the study was scientifically rigorous. The lower the score, the more likely the study was to demonstrate a positive or mixed impact on the pet owner’s mental health. While both previous literature reviews critiqued the rigor of the studies reviewed and remarked upon the consistent methodological flaws, Islam and Towel did not assign objective scores to the 11 studies reviewed. Brooks et al. [ 17 ] did assign quality scores to each of the 17 studies reviewed but did not evaluate the impact of the quality of the study on its results. The quality scores in the current review varied across all four outcome categories and did not give any indication of quality impacting the overall outcome. Still, it is important that researchers strive for higher quality research that carries more weight in the question of whether pet ownership truly impacts mental health. Additionally, we recommend that studies be replicated in an attempt to corroborate previous findings, which contribute to the overall understanding of the phenomenon.

Lastly, this study also examined how mental health was evaluated across the studies. For the 54 studies included in this review, 75 different scales ( Table 5 ) were used with many research studies implementing more than one scale ( Appendix B ). While most of the scales used have been previously validated, the inconsistent use of scales makes comparison of results across studies challenging. While it is common to utilize an instrument that is a validated self-report of depression, it is likely that researchers often utilize other scales because they are investigating other aspects of mental health such as loneliness, stress, and anxiety. Many scales also rely on self-reporting of mental health indicators, which can be affected by inherent bias, especially when completing a survey regarding mental health and pet ownership. To allow for better comparison of future studies, researchers should attempt to use consistent measures of mental health across studies, such as the CES-D [ 44 ], which was the most commonly used scale in 13 of the 54 examined studies.

In addition to consistent use of mental health scales across studies, the development of a module for use in wide-scale population surveys with a focus on pet-ownership would benefit future research examining the relationship between pet ownership and health. The Behavioral Risk Factor Surveillance System (BRFSS) [ 45 ] is an annual questionnaire administered by the US Centers for Disease Control and Prevention. There are 14 core sections that are administered to all participants and 31 optional modules [ 45 ]. None of these modules focuses on pet ownership and the addition of such a module would allow for a more in-depth evaluation of the relationship between pet ownership and health, both mental and physical, across large populations. While pets can play a significant role in the owner’s health, it can be difficult to differentiate the effects of pet ownership from the many other factors that contribute to one’s mental and physical health. The addition of a pet ownership module to the BRFSS would allow researchers to examine the role of pet ownership in tandem with other factors that contribute to health. On a smaller scale (approximately 3000 participants), the General Social Survey (GSS) is a representative survey that monitors trends in opinions, behaviors, and demographics among Americans [ 46 ]. Though not a main focus, the GSS does include pet ownership and mental health variables. Including pet ownership allows researchers who study the relationship of ownership with humans to have a large, representative dataset to analyze correlations. For example, a recent study used the GSS 2018 to examine demographics of pet ownership [ 46 ]. In their conclusion, the authors of this study indicated that the strengths of using the GSS to study pet ownership characteristics are high quality data, multiple covariates, sound methodology, and easy access [ 47 ]. Including pet ownership questions in multi-wave, representative studies would further the work of human animal relationship research.

This systematic review was limited due to only searching two databases and only evaluating research published in English. The majority of studies focused on pet-owners in Western cultures. The human–animal bond may differ across cultures and future studies should include pet-owners in non-Western cultures. However, a large number of articles were identified, and the total number of articles included in final extraction was greater than similar previous systematic reviews. More consistent methods across research that evaluates the relationship between pet ownership and mental health might allow for more extensive comparison of studies.

5. Conclusions

Previous research examining the impact of pet ownership on mental health has shown mixed results and the results of this study were the same. While there were more absolute numbers of studies to demonstrate a positive impact ( n = 17) compared to negative impact ( n = 5) on mental health, the overall results indicate a much more complicated picture. While 17 of the 54 studies had a clear association of pet ownership and positive mental health, the remaining 37 articles show a mixed association, no association, or a negative association. Comparing these studies is quite challenging due to the number of measures used to assess mental health, the differences in study quality, and the variety of variables that were controlled for. While research studies can be improved by addressing limitations as described, a more comprehensive evaluation of behavior and its association with health outcomes is warranted. We also cannot ignore that mental health is multifactorial. Pet ownership and the resulting human–animal interaction is a single factor; other factors that also contribute to mental health should be examined in large populations of pet-owners and non-pet-owners. The addition of a pet-ownership specific module to the BRFSS, as previously described, would allow for prospective research that can be replicated, and eventually retrospective research, that will also allow for inclusion of other factors that contribute to health.

Following a literature review and data extraction of research articles that examined the relationship between pet ownership and mental health, the following articles were found to meet inclusion and exclusion criteria as outlined in Table 1 .

Author Contributions

Conceptualization, K.J.S., E.B.S. and M.S. methodology, E.B.S. and M.S.; validation, all; formal analysis, all; data; writing—original draft preparation, K.J.S., E.B.S., M.S., Z.N.; writing—review and editing, K.J.S., E.B.S., Z.N., K.S., K.C.B., C.R.S., C.S.B. and M.S.; supervision, E.B.S., Z.N., K.C.B., C.S.B. and M.S.; project administration, E.B.S. and M.S.; funding acquisition, E.B.S. and M.S. All authors have read and agreed to the published version of the manuscript.

This research was funded by Maddie’s Fund.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Data availability statement, conflicts of interest.

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Pet ownership and...

Pet ownership and human health: a brief review of evidence and issues

  • Related content
  • Peer review
  • June McNicholas , psychologist ( june{at}cullach.fsnet.co.uk ) 1 ,
  • Andrew Gilbey , lecturer 2 ,
  • Ann Rennie , general practitioner 3 ,
  • Sam Ahmedzai , professor of palliative medicine 4 ,
  • Jo-Ann Dono , director 3 ,
  • Elizabeth Ormerod , veterinary surgeon 3
  • 1 Croit Cullach, Durnamuck, Dundonnell, Ross-shire
  • 2 Massey University, New Zealand
  • 3 Society for Companion Animal Studies, Blue Cross, Burford, Oxon
  • 4 Royal Hallamshire Hospital, University of Sheffield
  • Correspondence to: J McNicholas
  • Accepted 4 November 2005

Research into the association between pet ownership and human health has produced intriguing, although frequently contradictory, results often raising uncertainty as to whether pet ownership is advisable on health grounds

Introduction

The question of whether someone should own a pet is never as simple as whether that pet has a measurably beneficial or detrimental effect on the owner's physical health. The emotional bond between owner and pet can be as intense as that in many human relationships and may confer similar psychological benefits. Death of a pet can cause grief similar to that in human bereavement, whereas threat of loss of a pet may be met with blunt refusal and non-compliance with advice on health.

We examine the current evidence for a link between pet ownership and human health and discuss the importance of understanding the role of pets in people's lives.

Is pet ownership associated with human health?

Research dating from the 1980s popularised the view that pet ownership could have positive benefits on human health. Benefits ranged from higher survival rates from myocardial infarction 1 ; a significantly lower use of general practitioner services (prompting some researchers to speculate on considerable potential savings to health expenditure) 2 ; a reduced risk of asthma and allergic rhinitis in children exposed to pet allergens during the first year of life 3 4 ; a reduced risk of cardiovascular disease 5 ; and better physical and psychological wellbeing in community dwelling older people. 6 No studies have found significant social or economic differences between people who do or do not have pets that would adequately explain differences in health outcome, leading to the belief that pet ownership itself is the primary cause of the reported benefits.

Although the research did much to raise awareness of the importance that people attach to their pets, recent studies have failed to replicate the benefits. A review of the association between pets and allergic sensitisation found inconsistent results for cat ownership between studies of similar design, whereas dog ownership seemed to have no effect or even protected against specific sensitisation to dog allergens and allergic sensitisation in general. 7 Other studies on the subject suggest that exposure to pets may be beneficial provided that exposure is sufficient, as lower levels may enhance sensitisation whereas higher levels may protect against sensitisation. 8 Yet others suggest that the effects may heavily depend on age at exposure and type of pet. 9

Similarly, recent research has failed to support earlier findings that pet ownership is associated with a reduced risk of cardiovascular disease, 10 a reduced use of general practitioner services, 11 or any psychological or physical benefits on health for community dwelling older people. 12 Research has, however, pointed to significantly less absenteeism from school through sickness among children who live with pets. w1

Do we need a broader definition of health?

The main issue may not be whether pet ownership per se confers measurable physical benefits but the role that pets have in individual people's lives—namely, the contributions of the pet to quality of life or the costs to wellbeing through a pet's death. This issue embraces a broader definition of health that encompasses the dimensions of wellbeing (physical and mental) and a sense of social integration.

Three potential mechanisms have been proposed to explain the association between pet ownership and benefits to human health ( fig 1 ). 13 The first is that there is no real association between the two, rather that cofactors such as personality traits, age, and economic or health status impact on the decision to own a pet and thus produce an apparent link between pets and health. So far, however, evidence is lacking that any of these cofactors account for both health promoting attributes and propensity to own pets, suggesting that health benefits, when reported, may be attributable to some aspect of pet ownership.

Three proposed mechanisms for association between pet ownership and health benefits for humans

  • Download figure
  • Open in new tab
  • Download powerpoint

A Munduruku boy carries his pet, a domesticated wild boar, for a daily cleansing swim in the Rio Canuma

Credit: GERD LUDWIG/PANOS

The second proposal is that pets may enhance social interactions with other people, thus providing an indirect effect on wellbeing. Social contact has been long recognised as beneficial in that it alleviates feelings of loneliness and social isolation. Pets undoubtedly act as “social catalysts,” leading to greater social contact between people. 14 These factors may be particularly important for those at risk of social isolation, such as elderly people or people with physical disabilities, who lack many of the opportunities for social interactions of their more able bodied peers. 15

The third proposal focuses on ways in which pet ownership may exert a direct effect on human health and wellbeing through the nature of the relationship. Close human relationships have a powerful influence on wellbeing by providing emotional support. They may reduce perceptions of stressful events thus protecting against anxiety related illness, may give confidence that successful coping strategies can be found to deal with stress, and may enhance recovery from serious illness such as stroke, myocardial infarction, and cancer. These aspects of a relationship are collectively referred to as social support. Social relationships, or the lack of, seem to constitute a major risk factor for health, rivalling the effects of well established risk factors such as cigarette smoking, blood pressure, blood lipid concentrations, obesity, and lack of physical activity. 16

The value of companionship

Companionship—a commonly stated reason for pet ownership—is regarded as theoretically distinct from social support in that it does not offer extrinsic support but provides intrinsic satisfactions, such as shared pleasure in recreation, relaxation, and uncensored spontaneity, all of which add to quality of life. Thus companionship may be important in fostering positive mental health on a day to day basis, whereas social support may be of particular value in buffering threats to mental health and wellbeing from real or perceived stressors. Figure 2 illustrates the inter-relationship between functions served by pet ownership and human health outcomes. 15

Correlations between questionnaire items measuring social facilitation, affectionate relationship, social support, and recipients' self perceived health in study on non-task related benefits of a trained assistance dog to people with physical disabilities. Correlations, derived from carrying out Pearson's correlation, are significant at P<0.05

Although research has primarily focused on human relationships as providing support and companionship, it is a short step to extrapolating these to pets. Studies have shown that the support from pets may mirror some of the elements of human relationships known to contribute to health. 17 Although support from pets should not be regarded as a replacement for help from people, the fact that pets are not human confers certain advantages; the relationships are less subject to provider burnout or to fluctuations, and they do not impose a strain or cause concern about continuing stability. Relationships with pets seem to be of value in the early stages of bereavement w2 and after treatment for breast cancer. w3

Most pets are valued family members

Credit: BARRY LEWIS/NETWORK PHOTOGRAPHERS

Why pet ownership should be taken seriously

The question of whether a person should acquire a pet or continue to own a pet requires careful consideration of the balance between benefits and potential problems. About half of households in the United Kingdom own pets. w4 Most are valued as family members. Conflict between health interests and pet ownership can cause non-compliance with advice on health. Some sources estimate that up to 70% of pet owners would disregard advice to get rid of a pet owing to allergies, w5 whereas reports abound of older people avoiding medical care through fear of being admitted to hospital or residential care as this often means giving up a pet. w6

Summary points

Over 90% of pet owners regard their pet as a valued family member

Reluctance to part with a pet may lead to non-compliance with health advice

Pets may be of particular value to older people and patients recovering from major illness

The death of a pet may cause great distress to owners, especially when the pet has associations with a deceased spouse or former lifestyle

Many people would welcome advice and support to enable them to reconcile or manage pet ownership and health problems whenever possible

The loss of a pet may be particularly distressing for owners if it was linked with a deceased spouse or if it offered companionship or social contact with people. 18 For these reasons many people may appreciate help and advice on how to manage a pet in the event of a health problem in the family.

Animal welfare organisations cite allergies and the fear of zoonoses as common reasons for people giving up their pets. Yet in some cases this may not be necessary. Research from the University of West Virginia shows that simple, day to day hygiene and pet care can reduce allergic reactions by up to 95%. 3 A recent review of pets in nursing homes provides a comprehensive list of potential health problems and steps that can be taken to avoid these. 19

People do not own pets specifically to enhance their health, rather they value the relationship and the contribution their pet makes to their quality of life. 20 Greater understanding among health professionals is needed to assure people that they do not need to choose between pet ownership and compliance with health advice.

Contributors and sources JMcN has special research interests in the influence of pet ownership on health and lifestyle. She was formerly based at the University of Warwick. Her current work is with Dogs for the Disabled, the Society for the Protection of Animals Abroad, and Cats Protection, UK. She is a member of the Society of Companion Animal Studies. AG gained his doctorate from the University of Warwick, researching the role of pets in the alleviation of loneliness. AR and SA are members of the Society of Companion Animal Studies. J-AD has a degree in psychology and is director of the Society of Companion Animal Studies. EO is chairwoman of the Society of Companion Animal Studies. References refer to primary sources located through MIMAS web of knowledge service/web of science records. JMcN wrote the article, with contributions from the other authors, and is guarantor.

Competing interests JMcN received a research award, 2000-2, from Masterfoods UK to investigate the role of pets in children's health. AG was employed as a research assistant at University of Warwick, 1999-2003, funded by Waltham Centre for Animal Nutrition.

  • Friedmann E ,
  • Katcher AH ,
  • Johnson CC ,
  • Peterson EL
  • Nafsted P ,
  • Gaader PI ,
  • Jaakola JJK
  • Anderson WP ,
  • Jennings GL
  • Waltner-Toews D ,
  • Bonnett B ,
  • Woodward C ,
  • Abernathy T
  • Simpson A ,
  • Behrens T ,
  • Weiland SK ,
  • Siebert E ,
  • Parslow RA ,
  • Christensen H ,
  • Rodgers B ,
  • McNicholas J ,
  • Landis KR ,
  • Collis GM ,
  • McNicholas J
  • Podbercek AL ,

research paper about pets

Pet Ownership and Quality of Life: A Systematic Review of the Literature

Affiliations.

  • 1 College of Social Work, University of Tennessee, Knoxville, TN 37996, USA.
  • 2 Veterinary Social Work, Colleges of Veterinary Medicine and Social Work, University of Tennessee, Knoxville, TN 37996, USA.
  • 3 College of Veterinary Medicine, University of Tennessee, Knoxville, TN 37996, USA.
  • 4 Department of Public Health, University of Tennessee, Knoxville, TN 37996, USA.
  • PMID: 34941859
  • PMCID: PMC8705563
  • DOI: 10.3390/vetsci8120332

Pet ownership is the most common form of human-animal interaction, and anecdotally, pet ownership can lead to improved physical and mental health for owners. However, scant research is available validating these claims. This study aimed to review the recent peer reviewed literature to better describe the body of knowledge surrounding the relationship between pet ownership and mental health. A literature search was conducted in May 2020 using two databases to identify articles that met inclusion/exclusion criteria. After title review, abstract review, and then full article review, 54 articles were included in the final analysis. Of the 54 studies, 18 were conducted in the general population, 15 were conducted in an older adult population, eight were conducted in children and adolescents, nine focused on people with chronic disease, and four examined a specific unique population. Forty-one of the studies were cross-sectional, 11 were prospective longitudinal cohorts, and two were other study designs. For each of the articles, the impact of pet ownership on the mental health of owners was divided into four categories: positive impact ( n = 17), mixed impact ( n = 19), no impact ( n = 13), and negative impact ( n = 5). Among the reviewed articles, there was much variation in population studied and study design, and these differences make direct comparison challenging. However, when focusing on the impact of pet ownership on mental health, the results were variable and not wholly supportive of the benefit of pets on mental health. Future research should use more consistent methods across broader populations and the development of a pet-ownership survey module for use in broad, population surveys would afford a better description of the true relationship of pet ownership and mental health.

Keywords: human-animal bond; human-animal interactions; pet ownership mental health.

Publication types

Grants and funding.

  • 12345/Maddie's Fund
  • Reference Manager
  • Simple TEXT file

People also looked at

Perspective article, the new era of canine science: reshaping our relationships with dogs.

research paper about pets

  • 1 School of Anthropology, University of Arizona, Tucson, AZ, United States
  • 2 College of Veterinary Medicine, University of Arizona, Tucson, AZ, United States
  • 3 Cognitive Science, University of Arizona, Tucson, AZ, United States
  • 4 California State Polytechnic University, Pomona, CA, United States
  • 5 Department of Psychology, Western Carolina University, Cullowhee, NC, United States
  • 6 Center for Urban Resilience, Loyola Marymount University, Los Angeles, CA, United States
  • 7 Animal Welfare Science Centre, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, VIC, Australia

Canine science is rapidly maturing into an interdisciplinary and highly impactful field with great potential for both basic and translational research. The articles in this Frontiers Research Topic, Our Canine Connection: The History, Benefits and Future of Human-Dog Interactions , arise from two meetings sponsored by the Wallis Annenberg PetSpace Leadership Institute, which convened experts from diverse areas of canine science to assess the state of the field and challenges and opportunities for its future. In this final Perspective paper, we identify a set of overarching themes that will be critical for a productive and sustainable future in canine science. We explore the roles of dog welfare, science communication, and research funding, with an emphasis on developing approaches that benefit people and dogs, alike.

Dogs have played important roles in the lives of humans for millennia ( 1 , 2 ). However, throughout much of scientific history they have been dismissed as an artificial species with little to contribute to our understanding of the natural world, or our place within it. During the last two decades, this sentiment has changed dramatically; canine science is rapidly maturing into an established, impactful, and highly interdisciplinary field ( Figure 1 ). Canine scientists, who previously occupied relatively marginalized roles in academic research, are increasingly being hired at major research universities, and centers devoted to the study of dogs and their interactions with humans are proliferating around the world. The factors underlying dogs' newfound popularity in science are diverse and include (1) increased interest in understanding dog origins, behavior, and cognition; (2) diversification in our approaches to research with non-human animals; (3) recognition of dogs' value as a unique biological model with relevance for humans; and (4) growth in research on the nature and consequences of dog-human interactions, in their myriad forms, from working dog performance to displaced canines living in shelters.

www.frontiersin.org

Figure 1 . Canine science is an interdisciplinary field with connections to other traditional and emerging areas of research. The specific fields shown overlap in ways not depicted here and are not an exhaustive list of disciplines contributing to canine science. Rather, they are included as examples of the diversity of scholarship in canine science.

This Perspective represents the final article in a collection of manuscripts arising from two workshops sponsored by the Wallis Annenberg PetSpace Leadership Institute. Leadership Fellows from around the world gathered in 2017 and 2020 to discuss the state of research and future directions in canine science. The individual articles in this collection provide a detailed treatment of key topics discussed at these events. In this final article, we identify a set of overarching challenges that emerge from this work and identify priorities and opportunities for the future of canine science.

The rise of canine science has benefited substantially from public interest and participation in the research process. Unlike many research studies, which unfold quietly in the ivory towers of research universities, the new era of canine science is intentionally public facing. The dogs being studied are not laboratory animals, bred and housed for research purposes, but rather are companions living in private homes, or assisting humans in capacities ranging from assistance for people with disabilities, to medical and explosives detection. Campus-based research laboratories have opened their doors to members of the public who bring their dogs to participate in problem-solving tasks, social interactions, and sometimes even non-invasive neuroimaging studies. Increasingly, dog owners themselves play a significant role in the scientific process, serving as community scientists who contribute to the systematic gathering of data from the convenience of their homes.

This new research model in conjunction with emerging technologies, makes canine science a highly visible field that engages public stakeholders in unprecedented ways. From a scientific perspective, society has become the new laboratory, and in doing so, has facilitated research with dogs of a scope and scale that was heretofore unthinkable. As tens of thousands of dogs contribute to research on topics ranging from cognition and genetics ( 3 , 4 ) to aging and human loneliness ( 5 ), canine science is entering the realm of “big data” and eclipsing many traditional research approaches. Importantly, these advances are occurring simultaneously across diverse fields of science, creating powerful new opportunities for consilience that will make canine science even more valuable in the years ahead. However, maturing this model toward a sustainable future that serves its diverse stakeholders—who include scientists, research funders, members of the public, and dogs themselves—will require careful navigation of key challenges related to dog welfare, science communication, and financial support ( Figure 2 ).

www.frontiersin.org

Figure 2 . Visual summary of the key issues identified in this Perspective . A sustainable future in canine science will require (1) research approaches that prioritize and monitor the welfare of dogs, (2) improved science communication to avoid incorrect reporting of study results, and to translate research findings to meaningful change in practices relating to dogs, and (3) availability of research funding that is not tied exclusively to studying the possible benefits of dogs for humans.

Dog Welfare

Globally, animal welfare has been linked to the public acceptability that underpins sustainable animal interactions and partnerships ( 6 ). Where human-animal interactions have failed to meet community expectations, practices and in some case entire industries, have been disrupted or ceased. Recent examples include whaling for profit and greyhound racing ( 6 , 7 ). Science is not exempt from this necessity to meet with public expectations and the new era of canine science must place canine welfare at the forefront. Considering dogs as individuals and co-workers, rather than tools for work or subjects, reflects a community moral and ethical paradigm shift that is currently underway. Reimagining our relationship with domestic dogs in research will also help inform our treatment of other animals. In this way, studies of dogs and our interactions with them can serve as a pioneering new model for many areas of science.

As scientists advocate for the revision of community and industry practices with dogs in light of new evidence, we must apply the same criteria to the conduct of our research. This includes adjusting canine research and training methods to acknowledge the sentience of dogs, and the importance of the affective experience for dogs in both research and community settings ( 8 – 11 ). The discipline of animal welfare science has progressed rapidly over the last two decades, and we have many animal-based, welfare-outcome measures available to us ( 6 , 11 ). Ensuring the well-being of the dogs we study will be as critical to ongoing social license to operate (i.e., community approval) for canine science as it is for working dog interests ( 12 ). Being transparent about the issues of animal consent and vulnerability, as well as offering animals agency with regard to their participation in science are valuable suggestions offered within this special issue. We encourage our colleagues to not just consider this paradigm shift, but to effect it through prioritizing and representing the dog's perspective and welfare in their research.

Although increasingly, researchers may include a single or limited set of canine stress measures in studies exploring dogs' potential benefits to humans, this approach alone does not fill the need for studies that prioritize an understanding of canine welfare as their central focus. Canine welfare should be considered not just as an emergent population-level measure ( 13 ) but rather with respect to the way in which it is experienced: from the perspectives of individual dogs. Commonly used statistical methods from human research, such as group-based trajectory analysis ( 14 ) may offer proven techniques that allow meaningful reporting on populations while reflecting the nuance of shared, sub-group patterns. Such approaches will better reflect individual differences, for example variations in canine personality, social support and relationship styles, as well as other significant factors. One impediment to robust measurement of animal welfare in canine science has been limited funding.

We believe that all granting bodies who fund exploration of the possible benefits to people from dogs should also fund and require the canine perspective to be robustly monitored and reported. Impediments to this work arise not from lack of researcher interest or access to dogs, but rather from challenges to securing funding that is independent from a focus on human health outcomes, or other tangible outcomes of work that dogs perform. To be able to optimize canine welfare, there is an urgent need for increased funding specifically to study the welfare of dogs, in all their diversity. The new era of canine science will identify what dogs need to thrive, propelling us toward a mutually sustainable partnership between people and dogs.

Communication

One area that has not received much attention in relation to canine science is the way in which research findings are communicated outside the empirical literature. Fueled by media reports, interest in canine science and the impact of dogs on human health and well-being has grown substantially in the last 10 years. A survey by the Human-Animal Bond Research Institute found that 71% of pet owners were aware of studies demonstrating that pets improve mental and physical health. Some of these claims are justified. For example, many studies have found that interacting with therapy dogs reduces stress and anxiety and increases positive emotional states in a variety of settings including hospitals, schools and nursing homes ( 15 , 16 ). In other cases, high public expectations about the healing power of pets are not matched by the results of empirical studies. For instance, while the Human-Animal Bond Research Institute survey found that 86% of pet owners believe pets relieve depression, the majority of studies on pet-ownership and depression do not support these conclusions ( 17 ).

Because so many people have extensive personal experiences with dogs, investigators face unique challenges in sharing research results with the public. In their hearts, dog owners believe that their canine companions make them feel less depressed, or that dogs feel guilty when they've eliminated indoors or explored the kitchen garbage—even though research might suggest otherwise. In addition, when it comes to animal companions, people much prefer to read a news article in which visits with a therapy dog improved the well-being of a child undergoing chemotherapy than an article about a randomized clinical trial which found no differences between the well-being of children in a therapy dog group and a control group ( 18 ). Nor is there likely to be much press coverage devoted to methodological issues such as small effect sizes and inappropriate attributions of causality to the results of correlational studies.

Canine scientists and scholars of human-animal interactions (anthrozoologists) are fortunate that the public is intrinsically interested in our research. We feel that it is critical for investigators to make efforts to communicate the findings of important studies to the public. We caution however, that researchers should not overstate the implications of their findings in press releases and conversations with journalists, despite frequent pressure to do so. These distortions could have a negative impact on misleading the public and misrepresenting the actual findings, a problem that is particularly acute in canine science where well-intentioned pet owners may eagerly adopt practices based on media coverage of scientific studies. The now-established discipline of science communication offers guidance for how best to engage with community and research stakeholders in meaningful ways.

Traditionally, science communication has relied on the knowledge deficit model of communication ( 19 ). Directionally one-way, the deficit model operates on the assumption that ignorance is the reason for a lack of community support and application of scientific evidence. Examples where practices have not been updated in response to research findings include dog training methodology ( 9 ) and breeding selection for extreme body types, such as brachycephaly in pugs and bulldogs, even though the health and welfare impacts are scientifically well understood ( 20 ). Scientists who share their research results thinking that knowledge disseminated—to “educate” the public—is enough to result in different dog care decisions, industry practices or legislation, will generally find this to be ineffective ( 21 ). This is because the deficit model overlooks the underlying beliefs, existing attitudes and motivations for current practices. We now recognize that the deficit model is not the most effective way to communicate, engage stakeholders and effect change ( 22 , 23 ).

Further exploration of the effect of targeted and intentional science communication, informed by human behavior change research, will improve the translation of canine science to meaningful outcomes for dogs and people alike ( 12 ). This is important, as many studies in canine science have applied aims designed to inform global policies and the creation of best practices ( 24 , 25 ). Applied research from the livestock and farming sector suggests that coordinating human behavior change strategies from social and psychological sciences can influence beliefs and attitudes to motivate changes in the ways people behave toward animals, resulting in improved animal welfare ( 26 – 28 ). In the era of attention economics, where scientists are competing for public attention alongside other diverse media, it is vital that the communication of our work is honest, relevant, and effective, to ensure that our field stays on the radar of key stakeholders, funding bodies and change agents.

A third key challenge in the future of canine science concerns research funding and a careful balancing of the priorities of scientists and funding agencies. In the last decade, canine science has received considerable support from the pet care sector, as well as human health and defense agencies [e.g., ( 29 )]. Fine and Andersen ( 30 ) stress that although funding is still a challenge in human-animal interaction research, there are now more options to be found. In 2008, the Waltham Petcare Science Institute initiated a public-private partnership with the Eunice Kennedy Shriver National Institute of Child Health and Human Development. Over the past decade, this partnership has provided funding for research aimed at measuring the impact of specific Animal-assisted interventions. Since 2014, the Human Animal Bond Research Institute has funded a total of 35 academic research grants investigating the health outcomes of pet ownership and/or human-animal interaction, both for the people and non-human animals involved. Despite clear benefits for enabling research, there remains a limited group of agencies responsible for funding this work. This has potential to constrain the range of topics being studied. In addition, scientists may feel compelled to support the agendas of industry groups, such as those in the pet sector, who often encourage research that will demonstrate the benefits of pets and human-animal interactions.

These constraints were recognized by Wallis Annenberg PetSpace in 2017 when they envisioned their Leadership Institute Program with a mission to promote interdisciplinary scholarship and convene meetings to accelerate research and policy development ( https://www.annenbergpetspace.org/about/leadership ). This model for engagement inspired the organization to offer two invited retreats (2017, 2020) for a total of 33 experts in the field that provided opportunities for open ended and frank discussion about the nature of human-animal interaction research, and the maturing field of canine science. By providing the space and financial support, plus the opportunity to work together and publish, Annenberg PetSpace provided a way to both illuminate current limitations, and to identify priorities for the future, free of constraints from outside interest groups. These intellectual salons have no specific agenda other than to consider the future of the field and what kinds of questions need to be asked based on what we already know. The results of these two retreats include 14 published refereed papers, plus a suite of collaborations that might otherwise not have happened. We hope that these fellowships and retreats continue and inspire others to support similar initiatives so that scholars across multiple disciplines have the opportunity to experience the transformational exchanges that occur during these programs. The new era of canine science will require diverse funding that is not limited to how dogs can benefit humans, from health, safety and economic perspectives. This change will enable researchers the freedom to further our understanding of dogs and their needs for optimized welfare. In turn, this will allow us to identify how dogs and people can thrive together.

Looking Ahead

We hope that the publications emerging from these retreats will reach a diverse community of stakeholders, including students, early career researchers, animal welfare and advocacy groups, legislators and policy makers, philanthropies, and traditional agency funders. The goal of these papers is to spark imagination for projects not yet engaged and to help set the agenda for future research that can enhance our understanding of human-dog interactions and identify paths to ensure a future of symbiotic relationships between these species.

The vision of this collective group of scholars includes the goal of establishing studies with dogs as a sustainable and broad-reaching research focus. Although dogs provide many advantages as a “model species” —including their phenotypic diversity, and shared environments and evolutionary history with humans—a research model centered around dogs has many additional benefits. Dogs provide a rich, interactive and sentient model with deep implications for the way scientists approach animal research, and animal welfare. Dogs also increase the accessibility of research, both literally, due to their ubiquity and opportunities for large-scale public participation in research ( 31 , 32 ), and figuratively, through a body of work with appeal to the broader public.

The field of canine science has much in common with a similar emerging science, that of urban ecology. Humans are historically at the core of the subject material, but non-human elements are often the focus of the study. As such, the work is always culturally embedded, relevant to a variety of stakeholders, and ultimately expected to improve quality of life. The urban ecologists coined a term Use-Inspired Research ( 33 ) from modifying the existing idea of Pasteur's Quadrant which organizes research questions across the axes of fundamental understanding and considerations of use ( 34 ). Both canine research and urban ecology seek fundamental understanding, but also expect to directly apply the knowledge gained to improve outcomes for their subjects and stakeholders.

By including the public in canine science we not only increase the quantity of the data that we can gather, we serve as ambassadors for a new model of responsible animal research. The result increases the value of human-animal interaction research and creates opportunities for the next generation of interdisciplinary scientists. The goal of this collection has been both to highlight specific recent advances in canine science as well as to identify emerging and overarching issues that will shape the future of this field. The multidisciplinary nature of our work with dogs allows scientists to contribute to a robust research agenda, enhancing our understanding of canines and their impact on society. Ultimately, the nexus of our discoveries should have profound effects on reshaping and enriching our relationships with dogs.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

We thank Wallis Annenberg PetSpace for supporting the open-access publishing fees associated with this manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1. Serpell JA. Commensalism or cross-species adoption? A critical review of theories of wolf domestication. Front Vet Sci . (2021) 8:662370. doi: 10.3389/fvets.2021.662370

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Wynne C. The indispensable dog. Front Vet Sci. (2021).

3. Chen F, Zimmermann M, Hekman JP, Lord KA, Logan B, Russenberger J, et al. Advancing genetic selection and behavioral genomics of working dogs through collaborative science. Front Vet Sci. (2021).

Google Scholar

4. Gnanadesikan GE, Hare B, Snyder-Mackler N, MacLean EL. Estimating the heritability of cognitive traits across dog breeds reveals highly heritable inhibitory control and communication factors. Anim Cogn . (2020) 23:953–64. doi: 10.1007/s10071-020-01400-4

5. McCune S, Promislow D. Healthy, active aging for people and dogs. Front Vet Sci . (2021). doi: 10.3389/fvets.2021.655191

6. Broom DM. International Animal Welfare Perspectives, Including Whaling and Inhumane Seal Killing as a Wto Public Morality Issue. In: Animal Law and Welfare-International Perspectives . New York, NY: Springer (2016). p. 45–61.

7. Markwell K, Firth T, Hing N. Blood on the race track: an analysis of ethical concerns regarding animal-based gambling. Ann Leisure Res. (2017) 20:594–609. doi: 10.1080/11745398.2016.1251326

CrossRef Full Text | Google Scholar

8. Cobb M, Lill A, Bennett P. Not all dogs are equal: Perception of canine welfare varies with context. Anim Welfare. (2020) 29:27–35. doi: 10.7120/09627286.29.1.027

9. Hall NJ, Johnston AM, Bray EE, Otto CM, MacLean EL, Udell MA. Working dog training for the 21st century. Front Vet Sci. (2021).

10. Horowitz A. Considering the “dog” in dog-human interaction. Front Vet Sci . (2021). doi: 10.3389/fvets.2021.642821

11. Mellor DJ, Beausoleil NJ, Littlewood KE, McLean AN, McGreevy PD, Jones B, et al. The 2020 five domains model: including human–animal interactions in assessments of animal welfare. Animals. (2020) 10:1870. doi: 10.3390/ani10101870

12. Cobb ML, Otto CM, Fine AH. The animal welfare science of working dogs: current perspectives on recent advances and future directions. Front Vet Sci. (2021).

13. Richter SH, Hintze S. From the individual to the population–and back again? Emphasising the role of the individual in animal welfare science. Appl Anim Behav Sci. (2019) 212:1–8. doi: 10.1016/j.applanim.2018.12.012

14. Nagin DS, Odgers CL. Group-based trajectory modeling in clinical research. Ann Rev Clin Psychol. (2010) 6:109–38. doi: 10.1146/annurev.clinpsy.121208.131413

15. Barker SB, Gee NR. Canine-assisted interventions in hospitals: Best practices for maximizing human and canine safety. Front Vet Sci . (2021) 8:615730. doi: 10.3389/fvets.2021.615730

16. Gee NR, Rodriguez KE, Fine AH, Trammell JP. Dogs supporting human health and wellbeing: a biopsychosocial approach. Front Vet Sci . (2021) 8:630465. doi: 10.3389/fvets.2021.630465

17. Rodriguez KE, Herzog H, Gee NR. Variability in human-animal interaction research. Front Vet Sci . (2021) 7:619600. doi: 10.3389/fvets.2020.619600

18. McCullough A, Ruehrdanz A, Jenkins MA, Gilmer MJ, Olson J, Pawar A, et al. Measuring the effects of an animal-assisted intervention for pediatric oncology patients and their parents: a multisite randomized controlled trial. J Pediatr Oncol Nurs. (2018) 35:159–77. doi: 10.1177/1043454217748586

19. Simis MJ, Madden H, Cacciatore MA, Yeo SK. The lure of rationality: why does the deficit model persist in science communication? Public Understand Sci. (2016) 25:400–14. doi: 10.1177/0963662516629749

20. Packer RM, O'Neill DG, Fletcher F, Farnworth MJ. Great expectations, inconvenient truths, and the paradoxes of the dog-owner relationship for owners of brachycephalic dogs. PLoS ONE. (2019) 14:e0219918. doi: 10.1371/journal.pone.0219918

21. Seethaler S, Evans JH, Gere C, Rajagopalan RM. Science, values, and science communication: competencies for pushing beyond the deficit model. Sci Commun. (2019) 41:378–88. doi: 10.1177/1075547019847484

22. Philpotts I, Dillon J, Rooney N. Improving the welfare of companion dogs—is owner education the solution? Animals. (2019) 9:662. doi: 10.3390/ani9090662

23. Westgarth C, Christley RM, Marvin G, Perkins E. The responsible dog owner: the construction of responsibility. Anthrozoös. (2019) 32:631–46. doi: 10.1080/08927936.2019.1645506

24. Bray EE, Otto CM, Udell MA, Hall NJ, Johnston AM, MacLean EL. Enhancing the selection and performance of working dogs. Front Vet Sci . (2021) 8:644431. doi: 10.3389/fvets.2021.644431

25. Feldman S, Fine AH, Melfi L. Research, practice, science public policy: How they fit together in the context of aai. In: Fine AH, editor. Handbook on Animal Assisted Therapy. 5th ed. San Diego, CA: Elsevier/Academic Press (2019). p. 417–24.

26. Coleman G, Hemsworth PH. Training to improve stockperson beliefs and behaviour towards livestock enhances welfare and productivity. Rev Sci Tech. (2014) 33:131–7. doi: 10.20506/rst.33.1.2257

27. Fernandes J, Blache D, Maloney SK, Martin GB, Venus B, Walker FR, et al. Addressing animal welfare through collaborative stakeholder networks. Agriculture. (2019) 9:132. doi: 10.3390/agriculture9060132

28. Vigors B. Reducing the consumer attitude–behaviour gap in animal welfare: the potential role of ‘nudges’. Animals. (2018) 8:232. doi: 10.3390/ani8120232

29. McCune S, McCardle P, Griffin JA, Esposito L, Hurley K, Bures R, et al. Human-animal interaction (hai) research: a decade of progress. Front Vet Sci. (2020) 7:44. doi: 10.3389/fvets.2020.00044

30. Fine AH, Andersen SJ. A commentary on the contemporary issues confronting animal assisted and equine assisted interactions. J Equine Vet Sci. (2021) 103436. doi: 10.1016/j.jevs.2021.103436

31. Kaeberlein M, Creevy KE, Promislow DE. The dog aging project: Translational geroscience in companion animals. Mamm Genome. (2016) 27:279–88. doi: 10.1007/s00335-016-9638-7

32. Stewart L, MacLean EL, Ivy D, Woods V, Cohen E, Rodriguez K, et al. Citizen science as a new tool in dog cognition research. PLoS ONE. (2015) 10:e0135176. doi: 10.1371/journal.pone.0135176

33. Grove JM, Childers DL, Galvin M, Hines S, Muñoz-Erickson T, Svendsen ES. Linking science and decision making to promote an ecology for the city: practices and opportunities. Ecosyst Health Sustain. (2016) 2:e01239. doi: 10.1002/ehs2.1239

34. Stokes DE. Pasteur's Quadrant: Basic Science and Technological Innovation . Washington, DC: Brookings Institution Press (2011).

Keywords: canine science, dog, animal welfare, human-animal interaction, science communication, funding, sustainability

Citation: MacLean EL, Fine A, Herzog H, Strauss E and Cobb ML (2021) The New Era of Canine Science: Reshaping Our Relationships With Dogs. Front. Vet. Sci. 8:675782. doi: 10.3389/fvets.2021.675782

Received: 03 March 2021; Accepted: 11 June 2021; Published: 15 July 2021.

Reviewed by:

Copyright © 2021 MacLean, Fine, Herzog, Strauss and Cobb. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Evan L. MacLean, evanmaclean@arizona.edu

This article is part of the Research Topic

Our Canine Connection: The History, Benefits and Future of Human-Dog Interactions

  • Research article
  • Open access
  • Published: 05 February 2021

Increasing adoption rates at animal shelters: a two-phase approach to predict length of stay and optimal shelter allocation

  • Janae Bradley 1 &
  • Suchithra Rajendran   ORCID: orcid.org/0000-0002-0817-6292 2 , 3  

BMC Veterinary Research volume  17 , Article number:  70 ( 2021 ) Cite this article

38k Accesses

18 Citations

74 Altmetric

Metrics details

Among the 6–8 million animals that enter the rescue shelters every year, nearly 3–4 million (i.e., 50% of the incoming animals) are euthanized, and 10–25% of them are put to death specifically because of shelter overcrowding each year. The overall goal of this study is to increase the adoption rates at animal shelters. This involves predicting the length of stay of each animal at shelters considering key features such as animal type (dog, cat, etc.), age, gender, breed, animal size, and shelter location.

Logistic regression, artificial neural network, gradient boosting, and the random forest algorithms were used to develop models to predict the length of stay. The performance of these models was determined using three performance metrics: precision, recall, and F1 score. The results demonstrated that the gradient boosting algorithm performed the best overall, with the highest precision, recall, and F1 score. Upon further observation of the results, it was found that age for dogs (puppy, super senior), multicolor, and large and small size were important predictor variables.

The findings from this study can be utilized to predict and minimize the animal length of stay in a shelter and euthanization. Future studies involve determining which shelter location will most likely lead to the adoption of that animal. The proposed two-phased tool can be used by rescue shelters to achieve the best compromise solution by making a tradeoff between the adoption speed and relocation cost.

As the problem of overpopulation of domestic animals continues to rise, animal shelters across the nation are faced with the challenge of finding solutions to increase the adoption rates. In the United States, about 6–8 million dogs and cats enter animal shelters every year, and 3–4 million of those animals are euthanized [ 1 ]. In other words, about 50% of the total canines and felines that enter animal shelters are put to death annually. Moreover, 10–25% of the total euthanized population in the United States is explicitly euthanized because of shelter overcrowding each year [ 2 ]. Though animal shelters provide incentives such as reduced adoption fees and sterilizing animals before adoption, only a quarter of total animals living in the shelter are adopted.

Animal adoption from shelters and rescues

There are various places to adopt an animal, and each potential owner must complete the adoption process and paperwork to take their new animal home [ 3 ]. Public and private animal shelters include animal control, city and county animal shelters, and police and health departments. Staff and volunteers run these facilities. Animals may also be adopted from a rescue organization, where pets are fostered in a home or a private boarding facility. These organizations are usually run by volunteers, and animals are viewed during local adoption events that are held at different locations, such as a pet store [ 3 ].

There could be several reasons for the euthanization of animals in a shelter, such as overcrowding, medical issues (ex. sick, disabled), or behavioral issues (ex. too aggressive). The causes for the overpopulation of animals include failure to spay or neuter animals leading to reckless breeding habits and abandonment or surrender of offspring, animal abandonment from owners who are no longer able to take care of or do not want the animal, and individuals still buying from pet stores [ 4 ]. With the finite room capacity for animals that are abandoned or surrendered, overpopulation becomes a key challenge [ 5 ]. Though medical and behavioral issues are harder to solve, the overpopulation of healthy adoptable animals in shelters is a problem that can be addressed through machine learning and predictive analytics.

Literature review

In this section, we describe the research conducted on animal shelters evaluating euthanasia and factors associated with animal adoption. The articles provide insights into factors that influence the length of stay and what characteristics influence adoption.

Studies have been conducted investigating the positive influence of pre-adoption neutering of animals on the probability of pet adoption [ 2 ]. The author investigated the impact of the cooperation of veterinary medical schools in increasing pet adoption by offering free sterilization. Results demonstrated that the collaboration between veterinary hospitals and local animal shelters decreased the euthanization of adoptable pets.

Hennessy et al. [ 6 ] conducted a study to determine the relationship between the behavior and cortisol levels of dogs in animal shelters and examined its effect in predicting behavioral issues after adoption. Shore et al. [ 7 ] analyzed the reasons for returning adopted animals by owners and obtained insights for these failed adoptions to attain more successful future approvals. The researchers found that prior failed adoption had led to longer-lasting future acceptances. They hypothesized that the failed adoptions might lead owners to discover their dog preferences by assessing their living situation and the type of animal that would meet that requirement.

Morris et al. [ 8 ] evaluated the trends in income and outcome data for shelters from 1989 to 2010 in a large U.S. metropolitan area. The results showed a decrease in euthanasia, adoption, and intake for dogs. For cats, a reduction in intake was observed until 1998, a decrease in euthanasia was observed until 2000, and the adoption of cats remained the same. Fantuzzi et al. [ 9 ] explored the factors that are significant for the adoption of cats in the animal shelter. The study investigated the effects of toy allocation, cage location, and cat characteristics (such as age, gender, color, and activity level). Results demonstrated that the more active cats that possessed toys and were viewed at eye level were more likely to impress the potential adopter and be adopted. Brown et al. [ 10 ] conducted a study evaluating the influence of age, breed, color, and coat pattern on the length of stay for cats in a no-kill shelter. The authors concluded that while color did not influence the length of stay for kittens, whereas gender, coat patterning, and breed were significant predictors for both cats and kittens.

Machine learning

Machine learning is one possible tool that can be used to identify risk factors for animal adoption and predict the length of stay for animals in shelters. Machine learning is the ability to program computers to learn and improve all by itself using training experience [ 11 ]. The goal of machine learning is to develop a system to analyze big data, quickly deliver accurate and repeatable results, and to adapt to new data independently. A system can be trained to make accurate predictions by learning from examples of desired input-output data. More specifically, machine learning algorithms are utilized to detect classification and prediction patterns from large data and to develop models to predict future outcomes [ 12 ]. These patterns show the relationship between the attribute variables (input) and target variables (output) [ 13 ].

Widely used data mining tasks include supervised learning, unsupervised learning, and reinforcement learning [ 14 ]. Unsupervised learning involves the use of unlabeled datasets to train a system for finding hidden patterns within the data [ 15 ]. Clustering is an example of unsupervised learning. Reinforcement learning is where a system is trained through direct interaction with the environment by trial and error [ 15 ]. Supervised learning encompasses classification and prediction using labeled datasets [ 15 ]. These classification and regression algorithms are used to classify the output variable with a discrete label or predict the outcome as a continuous or numerical value. Traditional algorithms such as neural networks, decision trees, and logistic regression typically use supervised learning. Figure  1 provides a pictorial of the steps for developing and testing a predictive model.

figure 1

Pictorial Representation of Developing a Predictive Model

Contributions to the literature

Although prior studies have investigated the impact of several factors, such as age and gender, on the length of stay, they focus on a single shelter, rather than multiple organizations, as in this study. The goal of this study is to investigate the length of stay of animals at shelters and the factors influencing the rate of animal adoption. The overall goal is to increase adoption rates of pets in animal shelters by utilizing several factors to predict the length of stay. Machine learning algorithms are used to predict the length of stay of each animal based on numerous factors (such as breed, size, and color). We address several objectives in this study that are listed below.

Identify risk factors associated with adoption rate and length of stay

Utilize the identified risk factors from collected data to develop predictive models

Compare statistical models to determine the best model for length of stay prediction

Exploratory Data results

From Fig.  2 , it is evident that the return of dogs is the highest outcome type at 43.3%, while Fig.  3 shows that the adoption of cats is the highest outcome type at 46.1%. Both figures illustrate that the euthanization of both cats and dogs is still prevalent (~ 20%). The results from Table 1 demonstrate that the longest time spent in the shelter is at 355 days by a male cat that is adopted and a female dog that is euthanized. Observing the results, adoption has the lowest variance among all animal types compared to the other outcome types. Adopted male cats have the lowest variance for days spent in the shelter, followed by female dogs. Female cats that are returned have the highest variance for days spent in the shelter.

figure 2

Distribution of Outcome Types for Dogs

figure 3

Distribution of Outcome Types for Cats

Figure  4 shows a comparison of cats and dogs for the three different outcome types. It is observed from the data that there are more dogs returned than cats. From Fig.  5 , it is observed that the number of days a dog stays in the shelter decreases as the age increases. This is not expected, as it is predicted that the number of days in a shelter would be lower for younger dogs and puppies. This observation could be due to having more data points for younger dogs.

figure 4

Comparison of Outcome Types for Cats and Dogs

figure 5

Age vs. Days in Shelter for Cats and Dogs

Machine learning results

Examining Table 2 , it is clear that the most proficient predictive model is developed by the gradient boosting algorithm for this dataset, followed by the random forest algorithm. The logistic regression algorithm appears to perform the worst with low precision, recall, and F1 score performance metrics for all categories of length of stay. For the prediction of low length of stay in a shelter, the random forest algorithm is the best performing model in comparison to the others at around 64–70% performance for precision, recall, and F1 score. The ANN algorithm is found to be the best when evaluating the precision and F1 score for medium length of stay, while the random forest algorithm is better for assessing recall. However, the performance of these models in predicting the medium length of stay for the given dataset is low for all three-performance metrics. The gradient boosting algorithm performs the best when predicting the high length of stay. Finally, the gradient boosting and random forest algorithms perform well when predicting the very high length of stay at around 70–80%.

Results from Table 2 also demonstrate that the model developed from the gradient boosting algorithm has a higher performance when predicting the high length of stay that leads to adoption, and when the outcome is euthanization. Evaluating the average of all three-performance metrics for all algorithms, the gradient boosting is the most proficient model at almost 60%, while logistic regression appears to be the worst. Table 2 also provides the computational time for each machine learning algorithm. For the given dataset, logistic regression runs the fastest at 9.41 s, followed by gradient boosting, artificial neural network, and finally, random forest running the longest. The gap in the performance measure ( pm ) is calculated by \( \frac{p{m}_{best}-p{m}_{worst}}{p{m}_{best}} \) , and is nearly 34, 39, and 32% for precision, recall, and F1 score, respectively.

Table 3 provides information on the top features or factors from each machine learning algorithm. Observing the table, we find that age (senior, super senior, and puppy), size (large and small), and color (multicolor) has a significant impact or influence on the length of stay. Specifically, we observe that older-aged animals (senior and/or super senior) appear as a significant factor for every algorithm. For the artificial neural network, older age is the #2 and #3 predictor, and super senior is the #2 predictor for the gradient boosting algorithm. Large and small-sized animals are also observed to be important features, as both are shown as the #1 predictor in the gradient boosting and ANN algorithms. The results also demonstrate that gender, animal type, other colors besides multicolor, middle age, and medium-sized animals did not significantly impact the length of stay.

Results from our study provided information on what factors are significant in influencing length of stay. Brown et al. [ 10 ] conducted research that found that age, breed designation, coat color, and coat pattern influenced the length of stay for cats in animal shelters. Similar to these studies, observations from our study also suggest that age and color have a significant impact or influence on the length of stay.

Determining which algorithm will develop the best model for the given set of data is critical to predict the length of stay and minimize the chances of euthanization. The goal of predictive analytics is to develop a model that best approximates the true mapping function for the relationship between the input and output variables. To approximate this function, parametric or non-parametric algorithms can be used. Parametric algorithms simplify the unknown function to a known form. Non-parametric algorithms do not make assumptions about the structure of the mapping function, allowing free learning of any functional form. In this study, we utilize both parametric (logistic regression and artificial neural network) and non-parametric (random forest and gradient boosting) algorithms on the given data. Observing the results from Table 2 , the gradient boosting and random forest (non-parametric algorithms) perform the best on the dataset. It is observed from the results that using a non-parametric approach leads to a better approximation of the true mapping function for the given records. These results also support prior studies on parametric versus non-parametric methods. Neely et al. [ 16 ] detailed the theoretical superiority of non-parametric algorithms for detecting pharmacokinetic and pharmacodynamic subgroups in a study population. The author suggests this superiority comes from the lack of assumptions made about the distribution of parameter values in a dataset. Bissantz et al. [ 17 ] discussed a resampling algorithm that evaluates the deviations between parametric and non-parametric methods to be noise or systematic by comparing parametric models to a non-parametric “supermodel”. Results demonstrate the non-parametric model to be significantly better. The use of algorithms that do not approximate the true function of the relationship between input and output provides better performance results for this application as well.

Current literature also supports the use of ensemble methods to increase prediction accuracy and performance. Dietterich [ 18 ] discussed the ongoing research into developing good ensemble methods as well as the discovery that ensemble algorithms are often more accurate than individual algorithms that are used to create them. Pandey, and S, T [ 19 ]. conducted a study to compare the accuracy of ensemble methodology on predicting student academic performance as research has demonstrated better results for composite models over a single model. This study applied ensemble techniques on learning algorithms (AdaBoost, Random Forest, Rotation Forest, and Bagging). For our study with the given records, the results support this claim. Both the gradient boosting and random forest algorithms are ensemble algorithms and performed the best on the animal shelter data.

Results from Table 2 demonstrate the best performance of the gradient boosting and random forest algorithm when the length of stay was classified as very high or the animal was euthanized. This is beneficial as the models can predict long stays where the outcome is euthanasia. This can lead to shelters identifying at-risk animals and implementing methods and solutions to ensure their adoption. These potential methods are the second phase of this research study, which will involve relocating animals to shelters where they will more likely be adopted. This phase is discussed in the future directions section.

Studies have been conducted evaluating euthanasia-related stress on workers (e.g., [ 1 ]). In other words, overpopulation not only leads to euthanasia but can, in turn, cause mental and emotional problems for the workers. For instance, Reeve et al. [ 20 ] evaluated the strain related to euthanasia among animal workers. Results demonstrated that euthanasia related strain was prevalent, and an increase in substance abuse, job stress, work causing family conflict, complaints, and low job satisfaction was observed. Predicting the length of stay for animals will aid in them being more likely to be adopted and will lead to fewer animals being euthanized, adding value not only to animals finding a home but also less stress on the workers.

The approach developed in this paper could be beneficial not only to reduce euthanasia but also to reduce overcrowding in shelters operated in countries where euthanasia of healthy animals is illegal, and all animals must be housed in shelters until adoption (or natural death). It is essential to develop an information system for a collaborative animal shelter network in which the entities can coordinate with each other, exchanging information about the animal inventory. Another benefit of this study is that it investigates applying machine learning to the animal care domain. Previous studies have looked into what factors influence the length of stay; however, this study utilizes these factors in addition to classification algorithms to predict how long an animal will stay in the shelter. Moreover, the use of a prescriptive analytics approach is discussed in this paper, where the predictions made by the machine learning algorithms will be used along with a goal programming model to decide in what shelter is an animal most likely to be adopted.

Limitations of this study include lack of behavioral data, limited sample size, and the use of simple algorithms. The first limitation, lack of behavioral data of the animal during intake and outcome, would be beneficial to develop a more comprehensive model. Though behavioral problems are harder to solve, having data would provide insight into how long these animals with behavioral issues are staying in shelters and what the outcome is. Studies have shown that behavioral problems play a significant role in preventing bonding between owners and their animals and one of the most common reasons cited for animal surrender [ 21 , 22 ]. These behavioral problems can include poor manners, too much energy, aggression, and destruction of the household. Dogs surrendered to shelters because of behavioral issues have also been shown to be less likely to be adopted or rehomed, and the ones that are adopted are more likely to be returned [ 21 ]. Studies have also been conducted to evaluate the effect of the length of time on the behavior of dogs in rescue shelters [ 23 , 24 , 25 ]. Most of them concluded that environmental factors led to changes in the behavior of dogs and that a prolonged period in a shelter may lead to unattractive behavior of dogs to potential owners. Acquiring information on behavioral problems gives more information for the algorithm to learn when developing the predictive model. This allows more in-depth predictions to be made on how long an animal will stay in a shelter, which could also aid in adoption. This approach can be used to shorten the length of stay, which makes sure that healthy animals are not developing behavioral problems in the shelters. It is not only crucial for the animal to be adopted, but also that the adoption is a good fit between owner and pet. Shortening the length of stay would also lessen the chance that the animal will be returned by the adopter because of behavior. Having this information will also allow shelters to find other shelters close by where animals with behavioral issues are more likely to be adopted. To overcome this limitation of the lack of data on behavioral problems, behavioral issues will be used as a factor and will be specifically asked for when acquiring data from shelters.

Another limitation includes collecting more data from animal shelters across the United States, allowing for more representative data to be collected and inputted into these algorithms. However, this presents a challenge due to most shelters being underfunded and low on staff. Though we reached out to shelters, most replied that they lacked the resources and staff to provide the information needed. Future work would include applying for funding to provide a stipend to staff for their assistance in gathering the data from respective shelters. With more data, the algorithm has more information to learn on, which could improve the performance metrics of the predictive models developed. There may also be other factors that show to be significant as more data is collected.

Finally, the last limitation is the use of simpler algorithms. This study considers basic ML algorithms. Nevertheless, in recent years, there has been development in the ML field of more complex networks. For instance, Zhong et al. [ 26 ] proposed a novel reinforcement learning method to select neural blocks and develop deep learning networks. Results demonstrated high efficiency in comparison to most of the previous deep network search approaches. Though only four algorithms were considered, future work would investigate deep learning networks, as well as bagging algorithms. Using more complex algorithms could ensure that if intricate patterns in the data are present, the algorithm can learn them.

Future direction

Phase 2: goal programming approach for making relocation decisions.

Using the information gathered in this study, we can predict the type of animals that are being adopted the most in each region and during each season of the year. To accomplish this, we utilize a two-phase approach. The first phase was leveraging the machine learning algorithms to predict the length of stay of each animal based on numerous factors (such as breed, size, and color). Phase-2 involves determining the best shelter to transport adoptable animals to increase the adoption rates, based on several conflicting criteria. This criterion includes predicted length of stay from phase-1, the distance between where the animal is currently housed and the potential animal shelters, transportation costs, and transportation time. Therefore, our goal is to increase adoption rates of pets in animal shelters by utilizing several factors to predict the length of stay, as well as determine the optimal animal shelter location where the animal will have the least amount stay in a shelter and most likely be adopted.

After predicting the length of stay of an incoming animal that is currently housed in the shelter l ′ using the machine learning algorithms, the next phase is to evaluate the potential relocation options for that animal. This strategic decision is specifically essential if the length of stay of the animal at its current location is high/very high. Nevertheless, while making this relocation decision, it is also necessary to consider the cost of transporting the animal between the shelters. For instance, if a dog is brought into a shelter in Houston, Texas, and is estimated to have a high/very high length of stay. Suppose if the dog is predicted to have a low length of stay at New York City and a medium length of stay at Oklahoma City, then a tradeoff has to be made between the relocation cost and the adoption speed. The objectives, length of stay, and relocation costs are conflicting and have to be minimized. Phase-2 attempts to yield a compromise solution that establishes a trade-off between these two criteria.

Goal programming (GP) is a widely used approach to solve problems involving multiple conflicting criteria. Under this method, each objective function is assigned as a goal, and a target value is specified for the individual criterion [ 27 ]. These target numbers can be fulfilled by the model with certain deviations, while the objective of the GP model is to minimize these deviations. Pertaining to this study, the desired values for the length of stay and relocation cost is pre-specified in the model and can be fulfilled with deviations. The GP model attempts to minimize these deviations. Thus, this technique attempts to produce a solution that is as close as possible to the targets, and the model solutions are referred to as the “most preferred solution” by prior studies (e.g., [ 28 , 29 ]).

As mentioned earlier, the primary task to be completed using this phase-2 goal programming approach is the relocation decisions considering the adoption speed and the cost of transporting the animal from the current location.

Model notations

Goal programming model formulation, goal constraints.

Objective 1: Minimize the overall length of stay of the animal under consideration (Eq. 1 ).

Goal constraint for objective 1: The corresponding goal constraint of objective 2 is given using Equation [ 30 ].

Objective 2: Minimize the overall relocation cost for transporting the animal under consideration (Eq. 3 ).

Goal constraint for objective 2: The corresponding goal constraint of objective 2 is given using Equation [ 18 ].

Hard constraints

Equation [ 9 ] ensures that the animal can be assigned to only one shelter.

The animal can be accommodated in shelter l only if there are a shelter capacity and type for that particular animal size category, and this is guaranteed using constraint [ 31 ]. It is important to note that both y and s are input parameters , whereas l is the set of shelters.

Equation [ 21 ] sets an upper limit on the length of stay category if the shelter l is assigned as the destination location. This prevents relocating animals to a shelter that might potentially have a high or very high length of stay.

Similarly, Equation [ 32 ] sets an upper limit on the relocation cost, if the shelter l is assigned as the destination location. This prevents relocating animals to a very far location. The current shelter location, l ′ , that is hosting the animal is an input parameter.

Objective function

Since the current problem focuses on minimizing the expected length of stay and relocation cost, the objective function of the goal programming approach is to reduce the sum of the weighted positive deviations given in Equations ([ 18 , 30 ], as shown in Equation [ 6 ].

where w g is the weight assigned for each goal g .

It is necessary to scale the deviation (since the objectives have different magnitudes as well as units) to avoid a biased solution.

If the scaling factors are represented by f g for goal g , then the scaled objective function is given in Equation [ 14 ].

Using this goal programming approach, the potential relocation options are evaluated considering the length of stay from phase-1. This phase-2 goal programming approach is useful, especially if the length of stay of the animal at its current location is high/very high, and a trade-off has to be made between relocation cost and length of stay. Phase-2 acts as a recommendation tool for assisting administrators with relocation decisions.

Nearly 3–4 million animals are euthanized out of the 6–8 million animals that enter shelters annually. The overall objective of this study is to increase the adoption rates of animals entering shelters by using key factors found in the literature to predict the length of stay. The second phase determines the best shelter location to transport animals using the goal programming approach to make relocation decisions. To accomplish this objective, first, the data is acquired from online sources as well as from numerous shelters across the United States. Once the data is acquired and cleaned, predictive models are developed using logistic regression, artificial neural network, gradient boosting, and random forest. The performance of these models is determined using three performance metrics: precision, recall, and F1 score.

The results demonstrate that the gradient boosting algorithm performed the best overall, with the highest precision, recall, and F1 score. Followed closely in second is the random forest algorithm, then the artificial neural network, and then finally, the logistic regression algorithm is the worst performer. We also observed from the data that the gradient boosting performed better when predicting the high or very high length of stay. Further observing the results, it is found that age for dogs (e.g., puppy, super senior), multicolor, and large and small size are important predictor variables.

The findings from this study can be utilized to predict how long an animal will stay in a shelter, as well as minimize their length of stay and chance of euthanization by determining which shelter location will most likely lead to the adoption of that animal. For future studies, we will implement phase 2, which will determine the best shelter location to transport animals using the goal programming approach to make relocation decisions.

Data description

A literature review is conducted to determine the factors that might potentially influence the length of stay for animals in shelters. These factors include gender, breed, age, and several other variables that are listed in Table 4 . These features will be treated as input variables for the machine learning algorithms. Overall, there are eight input or predictor variables and one output variable, which is the length of stay.

Animal shelter intake and outcome data are publicly made available by several state/city governments on their website (e.g., [ 33 , 34 ]), specifically in several southern and south-western states. These online sources provide datasets for animal shelters from Kentucky (150,843 data rows), California (334,016), Texas (155,115), and Indiana (4132). Since there is no nationwide database for animal shelters, information is also collected through individual animal shelters that conduct euthanization of animals. We contacted over 100 animal shelters across the United States and inquired for data on the factors mentioned in Table 4 . We received responses from 20 of the animal shelters that were contacted. Most responses received stated there was not enough staff or resources to be able to provide this information. From the responses that were received back, only four shelters were able to provide any information. Of those four, only two of the datasets contained the factors and information needed, which are Colorado (8488 data rows) and Arizona (4, 667 data rows).

The data that is collected from the database and animal shelters included information such as animal type, intake and outcome date, gender, color, breed, and intake and outcome status (behavior of animal entering the shelter and behavior of animal at outcome type). These records also included information on several types of animals, such as dogs, cats, birds, rabbits, and lizards. For this study, the focus is on dogs and cats. After filtering through these records, we found that only California, Kentucky, Colorado, Arizona, and Indiana had all of the factors needed for the study. Upon downloading data from the database and receiving data from the animal shelters, the acquired data underwent data integration, data transformation, and data cleaning (as detailed in Fig.  1 ). After data pre-processing, there are over 113,000 animal records.

Data cleaning methods

Next, data cleaning methods are utilized to detect discrepancies in the data, such as missing values, erroneous data, and inconsistencies. Data cleaning is an essential step for obtaining unbiased results [ 35 , 36 ]. In other words, identifying and cleaning erroneous data must be performed before inputting the data into the algorithm as it can significantly impact the output results.

The following is a list of commonly used data cleaning techniques in the literature [ 11 ]:

Substitution with Median: Missing or incorrect data are replaced with the median value for that predictor variable.

Substitution with a Unique Value: Erroneous data are replaced with a value that does not fall within the range that the input variables can accept (e.g., a negative number)

Discard Variable and Substitute with a Median: When an input variable has a significant number of missing values, these values are removed from the dataset, and the features that remain with missing or erroneous values are replaced with the median.

Discard Variable and Substitute with a Unique Value: Input variables with a significant number of missing values are removed from the dataset, and the features that remain with missing or erroneous values are coded as − 1.

Remove Incomplete Rows Entirely: Incomplete Rows are removed from the dataset.

Data preprocessing

Some animal breeds are listed in multiple formats and are changed to maintain uniformity. An example of this is a Russian Blue cat, which is formatted in several ways such as “Russian”, “Russian Blue”, and “RUSSIAN BLUE”. Animals with multiple breeds such as “Shih Tzu/mix” or “Shih Tzu/Yorkshire Terr” are classified as the first breed listed. Other uncommon breeds are classified as “other” for simplicity. Finally, all animal breeds are summarized into three categories (small, medium, or large) using the American Kennel Clubs’ breed size classification [ 37 ]. Part of the data cleansing process also includes categorizing multiple colors found throughout the sample size into five distinct color categories (brown, black, blue, white, and multicolor). We classified age into five categories for dogs and cats (puppy or kitten, adolescent, adult, senior, super senior). The puppy or kitten category includes data points 0–1 year, adolescence includes data points 2–3 years old, adulthood includes animals 4–7 years of age, and senior animals are 8–10 years of age. Any animal that is older than ten years are categorized as a super senior, based on the recommendations provided in Wapiti Labs [ 38 ].

As mentioned previously, the output variable is the length of stay and is classified as low, medium, high, and very high/euthanization. The length of stay is calculated by taking the difference between the intake date and outcome date. To remove erroneous data entries and special cases, the number of days in the animal shelter is also capped at a year. The “low” category represents animals that are returned (in which case, they are assigned the days in the shelter as 0) or spent less than 8 days before getting adopted. It is important to keep these animals at the shelter so that the owner may find them or they are transferred to their new homes. Animals that stayed in a shelter for 9–42 days and are adopted are categorized as “medium” length of stay. The “high” category is given to animals that stayed in the shelter for 43–365 days. Finally, animals that are euthanized are categorized as “very high”.

After integrating all data points from each animal shelter, the sample size includes 119,691 records. After the evaluation of these data points, 5436 samples are found to have miscellaneous (such as a negative length of stay) or missing values. After applying data cleaning techniques, the final cleaned dataset includes 114,256 data points, with 50,466 cat- and 63,790 dog-records.

Machine learning algorithms to predict the length of stay

The preprocessed records are then separated into training and testing datasets based on the type of classification algorithm used. Studies have demonstrated the need for testing and comparing machine learning algorithms, as the performance of the models depends on the application. While an algorithm may develop a predictive model that performs well in one application, it may not be the best performing model for another. A comparison between the statistical models is conducted to determine the overall best performing model. In this section, we provide a description as well as the advantages of each classification algorithm that is utilized in this study.

Logistic regression

Logistic regression (LR) is a machine learning algorithm that is used to understand the probability of the occurrence of an event [ 39 ]. It is typically used when the model output variable is binary or categorical (see Fig.  6 ), unlike linear regression, where the dependent variable is numeric [ 40 ]. Logistic regression involves the use of a logistic function, referred to as a “sigmoid function” that takes a real-valued number and maps it into a value between 0 and 1 [ 41 ]. The probability that the length of stay of the animal at a specific location will be low, medium, high, or very high, is computed using the input features discussed in Table 4 .

figure 6

Pictorial Representation of the Logistic Regression Algorithm

The linear predictor function to predict the probability that the animal in record i has a low, medium, high, and very high length of stay categories is given by Equations ( 11 ) –[ 3 ], respectively.

Where β v , l is a set of multinomial logistic regression coefficients for variable v of the length of stay category l , and x v , i is the input feature v corresponding to data observation i .

Artificial neural network

Artificial Neural Network (ANN) algorithms were inspired by the brain’s neuron, which transmits signals to other nerve cells [ 40 , 42 ]. ANN’s were designed to replicate the way humans learn and were developed to imitate the operational sequence in which the body sends signals in the nervous system [ 43 ]. In an ANN, there exists a network structure with directional links connecting multiple nodes or “artificial neurons”. These neurons are information-processing units, and the ties that connect them represent the relationship between each of the connected neurons. Each ANN consists of three layers - the input layer, hidden layer, and the output layer [ 32 , 44 ]. The input layer is where each of the input variables is fed into the artificial neuron. The neuron will first calculate the sum of multiple inputs from the independent variables. Each of the connecting links (synapses) from these inputs has a characterized weight or strength that has a negative or positive value [ 32 ]. When new data is received, the synaptic weight changes, and learning will occur. The hidden layer learns the relationship between the input and output variables, and a threshold value determines whether the artificial neuron will fire or pass the learned information to the output layer, as shown in Fig.  7 . Finally, the output layer is where labels are given to the output value, and backpropagation is used to correct any errors.

figure 7

Pictorial Representation of the Artificial Neural Networks

Random Forest

The Random Forest (RF) algorithm is a type of ensemble methodology that combines the results of multiple decision trees to create a new predictive model that is less likely to misclassify new data [ 30 , 45 ]. Decision Trees have a root node at the top of the tree that consists of the attribute that best classifies the training data. The attribute with the highest information gain (given in Eq. 16 ) is used to determine the best attribute at each level/node. The root node will be split into more subnodes, which are categorized as a decision node or leaf node. A decision node can be divided into further subnodes, while a leaf node cannot be split further and will provide the final classification or discrete label. RF algorithm uses mtree and ntry as the two main parameters in developing the multiple parallel decision trees. Mtree specifies how many trees to train in parallel, while ntry defines the number of independent variables or attributes to choose to split each node [ 30 ].. The majority voting from all parallel trees gives the final prediction, as given in Fig.  8 .

figure 8

Pictorial Representation of the Random Forest Algorithm

Gradient boosting

Boosting is another type of ensemble method that combines the results from multiple predictive algorithms to develop a new model. While the RF approach is built solely on decision trees, boosting algorithms can use various algorithms such as decision trees, logistic regression, and neural networks. The primary goal of boosting algorithms is to convert weak learners into stronger ones by leveraging weighted averages to identify “weak classifiers” [ 31 ]. Samples are assigned an initial uniformed weight, and when incorrectly labeled by the algorithm, a penalty of an increase in weight is given [ 46 ]. On the other hand, samples that are correctly classified by the algorithm will decrease in weight. This process of re-weighing is done until a weighted vote of weak classifiers is combined into a robust classifier that determines the final labels or classification [ 46 ]. For our study, gradient boosting (GB) will be used on decision trees for the given dataset, as illustrated in Fig.  9 .

figure 9

Pictorial Representation of Boosting Algorithm

Machine learning model parameters

The clean animal shelter data is split into two datasets: training and testing data. These records are randomly placed in the two groups to train the algorithms and to test the model developed by the algorithm. 80% of the data is used to train the algorithm, while the other 20% is used to test the predictive model. To avoid overfitting, a tenfold cross-validation procedure is used on the training data. There are no parameters associated with the machine learning of logistic regression algorithms. However, a grid search method is used to tune the parameters of the random forest, gradient boosting, and artificial neural network algorithms. This allows the best parameter in a specific set to be chosen by running an in-depth search by the user during the training period.

The number of trees in the random forest and gradient boosting algorithms is changed from 100 to 1000 in increments of 100. A learning rate of 0.01, 0.05, and 0.10 is used based on the recommendations of previous studies [ 47 ]. The minimum observations for the trees’ terminal node are set to vary from 2 to 10 in increments of one, while the splitting of trees varies from 2 to 10 in increments of two. A feed-forward method is used to develop the predictive model using the artificial neural network algorithm. The feed-forward algorithm consists of three layers (input, hidden, output) as well as backpropagation learning. The independent and dependent variables represent the input and output layers. Since the input and output layers are already known, an optimal point is reached for the number of nodes when between 1 and the number of predictors. This means that for our study, the nodes of the hidden layer vary from 1 to 8. The learning rate values used to train the ANN are 0.01, 0.05, and 0.10.

To find the optimal setting for each machine learning algorithm, a thorough search of their corresponding parameter space is performed.

Performance measures

In this study, we use three performance measures to evaluate the ability of machine learning algorithms in developing the best predictive model for the intended application. The measures considered are precision, F1 score, and sensitivity/recall to determine the best model given the inputted data samples. Table 5 provides a confusion matrix to define the terms used for all possible outcomes.

Precision evaluates the number of correct, true positive predictions by the algorithm while still considering the incorrectly predicted positive when it should have been negative (Eq. 17 ). By having high precision, this means that there is a low rate of false positives or type I error. Sensitivity or recall evaluates the number of true positives that are correctly predicted by the algorithm while considering the incorrectly predicted negative when it should have been positive (Eq. 18 ). Recall is a good tool to use when the focus is on minimizing false negatives (type II error). F1 score (shown in Eq. 19 ) evaluates both type I and type II errors and assesses the ability of the model to resist false positives and false negatives. This performance metric evaluates the robustness (low number of missed classifications), as well as the number of data points that are classified correctly by the model.

Availability of data and materials

Most of the datasets used and/or analyzed during the current study were publicly available online as open source data. The data were available in the website details given below:

https://data.bloomington.in.gov/dataset

https://data.louisvilleky.gov/dataset

https://data.sonomacounty.ca.gov/Government

We also obtained data from Sun Cities 4 Paws Rescue, Inc., and the Rifle Animal Shelter. No administrative permission was required to access the raw data from these shelters.

Abbreviations

Logistic Regression

Artificial Neural Network

Gradient Boosting

Goal Programming

Coefficient of Variation

Anderson KA, Brandt JC, Lord LK, Miles EA. Euthanasia in animal shelters: Management's perspective on staff reactions and support programs. Anthrozoös. 2013;26(4):569–78. https://doi.org/10.2752/175303713X13795775536057 .

Article   Google Scholar  

Clevenger J, Kass PH. Determinants of adoption and euthanasia of shelter dogs spayed or neutered in the University of California veterinary student surgery program compared to other shelter dogs. J Veterinary Med Educs. 2003;30(4):372–8.

Animal Humane Society. (n.d.). Retrieved November 2019, from https://www.animalhumanesociety.org/ .

Home. (2016, July 15). Retrieved November 2019, from http://www.americanhumane.org/ .

Rogelberg SG, DiGiacomo N, Reeve CL, Spitzmüller C, Clark OL, Teeter L, et al. What shelters can do about euthanasia-related stress: an examination of recommendations from those on the front line. J Appl Anim Welf Sci. 2007;10(4):331–47. https://doi.org/10.1080/10888700701353865 .

Article   CAS   PubMed   Google Scholar  

Hennessy MB, Voith VL, Mazzei SJ, Buttram J, Miller DD, Linden F. Behavior and cortisol levels of dogs in a public animal shelter, and an exploration of the ability of these measures to predict problem behavior after adoption. Appl Anim Behav Sci. 2001;73(3):217–33.

Shore ER. Returning a recently adopted companion animal: Adopters' reasons for and reactions to the failed adoption experience. J Appl Anim Welf Sci. 2005;8(3):187–98.

Article   CAS   Google Scholar  

Morris KN, Gies DL. Trends in intake and outcome Data for animal shelters in a large U.S. metropolitan area, 1989 to 2010. J Appl Anim Welf Sci. 2014;17(1):59–72. https://doi.org/10.1080/10888705.2014.856250 .

Fantuzzi JM, Miller KA, Weiss E. Factors relevant to adoption of cats in an animal shelter. J Appl Anim Welf Sci. 2010;13(2):174–9.

Brown WP, Morgan KT. Age, breed designation, coat color, and coat pattern influenced the length of stay of cats at a no-kill shelter. J Appl Anim Welf Sci. 2015;18(2):169–80.

Srinivas, S., & Rajendran, S. (2017). A Data-driven approach for multiobjective loan portfolio optimization using machine-learning algorithms and mathematical programming. In big Data analytics using multiple criteria decision-making models (pp. 175-210): CRC press.

Waller MA, Fawcett SE. Data science, predictive analytics, and big Data: a revolution that will transform supply chain design and management. J Bus Logist. 2013;34(2):77–84.

Kantardzic M. DATA MINING: concepts, models, methods, and algorithms. 2nd ed: IEEE: Wiley; 2019.

Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349(6245):255–60.

Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I. Machine learning and Data mining methods in diabetes research. Computational Structural Biotechnol J. 2017;15:104–16. https://doi.org/10.1016/j.csbj.2016.12.005 .

Neely MN, van Guilder MG, Yamada WM, Schumitzky A, Jelliffe RW. Accurate detection of outliers and subpopulations with Pmetrics, a nonparametric and parametric pharmacometric modeling and simulation package for R. Ther Drug Monit. 2012;34(4):467–76. https://doi.org/10.1097/FTD.0b013e31825c4ba6 .

Article   PubMed   PubMed Central   Google Scholar  

Bissantz N, Munk A, Scholz A. Parametric versus non-parametric modelling? Statistical evidence based on P-value curves. Mon Not R Astron Soc. 2003;340(4):1190–8. https://doi.org/10.1046/j.1365-8711.2003.06377.x .

Dietterich TG. Ensemble methods in machine learning. Berlin: Heidelberg; 2000.

Book   Google Scholar  

Pandey M, S, T. A comparative study of ensemble methods for students&apos; performance modeling. Int J Computer ApplS. 2014;103:26–32. https://doi.org/10.5120/18095-9151 .

Reeve CL, Rogelberg SG, Spitzmüller C, Digiacomo N. The caring-killing paradox: euthanasia-related strain among animal-shelter Workers1. J Appl Soc Psychol. 2005;35(1):119–43. https://doi.org/10.1111/j.1559-1816.2005.tb02096.x .

Gates MC, Zito S, Thomas J, Dale A. Post-adoption problem Behaviours in adolescent and adult dogs rehomed through a New Zealand animal shelter. Animals : an open access journal from MDPI. 2018;8(6):93. https://doi.org/10.3390/ani8060093 .

Weiss E, Gramann S, Drain N, Dolan E, Slater M. Modification of the feline-Ality™ assessment and the ability to predict adopted Cats' behaviors in their new homes. Animals : an open access journal from MDPI. 2015;5(1):71–88. https://doi.org/10.3390/ani5010071 .

Normando S, Stefanini C, Meers L, Adamelli S, Coultis D, Bono G. Some factors influencing adoption of sheltered dogs. Anthrozoös. 2006;19(3):211–24.

Protopopova A, Mehrkam LR, Boggess MM, Wynne CDL. In-kennel behavior predicts length of stay in shelter dogs. PLoS One. 2014;9(12):e114319.

Wells DL, Graham L, Hepper PG. The influence of length of time in a rescue shelter on the behaviour of Kennelled dogs. Anim Welf. 2002;11(3):317–25.

CAS   Google Scholar  

Zhong G, Jiao W, Gao W, Huang K. Automatic design of deep networks with neural blocks. Cogn Comput. 2020;12(1):1–12.

Rajendran S, Ravindran AR. Multi-criteria approach for platelet inventory management in hospitals. Int J Operational ResS. 2020;38(1):49–69.

Bastian ND, McMurry P, Fulton LV, Griffin PM, Cui S, Hanson T, Srinivas S. The AMEDD uses goal programming to optimize workforce planning decisions. Interfaces. 2015;45(4):305–24.

Rajendran S, Ansaripour A, Kris Srinivasan M, Chandra MJ. Stochastic goal programming approach to determine the side effects to be labeled on pharmaceutical drugs. IISE Transactions on Healthcare Systems Engineering. 2019;9(1):83–94.

Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ. Random forests for classification in ECOLOGY. Ecology. 2007;88(11):2783–92.

Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat. 2000;28(2):337–407.

Ge Z, Song Z, Ding SX, Huang B. Data mining and analytics in the process industry: the role of machine learning. IEEE Access. 2017;5:20590–616.

Open Data: City of Austin Texas: Open Data: City of Austin Texas. (n.d.). Retrieved March 2019, from https://data.austintexas.gov//Health-and-Community-Services/Austin-Animal-Center-Outcomes/9t4d-g238 .

County of Sonoma: Open Data: Open Data. (n.d.). Retrieved March 2019, from https://data.sonomacounty.ca.gov/Government/Animal-Shelter-Intake-and-Outcome/924a-vesw .

Kambli A, Sinha AA, Srinivas S. Improving campus dining operations using capacity and queue management: a simulation-based case study. J Hosp Tour Manag. 2020;43:62–70.

Rajendran S, Zack J. Insights on strategic air taxi network infrastructure locations using an iterative constrained clustering approach. Transport Res Part E: Logistics and Transportation Review. 2019;128:470–505.

American Kennel Club. (n.d.). Retrieved November 2019, from http://www.akc.org/ .

Elk Antler Supplements & Chews: Wapiti Labs, Inc. (n.d.). Retrieved November 2019, from https://www.wapitilabsinc.com/ .

Bursac Z, Gauss CH, Williams DK, Hosmer DW. Purposeful selection of variables in logistic regression. Source Code for Biol Med. 2008;3(1):17.

Delen D, Walker G, Kadam A. Predicting breast cancer survivability: a comparison of three data mining methods. Artif Intell Med. 2005;34(2):113–27.

Kim A, Song Y, Kim M, Lee K, Cheon JH. Logistic regression model training based on the approximate homomorphic encryption. BMC Med Genet. 2018;11(4):83.

Google Scholar  

Srinivas S, Ravindran AR. Optimizing outpatient appointment system using machine learning algorithms and scheduling rules: a prescriptive analytics framework. Expert Syst Appl. 2018;102:245–61. https://doi.org/10.1016/j.eswa.2018.02.022 .

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436.

Shih H, Rajendran S. Comparison of time series methods and machine learning algorithms for forecasting Taiwan blood Services Foundation’s blood supply. Journal of healthcare engineering. 2019;2019.

Srinivas S, Salah H. Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: a data analytics approach. Int J Med Inform. 2020;145:104290.

Rokach L. Ensemble-based classifiers. Artif Intell Rev. 2010;33(1):1–39.

Srinivas S. A machine learning-based approach for predicting patient punctuality in ambulatory care centers. Int J Environ Res Public Health. 2020;17(10):3703.

Download references

Acknowledgments

We would like to thank the Sun Cities 4 Paws Rescue, Inc., and the Rifle Animal Shelter for providing the length of stay reports in order to complete this study.

This research was not funded by any agency/grant.

Author information

Authors and affiliations.

Department of Bioengineering, University of Missouri Columbia, Columbia, MO, 65211, USA

Janae Bradley

Department of Industrial and Manufacturing Systems Engineering, University of Missouri Columbia, Columbia, MO, 65211, USA

Suchithra Rajendran

Department of Marketing, University of Missouri Columbia, Columbia, MO, 65211, USA

You can also search for this author in PubMed   Google Scholar

Contributions

JB performed data mining, data cleaning and analyses of the animal shelter data and machine learning algorithms. JB was also a major contributor in writing the manuscript. SR performed data mining, cleaning, and analyses of the machine learning algorithms, as well as the goal programming. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Suchithra Rajendran .

Ethics declarations

Ethics approval and consent to participate.

Most of the datasets used in this study are open source and are publicly available. The remaining data was collected from animal shelters with their consent to use the data for research purposes.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Bradley, J., Rajendran, S. Increasing adoption rates at animal shelters: a two-phase approach to predict length of stay and optimal shelter allocation. BMC Vet Res 17 , 70 (2021). https://doi.org/10.1186/s12917-020-02728-2

Download citation

Received : 07 January 2020

Accepted : 22 December 2020

Published : 05 February 2021

DOI : https://doi.org/10.1186/s12917-020-02728-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Animal shelter
  • High euthanization rates
  • Machine learning algorithms
  • Prediction models
  • Goal programming approach
  • Decision support tool

BMC Veterinary Research

ISSN: 1746-6148

research paper about pets

Science explores the origins of the friendship between dogs and humans

Recent studies confirm dogs’ ability to understand us, their natural talent for empathizing with other species and the pleasure we get from sharing our lives with them.

La amistad entre perros y humanos encuentra una base científica que explica su origen

It is often said that dogs are man’s best friend, but less often that they are the oldest. Dogs were the first domesticated animal in history. The two species stitched their evolutionary destinies together some 15,000 years ago, establishing a symbiotic relationship with few analogues in the animal world. It’s a rarity. Decades ago, archaeologists and zoologists posited that this relationship was born of utility but that, over the years, a fondness and understanding emerged between the two species that science is now trying to measure. In recent years, several studies have analyzed how this joint evolution has affected dogs and humans. In the past 20 years, the scientific literature on this subject has proliferated. And so has the conventional one. It is estimated that there are over 70,000 books about dogs on Amazon : a further sign that this prehistoric friendship is coming to the present in top form.

Onyoo Yoo has a beautiful four-year-old poodle named Aroma; they call her Aro at home. There were others before: Yoo has spent her whole life surrounded by dogs and knows from personal experience that these animals can bring joy and comfort, although she doesn’t quite understand the mechanisms that make this possible. Last year, Yoo took her dog to work to find out. She asked 30 volunteers to pet Aro, give her treats, walk her and play with her. Meanwhile, Yoo, who is a researcher at Konkuk University in South Korea, analyzed their brain activity.

“Our research found that participants’ alpha band brain waves [related to relaxation] increased while playing and walking with my dog. While beta band brain waves [associated with concentration] did so while grooming, massaging or playing with her,” Yoo explains. Recently published in the scientific journal PLOS One , the study confirms something many people feel : spending time with dogs is enormously pleasurable. But it does so in detail, “providing valuable information to elucidate the therapeutic effects and underlying mechanisms of animal-assisted interventions,” Yoo explains.

Pet ownership is known to help reduce stress levels, promote positive emotions and reduce the risk of cardiovascular disease. “However, research on the brain activity produced by human-animal interaction is incipient and insufficient,” says Yoo. This may be because, in order to understand it, one needs not only neurology and psychology but also paleobiology, in addition to taking a look back.

Initiating a friendship is not always easy, and the one forged between people and dogs did not start by petting a wolf or throwing a ball at it. Domestication was multifactorial and happened in fits and starts. An ambitious study published in Science in 2020 attempted to trace that process by sequencing 27 ancient dog genomes . In analyzing them, the authors discovered that dogs probably arose from a now-extinct population of wolves. They also distinguished at least five different dog populations, drawing a complex ancestral history. Different types of dogs expanded with the other human groups, linking their destinies (and eventual disappearance) to the survival of the clan with which they were associated.

Aritza Villaluenga, a researcher at the University of the Basque Country UPV/EHU (Spain) and a co-author of the study, points out that the first (albeit disputed) evidence of coexistence between humans and wolves dates back to 25,000 years ago: “It was probably not a conscious domestication, they did not know what they were doing, they did not conceive of what the result was going to be. They simply had animals to help them hunt.” We have to jump forward 10,000 years to when dogs first appeared in a sustained way. “And here, yes, we can talk about dogs because genetically they’re different from the wolves living in the same area at the same time. There [were] physical and genetic changes,” Villaluenga explains.

Hunting allies

At that time, their coexistence was symbiotic. “The partnership suited [both] dogs and humans. The dogs pushed herds of animals to where the human hunters were hiding,” the expert notes. The former were much better at running and the latter at strategizing. They formed a good team when hunting and, once the game was caught, they both shared the spoils. Thus, from the beginning of the relationship, it was very important for both species to understand each other, to be able to read each other.

Researchers at Stockholm University (Sweden) conducted an experiment with wolf cubs and found that some specimens can understand human cues and comprehend their playful intentionality. They proved this with something as trivial as throwing them a ball and asking them to bring it back. In everyday life, this action seems simple because many people do it with their pets daily. But that task involves great cognitive complexity, and in a few seconds, it demonstrates the ability of two species to understand each other that has been forged over millennia. The study suggested that this ability, present in some very sociable specimens, may have led to their domestication. Associating with humans was an evolutionary success in all respects. Today, it is estimated that for every wolf, there are 3,000 dogs.

Some 5,000 generations after that prehistoric union, today’s dogs can understand many more human commands, gestures and words. Mariana Boros, a neuroethologist at Loránd University in Budapest (Hungary), knows that well. She just published a study that analyzes how dogs can understand words . “The most important ability this animal has is the ability to understand human communication. They are exceptional,” the expert observes in a video call.

Boros and her team wanted to test whether this understanding was due to vocalization rather than context. So, they locked a dog in a room, announced that they were going to give him one object, say a ball, and then offered him another, for example, a stick. “We thought that if the dog understood what the word meant, it would have an expectation of what it would see next. And the violation of that expectation would be visible on the EEG,” Boros explains. And, indeed, it was. With this data, the team can be sure that dogs understand the word’s meaning. “In fact, the understanding mechanisms are very similar to what we see in humans,” Boros adds.

A man walks his dog in Barcelona in December 2023.

Love beyond understanding

Most scientific literature concludes that dogs have a special bond with humans for this reason. They understand us and communicate with us better than any other animal. But psychologist Clive Wynne of the University of Arizona disagrees. In his book Dog is Love , he argues that dogs have a unique capacity for interspecies love. If you raise a dog with sheep, goats or cats (or even tigers or lions), it will end up joining them and getting attached to them, he explains, using examples. Something similar would have happened with humans. Wyne’s hypothesis is backed up by science. In 2015, Japanese scientists showed that the more people looked into their dogs’ eyes, the more they both increased their production of oxytocin, the key chemical ingredient of affection. It’s not that they understand the humans with whom they live; it is that they love them .

In any case, understanding is not the only aspect in which dogs have evolved to adapt to our tastes. According to several studies, they have also become more adorable and expressive. Charles Darwin was the first to realize that domestic animals — such as cats, dogs and rabbits — share certain physical traits. They tend to have droopier ears and curlier tails than their wild ancestors. Their teeth are smaller, and they have white patches on their fur. This phenomenon is known as domestication syndrome.

The most eloquent example of this process occurred at a Soviet fox farm in the 1950s. Geneticist Dmitri K. Belyaev wanted to create a domestic fox population by selecting and crossing the tamest specimens. The results were analyzed in a scientific study in 2009 . By the fourth generation, the foxes were licking scientists and greeting them with tail wags. Their offspring were even more domesticated and could understand human signals and respond to gestures or looks. “They not only developed internal traits such as acceptance of human closeness. Physically, they became more pup-like, cuter. They changed to be more adorable to the human eye and presumably that same thing happened to dogs,” Boros explains.

The difference is that, with the foxes, this happened artificially and forcibly in just 50 years, while the domestication of the wolf into a dog was natural and presumably took much longer. This process was not born of man’s whim, as Villaluenga explained. Some Stone Age wolves showed a natural inclination to befriend the strange apes that spread throughout the world. They understood each other not only when hunting, but also when playing and being affectionate with each other. When they looked at each other, they both felt strangely good. These wolves got closer and closer to the humans and mingled with other wolves that were also hanging around human settlements. They decided to stay close, and it turned out to be for the best. According to this interpretation, shared by many specialists, the dog was not domesticated, but rather some wolves became self-domesticated and ended up becoming dogs. They chose us, at least as much as we chose them.

Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition

More information

¿Qué puede soñar un perro?

Do dogs dream? What about?

Grain-free diets are not necessary for healthy dogs.

Is it healthy for dogs to eat cereals?

  • Francés online
  • Inglés online
  • Italiano online
  • Alemán online
  • Crucigramas & Juegos

Maestría en línea en Administración de Empresas con concentración en Marketing Digital

Co-sleeping with pet dogs — but not cats — linked to poorer sleep in study

A survey-based study finds that people who sleep in the same room as their dogs show worse sleep quality. Cats weren't linked to the same effect.

photo of a white french bulldog with black splotches around the eyes sleeping next to a young woman in bed

Sleeping with your dog in the same room could be negatively affecting your sleep quality, according to  my team's  recently published research in  Scientific Reports .

We recruited a nationally representative sample of more than 1,500 American adults who completed questionnaires assessing their sleep habits. Overall, about half of the participants reported co-sleeping with pets — defined in our study as sleeping in the same room with your pet for at least part of the night.

Next, our research team compared the sleep habits of people who did and didn't co-sleep with pets. Our analyses revealed that participants who co-slept with pets had  poorer sleep quality and more insomnia symptoms  than those who did not. These findings persisted even after accounting for demographic differences between these groups. When considering pet type, we found evidence for a negative effect on sleep when co-sleeping with dogs but no evidence for a negative effect on sleep when co-sleeping with cats.

Surprisingly, 93% of people in our study who co-slept with their pets believed that their pets had either a positive or neutral overall effect on their sleep. Although more research is needed, these findings could suggest that most people are unaware of the potential negative effects their pets may have on their sleep.

Why it matters

orange and white cat curled up in a blue and white blanket on a bed

Most pet owners report that their pets have a  generally positive effect on their mental health . Pets can improve their owners' health in numerous ways during the day, such as by encouraging physical activity, promoting a daily routine and providing love and companionship.

However, our study fills an important knowledge gap by indicating that co-sleeping with pets can affect sleep quality. Good sleep is a  pillar of health and wellness . Even though pets may have an overall positive effect on mental health, it is possible that some of this benefit may be undermined if they are also causing you to lose sleep at night.

Although some people report that co-sleeping with their pets can provide them with a sense of  comfort or intimacy , it is important for people sharing a bedroom with their pets to be aware of their potential to serve as a source of nighttime noise, heat or movement that can  disrupt your ability to fall or stay asleep .

Sign up for the Live Science daily newsletter now

Get the world’s most fascinating discoveries delivered straight to your inbox.

What still isn't known

big brown lab sleeping in a bed propped up on pillows

Survey-based studies like ours are unable to prove that co-sleeping with pets causes disrupted sleep, although there is  some evidence  suggesting that this could be the case.

One important factor that our study did not assess was whether participants were also co-sleeping with other people like a spouse or child. Previous research suggests that  sharing a bed with other people can also affect our sleep  and that the mental health benefits of pet ownership could be stronger for  people with a romantic partner .

What's next

It probably isn’t realistic for most people to just stop co-sleeping with their pets. So, what should someone do to improve their sleep if they already share a bed with their pets?

— Do any animals keep pets like humans do?

— Why do cats throw up so much?

— Doctor injected dog and rabbits with bacteria from assassinated US president in bizarre autopsy experiments, documents reveal

Some  expert tips  include choosing a mattress that is large enough for you and your pets, washing and changing your bedding regularly, and establishing and maintaining a consistent bedtime routine with your pets. Further research is needed to identify more specific habits and routines that pet owners can adopt to ensure a good night's sleep when sharing the bedroom with their pets.

This edited article is republished from The Conversation under a Creative Commons license. Read the original article .

Brian N. Chin

I have been a tenure-track assistant professor of social/health psychology at Trinity College since 2022. Before starting this position, I earned my Ph.D. in Social and Health Psychology from Carnegie Mellon University in 2020 and then completed NIH-funded T32 postdoctoral training fellowships in cardiovascular behavioral medicine and translational sleep medicine at the University of Pittsburgh from 2020-2022.

The case against daylight saving, from a neurologist and sleep expert

'Rare' disorder that causes extreme sleepiness may be more common than thought

Underwater robot in Siberia's Lake Baikal reveals hidden mud volcanoes — and an active fault

  • FenrirAldebrand So there's either a huge flaw in the study or the article. Several times, the words "bed" and "bedroom" were used interchangeably. This suggests that someone who has their great Dane sleeping IN their bed, has the same effect as my Belgian Malinois quietly sleeping in his crate at the foot of my bed. Article never makes mention of dog size or breed, and where they are in the room. I think someone needs to re-read this study and try again at a summary. Reply
  • View All 1 Comment

Most Popular

By Sascha Pare April 03, 2024

By Emily Cooke April 03, 2024

By Mike Wall April 03, 2024

By Harry Baker April 03, 2024

By Ben Turner April 03, 2024

By Lloyd Coombes April 03, 2024

By Joe Rao April 03, 2024

By Sharmila Kuthunur April 03, 2024

By Rahul Rao April 03, 2024

  • 2 'You could almost see and smell their world': Remnants of 'Britain's Pompeii' reveal details of life in Bronze Age village
  • 3 Hidden chunk of Earth's crust that seeded birth of 'Scandinavia' discovered through ancient river crystals
  • 4 Why NASA is launching 3 rockets into the solar eclipse next week
  • 5 Do animals really have instincts?
  • 2 Nuclear fusion reactor in South Korea runs at 100 million degrees C for a record-breaking 48 seconds
  • 3 'It's had 1.1 billion years to accumulate': Helium reservoir in Minnesota has 'mind-bogglingly large' concentrations
  • 4 Explosive green 'Mother of Dragons' comet now visible in the Northern Hemisphere

research paper about pets

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 25 March 2024

The evolutionary drivers and correlates of viral host jumps

  • Cedric C. S. Tan   ORCID: orcid.org/0000-0003-3536-8465 1 , 2 ,
  • Lucy van Dorp   ORCID: orcid.org/0000-0002-6211-2310 1   na1 &
  • Francois Balloux   ORCID: orcid.org/0000-0003-1978-7715 1   na1  

Nature Ecology & Evolution ( 2024 ) Cite this article

9888 Accesses

2364 Altmetric

Metrics details

  • Molecular evolution
  • Viral evolution

Most emerging and re-emerging infectious diseases stem from viruses that naturally circulate in non-human vertebrates. When these viruses cross over into humans, they can cause disease outbreaks, epidemics and pandemics. While zoonotic host jumps have been extensively studied from an ecological perspective, little attention has gone into characterizing the evolutionary drivers and correlates underlying these events. To address this gap, we harnessed the entirety of publicly available viral genomic data, employing a comprehensive suite of network and phylogenetic analyses to investigate the evolutionary mechanisms underpinning recent viral host jumps. Surprisingly, we find that humans are as much a source as a sink for viral spillover events, insofar as we infer more viral host jumps from humans to other animals than from animals to humans. Moreover, we demonstrate heightened evolution in viral lineages that involve putative host jumps. We further observe that the extent of adaptation associated with a host jump is lower for viruses with broader host ranges. Finally, we show that the genomic targets of natural selection associated with host jumps vary across different viral families, with either structural or auxiliary genes being the prime targets of selection. Collectively, our results illuminate some of the evolutionary drivers underlying viral host jumps that may contribute to mitigating viral threats across species boundaries.

Similar content being viewed by others

research paper about pets

Complexity of avian evolution revealed by family-level genomes

Josefin Stiller, Shaohong Feng, … Guojie Zhang

research paper about pets

Revealing uncertainty in the status of biodiversity change

T. F. Johnson, A. P. Beckerman, … R. P. Freckleton

research paper about pets

Ecological countermeasures to prevent pathogen spillover and subsequent pandemics

Raina K. Plowright, Aliyu N. Ahmed, … Annika T. H. Keeley

The majority of emerging and re-emerging infectious diseases in humans are caused by viruses that have jumped from wild and domestic animal populations into humans (that is, zoonoses) 1 . Zoonotic viruses have caused countless disease outbreaks ranging from isolated cases to pandemics and have taken a major toll on human health throughout history. There is a pressing need to develop better approaches to pre-empt the emergence of viral infectious diseases and mitigate their effects. As such, there is an immense interest in understanding the correlates and mechanisms of zoonotic host jumps 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 .

Most studies thus far have primarily investigated the ecological and phenotypic risk factors contributing to viral host range through the use of host–virus association databases constructed mainly on the basis of systematic literature reviews and online compendiums, including VIRION 11 and CLOVER 12 . For example, ‘generalist’ viruses that can infect a broader range of hosts have typically been shown to be associated with greater zoonotic potential 2 , 3 , 5 . In addition, factors such as increasing human population density 1 , alterations in human-related land use 4 , ability to replicate in the cytoplasm or being vector-borne 3 are positively associated with zoonotic risk. However, despite global efforts to understand how viral infectious diseases emerge as a result of host jumps, our current understanding remains insufficient to effectively predict, prevent and manage imminent and future infectious disease threats. This may partly stem from the lack of integration of genomics into these ecological and phenotypic analyses.

One challenge for predicting viral disease emergence is that only a small fraction of the viral diversity circulating in wild and domestic vertebrates has been characterized so far. Due to resource and logistical constraints, surveillance studies of novel pathogens in animals often have sparse geographical and/or temporal coverage 13 , 14 and focus on selected host and pathogen taxa. Further, many of these studies do not perform downstream characterization of the novel viruses recovered and may lack sensitivity due to the use of PCR pre-screening to prioritize samples for sequencing 15 . As such, our knowledge of which viruses can, or are likely to emerge and in which settings, is poor. In addition, while genomic analyses are important for investigating the drivers of viral host jumps 16 , most studies do not incorporate genomic data into their analyses. Those that did have mostly focused on measures of host 2 or viral 3 diversity as predictors of zoonotic risk. As such, despite the limited characterization of global viral diversity thus far, existing genomic databases remain a rich, largely untapped resource to better understand the evolutionary processes surrounding viral host jumps.

Further, humans are just one node in a large and complex network of hosts in which viruses are endlessly exchanged, with viral zoonoses representing probably only rare outcomes of this wider ecological network. While research efforts have rightfully focused on zoonoses, viral host jumps between non-human animals remain relatively understudied. Another important process that has received less attention is human-to-animal (that is, anthroponotic) spillover, which may impede biodiversity conservation efforts and could also negatively impact food security. For example, human-sourced metapneumovirus has caused fatal respiratory outbreaks in captive chimpanzees 17 . Anthroponotic events may also lead to the establishment of wild animal reservoirs that may reseed infections in the human population, potentially following the acquisition of animal-specific adaptations that could increase the transmissibility or pathogenicity of a virus in humans 13 . Uncovering the broader evolutionary processes surrounding host jumps across vertebrate species may therefore enhance our ability to pre-empt and mitigate the effects of infectious diseases on both human and animal health.

A major challenge for understanding macroevolutionary processes through large-scale genomic analyses is the traditional reliance on physical and biological properties of viruses to define viral taxa, which is largely a vestige of the pre-genomic era 18 . As a result, taxon names may not always accurately reflect the evolutionary relatedness of viruses, precluding robust comparative analyses involving diverse viral taxa. Notably, the International Committee on Taxonomy of Viruses (ICTV) has been strongly advocating for taxon names to also reflect the evolutionary history of viruses 18 , 19 . However, the increasing use of metagenomic sequencing technologies has resulted in a large influx of newly discovered viruses that have not yet been incorporated into the ICTV taxonomy. Furthermore, it remains challenging to formally assess genetic relatedness through multiple sequence alignments of thousands of sequences comprising diverse viral taxa, particularly for those that experience a high frequency of recombination or reassortment.

In this study, we leverage the ~12 million viral sequences and associated host metadata hosted on NCBI to assess the current state of global viral genomic surveillance. We additionally analyse ~59,000 viral sequences isolated from various vertebrate hosts using a bespoke approach that is agnostic to viral taxonomy to understand the evolutionary processes surrounding host jumps. We ascertain overall trends in the directionality of viral host jumps between human and non-human vertebrates and quantify the amount of detectable adaptation associated with putative host jumps. Finally, we examine, for a subset of viruses, signatures of adaptive evolution detected in specific categories of viral proteins associated with facilitating or sustaining host jumps. Together, we provide a comprehensive assessment of potential genomic correlates underpinning host jumps in viruses across humans and other non-human vertebrates.

An incomplete picture of global vertebrate viral diversity

Global genomic surveillance of viruses from different hosts is key to preparing for emerging and re-emerging infectious diseases in humans and animals 13 , 16 . To identify the scope of viral genomic data collected thus far, we downloaded the metadata of all viral sequences hosted on NCBI Virus ( n  = 11,645,803; accessed 22 July 2023; Supplementary Data 1 ). Most (68%) of these sequences were associated with SARS-CoV-2, reflecting the intense sequencing efforts during the COVID-19 pandemic. In addition, of these sequences, 93.6%, 3.3%, 1.5%, 1.1% and 0.6% were of viruses with single-stranded (ss)RNA, double-stranded (ds)DNA, dsRNA, ssDNA and unspecified genome compositions, respectively. The dominance of ssRNA viruses is not entirely explained by the high number of SARS-CoV-2 genomes, as ssRNA viruses still represent 80% of all viral genomes if SARS-CoV-2 is discounted.

Vertebrate-associated viral sequences represent 93% of this dataset, of which 93% were human associated. The next four most-sequenced viruses are associated with domestic animals ( Sus , Gallus , Bos and Anas ) and, after excluding SARS-CoV-2, represent 15% of vertebrate viral sequences, while viruses isolated from the remaining vertebrate genera occupy a mere 9% (Fig. 1a and Extended Data Fig. 1a ), highlighting the human-centric nature of viral genomic surveillance. Further, only a limited number of non-human vertebrate families have at least ten associated viral genome sequences deposited (Fig. 1b ), reinforcing the fact that a substantial proportion of viral diversity in vertebrates remains uncharacterized. Viral sequences obtained from non-human vertebrates thus far also display a strong geographic bias, with most samples collected from the United States of America and China, whereas countries in Africa, Central Asia, South America and Eastern Europe are highly underrepresented (Fig. 1c ). This geographical bias varies among the four most-sequenced non-human host genera Sus , Gallus , Anas and Bos (Extended Data Fig. 1b ). Finally, the user-submitted host metadata associated with viral sequences, which is key to understanding global trends in the evolution and spread of viruses in wildlife, remains poor, with 45% and 37% of non-human viral sequences having no associated host information provided at the genus level, or sample collection year, respectively. The proportion of missing metadata also varies extensively between viral families and between countries (Extended Data Fig. 2 ). Overall, these results highlight the massive gaps in the genomic surveillance of viruses in wildlife globally and the need for more conscientious reporting of sample metadata.

figure 1

a , Proportion of non-SARS-CoV-2, vertebrate-associated viral sequences deposited in public sequence databases ( n  = 2,874,732), stratified by host. Viral sequences associated with humans and the next four most-sampled vertebrate hosts are shown. Sequences with no host metadata resolved at the genus level are denoted as ‘missing’. b , Proportion of host families represented by at least 10 associated viral sequences for the five major vertebrate host groups. c , Global heat map of sequencing effort, generated from all viral sequences deposited in public sequence databases that are not associated with human hosts ( n  = 1,599,672). d , Number of vertebrate viral species on NCBI Virus used for the genomic analyses in this study, stratified by viral family. The 32 vertebrate-associated viral families considered in this study are shown and the remaining 21 families that were not considered are denoted as ‘others’.

Humans give more viruses to animals than they do to us

To investigate the relative frequency of anthroponotic and zoonotic host jumps, we retrieved 58,657 quality-controlled viral genomes spanning 32 viral families, associated with 62 vertebrate host orders and representing 24% of all vertebrate viral species on NCBI Virus ( https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ ) (Fig. 1d ). We found that the user-submitted species identifiers of these viral genomes are poorly ascribed, with only 37% of species names consistent with those in the ICTV viral taxonomy 20 . In addition, the genetic diversity represented by different viral species is highly variable since they are conventionally defined on the basis of the genetic, phenotypic and ecological attributes of viruses 18 . Thus, we implemented a species-agnostic approach based on network theory to define ‘viral cliques’ that represent discrete taxonomic units with similar degrees of genetic diversity, similar to the concept of operational taxonomic units 21 (Fig. 2a and Methods ). A similar approach was previously shown to effectively partition the genomic diversity of plasmids in a biologically relevant manner 22 . Using this approach, we identified 5,128 viral cliques across the 32 viral families that were highly concordant with ICTV-defined species (median adjusted Rand index, ARI = 83%; adjusted mutual information, AMI = 75%) and of which 95% were monophyletic (Fig. 2a ). Some clique assignments aggregated multiple viral species identifiers, while others disaggregated species into multiple cliques (Fig. 2b ; clique assignments for Coronaviridae illustrated in Extended Data Fig. 3 ). Despite the human-centric nature of genomic surveillance, viral cliques involving only animals represent 62% of all cliques, highlighting the extensive diversity of animal viruses in the global viral-sharing network (Extended Data Fig. 4a ).

figure 2

a , Workflow for taxonomy-agnostic clique assignments. Briefly, the alignment-free Mash 53 distances between complete viral genomes in each viral family are computed and dense networks where nodes and edges representing viral genomes and the pairwise Mash distances, respectively, are constructed. From these networks, edges representing Mash distances >0.15 are removed to produce sparse networks, on which the community-detection algorithm, Infomap 54 , is applied to identify viral cliques. Concordance with the ICTV taxonomy was assessed using ARI and AMI. b , Sparse networks of representative viral cliques identified within the Coronaviridae (ssRNA), Picobirnaviridae (dsRNA), Genomoviridae (ssDNA) and Adenoviridae (dsDNA). Some viral clique assignments aggregated multiple viral species, while others disaggregated species into multiple cliques. Nodes, node shapes and edges represent individual genomes, their associated host and their pairwise Mash distances, respectively. The list of viral families considered in our analysis are shown on the bottom-left corner of each panel. Silhouettes were sourced from Flaticon.com and Adobe Stock Images ( https://stock.adobe.com ) with a standard licence.

We then identified putative host jumps within these viral cliques by producing curated whole-genome alignments to which we applied maximum-likelihood phylogenetic reconstruction. For segmented viruses, we instead used single-gene alignments as the high frequency of reassortment 23 precludes robust phylogenetic reconstruction using whole genomes. Phylogenetic trees were rooted with suitable outgroups identified using metrics of alignment-free distances (see Methods ). We subsequently reconstructed the host states of all ancestral nodes in each tree, allowing us to determine the most probable direction of a host jump for each viral sequence (approach illustrated in Fig. 3a ). To minimize the uncertainty in the ancestral reconstructions, we considered only host jumps where the likelihood of the ancestral host state was twofold higher than alternative host states (Fig. 3a and Supplementary Methods ). Varying the stringency of this likelihood threshold yielded highly consistent results (Extended Data Fig. 5a ), indicating that the inferred host jumps are robust to our choice of threshold. In total, we identified 12,676 viral lineages comprising 2,904 putative vertebrate host jumps across 174 of these viral cliques.

figure 3

a , Illustration of ancestral host reconstruction approach used to infer the directionality of putative host jumps. Putative host jumps are identified if the ancestral host state has a twofold higher likelihood than alternative host states. The mutational distance (substitutions per site) represents the sum of the branch lengths between the tip sequence and the ancestral node for which the first host state transition occurred in a tip-to-root traverse. b , Number of distinct putative host jumps involving humans across all viral families considered ( n  = 32). Black dots represent the observed point estimates for each type of host jump. The violin plots show the bootstrap distributions of these estimates, where the host jumps within each viral clique were resampled with replacement for 1,000 iterations. Black lines show the 95% confidence intervals associated with these bootstrap distributions. Silhouettes were sourced from Flaticon.com and Adobe Stock Images ( https://stock.adobe.com ) with a standard licence. A two-tailed paired t -test was performed to test for a difference in the zoonotic and anthroponotic bootstrap distributions.

Among the putative host jumps inferred to involve human hosts (599/2,904; 21%), we found a much higher frequency of anthroponotic compared with zoonotic host jumps (64% vs 36%, respectively; Fig. 3b ). This finding was statistically significant as assessed via a bootstrap paired t -test ( t  = 227, d.f. = 999, P  < 0.0001) and a permutation test ( P  = 0.035; see Methods ). In addition, this result was robust to our choice of likelihood thresholds used during ancestral reconstruction (Extended Data Fig. 5b ), the tree depth at which the host jump was identified (Extended Data Fig. 5c ), and to sampling bias ( Supplementary Notes and Fig. 1 ). The highest number of anthroponotic jumps was contributed by the cliques representing SARS-CoV-2 (132/383; 34%), MERS-CoV (39/383; 10%) and influenza A (37/383; 10%). This is concordant with the repeated independent anthroponotic spillovers into farmed, captive and wild animals described for SARS-CoV-2 (refs. 13 , 24 , 25 , 26 , 27 ) and influenza A 28 , 29 . Meanwhile, there has only been circumstantial evidence for human-to-camel transmission of MERS-CoV 30 , 31 , 32 . Noting the disproportionate number of anthroponotic jumps contributed by these viral cliques, we reperformed the analysis without them and found a significantly higher frequency of anthroponotic than zoonotic jumps (53.5% vs 46.5%; bootstrap paired t -test, t  = 40, d.f. = 999, P  < 0.0001), suggesting that our results are not driven solely by these cliques. Further, 16/21 of the viral families were involved in more anthroponotic than zoonotic jumps (Extended Data Fig. 5d ), indicating that this finding is generalizable across most viruses. Overall, our results highlight the high but largely underappreciated frequency of anthroponotic jumps among vertebrate viruses.

Host jumps of multihost viruses require fewer adaptations

Before jumping to a new host, a virus in its natural reservoir may fortuitously acquire pre-adaptive mutations that facilitate its transition to a new host. This may be followed by the further acquisition of adaptive mutations as the virus adapts to its new host environment 16 .

For each host jump inferred, we estimated the extent of both pre-jump and post-jump adaptations through the sum of branch lengths from the observed tip to the ancestral node where the host transition occurred (Fig. 3a ). However, in practice, the degree of adaptation inferred may vary on the basis of different factors, including sampling intensity and the time interval between when the host jump occurred and when the virus was isolated from its new host. As such, for each viral clique, we considered only the minimum mutational distance associated with a host jump.

We first examined whether the minimum mutational distance associated with a host jump for each viral clique was higher than the minimum for a random selection of viral lineages not involved in host jumps (Fig. 3a and Methods ). Indeed, the minimum mutational distance for a putative host jump within each clique was significantly higher than that for non-host jumps (Fig. 4a ; two-tailed Mann–Whitney U -test, U  = 6,767, P  < 0.0001). Noting that both sampling intensity and the different mutation rates of viral families may confound these results, we corrected for these confounders using a logistic regression model but found a similar effect (odds ratio, OR host jump  = 1.31; two-tailed Z -test for slope = 0, Z  = 6.58, d.f. = 289, P  < 0.0001).

figure 4

a , b , Distributions (Gaussian kernel densities and boxplots) of ( a ) minimum mutational distance and ( b ) minimum dN/dS for inferred host jump events and non-host jump controls on the logarithmic scale. Differences in distributions were assessed using two-sided Mann–Whitney U -tests. c , d , Scatterplots of the ( c ) minimum mutational distance and ( d ) minimum dN/dS for host jump and non-host jumps. Lines represent univariate linear regression smooths fitted on the data. We corrected for the effects of sequencing effort and viral family membership using Poisson regression models. The parameter estimates in these Poisson models and their statistical significance, as assessed using two-tailed Z -tests, after performing these corrections are annotated. For all panels, each data point represents the minimum distance or minimum dN/dS across all host jump or randomly selected non-host jump lineages in a single clique. Boxplot elements are defined as follows: centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range.

We then considered the commonly used measure of directional selection acting on genomes, the ratio of non-synonymous mutations per non-synonymous site (dN) to the number of synonymous mutations per synonymous site (dS). Comparing the minimum dN/dS for host jumps within each clique, we observed that minimum dN/dS was also significantly higher for host jumps compared with non-host jumps (Fig. 4b ; OR host jump  = 2.39; Z  = 4.84, d.f. = 263, P  < 0.0001). Finally, after correcting for viral clique membership, there were no significant differences in log-transformed mutational distance ( F (1,528)  = 2.23, P  = 0.136) or dN/dS estimates ( F (1,338)  = 1.66, P  = 0.198) between zoonotic and anthroponotic jumps, or between forward and reverse cross-species jumps (mutational distance: F (1,1588)  = 0.538, P  = 0.463; dN/dS: F (1,1168)  = 0.0311, P  = 0.860), indicating that there are no direction-specific biases in these measures of adaptation. Overall, these results are consistent with the hypothesized heightened selection following a change in host environment and additionally provide confidence in our ancestral-state reconstruction method for assigning host jump status.

However, the extent of adaptive change required for a viral host jump may vary. For instance, some zoonotic viruses may require minimal adaptation to infect new hosts while in other cases, more substantial genetic changes might be necessary for the virus to overcome barriers that prevent efficient infection or transmission in the new host. We therefore tested the hypothesis that the strength of selection associated with a host jump decreases for viruses that tend to have broader host ranges. To do so, we compared the minimum mutational distance between ancestral and observed host states to the number of host genera sampled for each viral clique. We found that the observed host range for each viral clique is positively associated with greater sequencing intensity (that is, the number of viral genomes in each clique; Pearson’s r  = 0.486; two-tailed t -test for r  = 0, t  = 34.9, d.f. = 3,932, P  < 0.0001), in line with the strong positive correlation between per-host viral diversity and surveillance effort reported in previous studies 2 , 3 , 8 . After correcting for both sequencing effort and viral family membership, we found that the mutational distance for host jumps tends to decrease with broader host ranges (Poisson regression, slope = −0.113; two-tailed Z -test for slope = 0, Z  = −9.40, d.f. = 129, P  < 0.0001). In contrast, the relationship between mutational distance and host range for viral lineages that have not experienced host jumps is only weakly positive (slope = 0.0843; Z  = 7.16, d.f. = 127, P  < 0.0001) (Fig. 4c ). Similarly, the minimum dN/dS for a host jump decreases more substantially for viral cliques with broader host ranges (slope = −0.427; Z  = −9.18, d.f. = 116, P  < 0.0001) than for non-host jump controls (slope = 0.143; Z  = 3.08, d.f. = 116, P  < 0.01) (Fig. 4d ). These trends in mutational distance and dN/dS were consistent when the same analysis was performed for ssDNA, dsDNA, +ssRNA and −ssRNA viruses separately (Extended Data Fig. 6 ). These results indicate that, on average, ‘generalist’ multihost viruses experience lower degrees of adaptation when jumping into new vertebrate hosts.

Host jump adaptations are gene and family specific

We next examined whether genes with different established functions displayed distinctive patterns of adaptive evolution linked to host jump events. Since gene function remains poorly characterized in the large and complex genomes of dsDNA viruses, we focused on the shorter ssRNA and ssDNA viral families. We selected for analysis the four non-segmented viral families with the greatest number of host jump lineages in our dataset: Coronaviridae (+ssRNA; n  = 2,537), Rhabdoviridae (−ssRNA; n  = 1,097), Paramyxoviridae (−ssRNA; n  = 787) and Circoviridae (ssDNA; n  = 695). For these viral families, we extracted all annotated protein-coding regions from their genomes and categorized them as either being associated with cell entry (termed ‘entry’), viral replication (‘replication-associated’) or virion formation (‘structural’), and classifying the remaining genes as ‘auxiliary’ genes.

For the Coronaviridae , Paramyxoviridae and Rhabdoviridae , the entry genes encode surface glycoproteins that could also be considered structural but were not categorized as such given their important role in mediating cell entry. The capsid gene of circoviruses, however, encodes the sole structural protein that is also the key mediator of cell entry and was therefore categorized as structural. To estimate putative signatures of adaptation in relation to lineages that have experienced host jumps for the different gene categories, we modelled the change in log 10 (dN/dS) in host jumps versus non-host jumps using a linear model, while correcting for the effects of clique membership (see Methods ). Contrary to our expectation that entry genes would generally be under the strongest adaptive pressures during a host jump, we found that the strength of adaptation signals for each gene category varied by family. Indeed, the strongest signals were observed for structural proteins in coronaviruses (effect = 0.375, two-tailed t -test for difference in parameter estimates, t  = 4.35, d.f. = 10,121, P  < 0.0001) and auxiliary proteins in paramyxoviruses (effect = 0.439, t  = 2.15, d.f. = 4,225, P  = 0.02) (Fig. 5 ). Meanwhile, no significant adaptive signals were observed in the entry genes of all families (minimum P  = 0.3), except for the capsid gene in circoviruses (effect = 0.325, t  = 2.68, d.f. = 1,367, P  = 0.004) (Fig. 5 ). These findings suggest that selective pressures acting on viral genomes in relation to host jumps are likely to differ by gene function and viral family.

figure 5

The strength of adaptation signals in genes associated with host jump and non-host jump lineages were estimated using linear models for Coronaviridae ( n  = 10,129), Paramyxoviridae ( n  = 4,233), Rhabdoviridae ( n  = 3,321), and Circoviridae ( n  = 1,373). We modelled the effects of gene type and host jump status on log(dN/dS) while correcting for viral clique membership and, for each gene type, inferred the strength of adaptive signal (denoted ‘effect’) as the difference in parameter estimates for host jumps versus non-host jumps. Points and lines represent the parameter estimates and their standard errors, respectively. Differences in parameter estimates were tested against zero using a one-tailed t -test. Subpanels for each gene type were ordered from left to right with increasing effect estimates.

Given the lack of adaptive signals in the entry proteins, we further hypothesized that within each gene, adaptative changes are likely to be localized to regions of functional importance and/or that are under relatively stronger selective pressures exerted by host immunity. To test this, we focused on the spike gene (entry) of viral cliques within the Coronaviridae since the key region involved in viral entry is well characterized (that is, the receptor-binding domain (RBD)) 33 . We found that dN/dS estimates consistent with adaptive evolution were indeed localized to the RBDs, but also to the N-terminal domains (NTD), of SARS-CoV-2 (genus Betacoronavirus ), avian infectious bronchitis virus (IBV; Gammacoronavirus ) and MERS (genus Alphacoronavirus ) (Extended Data Fig. 7 ). This is consistent with the strong immune pressures exerted on these regions of the spike protein 34 , 35 and the central role of the RBD in host-cell recognition and entry 36 , 37 , 38 . Overall, our results indicate that the extent of adaptation associated with a host jump likely varies by gene function, gene region and viral family.

The post-genomic era has opened opportunities to advance our understanding of the diversity of viruses in circulation and the macroevolutionary principles of viral host range. Leveraging ~59,000 publicly available viral sequences isolated from vertebrate hosts, we inferred that humans give more viruses to other vertebrates than they give to us across the 32 viral families we considered. We further demonstrated that host jumps are associated with heightened signals of adaptive evolution that tend to decrease in viruses with broader host ranges. This indicates that there may be a minimum mutational threshold necessary for viruses to expand their host range. Finally, we showed that adaptive evolution linked to host jumps may vary by gene function and may be localized to specific gene regions of functional importance.

To bypass the limitations of existing viral taxonomies, we used a taxonomy-agnostic approach to define roughly equivalent units of viral diversity, which formed the basis for most of the analyses presented in this study. The use of operational taxonomic units rather than traditional taxonomic species names further allowed us to perform like-for-like analyses across the entire diversity of viruses. Our approach identified cliques that were largely concordant with traditional viral species nomenclature but also highlighted inconsistencies, where in some cases, single viral species appear to form distinct taxonomic groups while other groups of species seem to form a single group solely based on their genetic relatedness (Fig. 2 and Extended Data Fig. 3 ). However, we do not claim that our approach should supersede existing taxonomic classification systems, especially since a robust and meaningful species definition requires the integration of viral properties with finer-scale evolutionary analyses that was not necessary for our purposes. Nevertheless, we anticipate that the development and use of similar network-based approaches may pave the way for the development of efficient classification frameworks that can rapidly incorporate novel, metagenomically derived viruses into existing taxonomies.

Harnessing cliques as a mechanism of identifying clusters of related viruses for phylogenetic inspection allowed us to quantify the number and sources of recent host jump events. One important caveat to this approach is that the viral cliques involved in putative host jumps represent only a fraction of the viral diversity sequenced thus far (Extended Data Fig. 4b ) and the patterns we observed may change as more viruses are discovered. However, we consistently found higher frequencies of anthroponotic than zoonotic jumps across 16 of the 21 viral families (Extended Data Fig. 5d ). Since each of these families are associated with varying viral discovery effort, the consistency of this pattern makes it highly unlikely that surveillance biases are driving the excess of anthroponotic jumps we inferred. Another caveat is that our clique assignment approach clusters viruses within ~15% sequence divergence, which limits our analyses to relatively recent host jump events. However, the limited divergence of the sequences within each clique also allowed us to produce more robust alignments and hence evolutionary inferences.

Of the 599 recent host jumps identified, 64% were inferred as anthroponotic (Fig. 3b ). While the relative importance of anthroponotic versus zoonotic events has been speculated 13 , 29 , 39 , 40 , we provide a formal evaluation of the zoonotic-to-anthroponotic ratio in vertebrates, showing that anthroponoses are equally, if not more, critical to consider than zoonoses when assessing viral spillover dynamics. It stands to reason that the substantial global human population size and ubiquitous spatial distribution position us as a major source for viral exchange. However, it is also likely that behavioural factors might amplify the risk of anthroponotic transmission, for example, through changes in land use, agricultural methods or heightened interactions between humans and wildlife 4 . Overall, our results highlight the importance of surveying and monitoring human-to-animal transmission of viruses, and its impacts on human and animal health.

We observed heightened evolution and adaptive signals in association with host jumps (Fig. 4 ). This result is largely intuitive, since a virus jumping into a new host is likely to be under different selective pressures exerted directly by the novel host environment and indirectly by changes in host-to-host transmission dynamics. The evolutionary signals we captured may include pre-requisite adaptations that enable a virus to infect the new host. In addition, they probably also represent the burst of adaptive mutations which may be acquired following a host jump, which has been demonstrated for multiple viral systems 24 , 41 , 42 , 43 . Further, these signals could potentially reflect a relaxation of previous selective pressures no longer present in the novel host. We note that these signals of heightened evolution could also, in principle, be inflated by sampling bias, where two viruses circulating in the same host are more often drawn from the same population. However, this was largely controlled for in our analysis through comparisons to representative non-host jump lineages that are expected to be affected by the same sampling bias.

We observed lower mutational and adaptive signals associated with host jumps for viruses that infect a broader range of hosts (Fig. 4c,d ). The most likely explanation for this pattern is that some viruses are intrinsically more capable of infecting a diverse range of hosts, possibly by exploiting host-cell machinery that are conserved across different hosts. For example, sarbecoviruses (the subgenus comprising SARS-CoV-2) target the ACE2 host-cell receptor, which is conserved across vertebrates 44 , 45 , and the high structural conservation of the sarbecovirus spike protein 15 may explain the observation that single mutations can enable sarbecoviruses to expand their host tropism 46 . In other words, multihost viruses may have evolved to target more conserved host machinery that reduces the mutational barrier for them to productively infect new hosts. This may provide a mechanistic explanation for previous observations that viruses with broad host range have a higher risk of emerging as zoonotic diseases 2 , 3 , 5 .

Our approach to identifying putative host jumps hinges on ancestral-state reconstruction (Fig. 3a ), which has been shown to be affected by sampling biases 47 , 48 . However, we accounted for this, at least in part, by including sequencing effort as a measure of sampling bias in our statistical models, allowing us to draw inferences that were robust to disproportionate sampling of viruses in different hosts. Our approach also does not consider the epidemiology or ecology of viral transmission, as this is largely dependent on host features such as population size, social structure and behaviour for which comprehensive datasets at this scale are not currently available. We anticipate that future datasets that integrate ecology, epidemiology and genomics may allow more granular investigations of these patterns in specific host and viral systems. In addition, the patterns we described are broad and do not capture the idiosyncrasies of individual host–pathogen associations. These include a variety of biological features— intrinsic ones, such as the molecular adaptations required for receptor binding, as well as more complex ones including cross-immunity and interference with other viral pathogens circulating in a host population.

Overall, our work highlights the large scope of genomic data in the public domain and its utility in exploring the evolutionary mechanisms of viral host jumps. However, the large gaps in the genomic surveillance of viruses thus far suggest that we have only just scratched the surface of the true viral diversity in nature. In addition, despite the strong anthropocentric bias in viral surveillance, 81% of the putative host jumps identified in this study do not involve humans, emphasizing the large underappreciated scale of the global viral-sharing network (Extended Data Fig. 8 ). Widening our field of view beyond zoonoses and investigating the flow of viruses within this larger network could yield valuable insights that may help us better prepare for and manage infectious disease emergence at the human–animal interface.

Data acquisition, curation and quality control

The metadata of all partial and complete viral genomes were downloaded from NCBI Virus ( https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ ) on 22 July 2023, with filters excluding sequences isolated from environmental sources, lab hosts, or associated with vaccine strains or proviruses ( n  = 11,645,803). Where possible, host taxa names in the metadata were resolved in accordance with the NCBI taxonomy 49 using the ‘taxizedb’ v.0.3.1 package in R. User-submitted viral species names were compared to the ICTV master species list version ‘MSL38.V2’ dated 6 July 2023.

To generate a candidate list of viral sequences for further genomic analysis, the metadata were filtered to include 53 viral families known to infect vertebrate hosts on the basis of information provided in the 2022 release of the ICTV taxonomy ( https://ictv.global/taxonomy ) 50 and with reference to that provided by ViralZone ( https://viralzone.expasy.org/ ) 51 . We then retained only sequences from viral families comprising at least 100 sequences of greater than 1,000 nt in length. Since the sequences of segmented viral families are rarely deposited as whole genomes and since the high frequency of reassortment 23 precludes robust phylogenetic reconstruction, we identified sequences for single genes conserved within each of these families for further analysis ( Arenaviridae : L segment; Birnaviridae : ORF1/RdRP/VP1/Segment B; Peribunyaviridae : L segment; Orthomyxoviridae : PB1; Picobirnaviridae : RdRP; Sedoreoviridae : VP1/Segment 1/RdRP; Spinareoviridae : Segment 1/RdRP/Lambda 3). These sequences were retrieved by applying text-based pattern matching (that is, ‘grepl’ in R) to query the GenBank sequence titles. For non-segmented genomes, we retained all non-human-associated sequences and subsampled the human-associated sequences as follows: we selected a random subsample of 1,000 SARS-CoV-2 genomes of greater than 28,000 nt from distinct countries, isolation sources and with distinct collection dates. For influenza B, we retained only human sequences with distinct country of origins, sample types and collection dates, and hosts of isolation. For other human-associated sequences, we retained viruses with distinct species, country, isolation source and collection date information. We then downloaded the final candidate list of viral sequences ( n  = 92,973) using ‘ncbi-acc-download’ v.0.2.8 ( https://github.com/kblin/ncbi-acc-download ). Further quality control of the genomes downloaded was performed using ‘CheckV’ (v.1.0.1) 52 , retaining sequences with more than 95% completeness (for non-segmented viruses) and less than 5% contamination (for all sequences). This resulted in a final genomic dataset comprising 58,657 observations (Supplementary Table 1 ) composed of gene sequences for segmented viruses and complete genomes for non-segmented viruses. For simplicity, we will henceforth refer to the gene sequences and complete genomes as ‘genomes’.

Taxonomy-agnostic identification of viral cliques

To identify viral cliques, we calculated the pairwise alignment-free Mash distances of genomes within each viral family via ‘Mash’ (v.1.1) 53 with a k -mer size of 13. This k -mer size ensures that the probability of observing a k -mer by chance, given the median genome length for each clique, is less than 0.01. Given a genome length, l , alphabet, Σ  = {A, T, G, C}, and the desired probability of observing a k -mer by chance, q  = 0.01, this was computed using the formula described previously 53 :

We then constructed undirected graphs for each viral family with nodes and edges representing genomes and Mash distances, respectively. From these networks, we removed edges with Mash distance values greater than a certain threshold, t , before we applied the community-detection algorithm, Infomap 54 . This community-detection algorithm performs well in both large (>1,000 nodes) and small (≤1,000 nodes) undirected graphs 55 and seeks to identify subgraphs within these undirected graphs that minimize the information required to constrain the movement of a random walker 54 . We refer to the subgraphs identified through this algorithm as ‘viral cliques’. Here we forced the community-detection algorithm to identify taxonomically relevant cliques by removing edges with Mash distance values greater than t , which resulted in sparser graphs with closely related genomes (for example, from the same species) being more densely connected than more distantly related genomes (for example, different species). The value of t was selected by maximizing the proportion of monophyletic cliques identified and the concordance of the viral cliques identified with the viral species names from the NCBI taxonomy, based on the commonly used clustering performance metrics, AMI and ARI (Supplementary Fig. 2 ). These metrics were computed using the ‘AMI’ and ‘ARI’ functions in ‘Aricode’ v.1.0.2. To assess whether the viral cliques identified fulfil the species definition criterion of being monophyletic 18 , we reconstructed the phylogenies of each viral family by applying the neighbour-joining algorithm 56 implemented in the ‘Ape’ v.5.7.1 R package on their pairwise Mash distance matrices. We then computed the proportion of monophyletic viral cliques using the ‘is.monophyletic’ function in Ape v.5.7.1 across the various values of t . Given the discordance between the NCBI and ICTV taxonomies, we applied the above optimization protocol to t using the viral species names in the ICTV taxonomy. Using the NCBI viral species names, t  = 0.15 maximized both the median AMI and ARI across all families (Supplementary Fig. 2a ), with 94.3% of the cliques identified being monophyletic (Supplementary Fig. 2b ). Using the ICTV viral species names, t  = 0.2 and t  = 0.25 maximized the median AMI and median ARI across families (Supplementary Fig. 2c ), with 93.7% and 87.8% of the cliques being monophyletic (Supplementary Fig. 2b ), respectively. Since t  = 0.15 produced the highest proportion of monophyletic clades that were approximately concordant with existing viral taxonomies, we used this threshold to generate the final viral clique assignments for downstream analyses (Supplementary Table 1 ).

Identification of putative host jumps

We retrieved all viral cliques that were associated with at least two distinct host genera and comprised at least 10 genomes ( n  = 215). We then generated clique-level genome alignments using the ‘FFT-NS-2’ algorithm in ‘MAFFT’ (v.7.490) 57 , 58 . We masked regions of the alignments that were poorly aligned or prone to sequencing error by replacing alignment sites that had more than 10% of gaps or ambiguous nucleotides with Ns. Clique-level genome alignments that had more than 20% of the median genome length masked were considered to be poorly aligned and thus removed from further analysis ( n  = 6; Supplementary Fig. 3 ). Following this procedure, we reconstructed maximum-likelihood phylogenies for each viral clique with ‘IQ-Tree’ (v.2.1.4-beta) 59 , using 1,000 ultrafast bootstrap (UFBoot) 60 replicates. The optimal substitution model for each tree was automatically determined using the ‘ModelFinder’ 61 utility native to IQ-Tree. To estimate the root position for each clique tree, we reconstructed neighbour-joining Mash trees for each viral clique, including 10 additional genomes whose minimum pairwise Mash distance to the genomes in each tree was 0.3–0.5, as potential outgroups. The most basal tips in these neighbour-joining Mash trees were identified and used to root the maximum-likelihood clique trees. This approach, as opposed to using maximum-likelihood phylogenetic reconstruction involving the outgroups, was used as it is difficult to reliably align clique sequences with highly divergent outgroups.

To identify putative host jumps, we performed ancestral-state reconstruction on the resultant rooted maximum-likelihood phylogenies with host as a discrete trait using the ‘ace’ function in Ape v.5.7.1. Traversing from a tip to the root node, a putative host jump is identified if the reconstructed host state of an ancestral node is different from the observed tip state, has a twofold greater likelihood compared with alternative states and is different from the host state of the sampled tip. Where the tip and ancestral host states were of different taxonomic ranks, we excluded putative host jumps where the ancestral host state is nested within the tip host state, or vice versa (for example, ‘ Homo ’ and ‘Hominidae’). Missing host metadata were encoded as ‘unknown’ and included in the ancestral-state reconstruction analysis. Host jumps involving unknown or non-vertebrate host states were excluded from further analysis. Separately, we extracted non-host jump lineages to control for any biases in our analysis approach. To do so, we randomly selected an ancestral node where the reconstructed host state is the same as the observed tip state and has a twofold greater likelihood than alternative host states, for each viral genome that is not involved in any putative host jumps. For the mutational distance and dN/dS analyses, we retained only viral cliques where non-host jump lineages could be identified. An analysis exploring the robustness of this host jump inference approach to sampling biases (Supplementary Fig. 1 ) and a more detailed description of the inference algorithm (Supplementary Fig. 4 ) are provided in Supplementary Information .

Implementation of this algorithm yielded a list of all viral lineages involving a host jump (Supplementary Table 2 ). Since multiple lineages may involve a host transition at the same ancestral node, we calculated the number of unique host jump events as the number of distinct nodes for each unique host pair. For example, the three lineages Node1 (host A)→Tip1 (host B), Node1 (host A)→Tip2 (host B) and Node1 (host A)→Tip3 (host C) would be considered as two distinct host jump events, one between hosts A and B and the other between hosts A and C. This counting approach was used for Fig. 3a and Extended Data Fig. 5 . The list of all 2,904 distinct host jumps is provided in Supplementary Table 3 .

Calculating mutational distances and dN/dS

Mutational distance and dN/dS estimates may be lineage specific and may depend on sampling intensity. In addition, there is a nonlinear relationship between dN/dS and branch length, that is, the estimated dN/dS decreases with increasing evolutionary distance 62 . Therefore, we opted to compare the minimum adaptive signal (that is, minimum dN/dS) associated with a host jump for each clique. For host jump lineages, mutational distances were calculated as the sum of the branch lengths between the tip sequence and the ancestral node for which the first host state transition occurred (in substitutions per site) using the ‘get_pairwise_distances’ function in the ‘Castor’ (v.1.7.10) 63 R package; this was then multiplied by the alignment length to obtain the estimated number of substitutions (Fig. 3a ). To calculate the dN/dS estimates, we reconstructed the ancestral sequences of ancestral nodes using the ‘-asr’ flag in IQ-Tree, which is based on an empirical Bayesian algorithm ( http://www.iqtree.org/doc/Command-Reference ). We then extracted coding regions from the clique-level masked alignments based on the user-submitted gene annotations on NCBI GenBank (in ‘gff’ format) of each viral genome. We then computed the dN/dS estimates using the method of ref. 64 implemented in the ‘dnastring2kaks’ function of the ‘MSA2dist’ v.1.4.0 R package ( https://github.com/kullrich/MSA2dist ). We calculated the minimum mutational distance and dN/dS across all host jump events in each clique for our downstream statistical analyses, which, in principle, represents the minimum evolutionary signal associated with a host jump in each viral clique. For non-host jump lineages, we similarly computed the minimum mutational distance and dN/dS across the randomly selected lineages. Estimates where dN = 0 or dS = 0 were removed. The list of all minimum mutational distance and minimum dN/dS estimates is provided in Supplementary Tables 4 and 5 , respectively. The dN/dS estimates for the analysis shown in Fig. 5 are provided in Supplementary Table 6 .

For the coronavirus spike gene analysis (Extended Data Fig. 7 ), spike sequences were extracted from the clique-level multiple sequence alignments, with gaps trimmed to the reference sequences (avian infectious bronchitis virus, EU714028.1; SARS-CoV-2, MN908947.3; MERS, JX869059.2). The genomic coordinates for the functional domains of the spike proteins were derived from previous studies 33 , 37 , 65 . Estimates where dN = 0 or dS = 0 were removed. The dN/dS estimates are provided in Supplementary Table 7 .

Statistical analyses

All statistical analyses were performed using the ‘stats’ package native to R v.4.3.1. To generate the bootstrapped distributions shown in Fig. 3b , we randomly resampled the host jumps within each clique with replacement (1,000 iterations) and performed two-tailed paired t -tests using the ‘t.test’ function. Mann–Whitney U -tests, analysis of variance (ANOVA), linear regressions, and Poisson and logistic regressions were implemented using ‘wilcox.test’, ‘anova’, ‘lm’ and ‘glm’ functions, respectively.

A permutation test was performed to assess whether the higher proportion of anthroponotic versus zoonotic jumps was statistically significant. We randomly permuted the host states in each clique for 500 iterations while preserving the number of host-jump and non-host-jump lineages (illustrated in Supplementary Fig. 5 ). The P value was calculated as the number of iterations where the permutated anthroponotic/zoonotic ratio was greater than or equal to the observed ratio.

To assess the relationship between host range and adaptative signals (Fig. 4 ), we used Poisson regressions to model the expected number of host genera observed in each viral clique, λ host range . We corrected for the number of genomes in each clique, g , as a measure of sampling effort, and viral family membership, v , by including them as fixed effects in these models. These models can be formalized for mutational distance or dN/dS, d , with some p number of viral families and residual error, ε , as:

We tested whether the parameter estimates were non-zero by performing two-tailed Z -tests implemented within the ‘summary’ function in R.

To estimate the strength of adaptive signals for coronaviruses, paramyxoviruses, rhabdoviruses and circoviruses (Fig. 5 ) by gene type, we implemented two linear regression models for each viral family. Since the overall adaptive signal may differ for each viral clique, we corrected for this effect by using an initial linear model where the number of viral cliques, viral clique membership and residual are given by q , c and ε , respectively, as follows:

Subsequently, we used the corrected log(dN/dS) estimates represented by the residuals of model 1, ε model 1 , in a second linear model partitioning the effects of gene type by host jump status, j . Given r number of gene types, this model can be formalized as follows:

The estimated effects shown in Fig. 5 , representative of the difference in adaptive signals associated with jump and non-host jump lineages for each gene type, were then computed as:

To test whether this effect is statistically significant, we used a one-tailed t -test, with the t statistic computed using the standard error of the parameter estimates in model 2:

The residuals of model 2 were confirmed to be approximately normal by visual inspection (Supplementary Fig. 6 ).

Data analysis and visualization

All data analyses were performed using R v.4.3.1. All visualizations were performed using ggplot (v.3.4.2) 66 or ggtree (v.3.8.2) 67 . UpSet plots were created using the R package, UpSetR (v.1.4.0) 68 .

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The full list of accessions considered in this study is provided in Supplementary Data 1 . The data used for the main analyses are provided in Supplementary Tables 2–7 . All reconstructed maximum-likelihood trees and ancestral sequences used for the analyses are hosted on Zenodo ( https://doi.org/10.5281/zenodo.10214868 ) 69 .

Code availability

All custom code used to perform the analyses reported here are hosted on GitHub ( https://github.com/cednotsed/vertebrate_host_jumps ).

Jones, K. E. et al. Global trends in emerging infectious diseases. Nature 451 , 990–993 (2008).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Shaw, L. P. et al. The phylogenetic range of bacterial and viral pathogens of vertebrates. Mol. Ecol. 29 , 3361–3379 (2020).

Article   PubMed   Google Scholar  

Olival, K. J. et al. Host and viral traits predict zoonotic spillover from mammals. Nature 546 , 646–650 (2017).

Gibb, R. et al. Zoonotic host diversity increases in human-dominated ecosystems. Nature 584 , 398–402 (2020).

Article   ADS   CAS   PubMed   Google Scholar  

Woolhouse, M. E. J. & Gowtage-Sequeria, S. Host range and emerging and reemerging pathogens. Emerg. Infect. Dis. 11 , 1842–1847 (2005).

Article   PubMed   PubMed Central   Google Scholar  

Albery, G. F., Eskew, E. A., Ross, N. & Olival, K. J. Predicting the global mammalian viral sharing network using phylogeography. Nat. Commun. 11 , 2260 (2020).

Taylor, L. H., Latham, S. M. & Woolhouse, M. E. J. Risk factors for human disease emergence. Phil. Trans. R. Soc. Lond. B 356 , 983–989 (2001).

Article   CAS   Google Scholar  

Albery, G. F. et al. Urban-adapted mammal species have more known pathogens. Nat. Ecol. Evol. 6 , 794–801 (2022).

Karesh, W. B. et al. Ecology of zoonoses: natural and unnatural histories. Lancet 380 , 1936–1945 (2012).

Cleaveland, S., Laurenson, M. K. & Taylor, L. H. Diseases of humans and their domestic mammals: pathogen characteristics, host range and the risk of emergence. Phil. Trans. R. Soc. Lond. B 356 , 991–999 (2001).

Carlson, C. J. et al. The Global Virome in One Network (VIRION): an atlas of vertebrate–virus associations. mBio 13 , e02985-21 (2022).

Gibb, R. et al. Data proliferation, reconciliation, and synthesis in viral ecology. BioScience 71 , 1148–1156 (2021).

Article   Google Scholar  

Kuchipudi, S. V. et al. Coordinated surveillance is essential to monitor and mitigate the evolutionary impacts of SARS-CoV-2 spillover and circulation in animal hosts. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-023-02082-0 (2023).

Watsa, M. & Wildlife Disease Surveillance Focus Group. Rigorous wildlife disease surveillance. Science 369 , 145–147 (2020).

Tan, C. C. et al. Genomic screening of 16 UK native bat species through conservationist networks uncovers coronaviruses with zoonotic potential. Nat. Commun. 14 , 3322 (2023).

Pepin, K. M., Lass, S., Pulliam, J. R. C., Read, A. F. & Lloyd-Smith, J. O. Identifying genetic markers of adaptation for surveillance of viral host jumps. Nat. Rev. Microbiol. 8 , 802–813 (2010).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kaur, T. et al. Descriptive epidemiology of fatal respiratory outbreaks and detection of a human‐related metapneumovirus in wild chimpanzees ( Pan troglodytes ) at Mahale Mountains National Park, Western Tanzania. Am. J. Primatol. 70 , 755–765 (2008).

Simmonds, P. et al. Four principles to establish a universal virus taxonomy. PLoS Biol. 21 , e3001922 (2023).

Adams, M. J. et al. 50 years of the International Committee on Taxonomy of Viruses: progress and prospects. Arch. Virol. 162 , 1441–1446 (2017).

Article   CAS   PubMed   Google Scholar  

Walker, P. J. et al. Recent changes to virus taxonomy ratified by the International Committee on Taxonomy of Viruses (2022). Arch. Virol. 167 , 2429–2440 (2022).

Blaxter, M. et al. Defining operational taxonomic units using DNA barcode data. Phil. Trans. R. Soc. B 360 , 1935–1943 (2005).

Acman, M., van Dorp, L., Santini, J. M. & Balloux, F. Large-scale network analysis captures biological features of bacterial plasmids. Nat. Commun. 11 , 2452 (2020).

McDonald, S. M., Nelson, M. I., Turner, P. E. & Patton, J. T. Reassortment in segmented RNA viruses: mechanisms and outcomes. Nat. Rev. Microbiol. 14 , 448–460 (2016).

Tan, C. C. S. et al. Transmission of SARS-CoV-2 from humans to animals and potential host adaptation. Nat. Commun. 13 , 2988 (2022).

Kuchipudi, S. V. et al. Multiple spillovers from humans and onward transmission of SARS-CoV-2 in white-tailed deer. Proc. Natl Acad. Sci. USA 119 , e2121644119 (2022).

Munnink, B. B. O. et al. Transmission of SARS-CoV-2 on mink farms between humans and mink and back to humans. Science 371 , 172–177 (2021).

Article   ADS   Google Scholar  

McAloose, D. et al. From people to Panthera: natural SARS-CoV-2 infection in tigers and lions at the Bronx Zoo. mBio 11 , e02220-20 (2020).

Short, K. R. et al. One health, multiple challenges: the inter-species transmission of influenza A virus. One Health 1 , 1–13 (2015).

Nelson, M. I. & Vincent, A. L. Reverse zoonosis of influenza to swine: new perspectives on the human–animal interface. Trends Microbiol. 23 , 142–153 (2015).

Samara, E. M. & Abdoun, K. A. Concerns about misinterpretation of recent scientific data implicating dromedary camels in epidemiology of Middle East respiratory syndrome (MERS). mBio 5 , e01430-14 (2014).

Du, L. & Han, G.-Z. Deciphering MERS-CoV evolution in dromedary camels. Trends Microbiol. 24 , 87–89 (2016).

Zhang, Z., Shen, L. & Gu, X. Evolutionary dynamics of MERS-CoV: potential recombination, positive selection and transmission. Sci. Rep. 6 , 25049 (2016).

Huang, Y., Yang, C., Xu, X., Xu, W. & Liu, S. Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19. Acta Pharmacol. Sin. 41 , 1141–1149 (2020).

Carabelli, A. M. et al. SARS-CoV-2 variant biology: immune escape, transmission and fitness. Nat. Rev. Microbiol. 21 , 162–177 (2023).

CAS   PubMed   PubMed Central   Google Scholar  

Wang, N. et al. Structural definition of a neutralization-sensitive epitope on the MERS-CoV S1-NTD. Cell Rep. 28 , 3395–3405 (2019).

Shang, J. et al. Structural basis of receptor recognition by SARS-CoV-2. Nature 581 , 221–224 (2020).

Shang, J. et al. Cryo-EM structure of infectious bronchitis coronavirus spike protein reveals structural and functional evolution of coronavirus spike proteins. PLoS Pathog. 14 , e1007009 (2018).

Yuan, Y. et al. Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains. Nat. Commun. 8 , 15092 (2017).

Messenger, A. M., Barnes, A. N. & Gray, G. C. Reverse zoonotic disease transmission (zooanthroponosis): a systematic review of seldom-documented human biological threats to animals. PLoS ONE 9 , e89055 (2014).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Edwards, S. J., Chatterjee, H. J. & Santini, J. M. Anthroponosis and risk management: a time for ethical vaccination of wildlife? Lancet Microbe 2 , e230–e231 (2021).

Schrauwen, E. J. & Fouchier, R. A. Host adaptation and transmission of influenza A viruses in mammals. Emerg. Microbes Infect. 3 , e9 (2014).

Villordo, S. M., Carballeda, J. M., Filomatori, C. V. & Gamarnik, A. V. RNA structure duplications and flavivirus host adaptation. Trends Microbiol. 24 , 270–283 (2016).

Urbanowicz, R. A. et al. Human adaptation of Ebola virus during the West African outbreak. Cell 167 , 1079–1087 (2016).

Damas, J. et al. Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates. Proc. Natl Acad. Sci. USA 117 , 22311–22322 (2020).

Lam, S. D. et al. SARS-CoV-2 spike protein predicted to form complexes with host receptor protein orthologues from a broad range of mammals. Sci. Rep. 10 , 16471 (2020).

Starr, T. N. et al. ACE2 binding is an ancestral and evolvable trait of sarbecoviruses. Nature 603 , 913–918 (2022).

Wright, A. M., Lyons, K. M., Brandley, M. C. & Hillis, D. M. Which came first: the lizard or the egg? Robustness in phylogenetic reconstruction of ancestral states. J. Exp. Zool. B 324 , 504–516 (2015).

Liu, P., Song, Y., Colijn, C. & MacPherson, A. The impact of sampling bias on viral phylogeographic reconstruction. PLoS Glob. Public Health 2 , e0000577 (2022).

Schoch, C. L. et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database 2020 , baaa062 (2020).

Lefkowitz, E. J. et al. Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV). Nucleic Acids Res. 46 , D708–D717 (2018).

Hulo, C. et al. ViralZone: a knowledge resource to understand virus diversity. Nucleic Acids Res. 39 , D576–D582 (2011).

Nayfach, S. et al. CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat. Biotechnol. 39 , 578–585 (2021).

Ondov, B. D. et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol. 17 , 132 (2016).

Rosvall, M. & Bergstrom, C. T. Maps of random walks on complex networks reveal community structure. Proc. Natl Acad. Sci. USA 105 , 1118–1123 (2008).

Yang, Z., Algesheimer, R. & Tessone, C. J. A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 6 , 30750 (2016).

Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4 , 406–425 (1987).

CAS   PubMed   Google Scholar  

Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30 , 3059–3066 (2002).

Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30 , 772–780 (2013).

Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37 , 1530–1534 (2020).

Minh, B. Q., Nguyen, M. A. T. & von Haeseler, A. Ultrafast approximation for phylogenetic bootstrap. Mol. Biol. Evol. 30 , 1188–1195 (2013).

Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14 , 587–589 (2017).

Wolf, J. B., Künstner, A., Nam, K., Jakobsson, M. & Ellegren, H. Nonlinear dynamics of nonsynonymous (d N) and synonymous (d S) substitution rates affects inference of selection. Genome Biol. Evol. 1 , 308–319 (2009).

Louca, S. & Doebeli, M. Efficient comparative phylogenetics on large trees. Bioinformatics 34 , 1053–1055 (2018).

Li, W.-H. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36 , 96–99 (1993).

Lu, G. et al. Molecular basis of binding between novel human coronavirus MERS-CoV and its receptor CD26. Nature 500 , 227–231 (2013).

Wickham, H. ggplot2. Wiley Interdiscip. Rev. Comput. Stat. 3 , 180–185 (2011).

Yu, G., Smith, D. K., Zhu, H., Guan, Y. & Lam, T. T. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8 , 28–36 (2017).

Conway, J. R., Lex, A. & Gehlenborg, N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33 , 2938–2940 (2017).

Tan, C. C. S. Supplementary data for ‘Crossing host boundaries: the evolutionary drivers and correlates of viral host jumps’ [Data set]. Zenodo https://doi.org/10.5281/zenodo.10497734 (2023).

Download references

Acknowledgements

We thank R. J. Gibbs, G. Murray and L. P. Shaw for helpful feedback and discussions. C.C.S.T. was funded by the National Science Scholarship from the Agency for Science, Technology and Research (A*STAR), Singapore. F.B. and L.v.D. were funded by the European Commission (Horizon 2021–2024, END-VOC Project). L.v.D. was also funded by the UCL Excellence Fellowship. Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union or the European Health and Digital Executive Agency. For the purpose of open access, the corresponding author has applied a ‘Creative Commons Attribution’ (CC BY) licence to any author-accepted version of the manuscript. The authors acknowledge the use of the UCL Myriad High Performance Computing Facility (Myriad@UCL), the UCL Department of Computer Science High Performance Computing Cluster and associated support services in the completion of this work.

Author information

These authors contributed equally: Lucy van Dorp, Francois Balloux.

Authors and Affiliations

UCL Genetics Institute, University College London, London, UK

Cedric C. S. Tan, Lucy van Dorp & Francois Balloux

The Francis Crick Institute, London, UK

Cedric C. S. Tan

You can also search for this author in PubMed   Google Scholar

Contributions

C.C.S.T. performed all analyses. L.v.D. and F.B. jointly supervised the study. C.C.S.T., L.v.D. and F.B. wrote the manuscript.

Corresponding author

Correspondence to Cedric C. S. Tan .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Ecology & Evolution thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended data fig. 1 host and geographical distribution of viral sequences..

( a ) Number of viral sequences, excluding SARS-CoV-2, associated with the top 50 vertebrate hosts observed in the ‘others’ category as shown in main text Fig. 1a . ( b ) Number of viral sequences stratified by the four most-sequenced non-human animals, excluding SARS-CoV-2. The number of viral sequences for the top 10 countries are shown as bar plots. The percentage of viral sequences for the top three most sequenced viral species for each host are annotated.

Extended Data Fig. 2 Distribution of missing metadata for viral sequences.

(Top) Proportion of all viral sequences associated to non-human vertebrates ( n  = 1,599,672) with missing genus information or (bottom) sample collection year, stratified by viral family or country of origin. Countries with no associated sequences are denoted ‘NA’.

Extended Data Fig. 3 Viral cliques for Coronaviridae .

Sparse networks of viral cliques identified (see Methods) and their corresponding user-submitted species names for the Coronaviridae , similar to main text Fig. 2 . Nodes, node shapes, and edges represent individual genomes, their associated host and their pairwise Mash (alignment-free) distances, respectively.

Extended Data Fig. 4 Summary of viral cliques identified.

( a ) Number of viral cliques identified stratified by viral family. Cliques with only animal-associated sequences, human-associated sequences, or both are annotated. ( b ) Percentage of viral cliques involving at least one of the 2,904 putative host jumps inferred, stratified by viral family.

Extended Data Fig. 5 Robustness of host jump inference.

( a ) UpSet plot providing the intersecting host jumps identified via ancestral reconstruction when using a two-fold, five-fold or ten-fold likelihood threshold. ( b ) Bar plot showing the number of anthroponotic and zoonotic events inferred using various likelihood thresholds, ( c ) at different ancestral node depths, and ( d ) stratified by viral family. For (b), the number of anthroponotic and zoonotic host jumps were stratified by the depth of the ancestral node in the tip-to-node traversal. Since multiple host jump lineages can involve the same ancestral node, the tip-to-node depths may vary depending on which lineage is selected. As such, we randomly selected a viral lineage for each distinct host jump event for this analysis.

Extended Data Fig. 6 Adaptation analysis for viral groups.

Analysis of relationships between host range and estimated adaptive signals, similar to Fig. 3 , but only considering ssDNA, dsDNA, +ssRNA or -ssRNA viruses. Distributions of minimum ( a ) mutational distance and ( b ) dN/dS for host jump and non-host jumps on the logarithmic scale. We corrected for the effects of sequencing effort and viral family membership using Poisson regression models. The estimated effects of patristic distance on host range after these corrections are annotated. We tested whether the estimated effects were non-zero using two-tailed Z-tests. For all panels, each data point represents the minimum distance or dN/dS across all host jump or randomly selected non-host jump lineages in a single clique. Line segments represent linear regression smooths without correction.

Extended Data Fig. 7 Adaptive signals in the Coronaviridae spike gene.

Analysis of the log10(dN/dS) estimates associated to different functional domains encoded by the coronavirus spike gene: N-terminal domain (NTD), receptor-binding domain (RBD), fusion peptide (FP), heptad repeats 1 and 2 (HR1 and HR2), central helix (CH), transmembrane (TM), C-terminal domains (CT). Estimates with dN=0 or dS=0 were removed and the remaining number of sequences for each domain and viral clique are annotated. Differences in distributions were tested for using two-sided Mann-Whitney U tests and the corresponding p-values are annotated. Boxplot elements are defined as follows: centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range.

Extended Data Fig. 8 The global viral host jump network.

Directed network of the vertebrate viral-sharing network, where nodes and edges represent host genera and the number of viral cliques shared. Edge widths and colour are indicative of the number of viral cliques shared.

Supplementary information

Supplementary information.

Supplementary Note, Methods and Figs. 1–6.

Reporting Summary

Peer review file, supplementary data 1., supplementary tables 1–9., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Tan, C.C.S., van Dorp, L. & Balloux, F. The evolutionary drivers and correlates of viral host jumps. Nat Ecol Evol (2024). https://doi.org/10.1038/s41559-024-02353-4

Download citation

Received : 11 January 2024

Accepted : 29 January 2024

Published : 25 March 2024

DOI : https://doi.org/10.1038/s41559-024-02353-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research paper about pets

New Research

Microplastics Are Contaminating Ancient Archaeological Sites

New research suggests plastic particles may pose a threat to the preservation of historic remains

Aaron Boorstein

Staff Contributor

Two researchers in a lab

Today, microplastics are found almost everywhere: oceans , food , the atmosphere and even human lungs , blood and placenta s. But while they’re thought of as a modern problem, plastic particles are now appearing where one might least expect: ancient archaeological sites.

Researchers found microplastics in soil deposits 7.35 meters (24.11 feet) below the ground, according to a study published this month in the journal Science of the Total Environment . The soil samples date to the first or early second century C.E. and were sourced from two archaeological sites in York, England. Some were excavated in the late 1980s, while others were contemporary samples.

The scientists then used an imaging technique called μFTIR , which can detect microplastics’ quantities, size and composition. Across all samples, they found 66 particles consisting of 16 polymer types.

“This feels like an important moment, confirming what we should have expected: that what were previously thought to be pristine archaeological deposits, ripe for investigation, are in fact contaminated with plastics,” says John Schofield , an archaeologist at the University of York, in a statement .

Microplastics are fragments of plastic that are smaller than five millimeters long, the diameter of a standard pencil eraser . They come from a variety of sources, including laundry, landfills, beauty products and sewage sludge.

“In the last not even 100 years—mostly since the 1950s—we as humans have produced eight billion tons of plastic, and the estimate is only about 10 percent of that has been recycled,” Leigh Shemitz, president of the climate education group SoundWaters, told Yale Sustainability in 2020.

Microplastics have been found in soil samples before. In fact, almost one-third of all plastic waste ends up in soil or freshwater, according to the United Nations Convention to Combat Desertification .

But the new study provides “the first evidence of [microplastic] contamination in archaeological sediment (or soil) samples,” write the researchers. These findings could change how archaeologists protect historic sites.

“While preserving archaeological remains in situ has been the favored approach in recent years, the new findings could trigger a change in approach, as microplastic contamination could compromise the remains’ scientific value,” writes CNN ’s Jack Guy.

In situ , Latin for “in the place,” is the term used to describe archaeological objects that have not been moved from their original locations. Leaving remains in situ helps prevent site and artifact damage, preserves contextual setting and allows future researchers to gather information.

“The presence of microplastics can and will change the chemistry of the soil, potentially introducing elements which will cause the organic remains to decay,” says David Jennings , chief executive of York Archaeology, in the statement. “If that is the case, preserving archaeology in situ may no longer be appropriate.”

Now, the researchers will shift their attention toward better understanding the implications of their findings. They know microplastics could threaten the integrity of archaeological samples, but what exactly does that harm look like?

“To what extent this contamination compromises the evidential value of these deposits and their national importance is what we'll try to find out next,” says Schofield.

Get the latest stories in your inbox every weekday.

Aaron Boorstein | READ MORE

Aaron Boorstein is an intern with  Smithsonian magazine.

IMAGES

  1. 54 Powerful Pets Statistics You Need To Know In 2021

    research paper about pets

  2. Why Dogs Are Better Pets Than Cats Essay

    research paper about pets

  3. Write A Paragraph About My Pet Dog

    research paper about pets

  4. Essay on My Pet Dog

    research paper about pets

  5. (PDF) Pets impact on quality of life, a case study

    research paper about pets

  6. Animal Research templates for primary grades

    research paper about pets

VIDEO

  1. Introducing Paper Pets 😊✨(3/3) #art #paperdolls #pets #dog #cute

  2. Kittens Takedown the TP

  3. Introducing Paper Pets 😊✨(1/3) #art #paperdolls #pets #dog #cute

  4. Researchers decode why children prefer pets to their siblings

  5. REMAKING ONE OF MY OLDEST PAPER PETS! @89angelfox

COMMENTS

  1. Pet Ownership and Quality of Life: A Systematic Review of the Literature

    1. Introduction. Throughout history, animals have played a significant role in society including in agriculture and pet ownership. A recent survey conducted in the United States estimated that approximately 67% of homes had at least one pet, equaling about 63 million homes with at least one dog and 42 million homes with at least one cat [].Pets can constitute a connection to nature, function ...

  2. Pet ownership and human health: a brief review of evidence and issues

    Research into the association between pet ownership and human health has produced intriguing, although frequently contradictory, results often raising uncertainty as to whether pet ownership is advisable on health grounds The question of whether someone should own a pet is never as simple as whether that pet has a measurably beneficial or detrimental effect on the owner's physical health.

  3. Benefits of pets' ownership, a review based on health perspectives

    In recent years there exist many research publications that emphasize the medical value and therapeutic nature of the human-pet bond (Hussein et al., 2021; Lass-Hennemann et al., 2020;Matchock ...

  4. (PDF) Pet Ownership and Quality of Life: A Systematic ...

    Abstract: Pet ownership is the most common form of human-animal interaction, and anecdotally, pet ownership can lead to improved physical and mental health for owners. However, scant research ...

  5. (PDF) The Impact of Pets on Human Health and ...

    Abstract. Because of extensive media coverage, it is now widely believed that pets enhance their owners' health, sense of psychological well-being, and longevity. But while some researchers have ...

  6. The Pet Exposure Effect: Exploring the Differential Impact of Dogs

    Pets are prevalent and play important roles in consumers' daily lives (Amiot and Bastian 2015; Cavanaugh, Leonard, and Scammon 2008; Hirschman 1994; Holbrook and Woodside 2008; Serpell and Paul 2011).According to the survey of the American Pet Products Association (), 68% of U.S. households, or 84.6 million homes, own a pet.Dogs and cats are the most popular pets, with 48% of U.S. households ...

  7. A Systematic Review of Research on Pet Ownership and Animal

    189 Anthrozoös A Systematic Review of Research on Pet Ownership and Animal Interactions among Older Adults. Thomas, Son, Chapa, & McCune, 2013). In an analysis of 460 older individuals who had experienced a myocardial infarction, PO was the only factor significantly predicting survival (Friedmann, Thomas, & Son, 2011).

  8. Pet Ownership and Quality of Life: A Systematic Review of the

    Abstract. Pet ownership is the most common form of human-animal interaction, and anecdotally, pet ownership can lead to improved physical and mental health for owners. However, scant research is available validating these claims. This study aimed to review the recent peer reviewed literature to better describe the body of knowledge surrounding ...

  9. PDF Friends With Benefits: On the Positive Consequences of Pet Ownership

    Overall, 167 reported owning a pet (50 had no pets), with pet owners having an average of 1.98 pets (SD 1.25) in their household. Owners and nonowners did not differ with respect to sex, age, or family income. Procedure. Participants completed the measures online dur-ing a 2-week window on a secure computer server.

  10. Frontiers

    Attitudes Toward Pets and Youth Outcomes. Table 2 shows the results from the mixed level regression models used to test whether attitudes toward pets was associated with youth socioemotional outcomes. As above, analyses were run in two steps, without and with demographic covariates. In addition to gender, age, race/ethnicity, and family SES, pet ownership was also included in the second step.

  11. The impact of returning a pet to the shelter on future animal adoptions

    Post-hoc analyses using standardized residuals showed dogs were returned more frequently than cats for behavior (36.1%) and housing issues (11.3%), and cats were returned more due to the health of ...

  12. Frontiers

    Canine science is rapidly maturing into an interdisciplinary and highly impactful field with great potential for both basic and translational research. The articles in this Frontiers Research Topic, Our Canine Connection: The History, Benefits and Future of Human-Dog Interactions, arise from two meetings sponsored by the Wallis Annenberg PetSpace Leadership Institute, which convened experts ...

  13. Increasing adoption rates at animal shelters: a two-phase approach to

    Background Among the 6-8 million animals that enter the rescue shelters every year, nearly 3-4 million (i.e., 50% of the incoming animals) are euthanized, and 10-25% of them are put to death specifically because of shelter overcrowding each year. The overall goal of this study is to increase the adoption rates at animal shelters. This involves predicting the length of stay of each animal ...

  14. Animals

    Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. ... Animals is an international ...

  15. (PDF) Pets and mental health

    Abstract. Mental un-wellness is a major health issue internationally. It is important to identify policy avenues. that improve or support mental health. The animals we keep as pets may offer ...

  16. Science explores the origins of the friendship between dogs and humans

    Pet ownership is known to help reduce stress levels, promote positive emotions and reduce the risk of cardiovascular disease. "However, research on the brain activity produced by human-animal interaction is incipient and insufficient," says Yoo. This may be because, in order to understand it, one needs not only neurology and psychology but ...

  17. Co-sleeping with pet dogs

    Co-sleeping with pet dogs — but not cats — linked to poorer sleep in study. A survey-based study finds that people who sleep in the same room as their dogs show worse sleep quality. Cats weren ...

  18. The evolutionary drivers and correlates of viral host jumps

    Humans give more viruses to animals than they do to us. To investigate the relative frequency of anthroponotic and zoonotic host jumps, we retrieved 58,657 quality-controlled viral genomes ...

  19. (PDF) EVOLVING OPPORTUNITIES AND TRENDS IN THE PET ...

    This paper aims to put forward constructive suggestions for the future development of the pet industry under the epidemic environment through market analysis, data research, prospect judgment and ...

  20. Microplastics Are Contaminating Ancient Archaeological Sites

    But while they're thought of as a modern problem, plastic particles are now appearing where one might least expect: ancient archaeological sites. Researchers found microplastics in soil deposits ...