• Link to facebook
  • Link to linkedin
  • Link to twitter
  • Link to youtube
  • Writing Tips

How to Justify Your Methods in a Thesis or Dissertation

How to Justify Your Methods in a Thesis or Dissertation

4-minute read

  • 1st May 2023

Writing a thesis or dissertation is hard work. You’ve devoted countless hours to your research, and you want your results to be taken seriously. But how does your professor or evaluating committee know that they can trust your results? You convince them by justifying your research methods.

What Does Justifying Your Methods Mean?

In simple terms, your methods are the tools you use to obtain your data, and the justification (which is also called the methodology ) is the analysis of those tools. In your justification, your goal is to demonstrate that your research is both rigorously conducted and replicable so your audience recognizes that your results are legitimate.

The formatting and structure of your justification will depend on your field of study and your institution’s requirements, but below, we’ve provided questions to ask yourself as you outline your justification.

Why Did You Choose Your Method of Gathering Data?

Does your study rely on quantitative data, qualitative data, or both? Certain types of data work better for certain studies. How did you choose to gather that data? Evaluate your approach to collecting data in light of your research question. Did you consider any alternative approaches? If so, why did you decide not to use them? Highlight the pros and cons of various possible methods if necessary. Research results aren’t valid unless the data are valid, so you have to convince your reader that they are.

How Did You Evaluate Your Data?

Collecting your data was only the first part of your study. Once you had them, how did you use them? Do your results involve cross-referencing? If so, how was this accomplished? Which statistical analyses did you run, and why did you choose them? Are they common in your field? How did you make sure your data were statistically significant ? Is your effect size small, medium, or large? Numbers don’t always lend themselves to an obvious outcome. Here, you want to provide a clear link between the Methods and Results sections of your paper.

Did You Use Any Unconventional Approaches in Your Study?

Most fields have standard approaches to the research they use, but these approaches don’t work for every project. Did you use methods that other fields normally use, or did you need to come up with a different way of obtaining your data? Your reader will look at unconventional approaches with a more critical eye. Acknowledge the limitations of your method, but explain why the strengths of the method outweigh those limitations.

Find this useful?

Subscribe to our newsletter and get writing tips from our editors straight to your inbox.

What Relevant Sources Can You Cite?

You can strengthen your justification by referencing existing research in your field. Citing these references can demonstrate that you’ve followed established practices for your type of research. Or you can discuss how you decided on your approach by evaluating other studies. Highlight the use of established techniques, tools, and measurements in your study. If you used an unconventional approach, justify it by providing evidence of a gap in the existing literature.

Two Final Tips:

●  When you’re writing your justification, write for your audience. Your purpose here is to provide more than a technical list of details and procedures. This section should focus more on the why and less on the how .

●  Consider your methodology as you’re conducting your research. Take thorough notes as you work to make sure you capture all the necessary details correctly. Eliminating any possible confusion or ambiguity will go a long way toward helping your justification.

In Conclusion:

Your goal in writing your justification is to explain not only the decisions you made but also the reasoning behind those decisions. It should be overwhelmingly clear to your audience that your study used the best possible methods to answer your research question. Properly justifying your methods will let your audience know that your research was effective and its results are valid.

Want more writing tips? Check out Proofed’s Writing Tips and Academic Writing Tips blogs. And once you’ve written your thesis or dissertation, consider sending it to us. Our editors will be happy to check your grammar, spelling, and punctuation to make sure your document is the best it can be. Check out our services for free .

Share this article:

Post A New Comment

Got content that needs a quick turnaround? Let us polish your work. Explore our editorial business services.

3-minute read

What Is a Content Editor?

Are you interested in learning more about the role of a content editor and the...

The Benefits of Using an Online Proofreading Service

Proofreading is important to ensure your writing is clear and concise for your readers. Whether...

2-minute read

6 Online AI Presentation Maker Tools

Creating presentations can be time-consuming and frustrating. Trying to construct a visually appealing and informative...

What Is Market Research?

No matter your industry, conducting market research helps you keep up to date with shifting...

8 Press Release Distribution Services for Your Business

In a world where you need to stand out, press releases are key to being...

How to Get a Patent

In the United States, the US Patent and Trademarks Office issues patents. In the United...

Logo Harvard University

Make sure your writing is the best it can be with our expert English proofreading and editing.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Justification of research using systematic reviews continues to be inconsistent in clinical health science—A systematic review and meta-analysis of meta-research studies

Jane Andreasen

1 Department of Physiotherapy and Occupational Therapy, Aalborg University Hospital, Denmark and Public Health and Epidemiology Group, Department of Health, Science and Technology, Aalborg University, Aalborg, Denmark

Birgitte Nørgaard

2 Department of Public Health, University of Southern Denmark Odense, Denmark

Eva Draborg

Carsten bogh juhl.

3 Department of Sports Science and Clinical Biomechanics, University of Southern Denmark and Department of Physiotherapy and Occupational Therapy, Copenhagen University Hospital, Herlev and Gentofte, Herlev, Denmark

Jennifer Yost

4 M. Louise Fitzpatrick College of Nursing, Villanova University, Villanova, PA, United States of America

Klara Brunnhuber

5 Digital Content Services, Elsevier, London, United Kingdom

Karen A. Robinson

6 Johns Hopkins University School of Medicine, Baltimore, MD, United States of America

7 Department of Evidence-Based Practice, Western Norway University of Applied Sciences, Bergen, Norway

Associated Data

All relevant data are within the paper and its Supporting Information files.

Redundancy is an unethical, unscientific, and costly challenge in clinical health research. There is a high risk of redundancy when existing evidence is not used to justify the research question when a new study is initiated. Therefore, the aim of this study was to synthesize meta-research studies evaluating if and how authors of clinical health research studies use systematic reviews when initiating a new study.

Seven electronic bibliographic databases were searched (final search June 2021). Meta-research studies assessing the use of systematic reviews when justifying new clinical health studies were included. Screening and data extraction were performed by two reviewers independently. The primary outcome was defined as the percentage of original studies within the included meta-research studies using systematic reviews of previous studies to justify a new study. Results were synthesized narratively and quantitatively using a random-effects meta-analysis. The protocol has been registered in Open Science Framework ( https://osf.io/nw7ch/ ).

Twenty-one meta-research studies were included, representing 3,621 original studies or protocols. Nineteen of the 21 studies were included in the meta-analysis. The included studies represented different disciplines and exhibited wide variability both in how the use of previous systematic reviews was assessed, and in how this was reported. The use of systematic reviews to justify new studies varied from 16% to 87%. The mean percentage of original studies using systematic reviews to justify their study was 42% (95% CI: 36% to 48%).

Justification of new studies in clinical health research using systematic reviews is highly variable, and fewer than half of new clinical studies in health science were justified using a systematic review. Research redundancy is a challenge for clinical health researchers, as well as for funders, ethics committees, and journals.

Introduction

Research redundancy in clinical health research is an unethical, unscientific, and costly challenge that can be minimized by using an evidence-based research approach. First introduced in 2009 and since endorsed and promoted by organizations and researchers worldwide [ 1 – 6 ], evidence-based research is an approach whereby researchers systematically and transparently take into account the existing evidence on a topic before embarking on a new study. The researcher thus strives to enter the project unbiased, or at least aware of the risk of knowledge redundancy bias. The key is an evidence synthesis using formal, explicit, and rigorous methods to bring together the findings of pre-existing research to synthesize the totality what is known [ 7 ]. Evidence syntheses provide the basis for an unbiased justification of the proposed research study to ensure that the enrolling of participants, resource allocation, and healthcare systems are supporting only relevant and justified research. Enormous numbers of research studies are conducted, funded, and published globally every year [ 8 ]. Thus, if earlier relevant research is not considered in a systematic and transparent way when justifying research, the foundation for a research question is not properly established, thereby increasing the risk of redundant studies being conducted, funded, and published resulting in a waste of resources, such as time and funding [ 1 , 4 ]. Most importantly, when redundant research is initiated, participants unethically and unnecessarily receive placebos or receive suboptimal treatment.

Previous meta-research, defined as the study of research itself including the methods, reporting, reproducibility, evaluation and incentives of the research [ 9 ] have shown that there is considerable variation and bias in the use of evidence syntheses to justify research studies [ 10 – 12 ]. To the best of our knowledge, a systematic review of previous meta-research studies assessing the use of systematic reviews to justify studies in clinical health research has not previously been conducted. Evaluating how evidence-based research is implemented in research practices across disciplines and specialties when justifying new studies will provide an indication of the integration of evidence-based research in research practices [ 9 ]. The present systematic review aimed to identify and synthesize results from meta-research studies, regardless of study type, evaluating if and how authors of clinical health research studies use systematic reviews to justify a new study.

Prior to commencing the review, we registered the protocol in the Open Science Framework ( https://osf.io/nw7ch/ ). The protocol remained unchanged, but in this paper we have made adjustments to the risk-of-bias assessment, reducing the tool to 10 items and removing the assessment of reporting quality. The review is presented in accordance with the Preferred Reporting Items for Systematic review and Meta-Analysis (PRISMA) guidelines [ 13 ].

Eligibility criteria

Studies were eligible for inclusion if they were original meta-research studies, regardless of study type, that evaluated if and how authors of clinical health research studies used systematic reviews to justify new clinical health studies. No limitations on language, publication status, or publication year were applied. Only meta-research studies of studies on human subjects in clinical health sciences were eligible for inclusion. The primary outcome was defined as the percentage of original studies within the included meta-research studies using systematic reviews of previous studies to justify a new study. The secondary outcome was how the systematic reviews of previous research were used (e.g., within the text to justify the study) by the original studies.

Information sources and search strategy

This study is one of six ongoing evidence syntheses (four systematic reviews and two scoping reviews) planned to assess the global state of evidence-based research in clinical health research. These are; a scoping review mapping the area broadly to describe current practice and identify knowledge gaps, a systematic review on the use of prior research in reports of randomized controlled trials specifically, three systematic reviews assessing the use of systematic reviews when justifying, designing [ 14 ] or putting results of a new study in context, and finally a scoping review uncovering the breadth and characteristics of the available, empirical evidence on the topic of citation bias. Further, the research group is working with colleagues on a Handbook for Evidence-based Research in health sciences. Due to the common aim across the six evidence syntheses, a broad overall search strategy was designed to identify meta-research studies that assessed whether researchers used earlier similar studies and/or systematic reviews of earlier similar studies to inform the justification and/or design of a new study, whether researchers used systematic reviews to inform the interpretation of new results, and meta-research studies that assessed if there were published redundant studies within a specific area or not.

The first search was performed in June 2015. Databases included MEDLINE via both PubMed and Ovid, EMBASE via Ovid, CINAHL via EBSCO, Web of Science (Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Arts & Humanities Citation Index (A&HCI), and the Cochrane Methodology Register (CMR, Methods Studies) from inception (Appendix 1 in S1 File ). In addition, reference lists of included studies were screened for relevant articles, as well as the authors’ relevant publications and abstracts from the Cochrane Methodology Reviews.

Based upon the experiences from the results of the baseline search in June 2015, an updated and revised search strategy was conducted in MEDLINE and Embase via Ovid from January 2015 to June 2021 (Appendix 1 in S1 File ). Once again, the reference lists of new included studies were screened for relevant references, as were abstracts from January 2015 to June 2021 in the Cochrane Methodology Reviews. Experts in the field were contacted to identify any additional published and/or grey literature. No restrictions were made on publication year and language. See Appendix 1 and Appendix 2 in S1 File for the full search strategy.

Screening and study selection

Following deduplication, the search results were uploaded to Rayyan ( https://rayyan.qcri.org/welcome ). The search results from the 1st search (June 2015) were independently screened by a pair of reviewers. Twenty screeners were paired, with each pair including an author very experienced in systematic reviews and a less experienced author. To increase consistency among reviewers, both reviewers initially screened the same 50 publications and discussed the results before beginning screening for this review. Disagreements on study selection were resolved by consensus and discussion with a third reviewer, if needed. The full-text screening was also performed by two reviewers independently. Disagreements on study selection were resolved by consensus and discussion. There were also two independent reviewers who screened following the last search, using the same procedure, as for the first search, for full-text screening and disagreements. The screening procedures resulted in a full list of studies potentially relevant for one or more of the six above-mentioned evidence syntheses.

A second title and abstract screening and full-text screening of the full list was then performed independently by two reviewers using screening criteria specific to this systematic review. Reasons for excluding trials were recorded, and disagreements between the reviewers were resolved through discussion. If consensus was not reached, a third reviewer was involved.

Data extraction

We developed and pilot tested a data extraction form to extract data regarding study characteristics and outcomes of interest. Two reviewers independently extracted data, with other reviewers available to resolve disagreements. The following study characteristics were extracted from each of the included studies: bibliographic information, study aim, study design, setting, country, inclusion period, area of interest, results, and conclusion. Further, data for this study’s primary and secondary outcomes were extracted; these included the percentage of original studies using systematic reviews to justify their study and how the systematic reviews of previous research were used (e.g., within the text to justify the study) by the original studies.

Risk-of-bias assessment

No standard tool was identified to assess the risk of bias in empirical meta-research studies. The Editorial Group of the Evidence-Based Research Network prepared a risk-of-bias tool for the planned five systematic reviews with list of items important for evaluating the risk of bias in meta-research studies. For each item, one could classify the study under examination as exhibiting a “low risk of bias”, “unclear risk of bias” or “high risk of bias”. We independently tested the list of items upon a sample of included studies. Following a discussion of the different answers, we adjusted the number and content of the list of items to ten and defined the criteria to evaluate the risk of bias in the included studies ( Table 1 ). Each of the included meta-research studies was appraised independently by two reviewers using the customized checklist to determine the risk of bias. Disagreements regarding the risk of bias were solved through discussion. No study was excluded on the grounds of low quality.

Data synthesis and interpretation

In addition, to narratively summarizing the characteristics of the included meta-research studies and their risk-of-bias assessments, the percentage of original studies using systematic review of previous similar studies to justify a new study (primary outcome) was calculated as the number of studies using at least one systematic review, divided by the total number of original studies within each of the included meta-research studies. A meta-analysis using the random-effects model (DerSimonian and Laird) was used to estimate the overall estimate and perform the forest plot as this model is the default when using the metaprop command. Heterogeneity was evaluated estimating the I 2 statistics (the percentage of variance attributable to heterogeneity i.e., inconsistency) and the between study variance tau 2 . When investigating reasons for heterogeneity, a restricted maximum likelihood (REML) model was used and covariates with the ability to reduce tau 2 was deemed relevant. [ 15 ].

All analyses were conducted in Stata, version 17.0 (StataCorp. 2019. Stata Statistical Software : Release 17 . College Station, TX: StataCorp LLC).

Study selection

In total, 30,592 publications were identified through the searches. Of these, 69 publications were determined eligible for one of the six evidence syntheses. A total of 21 meta-research studies fulfilled the inclusion criteria for this systematic review [ 10 , 11 , 16 – 34 ]; see Fig 1 .

An external file that holds a picture, illustration, etc.
Object name is pone.0276955.g001.jpg

Study characteristics

The 21 included meta-research studies were published from 2007 to 2021, representing 3,621 original studies or protocols and one survey with 106 participants; only three of these studies were published before 2013 [ 10 , 18 , 26 ]. The sample of the original study within each of the included meta-research studies varied. One meta-research study surveyed congress delegates [ 29 ], one study examined first-submission protocols for randomized controlled trials submitted to four hospital ethics committees [ 17 ], and 14 studies examined randomized or quasi-randomized primary studies published during a specific time period in a range of journals [ 10 , 11 , 18 , 21 – 28 , 31 , 32 , 34 ] or in specific databases [ 16 , 19 , 20 , 30 ]. Finally, one study examined the use of previously published systematic reviews when publishing a new systematic review [ 33 ]. Further, the number of original studies within each included meta-research study varied considerably, ranging from 18 [ 10 ] to 637 original studies [ 27 ]. The characteristics of the included meta-research studies are presented in Table 2 .

SR: systematic review; MA: meta–analysis; RCT: randomized controlled trial.

Risk of bias assessment

Overall, most studies were determined to exhibit a low risk of bias in the majority of items, and all of the included meta-research studies reported an unambiguous aim and a match between aim and methods. However, only a few studies provided argumentation for their choice of data source [ 17 , 20 , 24 , 30 ], and only two of the 21 studies referred to an available a-priori protocol [ 16 , 21 ]. Finally, seven studies provided poor or no discussion of the limitations of their study [ 10 , 19 , 22 , 26 – 28 , 34 ]. The risk-of-bias assessments are shown in Table 3 .

Synthesis of results

Of the included 21 studies, a total of 18 studies were included in the meta-analysis. Two studies included two cohorts each, and both cohorts in each of these studies were included in our meta-analysis [ 21 , 30 ]. The survey by Clayton and colleagues, with a response rate of 17%, was not included in the meta-analysis as the survey did not provide data to identify the use of systematic reviews to justify specific studies. However, their results showed that 42 of 84 respondents (50%) reported using a systematic review for justification [ 29 ]. The study by Chow, which was also not included in the meta-analysis, showed that justification varied largely within and between specialties. However, only relative numbers were provided, and, therefore, no overall percentage could be extracted [ 11 ]. The study by Seehra et al. counted the SR citations in RCTs and not the number of RCTs citing SRs and is therefore not included in the meta-analysis either [ 23 ].

The percentage of original studies that justified a new study with a systematic review within each meta-research study ranged from 16% to 87%. The pooled percentage of original studies using systematic reviews to justify their research question was 42% (95% CI: 36% to 48%) as shown in Fig 2 . Where the confidence interval showed the precision of the pooled estimate in a meta-analysis, the prediction interval showed the distribution of the individual studies. The heterogeneity in the meta-analysis assessed by I 2 was 94%. The clinical interpretation of this large heterogeneity is seen in a the very broad prediction interval ranging from 16 to 71%, meaning that based on these studies there is 95% chance that the results of the next study will show a prevalence between 16 to 71%.

An external file that holds a picture, illustration, etc.
Object name is pone.0276955.g002.jpg

Forest plot prevalence and 95% confidence intervals for the percentage of studies using an SR to justify the study.

Further, we conducted an explorative subgroup analysis of the study of Helfer et al. and the study of Joseph et al. as these two studies were on meta-analyses and protocols and therefore differ from the other included studies. This analysis did only marginally change the pooled percentage to 39% (95% CI; 33% to 46%) and the between-study variance (tau 2 ) was reduced with 23%.

The 21 included studies varied greatly in their approach and in their description of how systematic reviews were used, i.e., if the original studies referred and whether the used systematic reviews in the original studies were relevant and/or of high-quality. Nine studies assessed, to varying degrees, whether the used systematic reviews were relevant for the justification of the research [ 16 – 20 , 25 , 30 , 32 , 34 ]. Overall, the information reported by the meta-research studies was not sufficient to report the percentage of primary studies referring to relevant systematic reviews. No details were provided regarding the methodological quality of the systematic reviews used to justify the research question or if they were recently published reviews, except for Hoderlein et al., who reported that the mean number of years from publication of the cited systematic review and the trial report was four years [ 30 ].

We identified 21 meta-research studies, spanning 15 publication years and 12 medical disciplines. The findings showed substantial variability in the use of systematic reviews when justifying new clinical studies, with the incidence of use ranging from 16% to 87%. However, fewer than half of the 19 meta-analysis-eligible studies used a systematic review to justify their new study. There was wide variability, and a general lack of information, about how systematic reviews were used within many of the original studies. Our systematic review found that the proportion of original studies justifying their new research using evidence syntheses is sub-optimal and, thus, the potential for research redundancy continues to be a challenge. This study corroborates the serious possible consequences regarding research redundancy previously problematized by Chalmers et al. and Glasziou et al. [ 35 , 36 ].

Systematic reviews are considered crucial when justifying a new study, as is emphasized in reporting guidelines such as the CONSORT statement [ 37 ]. However, there are challenges involved in implementing an evidence-based research approach. The authors of the included meta-research study reporting the highest use of systematic reviews to justify a new systematic review study point out that even though the authors of the original studies refer to some of the published systematic reviews, they neglect others on the same topic, which may be problematic and result in a biased approach [ 33 ]. Other issues that have been identified are the risk of research waste when a systematic review may not be methodologically sound [ 12 , 38 ] and that there is also redundancy in the conduct of systematic reviews, with many overlapping systematic reviews existing on the same topic [ 39 – 41 ]. In the original studies within the meta-research studies, the use of systematic reviews was not consistent and, further, it was not explicated whether the systematic reviews used were the most recent and/or of high methodological quality. These issues speak to the need for refinement in the area of systematic review development, such as mandatory registration in prospective registries. Only two out of the included 21 studies in this study referred to an available a-priori protocol [ 16 , 21 ]. General recommendations in the use of systematic reviews as justification for a new study are difficult as these will be topic specific, however researchers should be aware to use the most robust and methodologically sound of recently published reviews, preferably with á priori published protocols.

Efforts must continue in promoting the use of evidence-based research approaches among clinical health researchers and other important stakeholders, such as funders. Collaborations such as the Ensuring Value in Research Funders Forum, and changes in funding review criteria mandating reference to previously published systematic reviews when justifying the research question within funding proposals, are examples of how stakeholders can promote research that is evidence-based [ 8 , 41 ].

Strengths and limitations

We conducted a comprehensive and systematic search. The lack of standard terminology for meta-research studies resulted in search strategies that retrieved thousands of citations. We also relied on snowballing efforts to identify relevant studies, such as by contacting experts and scanning the reference lists of relevant studies.

There is also a lack of tools to assess risk of bias for meta-research studies, so a specific risk-of bias tool for the five conducted reviews was created. The tool was discussed and revised continuously throughout the research process; however, we acknowledge that the checklist is not yet optimal and a validated risk-of-bias tool for meta-research studies is needed.

Many of the included meta-research studies did not provide details as to whether the systematic reviews used to justify the included studies were relevant, high-quality and/or recently published. This may raise questions as to the validity of our findings, as the majority of the meta-research studies only provide an indication of the citation of systematic reviews to justify new studies, not whether the systematic review cited was relevant, recent and of high-quality, or even how the systematic review was used. We did not assess this further either. Nonetheless, even if we assumed that these elements were provided for every original study included in the included meta-research studies (i.e. taking a conservative approach), fewer than half used systematic reviews to justify their research questions. The conservative approach used in this study therefore does not underestimate, and perhaps rather overestimates, the actual use of relevant systematic reviews to justify studies in clinical health science across disciplines.

Different study designs were included in the meta-analysis, which may have contributed to the high degree of heterogeneity observed. Therefore, the presented results should be interpreted with caution due to the high heterogeneity. Not only were there differences in the methods of the included meta-research studies, but there was also heterogeneity in the medical specialties evaluated [ 42 , 43 ].

In conclusion, justification of research questions in clinical health research with systematic reviews continues to be inconsistent; fewer than half of the primary studies within the included meta-research studies in this systematic review were found to have used a systematic review to justify their research question. This indicates that the risk of redundant research is still high when new studies across disciplines and professions in clinical health are initiated, thereby indicating that evidence-based research has not yet been successfully implemented in the clinical health sciences. Efforts to raise awareness and to ensure an evidence-based research approach continue to be necessary, and such efforts should involve clinical health researchers themselves as well as important stakeholders such as funders.

Supporting information

S1 checklist, s1 protocol, acknowledgments.

This work has been prepared as part of the Evidence-Based Research Network ( ebrnetwork.org ). The Evidence-Based Research Network is an international network that promotes the use of systematic reviews when justifying, designing, and interpreting research. The authors thank the Section for Evidence-Based Practice, Department for Health and Function, Western Norway University of Applied Sciences for their generous support of the EBRNetwork. Further, thanks to COST Association for supporting the COST Action “EVBRES” (CA 17117, evbres.eu) and thereby the preparation of this study. Thanks to Gunhild Austrheim, Head of Unit, Library at Western Norway University of Applied Sciences, Norway, for helping with the second search. Thanks to those helping with the screening: Durita Gunnarsson, Gorm Høj Jensen, Line Sjodsholm, Signe Versterre, Linda Baumbach, Karina Johansen, Rune Martens Andersen, and Thomas Aagaard.

We gratefully acknowledge the contribution from the EVBRES (COST ACTION CA 17117) Core Group, including Anne Gjerland (AG) and her specific contribution to the search and screening process.

Funding Statement

The authors received no specific funding for this work.

Data Availability

  • PLoS One. 2022; 17(10): e0276955.

Decision Letter 0

Transfer alert.

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

PONE-D-22-02383Justification of research using systematic reviews continues to be inconsistent in clinical health science - a systematic review and meta-analysis of meta-research studiesPLOS ONE

Dear Dr. Andreasen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by May 17 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at  gro.solp@enosolp . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Andrzej Grzybowski

Academic Editor

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We noticed you have some minor occurrence of overlapping text with the following previous publication(s), which needs to be addressed:

- https://www.jclinepi.com/article/S0895-4356 (22)00016-6/fulltext

In your revision ensure you cite all your sources (including your own works), and quote or rephrase any duplicated text outside the methods section. Further consideration is dependent on these concerns being addressed.

3. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions .

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories .

We will update your Data Availability statement on your behalf to reflect the information you provide.

4. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

2. Has the statistical analysis been performed appropriately and rigorously?

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Thank you for the opportunity to review this interesting meta-research paper, which is part of a series of papers.

Basing new research on systematic reviews is clearly important and has been the subject of a number of reviews. This paper essentially reviews the meta-research in this area, to give a global assessment of the issue taking into account all of the evidence

The content of the rest of the series was not made clear, but a decision has been made to publish them singly. I think the short description of the rest of the programme could be expanded a little to put the work in context and help the reader understand how the work fits together. How do the different studies relate, and are other papers needed to put the current work in context?

The introduction defines meta-research in broad terms, but it is not until the results that the reader is given a sense of the actual designs included and of relevance to the research question. Were these defined a priori, or were these study designs that fit the broad definition which happened to be found in the search? Are there meta-research designs of relevance to the research question which were not found in the searches?

Personally, I would bring a description of the range of study design forward into the introduction, as getting a sense of the sorts of approaches to meta-research of relevance will help non-specialists in this area. I was not clear of the likely designs until quite late in the paper

The review methods seemed very rigorous, and I had no major comments on those beyond one clarification. When they said, ‘No study was excluded on the grounds of low quality’, did they mean that no studies were considered so bad, or that as a rule no studies were every going to be excluded on that basis?

As noted above, there were a number of study designs included, and all were assessed using the generic risk of bias tool. Presumably some designs are just stronger than others? The survey must be considered a weaker design that the others. Again, this links to the earlier comment about the need for more detail on design of the meta research, which I felt was lost in the use of a generic risk of bias assessment.

I did not understand the statement ‘The clinical interpretation of the large heterogeneity is seen in a broad prediction interval with a range from 16 to 71%’ and that needs clarification

The discussion is balanced, but there are a few significant issues that are given a fairly cursory consideration and would benefit from greater detail

I was interested in the issue of the ‘quality’ of the reviews used. I accept that the data here was not enough for analysis, but felt that the authors (as experts in this area) could be pushed to provide a stronger statement about what criteria should be used by further studies (for example, how do we judge if a review used as the basis for research is a strong basis. How long before a quoted review is too ‘old’?)

They acknowledge that ‘the checklist is not yet optimal and a validated risk-of-bias tool for meta-research studies is needed’. Given their experience and expertise, what would that look like, and how would it be best developed and tested? How would it take into account the role of different designs noted above, given variation in the approaches to meta-research they found?

I appreciate the simple and elegant assessment of the main findings, but they present only vague statement on the role of design and medical specialities. Is it not possible for them to say more on this, or explore the data more fully? What about change over time, which seems very relevant. I did feel the authors could be pushed a little more here, given that they have a programme of work and must be in a position to present more substantive statements. I think that would add to the contribution of the paper

Reviewer #2: The article is on interesting topic but several points needs emphasis:

the inclusion criteria should be defined more clearly in the text

Systematiic review and meta analysis are relatively new and first papers go to late seventies in previous century.

This should be considered when reviewing papers.

The risk of redundancy could not be well defined from the meta search papers rather it should be from the original articles . This would not be possible unless a focused issue is chosen as an example.

The different disciplines have different research out puts as the basis for systematic reviews which makes the comparison difficult .

I realize some studies are based on the disclosure of the authors whether they have used the previous systematic reviews or not . This should be confirmed by evidence .

These should be mentioned as the limitations of this work .

6. PLOS authors have the option to publish the peer review history of their article ( what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1:  Yes:  Peter Bower

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,  https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at  gro.solp@serugif . Please note that Supporting Information files do not need this step.

Author response to Decision Letter 0

25 Apr 2022

Response letter to the editor and reviewers,

Thank you for the opportunity to revise the manuscript. Thank you to the reviewers for the positive and constructive comments concerning the manuscript. We have now revised the manuscript in accordance with these comments by addressing all issues from the editor and from the reviewers below.

Answer: we have addressed the requirements, see our answers below.

Answer: We believe we meet the style requirements, including correct file naming.

Answer: We agree that there are overlap in parts of the methods section with the mentioned publication. The paper was published in the period of this manuscript being in review, we have therefore now referred to the publication in this manuscript. This manuscript and the publication are both part of a series of papers assessing the global status of evidence-based research in clinical health research and therefore the overlap in the methods section was expected. We have thoroughly scrutinized the full manuscript and found no full sentences that are overlapping, except for the methods section. To be sure of this, we further have conducted a legal comparison in MS Words with the mentioned publication and again found no full sentences except in the methods section. This is to our sincere knowledge only in the methods section, please let us know if we are mistaken.

Answer: We have uploaded the data set necessary to replicate our study findings in a supplementary file and described the changes to the “Data Availability statement” in the cover letter.

Reviewer comments Reviewer #1:

1. Thank you for the opportunity to review this interesting meta-research paper, which is part of a series of papers.

Response: Thank you for this response and that is exactly the purpose.

2. The content of the rest of the series was not made clear, but a decision has been made to publish them singly. I think the short description of the rest of the program could be expanded a little to put the work in context and help the reader understand how the work fits together. How do the different studies relate, and are other papers needed to put the current work in context?

Response: We have expanded the text and especially regarding how the work fits together and shows our purpose of taking a global assessment of the on evidence-based research in the following six papers:

1. Meta-research evaluating redundancy and use of systematic reviews when planning new studies in health research – a scoping review

2. A Systematic Review on the Use of Prior Research in Reports of Randomized Clinical Trials

3. Justification

6. The problem of citation bias – a scoping review

We do not have other papers in pipeline at the moment, but we are currently working on a Handbook for Evidence-Based Research to provide tools and models to make it easier for researchers to work evidence- based in their research.

Changes to text: This study is one of six ongoing meta-syntheses (four systematic reviews and two scoping reviews) planned to assess the global state of evidence-based research in clinical health research. These are; a scoping review mapping the area broadly to describe current practice and identify knowledge gaps, a systematic review on the use of prior research in reports of randomized controlled trials specifically, three systematic reviews assessing the use of systematic reviews when justifying, designing [14] or putting results of a new study in context, and finally a scoping review uncovering the breadth and characteristics of the available, empirical evidence on the topic of citation bias . Further, the research group is working with colleagues on a Handbook for Evidence-based Research in health sciences.

3. The introduction defines meta-research in broad terms, but it is not until the results that the reader is given a sense of the actual designs included and of relevance to the research question. Were these defined a priori, or were these study designs that fit the broad definition which happened to be found in the search? Are there meta-research designs of relevance to the research question which were not found in the searches?

Response: We get your point. A very broad and inclusive definition was defined a priori in the published protocol: “Types of study to be included: We will include meta-research studies (or studies performing research on research)” in order not to miss out on relevant studies, because the research field was quite new and further, we did not identify other meta-research studies to guide our process. Due to our very broad and sensitive search strategy we believe we identified all relevant meta-research studies.

Only data regarding justification from original papers were included in our meta-analysis as the study design of a survey of delegates use of systematic reviews to justify their studies, was assessed as seriously subjected to a social desirability bias.

Changes to text:

Introduction: The present systematic review aimed to identify and synthesize results from meta-research studies, regardless study type, evaluating if and how authors of clinical health research studies use systematic reviews to justify a new study.

Methods section, eligibility criteria: Studies were eligible for inclusion if they were original meta-research studies, regardless study type, that evaluated if and how authors of clinical health studies used systematic reviews to justify new clinical health studies.

4. Personally, I would bring a description of the range of study design forward into the introduction, as getting a sense of the sorts of approaches to meta-research of relevance will help non-specialists in this area. I was not clear of the likely designs until quite late in the paper

Response: We agree and have made it clear that all meta-research studies regardless design was included.

Changes to text: see above.

5. The review methods seemed very rigorous, and I had no major comments on those beyond one clarification. When they said, ‘No study was excluded on the grounds of low quality’, did they mean that no studies were considered so bad, or that as a rule no studies were every going to be excluded on that basis?

Response: The latter, as a rule no studies were excluded, as our intention was not to guide clinical practice. This is stated in the manuscript as the last sentence in the Risk-of-Bias Assessment section. No changes are therefore made.

6. As noted above, there were a number of study designs included, and all were assessed using the generic risk of bias tool. Presumably some designs are just stronger than others? The survey must be considered a weaker design that the others. Again, this links to the earlier comment about the need for more detail on design of the meta research, which I felt was lost in the use of a generic risk of bias assessment.

Response: We agree on this point, but we did take a very open approach to monitor the field of justification. And we did not range the study designs in a hierarchical order in our “premature” Risk of Bias tool, as we aimed to assess the area and not to provide any clinical recommendations. However, the author group and colleagues are currently working on an improved checklist tool.

No further changes to text.

7. I did not understand the statement ‘The clinical interpretation of the large heterogeneity is seen in a broad prediction interval with a range from 16 to 71%’ and that needs clarification

Response: We agree that an explanation is appropriate.

Changes to text: The clinical interpretation of the large heterogeneity is seen in a broad prediction interval with a range from 16 to 71%, meaning that there is 95% confidence that the results of the next study will be between a prevalence of 16 to 71%.

8. The discussion is balanced, but there are a few significant issues that are given a fairly cursory consideration and would benefit from greater detail

Response: We have addressed the issues mentioned below and provided more detail

9. I was interested in the issue of the ‘quality’ of the reviews used. I accept that the data here was not enough for analysis, but felt that the authors (as experts in this area) could be pushed to provide a stronger statement about what criteria should be used by further studies (for example, how do we judge if a review used as the basis for research is a strong basis. How long before a quoted review is too ‘old’?)

Response: Very interesting topic to address further, which we have continuously discussed in the author group, but this is both complex and context dependent in specific topics. Therefore, we have chosen not to elaborate further on the topic in the manuscript, to give an appropriate consideration more space is needed.

Instead, we have mentioned these considerations as important to address further in future publications as to guide researchers when using systematic reviews to justify. As mentioned earlier, the research group is working with colleagues on a Handbook for Evidence-based Research in health sciences, which will elaborate on the topics in detail.

Changes to text in Discussion section:

General recommendations in the use of systematic reviews as justification for a new study are difficult as these will be topic specific, however researchers should be aware to use the most robust and methodologically sound of recently published reviews, preferably with á priori published protocols.

10. They acknowledge that ‘the checklist is not yet optimal and a validated risk-of-bias tool for meta-research studies is needed’. Given their experience and expertise, what would that look like, and how would it be best developed and tested? How would it take into account the role of different designs noted above, given variation in the approaches to meta-research they found?

Response: We fully agree with you on this topic and the author group and colleagues are currently working on an improved checklist tool. Your suggestion about ranging the study designs is very relevant and will be considered in the author group in this thorough work that we expect to publish in the near future. We find the work requires space and thorough analysis and we therefore have decided this should be published in an independent paper.

11. I appreciate the simple and elegant assessment of the main findings, but they present only vague statement on the role of design and medical specialities. Is it not possible for them to say more on this, or explore the data more fully? What about change over time, which seems very relevant. I did feel the authors could be pushed a little more here, given that they have a programme of work and must be in a position to present more substantive statements. I think that would add to the contribution of the paper

Response: The role of design is only considered in relation to that the studies has done meta - research on the topic “justification”. We do not find it was appropriate to explicate more about the roles of medical specialties as the approach in the different studies were very diverse ranging from participants in the survey, to specialties or to specific journals (mostly high ranking) or more broad aimed journals or databases.

Change over time is an important and relevant question. We did not address the issue for two reasons. Firstly, most of the papers are published after 2012 and it would be a short timeline to assess. But most importantly, as most of the included studies in our meta-research study were cross-sectional, we would not be able to validly assess change over time with the data at hand.

Reviewer comments Reviewer #2 :

1. The article is on interesting topic but several points needs emphasis

Response: Thank you. We have answered each point above.

2. The inclusion criteria should be defined more clearly in the text

Response: Methods section: we have clarified the inclusion criteria in the methods section.

3. Systematic review and meta analysis are relatively new and first papers go to late seventies in previous century. This should be considered when reviewing papers.

Response: Yes, it is a fairly new discipline, however it has been recommended to be evidence-based by the use of systematic reviews and meta-analyses for many years. Our aim was therefore to look at meta-research in a broad sense by using previously published studies investigating how large a percentage are using systematic reviews as justification when initiating new health science.

4. The risk of redundancy could not be well defined from the meta search papers rather it should be from the original articles . This would not be possible unless a focused issue is chosen as an example.

Response: Risk of redundancy can, in our perspective, be thoroughly assessed by the use of systematic reviews with meta-analyses included, and especially cumulative meta-analyses can pinpoint this in a specific research topic. Therefore, we agree that we cannot point it to a specific field but have taken this meta-research perspective to provide a more global status on the topic.

We hope you can follow our reasoning.

5. The different disciplines have different research out puts as the basis for systematic reviews which makes the comparison difficult

Response: In this paper, we did not look for the output, but the “input” so to speak, as we assess whether the authors have used justification by using systematic reviews, when initiating a new study in health science. We agree, it is important to define the aim and approach and the outcomes more specifically, if you look into a specific topic.

No changes to text.

6. I realize some studies are based on the disclosure of the authors whether they have used the previous systematic reviews or not. This should be confirmed by evidence.

These should be mentioned as the limitations of this work.

Response: We agree on this point and have clarified in the limitations that we have taken “the face value” reported by the authors in the included studies.

Changes to text: Discussion, Strengths and Limitations section:

This may raise questions as to the validity of our findings, as the majority of the meta-research studies only provide an indication of the citation of systematic reviews to justify new studies, not whether the systematic review was relevant, recent or of high-quality, or even how the systematic review was used. We did not assess this further either.

Submitted filename: Response letter_25042022.docx

Decision Letter 1

19 Sep 2022

PONE-D-22-02383R1Justification of research using systematic reviews continues to be inconsistent in clinical health science - a systematic review and meta-analysis of meta-research studiesPLOS ONE

Please submit your revised manuscript by Nov 03 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at  gro.solp@enosolp . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

2. Is the manuscript technically sound, and do the data support the conclusions?

3. Has the statistical analysis been performed appropriately and rigorously?

4. Have the authors made all data underlying the findings in their manuscript fully available?

5. Is the manuscript presented in an intelligible fashion and written in standard English?

6. Review Comments to the Author

Reviewer #1: I am happy with the responses and thank the authors for their detailed replies, but just had 2 minor issues

This probably reflects my ignorance so apologies to the authors, but I still do not understand the relationship between the 95% CI around the pooled percentage, and the 'broad prediction interval' which follows it. Could they add a line to explain?

There are some typos remaining. The phrase 'regardless study type' should read 'regardless of study type'. There are some rogue apostrophes in the tables (SR's, RCT's) which need to be edited

7. PLOS authors have the option to publish the peer review history of their article ( what does this mean? ). If published, this will include your full peer review and any attached files.

Reviewer #1: No

Author response to Decision Letter 1

21 Sep 2022

Response letter

Thank you for the opportunity to revise the manuscript. Thank you to the reviewer for the relevant comments concerning the manuscript. We have revised the manuscript in accordance with these comments by addressing all issues from the editor and from the reviewers below.

Reviewer #1: I am happy with the responses and thank the authors for their detailed replies, but just had 2 minor issues

Response: Thank you very much.

Response: We have revised and explained more in detail and hope the revised text explains this more clearly.

Where the confidence interval showed the precision of the pooled estimate in a meta-analysis, the prediction interval showed the distribution of the individual studies. The heterogeneity in the meta-analysis assessed by I2 was 94%. The clinical interpretation of this large heterogeneity is seen in a the very broad prediction interval ranging from 16 to 71%, meaning that based on these studies there is 95% chance that the results of the next study will show a prevalence between 16 to 71%.

There are some typos remaining. The phrase 'regardless study type' should read 'regardless of study type'.

Response: Thank you, we have revised as suggested.

There are some rogue apostrophes in the tables (SR's, RCT's) which need to be edited

Response: Thank you for pointing this out. We have edited this now.

On behalf of the author group,

Submitted filename: Response letter 20092022.docx

Decision Letter 2

18 Oct 2022

Justification of research using systematic reviews continues to be inconsistent in clinical health science - a systematic review and meta-analysis of meta-research studies

PONE-D-22-02383R2

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/ , click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at gro.solp@gnillibrohtua .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact gro.solp@sserpeno .

Additional Editor Comments (optional):

Reviewer #1: (No Response)

Acceptance letter

21 Oct 2022

Dear Dr. Andreasen:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact gro.solp@sserpeno .

If we can help with anything else, please email us at gro.solp@enosolp .

Thank you for submitting your work to PLOS ONE and supporting open access.

PLOS ONE Editorial Office Staff

on behalf of

Dr. Andrzej Grzybowski

learnonline

Research proposal, thesis, exegesis, and journal article writing for business, social science and humanities (BSSH) research degree candidates

Topic outline, introduction and research justification.

is a research paper justified

Introduction and research justification, business, social sciences, humanities

Introduction.

  • Signalling the topic in the first sentence
  • The research justification or 'problem' statement 
  • The 'field' of literature
  • Summary of contrasting areas of research
  • Summary of the 'gap' in the literature
  • Research aims and objectives

Summary of the research design

Example research proposal introductions.

This topic outlines the steps in the introduction of the research proposal. As discussed in the first topic in this series of web resources, there are three key elements or conceptual steps within the main body of the research proposal. In this resource, these elements are referred to as the research justification, the literature review and the research design. These three steps also structure, typically, but not always in this order, the proposal introduction which contains an outline of the proposed research.

These steps pertain to the key questions of reviewers:

  • What problem or issue does the research address? (research justification)
  • How will the research contribute to existing knowledge? (the 'gap' in the literature, sometimes referred to as the research 'significance')
  • How will the research achieve its stated objectives? (the research design)

Reviewers look to find a summary of the case for the research in the introduction, which, in essence, involves providing summary answers to each of the questions above.

The introduction of the research proposal usually includes the following content:

  • a research justification or statement of a problem (which also serves to introduce the topic)
  • a summary of the key point in the literature review (a summary of what is known and how the research aims to contribute to what is known)
  • the research aim or objective
  • a summary of the research design
  • concise definitions of any contested or specialised terms that will be used throughout the proposal (provided the first time the term is used).

This topic will consider how to write about each of these in turn.

Signaling the topic in the first sentence

The first task of the research proposal is to signal the area of the research or 'topic' so the reader knows what subject will be discussed in the proposal. This step is ideally accomplished in the opening sentence or the opening paragraph of the research proposal. It is also indicated in the title of the research proposal. It is important not to provide tangential information in the opening sentence or title because this may mislead the reader about the core subject of the proposal.

A ‘topic’ includes:

is a research paper justified

  • the context or properties of the subject (the particular aspect or properties of the subject that are of interest).

Questions to consider in helping to clarify the topic:

  • What is the focus of my research?
  • What do I want to understand?
  • What domain/s of activity does it pertain to?
  • What will I investigate in order to shed light on my focus?

The research justification or the ‘problem’ statement

The goal of the first step of the research proposal is to get your audience's attention; to show them why your research matters, and to make them want to know more about your research. The first step within the research proposal is sometimes referred to as the research justification or the statement of the 'problem'. This step involves providing the reader with critical background or contextual information that introduces the topic area, and indicates why the research is important. Research proposals often open by outlining a central concern, issue, question or conundrum to which the research relates.

The research justification should be provided in an accessible and direct manner in the introductory section of the research proposal. The number of words required to complete this first conceptual step will vary widely depending on the project.

Writing about the research justification, like writing about the literature and your research design, is a creative process involving careful decision making on your part. The research justification should lead up to the topic of your research and frame your research, and, when you write your thesis, exegesis or journal article conclusion, you will again return to the research justification to wrap up the implications of your research. That is to say, your conclusions will refer back to the problem and reflect on what the findings suggest about how we should treat the problem. For this reason, you may find the need to go back and reframe your research justification as your research and writing progresses.

The most common way of establishing the importance of the research is to refer to a real world problem. Research may aim to produce knowledge that will ultimately be used to:

  • advance national and organisational goals (health, clean environment, quality education),
  • improve policies and regulations,
  • manage risk,
  • contribute to economic development,
  • promote peace and prosperity,
  • promote democracy,
  • test assumptions (theoretical, popular, policy) about human behaviour, the economy, society,
  • understand human behaviour, the economy and social experience,
  • understand or critique social processes and values.

Examples of 'research problems' in opening sentences and paragraphs of research writing

Management The concept of meritocracy is one replicated and sustained in much discourse around organisational recruitment, retention and promotion. Women have a firm belief in the concept of merit, believing that hard work, education and talent will in the end be rewarded (McNamee and Miller, 2004). This belief in workplace meritocracy could in part be due to the advertising efforts of employers themselves, who, since the early 1990s, attempt to attract employees through intensive branding programs and aggressive advertising which emphasise equality of opportunity. The statistics, however, are less than convincing, with 2008 data from the Equal Employment for Women in the Workplace agency signalling that women are disproportionately represented in senior management levels compared to men, and that the numbers of women at Chief Executive Officer level in corporate Australia have actually decreased (Equal Opportunity for Women Agency, 2008). Women, it seems, are still unable to shatter the glass ceiling and are consistently overlooked at executive level.

Psychology Tension-type headache is extremely prevalent and is associated with significant personal and social costs.

Education One of the major challenges of higher education health programs is developing the cognitive abilities that will assist undergraduate students' clinical decision making. This is achieved by stimulating enquiry analysis, creating independent judgement and developing cognitive skills that are in line with graduate practice (Hollingworth and McLoughlin 2001; Bedard, 1996).

Visual arts In the East, the traditional idea of the body was not as something separate from the mind. In the West, however, the body is still perceived as separate, as a counterpart of the mind. The body is increasingly at the centre of the changing cultural environment, particularly the increasingly visual culture exemplified by the ubiquity of the image, the emergence of virtual reality, voyeurism and surveillance culture. Within the contemporary visual environment, the body's segregation from the mind has become more intense than ever, conferring upon the body a 'being watched' or 'manufacturable' status, further undermining the sense of the body as an integral part of our being.

is a research paper justified

Literature review summary

The next step following the research justification in the introduction is the literature review summary statement. This part of the introduction summarises the literature review section of the research proposal, providing a concise statement that signals the field of research and the rationale for the research question or aim.

It can be helpful to think about the literature review element as comprised of four parts. The first is a reference to the field or discipline the research will contribute to. The second is a summary of the main questions, approaches or accepted conclusions in your topic area in the field or discipline at present ('what is known'). This summary of existing research acts as a contrast to highlight the significance of the third part, your statement of a 'gap'. The fourth part rephrases this 'gap' in the form of a research question, aim, objective or hypothesis.

For example

Scholars writing about ... (the problem area) in the field of ... (discipline or sub-discipline, part one) have observed that ... ('what is known', part two). Others describe ... ('what is known', part two). A more recent perspective chronicles changes that, in broad outline, parallel those that have occurred in ... ('what is known', part two). This study differs from these approaches in that it considers ... ('gap', research focus, part three). This research draws on ... to consider ... (research objective, part four).  

More information about writing these four parts of the literature review summary is provided below.

1. The 'field' of literature

The field of research is the academic discipline within which your research is situated, and to which it will contribute. Some fields grow out of a single discipline, others are multidisciplinary. The field or discipline is linked to university courses and research, academic journals, conferences and other academic associations, and some book publishers. It also describes the expertise of thesis supervisors and examiners. 

The discipline defines the kinds of approaches, theories, methods and styles of writing adopted by scholars and researchers working within them.

For a list of academic disciplines have a look at the wikipedia site at: https://en.wikipedia.org/wiki/List_of_academic_disciplines

The field or discipline is not the same as the topic of the research. The topic is the subject matter or foci of your research. Disciplines or 'fields' refer to globally recognised areas of research and scholarship.

The field or discipline the research aims to contribute to can be signalled in a few key words within the literature review summary, or possibly earlier withn the research justification.

Sentence stems to signal the field of research 

  • Within the field of ... there is now agreement that ... .
  • The field of ... is marked by ongoing debate about ... .
  • Following analysis of ... the field of ... turned to an exploration of ... .

2. A summary of contrasting areas of research or what is 'known'

The newness or significance of what you are doing is typically established in a contrast or dialogue with other research and scholarship. The 'gap' (or hole in the donut) only becomes apparent by the surrounding literature (or donut). Sometimes a contrast is provided to show that you are working in a different area to what has been done before, or to show that you are building on previous work, or perhaps working on an unresolved issue within a discipline. It might also be that the approaches of other disciplines on the same problem area or focus are introduced to highlight a new angle on the topic.

3. The summary of the 'gap' in the literature

The 'gap' in the field typically refers to the explanation provided to support the research question. Questions or objectives grow out of areas of uncertainty, or gaps, in the field of research. In most cases, you will not know what the gap in knowledge is until you have reviewed the literature and written up a good part of the literature review section of the proposal. It is often not possible therefore to confidently write the 'gap' statement until you have done considerable work on the literature review. Once your literature review section is sufficiently developed, you can summarise the missing piece of knowledge in a brief statement in the introduction.

Sentence stems for summarising a 'gap' in the literature

Indicate a gap in the previous research by raising a question about it, or extending previous knowledge in some way:

  • However, there is little information/attention/work/data/research on … .
  • However, few studies/investigations/researchers/attempt to … .

Often steps two and three blend together in the same sentence, as in the sentence stems below.

Sentence stems which both introduce research in the field (what is 'known') and summarise a 'gap'

  • The research has tended to focus on …(introduce existing field foci), rather than on … ('gap').
  • These studies have emphasised that … …(introduce what is known), but it remains unclear whether … ('gap').
  • Although considerable research has been devoted to … (introduce field areas), rather less attention has been paid to … ('gap').

The 'significance' of the research

When writing the research proposal, it is useful to think about the research justification and the  ‘gap in the literature’ as two distinct conceptual elements, each of which must be established separately. Stating a real world problem or outlining a conceptual or other conundrum or concern is typically not, in itself, enough to justify the research. Similarly, establishing that there is a gap in the literature is often not enough on its own to persuade the reader that the research is important. In the first case, reviewers may still wonder ‘perhaps the problem or concern has already been addressed in the literature’, or, in the second, ‘so little has been done on this focus, but perhaps the proposed research is not important’? The proposal will ideally establish that the research is important, and that it will provide something new to the field of knowledge.

In effect, the research justification and the literature review work together to establish the benefit, contribution or 'significance' of the research. The 'significance' of the research is established not in a statement to be incorporated into the proposal, but as something the first two sections of the proposal work to establish. Research is significant when it pertains to something important, and when it provides new knowledge or insights within a field of knowledge.

4. The research aim or objective

The research aim is usually expressed as a concise statement at the close of the literature review. It may be referred to as an objective, a question or an aim. These terms are often used interchangeably to refer to the focus of the investigation. The research focus is the question at the heart of the research, designed to produce new knowledge. To avoid confusing the reader about the purpose of the research it is best to express it as either an aim, or an objective, or a question. It is also important to frame the aims of the research in a succinct manner; no more than three dot points say. And the aim/objective/question should be framed in more or less the same way wherever it appears in the proposal. This ensures the research focus is clear.

Language use

Research generally aims to produce knowledge, as opposed to say recommendations, policy or social change. Research may support policy or social change, and eventually produce it in some of its applications, but it does not typically produce it (with the possible exception of action research). For this reason, aims and objectives are framed in terms of knowledge production, using phrases like:

  • to increase understanding, insight, clarity;
  • to evaluate and critique;
  • to test models, theory, or strategies.

These are all knowledge outcomes that can be achieved within the research process.

Reflecting your social philosophy in the research aim

A well written research aim typically carries within it information about the philosophical approach the research will take, even if the researcher is not themselves aware of it, or if the proposal does not discuss philosophy or social theory at any length. If you are interested in social theory, you might consider framing your aim such that it reflects your philosophical or theoretical approach. Since your philosophical approach reflects your beliefs about how 'valid' knowledge can be gained, and therefore the types of questions you ask, it follows that it will be evident within your statement of the research aim. Researchers, variously, hold that knowledge of the world arises through:

  • observations of phenomena (measurements of what we can see, hear, taste, touch);
  • the interactions between interpreting human subjects and objective phenomena in the world;
  • ideology shaped by power, which we may be unconscious of, and which must be interrogated and replaced with knowledge that reflects people's true interests; 
  • the structure of language and of the unconscious;
  • the play of historical relations between human actions, institutional practices and prevailing discourses;
  • metaphoric and other linguistic relations established within language and text.

The philosophical perspective underpinning your research is then reflected in the research aim. For example, depending upon your philosophical perspective, you may aim to find out about:

  • observable phenomenon or facts;
  • shared cultural meanings of practices, rituals, events that determine how objective phenomena are interpreted and experienced;
  • social structures and political ideologies that shape experience and distort authentic or empowered experience;
  • the structure of language;
  • the historical evolution of networks of discursive and extra-discursive practices;
  • emerging or actual phenomenon untainted by existing representation.

You might check your aim statement to ensure it reflects the philosophical perspective you claim to adopt in your proposal. Check that there are not contradictions in your philosophical claims and that you are consistent in your approach. For assistance with this you may find the Social philosophy of research resources helpful.

Sentence stems for aims and objectives

  • The purpose of this research project is to … .
  • The purpose of this investigation is to … .
  • The aim of this research project is to … .
  • This study is designed to … .

The next step or key element in the research proposal is the research design. The research design explains how the research aims will be achieved. Within the introduction a summary of the overall research design can make the project more accessible to the reader.

The summary statement of the research design within the introduction might include:

  • the method/s that will be used (interviews, surveys, video observation, diary recording);
  • if the research will be phased, how many phases, and what methods will be used in each phase;
  • brief reference to how the data will be analysed.

The statement of the research design is often the last thing discussed in the research proposal introduction.

NB. It is not necessary to explain that a literature review and a detailed ouline of the methods and methodology will follow because academic readers will assume this.

Title: Aboriginal cultural values and economic sustainability: A case study of agro-forestry in a remote Aboriginal community

Further examples can be found at the end of this topic, and in the drop down for this topic in the left menu. 

In summary, the introduction contains a problem statement, or explanation of why the research is important to the world, a summary of the literature review, and a summary of the research design. The introduction enables the reviewer, as well as yourself and your supervisory team, to assess the logical connections between the research justification, the 'gap' in the literature, research aim and the research design without getting lost in the detail of the project. In this sense, the introduction serves as a kind of map or abstract of the proposed research as well as of the main body of the research proposal.

The following questions may be useful in assessing your research proposal introduction.

  • Have I clearly signalled the research topic in the key words and phrases used in the first sentence and title of the research proposal?
  • Have I explained why my research matters, the problem or issue that underlies the research in the opening sentences,  paragraphs and page/s?
  • Have I used literature, examples or other evidence to substantiate my understanding of the key issues?
  • Have I explained the problem in a way that grabs the reader’s attention and concern?
  • Have I indicated the field/s within which my research is situated using key words that are recognised by other scholars?
  • Have I provided a summary of previous research and outlined a 'gap' in the literature?
  • Have I provided a succinct statement of the objectives or aims of my research?
  • Have I provided a summary of the research phases and methods?

This resource was developed by Wendy Bastalich.

File icon

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Research Paper Format – Types, Examples and Templates

Research Paper Format – Types, Examples and Templates

Table of Contents

Research Paper Formats

Research paper format is an essential aspect of academic writing that plays a crucial role in the communication of research findings . The format of a research paper depends on various factors such as the discipline, style guide, and purpose of the research. It includes guidelines for the structure, citation style, referencing , and other elements of the paper that contribute to its overall presentation and coherence. Adhering to the appropriate research paper format is vital for ensuring that the research is accurately and effectively communicated to the intended audience. In this era of information, it is essential to understand the different research paper formats and their guidelines to communicate research effectively, accurately, and with the required level of detail. This post aims to provide an overview of some of the common research paper formats used in academic writing.

Research Paper Formats

Research Paper Formats are as follows:

  • APA (American Psychological Association) format
  • MLA (Modern Language Association) format
  • Chicago/Turabian style
  • IEEE (Institute of Electrical and Electronics Engineers) format
  • AMA (American Medical Association) style
  • Harvard style
  • Vancouver style
  • ACS (American Chemical Society) style
  • ASA (American Sociological Association) style
  • APSA (American Political Science Association) style

APA (American Psychological Association) Format

Here is a general APA format for a research paper:

  • Title Page: The title page should include the title of your paper, your name, and your institutional affiliation. It should also include a running head, which is a shortened version of the title, and a page number in the upper right-hand corner.
  • Abstract : The abstract is a brief summary of your paper, typically 150-250 words. It should include the purpose of your research, the main findings, and any implications or conclusions that can be drawn.
  • Introduction: The introduction should provide background information on your topic, state the purpose of your research, and present your research question or hypothesis. It should also include a brief literature review that discusses previous research on your topic.
  • Methods: The methods section should describe the procedures you used to collect and analyze your data. It should include information on the participants, the materials and instruments used, and the statistical analyses performed.
  • Results: The results section should present the findings of your research in a clear and concise manner. Use tables and figures to help illustrate your results.
  • Discussion : The discussion section should interpret your results and relate them back to your research question or hypothesis. It should also discuss the implications of your findings and any limitations of your study.
  • References : The references section should include a list of all sources cited in your paper. Follow APA formatting guidelines for your citations and references.

Some additional tips for formatting your APA research paper:

  • Use 12-point Times New Roman font throughout the paper.
  • Double-space all text, including the references.
  • Use 1-inch margins on all sides of the page.
  • Indent the first line of each paragraph by 0.5 inches.
  • Use a hanging indent for the references (the first line should be flush with the left margin, and all subsequent lines should be indented).
  • Number all pages, including the title page and references page, in the upper right-hand corner.

APA Research Paper Format Template

APA Research Paper Format Template is as follows:

Title Page:

  • Title of the paper
  • Author’s name
  • Institutional affiliation
  • A brief summary of the main points of the paper, including the research question, methods, findings, and conclusions. The abstract should be no more than 250 words.

Introduction:

  • Background information on the topic of the research paper
  • Research question or hypothesis
  • Significance of the study
  • Overview of the research methods and design
  • Brief summary of the main findings
  • Participants: description of the sample population, including the number of participants and their characteristics (age, gender, ethnicity, etc.)
  • Materials: description of any materials used in the study (e.g., survey questions, experimental apparatus)
  • Procedure: detailed description of the steps taken to conduct the study
  • Presentation of the findings of the study, including statistical analyses if applicable
  • Tables and figures may be included to illustrate the results

Discussion:

  • Interpretation of the results in light of the research question and hypothesis
  • Implications of the study for the field
  • Limitations of the study
  • Suggestions for future research

References:

  • A list of all sources cited in the paper, in APA format

Formatting guidelines:

  • Double-spaced
  • 12-point font (Times New Roman or Arial)
  • 1-inch margins on all sides
  • Page numbers in the top right corner
  • Headings and subheadings should be used to organize the paper
  • The first line of each paragraph should be indented
  • Quotations of 40 or more words should be set off in a block quote with no quotation marks
  • In-text citations should include the author’s last name and year of publication (e.g., Smith, 2019)

APA Research Paper Format Example

APA Research Paper Format Example is as follows:

The Effects of Social Media on Mental Health

University of XYZ

This study examines the relationship between social media use and mental health among college students. Data was collected through a survey of 500 students at the University of XYZ. Results suggest that social media use is significantly related to symptoms of depression and anxiety, and that the negative effects of social media are greater among frequent users.

Social media has become an increasingly important aspect of modern life, especially among young adults. While social media can have many positive effects, such as connecting people across distances and sharing information, there is growing concern about its impact on mental health. This study aims to examine the relationship between social media use and mental health among college students.

Participants: Participants were 500 college students at the University of XYZ, recruited through online advertisements and flyers posted on campus. Participants ranged in age from 18 to 25, with a mean age of 20.5 years. The sample was 60% female, 40% male, and 5% identified as non-binary or gender non-conforming.

Data was collected through an online survey administered through Qualtrics. The survey consisted of several measures, including the Patient Health Questionnaire-9 (PHQ-9) for depression symptoms, the Generalized Anxiety Disorder-7 (GAD-7) for anxiety symptoms, and questions about social media use.

Procedure :

Participants were asked to complete the online survey at their convenience. The survey took approximately 20-30 minutes to complete. Data was analyzed using descriptive statistics, correlations, and multiple regression analysis.

Results indicated that social media use was significantly related to symptoms of depression (r = .32, p < .001) and anxiety (r = .29, p < .001). Regression analysis indicated that frequency of social media use was a significant predictor of both depression symptoms (β = .24, p < .001) and anxiety symptoms (β = .20, p < .001), even when controlling for age, gender, and other relevant factors.

The results of this study suggest that social media use is associated with symptoms of depression and anxiety among college students. The negative effects of social media are greater among frequent users. These findings have important implications for mental health professionals and educators, who should consider addressing the potential negative effects of social media use in their work with young adults.

References :

References should be listed in alphabetical order according to the author’s last name. For example:

  • Chou, H. T. G., & Edge, N. (2012). “They are happier and having better lives than I am”: The impact of using Facebook on perceptions of others’ lives. Cyberpsychology, Behavior, and Social Networking, 15(2), 117-121.
  • Twenge, J. M., Joiner, T. E., Rogers, M. L., & Martin, G. N. (2018). Increases in depressive symptoms, suicide-related outcomes, and suicide rates among U.S. adolescents after 2010 and links to increased new media screen time. Clinical Psychological Science, 6(1), 3-17.

Note: This is just a sample Example do not use this in your assignment.

MLA (Modern Language Association) Format

MLA (Modern Language Association) Format is as follows:

  • Page Layout : Use 8.5 x 11-inch white paper, with 1-inch margins on all sides. The font should be 12-point Times New Roman or a similar serif font.
  • Heading and Title : The first page of your research paper should include a heading and a title. The heading should include your name, your instructor’s name, the course title, and the date. The title should be centered and in title case (capitalizing the first letter of each important word).
  • In-Text Citations : Use parenthetical citations to indicate the source of your information. The citation should include the author’s last name and the page number(s) of the source. For example: (Smith 23).
  • Works Cited Page : At the end of your paper, include a Works Cited page that lists all the sources you used in your research. Each entry should include the author’s name, the title of the work, the publication information, and the medium of publication.
  • Formatting Quotations : Use double quotation marks for short quotations and block quotations for longer quotations. Indent the entire quotation five spaces from the left margin.
  • Formatting the Body : Use a clear and readable font and double-space your text throughout. The first line of each paragraph should be indented one-half inch from the left margin.

MLA Research Paper Template

MLA Research Paper Format Template is as follows:

  • Use 8.5 x 11 inch white paper.
  • Use a 12-point font, such as Times New Roman.
  • Use double-spacing throughout the entire paper, including the title page and works cited page.
  • Set the margins to 1 inch on all sides.
  • Use page numbers in the upper right corner, beginning with the first page of text.
  • Include a centered title for the research paper, using title case (capitalizing the first letter of each important word).
  • Include your name, instructor’s name, course name, and date in the upper left corner, double-spaced.

In-Text Citations

  • When quoting or paraphrasing information from sources, include an in-text citation within the text of your paper.
  • Use the author’s last name and the page number in parentheses at the end of the sentence, before the punctuation mark.
  • If the author’s name is mentioned in the sentence, only include the page number in parentheses.

Works Cited Page

  • List all sources cited in alphabetical order by the author’s last name.
  • Each entry should include the author’s name, title of the work, publication information, and medium of publication.
  • Use italics for book and journal titles, and quotation marks for article and chapter titles.
  • For online sources, include the date of access and the URL.

Here is an example of how the first page of a research paper in MLA format should look:

Headings and Subheadings

  • Use headings and subheadings to organize your paper and make it easier to read.
  • Use numerals to number your headings and subheadings (e.g. 1, 2, 3), and capitalize the first letter of each word.
  • The main heading should be centered and in boldface type, while subheadings should be left-aligned and in italics.
  • Use only one space after each period or punctuation mark.
  • Use quotation marks to indicate direct quotes from a source.
  • If the quote is more than four lines, format it as a block quote, indented one inch from the left margin and without quotation marks.
  • Use ellipses (…) to indicate omitted words from a quote, and brackets ([…]) to indicate added words.

Works Cited Examples

  • Book: Last Name, First Name. Title of Book. Publisher, Publication Year.
  • Journal Article: Last Name, First Name. “Title of Article.” Title of Journal, volume number, issue number, publication date, page numbers.
  • Website: Last Name, First Name. “Title of Webpage.” Title of Website, publication date, URL. Accessed date.

Here is an example of how a works cited entry for a book should look:

Smith, John. The Art of Writing Research Papers. Penguin, 2021.

MLA Research Paper Example

MLA Research Paper Format Example is as follows:

Your Professor’s Name

Course Name and Number

Date (in Day Month Year format)

Word Count (not including title page or Works Cited)

Title: The Impact of Video Games on Aggression Levels

Video games have become a popular form of entertainment among people of all ages. However, the impact of video games on aggression levels has been a subject of debate among scholars and researchers. While some argue that video games promote aggression and violent behavior, others argue that there is no clear link between video games and aggression levels. This research paper aims to explore the impact of video games on aggression levels among young adults.

Background:

The debate on the impact of video games on aggression levels has been ongoing for several years. According to the American Psychological Association, exposure to violent media, including video games, can increase aggression levels in children and adolescents. However, some researchers argue that there is no clear evidence to support this claim. Several studies have been conducted to examine the impact of video games on aggression levels, but the results have been mixed.

Methodology:

This research paper used a quantitative research approach to examine the impact of video games on aggression levels among young adults. A sample of 100 young adults between the ages of 18 and 25 was selected for the study. The participants were asked to complete a questionnaire that measured their aggression levels and their video game habits.

The results of the study showed that there was a significant correlation between video game habits and aggression levels among young adults. The participants who reported playing violent video games for more than 5 hours per week had higher aggression levels than those who played less than 5 hours per week. The study also found that male participants were more likely to play violent video games and had higher aggression levels than female participants.

The findings of this study support the claim that video games can increase aggression levels among young adults. However, it is important to note that the study only examined the impact of video games on aggression levels and did not take into account other factors that may contribute to aggressive behavior. It is also important to note that not all video games promote violence and aggression, and some games may have a positive impact on cognitive and social skills.

Conclusion :

In conclusion, this research paper provides evidence to support the claim that video games can increase aggression levels among young adults. However, it is important to conduct further research to examine the impact of video games on other aspects of behavior and to explore the potential benefits of video games. Parents and educators should be aware of the potential impact of video games on aggression levels and should encourage young adults to engage in a variety of activities that promote cognitive and social skills.

Works Cited:

  • American Psychological Association. (2017). Violent Video Games: Myths, Facts, and Unanswered Questions. Retrieved from https://www.apa.org/news/press/releases/2017/08/violent-video-games
  • Ferguson, C. J. (2015). Do Angry Birds make for angry children? A meta-analysis of video game influences on children’s and adolescents’ aggression, mental health, prosocial behavior, and academic performance. Perspectives on Psychological Science, 10(5), 646-666.
  • Gentile, D. A., Swing, E. L., Lim, C. G., & Khoo, A. (2012). Video game playing, attention problems, and impulsiveness: Evidence of bidirectional causality. Psychology of Popular Media Culture, 1(1), 62-70.
  • Greitemeyer, T. (2014). Effects of prosocial video games on prosocial behavior. Journal of Personality and Social Psychology, 106(4), 530-548.

Chicago/Turabian Style

Chicago/Turabian Formate is as follows:

  • Margins : Use 1-inch margins on all sides of the paper.
  • Font : Use a readable font such as Times New Roman or Arial, and use a 12-point font size.
  • Page numbering : Number all pages in the upper right-hand corner, beginning with the first page of text. Use Arabic numerals.
  • Title page: Include a title page with the title of the paper, your name, course title and number, instructor’s name, and the date. The title should be centered on the page and in title case (capitalize the first letter of each word).
  • Headings: Use headings to organize your paper. The first level of headings should be centered and in boldface or italics. The second level of headings should be left-aligned and in boldface or italics. Use as many levels of headings as necessary to organize your paper.
  • In-text citations : Use footnotes or endnotes to cite sources within the text of your paper. The first citation for each source should be a full citation, and subsequent citations can be shortened. Use superscript numbers to indicate footnotes or endnotes.
  • Bibliography : Include a bibliography at the end of your paper, listing all sources cited in your paper. The bibliography should be in alphabetical order by the author’s last name, and each entry should include the author’s name, title of the work, publication information, and date of publication.
  • Formatting of quotations: Use block quotations for quotations that are longer than four lines. Indent the entire quotation one inch from the left margin, and do not use quotation marks. Single-space the quotation, and double-space between paragraphs.
  • Tables and figures: Use tables and figures to present data and illustrations. Number each table and figure sequentially, and provide a brief title for each. Place tables and figures as close as possible to the text that refers to them.
  • Spelling and grammar : Use correct spelling and grammar throughout your paper. Proofread carefully for errors.

Chicago/Turabian Research Paper Template

Chicago/Turabian Research Paper Template is as folows:

Title of Paper

Name of Student

Professor’s Name

I. Introduction

A. Background Information

B. Research Question

C. Thesis Statement

II. Literature Review

A. Overview of Existing Literature

B. Analysis of Key Literature

C. Identification of Gaps in Literature

III. Methodology

A. Research Design

B. Data Collection

C. Data Analysis

IV. Results

A. Presentation of Findings

B. Analysis of Findings

C. Discussion of Implications

V. Conclusion

A. Summary of Findings

B. Implications for Future Research

C. Conclusion

VI. References

A. Bibliography

B. In-Text Citations

VII. Appendices (if necessary)

A. Data Tables

C. Additional Supporting Materials

Chicago/Turabian Research Paper Example

Title: The Impact of Social Media on Political Engagement

Name: John Smith

Class: POLS 101

Professor: Dr. Jane Doe

Date: April 8, 2023

I. Introduction:

Social media has become an integral part of our daily lives. People use social media platforms like Facebook, Twitter, and Instagram to connect with friends and family, share their opinions, and stay informed about current events. With the rise of social media, there has been a growing interest in understanding its impact on various aspects of society, including political engagement. In this paper, I will examine the relationship between social media use and political engagement, specifically focusing on how social media influences political participation and political attitudes.

II. Literature Review:

There is a growing body of literature on the impact of social media on political engagement. Some scholars argue that social media has a positive effect on political participation by providing new channels for political communication and mobilization (Delli Carpini & Keeter, 1996; Putnam, 2000). Others, however, suggest that social media can have a negative impact on political engagement by creating filter bubbles that reinforce existing beliefs and discourage political dialogue (Pariser, 2011; Sunstein, 2001).

III. Methodology:

To examine the relationship between social media use and political engagement, I conducted a survey of 500 college students. The survey included questions about social media use, political participation, and political attitudes. The data was analyzed using descriptive statistics and regression analysis.

Iv. Results:

The results of the survey indicate that social media use is positively associated with political participation. Specifically, respondents who reported using social media to discuss politics were more likely to have participated in a political campaign, attended a political rally, or contacted a political representative. Additionally, social media use was found to be associated with more positive attitudes towards political engagement, such as increased trust in government and belief in the effectiveness of political action.

V. Conclusion:

The findings of this study suggest that social media has a positive impact on political engagement, by providing new opportunities for political communication and mobilization. However, there is also a need for caution, as social media can also create filter bubbles that reinforce existing beliefs and discourage political dialogue. Future research should continue to explore the complex relationship between social media and political engagement, and develop strategies to harness the potential benefits of social media while mitigating its potential negative effects.

Vii. References:

  • Delli Carpini, M. X., & Keeter, S. (1996). What Americans know about politics and why it matters. Yale University Press.
  • Pariser, E. (2011). The filter bubble: What the Internet is hiding from you. Penguin.
  • Putnam, R. D. (2000). Bowling alone: The collapse and revival of American community. Simon & Schuster.
  • Sunstein, C. R. (2001). Republic.com. Princeton University Press.

IEEE (Institute of Electrical and Electronics Engineers) Format

IEEE (Institute of Electrical and Electronics Engineers) Research Paper Format is as follows:

  • Title : A concise and informative title that accurately reflects the content of the paper.
  • Abstract : A brief summary of the paper, typically no more than 250 words, that includes the purpose of the study, the methods used, the key findings, and the main conclusions.
  • Introduction : An overview of the background, context, and motivation for the research, including a clear statement of the problem being addressed and the objectives of the study.
  • Literature review: A critical analysis of the relevant research and scholarship on the topic, including a discussion of any gaps or limitations in the existing literature.
  • Methodology : A detailed description of the methods used to collect and analyze data, including any experiments or simulations, data collection instruments or procedures, and statistical analyses.
  • Results : A clear and concise presentation of the findings, including any relevant tables, graphs, or figures.
  • Discussion : A detailed interpretation of the results, including a comparison of the findings with previous research, a discussion of the implications of the results, and any recommendations for future research.
  • Conclusion : A summary of the key findings and main conclusions of the study.
  • References : A list of all sources cited in the paper, formatted according to IEEE guidelines.

In addition to these elements, an IEEE research paper should also follow certain formatting guidelines, including using 12-point font, double-spaced text, and numbered headings and subheadings. Additionally, any tables, figures, or equations should be clearly labeled and referenced in the text.

AMA (American Medical Association) Style

AMA (American Medical Association) Style Research Paper Format:

  • Title Page: This page includes the title of the paper, the author’s name, institutional affiliation, and any acknowledgments or disclaimers.
  • Abstract: The abstract is a brief summary of the paper that outlines the purpose, methods, results, and conclusions of the study. It is typically limited to 250 words or less.
  • Introduction: The introduction provides a background of the research problem, defines the research question, and outlines the objectives and hypotheses of the study.
  • Methods: The methods section describes the research design, participants, procedures, and instruments used to collect and analyze data.
  • Results: The results section presents the findings of the study in a clear and concise manner, using graphs, tables, and charts where appropriate.
  • Discussion: The discussion section interprets the results, explains their significance, and relates them to previous research in the field.
  • Conclusion: The conclusion summarizes the main points of the paper, discusses the implications of the findings, and suggests future research directions.
  • References: The reference list includes all sources cited in the paper, listed in alphabetical order by author’s last name.

In addition to these sections, the AMA format requires that authors follow specific guidelines for citing sources in the text and formatting their references. The AMA style uses a superscript number system for in-text citations and provides specific formats for different types of sources, such as books, journal articles, and websites.

Harvard Style

Harvard Style Research Paper format is as follows:

  • Title page: This should include the title of your paper, your name, the name of your institution, and the date of submission.
  • Abstract : This is a brief summary of your paper, usually no more than 250 words. It should outline the main points of your research and highlight your findings.
  • Introduction : This section should introduce your research topic, provide background information, and outline your research question or thesis statement.
  • Literature review: This section should review the relevant literature on your topic, including previous research studies, academic articles, and other sources.
  • Methodology : This section should describe the methods you used to conduct your research, including any data collection methods, research instruments, and sampling techniques.
  • Results : This section should present your findings in a clear and concise manner, using tables, graphs, and other visual aids if necessary.
  • Discussion : This section should interpret your findings and relate them to the broader research question or thesis statement. You should also discuss the implications of your research and suggest areas for future study.
  • Conclusion : This section should summarize your main findings and provide a final statement on the significance of your research.
  • References : This is a list of all the sources you cited in your paper, presented in alphabetical order by author name. Each citation should include the author’s name, the title of the source, the publication date, and other relevant information.

In addition to these sections, a Harvard Style research paper may also include a table of contents, appendices, and other supplementary materials as needed. It is important to follow the specific formatting guidelines provided by your instructor or academic institution when preparing your research paper in Harvard Style.

Vancouver Style

Vancouver Style Research Paper format is as follows:

The Vancouver citation style is commonly used in the biomedical sciences and is known for its use of numbered references. Here is a basic format for a research paper using the Vancouver citation style:

  • Title page: Include the title of your paper, your name, the name of your institution, and the date.
  • Abstract : This is a brief summary of your research paper, usually no more than 250 words.
  • Introduction : Provide some background information on your topic and state the purpose of your research.
  • Methods : Describe the methods you used to conduct your research, including the study design, data collection, and statistical analysis.
  • Results : Present your findings in a clear and concise manner, using tables and figures as needed.
  • Discussion : Interpret your results and explain their significance. Also, discuss any limitations of your study and suggest directions for future research.
  • References : List all of the sources you cited in your paper in numerical order. Each reference should include the author’s name, the title of the article or book, the name of the journal or publisher, the year of publication, and the page numbers.

ACS (American Chemical Society) Style

ACS (American Chemical Society) Style Research Paper format is as follows:

The American Chemical Society (ACS) Style is a citation style commonly used in chemistry and related fields. When formatting a research paper in ACS Style, here are some guidelines to follow:

  • Paper Size and Margins : Use standard 8.5″ x 11″ paper with 1-inch margins on all sides.
  • Font: Use a 12-point serif font (such as Times New Roman) for the main text. The title should be in bold and a larger font size.
  • Title Page : The title page should include the title of the paper, the authors’ names and affiliations, and the date of submission. The title should be centered on the page and written in bold font. The authors’ names should be centered below the title, followed by their affiliations and the date.
  • Abstract : The abstract should be a brief summary of the paper, no more than 250 words. It should be on a separate page and include the title of the paper, the authors’ names and affiliations, and the text of the abstract.
  • Main Text : The main text should be organized into sections with headings that clearly indicate the content of each section. The introduction should provide background information and state the research question or hypothesis. The methods section should describe the procedures used in the study. The results section should present the findings of the study, and the discussion section should interpret the results and provide conclusions.
  • References: Use the ACS Style guide to format the references cited in the paper. In-text citations should be numbered sequentially throughout the text and listed in numerical order at the end of the paper.
  • Figures and Tables: Figures and tables should be numbered sequentially and referenced in the text. Each should have a descriptive caption that explains its content. Figures should be submitted in a high-quality electronic format.
  • Supporting Information: Additional information such as data, graphs, and videos may be included as supporting information. This should be included in a separate file and referenced in the main text.
  • Acknowledgments : Acknowledge any funding sources or individuals who contributed to the research.

ASA (American Sociological Association) Style

ASA (American Sociological Association) Style Research Paper format is as follows:

  • Title Page: The title page of an ASA style research paper should include the title of the paper, the author’s name, and the institutional affiliation. The title should be centered and should be in title case (the first letter of each major word should be capitalized).
  • Abstract: An abstract is a brief summary of the paper that should appear on a separate page immediately following the title page. The abstract should be no more than 200 words in length and should summarize the main points of the paper.
  • Main Body: The main body of the paper should begin on a new page following the abstract page. The paper should be double-spaced, with 1-inch margins on all sides, and should be written in 12-point Times New Roman font. The main body of the paper should include an introduction, a literature review, a methodology section, results, and a discussion.
  • References : The reference section should appear on a separate page at the end of the paper. All sources cited in the paper should be listed in alphabetical order by the author’s last name. Each reference should include the author’s name, the title of the work, the publication information, and the date of publication.
  • Appendices : Appendices are optional and should only be included if they contain information that is relevant to the study but too lengthy to be included in the main body of the paper. If you include appendices, each one should be labeled with a letter (e.g., Appendix A, Appendix B, etc.) and should be referenced in the main body of the paper.

APSA (American Political Science Association) Style

APSA (American Political Science Association) Style Research Paper format is as follows:

  • Title Page: The title page should include the title of the paper, the author’s name, the name of the course or instructor, and the date.
  • Abstract : An abstract is typically not required in APSA style papers, but if one is included, it should be brief and summarize the main points of the paper.
  • Introduction : The introduction should provide an overview of the research topic, the research question, and the main argument or thesis of the paper.
  • Literature Review : The literature review should summarize the existing research on the topic and provide a context for the research question.
  • Methods : The methods section should describe the research methods used in the paper, including data collection and analysis.
  • Results : The results section should present the findings of the research.
  • Discussion : The discussion section should interpret the results and connect them back to the research question and argument.
  • Conclusion : The conclusion should summarize the main findings and implications of the research.
  • References : The reference list should include all sources cited in the paper, formatted according to APSA style guidelines.

In-text citations in APSA style use parenthetical citation, which includes the author’s last name, publication year, and page number(s) if applicable. For example, (Smith 2010, 25).

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Research Paper Citation

How to Cite Research Paper – All Formats and...

Delimitations

Delimitations in Research – Types, Examples and...

Research Design

Research Design – Types, Methods and Examples

Research Paper Title

Research Paper Title – Writing Guide and Example

Research Paper Introduction

Research Paper Introduction – Writing Guide and...

Research Paper Conclusion

Research Paper Conclusion – Writing Guide and...

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Justification of research using systematic reviews continues to be inconsistent in clinical health science—A systematic review and meta-analysis of meta-research studies

Roles Conceptualization, Data curation, Methodology, Project administration, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Department of Physiotherapy and Occupational Therapy, Aalborg University Hospital, Denmark and Public Health and Epidemiology Group, Department of Health, Science and Technology, Aalborg University, Aalborg, Denmark

ORCID logo

Roles Conceptualization, Data curation, Methodology, Writing – review & editing

Affiliation Department of Public Health, University of Southern Denmark Odense, Denmark

Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – review & editing

Affiliation Department of Sports Science and Clinical Biomechanics, University of Southern Denmark and Department of Physiotherapy and Occupational Therapy, Copenhagen University Hospital, Herlev and Gentofte, Herlev, Denmark

Roles Conceptualization, Methodology, Writing – review & editing

Affiliation M. Louise Fitzpatrick College of Nursing, Villanova University, Villanova, PA, United States of America

Affiliation Digital Content Services, Elsevier, London, United Kingdom

Affiliation Johns Hopkins University School of Medicine, Baltimore, MD, United States of America

Roles Conceptualization, Data curation, Methodology, Project administration, Supervision, Writing – review & editing

Affiliation Department of Evidence-Based Practice, Western Norway University of Applied Sciences, Bergen, Norway

  • Jane Andreasen, 
  • Birgitte Nørgaard, 
  • Eva Draborg, 
  • Carsten Bogh Juhl, 
  • Jennifer Yost, 
  • Klara Brunnhuber, 
  • Karen A. Robinson, 

PLOS

  • Published: October 31, 2022
  • https://doi.org/10.1371/journal.pone.0276955
  • Peer Review
  • Reader Comments

Table 1

Redundancy is an unethical, unscientific, and costly challenge in clinical health research. There is a high risk of redundancy when existing evidence is not used to justify the research question when a new study is initiated. Therefore, the aim of this study was to synthesize meta-research studies evaluating if and how authors of clinical health research studies use systematic reviews when initiating a new study.

Seven electronic bibliographic databases were searched (final search June 2021). Meta-research studies assessing the use of systematic reviews when justifying new clinical health studies were included. Screening and data extraction were performed by two reviewers independently. The primary outcome was defined as the percentage of original studies within the included meta-research studies using systematic reviews of previous studies to justify a new study. Results were synthesized narratively and quantitatively using a random-effects meta-analysis. The protocol has been registered in Open Science Framework ( https://osf.io/nw7ch/ ).

Twenty-one meta-research studies were included, representing 3,621 original studies or protocols. Nineteen of the 21 studies were included in the meta-analysis. The included studies represented different disciplines and exhibited wide variability both in how the use of previous systematic reviews was assessed, and in how this was reported. The use of systematic reviews to justify new studies varied from 16% to 87%. The mean percentage of original studies using systematic reviews to justify their study was 42% (95% CI: 36% to 48%).

Justification of new studies in clinical health research using systematic reviews is highly variable, and fewer than half of new clinical studies in health science were justified using a systematic review. Research redundancy is a challenge for clinical health researchers, as well as for funders, ethics committees, and journals.

Citation: Andreasen J, Nørgaard B, Draborg E, Juhl CB, Yost J, Brunnhuber K, et al. (2022) Justification of research using systematic reviews continues to be inconsistent in clinical health science—A systematic review and meta-analysis of meta-research studies. PLoS ONE 17(10): e0276955. https://doi.org/10.1371/journal.pone.0276955

Editor: Andrzej Grzybowski, University of Warmia, POLAND

Received: January 24, 2022; Accepted: October 18, 2022; Published: October 31, 2022

Copyright: © 2022 Andreasen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Research redundancy in clinical health research is an unethical, unscientific, and costly challenge that can be minimized by using an evidence-based research approach. First introduced in 2009 and since endorsed and promoted by organizations and researchers worldwide [ 1 – 6 ], evidence-based research is an approach whereby researchers systematically and transparently take into account the existing evidence on a topic before embarking on a new study. The researcher thus strives to enter the project unbiased, or at least aware of the risk of knowledge redundancy bias. The key is an evidence synthesis using formal, explicit, and rigorous methods to bring together the findings of pre-existing research to synthesize the totality what is known [ 7 ]. Evidence syntheses provide the basis for an unbiased justification of the proposed research study to ensure that the enrolling of participants, resource allocation, and healthcare systems are supporting only relevant and justified research. Enormous numbers of research studies are conducted, funded, and published globally every year [ 8 ]. Thus, if earlier relevant research is not considered in a systematic and transparent way when justifying research, the foundation for a research question is not properly established, thereby increasing the risk of redundant studies being conducted, funded, and published resulting in a waste of resources, such as time and funding [ 1 , 4 ]. Most importantly, when redundant research is initiated, participants unethically and unnecessarily receive placebos or receive suboptimal treatment.

Previous meta-research, defined as the study of research itself including the methods, reporting, reproducibility, evaluation and incentives of the research [ 9 ] have shown that there is considerable variation and bias in the use of evidence syntheses to justify research studies [ 10 – 12 ]. To the best of our knowledge, a systematic review of previous meta-research studies assessing the use of systematic reviews to justify studies in clinical health research has not previously been conducted. Evaluating how evidence-based research is implemented in research practices across disciplines and specialties when justifying new studies will provide an indication of the integration of evidence-based research in research practices [ 9 ]. The present systematic review aimed to identify and synthesize results from meta-research studies, regardless of study type, evaluating if and how authors of clinical health research studies use systematic reviews to justify a new study.

Prior to commencing the review, we registered the protocol in the Open Science Framework ( https://osf.io/nw7ch/ ). The protocol remained unchanged, but in this paper we have made adjustments to the risk-of-bias assessment, reducing the tool to 10 items and removing the assessment of reporting quality. The review is presented in accordance with the Preferred Reporting Items for Systematic review and Meta-Analysis (PRISMA) guidelines [ 13 ].

Eligibility criteria

Studies were eligible for inclusion if they were original meta-research studies, regardless of study type, that evaluated if and how authors of clinical health research studies used systematic reviews to justify new clinical health studies. No limitations on language, publication status, or publication year were applied. Only meta-research studies of studies on human subjects in clinical health sciences were eligible for inclusion. The primary outcome was defined as the percentage of original studies within the included meta-research studies using systematic reviews of previous studies to justify a new study. The secondary outcome was how the systematic reviews of previous research were used (e.g., within the text to justify the study) by the original studies.

Information sources and search strategy

This study is one of six ongoing evidence syntheses (four systematic reviews and two scoping reviews) planned to assess the global state of evidence-based research in clinical health research. These are; a scoping review mapping the area broadly to describe current practice and identify knowledge gaps, a systematic review on the use of prior research in reports of randomized controlled trials specifically, three systematic reviews assessing the use of systematic reviews when justifying, designing [ 14 ] or putting results of a new study in context, and finally a scoping review uncovering the breadth and characteristics of the available, empirical evidence on the topic of citation bias. Further, the research group is working with colleagues on a Handbook for Evidence-based Research in health sciences. Due to the common aim across the six evidence syntheses, a broad overall search strategy was designed to identify meta-research studies that assessed whether researchers used earlier similar studies and/or systematic reviews of earlier similar studies to inform the justification and/or design of a new study, whether researchers used systematic reviews to inform the interpretation of new results, and meta-research studies that assessed if there were published redundant studies within a specific area or not.

The first search was performed in June 2015. Databases included MEDLINE via both PubMed and Ovid, EMBASE via Ovid, CINAHL via EBSCO, Web of Science (Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Arts & Humanities Citation Index (A&HCI), and the Cochrane Methodology Register (CMR, Methods Studies) from inception (Appendix 1 in S1 File ). In addition, reference lists of included studies were screened for relevant articles, as well as the authors’ relevant publications and abstracts from the Cochrane Methodology Reviews.

Based upon the experiences from the results of the baseline search in June 2015, an updated and revised search strategy was conducted in MEDLINE and Embase via Ovid from January 2015 to June 2021 (Appendix 1 in S1 File ). Once again, the reference lists of new included studies were screened for relevant references, as were abstracts from January 2015 to June 2021 in the Cochrane Methodology Reviews. Experts in the field were contacted to identify any additional published and/or grey literature. No restrictions were made on publication year and language. See Appendix 1 and Appendix 2 in S1 File for the full search strategy.

Screening and study selection

Following deduplication, the search results were uploaded to Rayyan ( https://rayyan.qcri.org/welcome ). The search results from the 1st search (June 2015) were independently screened by a pair of reviewers. Twenty screeners were paired, with each pair including an author very experienced in systematic reviews and a less experienced author. To increase consistency among reviewers, both reviewers initially screened the same 50 publications and discussed the results before beginning screening for this review. Disagreements on study selection were resolved by consensus and discussion with a third reviewer, if needed. The full-text screening was also performed by two reviewers independently. Disagreements on study selection were resolved by consensus and discussion. There were also two independent reviewers who screened following the last search, using the same procedure, as for the first search, for full-text screening and disagreements. The screening procedures resulted in a full list of studies potentially relevant for one or more of the six above-mentioned evidence syntheses.

A second title and abstract screening and full-text screening of the full list was then performed independently by two reviewers using screening criteria specific to this systematic review. Reasons for excluding trials were recorded, and disagreements between the reviewers were resolved through discussion. If consensus was not reached, a third reviewer was involved.

Data extraction

We developed and pilot tested a data extraction form to extract data regarding study characteristics and outcomes of interest. Two reviewers independently extracted data, with other reviewers available to resolve disagreements. The following study characteristics were extracted from each of the included studies: bibliographic information, study aim, study design, setting, country, inclusion period, area of interest, results, and conclusion. Further, data for this study’s primary and secondary outcomes were extracted; these included the percentage of original studies using systematic reviews to justify their study and how the systematic reviews of previous research were used (e.g., within the text to justify the study) by the original studies.

Risk-of-bias assessment

No standard tool was identified to assess the risk of bias in empirical meta-research studies. The Editorial Group of the Evidence-Based Research Network prepared a risk-of-bias tool for the planned five systematic reviews with list of items important for evaluating the risk of bias in meta-research studies. For each item, one could classify the study under examination as exhibiting a “low risk of bias”, “unclear risk of bias” or “high risk of bias”. We independently tested the list of items upon a sample of included studies. Following a discussion of the different answers, we adjusted the number and content of the list of items to ten and defined the criteria to evaluate the risk of bias in the included studies ( Table 1 ). Each of the included meta-research studies was appraised independently by two reviewers using the customized checklist to determine the risk of bias. Disagreements regarding the risk of bias were solved through discussion. No study was excluded on the grounds of low quality.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0276955.t001

Data synthesis and interpretation

In addition, to narratively summarizing the characteristics of the included meta-research studies and their risk-of-bias assessments, the percentage of original studies using systematic review of previous similar studies to justify a new study (primary outcome) was calculated as the number of studies using at least one systematic review, divided by the total number of original studies within each of the included meta-research studies. A meta-analysis using the random-effects model (DerSimonian and Laird) was used to estimate the overall estimate and perform the forest plot as this model is the default when using the metaprop command. Heterogeneity was evaluated estimating the I 2 statistics (the percentage of variance attributable to heterogeneity i.e., inconsistency) and the between study variance tau 2 . When investigating reasons for heterogeneity, a restricted maximum likelihood (REML) model was used and covariates with the ability to reduce tau 2 was deemed relevant. [ 15 ].

All analyses were conducted in Stata, version 17.0 (StataCorp. 2019. Stata Statistical Software : Release 17 . College Station, TX: StataCorp LLC).

Study selection

In total, 30,592 publications were identified through the searches. Of these, 69 publications were determined eligible for one of the six evidence syntheses. A total of 21 meta-research studies fulfilled the inclusion criteria for this systematic review [ 10 , 11 , 16 – 34 ]; see Fig 1 .

thumbnail

https://doi.org/10.1371/journal.pone.0276955.g001

Study characteristics

The 21 included meta-research studies were published from 2007 to 2021, representing 3,621 original studies or protocols and one survey with 106 participants; only three of these studies were published before 2013 [ 10 , 18 , 26 ]. The sample of the original study within each of the included meta-research studies varied. One meta-research study surveyed congress delegates [ 29 ], one study examined first-submission protocols for randomized controlled trials submitted to four hospital ethics committees [ 17 ], and 14 studies examined randomized or quasi-randomized primary studies published during a specific time period in a range of journals [ 10 , 11 , 18 , 21 – 28 , 31 , 32 , 34 ] or in specific databases [ 16 , 19 , 20 , 30 ]. Finally, one study examined the use of previously published systematic reviews when publishing a new systematic review [ 33 ]. Further, the number of original studies within each included meta-research study varied considerably, ranging from 18 [ 10 ] to 637 original studies [ 27 ]. The characteristics of the included meta-research studies are presented in Table 2 .

thumbnail

https://doi.org/10.1371/journal.pone.0276955.t002

Risk of bias assessment

Overall, most studies were determined to exhibit a low risk of bias in the majority of items, and all of the included meta-research studies reported an unambiguous aim and a match between aim and methods. However, only a few studies provided argumentation for their choice of data source [ 17 , 20 , 24 , 30 ], and only two of the 21 studies referred to an available a-priori protocol [ 16 , 21 ]. Finally, seven studies provided poor or no discussion of the limitations of their study [ 10 , 19 , 22 , 26 – 28 , 34 ]. The risk-of-bias assessments are shown in Table 3 .

thumbnail

https://doi.org/10.1371/journal.pone.0276955.t003

Synthesis of results

Of the included 21 studies, a total of 18 studies were included in the meta-analysis. Two studies included two cohorts each, and both cohorts in each of these studies were included in our meta-analysis [ 21 , 30 ]. The survey by Clayton and colleagues, with a response rate of 17%, was not included in the meta-analysis as the survey did not provide data to identify the use of systematic reviews to justify specific studies. However, their results showed that 42 of 84 respondents (50%) reported using a systematic review for justification [ 29 ]. The study by Chow, which was also not included in the meta-analysis, showed that justification varied largely within and between specialties. However, only relative numbers were provided, and, therefore, no overall percentage could be extracted [ 11 ]. The study by Seehra et al. counted the SR citations in RCTs and not the number of RCTs citing SRs and is therefore not included in the meta-analysis either [ 23 ].

The percentage of original studies that justified a new study with a systematic review within each meta-research study ranged from 16% to 87%. The pooled percentage of original studies using systematic reviews to justify their research question was 42% (95% CI: 36% to 48%) as shown in Fig 2 . Where the confidence interval showed the precision of the pooled estimate in a meta-analysis, the prediction interval showed the distribution of the individual studies. The heterogeneity in the meta-analysis assessed by I 2 was 94%. The clinical interpretation of this large heterogeneity is seen in a the very broad prediction interval ranging from 16 to 71%, meaning that based on these studies there is 95% chance that the results of the next study will show a prevalence between 16 to 71%.

thumbnail

Forest plot prevalence and 95% confidence intervals for the percentage of studies using an SR to justify the study.

https://doi.org/10.1371/journal.pone.0276955.g002

Further, we conducted an explorative subgroup analysis of the study of Helfer et al. and the study of Joseph et al. as these two studies were on meta-analyses and protocols and therefore differ from the other included studies. This analysis did only marginally change the pooled percentage to 39% (95% CI; 33% to 46%) and the between-study variance (tau 2 ) was reduced with 23%.

The 21 included studies varied greatly in their approach and in their description of how systematic reviews were used, i.e., if the original studies referred and whether the used systematic reviews in the original studies were relevant and/or of high-quality. Nine studies assessed, to varying degrees, whether the used systematic reviews were relevant for the justification of the research [ 16 – 20 , 25 , 30 , 32 , 34 ]. Overall, the information reported by the meta-research studies was not sufficient to report the percentage of primary studies referring to relevant systematic reviews. No details were provided regarding the methodological quality of the systematic reviews used to justify the research question or if they were recently published reviews, except for Hoderlein et al., who reported that the mean number of years from publication of the cited systematic review and the trial report was four years [ 30 ].

We identified 21 meta-research studies, spanning 15 publication years and 12 medical disciplines. The findings showed substantial variability in the use of systematic reviews when justifying new clinical studies, with the incidence of use ranging from 16% to 87%. However, fewer than half of the 19 meta-analysis-eligible studies used a systematic review to justify their new study. There was wide variability, and a general lack of information, about how systematic reviews were used within many of the original studies. Our systematic review found that the proportion of original studies justifying their new research using evidence syntheses is sub-optimal and, thus, the potential for research redundancy continues to be a challenge. This study corroborates the serious possible consequences regarding research redundancy previously problematized by Chalmers et al. and Glasziou et al. [ 35 , 36 ].

Systematic reviews are considered crucial when justifying a new study, as is emphasized in reporting guidelines such as the CONSORT statement [ 37 ]. However, there are challenges involved in implementing an evidence-based research approach. The authors of the included meta-research study reporting the highest use of systematic reviews to justify a new systematic review study point out that even though the authors of the original studies refer to some of the published systematic reviews, they neglect others on the same topic, which may be problematic and result in a biased approach [ 33 ]. Other issues that have been identified are the risk of research waste when a systematic review may not be methodologically sound [ 12 , 38 ] and that there is also redundancy in the conduct of systematic reviews, with many overlapping systematic reviews existing on the same topic [ 39 – 41 ]. In the original studies within the meta-research studies, the use of systematic reviews was not consistent and, further, it was not explicated whether the systematic reviews used were the most recent and/or of high methodological quality. These issues speak to the need for refinement in the area of systematic review development, such as mandatory registration in prospective registries. Only two out of the included 21 studies in this study referred to an available a-priori protocol [ 16 , 21 ]. General recommendations in the use of systematic reviews as justification for a new study are difficult as these will be topic specific, however researchers should be aware to use the most robust and methodologically sound of recently published reviews, preferably with á priori published protocols.

Efforts must continue in promoting the use of evidence-based research approaches among clinical health researchers and other important stakeholders, such as funders. Collaborations such as the Ensuring Value in Research Funders Forum, and changes in funding review criteria mandating reference to previously published systematic reviews when justifying the research question within funding proposals, are examples of how stakeholders can promote research that is evidence-based [ 8 , 41 ].

Strengths and limitations

We conducted a comprehensive and systematic search. The lack of standard terminology for meta-research studies resulted in search strategies that retrieved thousands of citations. We also relied on snowballing efforts to identify relevant studies, such as by contacting experts and scanning the reference lists of relevant studies.

There is also a lack of tools to assess risk of bias for meta-research studies, so a specific risk-of bias tool for the five conducted reviews was created. The tool was discussed and revised continuously throughout the research process; however, we acknowledge that the checklist is not yet optimal and a validated risk-of-bias tool for meta-research studies is needed.

Many of the included meta-research studies did not provide details as to whether the systematic reviews used to justify the included studies were relevant, high-quality and/or recently published. This may raise questions as to the validity of our findings, as the majority of the meta-research studies only provide an indication of the citation of systematic reviews to justify new studies, not whether the systematic review cited was relevant, recent and of high-quality, or even how the systematic review was used. We did not assess this further either. Nonetheless, even if we assumed that these elements were provided for every original study included in the included meta-research studies (i.e. taking a conservative approach), fewer than half used systematic reviews to justify their research questions. The conservative approach used in this study therefore does not underestimate, and perhaps rather overestimates, the actual use of relevant systematic reviews to justify studies in clinical health science across disciplines.

Different study designs were included in the meta-analysis, which may have contributed to the high degree of heterogeneity observed. Therefore, the presented results should be interpreted with caution due to the high heterogeneity. Not only were there differences in the methods of the included meta-research studies, but there was also heterogeneity in the medical specialties evaluated [ 42 , 43 ].

In conclusion, justification of research questions in clinical health research with systematic reviews continues to be inconsistent; fewer than half of the primary studies within the included meta-research studies in this systematic review were found to have used a systematic review to justify their research question. This indicates that the risk of redundant research is still high when new studies across disciplines and professions in clinical health are initiated, thereby indicating that evidence-based research has not yet been successfully implemented in the clinical health sciences. Efforts to raise awareness and to ensure an evidence-based research approach continue to be necessary, and such efforts should involve clinical health researchers themselves as well as important stakeholders such as funders.

Supporting information

S1 checklist..

https://doi.org/10.1371/journal.pone.0276955.s001

S1 Protocol.

https://doi.org/10.1371/journal.pone.0276955.s002

https://doi.org/10.1371/journal.pone.0276955.s003

https://doi.org/10.1371/journal.pone.0276955.s004

Acknowledgments

This work has been prepared as part of the Evidence-Based Research Network ( ebrnetwork.org ). The Evidence-Based Research Network is an international network that promotes the use of systematic reviews when justifying, designing, and interpreting research. The authors thank the Section for Evidence-Based Practice, Department for Health and Function, Western Norway University of Applied Sciences for their generous support of the EBRNetwork. Further, thanks to COST Association for supporting the COST Action “EVBRES” (CA 17117, evbres.eu) and thereby the preparation of this study. Thanks to Gunhild Austrheim, Head of Unit, Library at Western Norway University of Applied Sciences, Norway, for helping with the second search. Thanks to those helping with the screening: Durita Gunnarsson, Gorm Høj Jensen, Line Sjodsholm, Signe Versterre, Linda Baumbach, Karina Johansen, Rune Martens Andersen, and Thomas Aagaard.

We gratefully acknowledge the contribution from the EVBRES (COST ACTION CA 17117) Core Group, including Anne Gjerland (AG) and her specific contribution to the search and screening process.

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 7. Evidence Synthesis International, https://evidencesynthesis.org/ .
  • 15. Cochrane. No Title. Cochrane Handbook, https://training.cochrane.org/handbook/current/chapter-i .
  • 36. Https://blogs.bmj.com/bmj/2016/01/14/paul-glasziou-and-iain-chalmers-is-85-of-health-research-really-wasted/ ). No Title.

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

13.1 Formatting a Research Paper

Learning objectives.

  • Identify the major components of a research paper written using American Psychological Association (APA) style.
  • Apply general APA style and formatting conventions in a research paper.

In this chapter, you will learn how to use APA style , the documentation and formatting style followed by the American Psychological Association, as well as MLA style , from the Modern Language Association. There are a few major formatting styles used in academic texts, including AMA, Chicago, and Turabian:

  • AMA (American Medical Association) for medicine, health, and biological sciences
  • APA (American Psychological Association) for education, psychology, and the social sciences
  • Chicago—a common style used in everyday publications like magazines, newspapers, and books
  • MLA (Modern Language Association) for English, literature, arts, and humanities
  • Turabian—another common style designed for its universal application across all subjects and disciplines

While all the formatting and citation styles have their own use and applications, in this chapter we focus our attention on the two styles you are most likely to use in your academic studies: APA and MLA.

If you find that the rules of proper source documentation are difficult to keep straight, you are not alone. Writing a good research paper is, in and of itself, a major intellectual challenge. Having to follow detailed citation and formatting guidelines as well may seem like just one more task to add to an already-too-long list of requirements.

Following these guidelines, however, serves several important purposes. First, it signals to your readers that your paper should be taken seriously as a student’s contribution to a given academic or professional field; it is the literary equivalent of wearing a tailored suit to a job interview. Second, it shows that you respect other people’s work enough to give them proper credit for it. Finally, it helps your reader find additional materials if he or she wishes to learn more about your topic.

Furthermore, producing a letter-perfect APA-style paper need not be burdensome. Yes, it requires careful attention to detail. However, you can simplify the process if you keep these broad guidelines in mind:

  • Work ahead whenever you can. Chapter 11 “Writing from Research: What Will I Learn?” includes tips for keeping track of your sources early in the research process, which will save time later on.
  • Get it right the first time. Apply APA guidelines as you write, so you will not have much to correct during the editing stage. Again, putting in a little extra time early on can save time later.
  • Use the resources available to you. In addition to the guidelines provided in this chapter, you may wish to consult the APA website at http://www.apa.org or the Purdue University Online Writing lab at http://owl.english.purdue.edu , which regularly updates its online style guidelines.

General Formatting Guidelines

This chapter provides detailed guidelines for using the citation and formatting conventions developed by the American Psychological Association, or APA. Writers in disciplines as diverse as astrophysics, biology, psychology, and education follow APA style. The major components of a paper written in APA style are listed in the following box.

These are the major components of an APA-style paper:

Body, which includes the following:

  • Headings and, if necessary, subheadings to organize the content
  • In-text citations of research sources
  • References page

All these components must be saved in one document, not as separate documents.

The title page of your paper includes the following information:

  • Title of the paper
  • Author’s name
  • Name of the institution with which the author is affiliated
  • Header at the top of the page with the paper title (in capital letters) and the page number (If the title is lengthy, you may use a shortened form of it in the header.)

List the first three elements in the order given in the previous list, centered about one third of the way down from the top of the page. Use the headers and footers tool of your word-processing program to add the header, with the title text at the left and the page number in the upper-right corner. Your title page should look like the following example.

Beyond the Hype: Evaluating Low-Carb Diets cover page

The next page of your paper provides an abstract , or brief summary of your findings. An abstract does not need to be provided in every paper, but an abstract should be used in papers that include a hypothesis. A good abstract is concise—about one hundred fifty to two hundred fifty words—and is written in an objective, impersonal style. Your writing voice will not be as apparent here as in the body of your paper. When writing the abstract, take a just-the-facts approach, and summarize your research question and your findings in a few sentences.

In Chapter 12 “Writing a Research Paper” , you read a paper written by a student named Jorge, who researched the effectiveness of low-carbohydrate diets. Read Jorge’s abstract. Note how it sums up the major ideas in his paper without going into excessive detail.

Beyond the Hype: Abstract

Write an abstract summarizing your paper. Briefly introduce the topic, state your findings, and sum up what conclusions you can draw from your research. Use the word count feature of your word-processing program to make sure your abstract does not exceed one hundred fifty words.

Depending on your field of study, you may sometimes write research papers that present extensive primary research, such as your own experiment or survey. In your abstract, summarize your research question and your findings, and briefly indicate how your study relates to prior research in the field.

Margins, Pagination, and Headings

APA style requirements also address specific formatting concerns, such as margins, pagination, and heading styles, within the body of the paper. Review the following APA guidelines.

Use these general guidelines to format the paper:

  • Set the top, bottom, and side margins of your paper at 1 inch.
  • Use double-spaced text throughout your paper.
  • Use a standard font, such as Times New Roman or Arial, in a legible size (10- to 12-point).
  • Use continuous pagination throughout the paper, including the title page and the references section. Page numbers appear flush right within your header.
  • Section headings and subsection headings within the body of your paper use different types of formatting depending on the level of information you are presenting. Additional details from Jorge’s paper are provided.

Cover Page

Begin formatting the final draft of your paper according to APA guidelines. You may work with an existing document or set up a new document if you choose. Include the following:

  • Your title page
  • The abstract you created in Note 13.8 “Exercise 1”
  • Correct headers and page numbers for your title page and abstract

APA style uses section headings to organize information, making it easy for the reader to follow the writer’s train of thought and to know immediately what major topics are covered. Depending on the length and complexity of the paper, its major sections may also be divided into subsections, sub-subsections, and so on. These smaller sections, in turn, use different heading styles to indicate different levels of information. In essence, you are using headings to create a hierarchy of information.

The following heading styles used in APA formatting are listed in order of greatest to least importance:

  • Section headings use centered, boldface type. Headings use title case, with important words in the heading capitalized.
  • Subsection headings use left-aligned, boldface type. Headings use title case.
  • The third level uses left-aligned, indented, boldface type. Headings use a capital letter only for the first word, and they end in a period.
  • The fourth level follows the same style used for the previous level, but the headings are boldfaced and italicized.
  • The fifth level follows the same style used for the previous level, but the headings are italicized and not boldfaced.

Visually, the hierarchy of information is organized as indicated in Table 13.1 “Section Headings” .

Table 13.1 Section Headings

A college research paper may not use all the heading levels shown in Table 13.1 “Section Headings” , but you are likely to encounter them in academic journal articles that use APA style. For a brief paper, you may find that level 1 headings suffice. Longer or more complex papers may need level 2 headings or other lower-level headings to organize information clearly. Use your outline to craft your major section headings and determine whether any subtopics are substantial enough to require additional levels of headings.

Working with the document you developed in Note 13.11 “Exercise 2” , begin setting up the heading structure of the final draft of your research paper according to APA guidelines. Include your title and at least two to three major section headings, and follow the formatting guidelines provided above. If your major sections should be broken into subsections, add those headings as well. Use your outline to help you.

Because Jorge used only level 1 headings, his Exercise 3 would look like the following:

Citation Guidelines

In-text citations.

Throughout the body of your paper, include a citation whenever you quote or paraphrase material from your research sources. As you learned in Chapter 11 “Writing from Research: What Will I Learn?” , the purpose of citations is twofold: to give credit to others for their ideas and to allow your reader to follow up and learn more about the topic if desired. Your in-text citations provide basic information about your source; each source you cite will have a longer entry in the references section that provides more detailed information.

In-text citations must provide the name of the author or authors and the year the source was published. (When a given source does not list an individual author, you may provide the source title or the name of the organization that published the material instead.) When directly quoting a source, it is also required that you include the page number where the quote appears in your citation.

This information may be included within the sentence or in a parenthetical reference at the end of the sentence, as in these examples.

Epstein (2010) points out that “junk food cannot be considered addictive in the same way that we think of psychoactive drugs as addictive” (p. 137).

Here, the writer names the source author when introducing the quote and provides the publication date in parentheses after the author’s name. The page number appears in parentheses after the closing quotation marks and before the period that ends the sentence.

Addiction researchers caution that “junk food cannot be considered addictive in the same way that we think of psychoactive drugs as addictive” (Epstein, 2010, p. 137).

Here, the writer provides a parenthetical citation at the end of the sentence that includes the author’s name, the year of publication, and the page number separated by commas. Again, the parenthetical citation is placed after the closing quotation marks and before the period at the end of the sentence.

As noted in the book Junk Food, Junk Science (Epstein, 2010, p. 137), “junk food cannot be considered addictive in the same way that we think of psychoactive drugs as addictive.”

Here, the writer chose to mention the source title in the sentence (an optional piece of information to include) and followed the title with a parenthetical citation. Note that the parenthetical citation is placed before the comma that signals the end of the introductory phrase.

David Epstein’s book Junk Food, Junk Science (2010) pointed out that “junk food cannot be considered addictive in the same way that we think of psychoactive drugs as addictive” (p. 137).

Another variation is to introduce the author and the source title in your sentence and include the publication date and page number in parentheses within the sentence or at the end of the sentence. As long as you have included the essential information, you can choose the option that works best for that particular sentence and source.

Citing a book with a single author is usually a straightforward task. Of course, your research may require that you cite many other types of sources, such as books or articles with more than one author or sources with no individual author listed. You may also need to cite sources available in both print and online and nonprint sources, such as websites and personal interviews. Chapter 13 “APA and MLA Documentation and Formatting” , Section 13.2 “Citing and Referencing Techniques” and Section 13.3 “Creating a References Section” provide extensive guidelines for citing a variety of source types.

Writing at Work

APA is just one of several different styles with its own guidelines for documentation, formatting, and language usage. Depending on your field of interest, you may be exposed to additional styles, such as the following:

  • MLA style. Determined by the Modern Languages Association and used for papers in literature, languages, and other disciplines in the humanities.
  • Chicago style. Outlined in the Chicago Manual of Style and sometimes used for papers in the humanities and the sciences; many professional organizations use this style for publications as well.
  • Associated Press (AP) style. Used by professional journalists.

References List

The brief citations included in the body of your paper correspond to the more detailed citations provided at the end of the paper in the references section. In-text citations provide basic information—the author’s name, the publication date, and the page number if necessary—while the references section provides more extensive bibliographical information. Again, this information allows your reader to follow up on the sources you cited and do additional reading about the topic if desired.

The specific format of entries in the list of references varies slightly for different source types, but the entries generally include the following information:

  • The name(s) of the author(s) or institution that wrote the source
  • The year of publication and, where applicable, the exact date of publication
  • The full title of the source
  • For books, the city of publication
  • For articles or essays, the name of the periodical or book in which the article or essay appears
  • For magazine and journal articles, the volume number, issue number, and pages where the article appears
  • For sources on the web, the URL where the source is located

The references page is double spaced and lists entries in alphabetical order by the author’s last name. If an entry continues for more than one line, the second line and each subsequent line are indented five spaces. Review the following example. ( Chapter 13 “APA and MLA Documentation and Formatting” , Section 13.3 “Creating a References Section” provides extensive guidelines for formatting reference entries for different types of sources.)

References Section

In APA style, book and article titles are formatted in sentence case, not title case. Sentence case means that only the first word is capitalized, along with any proper nouns.

Key Takeaways

  • Following proper citation and formatting guidelines helps writers ensure that their work will be taken seriously, give proper credit to other authors for their work, and provide valuable information to readers.
  • Working ahead and taking care to cite sources correctly the first time are ways writers can save time during the editing stage of writing a research paper.
  • APA papers usually include an abstract that concisely summarizes the paper.
  • APA papers use a specific headings structure to provide a clear hierarchy of information.
  • In APA papers, in-text citations usually include the name(s) of the author(s) and the year of publication.
  • In-text citations correspond to entries in the references section, which provide detailed bibliographical information about a source.

Writing for Success Copyright © 2015 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Deciding whether the conclusions of studies are justified: a review

  • PMID: 7052412
  • DOI: 10.1177/0272989X8100100306

Critical review of original studies is a major source of medical information for the physician; it necessitates a clear understanding of the research objective of the study. The design of the study, whether experimental or observational, should be appropriate to answer the research question and, further, the study should employ accurate and precise measurements on a suitable group of subjects. The critical reader, aware of the principle of study design and analysis, can assess the validity of the author's conclusions through careful review of the article. A guide to the analysis of an original article is presented here. The reader is asked to formulate questions about a study from the abstract. The answers to these questions, which form the basis for acceptance or rejection of the author's conclusions, are in general available in the body of an article. This technique is demonstrated with two fictional studies about patients with angina.

  • Information Services*
  • Statistics as Topic

SIPS logo

  • Previous Article
  • Next Article

Six Approaches to Justify Sample Sizes

Six ways to evaluate which effect sizes are interesting, the value of information, what is your inferential goal, additional considerations when designing an informative study, competing interests, data availability, sample size justification.

ORCID logo

[email protected]

  • Split-Screen
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Open the PDF for in another window
  • Guest Access
  • Get Permissions
  • Cite Icon Cite
  • Search Site

Daniël Lakens; Sample Size Justification. Collabra: Psychology 5 January 2022; 8 (1): 33267. doi: https://doi.org/10.1525/collabra.33267

Download citation file:

  • Ris (Zotero)
  • Reference Manager

An important step when designing an empirical study is to justify the sample size that will be collected. The key aim of a sample size justification for such studies is to explain how the collected data is expected to provide valuable information given the inferential goals of the researcher. In this overview article six approaches are discussed to justify the sample size in a quantitative empirical study: 1) collecting data from (almost) the entire population, 2) choosing a sample size based on resource constraints, 3) performing an a-priori power analysis, 4) planning for a desired accuracy, 5) using heuristics, or 6) explicitly acknowledging the absence of a justification. An important question to consider when justifying sample sizes is which effect sizes are deemed interesting, and the extent to which the data that is collected informs inferences about these effect sizes. Depending on the sample size justification chosen, researchers could consider 1) what the smallest effect size of interest is, 2) which minimal effect size will be statistically significant, 3) which effect sizes they expect (and what they base these expectations on), 4) which effect sizes would be rejected based on a confidence interval around the effect size, 5) which ranges of effects a study has sufficient power to detect based on a sensitivity power analysis, and 6) which effect sizes are expected in a specific research area. Researchers can use the guidelines presented in this article, for example by using the interactive form in the accompanying online Shiny app, to improve their sample size justification, and hopefully, align the informational value of a study with their inferential goals.

Scientists perform empirical studies to collect data that helps to answer a research question. The more data that is collected, the more informative the study will be with respect to its inferential goals. A sample size justification should consider how informative the data will be given an inferential goal, such as estimating an effect size, or testing a hypothesis. Even though a sample size justification is sometimes requested in manuscript submission guidelines, when submitting a grant to a funder, or submitting a proposal to an ethical review board, the number of observations is often simply stated , but not justified . This makes it difficult to evaluate how informative a study will be. To prevent such concerns from emerging when it is too late (e.g., after a non-significant hypothesis test has been observed), researchers should carefully justify their sample size before data is collected.

Researchers often find it difficult to justify their sample size (i.e., a number of participants, observations, or any combination thereof). In this review article six possible approaches are discussed that can be used to justify the sample size in a quantitative study (see Table 1 ). This is not an exhaustive overview, but it includes the most common and applicable approaches for single studies. 1 The first justification is that data from (almost) the entire population has been collected. The second justification centers on resource constraints, which are almost always present, but rarely explicitly evaluated. The third and fourth justifications are based on a desired statistical power or a desired accuracy. The fifth justification relies on heuristics, and finally, researchers can choose a sample size without any justification. Each of these justifications can be stronger or weaker depending on which conclusions researchers want to draw from the data they plan to collect.

All of these approaches to the justification of sample sizes, even the ‘no justification’ approach, give others insight into the reasons that led to the decision for a sample size in a study. It should not be surprising that the ‘heuristics’ and ‘no justification’ approaches are often unlikely to impress peers. However, it is important to note that the value of the information that is collected depends on the extent to which the final sample size allows a researcher to achieve their inferential goals, and not on the sample size justification that is chosen.

The extent to which these approaches make other researchers judge the data that is collected as informative depends on the details of the question a researcher aimed to answer and the parameters they chose when determining the sample size for their study. For example, a badly performed a-priori power analysis can quickly lead to a study with very low informational value. These six justifications are not mutually exclusive, and multiple approaches can be considered when designing a study.

The informativeness of the data that is collected depends on the inferential goals a researcher has, or in some cases, the inferential goals scientific peers will have. A shared feature of the different inferential goals considered in this review article is the question which effect sizes a researcher considers meaningful to distinguish. This implies that researchers need to evaluate which effect sizes they consider interesting. These evaluations rely on a combination of statistical properties and domain knowledge. In Table 2 six possibly useful considerations are provided. This is not intended to be an exhaustive overview, but it presents common and useful approaches that can be applied in practice. Not all evaluations are equally relevant for all types of sample size justifications. The online Shiny app accompanying this manuscript provides researchers with an interactive form that guides researchers through the considerations for a sample size justification. These considerations often rely on the same information (e.g., effect sizes, the number of observations, the standard deviation, etc.) so these six considerations should be seen as a set of complementary approaches that can be used to evaluate which effect sizes are of interest.

To start, researchers should consider what their smallest effect size of interest is. Second, although only relevant when performing a hypothesis test, researchers should consider which effect sizes could be statistically significant given a choice of an alpha level and sample size. Third, it is important to consider the (range of) effect sizes that are expected. This requires a careful consideration of the source of this expectation and the presence of possible biases in these expectations. Fourth, it is useful to consider the width of the confidence interval around possible values of the effect size in the population, and whether we can expect this confidence interval to reject effects we considered a-priori plausible. Fifth, it is worth evaluating the power of the test across a wide range of possible effect sizes in a sensitivity power analysis. Sixth, a researcher can consider the effect size distribution of related studies in the literature.

Since all scientists are faced with resource limitations, they need to balance the cost of collecting each additional datapoint against the increase in information that datapoint provides. This is referred to as the value of information   (Eckermann et al., 2010) . Calculating the value of information is notoriously difficult (Detsky, 1990) . Researchers need to specify the cost of collecting data, and weigh the costs of data collection against the increase in utility that having access to the data provides. From a value of information perspective not every data point that can be collected is equally valuable (J. Halpern et al., 2001; Wilson, 2015) . Whenever additional observations do not change inferences in a meaningful way, the costs of data collection can outweigh the benefits.

The value of additional information will in most cases be a non-monotonic function, especially when it depends on multiple inferential goals. A researcher might be interested in comparing an effect against a previously observed large effect in the literature, a theoretically predicted medium effect, and the smallest effect that would be practically relevant. In such a situation the expected value of sampling information will lead to different optimal sample sizes for each inferential goal. It could be valuable to collect informative data about a large effect, with additional data having less (or even a negative) marginal utility, up to a point where the data becomes increasingly informative about a medium effect size, with the value of sampling additional information decreasing once more until the study becomes increasingly informative about the presence or absence of a smallest effect of interest.

Because of the difficulty of quantifying the value of information, scientists typically use less formal approaches to justify the amount of data they set out to collect in a study. Even though the cost-benefit analysis is not always made explicit in reported sample size justifications, the value of information perspective is almost always implicitly the underlying framework that sample size justifications are based on. Throughout the subsequent discussion of sample size justifications, the importance of considering the value of information given inferential goals will repeatedly be highlighted.

Measuring (Almost) the Entire Population

In some instances it might be possible to collect data from (almost) the entire population under investigation. For example, researchers might use census data, are able to collect data from all employees at a firm or study a small population of top athletes. Whenever it is possible to measure the entire population, the sample size justification becomes straightforward: the researcher used all the data that is available.

Resource Constraints

A common reason for the number of observations in a study is that resource constraints limit the amount of data that can be collected at a reasonable cost (Lenth, 2001) . In practice, sample sizes are always limited by the resources that are available. Researchers practically always have resource limitations, and therefore even when resource constraints are not the primary justification for the sample size in a study, it is always a secondary justification.

Despite the omnipresence of resource limitations, the topic often receives little attention in texts on experimental design (for an example of an exception, see Bulus and Dong (2021) ). This might make it feel like acknowledging resource constraints is not appropriate, but the opposite is true: Because resource limitations always play a role, a responsible scientist carefully evaluates resource constraints when designing a study. Resource constraint justifications are based on a trade-off between the costs of data collection, and the value of having access to the information the data provides. Even if researchers do not explicitly quantify this trade-off, it is revealed in their actions. For example, researchers rarely spend all the resources they have on a single study. Given resource constraints, researchers are confronted with an optimization problem of how to spend resources across multiple research questions.

Time and money are two resource limitations all scientists face. A PhD student has a certain time to complete a PhD thesis, and is typically expected to complete multiple research lines in this time. In addition to time limitations, researchers have limited financial resources that often directly influence how much data can be collected. A third limitation in some research lines is that there might simply be a very small number of individuals from whom data can be collected, such as when studying patients with a rare disease. A resource constraint justification puts limited resources at the center of the justification for the sample size that will be collected, and starts with the resources a scientist has available. These resources are translated into an expected number of observations ( N ) that a researcher expects they will be able to collect with an amount of money in a given time. The challenge is to evaluate whether collecting N observations is worthwhile. How do we decide if a study will be informative, and when should we conclude that data collection is not worthwhile?

When evaluating whether resource constraints make data collection uninformative, researchers need to explicitly consider which inferential goals they have when collecting data (Parker & Berman, 2003) . Having data always provides more knowledge about the research question than not having data, so in an absolute sense, all data that is collected has value. However, it is possible that the benefits of collecting the data are outweighed by the costs of data collection.

It is most straightforward to evaluate whether data collection has value when we know for certain that someone will make a decision, with or without data. In such situations any additional data will reduce the error rates of a well-calibrated decision process, even if only ever so slightly. For example, without data we will not perform better than a coin flip if we guess which of two conditions has a higher true mean score on a measure. With some data, we can perform better than a coin flip by picking the condition that has the highest mean. With a small amount of data we would still very likely make a mistake, but the error rate is smaller than without any data. In these cases, the value of information might be positive, as long as the reduction in error rates is more beneficial than the cost of data collection.

Another way in which a small dataset can be valuable is if its existence eventually makes it possible to perform a meta-analysis (Maxwell & Kelley, 2011) . This argument in favor of collecting a small dataset requires 1) that researchers share the data in a way that a future meta-analyst can find it, and 2) that there is a decent probability that someone will perform a high-quality meta-analysis that will include this data in the future (S. D. Halpern et al., 2002) . The uncertainty about whether there will ever be such a meta-analysis should be weighed against the costs of data collection.

One way to increase the probability of a future meta-analysis is if researchers commit to performing this meta-analysis themselves, by combining several studies they have performed into a small-scale meta-analysis (Cumming, 2014) . For example, a researcher might plan to repeat a study for the next 12 years in a class they teach, with the expectation that after 12 years a meta-analysis of 12 studies would be sufficient to draw informative inferences (but see ter Schure and Grünwald (2019) ). If it is not plausible that a researcher will collect all the required data by themselves, they can attempt to set up a collaboration where fellow researchers in their field commit to collecting similar data with identical measures. If it is not likely that sufficient data will emerge over time to reach the inferential goals, there might be no value in collecting the data.

Even if a researcher believes it is worth collecting data because a future meta-analysis will be performed, they will most likely perform a statistical test on the data. To make sure their expectations about the results of such a test are well-calibrated, it is important to consider which effect sizes are of interest, and to perform a sensitivity power analysis to evaluate the probability of a Type II error for effects of interest. From the six ways to evaluate which effect sizes are interesting that will be discussed in the second part of this review, it is useful to consider the smallest effect size that can be statistically significant, the expected width of the confidence interval around the effect size, and effects that can be expected in a specific research area, and to evaluate the power for these effect sizes in a sensitivity power analysis. If a decision or claim is made, a compromise power analysis is worthwhile to consider when deciding upon the error rates while planning the study. When reporting a resource constraints sample size justification it is recommended to address the five considerations in Table 3 . Addressing these points explicitly facilitates evaluating if the data is worthwhile to collect. To make it easier to address all relevant points explicitly, an interactive form to implement the recommendations in this manuscript can be found at https://shiny.ieis.tue.nl/sample_size_justification/ .

A-priori Power Analysis

When designing a study where the goal is to test whether a statistically significant effect is present, researchers often want to make sure their sample size is large enough to prevent erroneous conclusions for a range of effect sizes they care about. In this approach to justifying a sample size, the value of information is to collect observations up to the point that the probability of an erroneous inference is, in the long run, not larger than a desired value. If a researcher performs a hypothesis test, there are four possible outcomes:

A false positive (or Type I error), determined by the α level. A test yields a significant result, even though the null hypothesis is true.

A false negative (or Type II error), determined by β , or 1 - power. A test yields a non-significant result, even though the alternative hypothesis is true.

A true negative, determined by 1- α . A test yields a non-significant result when the null hypothesis is true.

A true positive, determined by 1- β . A test yields a significant result when the alternative hypothesis is true.

Given a specified effect size, alpha level, and power, an a-priori power analysis can be used to calculate the number of observations required to achieve the desired error rates, given the effect size. 3   Figure 1 illustrates how the statistical power increases as the number of observations (per group) increases in an independent t test with a two-sided alpha level of 0.05. If we are interested in detecting an effect of d = 0.5, a sample size of 90 per condition would give us more than 90% power. Statistical power can be computed to determine the number of participants, or the number of items (Westfall et al., 2014) but can also be performed for single case studies (Ferron & Onghena, 1996; McIntosh & Rittmo, 2020)  

graphic

Although it is common to set the Type I error rate to 5% and aim for 80% power, error rates should be justified (Lakens, Adolfi, et al., 2018) . As explained in the section on compromise power analysis, the default recommendation to aim for 80% power lacks a solid justification. In general, the lower the error rates (and thus the higher the power), the more informative a study will be, but the more resources are required. Researchers should carefully weigh the costs of increasing the sample size against the benefits of lower error rates, which would probably make studies designed to achieve a power of 90% or 95% more common for articles reporting a single study. An additional consideration is whether the researcher plans to publish an article consisting of a set of replication and extension studies, in which case the probability of observing multiple Type I errors will be very low, but the probability of observing mixed results even when there is a true effect increases (Lakens & Etz, 2017) , which would also be a reason to aim for studies with low Type II error rates, perhaps even by slightly increasing the alpha level for each individual study.

Figure 2 visualizes two distributions. The left distribution (dashed line) is centered at 0. This is a model for the null hypothesis. If the null hypothesis is true a statistically significant result will be observed if the effect size is extreme enough (in a two-sided test either in the positive or negative direction), but any significant result would be a Type I error (the dark grey areas under the curve). If there is no true effect, formally statistical power for a null hypothesis significance test is undefined. Any significant effects observed if the null hypothesis is true are Type I errors, or false positives, which occur at the chosen alpha level. The right distribution (solid line) is centered on an effect of d = 0.5. This is the specified model for the alternative hypothesis in this study, illustrating the expectation of an effect of d = 0.5 if the alternative hypothesis is true. Even though there is a true effect, studies will not always find a statistically significant result. This happens when, due to random variation, the observed effect size is too close to 0 to be statistically significant. Such results are false negatives (the light grey area under the curve on the right). To increase power, we can collect a larger sample size. As the sample size increases, the distributions become more narrow, reducing the probability of a Type II error. 4

graphic

It is important to highlight that the goal of an a-priori power analysis is not to achieve sufficient power for the true effect size. The true effect size is unknown. The goal of an a-priori power analysis is to achieve sufficient power, given a specific assumption of the effect size a researcher wants to detect. Just like a Type I error rate is the maximum probability of making a Type I error conditional on the assumption that the null hypothesis is true, an a-priori power analysis is computed under the assumption of a specific effect size. It is unknown if this assumption is correct. All a researcher can do is to make sure their assumptions are well justified. Statistical inferences based on a test where the Type II error rate is controlled are conditional on the assumption of a specific effect size. They allow the inference that, assuming the true effect size is at least as large as that used in the a-priori power analysis, the maximum Type II error rate in a study is not larger than a desired value.

This point is perhaps best illustrated if we consider a study where an a-priori power analysis is performed both for a test of the presence of an effect, as for a test of the absence of an effect. When designing a study, it essential to consider the possibility that there is no effect (e.g., a mean difference of zero). An a-priori power analysis can be performed both for a null hypothesis significance test, as for a test of the absence of a meaningful effect, such as an equivalence test that can statistically provide support for the null hypothesis by rejecting the presence of effects that are large enough to matter (Lakens, 2017; Meyners, 2012; Rogers et al., 1993) . When multiple primary tests will be performed based on the same sample, each analysis requires a dedicated sample size justification. If possible, a sample size is collected that guarantees that all tests are informative, which means that the collected sample size is based on the largest sample size returned by any of the a-priori power analyses.

For example, if the goal of a study is to detect or reject an effect size of d = 0.4 with 90% power, and the alpha level is set to 0.05 for a two-sided independent t test, a researcher would need to collect 133 participants in each condition for an informative null hypothesis test, and 136 participants in each condition for an informative equivalence test. Therefore, the researcher should aim to collect 272 participants in total for an informative result for both tests that are planned. This does not guarantee a study has sufficient power for the true effect size (which can never be known), but it guarantees the study has sufficient power given an assumption of the effect a researcher is interested in detecting or rejecting. Therefore, an a-priori power analysis is useful, as long as a researcher can justify the effect sizes they are interested in.

If researchers correct the alpha level when testing multiple hypotheses, the a-priori power analysis should be based on this corrected alpha level. For example, if four tests are performed, an overall Type I error rate of 5% is desired, and a Bonferroni correction is used, the a-priori power analysis should be based on a corrected alpha level of .0125.

An a-priori power analysis can be performed analytically, or by performing computer simulations. Analytic solutions are faster but less flexible. A common challenge researchers face when attempting to perform power analyses for more complex or uncommon tests is that available software does not offer analytic solutions. In these cases simulations can provide a flexible solution to perform power analyses for any test (Morris et al., 2019) . The following code is an example of a power analysis in R based on 10000 simulations for a one-sample t test against zero for a sample size of 20, assuming a true effect of d = 0.5. All simulations consist of first randomly generating data based on assumptions of the data generating mechanism (e.g., a normal distribution with a mean of 0.5 and a standard deviation of 1), followed by a test performed on the data. By computing the percentage of significant results, power can be computed for any design.

p <- numeric(10000) # to store p-values for (i in 1:10000) { #simulate 10k tests x <- rnorm(n = 20, mean = 0.5, sd = 1) p[i] <- t.test(x)$p.value # store p-value } sum(p < 0.05) / 10000 # Compute power

There is a wide range of tools available to perform power analyses. Whichever tool a researcher decides to use, it will take time to learn how to use the software correctly to perform a meaningful a-priori power analysis. Resources to educate psychologists about power analysis consist of book-length treatments (Aberson, 2019; Cohen, 1988; Julious, 2004; Murphy et al., 2014) , general introductions (Baguley, 2004; Brysbaert, 2019; Faul et al., 2007; Maxwell et al., 2008; Perugini et al., 2018) , and an increasing number of applied tutorials for specific tests (Brysbaert & Stevens, 2018; DeBruine & Barr, 2019; P. Green & MacLeod, 2016; Kruschke, 2013; Lakens & Caldwell, 2021; Schoemann et al., 2017; Westfall et al., 2014) . It is important to be trained in the basics of power analysis, and it can be extremely beneficial to learn how to perform simulation-based power analyses. At the same time, it is often recommended to enlist the help of an expert, especially when a researcher lacks experience with a power analysis for a specific test.

When reporting an a-priori power analysis, make sure that the power analysis is completely reproducible. If power analyses are performed in R it is possible to share the analysis script and information about the version of the package. In many software packages it is possible to export the power analysis that is performed as a PDF file. For example, in G*Power analyses can be exported under the ‘protocol of power analysis’ tab. If the software package provides no way to export the analysis, add a screenshot of the power analysis to the supplementary files.

graphic

The reproducible report needs to be accompanied by justifications for the choices that were made with respect to the values used in the power analysis. If the effect size used in the power analysis is based on previous research the factors presented in Table 5 (if the effect size is based on a meta-analysis) or Table 6 (if the effect size is based on a single study) should be discussed. If an effect size estimate is based on the existing literature, provide a full citation, and preferably a direct quote from the article where the effect size estimate is reported. If the effect size is based on a smallest effect size of interest, this value should not just be stated, but justified (e.g., based on theoretical predictions or practical implications, see Lakens, Scheel, and Isager (2018) ). For an overview of all aspects that should be reported when describing an a-priori power analysis, see Table 4 .

Planning for Precision

Some researchers have suggested to justify sample sizes based on a desired level of precision of the estimate (Cumming & Calin-Jageman, 2016; Kruschke, 2018; Maxwell et al., 2008) . The goal when justifying a sample size based on precision is to collect data to achieve a desired width of the confidence interval around a parameter estimate. The width of the confidence interval around the parameter estimate depends on the standard deviation and the number of observations. The only aspect a researcher needs to justify for a sample size justification based on accuracy is the desired width of the confidence interval with respect to their inferential goal, and their assumption about the population standard deviation of the measure.

If a researcher has determined the desired accuracy, and has a good estimate of the true standard deviation of the measure, it is straightforward to calculate the sample size needed for a desired level of accuracy. For example, when measuring the IQ of a group of individuals a researcher might desire to estimate the IQ score within an error range of 2 IQ points for 95% of the observed means, in the long run. The required sample size to achieve this desired level of accuracy (assuming normally distributed data) can be computed by:

where N is the number of observations, z is the critical value related to the desired confidence interval, sd is the standard deviation of IQ scores in the population, and error is the width of the confidence interval within which the mean should fall, with the desired error rate. In this example, (1.96 × 15 / 2)^2 = 216.1 observations. If a researcher desires 95% of the means to fall within a 2 IQ point range around the true population mean, 217 observations should be collected. If a desired accuracy for a non-zero mean difference is computed, accuracy is based on a non-central t -distribution. For these calculations an expected effect size estimate needs to be provided, but it has relatively little influence on the required sample size (Maxwell et al., 2008) . It is also possible to incorporate uncertainty about the observed effect size in the sample size calculation, known as assurance   (Kelley & Rausch, 2006) . The MBESS package in R provides functions to compute sample sizes for a wide range of tests (Kelley, 2007) .

What is less straightforward is to justify how a desired level of accuracy is related to inferential goals. There is no literature that helps researchers to choose a desired width of the confidence interval. Morey (2020) convincingly argues that most practical use-cases of planning for precision involve an inferential goal of distinguishing an observed effect from other effect sizes (for a Bayesian perspective, see Kruschke (2018) ). For example, a researcher might expect an effect size of r = 0.4 and would treat observed correlations that differ more than 0.2 (i.e., 0.2 < r < 0.6) differently, in that effects of r = 0.6 or larger are considered too large to be caused by the assumed underlying mechanism (Hilgard, 2021) , while effects smaller than r = 0.2 are considered too small to support the theoretical prediction. If the goal is indeed to get an effect size estimate that is precise enough so that two effects can be differentiated with high probability, the inferential goal is actually a hypothesis test, which requires designing a study with sufficient power to reject effects (e.g., testing a range prediction of correlations between 0.2 and 0.6).

If researchers do not want to test a hypothesis, for example because they prefer an estimation approach over a testing approach, then in the absence of clear guidelines that help researchers to justify a desired level of precision, one solution might be to rely on a generally accepted norm of precision to aim for. This norm could be based on ideas about a certain resolution below which measurements in a research area no longer lead to noticeably different inferences. Just as researchers normatively use an alpha level of 0.05, they could plan studies to achieve a desired confidence interval width around the observed effect that is determined by a norm. Future work is needed to help researchers choose a confidence interval width when planning for accuracy.

When a researcher uses a heuristic, they are not able to justify their sample size themselves, but they trust in a sample size recommended by some authority. When I started as a PhD student in 2005 it was common to collect 15 participants in each between subject condition. When asked why this was a common practice, no one was really sure, but people trusted there was a justification somewhere in the literature. Now, I realize there was no justification for the heuristics we used. As Berkeley (1735) already observed: “Men learn the elements of science from others: And every learner hath a deference more or less to authority, especially the young learners, few of that kind caring to dwell long upon principles, but inclining rather to take them upon trust: And things early admitted by repetition become familiar: And this familiarity at length passeth for evidence.”

Some papers provide researchers with simple rules of thumb about the sample size that should be collected. Such papers clearly fill a need, and are cited a lot, even when the advice in these articles is flawed. For example, Wilson VanVoorhis and Morgan (2007) translate an absolute minimum of 50+8 observations for regression analyses suggested by a rule of thumb examined in S. B. Green (1991) into the recommendation to collect ~50 observations. Green actually concludes in his article that “In summary, no specific minimum number of subjects or minimum ratio of subjects-to-predictors was supported”. He does discuss how a general rule of thumb of N = 50 + 8 provided an accurate minimum number of observations for the ‘typical’ study in the social sciences because these have a ‘medium’ effect size, as Green claims by citing Cohen (1988) . Cohen actually didn’t claim that the typical study in the social sciences has a ‘medium’ effect size, and instead said (1988, p. 13) : “Many effects sought in personality, social, and clinical-psychological research are likely to be small effects as here defined”. We see how a string of mis-citations eventually leads to a misleading rule of thumb.

Rules of thumb seem to primarily emerge due to mis-citations and/or overly simplistic recommendations. Simonsohn, Nelson, and Simmons (2011) recommended that “Authors must collect at least 20 observations per cell”. A later recommendation by the same authors presented at a conference suggested to use n > 50, unless you study large effects (Simmons et al., 2013) . Regrettably, this advice is now often mis-cited as a justification to collect no more than 50 observations per condition without considering the expected effect size. If authors justify a specific sample size (e.g., n = 50) based on a general recommendation in another paper, either they are mis-citing the paper, or the paper they are citing is flawed.

Another common heuristic is to collect the same number of observations as were collected in a previous study. This strategy is not recommended in scientific disciplines with widespread publication bias, and/or where novel and surprising findings from largely exploratory single studies are published. Using the same sample size as a previous study is only a valid approach if the sample size justification in the previous study also applies to the current study. Instead of stating that you intend to collect the same sample size as an earlier study, repeat the sample size justification, and update it in light of any new information (such as the effect size in the earlier study, see Table 6 ).

Peer reviewers and editors should carefully scrutinize rules of thumb sample size justifications, because they can make it seem like a study has high informational value for an inferential goal even when the study will yield uninformative results. Whenever one encounters a sample size justification based on a heuristic, ask yourself: ‘Why is this heuristic used?’ It is important to know what the logic behind a heuristic is to determine whether the heuristic is valid for a specific situation. In most cases, heuristics are based on weak logic, and not widely applicable. It might be possible that fields develop valid heuristics for sample size justifications. For example, it is possible that a research area reaches widespread agreement that effects smaller than d = 0.3 are too small to be of interest, and all studies in a field use sequential designs (see below) that have 90% power to detect a d = 0.3. Alternatively, it is possible that a field agrees that data should be collected with a desired level of accuracy, irrespective of the true effect size. In these cases, valid heuristics would exist based on generally agreed goals of data collection. For example, Simonsohn (2015) suggests to design replication studies that have 2.5 times as large sample sizes as the original study, as this provides 80% power for an equivalence test against an equivalence bound set to the effect the original study had 33% power to detect, assuming the true effect size is 0. As original authors typically do not specify which effect size would falsify their hypothesis, the heuristic underlying this ‘small telescopes’ approach is a good starting point for a replication study with the inferential goal to reject the presence of an effect as large as was described in an earlier publication. It is the responsibility of researchers to gain the knowledge to distinguish valid heuristics from mindless heuristics, and to be able to evaluate whether a heuristic will yield an informative result given the inferential goal of the researchers in a specific study, or not.

No Justification

It might sound like a contradictio in terminis , but it is useful to distinguish a final category where researchers explicitly state they do not have a justification for their sample size. Perhaps the resources were available to collect more data, but they were not used. A researcher could have performed a power analysis, or planned for precision, but they did not. In those cases, instead of pretending there was a justification for the sample size, honesty requires you to state there is no sample size justification. This is not necessarily bad. It is still possible to discuss the smallest effect size of interest, the minimal statistically detectable effect, the width of the confidence interval around the effect size, and to plot a sensitivity power analysis, in relation to the sample size that was collected. If a researcher truly had no specific inferential goals when collecting the data, such an evaluation can perhaps be performed based on reasonable inferential goals peers would have when they learn about the existence of the collected data.

Do not try to spin a story where it looks like a study was highly informative when it was not. Instead, transparently evaluate how informative the study was given effect sizes that were of interest, and make sure that the conclusions follow from the data. The lack of a sample size justification might not be problematic, but it might mean that a study was not informative for most effect sizes of interest, which makes it especially difficult to interpret non-significant effects, or estimates with large uncertainty.

The inferential goal of data collection is often in some way related to the size of an effect. Therefore, to design an informative study, researchers will want to think about which effect sizes are interesting. First, it is useful to consider three effect sizes when determining the sample size. The first is the smallest effect size a researcher is interested in, the second is the smallest effect size that can be statistically significant (only in studies where a significance test will be performed), and the third is the effect size that is expected. Beyond considering these three effect sizes, it can be useful to evaluate ranges of effect sizes. This can be done by computing the width of the expected confidence interval around an effect size of interest (for example, an effect size of zero), and examine which effects could be rejected. Similarly, it can be useful to plot a sensitivity curve and evaluate the range of effect sizes the design has decent power to detect, as well as to consider the range of effects for which the design has low power. Finally, there are situations where it is useful to consider a range of effect that is likely to be observed in a specific research area.

What is the Smallest Effect Size of Interest?

The strongest possible sample size justification is based on an explicit statement of the smallest effect size that is considered interesting. A smallest effect size of interest can be based on theoretical predictions or practical considerations. For a review of approaches that can be used to determine a smallest effect size of interest in randomized controlled trials, see Cook et al.  (2014) and Keefe et al.  (2013) , for reviews of different methods to determine a smallest effect size of interest, see King (2011) and Copay, Subach, Glassman, Polly, and Schuler (2007) , and for a discussion focused on psychological research, see Lakens, Scheel, et al.  (2018) .

It can be challenging to determine the smallest effect size of interest whenever theories are not very developed, or when the research question is far removed from practical applications, but it is still worth thinking about which effects would be too small to matter. A first step forward is to discuss which effect sizes are considered meaningful in a specific research line with your peers. Researchers will differ in the effect sizes they consider large enough to be worthwhile (Murphy et al., 2014) . Just as not every scientist will find every research question interesting enough to study, not every scientist will consider the same effect sizes interesting enough to study, and different stakeholders will differ in which effect sizes are considered meaningful (Kelley & Preacher, 2012) .

Even though it might be challenging, there are important benefits of being able to specify a smallest effect size of interest. The population effect size is always uncertain (indeed, estimating this is typically one of the goals of the study), and therefore whenever a study is powered for an expected effect size, there is considerable uncertainty about whether the statistical power is high enough to detect the true effect in the population. However, if the smallest effect size of interest can be specified and agreed upon after careful deliberation, it becomes possible to design a study that has sufficient power (given the inferential goal to detect or reject the smallest effect size of interest with a certain error rate). A smallest effect of interest may be subjective (one researcher might find effect sizes smaller than d = 0.3 meaningless, while another researcher might still be interested in effects larger than d = 0.1), and there might be uncertainty about the parameters required to specify the smallest effect size of interest (e.g., when performing a cost-benefit analysis), but after a smallest effect size of interest has been determined, a study can be designed with a known Type II error rate to detect or reject this value. For this reason an a-priori power based on a smallest effect size of interest is generally preferred, whenever researchers are able to specify one (Aberson, 2019; Albers & Lakens, 2018; Brown, 1983; Cascio & Zedeck, 1983; Dienes, 2014; Lenth, 2001) .

The Minimal Statistically Detectable Effect

The minimal statistically detectable effect, or the critical effect size, provides information about the smallest effect size that, if observed, would be statistically significant given a specified alpha level and sample size (Cook et al., 2014) . For any critical t value (e.g., t = 1.96 for α = 0.05, for large sample sizes) we can compute a critical mean difference (Phillips et al., 2001) , or a critical standardized effect size. For a two-sided independent t test the critical mean difference is:

and the critical standardized mean difference is:

In Figure 4 the distribution of Cohen’s d is plotted for 15 participants per group when the true effect size is either d = 0 or d = 0.5. This figure is similar to Figure 2 , with the addition that the critical d is indicated. We see that with such a small number of observations in each group only observed effects larger than d = 0.75 will be statistically significant. Whether such effect sizes are interesting, and can realistically be expected, should be carefully considered and justified.

graphic

G*Power provides the critical test statistic (such as the critical t value) when performing a power analysis. For example, Figure 5 shows that for a correlation based on a two-sided test, with α = 0.05, and N = 30, only effects larger than r = 0.361 or smaller than r = -0.361 can be statistically significant. This reveals that when the sample size is relatively small, the observed effect needs to be quite substantial to be statistically significant.

graphic

It is important to realize that due to random variation each study has a probability to yield effects larger than the critical effect size, even if the true effect size is small (or even when the true effect size is 0, in which case each significant effect is a Type I error). Computing a minimal statistically detectable effect is useful for a study where no a-priori power analysis is performed, both for studies in the published literature that do not report a sample size justification (Lakens, Scheel, et al., 2018) , as for researchers who rely on heuristics for their sample size justification.

It can be informative to ask yourself whether the critical effect size for a study design is within the range of effect sizes that can realistically be expected. If not, then whenever a significant effect is observed in a published study, either the effect size is surprisingly larger than expected, or more likely, it is an upwardly biased effect size estimate. In the latter case, given publication bias, published studies will lead to biased effect size estimates. If it is still possible to increase the sample size, for example by ignoring rules of thumb and instead performing an a-priori power analysis, then do so. If it is not possible to increase the sample size, for example due to resource constraints, then reflecting on the minimal statistically detectable effect should make it clear that an analysis of the data should not focus on p values, but on the effect size and the confidence interval (see Table 3 ).

It is also useful to compute the minimal statistically detectable effect if an ‘optimistic’ power analysis is performed. For example, if you believe a best case scenario for the true effect size is d = 0.57 and use this optimistic expectation in an a-priori power analysis, effects smaller than d = 0.4 will not be statistically significant when you collect 50 observations in a two independent group design. If your worst case scenario for the alternative hypothesis is a true effect size of d = 0.35 your design would not allow you to declare a significant effect if effect size estimates close to the worst case scenario are observed. Taking into account the minimal statistically detectable effect size should make you reflect on whether a hypothesis test will yield an informative answer, and whether your current approach to sample size justification (e.g., the use of rules of thumb, or letting resource constraints determine the sample size) leads to an informative study, or not.

What is the Expected Effect Size?

Although the true population effect size is always unknown, there are situations where researchers have a reasonable expectation of the effect size in a study, and want to use this expected effect size in an a-priori power analysis. Even if expectations for the observed effect size are largely a guess, it is always useful to explicitly consider which effect sizes are expected. A researcher can justify a sample size based on the effect size they expect, even if such a study would not be very informative with respect to the smallest effect size of interest. In such cases a study is informative for one inferential goal (testing whether the expected effect size is present or absent), but not highly informative for the second goal (testing whether the smallest effect size of interest is present or absent).

There are typically three sources for expectations about the population effect size: a meta-analysis, a previous study, or a theoretical model. It is tempting for researchers to be overly optimistic about the expected effect size in an a-priori power analysis, as higher effect size estimates yield lower sample sizes, but being too optimistic increases the probability of observing a false negative result. When reviewing a sample size justification based on an a-priori power analysis, it is important to critically evaluate the justification for the expected effect size used in power analyses.

Using an Estimate from a Meta-Analysis

In a perfect world effect size estimates from a meta-analysis would provide researchers with the most accurate information about which effect size they could expect. Due to widespread publication bias in science, effect size estimates from meta-analyses are regrettably not always accurate. They can be biased, sometimes substantially so. Furthermore, meta-analyses typically have considerable heterogeneity, which means that the meta-analytic effect size estimate differs for subsets of studies that make up the meta-analysis. So, although it might seem useful to use a meta-analytic effect size estimate of the effect you are studying in your power analysis, you need to take great care before doing so.

If a researcher wants to enter a meta-analytic effect size estimate in an a-priori power analysis, they need to consider three things (see Table 5 ). First, the studies included in the meta-analysis should be similar enough to the study they are performing that it is reasonable to expect a similar effect size. In essence, this requires evaluating the generalizability of the effect size estimate to the new study. It is important to carefully consider differences between the meta-analyzed studies and the planned study, with respect to the manipulation, the measure, the population, and any other relevant variables.

Second, researchers should check whether the effect sizes reported in the meta-analysis are homogeneous. If not, and there is considerable heterogeneity in the meta-analysis, it means not all included studies can be expected to have the same true effect size estimate. A meta-analytic estimate should be used based on the subset of studies that most closely represent the planned study. Note that heterogeneity remains a possibility (even direct replication studies can show heterogeneity when unmeasured variables moderate the effect size in each sample (Kenny & Judd, 2019; Olsson-Collentine et al., 2020) ), so the main goal of selecting similar studies is to use existing data to increase the probability that your expectation is accurate, without guaranteeing it will be.

Third, the meta-analytic effect size estimate should not be biased. Check if the bias detection tests that are reported in the meta-analysis are state-of-the-art, or perform multiple bias detection tests yourself (Carter et al., 2019) , and consider bias corrected effect size estimates (even though these estimates might still be biased, and do not necessarily reflect the true population effect size).

Using an Estimate from a Previous Study

If a meta-analysis is not available, researchers often rely on an effect size from a previous study in an a-priori power analysis. The first issue that requires careful attention is whether the two studies are sufficiently similar. Just as when using an effect size estimate from a meta-analysis, researchers should consider if there are differences between the studies in terms of the population, the design, the manipulations, the measures, or other factors that should lead one to expect a different effect size. For example, intra-individual reaction time variability increases with age, and therefore a study performed on older participants should expect a smaller standardized effect size than a study performed on younger participants. If an earlier study used a very strong manipulation, and you plan to use a more subtle manipulation, a smaller effect size should be expected. Finally, effect sizes do not generalize to studies with different designs. For example, the effect size for a comparison between two groups is most often not similar to the effect size for an interaction in a follow-up study where a second factor is added to the original design (Lakens & Caldwell, 2021) .

Even if a study is sufficiently similar, statisticians have warned against using effect size estimates from small pilot studies in power analyses. Leon, Davis, and Kraemer (2011) write:

Contrary to tradition, a pilot study does not provide a meaningful effect size estimate for planning subsequent studies due to the imprecision inherent in data from small samples.

The two main reasons researchers should be careful when using effect sizes from studies in the published literature in power analyses is that effect size estimates from studies can differ from the true population effect size due to random variation, and that publication bias inflates effect sizes. Figure 6 shows the distribution for η p 2 for a study with three conditions with 25 participants in each condition when the null hypothesis is true and when there is a ‘medium’ true effect of η p 2 = 0.0588 (Richardson, 2011) . As in Figure 4 the critical effect size is indicated, which shows observed effects smaller than η p 2 = 0.08 will not be significant with the given sample size. If the null hypothesis is true effects larger than η p 2 = 0.08 will be a Type I error (the dark grey area), and when the alternative hypothesis is true effects smaller than η p 2 = 0.08 will be a Type II error (light grey area). It is clear all significant effects are larger than the true effect size ( η p 2 = 0.0588), so power analyses based on a significant finding (e.g., because only significant results are published in the literature) will be based on an overestimate of the true effect size, introducing bias.

graphic

But even if we had access to all effect sizes (e.g., from pilot studies you have performed yourself) due to random variation the observed effect size will sometimes be quite small. Figure 6 shows it is quite likely to observe an effect of η p 2 = 0.01 in a small pilot study, even when the true effect size is 0.0588. Entering an effect size estimate of η p 2 = 0.01 in an a-priori power analysis would suggest a total sample size of 957 observations to achieve 80% power in a follow-up study. If researchers only follow up on pilot studies when they observe an effect size in the pilot study that, when entered into a power analysis, yields a sample size that is feasible to collect for the follow-up study, these effect size estimates will be upwardly biased, and power in the follow-up study will be systematically lower than desired (Albers & Lakens, 2018) .

In essence, the problem with using small studies to estimate the effect size that will be entered into an a-priori power analysis is that due to publication bias or follow-up bias the effect sizes researchers end up using for their power analysis do not come from a full F distribution, but from what is known as a truncated   F distribution (Taylor & Muller, 1996) . For example, imagine if there is extreme publication bias in the situation illustrated in Figure 6 . The only studies that would be accessible to researchers would come from the part of the distribution where η p 2 > 0.08, and the test result would be statistically significant. It is possible to compute an effect size estimate that, based on certain assumptions, corrects for bias. For example, imagine we observe a result in the literature for a One-Way ANOVA with 3 conditions, reported as F (2, 42) = 0.017, η p 2 = 0.176. If we would take this effect size at face value and enter it as our effect size estimate in an a-priori power analysis, the suggested sample size to achieve 80% power would suggest we need to collect 17 observations in each condition.

However, if we assume bias is present, we can use the BUCSS R package (S. F. Anderson et al., 2017) to perform a power analysis that attempts to correct for bias. A power analysis that takes bias into account (under a specific model of publication bias, based on a truncated F distribution where only significant results are published) suggests collecting 73 participants in each condition. It is possible that the bias corrected estimate of the non-centrality parameter used to compute power is zero, in which case it is not possible to correct for bias using this method. As an alternative to formally modeling a correction for publication bias whenever researchers assume an effect size estimate is biased, researchers can simply use a more conservative effect size estimate, for example by computing power based on the lower limit of a 60% two-sided confidence interval around the effect size estimate, which Perugini, Gallucci, and Costantini (2014) refer to as safeguard power . Both these approaches lead to a more conservative power analysis, but not necessarily a more accurate power analysis. It is simply not possible to perform an accurate power analysis on the basis of an effect size estimate from a study that might be biased and/or had a small sample size (Teare et al., 2014) . If it is not possible to specify a smallest effect size of interest, and there is great uncertainty about which effect size to expect, it might be more efficient to perform a study with a sequential design (discussed below).

To summarize, an effect size from a previous study in an a-priori power analysis can be used if three conditions are met (see Table 6 ). First, the previous study is sufficiently similar to the planned study. Second, there was a low risk of bias (e.g., the effect size estimate comes from a Registered Report, or from an analysis for which results would not have impacted the likelihood of publication). Third, the sample size is large enough to yield a relatively accurate effect size estimate, based on the width of a 95% CI around the observed effect size estimate. There is always uncertainty around the effect size estimate, and entering the upper and lower limit of the 95% CI around the effect size estimate might be informative about the consequences of the uncertainty in the effect size estimate for an a-priori power analysis.

Using an Estimate from a Theoretical Model

When your theoretical model is sufficiently specific such that you can build a computational model, and you have knowledge about key parameters in your model that are relevant for the data you plan to collect, it is possible to estimate an effect size based on the effect size estimate derived from a computational model. For example, if one had strong ideas about the weights for each feature stimuli share and differ on, it could be possible to compute predicted similarity judgments for pairs of stimuli based on Tversky’s contrast model (Tversky, 1977) , and estimate the predicted effect size for differences between experimental conditions. Although computational models that make point predictions are relatively rare, whenever they are available, they provide a strong justification of the effect size a researcher expects.

Compute the Width of the Confidence Interval around the Effect Size

If a researcher can estimate the standard deviation of the observations that will be collected, it is possible to compute an a-priori estimate of the width of the 95% confidence interval around an effect size (Kelley, 2007) . Confidence intervals represent a range around an estimate that is wide enough so that in the long run the true population parameter will fall inside the confidence intervals 100 - α percent of the time. In any single study the true population effect either falls in the confidence interval, or it doesn’t, but in the long run one can act as if the confidence interval includes the true population effect size (while keeping the error rate in mind). Cumming (2013) calls the difference between the observed effect size and the upper bound of the 95% confidence interval (or the lower bound of the 95% confidence interval) the margin of error.

If we compute the 95% CI for an effect size of d = 0 based on the t statistic and sample size (Smithson, 2003) , we see that with 15 observations in each condition of an independent t test the 95% CI ranges from d = -0.72 to d = 0.72 5 . The margin of error is half the width of the 95% CI, 0.72. A Bayesian estimator who uses an uninformative prior would compute a credible interval with the same (or a very similar) upper and lower bound (Albers et al., 2018; Kruschke, 2011) , and might conclude that after collecting the data they would be left with a range of plausible values for the population effect that is too large to be informative. Regardless of the statistical philosophy you plan to rely on when analyzing the data, the evaluation of what we can conclude based on the width of our interval tells us that with 15 observation per group we will not learn a lot.

One useful way of interpreting the width of the confidence interval is based on the effects you would be able to reject if the true effect size is 0. In other words, if there is no effect, which effects would you have been able to reject given the collected data, and which effect sizes would not be rejected, if there was no effect? Effect sizes in the range of d = 0.7 are findings such as “People become aggressive when they are provoked”, “People prefer their own group to other groups”, and “Romantic partners resemble one another in physical attractiveness” (Richard et al., 2003) . The width of the confidence interval tells you that you can only reject the presence of effects that are so large, if they existed, you would probably already have noticed them. If it is true that most effects that you study are realistically much smaller than d = 0.7, there is a good possibility that we do not learn anything we didn’t already know by performing a study with n = 15. Even without data, in most research lines we would not consider certain large effects plausible (although the effect sizes that are plausible differ between fields, as discussed below). On the other hand, in large samples where researchers can for example reject the presence of effects larger than d = 0.2, if the null hypothesis was true, this analysis of the width of the confidence interval would suggest that peers in many research lines would likely consider the study to be informative.

We see that the margin of error is almost, but not exactly, the same as the minimal statistically detectable effect ( d = 0.75). The small variation is due to the fact that the 95% confidence interval is calculated based on the t distribution. If the true effect size is not zero, the confidence interval is calculated based on the non-central t distribution, and the 95% CI is asymmetric. Figure 7 visualizes three t distributions, one symmetric at 0, and two asymmetric distributions with a noncentrality parameter (the normalized difference between the means) of 2 and 3. The asymmetry is most clearly visible in very small samples (the distributions in the plot have 5 degrees of freedom) but remains noticeable in larger samples when calculating confidence intervals and statistical power. For example, for a true effect size of d = 0.5 observed with 15 observations per group would yield d s = 0.50, 95% CI [-0.23, 1.22]. If we compute the 95% CI around the critical effect size, we would get d s = 0.75, 95% CI [0.00, 1.48]. We see the 95% CI ranges from exactly 0.00 to 1.48, in line with the relation between a confidence interval and a p value, where the 95% CI excludes zero if the test is statistically significant. As noted before, the different approaches recommended here to evaluate how informative a study is are often based on the same information.

graphic

Plot a Sensitivity Power Analysis

A sensitivity power analysis fixes the sample size, desired power, and alpha level, and answers the question which effect size a study could detect with a desired power. A sensitivity power analysis is therefore performed when the sample size is already known. Sometimes data has already been collected to answer a different research question, or the data is retrieved from an existing database, and you want to perform a sensitivity power analysis for a new statistical analysis. Other times, you might not have carefully considered the sample size when you initially collected the data, and want to reflect on the statistical power of the study for (ranges of) effect sizes of interest when analyzing the results. Finally, it is possible that the sample size will be collected in the future, but you know that due to resource constraints the maximum sample size you can collect is limited, and you want to reflect on whether the study has sufficient power for effects that you consider plausible and interesting (such as the smallest effect size of interest, or the effect size that is expected).

Assume a researcher plans to perform a study where 30 observations will be collected in total, 15 in each between participant condition. Figure 8 shows how to perform a sensitivity power analysis in G*Power for a study where we have decided to use an alpha level of 5%, and desire 90% power. The sensitivity power analysis reveals the designed study has 90% power to detect effects of at least d = 1.23. Perhaps a researcher believes that a desired power of 90% is quite high, and is of the opinion that it would still be interesting to perform a study if the statistical power was lower. It can then be useful to plot a sensitivity curve across a range of smaller effect sizes.

graphic

The two dimensions of interest in a sensitivity power analysis are the effect sizes, and the power to observe a significant effect assuming a specific effect size. These two dimensions can be plotted against each other to create a sensitivity curve. For example, a sensitivity curve can be plotted in G*Power by clicking the ‘X-Y plot for a range of values’ button, as illustrated in Figure 9 . Researchers can examine which power they would have for an a-priori plausible range of effect sizes, or they can examine which effect sizes would provide reasonable levels of power. In simulation-based approaches to power analysis, sensitivity curves can be created by performing the power analysis for a range of possible effect sizes. Even if 50% power is deemed acceptable (in which case deciding to act as if the null hypothesis is true after a non-significant result is a relatively noisy decision procedure), Figure 9 shows a study design where power is extremely low for a large range of effect sizes that are reasonable to expect in most fields. Thus, a sensitivity power analysis provides an additional approach to evaluate how informative the planned study is, and can inform researchers that a specific design is unlikely to yield a significant effect for a range of effects that one might realistically expect.

graphic

If the number of observations per group had been larger, the evaluation might have been more positive. We might not have had any specific effect size in mind, but if we had collected 150 observations per group, a sensitivity analysis could have shown that power was sufficient for a range of effects we believe is most interesting to examine, and we would still have approximately 50% power for quite small effects. For a sensitivity analysis to be meaningful, the sensitivity curve should be compared against a smallest effect size of interest, or a range of effect sizes that are expected. A sensitivity power analysis has no clear cut-offs to examine (Bacchetti, 2010) . Instead, the idea is to make a holistic trade-off between different effect sizes one might observe or care about, and their associated statistical power.

The Distribution of Effect Sizes in a Research Area

In my personal experience the most commonly entered effect size estimate in an a-priori power analysis for an independent t test is Cohen’s benchmark for a ‘medium’ effect size, because of what is known as the default effect . When you open G*Power, a ‘medium’ effect is the default option for an a-priori power analysis. Cohen’s benchmarks for small, medium, and large effects should not be used in an a-priori power analysis (Cook et al., 2014; Correll et al., 2020) , and Cohen regretted having proposed these benchmarks (Funder & Ozer, 2019) . The large variety in research topics means that any ‘default’ or ‘heuristic’ that is used to compute statistical power is not just unlikely to correspond to your actual situation, but it is also likely to lead to a sample size that is substantially misaligned with the question you are trying to answer with the collected data.

Some researchers have wondered what a better default would be, if researchers have no other basis to decide upon an effect size for an a-priori power analysis. Brysbaert (2019) recommends d = 0.4 as a default in psychology, which is the average observed in replication projects and several meta-analyses. It is impossible to know if this average effect size is realistic, but it is clear there is huge heterogeneity across fields and research questions. Any average effect size will often deviate substantially from the effect size that should be expected in a planned study. Some researchers have suggested to change Cohen’s benchmarks based on the distribution of effect sizes in a specific field (Bosco et al., 2015; Funder & Ozer, 2019; Hill et al., 2008; Kraft, 2020; Lovakov & Agadullina, 2017) . As always, when effect size estimates are based on the published literature, one needs to evaluate the possibility that the effect size estimates are inflated due to publication bias. Due to the large variation in effect sizes within a specific research area, there is little use in choosing a large, medium, or small effect size benchmark based on the empirical distribution of effect sizes in a field to perform a power analysis.

Having some knowledge about the distribution of effect sizes in the literature can be useful when interpreting the confidence interval around an effect size. If in a specific research area almost no effects are larger than the value you could reject in an equivalence test (e.g., if the observed effect size is 0, the design would only reject effects larger than for example d = 0.7), then it is a-priori unlikely that collecting the data would tell you something you didn’t already know.

It is more difficult to defend the use of a specific effect size derived from an empirical distribution of effect sizes as a justification for the effect size used in an a-priori power analysis. One might argue that the use of an effect size benchmark based on the distribution of effects in the literature will outperform a wild guess, but this is not a strong enough argument to form the basis of a sample size justification. There is a point where researchers need to admit they are not ready to perform an a-priori power analysis due to a lack of clear expectations (Scheel et al., 2020) . Alternative sample size justifications, such as a justification of the sample size based on resource constraints, perhaps in combination with a sequential study design, might be more in line with the actual inferential goals of a study.

So far, the focus has been on justifying the sample size for quantitative studies. There are a number of related topics that can be useful to design an informative study. First, in addition to a-priori or prospective power analysis and sensitivity power analysis, it is important to discuss compromise power analysis (which is useful) and post-hoc or retrospective power analysis (which is not useful, e.g., Zumbo and Hubley (1998) , Lenth (2007) ). When sample sizes are justified based on an a-priori power analysis it can be very efficient to collect data in sequential designs where data collection is continued or terminated based on interim analyses of the data. Furthermore, it is worthwhile to consider ways to increase the power of a test without increasing the sample size. An additional point of attention is to have a good understanding of your dependent variable, especially it’s standard deviation. Finally, sample size justification is just as important in qualitative studies, and although there has been much less work on sample size justification in this domain, some proposals exist that researchers can use to design an informative study. Each of these topics is discussed in turn.

Compromise Power Analysis

In a compromise power analysis the sample size and the effect are fixed, and the error rates of the test are calculated, based on a desired ratio between the Type I and Type II error rate. A compromise power analysis is useful both when a very large number of observations will be collected, as when only a small number of observations can be collected.

In the first situation a researcher might be fortunate enough to be able to collect so many observations that the statistical power for a test is very high for all effect sizes that are deemed interesting. For example, imagine a researcher has access to 2000 employees who are all required to answer questions during a yearly evaluation in a company where they are testing an intervention that should reduce subjectively reported stress levels. You are quite confident that an effect smaller than d = 0.2 is not large enough to be subjectively noticeable for individuals (Jaeschke et al., 1989) . With an alpha level of 0.05 the researcher would have a statistical power of 0.994, or a Type II error rate of 0.006. This means that for a smallest effect size of interest of d = 0.2 the researcher is 8.30 times more likely to make a Type I error than a Type II error.

Although the original idea of designing studies that control Type I and Type II error rates was that researchers would need to justify their error rates (Neyman & Pearson, 1933) , a common heuristic is to set the Type I error rate to 0.05 and the Type II error rate to 0.20, meaning that a Type I error is 4 times as unlikely as a Type II error. The default use of 80% power (or a 20% Type II or β error) is based on a personal preference of Cohen (1988) , who writes:

It is proposed here as a convention that, when the investigator has no other basis for setting the desired power value, the value .80 be used. This means that β is set at .20. This arbitrary but reasonable value is offered for several reasons (Cohen, 1965, pp. 98-99). The chief among them takes into consideration the implicit convention for α of .05. The β of .20 is chosen with the idea that the general relative seriousness of these two kinds of errors is of the order of .20/.05, i.e., that Type I errors are of the order of four times as serious as Type II errors. This .80 desired power convention is offered with the hope that it will be ignored whenever an investigator can find a basis in his substantive concerns in his specific research investigation to choose a value ad hoc.

We see that conventions are built on conventions: the norm to aim for 80% power is built on the norm to set the alpha level at 5%. What we should take away from Cohen is not that we should aim for 80% power, but that we should justify our error rates based on the relative seriousness of each error. This is where compromise power analysis comes in. If you share Cohen’s belief that a Type I error is 4 times as serious as a Type II error, and building on our earlier study on 2000 employees, it makes sense to adjust the Type I error rate when the Type II error rate is low for all effect sizes of interest (Cascio & Zedeck, 1983) . Indeed, Erdfelder, Faul, and Buchner (1996) created the G*Power software in part to give researchers a tool to perform compromise power analysis.

Figure 10 illustrates how a compromise power analysis is performed in G*Power when a Type I error is deemed to be equally costly as a Type II error, which for a study with 1000 observations per condition would lead to a Type I error and a Type II error of 0.0179. As Faul, Erdfelder, Lang, and Buchner (2007) write:

Of course, compromise power analyses can easily result in unconventional significance levels greater than α = .05 (in the case of small samples or effect sizes) or less than α = .001 (in the case of large samples or effect sizes). However, we believe that the benefit of balanced Type I and Type II error risks often offsets the costs of violating significance level conventions.

graphic

This brings us to the second situation where a compromise power analysis can be useful, which is when we know the statistical power in our study is low. Although it is highly undesirable to make decisions when error rates are high, if one finds oneself in a situation where a decision must be made based on little information, Winer (1962) writes:

The frequent use of the .05 and .01 levels of significance is a matter of convention having little scientific or logical basis. When the power of tests is likely to be low under these levels of significance, and when Type I and Type II errors are of approximately equal importance, the .30 and .20 levels of significance may be more appropriate than the .05 and .01 levels.

For example, if we plan to perform a two-sided t test, can feasibly collect at most 50 observations in each independent group, and expect a population effect size of 0.5, we would have 70% power if we set our alpha level to 0.05. We can choose to weigh both types of error equally, and set the alpha level to 0.149, to end up with a statistical power for an effect of d = 0.5 of 0.851 (given a 0.149 Type II error rate). The choice of α and β in a compromise power analysis can be extended to take prior probabilities of the null and alternative hypothesis into account (Maier & Lakens, 2022; Miller & Ulrich, 2019; Murphy et al., 2014) .

A compromise power analysis requires a researcher to specify the sample size. This sample size itself requires a justification, so a compromise power analysis will typically be performed together with a resource constraint justification for a sample size. It is especially important to perform a compromise power analysis if your resource constraint justification is strongly based on the need to make a decision, in which case a researcher should think carefully about the Type I and Type II error rates stakeholders are willing to accept. However, a compromise power analysis also makes sense if the sample size is very large, but a researcher did not have the freedom to set the sample size. This might happen if, for example, data collection is part of a larger international study and the sample size is based on other research questions. In designs where the Type II error rate is very small (and power is very high) some statisticians have also recommended to lower the alpha level to prevent Lindley’s paradox, a situation where a significant effect ( p < α ) is evidence for the null hypothesis (Good, 1992; Jeffreys, 1939) . Lowering the alpha level as a function of the statistical power of the test can prevent this paradox, providing another argument for a compromise power analysis when sample sizes are large (Maier & Lakens, 2022) . Finally, a compromise power analysis needs a justification for the effect size, either based on a smallest effect size of interest or an effect size that is expected. Table 7 lists three aspects that should be discussed alongside a reported compromise power analysis.

What to do if Your Editor Asks for Post-hoc Power?

Post-hoc, retrospective, or observed power is used to describe the statistical power of the test that is computed assuming the effect size that has been estimated from the collected data is the true effect size (Lenth, 2007; Zumbo & Hubley, 1998) . Post-hoc power is therefore not performed before looking at the data, based on effect sizes that are deemed interesting, as in an a-priori power analysis, and it is unlike a sensitivity power analysis where a range of interesting effect sizes is evaluated. Because a post-hoc or retrospective power analysis is based on the effect size observed in the data that has been collected, it does not add any information beyond the reported p value, but it presents the same information in a different way. Despite this fact, editors and reviewers often ask authors to perform post-hoc power analysis to interpret non-significant results. This is not a sensible request, and whenever it is made, you should not comply with it. Instead, you should perform a sensitivity power analysis, and discuss the power for the smallest effect size of interest and a realistic range of expected effect sizes.

Post-hoc power is directly related to the p value of the statistical test (Hoenig & Heisey, 2001) . For a z test where the p value is exactly 0.05, post-hoc power is always 50%. The reason for this relationship is that when a p value is observed that equals the alpha level of the test (e.g., 0.05), the observed z score of the test is exactly equal to the critical value of the test (e.g., z = 1.96 in a two-sided test with a 5% alpha level). Whenever the alternative hypothesis is centered on the critical value half the values we expect to observe if this alternative hypothesis is true fall below the critical value, and half fall above the critical value. Therefore, a test where we observed a p value identical to the alpha level will have exactly 50% power in a post-hoc power analysis, as the analysis assumes the observed effect size is true.

For other statistical tests, where the alternative distribution is not symmetric (such as for the t test, where the alternative hypothesis follows a non-central t distribution, see Figure 7 ), a p = 0.05 does not directly translate to an observed power of 50%, but by plotting post-hoc power against the observed p value we see that the two statistics are always directly related. As Figure 11 shows, if the p value is non-significant (i.e., larger than 0.05) the observed power will be less than approximately 50% in a t test. Lenth (2007) explains how observed power is also completely determined by the observed p value for F tests, although the statement that a non-significant p value implies a power less than 50% no longer holds.

graphic

When editors or reviewers ask researchers to report post-hoc power analyses they would like to be able to distinguish between true negatives (concluding there is no effect, when there is no effect) and false negatives (a Type II error, concluding there is no effect, when there actually is an effect). Since reporting post-hoc power is just a different way of reporting the p value, reporting the post-hoc power will not provide an answer to the question editors are asking (Hoenig & Heisey, 2001; Lenth, 2007; Schulz & Grimes, 2005; Yuan & Maxwell, 2005) . To be able to draw conclusions about the absence of a meaningful effect, one should perform an equivalence test, and design a study with high power to reject the smallest effect size of interest (Lakens, Scheel, et al., 2018) . Alternatively, if no smallest effect size of interest was specified when designing the study, researchers can report a sensitivity power analysis.

Sequential Analyses

Whenever the sample size is justified based on an a-priori power analysis it can be very efficient to collect data in a sequential design. Sequential designs control error rates across multiple looks at the data (e.g., after 50, 100, and 150 observations have been collected) and can reduce the average expected sample size that is collected compared to a fixed design where data is only analyzed after the maximum sample size is collected (Proschan et al., 2006; Wassmer & Brannath, 2016) . Sequential designs have a long history (Dodge & Romig, 1929) , and exist in many variations, such as the Sequential Probability Ratio Test (Wald, 1945) , combining independent statistical tests (Westberg, 1985) , group sequential designs (Jennison & Turnbull, 2000) , sequential Bayes factors (Schönbrodt et al., 2017) , and safe testing (Grünwald et al., 2019) . Of these approaches, the Sequential Probability Ratio Test is most efficient if data can be analyzed after every observation (Schnuerch & Erdfelder, 2020) . Group sequential designs, where data is collected in batches, provide more flexibility in data collection, error control, and corrections for effect size estimates (Wassmer & Brannath, 2016) . Safe tests provide optimal flexibility if there are dependencies between observations (ter Schure & Grünwald, 2019) .

Sequential designs are especially useful when there is considerable uncertainty about the effect size, or when it is plausible that the true effect size is larger than the smallest effect size of interest the study is designed to detect (Lakens, 2014) . In such situations data collection has the possibility to terminate early if the effect size is larger than the smallest effect size of interest, but data collection can continue to the maximum sample size if needed. Sequential designs can prevent waste when testing hypotheses, both by stopping early when the null hypothesis can be rejected, as by stopping early if the presence of a smallest effect size of interest can be rejected (i.e., stopping for futility). Group sequential designs are currently the most widely used approach to sequential analyses, and can be planned and analyzed using rpact (Wassmer & Pahlke, 2019) or gsDesign (K. M. Anderson, 2014) . 6

Increasing Power Without Increasing the Sample Size

The most straightforward approach to increase the informational value of studies is to increase the sample size. Because resources are often limited, it is also worthwhile to explore different approaches to increasing the power of a test without increasing the sample size. The first option is to use directional tests where relevant. Researchers often make directional predictions, such as ‘we predict X is larger than Y’. The statistical test that logically follows from this prediction is a directional (or one-sided) t test. A directional test moves the Type I error rate to one side of the tail of the distribution, which lowers the critical value, and therefore requires less observations to achieve the same statistical power.

Although there is some discussion about when directional tests are appropriate, they are perfectly defensible from a Neyman-Pearson perspective on hypothesis testing (Cho & Abe, 2013) , which makes a (preregistered) directional test a straightforward approach to both increase the power of a test, as the riskiness of the prediction. However, there might be situations where you do not want to ask a directional question. Sometimes, especially in research with applied consequences, it might be important to examine if a null effect can be rejected, even if the effect is in the opposite direction as predicted. For example, when you are evaluating a recently introduced educational intervention, and you predict the intervention will increase the performance of students, you might want to explore the possibility that students perform worse, to be able to recommend abandoning the new intervention. In such cases it is also possible to distribute the error rate in a ‘lop-sided’ manner, for example assigning a stricter error rate to effects in the negative than in the positive direction (Rice & Gaines, 1994) .

Another approach to increase the power without increasing the sample size, is to increase the alpha level of the test, as explained in the section on compromise power analysis. Obviously, this comes at an increased probability of making a Type I error. The risk of making either type of error should be carefully weighed, which typically requires taking into account the prior probability that the null-hypothesis is true (Cascio & Zedeck, 1983; Miller & Ulrich, 2019; Mudge et al., 2012; Murphy et al., 2014) . If you have to make a decision, or want to make a claim, and the data you can feasibly collect is limited, increasing the alpha level is justified, either based on a compromise power analysis, or based on a cost-benefit analysis (Baguley, 2004; Field et al., 2004) .

Another widely recommended approach to increase the power of a study is use a within participant design where possible. In almost all cases where a researcher is interested in detecting a difference between groups, a within participant design will require collecting less participants than a between participant design. The reason for the decrease in the sample size is explained by the equation below from Maxwell, Delaney, and Kelley (2017) . The number of participants needed in a two group within-participants design (NW) relative to the number of participants needed in a two group between-participants design (NB), assuming normal distributions, is:

The required number of participants is divided by two because in a within-participants design with two conditions every participant provides two data points. The extent to which this reduces the sample size compared to a between-participants design also depends on the correlation between the dependent variables (e.g., the correlation between the measure collected in a control task and an experimental task), as indicated by the (1- ρ ) part of the equation. If the correlation is 0, a within-participants design simply needs half as many participants as a between participant design (e.g., 64 instead 128 participants). The higher the correlation, the larger the relative benefit of within-participants designs, and whenever the correlation is negative (up to -1) the relative benefit disappears. Especially when dependent variables in within-participants designs are positively correlated, within-participants designs will greatly increase the power you can achieve given the sample size you have available. Use within-participants designs when possible, but weigh the benefits of higher power against the downsides of order effects or carryover effects that might be problematic in a within-participants design (Maxwell et al., 2017) . 7 For designs with multiple factors with multiple levels it can be difficult to specify the full correlation matrix that specifies the expected population correlation for each pair of measurements (Lakens & Caldwell, 2021) . In these cases sequential analyses might provide a solution.

In general, the smaller the variation, the larger the standardized effect size (because we are dividing the raw effect by a smaller standard deviation) and thus the higher the power given the same number of observations. Some additional recommendations are provided in the literature (Allison et al., 1997; Bausell & Li, 2002; Hallahan & Rosenthal, 1996) , such as:

Use better ways to screen participants for studies where participants need to be screened before participation.

Assign participants unequally to conditions (if data in the control condition is much cheaper to collect than data in the experimental condition, for example).

Use reliable measures that have low error variance (Williams et al., 1995) .

Smart use of preregistered covariates (Meyvis & Van Osselaer, 2018) .

It is important to consider if these ways to reduce the variation in the data do not come at too large a cost for external validity. For example, in an intention-to-treat analysis in randomized controlled trials participants who do not comply with the protocol are maintained in the analysis such that the effect size from the study accurately represents the effect of implementing the intervention in the population, and not the effect of the intervention only on those people who perfectly follow the protocol (Gupta, 2011) . Similar trade-offs between reducing the variance and external validity exist in other research areas.

Know Your Measure

Although it is convenient to talk about standardized effect sizes, it is generally preferable if researchers can interpret effects in the raw (unstandardized) scores, and have knowledge about the standard deviation of their measures (Baguley, 2009; Lenth, 2001) . To make it possible for a research community to have realistic expectations about the standard deviation of measures they collect, it is beneficial if researchers within a research area use the same validated measures. This provides a reliable knowledge base that makes it easier to plan for a desired accuracy, and to use a smallest effect size of interest on the unstandardized scale in an a-priori power analysis.

In addition to knowledge about the standard deviation it is important to have knowledge about the correlations between dependent variables (for example because Cohen’s d z for a dependent t test relies on the correlation between means). The more complex the model, the more aspects of the data-generating process need to be known to make predictions. For example, in hierarchical models researchers need knowledge about variance components to be able to perform a power analysis (DeBruine & Barr, 2019; Westfall et al., 2014) . Finally, it is important to know the reliability of your measure (Parsons et al., 2019) , especially when relying on an effect size from a published study that used a measure with different reliability, or when the same measure is used in different populations, in which case it is possible that measurement reliability differs between populations. With the increasing availability of open data, it will hopefully become easier to estimate these parameters using data from earlier studies.

If we calculate a standard deviation from a sample, this value is an estimate of the true value in the population. In small samples, our estimate can be quite far off, while due to the law of large numbers, as our sample size increases, we will be measuring the standard deviation more accurately. Since the sample standard deviation is an estimate with uncertainty, we can calculate a confidence interval around the estimate (Smithson, 2003) , and design pilot studies that will yield a sufficiently reliable estimate of the standard deviation. The confidence interval for the variance σ 2 is provided in the following formula, and the confidence for the standard deviation is the square root of these limits:

Whenever there is uncertainty about parameters, researchers can use sequential designs to perform an internal pilot study   (Wittes & Brittain, 1990) . The idea behind an internal pilot study is that researchers specify a tentative sample size for the study, perform an interim analysis, use the data from the internal pilot study to update parameters such as the variance of the measure, and finally update the final sample size that will be collected. As long as interim looks at the data are blinded (e.g., information about the conditions is not taken into account) the sample size can be adjusted based on an updated estimate of the variance without any practical consequences for the Type I error rate (Friede & Kieser, 2006; Proschan, 2005) . Therefore, if researchers are interested in designing an informative study where the Type I and Type II error rates are controlled, but they lack information about the standard deviation, an internal pilot study might be an attractive approach to consider (Chang, 2016) .

Conventions as meta-heuristics

Even when a researcher might not use a heuristic to directly determine the sample size in a study, there is an indirect way in which heuristics play a role in sample size justifications. Sample size justifications based on inferential goals such as a power analysis, accuracy, or a decision all require researchers to choose values for a desired Type I and Type II error rate, a desired accuracy, or a smallest effect size of interest. Although it is sometimes possible to justify these values as described above (e.g., based on a cost-benefit analysis), a solid justification of these values might require dedicated research lines. Performing such research lines will not always be possible, and these studies might themselves not be worth the costs (e.g., it might require less resources to perform a study with an alpha level that most peers would consider conservatively low, than to collect all the data that would be required to determine the alpha level based on a cost-benefit analysis). In these situations, researchers might use values based on a convention.

When it comes to a desired width of a confidence interval, a desired power, or any other input values required to perform a sample size computation, it is important to transparently report the use of a heuristic or convention (for example by using the accompanying online Shiny app). A convention such as the use of a 5% Type 1 error rate and 80% power practically functions as a lower threshold of the minimum informational value peers are expected to accept without any justification (whereas with a justification, higher error rates can also be deemed acceptable by peers). It is important to realize that none of these values are set in stone. Journals are free to specify that they desire a higher informational value in their author guidelines (e.g., Nature Human Behavior requires registered reports to be designed to achieve 95% statistical power, and my own department has required staff to submit ERB proposals where, whenever possible, the study was designed to achieve 90% power). Researchers who choose to design studies with a higher informational value than a conventional minimum should receive credit for doing so.

In the past some fields have changed conventions, such as the 5 sigma threshold now used in physics to declare a discovery instead of a 5% Type I error rate. In other fields such attempts have been unsuccessful (e.g., Johnson (2013) ). Improved conventions should be context dependent, and it seems sensible to establish them through consensus meetings (Mullan & Jacoby, 1985) . Consensus meetings are common in medical research, and have been used to decide upon a smallest effect size of interest (for an example, see Fried, Boers, and Baker (1993) ). In many research areas current conventions can be improved. For example, it seems peculiar to have a default alpha level of 5% both for single studies and for meta-analyses, and one could imagine a future where the default alpha level in meta-analyses is much lower than 5%. Hopefully, making the lack of an adequate justification for certain input values in specific situations more transparent will motivate fields to start a discussion about how to improve current conventions. The online Shiny app links to good examples of justifications where possible, and will continue to be updated as better justifications are developed in the future.

Sample Size Justification in Qualitative Research

A value of information perspective to sample size justification also applies to qualitative research. A sample size justification in qualitative research should be based on the consideration that the cost of collecting data from additional participants does not yield new information that is valuable enough given the inferential goals. One widely used application of this idea is known as saturation and is indicated by the observation that new data replicates earlier observations, without adding new information (Morse, 1995) . For example, let’s imagine we ask people why they have a pet. Interviews might reveal reasons that are grouped into categories, but after interviewing 20 people, no new categories emerge, at which point saturation has been reached. Alternative philosophies to qualitative research exist, and not all value planning for saturation. Regrettably, principled approaches to justify sample sizes have not been developed for these alternative philosophies (Marshall et al., 2013) .

When sampling, the goal is often not to pick a representative sample, but a sample that contains a sufficiently diverse number of subjects such that saturation is reached efficiently. Fugard and Potts (2015) show how to move towards a more informed justification for the sample size in qualitative research based on 1) the number of codes that exist in the population (e.g., the number of reasons people have pets), 2) the probability a code can be observed in a single information source (e.g., the probability that someone you interview will mention each possible reason for having a pet), and 3) the number of times you want to observe each code. They provide an R formula based on binomial probabilities to compute a required sample size to reach a desired probability of observing codes.

A more advanced approach is used in Rijnsoever (2017) , which also explores the importance of different sampling strategies. In general, purposefully sampling information from sources you expect will yield novel information is much more efficient than random sampling, but this also requires a good overview of the expected codes, and the sub-populations in which each code can be observed. Sometimes, it is possible to identify information sources that, when interviewed, would at least yield one new code (e.g., based on informal communication before an interview). A good sample size justification in qualitative research is based on 1) an identification of the populations, including any sub-populations, 2) an estimate of the number of codes in the (sub-)population, 3) the probability a code is encountered in an information source, and 4) the sampling strategy that is used.

Providing a coherent sample size justification is an essential step in designing an informative study. There are multiple approaches to justifying the sample size in a study, depending on the goal of the data collection, the resources that are available, and the statistical approach that is used to analyze the data. An overarching principle in all these approaches is that researchers consider the value of the information they collect in relation to their inferential goals.

The process of justifying a sample size when designing a study should sometimes lead to the conclusion that it is not worthwhile to collect the data, because the study does not have sufficient informational value to justify the costs. There will be cases where it is unlikely there will ever be enough data to perform a meta-analysis (for example because of a lack of general interest in the topic), the information will not be used to make a decision or claim, and the statistical tests do not allow you to test a hypothesis with reasonable error rates or to estimate an effect size with sufficient accuracy. If there is no good justification to collect the maximum number of observations that one can feasibly collect, performing the study anyway is a waste of time and/or money (Brown, 1983; Button et al., 2013; S. D. Halpern et al., 2002) .

The awareness that sample sizes in past studies were often too small to meet any realistic inferential goals is growing among psychologists (Button et al., 2013; Fraley & Vazire, 2014; Lindsay, 2015; Sedlmeier & Gigerenzer, 1989) . As an increasing number of journals start to require sample size justifications, some researchers will realize they need to collect larger samples than they were used to. This means researchers will need to request more money for participant payment in grant proposals, or that researchers will need to increasingly collaborate (Moshontz et al., 2018) . If you believe your research question is important enough to be answered, but you are not able to answer the question with your current resources, one approach to consider is to organize a research collaboration with peers, and pursue an answer to this question collectively.

A sample size justification should not be seen as a hurdle that researchers need to pass before they can submit a grant, ethical review board proposal, or manuscript for publication. When a sample size is simply stated, instead of carefully justified, it can be difficult to evaluate whether the value of the information a researcher aims to collect outweighs the costs of data collection. Being able to report a solid sample size justification means a researcher knows what they want to learn from a study, and makes it possible to design a study that can provide an informative answer to a scientific question.

This work was funded by VIDI Grant 452-17-013 from the Netherlands Organisation for Scientific Research. I would like to thank Shilaan Alzahawi, José Biurrun, Aaron Caldwell, Gordon Feld, Yaov Kessler, Robin Kok, Maximilian Maier, Matan Mazor, Toni Saari, Andy Siddall, and Jesper Wulff for feedback on an earlier draft. A computationally reproducible version of this manuscript is available at https://github.com/Lakens/sample_size_justification. An interactive online form to complete a sample size justification implementing the recommendations in this manuscript can be found at https://shiny.ieis.tue.nl/sample_size_justification/.

I have no competing interests to declare.

A computationally reproducible version of this manuscript is available at https://github.com/Lakens/sample_size_justification .

The topic of power analysis for meta-analyses is outside the scope of this manuscript, but see Hedges and Pigott (2001) and Valentine, Pigott, and Rothstein (2010) .

It is possible to argue we are still making an inference, even when the entire population is observed, because we have observed a metaphorical population from one of many possible worlds, see Spiegelhalter (2019) .

Power analyses can be performed based on standardized effect sizes or effect sizes expressed on the original scale. It is important to know the standard deviation of the effect (see the ‘Know Your Measure’ section) but I find it slightly more convenient to talk about standardized effects in the context of sample size justifications.

These figures can be reproduced and adapted in an online shiny app: http://shiny.ieis.tue.nl/d_p_power/ .

Confidence intervals around effect sizes can be computed using the MOTE Shiny app: https://www.aggieerin.com/shiny-server/

Shiny apps are available for both rpact: https://rpact.shinyapps.io/public/ and gsDesign: https://gsdesign.shinyapps.io/prod/

You can compare within- and between-participants designs in this Shiny app: http://shiny.ieis.tue.nl/within_between .

Supplementary data

Recipient(s) will receive an email with a link to 'Sample Size Justification' and will not need an account to access the content.

Subject: Sample Size Justification

(Optional message may have a maximum of 1000 characters.)

Citing articles via

Email alerts, affiliations.

  • Recent Content
  • Special Collections
  • All Content
  • Submission Guidelines
  • Publication Fees
  • Journal Policies
  • Editorial Team
  • Online ISSN 2474-7394
  • Copyright © 2024

Stay Informed

Disciplines.

  • Ancient World
  • Anthropology
  • Communication
  • Criminology & Criminal Justice
  • Film & Media Studies
  • Food & Wine
  • Browse All Disciplines
  • Browse All Courses
  • Book Authors
  • Booksellers
  • Instructions
  • Journal Authors
  • Journal Editors
  • Media & Journalists
  • Planned Giving

About UC Press

  • Press Releases
  • Seasonal Catalog
  • Acquisitions Editors
  • Customer Service
  • Exam/Desk Requests
  • Media Inquiries
  • Print-Disability
  • Rights & Permissions
  • UC Press Foundation
  • © Copyright 2024 by the Regents of the University of California. All rights reserved. Privacy policy    Accessibility

This Feature Is Available To Subscribers Only

Sign In or Create an Account

Knowledge as Justified True Belief

  • Original Research
  • Open access
  • Published: 19 February 2021
  • Volume 88 , pages 531–549, ( 2023 )

Cite this article

You have full access to this open access article

  • Job de Grefte   ORCID: orcid.org/0000-0003-2433-952X 1  

23k Accesses

6 Citations

6 Altmetric

Explore all metrics

What is knowledge? I this paper I defend the claim that knowledge is justified true belief by arguing that, contrary to common belief, Gettier cases do not refute it. My defence will be of the anti-luck kind: I will argue that (1) Gettier cases necessarily involve veritic luck, and (2) that a plausible version of reliabilism excludes veritic luck. There is thus a prominent and plausible account of justification according to which Gettier cases do not feature justified beliefs, and therefore, do not present counterexamples to the tripartite analysis. I defend the account of justification against objections, and contrast my defence of the tripartite analysis to similar ones from the literature. I close by considering some implications of this way of thinking about justification and knowledge.

Similar content being viewed by others

is a research paper justified

Reliabilist epistemology meets bounded rationality

Giovanni Dusi

Conspiracy theories, epistemic self-identity, and epistemic territory

Daniel Munro

Interpretivism and norms

Devin Sanchez Curry

Avoid common mistakes on your manuscript.

1 Introduction

What is knowledge? In this paper I defend the claim that knowledge is justified true belief. This account is well-known as the ‘classical’ or ‘tripartite’ analysis of knowledge. Many epistemologists, however, regard the claim to be plainly false. Footnote 1 In this paper I aim to show that the tripartite analysis of knowledge should be given more credit than the current state of the debate affords it.

My defence will be indirect: I will argue that, on a plausible interpretation of the justification condition, Gettier cases do not present counterexamples to the tripartite analysis of knowledge. If successful, my argument shows that the tripartite analysis is more plausible than commonly supposed, not that it is beyond question.

The paper is structured as follows. In Sect.  2 , I show that Gettier cases necessarily involve a kind of luck known as veritic luck . In Sect.  3 , I provide a plausible interpretation of reliabilist justification that excludes veritic luck. In Sect.  4 , I defend this interpretation against objections. In Sect.  5 , I compare my defence of the tripartite analysis against alternatives from the literature. In Sect.  6 , I consider some implications of the proposed way of thinking about justification and knowledge.

2 Gettier Cases Involve Veritic Luck

In this section I shall argue that Gettier cases necessarily involve a particular kind of luck: veritic luck. Footnote 2

I will be working with a modal account of luck (MAL) (Pritchard 2005 , 2014 ). According to this account, luck depends on the modal profile of an event: the distribution of possible worlds where the event does and does not occur. An event is a case of luck only if it occurs in the actual world, but fails to occur in (enough) nearby possible worlds, where, where a world is ‘closer’ to the actual world the more similar it is to it (Pritchard 2005 , p. 128; Sainsbury 1997 , p. 913). Nearby worlds represent ‘easy’ possibilities, since not much would need to change to the actual world for the event occurring in a nearby world to occur. On this interpretation, nothing is more easily possible than what happens in the actual world, since no world is closer (more similar) to the actual world than the actual world itself. The account correctly classifies paradigm cases of luck, like winning the lottery and finding a treasure, because such events could have easily failed to occur, in the sense that not much would need to change to the actual world for one to fail to win the lottery or fail to find the treasure.

Luck is relative both to a set of agents and to a set of ‘initial conditions’. It is relative to agents because the same event may be lucky for one agent but not for another. The reason for this is that events need to be of some positive significance to some agent in order to be lucky: an avalanche on the South pole, no matter how easily it could have failed to occur, is not a case of luck if no one cares. Footnote 3

Whether an event is a case of luck also depends on what we take to be its relevant initial conditions. For example, keeping fixed the complete state of the universe just prior to buying the ticket, my lottery win may well be fully determined, and therefore not a case of luck. However, fix only that I bought a random ticket, and under these conditions it will be easily possible for me to fail to win the lottery.

We arrive at the following definition of luck:

LUCK Event E is lucky for agent S under conditions I iff:

E is significant to S (or would be significant, were S to be availed of the relevant facts), and

E actually occurs, but could have easily failed to occur under conditions I.

Veritic luck is a special kind of luck. It attaches to beliefs that are true but produced in a way that could have easily produced a false belief instead.

VERITIC LUCK : A belief is veritically lucky if and only if it is a matter of luck that the method one used to form one’s belief produced a true belief. Footnote 4

Suppose that I form a belief that the number of stars is even on the basis of a simple guess. My belief may be true. If it is, then it is veritically lucky, because it is produced in a way that could have easily resulted in me forming the false belief that the number of stars is uneven. Footnote 5 As is common in the literature, I will assume that the formation of a true belief is of at least some significance to the relevant agents involved, and that the relevant initial conditions for veritic luck include the agent’s method of belief formation. Footnote 6

Why think that Gettier cases necessarily involve veritic luck? Consider one of Gettier’s own cases (somewhat abbreviated for ease of use):

DISJUNCTION : Smith has excellent evidence for the proposition that Jones owns a Ford, and forms the corresponding belief. From this proposition, Smith competently deduces the further proposition that either Jones owns a Ford, or Brown is in Barcelona, and again forms the corresponding belief. Smith has no evidence whatsoever that indicates that Brown is in fact in Barcelona, and so formulates the second disjunct quite at random. Now suppose that through some elaborate deception, all Smith’s evidence for believing that Jones owns a Ford is misleading, and Jones in fact does not own a Ford at all. Suppose further, however, that Brown is in Barcelona at the moment Smith forms his belief in the disjunction. His belief thus ends up being true.

It is widely accepted that in the above case Smith does not know that either Jones owns a Ford or Brown is in Barcelona. Note, however, that Smith’s belief-forming method could have easily produced a false belief. For example, Smith could have easily formed the false belief that either Jones owns a Ford or Brown is in London. Footnote 7 This Gettier case thus clearly involves veritic luck.

The above is just one example. Linda Zagzebski provides a general formula for generating Gettier cases (Zagzebski 1994 ). If we can show that, following this formula, one will be guaranteed to end up with a belief that is veritically lucky, this will suffice to show that all Gettier cases (at least of the standard sort covered by Zagzebski’s formula) involve veritic luck.

Zagzebski’s recipe is the following: take any non-factive epistemic condition you like and construct a case such that a given subject’s true belief satisfies it. Footnote 8 Then, modify the case such that accidentally, satisfying the epistemic condition does not lead you to form a true belief. Finally, make it so that as a second case of luck (unconnected to your cognitive activity), you end up with a true belief nonetheless (Zagzebski 1994 , p. 66). In these cases, the subject will, according to Zagzebski, end up with a belief that satisfies the preferred conditions for knowledge, but will still fail to qualify as such. In short, the subject will end up with a Gettiered belief.

A belief is veritically lucky if one’s belief-forming method actually produced a true belief but could have easily produced a false belief instead. In Gettier cases, according to Zagzebski, “an accident of bad luck is cancelled out by an accident of good luck. The right goal is reached, but only by chance” (Zagzebski 1994 ) Here the right goal is the formation of a true belief. It is reached by chance because one’s method of belief formation is such that in the case at hand, it does nothing to lead you to form a true belief. In combination with the second kind of luck, this means not much would need to change to the actual world for the method to produce a false belief instead. Thus, it follows that all Gettier cases, or at least the ones that can be constructed using Zagzebski’s method, will feature veritic luck. Footnote 9

3 A Modal Interpretation of Reliabilism

In the previous section I argued that Gettier cases involve veritic luck. In this section and the next, I will defend the following claim:

JUSTIFICATION A belief is justified only if it is not veritically lucky.

The account is original in that anti-luck conditions are usually formulated as conditions on knowledge, rather than on justification (Littlejohn 2014 ; Pritchard 2005 ; Williamson 2009 ). Under the assumption that knowledge requires justification, our account will explain why there is such an anti-luck condition on knowledge. But depending on how these authors flesh out their notion of justification, our account may or may not be compatible with theirs. In any case, in this section I will argue specifically for an anti-luck condition on justification.

I will do so by first presenting a modal interpretation of Goldman’s famous reliabilist theory of justification, an interpretation on which no justified belief is veritically lucky. While I believe many of Goldman’s writings are compatible with such a reading of reliabilism, this is rarely noted, and the modal interpretation of reliabilism is not widely endorsed in the literature. I will therefore provide further support for this interpretation in the next section, by considering and diffusing main objections to it.

First, some preliminaries. The relevant kind of justification at issue in JUSTIFICATION is doxastic justification , a property of beliefs, rather than propositional justification , which is a property of propositions. Footnote 10 Further, the claim specifies a necessary condition for doxastic justification, not a sufficient one. It may very well be that there are other necessary conditions on doxastic justification, besides the one proposed in this paper. It should finally be noted that whether a belief is veritically lucky depends on factors other than the believing agent’s mental states or reflectively accessible information, so that the concept of justification we are working with here in this paper is externalist . Footnote 11

JUSTIFICATION is supported by one of the most prominent accounts of doxastic justification in the literature: Goldman’s process reliabilism ( 1979 , 1994 ). Footnote 12 While Goldman does not explicitly endorse the claim that justification excludes veritic luck in his writings, in this section I will argue that there is a plausible interpretation of his account that does.

Consider first Goldman’s reliabilist account of justification:

RELIABILISM S’s belief in p is justified IFF it is caused (or causally sustained) by a reliable cognitive process, or a history of reliable processes. (Goldman 1994 )

The general idea behind reliabilism is that a belief is justified if and only if it is caused by a process that reliably produces true beliefs. Thus, beliefs formed on the basis of perception under normal circumstances will come out as justified (as they should) because under normal circumstances perception reliably causes true beliefs. Conversely, beliefs formed on the basis of tea-leaf reading will not come out as justified (as they should), because this process will not produce a high ratio of true over false belief.

There are different ways to understand the relevant truth/falsity ratio. First, we can understand it to concern actual operations of the process, or also counterfactual ones. This gives us the difference between frequency and modal interpretations of reliabilism. On a frequency account, what matters is whether the process in actual operation produces enough truth over falsity, whereas on the modal interpretation, what matters is whether the process would produce truth over falsity, even if it actually does not operate at all, or actually fails to produce enough truth over falsity.

We may further distinguish global from local reliability. A process is globally reliable if and only if it produces enough truth over falsity in all its possible or actual applications, whereas it is locally reliable if and only if it produces (or would produce) enough truth over falsity in situations similar enough to the actual case. Thus, ‘going by eyesight’ may be a globally reliable process or method, but it will not be a locally reliable method if one is currently in barn-façade county and forming beliefs about the presence of barns. Generally, (we presume,) trusting one’s eyes will produce a high ratio of true beliefs over false ones, but in the context of barn-façade county, looks are deceiving, and so in similar circumstances one would form many false beliefs in the same way.

Which of these notions is relevant for justification? According to Timothy Williamson, reliability should be understood in modal rather than frequency terms:

Reliability and unreliability, stability and instability, safety and danger, robustness and fragility are modal states. They concern what could easily have happened. They depend on what happens under small variations in the initial conditions. (Williamson 2000 )

In the epistemic context, there are good reasons for doing so, in particular that we would not want to say that belief-forming methods that are only used once are either completely reliable or completely unreliable. Relatedly, if I follow a version of the gambler’s fallacy consistently, and believe that the next number of a roulette wheel will be the number that has not come up for the longest amount of runs, this method will not produce justified beliefs, even if in the actual circumstances in which I apply it, it actually does produce mostly true beliefs, What matters for justification seems to be whether the method could have easily produced false belief, not whether it has actually done so.

We can find a similar modal interpretation of reliability in the work of Goldman, specifically a local modal account, when he speaks about the reliability required for knowledge:

… a cognitive mechanism or process is reliable if it not only produces true beliefs in actual situations, but would produce true beliefs, or at least inhibit false beliefs, in relevant counterfactual situations. (Goldman 1976 )
The reliability theories [of knowledge] presented above focus on modal reliability, on getting truth and avoiding error in possible worlds with specified relations to the actual one. They also focus on local reliability, that is, truth-acquisition in scenarios linked to the specific scenario in question as opposed to truth-getting by a process or method over a wide range of cases. (Goldman and Beddor 2016 )

At first sight, it is not clear whether the kind of reliability required for knowledge is the same as that required for justification according to Goldman. For example, in Epistemology and cognition, when he speaks explicitly about the reliability required for justification, Goldman again opts for modal condition, but one that is slightly more difficult to place on the global–local axis, since it makes the required reliability dependent on what happens in so-called ‘normal’ worlds—worlds that conform to our current beliefs about the world ( 1986 , p. 107). Such ‘normic’ reliability conditions on justification receive support from recent defenses by Jarett Leplin and Martin Smith (Leplin 2009 ; Smith 2016 ).

Normic reliability resembles local reliability since both depend on what happens in a restricted class of worlds rather than all possible worlds. But it differs from local accounts of reliability in that it anchors the relevant set of worlds not to the actual world but to a class of ‘normal worlds’, where normal worlds are worlds compatible with our current beliefs about the world. Thus, if we are envatted brains, we may continue to believe as we do, and our methods would still be justified according to the normic reliability criterion (for these methods are reliable in worlds compatible with our current beliefs about the world). This is how normic reliabilists accommodate the intuition that the beliefs of BIV’s are justified.

In this paper, I opt for a local conception of the kind of reliability required for justification rather than a normic conception. Normic accounts unduly prioritize the epistemic relevance of (our beliefs about) our current world. It is a guiding thought behind the present paper that methods that produce justified beliefs do so because they ensure a proper fit between our beliefs and the world. If the notion of reliability has any relevance in epistemology it is to designate that our methods are guides to truth. That some method is reliable in contexts in which it will never be used seems of little epistemic relevance. Normic reliability accounts predict that BIV’s are justified in using our empirical belief-forming methods even if the relevant subject is envatted from the moment they are born to the moment they die, and these empirical methods never produce a single true belief. Ideally, we want a general analysis that has sensible conditions on knowledge and justification not just for us, but for creatures cognizing in vastly different epistemic contexts as well. It is hard to imagine why such creatures would accord any epistemic relevance to methods that are reliable at our world only. What their epistemologists would care about is reliability in their context, and so I think it is local reliability that ultimately matters for a general theory of justification.

In any case, Goldman abandoned his normic account in favor of a distinction between strong and weak justification (Goldman 1988 ). A belief is said to be strongly justified just in case it is produced by an (epistemically) adequate method, whereas it is said to be weakly justified just in case the believer is (epistemically) blameless in so believing. Since no method for which one is epistemically to blame is epistemically adequate, strong justification implies weak justification, but not the other way around, for adequate methods may require more than just blameless believing.

What more is required? Here Goldman is very explicit. For any belief-forming process, we should assess its “rightness [strong justification] in [world] W not simply by its performance in W, but by its performance in a set of worlds very close to W” (Goldman 1988 , p. 63). This clearly indicates that the reliability that Goldman thinks is required for strong justification is local modal reliability.

The same kind of reliability is not required for weak justification, however, as becomes clear from Goldman’s treatment of the Cartesian demon case (a variant of the envatted brain case discussed above): “The present version of reliabilism accommodates the intuition that demon-world believers have justified beliefs by granting that they have weakly justified beliefs” (Goldman 1988 , pp. 62, 63). Obviously, demon-world victims do not have beliefs that are produced by processes that perform well in their actual world as well as in a set of worlds close to the demon-world. This does not stop their beliefs from being weakly justified according to Goldman, so weak justification does not require local modal reliability.

Weak justification thus does not eliminate veritic luck. But with our definitions of veritic luck and local modal reliability in hand, it is easy to see that strong justification, as well as any account that requires local modal reliability, does entail the absence of veritic luck.

First, a method that is locally modally reliable is a process or method that produces a high ratio of truth over falsity situations similar to the actual case. Second, a belief is veritically lucky if and only if the method or process that produced it produced a true belief but produces false belief in close possible worlds.

Now, it is natural to interpret the notion of ‘similar circumstances’ occurring in our definition of local modal reliability in terms of close possible worlds. After all, close possible worlds are defined as worlds that differ little from the actual world. Such an interpretation of reliabilist justification fits well with Goldman’s claims regarding the modal profile of strong justification provided above, as well as with his treatment of BIV’s. Envatted subjects lack reliably formed beliefs because in worlds close to their actual world, their methods produce false beliefs too often. We will thus continue under the assumption that the notions of ‘close possible worlds’ and ‘similar situations’, as they occur in the definitions of veritic luck and local modal reliability, share their extension.

Admittedly, it is unclear how ‘wide’ the class of worlds where the agent forms a false belief in the same way as she formed her true belief in the actual world must be for a belief to count as veritically lucky. Footnote 13 But similarly, it is unclear what counts as a similar situation, on a local modal reliabilist conception of justification.

To circumvent this worry, I will assume that reliability and veritic luck are both graded notions. By this I mean that our beliefs can be more or less reliable than other beliefs, without it being clear that there is a sharp cut-off point between reliable and unreliable beliefs. The same holds for veritic luck: it is intuitively plausible that there is a continuum of veritic luck, where beliefs can be more, or less veritically lucky without there being a precise cut-off point where a veritically lucky belief becomes a non-veritically lucky one.

If this is true, then it follows that the higher the local modal reliability of a method is, the lower the degree of veritic luck will be that attaches to the beliefs produced by this method. In this sense, a local modal reliability condition behaves as an anti-veritic luck condition on justification. The more (locally modally) reliable your method, the less subject your beliefs are to veritic luck. In the extreme case, complete local modal reliability entails complete absence of veritic luck (in this case, there are no nearby possible worlds where one’s method produces a false belief).

A final point worth emphasizing in this section is that while RELIABILISM takes reliability to be both necessary and sufficient for justification, I will commit myself only to its necessity (that is why JUSTIFICATION does not feature a biconditional). There are several reasons for this, some will be outlined in the next section, and some in Sect.  6 . But perhaps the most important point presently is that I want to show as clearly as possible what is required to evade Gettier cases, and an anti-veritic luck condition on justification suffices for this purpose. Perhaps other conditions on justification are necessary, perhaps not. We will leave this question for another time.

Let us briefly recap. I have presented in this section a local modal interpretation of RELIABILISM supported by the writings of Alvin Goldman, and argues that it excludes veritic luck. This means that there is at least one prominent and plausible account of justification in the literature that satisfies JUSTIFICATION. I do not claim the interpretation presented in this section is the only possible interpretation of RELIABILISM, nor that it is Goldman’s own interpretation, nor that RELIABILISM is the only plausible account that satisfies JUSITIFICATION. My aim in this paper is only to establish that there is a plausible interpretation of justification that allows for an anti-luck defense of the tripartite analysis of knowledge, not that this defense is possible for all accounts of justification. In the next section I will provide further support for JUSTIFICATION by defending it against objections.

4 Lotteries, Evil Demons and Other Objections

While I have argued in the previous section that there is at least one prominent and plausible account that satisfies JUSTIFICATION, such an condition on justification is not widely endorsed in the literature. In this section I will review and respond to some possible objections.

First, I suspect some will find an anti-veritic luck condition too strong on the basis of how the account handles lottery cases. Instead, a probabilistic interpretation of RELIABILISM may be preferrable. In lottery cases, purely on the basis of the long odds involved, you form the (true) belief that the lottery ticket you just bought will lose. Forming your belief in this way will result in error in nearby worlds, since any of the tickets, including yours, could easily be drawn. If this much is admitted, then your belief is subject to at least some veritic luck, and our account seems to rule it as unjustified. This may strike some as counterintuitive. After all, the probability that your ticket is drawn is extremely small, given a large enough lottery. Thus, on this basis, one may prefer a probabilistic conception of RELIABILISM, where your belief is produced reliably just in case the probability of forming a false belief is small enough. In lottery cases, such a condition would be satisfied, which would allow proponents of such a reliabilism to say that lottery beliefs are justified.

In response, I would like to say the following things. First, it is not as clear cut as it may initially seem that lottery beliefs are justified. For example, some authors have argued for a knowledge norm on justification, and since it is universally accepted that we cannot know that our ticket will win on the basis of the odds alone, these views entail that lottery beliefs are not justified (Sutton 2005 ). Others have adduced other externalist conditions on justification that seem to rule out lottery beliefs from counting as justified (Littlejohn 2014 ; Smith forthcoming, 2016 ). In denying justification to lottery beliefs, I would not be alone.

Of course, it is better to provide a principled reason for denying justification in lottery cases. Our account provides such a reason: lottery beliefs are produced by a method that could have easily produced a false belief, and such methods fail to provide justification. This requires that the notion of easy possibility is given a modal characterization, but such interpretations have been fruitfully applied in philosophy at least since Lewis’ analysis of counterfactuals (Lewis 1973 ).

If one is not convinced, our verdict can be made more palatable by noting again that justification is a matter of degree. While JUSTIFICATION entails that lottery beliefs are not completely justified, since there are some nearby possibilities for error, our account is compatible with the idea that such beliefs still receive a relatively high degree of justification, since there are only a few such nearby error-possibilities. The only thing entailed by JUSTIFICATION is that lottery beliefs are not completely justified, but this same verdict must be reached by a probabilistic reliabilist. I conclude that lottery cases do not pose a serious threat to our account. Footnote 14

Let us move on to the next set of objections, both derived from Chris Kelp ( 2017 ). Kelp provides an alternative competence-based version of RELIABILISM, where (roughly) a belief is justified if and only if it is formed by an ability to form true beliefs. By providing both necessary and sufficient conditions on justification, Kelp’s account of justification is more ambitious than the present view, which commits itself to a necessary condition only.

In his paper, Kelp discusses two problems that may threaten JUSTIFICATION. Footnote 15 The first concerns new evil demon cases. Accounts like ours deny justification in such cases, since they feature beliefs that, if true at all, are subject to substantial degrees of veritic luck. Kelp maintains our verdict in these cases is implausible. In response, we have already argued against normic accounts of reliabilism that there are reasons to suppose victims in such deception cases lack justified beliefs, so I am prepared to bite this bullet. Moreover, Kelp’s own account of justification falls prey to a generalized version of the New Evil Demon case. Let us see why. Kelp evades standard new evil demon cases because according to Kelp, such cases involve conditions C “highly unsuitable for your ability to form true beliefs about the time in the sense that using W [your way of forming beliefs] in C does not dispose you to form true beliefs” (Kelp 2017 , p. 19). In standard new evil demon cases, however, W is grounded in normal conditions C’ (say, regular conditions as we take them to be on earth), in which exercising W does lead to true belief. So even in the demon case, you still form your beliefs by exercising an ability to form true beliefs, and so it seems that Kelp can accommodate the intuition that victims of radical deception are justified in their beliefs.

The case may be generalized, however, such that you are radically deceived since you were born. In such a case, your ability cannot be grounded in circumstances where you are disposed to form true beliefs (because you have never been able to form true beliefs about your environment). In this case, Kelp would have to agree that the relevant beliefs are unjustified. This puts our accounts in the same boat in this respect. To the extent that BIV’s and evil demon scenario’s count against our account, our generalized scenario should count against Kelp’s account as well. As said, however, I think the best way to respond to such scenarios is to bite the bullet. Whether our belief-forming methods provide us with justified beliefs depends on whether they are reliable guides to truth, and our present ways of forming our beliefs fail this criterion in radical deception cases.

The last objection I want to discuss concerns the kind of reliabilism I used in the previous section to support JUSTIFICATION. Kelp objects to standard process-reliabilist theories of justification that their measure of reliability depends on truth-falsity ratios at worlds. An account based on competences is better, according to Kelp, because competences are defined relative to conditions (you may competently play the piano, but not underwater). Kelp provides an example where people are generally unable to tell chantarelles from jack-of-lanterns, a very similar looking mushroom. Since chantarelles are edible but jack-of-lanterns are not, people cannot reliably tell whether a mushroom with the relevant appearance is edible, so their beliefs about this will not be justified on standard reliabilist accounts. Now imagine a secluded island where there are only chantarelles around. In this case, Kelp argues, we want to say that the beliefs of people living there about the edibility of these mushrooms are justified. His account predicts that they are, since in so believing they exercise an ability to believe truly. Reliabilism does not seem to deliver this verdict, because even if in their local context the beliefs are reliable, the kind of reliability adduced by standard process-reliabilism is defined over the whole world, which means, given the above, that their method is unreliable.

I want to grant Kelp the point against the standard kind of process-reliabilism that he discusses. But as we have made clear above, the general reliabilist framework is flexible enough to accommodate other measures of reliability than the one discussed by Kelp. In particular, we have been working with a local modal conception of reliability, which concerns reliability in cases similar to the actual case. Take any instance of a belief about the edibility of one of the chantarelles formed by someone living on the island described above. Does this way of forming their beliefs produce error in close possible worlds? Assuming that the person never leaves the island, it seems hard to deny that their method is locally modally reliable; quite a lot would have to change for this way of forming their beliefs to produce false belief here. Of course, that will change if one introduces jack-of-lanterns into the case, but the more of those we stipulate there to be on the island, the less strong our inclination to attribute justification, just as local modal reliabilism predicts.

All in all, our defense of JUSTIFICATION stands up to the challenges discussed above. Lottery beliefs may be justified to a high degree but are not completely justified. Beliefs of radically deceived agents do not seem to be justified at all. The final objection discussed emphasizes the plausibility of the local modal reliabilism used in the previous section to support JUSTIFICATION. The point of all this is to support the tripartite analysis of knowledge. In this next section, we compare our strategy to some recent alternatives.

5 Similar Strategies

I have argued that we can save the traditional analysis of knowledge against Gettier’s famous counterexamples if we properly understand what is required for justification. In different ways, Adrian Haddock and Mark Schoeder have argued for similar points (Haddock 2010 ; Schroeder 2015b ). Footnote 16 In this section, I will compare my account to theirs and provide some reasons for preferring the present one.

Let us address these accounts in alphabetical order. According to Adrian Haddock, knowledge is justified true belief where the justification condition is factive (one cannot justifiably believe that p when p is false) and requires moreover that the fact that provides justification is known by the subject. Haddock restricts his discussion to the case of visual knowledge, in which case, he argues, the fact that provides justification is ‘that one sees that p ’. Since both knowledge and seeing are factive states, it is impossible to be justified in this sense without it being the case that p. Footnote 17

We need not delve into the details of Haddocks account to note two main differences between it and the account presented in this paper. First, I do not consider justification to be a factive state in general. While complete justification may require the absence of false belief in nearby worlds, including our actual one, lesser degrees of justification do not, and are compatible with some false beliefs in nearby worlds, including our own. So, we can believe with a high degree of justification, on our present account, without it being the case that our belief is true. Secondly, I do not require the kind of second-order knowledge that Haddock requires for a belief to be justified. I suspect such a requirement is too strong, lest children, animals, and even probably most adult humans lack much of the knowledge we think they have. Rarely do we form beliefs about what justifies our beliefs, and when we do, such beliefs may simply be wrong, as the literature on cognitive bias makes painfully clear. Yet, as long as the methods we use to form the relevant beliefs are reliable enough, in our specific sense of local modal reliability, they may on our account amount to knowledge nonetheless. On our account knowing things about the world is a matter of having proper epistemic access to that world, and not of having proper second-order beliefs about the kind of access we enjoy.

Schroeder defines knowledge as belief for sufficient subjective and objective reason (Schroeder 2015b ). We will again focus on the case of visual knowledge. In cases of normal perception, you look at an object (say a dog) and form the belief that there is a dog over yonder. In such cases, according to Schroeder, your evidence is that you see that there is a dog over yonder . If you properly base your belief on this reason, it will count as subjectively sufficient (the fact that seeing is a factive state rationalizes your belief that there is a dog over yonder). If it is also true that you see that there is a dog over yonder (which for Schroeder means that there is no deception going on), then you also believe for objectively sufficient reason, and your belief may then amount to knowledge.

Schroeder is explicit in saying that doxastic justification for him requires believing for sufficient subjective reason only. This entails, among other things, that subjects’ beliefs in fake barn cases are doxastically justified on Schroeder’s account. Regarding a subject located in Fake Barn country, Schroeder says:

[S]he knows just in case the reasons for which she believes are both objectively and subjectively sufficient. And according to my take on the Kantian account [Schroeder’s account], her belief is doxastically rational just in case her reasons are subjectively sufficient. So given that the agent in the fake barn case believes rationally, the Kantian Account can deny that she knows only if it turns out that her reasons are not objectively sufficient. (Schroeder 2015a , p. 377)

Schroeder’s analysis of such cases thus seems to be one of doxastic justification but failure of knowledge because the subject’s subjectively sufficient reason is not also objectively sufficient. As an attempt to save the tripartite analysis, this strategy fails: we have here a case of doxastically justified true belief that nevertheless fails to amount to knowledge. The account also clearly conflicts with our present proposal, since on our account subjects in fake barn country do not know because they are not justified. In such cases, one’s method may all to easily produce false belief, such as when one is looking at a fake barn, and so one is not justified to believe there is a barn over yonder even if one is looking at the one real example.

Fake barn cases are controversial, but I think intuitions to the effect that fake-barn beliefs are justified can be explained away by noting that the same way of forming our beliefs about barns in the distance is presumably, in most of the contexts where we find ourselves, a locally modally reliable method. We don’t usually go wrong by using this method, we assume. And so, we conclude that using the method provides justification tout court . However, the point of requiring local modal reliability for justification is that it is plausible that whether a belief-forming method provides justification may differ across contexts. A method fit for forming true beliefs in one environment may not be so helpful in others. And in fake barn cases, going by eyesight errs too easily for that method to provide justification.

Thus, the approaches of Haddock and Schroeder are substantially different from the present one, which, I have argued, is to be preferred.

6 Further Implications

So far, I have argued that Gettier cases necessarily involve veritic luck, and that a local modal interpretation of RELIABILISM entails that justified beliefs cannot be veritically lucky. Together, these claims entail that no Gettier case can involve justified beliefs, and thus, that they do not provide counterexamples to the tripartite account of knowledge. Footnote 18 I have defended the account against objections and alternative analyses of knowledge. In this section, I discuss some implications of the present anti-luck approach to justification.

First, Linda Zagzebski has famously argued that Gettier cases are inescapable, in the sense that no non-factive account of justification immune to Gettier cases can be formulated (Zagzebski 1994 ). Since she explicitly discusses reliabilist conditions on justification, her findings may seem conflict with our claim that local modal reliabilism evades such cases. This impression would be incorrect, however, as Zagzebski is working with a probabilistic version of reliabilism, not a local modal one. Since I acknowledge that probabilistic versions of reliabilism will not suffice to rule out veritic luck, Zagzebski’s claim that reliabilism does not rule out veritic luck, properly understood, is compatible with what is argued here.

Zagzebski, however, further argues that no non - factive account of justification (where a factive account of justification is an account that entails justified beliefs are true) is going to be immune to Gettier cases. Footnote 19 As I have argued above, our general account of justification is non-factive. Justification comes in degrees. One’s belief is completely justified if there are no nearby possible worlds where one forms a false belief. Since the actual world counts as near to itself, it is not possible to be completely justified when one believes falsely. So, complete justification is factive on the present picture. But one may be justified to a lesser degree. In this case, the higher the proportion of nearby possible worlds where one forms a false belief, the lower one’s degree of justification. Since lower degrees of allow for false belief in nearby worlds, including the actual world, our general account of justification is non-factive.

It is a further question whether degrees of justification lower than complete justification suffice to rule out knowledge-undermining luck, and thus, in effect, whether knowledge requires complete justification on our account. I am inclined to think that knowledge may be compatible with minute amounts of veritic luck, given our frequent knowledge attributions. Officially, however, I will leave this as an open question. Knowledge may require complete justification (in which case the truth condition in the tripartite analysis is superfluous), or it may only require a lesser degree of justification (in which case the truth condition is required, and in which case our account provides a counterexample to Zagzebski’s claim). Footnote 20 Perhaps, as some have argued, the standards for knowledge depend on context, such that in some contexts, stronger justification is required than in others (Stanley 2005 ). Full discussion of this point will have to wait for another time; our account is flexible enough to handle any of the potential outcomes.

Let us return to our main line of argument. In this paper, I argued that there is a plausible account of justification where Gettier cases do not undermine the claim that knowledge is justified true belief. Why then, do so many epistemologists consider the tripartite analysis refuted?

The answer seems to be that before Gettier, justification is generally given an internalist gloss. Footnote 21 Because such accounts tend to be compatible with justified beliefs that are produced by methods that could very easily have produced false belief (BIV-beliefs, demon beliefs, etc.), such accounts will not eliminate veritic luck. Footnote 22 It is for this reason that Ted Poston, for example, writes: “[s]tandard Gettier cases show that one can have internally adequate justification without knowledge” ( 2016 , my emphasis). It is because our anti-luck condition is an externalist condition that it evades Gettier-cases.

Interestingly, internalist justification is incompatible with a different kind of luck:

REFLECTIVE LUCK S’s belief that p is reflectively lucky if and only if, given the information reflectively accessible to S , it is a matter of luck that the method S used to form her belief that p produced a true belief.

Note the parallels between veritic and reflective luck. Whereas veritic luck requires the belief to be true but produced by a method that could have easily produced false belief instead, the notion of reflective luck requires this same thing to be the case, but then judged from one’s reflective perspective. Footnote 23 Some examples of reflectively lucky beliefs include beliefs formed on the basis of simple guessing and the beliefs of Brandom’s famous chicken-sexers, or Bonjour’s equally famous clairvoyant (BonJour 1980 ; Brandom 1998 ). It is important to note that beliefs can be reflectively lucky without being veritically lucky (as is the case in the latter two examples mentioned above). Even if the chicken-sexer from her own perspective cannot explain why she reliably forms true belief, her method still is locally modally reliable. Similarly, beliefs can also be veritically lucky without being reflectively lucky, as when things look as if one’s method is a (locally modally) reliable one, whereas in fact it is not.

With the distinction between reflective and veritic luck in hand, it is possible to draw an alternative lesson from Gettier. The primary target of internalist concepts of justification is the elimination of reflective, not veritic luck. What Gettier showed us, is that there is another kind of luck that prevents knowledge: veritic luck. Since Gettier’s paper predates the careful distinctions of anti-luck epistemology, this point remains entirely implicit. Footnote 24

Still, I submit this is the main lesson from Gettier. It explains why Gettier cases are seen to refute the tripartite analysis of knowledge: because traditional accounts of justification aim to eliminate reflective but not veritic luck, the conditions laid down by these accounts can be satisfied even in the presence of veritic luck, which opens the door to Gettier cases.

The main lesson from Gettier is not that knowledge is incompatible with luck simpliciter, but specifically that knowledge is incompatible with veritic luck. A next question is then what this means for the analysis of knowledge. The overwhelming majority opinion is that Gettier refuted the classical, tripartite account of knowledge. But our present findings open up the possibility for a different interpretation. As I have been arguing in this paper, a plausible reading of reliabilist justification requires the elimination of veritic luck. On this account, Gettier cases lose their teeth. Footnote 25

7 Conclusion

Let us conclude. In this paper I have argued that by focussing on the relation between epistemic justification and luck, we can defend the traditional analysis of knowledge as justified, true, belief. Gettier cases are usually seen to refute any such attempt, but we have seen that all Gettier cases involve veritic luck, and that a plausible version of reliabilism about epistemic justification eliminates veritic luck. If this is so, then no belief in Gettier cases is epistemically justified, properly understood. That means that Gettier cases lose their teeth, and we can consistently maintain the claim that knowledge is justified true belief even in the light of any failure to know in Gettier cases.

Indeed, the claim that knowledge is not justified true belief is one of the few philosophical claims David Lewis took to be established conclusively: “Philosophical theories are never refuted conclusively. (Or hardly ever. Gödel and Gettier may have done it)” (Lewis 1983 , p. x).

In other work, I investigate in some detail the nature of veritic luck (de Grefte 2018 ). Here, I will rest content with providing a brief overview of the main conclusions of that investigation.

In recent work, Pritchard drops such a significance condition on luck (Pritchard 2014 ). See (de Grefte  2019 ) for discussion.

The definition of veritic luck that I am working with in this paper is different from those proposed by Pritchard (Pritchard 2005 , 2014 ) and Engel ( 1992 ). Reasons for my alternative formation are given in full in (de Grefte 2018 ). Briefly put, the difference is that for Pritchard, in order to be veritically lucky, a belief must be produced by a method that easily produces that very same belief , but it would be false. The reason we opt for the present requirement is Pritchard’s formulation renders beliefs in necessary truths necessarily non-lucky. For an objection along these lines see (Hales 2016 ). For a similar modification of Pritchard’s account, see (Goldberg 2015 , p. 274).

Here I assume that a guess whether p or not-p can easily result in either the belief that p or the belief that not-p.

The first of these assumptions is defended in (Whittington 2016 ). The second assumption is defended in (Pritchard 2005 , Chapter 6).

As is well-known, it is difficult to specify adequate criteria for the individuation of methods of belief-formation (e.g. Conee and Feldman 1998 ). I believe this problem, known as the ‘generality-problem’ is an issue for any adequate theory of justification, and I will not attempt to solve it in this paper. Instead, I rely on an intuitive understanding of the methods involved in my examples.

A non-factive condition on belief is a condition such that satisfying the condition does not entail the truth of the belief. Accounts of justification that feature only non-factive conditions on justification are called fallibilist accounts of justification.

The claim that Gettier cases necessarily involve veritic luck is relatively uncontroversial (e.g. Engel 1992 , p. 70; Pritchard 2005 , p. 150). For some recent objections, see (Bernecker 2011 ; Hetherington 2011 , Chapter 3).

For more on the distinction, see (Kornblith 2017 ; Silva and Oliveira forthcoming; Turri 2010 ).

We will come back to the relation between luck and the internalism/externalism debate in Sect.  6 .

I provide a more brief argument for this claim in my (de Grefte 2018 ).

Pritchard does go into this issue when he talks about the related notion of safety (Pritchard 2005 , Sects. 6.2–6.4).

It is worthwhile to pause on the distinction between partial and complete justification. In this paper, I argue that the tripartite account of knowledge can be saved from Gettier-style counterexamples by positing an anti-luck condition on justification. As I have shown above, Gettier cases necessarily involve veritic luck. But luck comes in degrees, so our beliefs may be subject to more or less veritic luck, The degree of veritic luck present in Gettier cases, is assumed to be high enough to destroy knowledge. But it is a further question, one not often explicitly dealt with in the literature, whether any degree of veritic luck is incompatible with knowledge. Lottery cases may be marshalled in support of the view that knowledge requires the absence of even low degrees of veritic luck, since, while the nearest possible world where one forms a false belief on the basis of the same method is close (just a few different numbers have to come up), but the proportion of nearby worlds where one forms a false belief on the basis of employing the same methods is arbitrarily small. Some veritic luck is involved, but not very much, it seems. The widespread intuition that lottery propositions are not known provides some evidence that knowledge is incompatible with even very small degrees of veritic luck. For the larger project of this paper this issue can be left undecided. To save the tripartite analysis, we only need to assume that knowledge is incompatible with the degree of veritic luck present in Gettier cases, and then argue that justification is incompatible with the same degree of veritic luck. This is what I have aimed to do in Sect.  3 . It should be noted, however, that our account is flexible enough to accommodate the thought that knowledge requires the complete absence of veritic luck, but that I am not committed to an account of justification that eliminates veritic luck completely. This issue will come up again in Sect.  6 .

Kelp discusses two other problems: clairvoyant cases and the generality problem. I will set the generality problem aside here, since this is a problem not specific to the present account (Bishop 2010 ), and would in any case require much more discussion. Clairvoyant cases are irrelevant in the present discussion because they seem to contradict the sufficiency of reliability for justification, a claim not endorsed in this paper.

Schroeder does not intend to save the tripartite analysis, since his account of knowledge features a fourth condition on knowledge (that the relevant belief is supported by sufficient objective reason). However, since he clearly aims to provide an analysis of knowledge, it is still worthwhile to compare the account to ours.

Haddock’s account of justification is not the only factive account of justification. Views that equate justification with knowledge, such as Sutton’s ( 2005 ) account, will entail that justification is factive. Also, disjunctivist accounts of perceptual knowledge such as McDowell’s ( 2009 ) may entail that justification (at least conceived of as the warrant required for knowledge) is factive. It is not possible here to compare my account to all alternatives. The decision to focus on the accounts of Haddock and Schroeder is motivated by the fact that both of them seem to be concerned explicitly with the analysis of knowledge. Since this is also my project here, the comparison between these different strategies is especially relevant.

This defence of the tripartite account is indirect because it concerns the removal of one of the main arguments against the tripartite account.

Zagzebski’s claim is relatively easily refuted. Suppose one posits that the absence of veritic luck is both necessary and sufficient for justification. Such an account is non-factive and able to evade Gettier cases. The arguments for the latter part of this claim have been provided above. Further, such an account would not be factive because a belief is veritically lucky only if it is both true and produced by a method that could easily have produced a false belief. False beliefs fail the first conjunct and so, on this account, cannot be veritically lucky. So, an account that posits that the absence of veritic luck is both necessary and sufficient is non-factive and immune to Gettier cases.

More recently, Zagzebski seems to admit as much when she discusses the lesson to be drawn from Gettier (Zagzebski 2017 ). While she thinks anti-luck approaches like the one from Howard-Snyder, Howard-Snyder and Feit ( 2003 ) are immune to Gettier cases, she thinks such accounts are ‘uninteresting’ and ‘ad-hoc’. Since these issues are not our central concern here, we will set them aside.

For some examples, see the theories explicitly targeted by Gettier: those of Chisholm ( 1957 , p. 16) and Ayer ( 1956 , p. 34).

See my (de Grefte 2018 ).

The notion of reflective luck is derived from Duncan Pritchard’s seminal work on epistemic luck (Pritchard 2005 ). Note that our account differs slightly from Pritchard’s account, just like our account of veritic luck differs slightly from Pritchard’s version in the same way as our account of veritic luck in order to avoid necessarily true propositions to be immune from reflective luck.

Note that I am not saying here that Gettier intended his cases to be read in this way. I am merely speculating that this is the best way to make sense of the cases, and the lesson to be drawn from them.

One note in closing, however. In modifying their justification conditions, externalists usually propose conditions that do not require the elimination of reflective luck. But it is perfectly consistent to require that justification requires both the absence of veritic, and of reflective luck. Prima facie, such an account of justification would seem to satisfy important externalist as well as internalist intuitions about justification. This is a theoretical possibility that is often overlooked in the debate between internalists and externalists, perhaps because externalism is often formulated as the explicit denial of internalism. Ernest Sosa is among the few epistemologists that have long stressed the importance of both externalist and internalist justification, at least when the higher grades of knowledge are concerned ( 2009 , 2010 ). Duncan Pritchard notes the compatibility but remains uncommitted toward such a hybrid account of justification ( 2005 , Chapters 6, 7, 8). A hybrid approach also seems compatible with Goldman’s distinction between strong and weak justification (Goldman 1988 ).

Ayer, A. J. (1956). The problem of knowledge (Vol. 8). Harmondsworth: Penguin books.

Google Scholar  

Bernecker, S. (2011). Keeping track of the Gettier problem. Pacific Philosophical Quarterly, 92 (2), 127–152.

Article   Google Scholar  

Bishop, M. A. (2010). Why the generality problem is everybody’s problem. Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition, 151 (2), 285–298.

BonJour, L. (1980). Externalist theories of empirical knowledge. Midwest Studies in Philosophy, 5 (1), 53–73.

Brandom, R. (1998). Insights and blindspots of reliabilism. The Monist, 81 (3), 371–392.

Chisholm, R. M. (1957). Perceiving: A philosophical study (Vol. 9). Ithaca: Cornell University Press.

Conee, E., & Feldman, R. (1998). The generality problem for reliabilism. Philosophical Studies, 89 (1), 1–29.

de Grefte, J. (2018). Epistemic justification and epistemic luck. Synthese, 195 (9), 3821–3836.

de Grefte, J. (2019). Pritchard Versus Pritchard on Luck. Metaphilosophy, 50 (1–2), 3–15.

Engel, M. (1992). Is epistemic luck compatible with knowledge? The Southern Journal of Philosophy, 30 (2), 59–75.

Goldberg, S. C. (2015). Epistemic entitlement and luck. Philosophy and Phenomenological Research, 91 (2), 273–302.

Goldman, A. I. (1976). Discrimination and perceptual knowledge. The Journal of Philosophy, 73 (20), 771–791.

Goldman, A. I. (1979). What is justified belief? In G. S. Pappas (Ed.), Justification and knowledge (pp. 1–23). Dordrecht: D. Reidel Publishing Company.

Goldman, A. I. (1986). Epistemology and cognition . Cambridge, MA: Harvard University Press.

Goldman, A. I. (1988). Strong and weak justification. Philosophical Perspectives, 2, 51–69.

Goldman, A. I. (1994). Naturalistic epistemology and reliabilism. Midwest Studies in Philosophy, 19, 301–320.

Goldman, A. I., & Beddor, B. (2016). Reliabilist epistemology. In E. N. Zalta (Ed.), Stanford encyclopedia of philosophy (winter 201) . Stanford: Metaphysics Research Lab, Stanford University.

Haddock, A. (2010). Knowledge and justification. In the nature and value of knowledge . Oxford: Oxford University Press.

Hales, S. D. (2016). Why every theory of luck is wrong. Noûs, 50 (3), 490–508.

Hetherington, S. C. (2011). How to know: A practicalist conception of knowledge . Hoboken: Wiley.

Book   Google Scholar  

Howard-snyder, D., Howard-snyder, F., & Feit, N. (2003). Infallibilism and Gettier’s legacy. Philosophy and Phenomenological Research, 66 (2), 304–327.

Kelp, C. (2017). How to be a reliabilist. Philosophy and Phenomenological Research, 98, 1–29.

Kornblith, H. (2017). Doxastic justification is fundamental. Philosophical Topics, 45 (1), 63–80.

Leplin, J. (2009). A theory of epistemic justification . Berlin: Springer.

Lewis, D. K. (1973). Counterfactuals . Cambridge, MA: Harvard University Press.

Lewis, D. K. (1983). Philosophical papers (Vol. 1). New York: Oxford University Press.

Littlejohn, C. (2014). Fake barns and false dilemmas. Episteme, 11 (4), 369–389.

McDowell, J. (2009). Selections from criteria, defeasibility, and knowledge. In A. Byrne & H. Logue (Eds.), Disjunctivism: Contemporary readings (pp. 75–91). Cambridge: MIT Press.

Poston, T. (2016). Internalism and externalism in epistemology. Philosophical Topics, 14 (1), 179–221.

Pritchard, D. (2005). Epistemic luck . New York, NY: Oxford University Press.

Pritchard, D. (2014). The modal account of luck. Metaphilosophy, 45 (4–5), 594–619.

Sainsbury, M. R. (1997). Easy possibilities. Philosophy and Phenomenological Research, 57 (4), 907–919.

Schroeder, M. (2015a). In defense of the Kantian account of knowledge: Reply to Whiting. Logos & Episteme, 6 (3), 371–382.

Schroeder, M. (2015b). Knowledge is belief for sufficient (objective and subjective) reason. In T. S. Gendler & J. Hawthorne (Eds.), Oxford studies in epistemology (Vol. 5, pp. 226–252). Oxford: Oxford University Press.

Chapter   Google Scholar  

Silva, P., & Oliveira, L. R. G. (forthcoming). Propositional justification and doxastic justification. In M. Lasonen-Aarnio & C. M. Littlejohn (Eds.), Routledge handbook of the philosophy evidence. Abingdon: Routledge.

Smith, M. (2016). Between probability and certainty: What justifies belief . Oxford: Oxford University Press.

Smith, M. (forthcoming). Four arguments for denying that lottery beliefs are justified. In I. Douven (ed.), Lotteries, knowledge and rational belief: Essays on the lottery paradox. Cambridge: Cambridge University Press.

Sosa, E. (2009). Reflective knowledge: Apt belief and reflective knowledge (Vol. II). Oxford: Oxford University Press.

Sosa, E. (2010). Knowing full well. In knowing full well . Princeton, NJ: Princeton University Press.

Stanley, J. (2005). Knowledge and practical interests . Oxford: Oxford University Press.

Sutton, J. (2005). Stick to what you know. Nous, 39 (3), 359–396.

Turri, J. (2010). On the relationship between propositional and doxastic justification. Philosophy and Phenomenological Research, 80 (2), 312–326.

Whittington, L. J. (2016). Luck, knowledge and value. Synthese, 193 (6), 1615–1633.

Williamson, T. (2000). Knowledge and its limits . Oxford, NY: Oxford University Press.

Williamson, T. (2009). Probability and danger. The Amherst Lecture in Philosophy, 4, 1–35.

Zagzebski, L. (1994). The inescapability of Gettier problems. Philosophical Quarterly, 44 (174), 65–73.

Zagzebski, L. (2017). The lesson of gettier. In explaining knowledge . Oxford: Oxford University Press.

Download references

Acknowledgements

I would like to thank Alvin Goldman for helpful disccussion of this material, as well as the audience from the OZSW Conference 2019 in Amsterdam, and two anonymous referees for this journal.

Author information

Authors and affiliations.

Faculty of Philosophy, University of Groningen, Oude Boteringestraat 52, 9712 GL, Groningen, The Netherlands

Job de Grefte

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Job de Grefte .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

de Grefte, J. Knowledge as Justified True Belief. Erkenn 88 , 531–549 (2023). https://doi.org/10.1007/s10670-020-00365-7

Download citation

Received : 02 June 2020

Accepted : 21 December 2020

Published : 19 February 2021

Issue Date : February 2023

DOI : https://doi.org/10.1007/s10670-020-00365-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

Prestigious cancer research institute has retracted 7 studies amid controversy over errors

Dana-Farber Cancer Institute

Seven studies from researchers at the prestigious Dana-Farber Cancer Institute have been retracted over the last two months after a scientist blogger alleged that images used in them had been manipulated or duplicated.

The retractions are the latest development in a monthslong controversy around research at the Boston-based institute, which is a teaching affiliate of Harvard Medical School. 

The issue came to light after Sholto David, a microbiologist and volunteer science sleuth based in Wales, published a scathing post on his blog in January, alleging errors and manipulations of images across dozens of papers produced primarily by Dana-Farber researchers . The institute acknowledged errors and subsequently announced that it had requested six studies to be retracted and asked for corrections in 31 more papers. Dana-Farber also said, however, that a review process for errors had been underway before David’s post. 

Now, at least one more study has been retracted than Dana-Farber initially indicated, and David said he has discovered an additional 30 studies from authors affiliated with the institute that he believes contain errors or image manipulations and therefore deserve scrutiny.

The episode has imperiled the reputation of a major cancer research institute and raised questions about one high-profile researcher there, Kenneth Anderson, who is a senior author on six of the seven retracted studies. 

Anderson is a professor of medicine at Harvard Medical School and the director of the Jerome Lipper Multiple Myeloma Center at Dana-Farber. He did not respond to multiple emails or voicemails requesting comment. 

The retractions and new allegations add to a larger, ongoing debate in science about how to protect scientific integrity and reduce the incentives that could lead to misconduct or unintentional mistakes in research. 

The Dana-Farber Cancer Institute has moved relatively swiftly to seek retractions and corrections. 

“Dana-Farber is deeply committed to a culture of accountability and integrity, and as an academic research and clinical care organization we also prioritize transparency,” Dr. Barrett Rollins, the institute’s integrity research officer, said in a statement. “However, we are bound by federal regulations that apply to all academic medical centers funded by the National Institutes of Health among other federal agencies. Therefore, we cannot share details of internal review processes and will not comment on personnel issues.”

The retracted studies were originally published in two journals: One in the Journal of Immunology and six in Cancer Research. Six of the seven focused on multiple myeloma, a form of cancer that develops in plasma cells. Retraction notices indicate that Anderson agreed to the retractions of the papers he authored.

Elisabeth Bik, a microbiologist and longtime image sleuth, reviewed several of the papers’ retraction statements and scientific images for NBC News and said the errors were serious. 

“The ones I’m looking at all have duplicated elements in the photos, where the photo itself has been manipulated,” she said, adding that these elements were “signs of misconduct.” 

Dr.  John Chute, who directs the division of hematology and cellular therapy at Cedars-Sinai Medical Center and has contributed to studies about multiple myeloma, said the papers were produced by pioneers in the field, including Anderson. 

“These are people I admire and respect,” he said. “Those were all high-impact papers, meaning they’re highly read and highly cited. By definition, they have had a broad impact on the field.” 

Chute said he did not know the authors personally but had followed their work for a long time.

“Those investigators are some of the leading people in the field of myeloma research and they have paved the way in terms of understanding our biology of the disease,” he said. “The papers they publish lead to all kinds of additional work in that direction. People follow those leads and industry pays attention to that stuff and drug development follows.”

The retractions offer additional evidence for what some science sleuths have been saying for years: The more you look for errors or image manipulation, the more you might find, even at the top levels of science. 

Scientific images in papers are typically used to present evidence of an experiment’s results. Commonly, they show cells or mice; other types of images show key findings like western blots — a laboratory method that identifies proteins — or bands of separated DNA molecules in gels. 

Science sleuths sometimes examine these images for irregular patterns that could indicate errors, duplications or manipulations. Some artificial intelligence companies are training computers to spot these kinds of problems, as well. 

Duplicated images could be a sign of sloppy lab work or data practices. Manipulated images — in which a researcher has modified an image heavily with photo editing tools — could indicate that images have been exaggerated, enhanced or altered in an unethical way that could change how other scientists interpret a study’s findings or scientific meaning. 

Top scientists at big research institutions often run sprawling laboratories with lots of junior scientists. Critics of science research and publishing systems allege that a lack of opportunities for young scientists, limited oversight and pressure to publish splashy papers that can advance careers could incentivize misconduct. 

These critics, along with many science sleuths, allege that errors or sloppiness are too common , that research organizations and authors often ignore concerns when they’re identified, and that the path from complaint to correction is sluggish. 

“When you look at the amount of retractions and poor peer review in research today, the question is, what has happened to the quality standards we used to think existed in research?” said Nick Steneck, an emeritus professor at the University of Michigan and an expert on science integrity.

David told NBC News that he had shared some, but not all, of his concerns about additional image issues with Dana-Farber. He added that he had not identified any problems in four of the seven studies that have been retracted. 

“It’s good they’ve picked up stuff that wasn’t in the list,” he said. 

NBC News requested an updated tally of retractions and corrections, but Ellen Berlin, a spokeswoman for Dana-Farber, declined to provide a new list. She said that the numbers could shift and that the institute did not have control over the form, format or timing of corrections. 

“Any tally we give you today might be different tomorrow and will likely be different a week from now or a month from now,” Berlin said. “The point of sharing numbers with the public weeks ago was to make clear to the public that Dana-Farber had taken swift and decisive action with regard to the articles for which a Dana-Farber faculty member was primary author.” 

She added that Dana-Farber was encouraging journals to correct the scientific record as promptly as possible. 

Bik said it was unusual to see a highly regarded U.S. institution have multiple papers retracted. 

“I don’t think I’ve seen many of those,” she said. “In this case, there was a lot of public attention to it and it seems like they’re responding very quickly. It’s unusual, but how it should be.”

Evan Bush is a science reporter for NBC News. He can be reached at [email protected].

Researchers Find That Higher Intelligence Is Correlated With Left-Wing Beliefs

"our results imply that being genetically predisposed to be smarter causes left-wing beliefs.".

Drew Angerer via Getty / Futurism

A provocative new study has found a link between left-wing beliefs and both higher intelligence quotient (IQ) scores and genetic markers believed to be associated with higher intelligence.

As psychology researchers at the University of Minnesota Twin Cities report in their new paper, published in the journal Intelligence , numerous intelligence tests found that being more clever "is correlated with a range of left-wing and liberal political beliefs."

"Our results," the paper's authors wrote, "imply that being genetically predisposed to be smarter causes left-wing beliefs."

As with all research on human intelligence, the work is fraught. There are different types of intelligence, and it's hard to draw the line between nature and nurture, since education clearly causes IQ test scores to rise .

Still, the paper's methodology is compelling. The UM authors gleaned results from a study of more than 200 families, some of which included only biological children, others only adopted children, and a smaller portion of which had both adopted and biological kids.

"We find both IQ and genetic indicators of intelligence, known as polygenic scores, can help predict which of two siblings tends to be more liberal," study author Tobias Edwards told PsyPost in a fascinating interview about the research. "These are siblings with the same upbringing, who are raised under the same roof.

"This implies that intelligence is associated with political beliefs, not solely because of environment or upbringing, but rather that the genetic variation for intelligence may play a part in influencing our political differences," he added. "Why is this the case? I do not know."

"Using both measured IQ and polygenic scores" — the latter are genetic profiles that determine all kinds of things, from how one looks and to their risk of acquiring disease or expressing mental illness — the research measured "for cognitive performance and educational attainment" and then determined whether there was a correlation between intelligence, genetics, and political affiliation.

On the politics side, they tested for five variables: "political orientation, authoritarianism, egalitarianism, social liberalism, and fiscal conservatism."

"Polygenic scores predicted social liberalism and lower authoritarianism, within-families," the paper continues. "Intelligence was able to significantly predict social liberalism and lower authoritarianism, within families, even after controlling for socioeconomic variables."

Still, Edwards cautioned, political beliefs are complex constructs of a particular historical moment that will never be totally reduceable to any one variable.

"This surprise highlights an important point; there is no law saying that intelligent people must always be supportive of particular beliefs or ideologies," he told PsyPost. "The way our intelligence affects our beliefs is likely dependent upon our environment and culture. Looking back across history, we can see intelligent individuals have been attracted to all sorts of different and often contradictory ideas."

:Intellectuals have flirted with and been seduced by dangerous ideologies and tyrannical regimes," he added. "Many smart people have believed ideas that are downright stupid. Because of this George Orwell doubted that the intelligence of partisans could be any guide to the quality of their beliefs, declaring that, ‘one has to belong to the intelligentsia to believe things like that: no ordinary man could be such a fool.'"

More on politics: Bernie Sanders Proposes Four-Day Workweek With No Loss in Pay

Share This Article

Help | Advanced Search

Computer Science > Computation and Language

Title: leave no context behind: efficient infinite context transformers with infini-attention.

Abstract: This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block. We demonstrate the effectiveness of our approach on long-context language modeling benchmarks, 1M sequence length passkey context block retrieval and 500K length book summarization tasks with 1B and 8B LLMs. Our approach introduces minimal bounded memory parameters and enables fast streaming inference for LLMs.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Example Of Justification Of The Study In Research

    is a research paper justified

  2. Research Paper Format

    is a research paper justified

  3. How to Write a Research Paper Fast in 9 Steps

    is a research paper justified

  4. Best Steps to Write a Research Paper in College/University

    is a research paper justified

  5. Justification Paper

    is a research paper justified

  6. Example Of Justification Of The Study In Research

    is a research paper justified

VIDEO

  1. Why do Author Withdraw the Research Paper From The Journal?

  2. Difference between Research paper and a review. Which one is more important?

  3. Is Space Exploration Worth It? Exploring the Benefits and Impact

  4. E-40

  5. Difference between Research Paper and Research Article

  6. Super Paper Peach 5: Maniac's Manor

COMMENTS

  1. What is the justification of a research?

    Answer: Research is conducted to add something new, either knowledge or solutions, to a field. Therefore, when undertaking new research, it is important to know and state why the research is being conducted, in other words, justify the research. The justification of a research is also known as the rationale.

  2. How to Justify Your Methods in a Thesis or Dissertation

    Two Final Tips: When you're writing your justification, write for your audience. Your purpose here is to provide more than a technical list of details and procedures. This section should focus more on the why and less on the how. Consider your methodology as you're conducting your research.

  3. Ten simple rules for typographically appealing scientific texts

    Paragraphs can be typeset left-aligned, centered, right-aligned, or fully justified; cf. Fig 4A. Justifying text requires aligning both the left and right ends of lines, and this is commonly achieved by stretching the spacing between words. Paragraphs in continuous text are usually typeset justified. ... Simple Rules for Writing Research Papers ...

  4. Scientific conclusions need not be accurate, justified, or believed by

    Abstract. We argue that the main results of scientific papers may appropriately be published even if they are false, unjustified, and not believed to be true or justified by their author. To defend this claim we draw upon the literature studying the norms of assertion, and consider how they would apply if one attempted to hold claims made in ...

  5. Justification of research using systematic reviews continues to be

    1. Thank you for the opportunity to review this interesting meta-research paper, which is part of a series of papers. Basing new research on systematic reviews is clearly important and has been the subject of a number of reviews. This paper essentially reviews the meta-research in this area, to give a global assessment of the issue taking into ...

  6. Research Problems and Hypotheses in Empirical Research

    The paper illustrates the central role played by the study's general aim and its relation to existing knowledge in the research domain. KEYWORDS: ... Is a research problem justified if a research hypothesis has been formulated? The account is limited to individual, substantive, empirical, and quantitative research studies.

  7. Scientific Statements: Their Justification and Acceptance

    In order to provide such knowledge, science must have means and methods for justifying its statements—its facts, hypotheses, theories and laws. Scientific knowledge is distinguished from belief, no matter how true or strongly felt that belief may be, by its processes of justification and by its being accepted as such by the scientific ...

  8. Q: How can I write about the justification of my research

    The justification is also known as the rationale and is written in the Introduction. You may thus refer to these resources for writing the justification of your research: How to write the rationale for research? Can you give an example of the "rationale of a study"? 4 Step approach to writing the Introduction section of a research paper.

  9. Topic: Introduction and research justification

    Research proposals often open by outlining a central concern, issue, question or conundrum to which the research relates. The research justification should be provided in an accessible and direct manner in the introductory section of the research proposal.

  10. Research Paper Format

    Research paper format is an essential aspect of academic writing that plays a crucial role in the communication of research findings.The format of a research paper depends on various factors such as the discipline, style guide, and purpose of the research. It includes guidelines for the structure, citation style, referencing, and other elements of the paper that contribute to its overall ...

  11. Justification for Research

    Justification for Research. Justification for Research What makes a good research question is often in the eye of the beholder, but there are several general best-practices criteria that can be used to assess the justification for research. Is the question scientifically well-posed, i.e. is it stated in a hypothetical form that leads to a ...

  12. How is Information Systems Research Justified? An Analysis of

    Abstract. This study analyses how Information Systems (IS) research is justified by authors. We assess how authors justify their research endeavors based on published IS research papers. We use ...

  13. Justification of research using systematic reviews continues to be

    Background Redundancy is an unethical, unscientific, and costly challenge in clinical health research. There is a high risk of redundancy when existing evidence is not used to justify the research question when a new study is initiated. Therefore, the aim of this study was to synthesize meta-research studies evaluating if and how authors of clinical health research studies use systematic ...

  14. writing

    Does this "Why" discussion belong in the methodology section of a research paper or should it be placed somewhere else? If so, where? edit. As an example: I can either survey people or perform an analysis of existing discussions on the topic. There are pros/cons to each approach (sample size, recruitment, etc) .

  15. 13.1 Formatting a Research Paper

    Set the top, bottom, and side margins of your paper at 1 inch. Use double-spaced text throughout your paper. Use a standard font, such as Times New Roman or Arial, in a legible size (10- to 12-point). Use continuous pagination throughout the paper, including the title page and the references section.

  16. Deciding whether the conclusions of studies are justified: a review

    The critical reader, aware of the principle of study design and analysis, can assess the validity of the author's conclusions through careful review of the article. A guide to the analysis of an original article is presented here. The reader is asked to formulate questions about a study from the abstract. The answers to these questions, which ...

  17. Rationale: the necessary ingredient for contributions to theory and

    Often, these papers neglect a critical component of the paper: rationale. The rationale is necessary to justify the need for the research and the approach taken. We argue that contributions flow naturally if the rationale for the study is provided and justified within existing literature, assuming that the study is well executed.

  18. Sample Size Justification

    An important step when designing an empirical study is to justify the sample size that will be collected. The key aim of a sample size justification for such studies is to explain how the collected data is expected to provide valuable information given the inferential goals of the researcher. In this overview article six approaches are discussed to justify the sample size in a quantitative ...

  19. Q: How can I write the Justification of my research paper?

    Answer: Welcome to the Editage Insights Q&A Forum, and thanks for your question. We have quite a few resources on writing the justification or rationale of a study. We have linked a few of these below. For more, you can search the forum/site and the platform using the relevant keywords.

  20. Effective Research Paper Paraphrasing: A Quick Guide

    Research papers rely on other people's writing as a foundation to create new ideas, but you can't just use someone else's words. That's why paraphrasing is an essential writing technique for academic writing.. Paraphrasing rewrites another person's ideas, evidence, or opinions in your own words.With proper attribution, paraphrasing helps you expand on another's work and back up ...

  21. Scientific conclusions need not be accurate, justified, or believed by

    Abstract. We argue that the main results of scientific papers may appropriately be published even if they are false, unjustified, and not believed to be true or justified by their author. To defend this claim we draw upon the literature studying the norms of assertion, and consider how they would apply if one attempted to hold claims made in ...

  22. Knowledge as Justified True Belief

    It is a guiding thought behind the present paper that methods that produce justified beliefs do so because they ensure a proper fit between our beliefs and the world. ... Philosophy and Phenomenological Research, 91(2), 273-302. Article Google Scholar Goldman, A. I. (1976). Discrimination and perceptual knowledge.

  23. Cancer research institute retracts studies amid controversy over errors

    Prestigious cancer research institute has retracted 7 studies amid controversy over errors. ... Scientific images in papers are typically used to present evidence of an experiment's results ...

  24. [2403.20329] ReALM: Reference Resolution As Language Modeling

    ReALM: Reference Resolution As Language Modeling. Reference resolution is an important problem, one that is essential to understand and successfully handle context of different kinds. This context includes both previous turns and context that pertains to non-conversational entities, such as entities on the user's screen or those running in the ...

  25. Researchers Find That Higher Intelligence Is Correlated With Left-Wing

    "Our results," the paper's authors wrote, "imply that being genetically predisposed to be smarter causes left-wing beliefs." As with all research on human intelligence, the work is fraught.

  26. Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

    Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens. In this paper, we present Ferret-UI, a new MLLM tailored for enhanced understanding of mobile UI screens, equipped with referring, grounding, and reasoning capabilities ...

  27. MSU's 700 acres of natural areas bring the classroom outdoors

    There are more than 700 acres in 25 distinct sites across campus, providing important examples of MSU's rich natural heritage and offering significant resources for teaching, research, demonstration and nature appreciation. Thanks to these spaces, research and teaching at MSU extend beyond the classroom. David Rothstein, a professor in the ...

  28. Should the manuscript be unified as "left justification" or ...

    Most of the journals do mention under their formatting guidelines, whether they prefer manuscript to be left aligned or justified. In case it is not mentioned, it is advisable to maintain uniformity throughout the text. ... Using a Venn diagram, state the similarities and differences of a research paper to a research report.

  29. [2404.07143] Leave No Context Behind: Efficient Infinite Context

    Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention. This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention.