YouTube

Sign up for the newsletter

Digital editions.

Digital Editions

Total quality management: three case studies from around the world

With organisations to run and big orders to fill, it’s easy to see how some ceos inadvertently sacrifice quality for quantity. by integrating a system of total quality management it’s possible to have both.

Feature image

Top 5 ways to manage the board during turbulent times Top 5 ways to create a family-friendly work culture Top 5 tips for a successful joint venture Top 5 ways managers can support ethnic minority workers Top 5 ways to encourage gender diversity in the workplace  Top 5 ways CEOs can create an ethical company culture Top 5 tips for going into business with your spouse Top 5 ways to promote a healthy workforce Top 5 ways to survive a recession Top 5 tips for avoiding the ‘conference vortex’ Top 5 ways to maximise new parents’ work-life balance with technology Top 5 ways to build psychological safety in the workplace Top 5 ways to prepare your workforce for the AI revolution Top 5 ways to tackle innovation stress in the workplace Top 5 tips for recruiting Millennials

There are few boardrooms in the world whose inhabitants don’t salivate at the thought of engaging in a little aggressive expansion. After all, there’s little room in a contemporary, fast-paced business environment for any firm whose leaders don’t subscribe to ambitions of bigger factories, healthier accounts and stronger turnarounds. Yet too often such tales of excess go hand-in-hand with complaints of a severe drop in quality.

Food and entertainment markets are riddled with cautionary tales, but service sectors such as health and education aren’t immune to the disappointing by-products of unsustainable growth either. As always, the first steps in avoiding a catastrophic forsaking of quality begins with good management.

There are plenty of methods and models geared at managing the quality of a particular company’s goods or services. Yet very few of those models take into consideration the widely held belief that any company is only as strong as its weakest link. With that in mind, management consultant William Deming developed an entirely new set of methods with which to address quality.

Deming, whose managerial work revolutionised the titanic Japanese manufacturing industry, perceived quality management to be more of a philosophy than anything else. Top-to-bottom improvement, he reckoned, required uninterrupted participation of all key employees and stakeholders. Thus, the total quality management (TQM) approach was born.

All in Similar to the Six Sigma improvement process, TQM ensures long-term success by enforcing all-encompassing internal guidelines and process standards to reduce errors. By way of serious, in-depth auditing – as well as some well-orchestrated soul-searching – TQM ensures firms meet stakeholder needs and expectations efficiently and effectively, without forsaking ethical values.

By opting to reframe the way employees think about the company’s goals and processes, TQM allows CEOs to make sure certain things are done right from day one. According to Teresa Whitacre, of international consulting firm ASQ , proper quality management also boosts a company’s profitability.

“Total quality management allows the company to look at their management system as a whole entity — not just an output of the quality department,” she says. “Total quality means the organisation looks at all inputs, human resources, engineering, production, service, distribution, sales, finance, all functions, and their impact on the quality of all products or services of the organisation. TQM can improve a company’s processes and bottom line.”

Embracing the entire process sees companies strive to improve in several core areas, including: customer focus, total employee involvement, process-centred thinking, systematic approaches, good communication and leadership and integrated systems. Yet Whitacre is quick to point out that companies stand to gain very little from TQM unless they’re willing to go all-in.

“Companies need to consider the inputs of each department and determine which inputs relate to its governance system. Then, the company needs to look at the same inputs and determine if those inputs are yielding the desired results,” she says. “For example, ISO 9001 requires management reviews occur at least annually. Aside from minimum standard requirements, the company is free to review what they feel is best for them. While implementing TQM, they can add to their management review the most critical metrics for their business, such as customer complaints, returns, cost of products, and more.”

The customer knows best: AtlantiCare TQM isn’t an easy management strategy to introduce into a business; in fact, many attempts tend to fall flat. More often than not, it’s because firms maintain natural barriers to full involvement. Middle managers, for example, tend to complain their authority is being challenged when boots on the ground are encouraged to speak up in the early stages of TQM. Yet in a culture of constant quality enhancement, the views of any given workforce are invaluable.

AtlantiCare in numbers

5,000 Employees

$280m Profits before quality improvement strategy was implemented

$650m Profits after quality improvement strategy

One firm that’s proven the merit of TQM is New Jersey-based healthcare provider AtlantiCare . Managing 5,000 employees at 25 locations, AtlantiCare is a serious business that’s boasted a respectable turnaround for nearly two decades. Yet in order to increase that margin further still, managers wanted to implement improvements across the board. Because patient satisfaction is the single-most important aspect of the healthcare industry, engaging in a renewed campaign of TQM proved a natural fit. The firm chose to adopt a ‘plan-do-check-act’ cycle, revealing gaps in staff communication – which subsequently meant longer patient waiting times and more complaints. To tackle this, managers explored a sideways method of internal communications. Instead of information trickling down from top-to-bottom, all of the company’s employees were given freedom to provide vital feedback at each and every level.

AtlantiCare decided to ensure all new employees understood this quality culture from the onset. At orientation, staff now receive a crash course in the company’s performance excellence framework – a management system that organises the firm’s processes into five key areas: quality, customer service, people and workplace, growth and financial performance. As employees rise through the ranks, this emphasis on improvement follows, so managers can operate within the company’s tight-loose-tight process management style.

After creating benchmark goals for employees to achieve at all levels – including better engagement at the point of delivery, increasing clinical communication and identifying and prioritising service opportunities – AtlantiCare was able to thrive. The number of repeat customers at the firm tripled, and its market share hit a six-year high. Profits unsurprisingly followed. The firm’s revenues shot up from $280m to $650m after implementing the quality improvement strategies, and the number of patients being serviced dwarfed state numbers.

Hitting the right notes: Santa Cruz Guitar Co For companies further removed from the long-term satisfaction of customers, it’s easier to let quality control slide. Yet there are plenty of ways in which growing manufacturers can pursue both quality and sales volumes simultaneously. Artisan instrument makers the Santa Cruz Guitar Co (SCGC) prove a salient example. Although the California-based company is still a small-scale manufacturing operation, SCGC has grown in recent years from a basement operation to a serious business.

SCGC in numbers

14 Craftsmen employed by SCGC

800 Custom guitars produced each year

Owner Dan Roberts now employs 14 expert craftsmen, who create over 800 custom guitars each year. In order to ensure the continued quality of his instruments, Roberts has created an environment that improves with each sale. To keep things efficient (as TQM must), the shop floor is divided into six workstations in which guitars are partially assembled and then moved to the next station. Each bench is manned by a senior craftsman, and no guitar leaves that builder’s station until he is 100 percent happy with its quality. This product quality is akin to a traditional assembly line; however, unlike a traditional, top-to-bottom factory, Roberts is intimately involved in all phases of instrument construction.

Utilising this doting method of quality management, it’s difficult to see how customers wouldn’t be satisfied with the artists’ work. Yet even if there were issues, Roberts and other senior management also spend much of their days personally answering web queries about the instruments. According to the managers, customers tend to be pleasantly surprised to find the company’s senior leaders are the ones answering their technical questions and concerns. While Roberts has no intentions of taking his manufacturing company to industrial heights, the quality of his instruments and high levels of customer satisfaction speak for themselves; the company currently boasts one lengthy backlog of orders.

A quality education: Ramaiah Institute of Management Studies Although it may appear easier to find success with TQM at a boutique-sized endeavour, the philosophy’s principles hold true in virtually every sector. Educational institutions, for example, have utilised quality management in much the same way – albeit to tackle decidedly different problems.

The global financial crisis hit higher education harder than many might have expected, and nowhere have the odds stacked higher than in India. The nation plays home to one of the world’s fastest-growing markets for business education. Yet over recent years, the relevance of business education in India has come into question. A report by one recruiter recently asserted just one in four Indian MBAs were adequately prepared for the business world.

RIMS in numbers

9% Increase in test scores post total quality management strategy

22% Increase in number of recruiters hiring from the school

20,000 Increase in the salary offered to graduates

50,000 Rise in placement revenue

At the Ramaiah Institute of Management Studies (RIMS) in Bangalore, recruiters and accreditation bodies specifically called into question the quality of students’ educations. Although the relatively small school has always struggled to compete with India’s renowned Xavier Labour Research Institute, the faculty finally began to notice clear hindrances in the success of graduates. The RIMS board decided it was time for a serious reassessment of quality management.

The school nominated Chief Academic Advisor Dr Krishnamurthy to head a volunteer team that would audit, analyse and implement process changes that would improve quality throughout (all in a particularly academic fashion). The team was tasked with looking at three key dimensions: assurance of learning, research and productivity, and quality of placements. Each member underwent extensive training to learn about action plans, quality auditing skills and continuous improvement tools – such as the ‘plan-do-study-act’ cycle.

Once faculty members were trained, the team’s first task was to identify the school’s key stakeholders, processes and their importance at the institute. Unsurprisingly, the most vital processes were identified as student intake, research, knowledge dissemination, outcomes evaluation and recruiter acceptance. From there, Krishnamurthy’s team used a fishbone diagram to help identify potential root causes of the issues plaguing these vital processes. To illustrate just how bad things were at the school, the team selected control groups and administered domain-based knowledge tests.

The deficits were disappointing. A RIMS students’ knowledge base was rated at just 36 percent, while students at Harvard rated 95 percent. Likewise, students’ critical thinking abilities rated nine percent, versus 93 percent at MIT. Worse yet, the mean salaries of graduating students averaged $36,000, versus $150,000 for students from Kellogg. Krishnamurthy’s team had their work cut out.

To tackle these issues, Krishnamurthy created an employability team, developed strategic architecture and designed pilot studies to improve the school’s curriculum and make it more competitive. In order to do so, he needed absolutely every employee and student on board – and there was some resistance at the onset. Yet the educator asserted it didn’t actually take long to convince the school’s stakeholders the changes were extremely beneficial.

“Once students started seeing the results, buy-in became complete and unconditional,” he says. Acceptance was also achieved by maintaining clearer levels of communication with stakeholders. The school actually started to provide shareholders with detailed plans and projections. Then, it proceeded with a variety of new methods, such as incorporating case studies into the curriculum, which increased general test scores by almost 10 percent. Administrators also introduced a mandate saying students must be certified in English by the British Council – increasing scores from 42 percent to 51 percent.

By improving those test scores, the perceived quality of RIMS skyrocketed. The number of top 100 businesses recruiting from the school shot up by 22 percent, while the average salary offers graduates were receiving increased by $20,000. Placement revenue rose by an impressive $50,000, and RIMS has since skyrocketed up domestic and international education tables.

No matter the business, total quality management can and will work. Yet this philosophical take on quality control will only impact firms that are in it for the long haul. Every employee must be in tune with the company’s ideologies and desires to improve, and customer satisfaction must reign supreme.

Contributors

  • Industry Outlook

CEO

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Quality management

  • Business management
  • Process management
  • Project management

Each Employee's Retirement Is Unique

  • Charalambos Vlachoutsicos
  • September 30, 2014

case study about quality

Organizational Grit

  • Thomas H. Lee
  • Angela L. Duckworth
  • From the September–October 2018 Issue

case study about quality

The Risks of Botshit

  • Ian P. McCarthy
  • Timothy R Hannigan
  • André Spicer
  • July 17, 2024

case study about quality

Teaching Smart People How to Learn

  • Chris Argyris
  • From the May–June 1991 Issue

case study about quality

The IT Transformation Health Care Needs

  • Nikhil Sahni
  • Robert S. Huckman
  • Anuraag Chigurupati
  • David M. Cutler
  • From the November–December 2017 Issue

Will Disruptive Innovations Cure Health Care?

  • Clayton M. Christensen
  • Richard Bohmer
  • John Kenagy
  • September 01, 2000

case study about quality

Lean Knowledge Work

  • Bradley Staats
  • David M. Upton
  • From the October 2011 Issue

case study about quality

Lessons from the U.S. Navy on Building a Culture of Learning

  • Bill Lescher
  • November 28, 2023

case study about quality

Learning to Lead at Toyota

  • Steven J. Spear
  • From the May 2004 Issue

case study about quality

4 Actions to Reduce Medical Errors in U.S. Hospitals

  • John S. Toussaint
  • Kenneth T. Segel
  • April 20, 2022

Trial-By-Fire Transformation: An Interview with Globe Metallurgical's Arden C. Sims

  • Arden C. Sims
  • Bruce Rayner
  • From the May–June 1992 Issue

Selection Bias and the Perils of Benchmarking

  • Jerker Denrell
  • From the April 2005 Issue

Fixing Health Care from the Inside, Today

  • August 31, 2005

Six Sigma Pricing

  • Manmohan S. Sodhi
  • Navdeep S. Sodhi
  • From the May 2005 Issue

Inside the Mind of the Chinese Consumer

  • William McEwen
  • Xiaoguang Fang
  • Chuanping Zhang
  • Richard Burkholder
  • From the March 2006 Issue

case study about quality

A Better Way to Onboard AI

  • Boris Babic
  • Daniel L. Chen
  • Theodoros Evgeniou
  • Anne-Laure Fayard
  • From the July–August 2020 Issue

case study about quality

How U.S. Health Care Got Safer by Focusing on the Patient Experience

  • May 31, 2017

case study about quality

U.S. Health Care Reform Can't Wait for Quality Measures to Be Perfect

  • Brian J Marcotte
  • Annette Guarisco Fildes
  • Michael Thompson
  • Leah Binder
  • October 04, 2017

case study about quality

Crafting the Luxury Experience

  • Robert Chavez
  • December 04, 2011

The Culture to Cultivate

  • George C. Halvorson
  • From the July–August 2013 Issue

case study about quality

Caterpillar in Europe: Inventory Reorder Policies

  • Michael T. Pich
  • Ludo Van Der Heyden
  • March 14, 2002

case study about quality

Simply Effective: How to Cut Through Complexity in Your Organization and Get Things Done

  • Ron Ashkenas
  • December 08, 2009

Solid as Steel: Production Planning at thyssenkrupp

  • Karl Schmedders
  • Markus Schulze
  • February 11, 2016

The Challenge Facing the U.S. Healthcare Delivery System

  • Carin-Isabel Knoop
  • June 06, 2006

Toyota Motor Manufacturing, U.S.A., Inc.

  • Kazuhiro Mishina
  • September 08, 1992

The Ritz-Carlton Hotel Company: The Quest for Service Excellence

  • Nelson M. Fraiman
  • Linda V. Green
  • Aliza Heching
  • Garrett van Ryzin
  • August 13, 2010

Nanxi Liu: Finding the Keys to Sales Success at Enplug

  • Paul Orlando
  • Megan Strawther
  • October 12, 2016

Jailing Kids for Cash in Pennsylvania (Supplement)

  • John D. Donahue
  • Esther Scott
  • November 03, 2009

NovaStar Financial: A Short Seller's Battle

  • Suraj Srinivasan
  • March 13, 2013

Trouble Brewing for Green Mountain Coffee Roasters

  • Michael Norris
  • December 18, 2012

SofMedica Group: Managing Growth

  • Boris Groysberg
  • Sarah L. Abbott
  • May 27, 2024

Gati: Achieving Quality Excellence in Shipment Delivery

  • Soumyajyoti Datta
  • Rohit Kapoor
  • August 22, 2019

case study about quality

Redefining Health Care: Creating Value-Based Competition on Results

  • Michael E. Porter
  • Elizabeth Olmsted Teisberg
  • May 25, 2006

Country Market Collection: A Case of Channel Conflict

  • Kimberly A Whitler
  • Randle D. Raggio
  • November 09, 2017

New Balance Athletic Shoe, Inc.

  • H. Kent Bowen
  • April 20, 2006

Meru Cabs - A Spectacular Growth Story

  • Sridhar Seshadri
  • Arohini Narain
  • Meena Saxena
  • November 25, 2013

Ritz-Carlton Hotel Co.

  • Sandra J. Sucher
  • Stacy McManus
  • March 20, 2001

case study about quality

Mastering the Dynamics of Innovation

  • James M. Utterback
  • August 16, 1996

Lean Manufacturing at FCI (A):The Global Challenge

  • Cynthia Laumuno
  • Enver Yucesan
  • June 25, 2012

A3 Thinking

  • Elliott N. Weiss
  • Austin English
  • June 03, 2020

case study about quality

Understanding the Risks of Different Chatbot-Assisted Tasks

  • Timothy R. Hannigan

case study about quality

The Fit Organization: How to Create a Continuous-Improvement Culture

  • Daniel Markovitz
  • January 01, 2016

Popular Topics

Partner center.

  • Browse All Articles
  • Newsletter Sign-Up

case study about quality

  • 11 Apr 2023
  • Cold Call Podcast

A Rose by Any Other Name: Supply Chains and Carbon Emissions in the Flower Industry

Headquartered in Kitengela, Kenya, Sian Flowers exports roses to Europe. Because cut flowers have a limited shelf life and consumers want them to retain their appearance for as long as possible, Sian and its distributors used international air cargo to transport them to Amsterdam, where they were sold at auction and trucked to markets across Europe. But when the Covid-19 pandemic caused huge increases in shipping costs, Sian launched experiments to ship roses by ocean using refrigerated containers. The company reduced its costs and cut its carbon emissions, but is a flower that travels halfway around the world truly a “low-carbon rose”? Harvard Business School professors Willy Shih and Mike Toffel debate these questions and more in their case, “Sian Flowers: Fresher by Sea?”

case study about quality

  • 17 Sep 2019

How a New Leader Broke Through a Culture of Accuse, Blame, and Criticize

Children’s Hospital & Clinics COO Julie Morath sets out to change the culture by instituting a policy of blameless reporting, which encourages employees to report anything that goes wrong or seems substandard, without fear of reprisal. Professor Amy Edmondson discusses getting an organization into the “High Performance Zone.” Open for comment; 0 Comments.

case study about quality

  • 27 Feb 2019
  • Research & Ideas

The Hidden Cost of a Product Recall

Product failures create managerial challenges for companies but market opportunities for competitors, says Ariel Dora Stern. The stakes have only grown higher. Open for comment; 0 Comments.

case study about quality

  • 31 Mar 2018
  • Working Paper Summaries

Expected Stock Returns Worldwide: A Log-Linear Present-Value Approach

Over the last 20 years, shortcomings of classical asset-pricing models have motivated research in developing alternative methods for measuring ex ante expected stock returns. This study evaluates the main paradigms for deriving firm-level expected return proxies (ERPs) and proposes a new framework for estimating them.

  • 26 Apr 2017

Assessing the Quality of Quality Assessment: The Role of Scheduling

Accurate inspections enable companies to assess the quality, safety, and environmental practices of their business partners, and enable regulators to protect consumers, workers, and the environment. This study finds that inspectors are less stringent later in their workday and after visiting workplaces with fewer problems. Managers and regulators can improve inspection accuracy by mitigating these biases and their consequences.

  • 23 Sep 2013

Status: When and Why It Matters

Status plays a key role in everything from the things we buy to the partnerships we make. Professor Daniel Malter explores when status matters most. Closed for comment; 0 Comments.

  • 16 May 2011

What Loyalty? High-End Customers are First to Flee

Companies offering top-drawer customer service might have a nasty surprise awaiting them when a new competitor comes to town. Their best customers might be the first to defect. Research by Harvard Business School's Ryan W. Buell, Dennis Campbell, and Frances X. Frei. Key concepts include: Companies that offer high levels of customer service can't expect too much loyalty if a new competitor offers even better service. High-end businesses must avoid complacency and continue to proactively increase relative service levels when they're faced with even the potential threat of increased service competition. Even though high-end customers can be fickle, a company that sustains a superior service position in its local market can attract and retain customers who are more valuable over time. Firms rated lower in service quality are more or less immune from the high-end challenger. Closed for comment; 0 Comments.

  • 08 Dec 2008

Thinking Twice About Supply-Chain Layoffs

Cutting the wrong employees can be counterproductive for retailers, according to research from Zeynep Ton. One suggestion: Pay special attention to staff who handle mundane tasks such as stocking and labeling. Your customers do. Closed for comment; 0 Comments.

  • 01 Dec 2006
  • What Do You Think?

How Important Is Quality of Labor? And How Is It Achieved?

A new book by Gregory Clark identifies "labor quality" as the major enticement for capital flows that lead to economic prosperity. By defining labor quality in terms of discipline and attitudes toward work, this argument minimizes the long-term threat of outsourcing to developed economies. By understanding labor quality, can we better confront anxieties about outsourcing and immigration? Closed for comment; 0 Comments.

  • 20 Sep 2004

How Consumers Value Global Brands

What do consumers expect of global brands? Does it hurt to be an American brand? This Harvard Business Review excerpt co-written by HBS professor John A. Quelch identifies the three characteristics consumers look for to make purchase decisions. Closed for comment; 0 Comments.

  • ACS Foundation
  • Inclusive Excellence
  • ACS Archives
  • Careers at ACS
  • Federal Legislation
  • State Legislation
  • Regulatory Issues
  • Get Involved
  • SurgeonsPAC
  • About ACS Quality Programs
  • Accreditation & Verification Programs
  • Data & Registries
  • Standards & Staging
  • Membership & Community
  • Practice Management
  • Professional Growth
  • News & Publications
  • Information for Patients and Family
  • Preparing for Your Surgery
  • Recovering from Your Surgery
  • Jobs for Surgeons
  • Become a Member
  • Media Center

Our top priority is providing value to members. Your Member Services team is here to ensure you maximize your ACS member benefits, participate in College activities, and engage with your ACS colleagues. It's all here.

  • Membership Benefits
  • Find a Surgeon
  • Find a Hospital or Facility
  • Quality Programs
  • Education Programs
  • Member Benefits
  • QI Resources
  • Case Studies

Quality Improvement Case Study Repository

The ACS Quality Improvement Case Study Repository is a collection of QI projects from hospitals participating in ACS Quality Programs.

The ACS Quality Improvement Case Study Repository is a centralized platform of quality improvement projects implemented by participants of the ACS Quality Programs. Each of the curated projects in the repository has been formatted to follow the new ACS Quality Framework , allowing readers to easily understand the details of each project from planning through execution, data analysis, and lessons learned.

All projects were developed by surgical clinical reviewers, cancer registrars, surgeon champions, program directors, or other quality improvement professionals. They focus on a local problem, utilize local data, and were implemented within their own facilities. They describe the team’s experience, explain project challenges, and how these challenges were addressed. 

The ACS is providing these case studies to educate and inspire surgical teams, their hospitals, and other healthcare entities to engage in quality improvement activities. Quality improvement is not an exact science, and it is important that your quality improvement project is based on a local problem at your institution.

The case studies offered represent the experiences of the authors and may not be generalizable to other institutions. These examples may serve as a starting point to assist you in developing your own quality improvement initiative. Adapting projects outlined as example here does not guarantee compliance with an ACS accreditation or verification standard.

If you have a quality improvement project you would like to add to the case study repository or would like to provide feedback on this new resource, contact us at [email protected] .

MBA Knowledge Base

Business • Management • Technology

Home » Management Case Studies » Case Study: Quality Management System at Coca Cola Company

Case Study: Quality Management System at Coca Cola Company

Coca Cola’s history can be traced back to a man called Asa Candler, who bought a specific formula from a pharmacist named Smith Pemberton. Two years later, Asa founded his business and started production of soft drinks based on the formula he had bought. From then, the company grew to become the biggest producers of soft drinks with more than five hundred brands sold and consumed in more than two hundred nations worldwide.

Although the company is said to be the biggest bottler of soft drinks, they do not bottle much. Instead, Coca Cola Company manufactures a syrup concentrate, which is bought by bottlers all over the world. This distribution system ensures the soft drink is bottled by these smaller firms according to the company’s standards and guidelines. Although this franchised method of distribution is the primary method of distribution, the mother company has a key bottler in America, Coca Cola Refreshments.

In addition to soft drinks, which are Coca Cola’s main products, the company also produces diet soft drinks. These are variations of the original soft drinks with improvements in nutritional value, and reductions in sugar content. Saccharin replaced industrial sugar in 1963 so that the drinks could appeal to health-conscious consumers. A major cause for concern was the inter product competition which saw some sales dwindle in some products in favor of others.

Coca Cola started diversifying its products during the First World War when ‘Fanta’ was introduced. During World War 1, the heads of Coca Cola in Nazi Germany decided to establish a new soft drink into the market. During the ongoing war, America’s promotion in Germany was not acceptable. Therefore, he decided to use a new name and ‘Fanta’ was born. The creation was successful and production continued even after the war. ‘Sprite’ followed soon after.

In the 1990’s, health concerns among consumers of soft drinks forced their manufactures to consider altering the energy content of these products. ‘Minute Maid’ Juices, ‘PowerAde’ sports drinks, and a few flavored teas variants were Coca Cola’s initial reactions to this new interest. Although most of these new products were well received, some did not perform as well. An example of such was Coca Cola classic, dubbed C2.

Coca Cola Company has been a successful company for more than a century. This can be attributed partly to the nature of its products since soft drinks will always appeal to people. In addition to this, Coca Cola has one of the best commercial and public relations programs in the world. The company’s products can be found on adverts in virtually every corner of the globe. This success has led to its support for a wide range of sporting activities. Soccer, baseball, ice hockey, athletics and basketball are some of these sports, where Coca Cola is involved

Quality Management System at Coca Cola Company

The Quality Management System at Coca Cola

It is very important that each product that Coca Cola produces is of a high quality standard to ensure that each product is exactly the same. This is important as the company wants to meet with customer requirements and expectations. With the brand having such a global presence, it is vital that these checks are continually consistent. The standardized bottle of Coca Cola has elements that need to be checked whilst on the production line to make sure that a high quality is being met. The most common checks include ingredients, packaging and distribution. Much of the testing being taken place is during the production process, as machines and a small team of employees monitor progress. It is the responsibility of all of Coca Colas staff to check quality from hygiene operators to product and packaging quality. This shows that these constant checks require staff to be on the lookout for problems and take responsibility for this, to ensure maintained quality.

Coca-cola uses inspection throughout its production process, especially in the testing of the Coca-Cola formula to ensure that each product meets specific requirements. Inspection is normally referred to as the sampling of a product after production in order to take corrective action to maintain the quality of products. Coca-Cola has incorporated this method into their organisational structure as it has the ability of eliminating mistakes and maintaining high quality standards, thus reducing the chance of product recall. It is also easy to implement and is cost effective.

Coca-cola uses both Quality Control (QC) and Quality Assurance (QA) throughout its production process. QC mainly focuses on the production line itself, whereas QA focuses on its entire operations process and related functions, addressing potential problems very quickly. In QC and QA, state of the art computers check all aspects of the production process, maintaining consistency and quality by checking the consistency of the formula, the creation of the bottle (blowing), fill levels of each bottle, labeling of each bottle, overall increasing the speed of production and quality checks, which ensures that product demands are met. QC and QA helps reduce the risk of defective products reaching a customer; problems are found and resolved in the production process, for example, bottles that are considered to be defective are placed in a waiting area for inspection. QA also focuses on the quality of supplied goods to Coca-cola, for example sugar, which is supplied by Tate and Lyle. Coca-cola informs that they have never had a problem with their suppliers. QA can also involve the training of staff ensuring that employees understand how to operate machinery. Coca-Cola ensures that all members of staff receive training prior to their employment, so that employees can operate machinery efficiently. Machinery is also under constant maintenance, which requires highly skilled engineers to fix problems, and help Coca-cola maintain high outputs.

Every bottle is also checked that it is at the correct fill level and has the correct label. This is done by a computer which every bottle passes through during the production process. Any faulty products are taken off the main production line. Should the quality control measures find any errors, the production line is frozen up to the last good check that was made. The Coca Cola bottling plant also checks the utilization level of each production line using a scorecard system. This shows the percentage of the line that is being utilized and allows managers to increase the production levels of a line if necessary.

Coca-Cola also uses Total Quality Management (TQM) , which involves the management of quality at every level of the organisation , including; suppliers, production, customers etc. This allows Coca-cola to retain/regain competitiveness to achieve increased customer satisfaction . Coca-cola uses this method to continuously improve the quality of their products. Teamwork is very important and Coca-cola ensures that every member of staff is involved in the production process, meaning that each employee understands their job/roles, thus improving morale and motivation , overall increasing productivity. TQM practices can also increase customer involvement as many organisations, including Coca-Cola relish the opportunity to receive feedback and information from their consumers. Overall, reducing waste and costs, provides Coca-cola with a competitive advantage .

The Production Process

Before production starts on the line cleaning quality tasks are performed to rinse internal pipelines, machines and equipment. This is often performed during a switch over of lines for example, changing Coke to Diet Coke to ensure that the taste is the same. This quality check is performed for both hygiene purposes and product quality. When these checks are performed the production process can begin.

Coca Cola uses a database system called Questar which enables them to perform checks on the line. For example, all materials are coded and each line is issued with a bill of materials before the process starts. This ensures that the correct materials are put on the line. This is a check that is designed to eliminate problems on the production line and is audited regularly. Without this system, product quality wouldn’t be assessed at this high level. Other quality checks on the line include packaging and carbonation which is monitored by an operator who notes down the values to ensure they are meeting standards.

To test product quality further lab technicians carry out over 2000 spot checks a day to ensure quality and consistency. This process can be prior to production or during production which can involve taking a sample of bottles off the production line. Quality tests include, the CO2 and sugar values, micro testing, packaging quality and cap tightness. These tests are designed so that total quality management ideas can be put forward. For example, one way in which Coca Cola has improved their production process is during the wrapping stage at the end of the line. The machine performed revolutions around the products wrapping it in plastic until the contents were secure. One initiative they adopted meant that one less revolution was needed. This idea however, did not impact on the quality of the packaging or the actual product therefore saving large amounts of money on packaging costs. This change has been beneficial to the organisation. Continuous improvement can also be used to adhere to environmental and social principles which the company has the responsibility to abide by. Continuous Improvement methods are sometimes easy to identify but could lead to a big changes within the organisation. The idea of continuous improvement is to reveal opportunities which could change the way something is performed. Any sources of waste, scrap or rework are potential projects which can be improved.

The successfulness of this system can be measured by assessing the consistency of the product quality. Coca Cola say that ‘Our Company’s Global Product Quality Index rating has consistently reached averages near 94 since 2007, with a 94.3 in 2010, while our Company Global Package Quality Index has steadily increased since 2007 to a 92.6 rating in 2010, our highest value to date’. This is an obvious indication this quality system is working well throughout the organisation. This increase of the index shows that the consistency of the products is being recognized by consumers.

Related posts:

  • Case Study: The Coca-Cola Company Struggles with Ethical Crisis
  • Case Study: Analysis of the Ethical Behavior of Coca Cola
  • Case Study of Burger King: Achieving Competitive Advantage through Quality Management
  • SWOT Analysis of Coca Cola
  • Case Study: Marketing Strategy of Walt Disney Company
  • Case Study of Papa John’s: Quality as a Core Business Strategy
  • Case Study: Johnson & Johnson Company Analysis
  • Case Study: Inventory Management Practices at Walmart
  • Case Study: Analysis of Performance Management at British Petroleum
  • Total Quality Management And Continuous Quality Improvement

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals

You are here

  • Volume 20, Issue Suppl 1
  • The contribution of case study research to knowledge of how to improve quality of care
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • G Ross Baker
  • Correspondence to Professor G Ross Baker, Department of Health Policy, Management and Evaluation, University of Toronto, 155 College Street, Toronto, Ontario, Canada M5T 3M6; ross.baker{at}utoronto.ca

Background Efforts to improve the implementation of effective practice and to speed up improvements in quality and patient safety continue to pose challenges for researchers and policy makers. Organisational research, and, in particular, case studies of quality improvement, offer methods to improve understanding of the role of organisational and microsystem contexts for improving care and the development of theories which might guide improvement strategies.

Methods This paper reviews examples of such research and details the methodological issues in constructing and analysing case studies. Case study research typically collects a wide array of data from interviews, documents and other sources.

Conclusion Advances in methods for coding and analysing these data are improving the quality of reports from these studies.

  • Quality improvement
  • qualitative research
  • healthcare quality improvement

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode .

https://doi.org/10.1136/bmjqs.2010.046490

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

The gap between the knowledge of what works and the widespread adoption of those practices has become a major preoccupation of researchers and a challenge for funders and policy makers. 1–3 Recognition of this ‘quality chasm’ (the term that the US Institute of Medicine used to describe the distance ‘between the healthcare we have and the care we could have’ 4 ) has led to an increased focus on quality improvement and implementation science to advance understanding of how to promote evidence-based practice. In turn, the focus on implementation has led to the development of multiple theories and frameworks to guide implementation, 5–7 but no framework has demonstrated widespread results in practice.

There seems to be no immutable formula for successful implementation of innovations. While rational decision-makers would like the effectiveness of new technologies (including new work routines, devices and medications) to be the primary determinant of their adoption, research suggests otherwise. Healthcare systems are complex and variable. While some teams or organisations provide a ‘receptive context’ for innovation, 8 others resist, having limited interest or abilities to implement new ideas. Decades of research in organisational and social sciences suggest that the nature of the innovation and the organisational, professional and health system contexts into which they are introduced influence their adoption. 7 9–11 Thus, creating more effective, evidence-based care relies not just on developing and disseminating the evidence, but also on building knowledge of the ways in which innovations can be embedded into ongoing practice. Understanding the structures and processes of change is as critical as the knowledge of what works. In this paper, we outline how case study research can contribute a more detailed understanding of how to improve care. Case study methods are underutilised in quality improvement research, and given the growing calls to understand how innovation works in different contexts 12–14 these methods could be a valuable addition to current approaches. We begin by illustrating the insights from case study research, and then examine the contribution of case study research to theory. Next we discuss strategies for analysing case study data and the scientific soundness of such information, ending with a discussion of the need for case studies to enhance the scientific understanding of quality improvement.

Insights from case study research

Three examples of how qualitative organisational research informs our understanding of the adoption of healthcare innovations illustrate the value of this research. Denis and colleagues 15 studied the adoption of four innovations in several Quebec hospitals. They found that the strength of evidence of the innovation was not the only factor influencing adoption. Organisational arrangements, clinical skills and other more ambiguous elements that were open to interpretation and negotiation were also critical. In another study examining innovations in acute care and primary care settings in the UK, Ferlie 16 identified the critical role of boundaries between professional groups. Unlike some prior studies where high levels of professionalisation facilitated adoption of innovations, Ferlie's research found that the varying roles, social boundaries and distinctive cognitive styles of different professional groups can limit the adoption of new technologies. For example, the introduction of an anticoagulation service was slowed by disagreements between cardiologists, primary care physicians, nurses and IT system designers about the appropriate indications for treatment.

The adoption of minimally invasive cardiac surgery for coronary artery bypass graft or valve replacement surgery in 16 US hospitals provides a third example. Edmondson and colleagues 17 found that successful implementation depended on team learning processes rather than resources, academic status or innovation history. Innovative procedures like minimally invasive cardiac surgery disrupt established work routines. Establishing the necessary new routines for minimally invasive cardiac surgery depended on staff perceptions of psychological safety (the sense that ‘well-intentioned interpersonal risks will not be punished’), team stability and a collective learning process supported by leaders.

Each of these research projects used case study methods to identify the novel aspects of the process of implementing innovation. The research teams collected and analysed data from interviews, clinical data and documents. These research projects examined individuals or teams in context; they were embedded multiple case designs. 18 Although the researchers had detailed knowledge of potentially relevant factors, these were primarily exploratory studies, examining which aspects of the innovation, the individuals and teams and the larger organisations influenced the adoption of the innovation.

The case study methods used in these three studies offer valuable tools in exploring the effectiveness of quality improvement more broadly. While case study research is a well-established method in organisational research, it appears to be less common in organisational health services research. Case study research designs involve the collection of qualitative (and often quantitative) data from various sources to explore one or more organisations or parts of organisations and the characteristics of these contexts. 19 Some criticise case study research because they believe that the small sample size and lack of controls undermine the ability to generalise, 20 while others worry that the analysis of case study data is often unsystematic. 21 Yet case studies, because they detail specific experiences in particular contexts, offer the opportunity to learn more about the relationship of organisational processes and context to the success or failure of quality improvement efforts.

Contributions of case studies to theory

Case studies can inform the development of more robust theory that identifies the links between problem, intervention and outcome. Robert Yin, in his classic book, 22 notes that case study research is particularly helpful when researchers want to answer questions of how or why things work in real life contexts. Theory generated from cases may help to make sense of the complex relationships that underline healthcare practice and elucidate why efforts to improve care succeed in some circumstances, but not in others.

Christensen and Carlile 23 note that theory building (the creation of a ‘body of knowledge’ or understanding) occurs in two ways or stages; first there is a descriptive or inductive stage where researchers observe phenomena and describe and measure what they see (see figure 1 ). Based on these observations, researchers develop constructs that abstract the essence of what has been observed, classify or categorise these observations, and identify relationships between them. Through these activities, researchers develop theories or models which organise the aspects of the world they study. Second, in a deductive process, researchers test and improve these theories by exploring whether the same correlations exist in different data sets. This hypothesis testing allows the theory to be confirmed or rejected, and it also permits further specification of the theory to define the phenomena more precisely or specify the circumstances under which correlations hold. Where the goal of research is discovery or new explanations, case studies may offer a more powerful research design than experimental methods. 24 25

  • Download figure
  • Open in new tab
  • Download powerpoint

Process of building theory.

Edmondson and McManus 26 add to Christensen and Carlile's outline of the process of theory building and testing by identifying the importance of ‘methodological fit’ between theory building and different research methods. They suggest the appropriateness of different types of data varies depending on the research questions posed, the current state of the literature and the contribution envisaged from the research. Qualitative data, including interviews, observation and document analysis, are most appropriate for research where theory is nascent, and the research questions are exploratory. On the other hand, where theory is mature, survey methods and statistical testing focused on confirmation of hypotheses are more appropriate.

Organisational case studies have been an effective way to build theory in organisational research. 18 Eisenhardt and Graebner 27 note that ‘[a] major reason for the popularity and relevance of theory building from case studies is that it is one of the best (if not the best) of the bridges from rich qualitative evidence to mainstream deductive research. Its emphasis on developing constructs, measures and testable theoretical propositions makes inductive case research consistent with the emphasis on testable theory within mainstream deductive research.’ Some authors 28 argue that single case studies provide more detail and offer ‘better stories’ which are helpful in describing phenomena. But others assert that multiple case studies provide a stronger base for theory building. 22 27 Multiple case studies are powerful, since they permit replication and extension among individual cases. Replication enables a researcher to perceive the patterns in the cases more easily and to separate out patterns from change occurrences. Different cases can emphasise varying aspects of a phenomenon and enable researchers to develop a fuller theory. Fitzgerald and Dopson 19 identify four common types of multiple case study designs, each based on a different logic. These include (1) matching or replication designs intended to explore or verify ideas; (2) comparison of differences, including cases selected for their different characteristics; (3) outliers, comparison of extremes to delineate key factors and the shape of a field; and (4) embedded case study designs where multiple units are examined to identify similarities and differences.

Despite growing numbers of studies on quality improvement in healthcare, there is limited growth in a more general theory about improvement. For example, there is a growing view that improvement interventions should be tailored to potential barriers. Yet, as Bosch notes, 29 in many cases it is difficult to assess whether such tailoring was done based on a priori barrier identification, and explicit use of theory to match the intervention to the identified barriers. Bosch adds that ‘the translation of identified barriers into tailor-made [quality improvement] interventions [and their] implementation is still a black box for both educational and organisational interventions’ (p. 161). Case studies might contribute useful information to develop relevant theory. More broadly, case study research provides methods to examine organisational processes over time, examining the interplay of interventions with team dynamics or leadership strategy. For example, studies by Baker 30 and Bate 31 of high-performing healthcare organisations illustrate the challenges of creating, spreading and sustaining effective practice in organisations. Some case study research has followed organisations over extended time periods repeating interviews with key informants (eg, Denis' work on strategic change 40 41 ). Unlike survey research and RCTs, case study research can analyse the process of implementation and unpack the dynamics of change.

Data collection and analysis

Organisational case studies can include a wide array of data, including interviews, documents, ethnography, survey data and observations. Although the case study is generally viewed as a qualitative method, it may include quantitative data. For example, Greenhalgh's study of the impact of ‘modernisation initiatives’ of the delivery of care in London 42 used a wide range of methods and data, including interviews, document analysis and ethnography. Other organisational case study research 17 32 40 43 has adopted a similar mix of data sources.

Case study research typically generates large quantities of data, which makes analysis critical, but complex. Moreover, the methods for aggregating data across projects are not well developed. Coffey and Atkinson note that the use of coding and sorting, and the identification of themes are ‘an important, even an indispensable, part of the qualitative research process.’ 44 Yet, there are challenges to such methods, since coding individual experiences can lead to ‘decontextualisation,’ fragmenting such meanings and making them difficult to identify. 45 These problems are accentuated in multiple cases where results may reflect differences between the methods used, or the interests and orientation of various researchers. Even within the same research project, different investigators may take the lead in different cases. Dopson adds several other considerations about chronology: ‘Were the studies synchronous? Were they prospective or retrospective? Were they longitudinal or cross-sectional? How variable were the political and organisational contexts?’ (p. 6). 32 Multiple case studies are difficult to report, given the space constraints for journal publication, 27 and the use of extensive tables risks mimicking the presentation of quantitative data, stripping the illustrative detail from the case presentations. 19

Synthesis across studies can help to build a more generalisable understanding of organisational strategies to support improvement. Yet views vary on whether we can synthesise research from multiple case studies undertaken independently. In their review of studies examining efforts to integrate evidence into clinical decision-making in UK healthcare, Dopson and colleagues 32 compared and synthesised their findings reanalysing the original studies to identify themes, recoding their reports and then assessing the outputs generated by the five researchers involved (see table 1 ). Such tables offer a bird's-eye view of the extent to which common themes inform different case studies, but such summaries are divorced from understanding how these issues are inter-related within each case.

  • View inline

Identifying research themes across studies of innovation diffusion 32

Methodological rigour

Efforts to create such syntheses raise issues about methodological rigour. For those researchers who adopt a positivist framework, the test of good case studies builds on four criteria used to assess the rigour of field research: internal validity, construct validity, external validity and reliability. 22 These criteria might be applied to case studies in the following ways (see table 2 ).

Framework for an investigation of the methodological rigour of case studies 40

Gibbert and colleagues 46 reviewed case studies published in the organisation/management literature between 1995 and 2000. They found research procedures enhancing external validity in 82 of 159 papers, and procedures supporting reliability in 27 of these papers. Few papers provided evidence of internal or construct validity. Yin proposes pattern matching; explanation building; addressing rival explanations and using logic models as strategies to address internal validity. 22 Eisenhardt offers a series of questions that reflect on the match between method and results: ‘Have the investigators followed a careful analytical procedure? Does the evidence support the theory? Have the investigators ruled out rival explanations?’ 18 (p. 548). Non-positivist researchers employ other methods to ensure the soundness of their findings; for example, see Lincoln and Guba. 47

An alternative measure of the rigour of case study research focuses on how good the theory is that emerges from this research. Pfeffer 48 suggests that good theory is parsimonious, testable and logically coherent. Good theory should also address critical issues of interest to organisations and interested parties. Insights from other disciplines and attempts to seek out anomalies in other authors' work that might inform research in different areas are other strategies that may enrich the quality of case study research, improving the theory that results. 48

Despite the need for more robust theory, why are there so few organisational case studies of quality improvement? Some candidate explanations might include: (1) the limited number of organisational scholars working in this area; (2) the dominance of alternative research paradigms that dismiss case study research; (3) difficulties in securing funding; (4) the lack of publication outlets; and (5) the absence of a clear understanding of the relationship of case study research to the development of theory, and the testing of theory using randomised control trials and other methods. Still, the emergence of several strong research groups in the UK, Canada and the USA, and growing numbers of high-quality publications offer hope. What is missing in quality improvement research is a clear understanding of how case study research could contribute to the broader research enterprise, enriching the qualitative understanding of the complex processes of improving healthcare delivery.

Conclusions

Comparative case study research provides useful methods for identifying the factors facilitating and impeding improvement. Although valuable in their own right, such methods also offer the opportunity to enrich more traditional approaches to assessing interventions, helping to explain why some interventions are unsuccessful, or why they seem to work effectively in some contexts but not in others. Efforts to improve patient safety and quality of care need to take into account the complexities of the systems in which these improvements are being introduced. Case study methods provide a robust means to guide implementation of effective practices.

  • Dougherty D ,
  • Grimshaw JM ,
  • Thomas RE ,
  • MacLennan G ,
  • Robertson D ,
  • Woodside JM ,
  • ↵ Institute of Medicine . Crossing the Quality Chasm: A New Health System for the 21st Century . Washington, DC : National Academy of Sciences , 2001 .
  • Damschroder LJ ,
  • McCormack B
  • Greenhalgh T ,
  • Macfarlane F ,
  • Pettigrew A ,
  • Damanpour F
  • Langley A ,
  • Fitzgerald L ,
  • Edmondson AC ,
  • Bohmer RM ,
  • Eisenhardt KM
  • Buchanan DA ,
  • Campbell DT
  • Christensen CM ,
  • Vandenbroucke JP
  • Russell J ,
  • Swinglehurst D
  • Eisenhardt KM ,
  • Graebner ME
  • van der Weijden T ,
  • Wensing M ,
  • MacIntosh-Murray A ,
  • Porcellato C ,
  • FitzGerald L ,
  • FitzGerald L
  • Sutherland K ,
  • ↵ Clinical Standards Advisory Group . Clinical effectiveness . London , 1998 .
  • Chambers D ,
  • Surender R ,
  • Lamothe L ,
  • Humphrey C ,
  • Stetler CB ,
  • Ritchie JA ,
  • Rycroft-Malone J ,
  • Kavanaugh K ,
  • Gibbert M ,
  • Ruigrok W ,
  • Lincoln YS ,

Competing interests None.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

U.S. flag

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Heart-Healthy Living
  • High Blood Pressure
  • Sickle Cell Disease
  • Sleep Apnea
  • Information & Resources on COVID-19
  • The Heart Truth®
  • Learn More Breathe Better®
  • Blood Diseases & Disorders Education Program
  • Publications and Resources
  • Clinical Trials
  • Blood Disorders and Blood Safety
  • Sleep Science and Sleep Disorders
  • Lung Diseases
  • Health Disparities and Inequities
  • Heart and Vascular Diseases
  • Precision Medicine Activities
  • Obesity, Nutrition, and Physical Activity
  • Population and Epidemiology Studies
  • Women’s Health
  • Research Topics
  • All Science A-Z
  • Grants and Training Home
  • Policies and Guidelines
  • Funding Opportunities and Contacts
  • Training and Career Development
  • Email Alerts
  • NHLBI in the Press
  • Research Features
  • Ask a Scientist
  • Past Events
  • Upcoming Events
  • Mission and Strategic Vision
  • Divisions, Offices and Centers
  • Advisory Committees
  • Budget and Legislative Information
  • Jobs and Working at the NHLBI
  • Contact and FAQs
  • NIH Sleep Research Plan
  • < Back To Health Topics

Study Quality Assessment Tools

In 2013, NHLBI developed a set of tailored quality assessment tools to assist reviewers in focusing on concepts that are key to a study’s internal validity. The tools were specific to certain study designs and tested for potential flaws in study methods or implementation. Experts used the tools during the systematic evidence review process to update existing clinical guidelines, such as those on cholesterol, blood pressure, and obesity. Their findings are outlined in the following reports:

  • Assessing Cardiovascular Risk: Systematic Evidence Review from the Risk Assessment Work Group
  • Management of Blood Cholesterol in Adults: Systematic Evidence Review from the Cholesterol Expert Panel
  • Management of Blood Pressure in Adults: Systematic Evidence Review from the Blood Pressure Expert Panel
  • Managing Overweight and Obesity in Adults: Systematic Evidence Review from the Obesity Expert Panel

While these tools have not been independently published and would not be considered standardized, they may be useful to the research community. These reports describe how experts used the tools for the project. Researchers may want to use the tools for their own projects; however, they would need to determine their own parameters for making judgements. Details about the design and application of the tools are included in Appendix A of the reports.

Quality Assessment of Controlled Intervention Studies - Study Quality Assessment Tools

Criteria Yes No Other
(CD, NR, NA)*
1. Was the study described as randomized, a randomized trial, a randomized clinical trial, or an RCT?      
2. Was the method of randomization adequate (i.e., use of randomly generated assignment)?      
3. Was the treatment allocation concealed (so that assignments could not be predicted)?      
4. Were study participants and providers blinded to treatment group assignment?      
5. Were the people assessing the outcomes blinded to the participants' group assignments?      
6. Were the groups similar at baseline on important characteristics that could affect outcomes (e.g., demographics, risk factors, co-morbid conditions)?      
7. Was the overall drop-out rate from the study at endpoint 20% or lower of the number allocated to treatment?      
8. Was the differential drop-out rate (between treatment groups) at endpoint 15 percentage points or lower?      
9. Was there high adherence to the intervention protocols for each treatment group?      
10. Were other interventions avoided or similar in the groups (e.g., similar background treatments)?      
11. Were outcomes assessed using valid and reliable measures, implemented consistently across all study participants?      
12. Did the authors report that the sample size was sufficiently large to be able to detect a difference in the main outcome between groups with at least 80% power?      
13. Were outcomes reported or subgroups analyzed prespecified (i.e., identified before analyses were conducted)?      
14. Were all randomized participants analyzed in the group to which they were originally assigned, i.e., did they use an intention-to-treat analysis?      
Quality Rating (Good, Fair, or Poor)
Rater #1 initials:
Rater #2 initials:
Additional Comments (If POOR, please state why):

*CD, cannot determine; NA, not applicable; NR, not reported

Guidance for Assessing the Quality of Controlled Intervention Studies

The guidance document below is organized by question number from the tool for quality assessment of controlled intervention studies.

Question 1. Described as randomized

Was the study described as randomized? A study does not satisfy quality criteria as randomized simply because the authors call it randomized; however, it is a first step in determining if a study is randomized

Questions 2 and 3. Treatment allocation–two interrelated pieces

Adequate randomization: Randomization is adequate if it occurred according to the play of chance (e.g., computer generated sequence in more recent studies, or random number table in older studies). Inadequate randomization: Randomization is inadequate if there is a preset plan (e.g., alternation where every other subject is assigned to treatment arm or another method of allocation is used, such as time or day of hospital admission or clinic visit, ZIP Code, phone number, etc.). In fact, this is not randomization at all–it is another method of assignment to groups. If assignment is not by the play of chance, then the answer to this question is no. There may be some tricky scenarios that will need to be read carefully and considered for the role of chance in assignment. For example, randomization may occur at the site level, where all individuals at a particular site are assigned to receive treatment or no treatment. This scenario is used for group-randomized trials, which can be truly randomized, but often are "quasi-experimental" studies with comparison groups rather than true control groups. (Few, if any, group-randomized trials are anticipated for this evidence review.)

Allocation concealment: This means that one does not know in advance, or cannot guess accurately, to what group the next person eligible for randomization will be assigned. Methods include sequentially numbered opaque sealed envelopes, numbered or coded containers, central randomization by a coordinating center, computer-generated randomization that is not revealed ahead of time, etc. Questions 4 and 5. Blinding

Blinding means that one does not know to which group–intervention or control–the participant is assigned. It is also sometimes called "masking." The reviewer assessed whether each of the following was blinded to knowledge of treatment assignment: (1) the person assessing the primary outcome(s) for the study (e.g., taking the measurements such as blood pressure, examining health records for events such as myocardial infarction, reviewing and interpreting test results such as x ray or cardiac catheterization findings); (2) the person receiving the intervention (e.g., the patient or other study participant); and (3) the person providing the intervention (e.g., the physician, nurse, pharmacist, dietitian, or behavioral interventionist).

Generally placebo-controlled medication studies are blinded to patient, provider, and outcome assessors; behavioral, lifestyle, and surgical studies are examples of studies that are frequently blinded only to the outcome assessors because blinding of the persons providing and receiving the interventions is difficult in these situations. Sometimes the individual providing the intervention is the same person performing the outcome assessment. This was noted when it occurred.

Question 6. Similarity of groups at baseline

This question relates to whether the intervention and control groups have similar baseline characteristics on average especially those characteristics that may affect the intervention or outcomes. The point of randomized trials is to create groups that are as similar as possible except for the intervention(s) being studied in order to compare the effects of the interventions between groups. When reviewers abstracted baseline characteristics, they noted when there was a significant difference between groups. Baseline characteristics for intervention groups are usually presented in a table in the article (often Table 1).

Groups can differ at baseline without raising red flags if: (1) the differences would not be expected to have any bearing on the interventions and outcomes; or (2) the differences are not statistically significant. When concerned about baseline difference in groups, reviewers recorded them in the comments section and considered them in their overall determination of the study quality.

Questions 7 and 8. Dropout

"Dropouts" in a clinical trial are individuals for whom there are no end point measurements, often because they dropped out of the study and were lost to followup.

Generally, an acceptable overall dropout rate is considered 20 percent or less of participants who were randomized or allocated into each group. An acceptable differential dropout rate is an absolute difference between groups of 15 percentage points at most (calculated by subtracting the dropout rate of one group minus the dropout rate of the other group). However, these are general rates. Lower overall dropout rates are expected in shorter studies, whereas higher overall dropout rates may be acceptable for studies of longer duration. For example, a 6-month study of weight loss interventions should be expected to have nearly 100 percent followup (almost no dropouts–nearly everybody gets their weight measured regardless of whether or not they actually received the intervention), whereas a 10-year study testing the effects of intensive blood pressure lowering on heart attacks may be acceptable if there is a 20-25 percent dropout rate, especially if the dropout rate between groups was similar. The panels for the NHLBI systematic reviews may set different levels of dropout caps.

Conversely, differential dropout rates are not flexible; there should be a 15 percent cap. If there is a differential dropout rate of 15 percent or higher between arms, then there is a serious potential for bias. This constitutes a fatal flaw, resulting in a poor quality rating for the study.

Question 9. Adherence

Did participants in each treatment group adhere to the protocols for assigned interventions? For example, if Group 1 was assigned to 10 mg/day of Drug A, did most of them take 10 mg/day of Drug A? Another example is a study evaluating the difference between a 30-pound weight loss and a 10-pound weight loss on specific clinical outcomes (e.g., heart attacks), but the 30-pound weight loss group did not achieve its intended weight loss target (e.g., the group only lost 14 pounds on average). A third example is whether a large percentage of participants assigned to one group "crossed over" and got the intervention provided to the other group. A final example is when one group that was assigned to receive a particular drug at a particular dose had a large percentage of participants who did not end up taking the drug or the dose as designed in the protocol.

Question 10. Avoid other interventions

Changes that occur in the study outcomes being assessed should be attributable to the interventions being compared in the study. If study participants receive interventions that are not part of the study protocol and could affect the outcomes being assessed, and they receive these interventions differentially, then there is cause for concern because these interventions could bias results. The following scenario is another example of how bias can occur. In a study comparing two different dietary interventions on serum cholesterol, one group had a significantly higher percentage of participants taking statin drugs than the other group. In this situation, it would be impossible to know if a difference in outcome was due to the dietary intervention or the drugs.

Question 11. Outcome measures assessment

What tools or methods were used to measure the outcomes in the study? Were the tools and methods accurate and reliable–for example, have they been validated, or are they objective? This is important as it indicates the confidence you can have in the reported outcomes. Perhaps even more important is ascertaining that outcomes were assessed in the same manner within and between groups. One example of differing methods is self-report of dietary salt intake versus urine testing for sodium content (a more reliable and valid assessment method). Another example is using BP measurements taken by practitioners who use their usual methods versus using BP measurements done by individuals trained in a standard approach. Such an approach may include using the same instrument each time and taking an individual's BP multiple times. In each of these cases, the answer to this assessment question would be "no" for the former scenario and "yes" for the latter. In addition, a study in which an intervention group was seen more frequently than the control group, enabling more opportunities to report clinical events, would not be considered reliable and valid.

Question 12. Power calculation

Generally, a study's methods section will address the sample size needed to detect differences in primary outcomes. The current standard is at least 80 percent power to detect a clinically relevant difference in an outcome using a two-sided alpha of 0.05. Often, however, older studies will not report on power.

Question 13. Prespecified outcomes

Investigators should prespecify outcomes reported in a study for hypothesis testing–which is the reason for conducting an RCT. Without prespecified outcomes, the study may be reporting ad hoc analyses, simply looking for differences supporting desired findings. Investigators also should prespecify subgroups being examined. Most RCTs conduct numerous post hoc analyses as a way of exploring findings and generating additional hypotheses. The intent of this question is to give more weight to reports that are not simply exploratory in nature.

Question 14. Intention-to-treat analysis

Intention-to-treat (ITT) means everybody who was randomized is analyzed according to the original group to which they are assigned. This is an extremely important concept because conducting an ITT analysis preserves the whole reason for doing a randomized trial; that is, to compare groups that differ only in the intervention being tested. When the ITT philosophy is not followed, groups being compared may no longer be the same. In this situation, the study would likely be rated poor. However, if an investigator used another type of analysis that could be viewed as valid, this would be explained in the "other" box on the quality assessment form. Some researchers use a completers analysis (an analysis of only the participants who completed the intervention and the study), which introduces significant potential for bias. Characteristics of participants who do not complete the study are unlikely to be the same as those who do. The likely impact of participants withdrawing from a study treatment must be considered carefully. ITT analysis provides a more conservative (potentially less biased) estimate of effectiveness.

General Guidance for Determining the Overall Quality Rating of Controlled Intervention Studies

The questions on the assessment tool were designed to help reviewers focus on the key concepts for evaluating a study's internal validity. They are not intended to create a list that is simply tallied up to arrive at a summary judgment of quality.

Internal validity is the extent to which the results (effects) reported in a study can truly be attributed to the intervention being evaluated and not to flaws in the design or conduct of the study–in other words, the ability for the study to make causal conclusions about the effects of the intervention being tested. Such flaws can increase the risk of bias. Critical appraisal involves considering the risk of potential for allocation bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality.

Fatal flaws: If a study has a "fatal flaw," then risk of bias is significant, and the study is of poor quality. Examples of fatal flaws in RCTs include high dropout rates, high differential dropout rates, no ITT analysis or other unsuitable statistical analysis (e.g., completers-only analysis).

Generally, when evaluating a study, one will not see a "fatal flaw;" however, one will find some risk of bias. During training, reviewers were instructed to look for the potential for bias in studies by focusing on the concepts underlying the questions in the tool. For any box checked "no," reviewers were told to ask: "What is the potential risk of bias that may be introduced by this flaw?" That is, does this factor cause one to doubt the results that were reported in the study?

NHLBI staff provided reviewers with background reading on critical appraisal, while emphasizing that the best approach to use is to think about the questions in the tool in determining the potential for bias in a study. The staff also emphasized that each study has specific nuances; therefore, reviewers should familiarize themselves with the key concepts.

Quality Assessment of Systematic Reviews and Meta-Analyses - Study Quality Assessment Tools

Criteria Yes No Other
(CD, NR, NA)*
1. Is the review based on a focused question that is adequately formulated and described?      
2. Were eligibility criteria for included and excluded studies predefined and specified?      
3. Did the literature search strategy use a comprehensive, systematic approach?      
4. Were titles, abstracts, and full-text articles dually and independently reviewed for inclusion and exclusion to minimize bias?      
5. Was the quality of each included study rated independently by two or more reviewers using a standard method to appraise its internal validity?      
6. Were the included studies listed along with important characteristics and results of each study?      
7. Was publication bias assessed?      
8. Was heterogeneity assessed? (This question applies only to meta-analyses.)      

Guidance for Quality Assessment Tool for Systematic Reviews and Meta-Analyses

A systematic review is a study that attempts to answer a question by synthesizing the results of primary studies while using strategies to limit bias and random error.424 These strategies include a comprehensive search of all potentially relevant articles and the use of explicit, reproducible criteria in the selection of articles included in the review. Research designs and study characteristics are appraised, data are synthesized, and results are interpreted using a predefined systematic approach that adheres to evidence-based methodological principles.

Systematic reviews can be qualitative or quantitative. A qualitative systematic review summarizes the results of the primary studies but does not combine the results statistically. A quantitative systematic review, or meta-analysis, is a type of systematic review that employs statistical techniques to combine the results of the different studies into a single pooled estimate of effect, often given as an odds ratio. The guidance document below is organized by question number from the tool for quality assessment of systematic reviews and meta-analyses.

Question 1. Focused question

The review should be based on a question that is clearly stated and well-formulated. An example would be a question that uses the PICO (population, intervention, comparator, outcome) format, with all components clearly described.

Question 2. Eligibility criteria

The eligibility criteria used to determine whether studies were included or excluded should be clearly specified and predefined. It should be clear to the reader why studies were included or excluded.

Question 3. Literature search

The search strategy should employ a comprehensive, systematic approach in order to capture all of the evidence possible that pertains to the question of interest. At a minimum, a comprehensive review has the following attributes:

  • Electronic searches were conducted using multiple scientific literature databases, such as MEDLINE, EMBASE, Cochrane Central Register of Controlled Trials, PsychLit, and others as appropriate for the subject matter.
  • Manual searches of references found in articles and textbooks should supplement the electronic searches.

Additional search strategies that may be used to improve the yield include the following:

  • Studies published in other countries
  • Studies published in languages other than English
  • Identification by experts in the field of studies and articles that may have been missed
  • Search of grey literature, including technical reports and other papers from government agencies or scientific groups or committees; presentations and posters from scientific meetings, conference proceedings, unpublished manuscripts; and others. Searching the grey literature is important (whenever feasible) because sometimes only positive studies with significant findings are published in the peer-reviewed literature, which can bias the results of a review.

In their reviews, researchers described the literature search strategy clearly, and ascertained it could be reproducible by others with similar results.

Question 4. Dual review for determining which studies to include and exclude

Titles, abstracts, and full-text articles (when indicated) should be reviewed by two independent reviewers to determine which studies to include and exclude in the review. Reviewers resolved disagreements through discussion and consensus or with third parties. They clearly stated the review process, including methods for settling disagreements.

Question 5. Quality appraisal for internal validity

Each included study should be appraised for internal validity (study quality assessment) using a standardized approach for rating the quality of the individual studies. Ideally, this should be done by at least two independent reviewers appraised each study for internal validity. However, there is not one commonly accepted, standardized tool for rating the quality of studies. So, in the research papers, reviewers looked for an assessment of the quality of each study and a clear description of the process used.

Question 6. List and describe included studies

All included studies were listed in the review, along with descriptions of their key characteristics. This was presented either in narrative or table format.

Question 7. Publication bias

Publication bias is a term used when studies with positive results have a higher likelihood of being published, being published rapidly, being published in higher impact journals, being published in English, being published more than once, or being cited by others.425,426 Publication bias can be linked to favorable or unfavorable treatment of research findings due to investigators, editors, industry, commercial interests, or peer reviewers. To minimize the potential for publication bias, researchers can conduct a comprehensive literature search that includes the strategies discussed in Question 3.

A funnel plot–a scatter plot of component studies in a meta-analysis–is a commonly used graphical method for detecting publication bias. If there is no significant publication bias, the graph looks like a symmetrical inverted funnel.

Reviewers assessed and clearly described the likelihood of publication bias.

Question 8. Heterogeneity

Heterogeneity is used to describe important differences in studies included in a meta-analysis that may make it inappropriate to combine the studies.427 Heterogeneity can be clinical (e.g., important differences between study participants, baseline disease severity, and interventions); methodological (e.g., important differences in the design and conduct of the study); or statistical (e.g., important differences in the quantitative results or reported effects).

Researchers usually assess clinical or methodological heterogeneity qualitatively by determining whether it makes sense to combine studies. For example:

  • Should a study evaluating the effects of an intervention on CVD risk that involves elderly male smokers with hypertension be combined with a study that involves healthy adults ages 18 to 40? (Clinical Heterogeneity)
  • Should a study that uses a randomized controlled trial (RCT) design be combined with a study that uses a case-control study design? (Methodological Heterogeneity)

Statistical heterogeneity describes the degree of variation in the effect estimates from a set of studies; it is assessed quantitatively. The two most common methods used to assess statistical heterogeneity are the Q test (also known as the X2 or chi-square test) or I2 test.

Reviewers examined studies to determine if an assessment for heterogeneity was conducted and clearly described. If the studies are found to be heterogeneous, the investigators should explore and explain the causes of the heterogeneity, and determine what influence, if any, the study differences had on overall study results.

Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies - Study Quality Assessment Tools

Criteria Yes No Other
(CD, NR, NA)*
1. Was the research question or objective in this paper clearly stated?      
2. Was the study population clearly specified and defined?      
3. Was the participation rate of eligible persons at least 50%?      
4. Were all the subjects selected or recruited from the same or similar populations (including the same time period)? Were inclusion and exclusion criteria for being in the study prespecified and applied uniformly to all participants?      
5. Was a sample size justification, power description, or variance and effect estimates provided?      
6. For the analyses in this paper, were the exposure(s) of interest measured prior to the outcome(s) being measured?      
7. Was the timeframe sufficient so that one could reasonably expect to see an association between exposure and outcome if it existed?      
8. For exposures that can vary in amount or level, did the study examine different levels of the exposure as related to the outcome (e.g., categories of exposure, or exposure measured as continuous variable)?      
9. Were the exposure measures (independent variables) clearly defined, valid, reliable, and implemented consistently across all study participants?      
10. Was the exposure(s) assessed more than once over time?      
11. Were the outcome measures (dependent variables) clearly defined, valid, reliable, and implemented consistently across all study participants?      
12. Were the outcome assessors blinded to the exposure status of participants?      
13. Was loss to follow-up after baseline 20% or less?      
14. Were key potential confounding variables measured and adjusted statistically for their impact on the relationship between exposure(s) and outcome(s)?      

Guidance for Assessing the Quality of Observational Cohort and Cross-Sectional Studies

The guidance document below is organized by question number from the tool for quality assessment of observational cohort and cross-sectional studies.

Question 1. Research question

Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. Higher quality scientific research explicitly defines a research question.

Questions 2 and 3. Study population

Did the authors describe the group of people from which the study participants were selected or recruited, using demographics, location, and time period? If you were to conduct this study again, would you know who to recruit, from where, and from what time period? Is the cohort population free of the outcomes of interest at the time they were recruited?

An example would be men over 40 years old with type 2 diabetes who began seeking medical care at Phoenix Good Samaritan Hospital between January 1, 1990 and December 31, 1994. In this example, the population is clearly described as: (1) who (men over 40 years old with type 2 diabetes); (2) where (Phoenix Good Samaritan Hospital); and (3) when (between January 1, 1990 and December 31, 1994). Another example is women ages 34 to 59 years of age in 1980 who were in the nursing profession and had no known coronary disease, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.

In cohort studies, it is crucial that the population at baseline is free of the outcome of interest. For example, the nurses' population above would be an appropriate group in which to study incident coronary disease. This information is usually found either in descriptions of population recruitment, definitions of variables, or inclusion/exclusion criteria.

You may need to look at prior papers on methods in order to make the assessment for this question. Those papers are usually in the reference list.

If fewer than 50% of eligible persons participated in the study, then there is concern that the study population does not adequately represent the target population. This increases the risk of bias.

Question 4. Groups recruited from the same population and uniform eligibility criteria

Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the subjects involved? This issue is related to the description of the study population, above, and you may find the information for both of these questions in the same section of the paper.

Most cohort studies begin with the selection of the cohort; participants in this cohort are then measured or evaluated to determine their exposure status. However, some cohort studies may recruit or select exposed participants in a different time or place than unexposed participants, especially retrospective cohort studies–which is when data are obtained from the past (retrospectively), but the analysis examines exposures prior to outcomes. For example, one research question could be whether diabetic men with clinical depression are at higher risk for cardiovascular disease than those without clinical depression. So, diabetic men with depression might be selected from a mental health clinic, while diabetic men without depression might be selected from an internal medicine or endocrinology clinic. This study recruits groups from different clinic populations, so this example would get a "no."

However, the women nurses described in the question above were selected based on the same inclusion/exclusion criteria, so that example would get a "yes."

Question 5. Sample size justification

Did the authors present their reasons for selecting or recruiting the number of people included or analyzed? Do they note or discuss the statistical power of the study? This question is about whether or not the study had enough participants to detect an association if one truly existed.

A paragraph in the methods section of the article may explain the sample size needed to detect a hypothesized difference in outcomes. You may also find a discussion of power in the discussion section (such as the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any of these cases, the answer would be "yes."

However, observational cohort studies often do not report anything about power or sample sizes because the analyses are exploratory in nature. In this case, the answer would be "no." This is not a "fatal flaw." It just may indicate that attention was not paid to whether the study was sufficiently sized to answer a prespecified question–i.e., it may have been an exploratory, hypothesis-generating study.

Question 6. Exposure assessed prior to outcome measurement

This question is important because, in order to determine whether an exposure causes an outcome, the exposure must come before the outcome.

For some prospective cohort studies, the investigator enrolls the cohort and then determines the exposure status of various members of the cohort (large epidemiological studies like Framingham used this approach). However, for other cohort studies, the cohort is selected based on its exposure status, as in the example above of depressed diabetic men (the exposure being depression). Other examples include a cohort identified by its exposure to fluoridated drinking water and then compared to a cohort living in an area without fluoridated water, or a cohort of military personnel exposed to combat in the Gulf War compared to a cohort of military personnel not deployed in a combat zone.

With either of these types of cohort studies, the cohort is followed forward in time (i.e., prospectively) to assess the outcomes that occurred in the exposed members compared to nonexposed members of the cohort. Therefore, you begin the study in the present by looking at groups that were exposed (or not) to some biological or behavioral factor, intervention, etc., and then you follow them forward in time to examine outcomes. If a cohort study is conducted properly, the answer to this question should be "yes," since the exposure status of members of the cohort was determined at the beginning of the study before the outcomes occurred.

For retrospective cohort studies, the same principal applies. The difference is that, rather than identifying a cohort in the present and following them forward in time, the investigators go back in time (i.e., retrospectively) and select a cohort based on their exposure status in the past and then follow them forward to assess the outcomes that occurred in the exposed and nonexposed cohort members. Because in retrospective cohort studies the exposure and outcomes may have already occurred (it depends on how long they follow the cohort), it is important to make sure that the exposure preceded the outcome.

Sometimes cross-sectional studies are conducted (or cross-sectional analyses of cohort-study data), where the exposures and outcomes are measured during the same timeframe. As a result, cross-sectional analyses provide weaker evidence than regular cohort studies regarding a potential causal relationship between exposures and outcomes. For cross-sectional analyses, the answer to Question 6 should be "no."

Question 7. Sufficient timeframe to see an effect

Did the study allow enough time for a sufficient number of outcomes to occur or be observed, or enough time for an exposure to have a biological effect on an outcome? In the examples given above, if clinical depression has a biological effect on increasing risk for CVD, such an effect may take years. In the other example, if higher dietary sodium increases BP, a short timeframe may be sufficient to assess its association with BP, but a longer timeframe would be needed to examine its association with heart attacks.

The issue of timeframe is important to enable meaningful analysis of the relationships between exposures and outcomes to be conducted. This often requires at least several years, especially when looking at health outcomes, but it depends on the research question and outcomes being examined.

Cross-sectional analyses allow no time to see an effect, since the exposures and outcomes are assessed at the same time, so those would get a "no" response.

Question 8. Different levels of the exposure of interest

If the exposure can be defined as a range (examples: drug dosage, amount of physical activity, amount of sodium consumed), were multiple categories of that exposure assessed? (for example, for drugs: not on the medication, on a low dose, medium dose, high dose; for dietary sodium, higher than average U.S. consumption, lower than recommended consumption, between the two). Sometimes discrete categories of exposure are not used, but instead exposures are measured as continuous variables (for example, mg/day of dietary sodium or BP values).

In any case, studying different levels of exposure (where possible) enables investigators to assess trends or dose-response relationships between exposures and outcomes–e.g., the higher the exposure, the greater the rate of the health outcome. The presence of trends or dose-response relationships lends credibility to the hypothesis of causality between exposure and outcome.

For some exposures, however, this question may not be applicable (e.g., the exposure may be a dichotomous variable like living in a rural setting versus an urban setting, or vaccinated/not vaccinated with a one-time vaccine). If there are only two possible exposures (yes/no), then this question should be given an "NA," and it should not count negatively towards the quality rating.

Question 9. Exposure measures and assessment

Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable–for example, have they been validated or are they objective? This issue is important as it influences confidence in the reported exposures. When exposures are measured with less accuracy or validity, it is harder to see an association between exposure and outcome even if one exists. Also as important is whether the exposures were assessed in the same manner within groups and between groups; if not, bias may result.

For example, retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content. Another example is measurement of BP, where there may be quite a difference between usual care, where clinicians measure BP however it is done in their practice setting (which can vary considerably), and use of trained BP assessors using standardized equipment (e.g., the same BP device which has been tested and calibrated) and a standardized protocol (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged). In each of these cases, the former would get a "no" and the latter a "yes."

Here is a final example that illustrates the point about why it is important to assess exposures consistently across all groups: If people with higher BP (exposed cohort) are seen by their providers more frequently than those without elevated BP (nonexposed group), it also increases the chances of detecting and documenting changes in health outcomes, including CVD-related events. Therefore, it may lead to the conclusion that higher BP leads to more CVD events. This may be true, but it could also be due to the fact that the subjects with higher BP were seen more often; thus, more CVD-related events were detected and documented simply because they had more encounters with the health care system. Thus, it could bias the results and lead to an erroneous conclusion.

Question 10. Repeated exposure assessment

Was the exposure for each person measured more than once during the course of the study period? Multiple measurements with the same result increase our confidence that the exposure status was correctly classified. Also, multiple measurements enable investigators to look at changes in exposure over time, for example, people who ate high dietary sodium throughout the followup period, compared to those who started out high then reduced their intake, compared to those who ate low sodium throughout. Once again, this may not be applicable in all cases. In many older studies, exposure was measured only at baseline. However, multiple exposure measurements do result in a stronger study design.

Question 11. Outcome measures

Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable–for example, have they been validated or are they objective? This issue is important because it influences confidence in the validity of study results. Also important is whether the outcomes were assessed in the same manner within groups and between groups.

An example of an outcome measure that is objective, accurate, and reliable is death–the outcome measured with more accuracy than any other. But even with a measure as objective as death, there can be differences in the accuracy and reliability of how death was assessed by the investigators. Did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example is a study of whether dietary fat intake is related to blood cholesterol level (cholesterol level being the outcome), and the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a "yes." An example of a "no" would be self-report by subjects that they had a heart attack, or self-report of how much they weigh (if body weight is the outcome of interest).

Similar to the example in Question 9, results may be biased if one group (e.g., people with high BP) is seen more frequently than another group (people with normal BP) because more frequent encounters with the health care system increases the chances of outcomes being detected and documented.

Question 12. Blinding of outcome assessors

Blinding means that outcome assessors did not know whether the participant was exposed or unexposed. It is also sometimes called "masking." The objective is to look for evidence in the article that the person(s) assessing the outcome(s) for the study (for example, examining medical records to determine the outcomes that occurred in the exposed and comparison groups) is masked to the exposure status of the participant. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status because they also took measurements of exposures. If so, make a note of that in the comments section.

As you assess this criterion, think about whether it is likely that the person(s) doing the outcome assessment would know (or be able to figure out) the exposure status of the study participants. If the answer is no, then blinding is adequate. An example of adequate blinding of the outcome assessors is to create a separate committee, whose members were not involved in the care of the patient and had no information about the study participants' exposure status. The committee would then be provided with copies of participants' medical records, which had been stripped of any potential exposure information or personally identifiable information. The committee would then review the records for prespecified outcomes according to the study protocol. If blinding was not possible, which is sometimes the case, mark "NA" and explain the potential for bias.

Question 13. Followup rate

Higher overall followup rates are always better than lower followup rates, even though higher rates are expected in shorter studies, whereas lower overall followup rates are often seen in studies of longer duration. Usually, an acceptable overall followup rate is considered 80 percent or more of participants whose exposures were measured at baseline. However, this is just a general guideline. For example, a 6-month cohort study examining the relationship between dietary sodium intake and BP level may have over 90 percent followup, but a 20-year cohort study examining effects of sodium intake on stroke may have only a 65 percent followup rate.

Question 14. Statistical analyses

Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Logistic regression or other regression methods are often used to account for the influence of variables not of interest.

This is a key issue in cohort studies, because statistical analyses need to control for potential confounders, in contrast to an RCT, where the randomization process controls for potential confounders. All key factors that may be associated both with the exposure of interest and the outcome–that are not of interest to the research question–should be controlled for in the analyses.

For example, in a study of the relationship between cardiorespiratory fitness and CVD events (heart attacks and strokes), the study should control for age, BP, blood cholesterol, and body weight, because all of these factors are associated both with low fitness and with CVD events. Well-done cohort studies control for multiple potential confounders.

Some general guidance for determining the overall quality rating of observational cohort and cross-sectional studies

The questions on the form are designed to help you focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list that you simply tally up to arrive at a summary judgment of quality.

Internal validity for cohort studies is the extent to which the results reported in the study can truly be attributed to the exposure being evaluated and not to flaws in the design or conduct of the study–in other words, the ability of the study to draw associative conclusions about the effects of the exposures being studied on outcomes. Any such flaws can increase the risk of bias.

Critical appraisal involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality. Low risk of bias translates to a rating of good quality. (Thus, the greater the risk of bias, the lower the quality rating of the study.)

In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the exposure and outcome, the higher quality the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding–all concepts reflected in the tool.

Generally, when you evaluate a study, you will not see a "fatal flaw," but you will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, you should ask yourself about the potential for bias in the study you are critically appraising. For any box where you check "no" you should ask, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, does this factor cause you to doubt the results that are reported in the study or doubt the ability of the study to accurately assess an association between exposure and outcome?

The best approach is to think about the questions in the tool and how each one tells you something about the potential for bias in a study. The more you familiarize yourself with the key concepts, the more comfortable you will be with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own based on the details that are reported and consideration of the concepts for minimizing bias.

Quality Assessment of Case-Control Studies - Study Quality Assessment Tools

Criteria Yes No Other
(CD, NR, NA)*
1. Was the research question or objective in this paper clearly stated and appropriate?      
2. Was the study population clearly specified and defined?      
3. Did the authors include a sample size justification?      
4. Were controls selected or recruited from the same or similar population that gave rise to the cases (including the same timeframe)?      
5. Were the definitions, inclusion and exclusion criteria, algorithms or processes used to identify or select cases and controls valid, reliable, and implemented consistently across all study participants?      
6. Were the cases clearly defined and differentiated from controls?      
7. If less than 100 percent of eligible cases and/or controls were selected for the study, were the cases and/or controls randomly selected from those eligible?      
8. Was there use of concurrent controls?      
9. Were the investigators able to confirm that the exposure/risk occurred prior to the development of the condition or event that defined a participant as a case?      
10. Were the measures of exposure/risk clearly defined, valid, reliable, and implemented consistently (including the same time period) across all study participants?      
11. Were the assessors of exposure/risk blinded to the case or control status of participants?      
12. Were key potential confounding variables measured and adjusted statistically in the analyses? If matching was used, did the investigators account for matching during study analysis?      
Quality Rating (
Rater #1 Initials:
Rater #2 Initials:
Additional Comments (If POOR, please state why):

Guidance for Assessing the Quality of Case-Control Studies

The guidance document below is organized by question number from the tool for quality assessment of case-control studies.

Did the authors describe their goal in conducting this research? Is it easy to understand what they were looking to find? This issue is important for any scientific paper of any type. High quality scientific research explicitly defines a research question.

Question 2. Study population

Did the authors describe the group of individuals from which the cases and controls were selected or recruited, while using demographics, location, and time period? If the investigators conducted this study again, would they know exactly who to recruit, from where, and from what time period?

Investigators identify case-control study populations by location, time period, and inclusion criteria for cases (individuals with the disease, condition, or problem) and controls (individuals without the disease, condition, or problem). For example, the population for a study of lung cancer and chemical exposure would be all incident cases of lung cancer diagnosed in patients ages 35 to 79, from January 1, 2003 to December 31, 2008, living in Texas during that entire time period, as well as controls without lung cancer recruited from the same population during the same time period. The population is clearly described as: (1) who (men and women ages 35 to 79 with (cases) and without (controls) incident lung cancer); (2) where (living in Texas); and (3) when (between January 1, 2003 and December 31, 2008).

Other studies may use disease registries or data from cohort studies to identify cases. In these cases, the populations are individuals who live in the area covered by the disease registry or included in a cohort study (i.e., nested case-control or case-cohort). For example, a study of the relationship between vitamin D intake and myocardial infarction might use patients identified via the GRACE registry, a database of heart attack patients.

NHLBI staff encouraged reviewers to examine prior papers on methods (listed in the reference list) to make this assessment, if necessary.

Question 3. Target population and case representation

In order for a study to truly address the research question, the target population–the population from which the study population is drawn and to which study results are believed to apply–should be carefully defined. Some authors may compare characteristics of the study cases to characteristics of cases in the target population, either in text or in a table. When study cases are shown to be representative of cases in the appropriate target population, it increases the likelihood that the study was well-designed per the research question.

However, because these statistics are frequently difficult or impossible to measure, publications should not be penalized if case representation is not shown. For most papers, the response to question 3 will be "NR." Those subquestions are combined because the answer to the second subquestion–case representation–determines the response to this item. However, it cannot be determined without considering the response to the first subquestion. For example, if the answer to the first subquestion is "yes," and the second, "CD," then the response for item 3 is "CD."

Question 4. Sample size justification

Did the authors discuss their reasons for selecting or recruiting the number of individuals included? Did they discuss the statistical power of the study and provide a sample size calculation to ensure that the study is adequately powered to detect an association (if one exists)? This question does not refer to a description of the manner in which different groups were included or excluded using the inclusion/exclusion criteria (e.g., "Final study size was 1,378 participants after exclusion of 461 patients with missing data" is not considered a sample size justification for the purposes of this question).

An article's methods section usually contains information on sample size and the size needed to detect differences in exposures and on statistical power.

Question 5. Groups recruited from the same population

To determine whether cases and controls were recruited from the same population, one can ask hypothetically, "If a control was to develop the outcome of interest (the condition that was used to select cases), would that person have been eligible to become a case?" Case-control studies begin with the selection of the cases (those with the outcome of interest, e.g., lung cancer) and controls (those in whom the outcome is absent). Cases and controls are then evaluated and categorized by their exposure status. For the lung cancer example, cases and controls were recruited from hospitals in a given region. One may reasonably assume that controls in the catchment area for the hospitals, or those already in the hospitals for a different reason, would attend those hospitals if they became a case; therefore, the controls are drawn from the same population as the cases. If the controls were recruited or selected from a different region (e.g., a State other than Texas) or time period (e.g., 1991-2000), then the cases and controls were recruited from different populations, and the answer to this question would be "no."

The following example further explores selection of controls. In a study, eligible cases were men and women, ages 18 to 39, who were diagnosed with atherosclerosis at hospitals in Perth, Australia, between July 1, 2000 and December 31, 2007. Appropriate controls for these cases might be sampled using voter registration information for men and women ages 18 to 39, living in Perth (population-based controls); they also could be sampled from patients without atherosclerosis at the same hospitals (hospital-based controls). As long as the controls are individuals who would have been eligible to be included in the study as cases (if they had been diagnosed with atherosclerosis), then the controls were selected appropriately from the same source population as cases.

In a prospective case-control study, investigators may enroll individuals as cases at the time they are found to have the outcome of interest; the number of cases usually increases as time progresses. At this same time, they may recruit or select controls from the population without the outcome of interest. One way to identify or recruit cases is through a surveillance system. In turn, investigators can select controls from the population covered by that system. This is an example of population-based controls. Investigators also may identify and select cases from a cohort study population and identify controls from outcome-free individuals in the same cohort study. This is known as a nested case-control study.

Question 6. Inclusion and exclusion criteria prespecified and applied uniformly

Were the inclusion and exclusion criteria developed prior to recruitment or selection of the study population? Were the same underlying criteria used for all of the groups involved? To answer this question, reviewers determined if the investigators developed I/E criteria prior to recruitment or selection of the study population and if they used the same underlying criteria for all groups. The investigators should have used the same selection criteria, except for study participants who had the disease or condition, which would be different for cases and controls by definition. Therefore, the investigators use the same age (or age range), gender, race, and other characteristics to select cases and controls. Information on this topic is usually found in a paper's section on the description of the study population.

Question 7. Case and control definitions

For this question, reviewers looked for descriptions of the validity of case and control definitions and processes or tools used to identify study participants as such. Was a specific description of "case" and "control" provided? Is there a discussion of the validity of the case and control definitions and the processes or tools used to identify study participants as such? They determined if the tools or methods were accurate, reliable, and objective. For example, cases might be identified as "adult patients admitted to a VA hospital from January 1, 2000 to December 31, 2009, with an ICD-9 discharge diagnosis code of acute myocardial infarction and at least one of the two confirmatory findings in their medical records: at least 2mm of ST elevation changes in two or more ECG leads and an elevated troponin level. Investigators might also use ICD-9 or CPT codes to identify patients. All cases should be identified using the same methods. Unless the distinction between cases and controls is accurate and reliable, investigators cannot use study results to draw valid conclusions.

Question 8. Random selection of study participants

If a case-control study did not use 100 percent of eligible cases and/or controls (e.g., not all disease-free participants were included as controls), did the authors indicate that random sampling was used to select controls? When it is possible to identify the source population fairly explicitly (e.g., in a nested case-control study, or in a registry-based study), then random sampling of controls is preferred. When investigators used consecutive sampling, which is frequently done for cases in prospective studies, then study participants are not considered randomly selected. In this case, the reviewers would answer "no" to Question 8. However, this would not be considered a fatal flaw.

If investigators included all eligible cases and controls as study participants, then reviewers marked "NA" in the tool. If 100 percent of cases were included (e.g., NA for cases) but only 50 percent of eligible controls, then the response would be "yes" if the controls were randomly selected, and "no" if they were not. If this cannot be determined, the appropriate response is "CD."

Question 9. Concurrent controls

A concurrent control is a control selected at the time another person became a case, usually on the same day. This means that one or more controls are recruited or selected from the population without the outcome of interest at the time a case is diagnosed. Investigators can use this method in both prospective case-control studies and retrospective case-control studies. For example, in a retrospective study of adenocarcinoma of the colon using data from hospital records, if hospital records indicate that Person A was diagnosed with adenocarcinoma of the colon on June 22, 2002, then investigators would select one or more controls from the population of patients without adenocarcinoma of the colon on that same day. This assumes they conducted the study retrospectively, using data from hospital records. The investigators could have also conducted this study using patient records from a cohort study, in which case it would be a nested case-control study.

Investigators can use concurrent controls in the presence or absence of matching and vice versa. A study that uses matching does not necessarily mean that concurrent controls were used.

Question 10. Exposure assessed prior to outcome measurement

Investigators first determine case or control status (based on presence or absence of outcome of interest), and then assess exposure history of the case or control; therefore, reviewers ascertained that the exposure preceded the outcome. For example, if the investigators used tissue samples to determine exposure, did they collect them from patients prior to their diagnosis? If hospital records were used, did investigators verify that the date a patient was exposed (e.g., received medication for atherosclerosis) occurred prior to the date they became a case (e.g., was diagnosed with type 2 diabetes)? For an association between an exposure and an outcome to be considered causal, the exposure must have occurred prior to the outcome.

Question 11. Exposure measures and assessment

Were the exposure measures defined in detail? Were the tools or methods used to measure exposure accurate and reliable–for example, have they been validated or are they objective? This is important, as it influences confidence in the reported exposures. Equally important is whether the exposures were assessed in the same manner within groups and between groups. This question pertains to bias resulting from exposure misclassification (i.e., exposure ascertainment).

For example, a retrospective self-report of dietary salt intake is not as valid and reliable as prospectively using a standardized dietary log plus testing participants' urine for sodium content because participants' retrospective recall of dietary salt intake may be inaccurate and result in misclassification of exposure status. Similarly, BP results from practices that use an established protocol for measuring BP would be considered more valid and reliable than results from practices that did not use standard protocols. A protocol may include using trained BP assessors, standardized equipment (e.g., the same BP device which has been tested and calibrated), and a standardized procedure (e.g., patient is seated for 5 minutes with feet flat on the floor, BP is taken twice in each arm, and all four measurements are averaged).

Question 12. Blinding of exposure assessors

Blinding or masking means that outcome assessors did not know whether participants were exposed or unexposed. To answer this question, reviewers examined articles for evidence that the outcome assessor(s) was masked to the exposure status of the research participants. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would most likely not be blinded to exposure status. A reviewer would note such a finding in the comments section of the assessment tool.

One way to ensure good blinding of exposure assessment is to have a separate committee, whose members have no information about the study participants' status as cases or controls, review research participants' records. To help answer the question above, reviewers determined if it was likely that the outcome assessor knew whether the study participant was a case or control. If it was unlikely, then the reviewers marked "no" to Question 12. Outcome assessors who used medical records to assess exposure should not have been directly involved in the study participants' care, since they probably would have known about their patients' conditions. If the medical records contained information on the patient's condition that identified him/her as a case (which is likely), that information would have had to be removed before the exposure assessors reviewed the records.

If blinding was not possible, which sometimes happens, the reviewers marked "NA" in the assessment tool and explained the potential for bias.

Question 13. Statistical analysis

Were key potential confounding variables measured and adjusted for, such as by statistical adjustment for baseline differences? Investigators often use logistic regression or other regression methods to account for the influence of variables not of interest.

This is a key issue in case-controlled studies; statistical analyses need to control for potential confounders, in contrast to RCTs in which the randomization process controls for potential confounders. In the analysis, investigators need to control for all key factors that may be associated with both the exposure of interest and the outcome and are not of interest to the research question.

A study of the relationship between smoking and CVD events illustrates this point. Such a study needs to control for age, gender, and body weight; all are associated with smoking and CVD events. Well-done case-control studies control for multiple potential confounders.

Matching is a technique used to improve study efficiency and control for known confounders. For example, in the study of smoking and CVD events, an investigator might identify cases that have had a heart attack or stroke and then select controls of similar age, gender, and body weight to the cases. For case-control studies, it is important that if matching was performed during the selection or recruitment process, the variables used as matching criteria (e.g., age, gender, race) should be controlled for in the analysis.

General Guidance for Determining the Overall Quality Rating of Case-Controlled Studies

NHLBI designed the questions in the assessment tool to help reviewers focus on the key concepts for evaluating a study's internal validity, not to use as a list from which to add up items to judge a study's quality.

Internal validity for case-control studies is the extent to which the associations between disease and exposure reported in the study can truly be attributed to the exposure being evaluated rather than to flaws in the design or conduct of the study. In other words, what is ability of the study to draw associative conclusions about the effects of the exposures on outcomes? Any such flaws can increase the risk of bias.

In critical appraising a study, the following factors need to be considered: risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues addressed in the questions above. High risk of bias translates to a poor quality rating; low risk of bias translates to a good quality rating. Again, the greater the risk of bias, the lower the quality rating of the study.

In addition, the more attention in the study design to issues that can help determine whether there is a causal relationship between the outcome and the exposure, the higher the quality of the study. These include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, sufficient timeframe to see an effect, and appropriate control for confounding–all concepts reflected in the tool.

If a study has a "fatal flaw," then risk of bias is significant; therefore, the study is deemed to be of poor quality. An example of a fatal flaw in case-control studies is a lack of a consistent standard process used to identify cases and controls.

Generally, when reviewers evaluated a study, they did not see a "fatal flaw," but instead found some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers examined the potential for bias in the study. For any box checked "no," reviewers asked, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, did this factor lead to doubt about the results reported in the study or the ability of the study to accurately assess an association between exposure and outcome?

By examining questions in the assessment tool, reviewers were best able to assess the potential for bias in a study. Specific rules were not useful, as each study had specific nuances. In addition, being familiar with the key concepts helped reviewers assess the studies. Examples of studies rated good, fair, and poor were useful, yet each study had to be assessed on its own.

Quality Assessment Tool for Before-After (Pre-Post) Studies With No Control Group - Study Quality Assessment Tools

Criteria Yes No

Other
(CD, NR, NA)*

1. Was the study question or objective clearly stated?      
2. Were eligibility/selection criteria for the study population prespecified and clearly described?      
3. Were the participants in the study representative of those who would be eligible for the test/service/intervention in the general or clinical population of interest?      
4. Were all eligible participants that met the prespecified entry criteria enrolled?      
5. Was the sample size sufficiently large to provide confidence in the findings?      
6. Was the test/service/intervention clearly described and delivered consistently across the study population?      
7. Were the outcome measures prespecified, clearly defined, valid, reliable, and assessed consistently across all study participants?      
8. Were the people assessing the outcomes blinded to the participants' exposures/interventions?      
9. Was the loss to follow-up after baseline 20% or less? Were those lost to follow-up accounted for in the analysis?      
10. Did the statistical methods examine changes in outcome measures from before to after the intervention? Were statistical tests done that provided p values for the pre-to-post changes?      
11. Were outcome measures of interest taken multiple times before the intervention and multiple times after the intervention (i.e., did they use an interrupted time-series design)?      
12. If the intervention was conducted at a group level (e.g., a whole hospital, a community, etc.) did the statistical analysis take into account the use of individual-level data to determine effects at the group level?      

Guidance for Assessing the Quality of Before-After (Pre-Post) Studies With No Control Group

Question 1. Study question

Question 2. Eligibility criteria and study population

Did the authors describe the eligibility criteria applied to the individuals from whom the study participants were selected or recruited? In other words, if the investigators were to conduct this study again, would they know whom to recruit, from where, and from what time period?

Here is a sample description of a study population: men over age 40 with type 2 diabetes, who began seeking medical care at Phoenix Good Samaritan Hospital, between January 1, 2005 and December 31, 2007. The population is clearly described as: (1) who (men over age 40 with type 2 diabetes); (2) where (Phoenix Good Samaritan Hospital); and (3) when (between January 1, 2005 and December 31, 2007). Another sample description is women who were in the nursing profession, who were ages 34 to 59 in 1995, had no known CHD, stroke, cancer, hypercholesterolemia, or diabetes, and were recruited from the 11 most populous States, with contact information obtained from State nursing boards.

To assess this question, reviewers examined prior papers on study methods (listed in reference list) when necessary.

Question 3. Study participants representative of clinical populations of interest

The participants in the study should be generally representative of the population in which the intervention will be broadly applied. Studies on small demographic subgroups may raise concerns about how the intervention will affect broader populations of interest. For example, interventions that focus on very young or very old individuals may affect middle-aged adults differently. Similarly, researchers may not be able to extrapolate study results from patients with severe chronic diseases to healthy populations.

Question 4. All eligible participants enrolled

To further explore this question, reviewers may need to ask: Did the investigators develop the I/E criteria prior to recruiting or selecting study participants? Were the same underlying I/E criteria used for all research participants? Were all subjects who met the I/E criteria enrolled in the study?

Question 5. Sample size

Did the authors present their reasons for selecting or recruiting the number of individuals included or analyzed? Did they note or discuss the statistical power of the study? This question addresses whether there was a sufficient sample size to detect an association, if one did exist.

An article's methods section may provide information on the sample size needed to detect a hypothesized difference in outcomes and a discussion on statistical power (such as, the study had 85 percent power to detect a 20 percent increase in the rate of an outcome of interest, with a 2-sided alpha of 0.05). Sometimes estimates of variance and/or estimates of effect size are given, instead of sample size calculations. In any case, if the reviewers determined that the power was sufficient to detect the effects of interest, then they would answer "yes" to Question 5.

Question 6. Intervention clearly described

Another pertinent question regarding interventions is: Was the intervention clearly defined in detail in the study? Did the authors indicate that the intervention was consistently applied to the subjects? Did the research participants have a high level of adherence to the requirements of the intervention? For example, if the investigators assigned a group to 10 mg/day of Drug A, did most participants in this group take the specific dosage of Drug A? Or did a large percentage of participants end up not taking the specific dose of Drug A indicated in the study protocol?

Reviewers ascertained that changes in study outcomes could be attributed to study interventions. If participants received interventions that were not part of the study protocol and could affect the outcomes being assessed, the results could be biased.

Question 7. Outcome measures clearly described, valid, and reliable

Were the outcomes defined in detail? Were the tools or methods for measuring outcomes accurate and reliable–for example, have they been validated or are they objective? This question is important because the answer influences confidence in the validity of study results.

An example of an outcome measure that is objective, accurate, and reliable is death–the outcome measured with more accuracy than any other. But even with a measure as objective as death, differences can exist in the accuracy and reliability of how investigators assessed death. For example, did they base it on an autopsy report, death certificate, death registry, or report from a family member? Another example of a valid study is one whose objective is to determine if dietary fat intake affects blood cholesterol level (cholesterol level being the outcome) and in which the cholesterol level is measured from fasting blood samples that are all sent to the same laboratory. These examples would get a "yes."

An example of a "no" would be self-report by subjects that they had a heart attack, or self-report of how much they weight (if body weight is the outcome of interest).

Question 8. Blinding of outcome assessors

Blinding or masking means that the outcome assessors did not know whether the participants received the intervention or were exposed to the factor under study. To answer the question above, the reviewers examined articles for evidence that the person(s) assessing the outcome(s) was masked to the participants' intervention or exposure status. An outcome assessor, for example, may examine medical records to determine the outcomes that occurred in the exposed and comparison groups. Sometimes the person applying the intervention or measuring the exposure is the same person conducting the outcome assessment. In this case, the outcome assessor would not likely be blinded to the intervention or exposure status. A reviewer would note such a finding in the comments section of the assessment tool.

In assessing this criterion, the reviewers determined whether it was likely that the person(s) conducting the outcome assessment knew the exposure status of the study participants. If not, then blinding was adequate. An example of adequate blinding of the outcome assessors is to create a separate committee whose members were not involved in the care of the patient and had no information about the study participants' exposure status. Using a study protocol, committee members would review copies of participants' medical records, which would be stripped of any potential exposure information or personally identifiable information, for prespecified outcomes.

Question 9. Followup rate

Higher overall followup rates are always desirable to lower followup rates, although higher rates are expected in shorter studies, and lower overall followup rates are often seen in longer studies. Usually an acceptable overall followup rate is considered 80 percent or more of participants whose interventions or exposures were measured at baseline. However, this is a general guideline.

In accounting for those lost to followup, in the analysis, investigators may have imputed values of the outcome for those lost to followup or used other methods. For example, they may carry forward the baseline value or the last observed value of the outcome measure and use these as imputed values for the final outcome measure for research participants lost to followup.

Question 10. Statistical analysis

Were formal statistical tests used to assess the significance of the changes in the outcome measures between the before and after time periods? The reported study results should present values for statistical tests, such as p values, to document the statistical significance (or lack thereof) for the changes in the outcome measures found in the study.

Question 11. Multiple outcome measures

Were the outcome measures for each person measured more than once during the course of the before and after study periods? Multiple measurements with the same result increase confidence that the outcomes were accurately measured.

Question 12. Group-level interventions and individual-level outcome efforts

Group-level interventions are usually not relevant for clinical interventions such as bariatric surgery, in which the interventions are applied at the individual patient level. In those cases, the questions were coded as "NA" in the assessment tool.

General Guidance for Determining the Overall Quality Rating of Before-After Studies

The questions in the quality assessment tool were designed to help reviewers focus on the key concepts for evaluating the internal validity of a study. They are not intended to create a list from which to add up items to judge a study's quality.

Internal validity is the extent to which the outcome results reported in the study can truly be attributed to the intervention or exposure being evaluated, and not to biases, measurement errors, or other confounding factors that may result from flaws in the design or conduct of the study. In other words, what is the ability of the study to draw associative conclusions about the effects of the interventions or exposures on outcomes?

Critical appraisal of a study involves considering the risk of potential for selection bias, information bias, measurement bias, or confounding (the mixture of exposures that one cannot tease out from each other). Examples of confounding include co-interventions, differences at baseline in patient characteristics, and other issues throughout the questions above. High risk of bias translates to a rating of poor quality; low risk of bias translates to a rating of good quality. Again, the greater the risk of bias, the lower the quality rating of the study.

In addition, the more attention in the study design to issues that can help determine if there is a causal relationship between the exposure and outcome, the higher quality the study. These issues include exposures occurring prior to outcomes, evaluation of a dose-response gradient, accuracy of measurement of both exposure and outcome, and sufficient timeframe to see an effect.

Generally, when reviewers evaluate a study, they will not see a "fatal flaw," but instead will find some risk of bias. By focusing on the concepts underlying the questions in the quality assessment tool, reviewers should ask themselves about the potential for bias in the study they are critically appraising. For any box checked "no" reviewers should ask, "What is the potential risk of bias resulting from this flaw in study design or execution?" That is, does this factor lead to doubt about the results reported in the study or doubt about the ability of the study to accurately assess an association between the intervention or exposure and the outcome?

The best approach is to think about the questions in the assessment tool and how each one reveals something about the potential for bias in a study. Specific rules are not useful, as each study has specific nuances. In addition, being familiar with the key concepts will help reviewers be more comfortable with critical appraisal. Examples of studies rated good, fair, and poor are useful, but each study must be assessed on its own.

Quality Assessment Tool for Case Series Studies - Study Quality Assessment Tools

Criteria Yes No

Other
(CD, NR, NA)*

1. Was the study question or objective clearly stated?       
2. Was the study population clearly and fully described, including a case definition?      
3. Were the cases consecutive?      
4. Were the subjects comparable?      
5. Was the intervention clearly described?      
6. Were the outcome measures clearly defined, valid, reliable, and implemented consistently across all study participants?      
7. Was the length of follow-up adequate?      
8. Were the statistical methods well-described?      
9. Were the results well-described?      

Background: Development and Use - Study Quality Assessment Tools

Learn more about the development and use of Study Quality Assessment Tools.

Last updated: July, 2021

A Case Study on Improvement of Outgoing Quality Control Works for Manufacturing Products

  • January 2015
  • Journal of Applied Sciences Research 4(1):12-21
  • This person is not on ResearchGate, or hasn't claimed this research yet.

A.B. Elmi at Universiti Sains Malaysia

  • Universiti Sains Malaysia

Shahrul Kamaruddin at Universiti Teknologi PETRONAS

  • Universiti Teknologi PETRONAS

Abstract and Figures

Methodology of study

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations
  • Gian Azaria Tenda

Joy Elly Tulung

  • Regina Trifena Saerang
  • Nurgul Ilhan

N. Fırat Özkan

  • Anand K. Gramopadhye

Colin G Drury

  • INT J IND ERGONOM
  • Sandra K Garrett
  • Brian J. Melloy
  • M T Khasawneh
  • S Kaewkuekool
  • S R Bowling
  • A T Duchowski
  • A K Gramopadhye
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sage Choice

Logo of sageopen

Continuing to enhance the quality of case study methodology in health services research

Shannon l. sibbald.

1 Faculty of Health Sciences, Western University, London, Ontario, Canada.

2 Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

3 The Schulich Interfaculty Program in Public Health, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

Stefan Paciocco

Meghan fournie, rachelle van asseldonk, tiffany scurr.

Case study methodology has grown in popularity within Health Services Research (HSR). However, its use and merit as a methodology are frequently criticized due to its flexible approach and inconsistent application. Nevertheless, case study methodology is well suited to HSR because it can track and examine complex relationships, contexts, and systems as they evolve. Applied appropriately, it can help generate information on how multiple forms of knowledge come together to inform decision-making within healthcare contexts. In this article, we aim to demystify case study methodology by outlining its philosophical underpinnings and three foundational approaches. We provide literature-based guidance to decision-makers, policy-makers, and health leaders on how to engage in and critically appraise case study design. We advocate that researchers work in collaboration with health leaders to detail their research process with an aim of strengthening the validity and integrity of case study for its continued and advanced use in HSR.

Introduction

The popularity of case study research methodology in Health Services Research (HSR) has grown over the past 40 years. 1 This may be attributed to a shift towards the use of implementation research and a newfound appreciation of contextual factors affecting the uptake of evidence-based interventions within diverse settings. 2 Incorporating context-specific information on the delivery and implementation of programs can increase the likelihood of success. 3 , 4 Case study methodology is particularly well suited for implementation research in health services because it can provide insight into the nuances of diverse contexts. 5 , 6 In 1999, Yin 7 published a paper on how to enhance the quality of case study in HSR, which was foundational for the emergence of case study in this field. Yin 7 maintains case study is an appropriate methodology in HSR because health systems are constantly evolving, and the multiple affiliations and diverse motivations are difficult to track and understand with traditional linear methodologies.

Despite its increased popularity, there is debate whether a case study is a methodology (ie, a principle or process that guides research) or a method (ie, a tool to answer research questions). Some criticize case study for its high level of flexibility, perceiving it as less rigorous, and maintain that it generates inadequate results. 8 Others have noted issues with quality and consistency in how case studies are conducted and reported. 9 Reporting is often varied and inconsistent, using a mix of approaches such as case reports, case findings, and/or case study. Authors sometimes use incongruent methods of data collection and analysis or use the case study as a default when other methodologies do not fit. 9 , 10 Despite these criticisms, case study methodology is becoming more common as a viable approach for HSR. 11 An abundance of articles and textbooks are available to guide researchers through case study research, including field-specific resources for business, 12 , 13 nursing, 14 and family medicine. 15 However, there remains confusion and a lack of clarity on the key tenets of case study methodology.

Several common philosophical underpinnings have contributed to the development of case study research 1 which has led to different approaches to planning, data collection, and analysis. This presents challenges in assessing quality and rigour for researchers conducting case studies and stakeholders reading results.

This article discusses the various approaches and philosophical underpinnings to case study methodology. Our goal is to explain it in a way that provides guidance for decision-makers, policy-makers, and health leaders on how to understand, critically appraise, and engage in case study research and design, as such guidance is largely absent in the literature. This article is by no means exhaustive or authoritative. Instead, we aim to provide guidance and encourage dialogue around case study methodology, facilitating critical thinking around the variety of approaches and ways quality and rigour can be bolstered for its use within HSR.

Purpose of case study methodology

Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16 , 17 It is ideal for situations including, but not limited to, exploring under-researched and real-life phenomena, 18 especially when the contexts are complex and the researcher has little control over the phenomena. 19 , 20 Case studies can be useful when researchers want to understand how interventions are implemented in different contexts, and how context shapes the phenomenon of interest.

In addition to demonstrating coherency with the type of questions case study is suited to answer, there are four key tenets to case study methodologies: (1) be transparent in the paradigmatic and theoretical perspectives influencing study design; (2) clearly define the case and phenomenon of interest; (3) clearly define and justify the type of case study design; and (4) use multiple data collection sources and analysis methods to present the findings in ways that are consistent with the methodology and the study’s paradigmatic base. 9 , 16 The goal is to appropriately match the methods to empirical questions and issues and not to universally advocate any single approach for all problems. 21

Approaches to case study methodology

Three authors propose distinct foundational approaches to case study methodology positioned within different paradigms: Yin, 19 , 22 Stake, 5 , 23 and Merriam 24 , 25 ( Table 1 ). Yin is strongly post-positivist whereas Stake and Merriam are grounded in a constructivist paradigm. Researchers should locate their research within a paradigm that explains the philosophies guiding their research 26 and adhere to the underlying paradigmatic assumptions and key tenets of the appropriate author’s methodology. This will enhance the consistency and coherency of the methods and findings. However, researchers often do not report their paradigmatic position, nor do they adhere to one approach. 9 Although deliberately blending methodologies may be defensible and methodologically appropriate, more often it is done in an ad hoc and haphazard way, without consideration for limitations.

Cross-analysis of three case study approaches, adapted from Yazan 2015

Dimension of interestYinStakeMerriam
Case study designLogical sequence = connecting empirical data to initial research question
Four types: single holistic, single embedded, multiple holistic, multiple embedded
Flexible design = allow major changes to take place while the study is proceedingTheoretical framework = literature review to mold research question and emphasis points
Case study paradigmPositivismConstructivism and existentialismConstructivism
Components of study “Progressive focusing” = “the course of the study cannot be charted in advance” (1998, p 22)
Must have 2-3 research questions to structure the study
Collecting dataQuantitative and qualitative evidentiary influenced by:
Qualitative data influenced by:
Qualitative data research must have necessary skills and follow certain procedures to:
Data collection techniques
Data analysisUse both quantitative and qualitative techniques to answer research question
Use researcher’s intuition and impression as a guiding factor for analysis
“it is the process of making meaning” (1998, p 178)
Validating data Use triangulation
Increase internal validity

Ensure reliability and increase external validity

The post-positive paradigm postulates there is one reality that can be objectively described and understood by “bracketing” oneself from the research to remove prejudice or bias. 27 Yin focuses on general explanation and prediction, emphasizing the formulation of propositions, akin to hypothesis testing. This approach is best suited for structured and objective data collection 9 , 11 and is often used for mixed-method studies.

Constructivism assumes that the phenomenon of interest is constructed and influenced by local contexts, including the interaction between researchers, individuals, and their environment. 27 It acknowledges multiple interpretations of reality 24 constructed within the context by the researcher and participants which are unlikely to be replicated, should either change. 5 , 20 Stake and Merriam’s constructivist approaches emphasize a story-like rendering of a problem and an iterative process of constructing the case study. 7 This stance values researcher reflexivity and transparency, 28 acknowledging how researchers’ experiences and disciplinary lenses influence their assumptions and beliefs about the nature of the phenomenon and development of the findings.

Defining a case

A key tenet of case study methodology often underemphasized in literature is the importance of defining the case and phenomenon. Researches should clearly describe the case with sufficient detail to allow readers to fully understand the setting and context and determine applicability. Trying to answer a question that is too broad often leads to an unclear definition of the case and phenomenon. 20 Cases should therefore be bound by time and place to ensure rigor and feasibility. 6

Yin 22 defines a case as “a contemporary phenomenon within its real-life context,” (p13) which may contain a single unit of analysis, including individuals, programs, corporations, or clinics 29 (holistic), or be broken into sub-units of analysis, such as projects, meetings, roles, or locations within the case (embedded). 30 Merriam 24 and Stake 5 similarly define a case as a single unit studied within a bounded system. Stake 5 , 23 suggests bounding cases by contexts and experiences where the phenomenon of interest can be a program, process, or experience. However, the line between the case and phenomenon can become muddy. For guidance, Stake 5 , 23 describes the case as the noun or entity and the phenomenon of interest as the verb, functioning, or activity of the case.

Designing the case study approach

Yin’s approach to a case study is rooted in a formal proposition or theory which guides the case and is used to test the outcome. 1 Stake 5 advocates for a flexible design and explicitly states that data collection and analysis may commence at any point. Merriam’s 24 approach blends both Yin and Stake’s, allowing the necessary flexibility in data collection and analysis to meet the needs.

Yin 30 proposed three types of case study approaches—descriptive, explanatory, and exploratory. Each can be designed around single or multiple cases, creating six basic case study methodologies. Descriptive studies provide a rich description of the phenomenon within its context, which can be helpful in developing theories. To test a theory or determine cause and effect relationships, researchers can use an explanatory design. An exploratory model is typically used in the pilot-test phase to develop propositions (eg, Sibbald et al. 31 used this approach to explore interprofessional network complexity). Despite having distinct characteristics, the boundaries between case study types are flexible with significant overlap. 30 Each has five key components: (1) research question; (2) proposition; (3) unit of analysis; (4) logical linking that connects the theory with proposition; and (5) criteria for analyzing findings.

Contrary to Yin, Stake 5 believes the research process cannot be planned in its entirety because research evolves as it is performed. Consequently, researchers can adjust the design of their methods even after data collection has begun. Stake 5 classifies case studies into three categories: intrinsic, instrumental, and collective/multiple. Intrinsic case studies focus on gaining a better understanding of the case. These are often undertaken when the researcher has an interest in a specific case. Instrumental case study is used when the case itself is not of the utmost importance, and the issue or phenomenon (ie, the research question) being explored becomes the focus instead (eg, Paciocco 32 used an instrumental case study to evaluate the implementation of a chronic disease management program). 5 Collective designs are rooted in an instrumental case study and include multiple cases to gain an in-depth understanding of the complexity and particularity of a phenomenon across diverse contexts. 5 , 23 In collective designs, studying similarities and differences between the cases allows the phenomenon to be understood more intimately (for examples of this in the field, see van Zelm et al. 33 and Burrows et al. 34 In addition, Sibbald et al. 35 present an example where a cross-case analysis method is used to compare instrumental cases).

Merriam’s approach is flexible (similar to Stake) as well as stepwise and linear (similar to Yin). She advocates for conducting a literature review before designing the study to better understand the theoretical underpinnings. 24 , 25 Unlike Stake or Yin, Merriam proposes a step-by-step guide for researchers to design a case study. These steps include performing a literature review, creating a theoretical framework, identifying the problem, creating and refining the research question(s), and selecting a study sample that fits the question(s). 24 , 25 , 36

Data collection and analysis

Using multiple data collection methods is a key characteristic of all case study methodology; it enhances the credibility of the findings by allowing different facets and views of the phenomenon to be explored. 23 Common methods include interviews, focus groups, observation, and document analysis. 5 , 37 By seeking patterns within and across data sources, a thick description of the case can be generated to support a greater understanding and interpretation of the whole phenomenon. 5 , 17 , 20 , 23 This technique is called triangulation and is used to explore cases with greater accuracy. 5 Although Stake 5 maintains case study is most often used in qualitative research, Yin 17 supports a mix of both quantitative and qualitative methods to triangulate data. This deliberate convergence of data sources (or mixed methods) allows researchers to find greater depth in their analysis and develop converging lines of inquiry. For example, case studies evaluating interventions commonly use qualitative interviews to describe the implementation process, barriers, and facilitators paired with a quantitative survey of comparative outcomes and effectiveness. 33 , 38 , 39

Yin 30 describes analysis as dependent on the chosen approach, whether it be (1) deductive and rely on theoretical propositions; (2) inductive and analyze data from the “ground up”; (3) organized to create a case description; or (4) used to examine plausible rival explanations. According to Yin’s 40 approach to descriptive case studies, carefully considering theory development is an important part of study design. “Theory” refers to field-relevant propositions, commonly agreed upon assumptions, or fully developed theories. 40 Stake 5 advocates for using the researcher’s intuition and impression to guide analysis through a categorical aggregation and direct interpretation. Merriam 24 uses six different methods to guide the “process of making meaning” (p178) : (1) ethnographic analysis; (2) narrative analysis; (3) phenomenological analysis; (4) constant comparative method; (5) content analysis; and (6) analytic induction.

Drawing upon a theoretical or conceptual framework to inform analysis improves the quality of case study and avoids the risk of description without meaning. 18 Using Stake’s 5 approach, researchers rely on protocols and previous knowledge to help make sense of new ideas; theory can guide the research and assist researchers in understanding how new information fits into existing knowledge.

Practical applications of case study research

Columbia University has recently demonstrated how case studies can help train future health leaders. 41 Case studies encompass components of systems thinking—considering connections and interactions between components of a system, alongside the implications and consequences of those relationships—to equip health leaders with tools to tackle global health issues. 41 Greenwood 42 evaluated Indigenous peoples’ relationship with the healthcare system in British Columbia and used a case study to challenge and educate health leaders across the country to enhance culturally sensitive health service environments.

An important but often omitted step in case study research is an assessment of quality and rigour. We recommend using a framework or set of criteria to assess the rigour of the qualitative research. Suitable resources include Caelli et al., 43 Houghten et al., 44 Ravenek and Rudman, 45 and Tracy. 46

New directions in case study

Although “pragmatic” case studies (ie, utilizing practical and applicable methods) have existed within psychotherapy for some time, 47 , 48 only recently has the applicability of pragmatism as an underlying paradigmatic perspective been considered in HSR. 49 This is marked by uptake of pragmatism in Randomized Control Trials, recognizing that “gold standard” testing conditions do not reflect the reality of clinical settings 50 , 51 nor do a handful of epistemologically guided methodologies suit every research inquiry.

Pragmatism positions the research question as the basis for methodological choices, rather than a theory or epistemology, allowing researchers to pursue the most practical approach to understanding a problem or discovering an actionable solution. 52 Mixed methods are commonly used to create a deeper understanding of the case through converging qualitative and quantitative data. 52 Pragmatic case study is suited to HSR because its flexibility throughout the research process accommodates complexity, ever-changing systems, and disruptions to research plans. 49 , 50 Much like case study, pragmatism has been criticized for its flexibility and use when other approaches are seemingly ill-fit. 53 , 54 Similarly, authors argue that this results from a lack of investigation and proper application rather than a reflection of validity, legitimizing the need for more exploration and conversation among researchers and practitioners. 55

Although occasionally misunderstood as a less rigourous research methodology, 8 case study research is highly flexible and allows for contextual nuances. 5 , 6 Its use is valuable when the researcher desires a thorough understanding of a phenomenon or case bound by context. 11 If needed, multiple similar cases can be studied simultaneously, or one case within another. 16 , 17 There are currently three main approaches to case study, 5 , 17 , 24 each with their own definitions of a case, ontological and epistemological paradigms, methodologies, and data collection and analysis procedures. 37

Individuals’ experiences within health systems are influenced heavily by contextual factors, participant experience, and intricate relationships between different organizations and actors. 55 Case study research is well suited for HSR because it can track and examine these complex relationships and systems as they evolve over time. 6 , 7 It is important that researchers and health leaders using this methodology understand its key tenets and how to conduct a proper case study. Although there are many examples of case study in action, they are often under-reported and, when reported, not rigorously conducted. 9 Thus, decision-makers and health leaders should use these examples with caution. The proper reporting of case studies is necessary to bolster their credibility in HSR literature and provide readers sufficient information to critically assess the methodology. We also call on health leaders who frequently use case studies 56 – 58 to report them in the primary research literature.

The purpose of this article is to advocate for the continued and advanced use of case study in HSR and to provide literature-based guidance for decision-makers, policy-makers, and health leaders on how to engage in, read, and interpret findings from case study research. As health systems progress and evolve, the application of case study research will continue to increase as researchers and health leaders aim to capture the inherent complexities, nuances, and contextual factors. 7

An external file that holds a picture, illustration, etc.
Object name is 10.1177_08404704211028857-img1.jpg

Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more: https://www.cambridge.org/universitypress/about-us/news-and-blogs/cambridge-university-press-publishing-update-following-technical-disruption

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

  • > The Case for Case Studies
  • > Using Case Studies to Enhance the Quality of Explanation and Implementation

case study about quality

Book contents

  • The Case for Case Studies
  • Strategies for Social Inquiry
  • Copyright page
  • Contributors
  • Preface and Acknowledgments
  • 1 Using Case Studies to Enhance the Quality of Explanation and Implementation
  • Part I Internal and External Validity Issues in Case Study Research
  • Part II Ensuring High-Quality Case Studies
  • Part III Putting Case Studies to Work: Applications to Development Practice

1 - Using Case Studies to Enhance the Quality of Explanation and Implementation

Integrating Scholarship and Development Practice

Published online by Cambridge University Press:  05 May 2022

The opening chapter provides a brief outline of the conventional division of labor between qualitative and quantitative methods in the social sciences. It sketches the main standards that govern case study research. It then offers an overview of subsequent chapters, which challenge some of these distinctions or deepen our understanding of what makes qualitative case studies useful for both causal inference and policy practice.

1.1 Introduction

In recent years the development policy community has turned to case studies as an analytical and diagnostic tool. Practitioners are using case studies to discern the mechanisms underpinning variations in the quality of service delivery and institutional reform, to identify how specific challenges are addressed during implementation, and to explore the conditions under which given instances of programmatic success might be replicated or scaled up. Footnote 1 These issues are of prime concern to organizations such as Princeton University’s Innovations for Successful Societies (ISS) Footnote 2 program and the Global Delivery Initiative (GDI), Footnote 3 housed in the World Bank Group (from 2015–2021), both of which explicitly prepare case studies exploring the dynamics underpinning effective implementation in fields ranging from water, energy, sanitation, and health to cabinet office performance and national development strategies.

In this sense, the use of case studies by development researchers and practitioners mirrors their deployment in other professional fields. Case studies have long enjoyed high status as a pedagogical tool and research method in business, law, medicine, and public policy, and indeed across the full span of human knowledge. According to Google Scholar data reported by Reference Van Noorden, Maher and Nuzzo Van Noorden, Maher, and Nuzzo (2014) , Robert Yin’s Case Study Research ( Reference Yin 1984 ) is, remarkably, the sixth most cited article or book in any field , of all time . Footnote 4 Even so, skepticism lingers in certain quarters regarding the veracity of the case study method – for example, how confident can one be about claims drawn from single cases selected on a nonrandom or nonrepresentative basis? – and many legitimate questions remain ( Reference Morgan Morgan 2012 ). In order for insights from case studies to be valid and reliable, development professionals need to think carefully about how to ensure that data used in preparing the case study is accurate, that causal inferences drawn from it are made on a defensible basis ( Reference Mahoney Mahoney 2000 ; Reference Rohlfing Rohlfing 2012 ), and that broader generalizations are carefully delimited ( Reference Ruzzene Ruzzene 2012 ; Reference Woolcock Woolcock 2013 ). Footnote 5

How best to ensure this happens? Given the recent rise in prominence and influence of the case study method within the development community and elsewhere, scholars have a vital quality control and knowledge dissemination role to play in ensuring that the use of case studies both accurately reflects and contributes to leading research. To provide a forum for this purpose, the World Bank’s Development Research Group and its leading operational unit deploying case studies (the GDI) partnered with the leading academic institution that develops policy-focused case studies of development (Princeton’s ISS) and asked scholars and practitioners to engage with several key questions regarding the foundations, strategies, and applications of case studies as they pertain to development processes and outcomes: Footnote 6

What are the distinctive virtues and limits of case studies, in their own right and vis-à-vis other research methods? How can their respective strengths be harnessed and their weaknesses overcome (or complemented by other approaches) in policy deliberations?

Are there criteria for case study selection, research design, and analysis that can help ensure accuracy and comparability in data collection, reliability in causal inference within a single case, integrity in statements about uncertainty or scope, and something akin to the replicability standard in quantitative methods?

Under what conditions can we generalize from a small number of cases? When can comparable cases be generalized or not (across time, contexts, units of analysis, scales of operation, implementing agents)?

How can case studies most effectively complement the insights drawn from household surveys and other quantitative assessment tools in development research, policy, and practice?

How can lessons from case studies be used for pedagogical, diagnostic, and policy-advising purposes as improvements in the quality of implementation of a given intervention are sought?

How can the proliferation of case studies currently being prepared on development processes and outcomes be used to inform the scholarship on the theory and practice of case studies?

The remainder of this chapter provides an overview of the distinctive features (and limits) of case study research, drawing on “classic” and recent contributions in the scholarly literature. It provides a broad outline of the key claims and issues in the field, as well as a summary of the book’s chapters.

1.2 The Case for Case Studies: A Brief Overview

We can all point to great social science books and articles that derive from qualitative case study research. Herbert Reference Kaufman Kaufman’s (1960) classic, The Forest Ranger , profiles the principal–agent problems that arise in management of the US Forest Service as well as the design and implementation of several solutions. Robert Reference Ellickson Ellickson’s (1991) Order Without Law portrays how ranchers settle disputes among themselves without recourse to police or courts. Judith Reference Tendler Tendler’s (1997) Good Government in the Tropics uses four case studies of Ceara, Brazil’s poorest state, to identify instances of positive deviance in public sector reform. Daniel Reference Carpenter Carpenter’s (2001) The Forging of Bureaucratic Autonomy , based on three historical cases, seeks to explain why reformers in some US federal agencies were able to carve out space free from partisan legislative interference while others were unable to do so. In “The Market for Public Office,” Robert Reference Wade Wade (1985) elicits the strategic structure of a particular kind of spoiler problem from a case study conducted in India. In economics, a longitudinal study of poverty dynamics in a single village in India (Palanpur) Footnote 7 has usefully informed understandings of these processes across the subcontinent (and beyond).

What makes these contributions stand out compared to the vast numbers of case studies that few find insightful? What standards should govern the choice and design of case studies, generally? And what specific insights do case studies yield that other research methods might be less well placed to provide?

The broad ambition of the social sciences is to forge general insights that help us quickly understand the world around us and make informed policy decisions. While each social science discipline has its own distinctive approach, there is broad agreement upon a methodological division of labor in the work we do. This conventional wisdom holds that quantitative analysis of large numbers of discrete cases is usually more effective for testing the veracity of causal propositions, for estimating the strength of the association between readily measurable causes and outcomes, and for evaluating the sensitivity of correlations to changes in the underlying model specifying the relationship between causal variables (and their measurement). By contrast, qualitative methods generally, and case studies in particular, fulfill other distinct epistemological functions and are the predominant method for:

1. Developing a theory and/or identifying causal mechanisms (e.g., working inductively from evidence to propositions and exploring the contents of the “black box” processes connecting causes and effects)

2. Eliciting strategic structure (e.g., documenting how interaction effects of one kind or another influence options, processes, and outcomes)

3. Showing how antecedent conditions elicit a prevailing structure which thereby shapes/constrains the decisions of actors within that structure

4. Testing a theory in novel circumstances

5. Understanding outliers or deviant cases

The conventional wisdom also holds that in an ideal world we would have the ability to use both quantitative and qualitative analysis and employ “nested” research designs ( Reference Bamberger, Rao, Woolcock, Tashakkori and Teddlie Bamberger, Rao, and Woolcock 2010 ; Reference Goertz and Mahoney Goertz and Mahoney 2012 ; Reference Lieberman, Mahoney and Thelen Lieberman 2015 ). However, the appropriate choice of method depends on the character of the subject matter, the kinds of data available, and the array of constraints (resources, politics, time) under which the study is being conducted. The central task is to deploy those combinations of research methods that yield the most fruitful insights in response to a specific problem, given the prevailing constraints ( Reference Rueschemeyer Rueschemeyer 2009 ). We now consider each of these five domains in greater detail.

1.3 Developing a Theory and/or Identifying Causal Mechanisms

Identifying a causal mechanism and inferring an explanation or theory are important parts of the research process, especially in the early stages of knowledge development. The causal mechanism links an independent variable to an outcome, and over time may become more precise: to cite an oft-used example, an initial awareness that citrus fruits reduced scurvy became more refined when the underlying causal mechanism was discovered to be vitamin C. For policy purposes, mechanisms provide the basis for a compelling storyline, which can greatly influence the tone and terms of debate – or the space of what is “thinkable,” “say-able,” and “do-able” – which in turn can affect the design, implementation, and support for interventions. This can be particularly relevant for development practitioners if the storyline – and the mechanisms it highlights – provides important insights into how and where implementation processes unravel, and what factors enabled a particular intervention to succeed or fail during the delivery process.

In this way, qualitative research can provide clarity on the factors that influence critical processes and help us identify the mechanisms that affect particular outcomes. For example, there is a fairly robust association, globally, between higher incomes and smaller family sizes. But what is it about income that would lead families to have fewer children – or does income mask other changes that influence child-bearing decisions? To figure out the mechanism, one could conduct interviews and focus groups with a few families to understand decision-making about family planning. Hypotheses based on these family case studies could then inform the design of survey-based quantitative research to test alternative mechanisms and the extent to which one or another predominates in different settings. Population researchers have done just that (see Reference Knodel Knodel 1997 ).

Case studies carried out for the purpose of inductive generalization or identifying causal mechanisms are rarely pure “soak and poke” exercises uninformed by any preconceptions. Indeed, approaching a case with a provisional set of hypotheses is vitally important. The fact that we want to use a case to infer a general statement about cause and effect does not obviate the need for this vital intellectual tool; it just means we need to listen hard for alternative explanations we did not initially perceive and be highly attentive to actions, events, attitudes, etc., that are at odds with the reasoned intuition brought to the project.

An example where having an initial set of hypotheses was important comes from a GDI case on scaling-up rural sanitation. In this case, the authors wanted to further understand how the government of Indonesia had been able to substantially diminish open defecation, which is the main cause of several diseases in thousands of villages across the country. Footnote 8 The key policy change was a dramatic move from years of subsidizing latrines that ended up not being used to trying to change people’s behavior toward open defecation, a socially accepted norm. The authors had a set of hypotheses with respect to what triggered this important policy shift: a change in cabinet members, the presence of international organizations, adjustments in budgets, etc. However, the precise mechanism that triggered the change only became clear after interviewing several actors involved in the process. It turns out that a study tour taken by several Indonesian officials to Bangladesh was decisive since, for the first time, they could see the results of a different policy “with their own eyes” instead of just reading about it. Footnote 9

There are some situations, however, in which we may know so little that hypothesis development must essentially begin from scratch. For example, consider an ISS case study series on cabinet office performance. A key question was why so many heads of government allow administrative decisions to swamp cabinet meetings, causing the meetings to last a long time and reducing the chance that the government will reach actual policy decisions or priorities. One might have a variety of hypotheses to explain this predicament, but without direct access to the meetings themselves it is hard to know which of these hypotheses is most likely to be true ( Reference March, Sproull and Tamuz March, Sproul, and Tamuz 1991 ). In the initial phases, ISS researchers deliberately left a lot of space for the people interviewed to offer their own explanations. They anticipated that not all heads of state might want their cabinets to work as forums for decision-making and coordination, because ministers who had a lot of political and military clout might capture the stage or threaten vital interests of weaker members – or because the head of state benefited from the dysfunction. But as the first couple of cases unfolded, the research team realized that part of the problem arose from severe under-staffing, simple lack of know-how, inadequate capacity at the ministry level, or rapid turnover in personnel. In such situations, as Reference March, Sproull and Tamuz March, Sproul, and Tamuz (1991 : 8) aptly put it,

[t]he pursuit of rich experience … requires a method for absorbing detail without molding it. Great organizational histories, like great novels, are written, not by first constructing interpretations of events and then filling in the details, but by first identifying the details and allowing the interpretations to emerge from them. As a result, openness to a variety of (possibly irrelevant) dimensions of experience and preference is often more valuable than a clear prior model and unambiguous objectives.

In another ISS case study on the factors shaping the implementation and sustainability of “rapid results” management practices (e.g., setting 100-day goals, coupled with coaching on project management), a subquestion was when and why setting a 100-day goal improved service delivery. In interviews, qualitative insight into causal mechanisms surfaced: some managers said they thought employees understood expectations more clearly and therefore performed better as a result of setting a 100-day goal, while in other instances a competitive spirit or “game sense” increased motivation or cooperation with other employees, making work more enjoyable. Still others expected that an audit might follow, so a sense of heightened scrutiny also made a difference. The project in question did not try to arbitrate among these causal mechanisms or theories, but using the insight from the qualitative research, a researcher might well have proceeded to decipher which of these explanations carried most weight.

In many instances it is possible and preferable to approach the task of inductive generalization with more intellectual structure up front, however. As researchers we always have a few “priors” – hunches or hypotheses – that guide investigation. The extent to which we want these to structure initial inquiry may depend on the purpose of our research, but also on the likely causal complexity of the outcome we want to study, the rapidity of change in contexts, and the stock of information already available.

1.4 Eliciting Strategic Structure

A second important feature of the case study method, one that is intimately related to developing a theory or identifying causal mechanisms, is its ability to elicit the strategic structure of an event – that is, to capture the interactions that produce an important outcome. Some kinds of outcomes are “conditioned”: they vary with underlying contextual features like income levels or geography. Others are “crafted” or choice-based: the outcome is the product of bargaining, negotiating, deal-cutting, brinkmanship, and other types of interaction among a set of specified actors. Policy choice and implementation fall into this second category. Context may shape the feasible set of outcomes or the types of bargaining challenges, but the only way to explain outcomes is to trace the process or steps and choices as they unfold in the interaction (see Reference Bennett and Checkel Bennett and Checkel 2015 ).

In process tracing we want to identify the key actors, their preferences, and the alternatives or options they faced; evaluate the information available to these people and the expectations they formed; assess the resources available to each to persuade others or to alter the incentives others face and the expectations they form (especially with regard to the strategies they deploy); and indicate the formal and informal rules that govern the negotiation, as well as the personal aptitudes that influence effectiveness and constrain choice. The researcher often approaches the case with a specific type of strategic structure in mind – a bargaining story that plausibly accounts for the outcome – along with a sense of other frames that might explain the same set of facts.

In the 1980s and 1990s, the extensive literature on the politics of structural adjustment yielded many case studies designed to give us a better understanding of the kinds of difficulties ministers of finance faced in winning agreement to devalue a currency, sell assets, or liberalize trade or commodity markets, as well as the challenges they encountered in making these changes happen (e.g., Reference Haggard Haggard 1992 ). Although the case studies yielded insights that could be used to create models testable with large-N data, in any individual case the specific parameters – context or circumstance – remained important for explaining particular outcomes. Sensitivity to the kinds of strategic challenges that emerged in other settings helped decision-makers assess the ways their situations might be similar or different, identify workarounds or coalitions essential for winning support, and increase the probability that their own efforts would succeed. It is important to know what empirical relationships seem to hold across a wide (ideally full) array of cases, but the most useful policy advice is that which is given in response to specific people in a specific place responding to a specific problem under specific constraints; as such, deep knowledge of contextual contingencies characterizing each case is vital. Footnote 10

For example, consider the challenge of improving rural livelihoods during an economic crisis in Indonesia. In “Services for the People, By the People,” ISS researchers profiled how Indonesian policy-makers tried to address the problem of “capture” in a rural development program. Officials and local leaders often diverted resources designed to benefit the poor. The question was how to make compliance incentive compatible. That is, what did program leaders do to alter the cost–benefit calculus of the potential spoiler? How did they make their commitment to bargains, deals, pacts, or other devices credible? In most cases, the interaction is “dynamic” and equilibria (basis for compliance) are not stable. Learning inevitably takes place, and reform leaders often have to take new steps as circumstances change. Over time, what steps did a reformer take to preserve the fragile equilibrium first created or to forge a new equilibrium? Which tactics proved most effective, given the context?

In this instance, leaders used a combination of tactics to address the potential spoiler problem. They vested responsibility for defining priorities in communities, not in the capital or the district. They required that at least two of three proposals the communities could submit came from women’s groups. They set up subdistrict competitions to choose the best proposals, with elected members of each community involved in selection. They transferred money to community bank accounts that could only be tapped when the people villagers elected to monitor the projects all countersigned. They created teams of facilitators to provide support and monitor results. When funds disappeared, communities lost the ability to compete. Careful case analysis helped reveal not only the incentive design, but also the interaction between design and context – and the ways in which the system occasionally failed, although the program was quite successful overall.

A related series of ISS cases focused on how leaders overcame the opposition of people or groups who benefited from dysfunction and whose institutional positions enabled them to block changes that would improve service delivery. The ambition in these cases was to tease out the strategies reform leaders could use to reach an agreement on a new set of rules or practices; if they were able to do so, case studies focused on institutions where spoiler traps often appear: anticorruption initiatives, port reform (ports, like banks, being “where the money is”), and infrastructure. The strategies or tactics at the focus in these studies included use of external agencies of restraint (e.g., the Governance and Economic Management Assistance Program [GEMAP] in Liberia); “coalitions with the public” to make interference more costly in social or political terms; persuading opponents to surrender rents in one activity for rewards in another; pitting strong spoilers against each other; and altering the cost calculus by exposing the spoiler to new risks. The cases allowed researchers both to identify the strategies used and to weigh the sensitivity of these to variations in context or shifts in the rules of the game or the actors involved. The hope was that the analysis the cases embodied would help practitioners avoid the adoption of strategies that are doomed to fail in the specific contexts they face. It also enabled policy-makers to see how they might alter rules or practices in ways that make a reformer’s job ( at least to a degree) easier.

A couple of GDI cases provide further illustration of how to elicit strategic structure. In a case on how to shape an enabling environment for water service delivery in Nigeria, Footnote 11 the authors were able to identify the political incentives that undermine long-term commitments and overhaul short-run returns, and which generate a low-level equilibrium trap. This has led to improvements in investments in rehabilitation and even an expansion of water services, yet it has not allowed the institutional reforms needed to ensure sustainability to move forward. In the case of Mexico, where the government had been struggling to improve service delivery to Indigenous communities, a World Bank loan provided a window of opportunity to change things. A number of reformers within the government believed that catering services to these populations in their own languages would help decrease the number of dropouts from its flagship social program, Oportunidades. Footnote 12 However, previous efforts had not moved forward. A World Bank loan to the Mexican government triggered a safeguards policy on Indigenous populations and it became fundamental for officials to be able to develop a program to certify bilingual personnel that could service these communities. Interviews with key officials and stakeholders showed how the safeguards policy kick-started a set of meetings and decisions within the government that eventually led to this program, changing the strategic structures within government.

1.5 Showing How an Antecedent Condition Limits Decision-Makers’ Options

Some types of phenomena require case study analysis to disentangle complex causal relationships. We generally assume the cause of an outcome is exogenous, but sometimes there are feedback effects and an outcome intensifies one of its causes or limits the range of values the outcome can later assume. In such situations, case studies can be helpful in parsing the structure of these causal relationships and identifying which conditions are prior. Some of the case studies that inform Why States Fail ( Reference Acemoglu and Robinson Acemoglu and Robinson 2012 ), for example, perform this function. More detailed case studies of this type appear in political science and sociological writing in the “historical institutionalism” tradition (see Reference Thelen and Mahoney Thelen and Mahoney 2009 ; Reference Mahoney and Thelen Mahoney and Thelen 2015 ).

Case studies are also useful in other instances when both the design of a policy intervention and the way in which it is implemented affect the outcome. They help identify ways to distinguish the effects of policy from the effects of process, two things that most quantitative studies conflate. To illustrate, take another ISS case study series on rapid turnarounds observed in some types of public sector agencies: the quick development of pockets of effectiveness. The agencies at the focus of this project provided business licenses or identity documents – actions that required relatively little exercise of judgment on the part of the person dispensing the service and where the number of distribution points is fairly limited. Businesses and citizens felt the effects of delay and corruption in these services keenly, but not all governments put reformers at the helm and not all reformers improved performance. The ISS team was partly interested in the interventions that produced turnarounds in this type of activity: was there a secret recipe – a practice that produced altered incentives or outlooks and generated positive results? The literature on principal–agent problems offered hypotheses about ways to better align the interests of leaders and the people on the front-line who deliver a service, but many of these were inapplicable in low-resource environments or where removing personnel and modifying terms of service was hard to do. But ISS was also interested in how the mode of implementation affected outcomes, because solving the principal–agent problem often created clear losers who could block the new policies. How did the successful reformers win support?

The team refined and expanded its initial set of hypotheses through a detailed case study of South Africa’s Ministry of Home Affairs, and traced both the influence of the incentive design and the process used to put the new practices into effect. Without the second part, the case study team might have reasoned that the results stemmed purely from changed practices and tried to copy the same approach somewhere else, but in this instance, as in many cases, the mode of implementation was critical to success. The project leader could not easily draw from the standard toolkit for solving principal–agent problems because he could not easily remove poorly performing employees. He had to find ways to win union acceptance of the new policies and get people excited about the effort. This case study was an example of using qualitative methods to identify a causal mechanism and to develop explanations we can evaluate more broadly by conducting other case studies.

An example from the GDI is a case on addressing maternal and child mortality in Argentina in the early 2000s. Footnote 13 As a result of the 2001 economic crisis, thousands of people lost their jobs and hence were unable to pay for private healthcare; consequently, the public health system suddenly received a vast and unexpected influx of patients. Given that the Argentine public health system had been decentralized over the preceding decades and therefore the central government’s role in the provinces was minor, policy-makers had to work around a set of conditions and do it fast, given the context. The case disentangled how the central government was able to design one of the first results-based finance programs in the health sector and how this design was critical in explaining the maternal and child mortality outcomes. Policy-makers had to react immediately to the pressure on the health system and were able to make use of a provincial coordination mechanism that had become mostly irrelevant. By reviving this mechanism and having access to international funds, the central government was able to reinstate its role in provincial health care and engage key local decision-makers. Through the case study, the authors were able to assess the relevance of the policy-making process and how it defined the stakeholders’ choices, as well as the effect of the process in the Argentine healthcare system.

1.6 Testing a Theory in Novel Circumstances

Case study analysis is a relatively weak method for testing explanations derived from large sample sizes but it is often the only method available if the event is relatively uncommon or if sample sizes are small. Testing a theory against a small number of instrumentally chosen cases carries some peril. If we have only a few cases to study, the number of causal variables that potentially influence the outcome could overwhelm the number of observations, making it impossible to infer anything about the relationship between two variables, except through intensive tracing of processes.

Usually theory testing with case studies begins with a “truth table” or matrix, with the key independent variable(s) arrayed on one axis and the outcome variable arrayed on the other. The researcher collects data on the same variables in each case. The names of the cases in the cells of the table are then arranged and comparisons made of expected patterns with the actual pattern. The proportion of cases in each cell will track expectations if there is support for the theory.

An example of this kind of use of case studies appears in Alejandro Portes’s collaborative project on institutional development in Latin America ( Reference Portes and Smith Portes and Smith 2008 ). In each country, the project studied the same five agencies. The research team listed several organizational characteristics that prior theories suggested might be important. In the truth table, the characteristic on which the successful agencies clustered was having a merit system for making personnel decisions. Having a merit system distinguished the successful agencies from the unsuccessful agencies in each of the five country settings in which the research took place. (A slightly different design would have allowed the researchers to determine whether an antecedent condition shaped the adoption of merit systems in the successful cases and also exercised an independent effect on the outcome.)

In the ISS project about single-agency turnarounds, the aim was to make some tentative general statements about the robustness of a set of practices to differences in context. Specifically, the claim was that delays would diminish and productivity would rise by introducing a fairly standard set of management practices designed to streamline a process, increase transparency, and invite friendly group competition. In this kind of observational study, the authors had a before-and-after or longitudinal design in each individual case, which was married with a cross-sectional design. Footnote 14 The elements of the intervention were arrayed in a truth table and examined to see which of them were present or absent in parallel interventions in a number of other cases. The team added cases with nearly identical interventions but different underlying country contexts. ISS then explored each case in greater detail to see whether implementation strategy or something else having to do with context explained which reforms were successful and which were not.

Small-scale observational studies (the only type of study possible in many subject areas) suffer from a variety of threats, including inability to control for large numbers of differences in setting. However, the interview data and close process tracing helped increase confidence in two respects. First, they helped reveal the connection between the outcomes observed and the practices under study. For example, it was relevant that people in work groups could describe their reactions when a poster showing how many identity documents they had issued had increased or decreased compared to the month before. Second, the information the interviews delivered about obstacles encountered and workarounds developed fueled hypotheses about robustness to changes in setting. In short, the deep dive that the case study permitted helped alleviate some of the inferential challenges that inevitably arise when there are only small numbers of observations and a randomized controlled trial is not feasible.

Rare events pose special problems for theory testing. Organizations must often learn from single cases – for example, from the outcome of a rare event (such as a natural disaster, or a major restructuring). In this circumstance it may be possible to evaluate impact across several units within the organization or influences across policy areas. However, where this approach is impossible few organizations decline to learn from experience; instead, they look closely at the history of the event to assess the sequence of steps by which prevailing outcomes obtained and how these might have been different had alternative courses of action been pursued.

1.7 Understanding Outliers or Deviant Cases

A common and important use of case studies is to explore the case that does not conform to expectations. An analysis comparing a large number of cases on a few variables may find that most units (countries, agencies, etc.) cluster closely around a regression line whose slope shows the relationship between the causal variables and the outcome. However, one or two cases may lie far from the line. We usually want to know what’s different about those cases, and especially how and why they differ. For example, there is generally a quite robust relationship between a country’s level of spending on education and the quality of outcomes that country’s education system generates. Why is Vietnam in the bottom third globally in terms of its spending on education, yet in the upper third globally in terms of outcomes (as measured by student performance on standardized examinations)? Conversely, why is Malaysia in the upper third on spending and bottom third on outcomes?

In the study of development, outliers such as these hold particular fascination. For example, several scholars whose contributions are ordinarily associated with use of quantitative methods have employed schematic case studies to ponder why Botswana seems to have stronger institutions than most other African countries ( Reference Acemoglu, Johnson, Robinson and Rodrik Acemoglu, Johnson, and Robinson 2003 ). Costa Rica and Singapore attract attention for the same reason. Footnote 15 This same approach can be used to explore and explain subnational variation as a basis for deriving policy lessons. Reference Brixi, Lust and Woolcock Brixi, Lust, and Woolcock (2015) , for example, deploy data collected from household surveys to map the wide range of outcomes in public service delivery across countries in the Middle East and North Africa – countries which otherwise have highly centralized line ministries, which means roughly the same policies regarding (say) health and education apply across any given country. The wide variation in outcomes is thus largely a matter of factors shaping policy implementation , which are often highly contextual and thus much harder to assess via standard quantitative instruments. On the basis of the subnational variation maps, however, granular case studies were able to be prepared on those particular locations where unusually high (and low) outcomes were being obtained; the lessons from these cases, in turn, became inputs for a conversation with domestic policy-makers about where and how improvements might be sought. Here, the goal was not to seek policy reform by importing what researchers deemed “best practices” (as verified by “rigorous evidence”) from abroad but rather to use both household surveys and case studies to endogenize research tools into the ways in which local practitioners make difficult decisions about strategy, trade-offs, and feedback, doing so in ways regarded as legitimate and useful by providers and users of public services.

1.8 Ensuring Rigor in Case Studies: Foundations, Strategies, and Applications

There is general agreement on some of the standards that should govern qualitative case studies. Such studies should: Footnote 16

respond to a clear question that links to an important intellectual debate or policy problem

specify and define core concepts, terms, and metrics associated with the explanations

identify plausible explanations, articulating a main hypothesis and logical alternatives

offer data that allow us to evaluate the main ideas or discriminate between different possible causal mechanisms, including any that emerge as important in the course of the research

be selected according to clear and transparent criteria appropriate to the research objective

be amenable to replication – that is, other researchers ought to be able to check the results

Together, this book’s three parts – on Internal and External Validity Issues, Ensuring High-Quality Case Studies, and Applications to Development Practice – explore how the content and realization of these standards can be applied by those conducting case studies in development research and practice, and how, in turn, the fruits of their endeavors can contribute to a refinement and expansion of the “ecologies of evidence” on which inherently complex decisions in development are made.

We proceed as follows. Part I focuses on the relative strengths and weaknesses of qualitative cases versus frequentist observational studies (surveys, aggregate data analysis) and randomized controlled trials (RCTs). Its constituent chapters explore the logic of causal inference and the logic of generalization, often framed as problems of internal and external validity.

In Chapter 2 , philosopher of science Nancy Cartwright walks us through the logic behind RCTs on the one hand, and qualitative case studies on the other. RCTs have gained considerable prominence as a ‘gold standard’ for establishing whether a given policy intervention has a causal effect, but what do these experiments actually tell us and how useful is this information for policy-makers? Cartwright draws attention to two problems. First, an RCT only establishes a claim about average effects for the population enrolled in an experiment; it tells us little about what lies behind the average. The policy intervention studied might have changed nothing in some instances, while in others it triggered large shifts in behavior or health or whatever is under study. But, second, an RCT also tells us nothing about when we might expect to see the same effect size in a different population. To assess how a different population might respond requires other information of the sort that qualitative case studies often uncover. RCTs may help identify a cause, but identifying a cause is not the same as identifying something that is generally true, Cartwright notes. She then considers what information a policy-maker would need to predict whether a causal relationship will hold in a particular instance, which is often what we really want to know.

The singular qualitative case study has a role to play in addressing this need. Cartwright begins by asking what are the support factors that enable the intervention to work, and are they present in a particular situation? She suggests we should use various types of evidence, both indirect and direct. In the “direct” category are many of the elements that case studies can (and should) document: 1) Does O occur at the time, in the manner, and of the size to be expected that T caused it? 2) Are there symptoms of cause – by-products of the causal relationship? 3) Were requisite support factors present? (i.e., was everything in place that needed to be in order for T to produce O?), and 4) Were the expected intermediate steps (mediator variables) in place? Often these are the key elements we need to know in order to decide whether the effects observed in an experiment will scale.

Political scientist Christopher Achen also weighs the value of RCTs versus qualitative case studies with the aim of correcting what he perceives as an imbalance in favor of the former within contemporary social science. In Chapter 3 he shows that “the argument for experiments depends critically on emphasizing the central challenge of observational work – accounting for unobserved confounders – while ignoring entirely the central challenge of experimentation – achieving external validity.” Using the mathematics behind randomized controlled trials to make his point, he shows that once this imbalance is corrected, we are closer to Cartwright’s view than to the current belief that RCTs constitute the gold standard for good policy research.

As a pivot, Achen takes a 2014 essay, a classic statement about the failure of observational studies to generate learning and about the strengths of RCTs. The authors of that essay argued that

[t]he external validity of an experiment hinges on four factors: 1) whether the subjects in the study are as strongly influenced by the treatment as the population to which a generalization is made, 2) whether the treatment in the experiment corresponds to the treatment in the population of interest, 3) whether the response measure used in the experiment corresponds to the variable of interest in the population, and 4) how the effect estimates were derived statistically.

But Achen finds this list a little too short: “The difficulty is that those assumptions combine jaundiced cynicism about observational studies with gullible innocence about experiments,” he writes. “What is missing from this list are the two critical factors emphasized in the work of recent critics of RCTs: heterogeneity of treatment effects and the importance of context.” For example, in an experiment conducted with Michigan voters, there were no Louisianans, no Democrats, and no general election voters; “[h]ence, no within-sample statistical adjustments are available to accomplish the inferential leap” required for generalizing the result.

Achen concludes: “Causal inference of any kind is just plain hard. If the evidence is observational, patient consideration of plausible counterarguments, followed by the assembling of relevant evidence, can be, and often is, a painstaking process.” Well-structured qualitative case studies are one important tool; experiments, another.

In Chapter 4 , Andrew Bennett help us think about what steps are necessary to use case studies to identify causal relationships and draw contingent generalizations. He suggests that case study research employs Bayesian logic rather than frequentist logic: “Bayesian logic treats probabilities as degrees of belief in alternative explanations, and it updates initial degrees of belief (called ‘priors’) by using assessments of the probative value of new evidence vis-à-vis alternative explanations (the updated degree of belief is known as the ‘posterior’).”

Bennett’s chapter sketches four approaches: generalization from ‘typical’ cases, generalization from most- or least-likely cases, mechanism-based generalization, and typological theorizing, with special attention to the last two. Improved understanding of causal mechanisms permits generalizing to individuals, cases, or contexts outside the initial sample studied. In this regard, the study of deviant, or outlier, cases and cases that have high values on the independent variable of interest (theory of change) may prove helpful, Bennett suggests, aiding the identification of scope conditions, new explanations, and omitted variables.

In “Will it Work Here?” ( Chapter 5 ), Michael Woolcock focuses on the utility of qualitative case studies for addressing the decision-maker’s perennial external validity concern: What works there may not work here. He asks how to generate the facts that are important in determining whether an intervention can be scaled and replicated in a given setting. He focuses our attention on three categories. The first he terms causal density, or whether 1) there are numerous causal pathways and feedback loops that affect inputs, actions, and outcomes, and 2) there is greater or lesser openness to exogenous influence. Experiments are often helpful when causal density is low – deworming, use of malaria nets, classroom size – but they fail when causal density is high, as in parenting. To assess causal density, Woolcock suggests we pay special attention to how many person-to-person transactions are required; how much discretion is required of front-line implementing agents; how much pressure implementing agents face to do something other than respond constructively to the problem; and the extent to which implementing agents are required to deploy solutions from a known menu or to innovate in situ.

Woolcock’s two other categories of relevant fact include implementation capability and reasoned expectations about what can be achieved by when. With respect to the first, he urges us not to assume that implementation capacity is equally available in each setting. Who has the authority to act? Is there adequate management capacity? Are there adequately trained front-line personnel? Is there a clear point of delivery? A functional supply chain? His third category, reasoned expectations, focuses on having a grounded theory about what can be achieved by when. Should we anticipate that the elements of an intervention all show results at the same time, as we usually assume, or will some kinds of results materialize before others? Will some increase over time, while others dissipate? Deliberation about these matters on the basis of analytic case studies, Woolcock argues, are the main method available for assessing the generalizability of any given intervention. Woolcock supplements his discussion with examples and a series of useful summary charts.

Part II of the book builds upon these methodological concerns to examine practical strategies by which case studies in international development (and elsewhere) can be prepared to the highest standards. Although not exhaustive, these strategies, presented by three political scientists, can help elevate the quality and utility of case studies by focusing on useful analytical tools that can enhance the rigor of their methodological foundations.

In Chapter 6 , Jennifer Widner, who directs Princeton University’s Innovations for Successful Societies program, reflects on what she and others have learned about gathering reliable information from interviews. Case study researchers usually draw on many types of evidence, some qualitative and some quantitative. For understanding motivation/interest, anticipated challenges, strategic choices, steps taken, unexpected obstacles encountered, and other elements of implementation, interviews with people who were “in the room where it happens” are usually essential. There may be diary entries or meeting minutes to help verify personal recall, but often the documentary evidence is limited or screened from view by thirty-year rules. Subject matter, proximity to elections or other sensitive events, interviewer self-presentation, question sequence, probes, and ethics safeguards are among the factors that shape the reliability of information offered in an interview. Widner sketches ways to improve the accuracy of recall and the level of detail, and to guard against “spin,” drawing on her program’s experience as well as the work of survey researchers and anthropologists.

Political scientist Tommaso Pavone analyzes how our evolving understanding of case-based causal inference via process tracing should alter how we select cases for comparative inquiry ( Chapter 7 ). The chapter explicates perhaps the most influential and widely used means to conduct qualitative research involving two or more cases: Mill’s methods of agreement and difference. It then argues that the traditional use of Millian methods of case selection can lead us to treat cases as static units to be synchronically compared rather than as social processes unfolding over time. As a result, Millian methods risk prematurely rejecting and otherwise overlooking (1) ordered causal processes, (2) paced causal processes, and (3) equifinality, or the presence of multiple pathways that produce the same outcome. To address these issues, the chapter develops a set of recommendations to ensure the alignment of Millian methods of case selection with within-case sequential analysis. First, it outlines how the use of processualist theories can help reformulate Millian case selection designs to accommodate ordered and paced processes (but not equifinal processes). Second, it proposes a new, alternative approach to comparative case study research: the method of inductive case selection. By selecting cases for comparison after a causal process has been identified within a particular case, the method of inductive case selection enables researchers to assess (1) the generalizability of the causal sequences, (2) the logics of scope conditions on the causal argument, and (3) the presence of equifinal pathways to the same outcome. A number of concrete examples from development practice illustrate how the method of inductive case selection can be used by both scholars and policy practitioners alike.

One of the common criticisms of qualitative research is that a case is hard to replicate. Whereas quantitative researchers often share their research designs and their data and encourage one another to rerun their analyses, qualitative researchers cannot as easily do so. However, they can enhance reliability in other ways. In Chapter 8 , Andrew Moravcsik introduces new practices designed to enhance three dimensions of research transparency: data transparency , which stipulates that researchers should publicize the data and evidence on which their research rests; analytic transparency , which stipulates that researchers should publicize how they interpret and analyze evidence in order to generate descriptive and causal inferences; and production transparency , which stipulates that social scientists should publicize the broader set of design choices that underlie the research. To respond to these needs, Moravcsik couples technology with the practice of discursive footnotes common in law journals. He discusses the rationale for creating a digitally enabled appendix with annotated source materials, called Active Citation or the Annotation for Transparency Initiative.

Part III – this volume’s concluding section – explores the ways in which case studies are being used today to learn from and enhance effectiveness in different development agencies.

In Chapter 9 , Andrew Bennett explores how process tracing can be used in program evaluation. “Process tracing and program evaluation, or contribution analysis, have much in common, as they both involve causal inference on alternative explanations for the outcome of a single case,” Bennett says:

Evaluators are often interested in whether one particular explanation – the implicit or explicit theory of change behind a program – accounts for the outcome. Yet they still need to consider whether exogenous nonprogram factors … account for the outcome, whether the program generated the outcome through some process other than the theory of change, and whether the program had additional or unintended consequences, either good or bad.

Bennett discusses how to develop a process-tracing case study to meet these demands and walks the reader through several key elements of this enterprise, including types of confounding explanations and the basics of Bayesian analysis.

In Chapter 10 , with a focus on social services in the Middle East, political scientist Melani Cammett takes up the use of positive deviant cases – examples of sustained high performance in a context in which good results are uncommon – to identify and disentangle causal complexity and understand the role of context. Although the consensus view on the role of deviant cases is that they are most useful for exploratory purposes or discovery and theory building, Cammett suggests they can also generate insights into the identification and operation of causal mechanisms. She writes that “analyses of positive deviant cases among a field of otherwise similar cases that operate in the same context … can be a valuable way to identify potential explanatory variables for exceptional performance.” The hypothesized explanatory variables can then be incorporated in subsequent quantitative or qualitative studies in order to evaluate their effects across a broader range of observations. The chapter discusses how to approach selection of positive deviant cases systematically and then works through a real example.

In Chapter 11 , on “Analytical Narratives and Case Studies,” Margaret Levi and Barry Weingast focus on a particular type of case in which the focus is on an outcome that results from strategic interaction, when one person’s decision depends on what another does. “A weakness of case studies per se is that there typically exist multiple ways to interpret a given case,” they begin. “How are we to know which interpretation makes most sense? What gives us confidence in the particular interpretation offered?” An analytic narrative first elucidates the principal players, their preferences, key decision points and possible choices, and the rules of the game. It then builds a model of the sequence of interaction including predicted outcomes and evaluates the model through comparative statics and the testable implications the mode generates. An analytic narrative also models situations as an extensive-form game. “The advantage of the game is that it reveals the logic of why, in equilibrium, it is in the interest of the players to fulfill their threats or promises against those who leave the equilibrium path,” the authors explain. Although game theory is useful, there is no hard rule that requires us to formalize. The particular findings do not generalize to other contexts, but an analytic narrative points to the characteristics of situations to which a similar strategic logic applies.

The book’s final chapters focus on the use of case studies for refining development policy and practice – in short, for learning. In Chapter 12 , Sarah Glavery and her coauthors draw a distinction between explicit knowledge, which is easily identified and shared through databases and reports, and tacit knowledge – the less easily shared “know how” that comes with having carried out a task. The chapter explores ways to use case study preparation, as well as a case itself, as a vehicle for sharing “know how,” specifically with respect to program implementation. It considers the experiences of four different types of organizations that have used case studies as part of their decision-making as it pertains to development issues: a multilateral agency (the World Bank), a major bilateral agency (Germany’s GIZ), a leading think tank (Brookings), and a ministry of a large country (China’s Ministry of Finance), which are all linked through their involvement in the GDI.

Finally, in Chapter 13 , Maria Gonzalez and Jennifer Widner reflect more broadly on the intellectual history of a science of delivery and adaptive management, two interlinked approaches to improving public services, and the use of case studies to move these endeavors forward. They emphasize the ways in which case studies have become salient tools for front-line staff whose everyday work is trying to solve complex development challenges, especially those pertaining to the implementation of policies and projects, and how, in turn, case studies are informing a broader turn to explaining outcome variation and identifying strategies for responding to complex challenges and ultimately seeking to enhance development effectiveness. The chapter discusses seven qualities that make a case useful to practitioners, and then offers reflections on how to use cases in a group context to elucidate core ideas and spark innovation.

1.9 Conclusion

In both development research and practice, case studies provide unique insights into implementation successes and failures, and help to identify why and how a particular outcome occurred. The data collected through case studies is often richer and of greater depth than would normally be obtained by other research designs, which allows for (potentially) richer discussions regarding their generalizability beyond the defined context of the case being studied. The case study method facilitates the identification of patterns and provides practical insights on how to navigate complex delivery challenges. Case studies can also capture the contextual conditions surrounding the delivery case, trace the detailed dynamics of the implementation process, provide key lessons learned, and inform broader approaches to service delivery (e.g., by focusing attention on citizen outcomes, generating multidimensional responses, providing usable evidence to enhance real-time implementation, and supporting leadership for change).

The core idea behind recent initiatives seeking to expand, formalize, and catalogue case studies of development practice is that capturing implementation processes and building a cumulative body of operational knowledge and know-how can play a key role in helping development practitioners deliver better results. Systematically investigating delivery in its own right offers an opportunity to distill common delivery challenges, and to engage constructively with the nontechnical problems that often hinder development interventions and prevent countries and practitioners from translating technical solutions into results on the ground.

Doing this well, however, requires drawing on the full array of established and leading approaches to conducting case study research. As this volume seeks to show, the last twenty years have led to considerable refinements and extensions of prevailing practice, and renewed confidence among scholars of case study methods that they have not merely addressed (or at least identified defensible responses to) long-standing concerns regarding the veracity of case studies but actively advanced those domains of inquiry in which case studies enjoy a distinctive epistemological ‘comparative advantage’. In turn, the veritable explosion of case studies of development processes now being prepared by academic groups, domestic governments, and international agencies around the world offers unprecedented opportunities for researchers to refine still further the underlying techniques, methodological principles, and theory on which the case study itself ultimately rests. As such, the time is ripe for a mutually beneficial dialogue between scholars and practitioners of development – a dialogue we hope this volume can inspire.

The views expressed in this chapter are those of the authors alone, and should not be attributed to the organizations with which they are affiliated.

1 For example, see Reference Barma, Huybens and Viñuela Barma, Huybens, and Viñuela (2014) ; Reference Brixi, Lust and Woolcock Brixi, Lust, and Woolcock (2015) ; and Reference Woolcock Woolcock (2013) .

2 See https://successfulsocieties.princeton.edu/ .

3 GDI’s case studies are available (by clicking on “Case studies” under the search category “Resource type”) at www.effectivecooperation.org/search/resources .

4 Reference Van Noorden, Maher and Nuzzo Van Noorden et al. (2014) also provide a direct link to the dataset on which this empirical claim rests. As of this writing, according to Google Scholar, Yin’s book (across all six editions) has been cited over 220,000 times; see also Robert Stake’s The Art of Case Study Research ( Reference Stake 1995 ), which has been cited more than 51,000 times.

5 In addition to those already listed, other key texts on the theory and practice of case studies include Reference Feagin, Orum and Sjoberg Feagin, Orum and Sjoberg (1991) , Reference Ragin and Becker Ragin and Becker (1992) , Reference Bates, Avner Greif, Rosethal and Weingast Bates et al. (1998) , Reference Byrne and Ragin Byrne and Ragin (2009) , and Reference Gerring Gerring (2017) . See also Reference Flyvbjerg Flyvbjerg (2006) .

6 As such, this volume continues earlier dialogues between scholars and development practitioners in the fields of history ( Reference Bayly, Rao, Szreter and Woolcock Bayly et al. 2011 ), law ( Reference Tamanaha, Sage and Woolcock Tamanaha et al. 2012 ), and multilateralism ( Reference Singh and Woolcock Singh and Woolcock, forthcoming ).

7 The initial study in what has become a sequence is Reference Bliss and Stern Bliss and Stern (1982) ; for subsequent rounds, see Reference Lanjouw and Stern Lanjouw and Stern (1998) and Reference Lanjouw, Murgai and Stern Lanjouw, Murgai, and Stern (2013) . This study remains ongoing, and is now in its seventh decade.

8 Reference Glavey and Haas Glavey and Haas (2015) .

9 Reference Glavey and Haas Glavey and Haas (2015) .

10 For example, if it can be shown empirically that, in general, countries that exit from bilateral trade agreements show a subsequent improvement in their “rule of law” scores, does this provide warrant for advising (say) Senegal that if it wants to improve its “rule of law” then it should exit from all its bilateral trade agreements? We think not.

11 Reference Hima and Santibanez Hima and Santibanez (2015) .

12 Reference Estabridis and Nieto Estabridis and Nieto (2015) .

13 Reference Ortega Nieto and Parida Ortega Nieto and Parida (2015) .

14 In the best of all possible worlds, we would want to draw the cases systematically from a known universe or population, but the absence of such a dataset meant we had to satisfice and match organizations on function while varying context means. Conclusions reached thus need to be qualified by the recognition that there could be more cases “out there,” which, if included in the analysis, might alter the initial results.

15 The ISS program began with a similar aim. The questions at the heart of the program were “What makes the countries that pull off institutional transformation different from others? What have they done that others could do to increase government capacity? What can be learned from the positive deviants, in particular?” For a variety of reasons having to do with the nature of the subject matter, the program disaggregated the subject and focused on responses to particular kinds of strategic challenges within countries and why some had negotiated these successfully in some periods and places but not in others.

16 These general standards, importantly, are consistent with a recent interdisciplinary effort to define rigor in case study research, which took place under the auspices of the US National Science Foundation. See Report on the Workshop on Interdisciplinary Standards for Systematic Qualitative Research. Available at: https://oconnell.fas.harvard.edu/files/lamont/files/issqr_workshop_rpt.pdf .

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • Using Case Studies to Enhance the Quality of Explanation and Implementation
  • By Jennifer Widner , Michael Woolcock , Daniel Ortega Nieto
  • Edited by Jennifer Widner , Princeton University, New Jersey , Michael Woolcock , Daniel Ortega Nieto
  • Book: The Case for Case Studies
  • Online publication: 05 May 2022
  • Chapter DOI: https://doi.org/10.1017/9781108688253.002

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Journal Proposal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

sustainability-logo

Article Menu

case study about quality

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Estimation of urban high-quality development level using a three-stage stacks-based measure model: a case study of urban agglomerations in the yellow river basin.

case study about quality

1. Introduction

2. literature review and research gap, 2.1. literature review, 2.2. research gaps and potential contributions.

  • Existing studies on assessing regional high-quality development levels often utilize indicator systems based on the five development concepts. While this approach can comprehensively reflect the essence of high-quality development, the focus of the indicator systems varies among different scholars due to differences in research subjects and objectives, which can lead to a lack of comparability and generalizability in the evaluation results [ 7 ].
  • When using efficiency indicators such as GDE to assess high-quality development levels, studies often overlook the heterogeneity of the research regions, which may deviate from assumptions of homogeneity in DEA models. Additionally, the impacts of external environmental factors and random errors are often not considered, although they may introduce biases into the measurement results, leading to inaccurate conclusions and potentially misleading policy recommendations.
  • Existing research predominantly focuses on the provincial or city level, with limited attention given to the perspectives of urban agglomerations, which are crucial carriers of modern economic development [ 42 ], and the report of the 20th National Congress of the Communist Party of China emphasizes the construction of a coordinated development pattern relying on urban agglomerations and metropolitan areas; therefore, it is essential to examine regional high-quality development from the perspective of urban agglomerations. Moreover, compared to the provincial or individual city perspective, urban agglomerations place greater emphasis on high-level regional integration. Beyond enhancing overall regional development, reducing intra-regional development disparities is also crucial; however, existing studies often lack in-depth analysis of convergence in high-level regional development.
  • First, using GDE as the metric for assessing the high-quality development level in the YRB avoids the subjectivity associated with indicator selection and weight assignment, while also comprehensively considering changes in economic efficiency and resource utilization efficiency, making it more suitable for evaluating and analyzing ecologically sensitive areas, as well as promoting ecological protection and high-quality development;
  • Second, combining the super-efficient SBM model with the three-stage DEA model allows for the minimization of the impacts of the external environment and random factors in a region with significant internal disparities, such as the YRB, thus enhancing the accuracy of the assessment results; additionally, analyzing external environments by differentiating input factors clarifies the pathways of influence;
  • Third, by focusing on cities within the YRB and grouping them into urban agglomerations, this study fully considers the economic connections and interactions between cities; further investigation into the regional differences and convergence of GDE provides a better understanding of the overall green and coordinated development in the YRB, offering valuable data support and policy recommendations for enhancing regional high-quality development.

3. Methods and Materials

3.1. methods, 3.1.1. three-stage sbm dynamic analysis model.

  • Stage I: SBM model

3.1.2. Decomposition Method of Dagum’s Gini Coefficient

3.1.3. β convergence model, 3.2. study area, 3.3. variables and data source, 4.1. results and analysis of the three-stage sbm dynamic analysis model, 4.1.1. stage i: results of original gde, 4.1.2. stage ii: impact of external environment, 4.1.3. stage iii: results of actual gde, 4.2. decomposition of dagum’s gini coefficient for actual gde, 4.3. analysis of convergence in actual gde, 5. discussion, 5.1. discussion on external environmental variables, 5.2. discussion on the spatiotemporal evolution trends of gde in the yrb, 5.3. prospects and limitations, 6. conclusions and policy recommendations.

  • Promoting balanced development across the basin: Promoting integrated development within the region and fostering collaboration among urban agglomerations are crucial to addressing the issue of unbalanced high-quality development in the YRB. On the one hand, in order to fully harness the enthusiasm of various social actors, it is essential to establish diversified cooperation mechanisms, including ecological environment co-construction, industrial chain coordination, and cross-regional result- and benefit-sharing mechanisms. For instance, Shandong and Henan have signed a cooperation framework agreement to deepen collaboration in areas such as ecological environmental protection, infrastructure construction, the industrial division of labor, and cultural heritage preservation, jointly advancing the construction of a high-quality development demonstration area in the YRB. On the other hand, it is important to further enhance investment and trade exchanges and technology transfers within the basin and between regions, thereby promoting collaboration both within the YRB urban agglomerations and with domestic and international markets, to facilitate sharing through openness. In 2022, the nine provinces and regions along the Yellow River established the YRB Free Trade Zone Alliance. By promoting mutually beneficial industrial cooperation and business logistics collaboration, this initiative aims to elevate the overall openness level and economic competitiveness of the YRB, as well as enhance the efficient flow of resources within the free trade zones.
  • Industrial structure upgrading and optimization: Although urbanization levels in the YRB have been continuously improving, the industrial structures across provinces and cities remain highly homogeneous, which not only fails to fully exploit regional advantages, but also increases pressure on regional resources. Therefore, during the process of industrial upgrades and transformation, urban agglomerations should focus on developing unique, advantageous industries, tailored to local conditions and promoting emerging industries. For upstream urban agglomerations, efforts should be made to enhance the green production levels of traditional industries, while accelerating energy structure adjustments to highlight the advantages of clean energy and achieve large-scale utilization of renewable energy. Midstream urban agglomerations should leverage the high-end industrial agglomeration and leadership functions of national central cities to build an innovation-driven modern industrial system, moving away from long-term dependence on outdated industries. Additionally, midstream urban agglomerations should further develop their role as comprehensive regional transportation hubs, creating a star-shaped integrated economic development axis that drives surrounding areas, radiates nationwide, and connects internationally. Downstream urban agglomerations, serving as the “gateway” and open ports of the YRB, should continue to deepen comprehensive reforms, establish high-standard market systems, reduce institutional transaction costs, and improve the international business environment. While enhancing their ability to participate in global competition and cooperation, they should also deepen collaboration between coastal ports and inland ports, as well as establish and improve strategic cooperation mechanisms among urban agglomerations, along the Yellow River.
  • Enhancing quality and efficiency in infrastructure development: For the YRB, the imbalanced development of the infrastructure system has become a significant constraint on the advancement of new urbanization. Therefore, as a crucial lever for achieving high-quality development, strengthening the construction of transportation, water conservancy, and municipal infrastructure remains a key focus of regional economic efforts. In addition, in light of the significant opportunities presented by the development of new infrastructure, provinces, and regions are actively advancing the deployment and application of industries such as 5G base stations, intercity high-speed railways, urban rail transit, and artificial intelligence. Particularly in upstream areas that leverage the national “Eastern Data, Western Computing” project, efforts are being made to greenly construct and renovate medium and large data centers in order to achieve deep integration of informatization and green construction. However, it is essential to shift the goals of industrial development from speed and scale to quality and efficiency in the process of promoting infrastructure construction. In recent years, China has begun to strengthen the approval process for subways, high-speed railways, and certain major engineering projects and has halted infrastructure projects beyond basic livelihood projects in some provinces and municipalities with heavy debt burdens, including Inner Mongolia, Gansu, Qinghai, and Ningxia. In the future, when planning infrastructure construction in the Yellow River Basin’s urban agglomerations, it is essential to adopt a demand-oriented approach, which involves controlling the overdevelopment of traditional infrastructure projects and systematically planning new infrastructure projects. Care must be taken to avoid redundant construction and overcapacity, which could lead to high local debt and trigger systemic financial risks.

Author Contributions

Institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

  • OECD. Towards Green Growth: Monitoring Progress: OECD Indicators ; OECD: Paris, France, 2011. [ Google Scholar ] [ CrossRef ]
  • UNEP. Towards a Green Economy: Pathways to Sustainable Development and Poverty Eradication ; UNEP: Nairobi, Kenya, 2011; Available online: https://sustainabledevelopment.un.org/content/documents/126GER_synthesis_en.pdf (accessed on 23 November 2023).
  • Asian Development Bank. Green Growth, Resources and Resilience: Environmental Sustainability in Asia and the Pacific ; Asian Development Bank: Mandaluyong, Philippines, 2012. [ Google Scholar ]
  • World Bank. Inclusive Green Growth: The Pathway to Sustainable Development ; World Bank Publications: Washington, DC, USA, 2012. [ Google Scholar ]
  • UNSDG. Global Indicator Framework for the Sustainable Development Goals and Targets of the 2030 Agenda for Sustainable Development ; UNSDG: New York, NY, USA, 2017; Available online: https://unstats.un.org/sdgs/indicators/indicators-list/ (accessed on 23 November 2023).
  • Wang, M.; Zhao, H.; Cui, J.; Fan, D.; Lv, B.; Wang, G.; Li, Z.; Zhou, G. Evaluating Green Development Level of Nine Cities within the Pearl River Delta, China. J. Clean. Prod. 2018 , 174 , 315–323. [ Google Scholar ] [ CrossRef ]
  • Zhang, F.; Tan, H.; Zhao, P.; Gao, L.; Ma, D.; Xiao, Y. What Was the Spatiotemporal Evolution Characteristics of High-Quality Development in China? A Case Study of the Yangtze River Economic Belt Based on the ICGOS-SBM Model. Ecol. Indic. 2022 , 145 , 109593. [ Google Scholar ] [ CrossRef ]
  • Yang, Y.; Su, X.; Yao, S. Nexus between Green Finance, Fintech, and High-Quality Economic Development: Empirical Evidence from China. Resour. Policy 2021 , 74 , 102445. [ Google Scholar ] [ CrossRef ]
  • Pan, W.; Wang, J.; Lu, Z.; Liu, Y.; Li, Y. High-Quality Development in China: Measurement System, Spatial Pattern, and Improvement Paths. Habitat Int. 2021 , 118 , 102458. [ Google Scholar ] [ CrossRef ]
  • Liu, N.; Wang, Y. Urban Agglomeration Ecological Welfare Performance and Spatial Convergence Research in the Yellow River Basin. Land 2022 , 11 , 2073. [ Google Scholar ] [ CrossRef ]
  • Wang, B.; Huang, R. Regional Green Development Efficiency and Green Total Productivity Growth in China: 2000-2010--Base on Parametric Metafrontier Analysis. Ind. Econ. Rev. 2014 , 5 , 16–35. (In Chinese) [ Google Scholar ] [ CrossRef ]
  • Hailu, A.; Veeman, T.S. Environmentally Sensitive Productivity Analysis of the Canadian Pulp and Paper Industry, 1959–1994: An Input Distance Function Approach. J. Environ. Econ. Manag. 2000 , 40 , 251–274. [ Google Scholar ] [ CrossRef ]
  • Jiang, L.; Zuo, Q.; Ma, J.; Zhang, Z. Evaluation and Prediction of the Level of High-Quality Development: A Case Study of the Yellow River Basin, China. Ecol. Indic. 2021 , 129 , 107994. [ Google Scholar ] [ CrossRef ]
  • Zheng, W.; Zhang, L.; Hu, J. Green Credit, Carbon Emission and High Quality Development of Green Economy in China. Energy Rep. 2022 , 8 , 12215–12226. [ Google Scholar ] [ CrossRef ]
  • Chen, L.; Wang, X.; Wang, Y.; Gao, P. Improved Entropy Weight Methods and Their Comparisons in Evaluating the High-Quality Development of Qinghai, China. Open Geosci. 2023 , 15 , 20220570. [ Google Scholar ] [ CrossRef ]
  • Hua, X.; Lv, H.; Jin, X. Research on High-Quality Development Efficiency and Total Factor Productivity of Regional Economies in China. Sustainability 2021 , 13 , 8287. [ Google Scholar ] [ CrossRef ]
  • Sun, H.; Zhao, Z.; Han, D. Growth Models and Influencing Mechanisms of Total Factor Productivity in China’s National High-Tech Zones. Sustainability 2024 , 16 , 3245. [ Google Scholar ] [ CrossRef ]
  • Sun, X.; Sui, D.; Chang, K.; Wang, G. Research on the Influence of TFP on High-Quality Development in China. J. Appl. Math. Comput. 2021 , 5 , 154–164. [ Google Scholar ] [ CrossRef ]
  • Zeng, S.; Shu, X.; Ye, W. Total Factor Productivity and High-Quality Economic Development: A Theoretical and Empirical Analysis of the Yangtze River Economic Belt, China. Int. J. Environ. Res. Public Health 2022 , 19 , 2783. [ Google Scholar ] [ CrossRef ]
  • Li, W.; Cai, Z.; Jin, L. A Spatial-Temporal Analysis on Green Development in China’s Yellow River Basin: Model-Based Efficiency Evaluation and Influencing Factors Identification. Stoch. Environ. Res. Risk Assess 2023 , 37 , 4431–4444. [ Google Scholar ] [ CrossRef ]
  • Zhou, F.; Si, D.; Hai, P.; Ma, P.; Pratap, S. Spatial-Temporal Evolution and Driving Factors of Regional Green Development: An Empirical Study in Yellow River Basin. Systems 2023 , 11 , 109. [ Google Scholar ] [ CrossRef ]
  • Liu, L.; Yang, Y.; Liu, S.; Gong, X.; Zhao, Y.; Jin, R.; Duan, H.; Jiang, P. A Comparative Study of Green Growth Efficiency in Yangtze River Economic Belt and Yellow River Basin between 2010 and 2020. Ecol. Indic. 2023 , 150 , 110214. [ Google Scholar ] [ CrossRef ]
  • Wang, H.; Cui, H.; Zhao, Q. Effect of Green Technology Innovation on Green Total Factor Productivity in China: Evidence from Spatial Durbin Model Analysis. J. Clean. Prod. 2021 , 288 , 125624. [ Google Scholar ] [ CrossRef ]
  • Wang, K.; Pang, S.; Ding, L.; Miao, Z. Combining the Biennial Malmquist–Luenberger Index and Panel Quantile Regression to Analyze the Green Total Factor Productivity of the Industrial Sector in China. Sci. Total Environ. 2020 , 739 , 140280. [ Google Scholar ] [ CrossRef ]
  • Tone, K. A Slacks-Based Measure of Super-Efficiency in Data Envelopment Analysis. Eur. J. Oper. Res. 2002 , 143 , 32–41. [ Google Scholar ] [ CrossRef ]
  • Li, B.; Wu, S. Effects of Local and Civil Environmental Regulation on Green Total Factor Productivity in China: A Spatial Durbin Econometric Analysis. J. Clean. Prod. 2017 , 153 , 342–353. [ Google Scholar ] [ CrossRef ]
  • Song, M.; Du, J.; Tan, K. Impact of Fiscal Decentralization on Green Total Factor Productivity. Int. J. Prod. Econ. 2018 , 205 , 359–367. [ Google Scholar ] [ CrossRef ]
  • Liu, D.; Zhu, X.; Wang, Y. China’s Agricultural Green Total Factor Productivity Based on Carbon Emission: An Analysis of Evolution Trend and Influencing Factors. J. Clean. Prod. 2021 , 278 , 123692. [ Google Scholar ] [ CrossRef ]
  • Li, Y.; Chen, Y. Development of an SBM-ML Model for the Measurement of Green Total Factor Productivity: The Case of Pearl River Delta Urban Agglomeration. Renew. Sustain. Energy Rev. 2021 , 145 , 111131. [ Google Scholar ] [ CrossRef ]
  • Lee, C.; Lee, C. How Does Green Finance Affect Green Total Factor Productivity? Evidence from China. Energy Econ. 2022 , 107 , 105863. [ Google Scholar ] [ CrossRef ]
  • Zhou, Y.; Kong, Y.; Zhang, T. The Spatial and Temporal Evolution of Provincial Eco-Efficiency in China Based on SBM Modified Three-Stage Data Envelopment Analysis. Environ. Sci. Pollut. Res. 2020 , 27 , 8557–8569. [ Google Scholar ] [ CrossRef ]
  • Fried, H.O.; Lovell, C.A.K.; Schmidt, S.S.; Yaisawarng, S. Accounting for Environmental Effects and Statistical Noise in Data Envelopment Analysis. J. Product. Anal. 2002 , 17 , 157–174. [ Google Scholar ] [ CrossRef ]
  • Fan, Y.; Chen, C. The Performance Analysis of Anti- Terrorism Intelligence from Taiwan’s Investigation Bureau of the Ministry of Justice. In Proceedings of the 2008 IEEE International Conference on Intelligence and Security Informatics, Taipei, Taiwan, 17–20 June 2008; pp. 259–260. [ Google Scholar ] [ CrossRef ]
  • Shyu, J.; Hung, S. The True Managerial Efficiency of International Tourist Hotels in Taiwan: Three-Stage Data Envelopment Analysis. Serv. Ind. J. 2012 , 32 , 1991–2004. [ Google Scholar ] [ CrossRef ]
  • Iparraguirre, J.L.; Ma, R. Efficiency in the Provision of Social Care for Older People. A Three-Stage Data Envelopment Analysis Using Self-Reported Quality of Life. Socio-Econ. Plan. Sci. 2015 , 49 , 33–46. [ Google Scholar ] [ CrossRef ]
  • Zhao, H.; Guo, S.; Zhao, H. Provincial Energy Efficiency of China Quantified by Three-Stage Data Envelopment Analysis. Energy 2019 , 166 , 96–107. [ Google Scholar ] [ CrossRef ]
  • Su, W.; Hou, Y.; Huang, M.; Xu, J.; Du, Q.; Wang, P. Evaluating the Efficiency of Primary Health Care Institutions in China: An Improved Three-Stage Data Envelopment Analysis Approach. BMC Health Serv. Res. 2023 , 23 , 995. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Feng, C.; Wang, M.; Liu, G.; Huang, J. Green Development Performance and Its Influencing Factors: A Global Perspective. J. Clean. Prod. 2017 , 144 , 323–333. [ Google Scholar ] [ CrossRef ]
  • Chen, L.; Zhang, X.; He, F.; Yuan, R. Regional Green Development Level and Its Spatial Relationship under the Constraints of Haze in China. J. Clean. Prod. 2019 , 210 , 376–387. [ Google Scholar ] [ CrossRef ]
  • Shuai, S.; Fan, Z. Modeling the Role of Environmental Regulations in Regional Green Economy Efficiency of China: Empirical Evidence from Super Efficiency DEA-Tobit Model. J. Environ. Manag. 2020 , 261 , 110227. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Zhang, J.; Yang, Z.; Zhang, X.; Sun, J.; He, B. Institutional Configuration Study of Urban Green Economic Efficiency—Analysis Based on fsQCA and NCA. Pol. J. Environ. Stud. 2024 , 33 , 1–11. [ Google Scholar ] [ CrossRef ]
  • Fang, C.; Yu, D. Urban Agglomeration: An Evolving Concept of an Emerging Phenomenon. Landsc. Urban Plan. 2017 , 162 , 126–136. [ Google Scholar ] [ CrossRef ]
  • Jondrow, J.; Knox Lovell, C.A.; Materov, I.S.; Schmidt, P. On the Estimation of Technical Inefficiency in the Stochastic Frontier Production Function Model. J. Econom. 1982 , 19 , 233–238. [ Google Scholar ] [ CrossRef ]
  • Kumbhakar, S.C.; Knox Lovell, C.A. Stochastic Frontier Analysis ; Cambridge University Press: Cambridge, UK, 2003. [ Google Scholar ]
  • Dagum, C. A New Approach to the Decomposition of the Gini Income Inequality Ratio. Empir. Econ. 1997 , 22 , 515–531. [ Google Scholar ] [ CrossRef ]
  • Chen, Y.; Fu, B.; Zhao, Y.; Wang, K.; Zhao, M.; Ma, J.; Wu, J.; Xu, C.; Liu, W.; Wang, H. Sustainable Development in the Yellow River Basin: Issues and Strategies. J. Clean. Prod. 2020 , 263 , 121223. [ Google Scholar ] [ CrossRef ]
  • Outline of the Yellow River Basin’s Ecological Protection and High-quality Development Plan. People’s Daily , 9 October 2021; 001. (In Chinese) [ CrossRef ]
  • Liu, P.; Zhu, B. Temporal-Spatial Evolution of Green Total Factor Productivity in China’s Coastal Cities under Carbon Emission Constraints. Sustain. Cities Soc. 2022 , 87 , 104231. [ Google Scholar ] [ CrossRef ]
  • Lin, P.; Meng, N. Spatio-temporal Differentiation and Dynamic Convergence of Green Total Factor Productivity Growth. J. Quant. Technol. Econ. 2021 , 38 , 104–124. (In Chinese) [ Google Scholar ] [ CrossRef ]
  • Zhang, J.; Wu, G.; Zhang, J. The Estimation of China’s provincial capital stock: 1952–2000. Econ. Res. J. 2004 , 10 , 35–44. (In Chinese) [ Google Scholar ]
  • Shan, H. Reestimating the Capital Stock of China: 1952~2006. J. Quant. Technol. Econ. 2008 , 25 , 17–31. (In Chinese) [ Google Scholar ] [ CrossRef ]
  • Wu, Y. The Role of Productivity in China’ s Growth: New Estimates. China Econ. Q. 2008 , 7 , 827–842. (In Chinese) [ Google Scholar ] [ CrossRef ]
  • Sun, Y.; Yang, M. Research on Club Convergence and the Sources of Regional Gaps of Green Total Factor Productivity in China. J. Quant. Technol. Econ. 2020 , 37 , 47–69. (In Chinese) [ Google Scholar ] [ CrossRef ]
  • Guo, X.; Deng, M.; Wang, X.; Yang, X. Population Agglomeration in Chinese Cities: Is It Benefit or Damage for the Quality of Economic Development? Environ. Sci. Pollut. Res. 2024 , 31 , 10106–10118. [ Google Scholar ] [ CrossRef ]
  • Gan, C.; Zheng, R.; Yu, D. An Empirical Study on the Effects of Industrial Structure on Economic Growth and Fluctuations in China. Econ. Res. J. 2011 , 46 , 4–16+31. (In Chinese) [ Google Scholar ]
  • Levine, R. Financial Development and Economic Growth: Views and Agenda. J. Econ. Lit. 1997 , 35 , 688–726. [ Google Scholar ]
  • Beck, T.; Levine, R. Industry Growth and Capital Allocation: Does Having a Market- or Bank-Based System Matter? J. Financ. Econ. 2002 , 64 , 147–180. [ Google Scholar ] [ CrossRef ]
  • Chen, Y.; Su, X.; Zhou, Q. Spatial Differentiation and Influencing Factors of the Green Development of Cities along the Yellow River Basin. Discret. Dyn. Nat. Soc. 2022 , 2022 , 1–20. [ Google Scholar ] [ CrossRef ]
  • Zhang, J.; Liu, Y.; Liu, C.; Guo, S.; Cui, J. Study on the Spatial and Temporal Evolution of High-Quality Development in Nine Provinces of the Yellow River Basin. Sustainability 2023 , 15 , 6975. [ Google Scholar ] [ CrossRef ]
  • Zhang, S.; Lv, Y.; Zhang, B. Spatio-Temporal Evolution and Influencing Factors of Green Development in the Yellow River Basin of China. Sustainability 2022 , 14 , 12407. [ Google Scholar ] [ CrossRef ]
  • Zhang, C.; Chen, P. Applying the Three-Stage SBM-DEA Model to Evaluate Energy Efficiency and Impact Factors in RCEP Countries. Energy 2022 , 241 , 122917. [ Google Scholar ] [ CrossRef ]
  • Tonooka, Y.; Liu, J.; Kondou, Y.; Ning, Y.; Fukasawa, O. A Survey on Energy Consumption in Rural Households in the Fringes of Xian City. Energy Build. 2006 , 38 , 1335–1342. [ Google Scholar ] [ CrossRef ]
  • Sun, C.; Tong, Y.; Zou, W. The Evolution and a Temporal-Spatial Difference Analysis of Green Development in China. Sustain. Cities Soc. 2018 , 41 , 52–61. [ Google Scholar ] [ CrossRef ]
  • Huang, S.; Xie, D. North—South Urban Functional Difference and North—South Economic Disparity in China. South China J. Econ. 2022 , 41 , 40–63+76. (In Chinese) [ Google Scholar ] [ CrossRef ]
  • Davis, J.C.; Henderson, J.V. Evidence on the Political Economy of the Urbanization Process. J. Urban Econ. 2003 , 53 , 98–125. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Urban Agglomeration/CityGDP (CNY 100 Million)GDP per Capita (CNY 10,000)Population (10,000 People)General Public Budget Revenue (CNY 100 Million)
SPUA57,509 (26.19%)11.754895.00 (16.35%)4569.99 (27.00%)
    Jinan12,027 (20.91%)12.83937.51 (19.15%)1001.14 (21.91%)
    Qingdao14,921 (25.95%)14.491029.96 (21.04%)1273.31 (27.86%)
CHUA87,996 (40.08%)5.6415,591.21 (52.07%)6385.58 (37.73%)
    Zhengzhou12,935 (14.70%)10.121278.55 (8.20%)1130.79 (17.71%)
YBUA39,223 (17.87%)10.643685.33 (12.31%)3859.84 (22.80%)
    Hohhot3329 (8.49%)9.44352.49 (9.56%)230.87 (5.98%)
    Ordos5613 (14.31%)25.69218.48 (5.93%)842.84 (21.84%)
    Baotou3750 (9.56%)13.74273.01 (7.41%)173.61 (4.50%)
    Taiyuan5571 (14.20%)10.29541.28 (14.69%)437.48 (11.33%)
    Yinchuan2536 (6.47%)8.78288.98 (7.84%)168.86 (4.37%)
    Yulin6844 (17.45%)18.08378.51 (10.27%)926.81 (24.01%)
GPUA27,665 (12.60%)6.394332.92 (14.47%)1652.74 (9.76%)
    Xi’an11487 (41.52%)8.881293.49 (29.85%)834.08 (50.47%)
LXUA7153.6 (3.26%)4.981436.57 (4.80%)457.32 (2.70%)
    Lanzhou3344 (46.75%)7.60440.05 (30.63%)220.98 (48.32%)
    Xining1644 (22.98%)6.64247.73 (17.24%)131.72 (28.80%)
Function LayerDimension LayerIndex Layer
Input VariablesCapital Input (K)capital stock
Labor Input (L)end-of-year employment
Energy Consumption (E)total electricity consumption for the entire society
Output VariablesDesired OutputGDP
Undesired Outputenvironmental pollution composite index (PE)
Input VariableEnvironmental VariableIndicator Description
Capital Input (K)Industrial Structure (IS)Level of rationalization in industrial structure
Financial Development Level (FD)Proportion of total loans to GDP
Economic Openness Level (EO)Total value of imports and exports (in billions of CNY)
Labor Input (L)Financial Development Level (FD)Proportion of total loans to GDP
Population Density (PD)Persons per square kilometer
Urbanization Rate (UR)Proportion of urban population to total population
Energy Consumption (E)Industrial Structure (IS)Level of rationalization in industrial structure
Population Density (PD)Persons per square kilometer
Infrastructure (IC)Per capita urban road area (square meters)
54.087 ***
(6.350)
1.825
(0.322)
8.741
(0.629)
−24.178 ***
(−3.503)
−3.555 ***
(−3.256)
−26.015 **
(−2.081)
−9.952 ***
(−5.607)
−0.097 **
(2.329)
−0.247 **
(−2.236)
0.021 ***
(4.670)
0.224 **
(2.511)
0.833 *
(1.745)
9544.400 ***
(5.172)
595.624 ***
(7.084)
19,507.000 ***
(15,392.136)
0.944 ***
(81.071)
0.542 ***
(8.154)
0.913 ***
(176.455)
LR test691.893 ***188.291 ***822.935 ***
OverallCHUAGPUALXUASPUAYBUA
SDMSDMSAROLSSARSAR
β−0.205 ***−0.667 ***−0.140 **−0.012 *−0.149 **−0.129 **
(−5.325)(−9.200)(−2.063)(−1.950)(−2.256)(−2.388)
θ−0.733 *−3.957 ***
(−2.141)(−7.7834)
ρ/λ−1.633 ***−1.461 ***−0.467 * −0.319 *−1.413 ***
(−7.243)(−5.117)(−1.798) (−1.654)(−8.905)
Ind FEYESYESYESYESYESYES
Time FEYESYESYESYESYESYES
R-LM (SAR)0.001 ***0.001 ***0.006 ***0.3840.046 **0.079 *
R-LM (SEM)0.002 ***0.001 ***0.034 **0.4400.2170.219
R 0.3180.3220.4120.2690.4120.525
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Liu, S.; Yang, S.; Liu, N. Estimation of Urban High-Quality Development Level Using a Three-Stage Stacks-Based Measure Model: A Case Study of Urban Agglomerations in the Yellow River Basin. Sustainability 2024 , 16 , 8130. https://doi.org/10.3390/su16188130

Liu S, Yang S, Liu N. Estimation of Urban High-Quality Development Level Using a Three-Stage Stacks-Based Measure Model: A Case Study of Urban Agglomerations in the Yellow River Basin. Sustainability . 2024; 16(18):8130. https://doi.org/10.3390/su16188130

Liu, Sisi, Suchang Yang, and Ningyi Liu. 2024. "Estimation of Urban High-Quality Development Level Using a Three-Stage Stacks-Based Measure Model: A Case Study of Urban Agglomerations in the Yellow River Basin" Sustainability 16, no. 18: 8130. https://doi.org/10.3390/su16188130

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

case study about quality

Maintenance work is planned from 09:00 BST to 12:00 BST on Saturday 28th September 2024.

During this time the performance of our website may be affected - searches may run slowly, some pages may be temporarily unavailable, and you may be unable to access content. If this happens, please try refreshing your web browser or try waiting two to three minutes before trying again.

We apologise for any inconvenience this might cause and thank you for your patience.

case study about quality

Environmental Science: Water Research & Technology

Investigating water quality and preservation strategies in abuja's distribution system: a nigerian case study.

ORCID logo

* Corresponding authors

a Department of Civil Engineering, EPOKA University, Autostrada Tirana-Rinas, km. 12, Tirana, Albania E-mail: [email protected] Tel: +355 4 2232 086 (ext: 1552)

b Department of Civil Engineering, Nile University of Nigeria, Plot 681, Cadastral Zone C-OO, Research & Institution Area, Jabi Airport Bypass, Abuja FCT, Nigeria

Abuja, the capital city of Nigeria, primarily sources its drinking water from the Lower Usuma Dam Water Treatment Plant (LUD-WTP). This study aims to investigate the preservation of the physicochemical and biological properties of the treated water as it traverses the distribution network to reach the end consumers. Laboratory analyses indicate that the physicochemical parameters of the water samples comply with the guidelines set by the World Health Organization (WHO) and the Nigerian Standard for Drinking Water Quality (NSDWQ). However, bacteriological examination of samples from areas serviced by the LUD-WTP revealed the presence of E. coli , Enterobacter aerogenes , and Klebsiella bacteria, alongside a lack of residual chlorine. The study subsequently focuses on identifying vulnerabilities in the water distribution system and proposing preventive measures. The findings of this research have significant implications for managing drinking water quality in urban distribution networks, particularly in developing countries.

Graphical abstract: Investigating water quality and preservation strategies in Abuja's distribution system: a Nigerian case study

Article information

Download citation, permissions.

case study about quality

Investigating water quality and preservation strategies in Abuja's distribution system: a Nigerian case study

B. Kulmedov, L. A. Akaiku and O. N. Mogbo, Environ. Sci.: Water Res. Technol. , 2024, Advance Article , DOI: 10.1039/D4EW00613E

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page .

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page .

Read more about how to correctly acknowledge RSC content .

Social activity

Search articles by author.

This article has not yet been cited.

Advertisements

ASQ logo

  • Publications
  • Other Media
  • Quality Progress

Learn About Quality

  • Cost of quality
  • Quality tools
  • More topics ...
  • Industry communities

Quality Glossary

  • See definitions

Publication Types

  • Case Studies
  • Benchmarking Reports
  • Newsletters

Magazines and Journals

  • Lean & Six Sigma Review
  • Quality Management Journal
  • Journal for Quality and Participation
  • Software Quality Professional
  • Journal of Quality Technology
  • Quality Engineering
  • Technometrics
  • Search Publications

Become an Author

  • Calls for content
  • Submit content

From Topic and Industry Communities

  • Healthcare Division Knowledge Center
  • Quality Approaches in Education
  • Quality Management Forum
  • Statistics Digest
  • Industry/Specialty
  • Organizational Excellence
  • Team Success
  • Translated Video
  • Video Series

Webcasts and Podcasts

  • Featured webcasts
  • Search the collection
  • Submit a webcast proposal
  • Education podcasts and webinars
  • Government webinars
  • Browse topic and industry communities
  • ASQ YouTube
  • Lean Enterprise Division
  • Statistics Division
  • Quality Resources /
  • Case Studies /

Using Cost of Quality to Improve Business Results

case study about quality

  • Manufacturing

CRC Industries uses cost of quality as a key measure for improving business results. Since centering improvement efforts on cost of quality, the company has reduced failure dollars as a percentage of sales and saved hundreds of thousands of dollars. Cost of quality can also be linked to other improvements at CRC Industries, including shipping error reductions, customer service order entry error reductions, productivity increases, hazardous waste reduction, and profitability.

  • Case study,
  • Cost of quality,
  • Failure dollars,
  • Business results,
  • Measurement,
  • Quality tools,
  • Root cause analysis

To Access the Full Document:

case study about quality

IMAGES

  1. (PDF) A case study on quality management and digital innovation

    case study about quality

  2. (PDF) Towards managing quality cost: A case study

    case study about quality

  3. Quality Assurance Case Study Free Essay Example

    case study about quality

  4. Chapter 4 : Case Studies of Alternative Quality Management

    case study about quality

  5. (PDF) Impact of Quality Circle a Case Study

    case study about quality

  6. (PDF) 24 Case Study: Quality Management Methods and ISO 9000 Quality

    case study about quality

VIDEO

  1. DJF51082 QUALITY CONTROL

  2. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 19 updated

  3. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 1 updated

  4. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 13 updated

  5. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 11 updated

  6. SAP S4Hana Tutorial/Follow-Along -- QM Case Study Step 12 updated

COMMENTS

  1. Total quality management: three case studies from around the world

    According to Teresa Whitacre, of international consulting firm ASQ, proper quality management also boosts a company's profitability. "Total quality management allows the company to look at their management system as a whole entity — not just an output of the quality department," she says. "Total quality means the organisation looks at ...

  2. Case Studies

    Using DMAIC to Improve Nursing Shift-Change Assignments. In this case study involving an anonymous hospital, nursing department leaders sought to improve efficiency of their staff's shift change assignments. Upon value stream mapping the process, team members identified the shift nursing report took 43 minutes on average to complete.

  3. Quality management

    Manage Your Human Sigma. Organizational Development Magazine Article. John H. Fleming. Curt Coffman. James K. Harter. If sales and service organizations are to improve, they must learn to measure ...

  4. Quality: Articles, Research, & Case Studies on Quality

    by by Jim Heskett. A new book by Gregory Clark identifies "labor quality" as the major enticement for capital flows that lead to economic prosperity. By defining labor quality in terms of discipline and attitudes toward work, this argument minimizes the long-term threat of outsourcing to developed economies.

  5. Journey to Perfect: Mayo Clinic and the Path to Quality

    Abstract. Long regarded as one of the best healthcare organizations in the world, the Mayo Clinic has not been exempt from the challenges facing the industry. While the Mayo Clinic had employed quality approaches to an extent throughout its history, at the start of the 21st century the organization's leaders drove a system-wide transformation ...

  6. ISO 9001 Case Studies

    Quality Management Systems (QMS) & ISO 9001 - Case Studies. Implementing ISO 9001:2015 can help ensure that customers get consistent, high-quality products and services, which in turn can benefit your organization. The following ISO 9001:2015 case studies offer a look at the difference ISO 9001 can make for organizations in terms of process ...

  7. PDF Johns Hopkins Case Study

    AHRQ Quality Indicators Case Study: Johns Hopkins Health System Key Findings • The Johns Hopkins Hospital worked diligently to improve its performance for Postoperative Respiratory Failure (PSI 11). The effort started back in 2012 when only 30 percent were able to be removed from a ventilator within the desired timeframe.

  8. Reducing the Costs of Poor Quality: A Manufacturing Case Study

    Manufacturing firms can incur losses of up to 100% due to costs of poor quality (COPQ) in the form of internal and external product failures, rework, and scrap. The purpose of. this single case study was to explore what quality improvement strategies senior.

  9. Quality Improvement Case Study Repository

    The ACS Quality Improvement Case Study Repository is a centralized platform of quality improvement projects implemented by participants of the ACS Quality Programs. Each of the curated projects in the repository has been formatted to follow the new ACS Quality Framework, allowing readers to easily understand the details of each project from ...

  10. Improving Patient Experience

    In a series of recorded interviews, various quality improvement experts offered advice on what it takes to design and implement programs that lead to better patient experiences. The Case for Improving Patient Experience (11:59). Larry Morrissey, MD, Medical Director of Quality Improvement at Stillwater Medical Group, discusses the value of ...

  11. Case Study: Quality Management System at Coca Cola Company

    The successfulness of this system can be measured by assessing the consistency of the product quality. Coca Cola say that 'Our Company's Global Product Quality Index rating has consistently reached averages near 94 since 2007, with a 94.3 in 2010, while our Company Global Package Quality Index has steadily increased since 2007 to a 92.6 rating in 2010, our highest value to date'.

  12. The contribution of case study research to ...

    Quality improvement; case study; qualitative research; healthcare quality improvement; research; The gap between the knowledge of what works and the widespread adoption of those practices has become a major preoccupation of researchers and a challenge for funders and policy makers.1-3 Recognition of this 'quality chasm' (the term that the US Institute of Medicine used to describe the ...

  13. Quality 2030: quality management for the future

    The study was designed in two steps: (1) a collaborative brainstorming workshop with 22 researchers and practitioners (spring 2019) and (2) an appreciative inquiry summit with 20 researchers and practitioners (autumn 2019). The studies resulted in an agenda for quality management for the future - Quality 2030.

  14. Healthcare Quality Management

    Healthcare Quality Management: A Case Study Approach is the first comprehensive case-based text combining essential quality management knowledge with real-world scenarios. With in-depth healthcare quality management case studies, tools, activities, and discussion questions, the text helps build the competencies needed to succeed in quality management.

  15. Study Quality Assessment Tools

    For case-control studies, it is important that if matching was performed during the selection or recruitment process, the variables used as matching criteria (e.g., age, gender, race) should be controlled for in the analysis. General Guidance for Determining the Overall Quality Rating of Case-Controlled Studies

  16. Case Study: Integrating Strategic Planning and Quality improvement

    This case study demonstrates how the combination of strategic planning, project management, and Lean Six Sigma tools can be used to assess and improve an organization's ability to fulfill its mission and goals. ... Collectively, we are the voice of quality, and we increase the use and impact of quality in response to the diverse needs in the ...

  17. A Case Study on Improvement of Outgoing Quality Control Works for

    outgoing quality control works for manufacturing product. There are t wo types of part was sel ected for. this case study which are huge and symmetrical parts. 85.06 seconds total inspection time ...

  18. Continuing to enhance the quality of case study methodology in health

    Some criticize case study for its high level of flexibility, perceiving it as less rigorous, and maintain that it generates inadequate results. 8 Others have noted issues with quality and consistency in how case studies are conducted and reported. 9 Reporting is often varied and inconsistent, using a mix of approaches such as case reports, case ...

  19. Using Case Studies to Enhance the Quality of Explanation and

    1.1 Introduction . In recent years the development policy community has turned to case studies as an analytical and diagnostic tool. Practitioners are using case studies to discern the mechanisms underpinning variations in the quality of service delivery and institutional reform, to identify how specific challenges are addressed during implementation, and to explore the conditions under which ...

  20. Case Study Method: A Step-by-Step Guide for Business Researchers

    The quality of a case study does not only depend on the empirical material collection and analysis but also on its reporting (Denzin & Lincoln, 1998). A sound report structure, along with "story-like" writing is crucial to case study reporting. The following points should be taken into consideration while reporting a case study.

  21. Perspective: State‐of‐the‐Art: The Quality of Case Study Research in

    Assessments of Case Study Quality. Arising from the debates in other disciplines, various tools and approaches have been developed to assess the quality of case study research. Table 1 summarizes the characteristics of a selection of these, which give useful ideas on how case study quality can be evaluated.

  22. Estimation of Urban High-Quality Development Level Using a Three ...

    The high-quality development paradigm, which emphasizes the organic unity of efficiency, equity, and sustainability, has gained increasing global recognition as an extension of the concept of sustainable green development. In this study, we use green development efficiency as a metric of high-quality development and employ a three-stage Stacks-based Measure Model (SBM) in order to assess the ...

  23. Case Studies Show Positive Youth Development Empowers Young Workers

    As a result, they developed a series of three case studies to explore how employers can use positive youth development practices to better support young workers. These case studies highlight discussions from focus groups at Generation Work sites in Chicago and Birmingham and interviews with workforce development practitioners.

  24. Adolescent athletes' sleep problems and overtraining: A case study

    Introduction: Sleep is crucial for athletes' recovery and performance while overtraining can negatively affect sleep quantity and sleep quality. We present a case of a 16-year-old female athlete exploring the reciprocal negative effects of overtraining and sleep problems on each other. Methods: A flyer of a high school cheerleading team with a history of injuries, irregular menses, chronic ...

  25. Investigating water quality and preservation strategies in Abuja's

    The study subsequently focuses on identifying vulnerabilities in the water distribution system and proposing preventive measures. The findings of this research have significant implications for managing drinking water quality in urban distribution networks, particularly in developing countries.

  26. Search Case Studies

    Search Case Studies. With members and customers in over 130 countries, ASQ brings together the people, ideas and tools that make our world work better. ASQ celebrates the unique perspectives of our community of members, staff and those served by our society. Collectively, we are the voice of quality, and we increase the use and impact of ...

  27. Solution-Focused Brief Approach for Caregiver of a Person ...

    The researcher used a single-case AB design with pre- and post-assessment methods. The researcher administered Pai and Kapoor's Family Burden Interview Schedule, Brief Cope by Carver et al., and WHO Quality of Life-BREF (Group, 1998).The measurements used in the study are well renowned and have been used in many scientific studies.

  28. Using Cost of Quality to Improve Business Results

    Abstract. CRC Industries uses cost of quality as a key measure for improving business results. Since centering improvement efforts on cost of quality, the company has reduced failure dollars as a percentage of sales and saved hundreds of thousands of dollars. Cost of quality can also be linked to other improvements at CRC Industries, including ...

  29. How did pagers explode in Lebanon and why was Hezbollah using them

    Hundreds of pagers carried by Hezbollah members blew up nearly simultaneously in an attack that tops a series of covert assassinations and cyber-attacks in the region.