Getting Started with Health Statistics
Data vs. Statistics
"Data" and "Statistics" are two words that we tend to use interchangeably and yet they refer to 2 very different things. Data is the raw information from which statistics are created. Statistics, in turn, provide a summary of data. For more about data vs. statistics, check out this guide.
Types of Health Data
Health data are gathered from a number of different types of sources. The source, collection methodology, purpose of collection, and limitations should be considered when evaluating and using data and statistics.
Population or household surveys are a main source of health data. One advantage is that they are not limited to users of health services as are some of the other types of sources listed below. See the box below for the most important household surveys in the United States.
Surveys of providers
Surveys of physicians, hospitals and nursing homes can be an important source of information on medical transactions and patients.
These are drawn from the records of births, deaths, marriages and divorces and can facilitate detailed analyses of particular conditions, given that cause of death and circumstances of birth are also recorded.
Registers of diseases
These show the incidence, prevalence and outcomes of diseases like cancer and HIV/AIDS.
Examples include those records compiled during a hospital stay or at outpatient clinics or physician’s offices.
About this Guide
This guide provides links to sources of data and statistics collected and provided by numerous organizations and agencies. You can find potential sources of data and statistics by topic area. Health data and statistics can be difficult to find. Some of the things that make this so challenging are:
- Health data collection is decentralized, and carried out by many different government, non-governmental and private agencies and organizations. This will include organizations working at the national, state and local levels. Data quality, collection methodology and accessibility will vary considerably.
- Data collection and dissemination takes time and resources. There is often a lag time between collection and availability, and thus real-time data can be difficult to come by.
- Collection of health data in the United States is a fairly recent phenomenon. Thus, finding reliable data prior to 1956, when the National Health Survey was established, will take time and may involve consulting primary resources. You can also search the scholarly literature to find any studies that may have already done this ground work for you.
Reference: Health Statistics Guide, The University of Chicago Library.
Important Sources of Health Data and Statistics
These are some of the most important players in the collection of health data in the United States and worldwide. You'll find links to more specific topic-oriented resources provided by these organizations amongst the various topic pages in this guide.
U.S. Department of Health & Human Services (DHHS)
DHHS is the umbrella agency under which most national health data and statistics programs operate. These include the CDC, the Agency for Healthcare Research and Quality (AHRQ), the Substance Abuse and Mental Health Services Administration (SAMHSA), among others.
DataFinder: Topical access to health and human services related data from NHHS, other federal agencies, states and local governments.
Centers for Diseases Control (CDC)
The CDC is a part of the U.S. Department of Health & Human Services and is the primary federal agency for public health.
- Data and Statistics: This is the starting point for health statistics from the CDC. Browse by topic, view publications, and links to interactive tools, surveys and more.
- Health Data Interactive: Customizable tables for national health statistics covering topics in health status, health care, conditions, insurance, mortality, life expectancy, birth, pregnancy, risk factors and disease prevention.
- CDC Wonder: A portal to several CDC databases concerning health-related topics for public health information and numerical data sets such as AIDS/STDs, risk behaviors (the Behavioral Risk Surveillance System), mortality and natality statistics.
National Center for Health Statistics (NCHS)
The NCHS is the nation's principal health statistics agency. It is a unit of the CDC. The NCHS homepage is also a central point for health statistics browseable by topic, links to surveys, publications, and other online tools.
World Health Organization (WHO)
The World Health Organization is an agency of the United Nations and is an international coordinating agency for public health.
Useful WHO databases for data and statistics include the Global Health Observatory (providing national statistics for health indicators), WHO Global Infobase Online (chronic diseases and risk factors), and the Global Health Atlas. Data is also accessible by topical categories.
Browse Ruwix, the portal dedicated to puzzle programs and tutorials.
Important Household Surveys of Health
As mentioned above, population or household surveys collect data from people living in households, regardless of their use of health care services. Household surveys do not generally survey institutionalized populations such as incarcerated people or patients in long-term care facilities.
The most significant household health surveys are:
National Health Interview Survey (NHIS)
The NHIS has been conducted since 1957 by the National Center for health statistics [NCHS] and its predecessor agency. The survey collects data from a large representative sample of households in the United States. NHIS is “the principal source of information on the health of the civilian non-institutionalized population of the United States.” It includes data on health status, care, demography and behaviors.
Behavioral Risk Factor Surveillance System (BRFSS)
The BRFSS consists of a series of state based household surveys conducted by state health departments with technical assistance and support from the Centers for Disease Control and Prevention.
National Survey on Drug Use and Health (NSDUH)
The NSDUH is conducted by the Substance Abuse and Mental Health Services Administration [SAMHSA]. The survey tracks substance abuse and mental health of the non-institutionalized population of the United States.
The Cornell Institute for Social and Economic Research (CISER) is an excellent resource for help with the use of data files and statistical analysis software, and working with social science and economic data, both public access and restricted. CISER offers numerous workshops at Mann library on the use of statistical packages and computing. CISER also offers drop-in office hours both at their off-campus location on Pine Tree Road and at Mann Library. CISER is also the source of a large data archive available to Cornell researchers in the social sciences.