Skip to main content
  1. ~/archivo # Case Studies/

Two Years of NCLEX-RN Outcomes at Penobscot College of Nursing

A SQL case study on two years of NCLEX-RN outcomes across nineteen campuses, walked through phase by phase.

At a Glance
#

This case study takes a 7,635-attempt slice of NCLEX-RN data, every test attempt by every student across 19 campuses of a multi-campus nursing college over eight quarters from 2024SPQ through 2025WIQ, and walks through how to reason about it. Source, schema, exploration, findings: four phases, each short enough to read on its own, together documenting the full process from a flat CSV through to the SQL patterns that surface the interesting answers. The SQLite database produced at the end of phase 02 is queryable directly in the browser via Datasette Lite, so any reader can re-run every query in the case study.

The institution is anonymized and the outcome values are synthetic. The institution name, region names, and one program-code suffix are replaced for privacy, and the pass and fail outcomes are perturbed so that no published rate matches a real reported figure. Every campus, cohort, program, attempt count, and the full retake structure are real, the analytical patterns are preserved, and every number traces to the published database. The methodology is the case study’s contribution. The point of the SQL-primary approach is to rebut the assumption that statistical work needs a procedural language: confidence intervals, group comparisons, and counterfactual aggregations are all expressible directly in SQL. R appears only in phase 04, where logistic regression hits the SQL ceiling. The full reasoning, including what the anonymization and perturbation do and do not change, lives in phase 01.

The Phases
#

Source

12 mins
Phase 1: Synthetic outcomes, real structure, and the first thread
CSV · Data Quality · Public Data · Nursing Education · Case Study

Schema

10 mins
Three tables, twelve derived columns, one lookup
Python · SQL · SQLite · ETL · Schema Design · Case Study

Exploration

17 mins
Six queries that get the shape of the data
SQL · Datasette · SQLite · Exploratory Analysis · Case Study

Findings

19 mins
Three findings, one R supplement, an honest accounting of what the data can and cannot say
R · SQL · Logistic Regression · Predictive Modeling · Nursing Education · Case Study