Data mining at USED: Updating old systems to inform new policy

January 30, 2024

Kevin Stange approached PhD student Nathan Sotherland in the summer of 2022 with an offer to join him in a unique assignment in the U.S. Department of Education (USED). As a part of the first cohort of economists to join the Office of Under Secretary under the leadership of Jordan Matsudaira, the Department’s first Chief Economist, (see An economic eye on equity in higher ed) they would have a unique opportunity to help the Department leverage its vast data to inform better decision-making.

Sotherland jumped at the chance. Beginning that fall, he started work at the Department of Education splitting time between Washington and Ann Arbor. He worked alongside members of the cohort (including CJ Libassi, MPP ‘15), and with USED staff, to assess the breadth of the data and understand the architecture of the systems that had been built.     

“During the first half-year, a lot of what we were doing was discovering what data was available, working with USED people who had built it. What did they have? Where could we find it? One of the tricky things about the data was that it wasn’t structured in a way typically used by academics. It lived in tables, and database systems that we don’t use,” he says.

“The structures of the various types of data were different, so we had to understand how we could fit it all together to be able to paint a complete picture. So he had to learn SQL, the data structures, and how to use it all to answer the questions we wanted to address, he says, noting that  each time a new policy question arose, they had to find new ways of accessing the data.

In one particular project, they were looking at college transfer as part of Secretary of Education Miguel Cardona’s “Raise the Bar’ initiative. Sotherland’s first output for that project was a memo on individual colleges’ transfer performance. How do different institutions perform at getting students to transfer from two-year institutions to four-year institutions, and eventually graduating with a Bachelor’s degree?

They looked at many data points: FAFSA applications (tens of millions of observations), aid disbursements, Pell Loans, student enrollment patterns, such as how long they stayed in school, graduation records, and loan balances – the whole process from application to graduation.

“It was interesting for us and for them. It’s the first public piece, and more will come - hopefully including academic papers,” he says, noting that he is still affiliated with USED so he can have access to the data.

“Jordan is surrounding himself with good people who are great at what they do. The process of working there was wonderful. We have lots of interaction at our weekly meetings and it has been very fruitful for all of us,“ he says.

Sotherland is now using the data for his own dissertation which focuses on postsecondary access and policies that affect college affordability.

“My work to understand the data was a part of my dissertation process, I just didn’t know what the question would be! I spent a lot of time mucking around, and then things started falling into place!”

Now that they have made their first forays into organizing the data, their hope is subsequent cohorts can accelerate the research process, and continue to inform policy decisions.

See also: Dr. Stange Goes to Washington

More news from the Ford School