Welcome back to Better Know A Dataset! In today’s post, I investigate the career placements of Yale students (”Yalies”) using approximately 10 years of data from Yale Career Services. The dataset I use includes information on each graduating student, their college major, initial employer, industry, and job function. To protect the anonymity of the students, I won’t display the contents of the dataset; however, a typical entry (corresponding to yours truly) looks something like the table below:
As usual, the goal of this blog post is not only to document some stylized facts on Ivy League graduates — including career choice and job — but also to illustrate some new methods of data analysis and LLMs.
Where Have All the Yalies Gone?
To stay loyal to the title1 of the blog post, let’s start by investigating quite literally (i.e. geographically) where Yalies go. Despite the university’s best attempts to educate a worldly and diverse student body — requiring, for example, at least a year’s worth of foreign language study — the geographic distribution of graduates is remarkably concentrated
Of those graduates that stay in the US, some 31% of graduates end up in New York, followed by Connecticut (13.8%), California (13.4%), Massachusetts (9.6%), and Washington DC (6.1%). Collectively, the bottom 40 states account for a mere 13.7% of graduates.
If anything, analysis at the state level underestimates geographic concentration since students tend to cluster in cities within states. For example, of the 2,562 students who moved to the state of New York post-graduation, 2,498 (97.5%!) moved to New York City. At a global scale, the picture isn’t much different. The top international destinations for Yale graduates are UK, China, France, Japan, and Germany; but foreign destinations account for only 11% of destinations internationally.
To summarize succinctly: if you randomly pick a Yale graduate from the past decade, there is
a 1-in-4 chance they moved to New York City
a 1-in-2 chance the moved to the states of New York, California, or Connecticut
a 9-in-10 chance they stayed in the US.
Gone to McKinsey…
In Liar’s Poker, Michael Lewis notes that forty percent(!) of Yale’s graduating class of 1986 applied to one investment bank, First Boston, alone. While I don’t have data on applications (and First Boston no longer exists), we can ask about the common employers of Yale graduating classes of 2014-2024:
Perhaps somewhat surprisingly, the top employer of Yale graduates is…Yale itself. What does it mean that so many students graduate from Yale only to work there? While a select few list job functions of “project management” and “administrative services”, it is not the case that most of these students are contributing to the growing bureaucracy of the US academy. Instead, the majority (56%) of these 194 students are staying as research assistants — at labs, for professors, etc. Indeed, most of the students who have stayed at Yale have majored in the sciences or psychology.
Employers 2 through 30 more closely reflect the overarching narrative. The biggest category among these top employers (after Yale itself) is management consultants — McKinsey, Bain, and BCG — followed by investment banks (Goldman, Morgan Stanley, etc) and then technology companies (Google, Microsoft, Amazon, etc). Lower down, there’s a smattering of additional financial services ranging from accountants (Deloitte) to economic consultants (Analysis Group, Cornerstone) to buy-side investors (Blackstone, Jane Street, etc). One should be careful of concluding — based on top employers alone — that more students go into consulting than finance. Unlike finance, consulting is highly centralized into a small number of large firms. If we plot the share of students in different job functions over time, the primacy of finance over consulting is a little more clear.
Is it possible that the percent of Yale seniors who applied to McKinsey in the past decade is anywhere close to the percent that applied to First Boston back in 1986? Let’s do a quick back-of-the-envelope calculation. Suppose that McKinsey accepts p% of Yale applicants. Since 1.6% of Yale graduates end up at McKinsey, we would need2
If we want P(Applied)>40% then we need p<4%. In other words, if the acceptance rate of Yale applicants to McKinsey is less than or equal to 4% (which seems plausible to me?), then perhaps McKinsey is the new First Boston.3
…Every One?
Coupled with the previous section, the portrait of Yalies we have so far isn’t too interesting. We send ≈25% to a single city; ≈15% to a single industry; and potentially have ≈40% apply to a single company. This uniformity of experience made me wonder whether there are certain types of Yalies that have a bigger diversity of experience than others.
To develop a working definition of “diversity of experience”, consider trying to predict the industry a graduate ends up in, knowing only some small detail about her — say, her college major. Some majors may send a lot of students to one industry, in which case such a guess will be fairly easy, while majors that send graduates to a more dispersed set of industries offer a less uniform and predictable experience. This is the notion of “diversity of experience” that I will try to measure.
Formally, let’s define a new measure that looks how concentrated the job function is within college major. For each major, I define the share of students in that major who go into each job function, and then I add up the squared shares. For example, suppose there are two industries, “finance” and “other”. If 90% of economics majors go into “finance”, and 10% go into “other”, then industry concentration among economics majors is 0.9^2+0.1^2=0.82. If, say, spanish majors send 50% to “finance”, and 50% to “other”, then the concentration among spanish majors is 0.5^2+0.5^2=0.5. Since 0.82>0.5, economics is “more concentrated” than spanish, and indeed it is easier to predict where an economics major ends up. In economics, this formula is known as the Herfindahl-Hirschman Index or HHI, and is used to measure market concentration.
I construct these measure over all majors with at least 40 students, and across all job functions. The table below shows the results of the least “diverse” Yale majors:
Among not-so-diverse majors, the economics-to-finance pipeline leads the pack. But engineering-to-engineering and theater-to-theater are close behind. I’m also highly amused at the economics “double majors” (economics and history; economics and global affairs) to consulting pipeline.
By contrast, the highest diversity of experience Yale majors are those that are hardest to predict the job function, given the major. Here psychology and environmental studies lead, but many humanities feature as well (humanities, history, philosophy). Also of note: the most common post-graduate job function for philosophy majors is Law!
Young Girls Have Picked Them, Every One
The analysis so far has remained silent on demographics of students, and gender in particular. But a large literature in economics studies the extent to which the wage gap can be explained by such differences as college major and initial career. Following that line of inquiry, we might ask, what [majors, careers, industries] have girls picked, and how do they differ from what male graduates have picked.
I don’t have data on gender or salaries. But what I do have is access to a name-gender probability dictionary, and a subscription to a ChatGPT API. I first join the dataset on career outcomes to a name-gender dictionary to obtain a prediction of each student’s gender based on his or her first name. Then I deliver the following prompt to the gpt-3.5-turbo LLM API:
What is the expected salary of a recent Yale graduate working for the following employer, industry, and job function:
employer: {employer}
industry: {industry}
job function: {job_function}
please return your answer as:
low: [low]
high: [high]
I iterate through all 9,778 students in the dataset (skipping over students who directly attend graduate school) and collect estimated salaries as the midpoint of the low and high estimates.4 For the couple jobs I hand-checked (including my own) ChatGPT seemed relatively accurate.
We can now do exercises like breakdown each college major by (ChatGPT-imputed) average salary and gender proportion. The size of the dots corresponds to the number of students in the major.
Immediately, we notice a fairly strict positive relationship between the proportion male and the estimated average starting salary, i.e. males sort into higher salary majors. Some majors on the higher paying side that are not predominantly male include global affairs and biomedical engineering.
Using this data, we can also take an incredibly-back-of-envelope stab at analyzing the extent to which gender wage differentials are associated with sorting into college majors vs sorting into jobs conditional on major. Let Y^F_K denote the average salary (Y) for females (F) in major k, and let w_k^F denote the percent of females that major in subject k. We can express the unconditional wage differential as:
In our dataset, the average imputed male salary is $73,834 and the average imputed female salary is $67,942, for a wage differential of $5,892. Doing a little bit of algebra, we can rewrite the formula as:5
The first term holds the average salary constant, and asks how much of the difference is due to the different percent of each gender that selects a given major (the “across major” difference). The second term holds the percent in each major constant, and asks how much of the gap is due to different salaries within a given major (the “within major” difference).
Plugging in numbers, we get:
Or, about 6.9% ($405/$5,892) of the starting wage gap is due to differences in job conditional on major, while the remaining 93% is due to sorting across majors.
Oh, When Will They Ever Learn?
The last variable I will analyze is graduate school matriculation. Here the numbers are quite high: a full 18% of students go directly from Yale to graduate school. Since many students attend graduate school only after working for several years, this 18% is a vast underestimate of the fraction of Yalies that ever pursue further education.
Not surprisingly, Yale University is the top destination for those pursuing graduate school directly after graduation. University of Cambridge, Stanford, Harvard, and Oxford, and Columbia follow. 103/1758 (about 6%) go directly to law school; virtually no one goes to directly to business school; and 211/1758 (about 12%) go directly to medical school.
Finally (and at the risk of jinxing my own future) I peeked at the dataset containing the career outcomes of Yale PhD students. Of the 42 economics PhDs, 6 went to industry (all economic consulting), 4 to government and the remaining 32 to academia. But this sample omits Yale School of Management graduates, who are more likely to study to finance and have broader industry options.
And the top employer of Yale PhDs, after Yale and Harvard University? You guessed it — McKinsey.
Conclusion
Takeaways
Most Yalies end up remarkably geographically concentrated — in New York City, San Francisco, etc. Fewer than 10% move internationally, and living outside major cities is rare.
The top employers of Yale graduates is in fact Yale itself, but consulting (McKinsey) and banking (Goldman) lead private sector placements. Some 10% of graduates head straight to graduate school, primarily medical school.
College majors at Yale differ in the homogeneity of experience they offer. The economics major is the most homogenous (driven by the economics-to-finance pipeline), while many of the humanities (philosophy, history, etc) are the least homogenous, in the sense of having the largest diversity of job outcomes.
There is a strong positive correlation between the average estimated starting salary of a major, and the percent of that major that are male. Back-of-envelope decompositions suggest that selection across majors is a more prominent driver of gender wager differentials than within-major differences in starting job functions.
Further Research Questions
Time Trends: The analysis above pooled all the years together. But we might be curious in investigating time trends, e.g. how did major choice change around Covid (2020-2022) or around major events (e.g. Russian invasion of Ukraine)?
Current vs. Future Jobs: Since the dataset only contained information on initial placements, it was not possible to look at career outcomes several years out. One avenue of exploration is to scrape Linkedin data and analyze which professions are “sticky”.
Summer Internships: Another dataset I have (but didn’t analyze) contains outcomes on summer internships. Can we predict which students get return offers? How does choice of summer internships affect choice of full time job and location?
If it wasn’t clear by now, the title/subheadings are a reference to Peter, Paul and Mary classic song, “Where Have All the Flowers Gone?” I appreciate the reader’s patience as I force the blog post sections to correspond to the lyrics.
To be rigorous, we should also have P(Accept Offer | Offer ∩ Applied) on the right-hand side (since the 1.6% is people who have accepted), but I’m assuming that ≈100% of those who get offers go.
Strictly speaking, the new First Boston is Credit Suisse, which acquired the troubled bank in 1988 after First Boston’s misplaced investment in Sealy mattresses.
For those wondering, this took about 1 hour to run and cost me about 47 cents in OpenAI credits.
A more formal version of this test in the economics literature is known as the Blinder-Oaxaca group decomposition.
Hi Marrow, thank you for the excellent post! The decomposition approach is widely used and highly effective for breaking down and understanding the components of the total difference.
I have 2 questions out of curiosity:
1. For the 10% of Yale graduates who went directly to graduate school, how do their starting salaries compare to those of their peers who went straight into the industry? Do they tend to earn more after completing their graduate programs?
2. I came across an interesting pattern in Singapore where students of Chinese ethnicity (though they changed their nationality to Singaporean citizens from Chinese) seem more likely to pursue graduate studies than native-born Singaporean students. This trend could come from my selection bias because of my limited social connections. But, do you see a similar trend at Yale?
Wow maybe I am just like all the other girls