Harvard Data Science Review • Issue 3.1, Winter 2021
The People Machine: The
Earliest Machine Learning?
Francine Berman
1,2,3
Jill Lepore
4
1
Manning College of Information and Computer Sciences, University of Massachusetts Amherst,
Amherst, Massachusetts, United States of America,
2
Berkman Klein Center for Internet and Society, Harvard University, Cambridge, Massachusetts,
United States of America,
3
Department of Computer Science, School of Science, Rensselaer Polytechnic Institute, Troy, New
York, United States of America,
4
Department of History, The Faculty of Arts and Sciences, Harvard College, Cambridge,
Massachusetts, United States of America
The MIT Press
Published on: Jan 29, 2021
DOI: https://doi.org/10.1162/99608f92.87b0ec26
License: Creative Commons Attribution 4.0 International License (CC-BY 4.0)
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
2
ABSTRACT
In September 2020, the Harvard Data Science Initiative (HDSI) invited Jill Lepore and Fran Berman to a
special event to talk about data science and the Simulmatics Corporation, the focus of Jills new book If Then:
How the Simulmatics Corporation Invented the Future. Jill’s book tells the compelling story of one of the first
corporations to use the power of digital data to both understand and illuminate the world around us, as well as
to manipulate the population and skew behavioral outcomes, a kind of power hyped as “the People Machine”
by Simulmatics’ PR team. The conversants, data scientist Fran Berman and writer and historian Jill Lepore,
had met as Radcliffe Fellows in 2019 and are both passionate about the societal impacts of technology. They
were delighted to carry on their conversation, started at Radcliffe, about data, society, information, truth and
trust at the HDSI event. This piece is an edited and streamlined version of their HDSI discussion, whose live
video recording can be found below.
Keywords: data analytics, behavioral science, ethics, simulation, elections, democracy
Visit the web version of this article to view interactive content.
Fran Berman (FB): The Simulmatics Corporation, born in 1959 and dead by 1970, was a window to today, a
tech-powered world in which data is everywhere and drives virtually everything. Jill’s book, If Then: How the
Semantics Corporation Invented the Future, describes the extraordinary history of the company, one of the first
companies to automate, simulate, and predict human behavior for commercial purposes. The book explores the
company's successes and ultimate failure in the 1960s and its lessons for us now... History and tech, tech and
culture—there's a lot to explore here. Jill and I were Fellows in 2019-2020 at the Radcliffe Institute for
Advanced Study. She was finishing the book and working on other projects and my work explored the social
and environmental impacts of the Internet of Things. It was a match made in heaven, and at Radcliffe. It's
wonderful to continue our wide-ranging conversations today, courtesy of the Harvard Data Science Initiative.
I’ll start by asking Jill to talk a little bit about the book and how she came to write it.
Jill Lepore (JL): Sure. First, I want to thank the Harvard Data Science Initiative for the invitation to this event
and everyone who's come out on Friday afternoon for this conversation, but especially Fran. It is such an honor
and a real treat to be in conversation with you. It is one of the remarkable things about a place like Radcliffe—
this is exactly the kind of conversation such an institute is hoping to cultivate. I'm really grateful, and I know
the book also benefited from the conversations that you and I had over lunches before everybody had to go
home last year. Thank you for doing this.
Jill Lepore in conversation with Fran Berman
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
3
The book tells the story of the Simulmatics Corporation. I came across the story in 2015. I just needed a
paragraph for an essay I was writing on the history of the polling industry, because it had become clear to me
that polling was being replaced by data analytics and political prediction companies, and I needed to know,
when did that happen? In a journal article, I came across a fleeting mention of the Simulmatics Corporation
and its role in the 1960 election, working for the John F. Kennedy campaign. Historians look for evidence in
the archives, so I decided to look for the archives of this company. Simulmatics, founded in 1959 and
bankrupted by 1970, was a very small company but it had a significant history and was quite self-conscious
about its own importance. I couldn't find the papers anywhere; its archives had vanished. But I did find a
wealth of material at MIT, in the papers of the head of the company’s Research Board. He had been a political
scientist at MIT. And so, I went diving through those, and I found that not only had Simulmatics pioneered the
work of election simulation in 1960, but they'd gone on to undertake a series of projects—most of which, at
some level, failed—but which all share the same ambition, which was that you could use computer technology
to predict human behavior. And then you could sell that as a product to other entities. They formed a
corporation, went public in 1961, raised a fair amount of money, and had some really interesting clients over
the course of the 1960s. Until I found Simulmatics, I hadn’t understood how we got from Cold War behavioral
science to Facebook and social media. This company is a missing link.
FB: I found that really fascinating about the book, and, truth be told, I hadn’t known about Simulmatics before
I read the book. In many ways, Simulmatics was truly a company ahead of this time: Its mass cultural model
predated Amazon. Its relationship and its partnership with the New York Times launched data journalism and
data-driven election predictions. Its work to assess and predict voter behavior in the 1960s predated Cambridge
Analytica’s work to predict and manipulate voter behavior in 2016. Its work with the Defense Department
during the Vietnam War promoted the notion of war simulation to optimize war reality. All of this, from our
perspective, was way ahead of its time. Yet Simulmatics’ technologies could not meet its ambitions. Its
leadership was myopic, its science was sloppy, and it ultimately failed to deliver. Its problems were both of its
time and of our time.
As a historian, what do you see as the lessons learned from the Simulmatics experience? Do we have greater or
fewer controls at this point with which to rein in companies than we did then?
JL: That's a really interesting question. I think we have fewer rules, and among the reasons is Simulmatics
itself. Simulmatics scientists were quite brilliant, and they were extremely well-intentioned. These weren’t
nefarious people; they weren't trying to destroy institutions: they were just trying to figure out how to use these
new tools. After Kennedy won in 1960, the Simulmatics Corporation claimed credit for his victory. They said
that he had done everything that they had advised him to do, and then he won, and, therefore, he won because
of their advice. They took credit, without demonstrating that they deserved credit, and this really annoyed the
Kennedy campaign and incoming administration. American newspaper editorials condemned Kennedy for
having used this tool.
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
4
This was partly because—in another kind of resonance with our day—there was tremendous anxiety about the
future of work in the late 1950s and early 1960s, because of automation. Kennedy—because Democrats are the
party of labor—had campaigned, had made a plank in his platform, and he was promising all these jobs
retraining programs and job subsidies for people displaced by automation. So, not only was there a general fear
of computers controlling our minds—which was just a cultural anxiety in the 1950s and early 1960s—but there
was a specific fear that this guy who has run against automation was being controlled by a giant robot. I was
fascinated to see how clear-cut it was to people at the time that this was unethical. There had been some
consideration of whether or not it was in fact illegal. Today, we no longer question that sort of thing.
FB: Is it the workforce issue that's relevant here, i.e., who is doing the predicting? Why would they be any
more annoyed that the predictions came from behavioral science and machines than if they came from a really
spot-on ad agency that gave them the exact same advice?
JL: Sure, but remember the first big ad agency-driven presidential campaign took place in 1952, when the
Eisenhower campaign hired Rosser Reeves to write television spots. Eisenhower was the first presidential
candidate to appear in his own TV ads. That was extraordinarily controversial, so controversial that the
campaign of his rival Adlai Stevenson, the Democrat, dubbed Eisenhower’s campaign ‘the Cornflakes
campaign.’ Stevenson’s campaign went to the FCC (Federal Communications Commission) and said, ‘This has
got to be illegal. You cannot have a presidential candidate hiring an ad campaign to write TV spots that are like
those for toothpaste and laundry detergent. Like, that's just got to be illegal. That's got to be a violation of
either the FCC or the FEC (Federal Election Commission) in some way. This can't be allowed. And they didn't
get anywhere with that, but in any case, Stevenson said, ‘This is what's the problem with American politics,
this kind of crap. This will destroy the country.’ By 1960, only eight years later, there’d been a kind of tacit
acceptance of using advertising campaigns, and, of course, also the advice of pollsters, which presidents had
been using since the 1930s. But there was, nevertheless, a lot of anxiety about it and, on top of that, just a
tremendous cultural anxiety about computers.
Consider, for example, the 1957 film Desk Set with Spencer Tracy and Katharine Hepburn. Tracy plays an MIT
systems engineer and Hepburn runs a fact-checking department, and Tracy is going to be installing this giant—
it’s supposed to be like a UNIVAC—it’s called EMERAC, a giant room-sized computer in her department. It’s
a screwball comedy romance between the two of them, but the whole movie is about the anxiety of the
displacement of people by machines. If you walk that to its logical conclusion, if John F. Kennedy could hire a
company that could simulate the election and predict the outcome, then tailor his message in order to achieve
the desired outcome at some point, why even bother with voting? That's the anxiety, that you could displace the
voters themselves. I think we actually fairly need to still be asking that question.
FB: Yes! We still have exactly the same anxiety.
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
5
JL: Right. If voters are just tools, we're just pulling the lever, but we're being marched around to do the
bidding of an algorithm. I think to many people, that's how things feel.
FB: I want to get back to that. Your comments remind me of many of the great discussions on technology and
society we have at the Berkman Klein Center for Internet and Society at Harvard. For all my techie friends and
colleagues, I wanted to ask you about the kind of tech the Simulmatics Corporation actually did. The popular
press called the data, software, and hardware that Simulmatics used ‘the People Machine.’ You write in the
book: “The machine, crammed with microscopic data about voters and issues could act as a macroscope. You
could ask it any question about the kind of move that a candidate might make, and it would be able to tell you
how voters, down to the tiniest segment of the electorate, would respond.”
Let's talk about how this actually worked. How do we characterize their computers, data, and models in today's
terms? Were the programs data-driven simulations? Did the models utilize machine learning as we would
recognize it or anything that would resemble todays AI? How were their models vetted? Where did the data
come from?
JL: This is a really interesting question, and I spent some time with this because calling what they did ‘the
People Machine’ was the work of their PR guy, the Director of Public Relations for the company—who in fact
wrote an essay for Harper's Magazine about the company without revealing that he was the PR guy for the
company. So, a lot of it is just boosterism and flim-flam. What even is a “People Machine”? It’s just a program
written in FORTRAN, and a bunch of data. The “People Machine”—that’s just boosterism. And, to be fair, to
the scientists who worked on the project, a lot of them were deeply troubled by the way the work was
characterized by their PR. One of them was Robert Abelson, a quite distinguished Yale scholar, who had been
one of the founders of the company. After Ed Greenfield, the president of the company, appeared on CBS
Radio and took credit for Kennedy’s victory and said, ‘Our People Machine can simulate anything,’ Abelson
wrote basically a letter of resignation. He said, ‘If you do that again, I'm out. Like, I can’t be answerable to this
nonsense. We did not do this, we did not do that. And we can't do this. We can't do that. We ran this one
program, man, calm down.’
It’s useful to remember just what a small operation this really was. Simulmatics never owned its own
computers, which is sometimes hard for people to believe, right, but there just were not even a lot of computers
around. It’s 1959. There was at that point one computer at MIT—I think it was an IBM 704—for all of the
New England schools who are in a consortium to use. You could never get time on it. They didn't have time-
sharing—this is before time sharing—so you had to get a slot to do anything. The New York office used IBM
machines at the IBM Service Center in New York. IBM world headquarters was in downtown New York, but
also had a service center where you could rent time on—I think probably a 704. Most of what they were doing
was collecting and preparing data. I tried at some point in the book—I worked really hard to find a fact that
could communicate to the reader the speed of an IBM 704 against, like, an iPhone 6. It’s just laughable. I
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
6
mean, I could have done their entire project while I spoke this sentence, but it took them years. So you might
wonder: why elections?
After the Second World War, social scientists who were interested in quantification wanted to figure out how to
do predictive work because that's what the government wanted. They wanted predictive social science in order
to wage a war of ideas against communism. They needed to think about how the ideas get into people's heads
and how to change their minds. That was what the Cold War was about: don't believe in communism, come
with us! Defect, be capitalist! So, it was all about how to figure out messages and how to send them to people.
Now, the Ford Foundation funders of behavioral science wanted to be able to predict how messages would
affect people. Well, what's the best single way to begin a study of that project? It's voting, because voting
generates its own data. You have election returns every two years. And then you could match that up with
census data. And then we also have pretty extensive public opinion data. So, these guys figured out, all right,
here's a way to test whether we can make a predictive model of human behavior. We will predict how people
will vote based on how they voted in the past and what public opinion polls tell us about them, but you don't
have information down to the named individual level. You have aggregate data about a county or people who
are registered as Republicans in a particular precinct, say.
They called it “massive data”: their original project was the biggest social science research project up to that
point. What they did was very clever. They wrote to Gallup and to Elmo Roper, the pollster. And they said,
‘Can we have all your old punch cards if you haven't thrown them away? Can we have your punch cards from
all of your previous surveys? And so, Gallup and Roper, said okay, and Simulmatics got all these punch cards
from public opinion surveys, beginning in 1952, and then they had all the election returns and the census data.
They compiled it all and aggregated it in such a way so that, if you asked one question like ‘Do you support
Eisenhowers urging us to be more involved in Korea?’, and another poll said, ‘Do you think Eisenhower
should take a stronger stance against communism?’—they would somehow make those to be like one poll
about a stronger stand about communism, because the different pollsters asked somewhat different questions.
They had to work to standardize the data. They took all the voters and came up with 480 possible voter types,
like ‘New England Catholic white woman who voted for Kennedy’—that’d be a type. And then they took all
the issues on which people had been questioned and reduced them to 52 issue clusters. And then from this, they
constructed an imaginary population of 3000 possible individuals and on which they could test if you
emphasize one issue, how they would change their opinions about other issues.
FB: So your mileage kind of varied, then.
JL: Your mileage varied! There is actually a kind of hilarious—if anyone has ever seen a 1947 film called
Magic Town, starring Jimmy Stewart, in which he plays a pollster who's not making any money until he comes
across this town of 3000 people that happens to be a perfectly mathematically accurate representation of the
entire American electorate, so he just moves there. And instead of doing polls, he just walks around town and
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
7
asks people about issues, and then he reports that in his weekly column, and he's always right because he's
found that magical 3000 people. That's what Simulmatics is trying to do, but mathematically.
FB: You might have to give us a movie list to go with the book! Let’s talk about data science. You don’t have a
lot of good things to say about data scientists and data science, both in the book and in a great article you just
published in Nature, where another one of our Radcliffe Fellows, the amazing Jo Baker, is a senior editor.
Your summary of data science is thoughtful and it's worth us discussing. Here's what you say in the book: “In
the 2020s, a flood of money into universities attempted to make the study of data science with data science
initiatives, data science programs, data science degrees, data science centers. Much academic research that fell
under the label ‘data science produced excellent and invaluable work across many fields of inquiry. Findings
that would not have been possible without computational discovery. And no field should be judged by its worst
practitioners.
Still the shadiest data science, like the shadiest behavioral science, grew in influence by way of self-
mystification, exaggerated claims, and all-around chicanery, including fast-changing, razzle-dazzle buzzwords
from ‘big data’ to ‘data analytics.’ Calling something ‘AI,’ ‘data science,’ and ‘predictive’ became a way to
raise huge amounts of venture capital funding. A credulous press aided in hyping those claims, and a broken
federal government failed to exercise even the least oversight.”
I remember a number of our discussions about these kinds of issues—always fun over great Vietnamese food—
and, not surprisingly, my own view is a little more charitable than yours. In my view, data science, behavioral
science, and other emerging fields thrive in the academic sector—in the wild, if you will—where exploration is
primary. But when they're employed in the private sector, they can be weaponized, and they're often
weaponized to exploit. So in some sense, isn't the kind of hyperbole and the other ills you mentioned, and I
agree with many of them, more of a characteristic of data science for hire than data science in the wild? Isn't
data science really the victim in this process rather than the perpetrator?
JL: I think that's fairly stated. And I should just say here, that when I criticize data science, I am offering a
fairly specific sort of criticism. I treasure my colleagues at the university and the college who do this work, and
I'm not picking a fight. But it tends to be the case across realms of knowledge that when a flood of money
comes into a field, it can often attract a lot of scalawags who just want the money, and I also think sometimes
some of that money is coming from the data for hire companies that want to launder their money. They want to
borrow the prestige of a university that really is interested in the pursuit of truth and making interesting
discoveries, for the sake of knowledge. They want to affiliate themselves with the university because what
they're doing feels dirty. And I think that's ethically concerning. There was a lot of that going on in the 1960s
with behavioral science in the Vietnam War. And with chemical weapons. All the student protests and anti-war
movement that happened at MIT and also at Harvard, and many other places, including Stanford, were students
saying, ‘Well, we thought our professors were studying chemistry, you know, they're out there, working on
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
8
napalm, what, that can't be okay.’ I worry about the lessons of the Vietnam War having been forgotten. I think,
today, that higher education is really deeply implicated, especially with fossil fuels.
I also think—you can persuade me that I'm wrong about this—but sometimes when I read descriptions of what
data science is, on some campuses, it really sounds like market research. And I think if you want to have an
undergraduate concentrate major in market research, you should have an undergraduate major in market
research, which is fine, but then not call it data science. It’s also, on its face, a very weird name to give a
discipline. We could have a Department of Facts, I guess, but aren’t we all in the Department of Facts?
Research and analysis done with large bodies of data is invaluable; I’m less convinced that it fits within the
concept of “data science”; I think the name actually demeans that work. When I think about what are all the
incredibly meaningful kinds of insights that we can gain about the natural world, say, that are unavailable to us
without machine learning-driven, computational processing of data, that stuff is incredible.
FB: There’s definitely a lot to unpack, and I share some of your concerns from a really different point of view.
I think that we have to remember that data science has become an actual academic discipline. Data has always
been a tool for research, but as an academic discipline, it’s really kind of nascent. In creating data science
programs and departments, there are a lot of experiments out there, all adding to the evolution of core curricula
and research vehicles. One question that institutions are grappling with is where data science should live—in a
statistics department, a computer science department, as a multi-disciplinary program? What should the
curriculum be? Machine learning and statistics, to be sure. What about a class in ethics? What about a class in
data preservation and data stewardship? What about classes on databases or data visualization? What about
training and practice applying data science to other disciplines? It seems to me that over the last decade or two,
the data science community is really trying to figure out how data science as a discipline best fits in various
university academic environments.
Evolving the discipline focuses on data science in academia. There is also the whole issue about how data
science is or should be used in the private sector, where data science techniques and analysis currently serve as
a tremendous competitive advantage. If you're a company, it's hard to avoid collecting and using data because
that’s what your competitors are doing. A few years ago, Harvard Business Review said that data scientist was
the sexiest job of the twenty-first century. So there is also a lot of hype around it.
It’s also good to remember that data science is a tremendously valuable tool for exploration, but not the only
one. And it is critically important to think about data in context. What can it tell us and what are its limitations?
To me that argues for ‘humans in the loop’ to ensure our collection, analysis, and inferences based on data are
useful and appropriate.
So here's another question for you. Today, and perhaps in Simulmatic’s day in the 1960s, there is an over-trust
in technology. We tend to trust the results of data analysis and computational models without asking where the
data came from, whether the models are representative, and what the context is in which the results are
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
9
meaningful. The outcomes of our analyses may have powerful consequences and we don’t ask often enough
whether there are humans in the loop. If you think about medicine, we don't just rely on the results of machines
and disease models. We always have doctors and medical professionals interpreting those results and putting
them in context, trying to figure out all the things that were not captured by the model and the data. Shouldn’t
other kinds of data-driven analyses be doing the same thing?
You see this as a historian: new technology is always racing ahead of the curve, and once we understand its
implications, societal controls promoting the public interest play catch up. We saw this during the Industrial
Revolution. We see this throughout history. What are the right ways to align technological innovation and
societal controls? What can we do today to really take the wild west of data science and civilize it to promote
society?
JL: One thing that I've tried to bring into public discourse is a way of thinking about data as a kind of
knowledge, not as the best kind of knowledge. We have a whole cult of data now where, you know, whatever
you do, if you say it's data-driven, somehow you can get money for it.
I've tried to take some time to think through why people use the term ‘data’ in that way. I gave a talk a few
years ago called “How Data Killed Facts.” The larger analytical framework that I give historically is to think
about the history of the elemental unit of knowledge over the last centuries—maybe we could begin with the
mystery: mysteries are things that God knows and we cannot know, like the mystery of conception. The
mystery of resurrection. The mystery of the afterlife. A lot of what the Reformation is about is replacing the
mystery with the fact. The fact comes from the law but comes from a commitment to the idea that humans can
actually know things if we observe them and submit our observations to rules of evidence involving
corroboration and fairness, the kinds of rules that then are put to a jury.
Then that diffuses into the whole culture. The fact becomes central to the Scientific Revolution that you could
establish facts through empirical observation and corroboration of other science, which historians call ‘the cult
of the fact.’ It moves into journalism. By the time you get to the Enlightenment and the years before the
Industrial Revolution, historians talk about the great age of quantification, the fact is really challenged by the
number. People are doing research across great distances. The best way for them to share their observations is
by counting things. Then we have the rise of statistics, and people began to count populations and demography
is born on that, that we can count votes and we can therefore consent to be governed, and the rise of the
numbers incredibly important, in the US, especially, given the nature of the census and centrality to our
political order.
It's not really until the 20th century that something like data comes to be called ‘data’ in our sense, which is ‘a
set of numbers that human beings can’t calculate, you need a machine to calculate for you.’ You could even
begin with the 1890 census, calculated by tabulating machines. But by the time you get to 1950 and the
UNIVAC is running the census, that really is the beginning of the age of data, which happens also to be a time
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
10
when all of American culture orients itself around the idea that scientists are gods. In 1960, Time magazine's
Man of the Year was ‘men of science,’—all men of science are like gods. This marks an incredible elevation of
data as evidence that you can't understand, only machines or ‘men of science’ can understand. There's a whole
weird sexual and racial politics to that elevation, including the self-mystification of these early guys who
started Artificial Intelligence in 1956. The rise of the age of data is in a way a return to the age of mystery. The
machines are the gods and the computer scientists are the priests and the rest of us, we just have to look up and
hope that they get it right.
FB: I’ll pivot a bit and ask a question that frankly blows my mind. In an unexpected way, the founders of
Simulmatics really did predict the future! You write that Ithiel de Sola Pool, co-founder of Simulmatics, wrote
“Towards the year 2018” in 1968, 50+ years ago. He said, “By 2018, it will be cheaper to store information and
computer bank than on paper…tax returns, Social Security records, census forms, military records, perhaps
criminal records, security clearance files, school transcripts…bank statements, credit ratings, job records—
would in 2018 be stored on computers that could communicate with one another over a vast international
network.You say ‘People living in 2018 would be able to find out anything about anyone without ever
leaving their desk. We will have the technological capability to do this. Will [we] have the legal right?’
Pool’s and your questions are relevant today. The question for all of us is what are the legal, moral, and societal
constraints that we need to create to make sure that technology is a tool for the public good, rather than the
public being a tool for the advancement of technology? Does anything give you hope today about any of this?
JL: Pool was just brilliant. And he foresaw that, but actually the takeaway of the essay is, ‘we'll have to see
what they do in 2018.’ He just kicks the can down the road, and the lesson of the 1960s—in particular Vietnam,
but certainly of this behavioral science research—is, no, you don't kick the can down the road, you don't just
invent Facebook and see if it destroys democracy. I think that if you worship people as disruptive innovators
for long enough and just throw money at them and indulge them, well, they can really screw things up. There’s
a set of ethical guidelines around research and almost any other area. We see that in the comeuppance that
biologists had after Nuremberg about the medical research done by Nazis, and the incredible shuttering of
physicists after the Manhattan Project and the bombing of Hiroshima. People say, ‘Okay. Clearly, we shouldn't
have done this, let's come up with some rules for how we proceed from here.’ I had the expectation, as I think
many people did, that 2016 would be that moment for people who deal with personal data, and it wasn't. It
hasn't been. It's worse now than it was then. We can say that's the federal government's problem, and it is the
federal government's problem, but it is also everyone's problem.
FB: All the techniques Cambridge Analytica used can be used by other organizations. They didn't own that
particular kind of approach. So yes, it is everyone’s problem.
OK, some questions from our remote audience. Heres one: what lessons can organizers of political and social
movements learn from the mistakes of Simulmatics and be aware of when trying to mobilize people? What's
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
11
the difference between leadership and manipulation?
JL: One of the long term legacies—and I wouldn't put this by any means on the shoulders of Simulmatics,
which is, again, a small, failed company—but of the turn toward doing micro-targeted political messaging,
which starts in 1950s—is that its critics said the lesson to be learned was ‘this will destroy the sense of a
common good.’ If you only ever talk to voters as if the way you expect them to vote is based on who they are
and what would be good for them, you have destroyed the fabric of our political institutions and our
constitutional system, which requires voters to think about what would be good for everybody. Republicanism
with a lower-case ‘r’ means that we are supposed to go to our polling place or fill out our form and put it in the
mail, with an eye toward—as a virtuous citizen in a classic sense of what ‘virtue’ is—who is going to best
represent the interests of all of the people. Not ‘who's going to lower my tax rate or do this or that for me.’ Our
entire political discourse has changed around this kind of messaging, the micro-targeting that divides voters
interest against other voters interests and even makes political coalitions within a single political party hard to
hold onto because we kind of have all swallowed that there are 480 voter types or there are 10,000 voter types.
Whatever number it is. You’re not a citizen who's asked to think more broadly about what is good for all of us.
FB: In a way, I think that all of the customization that the data allows us is really a double-edged sword. I can
be in a cohort of people who have a particularly light, seaworthy kayak, and you can advertise kayaking gear to
me. I can also be fed news that is particularly tailored to my personal beliefs. In the first case, customization is
really convenient and does no harm. In the second case, customization means my information is highly tailored
and I may not get opposing perspectives; I may become more vulnerable to manipulation about critical societal
issues. With this kind of targeting, how are we supposed to come together as a country?
JL: Pool foresaw that in another essay he wrote, also in 1968, in a special issue of a magazine with J.C.R.
Licklider. He said, here's among the things I would predict: there will be the personal newspaper, and it'll be a
problem because we won't be able to have people belong to interest groups any longer because they'll just have
their personal interests. Then you have your highly atomized, profoundly alienated polity.
FB: It’s basically Spotify for news...Here’s another question: let’s stipulate that we’re able to predict human
behavior in some number of years, whether 5, 10, or 50. What are the lessons from history? What should we be
doing about this now? What can go well and what can go badly?
JL: I don't think there are laws of human behavior the way there are laws of gravity. Scientists should study
human behavior. There are reasons to do, obviously, quantitative work in that realm. But, for me, I actually
think that literature is more meaningful and poetry is more meaningful as ways to study human behavior. I'm a
humanist! History, I should emphasize, is not a predictive social science. There are lessons to be drawn from
the study of history, but they're not predictions.
FB: In some sense, it’s a kind of economics of the market, which is often patently unfair, don’t you think?
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
12
JL: Exactly. It's not idle to say we value some kinds of knowledge more than others when, even in a specific
monetary sense, we value them.
FB: Heres a timely question from our audience. It's January 23
rd
of 2021, and the President calls you from the
Oval Office and asks you to design future regulations in education. Where would you start, what would you tell
the President?
JL: I don't really have an educational agenda. I think that there are some changes we've talked about, and
you've been really involved in thinking about data science initiatives at colleges and universities.
I was at Berkeley right before the world ended in February, and its data science program has a required history
course that I think is really interesting. Some other programs I've looked into have this stapled-on, ethics-like
unit, or just kind of tacked-on history week that's taught by a computer scientist and not a historian. Now, I do
think, given the magnitude of this turning in higher education, that it would do well to be better integrated with
other parts of university. This is why I'm here having this conversation with you! I think we need to be talking
across these differences and finding the best in one another's approaches.
FB: A last question from me. Cyberspace and technology have become critical infrastructure, especially during
the pandemic. If they continue to be largely unregulated, it's hard to curb their capacity for exploitation. This is
a problem for all of us. We’re in school online. We're ordering groceries online. We're having this conversation
online. And as each one of us logged in today, including you and I, our data was collected and carefully
categorized by the services we are using. We’re basically now living in a data wild west where we can be
exploited anytime, anywhere, by anybody.
Once our participation becomes non-optional, once our technologies become critical infrastructure, I believe
we must regulate them to promote the public interest and reduce risk. I know that in 2020 we don't have a lot of
confidence in government. I know it takes practice—the General Data Protection Regulation (GDPR) in
Europe is a really good example of that. But how do we get from here to a place where technology really is
working for us, rather than us working for technology? What do you think should happen with the federal
government, with societal norms and practices, and with the other tools that we have at our disposal, to really
improve things?
JL: I think the members of this audience probably have more clear ideas about what the best steps are to
proceed there. One thing that I fear is that the very wealthy will opt out of all these technologies. Then it'll be
the poor that are fully monitored, who can only engage in transactions in this form, and the wealthy will have
ways of avoiding those things. I think that change needs to happen pretty soon before those people find a way
to opt out and have even less incentive to turn their resources toward supporting the kind of really very radical
reforms that are required. I’d say my position on most of this stuff pretty closely tracks, say, the kind of more
hard-nosed, Elizabeth Warren position on regulation.
Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
13
FB: Thanks so much Jill! What a pleasure to talk with you about all of these issues. What are you working on
next? Any more technology books in the works?
JL: Definitely not. I am working on getting through the Zoom semester. If we already didn't understand how
diminished we are by our reliance on certain kinds of technology and how helped we are by other kinds, I hope
this is a clarifying time. So I'm trying to just do the teaching and learn the lessons that can be learned from it.
Disclosure Statement
Francine Berman and Jill Lepore have no financial or non-financial disclosures to share for this interview.
©2021 Francine Berman and Jill Lepore. This interview is licensed under a Creative Commons Attribution
(CC BY 4.0) International license, except where otherwise indicated with respect to particular material
included in the interview.