Harvard Data Science Review • Issue 3.1, Winter 2021 The People Machine: The Earliest Machine Learning?
7
asks people about issues, and then he reports that in his weekly column, and he's always right because he's
found that magical 3000 people. That's what Simulmatics is trying to do, but mathematically.
FB: You might have to give us a movie list to go with the book! Let’s talk about data science. You don’t have a
lot of good things to say about data scientists and data science, both in the book and in a great article you just
published in Nature, where another one of our Radcliffe Fellows, the amazing Jo Baker, is a senior editor.
Your summary of data science is thoughtful and it's worth us discussing. Here's what you say in the book: “In
the 2020s, a flood of money into universities attempted to make the study of data science with data science
initiatives, data science programs, data science degrees, data science centers. Much academic research that fell
under the label ‘data science’ produced excellent and invaluable work across many fields of inquiry. Findings
that would not have been possible without computational discovery. And no field should be judged by its worst
practitioners.
Still the shadiest data science, like the shadiest behavioral science, grew in influence by way of self-
mystification, exaggerated claims, and all-around chicanery, including fast-changing, razzle-dazzle buzzwords
from ‘big data’ to ‘data analytics.’ Calling something ‘AI,’ ‘data science,’ and ‘predictive’ became a way to
raise huge amounts of venture capital funding. A credulous press aided in hyping those claims, and a broken
federal government failed to exercise even the least oversight.”
I remember a number of our discussions about these kinds of issues—always fun over great Vietnamese food—
and, not surprisingly, my own view is a little more charitable than yours. In my view, data science, behavioral
science, and other emerging fields thrive in the academic sector—in the wild, if you will—where exploration is
primary. But when they're employed in the private sector, they can be weaponized, and they're often
weaponized to exploit. So in some sense, isn't the kind of hyperbole and the other ills you mentioned, and I
agree with many of them, more of a characteristic of data science for hire than data science in the wild? Isn't
data science really the victim in this process rather than the perpetrator?
JL: I think that's fairly stated. And I should just say here, that when I criticize data science, I am offering a
fairly specific sort of criticism. I treasure my colleagues at the university and the college who do this work, and
I'm not picking a fight. But it tends to be the case across realms of knowledge that when a flood of money
comes into a field, it can often attract a lot of scalawags who just want the money, and I also think sometimes
some of that money is coming from the data for hire companies that want to launder their money. They want to
borrow the prestige of a university that really is interested in the pursuit of truth and making interesting
discoveries, for the sake of knowledge. They want to affiliate themselves with the university because what
they're doing feels dirty. And I think that's ethically concerning. There was a lot of that going on in the 1960s
with behavioral science in the Vietnam War. And with chemical weapons. All the student protests and anti-war
movement that happened at MIT and also at Harvard, and many other places, including Stanford, were students
saying, ‘Well, we thought our professors were studying chemistry, you know, they're out there, working on