Data For Social Good: People, Processes & Technology - Episode 3
This podcast features Frank Romo, founder of Detroit based RomoGIS Enterprises: a data, design and research collaborative aimed at promoting the public good through innovative technical solutions. Frank has a long history of being a community advocate, planner and activist for public health and safety, and social justice. As the CEO of RomoGIS, Frank provides technical solutions that empower residents to effectively impact their local communities. In his work with the University of Michigan, Frank engages in community-based research and develops geospatial applications that advance equity and social justice in cities.
This is the third episode of our miniseries on data, maps, and social movements with Frank Romo, hosted by CoLab Radio Producers Emmett McKinney and Allison Lee. This interview was recorded on Oct 30, 2020 and has been lightly edited for clarity and length. A transcription of the conversation is below.
Emmett McKinney: Welcome to CoLab Radio. This is a publication of the Community Innovators Lab at the MIT Department of Urban Studies and Planning. My name is Emmett McKinney, here with Allison Lee and Frank Romo, and this is the third episode in our series on the use of big data and how it helps us understand the current movements for social justice and those that have come before.
We have us with here today Frank Romo, the founder of RomoGIS and a former labor activist, who has now turned to data science as a tool for continuing his activism. As I mentioned at the top, this is the third installation of a series. So if you haven't already, I really suggest that you listen to the first episodes. The first one thought about data visualization. And the second one thought about data curation and collection.
Today we turn to data analysis and interpretation and the process of making sense of it. It's often said that data never speak for themselves. And so today we ask the question, Who does get to speak for the data? What are the power dynamics implicit in that? And how do those power dynamics change as there are more and more machine learning and algorithmic approaches to making sense of the terabytes of data that are being collected every day? So Frank, welcome back. We are so happy to have you.
Frank Romo: Thank you again for having me. It's been a pleasure to be on these past few episodes and I'm excited to get into today's discussion.
Emmett McKinney: Us too. So as we are taping this, the nation and Philadelphia, again, are embroiled in a massive uprising about the murder of a Black man at the hands of police. Walter Wallace Jr. was experiencing a mental health crisis in Philadelphia, when the police were called. But the police weren't the first people called, it had actually been an ambulance that had been called first. Mr. Wallace was seeking medical help; his mom was there seeking medical help; every intent was to de-escalate. But the police arrived first. And while Mr. Wallace was armed with a knife, the police ended up unloading 10 rounds, and killing Mr. Wallace, even as he and his mother tried to de-escalate the situation. So we are heartbroken and angered by this, again. And I think it really puts in context the urgency of this discussion about data and power, and how it influences the way we make decisions.
When the police arrived in that neighborhood, it wasn't with a totally blank slate. They arrived in that neighborhood with an idea about where they were going and who they were dealing with. They had in their minds a model. And that model was informed by the decisions and information that had been collected by their police institutions and fed to them by the rest of society. And so we want to interrogate that today. What should we be saying about the data that is collected, and how it influences police decisions? And how might we make better use of data to get to more just outcomes? So Frank, what does this moment mean to you?
Frank Romo: This moment, like so many other, unfortunately leaves us with a heavy heart. And for me, with the research that I do regarding this data, it's very important that, as Emmett said, it provides us with another moment of urgency, where we're trying to work on data that is trying to provide information to activists, to organizations who advocate against this type of behavior in the streets. And as Emmett also spoke of, there are things that point folks in this direction - the first responders in this direction, the police - to have certain biases. It's no secret we all have our own implicit biases and then we bring that to the table. And unfortunately, when folks come to a scene with those biases and are armed, we can have this kind of interaction.
So one of the first things I want to touch upon is the data. We've talked a little bit about predictive policing, or the tools that police departments use to monitor and police certain areas. So I'm not super familiar with what is actually being used in the Philadelphia precincts right now. But overall, on the nationwide level, we're seeing a lot of organizations moving towards this idea of predictive policing, which is focused on finding where crime has occurred, using that data, feeding it to an algorithm and allowing it to suggest certain type of units, certain type of response. And on top of that, we have additional crime statistics that are fed into those algorithms. But like with many other softwares, if we put poor data in, we're going to get bad outcomes. And when you have bad data trying to support some of these real life interactions, and you multiply that with community conditions, the relationships between the neighborhood and the police department, as well as police and the communities' implicit biases, we can have incidents like this. And unfortunately, we're seeing them happen more and more.
When we are using softwares like predictive policing, or datasets that try to be more "efficient in policing," I really would advise, we really take a look under the hood of what's going on in those softwares and question how those are being used.
And then again, I think it comes to a piece about training. I understand folks are heightened during these altercations. Walter Wallace, he did have a knife on him, but that was no reason for him to be shot in front of his mother like that. There are lots of ways that those things could be de-escalated. And we're here and our organization and so many others around the country are going to continue fighting, and continue to build datasets, continue to build platforms, continue to build solidarity, so that we can stop these occurrences from happening.
“When you have bad data trying to support some of these real life interactions, and you multiply that with community conditions, the relationships between the neighborhood and the police department, as well as police and the communities’ implicit biases, we can have incidents like this. And unfortunately, we’re seeing them happen more and more.”
Emmett McKinney: So let's go under the hood. How does predictive policing work? What is the data that's been collected, and how does it get ingested to ultimately inform policy?
Frank Romo: So predictive policing takes a variety of different forms. We talked about in the past episode, this idea of open data and what data is available to the public. Well, police agencies have a lot of data that is available to the public but at the same time, there are datasets that, or fields that, are not revealed. And information like that could include information about the race, the age, the model, if you will, suspect. And we have these predictors of, "This is a predominantly community of color. It is highly likely that a perpetrator might be X, Y, or Z, and they might fall within these ages." And you see this a lot, even in common culture, in police shows and things like that, where you see, "We have a perpetrator who is of this race and is between this age range and is of this height and this build." And you see how problematic that can be. And that leads to things like stop-and-frisk, when it was really predominant in New York City and how many times people were stopped and frisked and there was no crime being committed. Yet, because they matched a certain description that was spit out by an algorithm based upon previous instances at that location, based upon previous crime at that location or in that vicinity. And what it does lead to is sometimes harassment and over-policing.
But at the end of the day, what it does, it takes a geographic area, it crunches all those numbers - crime, race, age, victim age, race - it brings in all of these demographic informations, and it assigns values to them. It says, "This is 90% probable that this might happen again, because based upon the thousands of data points that we have, in the 80th percentile, it was a person who met this description." And so what happens is that data gets fed into the algorithm, it gets spit out into something that is readable by human, and allows folks to make those inferences. And while folks are in the streets looking for the perpetrators - while the first responders, while the police are out there looking for that - they are looking for someone with a red hat, who might be between the age of 22 and 24. And who might be a certain race.
And unfortunately, that does lead to more - I would say spotty, potentially - policing because it could be very hit-or-miss. And if you are in a neighborhood and you come across 10 people who are wearing a red hat who meet that description, are you going to stop every one of them if all of them are wearing red hats, but they're of different races? Which person are you going to stop?
And so that's where, again, those implicit biases come in place. And also reminding folks that when these tools are built, those biases are also within the building of the tools from the development side. So as you get biases or decisions that need to be made about - what kind of weight should we put on the race of the perpetrator, and how much should that be predicted in our algorithm - now there's biases inherent there in the building of the tool. There's biases in the algorithm computation. And then when it is put out on the street over a radio call saying, "Okay Unit, we're looking for somebody who is wearing a red hat, who is between this age and of this race." Now, those officers also have to make decisions in terms of how they're going to stop people, and they unfortunately have their own biases. So when we take a step back and look at the entirety of this situation, you just see these biases, and these guesswork happening and being compounded.
A lot of folks will stand very much behind the data and say, "Well, it's not guesswork. It's informed by previous datasets and these data points have validity," and that is true, but we also need to think about what the context of the neighborhood looks like, what the urban fabric is looking like. Is this already a neighborhood that is high impoverished and highly over-policed? And what does that mean for folks who actually live there, and then their relationship with the police? Because all of these things come into place, and unfortunately, when they all come together and make the perfect storm, we have incidents like what happened in Philadelphia.
Emmett McKinney: So there's a lot to respond to there. What I'm hearing is that these models are built on initially data about individuals. We have a lot of different accounts of like, “This person was observed doing this type of thing.” And then these individuals get aggregated into these general models of a neighborhood, or people wearing a certain type of hat, or of a certain height and build or complexion. And then those models are again, then re-projected onto individuals. So there's a circle.
And the trouble with this is that there is statistical noise. Any model is at best, a really good guess. And so as you mentioned, when it actually comes time for an officer to take this model into account, and then go have a one-on-one interaction with an individual, the judgement about that area, or that neighborhood, or people that look like that person, ultimately influence their interaction with a complex individual.
So there's an element of human discretion that's really needed here. And then, as you mentioned, with discretion comes bias. And there's an interesting question here, because a lot of folks will suggest that an algorithm being used to carry out the criminal justice system will be less racist than humans doing so. This precise question is on the ballot in California right now, with respect to the ending of cash bail and replacing it with an algorithm. So do you see any opportunities in this to remove some of the bias? Is there potential there in data?
Frank Romo: I'm a very data heavy person. But I think when we're talking about the stakes of people's lives of going to jail, or potentially having a fatal interaction in the street, I think we really need to get back to the basics. And I think it goes back to community police relations. And we don't just see that in Philadelphia, we see that all across the United States. Community and police relations are very haphazard right now, at best. And I think that has a lot to do with it. We sometimes believe that technology is going to solve a lot of our problems, when in fact, I believe in this case, is adding to the noise, as you said, is adding to some of the problems that are already inherent. And it really does come back to getting to know folks and building those relationships. That sounds very pie-in-the-sky. But I believe that, that's the only way that we're going to be able to get to a better position, is if we improve those interactions between police and community residents.
Because as you mentioned, when officers go into a certain neighborhood, not only is the data telling them one thing, but just like anybody else, an officer is a human being. And they have a mental map. We've talked so much about geospatial information. And we all have mental maps in our brain.
Whenever I work with students, the first thing I have them do is draw a mental map of their neighborhood. And that's a great way for me to gain an insight on how they see themselves in their neighborhood. If they show me a bunch of power lines and railroad tracks and things like that, different landmarks, or smoke coming out of a building nearby, we can probably infer that they're in a neighborhood that might have potential toxins nearby and things like that.
“We sometimes believe that technology is going to solve a lot of our problems, when in fact, I believe in this case, is adding to the noise, as you said, is adding to some of the problems that are already inherent.”
So when you ask folks to draw a mental map or understand and investigate how they are viewing the world mentally, I think that really makes a difference. Because when police go into a neighborhood, they have previous experience in there, and they have ideas in their head of, "Well, that didn't go well last time I came over here on this side of the tracks or in this neighborhood." That's not to diminish their experience, but at the same time, we need to recognize that those previous experiences present the bias. And then if folks are riding into that neighborhood with a mental map that says, "Oh, well, there's a lot of crime in this neighborhood. And I've had bad interactions in this neighborhood." Yes, the police officer needs to be aware and be able to protect him or herself and be diligent in approaching the situation. But at the same time, they should also have some more of a nuanced understanding of how the residents experience police activity in that area as well, and recognize what those folks' experience of the neighborhood is, because if they come into the neighborhood - meaning the police officers - with already a mental map or a bias of what the people are like in that neighborhood or what that neighborhood is like when they interact with the community, and then on top of that the data that they're getting fed is something that is also somewhat biased, then sometimes it can just confirm what the police officers already are believing. And again, it compounds. And I think what's really happening is the compounding of an error here, or a mistake there, or a misclassification of a record here or there. And we think that algorithms are going to be neutral. But when you have error upon error, and those errors continue to get compounded, they can make a big difference. And then what that does is it impacts people's lives and wellbeing.
Frank Romo: Absolutely. So you're right, the predictive policing is not new at all. And just like with any other agency, city agency, as the times change, the police department wanted to keep up and said, "What can we do better? How can we police better?" And that's fine, we want folks to have all the tools at their disposal to make good decisions, and to bring safety to communities.
However, what we do see is really a flood of this market, but from new vendors, new software, all claiming to be able to solve the problem of crime in your neighborhood. That's one of the big issues is that, What are the values that are held by some of these companies who are pushing this data? Is it just to be more efficient, dispatch more efficiently, and get folks there quicker? That's one aspect. Or is it to say, "We're going to reduce crime, and you're going to be able to reduce crime in this area, by X amount." And that's what their main pitch is. When they speak like that to executives, executives like to hear that, because they're saying, "Of course I want crime to go down. That would be perfect. That's what we want for our city and for our residents." But how that is approached is really important.
“And that’s not the way that activists and organizers and people who are on the ground want to be. We’re not here to be quantified in zeros and ones and be able to be told by some algorithm or policing product that our lives are worth less or we might be a threat when we are just living in our neighborhood.”
A lot of times when you see these new vendors or private companies trying to enter the space, like in so many other tech realms, they'll sell you the moon and say, "We can do everything, we can do everything great. Our software has no mistakes or errors in it. It's one of the best out there." And that's their right to be able to say that. But how many times do folks just look at a flashy presentation and say, "This is good, let's go with it." And they may not have some of the resources - meaning the police department - may not have some of the resources to actually look under the hood and see, "What's actually going on here? What does this algorithm actually do? How is it computing this?" And without folks to actually examine and inspect that data, sometimes the police departments can be working with organizations who have a proven track record of not doing so well in certain areas.
So one of the places where that's been happening is in LA County. I think Emmett mentioned that there are things on the ballot right now, and folks who have been advocating for certain set technologies to get out of the police department, and to not be used anymore, because they had a trial run and it didn't go well for the community. And the community wanted to speak out and say, "Why are you using this on us? This obviously isn't helping us. Maybe crime has been reduced. But more of our people are being injured." And that is really the conversation. The technology isn't going to solve everything, but without doing a deep dive into what's actually going on with that technology, to some degree, sometimes the residents are kind of the guinea pig. And that's not the way that activists and organizers and people who are on the ground want to be. We're not here to be quantified in zeros and ones and be able to be told by some algorithm or policing product that our lives are worth less or we might be a threat when we are just living in our neighborhood.
Emmett McKinney: I think an important question in all of this is, where the data comes from. The records that police use to inform their algorithms are not just coming from police stops. They're also coming from the way people move and use, for example, private mobility devices, the way that people spend money, the things that they're subscribed to. We live in an age where data is being collected about us constantly. But even before this was the case, communities of color, and especially Black communities, were already being surveilled by law enforcement at local, state and federal levels. So I'm wondering, how does this new data regime, how does the analysis of big data, fit into a longer history of being, as Ruha Benjamin has put it, "being watched, but not seen"?
Frank Romo: You can't separate surveillance from control. It likes to be passed off a lot of times as safety. But surveillance is control. If you know the movements of people, you know where they're going, how they interact, what kind of behaviors they engage in, because you have machine learning, and you have all this other terabytes of video content that you're running through some kind of algorithm or dataset to say, "Let's watch every move that's happening on this street." When you do that, what is the end goal there again? I think the end goal is, surveillance is about control.
Community members recognize that. And people who are surveilled recognize that. It's not even something that has to be said, it's more of a feeling honestly, sometimes. These communities, communities of color a lot of the times, have lived with this for generations. And what is happening is, the technology is getting smarter. And just because the technology is "getting smarter," doesn't mean it's getting safer for those communities. In fact, it's quite the opposite in some cases, where as the technology gets smarter, communities feel more at risk, because those biases that are baked into it, because of those decisions that are then made on the executive level, on the ground level of patrol officers. As the surveillance continues to increase and increase and it's not going anywhere, the idea is, "Efficiency, and we're getting smarter at surveilling, and we're using new technology and higher quality cameras, and all the cameras are fed into all of these different services." But what does that mean for the folks who are being watched?
“You can’t separate surveillance from control. It likes to be passed off a lot of times as safety. But surveillance is control.”
And you're right, they are watched consistently, but they're actually not seen as humans, I think is what it comes down to. Because when you are just surveilling or watching something, you are in this voyeuristic kind of space. And it is an "othering". There is an "othering" that is happening. We already know that there is an "othering" that happens in the racialization of policing. And when you add that level of surveillance, there is a more even further detachment of, "This is a human being who has a story, who has a job, who has a wife, who has a child, who has a husband," and we need to get back to the basics of seeing people as full human beings. And not just like I said, zeros and ones, or somebody doing something on the screen. Emmett McKinney: You say that technology is getting smarter. And this is something we hear a lot, like “smart mobility”, “smart cities”, “smart this-or-that”. And it strikes me that technology is not itself getting smarter; what that's a shorthand for is computational power is increasing. It can ingest more data, process more of it faster, and package more of it more neatly, and deliver it further at lower cost. This is a technical improvement.
And in the case of machine learning approaches, and especially say deep neural networks, technology is getting better at recognizing patterns. But technology has not developed a conscience. It has not developed the ability to think creatively and morally about what society ought to be doing and the role that technology ought to be playing in that. So to some degree, it's becoming faster and more powerful, but it remains a tool.
And even as we have smart technologies, we still have humans who are designing those, who are deciding what data to feed in, who are deciding which data to look at, and deciding what to do based on the analyses that they do. And in some ways that's really disheartening, because it means that the same biases that have always been present in our society are going to filter through into these new technologies.
There's also reason for hope, because it means that there are humans who can be spoken to, like there's some individual who's capable of making a different decision about how this technology gets used. And you've made this point really powerfully in a talk a few years back at the Tyranny of the Algorithm Conference (2016), where you suggested a different approach where communities would be in charge of using predictive analytics to identify places where they may be in danger. So taking that algorithmic approach and turning it on its head as something for community autonomy and self determination. Can you talk a little bit more about that project?
“Whoever has the data has the power.”
Frank Romo: The program, or the conference, was the Tyranny of the Algorithm. I sat on a panel with folks from some of these vendors or software companies who were saying, "This is what our software does, this is what it doesn't do. It actually is not problematic because X, Y, and Z." Then you had folks on the community side, on my side, the kind of the activist side, saying, "Well inherently,” as you pointed out Emmett, “these things are biased, and they are getting computationally faster and stronger and quicker, and you want to sell them to more police departments." But with twenty engineers, you're right Emmett, there is some hope because there are people who can make a difference while we're building the product. And there are folks who can make a difference while they're trying to make decisions for the community. And there are folks who can make a difference while they are engaging with the community in the streets.
However, if there are twenty engineers, and only two of them have that consciousness of trying to remove those biases from the software, then 18 x 100 million is still a lot more than 2 x 100 million in terms of the computational power, and how those biases get exponentially larger.
But in the Tyranny of the Algorithm talk, what I spoke about was, related to what we talked about the past few episodes is, whoever has the data has the power. We've seen this happen a lot with the rise of social media. We talk about surveillance of being control. Surveillance is a way of having control. Almost all the time, when we see these very disturbing videos online of people being shot or being harassed or killed in the streets - these altercations, these fatal police altercations - what that is is surveillance. If somebody is taking their phone and taking their small computer and their computational device that they have in their pocket, and being able to record and see what is actually going on. In that surveillance, you would hope that there is some change of behavior. But as we've seen in a lot of the recent protests this past year, even while the video cameras are on, there is still some of that very hostile and very aggressive behavior.
What we were trying to talk about in the Tyranny of the Algorithm was that we recognize that this could be very problematic. It is a very slippery slope to feed these algorithms, and make them make the choices for us, or provide us with a recommendation list of choices. And what we have suggested was, How do we provide communities with tools to do the same? To surveil what is going on in their community? In that same way, the community is saying, "Let's look at our own data. And let's use our own experiences as data and figure out how we can harness that to make ourselves safer." Because as I've said before, wherever has the data has the power, and if we as the community can harness that by surveilling these interactions more - which you've seen a great increase of - which sometimes is helpful and sometimes I'm screaming at the screen saying, "Why isn't somebody speaking out and trying to intervene instead of just recording?" So it's a slippery slope there as well. But what we're trying to do is try to figure out how we as the I.T., as the technologists, as the activists, can equip community members with some tools. And that's where my research comes in, is trying to identify neighborhoods and communities that are at most risk for these altercations based on historical data. What we're doing is not really that different. We're looking at historical trends. We're trying to interpret it.
Emmett McKinney: And to clarify, those are altercations with police. You're interested in building tools to help community members identify places where they might themselves be at risk from interacting with law enforcement.
Speaking of slippery slopes, with any new technology, or any new approach to using it, you can't be sure what hands it may fall to. And this community surveillance approach that you outline has some parallels with the type of neighborhood watch that George Zimmerman was on. There have been already privileged communities who have adopted this defensive posture with the intention of keeping people out. And I wonder, How do you make sure that this sort of community-based or community-oriented technology gets out and serves the people who it's intended to serve, and doesn't end up reinforcing some of the same dynamics that we've seen?
Frank Romo: I think you're absolutely right. And when I mentioned surveilling on the community side, let me be clear, I'm talking about surveilling the perpetrators which in a lot of cases are the police. And I think that's a little bit different. I think you're right in the George Zimmerman case, that is right. You see that across the nation. We had a lot of bills passed, local ordinances, where communities, with the advent of the ring camera and all of these cameras, there are more eyes on the street now and what some police departments and local ordinances have done and said, "You can opt in and you can feed us your ring camera, you can feed us your video content." And again, that poses some problems because then it is hyper-surveilling the street, and when somebody "out of place" is there, then all of a sudden an alarm gets triggered or 911 gets called, and we see this happen a lot, where folks call the police on somebody who might "seem out of place" or "might not belong in the neighborhood". You've got to imagine what that feels like from a community perspective.
So I think community surveilling the actions that are taken against them by police, I think that's one piece of it. But it is just a piece of it. Because you're right, the technology can be used in a negative way. So I think, when I use that very lightly to say that, that is one piece.
Another piece is being able to have that on record, so that they can then organize and go to the local board, city council, and say, "Look, here are data points of what have happened in our community,” and it can be very simple as a few points on a map of where people have been harassed, or felt like they were mistreated or had their rights violated. And put that on the map and say, "This precinct or this area, we've had a lot of complaints. Here it is on a map, here are some videos. We need you to do something about it."
So it's not just about the technology. It's about bringing it to action and organizing, and allowing the community to have that voice. And at the end of the day, the technology is just a tool. It is something that can help them make their argument to say, "What you all are practicing right now and what we're seeing in our community, it may be "reducing crime," but it is not making us feel safer."
“These communities, communities of color a lot of the times, have lived with this for generations. And what is happening is, the technology is getting smarter. And just because the technology is “getting smarter,” doesn’t mean it’s getting safer for those communities. ”
Allison Lee: You mentioned before that surveillance is control. And I think by extension then surveillance - but you can also talk about predictive policing, facial recognition technology, all of these technologies - they are a form of control. But in most of the time, they are posited as forms of convenience. And that's a very fine line, the convenience versus the control aspect. And I think it really comes down to, convenience for whom? And developed by whom? And for what purpose? So it just makes me think about all of these technologies that are claiming to be convenient for daily lives. But really, what are the implications of those? And what are we willingly participating in that perhaps we aren't aware of? And so is it that the public, that we need to have a better understanding of this technology? Is there something that we can do? Is learning almost a form of reaction against the technology, because so much is being done in our name that perhaps we aren't familiar with?
Frank Romo: I think that's a great point. It is about whose convenience, whose safety, who are the owners of this? Who are we actually catering to, as a city government, as a local police department? Whose concerns are prioritized? And you could even see that, there are many communities who have anecdotal information about back in the 90s in LA when there was a lot of gang violence, you had many communities who said, "The cops won't even come around here. They won't respond to us, even if we do have an actual emergency.” A lot of anecdotal stuff.
"It is about whose convenience, whose safety, who are the owners of this? Who are we actually catering to, as a city government, as a local police department? Whose concerns are prioritized?"
But it does come down to whose convenience and whose ownership of that data. And a lot of times, just like with any other technology that we participate in, "I really want to use this application." "Well, sign on the dotted line and click your consent button and you can use it." And again, without somebody actually looking into that on the legal side... So it's one thing if an individual uses it. But when an agency that is putting hundreds of thousands of dollars behind a vendor or a software. And of course they have their legal proceedings, they have their legal side look through that. But again, when police departments are looking through those legal precedents with regards to how they're interacting with the vendor, they're just making sure that everything's in order in the contract and they're not violating laws.
However, what does that mean for the local community member who is being surveilled? That's not what the legal aspect is for. And in the same way, when we're talking about how these algorithms are built, we're just going to trust that this vendor who has 20 other contracts in our region is going to be somebody who's going to do the best job for us as the police department, or as the city agency. At that macro scale, you lose the human being, the person on the street, who then could potentially have that negative interaction, and unfortunately becomes another part of... It's tough to say it, but with these data processes, they want to see folks as zeros and ones and unfortunately, when things go awry, people in our community do become another statistic or another number because they get added into a different column and that is a column of, "They had a fatal interaction with the police and now they are a hashtag, or they are another name that we are thinking of, where we must say, "Say her name, say his name."" And we think about all the people, even just this year, who have died. And so it's really important that we really do bring back this human aspect of things.
Emmett McKinney: Technology doesn't just appear on its own; it doesn't pop out of a wormhole. There is somebody who creates any given piece of technology. That takes a lot of funding and time to do that. And developing technology usually takes some institutional support, be it from a city or a university or a private company. Each of those actors exert some influence on its development. It results in a product that is designed for a particular set of people, for a particular purpose. But it has no conscience of its own. Part of the real peril here is that, terms like "smart" and "technological" and "data driven" project this air of neutrality, when they're anything but. This is also driven by a blind faith that we have, that technological innovation will continue, and that it will be a steady march toward progress, when it's not at all clear that that's the case. Frank the point you make about remembering the humans involved - saying her name, saying his name - is really vital to our understanding of what data is, and how it might be used better in the future.
Allison Lee: There's an inherent risk that comes with data and map making and analytics. And there's this scientific security that comes when a person - a member of the public, an authority figure - reads this kind of data. And that data is based on evidence. So it is accurate and truthful in that sense. But it also grossly hides the manipulation that happened along the way to produce that data and to analyze that data and to make those visualizations.
And I know, last episode, we spoke a bit about that curation process and what goes into that middle zone that the end consumer - being perhaps the public or the person receiving that information - the end consumer receives that, but they don't see that middle ground and what gets selectively chosen to be included or omitted. That's where the magic, or dark magic, happens. Where all of those decisions are made and all of those biases play into it. Maybe that's where people need to start putting a lot of their focus, not necessarily the data collection part or the end result, although those two obviously are very important. But it's that middle ground that makes those decisions. And like you were talking about before, the person behind who is making those decisions is really important. Whether it's coming from an institution or an authority force or the community, you might get very different results.
So going back to the Tyranny of the Algorithm Conference and the work that you did with that, taking the same software and turning it on its head is really interesting. It may not in the end be addressing the bigger issue, which is that the software is still running on certain biases, but the end outcome definitely might differ in that sense.
Is there anything going on with the evolution of this? That conference happened in 2016. So four or five years on, are we still having the same conversation about the same type of technology and worries? Because the technology certainly hasn't slowed down. That is picking up speed like wildfire. So is the pushback and the investigation into these technologies moving at the same speed?
Frank Romo: I think you made a fantastic point about, the data collection piece is something that a lot of folks can understand. The output is something that a lot of folks can understand. From a community organizing standpoint, the piece that's in the middle, as you said, is the most important part. That's how it's made. And that is always the most inaccessible part to the local person. To the average person, that is always the most inaccessible.
And I, in my thinking, always will believe that it is inaccessible for a reason. And that that reason is because - as you called it "black magic," or whatever science is going on in there, or whatever manipulation is going on in there - whoever is in power and is making those decisions don't want people to have an input. They recognize that, if we keep it very high level or very nuanced, very algorithmic, "Let's talk tech, and let's talk ones and zeros," people's eyes gloss over and that's where unfortunately, us in the community we lose the people who we need to be advocating for us - our law makers - because their eyes gloss over too. Because it isn't accessible to them. They're not tech-trained.
“The piece that’s in the middle is the most important part. That’s how it’s made. ...And I, in my thinking, always will believe that it is inaccessible for a reason.”
On a nationwide perspective, we see that our laws are not keeping up with the tech industry. Our laws are not keeping up with regulating the tech industry. We don't even have to get into it, but we can just talk about voting. Voting is coming up. And how technology has played a role in misinformation, and in creating different narratives around voting. You can imagine the stakes are very high when it comes to voting. Well, in my opinion, I would argue the stakes are even higher when we're talking about people's lives in the streets. When that information is not accessible to the lawmakers, to the community, then it really does become a black box and people do just become the victims of these softwares. And this algorithm has the ability to be tyrannical over communities and over neighborhoods.
Allison Lee: And I think a lot of that “middle-omission” is done in the name of convenience for the consumer. “We’re going to make it easy for you, we’re going to skip this section. You’re not interested anyway. Let’s just cut to the chase.” But again, from data collection to output, you’re going numbers to numbers, and that middle is really the human element that they’re cutting out and not showing, or intentionally hiding.
Emmett McKinney: It’s the curation of knowledge. This is the same thing that historians do, or the same thing that journalists do. They help everybody “cut to the chase,” and they introduce a framing for it. This is something that, to some degree, I don’t think we’ll be able to remove. You can’t ingest all of the information and all of the detail about the world at once. A map as big and as detailed as the world itself wouldn’t be useful as a map. That middle ground is necessary for us to make progress on things and make decisions because we’re limited in our ability to analyze. But we should also recognize that it exists. And there’s a whole ecosystem, like Frank observes, of organizations that develop software products and clickable graphic user interfaces to respond to this need. Thinking in more detail about the more political economy of those organizations is really vital.
Frank Romo: So I will say, even though as I said some of the laws and local ordinances aren’t necessarily keeping up, or keeping some of these technology or software companies in check, what you’re seeing a lot of is the rise of organizations that are jumping into that middle section and saying, “We need to be responsible with people’s data. We need lawmakers to understand that this is not neutral, and the way that they’re using the data against the citizens is not okay.” And what you are seeing a lot arise of, are nonprofits, institutional organizations, different centers or labs that are studying just that, of how this data is affecting people’s lives. So I think there is some hope there, that there are organizations that do that.
We, at our organization, really try to use the data for social good and justice. A lot of that comes with education. When we education nonprofit organizations and run trainings and things like that, we don’t need folks to understand the full algorithm or how things work, but I think folks understand experientially how that affects their lives. What we as technologists, as advocates for social justice, need to do is equip those people in the streets who have that experiential knowledge with the language so that they can talk to the local government, so that they can talk with the police in a different way that says, “This isn’t fair. Let me match my anecdotal information with what I do know about the software that you do have. I don’t need to know all the nuances but I know that I’m getting over-surveilled, over-policed, and that you’re using my tax money to pay for a $500,000 product that is doing me a disservice.” So that’s what I think the role is, of technologists, to do.
“I’m not saying that those same mistakes - those interactions or altercations, these fatal encounters - I’m not saying they will 100% go away, but the scale at which we’re seeing them across the United States for the past 5-10 years that have been documented is really remarkable and we need to not replicate these mistakes at scale. Because we’re talking about thousands of people’s lives.”
Allison Lee: This is really - it’s about tech, and looking into the tech, and advancing the tech - but it’s also really about having a conversation about what this technology means and how do people relate to it, and how do they understand it. That is very empowering for the community, to take that tech into their own hands and even if they are themselves not doing any of the work, computationally, for that technology, but just to understand what goes into it, how it impacts their lives, and how it impacts their future from a spatial element, from a legal perspective. So that is hopeful to hear.
Emmett McKinney: I couldn’t agree more. This topic of data aggregation, it happens in many different corners of this debate about data and social justice. There are organizations in the private sector who create these convenient graphic user interfaces for cities. There are think tanks and research organizations that compute numbers and put out reports and organize. And also, federal law enforcement. The entire intellegence infrastructure exists to ingest data in fusion centers, and I think that’s where a lot of the danger happens too. That’s how data, that’s collected from a seemingly innocuous source, like where somebody moved on a scooter or where somebody went to the movies, that’s how that information gets filtered through the complex network of actors and ends up being part of some report or some model that gets used to guide policing.
And we should be clear about the fact that there are many different places where this data is getting fused and managed. So maybe just an accounting of all the people who handle the data is really important.
Frank Romo: I definitely think it is about accountability. At the end of the day, that’s what it comes down to. You can’t have accountability if people are ignorant about how the technology is being used and affecting their lives. And again, I don’t mean that in a negative way. I just mean, we don’t know. A lot of community members don’t know what’s actually being used to surveil them and to predict what’s happening in their community. Education is really one of the key ways to try to make this issue less of an issue. It will still to be a big problem because, as Allison said, the technology is not slowing down. It’s coming out and there’s more programs and more software and more agencies that are starting to use it. But the more we can educate folks at every level - at the community level, the police officer on the street, the people who are making the decision to purchase that software, the chiefs who are saying, “Yes, that’s the software I want to work with.” - the more those folks are educated about these softwares, the more likely it is that we can not replicate those same mistakes at scale.
I’m not saying that those same mistakes - those interactions or altercations, these fatal encounters - I’m not saying they will 100% go away, but the scale at which we’re seeing them across the United States for the past 5-10 years that have been documented is really remarkable and we need to not replicate these mistakes at scale. Because we’re talking about thousands of people’s lives.
Allison Lee: Well this is a really timely conversation, especially as people are rethinking what is happening with our law enforcement in the U.S., with surveillance technology in this country but also globally. This is something that we need to start talking about more and more. And maybe it’s not in all of the conversations that we have - the technology aspect - that we have about rethinking policing, but the technology is moving and people are adopting it with lightning speed and maybe not asking the right questions or asking all of the right questions.
Once it’s implemented, it’s also hard to remove and it’s hard to go backwards. And so if those questions are going to be asked, I think now is the time to start asking them.
Emmett McKinney: I will certainly meditate on some additional questions. Frank, we really appreciate you taking some time to answer a few of ours. To our listeners, thank you so much for joining us for this journey. As with any good research project, this has turned up a lot more questions than answers and we hope that these conversations will continue. So thank you and please send us your comments and reactions, as well as your ideas for what we should talk about in the future. You will be able to find this episode, as well as some related resources on CoLab Radio’s website.
Frank, that you so much for taking your time. Thank you all.
For more of Frank’s work, or to get in touch:
About the Interviewers:
Emmett McKinney is a transportation planner at the nexus of tech, equity, and decarbonization. He holds a Master in City Planning from MIT, where his research focused on the equity implications of emergent mobility technologies. He has worked in urban design and environmental policy — but these days, he manages mobility data for Superpedestrian, a mobility technology company. Find him on Twitter at @EmmettMcKinney and GitHub at @ezmckinn.
Allison Lee is a Producer for CoLab Radio and a Masters student in the MIT Dept of Urban Studies and Planning. She is interested in balancing conservation and development, and places community and culture at the heart of her work.