One of our most prolific bloggers, National Coalition for Child Protection Reform leader Richard Wexler, articulated the thesis this week that the failures of “big data” predictive analysis in the 2016 election is an indication that using the same approach in child welfare is a mistake. From Wexler:
If all those number crunchers can’t figure out one presidential election, why should anyone trust predictive analytics to tell us which parents are going to harm their children? Yet that, of course, is what some in child welfare seriously want us to do.
Youth Services Insider was also interested in the data implications of the election, but made a different connection to child welfare. Because it seems to us that the big failure was in the interpretation of analytics, not the process itself.
Five Thirty Eight’s model arrives at a probability score for both candidates using data from polls that they have assessed, and weighted based on quality. By the end of the race, the site gave Donald Trump about a 30 percent probability of winning the election based on the numbers coming into the model.
It is easy to say now, but those who took a 30 percent score lightly there won’t be likely to do so again. Any baseball fan should have had some respect for it; it’s about the same probability that your team’s best batter is going to get a hit.
Five Thirty Eight boss Nate Silver was pretty clear about how real that chance was in the late weeks of the campaign, and also presented the argument that the numbers suggested real potential for Clinton to win the popular vote and lose the electoral college.
How does this relate to child welfare? At best, tangentially. The predictive voting model at FiveThirtyEight is fueled by two central inputs: polls and demographics. Predictive analytics in a child welfare setting can include data related to many issues: health, income, demographics, criminal history, drug use, to name a few.
In both cases, the process is as good as the quality of the data you are feeding into the model. That is why in the child welfare arena, there should be a demand that predictive analytics products are clear about what’s in the model and demonstrate that it stands up to rigorous testing.
Of equal importance is how the information generated by big data is interpreted, the real-world meaning given to it. YSI suspects that the next campaign that sees an opponent’s probability rise from 15 percent to 30 percent in a short time span, with corresponding narrower leads in swing states, will react with more urgency than pundits or the Clinton campaign did. And that is because we now know how real 30 percent is.
In child welfare, would you take a kid away from his mother based solely on the fact that a predictive measure said there is a 30 percent chance the child would be severely maltreated? The answer should be, flatly, “No.” Other mitigating factors might steer toward a decision to remove, but a predictive score alone should not be enough.
But that hardly means that some systemic response is not warranted by such a score. Surely, there is an appropriate way to interpret it. Wexler argues often that the child welfare system has not shown itself capable of incorporating risk measurement into decision-making without tearing families apart.
Perhaps that is the conversation the field needs to have as predictive analytics becomes a part of the field: How do systems responsibly control the interpretation of valuable, but limited, predictive analysis?