Earlier this month, the US Federal Trade Commission issued recommendations for businesses seeking to implement big data solutions. The report summarizes multiple sources -- a public workshop held last September, with four panels tackling different aspects; 65 public comments from citizens, industry representatives, think tanks, consumer groups, privacy advocates, and academics; and an earlier seminar on big data held in 2014.
The main caveat emerging from the report is that companies can inadvertently stumble into discriminatory actions if their big data inferences hit a protected class in a discriminatory manner, especially if a more direct approach could have prevented the discrimination. For example, a big data decision to not market a good price to a particular postal zone could also be perceived as depriving a protected class (by race, religion, gender).
Potential problems with careless big data inferences include inferring that certain geographic or behavioral traits make an individual ineligible for credit, favorable rates and discounts, or beneficial treatment (such as solicitations from top universities). Because individuals sharing certain protected demographic features (race, religion) can share a common postal zone or set of shops, inferring an individual's traits from population correlations could put a business at risk for being sued for discriminatory inferences, so to speak.
The potential for bias seems considerable, and the subtitle of the report captures the main concern: "A Tool for Inclusion or Exclusion?" For instance, if data from wearable devices are used to determine whether certain civic funding occurs (parks, repaving projects), the affluent areas could benefit disproportionately because wealthy people have FitBits while poor people don't.
The authors are careful to note that many benefits can accrue from big data used properly, including finding clever ways to pick promising people out of larger pools. For example, some people in impoverished areas are on the right track, and by finding them, new lending, educational, and work opportunities can be created. Some companies are using their big data tools in just this manner, and creating win-win scenarios.
The report has plenty of interesting examples worth contemplating:
- A credit card company that rated consumer credit worthiness based on whether they'd paid for marriage counseling, therapy, or tire repair services, based on inferences within their big data set.
- The ability of companies to identify "suffering seniors" with early Alzheimer's to exploit with offers.
- Preferring job applicants based on whether they used a browser they installed themselves (Firefox, Chrome), rather than one that came with their computer, as they'd found employees with these traits "perform better and change jobs less often."
The references point to a site worth a visit, if only to remind you that correlation does not equal causation: Spurious Correlations. On the day I visited, the featured chart correlated US spending on science, space, and technology with suicides by hanging, suffocation, and strangulation. Maybe the recent increases in the NIH, NOAA, NASA, and NSF budgets aren't the unvarnished good news we initially thought . . .
The FTC report is worth a look, if only to remind ourselves of the limitations of big data, which the authors capture succinctly:
Companies should remember that while big data is very good at detecting correlations, it does not explain which correlations are meaningful.
Or which correlations are risk-free.
As businesses become more digital at their core, data will become more central to success. And not just analytics, but rich customer data. Managing these data, using them judiciously and efficiently, and ensuring compliance with various laws and expectations will be vital to long-term strategic change and growth. Companies that begin early will learn first and best if they are diligent.