Thursday, July 22, 2010

The anguish of the data collectors

Jim Manzi views the dispute between poll analyst Nate Silver and pollster John Zogby in economic terms:
Silver intelligently combines multiple polls to make more accurate predictions than are usually achieved by any one individual pollster. On one hand, the math of this is irresistible – in the real world, voting models often work. On the other hand, it would be pretty uncomfortable for a pollster to combine his own results with various competitive poll results to achieve equivalent accuracy (or at least to do so transparently). So, the pollsters do all the tedious work to collect and analyze the data, and then Nate Silver comes along and creates all this value with it in a way that is hard for the pollsters to duplicate. You can see why this situation might upset the pollsters.

In every industry that combines data collection with analysis, there is an endless battle between the data collectors and the analysts. The data collectors bear the hard costs – people, office space, telecommunications, travel budgets, etc. – that are required for interviewing people, visiting stores, and so forth. Their nightmare world is to become commodity data collectors paid for their costs plus a small margin set by competitive bidding. Their typical defenses are to attempt : (1) to build proprietary methods for collecting superior data, or equivalent data at much lower cost, and (2) to integrate the analysis and the data into a single product, and forbid by contract the paying client from using this for other purposes. The analysts, on the other hand, want to have an open market in commoditized data and compete on analytical capability.

The pattern Manzi outlines is at work in the investment world as well.  On one level, conventional equity research, like polling data, is now subject to aggregation and analysis; services like Investars and Reuters' Starmine offer average ratings for equities as well as ranking analysts by the performance of past ratings on given stocks and sectors. Traditional fundamental research has become commoditized to a degree, as the research itself, like the material company information of which it's composed, can no longer be provided selectively to favored clients.  More broadly, active fund management itself is giving way to indexing, in large part through ETFs. Suzanne Duncan of the IBM Institute for Business Value has forecast,on the basis of a study polling financial executives and their clients, that within twenty years, 85--90% of assets under management will be invested in passive instruments such as index funds. (Not coincidentally, the study also found that only 10% of hedge funds actually produce any alpha, i.e. earn their exorbitant fees. The same is doubtless true of managed mutual funds.)

Perhaps it can also be said that analyst "bloat" often occurs in markets where incentives are skewed. That would include media in the last ten years, where the data collectors (journalists) have hollowed out their revenue base by giving their content away, and the "analysts" (the substantial segment of bloggers who don't generate new information, like yours truly) proliferate, thanks to minimal overhead and in many cases no need to make money at all.  It's a common reading weakness, I suspect (since I share it) to prefer commentary to information, or information spiced with commentary; hence, the market skews toward "analysis."

Another market that seems top-heavy with analysis, per the Washington Post's massive expose, is the national intelligence apparatus -- in which "data collectors" (field agents and their informants) seem in chronically short supply, while analysts proliferate:
Among the most important people inside the SCIFs are the low-paid employees carrying their lunches to work to save money. They are the analysts, the 20- and 30-year-olds making $41,000 to $65,000 a year, whose job is at the core of everything Top Secret America tries to do.

At its best, analysis melds cultural understanding with snippets of conversations, coded dialogue, anonymous tips, even scraps of trash, turning them into clues that lead to individuals and groups trying to harm the United States.

Their work is greatly enhanced by computers that sort through and categorize data. But in the end, analysis requires human judgment, and half the analysts are relatively inexperienced, having been hired in the past several years, said a senior ODNI official. Contract analysts are often straight out of college and trained at corporate headquarters.

When hired, a typical analyst knows very little about the priority countries - Iraq, Iran, Afghanistan and Pakistan - and is not fluent in their languages. Still, the number of intelligence reports they produce on these key countries is overwhelming, say current and former intelligence officials who try to cull them every day. The ODNI doesn't know exactly how many reports are issued each year, but in the process of trying to find out, the chief of analysis discovered 60 classified analytic Web sites still in operation that were supposed to have been closed down for lack of usefulness. "Like a zombie, it keeps on living" is how one official describes the sites.

The problem with many intelligence reports, say officers who read them, is that they simply re-slice the same facts already in circulation. "It's the soccer ball syndrome. Something happens, and they want to rush to cover it," said Richard H. Immerman, who was the ODNI's assistant deputy director of national intelligence for analytic integrity and standards until early 2009. "I saw tremendous overlap."

Even the analysts at the National Counterterrorism Center (NCTC), which is supposed to be where the most sensitive, most difficult-to-obtain nuggets of information are fused together, get low marks from intelligence officials for not producing reports that are original, or at least better than the reports already written by the CIA, FBI, National Security Agency or Defense Intelligence Agency.
In part, the proliferation of analysts is probably due to the flood of electronic data that the intelligence agencies collect. In part, I imagine it's because there's a lot more born analysts than born agents.  And in  part it's doubtless due to vast amounts of money thrown at the unaccountable and in some sense even uncounted agencies that have proliferated over the last sixty years, and especially since 9/11.

Silver is something of a special case, since he adds unquestionable value in original ways that struck a very interested and enormous audience with the force of revelation in the heat of the 2008 campaign.

No comments:

Post a Comment