yalealumnimagazine.com  
  findings  
spacer spacer spacer
 
rule
  yalealumnimagazine.com
about the Yale Alumni Magazine
classified & display advertising
address changes
The Yale Classifieds
support us
write a letter to the editor

spacer
 
current issue

current issue
issue archives

 

advertise demographics
request a media kit
view The Yale Classifieds
place a classified ad

 

The Yale Alumni Magazine is owned and operated by Yale Alumni Publications, Inc., a nonprofit corporation independent of Yale University.

The content of the magazine and its website is the responsibility of the editors and does not necessarily reflect the views of Yale or its officers.

 

Comment on this article

Managing the Information Overload

Alvin Toffler coined the term "information overload" to describe how individuals can be overwhelmed by masses of information in postindustrial society. Now, many businesses and scientific researchers find themselves facing something similar. "The amount of data that needs to be stored and processed is exploding," says Yale computer scientist Daniel Abadi.

 

“Windows is proprietary. Macs at least come with lots of Linuxy goodness.”

Two approaches to handling such data have become popular: parallel database management systems (DBMSs), developed to efficiently manage structured data—the sort of data that can be represented on a grid—and MapReduce, created by Google to allow flexible searches of the more free-form content of the Web.

Abadi and his students have developed a new open-source system called HadoopDB, which combines the efficiency of DBMS with the adaptability and scalability of MapReduce. Abadi likens it to the old Mac versus PC trope: "Windows is closed and proprietary. Macs are a little more open—they at least come with lots of open-source-based Linuxy goodness. DBMSs tend to be closed and proprietary, but MapReduce is known for the open sourcing.”

Currently, DBMSs are used by everyone from retailers mining their purchase records to scientists doing high-throughput analysis of biochemical compounds. The systems they use may be adequate today, but if HadoopDB succeeds as its creators hope, it will allow a wide range of users to handle increasingly large data sets. Says Abadi, "the problem we're trying to solve is tomorrow’s data workloads.”  the end

 
 
 
spacer
 

©1992–2012, Yale Alumni Publications, Inc. All rights reserved.

Yale Alumni Magazine, P.O. Box 1905, New Haven, CT 06509-1905, USA. yam@yale.edu