MapReduce

MapReduce notes

Programming Model for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key.
MapReduce functions are highly parallelizable, and can be executed on a large cluster of commodity machines, hid- ing the details of parallelization, fault-tolerance, data distribution and load balancing.

-No partial Computation
-Map and reduce Functions (C++)
-MapReduce works by having an initial set of data, working on it, and providing a result. This means that it is not useful for real or near real time applications. Additionally, mapreduce cannot efficiently process incremental changes in its inputs, which also poses problems for processing real time data.