MapReduce Function in C++

Summary

MapReduce is a framework for processing large datasets in a distributed computing environment. It is an algorithm for performing distributed computing on large data sets, and it is a programing model for large-scale data processing. MapReduce allows for data processing to be divided into smaller parts, which are then distributed across multiple computers. The end result is a single output that is aggregated from all of the individual pieces. MapReduce is a key component of the Apache Hadoop big data platform.

Map Reduce in C++

#include <iostream>
#include <map>
#include <vector>
using namespace std;

// MapReduce function
vector<pair<string, int>> MapReduce(vector<string> input) {
  // Create map to store key-value
  map<string, int> mp;

  // Iterate over input vector and add key-value to map
  for(int i = 0; i < input.size(); i++) {
    if(mp.find(input[i]) == mp.end()) {
      mp[input[i]] = 1;
    }
    else {
      mp[input[i]]++;
    }
  }

  // Create vector to store key-value
  vector<pair<string, int>> output;

  // Iterate over map and add key-value to vector
  for(map<string, int>::iterator it = mp.begin(); it != mp.end(); it++) {
    output.push_back(make_pair(it->first, it->second));
  }

  return output;
}

int main() {

Leave a comment