2024-03-08 14:03:37 +08:00
..
2024-03-08 14:03:37 +08:00
2024-03-08 14:03:37 +08:00
2024-03-08 14:03:37 +08:00
2024-03-08 14:03:37 +08:00
2024-03-08 14:03:37 +08:00
2024-03-08 14:03:37 +08:00

A distributed word counting example.

A hasher shards <word,count> to multiple receivers by hash(word).

A receiver collects <word,count> from multiple hashers
and writes the result to disk.

Example run with 3 hashers and 4 receivers:
1. run 4 receivers on 4 machines, namely ip1:port1, ip2:port2, ip3:port3, ip4:port4.
   a. on ip1, bin/wordcount_receiver port1 3
   b. on ip2, bin/wordcount_receiver port2 3
   c. on ip3, bin/wordcount_receiver port3 3
   d. on ip4, bin/wordcount_receiver port4 3
2. run 3 hashers on 3 machines.
   a. on ip1, bin/wordcount_hasher 'ip1:port1,ip2:port2,ip3:port3,ip4:port4' input1
   b. on ip2, bin/wordcount_hasher 'ip1:port1,ip2:port2,ip3:port3,ip4:port4' input2
   c. on ip3, bin/wordcount_hasher 'ip1:port1,ip2:port2,ip3:port3,ip4:port4' input3 input4
3. wait all hashers and receivers exit.