首页 > 解决方案 > 如何实现地图缩减算法以在图中查找所有长度为 2 的路径

问题描述

输入是图中的所有源/目标对,或边列表,如:

网址 1,网址 2

网址 1,网址 3

网址 2,网址 3

url4,url5

url2,url4

from mrjob.job import MRJob

class MRGraph(MRJob):
    def mapper(self, line_no, line):
        line = line.split(',')
        yield line[0], line[1]

    def reducer(self, source, destinations):
        for destination in destinations:
            if destination in mapper:
                #I don't know what I'm doing

MRGraph.run()

我无法理解 map-reduce 逻辑。输出需要:

url2、url4、url5

网址1、网址2、网址3

url1、url2、url4

我还尝试定义自定义步骤,其中所有邻接列表作为输出从第一个减速器使用相同的键发送,以便后面的减速器将它们全部聚集在一起,但这感觉像是作弊,我认为它不再是分布式计算了。

from mrjob.job import MRJob
from mrjob.step import MRStep


class MRGraph(MRJob):

    def steps(self):
        return [
            MRStep(mapper=self.mapper_edges, reducer=self.reducer_destinations),
            MRStep(reducer=self.reducer_next_step)
        ]

    def mapper_edges(self, line_no, line):
        line = line.split(',')
        yield line[0], line[1]

    def reducer_destinations(self, source, destinations):
        yield None, (source, list(destinations))

    def reducer_next_step(self, _, map):
        for source, destinations in map:
            #again I don't know What I am doing

if __name__ == '__main__':
    MRGraph.run()

有任何想法吗?

标签: dictionarygraphpathreducemrjob

解决方案


推荐阅读