首页 > 解决方案 > Apache Pig ORDER 比 LIMIT 返回 null

问题描述

我有猪的问题。我试图通过将项目组合在一起并计算数量来计算项目出现在某处的次数。然后我订购它们并将数量限制在前十名。当我转储有序集时,它工作正常,但是当我尝试转储有限集时,它每次都失败。我四处寻找这个问题并没有发现任何东西。我能得到一些帮助吗?这是下面的代码。

    lines = LOAD '/share/smallspoilers' USING PigStorage(':') AS (location:chararray,lNum:chararray,item:chararray,iNum:chararray);
    newLines = FOREACH lines GENERATE(item),REPLACE(location, '"', '') AS location;
    newerLines = FOREACH newLines GENERATE(item),REPLACE(location, ' ', '') AS location;
    newestLines = FOREACH newerLines GENERATE(location),REPLACE(item, '"', '') AS item;
    finalLines = FOREACH newestLines GENERATE(location),REPLACE(item, ' ', '') AS item;
    filteredLines = FILTER finalLines BY (item matches 'Lamp');
    grouped = GROUP filteredLines BY location;
    counted = FOREACH grouped GENERATE group, COUNT(filteredLines) AS total;
    ordered = ORDER counted BY total DESC;
    prac = LIMIT ordered 10;
    dump prac;

标签: apache-pig

解决方案


推荐阅读