首页 > 解决方案 > 基于索引数组对数组进行排序的有效方法是什么?

问题描述

在某些机器学习算法中,矩阵的列会根据每列的相关性进行旋转和排序。即将到来的新数据应该以相同的顺序进行转换。因此,如果我的初始排序给了我 [0,2,1,3] 作为索引数组,那么新数据也应该以这种方式排序:第一个、第三个、第二个、第四个元素。这就是为什么我想创建一个排序索引数组,以后可以将其用作重新排序新数据的源。我已经在下面的实现中做到了这一点。

我的问题是关于使用索引数组对新数据进行重新排序。在我的实现中,我首先创建了新数据数组的克隆。比将源数组中的元素复制到目标数组中的正确索引更容易。这是最有效的方法吗?还是有更有效的方法,例如对数据进行适当的排序?

import java.util.stream.*;
import java.util.*;

public class IndexSorter<T> {

   private final int[] indices;
   private final int[] reverted;

   public IndexSorter(T[] data, Comparator<T> comparator){
     
     // generate index array based on initial data and a comparator:
     indices = IntStream.range(0, data.length)
                        .boxed()
                        .sorted( (a, b) -> comparator.compare(data[a],data[b]))
                        .mapToInt(a -> a)
                        .toArray();

     // also create an index array to be able to revert the sort
     reverted = new int[indices.length];
     for(int i=0;i<indices.length;i++){
       reverted[indices[i]] = i;
     }
   }

   // sort new data based on initial array
   public T[] sort(T[] data){
     return sortUsing(data, indices);
   }
   
   // revert sorted data 
   public T[] revert(T[] data){
     return sortUsing(data, reverted);
   }

   private T[] sortUsing(T[] data, int[] ind){
     if(data.length != indices.length){
       throw new IllegalArgumentException(
         String.format("Data length does not match: (%s, should be: %s) "
         ,  data.length, indices.length));
     }
     // create a copy of the data (efficively this just creates a new array)
     T[] sorted = data.clone();
     // fill the copy with the sorted data
     IntStream.range(0, ind.length)
              .forEach(i -> sorted[i]=data[ind[i]]);
     return sorted;
   }
}

class App {
  public static void main(String args[]){
      IndexSorter<String> sorter = new IndexSorter<>(args, String::compareTo);
      String[] data = sorter.sort(args);
      System.out.println(Arrays.toString(data));
      data = sorter.revert(data);
      System.out.println(Arrays.toString(data));
      data = IntStream.range(0, data.length)
                                .mapToObj(Integer::toString)
                                .toArray(String[]::new);
      data = sorter.sort(data);
      System.out.println(Arrays.toString(data));
      data = sorter.revert(data);
      System.out.println(Arrays.toString(data));
  }
}

标签: javaarrayssortingindices

解决方案


我找到了一种就地排序的方法,使用 BitSet 来跟踪哪些索引具有正确的元素。它在方法 sortUsing 中。我希望有人可以使用这个算法。

你可以像这样测试它:

java App this is just some random test to show the result

然后结果将首先向您显示排序结果,而不是还原结果。相同的索引数组也用于对索引的 int 数组进行排序,以及还原版本:

[is, just, random, result, show, some, test, the, this, to]
[this, is, just, some, random, test, to, show, the, result]
[1, 2, 4, 9, 7, 3, 5, 8, 0, 6]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

这是代码:

import java.util.stream.*;
import java.util.*;

public class IndexSorter<T> {

  private final int[] indices;
  private final int[] reverted;
  private final BitSet done;

  public IndexSorter(T[] data, Comparator<T> comparator){

    // generate index array based on initial data and a comparator:
    indices = IntStream.range(0, data.length)
                       .boxed()
                       .sorted( (a, b) -> comparator.compare(data[a],data[b]))
                       .mapToInt(a -> a)
                       .toArray();

    // also create an index array to be able to revert the sort
    reverted = new int[indices.length];
    for(int i=0;i<indices.length;i++){
      reverted[indices[i]] = i;
    }
    done = new BitSet(data.length);
  }

  // sort new data based on initial array
  public void sort(T[] data){
    sortUsing(data, indices);
  }

  // revert sorted data 
  public void revert(T[] data){
    sortUsing(data, reverted);
  }

  private void sortUsing(T[] data, int[] ind){
    if(data.length != indices.length){
      throw new IllegalArgumentException(
          String.format("Data length does not match: (%s, should be: %s) "
            ,  data.length, indices.length));
    }
    int ia=0, ib=0, x = 0;
    T a = null, b = null;
    for (int i=0; i< data.length && done.cardinality()<data.length; i++){
      ia = i;
      ib = ind[ia];
      if(done.get(ia)){ // index is already done
        continue;
      } 
      if(ia==ib){       // element is at the right place
        done.set(ia);
        continue;
      }
      x = ia;           // start a loop at x = ia 
                        // some next index will be x again eventually
      a = data[ia];     // keep element a as the last value after the loop 
      while(ib!=x && !done.get(ia) ){
        b = data[ib];   // element from index b must go to index a 
        data[ia]=b;
        done.set(ia);
        ia = ib;
        ib = ind[ia];   // get next index
      }
      data[ia]=a;       // set value a to last index
      done.set(ia);
    }
    done.clear();
  }
}

class App {
  public static void main(String args[]){
    IndexSorter<String> sorter = new IndexSorter<>(args, String::compareTo);
    sorter.sort(args);
    System.out.println(Arrays.toString(args));
    sorter.revert(args);
    System.out.println(Arrays.toString(args));
    String[] data = IntStream.range(0, args.length)
      .mapToObj(Integer::toString)
      .toArray(String[]::new);
    sorter.sort(data);
    System.out.println(Arrays.toString(data));
    sorter.revert(data);
    System.out.println(Arrays.toString(data));
  }
}

推荐阅读