首页 > 技术文章 > Runtime - AppCDS with pre-initialization

kelthuzadx 2020-09-11 14:25 原文

CDS类提前初始化

CDS介绍

传统CDS[0]分为Dump和Use两个大阶段:

-Xshare:off -XX:DumpLoadedClassList=test.log

-Xshare:dump -XX:SharedClassListFile=test.log -XX:SharedArchiveFile=test.jsa

-Xshare:on -XX:SharedArchiveFile=test.jsa

实际上为了使用CDS,一般还是需要三步。第一步产生一个包含类名字的列表test.log,第二步根据类名字列表产生CDS archive,第三步使用CDS Archive。可以搜索DumpSharedSpaces和UseSharedSpaces两个flag来找到后面两步的代码和逻辑。

CDS Dump完整流程位于MetaspaceShared::preload_and_dump,它先做一些初始化工作,比如读取classlist的类,并预加载,然后VMThread::execute一个VM_PopulateDumpSharedSpace做实际dump archive工作:

void MetaspaceShared::preload_and_dump(TRAPS) {
  { TraceTime timer("Dump Shared Spaces", TRACETIME_LOG(Info, startuptime));
    ResourceMark rm;
    char class_list_path_str[JVM_MAXPATHLEN];
    // Preload classes to be shared.
    // Should use some os:: method rather than fopen() here. aB.
    const char* class_list_path;
    if (SharedClassListFile == NULL) {
      // Construct the path to the class list (in jre/lib)
      // Walk up two directories from the location of the VM and
      // optionally tack on "lib" (depending on platform)
      os::jvm_path(class_list_path_str, sizeof(class_list_path_str));
      for (int i = 0; i < 3; i++) {
        char *end = strrchr(class_list_path_str, *os::file_separator());
        if (end != NULL) *end = '\0';
      }
      int class_list_path_len = (int)strlen(class_list_path_str);
      if (class_list_path_len >= 3) {
        if (strcmp(class_list_path_str + class_list_path_len - 3, "lib") != 0) {
          if (class_list_path_len < JVM_MAXPATHLEN - 4) {
            jio_snprintf(class_list_path_str + class_list_path_len,
                         sizeof(class_list_path_str) - class_list_path_len,
                         "%slib", os::file_separator());
            class_list_path_len += 4;
          }
        }
      }
      if (class_list_path_len < JVM_MAXPATHLEN - 10) {
        jio_snprintf(class_list_path_str + class_list_path_len,
                     sizeof(class_list_path_str) - class_list_path_len,
                     "%sclasslist", os::file_separator());
      }
      class_list_path = class_list_path_str;
    } else {
      class_list_path = SharedClassListFile;
    }

    tty->print_cr("Loading classes to share ...");
    _has_error_classes = false;
    int class_count = preload_classes(class_list_path, THREAD);
    if (ExtraSharedClassListFile) {
      class_count += preload_classes(ExtraSharedClassListFile, THREAD);
    }
    tty->print_cr("Loading classes to share: done.");

    log_info(cds)("Shared spaces: preloaded %d classes", class_count);

    // Rewrite and link classes
    tty->print_cr("Rewriting and linking classes ...");

    // Link any classes which got missed. This would happen if we have loaded classes that
    // were not explicitly specified in the classlist. E.g., if an interface implemented by class K
    // fails verification, all other interfaces that were not specified in the classlist but
    // are implemented by K are not verified.
    link_and_cleanup_shared_classes(CATCH);
    tty->print_cr("Rewriting and linking classes: done");

    SystemDictionary::clear_invoke_method_table();
    HeapShared::init_archivable_static_fields(THREAD);

    VM_PopulateDumpSharedSpace op;
    VMThread::execute(&op);
  }
}

以上是CDS Dump的大流程。

CDS字段提前初始化

具体到CDS字段提前初始化主题,这是一个实验尝试,大前提是JDK12的https://wiki.openjdk.java.net/display/HotSpot/Caching+Java+Heap+Objects,有了这个技术才有下文。

现存JDK做了一些工作,它在CDS Dump阶段将一些硬编码的类字段dump到cds archive,也就是上面的HeapShared::init_archivable_static_fields

struct ArchivableStaticFieldInfo {
  const char* klass_name;
  const char* field_name;
  InstanceKlass* klass;
  int offset;
  BasicType type;
};

// If you add new entries to this table, you should know what you're doing!
static ArchivableStaticFieldInfo archivable_static_fields[] = {
  {"jdk/internal/module/ArchivedModuleGraph",  "archivedSystemModules"},
  {"jdk/internal/module/ArchivedModuleGraph",  "archivedModuleFinder"},
  {"jdk/internal/module/ArchivedModuleGraph",  "archivedMainModule"},
  {"jdk/internal/module/ArchivedModuleGraph",  "archivedConfiguration"},
  {"java/util/ImmutableCollections$ListN",     "EMPTY_LIST"},
  {"java/util/ImmutableCollections$MapN",      "EMPTY_MAP"},
  {"java/util/ImmutableCollections$SetN",      "EMPTY_SET"},
  {"java/lang/Integer$IntegerCache",           "archivedCache"},
  {"java/lang/module/Configuration",           "EMPTY_CONFIGURATION"},
};

const static int num_archivable_static_fields =
  sizeof(archivable_static_fields) / sizeof(ArchivableStaticFieldInfo);

class ArchivableStaticFieldFinder: public FieldClosure {
  InstanceKlass* _ik;
  Symbol* _field_name;
  bool _found;
  int _offset;
public:
  ArchivableStaticFieldFinder(InstanceKlass* ik, Symbol* field_name) :
    _ik(ik), _field_name(field_name), _found(false), _offset(-1) {}

  virtual void do_field(fieldDescriptor* fd) {
    if (fd->name() == _field_name) {
      assert(!_found, "fields cannot be overloaded");
      assert(fd->field_type() == T_OBJECT || fd->field_type() == T_ARRAY, "can archive only obj or array fields");
      _found = true;
      _offset = fd->offset();
    }
  }
  bool found()     { return _found;  }
  int offset()     { return _offset; }
};

void HeapShared::init_archivable_static_fields(Thread* THREAD) {
  for (int i = 0; i < num_archivable_static_fields; i++) {
    ArchivableStaticFieldInfo* info = &archivable_static_fields[i];
    TempNewSymbol klass_name =  SymbolTable::new_symbol(info->klass_name, THREAD);
    TempNewSymbol field_name =  SymbolTable::new_symbol(info->field_name, THREAD);

    Klass* k = SystemDictionary::resolve_or_null(klass_name, THREAD);
    assert(k != NULL && !HAS_PENDING_EXCEPTION, "class must exist");
    InstanceKlass* ik = InstanceKlass::cast(k);

    ArchivableStaticFieldFinder finder(ik, field_name);
    ik->do_local_static_fields(&finder);
    assert(finder.found(), "field must exist");

    info->klass = ik;
    info->offset = finder.offset();
  }
}

void HeapShared::archive_static_fields(Thread* THREAD) {
  // For each class X that has one or more archived fields:
  // [1] Dump the subgraph of each archived field
  // [2] Create a list of all the class of the objects that can be reached
  //     by any of these static fields.
  //     At runtime, these classes are initialized before X's archived fields
  //     are restored by HeapShared::initialize_from_archived_subgraph().
  int i;
  for (i = 0; i < num_archivable_static_fields; ) {
    ArchivableStaticFieldInfo* info = &archivable_static_fields[i];
    const char* klass_name = info->klass_name;
    start_recording_subgraph(info->klass, klass_name);

    // If you have specified consecutive fields of the same klass in
    // archivable_static_fields[], these will be archived in the same
    // {start_recording_subgraph ... done_recording_subgraph} pass to
    // save time.
    for (; i < num_archivable_static_fields; i++) {
      ArchivableStaticFieldInfo* f = &archivable_static_fields[i];
      if (f->klass_name != klass_name) {
        break;
      }
      archive_reachable_objects_from_static_field(f->klass, f->klass_name,
                                                  f->offset, f->field_name, CHECK);
    }
    done_recording_subgraph(info->klass, klass_name);
  }

  log_info(cds, heap)("Performed subgraph records = %d times", _num_total_subgraph_recordings);
  log_info(cds, heap)("Walked %d objects", _num_total_walked_objs);
  log_info(cds, heap)("Archived %d objects", _num_total_archived_objs);
  log_info(cds, heap)("Recorded %d klasses", _num_total_recorded_klasses);


#ifndef PRODUCT
  for (int i = 0; i < num_archivable_static_fields; i++) {
    ArchivableStaticFieldInfo* f = &archivable_static_fields[i];
    verify_subgraph_from_static_field(f->klass, f->offset);
  }
  log_info(cds, heap)("Verified %d references", _num_total_verifications);
#endif
}

其中HeapShared::init_archivable_static_fields在初始化阶段完成,HeapShared::archive_static_fields在VMThread执行VM_PopulateDumpSharedSpace时完成,两个过程完成后这些硬编码的字段可以一并dump到cds archive,如果运行时遇到jdk.internal.misc.VM.initializeFromArchive,它会调HeapShared::initialize_from_archived_subgraph,该函数从cds archive中直接加载这些字段数据,避免了解释器解释执行初始化相关字段的过程。

举个例子,上面的硬编码代码把java/lang/Integer$IntegerCache的archivedCache字段dump到了cds archive。这个IntegerCache代码如下:

 private static class IntegerCache {
        static final int low = -128;
        static final int high;
        static final Integer[] cache;
        static Integer[] archivedCache;

        static {
            // high value may be configured by property
            int h = 127;
            String integerCacheHighPropValue =
                VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
            if (integerCacheHighPropValue != null) {
                try {
                    int i = parseInt(integerCacheHighPropValue);
                    i = Math.max(i, 127);
                    // Maximum array size is Integer.MAX_VALUE
                    h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
                } catch( NumberFormatException nfe) {
                    // If the property cannot be parsed into an int, ignore it.
                }
            }
            high = h;

            // Load IntegerCache.archivedCache from archive, if possible
            VM.initializeFromArchive(IntegerCache.class);
            int size = (high - low) + 1;

            // Use the archived cache if it exists and is large enough
            if (archivedCache == null || size > archivedCache.length) {
                Integer[] c = new Integer[size];
                int j = low;
                for(int k = 0; k < c.length; k++)
                    c[k] = new Integer(j++);
                archivedCache = c;
            }
            cache = archivedCache;
            // range [-128, 127] must be interned (JLS7 5.1.7)
            assert IntegerCache.high >= 127;
        }

        private IntegerCache() {}
    }

然后static块里面用VM.initializeFromArchive(IntegerCache.class)从cds archive加载这个字段,免去了下面解释执行创建archiveCache数组的开销。

但是问题也很明显,现在只能将可以放到cds archive的字段硬编码到HotSpot VM的实现中,同时JDK需要调用一个函数,缺乏灵活性,应用程序代码无法使用这个技术。

CDS类提前初始化

如果只是硬编码提前初始化几个字段那就太狭隘了,实际上这个技术可以进一步扩展,也就是本章的主题,类的提前初始化。关于这点zhoujiangli的文档提案[1],[2]中有更详细的报告。

报告给的方案是使用一个新注解@Preserve,@Preserve标注的类表示可以进行类提前初始化。效果如图

现存方案需要VM.initializeFromArchive和HotSpotVM源码的全力配合,但是有了@Preserve后,VM能识别他,同时也无须调用VM.initializeFromArchive

基于@Preserve方案,整个CDS+运行时流程如下:

  1. 类初始化阶段
    cds dump时候,需要dump的类已经链接,这个在MetaspaceShared::preload_and_dump调用preload_classes时完成,其中有一部分类已经初始化,这些不用管。如果发现标注了@Preserve但是还没有初始化的类,那么显式的初始化它。
  2. subgraph检查阶段
    java堆对象archiving过程完成后,所有从static字段直接或者间接可达的对象会构成一幅subgraph。这个阶段遍历subgraph检查,如果发现subgraph有如下类型则不能archive:
    • non-mirror java.lang.Class对象
    • ClassLoader对象
    • java.security.ProtectionDomain对象
    • java.lang.Thread对象
    • Runnable对象
    • java.io.File对象
    • TBD
  3. 静态字段值保留阶段
    检查完成后,将static字段archive到archive mirror object,不用运行时再调VM.initializeFromArchive
  4. 运行时处理标注了@Preserve的类阶段
    运行时有多种情况,但是不管哪种都不需要再调clinit。

引用

0] https://openjdk.java.net/jeps/310
[1] http://cr.openjdk.java.net/~jiangli/Leyden/Java Class Pre-resolution and Pre-initialization (OpenJDK).pdf
[2] http://cr.openjdk.java.net/~jiangli/Leyden/Selectively Pre-initializing and Preserving Java Classes (OpenJDK).pdf

推荐阅读