本系列文章基于 OpenJDK 9,聚焦 x86 平台。

本文以 invokevirtual 指令为例,分析 HotSpot JVM 解释器如何从操作数解析出 符号引用 信息。

背景

类似 invokevirtual 的字节码指令,在解析前,其操作数大小为 2 字节,存储了 常量池ConstantPool)索引,根据该索引可以一步步获取到 符号引用 信息;在 解析 阶段将 符号引用 解析成 直接引用 后,该操作数存储的是 常量池缓存条目ConstantPoolCacheEntry)索引。

上篇文章提到过,缓存条目的 _indices 字段,低 2 字节保存了原 常量池 索引(original constant pool index)。

正文

看下面这段源码(openjdk\hotspot\src\share\vm\interpreter\interpreterRuntime.cpp):

1
2
3
LinkResolver::resolve_invoke(info, receiver, pool,
get_index_u2_cpcache(thread, bytecode), bytecode,
CHECK);

调用 LinkResolverresolve_invoke 方法解析字节码指令 bytecode

获取符号 常量池 索引

参数pool常量池, 方法 get_index_u2_cpcache 用于获取 常量池 索引(解析前)或者 常量池缓存 索引(解析后)。

实现如下(openjdk\hotspot\src\share\vm\interpreter\interpreterRuntime.hpp):

1
2
get_index_u2_cpcache(JavaThread *thread, Bytecodes::Code bc)
{ return bytecode(thread).get_index_u2_cpcache(bc); }

ByteCodeopenjdk\hotspot\src\share\vm\interpreter\bytecode.hpp):

1
2
3
4
int get_index_u2_cpcache(Bytecodes::Code bc) const {
assert_same_format_as(bc); assert_index_size(2, bc); assert_native_index(bc);
return Bytes::get_native_u2(addr_at(1)) + ConstantPool::CPCACHE_INDEX_TAG;
}

addr_at(1) 用于获取操作数地址:

1
2
 // Address computation
address addr_at (int offset) const { return (address)_bcp + offset; }

_bcp 即 byte code pointer,字节码指令指针,因为字节码指令大小为 1 字节,+ 1 后即为操作数地址。

Bytes::get_native_u2 根据 CPU 架构获取 2 字节大小的数据。

ConstantPool::CPCACHE_INDEX_TAG 做调试用,不必关注。

经过上述调用后,获得紧跟字节码指令后的 常量池 索引。

获取 符号引用 信息

开始正式解析(openjdk\hotspot\src\share\vm\interpreter\linkResolver.cpp):

1
2
3
4
5
6
7
8
9
10
11
void LinkResolver::resolve_invoke(CallInfo& result, Handle recv, const constantPoolHandle& pool, int index, Bytecodes::Code byte, TRAPS) {
switch (byte) {
case Bytecodes::_invokestatic : resolve_invokestatic (result, pool, index, CHECK); break;
case Bytecodes::_invokespecial : resolve_invokespecial (result, recv, pool, index, CHECK); break;
case Bytecodes::_invokevirtual : resolve_invokevirtual (result, recv, pool, index, CHECK); break;
case Bytecodes::_invokehandle : resolve_invokehandle (result, pool, index, CHECK); break;
case Bytecodes::_invokedynamic : resolve_invokedynamic (result, pool, index, CHECK); break;
case Bytecodes::_invokeinterface: resolve_invokeinterface(result, recv, pool, index, CHECK); break;
}
return;
}

关注 _invokevirtual 指令:

1
2
3
4
5
6
7
8
void LinkResolver::resolve_invokevirtual(CallInfo& result, Handle recv,
const constantPoolHandle& pool, int index,
TRAPS) {

LinkInfo link_info(pool, index, CHECK);
KlassHandle recvrKlass (THREAD, recv.is_null() ? (Klass*)NULL : recv->klass());
resolve_virtual_call(result, recv, recvrKlass, link_info, /*check_null_or_abstract*/true, CHECK);
}

本文只关注该行代码:

1
LinkInfo link_info(pool, index, CHECK);

LinkInfo 顾名思义,链接信息,也即 符号引用 信息,结构如下(openjdk\hotspot\src\share\vm\interpreter\linkResolver.hpp):

1
2
3
4
5
6
7
8
9
10
11
12
13
// Condensed information from constant pool to use to resolve the method or field.
// resolved_klass = specified class (i.e., static receiver class)
// current_klass = sending method holder (i.e., class containing the method
// containing the call being resolved)
// current_method = sending method (relevant for field resolution)
class LinkInfo : public StackObj {
Symbol* _name; // extracted from JVM_CONSTANT_NameAndType
Symbol* _signature;
KlassHandle _resolved_klass; // class that the constant pool entry points to
KlassHandle _current_klass; // class that owns the constant pool
methodHandle _current_method; // sending method
bool _check_access;
constantTag _tag;

其构造函数做了一系列工作(openjdk\hotspot\src\share\vm\interpreter\linkResolver.cpp):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

LinkInfo::LinkInfo(const constantPoolHandle& pool, int index, TRAPS) {
// resolve klass
Klass* result = pool->klass_ref_at(index, CHECK);
_resolved_klass = KlassHandle(THREAD, result);

// Get name, signature, and static klass
_name = pool->name_ref_at(index);
_signature = pool->signature_ref_at(index);
_tag = pool->tag_ref_at(index);
_current_klass = KlassHandle(THREAD, pool->pool_holder());
_current_method = methodHandle();

// Coming from the constant pool always checks access
_check_access = true;
}

invokevirtual 指令调用一个方法,如果要解析得到该方法的 直接引用,必定首先要有该方法的详细信息,如方法所在类、方法名称、方法签名(参数和范围值类型)等。该构造函数分别获取了这些信息,共同构成了方法 符号引用 信息。

以获取所在类信息为例(openjdk\hotspot\src\share\vm\oops\constantPool.cpp):

1
2
3
Klass* ConstantPool::klass_ref_at(int which, TRAPS) {
return klass_at(klass_ref_index_at(which), THREAD);
}

有两步,首先获取类引用索引,再通过索引获取类信息。

获取类引用索引

1
int klass_ref_index_at(int which)               { return impl_klass_ref_index_at(which, false); }

调用具体实现函数。

1
2
3
4
5
6
7
8
9
10
11
12
int ConstantPool::impl_klass_ref_index_at(int which, bool uncached) {
guarantee(!ConstantPool::is_invokedynamic_index(which),
"an invokedynamic instruction does not have a klass");
int i = which;
if (!uncached && cache() != NULL) {
// change byte-ordering and go via cache
i = remap_instruction_operand_from_cache(which);
}
assert(tag_at(i).is_field_or_method(), "Corrupted constant pool");
jint ref_index = *int_at_addr(i);
return extract_low_short_from_int(ref_index);
}

分两种情况:如果没缓存过,传入参数 which常量池 索引;否则前面说过,索引在 常量池缓存 中,缓存条目的 _indices 字段低 2 位。

1
2
3
4
5
6
7
int ConstantPool::remap_instruction_operand_from_cache(int operand) {
int cpc_index = operand;
DEBUG_ONLY(cpc_index -= CPCACHE_INDEX_TAG);
assert((int)(u2)cpc_index == cpc_index, "clean u2");
int member_index = cache()->entry_at(cpc_index)->constant_pool_index();
return member_index;
}

常量池缓存 获取目标数据。

通过索引获取类信息

1
2
3
4
Klass* klass_at(int which, TRAPS) {
constantPoolHandle h_this(THREAD, this);
return klass_at_impl(h_this, which, true, THREAD);
}

调用具体实现函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
Klass* ConstantPool::klass_at_impl(const constantPoolHandle& this_cp, int which,
bool save_resolution_error, TRAPS) {
assert(THREAD->is_Java_thread(), "must be a Java thread");

// A resolved constantPool entry will contain a Klass*, otherwise a Symbol*.
// It is not safe to rely on the tag bit's here, since we don't have a lock, and
// the entry and tag is not updated atomicly.
CPSlot entry = this_cp->slot_at(which);
if (entry.is_resolved()) {
assert(entry.get_klass()->is_klass(), "must be");
// Already resolved - return entry.
return entry.get_klass();
}

// This tag doesn't change back to unresolved class unless at a safepoint.
if (this_cp->tag_at(which).is_unresolved_klass_in_error()) {
// The original attempt to resolve this constant pool entry failed so find the
// class of the original error and throw another error of the same class
// (JVMS 5.4.3).
// If there is a detail message, pass that detail message to the error.
// The JVMS does not strictly require us to duplicate the same detail message,
// or any internal exception fields such as cause or stacktrace. But since the
// detail message is often a class name or other literal string, we will repeat it
// if we can find it in the symbol table.
throw_resolution_error(this_cp, which, CHECK_0);
ShouldNotReachHere();
}

Handle mirror_handle;
Symbol* name = entry.get_symbol();
Handle loader (THREAD, this_cp->pool_holder()->class_loader());
Handle protection_domain (THREAD, this_cp->pool_holder()->protection_domain());
Klass* kk = SystemDictionary::resolve_or_fail(name, loader, protection_domain, true, THREAD);
KlassHandle k (THREAD, kk);
if (!HAS_PENDING_EXCEPTION) {
// preserve the resolved klass from unloading
mirror_handle = Handle(THREAD, kk->java_mirror());
// Do access check for klasses
verify_constant_pool_resolve(this_cp, k, THREAD);
}

// Failed to resolve class. We must record the errors so that subsequent attempts
// to resolve this constant pool entry fail with the same error (JVMS 5.4.3).
if (HAS_PENDING_EXCEPTION) {
if (save_resolution_error) {
save_and_throw_exception(this_cp, which, constantTag(JVM_CONSTANT_UnresolvedClass), CHECK_NULL);
// If CHECK_NULL above doesn't return the exception, that means that
// some other thread has beaten us and has resolved the class.
// To preserve old behavior, we return the resolved class.
entry = this_cp->resolved_klass_at(which);
assert(entry.is_resolved(), "must be resolved if exception was cleared");
assert(entry.get_klass()->is_klass(), "must be resolved to a klass");
return entry.get_klass();
} else {
return NULL; // return the pending exception
}
}

// Make this class loader depend upon the class loader owning the class reference
ClassLoaderData* this_key = this_cp->pool_holder()->class_loader_data();
this_key->record_dependency(k(), CHECK_NULL); // Can throw OOM

// logging for class+resolve.
if (log_is_enabled(Debug, class, resolve)){
trace_class_resolution(this_cp, k);
}
this_cp->klass_at_put(which, k());
entry = this_cp->resolved_klass_at(which);
assert(entry.is_resolved() && entry.get_klass()->is_klass(), "must be resolved at this point");
return entry.get_klass();
}

代码很长,大意就是如果该类已被解析过,就直接返回类信息,否则尝试解析该类并放到常量池中,并返回类信息。

这样就获得了方法(符号)的所在类引用,方法名称等的获取方式大同小异。

话外

有了方法的 符号引用 信息,就可以根据这些信息解析到它的 直接引用 了,之后的文章将介绍这部分内容。