1. 絮絮叨叨
-
工作中,Java服务因为
fatal error
(致命错误,笔者称其为jvm crash
),在服务运行日志中出现了致命错误的概要信息:# # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x000000010a7d52e8, pid=47989, tid=11011 # # JRE version: OpenJDK Runtime Environment Temurin-17.0.6+10 (17.0.6+10) (build 17.0.6+10) # Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.6+10 (17.0.6+10, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64) # Problematic frame: # V [libjvm.dylib+0xada2e8] Unsafe_GetByte(JNIEnv_*, _jobject*, _jobject*, long)+0xd8 # # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /Users/xxx/IdeaProjects/study/hs_err_pid47989.log # # If you would like to submit a bug report, please visit: # https://github.com/adoptium/adoptium-support/issues #
-
服务运行在k8s中,由于未提前设置fatal error日志的路径(挂载到宿主机目录),容器重启后该日志会丢失,无法深入排查原因
-
因此,需要查询jvm的配置,将fatal error日志写入指定目录,保证该日志持久化存储到宿主机磁盘
2. 配置jvm参数,实现日志的持久化存储
2.1 -XX:ErrorFile配置fatal error路径
-
通过查阅资料,了解到可以通过
-XX:ErrorFile=filename
配置hs_err日志的路径 -
下面的示例中,将fatal error的日志写入指定目录,文件名的
%p
会动态替换成改Java程序的PID(进程id)-XX:ErrorFile=/var/log/java/java_error%p.log
-
默认将fatal error日志写入Java程序的working directory,且文件名为
hs_err_pid.log
;如果空间不足、权限不够等原因,fatal error日志将被写入系统的临时目录 -
详情见JDK官网的说明:
- JDK 8:A Fatal Error Log
- JDK 17:A Fatal Error Log,Command-Line Options
2.2 笔者的错误配置
-
考虑到服务每次重启的pid基本一致,如果多次出现fatal error,只使用pid的日志会被覆盖。
-
笔者结合之前配置heap dump的经验,添加了
%t
以生成类似2023-08-16_23-33-08
的时间戳-XX:ErrorFile=/data_path/var/log/hs_err_pid%p_%t.log
-
当再一次发生fatal error时,发现日志文件名为
hs_err_pid6_%t.log
,即%t
未按照预期进行解析
2.3 -XX:OnError配置更新文件名
-
受问题(How to specify a unique name for the JVM crash log files?)启发,配置
-XX:OnError
:在日志生成后,执行shell命令为其添加时间戳-XX:ErrorFile=/data_path/var/log/hs_err.log -XX:OnError="time=`date +%Y%m%d_%H%M%S` && mv /data_path/var/log/hs_err.log /data_path/var/log/hs_err_\${time}.log"
3. 如何触发fatal error?
-
不管是验证相关JVM参数的配置,还是学习查看fatal error日志的内容,学会如何在触发fatal error是非常必要的
-
参考:Write Java code to crash the java virtual machine,通过如下代码可以成功在本地触发fatal error
import sun.misc.Unsafe; import java.lang.reflect.Field; public class CrashTest { public static void main(String... args) throws Exception { getUnsafe().getByte(0); } private static Unsafe getUnsafe() throws NoSuchFieldException, IllegalAccessException { Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe"); theUnsafe.setAccessible(true); return (Unsafe) theUnsafe.get(null); } }
4. 待交流的问题
4.1 本地验证OK
-
按照上面的描述,笔者为
CrashTest
配置了如下JVM参数-XX:ErrorFile=/data_path/study/hs_err.log -XX:OnError="time=`date +%Y%m%d_%H%M%S` && echo $time && mv /data_path/hs_err.log /data_path//hs_err_${time}.log"
-
程序运行起来后,打印如下信息:
# # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x000000010a49e2e8, pid=56245, tid=11011 # # JRE version: OpenJDK Runtime Environment Temurin-17.0.6+10 (17.0.6+10) (build 17.0.6+10) # Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.6+10 (17.0.6+10, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64) # Problematic frame: # V [libjvm.dylib+0xada2e8] Unsafe_GetByte(JNIEnv_*, _jobject*, _jobject*, long)+0xd8 # # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /data_path/hs_err.log # # If you would like to submit a bug report, please visit: # https://github.com/adoptium/adoptium-support/issues # # # -XX:OnError="time=`date +%Y%m%d_%H%M%S` && mv /data_path/hs_err.log /data_path/hs_err_${time}.log" # Executing /bin/sh -c "time=`date +%Y%m%d_%H%M%S` && mv /data_path/hs_err.log /data_path/hs_err_${time}.log" ...
-
最终,fatal error日志的文件名为
hs_err_20230827_202458.log
,符合预期!
4.2 测试环境验证失败
-
将此配置移动到线上服务,却发现fatal error日志的文件名为
hs_err_.log
不符合预期 -
怀疑: 未能正确解析
${time}
-
一个问答: How to add the timestamp of the fatal error occurrence to Java fatal error log filename,遇到了与笔者类似的问题
-XX:ErrorFile={{ .Values.server.data_dir }}/var/log/hs_err.log -XX:OnError="mv {{ .Values.server.data_dir }}/var/log/hs_err.log {{ .Values.server.data_dir }}/var/log/hs_err_\$(date +%Y%m%d_%H%M%S).log"
-
虽然更新了配置,但是由于引发fatal error的错误已被修复,无法验证该配置的效果
-
要么等到后面出现fatal error时验证效果,要么回退镜像版本触发fatal error文章来源:https://www.toymoban.com/news/detail-684817.html
-
若后续有机会验证该配置,笔者会更新结果,暂时在此记录可能的可行解决方案文章来源地址https://www.toymoban.com/news/detail-684817.html
到了这里,关于触发JVM fatal error并配置相关JVM参数的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!