服务宕机了定时服务无缘无故宕机了,服务相关日志没有任何错误日志。
首先报告领导
恢复业务
排查问题
监控服务
因服务没有监控,导致服务宕机没有发现,还是通过统计数据异常发现问题,立马去查看log日志。。。
很奇怪项目日志没有任何error日志,大大的加深了问题排查。 查看jvm错误日志hs_err_pid*****.log,JVM crash信息,我们可以通过分析该文件定位到导致 JVM Crash 的原因,从而修复保证系统稳定 日志头
# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory. # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (os_linux.cpp:2640), pid=114181, tid=0x00007f9340e91700 # # JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build 1.8.0_171-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode linux-amd64 compressed oops) # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again #
Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
减小thread stack的大小线程数在3000~5000左右需要注意,JVM默认thread stack(-Xss)的大小为1024,这样当线程多时导致Native virtual memory被耗尽,实际上当thread stack的大小为128K 或 256K时是足够的,所以我们如果明确指定thread stack为128K 或 256K即可,具体使用-Xss Out of Memory Error (os_linux.cpp:2640), pid=114181, tid=0x00007f9340e91700
日志头可清晰看出Out of Memory Error-内存不足。liunx64位解决优化方案
减少Java堆大小(-Xmx / -Xms)减少Java线程数(从业务出发)减少Java线程堆栈大小(-Xss)使用-XX:ReservedCodeCacheSize =设置更大的代码缓存 堆栈信息
--------------- P R O C E S S --------------- Java Threads: ( => current thread ) =>0x00007f9b9447d800 JavaThread "pool-32458-thread-1" [_thread_new, id=2337, stack(0x00007f9340d91000,0x00007f9340e92000)] 0x00007f9b8c471000 JavaThread "pool-32456-thread-1" [_thread_blocked, id=2336, stack(0x00007f932a62b000,0x00007f932a72c000)] 0x00007f9ba44a4000 JavaThread "pool-32455-thread-1" [_thread_blocked, id=2330, stack(0x00007f932a72c000,0x00007f932a82d000)] 0x00007f9b745ed800 JavaThread "pool-32454-thread-1" [_thread_blocked, id=2319, stack(0x00007f932a82d000,0x00007f932a92e000)] 0x00007f9b7862a000 JavaThread "pool-32453-thread-1" [_thread_blocked, id=2318, stack(0x00007f932a92e000,0x00007f932aa2f000)] 0x00007f9b6c5cd800 JavaThread "pool-32452-thread-1" [_thread_blocked, id=2302, stack(0x00007f932aa2f000,0x00007f932ab30000)] 0x00007f9b98bf0000 JavaThread "pool-32451-thread-1" [_thread_blocked, id=2297, stack(0x00007f932ab30000,0x00007f932ac31000)] 0x00007f9b44633000 JavaThread "Keep-Alive-Timer" daemon [_thread_blocked, id=2285, stack(0x00007f9330e93000,0x00007f9330f94000)] 0x00007f9b6450b000 JavaThread "pool-32450-thread-1" [_thread_blocked, id=2187, stack(0x00007f932ac31000,0x00007f932ad32000)] 0x00007f9b9447b000 JavaThread "pool-32449-thread-1" [_thread_blocked, id=2159, stack(0x00007f932ad32000,0x00007f932ae33000)] 0x00007f9b8c46f000 JavaThread "pool-32448-thread-1" [_thread_blocked, id=2100, stack(0x00007f932ae33000,0x00007f932af34000)] 0x00007f9b8059b800 JavaThread "pool-32447-thread-1" [_thread_blocked, id=2068, stack(0x00007f932af34000,0x00007f932b035000)] 0x00007f9ba44a2000 JavaThread "pool-32446-thread-1" [_thread_blocked, id=1895, stack(0x00007f932b035000,0x00007f932b136000)] 0x00007f9b745eb000 JavaThread "pool-32445-thread-1" [_thread_blocked, id=1865, stack(0x00007f932b136000,0x00007f932b237000)] 0x00007f9b78628000 JavaThread "pool-32444-thread-1" [_thread_blocked, id=1864, stack(0x00007f932b237000,0x00007f932b338000)] 0x00007f9b6c5cb800 JavaThread "pool-32443-thread-1" [_thread_blocked, id=1854, stack(0x00007f932b338000,0x00007f932b439000)] 0x00007f9b98bed800 JavaThread "pool-32442-thread-1" [_thread_blocked, id=1850, stack(0x00007f932b439000,0x00007f932b53a000)] 0x00007f9b64508800 JavaThread "pool-32441-thread-1" [_thread_blocked, id=1849, stack(0x00007f932b53a000,0x00007f932b63b000)] 0x00007f9b94479000 JavaThread "pool-32440-thread-1" [_thread_blocked, id=1835, stack(0x00007f932b63b000,0x00007f932b73c000)] 0x00007f9b8c46d000 JavaThread "pool-32439-thread-1" [_thread_blocked, id=1832, stack(0x00007f932b73c000,0x00007f932b83d000)] 0x00007f9b80599000 JavaThread "pool-32438-thread-1" [_thread_blocked, id=1729, stack(0x00007f932b83d000,0x00007f932b93e000)] 0x00007f9ba449f800 JavaThread "pool-32437-thread-1" [_thread_blocked, id=1657, stack(0x00007f932b93e000,0x00007f932ba3f000)] 0x00007f9b78625800 JavaThread "pool-32436-thread-1" [_thread_blocked, id=1412, stack(0x00007f932ba3f000,0x00007f932bb40000)] 0x00007f9b54782000 JavaThread "pool-32435-thread-1" [_thread_blocked, id=1183, stack(0x00007f932bb40000,0x00007f932bc41000)] 0x00007f9b486df800 JavaThread "pool-32434-thread-1" [_thread_blocked, id=1182, stack(0x00007f932bc41000,0x00007f932bd42000)] 0x00007f9b44631000 JavaThread "pool-2-thread-16487" [_thread_blocked, id=1180, stack(0x00007f932bd42000,0x00007f932be43000)] 0x00007f9b4462f000 JavaThread "pool-2-thread-16486" [_thread_blocked, id=1177, stack(0x00007f932be43000,0x00007f932bf44000)] 0x0000000001d29800 JavaThread "pool-32433-thread-1" [_thread_blocked, id=1176, stack(0x00007f932bf44000,0x00007f932c045000)] 0x00007f9c5458a800 JavaThread "pool-32432-thread-1" [_thread_blocked, id=1175, stack(0x00007f932c045000,0x00007f932c146000)] 0x00007f9b4462d000 JavaThread "pool-2-thread-16485" [_thread_blocked, id=1174, stack(0x00007f932c146000,0x00007f932c247000)] 0x00007f9c4465c800 JavaThread "pool-32431-thread-1" [_thread_blocked, id=1173, stack(0x00007f932c247000,0x00007f932c348000)]
JAVA线程堆栈,发现堆栈里面大量的pool的线程池,blocked阻塞线程高达32458个,这就是根本原因,每执行一个就创建。误用JAVA线程池,每次用都新new一个线程池newSingleThreadScheduledExecutor确实每次new会占用堆外堆存,没有跟踪到底层,但是线程池是管理线程的,虚拟机线程肯定是要跟OS申请线程资源的,linux中线程作为轻量进程,每fork一个肯定会占用OS的资源,相对于java虚拟机堆内内存来说,即是占用了堆外内存;而虚拟机本身由于线程池不释放,老生代会一直缓慢增加,但是没有堆外内存那么厉害,当老生代一直增加到100%后,虚拟机本身会报内存溢出。而 *** 作系统层面,由于大量VIRT被占用,就连简单的top有时也会因为没有办法分配内存而执行不了
[hs_err_pid文件]
优化方案线程池用完了必须shutdown()。避免一直new创建新的线程池。服务总内存16G,此服务启动设置了2G,增大了最大内存至3G,设置堆栈大小256K。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)