收集线程池的metrics信息


了解线程池的状态,比如,活跃线程数量、缓冲队列的排队情况,对我们来说非常有意义,可以让我们了解应用的状态,对是否需要调整线程池参数提供了有力的依据。

由于我们公司的微服务已经开始使用metrics来收集应用运行指标,结合prometheusgrafana来统一收集和可视化。所以我就想通过这一套体系来收集线程池运行时的相关信息,并通过grafana展示出来,如图所示,部署在两台服务器上的某个线程池数据展示:

image-20210211155758983

下面我们就开始在应该中做相关的数据采集。

1. build.gradle

plugins {
    id 'org.springframework.boot' version '2.4.1'
    id 'io.spring.dependency-management' version '1.0.10.RELEASE'
    id 'java'
}

group = 'com.example'
version = '0.0.1-SNAPSHOT'
sourceCompatibility = '1.8'

configurations {
    compileOnly {
        extendsFrom annotationProcessor
    }
}

repositories {
    mavenLocal()
    maven { url 'https://repo.spring.io/milestone' }
    maven { url "http://maven.aliyun.com/nexus/content/groups/public/" }
    mavenCentral()
    jcenter()
}

ext {
    set('springCloudVersion', "2020.0.0-M6")
    set('mybatisPlusVersion', "3.4.0")
    set('beetlVersion', "3.3.0.RELEASE")
}

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-web'
    implementation 'org.springframework.boot:spring-boot-starter-validation'
    implementation 'org.springframework.cloud:spring-cloud-starter-sleuth'
    compileOnly 'org.projectlombok:lombok'
    testCompileOnly 'org.projectlombok:lombok'
    annotationProcessor 'org.springframework.boot:spring-boot-configuration-processor'
    annotationProcessor 'org.projectlombok:lombok'
    testAnnotationProcessor 'org.projectlombok:lombok'
    testImplementation 'org.springframework.boot:spring-boot-starter-test'
    implementation 'org.springframework.boot:spring-boot-starter-actuator'
    runtimeOnly 'io.micrometer:micrometer-registry-prometheus'
}

dependencyManagement {
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:${springCloudVersion}"
    }
}

test {
    useJUnitPlatform()
}

2. application.yml

server:
  port: 8080
  shutdown: graceful
spring:
  application:
    name: demo
  lifecycle:
    timeout-per-shutdown-phase: 30s
  jackson:
    time-zone: GMT+8
management:
  endpoints:
    web:
      base-path: /actuator
      exposure:
        include: "*"

3. 线程池配置

@Slf4j
@Configuration
public class ExecutorConfig {

    public static final String SERVICE_EXECUTOR = "serviceExecutor";

    @Autowired
    private BossExecutorProperties executorProperties;
   
    @Bean(SERVICE_EXECUTOR)
    public ThreadPoolTaskExecutor serviceExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(executorProperties.getService().getCorePoolSize());
        executor.setMaxPoolSize(executorProperties.getService().getMaxPoolSize());
        executor.setQueueCapacity(executorProperties.getService().getQueueCapacity());
        executor.setKeepAliveSeconds((int) executorProperties.getService().getKeepAlive().getSeconds());
        executor.setThreadNamePrefix("executor-service-");
        // 线程池对拒绝任务的处理策略:这里采用了CallerRunsPolicy策略,当线程池没有处理能力的时候,该策略会直接在 execute 方法的调用线程中运行被拒绝的任务;如果执行程序已关闭,则会丢弃该任务
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.setWaitForTasksToCompleteOnShutdown(true);
        executor.setAwaitTerminationSeconds(executorProperties.getService().getAwaitTerminationSeconds());
        return executor;
    }

}

@Data
@Validated
@Configuration
@ConfigurationProperties(prefix = "boss.config.executor")
public class BossCuiShouExecutorProperties {
    @Valid
    @NotNull
    private ExecutorProperties async;

    @Valid
    @NotNull
    private ExecutorProperties service;

    @Data
    public static class ExecutorProperties {
        @NotNull
        private Integer corePoolSize;
        @NotNull
        private Integer maxPoolSize;
        @NotNull
        private Integer queueCapacity;
        @NotNull
        private Duration keepAlive;
        @NotNull
        private int awaitTerminationSeconds = 30;
    }

}

4. 收集线程池运行数据

这里我们使用micrometer来收集数据.

/**
 * 关建的收集metrics信息的代码就在这
 */
@Component
public class MetricsSupport {

    @Autowired
    private MeterRegistry meterRegistry;
   
    @Autowired
    @Qualifier(SERVICE_EXECUTOR)
    private ThreadPoolTaskExecutor serviceExecutor;

    @PostConstruct
    public void init() {
        initServiceExecutorMetrics(serviceExecutor, "executor.service");
    }

    /**
     * 线程池metrics指标监控
     * @param serviceExecutor 线程池
     * @param namePrefix 指标名称前缀
     */
    private void initServiceExecutorMetrics(ThreadPoolTaskExecutor serviceExecutor, String namePrefix) {
        Gauge.builder(namePrefix + ".active", serviceExecutor, ThreadPoolTaskExecutor::getActiveCount)
                .register(meterRegistry);
        Gauge.builder(namePrefix + ".core", serviceExecutor, ThreadPoolTaskExecutor::getCorePoolSize)
                .register(meterRegistry);
        Gauge.builder(namePrefix + ".max", serviceExecutor, ThreadPoolTaskExecutor::getMaxPoolSize)
                .register(meterRegistry);
        Gauge.builder(namePrefix + ".pool", serviceExecutor, ThreadPoolTaskExecutor::getPoolSize)
                .register(meterRegistry);
        Gauge.builder(namePrefix + ".queue", serviceExecutor, executor -> executor.getThreadPoolExecutor().getQueue().size())
                .register(meterRegistry);
        Gauge.builder(namePrefix + ".task", serviceExecutor, executor -> executor.getThreadPoolExecutor().getTaskCount())
                .register(meterRegistry);
        Gauge.builder(namePrefix + ".complete", serviceExecutor, executor -> executor.getThreadPoolExecutor().getCompletedTaskCount())
                .register(meterRegistry);
    }
}

主要的工作其实就是第4步,做好之后启动应用就可以通过端口/actuator/prometheus看到对应的metrics信息!最后还需要在grafana上配置相应的图形展示。


文章作者: shiv
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 shiv !
评论
  目录