无论是哪一款注册中心,优雅发布要解决的问题都是优雅上线和优雅下线。本文结合 Nacos 的原理讲解了 Nacos 的优雅发布,希望对你有所帮助。
大家好,我是君哥。
(资料图)
今天来聊一聊使用 Nacos 做注册中心怎么做优雅发布。
跟其他的注册中心一样,Nacos 作为注册中心的使用如下图:
Service Provider 启动后注册到 Nacos Server,Service Consumer 则从 Nacos Server 拉取服务列表,根据一定算法选择一个 Service Provider 来发送请求。
对于优雅发布,要求是 Service Provider 上线(注册到 Nacos)后,服务能够正常地接收和处理请求,而 Service Provider 停服后,则不会再收到请求。这就有两个要求:
优雅上线:Service Provider 发布完成之前,Service Consumer 不应该从服务列表中拉取到这个服务地址;优雅下线:Service Provider 下线后,Service Consumer 不会从服务列表中拉取到这个服务地址。解决了这两个问题,优雅发布就可以做到了。
搭建环境是为了看 Nacos 日志,通过日志找到对应的源代码。本文搭建的环境如下图:
启动 springboot-provider 的应用,注册到 Nacos,启动日志如下:
2023-06-11 18:58:10,120 [main] [INFO] com.alibaba.nacos.client.naming - [BEAT] adding beat: BeatInfo{port=8083, ip="192.168.31.94", weight=1.0, serviceName="DEFAULT_GROUP@@springboot-provider", cluster="DEFAULT", metadata={management.endpoints.web.base-path=/actuator, management.port=18082, preserved.register.source=SPRING_CLOUD, management.address=127.0.0.1}, scheduled=false, period=5000, stopped=false} to beat map.2023-06-11 18:58:10,121 [main] [INFO] com.alibaba.nacos.client.naming - [REGISTER-SERVICE] public registering service DEFAULT_GROUP@@springboot-provider with instance: Instance{instanceId="null", ip="192.168.31.94", port=8083, weight=1.0, healthy=true, enabled=true, ephemeral=true, clusterName="DEFAULT", serviceName="null", metadata={management.endpoints.web.base-path=/actuator, management.port=18082, preserved.register.source=SPRING_CLOUD, management.address=127.0.0.1}}2023-06-11 18:58:10,133 [main] [INFO] com.alibaba.cloud.nacos.registry.NacosServiceRegistry - nacos registry, DEFAULT_GROUP springboot-provider 192.168.31.94:8083 register finished2023-06-11 18:58:10,221 [main] [INFO] org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 18082 (http)2023-06-11 18:58:10,222 [main] [INFO] org.apache.coyote.http11.Http11NioProtocol - Initializing ProtocolHandler ["http-nio-127.0.0.1-18082"]2023-06-11 18:58:10,223 [main] [INFO] org.apache.catalina.core.StandardService - Starting service [Tomcat]2023-06-11 18:58:10,223 [main] [INFO] org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.21]2023-06-11 18:58:10,239 [main] [INFO] org.apache.catalina.core.ContainerBase.[Tomcat-1].[localhost].[/] - Initializing Spring embedded WebApplicationContext2023-06-11 18:58:10,239 [main] [INFO] org.springframework.web.context.ContextLoader - Root WebApplicationContext: initialization completed in 99 ms2023-06-11 18:58:10,268 [main] [INFO] org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver - Exposing 22 endpoint(s) beneath base path "/actuator"2023-06-11 18:58:10,336 [main] [INFO] org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-127.0.0.1-18082"]2023-06-11 18:58:10,340 [main] [INFO] org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 18082 (http) with context path ""2023-06-11 18:58:10,342 [main] [INFO] boot.Application - Started Application in 7.051 seconds (JVM running for 7.874)2023-06-11 18:58:10,358 [main] [INFO] com.alibaba.nacos.client.config.impl.ClientWorker - [fixed-39.105.183.91_8848] [subscribe] springboot-provider.properties+DEFAULT_GROUP2023-06-11 18:58:10,359 [main] [INFO] com.alibaba.nacos.client.config.impl.CacheData - [fixed-39.105.183.91_8848] [add-listener] ok, tenant=, dataId=springboot-provider.properties, group=DEFAULT_GROUP, cnt=12023-06-11 18:58:10,359 [main] [INFO] com.alibaba.nacos.client.config.impl.ClientWorker - [fixed-39.105.183.91_8848] [subscribe] springboot-provider-dev.properties+DEFAULT_GROUP2023-06-11 18:58:10,359 [main] [INFO] com.alibaba.nacos.client.config.impl.CacheData - [fixed-39.105.183.91_8848] [add-listener] ok, tenant=, dataId=springboot-provider-dev.properties, group=DEFAULT_GROUP, cnt=12023-06-11 18:58:10,360 [main] [INFO] com.alibaba.nacos.client.config.impl.ClientWorker - [fixed-39.105.183.91_8848] [subscribe] springboot-provider+DEFAULT_GROUP2023-06-11 18:58:10,360 [main] [INFO] com.alibaba.nacos.client.config.impl.CacheData - [fixed-39.105.183.91_8848] [add-listener] ok, tenant=, dataId=springboot-provider, group=DEFAULT_GROUP, cnt=12023-06-11 18:58:10,639 [RMI TCP Connection(1)-192.168.31.94] [INFO] org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring DispatcherServlet "dispatcherServlet"2023-06-11 18:58:10,839 [com.alibaba.nacos.client.naming.updater] [INFO] com.alibaba.nacos.client.naming - [BEAT] adding beat: BeatInfo{port=8083, ip="192.168.31.94", weight=1.0, serviceName="DEFAULT_GROUP@@springboot-provider", cluster="DEFAULT", metadata={management.endpoints.web.base-path=/actuator, management.port=18082, preserved.register.source=SPRING_CLOUD, management.address=127.0.0.1}, scheduled=false, period=5000, stopped=false} to beat map.2023-06-11 18:58:10,840 [com.alibaba.nacos.client.naming.updater] [INFO] com.alibaba.nacos.client.naming - modified ips(1) service: DEFAULT_GROUP@@springboot-provider@@DEFAULT -> [{"instanceId":"192.168.31.94#8083#DEFAULT#DEFAULT_GROUP@@springboot-provider","ip":"192.168.31.94","port":8083,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@springboot-provider","metadata":{"management.endpoints.web.base-path":"/actuator","management.port":"18082","preserved.register.source":"SPRING_CLOUD","management.address":"127.0.0.1"},"ipDeleteTimeout":30000,"instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000}]2023-06-11 18:58:10,841 [com.alibaba.nacos.client.naming.updater] [INFO] com.alibaba.nacos.client.naming - current ips:(1) service: DEFAULT_GROUP@@springboot-provider@@DEFAULT -> [{"instanceId":"192.168.31.94#8083#DEFAULT#DEFAULT_GROUP@@springboot-provider","ip":"192.168.31.94","port":8083,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@springboot-provider","metadata":{"management.endpoints.web.base-path":"/actuator","management.port":"18082","preserved.register.source":"SPRING_CLOUD","management.address":"127.0.0.1"},"ipDeleteTimeout":30000,"instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000}]
我们再看下 Nacos 的日志,这里看的文件 naming-server.log,日志如下:
2023-06-11 18:58:09,723 INFO Client connection 192.168.31.94:51885#true connect2023-06-11 18:58:10,105 INFO Client change for service Service{namespace="public", group="DEFAULT_GROUP", name="springboot-provider", ephemeral=true, revisinotallow=1}, 192.168.31.94:8083#true2023-06-11 18:58:18,204 INFO Client connection 192.168.31.94:60850#true disconnect, remove instances and subscribers
springboot-provider 启动成功后,从Nacos 管理后台可以看到下图:
服务下线后,Nacos 日志如下:
2023-06-11 19:01:03,375 INFO Client connection 192.168.31.94:51885#true disconnect, remove instances and subscribers2023-06-11 19:01:05,048 INFO [AUTO-DELETE-IP] service: Service{namespace="public", group="DEFAULT_GROUP", name="springboot-provider", ephemeral=true, revisinotallow=2}, ip: {"ip":"192.168.31.94","port":8083,"healthy":false,"cluster":"DEFAULT","extendDatum":{"management.endpoints.web.base-path":"/actuator","management.port":"18082","preserved.register.source":"SPRING_CLOUD","management.address":"127.0.0.1","customInstanceId":"192.168.31.94#8083#DEFAULT#DEFAULT_GROUP@@springboot-provider"},"lastHeartBeatTime":1686481231604,"metadataId":"192.168.31.94:8083:DEFAULT"}2023-06-11 19:01:05,048 INFO Client remove for service Service{namespace="public", group="DEFAULT_GROUP", name="springboot-provider", ephemeral=true, revisinotallow=2}, 192.168.31.94:8083#true2023-06-11 19:01:08,379 INFO Client connection 192.168.31.94:8083#true disconnect, remove instances and subscribers
在 springboot-consumer 上跑一个单元测试的用例,用 FeignClient 调用下面的方法:
@FeignClient(value = "springboot-provider", configuration = FeignMultipartSupportConfig.class)public interface FeignAsEurekaClient { @PostMapping("/employee/save") String saveEmployeebyName(@RequestBody Employee employee);}
日志如下:
2023-06-11 19:15:47,694 [main] [INFO] org.springframework.test.context.transaction.TransactionContext - Began transaction (1) for test context [DefaultTestContext@5bf0d49 testClass = TestFeignAsEurekaClient, testInstance = boot.service.TestFeignAsEurekaClient@10683d9d, testMethod = testPostEmployByFeign@TestFeignAsEurekaClient, testException = [null], mergedContextConfiguration = [WebMergedContextConfiguration@5b7a5baa testClass = TestFeignAsEurekaClient, locations = "{}", classes = "{class boot.Application, class boot.Application}", contextInitializerClasses = "[]", activeProfiles = "{}", propertySourceLocations = "{}", propertySourceProperties = "{org.springframework.boot.test.context.SpringBootTestCnotallow=true, server.port=0}", contextCustomizers = set[org.springframework.boot.test.context.filter.ExcludeFilterContextCustomizer@166fa74d, org.springframework.boot.test.json.DuplicateJsonObjectContextCustomizerFactory$DuplicateJsonObjectContextCustomizer@588df31b, org.springframework.boot.test.mock.mockito.MockitoContextCustomizer@0, org.springframework.boot.test.web.client.TestRestTemplateContextCustomizer@7fad8c79, org.springframework.boot.test.autoconfigure.properties.PropertyMappingContextCustomizer@0, org.springframework.boot.test.autoconfigure.web.servlet.WebDriverContextCustomizerFactory$Customizer@10b48321], resourceBasePath = "src/main/webapp", contextLoader = "org.springframework.boot.test.context.SpringBootContextLoader", parent = [null]], attributes = map["org.springframework.test.context.web.ServletTestExecutionListener.activateListener" -> false]]; transaction manager [org.springframework.jdbc.datasource.DataSourceTransactionManager@693676d]; rollback [true]2023-06-11 19:15:47,941 [main] [INFO] com.netflix.config.ChainedDynamicProperty - Flipping property: springboot-provider.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 21474836472023-06-11 19:15:47,962 [main] [INFO] com.netflix.loadbalancer.BaseLoadBalancer - Client: springboot-provider instantiated a LoadBalancer: DynamicServerListLoadBalancer:{NFLoadBalancer:name=springboot-provider,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:null2023-06-11 19:15:47,969 [main] [INFO] com.netflix.loadbalancer.DynamicServerListLoadBalancer - Using serverListUpdater PollingServerListUpdater2023-06-11 19:15:48,064 [main] [INFO] com.netflix.config.ChainedDynamicProperty - Flipping property: springboot-provider.ribbon.ActiveConnectionsLimit to use NEXT property: niws.loadbalancer.availabilityFilteringRule.activeConnectionsLimit = 21474836472023-06-11 19:15:48,064 [main] [INFO] com.netflix.loadbalancer.DynamicServerListLoadBalancer - DynamicServerListLoadBalancer for client springboot-provider initialized: DynamicServerListLoadBalancer:{NFLoadBalancer:name=springboot-provider,current list of Servers=[192.168.31.94:8083],Load balancer stats=Zone stats: {unknown=[Zone:unknown; Instance count:1; Active connections count: 0; Circuit breaker tripped count: 0; Active connections per server: 0.0;]},Server stats: [[Server:192.168.31.94:8083; Zone:UNKNOWN; Total Requests:0; Successive connection failure:0; Total blackout seconds:0; Last connection made:Thu Jan 01 08:00:00 CST 1970; First connection made: Thu Jan 01 08:00:00 CST 1970; Active Connections:0; total failure count in last (1000) msecs:0; average resp time:0.0; 90 percentile resp time:0.0; 95 percentile resp time:0.0; min resp time:0.0; max resp time:0.0; stddev resp time:0.0]]}ServerList:com.alibaba.cloud.nacos.ribbon.NacosServerList@24d998ba
注意,这里使用了 OpenFeign,其中用到了 Ribbon 做负载均衡,那就需要考虑到 Ribbon 的刷新本地服务列表的时间,从源代码中看,刷新周期是 30s。如下图:
Ribbon 刷新缓存的逻辑参考下面代码:
public synchronized void start(final UpdateAction updateAction) { if (isActive.compareAndSet(false, true)) { final Runnable wrapperRunnable = new Runnable() { @Override public void run() { //... } }; scheduledFuture = getRefreshExecutor().scheduleWithFixedDelay( wrapperRunnable, initialDelayMs, refreshIntervalMs,//这里定义的是30s TimeUnit.MILLISECONDS ); }//...}
前面第一节提到过,优雅发布有两个要求:优雅上线和优雅下线。
Nacos 客户端和服务端的交互采用长轮询的方式,服务端收到客户端的请求后,首先会判断服务端本地的服务列表是否跟客户端的相比是否发生变化(比较 MD5),如果发生变化则立即通知客户端,否则放入长轮询队列挂起,如果这段时间内服务列表发生变化,则立刻通知客户端,否则等到超时后再通知客户端。代码如下:
//LongPollingService.javapublic void addLongPollingClient(HttpServletRequest req, HttpServletResponse rsp, Map clientMd5Map, int probeRequestSize) { String str = req.getHeader(LongPollingService.LONG_POLLING_HEADER); int delayTime = SwitchService.getSwitchInteger(SwitchService.FIXED_DELAY_TIME, 500); // Add delay time for LoadBalance, and one response is returned 500 ms in advance to avoid client timeout. long timeout = -1L; if (isFixedPolling()) { //... } else { timeout = Math.max(10000, Long.parseLong(str) - delayTime);//29.5s long start = System.currentTimeMillis(); List changedGroups = MD5Util.compareMd5(req, rsp, clientMd5Map); if (changedGroups.size() > 0) { //服务列表发生变化,直接返回给客户端 generateResponse(req, rsp, changedGroups); return; } //... } String ip = RequestUtil.getRemoteIp(req); //.. // Must be called by http thread, or send response. final AsyncContext asyncContext = req.startAsync(); // AsyncContext.setTimeout() is incorrect, Control by oneself asyncContext.setTimeout(0L); String appName = req.getHeader(RequestUtil.CLIENT_APPNAME_HEADER); String tag = req.getHeader("Vipserver-Tag"); //服务列表没有发生变化,放入长轮询队列等待调度 ConfigExecutor.executeLongPolling( new ClientLongPolling(asyncContext, clientMd5Map, ip, probeRequestSize, timeout, appName, tag));}
从上面服务端源代码可以看到,这里超时时间是 30s,其中 29.5s 用于挂起等待,0.5s 检查服务列表是否发生变化。这里使用了长轮询,如果服务端列表发生变化,会立刻通知客户端,所以对优雅发布影响非常小。
服务列表发生变化后,客户端用单独的线程通知监听的 listener,代码如下:
public void startInternal() { executor.schedule(() -> { while (!executor.isShutdown() && !executor.isTerminated()) { try { listenExecutebell.poll(5L, TimeUnit.SECONDS); //... executeConfigListen(); } catch (Throwable e) { //... } } }, 0L, TimeUnit.MILLISECONDS); }
优雅上线存在的问题主要在于 Service Provider 注册到 Nacos 后,服务还没有完成初始化,请求已经到来。这种情况主要原因是 Service Provider 启动后立刻注册 Naocs,而本身提供的接口可能还没有初始化完成。
这种情况的解决方法是关闭自动注册:
spring.cloud.nacos.discovery.registerEnabled=false
在服务初始化后使用代码手动注册,代码如下:
Properties setting8 = new Properties();String serverIp8 = "127.0.0.1:8848";setting8.put(PropertyKeyConst.SERVER_ADDR, serverIp8);setting8.put(PropertyKeyConst.USERNAME, "nacos");setting8.put(PropertyKeyConst.PASSWORD, "nacos");NamingService inaming8 = NacosFactory.createNamingService(setting7);inaming8.registerInstance("springboot-provider", "192.168.31.94", 8083);
服务下线分两种情况,一个是正常停服,一个是服务故障。
对于正常停服,Nacos 采用心跳检测来实现服务在线。心跳周期是 5s,Nacos Server 如果 15s 没收到心跳就会将实例设置为不健康,在 30s 没收到心跳才会讲这个服务删除。当然这个时间可以设置:
spring.cloud.nacos.discovery.metadata.preserved.heart.beat.interval=1000 #心跳间隔5s->1sspring.cloud.nacos.discovery.metadata.preserved.heart.beat.timeout=3000 #超时时间15s->3sspring.cloud.nacos.discovery.metadata.preserved.ip.delete.timeout=5000 #删除时间30s->5s
但这样并不能保证服务停止后能够立刻从 Nacos Server 下线,很有可能服务停止后还能再收到请求,最好的方式是手动下线,比如增加一个 API 接口,服务下线之前增加 preStopHook 函数调用这个 API 接口来实现下线。API 接口示例代码如下:
@GetMapping(value = "/nacos/deregisterInstance")public String deregisterInstance() { Properties prop = new Properties(); prop.setProperty("serverAddr", "localhost"); prop.put(PropertyKeyConst.NAMESPACE, "test"); NacosNamingService client = new NacosNamingService(prop); client.deregisterInstance("springboot-provider", "192.168.31.94", 8083); return "success";}
在使用 Ribbon 的场景,也需要考虑 Ribbon 更新本地缓存服务列表的机制,手动下线后,可以再等待 30s 再关闭服务。
第二种情况是服务故障,但是并没有停服,这种情况是很难避免外部请求再发送过来的。处理方式是对这个服务本身的健康检查结果进行处理,比如连续三次健康检查失败,可以调用上面的 API 接口让服务下线。
无论是哪一款注册中心,优雅发布要解决的问题都是优雅上线和优雅下线。本文结合 Nacos 的原理讲解了 Nacos 的优雅发布,希望对你有所帮助。
标签:
仓储物流“成渝圈”如何乘势而上? 12月3日,连接昆明和万象的中老铁路全线开通运营,被惠及的显...
两件西周青铜簋时隔三千年成功配对 考古工作者介绍,这个铜簋的盖、身分别时隔40余年出土,纹饰...
“医保砍价”不是一个人在战斗 晁星 “我眼泪都快掉下来了”“每一个小群体都不该被放弃”…...
“购物成瘾”真的是一种病 刘艳 牛雅娟 本周日即将迎来“双十二”促销季,很多人又开始摩拳...
因迷恋山间风景,一男子在甘孜州稻城县海拔4000多米的无人区迷失方向,随后与同伴失联。12月的稻城...
嫌疑人DNA信息比中后,成都市公安局刑侦支队技术处DNA实验室民警白小刚一下坐在凳子上,恍惚迟疑间...
一批反映南京大屠杀历史的新书发布 新华社南京12月7日电(记者邱冰清、蒋芳)“以史为鉴,开创未来...
我在现场·照片背后的故事|电影《亲爱的》里面没有的结局,在我眼前“上映” 12月6日,在深圳市...
冥想?泡脚?不如听听助眠音乐 晚上睡不着,白天睡不醒,成为最贴合都市人群的“睡眠画像”。随...
养老话题 老年教育面临缺口 “终身教育”潜力无限 【现实挑战】“新老年”群体愿意在培养兴...
孙海洋被拐14年儿子如何找到的? 警方侦办另一宗拐骗儿童案时发现线索,通过人像比对、DNA确认找...
北京天文馆、圆明园将对未成年人免费开放 12月6日,北京天文馆发布通知称,12月8日起试行对未成...
今年全国粮食总产量再创新高 连续7年保持在1 3万亿斤以上 根据对全国31个省(区、市)的抽样调...
斑块软的很危险 硬的就无碍? 血管里的“垃圾”分类 赶快学起来! 一项最新研究显示:中国...
诺西那生钠注射液大幅降价 聚焦医保谈判背后脊髓性肌萎缩症家庭 医保目录公布那天 好多家长都...
抖音“窗花剪剪”遭抄袭 被判获赔20万元 法院认为“窗花剪剪”的这种表达方式理应受到《著作权...
公安机关近日侦破3起拐卖儿童案件 失散十几年 3组家庭终于团圆了 北京青年报记者12月6日从公...
2021年度十大网络用语发布 本报讯(记者 路艳霞)作为年度“汉语盘点”活动最具网络特色的组成部...
北京天文馆向未成年人免费开放 本报讯(记者 牛伟坤)北京天文馆对票价免费及优惠政策作出调整:1...
2021北京百个网红打卡地发布 本报讯(记者 李洋)2021北京网红打卡地推荐榜单昨晚正式发布。自然...