It is a good practice to monitor your service and check whether it is available and/or is performing as expected. In order to do this we need to specify what service’s health term means. In this article I will present two different definitions. However keep in mind that you can have your own, project’s specific definition. All examples are prepared in Mule ESB 4.1. If you are familiar with Spring Boot Actuator you should see some interface similarities. I have decided to use Spring approach as it is clear and easy to read.

Service’s health

In order to efficiently monitor your services, set of service’s health conditions should be chosen. It may by universal list, but it may as well be tailored to your project’s specification.  Here are a couple of ideas that you can use:

  • has service started?
  • has service reachable endpoint?
  • has runtime created anchor file?
  • has service successfully established connection with another system via HTTP?
  • has service established connection within threshold with another system?
  • etc

As you can see this is just the beginning and it can be extended as needed. Keep in mind that health check should be quick and simple and not too complex as it may lead to difficult maintaining.  I decided to do two approaches. First one will be entirely focused on question whether my service has started. The second one will be more sophisticated as I would expect to see if my service has established connection within acceptable threshold.

Anchor file

Health check condition: has our service been deployed?

We have a couple of ways to check if service is running in Mule ESB. First of all we may look in mule-ee.log. After Mule service starts, in log file, you should see table with applications start-up statuses. Like in screenshot below. We can tell that health-check application from default domain has been DEPLOYED. Mule will set it to FAILED in case of any error.

deployment status in mule-ee.log
Deployment status in mule-ee.log

Mule run-time creates file [application name]-anchor.txt when the service is deploys correctly. Note that extension txt will exists for both Windows and Linux systems. In this scenario we need to look for file existence within apps directory. Using previous example I would look for health-check-anchor.txt. If my monitoring tool will not find this file I should receive an alert that something went wrong.

Health endpoint

Spring Boot Actuator

While I was implementing microservices using Spring Boot, I have encounter Spring Boot Actuator library. This library enabled a couple of simple endpoints. The most important for me were /health and /info. The first one, shown below, allowed me easily to check my application’s status. As you can see although configService and hystrix are marked as UP overall status is DOWN. This means that some other condition did not evaluate correctly.

Spring Boot Actuator service's health
Spring Boot Actuator service’s health

Simple health check

Health check condition: Has our service been deployed? Does service run?

How can we achieve that scenario? Mule does not have something like health endpoint allowing to check whether service is running or not. I think, that the easiest way would be to enable http listener on specific URI like /health. Under this address we should receive clear status’ information. Like in the diagram below this can be as simple as always returning status UP by service with 200 http status code.

Service's simple status
Service’s simple status

If I am not able to reach /health endpoint I know, promptly, that something is wrong with my service. On the other hand if I receive any response I am happy to mark my service as running and working as expected. Let’s see something more complex.

Complex health check

Health check condition: Has our service been deployed? Does service run? Has service established connection with external system withing defined timeout threshold?

In comparison with previous simple health check, here we have higher expectations towards our service. We expect that service can connect with external system through HTTP protocol, or query DB using simple select statement. What is more, we may require some timeout threshold to be met. The diagram below depicts simple process.

Service's complex health check
Service’s complex health check

In the presented example we are performing in parallel three different checks. Two external HTTP calls and one DB call. For each call we perform custom status verification. For HTTP call it could be check if 200 or 201 http status code has been returned. After all steps have been performed, we compute overall service status. Usually if one of call is marked as DOWN service status is also reflected as DOWN. The most complex part here is “Verify status” and “Compute status“. In this two actions you can put as much custom login as you needs.

HTTP status code

If you decide to expose service’s status using rest endpoint you should also consider changing returned http status. It is a good practice to return 200 code for status UP and 503 in case of status DOWN. Why? 200 means ok and I recon that DOWN status is definitely not ok. Most of all client code will notice that 5xx code occurred and this is exceptional situation which requires some action.

Service unavailable (source http.cat)
Service unavailable (source http.cat)

Implementation

After this brief introduction to service’ health status it is time to see implementation in Mule ESB. I have prepared one application that has /health endpoint exposed. This endpoint only accents GET requests and return content in JSON.

Simple scenario

The first and easiest is to always return UP status. As you can see, we perform this in three steps. We could do it in only one step, however I decided to have more generic flow. In consequence only first message processor will change. More about this in the next section.

Default, always UP, status
Default, always UP, status

What this flow actually do is to set status to successful. After calling GET /health we should always receive:

{
 "status": "UP"
}

This solution is fairly simple, but it may fill your needs. If you have more sophisticated requirements like checking if we have established connection or we get response within specified time boundaries go to the next section.

Verifying connection

Flow health-status-flow is far more complex. First of all we get scatter gather that calls two private flows concurrently. Next two steps, are similar to what you already saw. That is computing status and preparing final response.

Complex health status flow

I am expecting strucure like in the example below:

{
  "status": "DOWN",
    "details": [
      {
        "serviceType": "http",
        "status": "DOWN",
        "errorCode": "THRESHOLD BREACHED",
        "statusCode": 200
      },
      {
        "serviceType": "db",
        "status": "DOWN",
        "errorCode": "CONNECTIVIT"
      }
    ]
}

In comparison to the previous example, now I have details arrays. Each item is specific health check. For this particular example

  • getting response took longer then expected
  • connection to database did not work due to connectivity issues.

As a result overall status is DOWN.

Connecting to HTTP endpoint

 

Flow that checks health, is performing request than compute status. Logic is fairly simple. If HTTP response status code is 200 than service’s status is UP. Mule ESB by default would throw an exception for codes greater or equal then 400. We need to suppress this behavior. In order to treat any status code as a success we need to configure HTTP Request’ response validator like below:

<http:response-validator >
  <http:success-status-code-validator values="100...599" />
</http:response-validator>

Why I have decided for range from 100 to 599? Because this is a standard and I should not receive anything outside this range.

If you are not up to date with newest match and if DataWeave syntax, you may find useful reading article DataWeave – Tip #1. To keep it short following transformation set variable status.  DataWeave engine add errorCode and statusCode properties when status equals “DOWN”.


%dw 2.0
output application/java
---
{
  serviceHealth: {
    serviceType: "http",
   (using ( 
     status = if (vars.service.statusCode == 200) "UP" else "DOWN" ) { 
       status: status, 
       (status match { 
         case "DOWN" -> {
           errorCode: vars.service.reasonPhrase
           statusCode: vars.service.statusCode
        }
        else -> {}
      })
    })
  }
}
Timeout threshold

We may also extend conditions and expect to receive response within specified time range. Both conditions should be fulfilled to consider status as running:

  • HTTP response status code is 200
  • Connection time less then defined threshold (if threshold is specified)

In case of breached threshold I would like to provide error code. Here is the excerpt from transformation:

...

errorCode: vars.service.reasonPhrase match {
 case met if thresholdMet -> $
 else -> "THRESHOLD BREACHED"
 },

...

Connecting to DB

How can we check Database health? In Mule ESB we need to use Try block to handle all exceptions that can occur during call to database. We can use On Error Continue to continue our flow. Then in Transform Message we check whether we received any error during call and set status appropriately.

DB health check
Database health check

Source Code

Source Code is available at GitHub.

Summary

To check if Mule ESB service has been deployed correctly we can use anchor files. In advanced scenarios where conditions are much more complex it is worth to expose /health endpoint that would inform about service’s status. We can define threshold, we can perform simple calls to DB etc. It is totally up to you and your requirements. Bear in mind that checks should not be too complex as it may become too cumbersome.

If you find this article interesting please share it :). 

Is my service healthy?
Tagged on:             

Leave a Reply

Your email address will not be published. Required fields are marked *