Saturday, 14 March 2015

Cloud messaging with Camel and IronMQ

About 2 years ago I was on the lookout for a way of doing messaging between our internal integration infrastructure, and our Drupal cms system that runs on aws. Preferably the messaging solution should be SAAS.

Since we are using Camel for integration an obvious choice would be to use aws SQS, and the Camel sqs component for that, but message size limits and other missing QOS features that you are used to when working with message brokers made me google a bit around to look for other options.

I found IronMQ to be a good match since it had some nice features compared to SQS, especially FIFO and push queues.

Since Camel and the community around it is awesome, and it's fairly easy to create/share your own components I created a camel-ironmq component so IronMQ could be part of the camel route dsl like this simple file copy route shows.

About half a year ago another use case came up. We had to post time critical events to business partners for them to be able to start and stop recording live video broadcast streams. It should be easy to just add/remove authenticated partners as the need may arise.
The IronMQ push queue feature seemed a good fit, since you are able to fan out the messages you send to a IronMQ queue to a configurable number of http/webhook url's.

In the mean time IronMQ had moved somewhat, and they are working on a new V3 release with new features and better performance.
Some api changes had been introduced, and I had to brush up the component, and at the same time introduce some missing features.
Thats done on the v3 branch where also longPolling, batchDelete and concurrentConsumer support is added.
Christian Posta was so nice to make the concurrentConsumer support easy in the latest Camel 2.15.0 release as he blogged about here.    

Example of v3 route:
The tests I have done shows that you are able to consume 2500 msg/sec with 20 concurrent consumers and batchDelete turned on.

If you want to try out camel-ironmq v3 component, you have to compile the IronMQ 3.0.2-SNAPSHOT since all fixes havn't been released yet, and then upgrade to that version in camel-ironmq pom.

Currently I think only IronMQ v2 is public available for developer testing. For v3 you have to have a payed account.

Tuesday, 17 December 2013

Track and trace with Camel and Splunk

We have been using Camel for system integration for some years now, and are very pleased with the flexibility it has provided us.
Some time ago I build a Camel component for Splunk which now is on Camel master to be released with the coming 2.13.0 release, and I think a blog post about how it came about would be in it's place.

The big picture

We are running a JMS hub and spokes architecture with a central message hub consisting of topics and queues.

On each side of the hub we have Camel routes that act as integration points to other system.
Some routes act as consumers that collects data from provider systems. These integrations are typically pull based e.g. databases of various flavors, file, ftp, S3 or JMS.
The data collected is transformed to a common format (xml), and published to a queue or topic on the hub.

On the other side of the hub we have Camel routes that act as event providers. These routes consume the messages from the hub, transforms them to a target system specific format, and sends them on to the destination system using a variety of protocols such as SOAP, HTTP, database tables, stored procedures, JMS, file, ftp and raw socket.

All in all we are very pleased with this architecture since it has provided us with the desired flexibility and robustness.

Tracing messages

Early in the process we discovered that integration often can be kind of a black box, and you have to think of insight and traceability from the start. We needed insight in what was going on when we routed messages between systems, and also keep history of the message flow.
Therefor every integration adapter is publishing audit's (a copy of the original payload received or published) with some additional meta data in the header about the integration, to a audit queue on the hub.

Example of a route with a custom Camel component that creates a audit trail.

Audit trail adapter

The audit trail adapter consumes the messages from the audit queue (AUDIT_HUB), and stores the message payload in a database where they are kept for short time storage. This is usually enough to answer questions like "what happened yesterday between 9 and 10 on integration x, what did the message contain, and did the target system receive the message".

There is also a Angular app. that makes it possible for users to search and view events passing through the integration platform.

This has made it possible to gain fine grained insight over a short period of time, but for a more holistic and proactive approach something else was needed.


That was when I stumbled upon Splunk. It has a lot of features to ingest data of any kind, awesome search features on big data, alerting, and a really easy way to build dashboards with real time data if needed.
To get data into Splunk I created the Splunk component, and with that in place it's kind of easy to get data into Splunk as this example illustrates.

With data in a Splunk index the fun begins !!
Now we can use the Splunk web to search the data, and to build a dashboard with panels that should go on a display in our office, both for insight and proactive alerting.
First up is the data that we have ingested in the audit-trail index. We want to display a real time graph of events flowing through the platform by the top most active adapters.
This is done using the Splunk web creating a search query, and when happy with the result choose a way to visualise it. The end result is a panel in xml format :

Panels can be combined to build a dashboard page like this one.

Splunk comes with the possibility to install different apps. from a Splunk "App Store". The middle 2 panels are build using the Splunk jmx app. With this app. you can connect to running jvm's and ingest jmx data into a Splunk index.
The app. has a configuration file where you can configure which mbeans an attributes should be ingested.
Since Camel exposes a lot of jmx stats. you can even ingest that into Splunk as this sample snip config illustrates.

The Camel stats. can then be used in Splunk to do ad hoc. reporting, dashboards and alerting as you need it.

Search and view Camel jmx attributes Splunk

We had cases where event processing took too long since it we were dealing with recording of live streams. With the Camel stats. we could build a report that showed the integrations involved (routes), and at which times there were long processing times. With this information at hand it's easier to drill down and make decisions of where to fix a problem.

The final piece for the dashboard should be a platform health indicator like (Ok, Warn and Error).
Since our integration platform already has a rest endpoint where our monitoring system collects status information from, we can use that to ingest status data into Splunk.

For that we installed another Splunk app rest_ta The app. calls the rest endpoint, and ingests status information into a surveillance-status index.
The dashboard panel uses a range component from Splunk to indicate the status :

Final dashboard on our office wall, with the health status at the top.

Nearly forgot to mention that we also created alerts when certain events happen e.g when no data ingested on a given integration. 

My final words on Splunk would be that it's a swiss army knife for analyzing, understanding and using data, and that I'm only starting to the grip off the possibilities because there are so many.

If want to try out the Camel and Splunk combo there is a small twitter sample hosted at my Git hub repo.