ZeroMQ & Node.js Tutorial - Cracking JWT Tokens (Part 1.)

This article teaches you how to build a distributed application with ZeroMQ and Node.js by developing an exciting sample project: a brute-force cracker for JWT tokens.

This is a two-part story - this first post will focus on theory, and the second one is about coding. You’ll get to know ZeroMQ, how JWT tokens work and how our application can crack some of them! Be aware, that the application will be intentionally simple. I only want to demonstrate how we can leverage some specific patterns.

What is ZeroMQ

ZeroMQ (also known as ØMQ, 0MQ, or zmq) is an open source embeddable networking library and a concurrency framework built in C++. It is available for many platforms and programming languages (including Node.js). The goal of ZeroMQ is providing developers with a foundation of network utilities that can be easily used across systems with heterogeneous architectures. ZeroMQ provides sockets that can carry atomic messages across various transport layers like in-process, inter-process, TCP, and multicast.

And in case you are wondering why it is called "Zero"...

The Ø in ZeroMQ is all about tradeoffs. On the one hand this strange name lowers ZeroMQ's visibility on Google and Twitter. On the other hand it annoys the heck out of some Danish folk who write us things like "ØMG røtfl", and "Ø is not a funny looking zero!" and "Rødgrød med fløde!", which is apparently an insult that means "may your neighbours be the direct descendants of Grendel!" Seems like a fair trade.

For more info, you can read the The ZeroMQ official guide.

Building a JWT token cracker with ZeroMQ and Node.js

In the course of this article, we are going to build a functional distributed application: a JWT token cracker.

If you know what JWT tokens are and how they work feel free to skip this section, otherwise you are going to need a bit of theory here...

JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed. JWTs can be signed using a secret (with the HMAC algorithm) or a public/private key pair using RSA.

If you need more, read the introduction to JWT page.

JWT is often used as a mechanism to enforce authentication and authorization in websites and APIs, so being able to "crack" one of these tokens might mean gaining access to sensitive information or being able to impersonate a particular user on a given system.

But what do we really mean with "cracking" a JWT token?

In order to really understand this question we need to know how a JWT token is composed.

A typical JWT token is a string composed by 3 parts (separated by a "."): the header, the payload and the signature.

To have a visual cue about how it looks like, take the following token as an example:


The header

The header, also called JOSE header (JSON Object Signing and Encryption header), contains some metadata that describes which algorithm is used for signature and/or encryption. If we use base64 to decode the header in our example token we will get the following JSON string (properly beautified for your convenience):

  "alg": "HS256",
  "typ": "JWT"

The most common algorithms available are HS256 (HMAC signature) and RS256 (RSA public/private key pair signature).

In our application we will focus on cracking only HS256-based tokens.

The payload

The payload is the most important part of the token, because it is the one that actually contains the information exchanged between the parties.

In our example, the decoded payload (using base64) is the following JSON string:

  "sub": "1234567890",
  "name": "John Doe",
  "admin": true

The payload can contain virtually any kind of data that can be serialized to a JSON string. In this case it's fairly obvious that the token is used to exchange the information about the user that is currently logged in.

This should ring a bell (a malicious one). What if we could alter the payload of this token at our convenience? Yes, in this particular use case we might be able to impersonate another user or to obtain access to resources that might be restricted to our regular user.

The signature

Of course JWT has a mechanism to avoid people to easily forge their own tokens: the signature.

The signature, which is the third and last part of the token, can be (in theory) generated only by the token issuer authority, for example by an authentication server.

Every time the issuer needs to verify the authenticity of a previously generated JWT token, it simply calculates again the signature for the given header and payload. If it matches the original signature contained in the token, it can safely assume that the token is authentic and not maliciously forged.

As we said, we can have different signature algorithms. In case of HS256 the algorithm to calculate the signature is the following:

  base64UrlEncode(header) + "." + base64UrlEncode(payload),

As you can see the function HMACSHA256 is used to generate a hash-based signature. This function accepts two arguments: a string consisting of the encoded header and payload separated by a dot and a password (also known as secret).

So the password is what actually protects tokens from being forged, and it must be accessible only to the issuer authority. If the password is disclosed, a malicious attacker will be able to forge a token with an authentic signature and the issuer authority will not be able to distinguish forged tokens from authentic ones anymore.

Our application will use a brute force approach to try to find out the password. Given a specific token, it will be able to try any possible combination of characters over a specific alphabet and check if the resulting string is the valid secret for the token signature. If we are successful we can then use the discovered password to sign tokens that contains information that we can alter at our own will.

Are JWT tokens safe to use?

That's probably what you are asking yourself right now...

My personal answer to this question is "definitely YES"!

The weakness that we are trying to exploit here is the same that every password-based system has: passwords can be guessed or be subjected to brute force attacks!

So it's your responsibility to choose strong passwords in order to protect the signature of your JWT tokens from common attacks like brute force (the one we are going to use here) or dictionary attacks.

Also, if you need an increased level of security and having longer tokens is not an issue, you can switch to the RS256 signature algorithm.

There are also other techniques that you can adopt:

  • Store all the generated tokens in a database so that if a token signature is verified you can also check if it was really generated by the issuer.
  • Add a level of encryption over the full token string (which will even hide the fact that the original token is in JWT format).

These techniques are not really necessary though, and even if they might increase the security of your application they will add extra layers of complexity. In most of the cases, choosing a long random password over a big alphabet (e.g. including lowercase, uppercase, digits and symbols) should be enough to make your tokens virtually "uncrackable".

Finally, we need to take into consideration that a brute force attack is the least performant attack that we can make, and it might take years, even centuries to disclose a very strong password, even using a large cluster of performant machines working in parallel.

The approach to the problem

Our JWT token cracker application will consist of two parts: a server and a client.

The goal of the server is to collect the information needed to perform the computation and then distribute and coordinate the workload between the clients.

The server will be initialized with two parameters:
- A sample well formatted JWT token from a given issuer authority, - An alphabet of characters to use to generate all the possible variations of strings.

The space of the possible solutions is the infinite space of all the strings (of any length) that can be generated within the given alphabet. In short, the role of the server is to divide this space into chunks and assign them to the clients, making sure that every client gets a different chunk.

The server doesn't know how to crack the token (which is the goal of the client), it just knows how to distribute the chunks. To understand how the chunks are managed we need to clarify how the space of solutions can be represented.

Let's do this with an example.

If we take an alphabet containing the characters a, b, c and 1 we can generate the following strings:

(empty string), a, b, c, 1, aa, ab, ac, a1, ba, bb, bc, b1, ca, cb, cc, c1, 1a,
1b, 1c, 11, aaa, aab, aac, aa1, aba, ...  

As you might have noticed, there is an implicit order in the way we listed these strings on the given alphabet.

If we keep progressing with the iteration, it will be endless but we can be sure we are not going to miss any possible string over the chosen alphabet. In other words we can enumerate the possible solutions. If we start from 0 our enumeration will look like:

 0 ''
 1 'a'
 2 'b'
 3 'c'
 4 '1'
 5 'aa'
 6 'ab'
 7 'ac'
 8 'a1'
 9 'ba'
10 'bb'  

The enumeration associates univocally a non-negative integer to one and only one possible solution over the alphabet.
With this approach we can form a one-to-one relationship between the space of the non-negative integers to the space of strings built over the given alphabet.

This approach makes tracking the distributed workload relatively simple for the server, because a chunk of the solutions space can be represented simply with two integers (from and to) that define the boundaries of the sub-space.

If all the chunks have a fixed size, then the server only needs to maintain an integer in memory that identifies the starting point of the next chunk (a cursor over the space of solutions) and a list of the chunks currently being processed by every connected client.

When a new client joins the cluster, it will get the next chunk available (as pointed by the cursor) and the cursor gets moved forward.
The same happens when a client in the cluster finishes to analyze its chunk and requests a new one.

To make this clear, let's see an example where the size of our chunks is 3.

At first no client is connected, so the state of our distributed application can be represented as follows.

  "cursor": 0,
  "clients": {}

Then a client connects, so the server gives it the next available chunk ([0,2]) and moves
the cursor forward:

  "cursor": 3,
  "clients": {
    "client1": [0,2]

Then after some time, 2 new clients connect, client 2 arrives slightly earlier than client 3, so it gets the second chunk ([3,5]) while client 3 gets the third chunk ([6,8]):

  "cursor": 9,
  "clients": {
    "client1": [0,2],
    "client2": [3,5],
    "client3": [6,8]

Client 2 is "super" fast and after few milliseconds it already finished his job and requested a new batch, so it gets the next available chunk ([9,11]):

  "cursor": 12,
  "clients": {
    "client1": [0,2],
    "client2": [9,11],
    "client3": [6,8]

I think you got the idea...

This process continues until one of the clients find the solution in a chunk. New clients can join the cluster at any time.

When the solution is found, the server is notified which then notifies all the connected clients, so that they can stop and exit from the cluster.

To make it work, we will need an efficient algorithm to calculate the string associated to a specific integer over the given alphabet. For this purpose, we will use the library indexed-string-variations, which was built exactly for this use case. If you are curious to know how it works have a look at the official repository.

Networking patterns

Let's analyze what type of messages will flow on the network in order to choose the ideal networking patterns for our specific use cases.

From the clients point of view, we have 4 different type of networking messages:

  1. Start: a client joins the cluster and receives the current token, the current alphabet and a first batch to process.
  2. Batch: a client finishes to process a batch without finding the password and requests a new batch.
  3. Success: a client finds the password and communicates it to the server.
  4. Exit: a client receives an exit message because some other client in the cluster found the password.

To support these messages we can leverage two different networking patterns offered by ZeroMQ: the router/dealer pattern and the pub/sub pattern.

The router/dealer pattern is used to exchange messages between the server and the clients, and it supports complex multi-layer network structures. It allows managing multiple request-response cycles maintaining the relationship between every request and the associated response.

In our case, the server will act as a router dispatching tasks to the clients (the dealers) and expecting them to respond with a success (the password was found in the given batch) or a failure (the password was not found and a new batch can be processed). Every client gets a different batch, so every client has an exclusive router-dealer connection with the server. With this pattern we can manage Start, Batch and Success messages.

The pub-sub pattern connects a publisher to a set of subscribers, allowing a specific message to be distributed to all the subscribers that are interested in it. This is the perfect pattern to propagate (broadcast) the exit messages to all the clients. In the ZeroMQ implementation of this pattern, every message needs to have a topic and the subscriber needs to tell the server which topics they are interested in. In our case, we will only have the exit topic and every client will be subscribing it to receive the exit message.

To have a visual understanding of these patterns and see how they are composed in our architecture you can have a look at the following image:

ZeroMQ Rotuer/Dealer and Publish/Subscribe sample configuration

As you can see in the picture the server has two sockets. One to act as a router (to distributes the batches) and one to act as a publisher (to publish the exit signal). Every client has two sockets as well, one to act as dealer (to process the batches) and one to act as a subscriber (to listen for the exit signal).

Notice that the router/dealer connections are exclusive (not shared across clients), while every client subscribes the same channel on the server for the pub/sub connection.

Next Up

This was the first part of the article, where my aim was to get you up to speed on theory and to outline how the application will work. In the next part, we are going to actually build our password cracker application!

If you have any questions about this topic, find me in the comments section!

Meanwhile if you feel like you want to strengthen your knowledge of Node.js and design patterns to get ready for the second part, I recommend you to have a look at Node.js Design Patterns Second Edition.

A little spoiler: in the second part of the article, we are going to have a nice challenge with a prize, so be sure you won't miss it :)


This article was peer reviewed with great care by Arthur Thevenet, Valerio De Carolis, Mario Casciaro, Padraig O'Brien, Joe Minichino and Andrea Mangano. Thank you guys for the amazing support!