Scaling markets in real time
As part of its company-wide digital transformation, Societe Generale is structuring IT developments as reusable building blocks exposed through APIs as services. Among these, a financial market dataset platform, code-named Geysir, provides quotes describing financial instruments to several internal clients. Geysir can be thought of as a market provider, where a market is a coherent set of financial data, such as the price of gold, the conversion rate of a dollar in euros, or the London interbank interest rate. The custom markets that Geysir provides can have different currencies, instruments, cut-offs, and other mathematical model parameters.
Several teams are using Geysir for different purposes. Traders use it to provide real-time market data to price financial products. The middle office needs snapshots of a market to value a portfolio or compute its risks and profit and loss. Financial engineers use it to retrieve past values to back-test strategies.
Geysir’s predecessor was designed to distribute a single, real-time market in a push workflow. As more internal tools began to request more quotes from this platform, the team was faced with a growing need to generate customizable markets. Although the platform was stable, it wasn’t designed to keep pace with multiple custom market quotes. That’s when the team began gathering requirements for Geysir, as a new solution with greater capacity, flexibility, and scale.
Today, a single snapshot market builds approximately 30,000 instruments. But the forecast is to have at least 30 different snapshot markets building more than 40,000 instruments each. In a real-time market, 218,000 instruments are built continuously—and this number will reach 1 million instruments by the end of the year.
Markets as a service
Written in C#, the predecessor of Geysir was a set a microservices designed to offer an API to client applications for composing a market from different data sources with customized refresh rates. Some services were dedicated to retrieving live quotes, transforming them to the company’s internal model, and pushing the transformed model to a cache (MongoDB). Other services, at a given interval, were picking the new data from the cache, transforming this data to the clients’ format, and then publishing the result to the clients.
The team had to design a new solution to address the shortcomings of the former platform:
- Fault isolation. The old architecture was not truly isolated. Some connectors were in the same services, and a failure of one affected them all.
- Reliability. Only one instance of the services was deployed on a single virtual machine, without any orchestration mechanism.
- Scalability. The single-service microservices architecture had no way of managing scalability other than scaling up the machines the services ran on.
The team also wanted to add a set of new features, such as:
- Segregation of business capabilities and support for multiple custom markets. The new implementation needed features for market data retrieval, transformation processing, and a filtering and querying engine.
- Zero downtime upgrades to ensure a strong service-level agreement (SLA).
- Platform consistency. The goal was to build a consistent microservices ecosystem and to create a single platform capable of deploying, running, and scaling multiple versions of the application, side by side.
Service Fabric as a platform for microservices
To rearchitect the system, the Geysir team needed a new approach, but the developers didn’t have the time or bandwidth to gain expertise on distributed cache and all the other necessary technologies.
They consulted with another development team at Societe Generale, one which was responsible for a financial simulation platform. This platform was the main Geysir client, relying on Geysir data for its computations, and was built using Service Fabric. The simulation platform team showed the Geysir team what Service Fabric could do, and it looked like the natural solution to meet the design requirements. Service Fabric was compatible with the large, existing C# code base.
Even though there was no requirement for Geysir to use the same technology as its main client, the platforms would continue to work closely together. Both teams could benefit from sharing knowledge and components.
Without Service Fabric, the team would have to learn and deal with:
- Service orchestration (using, for example, Swarm and Kubernetes).
- Microservice communication (such as through Fabio and Consul).
- Microservice monitoring and health (using Consul).
- Distributed cache (using Redis).
- Key value store (in MongoDB or Azure tables).
- Queuing (through RabbitMQ or Azure Service Bus, for example).
With Service Fabric, the team can create Service Fabric clusters on any virtual machines or computers running Windows Server. For a team of three who were supporting a legacy application and developing its successor, the fact that Service Fabric manages most of the complexity involved in developing and managing distributed applications was a big bonus.
For security reasons and because the application is for internal use only, the Geysir team set up an on-premises cluster of virtual machines to host the applications. The production cluster has eight large nodes in three locations, with 10 applications, 60 services, and more than 700 replicas.