How we lowered operation cost, built a modern IT architecture and enhanced automation by migrating processing workloads to the public cloud.
The Near platform derives privacy-safe intelligence on people and places across 44 countries using real-world signals. We deal with 10 billion requests per day, and 12 terabytes of data on a daily basis, to process data from 1.6 billion users, and over 70 million places worldwide. When your platform helps businesses make quick and intelligent business decisions – you need to be quicker and more accurate simultaneously. And as a fast-growing startup that raised a $100 Mn as growth capital last year — the most important thing for us became speed. To adapt to this fast-paced growth, we decided to migrate our entire infrastructure to the public cloud from the existing on-premise infrastructure.
In the data industry, especially when data is primarily used for getting consumer context, it is critical to understand how consumers behave not just in the digital world, but also in the offline world. This is precisely one of the problems Near solves for its customers. In our everyday work, the most crucial pillars of handling and processing this data rest with the tech team that stores and manages terabytes of heterogeneous data and ensures effective use of this data.
As you deal with critical data at this point in the company, an important and difficult decision becomes migration to the public cloud and moving processing workloads. It is no news that moving to the cloud helps to lower operating costs, demonstrates a modern IT environment, and enhances high automated development and operation. As organizations get larger, the risk is even higher to migrate data and many have struggled with this process. Sample this: In a survey, around 60 percent of companies surveyed by McKinsey last year have migrated only less than 10 percent of their workloads to the public cloud as they came across a lot of complications on their way during the process of migration.
A cloud migration strategy is not a matter of lift and shift. It requires upfront investment in terms of time, money and team skills which might be needed to adapt to operate on the cloud. As a company dealing with large scale data from the real world, it was a natural step for us to make this shift to the cloud. Here is a look at how we started, what were the drivers for this migration and the processes we followed.
Why migrate to the cloud?
The data world is getting more competitive than ever, and as a platform dealing with real-time data processing requests, you need to be completing complex calculations and transactions at a very high speed. Heterogeneous data collected in different types and formats across devices could have problems such as missing values and high data redundancies. At a large-scale, data storage and processing become the backbone and foundation to have efficient results.
For storage, the focus should not only be on the currently acquired data but also on the historical data within a certain time frame. Every data acquisition device is placed at a specific geographic location and every piece of data has a time factor associated with it. The time and space correlation is an important property of data from IoT.
While dealing with these sensitivities, the most fundamental step is to consider the use case for migrating your data to the public cloud. What are the benefits of migrating? How will it help your business? Is it worth the risk? Is the team ready to migrate? What is the timeline and how ready is the organization? Should we first consider a hybrid approach? And the same was the starting point for us.
What are the main drivers for undertaking migration?
When you begin the process of migration, the major challenge is to ensure that the operations running at present are not disrupted, data is not lost, and downtime does not affect the business. But the step before that is to evaluate the factors and drivers of migration. At Near, these were the clear indicators for us to take the steps towards cloud migration.
- Maintaining high elasticity and low latency: High network latency regularly causes martech companies to exceed the RTB process time limit (~100 ms) because of slow computation time. This lag translates into lost bids and ad space, and can ultimately mean getting pushed out of sales channels because of consistently slow bids. With scale across APAC, EMEA and US regions, we realized that our real-time bidding infrastructure needed to be close to our partner locations so we can respond even faster. With more than 10 billion requests per day, there needs to be elasticity if you want to scale to integrate with more partners to increase unique data propositions. If you do not move to the cloud and are using on-premise infrastructure, moving into a new country or adding a new data partner is a time-consuming and high-cost effort for which you would need servers of different capacities since the data is coming at various speeds and at different intervals. You would need to scale up and scale down as required. Migrating to the cloud solves the problem of elasticity and scaling as needed.
- Business Agility for scaling infrastructure: With an on-premise solution, the agility to grow fast is challenged and a lot of infrastructure capacity is wasted up front to ensure there is always enough room for unexpected growth in scale. One of the driving factors for moving to the cloud for the organization was to keep its cost for scaling infrastructure low. Scaling up also leads to fast data processing which is an essential factor to determine business agility and define competitive advantage.
- Efficiency in Data Science Teams: If data is the oil in the company, data science teams are the engines that drive innovation. For teams to function efficiently and have structured data, it is important to make sure that the infrastructure which supports the data science team supports elasticity and scale depending on the size of the data they are using for training and testing. With the flexibility of the cloud, the pace at which data science models can go to production is much faster. Also, opting for a hybrid cloud platform essentially means you will have data residing in one cloud platform while the processing will take place in another one. This means you have to pay for the transfer, which brought us to the decision that it is better to be in one single place, mitigating the transfer risk as there might be a lot of hidden costs if we end up in a hybrid vault.
- Quality of Service (QoS): Whether it is while dealing with architecture solution discussions or with production-related issues to meet our Service Level Agreement (SLA) expectations at its best, the previous infrastructure provider really struggled to live up to our expectations of processing large amounts of data. If we waited longer and continued to function on the previous infrastructure, it could have suffered our quality of service and customer relations.
How to start the process of migration to the cloud?
Change is never easy, especially when it involves a certain degree of risk to move on from the infrastructure which had been running for six to seven years. The risk involved had to be documented in detail beforehand to the team to go after the migration effort. After documentation, the team mapped the journey between the “current state” and the “target state” of the migration process. Once we understood and documented the risk, we actually created the plan for mitigating it. Technology executions are easier when all the stakeholders are together and agree with the decision. The stakeholders should see the value of the risk. We decided a time frame of six weeks for our migration process which included a pre-verification checklist, migration and a feedback loop to ensure all subsystems and there is fill cutover to AWS without any downtime.
The four-dimensional impact of migration
Migration to the cloud is not just the responsibility of the engineering team. It might be an aspect of tech but impacts the entire business as data is at the core of Near’s business model:
- Reputation Impact: The first and foremost factor for any fast-growing startup is the reputation and relationship it manages with the customers that put trust in it. Hence, this is the first aspect that lays the foundation of other factors to be considered while migration.
- Business Impact: For us, downtime during migration means a direct impact on the terms of revenue. For example, what does product downtime mean in terms of revenue loss in dollars? Keeping this in mind, we worked with all stakeholders to identify the right time slot for migration, so that the impact in terms of revenue is minimal.
- Customer Impact: This is directly related to the factor concerning reputation. Even though product downtime might be low during migration, there could be a serious impact on the functionalities of products if migration causes unexpected damage and loss to data in turn affecting reputation with customers. If something goes wrong during migration, we were ready to roll back to our on-premises environment.
- Cost Impact: Last but not the least, is the cost vs quality factor while contemplating migration. From a business perspective, the focus is to be cost-efficient. From an engineering perspective, the focus is on the high quality of service. The key is to maintain the balance between the two and for that, we are continuing with our efforts to optimize our platform.
Migrated to the cloud? What next?
Once the entire infrastructure has been successfully migrated to the cloud and the biggest challenge which was to enable scale elastically is achieved, there is more work to be done. This is the time for the engineering team to ensure that the entire platform is continuously optimized to keep the cost under check, and is up to date with services and tech stack to get the maximum benefit out of the cloud. This also helps to monitor real-time changes to critical infrastructure and predict workload contentions.
At Near, privacy is not an afterthought but rather something that is inbuilt by design. This meant that apart from increasing efficiency in real-time monitoring, it was our priority to assess the security of the data at rest and in transit. Regulatory compliance laws such as GDPR were an integral part of our planning as we moved numerous heterogeneous datasets meticulously from on-premise infrastructure to the public cloud.
While the volume of data we collect and process continues to increase, this milestone of successfully migrating all the present and future data to the cloud at the right time undoubtedly helps us be a frontrunner in the evolving data economy.