Customer: The Client is a US$ 19.2B American Multinational Personal Care corporation
who produces sanitary paper products, surgical and medical instruments.
Problem:
• Client’s data lake contained hundreds of databases, not well structured. The Client
wanted to move away from SQL Server in Azure to Snowflake on Azure.
• Client was on Azure using React, Node.JS, Databricks and Snowflake. Client wanted
every day deployment which was not supported by current architecture.
• Client was on Azure using React, Node.JS, Databricks and Snowflake. Pipeline was:
output to datalake -> excel -> snowflake -> app API. pipeline was not so good:
manually triggered and not fully automated.
• The CD frequently broke, the client did not have Linux, Java or Tomcat skills to
diagnose or fix these problems quickly.
Solution:
• We created a fully automated version of the pipeline was created. Java and Spring
Boot were used to create APIs.
• The model ran in Databricks and output went into Databricks and then moved into
Snowflake.
• Started with dirty, disorganized data, incompatible schemas. CI/CD was originally on
Atlassian but was moved to Azure and based on Git using Java and Tomcat with a
webapp inside Docker.