A heavily siloed analytical ecosystem caused issues in data integrity, consistency, and trust. As a company willing to follow modern architectures and best practices, the bank wanted to limit the number of errors, reduce operational risk, optimize OPEX costs, and improve data quality. The new data architecture was expected to solve the current issues and meet the demands for processing both structured and unstructured data. Such a transformation is a challenge from the organizational perspective, so the workload was implemented in line with the agile methodology.
To address all the challenges, our team proposed a Data Lake-based approach in Lambda Architecture as the TO-BE state. The solution’s key architecture assumption is the separation between replication and serving layers.
BlueSoft delivered an MVP that features a fully automated replication layer aggregating data from various data sources (legacy systems). Data ingestion is based on a metadata-driven development approach with automatic ETL code generation.
An agile joint team is currently developing the following layers, including batch and stream processing, together with a serving layer to democratize data across the organization.
Data Transformation Architecture is considered to serve as a new single source of truth for data stored in the bank’s ecosystem. Thanks to a certain level of automation, data flows are resilient, secure, and fast. The usage of best design patterns like Data Lake or Data Hub allows data consumers to reach data easily.
Working side-by-side with the bank’s team, we developed a truly interdisciplinary product team consisting of architects, analysts, developers, and ops specialists working in the Scrum framework.