From data lake to data swamp

From data lake to data swamp

From data lake to data swamp

“Our data lake turned into a data swamp!” We are hearing this observation more often. Why is this happening and what can be done to mitigate?

 

Schema-less databases are embarking on the next phase in their classic technology-hype cycle.

technology hype cycle

Many customers are suggesting to us that they are currently on the second downward slope towards disillusionment with their data lakes.  Data lakes can grow fast.  This is a sign of their success.  Without the constraints of a rigid schema, new data can be added very easily.  Unfortunately, this growth is often accompanied by entropy caused by the collision with messy real-world realities of the complex nuances of the multiple meanings of overlapping data sources in the lake.

In the early years of the hype cycle of NoSQL platforms (Hadoop, MongoDB, Cassandra and many others) it was common to hear predictions of these exciting new database systems displacing vast numbers of legacy relational databases.  Although data lakes are simplifying some warehousing problems, generally this has not happened in practice.  Oracle and SQL Server still dominate and in data management circles it is dawning that data lakes and relational databases need to co-exist rather than rip and replace each other.

So how can we prevent this entropy caused by rapid addition of data sources in our lakes?

1. Ownership and stewardship

The data management system should allocate owners to all the data sources in the lake.  Just because a data lake is schema-less does not mean it should be owner-less.  A person who understands the source of the data needs to steward it in the lake.

2. Apply semantics to the lake

A common complaint from business users is that they do not know the meaning of the lake data.  Attributes can have multiple meanings from the perspective of different consumers.  The data management system needs to manage a sematic layer on top of the physical store.

About the Author

  •    28 Marshalsea Road
         London
         SE1 1HF
         United Kingdom
  •   +44 (0)20 3627 2908
  •    info@stream-financial.com

About Stream Financial

We enjoy those intractable business problems you hate. We employ our experience and leading edge software solutions to help your organisation realise it’s full potential. Our approach is widely effective for business processes across all organisations and business sectors.


Read more

Request a Demo


Request a Demo