We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
Back in the day when data infrastructure was just picking up, businesses were heavily reliant on batch data processing. They used to leverage at-rest data, stored across systems over a period of time, to drive insights for improving business outcomes. Now, data volumes have exploded at all levels, driving the need to take action on data as it flows through systems – known as streaming data processing.
In a bid to power use cases such as real-time product recommendations and fraud detection, enterprises around the world are using cloud-based streaming storage services such as Amazon Kinesis and Apache Kafka. They use these platforms to continuously capture gigabytes of data per second from hundreds of thousands of sources, from which developers build the desired real-time streaming applications – capable of processing and reacting to events with sub-second latency.
While the process sounds simple, implementing it has long been a challenging endeavor. First of all, a company needs to have highly skilled developers in distributed systems and data management to build real-time applications. Then, these engineers have to work around the clock and provision servers or clusters to not only ensure delivery guarantees, fault tolerance, and elasticity and security in the product, but also smooth 24/7 operation in a reliable, secure and scalable way.
DeltaStream’s serverless database
To simplify this aspect, DeltaStream offers a serverless database that manages, secures and processes all the streams – connected via streaming storage platforms – for various use cases. The company, founded by Hojjat Jafarpour, announced it has emerged from stealth with $10 million in seed funding.
“DeltaStream sits above the streaming storage services such as Apache Kafka and enables users to build real-time streaming applications and pipelines in familiar SQL language,” Jafarpour told VentureBeat. “The solution is serverless: meaning users just need to focus on building their applications and pipelines, and DeltaStream takes care of running them, complete with scaling up and down, fault tolerance and isolation.”
With serverless, users can just assume resources for their applications and go ahead by paying only for what is used. Meanwhile, SQL’s simplicity ensures users have a familiar way to manage, secure and query their data-in-motion. DeltaStream also organizes the data in schemas and databases, and provides role-based access controls for restricting who can access the flowing information and what they can do with it.
“DeltaStream’s model of providing the compute layer on top of users’ streaming storage systems … eliminates the need for data duplication and doesn’t add unnecessary latency to real-time applications and pipelines,” the company said in its blog post.
Other offerings that tackle the same challenge are Confluent’s ksqlDB, AWS Kinesis Data Analytics, Azure Stream Analytics and GCP DataFlow. However, according to Jafarpour, these are all restricted to certain streaming-storage platforms. In contrast, DeltaStream is platform-agnostic and can work with major streaming data stores like Apache Kafka, AWS Kinesis and Pulsar.
“Also, in addition to processing, DeltaStream enables users to organize and secure their streaming data similar to the relational databases, which the other systems don’t,” the CEO added.
Currently, a limited set of customers on AWS can have access to DeltaStream in private beta. The company plans to use the seed round, led by New Enterprise Associates (NEA), to build on the core offering before heading toward general availability. He did not share the exact timeline but has confirmed plans to expand the product to GCP and Azure soon.
Once available, DeltaStream could be accessed through a REST API, CLI application or a web app.