There’s never been a better time to be a data engineer, in large part due to the rapid innovation and rising popularity of modern warehouses and data processing tools. When working with customer data, though, collecting all of your relevant information in one place and making it usable around the organization is a non-trivial challenge.

Customer Data Platforms (CDPs) have tried to solve for data collection and activation, but unfortunately most of them make the problem worse by creating additional data silos and integration gaps. …


As customer data projects become more important, engineering teams are increasingly leading these efforts, collaborating with marketing, product and data science teams to reveal insights and drive the customer journey with data.

Unfortunately, they face a poor choice of tools to help them build the required infrastructure. That choice is often between closed solutions designed for marketers, which limit flexibility, or building everything from scratch, which requires engineers to spend time on low-level plumbing problems.

RudderStack’s goal is to provide developers with the open, flexible tools they need to build customer data infrastructure. As the leading open source Customer Data…


RudderStack is an engineer-focused, developer-first alternative in a world of tools that cater to marketers. We believe deeply in this approach and have written about it before. With our engineering-heavy influence, it probably comes as no surprise that design is one area that’s ripe for attention.

While it’s natural to focus on building more, we know thoughtful design must work in harmony with robust development to unlock our product’s full potential. …


Customer data pipelines play a critical role in the privacy of your customer data. They are one of the primary and most expansive collectors of your customers’ personally identifiable information (PII). They are also one of the most expansive sharers of customer data — with one of the primary use cases being event streaming to frequently large libraries of destination integrations.

Due to their specialized role of collecting and sharing customer data, customer data pipelines can either help ensure your data privacy or wreak havoc on it.

This post will explain how your customer data pipeline can help improve your…


Relational Data and Beyond

In part one, we talked about the importance of taking a holistic view of both data and infrastructure when building a data stack. We highlighted the essential role that categories play in this holistic view, and we detailed the first of two major sources of data, event data.

In this post, we’ll cover the other major source of data, relational data. We’ll outline how to collect relational data from both cloud applications and databases, and we’ll note two other lesser but still important sources of data.

Relational Data

This is another big category of data that we almost always have to work…


Astasia from RedPoint Ventures wrote a great post on new technologies supporting “reverse ETL” functionality in the customer data stack.

We’re excited to be innovating in the area of reverse ETL tech (via our Warehouse Actions feature), and our product and engineering teams discuss these topics and industry trends often, so we thought it would be helpful to provide a bit more technical depth on a few of Astasia’s points.

1) Data Movement Differs Between Event Streams and Tabular Data, Which is an Important Consideration for Reverse ETL

Differences in Moving Data

ETL/ELT solutions all accomplish a similar function, moving data, but there are several foundational differences to keep in mind when it comes to the data. …


The Importance of Categories

Even the best possible data stack is completely useless without data. For this reason, the first problem we always face when building data infrastructure is what data we are going to be collecting, from where, and how we should do it.

Of course, there are also other things we should keep in mind while trying to figure out the data we will be working with. For example, what kind of delivery semantics we need or how we will be processing the data later on.

In the end, the data we will be working with and the infrastructure we will be…


“Before RudderStack, I tried to build customer data pipelines inside a large enterprise using homegrown and vendor solutions. This article summarizes what I learned both building and buying customer data pipelines over the last ten years and how I would approach the challenge today.”

- Soumyadeb Mitra

A major initiative for all modern companies is aligning every team around driving a better experience for their users or customers across every touchpoint. This makes sense: happy, loyal users increase usage, business growth, and ultimately revenue.

Creating powerful experiences for each user, especially when it comes to use case personalization, is easier…


Thank you Yakko Majuri from PostHog for coming up with the idea for this article and for your feedback and contributions to it.

A data analytics stack enables all of the teams across your organization to look at important metrics and make data-driven decisions. It integrates different technologies needed to efficiently collect, store, transform, and analyze your data and to derive critical insights from it.

When it comes to using an analytics stack, businesses are often faced with two choices — buy one or more proprietary tools or build an open-source analytics stack. While proprietary tools often offer best-in-class analytics…


Your data warehouse is the platform your customer data stack is built on. It acts as a robust central system for driving critical business decisions while maintaining a unified source of truth for your entire organization. Businesses are constantly looking to improve their product or marketing strategies to get an extra edge over their competitors. Data warehouses help you get that competitive edge by serving analytical data to make fact-based decisions that drive innovation and business growth. …

RudderStack

RudderStack is the CDP for developers. - https://rudderstack.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store