There’s never been a better time to be a data engineer, in large part due to the rapid innovation and rising popularity of modern warehouses and data processing tools. When working with customer data, though, collecting all of your relevant information in one place and making it usable around the organization is a non-trivial challenge.

Customer Data Platforms (CDPs) have tried to solve for data collection and activation, but unfortunately most of them make the problem worse by creating additional data silos and integration gaps. …


RudderStack is an engineer-focused, developer-first alternative in a world of tools that cater to marketers. We believe deeply in this approach and have written about it before. With our engineering-heavy influence, it probably comes as no surprise that design is one area that’s ripe for attention.

While it’s natural to focus on building more, we know thoughtful design must work in harmony with robust development to unlock our product’s full potential. …


Customer data pipelines play a critical role in the privacy of your customer data. They are one of the primary and most expansive collectors of your customers’ personally identifiable information (PII). They are also one of the most expansive sharers of customer data — with one of the primary use cases being event streaming to frequently large libraries of destination integrations.

Due to their specialized role of collecting and sharing customer data, customer data pipelines can either help ensure your data privacy or wreak havoc on it.

This post will explain how your customer data pipeline can help improve your…


Relational Data and Beyond

In part one, we talked about the importance of taking a holistic view of both data and infrastructure when building a data stack. We highlighted the essential role that categories play in this holistic view, and we detailed the first of two major sources of data, event data.

In this post, we’ll cover the other major source of data, relational data. We’ll outline how to collect relational data from both cloud applications and databases, and we’ll note two other lesser but still important sources of data.

Relational Data

This is another big category of data that we almost always have to work…


Astasia from RedPoint Ventures wrote a great post on new technologies supporting “reverse ETL” functionality in the customer data stack.

We’re excited to be innovating in the area of reverse ETL tech (via our Warehouse Actions feature), and our product and engineering teams discuss these topics and industry trends often, so we thought it would be helpful to provide a bit more technical depth on a few of Astasia’s points.

1) Data Movement Differs Between Event Streams and Tabular Data, Which is an Important Consideration for Reverse ETL

Differences in Moving Data

ETL/ELT solutions all accomplish a similar function, moving data, but there are several foundational differences to keep in mind when it comes to the data. …


The Importance of Categories

Even the best possible data stack is completely useless without data. For this reason, the first problem we always face when building data infrastructure is what data we are going to be collecting, from where, and how we should do it.

Of course, there are also other things we should keep in mind while trying to figure out the data we will be working with. For example, what kind of delivery semantics we need or how we will be processing the data later on.

In the end, the data we will be working with and the infrastructure we will be…


“Before RudderStack, I tried to build customer data pipelines inside a large enterprise using homegrown and vendor solutions. This article summarizes what I learned both building and buying customer data pipelines over the last ten years and how I would approach the challenge today.”

- Soumyadeb Mitra

A major initiative for all modern companies is aligning every team around driving a better experience for their users or customers across every touchpoint. This makes sense: happy, loyal users increase usage, business growth, and ultimately revenue.

Creating powerful experiences for each user, especially when it comes to use case personalization, is easier…


Thank you Yakko Majuri from PostHog for coming up with the idea for this article and for your feedback and contributions to it.

A data analytics stack enables all of the teams across your organization to look at important metrics and make data-driven decisions. It integrates different technologies needed to efficiently collect, store, transform, and analyze your data and to derive critical insights from it.

When it comes to using an analytics stack, businesses are often faced with two choices — buy one or more proprietary tools or build an open-source analytics stack. While proprietary tools often offer best-in-class analytics…


Your data warehouse is the platform your customer data stack is built on. It acts as a robust central system for driving critical business decisions while maintaining a unified source of truth for your entire organization. Businesses are constantly looking to improve their product or marketing strategies to get an extra edge over their competitors. Data warehouses help you get that competitive edge by serving analytical data to make fact-based decisions that drive innovation and business growth. …


Overview

This post breaks and down 1mg’s data stack, allowing them to harness unlimited, real-time data securely. We will also look at the tools they use to activate this data for their downstream analytics and personalization use-cases.

Who is 1mg?

1mg is an online platform that provides medical diagnostics, consultation, lab tests, and general healthcare. Every day, millions of users visit the 1mg website or use their apps to buy medicine, schedule time with doctors, or find helpful medical information.

1mg’s Data Stack

  • Data Collection and Synchronization
  • RudderStack SDKs, custom scripts
  • Data Transformation and Enrichment
  • SQL, RudderStack Transformations, AWS Athena
  • Data Lake and Downstream Databases
  • Amazon S3…

RudderStack

RudderStack is the CDP for developers. - https://rudderstack.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store