Sampling is important for any brand that wants to impactfully explore its vast amounts of data. Whether it's understanding product usage or gauging engagement in your target audience, sample data is key. And when it comes to behavioral analytics and customer journey analytics, finding the right question is often even harder than finding the right answer.
Sampling your data with a customer intelligence platform supercharges the process by reducing the query response time to a few seconds even across billions of events. Brands can explore questions faster, avoid dead-end answers, and dive deeper into data. A typical exploratory session starts with a vague idea about a gem that might be lurking in the data, turns into a series of queries that zero in on the right question, and then saves the final query by pinning it to a dashboard.
Other solutions require sampling just to be able to work at scale. With customer intelligence and real-time analytics, sampling is a useful tool, but not essential for fast results at a massive scale. You can obtain statistically accurate sampled results in real-time, refine your queries, and when ready, get fast unsampled results for hundreds of billions of events. Customer intelligence platforms, like Scuba, can ingest and store all raw events and optionally sample during the query.
Let’s review some key concepts:
Many new applications relate to connected applications with millions of users and thousands of events per user session. Many brands see hundreds of millions, or billions, of events per hour. Being able to ingest, store, and analyze all that data in terms of behavior takes a dedicated approach. General-purpose solutions for big data analytics might keep up at smaller scales, but at high volumes they’re forced to make compromises like using much more expensive clusters, taking longer to get their answers, and depending on busy data scientists to translate basic questions into code.
The discovery process is exploratory by nature, and exploration is best done interactively. There’s something compelling about diving deep into the data and seeing it in new ways and from different angles. Time to discovery is a critical measure of an analytics solution. The good news is that there’s usually a balance between how accurate an answer needs to be, how much it costs to get a more accurate answer, and how much value additional accuracy brings to the organization.
Believe it or not, sampling isn’t always appropriate. Certain data isn’t going to be evenly distributed among the shards. Some events are very rare and unlikely to show up in a sampled result. Sometimes you’re looking for a tiny set of events but aren’t sure when they occurred. Sometimes the selection filters leave too few events to sample accurately.
For behavioral analytics of event data, there are right and wrong ways to sample. It’s tempting to sample at data collection points. There are potential upsides: the data shrinks and gets easier to ingest, less data needs to be stored, and it can be processed as-is without further reduction. But for behavioral analytics, this approach is tricky and limited.
For behavioral analytics of event data, the correct approach is to record all the events and make them part of the dataset. Sampling needs to be based on all the events for a representative set of actors from the population. It needs to happen at the time of the query, not during ingest. This approach moves the burden of correct sampling from the end-user and onto the analytics platform. If the answer is so clear-cut, why isn’t everybody doing it the same way?
The answer is implementation: A solution focused on event data can organize and manage data in ways that don’t make sense for a general-purpose analytics solution. That organization brings the power to store and query huge volumes of event data efficiently.
As a real-time customer intelligence platform, Scuba Analytics is a purpose-built solution for behavioral analytics of event data at a massive scale. With Scuba's no-code querying and real-time analytics, teams across any company can not only conduct impactful sampling but glean essential insights.
Explore our demo today or schedule a call with a Scuba expert.