During our hangout on marketing data lakes, we received a lot of questions about agencies’ use of data lakes. We asked Dustin Dewberry, one of our panelists and VP of data science & advanced analytics at Digitas, for his take on how agencies can leverage data lakes to better serve their clients.
RampUp: Are a lot of brands asking agencies to create data lakes for them or manage them in some way? What sort of data lake opportunities are coming agencies’ way?
Dustin: I think the role of a good agency is to listen to the client’s need, challenge, and/or business problem in order to help them find the right solution. Data collection, storage, accessibility and democratization are long standing problems where a data lake could be the right solution. As marketing continues to evolve, using a data lake both as a standalone solution or within a CDP or larger tech stack is only going to grow in demand.
RampUp: How do agencies activate data lakes?
Dustin: Assuming this question is about how agencies activate media through a data lake, the methods available are the same as the ones brands use:
- Through a proprietary ID service
- Through an onboarding partner
- Through a direct client one-to-one match with a DSP, DMP, or other equivalent platform
RampUp: Does that data cross clients?
Dustin: NEVER. Client data never crosses between clients. If your agency has a data product which has its assets stored in a data lake, then several clients who subscribe to that data asset can access it independently.
RampUp: How is the data mined?
Dustin: This is largely dependent on the source of the data and the data itself, in addition to the primary skill set of the data mining administrator and the specific task at hand.
For example, a client may have first-party data from owned properties. This data can be streaming into the data lake through a tag placed on the owned property. Depending on a team skill set and the primary reason for this data being in the data lake, this data could be accessed through a SQL-like interface or read into memory (or landed) where languages like R and Python can be used. I think the goal of the task should really determine if a single tool or approach—or a combination of several tools and approaches—are required. The data lake should help facilitate any approaches and tools with data.
RampUp: If the data used for marketing is uncovered and segmented during a campaign, does the agency ultimately own that contact or does the brand?
Dustin: Ownership can be determined a number of ways. Here are five options:
- The brand owns the contact by handling the mining into the data lake.
- The agency handles the contact and mines the data directly into the lake, and the brand retains ownership.
- The brand owns and/or builds their data lake and grants access to applicable parties like an agency, meaning the brand retains ownership of the data.
- The brand owns their data lake and hires an outside party to build it, so the brand takes over ownership and administration once the data asset is complete.
- The brand hires an outside party to build and administer the data lake on their behalf. In this scenario, there would be an ongoing fee to the outside party for the ownership, management, and administration of the data asset.
RampUp: How will GDPR impact how agencies manage and work with data lakes?
Dustin: That largely depends on the data source itself—the data lake has no bearing on that determination. Ultimately the agility and flexibility of the data lake will help you meet compliance requirements. Having a sound data strategy implemented both at the structured and unstructured data layer will help you pivot more efficiently to meet data privacy imperatives.
RampUp: What advice do you have for agencies looking to build data lakes, or for brands asking this of their agencies?
Dustin: A data lake can be the right tool for the right job. They can be a valuable asset to an agency or brand, both as a standalone solution or within a larger stack. Do your due diligence to determine if a data lake is the right solution for you. If you move to the build phase, ensure that the principles of the data needs are carried throughout the implementation and that they are scalable.