Utilizing Adobe Experience Platform’s (AEP) Query Service to Analyze & Extract Data Externally
Adobe’s Experience Platform (AEP) is a powerful solution set for enabling cross-channel identity graphing, personalized channel activations, and customer insights. By integrating data from various sources, AEP provides a comprehensive view of customer journeys while also enabling personalized and targeted marketing efforts based on real-time profiles, audiences, and triggers.
For those needing to get more hands on with the data stored and created in AEP, there is the Query Service feature, which allows users to view and extract data for deeper analysis through structured query language (SQL) but up to some limits. AEP offers Platform Intelligence add-ons that enables more robust tools, like Jupyter Notebook and Data Science Workspace, directly within AEP’s UI but not all users have this add on and would like to perform some deeper level of analysis beyond what they can do in the Query Service UI.
This article explores using AEP’s Query Service API through the python AEPP library to access data externally within your local Jupyter Notebook instance, enabling you to analyze and visualize your data efficiently and well as another provide an easy path to export data from AEP.
A Quick Review of Adobe Experience Platform (AEP)
Adobe Experience Platform (AEP) takes a philosophical approach to managing unknown and unknown customer data in a centralized and structured approach to be able to combine customer profiles and interactions from various channels. It enables businesses to create a unified customer view by integrating data from web interactions, mobile apps, CRM systems, retail, loyalty programs, call centers, and sales channels. With AEP, organizations can perform advanced cross-channel analytics, develop personalized experiences that span customer engagement channels, and drive meaningful customer engagements. The platform’s ability to process large volumes of data in real-time makes it a valuable tool for marketers, data scientists, and analysts.
AEP itself is an integrated platform that provides shared services to platform-native applications like Real-Time CDP, Customer Journey Analytics, and Journey Optimization as well as provides intelligence services to each of these to assist with identity stitching, attribution, segmentation, audience building, and activations to Adobe and non-Adobe destinations.
About AEP Query Service
AEP’s Query Service is a feature within Adobe Experience Platform that allows users to run SQL queries against their integrated customer, channel, and event data stored within AEP datasets. This service facilitates data exploration and analysis, making it easier to gain insights leveraging SQL’s familiarity and flexibility to filter, join, and aggregate data, enabling complex analyses without needing to move data out of the platform. The Query Service is essential for creating custom reports, dashboards, and advanced data visualizations as well as monitoring data quality.
Query Service is available to all AEP customers to run and schedule complex SQL queries within in the platform. The query service UI will limits results to 100 rows unless you run (or schedule) the query in a mode called “CTAS” and have it write to a dataset within AEP. Of course, you then next wonder, how can I view this result set from the dataset if I’m limited to 100 rows in the UI.
Enter Data Science Workspace & Jupyter Notebooks
Further unlocking the power of Query Services is AEP’s Platform Intelligence add-on which includes the ability to utilize Jupyter Notebooks directly within AEP. This provides even more robust capabilities to perform analysis using libraries and languages that are favorites of data analysts and scientists, like python, R, pandas, Spark, and so on.
But for those AEP users without the these add ons, you still have the power to leverage Jupyter Notebook and Interactive Queries externally through AEP’s APIs.
How to Unlock Interactive Querying Externally to Analyze, Visualize, and Export Data
Ok, so you don’t have the Platform Intelligence add-on but still want to unleash the potential of data exploration using Jupyter Notebook? Meet the AEPP Python library that when combined with PostgreSQL libraries, can open a new world of data access for you.
To get started with using AEP’s Interactive Query Service in Jupyter Notebook, you’ll need to set up the AEPP Python library, which provides a simple interface for interacting with Adobe Experience Platform APIs. Follow these steps to set up your local environment:
(note: while we’re illustrating the process to use AEPP with Jupyter externally in this example, you can also just work with the results directly in Python without Jupyter, if preferred. Jupyter does provide a very nice interface for experimenting with queries quickly and manipulating result sets within dataframes).
Step 1: Install Required Libraries
First, ensure you have Jupyter Notebook and the necessary Python dependencies installed. You can install Jupyter Notebook using pip if you haven’t already:
pip install jupyterlab
Next, install the AEPP Python library:
pip install aepp
(More information about installing AEPP and it’s dependencies are available here).
Step 2: Launch Jupyter & Create A Notebook
From your terminal or command line run the following to launch a new Jupyter session in a browser:
jupyter lab
Then open a new notebook with the Python 3 (ipykernel) option
Step 3: Authenticate with Adobe Experience Platform
To interact with AEP, you’ll need to authenticate using your Adobe Developer API credentials. This will need to be setup by an Admin to enable access to the API with Oauth credentials. Once this is available, create a configuration file (config.json) with your credentials in a directory where you plan to store your Jyputer Notebook files.
{
"org_id": "your_org_id",
“client_id”: “your client_id”,
"secret": "your API secret key",
“Sandbox”: {whichever sandbox you’re working in}
"scopes": "{all or needed scopes enabled for the API}"
}
(These credentials are found in Adobe Developer Console > Projects > {your API enabled project} > Credentials > Credentials details)
In your Jupyter Notebook, load the configuration and authenticate:
import aepp
from aepp import queryservice
aepp.importConfigFile('config.json')
qs = queryservice.QueryService()
conn = qs.connection()
Verify your connection is established
conn
Next setup for Interactive Querying
intQuery = queryservice.InteractiveQuery(conn)
Step 4: Query Data Using AEP Query Service
With authentication in place, you can now run queries against your AEP data. Here’s an example of how to execute a SQL query and load the results into a Pandas DataFrame:
import pandas as pd
query = "SELECT {some column} FROM {some_dataset_id} LIMIT 100"
result = intQuery.query(query)
# Load results into a DataFrame
data = pd.DataFrame(result)
Step 5: View Your Result Data
data
Step 6: Export Result Data to CSV
data.to_csv('{file_name}.csv')
Step 7 (Optional): Analyze and Visualize Data
Once you have your data in a DataFrame, you can leverage various Python libraries to analyze and visualize it. For example, you can use Matplotlib to create a simple plot:
import matplotlib.pyplot as plt
# Plot a simple histogram of a column
plt.hist(data['your_column_name'])
plt.title('Distribution of Your Column')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
Congrats! You’re now working with endless (well, up to 50,000 result rows via the API) of data from AEP.
Wrap Up/Final Thoughts
Sometimes you just need to get your hands on data directly to analyze it or verify that it’s structured correctly. Other times, you may need a quick way to export data from AEP. AEP’s built-in Query Service UI is great for spot checking up to 100 results but when you need to see and manipulate larger sets of results and don’t have access to Jupyter Notebooks with AEP, the AEPP library can be a game changer in conjunction with the Query Services API.