International Marketing Company – Data Transformation
Transforming foreign data for integration into a larger universe.
Service: Data Engineering
Industry: Marketing
Tech Stack: AWS S3, Apache Spark, Apache Hive, AWS Lambda, Pyspark, Jupyter
Profile & Challenge
Our client - a major marketing company - sought to include Japan’s census data into its larger data universe but faced multiple integration challenges. The data set used non-Latin characters (in this case Kanji), and required contextual understanding of Japan’s administrative operations, in order to match them with the client's existing data tables. The client turned to GAP for help with this big data transformation project.
SOLUTION & OUTCOME
Existing translator programs were not effective in translating the data set, so GAP found a one-to-one translation service to perform the translation. The client was presented with a data set that could be readily used and integrated with their larger data universe. After translation, GAP enhanced and standardized the data to be used in modeling efforts for audience creation. Moving forward, the client also gained the ability to easily convert future census data from Japan and other non-Latin languages based on GAP’s solution. As an added bonus, GAP’s data engineering team created a tool that can be used to update all data models faster for the client, improving the process for regular data refresh.
ADDITIONAL PROJECTS
RELATED ARTICLES
June 26, 2023
The Rise of AI Tools: How They Can Streamline Your Business and Free Up Time for More Complex Tasks
Javier Cravioto, Delivery Director at Growth Acceleration Partners, charts the evolution of human progress with a focus on AI tools. From agriculture to the digital age, he examines how these technologies are transforming efficiency and creativity. As we delve into the era of AI tools, discover their profound impact on
Read MoreJune 19, 2023
What Some Data Analytics Consultants Don’t Want You to Know
Data analytics consultants are filling the knowledge and skill gaps created by a tight labor market and the growing need for digitalization. Enterprise-level efficiency is now dependent on leveraging existing datasets to optimize operations. But companies don’t always have the internal capacity to manage still-developing and amorphous issues posed by data
Read More