Leading Advertising Agency
jidabyte helps implement multimodal website classification

Overview
A leading advertising agency sought to develop a tool to classify websites as MFA (Made for Advertising) or Non-MFA. The goal was to leverage various website features like screenshots, text, HTML code, and metadata to determine if a website was primarily designed for ads or had other primary objectives.
Challenge
The agency faced several key challenges while implementing this classification system:
- Multimodal Data Integration – Needed to combine diverse data types such as text, images, and HTML to make an accurate classification.
- Effective Classification – Required a method that could reliably classify websites based on various features, ensuring high accuracy in distinguishing MFA websites.
- Scalability and Performance – Needed to process a large number of websites quickly and accurately without compromising on performance.
Solution
The following solutions were proposed and implemented to address the challenges:
- Multimodal Retrieval-Augmented Generation (RAG) – Combined multiple data sources, including website screenshots, text, and metadata, with the RAG design to generate predictions.
- Traditional Machine Learning Model – Used XGBoost in combination with the RAG design to analyze features and predict whether a website was MFA or Non-MFA.
- AI Integration with Claude-3.5 and Sonnet – Leveraged advanced AI models like Claude-3.5 Sonnet for better feature extraction and classification from various website components.
Outcome
The proposed solution resulted in the following key outcomes for the agency
- Accurate Website Classification – The system was able to reliably classify websites as MFA or Non-MFA with high accuracy, improving ad targeting.
- Improved Data Analysis – By combining multimodal features and machine learning, the agency was able to better understand website structures and marketing strategies.
- Scalable and Efficient System – The AI-powered tool processed large volumes of websites quickly, enabling fast decision-making and data-driven strategies for ad placement.