Back to Projects
XGBoost
Feature Store
Streaming
CRM
Real-time Lead Scoring for CRM
Gradient boosted models scoring leads in real time; uplift +18% in SQLs using CRM events and marketing touchpoints.
Overview
Built an end-to-end real-time lead scoring system that processes CRM events and marketing touchpoints to predict lead quality. The system uses XGBoost models with feature engineering pipelines that capture behavioral patterns, engagement metrics, and demographic signals. Deployed with streaming architecture for sub-100ms inference latency.
Code Highlight
Real-time Feature Engineering Pipeline
import pandas as pdfrom kafka import KafkaConsumerimport xgboost as xgbfrom redis import Redisclass LeadScoringPipeline:def __init__(self):self.model = xgb.XGBClassifier()self.redis_client = Redis(host='localhost', port=6379)self.consumer = KafkaConsumer('lead_events')def engineer_features(self, lead_data):"""Extract behavioral and engagement features"""features = {'page_views_last_7d': self.get_pageviews(lead_data['email']),'email_opens_last_30d': self.get_email_engagement(lead_data['email']),'company_size_score': self.company_size_mapping(lead_data['company_size']),'behavioral_score': self.calculate_behavior_score(lead_data),}return pd.DataFrame([features])def predict_lead_score(self, lead_data):features = self.engineer_features(lead_data)score = self.model.predict_proba(features)[0][1]# Cache result for 1 hourself.redis_client.setex(f"lead_score:{lead_data['email']}",3600,score)return score
Key Results
+18% increase in SQL conversion rate
Sub-100ms inference latency
99.9% uptime in production
Reduced manual lead qualification by 60%
Technologies Used
Python
XGBoost
Apache Kafka
Redis
PostgreSQL
FastAPI
Project Category
ai automation