Back to Projects
    XGBoost
    Feature Store
    Streaming
    CRM

    Real-time Lead Scoring for CRM

    Gradient boosted models scoring leads in real time; uplift +18% in SQLs using CRM events and marketing touchpoints.

    Overview

    Built an end-to-end real-time lead scoring system that processes CRM events and marketing touchpoints to predict lead quality. The system uses XGBoost models with feature engineering pipelines that capture behavioral patterns, engagement metrics, and demographic signals. Deployed with streaming architecture for sub-100ms inference latency.

    Code Highlight

    Real-time Feature Engineering Pipeline
    import pandas as pd
    from kafka import KafkaConsumer
    import xgboost as xgb
    from redis import Redis
    class LeadScoringPipeline:
    def __init__(self):
    self.model = xgb.XGBClassifier()
    self.redis_client = Redis(host='localhost', port=6379)
    self.consumer = KafkaConsumer('lead_events')
    def engineer_features(self, lead_data):
    """Extract behavioral and engagement features"""
    features = {
    'page_views_last_7d': self.get_pageviews(lead_data['email']),
    'email_opens_last_30d': self.get_email_engagement(lead_data['email']),
    'company_size_score': self.company_size_mapping(lead_data['company_size']),
    'behavioral_score': self.calculate_behavior_score(lead_data),
    }
    return pd.DataFrame([features])
    def predict_lead_score(self, lead_data):
    features = self.engineer_features(lead_data)
    score = self.model.predict_proba(features)[0][1]
    # Cache result for 1 hour
    self.redis_client.setex(
    f"lead_score:{lead_data['email']}",
    3600,
    score
    )
    return score

    Key Results

    +18% increase in SQL conversion rate
    Sub-100ms inference latency
    99.9% uptime in production
    Reduced manual lead qualification by 60%

    Technologies Used

    Python
    XGBoost
    Apache Kafka
    Redis
    PostgreSQL
    FastAPI

    Project Category

    ai automation

    Repository

    View on GitHub