Querying Billions of Rows of LinkedIn Data in Seconds
Revenue in 6 months
The Problem
Blueprint is an outbound agency that sends hundreds of thousands of emails per month on behalf of its clients. The agency's entire value proposition rests on finding the right people at the right companies. To do that well, they need to efficiently query massive datasets of professional profiles and company information to identify ideal prospects for each campaign.
The challenge was scale. Blueprint had access to billions of rows of LinkedIn data, but no efficient way to search it. Finding companies and individuals matching specific ideal customer profiles meant writing complex SQL queries by hand, a process that was slow, error-prone, and required technical expertise that most of the team did not have. Campaign setup took far longer than it should, and the team was leaving money on the table because they could not execute engagements fast enough.
The CEO knew that whoever solved the data access problem would unlock a step change in the agency's capacity. They needed a way for their team to describe what they were looking for in plain English and get accurate, queryable results back in seconds, not hours.
The Solution
BetterBrain built ML models that translate natural language questions into executable SQL queries against Blueprint's massive LinkedIn dataset. When a team member describes the ideal customer profile for a campaign, the system writes the SQL, executes it against billions of rows, and returns the matching prospects. The models learn over time from previous queries, getting better at understanding the agency's specific terminology and search patterns.
The system uses RAG with BM25 and vector embeddings to retrieve relevant past queries and schema context before generating new SQL. OpenAI's gpt-4o handles query generation because of its ability to reason about complex filter combinations and nested conditions. A self-learning database captures every successful query, so the system builds up institutional knowledge about what makes a good prospect search for different industries, company sizes, and use cases.
What used to require a technically skilled team member spending hours writing and debugging SQL now takes anyone on the team a few seconds. The system handles the complexity of querying billions of records while presenting a simple conversational interface to the user.
The Number
- Over $600K in revenue generated in 6 months
- Campaign setup and prospect research reduced from hours to seconds
- The CEO called BetterBrain 'the core asset of the agency'
- Enabled the team to complete engagements significantly faster
- Natural language access to billions of LinkedIn records for the entire team