Data-driven sourcing/screening is creating meaningful value for VCs as competition becomes increasingly cutthroat

Data-driven sourcing/screening is creating meaningful value for VCs as competition becomes increasingly cutthroat

As funds available for investment grow, and new starts remain stagnant, VCs can identify early stage startups using unique alternative data sets that give them first dibs 

In this article we will discuss:

The struggling venture capitalist

The VC predicament can be summarized in one sentence: 

A lot of available investment capital vs. a stagnant pool of startups. 

Total value of venture capital funds raised in Europe from 2007 to 2019 (in billion euros)

Source: Statista

As you can see, since 2009 the pool of VC funds available for investment has grown almost four-fold. On the other hand, a paper released by the Kauffman Research Foundation on ‘Firm Formation and Economic Growth’, clearly shows that:

“Firm formation is remarkably constant over time, with the number of new companies varying little from year to year. This remains true despite sharp changes in economic conditions and markets, and longer-cycle changes in population and education.” 

This means that VCs are facing a classic capitalist’s predicament – demand outpacing supply. 

Additionally, identifying companies is usually a manual, and timely process. This essentially means that VCs need to start becoming more creative in how they source, and screen companies, if they want to beat the competition, and get in on the ground level. This is especially true as 60% of VC-generated profits occur in the sourcing/screening stages, according to Morten Sorensen in his publication ‘How Smart Is Smart Money? A Two‐Sided Matching Model of Venture Capital’. 

How alternative data can help

It is for this reason that Venture Capitalist firms are turning to alternative data-driven solutions including Artificial Intelligence, and Machine Learning that enrich:

  • Stage 1: Sourcing
  • Stage2: Screening

Stage 1: Sourcing

Sourcing is crucial in finding early stage startups before other VCs scoop them up behind your back. It is this search for new enterprises that can be led with a data-first approach. Here are some alternative data sets your firm might want to consider incorporating in algorithms:

  • Scanning social/business networks – LinkedIn crawling, for example, is a great way to identify new or growing firms. You can collect data on companies with under 5 employees who are now hiring i.e. they are still small but growing. You may also want to collect posts based on keywords, for example, ‘Beta Stage’, or ‘Investment Round’. 
  • New product platforms – Another way to identify new products/companies may include crawling websites such as Product Hunt. This platform, as well as others, allow users to rate products, and leave comments. VCs can collect these, and set trigger levels which indicate meaningful public/professional interest.   
  • Company databases – Directories the likes of Crunchbase, have tons of information about private and public companies. This includes investments/funding data, information about founders and corporate leadership, industry news/trends, as well as Mergers and Acquisitions (M&A). Scanning this type of platform for data and feeding it into algorithms with predefined triggers, can be very useful in bringing obscure companies to the forefront of your quarterly portfolio acquisitions. 

Stage 2: Screening 

Not all companies that trigger interest in the ‘sourcing’ stages, will make it past the ‘screening’ stage. A company that became interesting because one of the founders tweeted about a new investment round, may be eliminated during ‘screening’ because engagement/growth metrics looked weak. Here are some alt data points enabling VCs to weed out the undesirables from the pile:

  • Social media/funding – Where a company has a presence, as well as, how many followers they have, how many engagements (shares, likes, etc) are all important in terms of determining social sentiment, and excitement about a given product. A similar principal can, and is being applied on Kickstarter where companies can identify product-market-fit before investing their first dollar. 
  • Product reviews – For products that already have some version released/live, this is important both in terms of seeing what other industry professionals such as developers have to say, identifying Unique Sales Proposition (USP) gaps, and being made aware of bugs, and anomalies.  
  • App/website info –  Collecting data points on app-store performance, downloads, star ratings, as well as web traffic, and/or search result rankings, can also serve as strong indications of potentially explosive future growth. 

These data collection strategies will enable early-adopting venture capitalists to benefit by automating what has historically been a manual process full of human error, inaccuracies, and a lack of properly cross-referenced information. 

But as adoption becomes more mainstream, the competitive gap will close, and generating, creative/unique data sets will become cardinal to maintaining levels of successful acquisitions. 

Where data-driven VC is headed 

Successful data-driven venture capitalism will necessarily have to evolve in one or all of the following ways in order for firms to assume ‘species dominance’:

Real-time data 

Many investment algorithms are developed, and trained using historical data sets. On the surface this seems logical as Machine Learning should be able to grow and develop independently based on historical patterns. This would be true if it were not for human nuance, and innovative disruptors. For example, very few VCs would have thought to look at China in the context of cutting-edge electric vehicles. But in 2020, BYD, one of China’s leaders in electric vehicle R&D showed off a model of its Han EV series at the Beijing auto show. These kinds of trends could only be picked up on by data sets being collected in the present. 

Unique data sets

Beyond real-time data sets, VCs will also need to become more creative in terms of the type of data signals they choose to use. When a majority of VCs are already aggregating LinkedIn, CrunchBase, and Product Hunt data, portfolio managers will need to get creative. For example, funds may start collecting information from: 

  • Startup accelerators – To try and get in early on the action 
  • Universities and high schools – Where tech giants like Facebook have their roots
  • Forums – Where coders, and/or entrepreneurs meet, and chat
  • Patent offices – To see what tech is being developed, and protected
  • Local news items – Scanning stories for early-action / geo-specific ingenuity
  • Geospatial/satellite data – Could indicate progress on large projects (e.g. Elon Musk’s SpaceX). 

Complex cross-pollination 

Lastly, VCs who are really looking for a competitive edge will cross-pollinate i.e. identify multiple, real-time, unique data sets which they can cross-reference, and correlate using open-source web data, and dedicated algorithms. This may, for example, be comprised of data indicative of growth, combined with positive social sentiment/engagement, correlating with healthy web traffic, and app-downloads. 

The bottom line 

Whichever route your firm decides to take, those with a unique, yet aggressive alt-data strategy will be able to identify, and scoop up the winners, leaving their competition with nothing but crumbs to peck at. 

Leave a Reply