In our latest team interview, we sit down with Dr. Elvijs Sarkans, Senior Data Engineer at Crowdsurfer.
Elvijs loves all things mathematical. His obsessive attention to detail and commitment to the idempotency of data pipelines is eclipsed only by his love of ‘80s music and glitter.
Over a cup of coffee at Crowdsurfer HQ, Elvijs explained Crowdsurfer’s approach to data.
Elvijs, how long have you been with Crowdsurfer?
I’ve been with the company for a little under two and a half years.
What’s your background before that?
Lots of study! I did my undergraduate in mathematics just down the road at Cambridge University before completing my MSc and Ph.D. at University of Bath, where I was busy working on control-theoretic stability aspects of differential equations, culminating in 3 publications. In practical terms, this relates to understanding how physical systems behave under directed perturbations, for example, a related summer project involved working on the very hot area of autonomous vehicles. I then was offered a role at Crowdsurfer starting in September 2014.
And what’s the problem that you are trying to solve at Crowdsurfer?
It’s simple in principle: we are gathering in one place all the world's data on crowd finance. In practise, it's a huge data engineering challenge because the industry's data is very messy and inconsistent. We've spent the last few years building an understanding of the crowd finance markets and implemented a data acquisition engine with management interface to squeeze out every ounce of efficiency in structuring huge quantities of data.
How much of the market do you cover, Elvijs?
Nobody knows exactly how big the crowd finance market is because it's never been mapped before. But we can say with confidence that we offer more coverage than anyone else. And we can also confidently say that this coverage is growing all the time. We've mapped in detail around $30B campaign-by-campaign across all types of crowd finance, covering about 900 platforms globally.
We have a comprehensive global platform directory, which our data research team maintains. The number of platforms is increasing by around 1 new platform a day. We cover more platforms than other providers and offer more historical data as well because we've been around for a while now. To give you an idea of the size of this task, the average number of campaigns we update every day is over 70,000. Apart from the sheer volume, another interesting challenge is the geographical diversity of crowd finance, as evidenced by more than 200,000 unique locations and more than 700,000 unique translations in our database.
That’s a lot of data to structure! How do you ensure data integrity, so customers can trust the data?
Yes - loads of data! Data integrity is an essential aspect of what we do and we take it really seriously. Apart from a proactive data quality monitoring suite, we have built a comprehensive data error detection suite, which notifies us when a single field is missing on one campaign on one platform. We also source feedback from users, from our customer base and from platforms themselves sometimes. This feedback is then checked manually by the team to ensure data quality.
How is Crowdsurfer different to other data providers in this space?
We've built a very tech-rich company. We spent a lot of time and resource to build some clever data tools so we can gather, clean and structure huge amounts of messy data, and to make it easy to access, analyse and derive actionable insights.
You bring order to the chaos of data?
Yes! Our aim is to let those wanting to invest, raise and track the crowd economy uncover key trends or opportunities in the data that require some number crunching and clever search to find.
For example, we've enabled companies exploring the area to understand the big picture, as well as identify specific business opportunities. From the macro to micro we make the analysis available to our customers via Crowdsurfer Pro.
So, in short, you’re bringing transparency to the world of crowd finance? Or aiming to at least!
Yes. Exactly. By bringing the industry’s data analysis into one, easily accessible space, we're creating a next-generation data analytics service for a new, transparent financial industry. It's an ongoing task and we have to evolve in response to the changing landscape, but that's one of the things that makes this job so interesting!
Take advantage of Crowdsurfer to research your own fundraise or to answer any other questions you have about crowd finance. Register here to access the most current and interactive data intelligence on the industry.
What is Crowdsurfer Pro?
Crowdsurfer Pro is the world's most trusted, data intelligence service for the global crowd finance phenomenon. Using big data engineering and machine learning, Crowdsurfer Pro distills billions of data points on individual raises into usable intelligence for fundraisers, investors and regulators.