Alternative Data Newsletter #109 – March 14th, 2019

Important Developments

  • Proprietary Datasets: Eagle Alpha offers three exclusive proprietary datasets to the buyside. This number will increase substantially over the coming months and quarters. Please contact enquiries@eaglealpha.comto learn more.
  • New Weekly Newsletter: we launched a new weekly newsletter exclusively focused on the needs of vendors of alternative data. It includes active data requests of buyside firms. Please register for the newsletter by emailing

Events You Should Attend

  • Eagle Alpha’s Event Calendar:
  • AI & Data Science in Trading: We are excited to be sponsors at AI & Data Science in Trading in NYC in March, where we will be amongst industry peers and colleagues talking about alternative data, big data, AI, and machine learning. Details can be found here –

What Datasets Are Getting Traction?

  • Quant: this vendor is a leading provider of alternative data to hedge funds, consulting firms and retail companies in China, specializes in deploying sophisticated web scraping algorithms to get accurate data from websites in a daily pattern. Our Data Sourcing clients can view the full profile here.
  • Discretionary: this vendor tracks accurate and privacy-compliant GPS data visits to the physical locations of 5,000 brands in the United States across 15 million stores. This dataset provides a complete analysis of visits to each physical location for each brand. Our Data Sourcing clients can view the full profile here.
  • New dataset: this vendor identifies product trends in the market and provides precise and timely location data on store openings and closings. Using Wi-Fi, Bluetooth, and BLE signals originating from thousands of device types (Tesla, Fitbit, Roku, Oculus VR, etc.) and places (Starbucks, Warby Parker, etc.), the vendor can determine when and where new things appear in the world. Our Data Sourcing clients can view the full profile here.

Data Science Lab

  • Open Sourcing this week: PyTorch Geometric – This extension to the popular PyTorch library introduces methods for deep learning on graphs and other irregular structures such as 3D meshes and point clouds. The library utilizes dedicated CUDA kernels to handle the high-dimensional and sparse data that are characteristic of these irregular data structures.
  • What we’re reading this week: How I created over 100,000 labeled LEGO training images – this whimsical use case for image classification provides some very practical suggestions for building a tagged training set from scratch. The article focuses on ways to make the tagging process more accurate and efficient, including using knowledge of the sampling process to tag multiple samples of something known to be the same, streamlining the tagging process through web interfaces, and using your infant classifier to order your options for tags more intelligently.

Inside Track

  • Alpha Focus Session – B2B Data (online): our second alpha focus session will be held on March 29th and we will discuss how to work with various types of B2B datasets. Three vendors will attend the session, explain their data procurement and discuss the use cases. Contact to learn more.
  • Dataset Costs/Pricing: a client asked us to launch a think tank comprised of buyside firms to discuss dataset costs. The output will be a white paper. To learn more email
  • Standardized DDQ: the first draft of the document was prepared by Lowenstein Sandler and the document is currently being reviewed by Simmons & Simmons. Contact to learn more.

Notable News in the Alternative Data Space