Data Methodology

Transparency is core to our mission. This page explains how we collect, process, and present private equity data. All our data comes from publicly available sources.

Our Data Principles

Public Sources Only

All data comes from publicly available documents - no proprietary or leaked data.

Full Attribution

Every data point traces back to a source document with timestamp.

Regular Updates

Data is refreshed as new disclosures become available, typically quarterly.

15,000+
Funds Tracked
+2,000 YoY
4,000+
Managers
+500 YoY
160+
LP Sources
+20 YoY
2M+
Data Points
Growing daily
Data Sources
Where our data comes from and what it contains

Public Pension Funds

Quarterly and annual disclosures from state and local pension funds

Example Sources

CalPERSCalSTRSWashington State Investment BoardTexas Teachers

Data Types Extracted

  • LP-Fund relationships
  • Performance metrics (IRR, TVPI, DPI)
  • Commitment amounts
  • NAV/Fair Value
Update Frequency: Quarterly with 3-6 month lag
Coverage: 160+ pension funds tracked

SEC EDGAR Filings

Business Development Company (BDC) portfolio holdings and financial reports

Example Sources

Form 10-KForm 10-QForm N-CSR

Data Types Extracted

  • Portfolio company holdings
  • Investment valuations
  • Industry classifications
  • Interest rates
Update Frequency: Quarterly
Coverage: All publicly traded BDCs

Public Records Requests

FOIA and state public records requests for pension fund data

Example Sources

Investment reportsBoard meeting materialsConsultant presentations

Data Types Extracted

  • Supplementary performance data
  • Investment committee decisions
  • Allocation targets
Update Frequency: As available
Coverage: On request basis
Data Processing Pipeline
How raw documents become structured data
1

Document Collection

Automated monitoring of pension fund websites and SEC EDGAR for new filings. Documents are downloaded and archived with timestamps.

2

Data Extraction

Structured data is extracted from PDFs, Excel files, and HTML tables using a combination of OCR, table parsing, and manual review for complex documents.

3

Entity Resolution

Fund and manager names are normalized and matched across sources using fuzzy matching algorithms. "Blackstone Capital Partners VII" and "BCP VII" are linked to the same fund entity.

4

Validation & Quality Checks

Automated checks flag outliers and inconsistencies. Performance metrics are validated against expected ranges (e.g., IRR between -100% and +200%). Cross-source validation confirms data accuracy.

5

Database Integration

Validated data is merged into our database, preserving historical values and creating time series where multiple observations exist.

Data Limitations
Important considerations when using our data

Time Lag

Pension fund disclosures typically have a 3-6 month lag. Data as of Q2 2025 may not be published until September 2025.

Partial Coverage

Not all LPs publicly disclose their PE investments. Our database represents a subset of the total market, biased toward US public pensions.

Single LP Perspective

Performance data reflects what a specific LP experienced. Different LPs in the same fund may have slightly different returns due to timing, fees, or side letter provisions.

Naming Inconsistencies

Despite our entity resolution efforts, some funds may appear as separate entities due to significant naming variations across sources.

Valuation Methodology

NAV-based metrics (TVPI, IRR) rely on GP valuations, which may differ from realizable values and can lag market movements.

Update Schedule

Quarterly Updates

Major data refresh following pension fund disclosure cycles:

  • • February-March: Q4 data from previous year
  • • May-June: Q1 data
  • • August-September: Q2 data
  • • November-December: Q3 data

Continuous Updates

Ongoing additions and corrections:

  • • SEC filings processed within 48 hours
  • • New pension sources added monthly
  • • Entity resolution improvements weekly
  • • User-reported corrections as received
Questions or Corrections?

If you notice data discrepancies or have questions about our methodology, we'd love to hear from you.