Open Source · Built on SEC EDGAR

Open financial data
infrastructure

The transparent financial database built from SEC filings.

OpenFinDB provides clean, traceable financial datasets built directly from SEC EDGAR filings — designed for developers, quant researchers, and fintech builders who need data they can trust.

SEC EDGAR
Primary source
Point-in-Time
No look-ahead bias
Open Source
Transparent pipeline
XBRL Native
Full traceability
Developer First

Query financial data
in one line

Access point-in-time financial data derived directly from SEC EDGAR filings. Simple APIs designed for quant research and production fintech applications.

openfindb v0.1
from openfindb import fundamentals

# Fetch point-in-time financials — no look-ahead bias
df = fundamentals.get(
    ticker="AAPL",
    fields=["revenue", "net_income", "operating_cash_flow"],
    start="2015",
    asof="2024-01-01"  # Only data available as of this date
)

print(df.tail())
Output
   report_date    revenue          net_income       op_cash_flow
0  2023-09-30     383,285,000,000  96,995,000,000   110,543,000,000
1  2022-09-24     394,328,000,000  99,803,000,000   122,151,000,000
2  2021-09-25     365,817,000,000  94,680,000,000   104,038,000,000
3  2020-09-26     274,515,000,000  57,411,000,000    80,674,000,000
4  2019-09-28     260,174,000,000  55,256,000,000    69,391,000,000

# All values sourced directly from SEC 10-K filings via EDGAR
Point-in-Time Correctness

All data is filtered by filing_acceptance_dt — the exact timestamp the SEC received the filing. This eliminates look-ahead bias that corrupts most financial backtests. Your models reflect what was actually known at each point in time, not what was known later.

Built directly from SEC filings. Fully traceable to the source document.

The Problem

Why financial data
is broken

Traditional financial data platforms were built for institutions, not developers. The result is an industry that's expensive, opaque, and hostile to independent builders.

Enterprise pricing

Bloomberg, FactSet, and Refinitiv charge tens of thousands per year — pricing out solo researchers, indie developers, and early-stage startups entirely.

Inconsistent cheap APIs

Low-cost alternatives often have inconsistent data, undocumented normalization choices, and no clear sourcing — making it impossible to trust the numbers.

No traceability to filings

When a number looks wrong, you can't trace it. Most vendors offer no way to verify data against the original SEC filing that it came from.

Look-ahead bias in datasets

Most datasets are not truly point-in-time. They silently include restated or revised data, corrupting backtest results with information that wasn't available at decision time.

OpenFinDB solves this

By building directly from SEC EDGAR filings with full transparency, OpenFinDB provides datasets where every number traces to its source document, every timestamp reflects actual filing acceptance time, and the entire pipeline is open source and reproducible.

Core Features

Built for serious research

Every design decision optimized for correctness, traceability, and developer productivity.

Point-in-Time Financial Data

Financial values are tied to the exact SEC filing acceptance timestamp, preventing look-ahead bias in research and backtesting. Know exactly what data was available at any moment in history.

filing_acceptance_dt

Full Filing Traceability

Every financial number traces back to the original SEC filing — including accession number, CIK, and XBRL concept tag. No black boxes. Complete transparency from raw filing to final dataset.

accession_number

Developer-First API

Query financial data using simple Python APIs or SQL designed for quant research and fintech development. DuckDB-compatible Parquet files for fast offline analysis.

fundamentals.get()
Architecture

Layered data pipeline

OpenFinDB builds financial datasets directly from SEC filings using a transparent, reproducible pipeline. Each layer adds structure without losing traceability to the source.

SEC EDGAR
Raw 10-K, 10-Q, 8-K filings in XBRL format — primary source of truth
Source
XBRL parsing + metadata extraction
EdgarDB Open Source
Raw SEC filings archive — complete ingestion and storage of EDGAR submissions
financial concept normalization
FundamentalsDB
Normalized financial data — income statement, balance sheet, cash flow across all filers
signal computation + dataset packaging
QuantDB
Research-ready datasets — point-in-time fundamentals, derived signals, Parquet files

Every layer is fully traceable. A signal in QuantDB can always be traced back through FundamentalsDB → EdgarDB → the original SEC EDGAR filing.

Engineering

Built for
data engineers

OpenFinDB is designed as transparent financial data infrastructure — every component is documented, reproducible, and built on proven open-source tools.

Point-in-time datasets using SEC acceptance timestamps — not report dates

Full XBRL traceability to concepts, filings, and accession numbers

Reproducible pipelines built from raw filings with deterministic processing

Developer-first APIs and Python libraries designed for quant workflows

Py
Python
Core processing
XB
XBRL
Filing parsing
DK
DuckDB
Analytics queries
Pq
Parquet
Storage format
CF
Cloudflare Pages
Edge hosting & global distribution
Use Cases

Who uses OpenFinDB

Designed for builders and researchers who need financial data that is correct, transparent, and accessible.

Quant Developers

Build factor models and backtests with point-in-time fundamentals. Eliminate look-ahead bias at the data layer, not as an afterthought.

  • Factor model development
  • Historical backtesting
  • Fundamental signal research

Fintech Startups

Power financial applications with transparent, traceable datasets. Know exactly where every number comes from — and verify it independently.

  • Company financial data
  • Due diligence tooling
  • Financial dashboard apps

Academic Researchers

Access reproducible financial data linked directly to SEC filings. Full methodology transparency for peer review and replication.

  • Accounting & finance research
  • Reproducible studies
  • Market anomaly analysis
Open Source

Built with an
open ecosystem

OpenFinDB includes open-source components that let the community inspect, extend, and contribute to the financial data pipeline. Transparency isn't just a feature — it's the architecture.

View EdgarDB on GitHub
EdgarDB Open Source

EDGAR ingestion system — archives raw SEC filings with full metadata and submission history.

XBRL Parser Coming Soon

Parsing framework for extracting structured financial data from XBRL-tagged SEC filings.

Data Schemas Coming Soon

Documented schemas for FundamentalsDB and QuantDB, enabling community contributions and integrations.

Roadmap

Where we're headed

OpenFinDB is being built incrementally — one transparent layer at a time.

Phase 1 In Progress

EdgarDB — Filing Archive

Complete EDGAR ingestion pipeline. Archive all SEC filings with metadata, acceptance timestamps, and structured submission data. Foundation for all subsequent layers.

Filing ingestion Metadata extraction XBRL parsing
Phase 2 Planned

FundamentalsDB — Financial Data

Normalized financial dataset extraction from SEC filings. Income statement, balance sheet, and cash flow data with full XBRL concept mapping and point-in-time accuracy.

Income statement Balance sheet Cash flow
Phase 3 Future

QuantDB — Research Datasets

Research-ready datasets and derived signals. Factor datasets, anomaly signals, and Parquet files for offline analysis. Python and SQL APIs for quant research workflows.

Factor datasets Python API Parquet downloads

Follow the project

OpenFinDB is being built in the open. Star the repository to follow along as the data pipeline progresses.