OpenFinDB — Transparent Financial Data from SEC Filings

Developer First

Query financial data
in one line

Access point-in-time financial data derived directly from SEC EDGAR filings. Simple APIs designed for quant research and production fintech applications.

openfindb v0.1

from openfindb import fundamentals

# Fetch point-in-time financials — no look-ahead bias
df = fundamentals.get(
    ticker="AAPL",
    fields=["revenue", "net_income", "operating_cash_flow"],
    start="2015",
    asof="2024-01-01"  # Only data available as of this date
)

print(df.tail())

Output

   report_date    revenue          net_income       op_cash_flow
0  2023-09-30     383,285,000,000  96,995,000,000   110,543,000,000
1  2022-09-24     394,328,000,000  99,803,000,000   122,151,000,000
2  2021-09-25     365,817,000,000  94,680,000,000   104,038,000,000
3  2020-09-26     274,515,000,000  57,411,000,000    80,674,000,000
4  2019-09-28     260,174,000,000  55,256,000,000    69,391,000,000

# All values sourced directly from SEC 10-K filings via EDGAR

-- Point-in-time query: only returns filings accepted before snapshot date
SELECT
    ticker,
    report_date,
    filing_acceptance_dt,
    revenue,
    net_income,
    operating_cash_flow,
    accession_number  -- full traceability to source filing
FROM fundamentals_annual
WHERE
    ticker = 'AAPL'
    AND filing_acceptance_dt <= '2024-01-01'  -- point-in-time filter
ORDER BY report_date DESC
LIMIT 5;

Result

ticker  report_date  revenue        net_income     accession_number
AAPL    2023-09-30   383285000000   96995000000    0000320193-23-000106
AAPL    2022-09-24   394328000000   99803000000    0000320193-22-000108
AAPL    2021-09-25   365817000000   94680000000    0000320193-21-000105
AAPL    2020-09-26   274515000000   57411000000    0000320193-20-000096
AAPL    2019-09-28   260174000000   55256000000    0000320193-19-000119

{
  "ticker": "AAPL",
  "report_date": "2023-09-30",
  "filing_type": "10-K",
  "acceptance_datetime": "2023-11-03T08:01:05Z",
  "revenue": 383285000000,
  "net_income": 96995000000,
  "operating_cash_flow": 110543000000,
  "shares_outstanding": 15550061000,
  "source": {
    "accession_number": "0000320193-23-000106",
    "cik": "0000320193",
    "xbrl_tags": {
      "revenue": "us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax",
      "net_income": "us-gaap:NetIncomeLoss",
      "operating_cash_flow": "us-gaap:NetCashProvidedByUsedInOperatingActivities"
    },
    "edgar_url": "https://www.sec.gov/Archives/edgar/data/320193/000032019323000106/"
  }
}

Every field traces to a specific XBRL concept in the original SEC filing.

Point-in-Time Correctness

All data is filtered by filing_acceptance_dt — the exact timestamp the SEC received the filing. This eliminates look-ahead bias that corrupts most financial backtests. Your models reflect what was actually known at each point in time, not what was known later.

Built directly from SEC filings. Fully traceable to the source document.

The Problem

Why financial data
is broken

Traditional financial data platforms were built for institutions, not developers. The result is an industry that's expensive, opaque, and hostile to independent builders.

Enterprise pricing

Bloomberg, FactSet, and Refinitiv charge tens of thousands per year — pricing out solo researchers, indie developers, and early-stage startups entirely.

Inconsistent cheap APIs

Low-cost alternatives often have inconsistent data, undocumented normalization choices, and no clear sourcing — making it impossible to trust the numbers.

No traceability to filings

When a number looks wrong, you can't trace it. Most vendors offer no way to verify data against the original SEC filing that it came from.

Look-ahead bias in datasets

Most datasets are not truly point-in-time. They silently include restated or revised data, corrupting backtest results with information that wasn't available at decision time.

OpenFinDB solves this

By building directly from SEC EDGAR filings with full transparency, OpenFinDB provides datasets where every number traces to its source document, every timestamp reflects actual filing acceptance time, and the entire pipeline is open source and reproducible.

Core Features

Built for serious research

Every design decision optimized for correctness, traceability, and developer productivity.

Point-in-Time Financial Data

Financial values are tied to the exact SEC filing acceptance timestamp, preventing look-ahead bias in research and backtesting. Know exactly what data was available at any moment in history.

filing_acceptance_dt

Full Filing Traceability

Every financial number traces back to the original SEC filing — including accession number, CIK, and XBRL concept tag. No black boxes. Complete transparency from raw filing to final dataset.

accession_number

Developer-First API

Query financial data using simple Python APIs or SQL designed for quant research and fintech development. DuckDB-compatible Parquet files for fast offline analysis.

fundamentals.get()

Architecture

Layered data pipeline

OpenFinDB builds financial datasets directly from SEC filings using a transparent, reproducible pipeline. Each layer adds structure without losing traceability to the source.

SEC EDGAR

Raw 10-K, 10-Q, 8-K filings in XBRL format — primary source of truth

Source

XBRL parsing + metadata extraction

EdgarDB Open Source

Raw SEC filings archive — complete ingestion and storage of EDGAR submissions

financial concept normalization

FundamentalsDB

Normalized financial data — income statement, balance sheet, cash flow across all filers

signal computation + dataset packaging

QuantDB

Research-ready datasets — point-in-time fundamentals, derived signals, Parquet files

Every layer is fully traceable. A signal in QuantDB can always be traced back through FundamentalsDB → EdgarDB → the original SEC EDGAR filing.

Engineering

Built for
data engineers

OpenFinDB is designed as transparent financial data infrastructure — every component is documented, reproducible, and built on proven open-source tools.

Point-in-time datasets using SEC acceptance timestamps — not report dates

Full XBRL traceability to concepts, filings, and accession numbers

Reproducible pipelines built from raw filings with deterministic processing

Developer-first APIs and Python libraries designed for quant workflows

Python

Core processing

XBRL

Filing parsing

DuckDB

Analytics queries

Parquet

Storage format

Cloudflare Pages

Edge hosting & global distribution

Use Cases

Who uses OpenFinDB

Designed for builders and researchers who need financial data that is correct, transparent, and accessible.

Quant Developers

Build factor models and backtests with point-in-time fundamentals. Eliminate look-ahead bias at the data layer, not as an afterthought.

Factor model development
Historical backtesting
Fundamental signal research

Fintech Startups

Power financial applications with transparent, traceable datasets. Know exactly where every number comes from — and verify it independently.

Company financial data
Due diligence tooling
Financial dashboard apps

Academic Researchers

Access reproducible financial data linked directly to SEC filings. Full methodology transparency for peer review and replication.

Accounting & finance research
Reproducible studies
Market anomaly analysis

Open Source

Built with an
open ecosystem

OpenFinDB includes open-source components that let the community inspect, extend, and contribute to the financial data pipeline. Transparency isn't just a feature — it's the architecture.

View EdgarDB on GitHub

EdgarDB Open Source

EDGAR ingestion system — archives raw SEC filings with full metadata and submission history.

XBRL Parser Coming Soon

Parsing framework for extracting structured financial data from XBRL-tagged SEC filings.

Data Schemas Coming Soon

Documented schemas for FundamentalsDB and QuantDB, enabling community contributions and integrations.

Roadmap

Where we're headed

OpenFinDB is being built incrementally — one transparent layer at a time.

Phase 1 In Progress

EdgarDB — Filing Archive

Complete EDGAR ingestion pipeline. Archive all SEC filings with metadata, acceptance timestamps, and structured submission data. Foundation for all subsequent layers.

Filing ingestion Metadata extraction XBRL parsing

Phase 2 Planned

FundamentalsDB — Financial Data

Normalized financial dataset extraction from SEC filings. Income statement, balance sheet, and cash flow data with full XBRL concept mapping and point-in-time accuracy.

Income statement Balance sheet Cash flow

Phase 3 Future

QuantDB — Research Datasets

Research-ready datasets and derived signals. Factor datasets, anomaly signals, and Parquet files for offline analysis. Python and SQL APIs for quant research workflows.

Factor datasets Python API Parquet downloads

Open financial data
infrastructure

Query financial data
in one line

Why financial data
is broken

Enterprise pricing

Inconsistent cheap APIs

No traceability to filings

Look-ahead bias in datasets

OpenFinDB solves this

Built for serious research

Point-in-Time Financial Data

Full Filing Traceability

Developer-First API

Layered data pipeline

Built for
data engineers

Who uses OpenFinDB

Quant Developers

Fintech Startups

Academic Researchers

Built with an
open ecosystem

Where we're headed

EdgarDB — Filing Archive

FundamentalsDB — Financial Data

QuantDB — Research Datasets

Follow the project

Open financial data infrastructure

Query financial datain one line

Why financial datais broken

Enterprise pricing

Inconsistent cheap APIs

No traceability to filings

Look-ahead bias in datasets

OpenFinDB solves this

Built for serious research

Point-in-Time Financial Data

Full Filing Traceability

Developer-First API

Layered data pipeline

Built fordata engineers

Who uses OpenFinDB

Quant Developers

Fintech Startups

Academic Researchers

Built with anopen ecosystem

Where we're headed

EdgarDB — Filing Archive

FundamentalsDB — Financial Data

QuantDB — Research Datasets

Follow the project

Open financial data
infrastructure

Query financial data
in one line

Why financial data
is broken

Built for
data engineers

Built with an
open ecosystem