Skip to content

Add ISO 4217 currency column to XBRL facts DataFrame (#850)#851

Open
gaoflow wants to merge 1 commit into
dgunning:mainfrom
gaoflow:fix-850-currency-iso4217
Open

Add ISO 4217 currency column to XBRL facts DataFrame (#850)#851
gaoflow wants to merge 1 commit into
dgunning:mainfrom
gaoflow:fix-850-currency-iso4217

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Resolves #850. xbrl().facts.to_dataframe() exposed the raw XBRL unit_ref id for every fact. For non-USD filers that id is an opaque token such as UNIT_STANDARD_HKD_MNUSOXGRF0O9R60JINVDUQ instead of a usable currency, which made currency-based filtering and display unreliable for foreign companies (reported by @warzoo).

What changed

The facts DataFrame now carries a currency column with each fact's ISO 4217 code (e.g. USD, HKD), resolved from the unit's iso4217: measure that the parser already records in xbrl.units. The opaque unit_ref is preserved unchangedcurrency is purely additive:

df = filing.xbrl().facts.to_dataframe()
df["unit_ref"]   # "UNIT_STANDARD_HKD_..."  (unchanged)
df["currency"]   # "HKD"                    (new)
  • Per-share monetary units (e.g. iso4217:USD / xbrli:shares) report their numerator currency (USD).
  • Non-monetary units (shares, pure, company-specific) resolve to None rather than a misleading value.

Verification

tests/issues/regression/test_issue_850.py:

  • Issue repro — a minimal instance using the opaque UNIT_STANDARD_HKD_... id resolves to currency == "HKD" while unit_ref stays opaque.
  • Ground truth — the AAPL 10-K fixture resolves USD-denominated facts (and usdPerShare) to "USD".
  • Silence check — share-count facts have currency of None, not a fabricated code.

All three are red without the change. The existing facts/unit suites (test_xbrl_unit_pointintime.py, test_xbrl_facts.py, …) show no new failures (the remaining failures are pre-existing network-dependent tests). ruff check/format clean on the changed code.

xbrl().facts.to_dataframe() exposed the raw unit_ref id for every fact. For
non-USD filers that id is an opaque token like UNIT_STANDARD_HKD_MNUS... instead
of a usable currency, making currency-based filtering and display unreliable for
foreign companies.

Resolve each fact's unit to its ISO 4217 code (via the parsed iso4217: measure
already available in xbrl.units) and expose it in a new 'currency' column. The
raw unit_ref is preserved; per-share monetary units report their numerator
currency; non-monetary units (shares, pure, custom) resolve to None rather than
a misleading value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Normalize Currency Unit Identifiers in XBRL Facts to ISO 4217 Codes

1 participant