Statistical Data and Metadata eXchange (SDMX) for the Python data ecosystem¶
pandaSDMX is an Apache 2.0-licensed Python library that implements SDMX 2.1
(ISO 17369:2013), a format for
exchange of statistical data and metadata used by national statistical
agencies, central banks, and international organisations.
pandaSDMX can be used to:
explore the data available from over 20 data providers such as the World Bank, BIS, ILO, ECB, Eurostat, OECD, UNICEF and United Nations;
parse data and metadata in SDMX-ML (XML) or SDMX-JSON formats—either:
from local and remote files, or
retrieved from pandasdmx web services, with query validation and caching;
convert data and metadata into pandas objects, for use with the analysis, plotting, and other tools in the Python data science ecosystem; See also the companion project `intake_sdmx <https://intake_sdmx.readthedocs.io>`__, a plugin for the intake data acquisition and distribution framework;
apply the SDMX Information Model to your own data;
validate SDMX files against the official XML schemas;
…and much more.
Assuming a working copy of Python 3.9 or higher
is installed on your system,
you can get
pandaSDMX either by typing from the command prompt:
$ pip install pandasdmx
or from a conda environment:
$ conda install pandasdmx -c conda-forge
Next, look at a usage example in only 10 lines of code. Then dive into the longer, narrative walkthrough and finally peruse the more advanced chapters as needed.
Bear in mind that SDMX was designed to be flexible enough to accommodate almost any data. This also means it is complex, with many abstract concepts for describing data, metadata, and their relationships. These are called the “SDMX Information Model” (IM).
This documentation gently explains the
functionality provided by
pandaSDMX itself enabling you to make the
best use of all supported data sources.
A decent understanding of the IM is conveyed in passing.
However, if you got hooked, follow the list of resources and references,
or read the API documentation and implementation notes
pandasdmx.message modules that implement the IM.
pandaSDMX user guide¶
- Data sources
- Data source limitations
ABS: Australian Bureau of Statistics
ABS_XML: Australian Bureau of Statistics - XML-based API
BBK: Bundesbank (German Central Bank)
BIS: Bank for International Settlements
EC_COMP: European Commission - DG Competition
EC_EMPL: European Commission - DG Employment
EC_GROW: European Commission - DG Growth
ECB: European Central Bank
ILO: International Labour Organization
IMF: International Monetary Fund’s “SDMX Central” source
INEGI: National Institute of Statistics and Geography (Mexico)
INSEE: National Institute of Statistics and Economic Studies (France)
ISTAT: National Institute of Statistics (Italy)
LSD: National Institute of Statistics (Lithuania)
NB: Norges Bank (Norway)
NBB: National Bank of Belgium
OECD: Organisation for Economic Cooperation and Development
SGR: SDMX Global Registry
SPC: Pacific Data Hub
STAT_EE: Statistics Estonia
UNSD: United Nations Statistics Division
UNICEF: UN International Children’s Emergency Fund
CD2030: Countdown 2030
WB: World Bank Group’s “World Integrated Trade Solution”
WB_WDI: World Bank Group’s “World Development Indicators”
- API reference
- Implementation notes
- How to…
- What’s new?
- Development roadmap
Contributing to pandaSDMX and getting help¶
Report bugs, suggest features or view the source code on GitHub.
The sdmx-python Google Group and mailing list may have answers for some questions.