pandaSDMX: Statistical Data and Metadata eXchange in Python ============================================================= pandaSDMX is an Apache 2.0-licensed `Python `_ client to retrieve and acquire statistical data and metadata disseminated in `SDMX `_ 2.1, an ISO-standard widely used by institutions such as statistics offices, central banks, and international organisations. pandaSDMX exposes datasets and related structural metadata including dataflows, codelists, and datastructure definitions as `pandas `_ Series or multi-indexed DataFrames. Many other output formats and storage backends are available thanks to `Odo `_. Supported data providers ---------------------------- pandaSDMX ships with built-in support for the following agencies (others may be configured by the user): * `Australian Bureau of Statistics (ABS) `_ * `European Central Bank (ECB) `_ * `Eurostat `_ * `French National Institute for Statistics (INSEE) `_ * `Instituto Nacional de la Estadìstica y Geografìa - INEGI (Mexico) `_ * `International Monetary Fund (IMF) - SDMX Central only `_ * `International Labour Organization (ILO) `_ * `Italian statistics Office (ISTAT) `_ * `Norges Bank (Norway) `_ * `Organisation for Economic Cooperation and Development (OECD) `_ * `United Nations Statistics Division (UNSD) `_ * `UNESCO (free registration required) `_ * `World Bank - World Integrated Trade Solution (WITS) `_ Main features --------------------- * support for many SDMX 2.1 features * SDMXML and SDMXJSON formats * pythonic representation of the SDMX information model * When requesting datasets, validate column selections against code lists and content-constraints if available * export data and structural metadata such as code lists as multi-indexed pandas DataFrames or Series, and many other formats as well as database backends via `Odo`_ * read and write SDMX messages to and from files * configurable HTTP connections * support for `requests-cache `_ allowing to cache SDMX messages in memory, MongoDB, Redis or SQLite * extensible through custom readers and writers for alternative input and output formats * growing test suite Example --------- Suppose we want to analyze annual unemployment data for some European countries. All we need to know in advance is the data provider, eurostat. pandaSDMX makes it super easy to search the directory of dataflows, and analyze the complete structural metadata about the datasets available through the selected dataflow. We will skip this step here. The impatient reader may directly jump to :ref:`basic-usage`. The dataflow with the ID 'une_rt_a' contains the data we want. The dataflow definition references the datastructure definition which contains or references all the metadata describing data sets available through this dataflow: the dimensions, concept schemes, and corresponding code lists. .. ipython:: python from pandasdmx import Request estat = Request('ESTAT') # Download the metadata and expose it as a dict mapping resource names to pandas DataFrames flow_response = estat.dataflow('une_rt_a') structure_response = flow_response.dataflow.une_rt_a.structure(request=True, target_only=False) # Show some code lists. structure_response.write().codelist.loc['GEO'].head() Next we download a dataset. We use codes from the code list 'GEO' to obtain data on Greece, Ireland and Spain only. .. ipython:: python resp = estat.data('une_rt_a', key={'GEO': 'EL+ES+IE'}, params={'startPeriod': '2007'}) # We use a generator expression to select some columns # and write them to a pandas DataFrame data = resp.write(s for s in resp.data.series if s.key.AGE == 'TOTAL') # Explore the data set. First, show dimension names data.columns.names # and corresponding dimension values data.columns.levels # Show aggregate unemployment rates across ages and sexes as # percentage of active population data.loc[:, ('PC_ACT', 'TOTAL', 'T')] Quick install ----------------- * ``pip install pandasdmx`` pandaSDMX Links ------------------------------- * `PyPI `_ * `github `_ * `Google group `_ * `Official SDMX website `_ Table of contents --------------------- .. toctree:: :maxdepth: 2 :numbered: whatsnew faq intro sdmx_tour usage agencies advanced API documentation contributing license Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`