3. Getting started

3.1. Installation

3.1.1. Prerequisites

pandaSDMX is a pure Python package. As such it should run on any platform. It requires Python 2.7, 3.4 or higher.

It is recommended to use one of the common Python distributions for scientific data analysis such as

Along with a current Python interpreter these Python distributions include lots of useful packages for data analysis. For other Python distributions (not only scientific) see here.

pandaSDMX has the following dependencies:

  • the data analysis library pandas which itself depends on a number of packages
  • the HTTP library requests
  • LXML for XML processing.
  • JSONPATH-RW for JSON processing.

3.1.2. Optional dependencies

  • requests-cache allowing to cache SDMX messages in memory, MongoDB, Redis and more.
  • odo for fancy data conversion and database export
  • IPython is required to build the Sphinx documentation To do this, check out the pandaSDMX repository on github.
  • py.test to run the test suite.

3.1.3. Download

From the command line of your OS, issue

pip install pandasdmx

Installation with conda is currently not supported.

Of course, you can also download the tarball from the PyPI and issue python setup.py install from the package dir.

3.2. Running the test suite

The test suite is contained in the source distribution. It is recommended to run the tests with py.test.

3.3. Package overview

Modules

api
module containing the API to make queries to SDMX web services, validate keys (filters) etc. See pandasdmx.api.Request in particular its get method. pandasdmx.api.Request.get() return pandasdmx.api.Response instances.
model
implements the SDMX information model.
remote
contains a wrapper class around requests for http. Called by pandasdmx.api.Request.get() to make http requests to SDMX services. Also reads sdmxml files instead of querying them over the web.

Subpackages

reader
read SDMX files and instantiate the appropriate classes from pandasdmx.model There are currently two readers: one for XML-based SDMXML 2.1 and one for SDMX-JSON 2.1.
writer

contains writer classes transforming SDMX artefacts into other formats or writing them to arbitrary destinations such as databases. As of v0.6.0, two writers are available:

  • ‘data2pandas’ exports a dataset or portions thereof to a pandas DataFrame or Series.
  • ‘structure2pd’ exports structural metadata such as lists of data-flow definitions, code-lists, concept-schemes etc. which are contained in a structural SDMX message as as a dict mapping resource names (e.g. ‘dataflow’, ‘codelist’) to pandas DataFrames.
utils:
utility functions and classes. Contains a wrapper around dict allowing attribute access to dict items.
tests
unit tests and sample files

3.4. What next?

The following chapters explain the key characteristics of SDMX, demonstrate the basic usage of pandaSDMX and provide additional information on some advanced topics. While users that are new to SDMX are likely to benefit a lot from reading the next chapter on SDMX, normal use of pandaSDMX should not strictly require this. The Basic usage chapter should enable you to retrieve datasets and write them to pandas DataFrames. But if you want to exploit the full richness of the information model, or simply feel more comfortable if you know what happens behind the scenes, the SDMX introduction is for you. It also contains links to reference materials on SDMX. .