API reference

See also the Implementation notes.

Top-level methods and classes

Statistical Data and Metadata eXchange (SDMX) for the Python data ecosystem

class pandasdmx.Request(source=None, log_level=None, session=None, **session_opts)[source]

Client for a SDMX REST web service.

Parameters
  • source (str or pandasdmx.source.Source) – Identifier of a data source. If a string, must be one of the known sources in list_sources().

  • log_level (int) – Override the package-wide logger with one of the standard logging levels.

  • session (optional instance of requests.Session, typically a subclass. If given,) – it is used for HTTP requests, and any session_opts passed will raise TypeError. One use case is the injection of alternative caching libraries such as Cache Control.

  • **session_opts – Additional keyword arguments are passed to pandasdmx.remote.Session.

get(resource_type=None, resource_id=None, tofile=None, use_cache=False, dry_run=False, dsd=None, **kwargs)[source]

Retrieve SDMX data or metadata.

(Meta)data is retrieved from the source of the current Request. The item(s) to retrieve can be specified in one of two ways:

  1. resource_type, resource_id: These give the type (see Resource) and, optionally, ID of the item(s). If the resource_id is not given, all items of the given type are retrieved.

  2. a resource object, i.e. a MaintainableArtefact: resource_type and resource_id are determined by the object’s class and id attribute, respectively.

Data is retrieved with resource_type=’data’. In this case, the optional keyword argument key can be used to constrain the data that is retrieved. Examples of the formats for key:

  1. {'GEO': ['EL', 'ES', 'IE']}: dict with dimension name(s) mapped to an iterable of allowable values.

  2. {'GEO': 'EL+ES+IE'}: dict with dimension name(s) mapped to strings joining allowable values with ‘+’, the logical ‘or’ operator for SDMX web services.

  3. '....EL+ES+IE': str in which ordered dimension values (some empty, '') are joined with '.'. Using this form requires knowledge of the dimension order in the target data resource_id; in the example, dimension ‘GEO’ is the fifth of five dimensions: '.'.join(['', '', '', '', 'EL+ES+IE']). CubeRegion.to_query_string can also be used to create properly formatted strings.

For formats 1 and 2, but not 3, the key argument is validated against the relevant DataStructureDefinition, either given with the dsd keyword argument, or retrieved from the web service before the main query.

For the optional param keyword argument, some useful parameters are:

  • ‘startperiod’, ‘endperiod’: restrict the time range of data to retrieve.

  • ‘references’: control which item(s) related to a metadata resource are retrieved, e.g. references=’parentsandsiblings’.

Parameters
  • resource_type (str or Resource, optional) – Type of resource to retrieve.

  • resource_id (str, optional) – ID of the resource to retrieve.

  • tofile (str or PathLike or file-like object, optional) – File path or file-like to write SDMX data as it is recieved.

  • use_cache (bool, optional) – If True, return a previously retrieved Message from cache, or update the cache with a newly-retrieved Message.

  • dry_run (bool, optional) – If True, prepare and return a requests.Request object, but do not execute the query. The prepared URL and headers can be examined by inspecting the returned object.

  • dsd (DataStructureDefinition) – Existing object used to validate the key argument. If not provided, an additional query executed to retrieve a DSD in order to validate the key.

  • **kwargs – Other, optional parameters (below).

Other Parameters
  • force (bool) – If True, execute the query even if the source does not support queries for the given resource_type. Default: False.

  • headers (dict) – HTTP headers. Given headers will overwrite instance-wide headers passed to the constructor. Default: None to use the default headers of the source.

  • key (str or dict) – For queries with resource_type=’data’. str values are not validated; dict values are validated using make_constraint().

  • params (dict) – Query parameters. The SDMX REST web service guidelines describe parameters and allowable values for different queries. params is not validated before the query is executed.

  • provider (str) – ID of the agency providing the data or metadata. Default: ID of the source agency.

    An SDMX web service is a ‘data source’ operated by a specific, ‘source’ agency. A web service may host data or metadata originally published by one or more ‘provider’ agencies. Many sources are also providers. Other agencies—e.g. the SDMX Global Registry—simply aggregate (meta)data from other providers, but do not providing any (meta)data themselves.

  • resource (MaintainableArtefact subclass) – Object to retrieve. If given, resource_type and resource_id are ignored.

  • version (str) – version> of a resource to retrieve. Default: the keyword ‘latest’.

Returns

The requested SDMX message or, if dry_run is True, the prepared request object.

Return type

Message or Request

Raises

NotImplementedError – If the source does not support the given resource_type and force is not True.

preview_data(flow_id, key={})[source]

Return a preview of data.

For the Dataflow flow_id, return all series keys matching key. preview_data() uses a feature supported by some data providers that returns SeriesKeys without the corresponding Observations.

To count the number of series:

keys = sdmx.Request('PROVIDER').preview_data('flow')
len(keys)

To get a pandas object containing the key values:

keys_df = sdmx.to_pandas(keys)
Parameters
  • flow_id (str) – Dataflow to preview.

  • key (dict, optional) – Mapping of dimension to values, where values may be a ‘+’-delimited list of values. If given, only SeriesKeys that match key are returned. If not given, preview_data is equivalent to list(req.series_keys(flow_id)).

Returns

Return type

list of SeriesKey

series_keys(flow_id, use_cache=True)[source]

Return all pandasdmx.model.SeriesKey for flow_id.

Returns

Return type

list

session = None

Session for queries sent from the instance.

source = None

source.Source for requests sent from the instance.

class pandasdmx.Resource[source]

Enumeration of SDMX REST API endpoints.

Enum member

pandasdmx.model class

categoryscheme

CategoryScheme

codelist

Codelist

conceptscheme

ConceptScheme

data

DataSet

dataflow

DataflowDefinition

datastructure

DataStructureDefinition

provisionagreement

ProvisionAgreement

pandasdmx.list_sources()[source]

Return a sorted list of valid source IDs.

These can be used to create Request instances.

pandasdmx.read_sdmx(filename_or_obj, format=None, **kwargs)[source]

Load a SDMX-ML or SDMX-JSON message from a file or file-like object.

Parameters
  • filename_or_obj (str or PathLike or file) –

  • format ('XML' or 'JSON', optional) –

Other Parameters

dsd (DataStructureDefinition) – For “structure-specific” format`=``XML` messages only.

pandasdmx.read_url(url, **kwargs)[source]

Request a URL directly.

pandasdmx.logger = <Logger pandasdmx (ERROR)>

Top-level logger for pandaSDMX. By default, messages at the log level ERROR or greater are printed to sys.stderr. These include the web service query details (URL and headers) used by Request.

To debug requests to web services, set to a more permissive level:

import logging

sdmx.logger.setLevel(logging.DEBUG)

New in version 0.4.

message: SDMX messages

Classes for SDMX messages.

Message and related classes are not defined in the SDMX information model, but in the SDMX-ML standard.

pandaSDMX also uses DataMessage to encapsulate SDMX-JSON data returned by data sources.

class pandasdmx.message.DataMessage[source]

Bases: pandasdmx.message.Message

Data Message.

Note

A DataMessage may contain zero or more DataSet, so data is a list. To retrieve the first (and possibly only) data set in the message, access the first element of the list: msg.data[0].

data: List[DataSet] = None

list of DataSet.

dataflow: DataflowDefinition = None

DataflowDefinition that contains the data.

observation_dimension: Union[_AllDimensions, DimensionComponent, List[DimensionComponent]] = None

The “dimension at observation level”.

property structure

DataStructureDefinition used in the dataflow.

class pandasdmx.message.ErrorMessage[source]

Bases: pandasdmx.message.Message

footer = None
header = None
response = None
class pandasdmx.message.Footer[source]

Bases: pandasdmx.util.BaseModel

Footer of an SDMX-ML message.

SDMX-JSON messages do not have footers.

code: int = None
severity: Text = None
text: List[InternationalString] = None

The body text of the Footer contains zero or more blocks of text.

class pandasdmx.message.Header[source]

Bases: pandasdmx.util.BaseModel

Header of an SDMX-ML message.

SDMX-JSON messages do not have headers.

error: Text = None

(optional) Error code for the message.

id: Text = None

Identifier for the message.

prepared: Text = None

Date and time at which the message was generated.

receiver: Text = None

Intended recipient of the message, e.g. the user’s name for an authenticated service.

sender: Union[Item, Text] = None

The Agency associated with the data Source.

class pandasdmx.message.Message[source]

Bases: pandasdmx.util.BaseModel

class Config[source]

Bases: object

arbitrary_types_allowed = True
footer: Footer = None

(optional) Footer instance.

header: Header = None

Header instance.

response: Response = None

requests.Response instance for the response to the HTTP request that returned the Message. This is not part of the SDMX standard.

to_pandas(*args, **kwargs)[source]

Convert a Message instance to pandas object(s).

pandasdmx.writer.write() is called and passed the Message instance as first argument, followed by any args and kwargs.

See also

write()

write(*args, **kwargs)[source]

Alias for to_pandas improving backwards compatibility.

Deprecated since version 1.0: Use to_pandas() instead.

class pandasdmx.message.StructureMessage[source]

Bases: pandasdmx.message.Message

category_scheme: DictLike[str, CategoryScheme] = None

Collection of CategoryScheme.

codelist: DictLike[str, Codelist] = None

Collection of Codelist.

concept_scheme: DictLike[str, ConceptScheme] = None

Collection of ConceptScheme.

constraint: DictLike[str, ContentConstraint] = None

Collection of ContentConstraint.

dataflow: DictLike[str, DataflowDefinition] = None

Collection of DataflowDefinition.

organisation_scheme: DictLike[str, AgencyScheme] = None

Collection of AgencyScheme.

provisionagreement: DictLike[str, ProvisionAgreement] = None

Collection of ProvisionAgreement.

structure: DictLike[str, DataStructureDefinition] = None

Collection of DataStructureDefinition.

model: SDMX Information Model

SDMX Information Model (SDMX-IM).

This module implements many of the classes described in the SDMX-IM specification (‘spec’), which is available from:

Details of the implementation:

  • Python typing and pydantic are used to enforce the types of attributes that reference instances of other classes.

  • Some classes have convenience attributes not mentioned in the spec, to ease navigation between related objects. These are marked “pandaSDMX extension not in the IM.”

  • Class definitions are grouped by section of the spec, but these sections appear out of order so that dependent classes are defined first.

class pandasdmx.model.ActionType

Bases: str, enum.Enum

An enumeration.

append = '3'
delete = '1'
information = '4'
replace = '2'
pandasdmx.model.Agency

alias of pandasdmx.model.Organisation

class pandasdmx.model.AgencyScheme[source]

Bases: pandasdmx.model.ItemScheme

items: Dict[str, _item_type] = None
class pandasdmx.model.AnnotableArtefact[source]

Bases: pandasdmx.util.BaseModel

annotations: List[Annotation] = None

Annotations of the object.

pandaSDMX implementation: The IM does not specify the name of this feature.

class pandasdmx.model.Annotation[source]

Bases: pandasdmx.util.BaseModel

id: str = None

Can be used to disambiguate multiple annotations for one AnnotableArtefact.

text: InternationalString = None

Content of the annotation.

title: str = None

Title, used to identify an annotation.

type: str = None

Specifies how the annotation is processed.

url: str = None

A link to external descriptive text.

class pandasdmx.model.AttachmentConstraint[source]

Bases: pandasdmx.model.Constraint

attachment: Set[ConstrainableArtefact] = None
class pandasdmx.model.AttributeDescriptor[source]

Bases: pandasdmx.model.ComponentList

components: List[DataAttribute] = None
class pandasdmx.model.AttributeRelationship[source]

Bases: pandasdmx.util.BaseModel

dimensions: List[Dimension] = None
group_key: 'GroupDimensionDescriptor' = None
class pandasdmx.model.AttributeValue(*args, **kwargs)[source]

Bases: pandasdmx.util.BaseModel

SDMX-IM AttributeValue.

In the spec, AttributeValue is an abstract class. Here, it serves as both the concrete subclasses CodedAttributeValue and UncodedAttributeValue.

start_date: Optional[date] = None
value: Union[str, Code] = None
value_for: DataAttribute = None
class pandasdmx.model.Categorisation[source]

Bases: pandasdmx.model.MaintainableArtefact

artefact: IdentifiableArtefact = None
category: Category = None
class pandasdmx.model.Category(*args, **kwargs)[source]

Bases: pandasdmx.model.Item

SDMX-IM Category.

child = None
parent = None
class pandasdmx.model.CategoryScheme[source]

Bases: pandasdmx.model.ItemScheme

items: Dict[str, _item_type] = None
class pandasdmx.model.Code(*args, **kwargs)[source]

Bases: pandasdmx.model.Item

SDMX-IM Code.

child = None
parent = None
class pandasdmx.model.Codelist[source]

Bases: pandasdmx.model.ItemScheme

items: Dict[str, _item_type] = None
class pandasdmx.model.Component[source]

Bases: pandasdmx.model.IdentifiableArtefact

concept_identity: Concept = None
local_representation: Representation = None
class pandasdmx.model.ComponentList[source]

Bases: pandasdmx.model.IdentifiableArtefact

append(value)[source]

Append value to components.

components: List[Component] = None
get(id, cls=None, **kwargs)[source]

Return or create the component with the given id.

Parameters
  • id (str) – Component ID.

  • cls (type, optional) – Hint for the class of a new object.

  • kwargs – Passed to the constructor of Component, or a Component subclass if components is overridden in a subclass of ComponentList.

class pandasdmx.model.ComponentValue[source]

Bases: pandasdmx.util.BaseModel

value: str = None
value_for: Component = None
class pandasdmx.model.Concept(*args, **kwargs)[source]

Bases: pandasdmx.model.Item

core_representation: Representation = None
iso_concept: ISOConceptReference = None
class pandasdmx.model.ConceptScheme[source]

Bases: pandasdmx.model.ItemScheme

items: Dict[str, _item_type] = None
class pandasdmx.model.ConstrainableArtefact[source]

Bases: pandasdmx.util.BaseModel

SDMX-IM ConstrainableArtefact.

class pandasdmx.model.Constraint[source]

Bases: pandasdmx.model.MaintainableArtefact

class Config[source]

Bases: object

validate_assignment = False
data_content_keys: Optional[DataKeySet] = None

DataKeySet included in the Constraint.

role: ConstraintRole = None
class pandasdmx.model.ConstraintRole[source]

Bases: pandasdmx.util.BaseModel

role: ConstraintRoleType = None
class pandasdmx.model.ConstraintRoleType

Bases: enum.Enum

An enumeration.

actual = 2
allowable = 1
class pandasdmx.model.Contact[source]

Bases: pandasdmx.util.BaseModel

Organization contact information.

IMF is the only data provider that returns messages with Contact information. These differ from the IM in several ways. This class reflects these differences:

  • ‘name’ and ‘org_unit’ are InternationalString, instead of strings.

  • ‘email’ may be a list of e-mail addresses, rather than a single address.

  • ‘uri’ may be a list of URIs, rather than a single URI.

email: List[str] = None
name: InternationalString = None
org_unit: InternationalString = None
responsibility: InternationalString = None
telephone: str = None
uri: List[str] = None
class pandasdmx.model.ContentConstraint[source]

Bases: pandasdmx.model.Constraint

class Config[source]

Bases: object

validate_assignment_exclude = 'data_content_region'
content: Set[ConstrainableArtefact] = None
data_content_region: List[CubeRegion] = None

CubeRegions included in the ContentConstraint.

to_query_string(structure)[source]
class pandasdmx.model.CubeRegion[source]

Bases: pandasdmx.util.BaseModel

included: bool = None
member: Dict['Dimension', MemberSelection] = None
to_query_string(structure)[source]
class pandasdmx.model.DataAttribute[source]

Bases: pandasdmx.model.Component

related_to: AttributeRelationship = None
usage_status: UsageStatus = None
class pandasdmx.model.DataKey[source]

Bases: pandasdmx.util.BaseModel

included: bool = None

True if the keys are included in the Constraint; False if they are excluded.

key_value: Dict[Component, ComponentValue] = None

Mapping from Component to ComponentValue comprising the key.

class pandasdmx.model.DataKeySet[source]

Bases: pandasdmx.util.BaseModel

included: bool = None

True if the keys are included in the Constraint; False if they are excluded.

keys: List[DataKey] = None

DataKeys appearing in the set.

class pandasdmx.model.DataProvider(*args, **kwargs)[source]

Bases: pandasdmx.model.Organisation

SDMX-IM DataProvider.

contact = None
class pandasdmx.model.DataProviderScheme[source]

Bases: pandasdmx.model.ItemScheme

items: Dict[str, _item_type] = None
class pandasdmx.model.DataSet[source]

Bases: pandasdmx.model.AnnotableArtefact

action: ActionType = None
add_obs(observations, series_key=None)[source]

Add observations to a series with series_key.

Checks consistency and adds group associations.

attrib: DictLike[str, AttributeValue] = None
group: DictLike[GroupKey, List[Observation]] = None

Map of group key → list of observations. pandaSDMX extension not in the IM.

obs: List[Observation] = None

All observations in the DataSet.

series: DictLike[SeriesKey, List[Observation]] = None

Map of series key → list of observations. pandaSDMX extension not in the IM.

structured_by: DataStructureDefinition = None
valid_from: str = None
class pandasdmx.model.DataStructureDefinition[source]

Bases: pandasdmx.model.Structure, pandasdmx.model.ConstrainableArtefact

Defines a data structure. Referred to as “DSD”.

attribute(id, **kwargs)[source]

Call ComponentList.get() on attributes.

attributes: AttributeDescriptor = None

A AttributeDescriptor that describes the attributes of the data structure.

dimension(id, **kwargs)[source]

Call ComponentList.get() on dimensions.

dimensions: DimensionDescriptor = None

A DimensionDescriptor that describes the dimensions of the data structure.

classmethod from_keys(keys)[source]

Return a new DSD given some keys.

The DSD’s dimensions refers to a set of new Concepts and Codelists, created to represent all the values observed across keys for each dimension.

Parameters

keys (iterable of Key) – or of subclasses such as SeriesKey or GroupKey.

group_dimensions: DictLike[str, GroupDimensionDescriptor] = None

A GroupDimensionDescriptor.

make_constraint(key)[source]

Return a constraint for key.

key is a dict wherein:

  • keys are str ids of Dimensions appearing in this DSD’s dimensions, and

  • values are ‘+’-delimited str containing allowable values, _or_ iterables of str, each an allowable value.

For example:

cc2 = dsd.make_constraint({'foo': 'bar+baz', 'qux': 'q1+q2+q3'})

cc2 includes any key where the ‘foo’ dimension is ‘bar’ or ‘baz’, and the ‘qux’ dimension is one of ‘q1’, ‘q2’, or ‘q3’.

Returns

A constraint with one CubeRegion in its data_content_region , including only the values appearing in keys.

Return type

ContentConstraint

Raises

ValueError – if key contains a dimension IDs not appearing in dimensions.

measures: MeasureDescriptor = None

A MeasureDescriptor.

class pandasdmx.model.DataflowDefinition[source]

Bases: pandasdmx.model.StructureUsage, pandasdmx.model.ConstrainableArtefact

structure: DataStructureDefinition = None
class pandasdmx.model.Datasource[source]

Bases: pandasdmx.util.BaseModel

url: str = None
class pandasdmx.model.Dimension[source]

Bases: pandasdmx.model.DimensionComponent

SDMX-IM Dimension.

order = None
class pandasdmx.model.DimensionComponent[source]

Bases: pandasdmx.model.Component

order: Optional[int] = None
class pandasdmx.model.DimensionDescriptor[source]

Bases: pandasdmx.model.ComponentList

Describes a set of dimensions.

IM: “An ordered set of metadata concepts that, combined, classify a statistical series, and whose values, when combined (the key) in an instance such as a data set, uniquely identify a specific observation.”

assign_order()[source]

Assign the DimensionComponent.order attribute.

The Dimensions in components are numbered, starting from 1.

components: List[DimensionComponent] = None

list (ordered) of Dimension, MeasureDimension, and/or TimeDimension.

classmethod from_key(key)[source]

Create a new DimensionDescriptor from a key.

For each KeyValue in the key:

Parameters

key (Key or GroupKey or SeriesKey) –

order_key(key)[source]

Return a key ordered according to the DSD.

pandasdmx.model.DimensionRelationship

alias of pandasdmx.model.AttributeRelationship

class pandasdmx.model.Facet[source]

Bases: pandasdmx.util.BaseModel

type: FacetType = None
value: str = None
value_type: Optional[FacetValueType] = None
class pandasdmx.model.FacetType[source]

Bases: pandasdmx.util.BaseModel

decimals: Optional[int] = None
end_time: Optional[datetime] = None
end_value: Optional[str] = None
interval: Optional[float] = None
is_sequence: Optional[bool] = None
max_length: Optional[int] = None
max_value: Optional[float] = None
min_length: Optional[int] = None
min_value: Optional[float] = None
pattern: Optional[str] = None
start_time: Optional[datetime] = None
start_value: Optional[float] = None
time_interval: Optional[timedelta] = None
class pandasdmx.model.FacetValueType

Bases: enum.Enum

An enumeration.

alpha = 13
alphaNumeric = 14
basicTimePeriod = 20
bigInteger = 2
boolean = 9
count = 11
dataSetReference = 43
dateTime = 34
day = 38
decimal = 6
double = 8
duration = 40
exclusiveValueRange = 16
float = 7
gregorianDay = 25
gregorianMonth = 23
gregorianTimePeriod = 21
gregorianYear = 22
gregorianYearMonth = 24
identifiableReference = 42
inclusiveValueRange = 12
incremental = 17
integer = 3
keyValues = 41
long = 4
month = 36
monthDay = 37
numeric = 15
observationalTimePeriod = 18
reportingDay = 33
reportingMonth = 31
reportingQuarter = 30
reportingSemester = 28
reportingTimePeriod = 26
reportingTrimester = 29
reportingWeek = 32
reportingYear = 27
short = 5
standardTimePeriod = 19
string = 1
time = 39
timesRange = 35
uri = 10
class pandasdmx.model.GenericDataSet[source]

Bases: pandasdmx.model.DataSet

SDMX-IM GenericDataSet.

action = None
attrib = None
group = None
obs = None
series = None
structured_by = None
valid_from = None
class pandasdmx.model.GenericTimeSeriesDataSet[source]

Bases: pandasdmx.model.DataSet

SDMX-IM GenericTimeSeriesDataSet.

action = None
attrib = None
group = None
obs = None
series = None
structured_by = None
valid_from = None
class pandasdmx.model.GroupDimensionDescriptor[source]

Bases: pandasdmx.model.DimensionDescriptor

assign_order()[source]

assign_order() has no effect for GroupDimensionDescriptor.

attachment_constraint: bool = None
constraint: AttachmentConstraint = None
class pandasdmx.model.GroupKey(arg=None, **kwargs)[source]

Bases: pandasdmx.model.Key

described_by: GroupDimensionDescriptor = None
id: str = None
class pandasdmx.model.ISOConceptReference[source]

Bases: pandasdmx.util.BaseModel

agency: str = None
id: str = None
scheme_id: str = None
class pandasdmx.model.IdentifiableArtefact[source]

Bases: pandasdmx.model.AnnotableArtefact

id: str = None

Unique identifier of the object.

uri: str = None

Universal resource identifier that may or may not be resolvable.

urn: str = None

Universal resource name. For use in SDMX registries; all registered objects have a URN.

class pandasdmx.model.InternationalString(value=None, **kwargs)[source]

Bases: object

SDMX-IM InternationalString.

SDMX-IM LocalisedString is not implemented. Instead, the ‘localizations’ is a mapping where:

  • keys correspond to the ‘locale’ property of LocalisedString.

  • values correspond to the ‘label’ property of LocalisedString.

When used as a type hint with pydantic, InternationalString fields can be assigned to in one of four ways:

class Foo(BaseModel):
     name: InternationalString = InternationalString()

# Equivalent: no localizations
f = Foo()
f = Foo(name={})

# Using an explicit locale
f.name['en'] = "Foo's name in English"

# Using a (locale, label) tuple
f.name = ('fr', "Foo's name in French")

# Using a dict
f.name = {'en': "Replacement English name",
          'fr': "Replacement French name"}

# Using a bare string, implicitly for the DEFAULT_LOCALE
f.name = "Name in DEFAULT_LOCALE language"

Only the first method preserves existing localizations; the latter three replace them.

localizations: Dict[str, str] = {}
localized_default(locale)[source]

Return the string in locale, or else the first defined.

class pandasdmx.model.Item(*args, **kwargs)[source]

Bases: pandasdmx.model.NameableArtefact

class Config[source]

Bases: object

validate_assignment_exclude = 'parent'
child: List['Item'] = None
get_child(id)[source]

Return the child with the given id.

parent: 'Item' = None
class pandasdmx.model.ItemScheme[source]

Bases: pandasdmx.model.MaintainableArtefact

SDMX-IM Item Scheme.

The IM states that ItemScheme “defines a set of Items…” To simplify indexing/retrieval, this implementation uses a dict for the items attribute, in which the keys are the id of the Item.

Because this may change in future versions of pandaSDMX, user code should not access items directly. Instead, use the getattr() and indexing features of ItemScheme, or the public methods, to access and manipulate Items:

>>> foo = ItemScheme(id='foo')
>>> bar = Item(id='bar')
>>> foo.append(bar)
>>> foo
<ItemScheme: 'foo', 1 items>
>>> (foo.bar is bar) and (foo['bar'] is bar) and (bar in foo)
True
append(item: pandasdmx.model.Item)[source]

Add item to the ItemScheme.

Parameters

item (same class as items) – Item to add.

classmethod convert_to_dict(v)[source]
extend(items: Iterable[pandasdmx.model.Item])[source]

Extend the ItemScheme with members of items.

Parameters

items (iterable of Item) – Elements must be of the same class as items.

is_partial: Optional[bool] = None
items: Dict[str, _item_type] = None

Members of the ItemScheme. Both ItemScheme and Item are abstract classes. Concrete classes are paired: for example, a Codelist contains Codes.

setdefault(obj=None, **kwargs)[source]

Retrieve the item name, or add it with kwargs and return it.

The returned object is a reference to an object in the ItemScheme, and is of the appropriate class.

class pandasdmx.model.Key(arg=None, **kwargs)[source]

Bases: pandasdmx.util.BaseModel

SDMX Key class.

The constructor takes an optional list of keyword arguments; the keywords are used as Dimension or Attribute IDs, and the values as KeyValues.

For convience, the values of the key may be accessed directly:

>>> k = Key(foo=1, bar=2)
>>> k.values['foo']
1
>>> k['foo']
1
Parameters
attrib: DictLike[str, AttributeValue] = None
copy(arg=None, **kwargs)[source]

Duplicate a model, optionally choose which fields to include, exclude and change.

Parameters
  • include – fields to include in new model

  • exclude – fields to exclude from new model, as with values this takes precedence over include

  • update – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data

  • deep – set to True to make a deep copy of the model

Returns

new model instance

described_by: DimensionDescriptor = None
get_values()[source]
order(value=None)[source]
values: DictLike[str, KeyValue] = None

Individual KeyValues that describe the key.

class pandasdmx.model.KeyValue(*args, **kwargs)[source]

Bases: pandasdmx.util.BaseModel

One value in a multi-dimensional Key.

id: str = None
value: Any = None

The actual value.

value_for: Dimension = None
class pandasdmx.model.MaintainableArtefact[source]

Bases: pandasdmx.model.VersionableArtefact

is_external_reference: Optional[bool] = None

True if the content of the object is held externally; i.e., not the current Message.

is_final: Optional[bool] = None

True if the object is final; otherwise it is in a draft state.

maintainer: Optional['Agency'] = None

Association to the Agency responsible for maintaining the object.

service_url: Optional[str] = None

URL of an SDMX-compliant web service from which the object can be retrieved.

structure_url: Optional[str] = None

URL of an SDMX-ML document containing the object.

class pandasdmx.model.MeasureDescriptor[source]

Bases: pandasdmx.model.ComponentList

components: List[PrimaryMeasure] = None
class pandasdmx.model.MeasureDimension[source]

Bases: pandasdmx.model.DimensionComponent

SDMX-IM MeasureDimension.

order = None
class pandasdmx.model.MemberSelection[source]

Bases: pandasdmx.util.BaseModel

included: bool = None
values: Set[MemberValue] = None

NB the spec does not say what this feature should be named

values_for: Component = None
class pandasdmx.model.MemberValue[source]

Bases: pandasdmx.model.SelectionValue

cascade_values: bool = None
value: str = None
class pandasdmx.model.NameableArtefact[source]

Bases: pandasdmx.model.IdentifiableArtefact

description: InternationalString = None

Multi-lingual description of the object.

name: InternationalString = None

Multi-lingual name of the object.

pandasdmx.model.NoSpecifiedRelationship

alias of pandasdmx.model.AttributeRelationship

class pandasdmx.model.Observation[source]

Bases: pandasdmx.util.BaseModel

SDMX-IM Observation.

This class also implements the spec classes ObservationValue, UncodedObservationValue, and CodedObservation.

attached_attribute: DictLike[str, AttributeValue] = None
property attrib

Return a view of combined observation, series & group attributes.

property dim
dimension: Key = None

Key for dimension(s) varying at the observation level.

group_keys: Set[GroupKey] = None

pandaSDMX extension not in the IM.

property key

Return the entire key, including KeyValues at the series level.

series_key: SeriesKey = None
value: Union[Any, Code] = None

Data value.

value_for: PrimaryMeasure = None
class pandasdmx.model.Organisation(*args, **kwargs)[source]

Bases: pandasdmx.model.Item

contact: List[Contact] = None
class pandasdmx.model.PrimaryMeasure[source]

Bases: pandasdmx.model.Component

SDMX-IM PrimaryMeasure.

concept_identity = None
local_representation = None
pandasdmx.model.PrimaryMeasureRelationship

alias of pandasdmx.model.AttributeRelationship

class pandasdmx.model.ProvisionAgreement[source]

Bases: pandasdmx.model.MaintainableArtefact, pandasdmx.model.ConstrainableArtefact

data_provider: DataProvider = None
structure_usage: StructureUsage = None
class pandasdmx.model.QueryDatasource[source]

Bases: pandasdmx.model.Datasource

url = None
class pandasdmx.model.RESTDatasource[source]

Bases: pandasdmx.model.QueryDatasource

url = None
class pandasdmx.model.ReportingYearStartDay[source]

Bases: pandasdmx.model.DataAttribute

related_to = None
usage_status = None
class pandasdmx.model.Representation[source]

Bases: pandasdmx.util.BaseModel

enumerated: ItemScheme = None
non_enumerated: List[Facet] = None
class pandasdmx.model.SelectionValue[source]

Bases: pandasdmx.util.BaseModel

SDMX-IM SelectionValue.

class pandasdmx.model.SeriesKey(arg=None, **kwargs)[source]

Bases: pandasdmx.model.Key

property group_attrib

Return a view of combined group attributes.

group_keys: Set[GroupKey] = None

pandaSDMX extension not in the IM.

class pandasdmx.model.SimpleDatasource[source]

Bases: pandasdmx.model.Datasource

url = None
class pandasdmx.model.Structure[source]

Bases: pandasdmx.model.MaintainableArtefact

grouping: ComponentList = None
class pandasdmx.model.StructureSpecificDataSet[source]

Bases: pandasdmx.model.DataSet

SDMX-IM StructureSpecificDataSet.

action = None
attrib = None
group = None
obs = None
series = None
structured_by = None
valid_from = None
class pandasdmx.model.StructureSpecificTimeSeriesDataSet[source]

Bases: pandasdmx.model.DataSet

SDMX-IM StructureSpecificTimeSeriesDataSet.

action = None
attrib = None
group = None
obs = None
series = None
structured_by = None
valid_from = None
class pandasdmx.model.StructureUsage[source]

Bases: pandasdmx.model.MaintainableArtefact

structure: Structure = None
class pandasdmx.model.TimeDimension[source]

Bases: pandasdmx.model.DimensionComponent

SDMX-IM TimeDimension.

order = None
pandasdmx.model.TimeKeyValue

alias of pandasdmx.model.KeyValue

class pandasdmx.model.UsageStatus

Bases: enum.Enum

An enumeration.

conditional = 2
mandatory = 1
class pandasdmx.model.VersionableArtefact[source]

Bases: pandasdmx.model.NameableArtefact

valid_from: str = None

Date from which the version is valid.

valid_to: str = None

Date from which the version is superseded.

version: str = None

A version string following an agreed convention.

pandasdmx.model.cls

alias of pandasdmx.model.DataProvider

pandasdmx.model.value_for_dsd_ref(kind, args, kwargs)[source]

Maybe replace a string ‘value_for’ in kwargs with a DSD reference.

reader: Parsers for SDMX file formats

SDMX-ML

pandaSDMX supports the several types of SDMX-ML messages.

class pandasdmx.reader.sdmxml.Reader[source]

Read SDMX-ML 2.1 and expose it as instances from pandasdmx.model.

The implementation is recursive, and depends on:

  • _parse(), _named() and _maintained().

  • State variables _current, _stack, :attr:`_index.

Parameters

dsd (DataStructureDefinition) – For “structure-specific” format`=``XML` messages only.

parse_annotation(elem)[source]
parse_attribute(elem)[source]
parse_attributerelationship(elem)[source]
parse_attributes(elem)[source]
parse_categorisation(elem)[source]
parse_category(elem)[source]
parse_categoryscheme(elem)[source]
parse_code(elem)[source]
parse_codelist(elem)[source]
parse_componentlist(elem)[source]
parse_concept(elem)[source]
parse_conceptidentity(elem)[source]
parse_conceptscheme(elem)[source]
parse_constraintattachment(elem)[source]
parse_contact(elem)[source]
parse_contentconstraint(elem)[source]
parse_cuberegion(elem)[source]
parse_dataflow(elem)[source]
parse_datakeyset(elem)[source]
parse_dataset(elem)[source]
parse_datastructure(elem)[source]
parse_dimension(elem)[source]
parse_errormessage(elem)[source]
parse_facet(elem)[source]
parse_group(elem)[source]

<generic:Group>, <structure:Group>, or <Group>.

parse_groupdimension(elem)[source]
parse_header(elem)[source]
parse_international_string(elem)[source]
parse_key(elem)[source]

SeriesKey, GroupKey, observation dimensions.

parse_memberselection(elem)[source]

<com:KeyValue> (not inside <com:Key>); or <com:Attribute>.

parse_message(elem)[source]
parse_obs(elem)[source]
parse_obsdimension(elem)[source]
parse_obsvalue(elem)[source]
parse_organisation(elem)[source]
parse_orgscheme(elem)[source]
parse_primarymeasure(elem)[source]
parse_provisionagreement(elem)[source]
parse_ref(elem, parent=None)[source]

References to Identifiable- and MaintainableArtefacts.

parent is the tag containing the reference.

parse_representation(elem)[source]
parse_series(elem)[source]
parse_structures(elem)[source]
parse_text(elem)[source]
read_message(source, dsd=None)[source]

Read message from source.

Must return an instance of a model.Message subclass.

SDMX-JSON

class pandasdmx.reader.sdmxjson.Reader[source]

Read SDMXJSON 2.1 and expose it as instances from pandasdmx.model.

read_dataset(root, ds_key)[source]
read_message(source)[source]

Read message from source.

Must return an instance of a model.Message subclass.

read_obs(root, series_key=None, base_key=None)[source]

writer: Convert SDMX to pandas objects

Changed in version 1.0: pandasdmx.to_pandas() (via write) handles all types of objects, replacing the earlier, separate data2pandas and structure2pd writers.

writer.write(*args, **kwargs)

Convert an SDMX obj to pandas object(s).

Implements a dispatch pattern according to the type of obj. For instance, a DataSet object is converted using write_dataset(). See individual write_* methods named for more information on their behaviour, including accepted args and kwargs.

write_component(obj)

Convert Component.

write_datamessage(obj, *args[, rtype])

Convert DataMessage.

write_dataset(obj[, attributes, dtype, …])

Convert DataSet.

write_dict(obj, *args, **kwargs)

Convert mappings.

write_dimensiondescriptor(obj)

Convert DimensionDescriptor.

write_itemscheme(obj[, locale])

Convert ItemScheme.

write_list(obj, *args, **kwargs)

Convert a list of SDMX objects.

write_membervalue(obj)

Convert MemberValue.

write_nameableartefact(obj)

Convert NameableArtefact.

write_serieskeys(obj)

Convert a list of SeriesKey.

write_structuremessage(obj[, include])

Convert StructureMessage.

pandasdmx.writer.DEFAULT_RTYPE = 'compat'

Default return type for write_dataset() and similar methods. Either ‘compat’ or ‘rows’. See the ref:HOWTO <howto-rtype>.

pandasdmx.writer.write_component(obj)[source]

Convert Component.

The id attribute of the concept_identity is returned.

pandasdmx.writer.write_contentconstraint(obj, **kwargs)[source]

Convert ContentConstraint.

pandasdmx.writer.write_cuberegion(obj, **kwargs)[source]

Convert CubeRegion.

pandasdmx.writer.write_datamessage(obj, *args, rtype=None, **kwargs)[source]

Convert DataMessage.

The data set(s) within the message are converted to pandas objects.

Parameters
Returns

pandasdmx.writer.write_dataset(obj, attributes='', dtype=<class 'numpy.float64'>, constraint=None, datetime=False, **kwargs)[source]

Convert DataSet.

See the walkthrough for examples of using the datetime argument.

Parameters
  • obj (DataSet or iterable of Observation) –

  • attributes (str) –

    Types of attributes to return with the data. A string containing zero or more of:

    • 'o': attributes attached to each Observation .

    • 's': attributes attached to any (0 or 1) SeriesKey associated with each Observation.

    • 'g': attributes attached to any (0 or more) GroupKey associated with each Observation.

    • 'd': attributes attached to the DataSet containing the Observations.

  • dtype (str or numpy.dtype or None) – Datatype for values. If None, do not return the values of a series. In this case, attributes must not be an empty string so that some attribute is returned.

  • constraint (ContentConstraint, optional) – If given, only Observations included by the constraint are returned.

  • datetime (bool or str or or .Dimension or dict, optional) –

    If given, return a DataFrame with a DatetimeIndex or PeriodIndex as the index and all other dimensions as columns. Valid datetime values include:

    • bool: if True, determine the time dimension automatically by detecting a TimeDimension.

    • str: ID of the time dimension.

    • Dimension: the matching Dimension is the time dimension.

    • dict: advanced behaviour. Keys may include:

      • dim (Dimension or str): the time dimension or its ID.

      • axis ({0 or ‘index’, 1 or ‘columns’}): axis on which to place the time dimension (default: 0).

      • freq (True or str or Dimension): produce pandas.PeriodIndex. If str, the ID of a Dimension containing a frequency specification. If a Dimension, the specified dimension is used for the frequency specification.

        Any Dimension used for the frequency specification is does not appear in the returned DataFrame.

Returns

  • pandas.DataFrame

    • if attributes is not '', a data frame with one row per Observation, value as the first column, and additional columns for each attribute;

    • if datetime is given, various layouts as described above; or

    • if _rtype (passed from write_datamessage()) is ‘compat’, various layouts as described in the HOWTO.

  • pandas.Series with pandas.MultiIndex – Otherwise.

pandasdmx.writer.write_dict(obj, *args, **kwargs)[source]

Convert mappings.

The values of the mapping are write()’d individually. If the resulting values are str or pd.Series with indexes that share the same name, then they are converted to a pd.Series, possibly with a pd.MultiIndex. Otherwise, a DictLike is returned.

pandasdmx.writer.write_dimensiondescriptor(obj)[source]

Convert DimensionDescriptor.

The components of the DimensionDescriptor are written.

pandasdmx.writer.write_itemscheme(obj, locale='en')[source]

Convert ItemScheme.

Names from locale are serialized.

Returns

Return type

pandas.Series

pandasdmx.writer.write_list(obj, *args, **kwargs)[source]

Convert a list of SDMX objects.

For the following obj, write_list() returns pandas.Series instead of a list:

  • a list of Observation: the Observations are written using write_dataset().

  • a list with only 1 DataSet (e.g. the

    data attribute of DataMessage): the Series for the single element is returned.

  • a list of SeriesKey: the key values (but no data) are returned.

pandasdmx.writer.write_membervalue(obj)[source]

Convert MemberValue.

pandasdmx.writer.write_nameableartefact(obj)[source]

Convert NameableArtefact.

The name attribute of obj is returned.

pandasdmx.writer.write_serieskeys(obj)[source]

Convert a list of SeriesKey.

pandasdmx.writer.write_set(obj, *args, **kwargs)[source]

Convert set.

pandasdmx.writer.write_structuremessage(obj, include=None, **kwargs)[source]

Convert StructureMessage.

Parameters
  • obj (StructureMessage) –

  • include (iterable of str or str, optional) – One or more of the attributes of the StructureMessage ( ‘category_scheme’, ‘codelist’, etc.) to transform.

  • kwargs – Passed to write() for each attribute.

Returns

Keys are StructureMessage attributes; values are pandas objects.

Return type

DictLike

pandasdmx.writer.DEFAULT_RTYPE = 'compat'

Default return type for write_dataset() and similar methods. Either ‘compat’ or ‘rows’. See the ref:HOWTO <howto-rtype>.

Todo

Support selection of language for conversion of InternationalString.

remote: Access SDMX REST web services

class pandasdmx.remote.Session(timeout=30.1, proxies=None, stream=False, **kwargs)[source]

requests.Session subclass with optional caching.

If requests_cache is installed, this class caches responses.

class pandasdmx.remote.ResponseIO(response, tee=None)[source]

Read data from a requests.Response object, into an in-memory or on-disk file and expose it as a file-like object.

ResponseIO wraps a requests.Response object’s ‘content’ attribute, providing a file-like object from which bytes can be read() incrementally.

Parameters
  • response (requests.Response) – HTTP response to wrap.

  • tee (binary, writable io.BufferedIOBasè, defaults to io.BytesIO()) – tee is exposed as self.tee and not closed explicitly.

source: Features of SDMX data sources

This module defines Source and some utility functions. For built-in subclasses of Source used to provide pandaSDMX’s built-in support for certain data sources, see Data sources.

class pandasdmx.source.Source(**kwargs)[source]

SDMX-IM RESTDatasource.

This class describes the location and features supported by an SDMX data source. Subclasses may override the hooks in order to handle specific features of different REST web services:

handle_response(response, content)

Handle response content of unknown type.

finish_message(message, request, **kwargs)

Postprocess retrieved message.

modify_request_args(kwargs)

Modify arguments used to build query URL.

data_content_type: DataContentType = None

pandasdmx.util.DataContentType indicating the type of data returned by the source.

finish_message(message, request, **kwargs)[source]

Postprocess retrieved message.

This hook is called by pandasdmx.Request.get() after a pandasdmx.message.Message object has been successfully parsed from the query response.

See ESTAT for an example implementation.

handle_response(response, content)[source]

Handle response content of unknown type.

This hook is called by pandasdmx.Request.get() only when the content cannot be parsed as XML or JSON.

See ESTAT and SGR for example implementations.

id: str = None

ID of the data source

modify_request_args(kwargs)[source]

Modify arguments used to build query URL.

This hook is called by pandasdmx.Request.get() to modify the keyword arguments before the query URL is built.

The default implementation handles requests for ‘structure-specific data’ by adding an HTTP ‘Accepts:’ header when a ‘dsd’ is supplied as one of the kwargs.

See SGR for an example override.

Returns

Return type

None

name: str = None

Human-readable name of the data source

supports: Dict[Union[str, Resource], bool] = None

Mapping from Resource to bool indicating support for SDMX REST API features. Two additional keys are valid:

  • 'preview'=True if the source supports ?detail=serieskeysonly. See preview_data.

  • 'structure-specific data'=True if the source can return structure- specific data messages.

url: str = None

Base URL for queries

pandasdmx.source.add_source(info, id=None, override=False, **kwargs)[source]

Add a new data source.

The info expected is in JSON format:

{
  "id": "ESTAT",
  "documentation": "http://data.un.org/Host.aspx?Content=API",
  "url": "http://ec.europa.eu/eurostat/SDMX/diss-web/rest",
  "name": "Eurostat",
  "supported": {"codelist": false, "preview": true}
}

…with unspecified values using the defaults; see Source.

Parameters
  • info (dict-like) – String containing JSON information about a data source.

  • id (str) – Identifier for the new datasource. If None (default), then info[‘id’] is used.

  • override (bool) – If True, replace any existing data source with id. Otherwise, raise ValueError.

  • **kwargs – Optional callbacks for handle_response and finish_message hooks.

pandasdmx.source.list_sources()[source]

Return a sorted list of valid source IDs.

These can be used to create Request instances.

pandasdmx.source.load_package_sources()[source]

Discover all sources listed in sources.json.

util: Utilities

class pandasdmx.util.BaseModel[source]

Bases: pydantic.main.BaseModel

Shim for pydantic.BaseModel.

This class changes two behaviours in pydantic. The methods are direct copies from pydantic’s code, with marked changes.

  1. https://github.com/samuelcolvin/pydantic/issues/524

    • “Multiple RecursionErrors with self-referencing models”

    • In e.g. pandasdmx.model.Item, having both .parent and .child references leads to infinite recursion during validation.

    • Fix: override BaseModel.__setattr__.

    • New value ‘limited’ for Config.validate_assignment: no sibling field values are passed to Field.validate().

    • New key Config.validate_assignment_exclude: list of field names that are not validated per se and not passed to Field.validate() when validating a sibling field.

  2. https://github.com/samuelcolvin/pydantic/issues/521

    • “Assignment to attribute changes id() but not referenced object,” marked as wontfix by pydantic maintainer.

    • When cls.attr is typed as BaseModel (or a subclass), then a.attr is b.attr is always False, even when set to the same reference.

    • Fix: override BaseModel.validate() without copy().

class Config[source]

Bases: object

validate_assignment = 'limited'
validate_assignment_exclude = []
classmethod validate(value: Any) → Model[source]
class pandasdmx.util.DictLike[source]

Bases: collections.OrderedDict, typing.Generic

Container with features of a dict & list, plus attribute access.

validate(value, field)[source]
class pandasdmx.util.Resource[source]

Bases: str, enum.Enum

Enumeration of SDMX REST API endpoints.

Enum member

pandasdmx.model class

categoryscheme

CategoryScheme

codelist

Codelist

conceptscheme

ConceptScheme

data

DataSet

dataflow

DataflowDefinition

datastructure

DataStructureDefinition

provisionagreement

ProvisionAgreement

categoryscheme = 'categoryscheme'
codelist = 'codelist'
conceptscheme = 'conceptscheme'
data = 'data'
dataflow = 'dataflow'
datastructure = 'datastructure'
classmethod describe()[source]
classmethod from_obj(obj)[source]

Return an enumeration value based on the class of obj.

provisionagreement = 'provisionagreement'
pandasdmx.util.get_class_hint(obj, attr)[source]

Return the type hint for attribute attr on obj.

pandasdmx.util.summarize_dictlike(dl, maxwidth=72)[source]

Return a string summary of the DictLike contents.

pandasdmx.util.validate_dictlike(*fields)[source]