How to get provider metadata?

How would I get a list of all datasets for a given data provider?

I would like to ask for:

  1. All providers
  2. All datasets for a given provider
  3. Attributes of dataset

Thank you.

for example in R i can do:
dim = rdbnomics::rdb_dimensions(provider_code = “IMF”, dataset_code = dataset)
countries = dim$IMF$WEO:2021-10$weo-country
series = dim$IMF$WEO:2021-10$weo-subject

but I cannot do this in python?

Hi @toast,

The DBnomics Python client can fetch series only. It may be enhanced, and we’ll keep your needs in mind for future updates of the package.

However, meanwhile, you can request the DBnomics API directly and parse the JSON response, and I’m going to help you about it below.

See also the docs of the API.

You can load this URL: https://api.db.nomics.world/v22/providers and get the response in the providers/docs key path.

There are 2 ways:

You can request the category tree of the provider (the same you see on the provider page on DBnomics website) by calling https://api.db.nomics.world/v22/providers/{provider_code}, and reading the category_tree key of the JSON response.

Please note that the category tree can be hierarchical (but not always) and you’ll have to flatten it by writing a recursive function yourself, for example. The leaves of the tree are the datasets.

Or you can also use this URL https://api.db.nomics.world/v22/datasets/{provider_code} and read the datasets/docs key path in the JSON response. For each item of the list you will have the dataset code as well as many other metadata you probably don’t need, so just ignore them.

Dataset attributes are not returned by the API, there is #265 and a draft merge request about it.

Could you give an example? I’m not sure we talk about the same “attributes”.

Your R example suggests that you need actually the dataset dimensions, which are not available from the Python client. This should be added definitively.


I hope it helped, if you need more info please ask.

ok I think it looks something like this to get all providers and then get their respective datasets and required metadata to query each dataset. this is exactly what I was after. thank you !

#%%
import requests

from bs4 import BeautifulSoup

#%%

######################################################################## All provider codes

all_providers_url= 'https://api.db.nomics.world/v22/providers'

providers_json= requests.get(all_providers_url).json()

providers= [provider['code'] for provider in providers_json['providers']['docs']]

providers

#%%

######################################################################### All datasets for a provider

provider_code= 'IMF'

provider_data_string= f'https://api.db.nomics.world/v22/datasets/{provider_code}'

providers_json= requests.get(provider_data_string).json()

datasets_provider= [dataset['code'] for dataset in providers_json['datasets']['docs']]

datasets_provider

#%%
########################################################################### All series for a dataset

series= providers_json['datasets']['docs'][0]['dimensions_values_labels']['INDICATOR']
1 Like