How to get provider metadata?

toast · December 30, 2021, 7:43pm

How would I get a list of all datasets for a given data provider?

I would like to ask for:

All providers
All datasets for a given provider
Attributes of dataset

Thank you.

toast · December 31, 2021, 4:49am

for example in R i can do:
dim = rdbnomics::rdb_dimensions(provider_code = “IMF”, dataset_code = dataset)
countries = dim$IMF$WEO:2021-10$weo-country
series = dim$IMF$WEO:2021-10$weo-subject

but I cannot do this in python?

cbenz · January 3, 2022, 9:19am

Hi @toast,

The DBnomics Python client can fetch series only. It may be enhanced, and we’ll keep your needs in mind for future updates of the package.

However, meanwhile, you can request the DBnomics API directly and parse the JSON response, and I’m going to help you about it below.

See also the docs of the API.

You can load this URL: https://api.db.nomics.world/v22/providers and get the response in the providers/docs key path.

There are 2 ways:

You can request the category tree of the provider (the same you see on the provider page on DBnomics website) by calling https://api.db.nomics.world/v22/providers/{provider_code}, and reading the category_tree key of the JSON response.

Please note that the category tree can be hierarchical (but not always) and you’ll have to flatten it by writing a recursive function yourself, for example. The leaves of the tree are the datasets.

Or you can also use this URL https://api.db.nomics.world/v22/datasets/{provider_code} and read the datasets/docs key path in the JSON response. For each item of the list you will have the dataset code as well as many other metadata you probably don’t need, so just ignore them.

Dataset attributes are not returned by the API, there is #265 and a draft merge request about it.

Could you give an example? I’m not sure we talk about the same “attributes”.

Your R example suggests that you need actually the dataset dimensions, which are not available from the Python client. This should be added definitively.

I hope it helped, if you need more info please ask.

toast · August 1, 2022, 1:06pm

ok I think it looks something like this to get all providers and then get their respective datasets and required metadata to query each dataset. this is exactly what I was after. thank you !

#%%
import requests

from bs4 import BeautifulSoup

#%%

######################################################################## All provider codes

all_providers_url= 'https://api.db.nomics.world/v22/providers'

providers_json= requests.get(all_providers_url).json()

providers= [provider['code'] for provider in providers_json['providers']['docs']]

providers

#%%

######################################################################### All datasets for a provider

provider_code= 'IMF'

provider_data_string= f'https://api.db.nomics.world/v22/datasets/{provider_code}'

providers_json= requests.get(provider_data_string).json()

datasets_provider= [dataset['code'] for dataset in providers_json['datasets']['docs']]

datasets_provider

#%%
########################################################################### All series for a dataset

series= providers_json['datasets']['docs'][0]['dimensions_values_labels']['INDICATOR']

Topic		Replies	Views
Querying only monthly series Community	2	543	September 12, 2022
Problem with dataset.json for BIS Community	0	7	November 19, 2024
Using db.nomics in Grafana, Pandas _metadata, formats and stability Community	0	102	March 5, 2024
DBnomics API v22 released Community	10	1921	March 7, 2019
Candidate data providers Site Feedback fetchers	2	416	January 26, 2023

How to get provider metadata?

Related topics