Exploring and Downloading Datasets and Models

Let's start by exploring the repository of datasets and models.

You can do that at the different accessibility layers of EOTDL: the user interface, the API, the command line interface (CLI) and the Python library.

The User Interface

The easiest way to get started with EOTDL is by exploring the user interface: Through the UI you will be able to:

  • Explore the datasets and models available in the repository (filtering by name, tags and liked)
  • Edit your own datasets and models information.
  • Read the tutorials on the blog.
  • Read the documentation.
  • Find useful links to other resources (GitHub, Discord, ...)

Quality levels

Datasets and models in EOTDL are categorized into quality levels. The quality levels are:

  • Q0: datasets in the form of an archive with arbitary files without curation. This level is ideal for easy and fast upload/download of small datasets.
  • Q1: datasets with STAC metadata but no QA. These datasets can leverage a limited set of EOTDL features.
  • Q2: datasets with STAC metadata with the EOTDL custom extensions and automated QA. These datasets can leverage the full potential of the EOTDL.
  • Q3: Q2 datasets that are manually curated. These datasets are the most reliable and can be used as benchmark datasets.

The Command Line Interface

Even though the UI is the easiest way to get started, it is not the most convenient for actually working with the datasets and models. For that we recommend installing the CLI.

If you are running this notebook locally, consider creating a virtual environment before installing the CLI to avoid conflicts with other packages.

With conda:

conda create -n eotdl python=3.8
conda activate eotdl

With python:

python -m venv eotdl
source eotdl/bin/activate

You may also have to install Jupyter on the new environment and restart the notebook.

Then, you can install the CLI with pip:

# uncomment to install

# !pip install eotdl

Once installed, you can execute the CLI with different commands.

!eotdl --help
 Usage: eotdl [OPTIONS] COMMAND [ARGS]...                                       
 Welcome to EOTDL. Learn more at                         
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.      │
│ --show-completion             Show completion for the current shell, to copy │
│                               it or customize the installation.              │
│ --help                        Show this message and exit.                    │
╭─ Commands ───────────────────────────────────────────────────────────────────╮
│ auth         Login to EOTDL.                                                 │
│ datasets     Explore, ingest and download training datasets.                 │
│ models       Explore, ingest and download ML models.                         │
│ version      Get EOTDL version.                                              │
!eotdl version
EOTDL Version: 2023.11.02-5
!eotdl datasets --help
 Usage: eotdl datasets [OPTIONS] COMMAND [ARGS]...                              
 Explore, ingest and download training datasets.                                
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                  │
╭─ Commands ───────────────────────────────────────────────────────────────────╮
│ get        Download a dataset from the EOTDL.                                │
│ ingest     Ingest a dataset to the EOTDL.                                    │
│ list       Retrieve a list with all the datasets in the EOTDL.               │

You can explore datasets with the following command:

!eotdl datasets list 
['EuroSAT-RGB', 'UCMerced', 'EuroSAT', 'SeCo100k', 'SeCo', 'AirbusAircraftDetection', 'AirbusWindTurbinesPatches', 'RoadNet', 'SloveniaLandCover', 'ISPRS-Potsdam2D', 'SEN12-FLOOD', 'Urban3dChallenge', 'tropical-cyclone-dataset', 'Vessel-detection', 'Airplanes-detection', 'S2-SHIPS', 'SpaceNet-7', 'Sentinel-2-Cloud-Mask', 'PASTIS', 'FlodNet', 'EuroCrops', 'open-cities-test', 'PASTIS-R', 'open-cities-tt1-source', 'open-cities-tt2-source', 'LandcoverAI', 'xview2', 'BigEarthNet', 'EuroSAT-RGB-STAC', 'EuroSAT-STAC', 'COWC', 'Stanford-Drone-dataset', 'eurosat-rgb', 'eurosat-rgb-q2', 'EuroSAT-small', 'test-q0', 'Boadella-BiDS23']
!eotdl datasets list --help
 Usage: eotdl datasets list [OPTIONS]                                           
 Retrieve a list with all the datasets in the EOTDL.                            
 If using --name, it will filter the results by name. If no name is provided,   
 it will return all the datasets.                                               
 If using --limit, it will limit the number of results. If no limit is          
 provided, it will return all the datasets.                                     
 $ eotdl datasets list                                                          
 $ eotdl datasets list --name YourModel --limit 5                               
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --name   -n      TEXT     Filter the returned datasets by name               │
│                           [default: None]                                    │
│ --limit  -l      INTEGER  Limit the number of returned results               │
│                           [default: None]                                    │
│ --help                    Show this message and exit.                        │
!eotdl datasets list -n eurosat
['EuroSAT-RGB', 'EuroSAT', 'EuroSAT-RGB-STAC', 'EuroSAT-STAC', 'eurosat-rgb', 'eurosat-rgb-q2', 'EuroSAT-small']

As you may have guessed, you can download a dataset with the following command:

!eotdl datasets get EuroSAT-small
Dataset `EuroSAT-small v10` already exists at /home/juan/.cache/eotdl/datasets/EuroSAT-small/v10. To force download, use force=True or -f in the CLI.

The first time you run the command, you will be asked to login (which will require you to create an account if you haven't already). You can also login with the command

!eotdl auth login
You are logged in as
!eotdl auth --help
 Usage: eotdl auth [OPTIONS] COMMAND [ARGS]...                                  
 Login to EOTDL.                                                                
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                  │
╭─ Commands ───────────────────────────────────────────────────────────────────╮
│ login              Login to the EOTDL.                                       │
│ logout             Logout from the EOTDL.                                    │
!eotdl datasets get --help
 Usage: eotdl datasets get [OPTIONS] [DATASET]                                  
 Download a dataset from the EOTDL.                                             
 If using --path, it will download the dataset to the specified path. If no     
 path is provided, it will download to ~/.eotdl/datasets.                       
 If using --file, it will download the specified file. If no file is provided,  
 it will download the entire dataset.                                           
 If using --version, it will download the specified version. If no version is   
 provided, it will download the latest version.                                 
 If using --assets when the dataset is STAC, it will also download the STAC     
 assets of the dataset. If not provided, it will only download the STAC         
 If using --force, it will download the dataset even if the file already        
 If using --verbose, it will print the progress of the download.                
 $ eotdl dataset get YourDataset                                                
 $ eotdl dataset get YourDataset --path /path/to/download --file    
 --version 1 --assets True --force True --verbose True                          
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│   dataset      [DATASET]  Name of the dataset to download [default: None]    │
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --path     -p      TEXT     Download the dataset to a specific output path   │
│                             [default: None]                                  │
│ --file     -f      TEXT     Download a specific file from the dataset        │
│                             [default: None]                                  │
│ --version  -v      INTEGER  Dataset version [default: None]                  │
│ --assets   -a               Download STAC assets from the dataset            │
│ --force    -f               Force download even if file exists               │
│ --verbose                   Verbose output. This will print the progress of  │
│                             the download                                     │
│ --help                      Show this message and exit.                      │

By default, datasets will be downloaded to your $HOME/.cache/eotdl/datasets folder or the path in the EOTDL_DOWNLOAD_PATH environment variable. You can change this with the --path argument.

!eotdl datasets get EuroSAT-small -p data
Dataset `EuroSAT-small v10` already exists at data/EuroSAT-small/v10. To force download, use force=True or -f in the CLI.

You can choose a particular version to download with the --version argument. If you don't specify a version, the latest version will be downloaded.

!eotdl datasets get EuroSAT-small -p data -v 1
100%|███████████████████████████████████████████| 6/6 [00:02<00:00,  2.35file/s]
Data available at data/EuroSAT-small/v1

The version number will be used to create a folder with the same name inside the path you specified. Inside this folder you will find the dataset files.

If you try to re-download a datasets, the CLI will complain. You can force a re-download with the --force argument.

!eotdl datasets get EuroSAT-small -p data -v 1
Dataset `EuroSAT-small v1` already exists at data/EuroSAT-small/v1. To force download, use force=True or -f in the CLI.
!eotdl datasets get EuroSAT-small -p data -v 1 -f
100%|███████████████████████████████████████████| 6/6 [00:02<00:00,  2.39file/s]
Data available at data/EuroSAT-small/v1

For Q1+ datasets, the get command will only download the STAC metadata of the dataset.

!eotdl datasets get eurosat-rgb -p data 
Downloading a STAC dataset is not implemented

Inside the metadata you will find the links to all the assets, so you can download them individually (maybe after some filtering or processing using only the metadata). However, you can download all assets with the command:

!eotdl datasets get eurosat-rgb -p data -a
Downloading a STAC dataset is not implemented

Working with models is very much the same at this point.

!eotdl models --help
 Usage: eotdl models [OPTIONS] COMMAND [ARGS]...                                
 Explore, ingest and download ML models.                                        
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                  │
╭─ Commands ───────────────────────────────────────────────────────────────────╮
│ get        Download a model from the EOTDL.                                  │
│ ingest     Ingest a model to the EOTDL.                                      │
│ list       Retrieve a list with all the models in the EOTDL.                 │
!eotdl models list
!eotdl models list --help
 Usage: eotdl models list [OPTIONS]                                             
 Retrieve a list with all the models in the EOTDL.                              
 If using --name, it will filter the results by name. If no name is provided,   
 it will return all the models.                                                 
 If using --limit, it will limit the number of results. If no limit is          
 provided, it will return all the models.                                       
 $ eotdl models list                                                            
 $ eotdl models list --name YourModel --limit 5                                 
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --name   -n      TEXT     Filter the returned models by name [default: None] │
│ --limit  -l      INTEGER  Limit the number of returned results               │
│                           [default: None]                                    │
│ --help                    Show this message and exit.                        │
!eotdl models get EuroSAT-RGB-BiDS23
100%|███████████████████████████████████████████| 2/2 [00:05<00:00,  2.54s/file]
Data available at /home/juan/.cache/eotdl/models/EuroSAT-RGB-BiDS23/v3
!eotdl models get --help
 Usage: eotdl models get [OPTIONS] [MODEL]                                      
 Download a model from the EOTDL.                                               
 If using --path, it will download the model to the specified path. If no path  
 is provided, it will download to ~/.eotdl/models.                              
 If using --file, it will download the specified file. If no file is provided,  
 it will download the entire model.                                             
 If using --version, it will download the specified version. If no version is   
 provided, it will download the latest version.                                 
 If using --assets when the model is STAC, it will also download the STAC       
 assets of the model. If not provided, it will only download the STAC metadata. 
 If using --force, it will download the model even if the file already exists.  
 If using --verbose, it will print the progress of the download.                
 $ eotdl models get YourModel                                                   
 $ eotdl models get YourModel --path /path/to/download --file         
 --version 1 --assets True --force True --verbose True                          
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│   model      [MODEL]  Name of the model to download [default: None]          │
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --path     -p      TEXT     Download the model to a specific output path     │
│                             [default: None]                                  │
│ --file     -f      TEXT     Download a specific file from the model          │
│                             [default: None]                                  │
│ --version  -v      INTEGER  Model version [default: None]                    │
│ --assets   -a               Download STAC assets from the model              │
│ --force    -f               Force download even if file exists               │
│ --verbose                   Verbose output. This will print the progress of  │
│                             the download                                     │
│ --help                      Show this message and exit.                      │

We will explore how to ingest datasets and models in the next tutorials.

The Library

Everything that we have done so far with the CLI is also enabled through the Python library. When installing the CLI, the library is automatically installed as well.

import eotdl

from eotdl.datasets import retrieve_datasets

datasets = retrieve_datasets()

With the library, you have full control over the datasets and models.

[d for d in datasets if "eurosat" in d.lower()]

You can download datasets as well, but now you will have to manage potential errors.

from eotdl.datasets import download_dataset


Exception                                 Traceback (most recent call last)

/home/juan/Desktop/eotdl/tutorials/notebooks/01_exploring.ipynb Cell 50 line 3
      <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/notebooks/01_exploring.ipynb#Y100sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a> from eotdl.datasets import download_dataset
----> <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/notebooks/01_exploring.ipynb#Y100sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a> download_dataset("EuroSAT-small")

File ~/miniconda3/envs/eotdl/lib/python3.8/site-packages/eotdl/auth/, in with_auth.<locals>.wrapper(*args, **kwargs)
     45 def wrapper(*args, **kwargs):
     46     user = auth()
---> 47     return func(*args, **kwargs, user=user)

File ~/miniconda3/envs/eotdl/lib/python3.8/site-packages/eotdl/datasets/, in download_dataset(dataset_name, version, path, logger, assets, force, verbose, user, file)
     40 if os.path.exists(download_path) and not force:
     41     os.makedirs(download_path, exist_ok=True)
---> 42     raise Exception(
     43         f"Dataset `{dataset['name']} v{str(version)}` already exists at {download_path}. To force download, use force=True or -f in the CLI."
     44     )
     45 if dataset["quality"] == 0:
     46     if file:

Exception: Dataset `EuroSAT-small v10` already exists at /home/juan/.cache/eotdl/datasets/EuroSAT-small/v10. To force download, use force=True or -f in the CLI.
download_dataset("EuroSAT-small", force=True)
100%|██████████| 7/7 [00:02<00:00,  2.57file/s]

download_dataset("EuroSAT-small", force=True, path="data")
100%|██████████| 7/7 [00:02<00:00,  2.76file/s]


In fact, the CLI is built on top of the library.

And the same for the models

from eotdl.models import retrieve_models

from eotdl.models import download_model 

path = download_model("EuroSAT-RGB-BiDS23", force=True)
100%|██████████| 2/2 [00:04<00:00,  2.26s/file]

import os 

['metadata.yml', 'model.onnx']

The Application Programming Interface

The last way to interact with EOTDL is using the API. You can explore the interactive documentation at

You can get the full list of datasets hosted in the EOTDL with the followgin API call:

import requests

datasets = requests.get("").json()
As you can see, here you get all the information about the dataset, not only the name (author, license, versions, etc). This is why the API is ideal for building third party applications on top of EOTDL.

datasets = requests.get("").json()
[(d['name'], d['id'], d['files'], len(d['versions'])) for d in datasets]	
[('EuroSAT-small', '6526accffd974011abc2413a', '6526accffd974011abc2413b', 9)]

In fact, the library (and CLI) are built on top of the API, so you can achieve the same functionality (or even better!) on your own applications.

files = requests.get("").json()
{'filename': 'Forest/Forest_3.tif',
 'version': 1,
 'checksum': '3e7bb982f9db5f7dabc556016c3d081dfb1fb73d'}

Some API calls requires you to be authenticated. You can do that with as follows:

  • Use the auth/login endpoint to get a login URL and a code
  • Navigate to the login URL to login
  • Use the auth/token endpoint to get a token with the provided code
  • Use the token to authenticate your requests
import os

token = '...'

file = files[0]
filename = file["filename"]
filepath = f'data/{filename}'

os.makedirs(os.path.dirname(filepath), exist_ok=True)
response = requests.get(
    headers={'Authorization': f'Bearer {token}'},

with open(filepath, 'wb') as file:
    for chunk in response.iter_content(chunk_size=8192):

HTTPError                                 Traceback (most recent call last)

/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb Cell 53 line 1
      <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a> os.makedirs(os.path.dirname(filepath), exist_ok=True)
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a> response = requests.get(
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>     f'{filename}?version=1', 
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>     headers={'Authorization': f'Bearer {token}'},
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=12'>13</a>     stream=True
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=13'>14</a> )
---> <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=14'>15</a> response.raise_for_status()
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=16'>17</a> with open(filepath, 'wb') as file:
     <a href='vscode-notebook-cell://ssh-remote%2Bharley/home/juan/Desktop/eotdl/tutorials/workshops/bids23/01_exploring_datasets_and_models..ipynb#Y144sdnNjb2RlLXJlbW90ZQ%3D%3D?line=17'>18</a>     for chunk in response.iter_content(chunk_size=8192):

File ~/miniconda3/envs/eotdl/lib/python3.8/site-packages/requests/, in Response.raise_for_status(self)
   1016     http_error_msg = (
   1017         f"{self.status_code} Server Error: {reason} for url: {self.url}"
   1018     )
   1020 if http_error_msg:
-> 1021     raise HTTPError(http_error_msg, response=self)

HTTPError: 401 Client Error: Unauthorized for url:
