1. Getting the Spanish Open Human Mobility Data in a Reproducible Way using {spanishoddata}
and aggregating it with {duckdb}
July 21, 2025
Logos are the property of their respective owners.
13:30 — | Getting the Open Human Mobility Data in a Reproducible Way using spanishoddata R package and aggregating the data using duckdb |
Visualization of Mobility using flowmapper and flowmapblue |
|
15:00 — 15:30 | Official Coffee Break |
— 17:00 | Accessibility and its relation to actual mobility |
{spanishoddata}
and aggregating it with {duckdb}
What did we know about human mobility?
Spanish Open Mobility Big Data (Ministerio de Transportes y Movilidad Sostenible (MITMS) 2024)
{spanishoddata}
R package to get the data (Kotov, Lovelace, and Vidal-Tortosa 2024)
Big data analysis with DuckDB
(Raasveldt and Muehleisen 2018) and duckdb
R package (Mühleisen and Raasveldt 2024)
Image source: Martínez-Durive et al. (2023)
Mobile phone data for humanitarian and development efforts in low- and middle-income countries
Data by Ministerio de Transportes y Movilidad Sostenible (MITMS) (2024)
Based on 13 million customers of Orange Spain, expanded to full population of Spain
Data by Ministerio de Transportes y Movilidad Sostenible (MITMS) (2024)
Based on 13 million customers of Orange Spain, expanded to full population of Spain
View the dashboard at https://data.transportes.gob.es/public/mov-diaria-mensual
View the dashboard at https://data.transportes.gob.es/public/mov-provincial
View the dashboard at https://mapas-movilidad.transportes.gob.es/?date=2025-04-16T22
spanishoddata
R packagespanishoddata
spanishoddata
- access open human mobility dataLogos are the property of their respective owners.
spanishoddata
use casesspanishoddata
use casesspanishoddata
use casesspanishoddata
use casesspanishoddata
use casesThe share of people who have not spent the night at home.
spanishoddata
use cases{spanishoddata}
::::
Use {spanishoddata}
package
Use {spanishoddata}
package
library(dplyr)
glimpse(od_data)
Rows: ??
Columns: 20
Database: DuckDB v1.2.1 [root@Darwin 24.4.0:R 4.5.0/:memory:]
$ date <date> 2022-01-04, 2022-01-04, 2…
$ hour <int> 0, 0, 0, 1, 1, 3, 4, 4, 5,…
$ id_origin <fct> 01001, 01001, 01001, 01001…
$ id_destination <fct> 01009_AM, 01009_AM, 01009_…
$ distance <fct> 2-10, 2-10, 2-10, 2-10, 2-…
$ activity_origin <fct> home, frequent_activity, w…
$ activity_destination <fct> frequent_activity, home, h…
$ study_possible_origin <lgl> FALSE, FALSE, FALSE, FALSE…
$ study_possible_destination <lgl> FALSE, FALSE, FALSE, FALSE…
$ residence_province_ine_code <fct> 01, 01, 01, 01, 01, 01, 01…
$ residence_province_name <fct> "Araba/Álava", "Araba/Álav…
$ income <fct> 10-15, >15, >15, >15, >15,…
$ age <fct> NA, NA, NA, NA, NA, NA, NA…
$ sex <fct> NA, NA, NA, NA, NA, NA, NA…
$ n_trips <dbl> 4.894, 1.779, 1.094, 1.094…
$ trips_total_length_km <dbl> 27.966, 5.997, 4.081, 4.16…
$ year <int> 2022, 2022, 2022, 2022, 20…
$ month <int> 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ day <int> 4, 4, 4, 4, 4, 4, 4, 4, 4,…
library(dplyr)
glimpse(od_data)
Rows: ??
Columns: 20
Database: DuckDB v1.2.1 [root@Darwin 24.4.0:R 4.5.0/:memory:]
$ date <date> 2022-01-04, 2022-01-04, 2…
$ hour <int> 0, 0, 0, 1, 1, 3, 4, 4, 5,…
$ id_origin <fct> 01001, 01001, 01001, 01001…
$ id_destination <fct> 01009_AM, 01009_AM, 01009_…
$ distance <fct> 2-10, 2-10, 2-10, 2-10, 2-…
$ activity_origin <fct> home, frequent_activity, w…
$ activity_destination <fct> frequent_activity, home, h…
$ study_possible_origin <lgl> FALSE, FALSE, FALSE, FALSE…
$ study_possible_destination <lgl> FALSE, FALSE, FALSE, FALSE…
$ residence_province_ine_code <fct> 01, 01, 01, 01, 01, 01, 01…
$ residence_province_name <fct> "Araba/Álava", "Araba/Álav…
$ income <fct> 10-15, >15, >15, >15, >15,…
$ age <fct> NA, NA, NA, NA, NA, NA, NA…
$ sex <fct> NA, NA, NA, NA, NA, NA, NA…
$ n_trips <dbl> 4.894, 1.779, 1.094, 1.094…
$ trips_total_length_km <dbl> 27.966, 5.997, 4.081, 4.16…
$ year <int> 2022, 2022, 2022, 2022, 20…
$ month <int> 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ day <int> 4, 4, 4, 4, 4, 4, 4, 4, 4,…
Plain text citations:
---------------------
To cite the spanishoddata package:
Kotov E, Lovelace R, Vidal-Tortosa E (2024). spanishoddata.
doi:10.32614/CRAN.package.spanishoddata
https://doi.org/10.32614/CRAN.package.spanishoddata,
https://github.com/rOpenSpain/spanishoddata.
To cite the Ministry's mobility study website:
Ministerio de Transportes y Movilidad Sostenible (MITMS)
(2024). “Estudio de la movilidad con Big Data (Study of
mobility with Big Data).”
https://www.transportes.gob.es/ministerio/proyectos-singulares/estudio-de-movilidad-con-big-data.
To cite the methodology for 2020-2021 data:
Ministerio de Transportes, Movilidad y Agenda Urbana (MITMA)
(2021). Análisis de la movilidad en España con tecnología Big
Data durante el estado de alarma para la gestión de la crisis
del COVID-19 (Analysis of mobility in Spain with Big Data
technology during the state of alarm for COVID-19 crisis
management).
https://cdn.mitma.gob.es/portal-web-drupal/covid-19/bigdata/mitma-estudiomovilidadcovid-19informemetodologico_v3.pdf.
To cite the methodology for 2022 and onwards data:
Ministerio de Transportes y Movilidad Sostenible (MITMS)
(2024). Estudio de movilidad de viajeros de ámbito nacional
aplicando la tecnología Big Data. Informe metodológico (Study
of National Traveler mobility Using Big Data Technology.
Methodological Report).
https://www.transportes.gob.es/recursosmfom/paginabasica/recursos/a3informemetodologicoestudiomovilidadmitms_v8.pdf.
Note: A more up-to-date methodology document may be available at https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/metodologia-del-estudio-de-movilidad-con-bigdata
Markdown citations:
-------------------
**To cite the spanishoddata package:**
Kotov E, Lovelace R, Vidal-Tortosa E (2024). _spanishoddata_.
doi:10.32614/CRAN.package.spanishoddata
https://doi.org/10.32614/CRAN.package.spanishoddata,
https://github.com/rOpenSpain/spanishoddata.
**To cite the Ministry's mobility study website:**
Ministerio de Transportes y Movilidad Sostenible (MITMS)
(2024). “Estudio de la movilidad con Big Data (Study of
mobility with Big Data).”
https://www.transportes.gob.es/ministerio/proyectos-singulares/estudio-de-movilidad-con-big-data.
**To cite the methodology for 2020-2021 data:**
Ministerio de Transportes, Movilidad y Agenda Urbana (MITMA)
(2021). _Análisis de la movilidad en España con tecnología Big
Data durante el estado de alarma para la gestión de la crisis
del COVID-19 (Analysis of mobility in Spain with Big Data
technology during the state of alarm for COVID-19 crisis
management)_.
https://cdn.mitma.gob.es/portal-web-drupal/covid-19/bigdata/mitma_-_estudio_movilidad_covid-19_informe_metodologico_v3.pdf.
**To cite the methodology for 2022 and onwards data:**
Ministerio de Transportes y Movilidad Sostenible (MITMS)
(2024). _Estudio de movilidad de viajeros de ámbito nacional
aplicando la tecnología Big Data. Informe metodológico (Study
of National Traveler mobility Using Big Data Technology.
Methodological Report)_.
https://www.transportes.gob.es/recursos_mfom/paginabasica/recursos/a3_informe_metodologico_estudio_movilidad_mitms_v8.pdf.
> **Note:** A more up-to-date methodology document may be available at https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/metodologia-del-estudio-de-movilidad-con-bigdata
BibTeX citations:
-----------------
%% To cite the spanishoddata package
@Manual{r-spanishoddata,
title = {spanishoddata},
author = {Egor Kotov and Robin Lovelace and Eugeni Vidal-Tortosa},
year = {2024},
url = {https://github.com/rOpenSpain/spanishoddata},
doi = {10.32614/CRAN.package.spanishoddata},
}
%% To cite the Ministry's mobility study website
@Misc{mitms_mobility_web,
title = {Estudio de la movilidad con Big Data (Study of mobility with Big Data)},
author = {{Ministerio de Transportes y Movilidad Sostenible (MITMS)}},
year = {2024},
url = {https://www.transportes.gob.es/ministerio/proyectos-singulares/estudio-de-movilidad-con-big-data},
}
%% To cite the methodology for 2020-2021 data
@Manual{mitma_methodology_2020_v3,
title = {Análisis de la movilidad en España con tecnología Big Data durante el estado de alarma para la gestión de la crisis del COVID-19 (Analysis of mobility in Spain with Big Data technology during the state of alarm for COVID-19 crisis management)},
author = {{Ministerio de Transportes, Movilidad y Agenda Urbana (MITMA)}},
year = {2021},
url = {https://cdn.mitma.gob.es/portal-web-drupal/covid-19/bigdata/mitma_-_estudio_movilidad_covid-19_informe_metodologico_v3.pdf},
}
%% To cite the methodology for 2022 and onwards data
@Manual{mitms_methodology_2022_v8,
title = {Estudio de movilidad de viajeros de ámbito nacional aplicando la tecnología Big Data. Informe metodológico (Study of National Traveler mobility Using Big Data Technology. Methodological Report)},
author = {{Ministerio de Transportes y Movilidad Sostenible (MITMS)}},
year = {2024},
url = {https://www.transportes.gob.es/recursos_mfom/paginabasica/recursos/a3_informe_metodologico_estudio_movilidad_mitms_v8.pdf},
}
%% Note: A more up-to-date methodology document may be available at https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/metodologia-del-estudio-de-movilidad-con-bigdata
OpenAI. (2025). A digital image of stacked hard drives on a MacBook representing big data [AI-generated image]. Created using DALL·E. https://openai.com/dall-e
Logos are the property of their respective owners.
Only for illustration purposes. May not accurately represent actual GB usage and DuckDB operation.
Logos are the property of their respective owners.
Only for illustration purposes. May not accurately represent actual GB usage and DuckDB operation.
Logos are the property of their respective owners.
Only for illustration purposes. May not accurately represent actual GB usage and DuckDB operation.
Logos are the property of their respective owners.
Only for illustration purposes. May not accurately represent actual GB usage and DuckDB operation.
Logos are the property of their respective owners.
Only for illustration purposes. May not accurately represent actual GB usage and DuckDB operation.