vignettes/opendatascot.Rmd
opendatascot.Rmd
Use opendatascot to download data from statistics.gov.scot with a single line of R code. opendatascot removes the need to write SPARQL code; you simply need the URI of a dataset. opendatascot can be used interactively, or as part of a reproducible analytical pipeline.
You can download an entire dataset, or filter by date and/or geography. We recommend filtering large datasets. If you require a full download of a large dataset, you may need to contact statistics.gov.scot.
You will always need the last part of the URI for your dataset. Find this on the statistics.gov.scot web page for your dataset (in the API tab). For example, the full URI for Average Household Size is Average Household Size:
http://statistics.gov.scot/data/average-household-size
you just need the last part:
average-household-size
Once you have the last part of the URI, set this as the dataset
parameter for ods_dataset()
:
household_size <- ods_dataset(dataset = "average-household-size")
#> Parsed with column specification:
#> cols(
#> FeatureCode = col_character(),
#> DateCode = col_integer(),
#> Measurement = col_character(),
#> Units = col_character(),
#> Value = col_double()
#> )
head(household_size)
#> # A tibble: 6 x 5
#> FeatureCode DateCode Measurement Units Value
#> <chr> <int> <chr> <chr> <dbl>
#> 1 S12000039 2015 Ratio People Per Household 2.09
#> 2 S12000039 2010 Ratio People Per Household 2.14
#> 3 S12000039 2005 Ratio People Per Household 2.21
#> 4 S12000039 2012 Ratio People Per Household 2.13
#> 5 S12000039 2007 Ratio People Per Household 2.18
#> 6 S12000039 2006 Ratio People Per Household 2.19
Filtering is useful for large data sets. The filtering parameters for ods_dataset()
are:
start_date
end_date
geography
Use either start_date
OR end_date
to filter datapoints before or after a certain date:
household_size_2010_onwards <- ods_dataset(dataset = "average-household-size",
start_date = 2010)
head(household_size_2010_onwards)
#> refArea refPeriod measureType value
#> 1 Dumfries and Galloway 2003 Ratio 2.25
#> 2 East Ayrshire 2003 Ratio 2.32
#> 3 East Lothian 2003 Ratio 2.32
#> 4 North Lanarkshire 2003 Ratio 2.35
#> 5 Angus 2003 Ratio 2.24
#> 6 Dundee City 2003 Ratio 2.09
Use start_date
AND end_date
to filter datapoints within a certain timeframe.
Specify a single geography using an S code:
household_size_S12000039 <- ods_dataset(dataset = "average-household-size",
geography = "S12000039")
head(household_size_S12000039)
#> refArea refPeriod measureType value
#> 1 Dumfries and Galloway 2003 Ratio 2.25
#> 2 East Ayrshire 2003 Ratio 2.32
#> 3 East Lothian 2003 Ratio 2.32
#> 4 North Lanarkshire 2003 Ratio 2.35
#> 5 Angus 2003 Ratio 2.24
#> 6 Dundee City 2003 Ratio 2.09
sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#>
#> locale:
#> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
#> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
#> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] opendatascot_0.0.0.9000 SPARQL_1.16 RCurl_1.95-4.11
#> [4] bitops_1.0-6 XML_3.98-1.16
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_0.12.18 rstudioapi_0.8 knitr_1.20 xml2_1.2.0
#> [5] magrittr_1.5 roxygen2_6.1.1 hms_0.4.2 MASS_7.3-51.1
#> [9] R6_2.3.0 rlang_0.2.2 fansi_0.3.0 stringr_1.3.1
#> [13] tools_3.5.2 utf8_1.1.4 cli_1.0.0 htmltools_0.3.6
#> [17] commonmark_1.5 yaml_2.2.0 rprojroot_1.3-2 digest_0.6.17
#> [21] assertthat_0.2.0 tibble_1.4.2 pkgdown_1.1.0 crayon_1.3.4
#> [25] readr_1.1.1 fs_1.2.6 curl_3.2 memoise_1.1.0
#> [29] evaluate_0.12 rmarkdown_1.10 stringi_1.2.4 pillar_1.3.0
#> [33] compiler_3.5.2 desc_1.2.0 backports_1.1.2 pkgconfig_2.0.2