Introduction

Use opendatascot to download data from statistics.gov.scot with a single line of R code. opendatascot removes the need to write SPARQL code; you simply need the URI of a dataset. opendatascot can be used interactively, or as part of a reproducible analytical pipeline.

Usage

You can download an entire dataset, or filter by date and/or geography. We recommend filtering large datasets. If you require a full download of a large dataset, you may need to contact statistics.gov.scot.

You will always need the last part of the URI for your dataset. Find this on the statistics.gov.scot web page for your dataset (in the API tab). For example, the full URI for Average Household Size is Average Household Size:

http://statistics.gov.scot/data/average-household-size

you just need the last part:

average-household-size

Without filtering

Once you have the last part of the URI, set this as the dataset parameter for ods_dataset():

household_size <- ods_dataset(dataset = "average-household-size")
#> Parsed with column specification:
#> cols(
#>   FeatureCode = col_character(),
#>   DateCode = col_integer(),
#>   Measurement = col_character(),
#>   Units = col_character(),
#>   Value = col_double()
#> )
head(household_size)
#> # A tibble: 6 x 5
#>   FeatureCode DateCode Measurement Units                Value
#>   <chr>          <int> <chr>       <chr>                <dbl>
#> 1 S12000039       2015 Ratio       People Per Household  2.09
#> 2 S12000039       2010 Ratio       People Per Household  2.14
#> 3 S12000039       2005 Ratio       People Per Household  2.21
#> 4 S12000039       2012 Ratio       People Per Household  2.13
#> 5 S12000039       2007 Ratio       People Per Household  2.18
#> 6 S12000039       2006 Ratio       People Per Household  2.19

With filtering

Filtering is useful for large data sets. The filtering parameters for ods_dataset() are:

  • start_date
  • end_date
  • geography
Filter by date

Use either start_date OR end_date to filter datapoints before or after a certain date:

household_size_2010_onwards <- ods_dataset(dataset = "average-household-size",
                                           start_date = 2010)
head(household_size_2010_onwards)
#>                 refArea refPeriod measureType value
#> 1 Dumfries and Galloway      2003       Ratio  2.25
#> 2         East Ayrshire      2003       Ratio  2.32
#> 3          East Lothian      2003       Ratio  2.32
#> 4     North Lanarkshire      2003       Ratio  2.35
#> 5                 Angus      2003       Ratio  2.24
#> 6           Dundee City      2003       Ratio  2.09

Use start_date AND end_date to filter datapoints within a certain timeframe.

Filter by geography

Specify a single geography using an S code:

household_size_S12000039 <- ods_dataset(dataset = "average-household-size",
                                           geography = "S12000039")
head(household_size_S12000039)
#>                 refArea refPeriod measureType value
#> 1 Dumfries and Galloway      2003       Ratio  2.25
#> 2         East Ayrshire      2003       Ratio  2.32
#> 3          East Lothian      2003       Ratio  2.32
#> 4     North Lanarkshire      2003       Ratio  2.35
#> 5                 Angus      2003       Ratio  2.24
#> 6           Dundee City      2003       Ratio  2.09

sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.2 LTS
#> 
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#> 
#> locale:
#>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
#>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
#>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] opendatascot_0.0.0.9000 SPARQL_1.16             RCurl_1.95-4.11        
#> [4] bitops_1.0-6            XML_3.98-1.16          
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_0.12.18     rstudioapi_0.8   knitr_1.20       xml2_1.2.0      
#>  [5] magrittr_1.5     roxygen2_6.1.1   hms_0.4.2        MASS_7.3-51.1   
#>  [9] R6_2.3.0         rlang_0.2.2      fansi_0.3.0      stringr_1.3.1   
#> [13] tools_3.5.2      utf8_1.1.4       cli_1.0.0        htmltools_0.3.6 
#> [17] commonmark_1.5   yaml_2.2.0       rprojroot_1.3-2  digest_0.6.17   
#> [21] assertthat_0.2.0 tibble_1.4.2     pkgdown_1.1.0    crayon_1.3.4    
#> [25] readr_1.1.1      fs_1.2.6         curl_3.2         memoise_1.1.0   
#> [29] evaluate_0.12    rmarkdown_1.10   stringi_1.2.4    pillar_1.3.0    
#> [33] compiler_3.5.2   desc_1.2.0       backports_1.1.2  pkgconfig_2.0.2