Skip to contents

The khisr package seamlessly integrates with the Kenya Health Information System (KHIS) API from R, empowering researchers and public health professionals to easily access and analyze valuable health data. Built on the DHIS2 platform, khisr automates data retrieval, saving you time and effort compared to manual methods.

Authentication

To ensure a secure and seamless experience, khisr operates by default in authenticated mode, enabling you to interact with KHIS as a recognized user. This means that before diving into KHIS data exploration, you’ll need to establish your credentials with the system.

Setting Your Credentials:

  1. Obtain your KHIS credentials: These typically consist of a username and password, which you can acquire through appropriate channels within the KHIS organization.

  2. Store your credentials securely: khisr provides a convenient mechanism for storing your credentials within your R environment. To learn how to set and manage your credentials effectively, please refer to the comprehensive guide at Set Your Credentials.

# Set the credentials using username and password
khis_cred(username = 'KHIS username', password = 'KHIS password')

# Set credentials using configuration path
khis_cred(config_path = 'path/to/secret.json')

Metadata

KHIS uses metadata to define the structure and meaning of the data stored in the system. For more information see data dimensions

Metadata helpers

khisr provides a number of high level metadata helpers for obtaining details of the different types of metadata. There is a helper for each of the main metadata categories in DHIS2, in fact auto complete in the users R IDE will often make typing these very fast. This is the list of the numerous high level metadata helpers in khisr.

khisr function KHIS API Endpoint
get_categories() categories
get_category_combos() categoryCombos
get_category_option_combos() categoryOptionCombos
get_category_option_group_sets() categoryOptionGroupSets
get_category_option_groups() categoryOptionGroups
get_category_options() categoryOptions
get_data_element_group_sets() dataElementGroupSets
get_data_element_groups() dataElementGroups
get_data_elements() dataElements
get_data_sets() dataSets
get_indicator_group_sets() indicatorGroupSets
get_indicator_groups() indicatorGroups
get_indicators() indicators
get_option_group_sets() optionGroupSets
get_option_groups() optionGroups
get_option_sets() optionSets
get_options() options
get_organisation_unit_groupsets() organisationUnitGroupSets
get_organisation_unit_groups() organisationUnitGroups
get_organisation_units() organisationUnits
get_dimensions() dimensions
get_user_groups() userGroups
get_period_types() periodTypes

Metadata object filter

To filter the metadata there are several filter operations that can be applied to the returned list of metadata. The format of the filter itself is straight-forward and follows the pattern property:operator:value, where property is the property on the metadata you want to filter on, operator is the comparison operator you want to perform and value is the value to check against (not all operators require value).

KHIS Operator Infix Operator Description
eq %.eq% Equality
!eq %.~eq% Inequality
ieq %.ieq% Case insensitive string, match exact
ne %.ne% Inequality
like %.Like% Case sensitive string, match anywhere
!like %.~Like% Case sensitive string, not match anywhere
$like %.^Like% Case sensitive string, match start
!$like %.~^Like% Case sensitive string, not match start
like$ %.Like$% Case sensitive string, match end
!like$ %.~Like$% Case sensitive string, not match end
ilike %.like% Case insensitive string, match anywhere
!ilike %.~like% Case insensitive string, not match anywhere
$ilike %.^like% Case insensitive string, match start
!$ilike %.~^like% Case insensitive string, not match start
ilike$ %.like$% Case insensitive string, match end
!ilike$ %.~like$% Case insensitive string, not match end
gt %.gt% Greater than
ge %.ge% Greater than or equal
lt %.lt% Less than
le %.le% Less than or equal
token %.token% Match on multiple tokens in search property
!token %.~token% Not match on multiple tokens in search property
in %.in% Find objects matching 1 or more values
!in %.~in% Find objects not matching 1 or more values

Working with metadata filters

Basic usage of the metadata filter

# Retrieve organisation units by county (level 2)
county <- get_organisation_units(level %.eq% '2')
county
#> # A tibble: 47 × 2
#>    name                   id         
#>    <chr>                  <chr>      
#>  1 Baringo County         vvOK1BxTbet
#>  2 Bomet County           HMNARUV2CW4
#>  3 Bungoma County         KGHhQ5GLd4k
#>  4 Busia County           Tvf1zgVZ0K4
#>  5 Elgeyo Marakwet County MqnLxQBigG0
#>  6 Embu County            PFu8alU2KWG
#>  7 Garissa County         uyOrcHZBpW0
#>  8 Homa Bay County        nK0A12Q7MvS
#>  9 Isiolo County          bzOfj0iwfDH
#> 10 Kajiado County         Hsk1YV8kHkT
#> # ℹ 37 more rows

# Retrieve county by name (Mombasa)
county <- get_organisation_units(level %.eq% '2',
                                 name %.like% 'mombasa')
county
#> # A tibble: 1 × 2
#>   name           id         
#>   <chr>          <chr>      
#> 1 Mombasa County wsBsC6gjHvn

data_element_id <- c('cXe64Yk0QMY', 'XEX93uLsAm2')

# Retrieve data elements by ID using operator in
data_elements <- get_data_elements(id %.in% data_element_id)
data_elements
#> # A tibble: 2 × 2
#>   name         id         
#>   <chr>        <chr>      
#> 1 CBE-Abnormal XEX93uLsAm2
#> 2 CBE-Normal   cXe64Yk0QMY

# Retrieve data elements by filtering using dataElementGroups
data_elements <- get_data_elements(dataElementGroups.name %.like% 'moh 705')
data_elements
#> # A tibble: 96 × 2
#>    name                         id         
#>    <chr>                        <chr>      
#>  1 Abortion                     IrWSgk9GsUm
#>  2 All other diseases           KxT47tbKHsd
#>  3 Anaemia cases                kkUHOwGMawD
#>  4 Arthritis, Joint pains etc.  waNhWrS3HL6
#>  5 Asthma                       L82lvvxVaqt
#>  6 Autism                       L529r3Wvtcf
#>  7 Bilharzia  (Schistosomiasis) ojFSHMwbkHK
#>  8 Brucellosis                  nb9cfWgxYFc
#>  9 Burns                        dkEYL9Sous9
#> 10 Cardiovascular conditions    sZETzNe1To8
#> # ℹ 86 more rows

Data analytics

The analytics resource in DHIS2 empowers you to access and analyze aggregated data across various dimensions. To effectively leverage this resource, let’s explore the key functions and parameters involved:

Key Functions

  • get_analytics(): Retrieves aggregated data based on specified dimensions and filters.
  • analytics_dimension(): Constructs dimensions for queries, ensuring accurate data retrieval.
  • %.d% (infix operator): Convenient shorthand for creating dimension filters.
  • %.f% (infix operator): Convenient shorthand for creating filter dimensions.

Dimension (dx)

The dimension query parameter defines which dimensions should be included in the analytics query. Any number of dimensions can be specified. The dimension parameter should be repeated for each dimension to include in the query response. The query response can potentially contain aggregated values for all combinations of the specified dimension items. The fixed dimensions are the data element (dx) period (time) (pe) and organisation unit (ou) dimension. You can dynamically add dimensions through categories, data element group sets and organisation unit group sets.

Dimension ID Dimensions
dx Data elements, indicators, data set reporting rate metrics,
data element operands, program indicators, program data elements,
program attributes, validation rules
pe ISO periods and relative periods (see “date and period format”)
ou Organisation unit hierarchy
Organisation unit identifiers, keywords USER_ORGUNIT,
USER_ORGUNIT_CHILDREN, USER_ORGUNIT_GRANDCHILDREN, LEVEL-,
and OU_GROUP-
co Category option combo identifiers (use all to get all items)
ao Category option combo identifiers (use all to get all items)

Filter (filter)

The filter parameter defines which dimensions should be used as filters for the data retrieved in the analytics query. Any number of filters can be specified. The filter parameter should be repeated for each filter to use in the query. A filter differs from a dimension in that the filter dimensions will not be part of the query response content, and that the aggregated values in the response will be collapsed on the filter dimensions. In other words, the data in the response will be aggregated on the filter dimensions, but the filters will not be included as dimensions in the actual response.

Constructing Queries

  • Specify Dimensions: Use the dimensions parameter to define the dimensions you want to include in the query response. Leverage the infix operators %.d% for concise and readable code.
# To include a list dimensions for data elements id, dataset ids
dx %.d% c('dimension-id-1', 'dimension-id-2')
#> <spliced>
#> $dimension
#> [1] "dx:dimension-id-1;dimension-id-2"

pe %.d% 'LAST_YEAR'
#> <spliced>
#> $dimension
#> [1] "pe:LAST_YEAR"

ou %.d% 'USER_ORGUNIT'
#> <spliced>
#> $dimension
#> [1] "ou:USER_ORGUNIT"

# showing in the analytics
get_analytics(
    dx %.d% c('siOyOiOJpI8', 'Lt0FqtnHraW', 'OoakJhWiyZp'),
    pe %.d% 'LAST_YEAR',
    ou %.d% c('qKzosKQPl6G')
)
#> # A tibble: 3 × 4
#>   dx          pe    ou          value
#>   <chr>       <chr> <chr>       <dbl>
#> 1 OoakJhWiyZp 2023  qKzosKQPl6G  5092
#> 2 Lt0FqtnHraW 2023  qKzosKQPl6G 31101
#> 3 siOyOiOJpI8 2023  qKzosKQPl6G 20554

# Using the startDate and endDate with organisation unit keyword 'USER_ORGUNIT'
get_analytics(
    dx %.d% c('siOyOiOJpI8', 'Lt0FqtnHraW', 'OoakJhWiyZp'),
    ou %.d% 'USER_ORGUNIT',
    pe %.d% 'all',
    startDate = '2023-07-01',
    endDate = '2023-12-31'
)
#> # A tibble: 18 × 4
#>    dx          ou          pe       value
#>    <chr>       <chr>       <chr>    <dbl>
#>  1 siOyOiOJpI8 HfVjCurKxh2 202310  750516
#>  2 siOyOiOJpI8 HfVjCurKxh2 202312  648714
#>  3 Lt0FqtnHraW HfVjCurKxh2 202307 1122397
#>  4 OoakJhWiyZp HfVjCurKxh2 202308  367187
#>  5 siOyOiOJpI8 HfVjCurKxh2 202311  670316
#>  6 Lt0FqtnHraW HfVjCurKxh2 202308  958455
#>  7 OoakJhWiyZp HfVjCurKxh2 202307  432283
#>  8 Lt0FqtnHraW HfVjCurKxh2 202312  766901
#>  9 OoakJhWiyZp HfVjCurKxh2 202312  284189
#> 10 Lt0FqtnHraW HfVjCurKxh2 202310  839725
#> 11 Lt0FqtnHraW HfVjCurKxh2 202311  808704
#> 12 OoakJhWiyZp HfVjCurKxh2 202309  329826
#> 13 siOyOiOJpI8 HfVjCurKxh2 202307  995644
#> 14 siOyOiOJpI8 HfVjCurKxh2 202309  843735
#> 15 OoakJhWiyZp HfVjCurKxh2 202311  261248
#> 16 siOyOiOJpI8 HfVjCurKxh2 202308  856746
#> 17 OoakJhWiyZp HfVjCurKxh2 202310  257664
#> 18 Lt0FqtnHraW HfVjCurKxh2 202309  942416
  • Apply Filters: Use the filters parameter to specify dimensions for filtering data without including them in the response.
# Filter by period
pe %.f% 'LAST_YEAR'
#> <spliced>
#> $filter
#> [1] "pe:LAST_YEAR"

# Filter by organisation unit
ou %.f% 'USER_ORGUNIT'
#> <spliced>
#> $filter
#> [1] "ou:USER_ORGUNIT"

# showing in the analytics. filter by organisation unit with id 'qKzosKQPl6G'
# and period 'LAST_YEAR'
get_analytics(
    dx %.d% c('siOyOiOJpI8', 'Lt0FqtnHraW', 'OoakJhWiyZp'),
    pe %.f% 'LAST_YEAR',
    ou %.f% 'qKzosKQPl6G'
)
#> # A tibble: 3 × 2
#>   dx          value
#>   <chr>       <dbl>
#> 1 OoakJhWiyZp  5092
#> 2 siOyOiOJpI8 20554
#> 3 Lt0FqtnHraW 31101