Skip to contents

Takes in a space-separated HLA typing string and splits it into its constituent loci and alleles ("A_1", "A_2", "DRB1_1").

extract_alleles_str() takes in a single string, and returns a named character vector of alleles.

extract_alleles_df() takes in a data frame, where one column contains the typing string, and returns the same data frame along with a new column for each allele.

Usage

extract_alleles_str(
  string,
  loci = c("A", "B", "C", "DPA1", "DPB1", "DQA1", "DQB1", "DRB1", "DRB."),
  strip_locus = TRUE
)

extract_alleles_df(
  df,
  col_typing,
  loci = c("A", "B", "C", "DPA1", "DPB1", "DQA1", "DQB1", "DRB1", "DRB."),
  strip_locus = TRUE
)

Arguments

string

String, space-separated HLA typing.

loci

A string or character vector with the loci you are interested in. Only these alleles will be returned. Defaults to all. DRB. is used for DRB3, DRB4, and DRB5.

strip_locus

Include the locus in the output or remove it?

  • If TRUE (the default), the locus will be removed from the extracted alleles.

  • If FALSE, will retain the locus as it was in the original typing.

df

A data frame.

col_typing

The column in df that contains a space-separated HLA typing string for each row.

Value

Either a character vector or a data frame with the named alleles. A warning will be shown if any loci in the input have more than two alleles.

Examples

typing <- "A1 A2 B7 B8 Cw3 DQ5 DQ8 DR4 DR11 DR52 DR53"
extract_alleles_str(typing, loci = "A")
#> A_1 A_2 
#> "1" "2" 
extract_alleles_str(typing)
#>    A_1    A_2    B_1    B_2    C_1    C_2 DPA1_1 DPA1_2 DPB1_1 DPB1_2 DQA1_1 
#>    "1"    "2"    "7"    "8"    "3"     NA     NA     NA     NA     NA     NA 
#> DQA1_2 DQB1_1 DQB1_2 DRB1_1 DRB1_2 DRB._1 DRB._2 
#>     NA    "5"    "8"    "4"   "11"   "52"   "53" 

df <- tidyr::tibble(typing = typing)
extract_alleles_df(df, typing, loci = c("A", "B", "C"))
#> Joining with `by = join_by(typing)`
#> Joining with `by = join_by(typing)`
#> # A tibble: 1 × 7
#>   typing                                     A_1   A_2   B_1   B_2   C_1   C_2  
#>   <chr>                                      <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 A1 A2 B7 B8 Cw3 DQ5 DQ8 DR4 DR11 DR52 DR53 1     2     7     8     3     ""   

# Can also handle newer nomenclature
extract_alleles_str("DQB1*03:01 DQB1*05:01 DRB1*04:AMR",
  loci = c("DRB1", "DQB1")
)
#>   DRB1_1   DRB1_2   DQB1_1   DQB1_2 
#> "04:AMR"       NA  "03:01"  "05:01"