validate_allele()
takes in a character vector or string of HLA alleles, and
returns TRUE
if the allele is well-formed, and FALSE
if it isn't.
Value
A Boolean or logical vector with the same lengths as allele
, with
TRUE
or FALSE
for each element.
Details
N.B. This function does not test whether an allele actually exists (e.g.
whether it occurs in the most recent version of the IPD-IMGT/HLA database),
but only whether it's string representation conforms to certain standards. An
allele can be well-formed but not exist (e.g. "A*99:01:01"
), or can exist
but not be well-formed (e.g. "HLA-A**02;01"
).
The following are explicitly considered valid HLAs:
alleles belonging to loci A, B, C, DRB, DQA/DQB, DPA/DPB (which is not an exhaustive list of class I or class II HLAs, but simply the ones that are often typed in the context of transplantation research/matching)
serological/antigen notation, such as
"A2"
,"DP-0201"
XX codes, e.g.
"A*02:XX"
prefixing an allele with
"HLA-"
is allowedambiguous alleles such as
"C*01:02/C*01:03/C*01:04/C*01:05/C*01:06"
Multiple Allele Codes, e.g.
"DRB1*07:GC"
(v3) or"DPB1*04BDVU"
(v2)P groups, G groups, and expression-related suffixes (N/L/S/C/A/Q)
Examples
validate_allele("A2")
#> [1] TRUE
validate_allele("A*99:01:01") # well-formed but non-existing
#> [1] TRUE
validate_allele("HLA-A**02;01") # existing but not well-formed
#> [1] FALSE
# also works with character vectors, or in a data frame
allele_vec <- c("A2", "A*01:AABJE", "A*24:02:01:02L", "not-an-HLA")
validate_allele(allele_vec)
#> [1] TRUE TRUE TRUE FALSE
df <- tidyr::tibble(alleles = allele_vec)
dplyr::mutate(df, alleles_check = validate_allele(alleles))
#> # A tibble: 4 × 2
#> alleles alleles_check
#> <chr> <lgl>
#> 1 A2 TRUE
#> 2 A*01:AABJE TRUE
#> 3 A*24:02:01:02L TRUE
#> 4 not-an-HLA FALSE