Extract SNPs from abstracts in a data frame.

extract_snp(
  df,
  pattern = snp_pattern,
  col.abstract = Abstract,
  indicate = FALSE,
  discard = FALSE
)

Arguments

df

Data frame containing abstracts.

pattern

String. Regex pattern to identify SNPs.

col.abstract

Symbol. Column containing abstracts.

indicate

Boolean. If indicate = TRUE, add another column called "SNP_present", verbally indicating if a SNP is present in an abstract.

discard

Boolean. If discard = TRUE, only abstracts containing a SNP are kept.

Value

Data frame. If discard = FALSE, return the data frame with an additional column for SNPs. If discard = TRUE, return only abstracts containing SNPs.

Details

Extract SNPs from abstracts in a data frame. SNPs are added to the data frame in a separate column. Furthermore, an optional column can indicate if SNPs are generally present in an abstract.

See also