extract_mir_df.Rd
Extract miRNA names from abstracts in a data frame.
extract_mir_df( df, threshold = 1, col.abstract = Abstract, extract_letters = FALSE )
df | Data frame containing abstracts. |
---|---|
threshold | Integer. Specifies how often a miRNA must be mentioned in an abstract to be extracted. |
col.abstract | Symbol. Column containing abstracts. |
extract_letters | Boolean. If |
Data frame with miRNA names extracted from abstracts.
Extract miRNA names from abstracts in a data frame. miRNA names can
either be extracted with their stem only, e.g. miR-23, or with their trailing
letter, e.g. miR-23a. miRNA names are adapted to the most recent miRBase
version (e.g. miR-97, miR-102, miR-180(a/b) become miR-30a, miR-29a,
and miR-172(a/b), respectively). Additionally, how often a miRNA must be
mentioned in an
abstract to be extracted can be regulated via the threshold
argument.
Ultimately, abstracts not containing any miRNA names
are silently dropped.
As many abstracts do not adhere to the miRNA nomenclature,
it is recommended to extract only the miRNA stem with
extract_letters = FALSE
.
Other extract functions:
extract_mir_string()
,
extract_snp()