Get miRNA names from a data frame. These miRNA names can either be the most frequent ones, or the ones exceeding a threshold.

get_mir(
  df,
  top = NULL,
  threshold = NULL,
  topic = NULL,
  col.mir = miRNA,
  col.pmid = PMID,
  col.topic = Topic
)

Arguments

df

Data frame containing miRNA names. If threshold is set, df must also contain PubMed-IDs. If topic is set, df must also contain topic names.

top

Integer. Optional. Specifies number of most frequent miRNA names to return. If neither top nor threshold is set, top is automatically set to 5.

threshold

Integer or float. Optional. If threshold >= 1, return miRNA names mentioned in at least threshold abstracts. If threshold is between 0 and 1, return miRNA names mentioned in at least threshold abstracts of all abstracts in df.

topic

String. Optional. Character vector specifying which topics to obtain miRNA names from.

col.mir

Symbol. Column containing miRNA names.

col.pmid

Symbol. Column containing PubMed-IDs.

col.topic

Symbol. Column containing topic names.

Value

Character vector containing miRNA names.

Details

Get miRNA names from a data frame. These miRNA names can either be the most frequent ones, or the ones exceeding a threshold. Furthermore, if the data frame contains abstracts of different topics, only the miRNA names of specific topics can be obtained by setting the topic argument.

  • To get the most frequent miRNA names, set the top argument. top determines how many most frequent miRNA names are returned, according to their rank. Ties among the most frequently mentioned miRNAs are treated as the same rank, e.g. if miR-126, miR-34, and miR-29 were all mentioned the most often with the same frequency, they would all be returned by specifying top = 1, top = 2, and top = 3.

  • To get the miRNA names exceeding a threshold, set the threshold argument. threshold can either be an absolute value, e.g. 3, or a float between 0 and 1, e.g. 0.2. If threshold is an absolute value, get_mir() returns only the miRNA names mentioned in at least threshold abstracts. If threshold is a float between 0 and 1, get_mir() returns only miRNA names mentioned in at least threshold abstracts of all abstracts. threshold requires the data frame to have a column with PubMed IDs.

If neither top nor threshold is set, top is automatically set to 5.

See also