Get Gene Info
get_gene_info.Rd
Convenience function for returning information about a gene or a set of genes. This function is internally called by [BioMaesteR::get_gene_region].
Arguments
- these_genes
Required argument. The gene or genes of interest.
- projection
The desired projection, default is hg38.
- raw
Default is FALSE, set to TRUE for keeping all columns.
Details
Give the function a gene or a set of genes (as a vector of characters), specify the projection (if not, hg38 is the default projection) and return gene information based on the bundled data. By default this function is run with `raw = FALSE`, this returns a subset of columns. If instead the user wants everything back (i.e all available columns) toggle `raw` to `TRUE`.
Examples
#Example 1 - Query one gene (in Hugo format) and with default parameters.
get_gene_info(these_genes = "MYC")
#> hugo_symbol ensembl_gene_id type gene_biotype source gene_version
#> 1 MYC ENSG00000136997 gene protein_coding ensembl_havana 22
#> gene_source tag ccds_id score transcript_id transcript_version
#> 1 ensembl_havana <NA> <NA> NA <NA> <NA>
#> transcript_name transcript_source transcript_biotype exon_number exon_id
#> 1 <NA> <NA> <NA> <NA> <NA>
#> protein_id protein_version
#> 1 <NA> <NA>
#Example 2 - Same as example 1 but MYC is here specified as Ensembl ID.
get_gene_info(these_genes = "ENSG00000136997")
#> ensembl_gene_id hugo_symbol type gene_biotype source gene_version
#> 1 ENSG00000136997 MYC gene protein_coding ensembl_havana 22
#> gene_source tag ccds_id score transcript_id transcript_version
#> 1 ensembl_havana <NA> <NA> NA <NA> <NA>
#> transcript_name transcript_source transcript_biotype exon_number exon_id
#> 1 <NA> <NA> <NA> <NA> <NA>
#> protein_id protein_version
#> 1 <NA> <NA>
#Example 3 - Request multiple genes with non-default parameters
get_gene_info(these_genes = c("MYC", "BCL2"),
projection = "grch37")
#> hugo_symbol ensembl_gene_id type gene_biotype source gene_version
#> 1 MYC ENSG00000136997 gene protein_coding ensembl_havana 10
#> 2 BCL2 ENSG00000171791 gene protein_coding ensembl_havana 10
#> gene_source tag ccds_id score transcript_id transcript_version
#> 1 ensembl_havana <NA> <NA> NA <NA> <NA>
#> 2 ensembl_havana <NA> <NA> NA <NA> <NA>
#> transcript_name transcript_source transcript_biotype exon_number exon_id
#> 1 <NA> <NA> <NA> <NA> <NA>
#> 2 <NA> <NA> <NA> <NA> <NA>
#> protein_id protein_version
#> 1 <NA> <NA>
#> 2 <NA> <NA>
#Example 4 - Request multiple Ensembl IDs and return all columns.
get_gene_info(these_genes = c("ENSG00000136997", "ENSG00000171791"),
raw = TRUE)
#> chrom start end width strand type tag ccds_id ensembl_gene_id
#> 1 chr18 63123346 63320128 196783 - gene <NA> <NA> ENSG00000171791
#> 2 chr8 127735434 127742951 7518 + gene <NA> <NA> ENSG00000136997
#> hugo_symbol source score gene_version gene_source gene_biotype
#> 1 BCL2 ensembl_havana NA 14 ensembl_havana protein_coding
#> 2 MYC ensembl_havana NA 22 ensembl_havana protein_coding
#> transcript_id transcript_version transcript_name transcript_source
#> 1 <NA> <NA> <NA> <NA>
#> 2 <NA> <NA> <NA> <NA>
#> transcript_biotype exon_number exon_id exon_version protein_id
#> 1 <NA> <NA> <NA> <NA> <NA>
#> 2 <NA> <NA> <NA> <NA> <NA>
#> protein_version input_format
#> 1 <NA> Ensembl
#> 2 <NA> Ensembl