Skip to contents

Return genes residing in defined region(s).

Usage

region_ranger(
  these_regions = NULL,
  qchrom = NULL,
  qstart = NULL,
  qend = NULL,
  projection = "hg38",
  raw = FALSE
)

Arguments

these_regions

The region(s) to be queried. Can be a data frame with regions with the following columns; chrom, start, end. Or in a string in the following format chr:start-end.

qchrom

Query chromosome (prefixed or un-prefixed), Required if `these_regions` is not provided.

qstart

Query start position. Required if `these_regions` is not provided.

qend

Query end position. Required if `these_regions` is not provided.

projection

The desired projection you want back coordinates for. Available projections are hg38 and grch37. Default is hg38.

raw

Set to TRUE to return all columns. Default is FALSE.

Value

A data frame with genomic events residing in the specified region(s).

Details

Query a region and return all genomic events residing inside the specified region. This function accepts a variety of incoming regions. Either, regions can be provided as a data frame with `these_regions`. If so, the following columns must exist; chrom, start, end. This parameter (`these_regions`) also accept a region in "region" format, (i.e chr:start-end). This can be a region or a vector of characters with multiple regions. The user can also individually specify region(s) with; `qchrom` (string), `qstart` (string, or integer), and `qend` (string or integer). These parameters can also accept a vector of characters for multiple regions. The function also handles chromosome prefixes in the returned object, based on the selected `projection`.

Examples

#Example 1 - Give the function one region as a string
region_ranger(these_regions = "chr8:127735434-127742951")
#>   region_start region_end type   gene_biotype hugo_symbol ensembl_gene_id chrom
#> 1    127735434  127742951 gene         lncRNA      CASC11 ENSG00000249375  chr8
#> 2    127735434  127742951 gene protein_coding         MYC ENSG00000136997  chr8
#>       start       end width strand
#> 1 127686343 127738987 52645      -
#> 2 127735434 127742951  7518      +

#Example 2 - Give the function multiple regions as a string
region_ranger(these_regions = c("chr8:128747680-128753674",
              "chr18:60790579-60987361"),
              projection = "grch37")
#>   region_start region_end type   gene_biotype  hugo_symbol ensembl_gene_id
#> 1     60790579   60987361 gene protein_coding         BCL2 ENSG00000171791
#> 2     60790579   60987361 gene sense_intronic RP11-299P2.1 ENSG00000267766
#> 3     60790579   60987361 gene         snoRNA       snoU13 ENSG00000238988
#> 4     60790579   60987361 gene sense_intronic  RP11-28F1.2 ENSG00000267701
#> 5    128747680  128753674 gene protein_coding          MYC ENSG00000136997
#>   chrom     start       end  width strand
#> 1    18  60790579  60987361 196783      -
#> 2    18  60818347  60818553    207      -
#> 3    18  60861822  60861898     77      -
#> 4    18  60981035  60981315    281      -
#> 5     8 128747680 128753674   5995      +

#Example 3 - Individually specify the chromosome, start and end coordinates
region_ranger(qchrom = "chr8",
              qstart = 127735434,
              qend = 127742951)
#>   region_start region_end type   gene_biotype hugo_symbol ensembl_gene_id chrom
#> 1    127735434  127742951 gene         lncRNA      CASC11 ENSG00000249375  chr8
#> 2    127735434  127742951 gene protein_coding         MYC ENSG00000136997  chr8
#>       start       end width strand
#> 1 127686343 127738987 52645      -
#> 2 127735434 127742951  7518      +

#Example 4 - Individually specify multiple regions with the query parameters
region_ranger(qchrom = c("chr8", "chr18"),
              qstart = c(128747680, 60790579),
              qend = c(128753674, 60987361),
              projection = "grch37")
#>   region_start region_end type   gene_biotype  hugo_symbol ensembl_gene_id
#> 1     60790579   60987361 gene protein_coding         BCL2 ENSG00000171791
#> 2     60790579   60987361 gene sense_intronic RP11-299P2.1 ENSG00000267766
#> 3     60790579   60987361 gene         snoRNA       snoU13 ENSG00000238988
#> 4     60790579   60987361 gene sense_intronic  RP11-28F1.2 ENSG00000267701
#> 5    128747680  128753674 gene protein_coding          MYC ENSG00000136997
#>   chrom     start       end  width strand
#> 1    18  60790579  60987361 196783      -
#> 2    18  60818347  60818553    207      -
#> 3    18  60861822  60861898     77      -
#> 4    18  60981035  60981315    281      -
#> 5     8 128747680 128753674   5995      +