Instructions for Variant Database

The program vdb allows one to search the GISAID dataset of SARS-CoV-2 (hCoV-19) viruses for spike mutation patterns using a natural syntax. The two main types of objects that can be manipulated are groups of isolates (“clusters”) and groups of mutations (“patterns”). Clusters can be obtained by searching for patterns, and patterns can be obtained by examining clusters.


The default cluster to search is the collection of all isolates (“world”). To search for all isolates from the United States, enter “from US” or just “us”. A cluster or pattern can be assigned to a variable for later use. For example, this is a command that defines a variable with all viruses collected in the United States containing both spike mutations L452R and E484K:

a = us w/ L452R E484K


Below is the command to perform this search for viruses containing mutations L452R and/or E484K:

b = us w/ 1 L452R E484K


Clusters can be filtered by Pango lineage, date, number of mutations, country or US state to define new clusters. Patterns can obtained by either calculating the consensus pattern of a cluster (using the consensus command) or by listing the most frequent distinct patterns in a cluster (using the patterns command).


Below is a quick reference to all of the commands available. Full documentation is here.


Adjusting the display

The font size and number of rows in the terminal display of vdb.live can be quickly adjusted. Enter font followed by a number between 6 and 60 to change the font size. Enter rows followed by a number between 6 and 60 to change the number of rows in the terminal. Changing the font size automatically changes the number of rows.


Notation

cluster = group of viruses < > = user input n = an integer

pattern = group of mutations [ ] = optional

"world" = all viruses in database result

If no cluster is entered, all viruses will be used (this is the built-in "world" cluster).


Variables

To define a variable for a cluster or pattern: <name> = cluster or pattern

To check whether two clusters or patterns are equal: <item1> == <item2>

To count a cluster or pattern in a variable: count <variable name>

Set operations +, -, and * (intersection) can be applied to clusters or patterns

The result, if any, of the previous command is available in the variable last.


Filtering commands

<cluster>from<country or state> cluster

<cluster>containing[<n>] <pattern> cluster alias with, w/ matches for ≥ n mutations

<cluster>not containing[<n>] <pattern> cluster alias without, w/o full pattern

<cluster>before<date> cluster

<cluster>after<date> cluster

<cluster>> or < or # <n> cluster filter by # of mutations

<cluster>named<state_id or EPI_ISL#> cluster

<cluster>lineage<Pango lineage> cluster


Commands to find mutation patterns

consensus [for] <cluster or country or state> pattern

patterns [in] [<n>] <cluster> pattern lists n patterns


Listing commands

list [<n>] <cluster>

[list] countries [for] <cluster>

[list] states [for] <cluster>

[list] lineages [for] <cluster>

[list] trends [for] <cluster>

[list] frequencies [for] <cluster> alias freq frequency of individual mutations

[list] monthly [for] <cluster> [<cluster2>] number of isolates per month or week

[list] weekly [for] <cluster> [<cluster2>] as a fraction of the # in cluster2

[list] patterns lists built-in and user defined patterns

[list] clusters lists built-in and user defined clusters

[list] proteins

[list] variants


Other commands

sort <cluster> (by date)

help [<command>] alias ?

license

history

char <Pango lineage> prints characteristics of lineage

testvdb runs built-in tests of vdb

group lineages <lineage names> define a lineage group alias group lineage, lineage group

lineage groups lists defined lineages groups

clear <cluster name> or <lineage group> clears the definition

reset

settings

mode

count <cluster name or pattern name>

// [<comment>]

quit


Program settings

debug/debug off

listAccession/listAccession off

listAverageMutations/listAverageMutations off

includeSublineages/includeSublineages off/excludeSublineages

sixel/sixel off

trendGraphs/trendGraphs off

stackGraphs/stackGraphs off

completions/completions off

displayTextWithColor/displayTextWithColor off

minimumPatternsCount = <n>

trendsLineageCount = <n>