PhyloD is a suite of statistical tools for the analysis of viral escape data. The underlying motivation of these tools is that the natural genetic variation in viral sequences provides the evolutionary substrate for rapid adaptation. As viruses pass through different environments (e.g., different hosts), they are exposed to different selection pressures. By parameterizing evolutionary models with features that capture distinct environments, we can test different hypotheses about what is attacking viruses in these environments, and how the virus genetically adapts to these attacks.
Under this umbrella, we at Microsoft Research have developed a suite of tools, which we provide as a service to the scientific community at this website.
These tools are provided to the scientific community. They have been used in numerous publications to generate and test various scientific hypotheses (see our publications page).
The input data varies, but two common types of data are found in these tools:
Virus sequence alignments
Human Leukocyte Antigen class I (HLA-I) types, which are a major source of selection pressure on human viruses.
Both of these data types contain potentially sensitive information. Please see the data retention and security details below.
We assume that you the user have the appropriate permissions from the subjects to use their data with these tools.
While we make every effort to safe guard your data, please use common sense and don’t include subject names or other sensitive information in the subject IDs.
For most tools, it is critical that your viral sequences be aligned to the reference alignments used to train our models. For this reason, we will guide you through the process of uploading and validating your alignments. The work flow is as follows: 1) Create a new sequence data set, which consists of one or more alignments, each of which corresponds to a different protein. 2) For each protein, upload your sequences (DNA only). If they are already aligned, we will compare the amino acid consensus of your alignment to that of our training alignment, highlighting differences. If you are satisfied, click “Accept”. Otherwise, we can tweak your alignment by adding or removing columns so as to minimize the edit distance between your consensus and ours, or we can realign your sequences using the HMMer tool, trained on our reference alignments. If you upload unaligned sequences, you will need to run the HMMer tool to align them. 3) Once saved, we will keep this sequence data set associated with your account and saved according to our data retention policy (see below). This data set is now available for you (and you alone) to use with any of our tools.
All data are encrypted in transit and at rest, using industry standard encryption.
All data from queries and results are stored for up to 7 days for anonymous accounts, and up to one year for registered users.
We will only use your data to run the query you specify. If an error occurs when running your query, we may rerun it to reproduce the bug and aid in fixing the problem.
As noted above, it is critical that your sequence alignments match our training alignments as closely as possible. As such, you must upload, possibly fix, and validate each sequence alignment. User accounts allow us to store these alignments so that you can reuse them later. User accounts also allow you to access prior results.
Nope. You can use an anonymous account, which is associated with a cookie placed in our browser. That cookie will last for 7 days after your last activity on our site, after which the anonymous account will be deleted. All data associated with the account will be kept for 7 days, then deleted.
Yes, we store a thumbprint linking your browser to your account for 7 days. This is the only way you can access an anonymous account. For a registered account, we will keep you logged in for 7 days unless you explicitly log out.
This allows you to access your sequence alignments from any computer. It also means we’ll keep your data and results for up to a year (unless you delete them). In the future, we may use your associated email address to email you notifications when long running jobs finish (if you so request).