Draft:Staged tree
Draft article not currently submitted for review.
This is a draft Articles for creation (AfC) submission. It is not currently pending review. While there are no deadlines, abandoned drafts may be deleted after six months. To edit the draft click on the "Edit" tab at the top of the window. To be accepted, a draft should:
It is strongly discouraged to write about yourself, your business or employer. If you do so, you must declare it. Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
Last edited by JackStorrorCarter (talk | contribs) 0 seconds ago. (Update) |
A staged tree is a probabilistic model for a process consisting of a sequence of discrete valued events described by a tree diagram (also known as an event tree or probability tree). A staged tree places equality relationships on the conditional probability distributions of an event tree. These equality relationships are represented visually by the colour of the vertices. A staged tree also has a more compact visual representation called the chain event graph.
Event tree
[edit]Construction of a staged tree begins with drawing the event tree that describes all possible evolutions of the process. An event tree is an arborescence - a directed rooted tree graph with all edges pointing away from the root vertex. The root represents the beginning of the process while the leaves of the tree represent all possible end points of the process. All non-leaf vertices, including the root, are called situations and represent intermediate points in the process evolution. The edges in the tree are labelled with an identifier for the event it corresponds to. Each edge is also associated to a transition probability - the probability of this event occurring, conditional on the process reaching that situation. The transition probabilities are enough to uniquely determine the full probability distribution of the process.
An event tree is called stratified if it can be related to a sequence of discrete random variables (one for each layer of the tree) and all possible combinations of these variables have non-zero probability. Non-stratified trees instead allow structural zeros - when certain combinations of the variables occur with probability zero - or for the possible values of future events to depend on the past events.
Example
Consider a study comparing a new treatment for an illness to the standard treatment. Patients are first assessed to have either mild or severe symptoms. They are then randomly assigned either the new treatment or the standard treatment. After one week the patients symptoms are assessed to have either improved or worsened. This process can be represented by the stratified event tree shown in...
Staged tree
[edit]A staged tree is a probabilistic model that restricts the space of probability distributions in an event tree by specifying equality relationships between the conditional probability distributions of the situations. Specifically, if two situations are in the same stage then they share a common transition probability distribution. Most generally, any two situations with the same number of outgoing edges can be in the same stage. However, the definition of a stage is often restricted so that only situations with the same outgoing edge labels can be in the same stage, with the equality relationship of the transition probabilities matching the edge labels. A staged tree is defined by the event tree and the sets of situations that are in the same stage - the staging.
A staged tree is represented graphically by colouring the vertices of the event tree according to their stage membership. That is, all stages are assigned a colour and all situations in a stage are given this colour in the tree. Usually singleton stages - stages with only one situation - remain uncoloured.
Example
Continuing the above example, if the treatment has been properly randomised, then the probability of being assigned the new treatment should not depend on the symptom severity. In other words, the treatment and symptoms are independent. This is represented in a staged tree by the stage . Suppose also that the treatment choice has no effect on the outcome when symptoms are mild. This is represented by the stage . This staged tree is represented visually in...
Chain event graph
[edit]A chain event graph is a more compact visual representation of a staged tree model. Two situations are in the same position if they are in the same stage and the future unfolding of the process is identical probabilistically from both situations. In other words, the subtrees beginning at the two situations are identical both in terms of topology and stage membership/colouring. A chain event graph is constructed by merging all vertices that are in the same position, as well as merging all leaf vertices into a single sink vertex.
Example
Continuing the above example, situations and are in the same position. However, and are not in the same position because the subsequent situations do not share stage membership. The chain event graph can be seen in...
Probability estimation
[edit]When estimating transition probabilities in an event tree, only samples that reach the relevant situation can be used. However, staged trees can enjoy a slight advantage because samples can be shared across any situations in the same stage since they are all used to estimate the same probability distribution. As long as the assumed staging is true, this results in larger sample sizes and therefore improved estimation of transition probabilities.
Model selection
[edit]When a staged tree is not known a priori, it can instead be learned from data. This is commonly done by combining a scoring function with a model search algorithm. The model search algorithm searches the set of possible models for the one with highest score. Standard choices for scoring function are the posterior model probability and the Bayesian information criterion, while choices for model search algorithm include agglomerative hierarchical clustering and hill climbing.
Relation to Bayesian networks
[edit]Stratified staged trees can be considered an extension of discrete Bayesian networks - every discrete Bayesian network can be represented by a stratified staged tree. However, staged trees are also able to specify context specific properties which cannot be represented by a standard Bayesian network. In this way, the class of staged tree models is broader than that of the standard Bayesian network. Additionally, non-stratified staged trees can be used for processes that cannot be modelled by standard Bayesian networks.
Software
[edit]Software for model selection and probability estimation using staged trees is available in R with the package stagedtrees and in Python with the package cegpy.