University of Johannesburg (UJ) researchers Professor Tshilidzi Marwala and Dr Pramod Kumar Parida have developed a multivariate additive noise model (MANM), which they say serves as a robust model for general causality that identifies multiple causal connections without time-sequence data.
The model creates significant opportunities to analyse complex phenomena in areas such as economics, disease outbreaks, climate change and conservation, according to an article published in the journal Neural Networks earlier this month.
Marwala, who is professor of artificial intelligence and lead author Parida also tested the model on real-world international datasets, including forest fires, fish growth data and housing.
Causality is determined by how information flows from one event to another. It is the information flow that shows there is a causal link – for example, that event A caused event B. However, if the time-sequenced information flow from event A to event B is missing, general causality is required.
“The model can identify multiple, hierarchical causal factors, even if data with time sequencing is not available,” explains Marwala.
Simple causality theory is sufficient to find out why a household is struggling, as it is possible to determine the causal connections between income, spending, savings, investments and debt.
However, determining causality at societal level is more challenging. Causality theory does not work at this scale, because the financial transaction data for households in the city or region will be incomplete. Date and time information will be missing on some data.
“With the MANM, one can identify multiple major driving factors causing the household debt. In the model, we call these factors the independent parent causal connections. One can also determine which causal connections are more dominant than the others.”
A second analysis of the data can reveal minor driving factors, which the researchers call the independent child causal connections. In this way, it is possible to identify a possible hierarchy of causal connections, says Parida.
National Institute of Technology Rourkela, in India, Department of Mathematics Applied Mathematics Group member and research coauthor Professor Snehashish Chakraverty says that the MANM provides significantly better causal analysis of real-world datasets than industry-standard models currently in use.
“Previous models developed by researchers worked with a maximum of two causal factors, that is, they were bivariate models, which simply could not find multiple-feature dependency criteria.”
Where an existing dataset is available, the MANM makes it possible to identify multiple, multinodal causal structures within the set. As an example, the model can identify the multiple causes of persistent household debt for low-, middle- and high-income households in a region.
The MANM is based on directed acyclic graphs. The model can estimate causal directions in complex feature sets, with no missing or wrong directions. Directed acyclic graphs are a key reason the MANM outperforms models based on independent component analysis, which includes ‘greedy’ directed acyclic graph searches, he explains.
“Another key feature is the proposed causal influence factor for the successful discovery of causal directions in the multivariate system. The causal influence factor score provides a reliable indicator of the quality of the casual inference, which allows for most of the missing or wrong directions in the resulting causal structure to be avoided,” concludes Chakraverty.