The topics will be further introduced and the students can choose between them at an introductory seminar on Tuesday, October 6 at 14:15 in room 1019.
* - the number of supervised students is full
Segmentation with pairwise Markov models, supervisor Jüri Lember.
Hidden Markov models (HMMs) [8,9,10,11] is a widely used stochastic model in many areas of theoretical and applied research (signal processing, bioinformatics, financial mathematics etc.) An HMM is a bivariate random process (X,Y), where Y – regime - is an underlying not observed (hidden) Markov chain with finite state sapce and, given Y, the values of X – observations - are conditionally independent. Thus a path of Markov chain is observed with noise and often the goal of HMM-inference is to denoise it or, equivalently, to prognose a realisation of the hidden process based on the observations. This task is referred to as the segmentation problem. Two standard approaches to solve it are to apply the Viterbi algorithm to obtain the so-called Viterbi path or to apply forward-backward algorithms to obtain the so-called PMAP-path that minimizes the expected number of classification errors. Also, any hybrid path introduced in [1] is a possible option. Despite its popularity, a HMM is often too limited model. Therefore, a way larger class of models - pairwise Markov models (PMMs) are introduced [2,3,4 ]. A PMM is just a two dimensional Markov chain (Y,X). Typically Y (regime) is still finite-valued, but the state space of X (observations) can be arbitrary, for instance d-dimensional Eukledian space. Neither X nor Y are necessarily Markov chains, but in special cases the can be. PMM allow (conditional) correlations between observations, and a HMM is just a narrow subclass of PMMs. PMM’s (under possible different names) are often used in financial mathematics and econometrics. For example, a commonly used model is so-called linear Markov swiching model [5,6,7] also known as swithcing linear autoregression model of order 1 that generalizes HMM and also autoregression model.
It turns out that main segmentation algorithms (Viterbi, PMAP) can be applied for PMM’s, because they solely relay on Markov property of (X,Y). Therefore, from a segmentation point of view, there is no restriction to use PMM’s instead of HMM’s.
The student:
• Gets familiar with HMM and PMM models
• Gives the classification of PMM models and the sufficient conditions fo Y being Markov chain.
• Studies Viterbi, PMAP and hybrid algorithms for HMM’s and generalizes them for PMM’s.
• Implements these algorithms for linear Markov switching models.
• Investigates the effect of PMM (correlated noise) in segmentation. Does the increase of dependence between the observations decrease the difference between PMAP and Viterbi paths.
References
1. A. Koloydenko, J. Lember, Bridging Viterbi and posterior decoding: A generalized risk approach to hidden path inference based on hidden Markov models, Journal of Machine Learning Research, 15, (2014), 1–58.
2. S. Derrode, W. Piecynski, Signal and image segmentation using pairwise Markov chains, IEEE Trans. Signal Process. 52 (9) (2004) 2477–2489.
3. W. Pieczynski, Pairwise Markov chains, IEEE Trans. Pattern Anal. Mach. Intell. 25 (5) (2003) 634–639.
4. I. Gorynin, H. Gangloff, E. Monfrini, W. Pieczynski, Assessing the segmentation performance of pairwise and triplet Markov models, Signal Process. 145 (2018) 183–192.
5. J. Hamilton, A new approach to the economic analysis of nonstationary time series and the business cycle, Econometrica (1989) 357–384.
6. J. Hamilton, Analysis of time series subject to changes in regime, J. Econometrics 45 (1–2) (1990) 39–70.
7. J. Hamilton, Regime switching models, in: Macroeconometrics and Time Series Analysis, Springer, (2010), 202–209.
8. J. Lember, J.Sova, Existence of infinite Viterbi path for pairwise Markov model, Stochastic Processes and their Applications (to appear)
9. J. Lember, Introduction to statistical learning theory (lecture notes), 2012
10. A. Koloydenko, K. Kuljus, J. Lember, Theory of segmentation, In: Hidden Markov Models, INTECH (2011)
11. L. Rabiner, A tutorial on Hidden Markov Models and selected applications in speech recognition, Proc. IEEE (1989), 1-58
On Neural Network models in integral form with uncertainty, supervisor Stefania Tomasiello
Neural networks (NNs) are well-known computing schemes mainly used for forecasting and classification problems, but also employed to solve ordinary or partial differential equations. They are usually modeled as a system of differential equations [1] or integro-differential equations for the recurrent schemes [2, 3]. Neural networks models in integral form (as a system of integral equations) have not been extensively investigated, though they exhibit some interesting properties [4]. The aim of this thesis is to investigate the approximation ability of a new NN scheme with uncertainty in integral form, formally and numerically through some benchmark problems. In particular, as an application example, the credit risk assessment will be considered [5], in order to predict whether a customer will be solvent or not.
References
[1] S. Haykin, Neural Networks and Learning Machines, Pearson College, 3rd ed. 2008
[2] B. de Vries, J.C. Principe, A theory for neural networks with time delays, in: Proceedings: Conference on Advances in Neural Information Processing Systems (NIPS-3), 1990, pp. 162–168.
[3] F. Colace, V. Loia, S. Tomasiello, Revising Recurrent Neural Networks from a Granular perspective, Applied Soft Computing, 2019, 82, 105535
[4] A. Nordbo, J. Wyller, G.T. Einevoll, Neural network firing-rate models on integral form: Effects of temporal coupling kernels on equilibrium-state stability, Biological Cybernetics, 2007, 97 (3), 195–209.
[5] M. Corazza et al. Design of adaptive Elman networks for credit risk assessment, Quantitative Finance, 2020, in press
Simulating financial transactions within a network of banks, supervisor Kaur Lumiste
Financial transaction data is a very sensitive subject within the financial industry and therefore the data is well guarded. Financial data is typically used in-house to improve risk management (e.g. credit scoring), improve internal processes, market analysis etc. But the data is bounded by the financial institution. If someone would want to incorporate data from another financial institution, then that data is very hard to acquire. If someone were to develop new technology that is based on synthetic data, then this is not possible, since sources do not share their data.
Because of this more and more attention is brought on simulating transactional data. For example, in the UK a synthetic dataset is being built. It would be freely distributable, would try to model real life and would be flexible to simulate drastic changes within society (e.g. aftermath of COVID-19 pandemic).
The master thesis comprises a review on simulating transactional data and builds a framework to simulate a network of financial institutions (based on literature review). The topic requires knowledge of mathematical modelling, stochastic distributions and good programming skills.
Detecting money laundering using hidden Markov models, supervisor Kaur Lumiste
Fight against money laundering has become a hot topic within the financial sector since more and more flaws found in money laundering risk management systems within large corporate banks are brought to the public. In June 2020 Kseniia Kassianova defended her master thesis where she tested hidden Markov models to detect suspicious transactions.
Money laundering is a process that takes illegally obtained finances and puts it through a cycle of transactions in a bank (or network of banks) for it to appear to be from a legitimate source. Banks usually do not know if their client X wishes to commit a money laundering act (i.e. hidden information). But the bank will see the manifestation of this hidden information - transactions + their characteristics (sum, counterparty, time etc.). The bank will have to analyse transactional data and give an estimate - is the new transaction of illicit nature or not.
Current thesis aims to improve and elaborate results of the aforementioned master thesis - take a look at the theoretical background and elaborate the empirical study.
The topic requires basic knowledge in probability theory, knowledge of hidden Markov chains and the financial sector will come handy.
Transforming health data to business process mining (BPM) format, supervisors Raivo Kolde, Joonas Puura
Business Process Mining models sequences of events, such as customer journey to make a deal or user actions on a website with a machine readable process format. On these process centric datasets a number of data mining algorithms have been implemented. In a sense the patient's journey through the healthcare system resembles a business process and if the data would be converted to the BPM format then many of the BPM methods could be applied. This topic would explore the best ways to translate medical data into BPM format. To validate the translation strategy, the thesis should include a case study where BPM algorithms are used to extract useful information from medical data.
Representation learning on free text medical data, supervisors Raivo Kolde, Mark Fishel
Using free text medical data in research and machine learning algorithms traditionally requires expensive and slow feature engineering and information retrieval steps. Recently, neural network based models have consistently shown the ability to learn useful features from the free text with relatively little supervision. In this project, we want to learn representations for free text fields in discharge reports in "Digilugu" and study if such representation is useful in various prediction tasks.
Generating example medical texts for Health Informatics course, supervisors Raivo Kolde, Mark Fishel
Information extraction from free text written by doctors is one of the important tasks in health data research. However, real data cannot be used for teaching purposes, as it is almost impossible to fully anonymize such entries. Therefore, the goal here is to train a generative language model that can produce medical text to be used in the teaching process. The work will rely on the recent advances of generative text models, their instances already trained for Estonian language and real Estonian language medical texts.
Visualising health trajectories, supervisors Raivo Kolde, Sulev Reisberg
We have developed an algorithm to identify common sequences of events in medical data. Such trajectories of diagnoses, drug prescriptions and procedures can get rather complicated and good visualisation methods are needed for finding relevant patterns in the data. The goal of the thesis is to produce an interactive tool to browse the results and do further characterization of trajectories of interest.
Valuation of American options using the Monte-Carlo method, supervisor Toomas Raus
The Monte-Carlo method is a common numerical method for estimating different types of European options in addition to lattice methods and finite differencel methods. The advantage of the Monte-Carlo method, among other approaches, is that it allows the option price to be found even for complex payout functions The use of the Monte-Carlo method for American options is more complicated because the option can be exercised before the option's exercise time. However, various variants of the Monte-Carlo method have been proposed in the literature for the American option. in the master's thesis it is planned to give an overview of these approaches and to compare them numerically.
Literature:
Longstaff, F.A., Schwartz, E.S., “Valuing American options by simulation: A simple least squares approach,” (2001)
Tilley, J. A,. 1993, "Valuing American Options in a Path Simulation model."
Yue-Kuen Kwok. Mathematical Models of Financial Derivatives, p.352-369. (2008)
Broadie, M., Glasserman, P., “Pricing American-style securities using simulation (1997)
Glasserman, P., Monte Carlo methods in financial engineering, Springer, New York (2004).
Interpretable approaches for financial time series forecasting, supervisors Stefania Tomasiello, Toomas Raus
Over the last years, different deep learning approaches (which are mostly black-box systems) have been discussed for financial time series forecasting, e.g. [1, 2], but the interpretability (or explainability) issue was not addressed. Fuzzy inference systems are based on a set of rules and they are usually interpretable, but when the system becomes more complex it may no longer be completely interpretable [3].
A consensus on a definition of interpretability has not been achieved yet, even though a position paper recently appeared [4]. Broadly speaking, interpretability can be meant as human readability. For instance, in a certain model, it easily allows to detect the reasons why the predictions go wrong. Lately, interpretability has become a requirement to comply with government regulations for sensitive applications, such as in finance, public health, and transportation. In fact, this issue has received attention from the European Parliament whose General Data Protection Regulation recognizes the right to receive an explanation for algorithmic decisions [5]. The aim of this thesis is to investigate and revise some existing approaches from the perspective of interpretability, with application to financial time series forecasting. This implies also a clear mathematical formulation of the methods.
References
[1] Bao W, Yue J, Rao Y (2017) A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLoS ONE 12(7): e0180944.
[2] A. Preeti et al. Financial Time Series Forecasting Using Deep Learning Network, in G. C. Deka et al. (Eds.): ICACCT 2018, CCIS 899, pp. 23–33, 2018.
[3] A. Vlasenko et al. A Novel Ensemble Neuro-Fuzzy Model for Financial Time Series Forecasting, Data, 2019, 4, 126
[4] Lipton, Z. C. (2018) The mythos of model interpretability. ACM Queue 16(3), 1–27.
[5] Regulation (EU) 2016/679 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing. Directive 95/46/EC (General Data Protection Regulation) [2016] OJ L119/1.
Machine learning for fractional partial differential equations, supervisors Stefania Tomasiello, Urve Kangro
There has been a growing interest recently on the application of neural networks-like approaches for the numerical solution of partial differential equations (PDEs), e.g. [1, 2] even with fractional derivatives [3]. Such approaches seem to overcome the typical shortcomings of the classical grid-based techniques, e.g. a suitable discretization, especially in complex domains.
The aim of this thesis is to investigate a neural network-like approach for the numerical solution of a class of fractional PDEs, and in particular the space fractional Black–Scholes equation for pricing European options, which has been recently considered in [4] by using a finite difference scheme.
References
[1] V. Dwivedi, B. Srinivasan, Physics Informed Extreme Learning Machine (PIELM)–A rapid method for the numerical solution of partial differential equations, Neurocomputing 391 (2020) 96–118
[2] Q. Wei, Y. Jiang, J. Z. Y. Chen, Machine-learning solver for modified diffusion equations, PHYSICAL REVIEW E 98 (2018) 053304
[3] H. Qu, Z. She, X. Liu, Neural network method for solving fractional diffusion equations, Applied Mathematics and Computation 391 (2021) 125635
[4] K. S. Patel , M. Mehra, Fourth order compact scheme for space fractional advection–diffusion reaction equations with variable coefficients, Journal of Computational and Applied Mathematics 380 (2020) 112963
[5] M. Raissi, G. E.Karniadakis, Hidden physics models: Machine learning of nonlinear partial differential equations, J. Comput. Phys. 357 (2018) 125–14
Cramer-Lundberg approximation in the risk theory, supervisor Jaan Lellep
Exponential bounds to the ruin probability are established and an extension to the renewal theorem in the risk theory is presented.
References
1. H. Schmidle , Risk Theory. Springer, 2017.
2. H. Schmidle, An extension to the renewal theorem and an application to risk theory. The Annales of Applied Probability, 1997,7, 121-133.
3. T. Polski, H. Schmidli, V. Schmidt, J. Teugels, Stochastic Processes for Insurance and Finance, Vol.505, Wiley, 2009.
Tree-based methods in supervised learning with Estonian Health Insurance Fond data