# Month: September 2020

## Monte Carlo Simulation

On a nice day 2 years ago, when I was in the financial field. My boss sent our team an email. In this email, he would like to us propose some machine learning techniques to predict stock price. So, after accepting the assignment from my manager, our team begin to research and apply some approaches …

## Hiring- Data Scientist (Algorithm Theory)

Job Title Data Scientist (Algorithm Theory) Location Ho Chi Minh Contact recruitment　 @　 mti-tech.vn Employment Fulltime Level Middle/Senior Report to Line Manager If you want to join in exciting and challenging projects, MTI Tech could be the next destination for your career.  MTI Technology specializes in creating smart mobile contents and services that transform …

## Bayesian estimator of the Bernoulli parameter

In this post, I will explain how to calculate a Bayesian estimator. The taken example is very simple: estimate the parameter θ of a Bernoulli distribution. A random variable X which has the Bernoulli distribution is defined as with           In this case, we can write . In reality, the simplest way …

## N-gram language models – Part 2

Background In part 1 of my project, I built a unigram language model: it estimates the probability of each word in a text simply based on the fraction of times the word appears in that text.   The text used to train the unigram model is the book “A Game of Thrones” by George R. R. Martin (called train). …

## N-gram language models – Part 1

Background Language modeling — that is, predicting the probability of a word in a sentence — is a fundamental task in natural language processing. It is used in many NLP applications such as autocomplete, spelling correction, or text generation.   Currently, language models based on neural networks, especially transformers, are the state of the art: they predict very accurately a …

## Gaussian samples – Part (3)

Background The goal of this project is to generate Gaussian samples in 2-D from uniform samples, the latter of which can be readily generated using built-in random number generators in most computer languages. In part 1 of the project, the inverse transform sampling was used to convert each uniform sample into respective x and y coordinates of …

## Gaussian samples – Part (2)

Background In part 1 of this project, I’ve shown how to generate Gaussian samples using the common technique of inversion sampling: First, we sample from the uniform distribution between 0 and 1 — green points in the below animation. These uniform samples represent the cumulative probabilities of a Gaussian distribution i.e. the area under the distribution to …

## Gaussian samples – Part (1)

Background Gaussian sampling — that is, generating samples from a Gaussian distribution — plays an important role in many cutting-edge fields of data science, such as Gaussian process, variational autoencoder, or generative adversarial network. As a result, you often see functions like tf.random.normal in their tutorials. But, deep down, how does computer know how to generate Gaussian samples? This series …

## Practice Design for Try/Fail Fast

At the moment, AI/ML/DL are hot keywords in the trend of Software development. The world have more successful projects based on AI technologies such as Google Translate, AWS Alexa, …AI makes machine smarter than. So, the way from idea to successfully have many challenges if want to make great solutions. I have some time working …

## N-gram language models – Part 3

Background In previous parts of my project, I built different n-gram models to predict the probability of each word in a given text. This probability is estimated using an n-gram — a sequence of words of length n — which contains the word. The below formula shows how the probability of the word “dream” is estimated …