Thursday, June 12, 2014

Introduction to Data


Data, Random Variable and Parameter
In most of our experiments, we represent data as a random variable. A random variable is a mapping from the set of outcome space to set of real numbers. But is the data really random ?

Lets take tossing of a die. Considering the die is fair, it can randomly produce values 1 to 6, with equal probability. With the frequentist approach, given sufficient number of trails, we must be able to see all the numbers 1-6 appear with equal frequency. You can check this in Matlab with the following code:



Distribution is defined as way in which something is shared among a group. Here, the total probability is shared equally among all the outcomes in the sample space. Since the sharing is equal, this is called uniform distribution.


Whats with Data? Does it have a  parameter? What isle a parameter?
Can we represent it as a Random Variable???

What is a distribution? How can you characterize a distribution?
Can you generalize this?

Example:
(1) Rolling a die many times
(2) Rolling many die many times and taking the count

Tuesday, October 01, 2013

Fifth Week


Nothing much to add...
Deriving the MLE for Maximum entropy
Verifying with a small example
Writing it all down

Research is a validation of the first noble truth: Suffering Exists.

Sunday, September 22, 2013

Fourth Week Updates

Ok, updates for week number 4:

The highlights during this week were:

First:  Matlab code for Naive Bayes classifier for text categorization based on 20Newsgroup database
Checked and compared the performance with Truyen's code for maxent classifier. Maxent outperforms NB for lower l2-penalty

Second: Tried to derive the MLE for NB. Thanks to the online lectures, was able to make some sense of it.
Still working on the maxnet derivation. I am kinda gonna give up on understanding the theory behind the maxnet.

Thats all there is for now ! 

Thursday, September 12, 2013

Third Week....


Did not do any work during the weekend. Tried to read, but not able to.
I did not get the key to my room yet, so I am not coming on the weekend to the University.

Monday 9th Sept

Thought I finally got an idea on the first problem mentioned by Truyen on his handout.
This was about Naive Bayes. Had a discussion with him. It turns out that what I thought is not actually correct.
He asked me to brush up my background and do the coursera courses on ML and PGM.

Tuesday and Wednesday, 10 and 11th of Sept

Continued with ML lectures. Feeling sleepy all the time.
Found a new house to stay near the uni. They allow couples. Gayathri likes it. Thats all that matters to me.

Thursday 12th of Sept

Truyen sends me another mail with a plan of action.  Truyen, Tu and I have a small chat in the tea room about research stuff. Seems I will be working with Tu in the future.

I have no idea how to start with the work mentioned in the plan of action.
I think I have been given 1 week to implement the 1st task.
Kavilamme shakthi taruuuu....

Attended writing workshop by Margret Kumar. Very informative.
Understood that I did not know anything about writing a paper.

Friday, September 06, 2013

The Initial Days

3/Sept/13
"Leveraging Aggregated Statistics to Improve Predictive Models"
I have been given 3 months to get my background on this.
Not much time if you ask me.

4/Sept/13
Truyen gave me another book. Its called "An Introduction to Medical Statistics" by Martin Bland.
Also told me to master "Elements of Statistical Learning" book.

5/Sept/13
Started reading with basic probability and statistics.
I can now understand one concept in the handout - "Marginals and Consistency"
Also read up on Bayes rule.
Some very basic stuff, I agree. But I am feeling slightly happy.

6/Sept/13
Continuing with Bayes rule. Cannot figure out this chain rule in probability.




Thursday, August 29, 2013

Melbourne Calling: Change of Research & Thesis Advisor


So yes, it had to happen.

Finally applied for a new PhD in Deakin University, Melbourne for Machine learning.

Day 1

Introduced myself to the team. Team looks very interesting. About 8-10 PhD students, most of them Vietnamese, 1 Iranian and 1 Indian girl.

Got a computer and a nice room the next day.

Now waiting to find out what I will be working on.



Tuesday, August 21, 2012



Started with Bachelor Thesis “Grammatical Evolution” by Adam Nohejl, Charles University in Prague.

I suddenly realize that I have to now adjust to a whole new set of jargons for this. I am a noob when it comes to evolutionary computing, let alone grammatical evolution.

So here goes the first word for today:

metaheuristic: computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality.