Tentin tekstisisältö

DATA.ML.100 Introduction to Pattern Recognition and Machine Learning - 13.10.2023 (Online Test)

Tentin tekstisisältö

Teksti on luotu tekstintunnistuksella alkuperäisestä tenttitiedostosta, joten se voi sisältää virheellistä tai puutteellista tietoa. Esimerkiksi matemaattisia merkkejä ei voida esitää oikein. Tekstiä käytetään pääasiassa hakutulosten luomiseen.

Alkuperäinen tentti
DATA.ML.100 Introduction to PR and ML Final online test, October 131h 2023
J.-K. Kämäräinen / TAU Computing Sciences page 1/3

Rules:

You NEED access to a computer with Internet access and Python installed. You NEED all Python packages
used in the course exercises (NumPy, Matplotlib).

You are allowed to use Internet and all course materials. You are NOT allowed to communicate with anyone in
any form during the exam.

You MUST join the course Slack channel during the exam and you can send private messages to the course
instructor.

All code and text MUST be written by yourself. Copying substantial parts from any source is PLAGIARISM.

Online test has multiple deadlines. If you MISS one, then the rest of them are not evaluated. DO NOT re-submit
files for the previous deadlines as it resets their time tags and your test is over.

1. Basics (Opts) [Return before 10am25]

Take two screenshots of your full desktop:

* An empty desktop where all applications are minimized (see Fig. 1(a))

« Open your grading page in Moodle DATA.ML.100 (your name must be visible) and the course
Slack (latest message must be visible) (see Fig. 1(b))

Submit the following items before the deadline:

« Screenshot 1 as PNG file: surname desktopl.png
« Screenshot 2 as PNG file: surname desktop2.png

[-]
[:]
[=]
[|
[|
L]
[.!
[)

 

(a) Empty desktop (b) Moodle and Slack open

Figure 1: Examples of the screenshots that should be returned

2. Data histograms (2pts) [Return before 10am45]

Download the training samples file
«e X train.txt
and the target values file

ey train.txt
DATA.ML.100 Introduction to PR and ML Final online test, October 13th 2023
J.-K. Kämäräinen / TAU Computing Sciences page 2/3

You can use numpy .loadtxt () function to load the data files.

Your training data contains measurements of the male and female height (1st column) and weight (2nd
column). The target (ground-truth) vector contains the class labels (0: male, 1: female).

Plot the height histograms of the both classes into the same histogram plot. Make sure the histograms are
clearly distinguishable from each other.

Show class labels in the histogram plot (legend) and name the y- and x-axes correctly.

All code should be in a single Python program which can be run as:

(dataml1100) Joni$S python kamarainen histogram.py
(dataml100) Joni$

Submitted items:

+ Full desktop screenshot of running your code and printing/plotting outputs (+Moodle grading view
+histogram plot windows):
surname histogram screenshot .png

« Python source code: surname histogram.py

3. Bayes classifier [Multiple deadlines]

In addition to the training data, download also the test files

e X test.txt
ey test.txt

We use only the first column that is the person height. The class labels are 0 for male and 1 for female.

You write a single Python file that is extended for each new deadline.

 

(a) Baseline classifier (3pts) [Return before 11am00]

Assign all test samples either the male (0) or female (1) label and print the correct classification
percentage for the both case.

(b) Gaussian pdf (10pts) [Return before 11am20]

Write a Python function for the Gaussian probability density function:

1
= €
oV2n

Use the function to plot the class-specific likelihoods for all training male and female samples (black
circles for male and red circles for female). Use male distribution for the male samples and female
for the female samples. X-axis is height and y-axis is the computed likelihood.

(==)

s

(x) (1)

Note: NumPy functions mean () and std() can be used.

(c) Bayes classifier (10pts) [Return before 11am45]

Using the Gaussian pdf compute the bayes posterior probabilities for all test samples, classify them
according to the probablities and print the correct classification percentage.

Submitted items:
DATA.ML.100 Introduction to PR and ML Final online test, October 13th 2023
J.-K. Kämäräinen / TAU Computing Sciences page 3/3

+ Full desktop screenshot of running your code and printing/plotting outputs (+Moodle grading view
+results output):
surname baseline screenshot .png

+ Full desktop screenshot of running your code and printing/plotting outputs (+Moodle grading view
+plot window):
surname gaussian screenshot.png

+ Full desktop screenshot of running your code and printing/plotting outputs (+Moodle grading view
+results output):
surname bayes screenshot .png

« Python source code: surname all.py


Käytämme evästeitä

Tämä sivusto käyttää evästeitä, mukaanlukien kolmansien puolten evästeitä, vain sivuston toiminnan kannalta välttämättömiin tarkoituksiin, kuten asetusten tallentamiseen käyttäjän laitteelle, käyttäjäistuntojen ylläpitoon ja palvelujen toiminnan mahdollistamiseen. Sivusto kerää käyttäjästä myös muuta tietoa, kuten käyttäjän IP-osoitteen ja selaimen tyypin. Tätä tietoa käytetään sivuston toiminnan ja tietoturvallisuuden varmistamiseen. Kerättyä tietoa voi päätyä myös kolmansien osapuolten käsiteltäväksi sivuston palvelujen tavanomaisen toiminnan seurauksena.

FI / EN