Image Processing
Image grey scale values stored in matrix
hi everyone
and welcome to a new video dedicated to
building statistical applications in r
this time we will take a look at image
processing
we will work with a few images shot
under different lighting conditions
and we will construct statistical models
in r to distinguish between the
different lighting states
we will also check how these models
perform on unlabeled images
and how accurately they classify new
examples based on known training data
let's first look at how images are
represented in r for the purpose of this
video we will only work with grayscale
images
so we will not worry about different
color channels an image
is then simply a grid of pixels whose
grayscale values can be stored as
integers in a regular matrix
let's look at the following 10 by 10
matrix which represents a 10 by 10 pixel
image
so we have an all zero matrix except for
two rows and two columns that are
non-zero
we can plot this matrix in r via the
built-in image function
this function will treat each number as
the light intensity for that given pixel
where higher values represent more light
this is how we can think of an image
simply as a numeric matrix which can be
used in statistical analyses
to simplify our work further we can
convert the matrix into a vector
let's look at a slightly more complex
image to load it in r
and extract the underlying data matrix
we will use the magic library
we can now load the image
and r tells us that this file is 454
pixels in width and 322 pixels in height
we can then extract the vector of
numbers that encodes this image
the length of this vector is exactly the
same as the total number of pixels in
the image
since we now have access to this data
vector we have full control over the
image and can make any changes we wish
to the pixels
the magic library also has some built-in
image processing functions
we can create a negative
now for our main application we will
analyze a collection of pictures from
the yale face database
which contains photos of individual
faces under different lighting
conditions
you can find a link to the database in
the video description below
our photos will have either central
lighting
but first how do we load the numerical
data that encodes these faces into
r well we can build a loop that goes
through all images which in my case are
stored in a folder called lighting
we can then load each image into r using
the image read function
extract the corresponding data vector
and finally store this vector in the
face data matrix object
each row in this matrix corresponds to
one image
and the row names record the exact
identity of the picture including the
lighting state
we thus have 43 images with over 77 000
pixels for each image
to identify patterns in this data we can
first run a principal component analysis
also called pca we can think of our
images as points in a space where each
pixel represents a different dimension
of the data
and this is useful because pca is a
dimensionality reduction method
that projects each data point into a
lower dimensional space while also
preserving as much of the data's
variation as
possible and working with fewer
dimensions will help us detect patterns
easier
[Music]
each dot here represents one image and
we find that pca recovered some broad
structure in our data
but how does it correspond to the
lighting conditions
we can first recall which images are lit
centrally or from the left or right
and we can then color code the points in
our pca space accordingly
orange shows the right lit images blue
the center lighting and green the left
lighting
we see that the images cluster fairly
well according to their lighting
so even though we have not told the
algorithm anything about this feature
it was able to pick up relevant patterns
based solely on a numerical
representation of our images
now let's go a step further imagine you
see this pattern
but you only know the lighting state
labels for some of the images
based on this partial information how
well can you predict the labels for all
the other images
we will answer this question by hiding
the labels of individual images
and then using a machine learning model
to try and re-identify these
labels this approach is called leave on
outcross validation
by comparing the predicted with the true
labels we can get some insight into the
performance of our model
let's start by assembling all the
relevant data
our data matrix contains the principal
component values of all 43 images
as well as a label of zero one or two
that records the true lighting state
let's use the carrot package for the
statistical work which allows us to
easily apply many different algorithms
to our data
and also test their predictive
performance we'll start with a random
forest model
random forests are used for
classification or regression
and rely on constructing multiple
decision trees to obtain some useful
theoretical properties
[Music]
we see that a random forest model was
run and that levon outcross validation
was used to test the accuracy of the
model
so the lighting state for each picture
was in turn hidden from the model
and then repredicted based on all other
observations
and the accuracy of these predictions
was recorded this explains why r reports
sample sizes of 42 images
despite having 43 total images to work
with
for predictors we have our 43 principal
components
and the three classes are zero for
central light one for left light and two
for right light
this method has one parameter which
corresponds to the number of randomly
sampled variables used internally to fit
the random forest model
the caret package runs through a few
values of this parameter to find a good
choice
finally we get an accuracy metric for
the different values of the parameter
we can also look at the confusion matrix
which compares the true labels with the
ones predicted by the best performing
random forest model
this looks good most predicted labels
match the true ones and the fraction of
errors is fairly low for all lighting
states
and that's about it for today we have
used both an unsupervised approach
pca as well as a supervised model the
random forest and found that lighting
states can be detected fairly well
this was a quick introduction to image
processing and basic analysis in r
and this is a very broad topic with
larger image data sets we can begin to
detect finer patterns and train more
powerful models
see you all in the next video for more
machine learning applications in r
Comments
Post a Comment