PCA of RGB Image

2024/12/9 21:02:57

I'm trying to figure out how to use PCA to decorrelate an RGB image in python. I'm using the code found in the O'Reilly Computer vision book:

from PIL import Image
from numpy import *def pca(X):# Principal Component Analysis# input: X, matrix with training data as flattened arrays in rows# return: projection matrix (with important dimensions first),# variance and mean#get dimensionsnum_data,dim = X.shape#center datamean_X = X.mean(axis=0)for i in range(num_data):X[i] -= mean_Xif dim>100:print 'PCA - compact trick used'M = dot(X,X.T) #covariance matrixe,EV = linalg.eigh(M) #eigenvalues and eigenvectorstmp = dot(X.T,EV).T #this is the compact trickV = tmp[::-1] #reverse since last eigenvectors are the ones we wantS = sqrt(e)[::-1] #reverse since eigenvalues are in increasing orderelse:print 'PCA - SVD used'U,S,V = linalg.svd(X)V = V[:num_data] #only makes sense to return the first num_data#return the projection matrix, the variance and the meanreturn V,S,mean_X

I know I need to flatten my image, but the shape is 512x512x3. Will the dimension of 3 throw off my result? How do I truncate this? How do I find a quantitative number of how much information is retained?

Answer

If there are three bands (which is the case for an RGB image), you need to reshape your image like

X = X.reshape(-1, 3)

In your case of a 512x512 image, the new X will have shape (262144, 3). The dimension of 3 will not throw off your result; that dimension represents the features in the image data space. Each row of X is a sample/observation and each column represents a variable/feature.

The total amount of variance in the image is equal to np.sum(S), which is the sum of eigenvalues. The amount of variance you retain will depend on which eigenvalues/eigenvectors you retain. So if you only keep the first eigenvalue/eigenvector, then the fraction of image variance you retain will be equal to

f = S[0] / np.sum(S)
https://en.xdnf.cn/q/72847.html

Related Q&A

delete the first element in subview of a matrix

I have a dataset like this:[[0,1],[0,2],[0,3],[0,4],[1,5],[1,6],[1,7],[2,8],[2,9]]I need to delete the first elements of each subview of the data as defined by the first column. So first I get all elem…

How to scroll QListWidget to selected item

The code below creates a single dialog window with QListWidget and QPushButton. Clicking the button fires up a scroll() function which finds and selects an "ITEM-0011". I wonder if there is a…

Declaring Subclass without passing self

I have an abstract base class Bicycle:from abc import ABC, abstractmethodclass Bicycle(ABC):def __init__(self, cadence = 10, gear = 10, speed = 10):self._cadence = cadenceself._gear = gear self…

Flask-OIDC with keycloak - oidc_callback default callback not working

Im trying to use Flask-oidc in a simple flask application in order to add authentication via keycloak. However, once I log-in with valid credentials it goes back to /oidc_callback which doesnt exist. T…

Matplotlib: reorder subplots

Say that I have a figure fig which contains two subplots as in the example from the documentation:I can obtain the two axes (the left one being ax1 and the right one ax2) by just doing:ax1, ax2 = fig.a…

Filter values in a list using an array with boolean expressions

I have a list of tuples like this:listOfTuples = [(0, 1), (0, 2), (3, 1)]and an array that could look like this:myArray = np.array([-2, 9, 5])Furthermore, I have an array with Boolean expressions which…

Show two correlation coefficients on pairgrid plot with hue (categorical variable) - seaborn python

I found a function to compute a correlation coefficient and then add it to a pair plot (shown below). My issue is that when I run a pairplot with hue (a categorical variable) the correlation coefficien…

How to properly setup vscode with pyside? Missing suggestions

Im very new to pyside, qt and python. I managed to setup a project with a basic window and a push button which closes the app. My problem is, that somehow vscode wont show all properties available, eve…

Split marker and line in Legend - Matplotlib

I want to make a legend where I specify the value for the markers and the value for the lines but not the combination of both.This example should help to illustrate my goal:import matplotlib.pyplot as …

How do I loop over all items in a DynamoDB table using boto?

Id like to query a DynamoDB table and retrieve all the items and loop over them using boto. How do I structure a query or scan that returns everything in the table?