A major aspect of observational astronomy involves exploration of various correlations
between observed / measured quantities. These correlations hint at the underlying
Physics that can explain the structure and the formation of these objects or
can also serve as a method to measure various quantities. In this exercise, we
will explore a well known relation obeyed by the elliptical galaxies namely the
Fundamental Plane. The fundamental plane is a linear relationship connecting
the effective radius of the galaxy (re ), the average surface brightness within the
effective radius (μe ) and the velocity dispersion of the galaxy (σe ).
Since a galaxy is an extended object, it is truly not possible to measure
the effective radius of the galaxy, which is the radius within which half the
total light of the galaxy is contained. So, this quantity is usually determined
by fitting a suitable analytical function such as the Sersic function to galaxy’s
light distribution. The velocity dispersion is a measure of the random motions
of the stars within the galaxy around the centre. This is a spectroscopically
determined quantity. These 3 quantities obey a relation of the following form.
log(re ) = A < μe > +B log(σe ) + C
The physical meaning of this relationship is that the given system (elliptical
galaxy) is virialized. A practical use of this relationship is to determine the
distance to the galaxy. One can use the velocity dispersion and the average
surface brightness of the galaxy to obtain effective radii using above relation.
The effective radius of the galaxy and its apparent angular span as observed
from Earth are related by the distance of the galaxy from the Earth. Thus, the
distance can be determined using the fundamental plane.
We use AstroStat, a Virtual Observatory tool for statistical analysis, to perform
various statistical tests in order to find various correlations between the properties
of galaxies. At the end of the exercise, the reader will familiarize himself/herself
with the interface of AstroStat and also employ a few statistical techniques and
interpret their results. The exercise makes use of rband data from Jorgensen et al (1996).
In this section, we will learn how to use AstroStat to determine the equation to
the fundamental plane assuming that we know it exists. The following stepbystep
instructions will allow the user to learn how to load a file, use multiplelinear regression
to determine the equation of the plane and plot an edgeon view which also involves
the use of AstroStat's column creation feature.
 On the right hand side of the application window, click on “Browse”. A
dialog box will appear where we can navigate through the folders and
select the file jor_r.csv
 A window appears with a preview of the file contents and one can find
various input boxes (Figure 2). This information is required by AstroStat
to be able to understand how data is stored in a file. In Header Line
Number, we say “1” since our first line contains the names of the columns.
DATATYPE is kept 0 since we do not have description of data (eg. which
column is a text, which is a number, etc.) in this file. Same with UCD
and UNIT line numbers. Enter “2” for “Data Starting Line Number”,
since our main data begins from 2nd line onwards.
 Next, the user is shown a window with a set of checkboxes on the top
with labels like Tab, Pipe, Comma, etc. The user is expected to check the
box (or boxes) depending on what delimiter has been used. In this case,
each column value is separated by a comma, thus we check the box left to
“Comma”. Click OK. The preview of the data as understood by AstroStat
based on our inputs in previous two windows is shown. Go through and
click “OK.” Click “Back” if there is any problem,, to alter the inputs in the
previous steps.

Select “Advanced” under Test Category. Then select “Multiple Linear Re
gression”. To the right, under ycolumn select lgrekpc, under xcolumns,
select lgsig and lgIe. Click “Run Test“ to obtain the relation of the form,
lgrekpc = A lgsig + B lgIe + C.

Next, we would like to visualize the fundamental plane. For this, we will plot an edgeon view
of the plane. To do this, select the "Add Column" feature from the top bar. In Column name, enter "FP RHS" which
stands for the Fundamental Plane Right Hand Side". Now, type in the expression by clicking on the column name, operator
and where needed the numbers by hand.
Then click Add. Thus a new column gets added.

Next, to plot the edgeon view i.e. a plot of left hand side vs right hand side, select XY plot
under Exploratory tests. Then, in the panel that appears below, select Yvariable as "lgre" and the
Xvariable as "FP RHS" i.e. the new column you just created. Select any format and click "Run Test".
This will produce a plot which shows the edgeon view of the plane.
We saw how easy it is to determine the fundamental plane for an appropriate data set. But here
we have assumed prior knowledge of its existence. But how would we, from first principles establish
that the fundamental plane exists? Read on to the next sections for detials.
Let us now see how one can establish the existence of the fundamental plane using simple
but effective statistical tests available within AstroStat. IMPORTANT: Please reload the file
afresh and then follow the below instructions. If you choose not to reload the file, please remember
that the instructions below assume that no fourth column exists in the file. So, you must
ensure that this column is not selected in any of the tests to get results as described below.
 We will start by making a pairs' plot. Ensuring the file is loaded and selected, click on
Pairs' plot under Exploratory tests. In the bottom panel, click on Xval to select all the
columns at once. Choose a suitable format from the right and click "Run Test". The pairs' plot
should appear in a new tab. Notice the strong correlation between lgIe and lgre. This is the Kormendy
relation. One can easily see that it is possible to fit a straightline to these two quantities.

Since we know that a strong linear relation exists between lgIe and lgrekpc,
let us fit a straight line model for the two variables. Click “Exploratory”
in Test Category and then “Simple Linear Regression Analysis”. Notice
how the lower panel changes with each test. For this test, we require a
“ycolumn” which is the dependent variable and an “xcolumn” which is
the independent variable so that we are fitting the straight line y = a+bx.
Select lgIe as the ycolumn and lgrekpc as xcolumn. Click “Run Test“.
A new window appears showing the slope and the intercept. Let’s make
a note of these. (Slope (b) = 0.9084. Intercept (a) = 18.874)

From the previous step, we saw that there is a scatter of 0.2 in the correlation between lgIe and
lgre. Is this scatter random? If we know the magnitude of the error bars we could easily compare the scatter
and answer this question but since this information is not available, we will use another method. We will
compute the deviations of the points around the scatter line and these will constitute a fourth column in our
file. Click on "Add New Column" button on the toolbar above and create a new column titled "Delta_lgIe" and
type in the following expression  “0.9084*$A1 + 18.874  $A3”. Click “Add”

Rerun the pairs' plot test (see above for detailed instructions in case you need to) on all the four columns.
From the pairs' plot, it is clear that a strong correlation exists between "Delta lgIe" and the third parameter
lgsigma, which is the log of the central velocity dispersion. This means the scatter is not random and caused by
a systematic variation in a third parameter. This way, we establish that a higher dimension relationship
must exist, which is the fundamental plane already derived in the previous section.
In the previous section, we saw how the existence of the fundamental plane could be established from first principles.
In this section, we will use another statistical test called Principal Component Analysis to achieve the same result.

Select Principal Component Analyis from under the Advanced category. In the bottom panel, select the three variables
lgIe, lgre and lgsig. Choose covariance matrix and select Run Test.

The output results comprise two parts  one table shows the three principal components and the other gives information
about how much variance is accounted for by each component. Can you read off the first principal component? It's PC1 =
0.651 lgre  0.027 lgsigma_ + 0.759 lgIe. This component accounts for maximum variance in the data. Notice how lgsigma has
a far lower coefficient than the other? This hints at a strong correlation between lgre and lgIe which is consistent with
what we saw in the previous section.

Now, look at the third principal component, PC3 = 0.563 lgre  0.687 lgig + 0.459 lgIe. Can you rearrange this equation
to make lgre the subject? Can you then compare it with equation of the fundamental plane already derived?