Page 1 of 1

BoxWhisker/BoxPlot issue

Posted: Fri Mar 09, 2007 1:39 pm
by 9642647
Hi there,


I seem to have a problem using TChart to create BoxWhisker charts, using BoxPlot series types.

I'm currently sending to TChart an array for a boxplot with the following values:

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.01656
0.01656
0.01656
0.03489

This should create a box plot with the following:

Min - 0
5% percentil - 0
25% percentil - 0
50% percentil (median) - 0
75% percentil -0
95% percentil - 0.01656
Max - 0.03489

But what i'm having is a flat box at position 0, with only the "star" symbol for min and max with values 0.01656 and 0.03489 respectively.

Am i doing anything wrong, or is this an issue with TChart?

Thank you in advance,

Pedro Reis.

Posted: Fri Mar 09, 2007 3:19 pm
by narcis
Hi Pedro,

Given your data we have checked that plotted BoxPlot is correct as the Median and IQR are both zero. We've also checked that using the same data in Matlab and SPSS provide an identic chart to what you get with TeeChart for .NET v2.

Posted: Fri Mar 09, 2007 3:29 pm
by 9642647
Maybe i didn't explain correctly :)

For the data array above,

this was what I got (with TChart):

Min - 0.01656
5% percentil - 0
25% percentil - 0
50% percentil (median) - 0
75% percentil -0
95% percentil - 0
Max - 0.03489


And this is what I was expecting:

Min - 0
5% percentil - 0
25% percentil - 0
50% percentil (median) - 0
75% percentil -0
95% percentil - 0.01656
Max - 0.03489


How can the min value be higher than the median or the IQR??
Can't the min and IQR be 0?

Posted: Fri Mar 09, 2007 3:44 pm
by narcis
Hi ReisP,

Using latest TeeChart for .NET v2 maintenance release available at the client area we obtain a zero for the minimum using box1.MinYValue(). This method's implementation was enhanced in August 2006. Are you using an older version? Could you please check if the latest version works fine at your end?

Thanks in advance.

Posted: Fri Mar 09, 2007 4:08 pm
by 9642647
The version I have installed is 2.0.2546.16098

While debugging in runtime, the method box.MinYValue() did return 0.
But the saved .ten file is wrong.


For this array:

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.01656
0.01656
0.01656
0.03489


i got this boxplot:
Image

As you can see, the box plot is in Y = 0, but the * for min and max are both above!?!?! How can the min value be above the median?

Posted: Fri Mar 09, 2007 5:49 pm
by Marjan
Hi.

I think the two stars (*) represent two outliers (0.0156 and 0.03489) and not the minimum nd maximum value. Minimum value is still 0.0 and because it's not an outlier, it's not drawn. Only the (mild and extreme) outliers are drawn.

Posted: Thu Mar 15, 2007 3:10 pm
by 9642647
Marjan wrote:(...)Only the (mild and extreme) outliers are drawn.
How can the mild value be so high?
The average is 0.00211!!

I think the plot is not being correctly drawn, because the 95% mark of the plot should be value = 0.01656.

I did managed to make a workaround by disabling the mild and extreme and creating 3 independent series calculated by me to represent average, min and max.

But the plotbox still seems to be wrong...

Is there any test i can make to be sure i'm not doing anything wrong?
Thank you!

Posted: Fri Mar 16, 2007 8:40 am
by narcis
Hi ReisP,

For box plot construction, mean value is NOT important, only median value is.

By definition, mild outliers are all points that fall into the intervals [OuterFence1,InnerFence1] or [InnerFence3, OuterFence3], where inner and outer fences are defined as:

InnerFence1 = Q1- WhiskerLength*IQR;
OuterFence1 = Q1- 2*WhiskerLength*IQR;

InnerFence3 = Q3+ WhiskerLength*IQR;
OuterFence3 = Q3+ 2*WhiskerLength*IQR;

where (by default) whisher lengt (multiply factor) is 1.5. In the same manner, extreme outliers are by definition all points bigger than OuterFence3 or smaller than OuterFence1.

In your case, first quartile, third quartile and IQR values are:

Q1 (first quartile) : 0.0
Q3 (third quartile) : 0.0
IQR (interquartile range) : 0.0

and

InnerFence1 = Q1- WhiskerLength*IQR = 0.0 - 1.5*0.0 = 0.0
OuterFence1 = Q1- 2*WhiskerLength*IQR 0.0 - 2*1.5*0.0 = 0.0

InnerFence3 = Q3+ WhiskerLength*IQR = 0.0 + 1.5*0.0 = 0.0
OuterFence3 = Q3+ 2*WhiskerLength*IQR = 0.0 + 2*1.5*0.0 = 0.0

i.e. at the same time the points can be mild or extreme outliers -> you have to choose whether you want to points to be declared as "mild" or "extreme" outliers. The point is, the points ARE outliers, you only have to decide whether they are "mild" or "extreme". Looking at your data, 0.01656 and 0.03489 fall in the outlier category. So, our oppinion is that the box plot is drawn correctly.

It this doesn't satisfy you please to tell us what is, by your interpretation, the correct result.

Posted: Fri Mar 16, 2007 12:51 pm
by 9642647
Apparently we've been talking about different things.

There are two ways (more even) to build box-plots/box-whiskers charts:
- the one you discribed
- the one I was expecting (one in which the whiskers represent percentil 5% and percentil 95%; check last paragraph of example)

TChart draws box-plot charts as you discribed, and I was expecting another. I couldn't distinguish them until I tested with this value array..
I'll have to deal with it.


Thank you for your time.

Posted: Fri Mar 16, 2007 3:42 pm
by narcis
Hi ReisP,

Thanks for the information.

We have added your suggestion to our wish-list to be considered for inclusion in future releases.