Page 1 of 1

histogram binning - Steema vs MATLAB

Posted: Mon Jul 26, 2010 8:59 pm
by 10545706
I'd appreciate some insight into why there is such a discrepancy between data binned by THistogramFunction and MATLAB's binning. The attached code uses my own method for binning (a standard method) which reproduces MATLAB output exactly. The differences are even more remarkable and disconcerting when using much larger data sets (I regularly use 70000+ raw data points), which significant consequences for statistics applied to the binned data. The differences between these methods are not just manifest at the beginning of the distributions, but appear throughout.

Clearly I'd like to discover that I'm doing something fundamentally wrong in my use of THistogramFunction which, if rectified, would match manual/MATLAB binning, but I'm not confident that this is where the problem lies. At the moment this is a major problem and cause for concern; any help would be very much appreciated indeed.

Attachment is for C++ Builder 5.1 and TChart 8.0.7; I get identical results in RAD C++ 2009.

Re: histogram binning - Steema vs MATLAB

Posted: Tue Jul 27, 2010 1:32 pm
by 10545706
The source of the discrepancy is in the binning algorithm of TeeHistogram.pas. In

Code: Select all

procedure Histogram(Data: TChartValues; var bins,counts: TChartValues; Min,Max: Double; nbins: Integer);
the use of Round() function seems to me to be inappropriate. Round() probably uses Banker's rounding which means odd and even numbers are treated differently. Replacing Round() with Trunc() has the desired effect, making the values compatible with C/C++, MATLAB etc. Thus:

Code: Select all

    j := Trunc((data[i]-min)*invbinwidth);
I'm also not convinced of the use of 0.5 in setting bin centerpoints a few lines above in the same file. Using

Code: Select all

    bins[i] := min + i*binwidth;
gives more intuitive results (others may disagree, and my impression may just be contextual).

One could edit TeeHistogram.pas in the source folder, or create a modified version of the file and function (and edit up some of the other source/Make/dpk files to make them aware of the new function), and run TeeRecompile.exe to rebuild and install the mod.

Re: histogram binning - Steema vs MATLAB

Posted: Wed Jul 28, 2010 2:52 pm
by yeray
Hi Philip,

We appreciate your effort and detailed study.
I've added it to the wish list to be revised asap and include it in a next maintenance release (TV52015054).

Re: histogram binning - Steema vs MATLAB

Posted: Thu Jul 29, 2010 9:08 am
by narcis
Hi Philip,

Thanks for your feedback.
the use of Round() function seems to me to be inappropriate. Round() probably uses Banker's rounding which means odd and even numbers are treated differently.
Yes, that's correct, see Delphi's Round method documentation.
Replacing Round() with Trunc() has the desired effect, making the values compatible with C/C++, MATLAB etc.
It's not necessary as our current v8 and v2010 (aka v9) sources already produce same results as you'd expect. I bet this is due to a bug (TV52012772) fix which was discussed here. Actually TV52012772 was fixed for v8.07 as can be seen in the release notes :shock:. Can you please confirm you are using v8.07? Anyway, I will also send you an e-mail with our current version of TeeHistogram.pas so that you can check if it fixes the issue at your end.

I attach a Delphi example, similar to yours, which I created to be able to easily debug sources and which produces this chart:
histogram.jpg
histogram.jpg (265.82 KiB) Viewed 16194 times

Re: histogram binning - Steema vs MATLAB

Posted: Thu Jul 29, 2010 9:38 am
by 10545706
I'm using

Release Notes 13th April 2010
TeeChart VCL version 8
Build 8.07.70413

In that source package, TeeHistogram.pas is definitely using Round().

I was aware of TV52012772.

I have downloaded

TeeChart8.07SourceCode.exe
April 13, 2010
Build 8.07.70413
File size - 6,61 MB

again just now. The Round() function still appears in TeeHistogram.pas

Re: histogram binning - Steema vs MATLAB

Posted: Thu Jul 29, 2010 9:47 am
by narcis
Hi Philip,

Yes, I know Round is still in Histogram method. However, there have been some recent changes in TeeHistogram.pas. Have you received the file I sent you? Does this work as expected?

Re: histogram binning - Steema vs MATLAB

Posted: Thu Jul 29, 2010 1:54 pm
by 10545706
Narcís wrote:Hi Philip,

Yes, I know Round is still in Histogram method. However, there have been some recent changes in TeeHistogram.pas. Have you received the file I sent you? Does this work as expected?
Yes, I received the TeeHistogram.pas file you sent as an attachment. It is identical to the one in the VCL 8.07 source package. (In case I had missed something, I recompiled the attached file into the library, and still get the same binning error.)

Re: histogram binning - Steema vs MATLAB

Posted: Thu Jul 29, 2010 2:06 pm
by narcis
Hi Philip,

Do you have any Delphi version for trying the project I attached and check if it works fine for you? At the URL below you can download the exe I generated with my sample project. Can you please check if it works as expected at your end?

http://www.teechart.net/files/public/su ... amBins.zip

Thanks in advance.

Re: histogram binning - Steema vs MATLAB

Posted: Thu Jul 29, 2010 2:34 pm
by 10545706
Narcís wrote: Do you have any Delphi version for trying the project I attached and check if it works fine for you? At the URL below you can download the exe I generated with my sample project. Can you please check if it works as expected at your end?
yes (RAD 2009). I was just looking at it just now. It compiles and runs fine. The THistogramFunction (left) and manual binning chart (middle) are identical, with no difference (right chart).

However, the manual binning routine in

Code: Select all

procedure TForm1.Edit1Change(Sender: TObject);
uses Round(), and is bound to generate the same result as the left graph. [Changing this to Trunc() produces results in the middle graph which are compatible with using int() in C/C++ and MATLAB's hist() function.]

Re: histogram binning - Steema vs MATLAB

Posted: Fri Jul 30, 2010 10:42 am
by narcis
Hi philip,

Oh, I see, thanks. We are a little bit worried because replacing Round for Trunc could change many customers charts unexpectedly for them. We will do some research on the file and consider the possibility of calculating histogram function based on truncated data but add the possibility of rounding it too, for example, adding RoundedData property set to false by default.

Re: histogram binning - Steema vs MATLAB

Posted: Fri Jul 30, 2010 11:39 am
by narcis
Hi philip,

Continuing with what I said above, we decided to add a new property to THistogramFunction called DataStyle of type TDataStyle which is an enum with those possible values: hdsTruncate and hdsRound; the first one being the default value. So, from now on, by default, you'll get histograms calculated as in the code imitating MATLAB you sent. To get previous versions histograms you can set DataStyle to hdsRound, for example:

Code: Select all

  TeeFunction1.DataStyle:=hdsRound;
I'll send you TeeHistogram.pas so that you can test this new feature at your end. This property has been added both in v8 and v2010.

Re: histogram binning - Steema vs MATLAB

Posted: Fri Jul 30, 2010 2:45 pm
by 10545706
Narcís wrote:Hi philip,

Continuing with what I said above, we decided to add a new property to THistogramFunction called DataStyle of type TDataStyle which is an enum with those possible values: hdsTruncate and hdsRound; the first one being the default value. So, from now on, by default, you'll get histograms calculated as in the code imitating MATLAB you sent. To get previous versions histograms you can set DataStyle to hdsRound, for example:

Code: Select all

  TeeFunction1.DataStyle:=hdsRound;
I'll send you TeeHistogram.pas so that you can test this new feature at your end. This property has been added both in v8 and v2010.
Received. The DataStyle property isn't available after I recompile 8.07 with the new TeeHistogram.pas file, either in C or Delphi. The compiled headers show the variable. Something's awry but I can't figure out what it is.

Re: histogram binning - Steema vs MATLAB

Posted: Fri Jul 30, 2010 3:18 pm
by narcis
Hi philip,

Really? That's strange, it works fine for me here in v8 directly referencing the sources from Delphi. You could try adding the source code path at Tools -> Options -> Environment Options -> Delphi Options -> Library - Win32 -> Library path. Does this work for you?

Re: histogram binning - Steema vs MATLAB

Posted: Fri Jul 30, 2010 3:52 pm
by 10545706
Narcís wrote:Hi philip,

Really? That's strange, it works fine for me here in v8 directly referencing the sources from Delphi. You could try adding the source code path at Tools -> Options -> Environment Options -> Delphi Options -> Library - Win32 -> Library path. Does this work for you?
Paths were ok. I purged the compiler of add-in components, and checked .bpr for vestigial v2010. Whatever non-standard thing it was I did worked. You can safely assume it was a local issue.

Anyway, it [your .pas modification] works fine (ChartEditor still to do) - as it should.


I want to say something about Steema - and that simply is that your customer support is exceptionally good - it differentiates you from the others; professional, courteous, thorough and timely. As well as having a damn fine product in your hands, you care about it and it's a pleasure to use it.

Re: histogram binning - Steema vs MATLAB

Posted: Mon Aug 02, 2010 5:40 pm
by yeray
Hi philip,

We are very pleased to hear positive opinions like yours. Thank you.