Tuesday, May 10, 2011

Unbiasing Bias with Animation

I was going to do more with this post but I am bored with it. Most people will probably not understand what I am doing anyway. The point of this is that since statistics can lie at least as well as the average weather person, you need ways to look at things that are not biased. Time series analysis is just about as boring as math can get. Because of noise from a variety of sources, it is difficult to determine the real significance of your work. There are plenty of techniques use to reduce the noise and bent the data to one's will. What is often lacking is a simple way to find out if the data is lying to your or if you are lying to yourself. So I invented or probably reinvented a graphical method to indicate the most likely outcomes of a time series. This method is far from perfect. If times series analysis did not bore the crap out of me I could make improvements and possibly even outline a proof. I suck at doing mathematical proofs, so that is not going to happen.

The chart I have created are almost legible. If you enlarge the page and do not suffer from too much eye strain you can make out the data. A added a few of the regression equations for them that like that stuff, but I may have a few mistakes.

I call this a stacked analysis because I am not that creative. All it is are plots of linear regressions of some number of years, 15 years in most cases, of temperature time series with the regressions sliding back one year at a time. The denser the clustering of the end of the individual regression lines at some point in time is a rough visual indication of the distribution of the possible future path of the series. No rocket science here.

The end point is the year 2100 and the regressions are stacked over the 1995 to 2010 period, today in other words. In order to see if the trend is accelerating, you have to compare the clustering with a previous time period. Still no rocket science.

This method is a serious pain in the ass the way I try to force Open Office to do it. A cleaver programmer can do a much better job in a fraction of the time it takes me.

I tried to show how the stacked regression method can eliminate the bias many can confront doing simple regressions. Using a 15 year end to end regression, the direction of the analysis can produce much different results if you are trying to define cycles in a time series based on eyeballing patterns. I am using 25 year stacks (I screwed up and some are only 24 year, not much difference though) because that is where I got tired. Depending on the period of the cycle you are looking for, you can change the length of the stack and the length of the regression. So it you are looking for a 60 year cycle, the stack length and regression length should be harmonics of the expected length of the cycle. Since there seems to be a 60 year cycle, I should have spent the extra time to make the stack 30 years with the regression 15 years. If we have a big hurricane season this year, I may find the time to do that.

Anyway, times series analysis, probably because it is so damn boring, rarely is tackled by the cream of the mathematical crop. This pretty basic method can give you an unbiased visual representation of potential paths. With a little more tinkering, it can do more, but it is not a bad way to check the skill of predictions made from times series analysis. I will leave it to the pros to shoot this down.

I modified the GISS temp global chart from an earlier post with 15 year trends. This one has beginning to middle and end to middle 15 year trends. I added the linear regression equations for each trend with them oriented roughly to show the slope changes. Pretty busy chart. Don't have a clue if any of that really means anything to any one save me.

This chart starts at the beginning with the 15 year periods so the last purple period is an orphan with only 10 years.

This one works back from the end. I adjusted the trend lines again. In this one there is only one period with a downward slope. Huge difference! So back the front may emphasis natural variability, front to back may emphasis enhanced global warming. Most likely all of them are crap. Is there worth looking at?

Other than the 1995 to 2010 period showing some acceleration of warming not much stands out. The maximum range of both front to back and back to front hang in about the same range of 2.3 C (230 on the charts). Since I am working toward a animation, I will start with the back to front and change the period, which is similar to a sliding window. First I will try an eight year window, 7.5 years would be better if I was looking for a 15/30/60 year natural cycle, but let me try 8 first.

This is the 8 year front to back stacked on top of each other. The purple is the last 8 years which shows the down turn of the trend. Eight years is not long enough to say much about a trend, it only shows the the peak trend of the previous chart may be turning down. Not much you can say for sure. I stacked the 8 year trends at the 2003 to 2010 point just to keep the end points on the same chart scale. That doesn't prove anything. It may indicate something, don't know.

This may be an interesting look at the data. Each line is a 15 year regression. Moving back from 2010, I slid back one year at a time with the same 15 year window. All the periods are stacked but not adjusted for the same zero. The equations for the maximum and minimum slopes are on the chart. There are 25 lines for trends starting in 1970. The data used was GISS global. Remember, the maximum trend is for 1995 to 2010, which appears to be leveling off or at least reducing. The range where the lines intersect the year 2100 is pretty similar to the IPCC projected range. It took a while to force Open Office to accept 25 trends, but 25 may be enough compare a few of the data sets.

The X-axis label is pretty much meaningless except for the end, 2100 and the start of the data at 1995. I still have the y-axis in tenths of a degree C, so divide the values by 100.

This chart is the stacked trends ending at 1940. I believe that comparing this 1915 to 1940 data to the 1975 to 2010, it is pretty clear that both periods warmed, the warming of the 1975 to 2010 period was greater and that roughly half of the warming of the 1975 to 2010 warming could be natural if the warming of the 1915 to 1940 warming was natural. There is a big difference in that the 1975 to 2010 period could be indicating accelerated warming while the 1915 to 1940 period does not indicate acceleration or at least not as much.

This chart is the stacked 15 year trends for the Southern Hemisphere using the GISS data staring 1975 and ending in 2010. We know the SH warming is not as much as the NH and that the NH has more land area. So how did the SH 1915 to 1940 period compare?

The chart above for the SH using the stacked 15 year trends from 1915 to 1940 looks pretty much like the 1975 to 2010. The is a lot more potential error in the SH for this period, blah, blah, blah... I don't care, I am just using the data I have. Since I have the RSS data, let me try that.

This the is RSS global stacked 15 year trend starting 1979. I had to work on this a bit by making the yearly averages and converting to hundredths of a degree. Every thing looks right, but I had several opportunities to mess up.

PS I forgot to change the title. I was going to animate this and have it slide through the whole times series, that is a major pain in the ass with my computer and computer literacy.

No comments: