Tuesday, December 27, 2016

Unambiguous regions in color space for the basic chromatic colors

I am going to start this blog post with the punchline. The image below shows the range in color space of the eight basic chromatic colors. I assert that any color that is within a given set of limits will be unambiguously identified by the corresponding color name by everyone except for people who are either Color Vision Deficient (CVD) or Color Naming Pretentious (CNP). 

If a color is in one of these regions, then it has an unambiguous name!

Note that this is an a*b* plot. Each color also has a viable range for L*. Stick around for the end of this post, and I will provide a simple mathematical description of these regions -- but that's a treat reserved for those people who read through this entire blog post!

Why is this important?

Before I go any further, I have a confession to make. I write this blog post (and put all that time into the data analysis) in hopes that I will someday win this running argument that I have with my wife. I know. Good luck, John. I can dream, can't I?

Here is how the argument typically plays out...

Math Guy: Did you see that woman with the gorgeous brown hair? She just winked at me and smiled. I'm gonna go ask her for her number.

Honey, she smiled at me!

Bride of Math Guy: Don't be such a dufus! Her hair is auburn, not brown! 

Math Guy: But, Honey! I know color. I am a color scientist!

Bride of Math Guy: That may be, but you're still wrong.

If only I could walk over to the brunette (who is obviously attracted to immensely intelligent guys like me), ask to measure her hair with the spectrophotometer that I always carry in my pocket, and then bring up a ColorNamer app to give me the unambiguous name for the color of her hair. If nothing else, you gotta admit that this is a novel pick-up line. And just maybe, it could be used to avoid marital strife.

If you have doubts about the importance of the question of color naming, witness the following. There is a prominent, well-respected, and humble color scientist/blogger who has devoted no less than six blog posts to the topic of assigning names to colors.

I dunno, maybe the name of a color is important for other reasons. I mean, there are a few odd cases where words are used to convey information, and the color associated might be important to somebody.

My data

I recently ran into a pile of papers written by Dimitris Mylonas. Unlike me, he has been doing a lot of real research. His research topic for his doctorate at University College London has been how people assign names to colors. He has been running an online experiment where he displays colors right on your own computer and then asks you to name the color. He has made the results available through an online color naming app where you can select from 30 color names and it will display the most common color associated with that name. Or, the other way around, you can click on a color from a palette, and you will get a word cloud with the most common names that your RGB combination has been given. Great entertainment for a rainy day. I gave up my subscription to Netflix when I found this.

Screen capture from Mylonas' site

There was a similar color naming experiment that was conducted by Nathan Moroney of HP. You can get a free copy of his color thesaurus online.

Snippet from Moroney's book

For both of these sources (the online app from Mylonas and the book from Moroney), I harvested RGB values for each of the basic chromatic color names: red, orange, yellow, green, blue, purple, pink, and brown.

Why these colors? 

Why not beige, turquoise, plum, coral, lilac, etc?

I do have some logical foundation for the colors I chose. It is based on a seminal paper in the study of chromolinguists by Berlin and Kay in 1969. They did linguistic studies of color names in eleventy-seven zillion different languages and came to the following conclusion:

"... a total basic inventory of eleven basic color categories exists from which the eleven or fewer basic color terms of any given language are always drawn. The eleven basic color categories are white, black, red, green, yellow, blue, brown, purple, pink, orange, and grey."

The eleven basic color categories

I think that's pretty amazing. There are many independent roots of languages, and for some reason, they eventually all settle on eleven words for basic color names. The words are different, of course, but they all kinda translate. You don't run into a basic color word in Swahili that translates into "a sorta brownish shade of red, but not so dark". There must be something fundamental to the human eye or the neural pathway to the human brain that segregates color into these eleven groups.

I should mention here that the bulk of the Berlin and Kay paper dealt with a recurring pattern in the development of languages. They posited that nascent languages include white and black in their vocabulary, later adopt a name for red, then either yellow or green, followed by green or yellow, etc. The sequence up to these eleven colors is largely predictable.

There has been much research based on the work of Berlin and Kay, and it mostly supports the eleven-ness of color categories. Perhaps there are a few colors (such as beige, turquoise, plum, and coral) that belong on the next tier, but these are clearly the Magnificent Eleven.

In the old west, life was lived in black and white,
and there were only seven magnificent basic colors

I did a little tiny bit of research on this topic. Years ago, I taught algebra to people who hated math for University of Wisconsin Milwaukee. One semester I had about 50 students, half male and half female. I asked them to write down all of the single-word color names they could think of, and gave them two minutes. 

There were perhaps three or four lists that had only ten of the Magnificent Eleven, but almost every student had included the eleven basic color terms. The next two color names in terms of frequency, were silver and gold, which each appeared on about half of the lists. That in itself I found interesting, since as a color scientist, I know that silver and gold are gonio-apparent effects, and not actually colors.

My paltry little experiment demonstrated once again that there is something magical about these eleven colors.

So, for this experiment today, I decided to go with that set of eleven. But I left off the achromatic colors (white, black, and gray) due to some technical problems beyond my control. I didn't think that neutral colors were pretty enough.

A caveat

There is always a caveat, isn't there? These two online experiments are absolutely fabulous work. Incredible amounts of data. In one of his papers, Mylonas states that there had been over 1,400 participants in his experiment, Moroney claims over 5,000. 

Here's the caveat, though. You can bet that most of the computer displays were uncalibrated. There were certainly 6,400 different viewing conditions, if you consider intensity and white point of the display, ambient light, and background. So when Sidney from Sidney looked at a shade of lime green that was created by the RGB values 30, 230, 40, and Charlotte from Charlotte viewed that same combination on her monitor, there is apt to be a difference in what they are actually seeing. Anyone who has used a laptop with a second monitor can appreciate this issue. 

So, we identify that this is a source of experimental variation. We have literally scads of data, but are unsure about the quality of the data. But, I used it for my experiment anyway. I'm not proud.

Both researchers provided us only with RGB values associated with the color names. Before I go on, I should explain that "RGB" is not a standard. It could refer to any of the particular flavors of RGB values associated with whatever monitor or cellphone or camera you are using. But adding an s to the front of RGB wildly changes the connotation of the whole thing in much the same way that adding a little s does to your ex. sRGB is a unique standard that can be converted into the standard L*a*b* values. That would be a handy trick right now.

I asked my good buddy Dimitris Mylonas if it would be reasonable to assume that his RGB values were sRGB. I could almost hear him shrugging his shoulders through the email: "sRGB is safer than any other assumption." So, I used the standard computation to convert from sRGB to L*a*b*. Here is a website to do the conversion from sRGB to L*a*b*.

An obvious question here

I can see one of you bouncing up and down with your hand in the air... yes? You want to know whether the two data sets agree with each other? Good question! I wish I'da thought to do that! 

In the graph below, the circles represent the Mylonas data, and the squares with an outline represent the Moroney data.

Comparison of two versions of the basic colors

So, the answer is no. They don't match. The color difference ranges from 12.5 DEab to 46.4 DEab. Not good matches by any stretch of the imagination. There is a consistent pattern, however. The Moroney data is always more saturated. And the two data sets have very similar hue angles. With the exception of yellow and blue, the hue angles are all within  of each other. 

I don't have a ready explanation for why the two experiments differed so much. Given the size of the data sets, the difference is due to some sort of bias between the two experiments, and is not a statistical anomaly. If nothing else, this is a caution for this endeavor: If we try to precisely define colors, there be dragons.

The eight color map problem

So here I am: I have L*a*b* values for the eight basic chromatic colors, but, truth be told, these numbers have somewhat of a checkered past.

Enter Sturges and Whitfield

If only I had some data that was taken under standardized conditions. Even if it were to be done with less than a cast of thousands, if it corroborated the online studies, then the online studies would be corroborated.  The good news is that such a study was done in 1995 by Sturges and Whitfield. They elicited the help of twenty students at their university in England. Half were male and half female. None had any specialized training in color, err, excuse me, in colour.

The experimenters selected 446 color chips from the Munsell Book, and asked each subject to give a monolexic name for the color of the chip. (From mono, meaning "the kissing disease", and lexic meaning "someone who can't remember the first three letters of their learning disorder", monolexic means "single word". Hence pink, aubergine, and sploofrinde are all monolexic. Burnt sienna and reddish-green, are not monolexic. Owyell, by the way, is dyslexic, and there are no English words that rhyme with orange. Except for sporange, which means "word that rhymes with orange".

Did I mention? Twenty subjects named 446 randomly ordered patches, and I forgot to mention that they were given each patch twice.

S&W thus had a lot of data to distill down -- about 18,000 words. Among other things, they tried using consensus to decide when a given name was proper for a given patch. If all twenty trials on a given patch yielded the same name, then there was consensus. I was a bit surprised but even with this stringent test, there were 102 patches where there was a consensus as to the name. For about one-quarter of the patches, twenty people independently came to the same conclusion about the monolexic name.

So, my third data set was this set of 102 colors and their associated names. Since the colors were reported in Munsell notation, I used the Munsell renotation data to convert to L*a*b*.

Am I done yet?

Just to be safe, I wanted to throw in a few more data sets. I happened to have measurements from a Macbeth Color Checker chart. (This shows my age. People who don't immediately recognize the names "John, Paul, George, and Ringo" will know the Color Checker by the name X-Rite Color Checker.) This chart unfortunately does not include pink or brown.

And I did wy own color naming experiment. I tossed a Pantone book at my wife and asked her to find the best representation of each of the basic color names in the book. Her brother, who is also color-savvy, was given the same test, and I recorded my answers as well. (In case you were wondering, we disagreed on pretty much all of them.)

Here are the results for the color purple. Each dot represents a "sample". Thus, the 5,000 people who took the Moroney test get one circle. The twenty college students who spent the better part of a weekend staring crossed-eyed at Munsell chips instead of going out to a proper pub got a total of fourteen circles -- one for each Munsell patch that they all agreed was pruple purple. And I got one circle all to myself. And, begrudgingly, I gave one circle to my wife and brother-in-law as well. Life isn't always fair. If one of the people who took the Mylonas test wants more than 1/1,400th of a circle, they can get write their own darn blog post.

The X in the middle is the average of all the data.

The range of the color purple

Looking at the scatter of points of purple and of other colors, I saw a shape that was bounded on two sides by two hue angles, on two other sides by two chroma values, and (not shown) bounded on top and bottom by two L* values. Since I had some clear outliers, I let Excel tell me the 10th percentile and 90th percentile of  L*, C*, and h.

Note: I had originally called this last one H*. Thank you Tammo for the correction!

Results, in graphical form and numeric form

The graph below may look familiar to those who bothered to read the first part of this blog, and who were also paying attention. It is the same graph as above, meticulously duplicated for the benefit of my dear readers.

Partitioning of color space into base color names

Low L* High L*  Low C* High C* Low h High h
Red    41    49     59    86     27    37
Orange       62    72    67    96    57    67
Yellow    81       90    68   109    86    97
Green    31    72      29    80   122   168
Blue    31    71    24    58  -112   -71
Purple    25    52    26    81   -56   -35
Pink     62    81    25    54   -23    21
Brown    29    41    26    43    55    76

Let me know if you find some use for this. I found some use... I wrote a blog post, and looking at this graph gives me a bunch of ideas for future blogposts.

Monday, November 21, 2016

What's with the new version of the X-Rite eXact???

I got a question the other day from my good buddy Steve (not to be confused with my mediocre friend Steve, or my sworn enemy Steve).  Steve was asking about the kerfuffle surrounding a new version of the X-Rite eXact. Something or other to do with polarization and clear films. What's up with that, John?

So, I did some research. And I have an answer to the question. I should mention that a lot what I have to say applies to the Techkon SpectroDENS as well.

First case, the easy case. If you are planning on using the M3 (AKA polarized mode) for measuring films, then I think your time will be more productively spent looking for Civil War medals with a metal detector at Malibu Beach. I hear that they are in desperate need of a good professional beachcomber.

Now, I am not a big fan of M3 in general. M3 only makes some limited sense when you are measuring ink that is still wet. I think if you are printing on clear films, you are probably not measuring wet ink, so why do you think you need M3? But more importantly, the use of polarized light to measure clear films will result in unexpected results that are unexpected. Just don't do it, ok? If you are skeptical, then read the rest of this blog post.

Now for the more complicated case. If you have an X-Rite eXact or a Techkon SpectroDENS, and are using it to measure some sort of films, then I suggest you read this whole blog post.

Here is the short answer. Measurement of films can be unreliable with the eXact in M0, M2, and M3 modes. There is no problem with the M1 mode. The Techkon SpectroDENS can be unreliable on films in any of the modes.

Both manufacturers offer a modification to their instruments to make them more reliable on films. Neither of the instruments will be able to measure M3 after they get back from surgery.

If you aren't sure about all this M0, M1, etc business, I know a guy who wrote an excellent blog post just for you: What measurement condition is your spectro wearing?  Fascinating post, really. And the guy who wrote it is so extinguished looking!

Films and polarized light

Measurement of films with polarized light is problematic. I put together a video showing the really awesome effects that you can get when you mix polarized light and clear films. I then blogged about this effect, giving my own explanation for what was happening. I am happy to say that the comments on the blog cleared up my wrong-headedness about the underlying physics.

I made some mistake about the physics, but the effect is real. As awesomely cool as the effects are to watch, the effects are ginormously terrifying for anyone who is trying to measure clear films with some kind of polarized light. Ok... maybe the superlative is a bit too ginormous?

The new JMG SuperSpec, available once I get FDA clearance

One of the important conclusions from the video had to do with which of the Roscolux filters exhibited this mind-boggling color shifting property. I quote myself from the video: "I dunno why that is. A lot of the filters I have don't do anything cool."

In writing this blog, I chatted with a number of folks who have seen this issue. In particular, I was trying to get a handle on when this Muenster can be expected to rear it's ugly head. Here are some quotes:

"The main culprits are films with some oriented grain structure, or anisotropy, a product of the extruding process. You might also say it occurs on all 'crinkly', cellophane-like films."

"I have seen the problem on matte clear films, so surface texture does not appear to reduce the effect, but the effect is not seen when measuring opaque white film."

"We have seen issues on matte, gloss, clear and 'opaque' films. What makes it all the more interesting is it isn't true on all of any of these (sort of demonstrated in your rosco experiment)!"

So, I think this just proves my point in the video: "I dunno why that is." I should say that a little less jocularly, since this is an important point. It is hard to predict when this really exciting property of films well come to poop on your parade.

I might add, measurement of polarized light sources, like computer displays, is also problematic. My good buddy Robin Myers has blogged about this.

M3 measurement condition (polarized)

Most spectros that are used in the graphic arts have a polarized or M3 mode. Why?  Read on...

When light bounces off the surface of something (specular reflection) the polarization isn't spoiled. When light enters something, and interacts, then it bounces around and quickly forgets the polarization it came in with. I'm sure you have been to parties like that as well.

This fact has been mercilessly exploited by manufacturers of spectrophotometers and (especially) by the users of those spectros. I have blogged about polarized spectrophotometers before. If you are reading this blog post because you are bored silly pretending to be paying attention to the opera with your spouse, you might want to go read that blog post. It will tell you why someone might want to use the polarized mode.

But for our purposes here, all we need to know is the concept of cross-polarized filters. (Get ready for the cool part...) If you illuminate the sample with light of one polarization, and then collect the light through a filter that only passes light of the other polarization, the only light you will measure is the light that has interacted with the sample. The specular light is extinguished. 

This has been implemented in spectros with a piece of glass that has one orientation of polarization on the outer ring, and a different orientation in the center, as shown in the really excellent artist's conceptual drawing below. 

The first spectrophotometer that I got to play with was the Gretag SPM 50, which had a thumbwheel that you had to rotate to put the unit into polarized mode. The polarized filter (no doubt) looked absolutely exactly like my drawing above.

I currently own a Spectrolino. This has a cap that you have to snap into place with the polarizing filter. I wanted to check if it was anything like my drawing. The three pictures below are taken of the cap with light polarized in one direction (gibbous moon on the left), in the perpendicular direction (crescent moon on the right), and somewhere in between (in the middle). Yup. Pretty much what I thought.

One key point here: moving parts. I have described two possible designs for a polarized spectro. Both require a part to be physically moved into place. It would be advantageous for the user to not have to make this change. Enter a new design...

A new design

X-Rite filed for a patent for a new design in May of 2000: "Color measurement instrument capable of obtaining simultaneous polarized and nonpolarized data". It was a clever idea that obviated the need to slap a funky dual-polarizing filter into place. The idea was simple. A polarized spectro requires cross polarization; one polarizer on the incoming light and another one on the detector. How about just leaving one of them in place all the time? In that way, you don't need a funky two-element filter, you just need to swap a polarizing filter in and out of the light path going to the detector. Less moving parts, more reliability. 

I should explain that I still own a lot of tie-dyed shirts, so my concept of "new design" might be different from yours.

If you happen to have a design with a spinning filter wheel (like most of the spectros designed by X-Rite in Michigan), then the polarizing filters can "easily" be incorporated into the filters already present on the filter wheel.

Below is a diagram from US patent 6,597,454. Element 14 is the polarizing filter on the light source. Element 16 is the wheel with a zillion narrow bandpass filters. One filter might pass light between 400 nm and 420 nm. The next one might pass between 420 nm and 440 nm. That accounts for 15 of the filters. The other half have the same bandpass, but  are polarized. Thus, half of the measurements are cross polarized and the other half are only once polarized.

A new ride at Six Flags - the Polarizing Spectrophotometer

I don't know if I mentioned this before, but it's actually a pretty clever idea. If I was on the design team that came up with the idea, I would have immediately tried to take credit for it.

We all know that having exactly one polarizing filter in a spectro will always give exactly the same readings as a spectro without any polarizing filters. What could go wrong?

Oh. Wait. I think all the talk about the awesome video and the previous blog post about the funny things that polarized light can do to extruded thin films is enough foreshadowing to suggest that there might be an issue.

So, what about the eXact?

Does the eXact utilize this ultra-spiffy concept? 

I don't have an eXact, so I contacted my good buddy Mike Strickler. I can say that he is my "good buddy" because we once had dinner together. Neither one of us threw anything at the other, or stormed out of the restaurant, so it went better than most of the blind dates that I went on when I was single.

I had him put the spectro in each of the modes (see image below), and ask the instrument to take measurements. Unbeknownst to the instrument, he held it up in the air so that he could see the illumination on the table. (If the instrument knew that he did not have a sample at the aperture, it would have just folded its cute little arms and said "Foo on you! Ain't nothing there to measure!"

Stolen from the eXact manual, page 12

To make this interesting, I asked him to place a polarizing filter between the instrument and the table. He asked the instrument to take measurements as he rotated the polarizer. Lo and behold... in the M0, M1, and M2 modes, the illumination from the spectro was polarized. In the M1 mode, it was not. 

This suggests to me that the eXact makes use of the clever patent when it is in the M0, M2, and M3 modes, but not when it is in M1. The switch in the above picture slides a polarizing filter over the illuminant. But, that is just my guess. I'm still waiting for my mole at X-Rite to slip the mechanical design docs into an unmarked manila folder. I told him there was $10 in it for him. Dunno why he hasn't gotten back to me.

I don't know for sure what they have in their for the M1 mode. If any of you see me in your pressroom and have an eXact sitting out, I suggest that you make sure that there are no screwdrivers or hammers nearby. When I was five, my Dad learned a hard lesson about me that had to do with clocks.

I don't know why they decided not to incorporate the polarizing filter in the M1 mode. Polarizing filters do have a pesky tendency to mask UV light, and M1 requires a fair amount of UV. Maybe that's why?

Anyway, the Xp version of the eXact removes this polarizing filter completely. As a result, M3 measurements (polarized) are not available with the eXact Xp.

What if you already own the normal version of the eXact and are measuring thin films?  X-Rite does have a path to change your device to eliminate this problem.

Here is more information from X-Rite. They also recommend testing on clear films by rotating the instrument and seeing if the measurements change. They note changes as large as 2.5 DE.

What about other instruments?

Since X-Rite's patent was filed in 2000, it is likely that other X-Rite instruments have this issue. I don't know which ones.

Currently, I know of four manufacturers who make M-condition instruments for the graphic arts: Barbieri, Konica-Minolta, Techkon, and X-Rite. 

Barbieri (LFP) and Konica-Minolta FD-5 and FD-7 both have caps that snap on when you make polarized measurements. One can assume that these are cross-polarized filters, so it is unlikely that either have this issue. I have heard directly from Barbieri that the LFP does not have this problem. I have not heard back from Konica-Minolta.

The Techkon SpectroDENS has the same issue with films. As with the X-Rite, Techkon has a modification for folks who measure films. They call it the SpectroDENS Flex. As with the eXact, the modification means that the M3 mode is no longer available.

As for inline (on-press) instruments, I have been told that the spectrophotometers from QuadTech, AVT, and BST do not utilize polarized light.

Tuesday, November 15, 2016

Explanation of "Cool tricks with polarized light" video

I saw a cool YouTube video yesterday. Let me warn you, if you don't like being barefoot, then I suggest you invest in a pair of sock garters before you watch this next video, cuz without them, this video will knock your socks off.

Wow. The guy in the video repeatedly says he doesn't know what's going on. I suspect he knows a bit more than what he let on. But just in case he didn't know the physics behind this, I decided to write this blog. 

Polarized light

Let's start with a little bit on polarized light. Light can be thought of as a rope that wiggles to and fro. It might be wiggling up and down, or right and left, or maybe at angle. The orientation becomes important if the light goes through a picket fence, as shown in the diagram below. If the wiggle direction aligns with the direction of the pickets (as shown on the left), the wiggle is not obstructed by the pickets. If the wiggle runs perpendicular to the  pickets, the energy doesn't go through. if the wiggle angle is somewhere in between, then some energy passes through.

Dreaming of a place with a white picket fence and a wiggly rope

That's all we need to know about picket fences and polarized light to explain the video. Well, maybe something else will come up. I dunno.

Extruded films

A lot of people I know would say that their favorite line from The Graduate is "Mrs. Robinson, you are trying to seduce me... Aren't you?" That is a great line, but for me, it's when Mr. Braddock whispers the key of the future into Dustin Hoffman's ear: "Plastics." This line, by the way, is listed as number 42 in the American Film Institutes 100 best movie quotes. "Plastics" became a movie symbol of the era, a symbol of fake eyelashes, fake relationships at cocktail parties, and aspirations to the fake nirvana promised by consumerism. At least that's what my seventh grade teacher (Mr. Mattke) told us about The Graduate.

Plastics are made up of molecules (monomers) that have chained themselves up into a conga line known as a polymer. These very long lines of connected molecules are what give plastics their unique combination of strength and flexibility.

Monomers polymerizing into a conga line

Here is where I come up with an extremely clever analogy that brings these two thoughts together. Picket fences are a social symbol of the aspirations of one era, and plastics are a social symbol of that of another. But they are connected in another way... a way that involves polarization of light, especially when it comes to thin films of extruded plastics.

Imagine if you will, a dance floor with multiple conga lines kinda winding around at random. Imagine further that the head person in each conga line suddenly gets mesmerized by the evil villain The Extruder. The Extruder induces these charismatic people to suddenly start running due north. What happens to the conga lines? It is easy to visualize that those conga-regants who hang on will one by one start heading north. In the end, each conga line will be running basically north to south, and the lines will be close to parallel. 

I haven't spent much time actually in an extruder nozzle, but this is what happens when molecules in the chamber of an extruder accelerate as they head out the nozzle. The homogeneity of orientation depends (I would guess) on the amount of acceleration that the material experiences when it exits the nozzle, the viscosity of the polymer in it's liquid phase, and the amount of time it takes for it to harden. But I'm just guessing. I am not a polymeric chemist. Nor do I play one on TV.

Here comes the important part where I draw this all together. For a wave of light, the two symbols of the aspirations of the generations (picket fences and plastics) look kinda the same. So, any extruded polymer film is inherently polarized, at least to some extent.

Example - cellophane

Cellophane is a polymer of good ol' glucose. The polymer is formed into sheets by extruding it through a slit. For whatever reason, cellophane seems to have a lot of this polarizing effect that I just predicted. I investigated a bunch of thin films for this blog post, and cellophane seems to be the most pronounced. 

(Interesting trivia: If you remove the letter a from cellophane, and move the o to where the a was, you get the word "cellphone". If that's not proof of a conspiracy involving Reagan and ray guns, then I don't know what is!)

For the next three pictures, I used my KindleFire as a backlight. The KindFire, like many displays, emits polarized light. The first picture is just a piece of cellophane that I tore from a box of Lipton tea. Nothing exciting here. Move on to the next image, please.

Boring image of cellophane on my tablet

For the next picture, I put a polarizing filter between the camera and the cellophane. Wow. There are two interesting things happening here. First, the black area shows that the light emitted from the Kindle display is indeed polarized, just like I said in the last paragraph. (All this time, the news articles you have been reading on your Kindle were polarized, and you didn't even realize it! Yet another conspiracy? I dunno. You be the judge.)

A polarizing element has interjected itself into the picture

But the odd part is what happens to the light that was emitted from the Kindle through the cellophane and then though the polarizing filter. That light does not seem to be polarized. Did the cellophane scatter the light so that it was now randomly polarized??? (Note the use of multiple question marks. This could be a sign that this "answer" to the befuddlement is tentative, and may not actually be correct.)

Just based on the picture above, I am thinking that perhaps the cellophane is not truly scattering the polarization of the light. You see, polarizing filters will attenuate randomly polarized light. Some of the light doesn't pass through. But the gosh darn cellophane appears almost as bright as the Kindle light that does not pass through the cellophane or the polarizing filter.

The following picture proves that the cellophane does not randomize the polarization. For this picture, the polarizing filter was rotated by around 70 degrees. Heavens to Betsy! The cellophane has rotated the polarization by about 70 degrees! So, the conga lines in the extruded glucose polymer (the cellophane) are not acting like a typical bouncer at a nightclub, only allowing photons with certain orientations to pass. The bouncer at the Cellophane nightclub is actually twisting the orientations of the photons as they come in the door! I will refrain from political commentary about whether this is appropriate behavior for a bouncer.

A simple twist of the filter, and Holy Buckets!

Also note that the light that passes through both the cellophane and the polarizing filter is not neutral gray or black; it's brown. This shows that the reorienting of the the polarization is not the same at all wavelengths. Here I am telling you just a bit more than what I know, but it seems to me that the width of the conga lines must be on the order of the size of the wavelength of light, so that (for example) larger photons (longer wavelength photons, like red) tend to be too big to experience this effect. Smaller photons are a bit more likely to be intimidated by the bouncer. 

The video demonstrates that we can get some bizarre effects when we start mixing things up a bit.

But I see there are some questions in the audience... 

Why did the color change as we rotated the Roscolux filters?

The distance between the conga lines depends on the orientation of the polarized light, as shown below. Since we have decided that the distance between polymer strands is in the neighborhood of the wavelength of visible light, we would expect to see different wavelengths effected differently as this distance changes.

Effective distance between conga lines depends on the angle you hit them at

Why does an individual filter show changes in color along its length?

Here is my conjecture: the thicker the film, the more rotation of the orientation of polarization. In the image at the left (below), there is a small difference in the effective thickness of the film, probably due to either the film not being perfectly flat, or due to manufacturing tolerances in the thickness. At the left, I have pinched the ends of the film together so that the effective thickness changes dramatically along the filter. As can be seen, the rainbow has been squashed together.

When films turn into inchworms

Why does stacking the filters cause such cool stuff to happen?

I'm gonna say "read the previous explanation". If you put one filter overneath another, you have something like twice the thickness, so there is more opportunity to rotate.

Why do some of the filters not show this effect?

I dunno, but I have an educated guess. According to the Roscolux website: the filters are "comprised of two types of body-colored plastic filters; extruded polycarbonate and deep dyed polyester". The fact that they use two different materials for the filters might explain why some exhibit this bizarre behavior, and some do not? 

But I dunno, there is also (perhaps) an effect caused by the colorant. It seemed to me that higher amounts of colorant tend to suppress anything fun and interesting.

An introspective comment on the nature of science

When my kids would ask me science questions, I would usually have a pat answer. "Why is the sky blue?" "It's not blue, it's cyan, and it's because of Rayleigh scattering." "What is electricity?" "It is the flow of electrons through a wire."

Sometimes, but not always, they would ask a follow up question like "what's Rayleigh scattering?" or "What are electrons, and why do they flow in wire but not in wood?" Eventually, they just stopped asking questions. Maybe they lost interest, or maybe they realized that I was just kicking the can further down the road.

I have given what I think is a reasonable explanation of why polymeric chains have a tendency to align during the extrusion process. I dunno if this is truly the case. Maybe if I finally get that electron microscope I have been begging Santa for, I will be able to find out?

I have given my supposition that these aligned polymeric chains can cause polarization effects and that they can further effect the orientation of the polarized light. I realize that I neglected to explain just why that might happen. 

I have noted that some films show this effect, and some do not. I didn't follow up on this or suggest anything substantive on why cellophane and some of the Roscolux filters do this.

Another mystery to me... I know that most display devices emit polarized light because they use liquid crystals to modulate the light. A weak electric field causes the liquid crystals to align one way or the other, so that the polarization changes. What I can't figger is why my KindleFire emits polarized light. I have been told that it has a layer of quantum dots that selectively absorb light from below, and then re-emit it (fluoresce) in a fairly narrow wavelength range. In this way, the light emitted can be closer to monochromatic. This gives the device a wider color gamut without the normal loss of power from a purely absorptive filter. But... I would think that the re-emission would be randomly polarized.

I was gonna pat myself on the back for a great explanation of a really cool effect, but all I have done is kick the can a bit further down the road.

Wednesday, November 9, 2016

Statistical process control of color difference data, part 4

Warning: If you were considering whether to jump off the diving board into this blog post because this series is getting too deep mathematically, then I would suggest for you to get off at this stop and wait the next bus. On the other hand, if you were one of the hardcore color geeks who has been chafing at the bit for me to get to the meat of the matter, then read on, read on. If you get all the way to the end of this blog post and find yourself saying "yeah... that all makes sense", or even "John is completely wrong on this!" then I commend you. I hope to get a chance to have a beer with you.

There have been a number of comments on the previous blog posts (primer on SPC, deltaE is not normally distributed, and anomalies with standard deviation) from hardcore color geeks. All of these comments are from smart color scientists. Please note that smart color scientists either have beards, or are named Dave. Unless, of course, they are female, in which case they like Thai food.

David MacAdam, Albert Munsell, and Deane Judd
All sporting beards!

Dave Wyble suggested looking at the variance on the individual components, ΔL*, Δa*, and Δb*. "Conventional wisdom says [component differences] will be normally distributed..." Come on, Dave. When was wisdom considered "conventional?" 

(Color scientist of the world today trivia fact number seventeen: Dave does not have a beard at this time. He has, however, had one at various times in his life.)

Max Derhak anticipated this blog post in one of his comments: "Isn't this a Chi-squared distribution?" 

Steve Viggiano commented: "Having advocated Hotelling's t-squared for this application for decades, I am interested in where this is going." He also commented on the distinction between deltaE and deltaE squared, and the unrealistic assumptions baked into the chi-squared distribution. I will address these directly.

These folks are all hardcore color geeks. Max and Steve both have beards, by the way. Come to think of it, I also have a beard. Therefor I must be a smart color scientist... a smart color scientist who is not so good with syllogisms.

The question on the chalkboard today is this:

Do color difference values, as measured by the square of the ΔEab values, have a chi-squared distribution? If not, then what is it?

Literature review 

Several sources have suggested that ΔE has a chi-squared distribution. The earliest I have found is from Fred Dolezelak [1]. He looked at measurements from 19 print runs and came to the following conclusion:

"[The ΔE values] followed a chi-squared statistic, characterisable by a single parameter, which could be linked to the standard deviations in L*, a*, and b* space."

In the Appendix of his paper, he referred to a previous paper, by H. G. Volz, which demonstrated this. I took a cursory look for the paper, and didn't find it. I admit to not trying real hard. It's in German. Unlike most color scientists, I don't read German.

Dolezalek's paper included the following graph, supporting his claim. Note that this is technically known as a funky-graph because of the nonlinear scaling on the y axis. This crazy scaling is designed so that data from a chi-squared distribution will plot out as a straight line.
One set of data, from Dolezalek's paper

All in all, this is not a bad fit to the data. But note that it can be seen to fall off at the right end. The four points at the end (the four points with highest ΔE) are all below the line. This is a sign that perhaps the curve fit is not ideal.  The fact that the curve fit starts falling off at the 80th percentile is bad juju for SPC. For SPC, the upper control limit is conventionally set to the 99.75 percentile. 

I will amend Dolezalek's statement accordingly: "[The ΔE values] followed a chi-squared statistic, but not in the area where SPC needs!"

The next reference I found is from an unpublished document by a highly respected friend of mine, Dave McDowell [2].  Kodak produced a test target which may be familiar to my readers - around 700,000 of these were manufactured. Kodak wanted these to be produced to tight tolerances, so they were very rigorous about their process control. They found the chi-squared statistic suitable for their needs.

Kodak QC60 target

Here is what Dave said:

"The quantity 'deltaE/2 - avg' when squared follows the chi-squared distribution...

"Evaluation of a large number of samples of the Kodak Q60 transmission and reflection targets showed that the deltaE characteristics of individual samples compared to the batch mean followed this same statistic." 

Before I go on, I want to highlight the "when squared" part that Dolezalek had inadvertently missed. The metric that theoretically has chi-squared distribution is not ΔE, but rather ΔE squared. I am sure that Dolezalek was aware of the fact, since his results were at least reasonable. It was probably an unintentional omission on his part. Steve Viggiano reiterated this in a comment to part 2 of this series.

Unfortunately, McDowell did not provide any data or plots by which we can assess the strength of his statement. He has provided me with the data, however. Winter is coming in Wisconsin. Plenty of time for me to curl up with my laptop and savor the 150 files. (It goes without saying that Dave does not have a beard.)

(The text that was quoted above later appeared in a white paper from Kodak [3]. This makes sense, since McDowell worked for Kodak at the time.)

Steve Viggiano is another highly respected friend of mine who has weighed in on this topic, again in an unpublished document [4]. Viggiano was the first (to my knowledge) to articulate the precise conditions that must be met in order for the chi-squared function to be applicable for ΔE. I don't mean to say that Viggiano invented these criteria - they go with whoever invented the chi-squared function. I mean to say that Viggiano was the first to assert these criteria for the distribution of color difference values. Following Stigler's law of eponymy, I will refer to these as Viggiano's criteria.

More on these criteria in the next section. A lot more. I might add that this blog post would not have been possible without Steve's prolific pursuit of pedantic pleasures. He knows this stuff better than I ever will.

Moving along to additional chi-squared sightings, the ASTM standard E 2214-02 [6] has a brief mention of the distribution. It states the following: "As observed in Fig. 2, the mode, median, and mean of a set of color difference (ΔE) determinations do not follow a bell curve but a curve related to the Chi-squared or F statistical distributions..." There is unfortunately no further explanation.What is the relationship? When is F applicable?

I know of one additional reference that discusses the statistical distribution of ΔE values, a paper by Maria Nadal et al. [5]. In their paper, they compared various methods for determining the 95th percentile of color difference data. 

Why the 95th percentile? Two of the authors of this paper are from NIST. For a fee, NIST makes official color measurements, usually of colorimetrically stable objects such as Lucideon tiles (previously known as Ceram tiles. which were enshrined in the literature under the name BCRA tiles). As part of their service, they assign confidence intervals to the measurements. They are required to report the 95% confidence interval.

This is very useful analysis and a well thought out paper, since there is a dearth of technical information on real-world evaluation of the distribution of color difference data. But, the analysis in Nadal et al. is not necessarily directly applicable for our goal, however. For SPC, one needs the 99th, or preferably, the 99.75th percentile. Our problem is more difficult, since we are interested in the shape of the distribution way out in the tail.

Putting our quest in perspective

What is the chi-squared distribution?

Let's say that you take a bunch of random data - values sampled from a distribution - and then add them in quadrature. (This is a fancy phrase meaning that you combine them with the Pythagorean theorem, which is to say, you square each one, add up the squares, and then take the square root of the sum.) The result is a chi-squared distribution, provided the Viggiano criteria have been met. (Yes, I will get to those criteria. Don't rush me!)

The chi-squared distribution is not just a single distribution; it is a family of distributions. The members of the family are distinguished by the number of random variables that were added together to get that distribution. We refer to any of the family as a chi-squared distribution with n degrees of freedom, where n is the number of random variables that were summed in quadrature.

Once the number of degrees of freedom has been decided upon, there is only one parameter left. This parameter accounts for a scaling of the distribution along the ΔE axis, and is dependent on the standard deviation of the distribution from which the random variables have been taken.

Note that the ΔEab formula is equal to the sum in quadrature of the differences in each of the colorimetric components: ΔL*, Δa*, and Δb*. So, it is at least a potential candidate for the chi-squared function with 3 degrees of freedom.

For the distribution of ΔEab squared to be chi-squared, the distributions of the three components must satisfy all four of the following criteria (as recited by Viggiano):

    1. They must all have zero mean. 

    2. They must all be normally distributed.

    3. They must all have the same standard deviation. (ΔL* can't dominate, for example.)

    4. They must be independent. 

Important plot point: If we can ascertain that ΔE squared follows the chi-squared distribution then finding the 99.75 percentile would be a simple matter of arithmetic applied to the mean.

Does the TR 002 data fit the chi-squared distribution?

I tested the TR 002 data to see if the chi-squared distribution thing worked. The data is measurements of 928 different CMYK patches as printed on 102 newspaper presses throughout the USA. (The data set is further explained in a previous post in this series.)

I first computed the average L*a*b* values for each of the 928 patches. Each of these averages was thus representative of what would be printed on the 102 different newspaper presses. For each of the 928 patches, I then computed the color difference, in ΔEab, between this average and each of the 102 measurements. This gave me a collection of 102 X 928 color difference values. More than enough to wallpaper my kitchen.

A comment here: My dog asked me why I didn't use ΔE 2000, rather than ΔEab. He felt that it would be more useful to use the latest and greatest color difference formula. Presumably this is the formula that will get the most air time in the future. I agreed with him (he is pretty smart, as dogs go), but explained that, for purposes of the initial investigation in this blog post, I prefer to use ΔEab. If all of the Viggiano criteria hold true, then the square of ΔEab would precisely fit the chi-squared distribution with three degrees of freedom. If this part holds true, then the next step would be to see if ΔE 2000 was reasonably close. (A bit of foreshadowing: My conclusion, as we shall see, is that even ΔEab did not fit this model.)

I generated cumulative probability density functions (CPDFs) for each of the 928 patches. I scaled each of the CPDFs by their mean. In this way, they all had a mean of 1.0 ΔEab. Bear in mind that if the variations truly follow a chi-squared distribution, then all of these would have the same shape, but just different scaling in the x axis. So assuming that all four criteria are met, these color difference values should all be from the same distribution. Therefor I can combine them into one CPDF. This gives me the advantage of having 928 X 102  = 94,656 data points, so the resulting CPDF will be relatively smooth.

To check the assumption of chi-squaredness,, I used a Monte Carlo method to generate the CPDF of the hypothetical distribution if all the Viggiano criteria hold. I generated 94,656 sets of hypothetical variations (in ΔL*, Δa*, and Δb*), each of which drawn from a normal distribution with mean of 0 and standard deviation of 1.0. I then went through the ΔEab formula to generate hypothetical color differences. Finally, this collection of values was normalized to have a mean of 1.0 so as to best match the distribution computed from the real TR 002 data.

Generating a plot of a CPDF from a collection of values is a rather easy task, by the way. (Thank you for asking how it is done.) First, the data is sorted from smallest to largest. Next, a second array of the same length is generated with values incrementing in fractional steps from 0 to 1. This incremental array is then plotted as a function of the color difference array. Note that this is kinda the reverse of the way we would normally plot.

The plot below shows a comparison of the two CPDFs. The blue line is the actual data, and the red line is the chi-squared distribution with three degrees of freedom. Both have a mean of 1.0. Gosh. They look different.

Do real color difference values follow a chi-squared distribution?

The above plots can be differentiated to create a probability density function, as shown below. As expected, they are not as smooth as the CPDF curves, but it is still clear that the two distributions are dissimilar. Real color difference data us more skewed to the left and has a longer tail to the right than the distribution based on the chi-squared function with three degrees of freedom. I put that in italics, since it kinda sounded like an important conclusion.

Clearly the distribution of real color difference values does not follow a chi-squared function with three degrees of freedom. What went wrong? One or more of the Viggiano criteria must have been missed. Let's look at the criteria one at a time.

Zero mean

Do the differences ΔL*, Δa*, and Δb* all have zero mean?  In general, this depends on how a given data set is compiled. I see three general cases:

In the first case, the data itself is used to define the target color. In a Mean Color Difference from Mean (MCDM) scenario, the target L* value is the average of the L* values of the data set, and similarly for a* and b*. In this case, the ΔL*, Δa*, and Δb* have zero mean by design. This is the official formula for determining the repeatability of a spectrophotometer. This was also the case in the analysis of the TR 002 data set that was presented in the second post of this series, and in the previous section. 

A second case is described in Nadal, et al. They talk about looking at all pairwise color differences in the data set. For example, in a set with ten color values, the color difference would be determined between the first sample and each of the nine other samples, between the second sample and each of the eight others, and so on. For ten color values, there are thus 45 different unique combinations.

It is not immediately obvious, but this case also has zero mean for ΔL*, Δa*, and Δb*. Consider the case where all pairings are considered, not just unique pairings. That is to say, one computes ΔE (sample n, sample m) as well ΔE (sample m, sample n). Noting that ΔL* (sample n, sample m) = - ΔL (sample m, sample n), it is easily seen that for every pairing, there is the reverse pairing which balances it out. Thus, if one considers all non-unique pairings, the three colorimetric components all have zero mean.

Since ΔEab is one of the commutative color difference formulas, the color difference for a pairing is the same as that of the reverse pairing. Therefor, the distribution of ΔEab is the same for the case of all pairings and all non-unique pairings. I would argue then that the first criterion is met with the case presented by Nadal, et al.

Both of these are (somewhat) unnatural cases, which is to say, not generally encountered in the SPC of color. Generally speaking, the target color has been externally specified, typically by the customer of the product. Although the process has been adjusted to come close to this color on average, this will never be exactly the case. It is not atypical for the average color of a print run to be a few ΔEab from the target color. Therefor, at least in the SPC of color in printing, Viggiano's first criteria is rarely met. Remember that the whole gist of this series is SPC? 

With SPC, we would therefor expect that the violation of the "non-zero mean" criteria will have a major effect on the distribution of color difference values. Viggiano anticipated that I would write that in my blog one day, so he mentioned that if the zero mean criteria is violated, then color difference will follow a non-central chi-squared distribution. 

Thus, when SPC is performed on color difference data, the non-central chi-squared distribution is the appropriate choice (assuming any of the chi-squared distributions are appropriate). Once again, I used italics cuz this sounded important.

Normal distribution of components

Are the distribution of ΔL*, Δa*, and Δb* normal? Gosh. This is not a simple question. On the one hand, it seems like a reasonable assumption, At least as a starting point. 

On the other hand, the values L*, a*, and b* are computed through a nonlinear transform of X, Y, and Z, so strictly speaking, either XYZ or L*a*b* could be normal, but never both. But then again, the nonlinearity is small when compared to the typical variation, so the effect is  probably not of practical importance. (This statement relies on a simple rule - every smooth function looks like a line of you look at a small enough piece of it.)

But, on the third hand, there are certainly conditions where the variation of L*a*b* is distinctly not normal. For example, if the process has a pronounced drift, the variation could resemble a uniform distribution. If the process has a sinusoidal fluctuation then the distribution will be a U shaped distribution with asymptotes corresponding to the two extremes of travel. I once figgered out what the formula for that distribution is. I fergit just now what my answer was.

I will argue that, with good SPC, care is taken so that the process is in known, good working condition.before initial characterization. This, of course is not always the case. The idealist side of my brain would put my foot down and say "WHAT!!?!?!? Gosh darn it to heck! If the process has an oscillation or drift, then in the name of peanut butter and jelly sandwiches you better fix it before you do any characterizing!!" But the realist side of my brain is willing to admit the possibility that some (if not all) processes have drift and/or oscillation that cannot be eliminated.

I generally avoid having the idealist and the realist sides of my brain in the same room together. It avoids a lot of arguments. But in this particular case, the tiny tiny portion of my brain that has some modicum of mediation skills was able to come to a compromise that was suitable for both of the other sides of my brain. If the process is capable (in the SPC sense of the word) of providing product that is within the customer tolerance, then does it matter if there is oscillation or drift? If it costs money in the long run to get rid of that anomaly, then one has to weight this against another customer requirement - price.

Getting back to the matter at hand, I am going to start with the bold assertion that when the process is in good working condition, the variation will likely be close to normal. But I will follow up with a little investigation to back this up.

I tested the TR 002 data to see if the variations in the colorimetric components had a normal distribution. My testing amounted to analysis of skewness and kurtosis.

-- Skewness

The skewness was computed for each of the 928 patches and each of the 3 colorimetric components. This gave 2,784 values for skew. Each value of skew represented the distribution of a different set of 102 samples.

The graph below shows all the values of skew.

Skew of ΔEab values

What to make of this? There are various rules of thumb about how much skew is of practical importance. Perhaps a skewness with absolute value greater than 1 or 2 is considered to be significant? By that rule of thumb, skew is not a problem.

A more precise test comes from a table in my favorite statistics book [7]. According to the table, if I were pulling 100 samples from a perfectly normal distribution, I would expect that 2% of the time, I would have a skewness above 0.567 or below -0.567. In this data set, I saw that 16.6% of the skewness values were above this number.

So, I can say with statistical certainty that an inordinately large number of the data sets are skewed. Oddly, about two thirds of these are skewed positive and one third is skewed negative. I don't know why that is! Why would the variation of some colorimetric components of some of the patches go positive and others negative? Without careful examination of the data, I would guess that there are some outliers? I might even go so far as to guess (rather oxymoronically) that there are an appreciable number of outliers. Someday when I run out of beer in the fridge, and run out of reruns of Get Smart to watch, I will look at that.

Then the realist side of my brain kicks in. "Really? Did you happen to notice that there are only seven cases where the skew is greater than 1.5??!? What are you smokin'? That's not a lot of skew!!!"


Kurtosis is the other popular measure of a data set that can be compared to that of the normal distribution to test for normality.

I compared the kurtosis values against the 1% ranges for kurtosis (also from [7]) , and found a similar situation. 6% of the data sets showed a statistically significant deviation from the normal distribution, with some being leptokurtic and some be platykurtic. (Man! I love those words!!)

Without a whole lot of guidance, I am going to jump to the wild conclusion that the kurtosis is not of practical significance.

--Conclusion of normality analysis

So, my conclusion is that there is a statistically significant number of color difference measurements that have skew and/or kurtosis. Some go one way, and some the other. But from a practical standpoint, the skew and kurtosis are not appreciable. It is unlikely that this Viggiano criteria that has been violated.

Equal standard deviation and independent

In the beginning of this blog post, I stated that I would take the Viggiano criteria one at a time. A lot has happened since then, and I have changed my mind. I have lumped these two criteria together, since they are symptoms of the same illness. If the illness is present, it will manifest in one or the other of the criteria, or perhaps both. I believe this analysis is unique to the color science world, but perhaps not among chi-squared-ologists.

To help understand this, consider the two plots below. For the plot on the left, I generated pairs of x and y values from a random number generator that gave mean of 3.0. For the x values, the standard deviation was 0.1, and for the y values, the standard deviation was 0.7. The plot on the right is that same data, but rotated about {3, 3} by 45 degrees.  

The data set on the left violates Viggiano criteria #3, since the ratio of the standard deviation in y is 7 times that of x. On the other hand, there is no correlation between the two axes. In other words, it passes criteria #4, but fails criteria #3.

In the data set on the right, the standard deviations of the x and of the y values are practically the same, so criteria #3 has passed. On the other hand, the correlation coefficient between the x and y values is 0.958. With such a strong correlation, the data on the right clearly fails at criteria #4.

From the standpoint of the problem at hand, the two data sets are identical, so the conclusions should be identical. In the one case a reasonable test is to look at the ratios of the standard deviations. In the other case, the correlation coefficient seems like a likely test. This begs the question about how to equivalently evaluate the two criteria, and how to catch the situation where the "illness" has been apportioned to both of the criteria. 

-- Interlude into ellipsoidification

I have developed a technique that is appropriate here. I mentioned the technique in a blog post about the color red back in January of 2013. I dubbed it ellipsoidification.  I have not seen this described elsewhere. Certainly the word is novel, if not the technique.

Ellipsoidification is an extension of the standard deviation to multiple dimensions. 

[Note to those who have committed ASTM 2214-02 [6] to memory. Section 6.1.1 describes an extension to standard deviation to multiple dimensions. While the method 2214 is similar - perhaps even very similar - it just misses a truly beautiful result.]

Ellipsoidification produces a vector with one standard deviation value for each dimension of the data. In the case of uncorrelated coordinates (as in the example above on the left), the multi-dimensional standard deviations are equivalent to the standard deviations of each of the individual dimensions (i.e. stdev (x), stdev (y), etc.). 

For data that is correlated in one or more dimensions, the results are the same as if you first rotated the coordinate system so as to make all the axes independent, and then took the one dimensional standard deviations of each component individually. In the example above, the plot on the right would first be rotated by 45 degrees, and the the standard deviation would be computed on each of the axes. Thus, the two plots would have identical multi-dimensional standard deviations, as we would hope.

That was one explanation of the technique.

Here is another explanation of the multi-dimensional standard deviation. Say we have a set of one-dimensional data. It is a set of points scattered along a line. The standard deviation is a line whose length is somehow representative of the amount of spread of the data points. In two dimensions, we do that same trick, but in this case, we are finding the ellipse which best describes the spread of data in two dimensions. In a sense, we "fit" an ellipse to the data. 

In the case of the example above at the left, the ellipse has the major axis straight up and down and the minor axis is right to left. The major axis has length of 0.1 and the major axis has length of 0.7. In the example at the right, the ellipse has the same major and minor axes, but it has been tilted by 45 degrees.

Below I have an actual an example of an ellipsoidification caught smiling for the camera. This is a two dimensional one, since my three dimensional display is at the cleaners getting its nails done. All those little black points that look like data points are data points, generated by a random number generator with normal distribution.

The red ellipses are the one, two, and three sigma ellipses generated from that random data set. I should point out that, although we are dealing with a normal distribution, the one, two, and three sigma probabilities that we memorized for the third grade stats class (68%, 95.4%, 99.75%) no longer apply. In the case below, we are looking at a two-dimensional normal distribution. 

I think this is a pretty cool plot

If we take that to three dimensions, as for L*a*b* data, we are fitting an ellipsoid to the scattering of the data points, Hence the term "ellipsoidification". The multi-dimensional standard deviations are the lengths of the three axes of the ellipsoid. Note that a by-product of ellipsoidification is the orientation of the ellipses, which could be useful. Note that I used this to properly tilt the ellipses above. On production data, the orientation of the ellipsoid can help point to the major cause of the variation.

That was the second explanation.

The first explanation was for the logophiles, that is, mainly in terms of words. The second explanation was for the pictophiles, that is, mainly in terms of a visualization in one awesome graph. The third explanation is for the folks who might actually have to do something with this mess. It is the most useful, because it is algorithmic. It is also, perhaps, the most difficult to understand. At least for me. I dunno, Viggiano probably stared at that plot above for a few seconds and figgered out the algorithm. 

(Color scientist of the world today trivia fact number twelve: Steve Viggiano and John the Math Guy went to different high schools together.)

Here's the third explanation, very terse and incredibly dense: Ellipsoidification is done by computing (egad) the square root of the covariance matrix. This is not done by taking the square roots of the individual elements, as suggested in ASTM 2214. Instead, we find a matrix, that when multiplied by itself, yields the covariance matrix. 

This is done by doing the singular value decomposition of the covariance matrix, and taking square roots of the entries of the diagonal matrix. The lengths of the axes of the ellipsoid (multidimensional standard deviations) are the square roots of the entries of the diagonal matrix. The orientation of the ellipsoid can be determined from the square root of the covariance matrix. 

I think this is a novel finding. 

-- Now back to our data analysis, which is already in progress...

I applied ellipsoidification to the data in the TR 002 data set. For each of the 928 patches I got three values -- the lengths of the three axes, or in different words, the standard deviations in each of three directions.

The real question for us (remember back to criteria 3 and 4?) is not the magnitude of the axes of the ellipsoid, but rather the relative magnitudes of the three axes. If they are all close to the same size, then the appropriate number of degrees of freedom for the chi-squared function is three. If two are the same size and one of them is zero, the there are two degrees of freedom. But how big a difference in magnitude could be considered large enough to lose a degree of freedom?

As an example, let's say that largest axis of the ellipsoid has a length of 1, and another axis has a magnitude of 0.333. By using the Pythagorean theorem on this example, the effect of that second dimension is only a 5% increase in ΔEab. Let's make a rule that this is considered negligible. By this somewhat arbitrary rule, if the shortest axis is less than a third of  the length of the longest, then a three dimensional variation is effectively a two dimensional variation. If two of the axes are less than a third of the length of the longest, then the variation is effectively one dimensional, that is, along a line.

This gives us a method by which to quantify the dimensionality of the variation in the TR 002 data set. I determined by this rule that 

     46.6% of the patches have three dimensional variation,
     40.2% of the patches have two dimensional variation, and 
       6.4% of the patches have one dimensional variation.

Wow. I think this makes sense. I would guess that the variation in the color of a patch is largely due variation in the amount of ink transferred to the substrate, or variation in the dot gain of the ink. If a patch has predominantly only one ink, then I would expect that the variation is largely in the direction of the trajectory of that ink, that is to say, the variation is largely one dimensional. Similarly, two inks would have two dimensional variation, and three or more would have three dimensional variation. It would be an interesting research project to look at individual patches and see if the dimensionality of the variation correlates with the number of inks present, but, that's a different question. 

Today we are pondering what the statistical distribution of color difference values looks like. From this analysis, it would appear that the variation from the TR 002 data set should be chi-squared with something less than 3 degrees of freedom. The exact number of degrees of freedom depends on the number of inks, and their relative proportion. Note that, while the chi-squared distribution makes intuitive sense only when the degrees of freedom parameter is an integer, it is still defined for non-integer values. Thus, a C40, M40, Y10 patch may exhibit 2.3 degrees of freedom. 

This is a similar conclusion to that of Nadal et al. They conjectured that the deviation from chi-squared distribution with three degrees of freedom was due to a violation of criteria #4. They used this to justify the use of the number of degrees of freedom as a regression parameter. They report determining a value of 2.4 for the degrees of freedom in one of their data sets. They did not consider the possibility of a violation of criteria #3, but I showed that this would have led them to the same course of action.

So, failures of criteria #3 and/or #4 can be seen in the data. This means that a chi-squared distribution of degree less than or equal to 3 is appropriate.


There are a number of significant findings in this. 

1. I have demonstrated that, at least for one data set, the distribution of the square of color difference (ΔEab) values does not follow a chi-square distribution with three degrees of freedom. I have tracked the cause of the failure down to the fact that the variation in L*a*b* is somewhat less than three dimensional. 

2. I have conjectured that the dimensionality of the variation (in printing) is connected to the number of inks that are present in the overprint. The data is there, I have not done the analysis to verify this. Ideally such an analysis would allow one to use the halftone percentages of a given color to predict the value for the degrees of freedom in the chi-square distribution. Alternatively, it could be determined from the data when characterizing the process.

3. For SPC, it is normally the case that the target color is dictated from above. This means that ΔL*, Δa*, and Δb* will usually not have zero mean.

4. I have proposed that for SPC, color variation can be modeled with ΔL*, Δa*, and Δb* being normally distributed.

5. If you combine points 2, 3, and 4, you have the distribution of ΔEab. It is a non-central chi-squared distribution with three or fewer degrees of freedom.

6. I have described a multi-dimensional extension of the standard deviation, which I call ellipsoidification. This can be used to determine the shape of the cloud of variance in any dimensional space. While the immediate application to color is apparent, the method could apply equally well to 30-some dimensional spectra or to the analysis of the variation in the price of a collection of ten stocks.

7. This analysis was carried out with the ΔEab color difference formula. I am going to conjecture that all this will work with ΔE 2000, or any other color difference formula that you choose. After all, the other color color difference formulas are all about ellipses. What could go wrong? Well, except for that pesky ΔE 2000 formula.

This is a work in progress. <Cue Man of La Mancha.> In my vision of SPC for ΔE, the appropriate distribution will be used - perhaps the non-central chi-squared distribution with something less than 3 degrees of freedom. (To dream, the impossible dream...) The amount of non-centrality and number of degrees of freedom will be discerned from the characterization of a process to arrive at the 99.75th percentile ΔE 2000. (To fight the unbeatable foe...) During production, this will be used as the upper control limit, but also - once enough production data has been received - the same techniques will be used to provide additional diagnostics. (This is my quest, to follow that star...)

But I have a lot of work to do before all these pieces fall into place. (No matter how hopeless, how near or how far...)

To Dream the Impossible Dream!

I am I, John the Math Guy,
the guy with the slide rule...

Ok... maybe I got a little melodramatic there. Sorry.


[1] Dolezalek, Friedrich, Appraisal of production run fluctuations from color measurements in the image, TAGA 1994

[2] McDowell, David, Statistical distribution of DeltaE, pdf dated Feb 20, 1997

[3] Anonymous, Kodak Q-60 color input targets, Kodak tech paper, June 2003

[4] Viggiano, J A Stephen, Statistical distribution of CIELAB color difference, June 1999

[5] Nadal, Maria, C. Cameron Miller, and Hugh Fairman, Statistical methods for analyzing color difference distributions

[6] ASTM E2214-02, Standard practice for specifying and verifying the performance of color-measuring instruments

[7] Snedecor, George, and William Cochran, Statistical Methods, 7th Ed., Iowa State Press, 1980, p. 492