lurkertech.com Lurker's Guide Programmer's Guide to Video Systems Non-Square Pixel Aspect Ratio Derivation

By Chris Pirazzi. Thanks to Charles Poynton and Andy Walls for some insights on standard def square luma sampling frequencies.

Lurker's Guide Trivia Contest! If you know a way to derive the 12 3/11 MHz and 14.75 MHz industry-standard luma sampling rates, or you know some historical clues as to where they came from, then please send me some mail! The prize is your name in lights on the top of this page. Well, ok, your name on the top of this page. And maybe some other pages too!

On this page you can find a set of points and ideas from various sources that may one day help us reach a final conclusion on the origin of 12 3/11 MHz and 14.75 MHz.


We begin with this email thread with Charles Poynton, who wrote a couple of excellent books about digital video, where Chris Pirazzi demonstrates that even with death-defying amounts of handwaving in interpreting the standard-def analog video specs, it is still not possible to derive the de facto standard square-pixel luma sampling frequencies that are used in the industry for analog video, or, equivalently, the pixel aspect ratio of non-square pixels.

|From cpirazzi Fri Jan 17 14:55:21 1997
|From: cpirazzi@cp (Chris Pirazzi)
|Date: Fri, 17 Jan 1997 14:55:21 -0800
|To: poynton@poynton.com
|Subject: square pixel sampling frequencies
|
|Hi,
|
|I'm on a quest to find the "deep, dark truth" behind the square pixel
|sampling frequencies used for sampling 525/59.94 and 625/50 video.
|
|In your book you mention "No SMPTE standard addresses square pixel
|sampling of 525/59.94 [60M] video.  I recommend using a sampling rate
|of 780FH, that is, 12 3/11 MHz...." (p.216).
|
|I too have seen hardware use 12 3/11 MHz.  Can you show me how this
|number is derived?  Can it be derived from values in specifications?
|
|Your book does not have a comparable suggestion for 625/50 video,
|but the devices I've seen use 14.75 million pixels per second.
|
|Same question: Can you show me how this number is derived?  Can it be
|derived from values in specifications?
|
|--
|
|In theory, we should be able to derive the 525/59.94 [60M] number from
|ANSI/SMPTE 170M:
|
|1. Section 3.3: "The aspect ratio of the active picture area
|   shall be four units horizontally to three units vertically"
|
|2. Section 11.1: "fsc = 5 Mhz * 63/88" (subcarrier frequency)
|
|3. Section 11.2: "fh = 2/455 * fsc" (line frequency)
|
|4. from #2 and #3, derive line period LP as 572/9 us exactly
|
|5. guess at this definition for "active picture area" from #1:
|  
|     vertically: Figure 7: The active lines:
|                 F1: 243 lines from line 21 to line 263 inclusive
|                 F2: 243 lines from line 283 to line 525 inclusive
|                 So the picture consists of 486 lines.
|
|     horizontally: Figure 7 and Table 2: The active line period ALP
|                   The period from "Blanking End" to "Blanking Start," or:
|                      ALP = (LP-(1.5 + 9.20)) us = 4757/90 us
|
|6. since the picture has 486 lines vertically and its active region
|   has a 4:3 aspect ratio, then there should be 486*(4/3) = 648 samples
|   within the horizontal active region.
|
|7. so since the horizontal active region has 648 samples and lasts
|   ALP us, then there must be 648/ALP samples per us, which is:
|
|   648 / (4757/90) = 58320 / 4757 Mhz = 12 1236/4757 MHz
|
|which is not equal to 12 3/11 Mhz.
|
|obviously the weakness of this derivation is step #5: ANSI/SMPTE 170M
|is fuzzy about _what_ region of the signal has a 4:3 aspect ratio!
|
|where does 3/11 come from ?!

----------------------------------------------------------------------

|From Poynton@Poynton.com  Sat Jan 18 10:50:52 1997
|
|> I'm on a quest to find the "deep, dark truth" behind the square pixel
|> sampling frequencies used for sampling 525/59.94 [60M] 
|> and 625/50 video.
|
...
|
|Get a copy of a SMPTE RP [187], "Picture centering and aspect ratio" ...
...
|
|> 5. guess at this definition for "active picture area" from #1: ...
|
| This guessing is what the new RP is supposed to address, but its numbers=
| preclude an exact integer relationship. 12 3/11 MHz is very close; I can't=
| remember whether 779 or 781 is closer to the RP's nominal, but 780 fh is=
| sensible - divisible by 4, even! - so I recommend it.=20

----------------------------------------------------------------------

(further nagging Charles)

|From: cpirazzi@cp (Chris Pirazzi)
|-
|
|now we've both read SMPTE RP 187 and we've seen that it specifies yet
|another 525/59.94 [60M], 4:3 aspect, 2:1 interlace pixel aspect ratio,
|160/177.  this implies a square-pixel horizontal sampling frequency of
|
|  13.5 MHz * (160/177) = 4320 / 354 = 12 12/59 MHz
|
|and it implies
|
|  (12 12/59 MHz) * (572 / 9 us) = 2471040 / 3186 = 775 35/59
|
|square pixels per total line.
|
|--
|
|the industry, however, seems to be using a horizontal sampling frequency
|of 12 3/11 MHz, which implies:
|
|  (12 3/11 MHz) / (13.5 MHz) = 10/11 (x/y)
|
|pixel aspect ratio, and which implies:
|
|  (12 3/11 MHz) * (572 / 9 us) = 77220 / 99 = 780
|
|square pixels per total line.
|
|--
|
|[it seems that the RP will be ignored] and will serve only to confuse matters
|
|and what's worse, I still don't know the answer to my original question :)
|which was:
|
|Q1: where does 12 3/11 MHz come from ?!
|
|- clearly, RP 187 cannot be used to derive it.
|
|- possibly, one may be able to derive it with a suitable definition of
|  "active picture area" in step 5 above.
|
|- presumably, whoever chose 12 3/11 wanted some value that would yeild
|  an integral number of total samples per line.
|
|- probably, some hardware engineer at Phillips chose 12 3/11, as opposed
|  to some other value that would also yeild an integral number of total
|  samples per line, because it happened one day that the 12 3/11 MHz
|  crystal oscillator that he needed for his prototype board was a few 
|  cents cheaper than the other possible oscillators.
|
|it's this last step which I have never gotten any solid data on.
|
|who was it who first chose 12 3/11 ?  why did he/she choose that?
|
|--
|
|SMPTE RP 187 introduces two new questions:
|
|Q2: where does the SMPTE RP's 160/177 come from?
|
|Q3: after going through all the effort to specify 160/177, why does RP
|187 informative Annex A part A.4 suggest resampling 525/59.94 [60M]
|square pixels to nonsquare using a ratio of 11:10?
|
|--
|
|Then, we switch over to 625/50, 4:3 aspect, 2:1 interlace video.
|
|  LP = (1000000 us/second) / ( (25 frames/second) * (625 lines/frame) ) 
|     = 64 us
|
|--
|
|In this case, SMPTE RP 187 specifies a pixel aspect ratio of
|1132/1035, which is just a little bit silly.  This implies a
|square-pixel horizontal sampling frequency of:
|
|  13.5 MHz * (1132/1035) = 30564 / 2070 = 14 88/115 MHz
|
|and it implies
|
|  (14 88/115 MHz) * (64 us) = 108672 / 115 = 944 112/115
|  
|square pixels per active line.
|
|--
|
|Instead, the industry seems to be using a horizontal square sampling
|frequency of 14.75 MHz, which implies:
|
|  (14.75 MHz) / (13.5 MHz) = 59/54 (x/y)
|
|pixel aspect ratio, and which implies:
|
|  (14.75 MHz) * (64 us) = 3776 / 4 = 944
|
|square pixels per total line.
|
|--
|
|We are left with the same questions as 525:
|
|Q4: where does the industry 14.75 MHz come from?
|
|Q5: where does the SMPTE RP's 1132/1035 come from?
|
|	- Chris Pirazzi

From this thread, we can at least conclude that the industry-standard ratios were designed to yield an integral number of luma sampling instants per line. But as to where they actually came from: who knows?


Now we jump to a 27 May 2008 contribution by reader Andy Walls.

Andy wishes to emphasize that almost all of the following is pure speculation and train-of-thought. But there seem to be some helpful new observations in here:

|Q1: where does 12 3/11 MHz come from ?!
|

|- presumably, whoever chose 12 3/11 wanted some value that would yeild
|  an integral number of total samples per line.

Well, yeah.  The idea was probably to keep the sampling frequency
phase locked or aligned with the line frequency as that makes life
easy. The result is, as you note, an integral number of samples per
line.  See below.


|- probably, some hardware engineer at Phillips chose 12 3/11, as opposed
|  to some other value that would also yeild an integral number of total
|  samples per line, because it happened one day that the 12 3/11 MHz
|  crystal oscillator that he needed for his prototype board was a few
|  cents cheaper than the other possible oscillators.

This is very unlikely.  12 3/11 MHz looks very deliberate, especially
when one observes that 12 3/11 = 1080/88 and figures out its
connection to the FCC's 63/88 * 5 MHz chroma subcarrier specification.
See below.


|it's this last step which I have never gotten any solid data on.
|
|who was it who first chose 12 3/11 ?

I have no clue.  I couldn't even find a copy of a standard that
documents it.  Although one 'net source says SMPTE 244M does.


| why did he/she choose that?

That part I think I have mostly figured out.  Here are design
constraints that I backed out of the number based on properties I
observed and some research of some old IRE Proceedings.

To summarize, I think the design criteria were something like these:

1. The sampling frequency should be such that it is phase aligned/locked
with the NTSC line frequency of Fh = 4.5 MHz/286 ~= 15.73426 kHz

2. The sampling frequency should be such that the number of luma samples
for the active part of a NTSC line (52.65556 us out of 1/Fh = 63.5556
us) is as close as possible to 640 samples without going below 640
samples.

3. The sampling frequency should be such that it is a small integer
multiple of a frequency that no lower than the chroma subcarrier
frequency fc = Fh * 455/2 = 63/88 * 5 MHz = 3.579 MHz that encompasses
most of the luma signal bandwidth.

4. The sampling frequency should be such that it is simply related to
the chroma subcarrier and line frequency, and perhaps maintains the
properties used for selection of the original chroma subcarrier (closer
analysis needed here).



So as for explanations:

1. You understand this one.  Basically it's nice to have an integral
number of luma samples per line.  In this case:

1/Fh * 12 3/11 MHz = 286/4.5 MHz * 12 3/11 MHz = 780

4.5 MHz is the sound carrier of the old B&W television standard which
could not be changed to have the old B&W sets receive sound properly
with the new color standard.  But the color subcarrier needed to be in
particular relation to the sound carrier and the line rate to reduce the
visibility of beats between these frequencies, so the line frequency was
changed [1].  Since the B&W line freq of 15.75 kHz had a 285th harmonic
at 4.489 MHz and a 286th harmonic 4.5045 MHz, and the 286th being closer
to 4.5 MHz, 286 was chosen as the scale factor to derive the new color
line rate from the 4.5 MHz sound carrier.


2. Given that the square pixel device of the period was a VGA monitor of
640x480 pixels, as VGA was introduced by IBM in 1987, it seems
reasonable to assume that 640 was the target pixel width of the active
part of a line.

The active part of the NTSC line at a sample rate of 12 3/11 MHz is

(1/Fh - 10.9 us) * 12 3/11 MHz = (286/4.5 MHz - 10.9 us) * 12 3/11 MHz =
646.22727

Pretty close to 640 with ~3 pixels of active video lost on each of the
left and right edge.


3. & 4. The chroma subcarrier freq is

fc = 4.5 MHz/286 * 455/2 ~= 3.579 MHz

by design for reasons cited in [1] and [2].

It is important to note that when factored, this can be written as

fc = 5 MHz * (3*3)/(2*5) * (1)/(2*11*13) * (5*7*13)/2

And note that 455 was chosen because it was made up of small odd factors
for various benefits and reasons listed in [1] and [2].

Canceling all the terms one gets:

fc = 5 MHz * 63/88

and this is the form the FCC used in its rules.


Now we can make the observation that:

12 3/11 MHz = fc * (8/7) * 3 = fc * 24/7

Expanding

12 3/11 = 5 MHz * (3*3)/(2*5) * (1)/(2*11*13) * (5*7*13)/2 * (2*2*2)/7 *3

Note that the 7 cancels out the last odd factor introduced to produce
"frequency interleaving" mentioned in [1] and [2].  This may not be a
good thing - I'm not sure.

Note that 8/7 gives a "basic" highest frequency of 4.0909 MHz, which is
close to the maximum frequency of 4.2 MHz of the luminance signal, and
slightly above fc, but exactly cancels out factors in both the numerator
and denominator of fc, to keep the relationship between 12 3/11 and fc
and the line frequency simple.

Note that the 3 gives a multiple of the basic frequency that is greater
than the Nyquist rate for sampling the basic frequency and thus for
sampling the chroma and probably luma.


Does that sound close to a reasonable, original rationale, or is there
too much hand waving?

Regards,
Andy

References:

[1] Abrahams, I. C., "Choice of Chrominance Subcarrier Frequency in the
NTSC Standards", Proceedings of the I-R-E, January 1954, pp 79-80

[2] Abrahams, I. C., "The 'Frequency Interleaving' Principle in the NTSC
Standards", Proceedings of the I-R-E, January 1954, pp 81-83

[3] Blinn, James F., "Jim Blinn's Corner: The World of Digital Video",
IEEE Computer Graphics and Applications, September 1992, pp 106-112

Note that some of the relevant information from the 1954 Abrahams papers that Andy referenced ([1] and [2]) is also contained in a short section of the Wiki page on NTSC.

Andy followed this up on 28 May 2008 with:


The more I think about it the more I want to refine the details of the
rationale for 3 & 4.

Ultimately I think some motivation was to be able to pick off a
frequency easily from the frequency dividers that already had to
generate the chroma subcarrier freq anyway.  Also the intent had to be
sampling the highest luma freq (4.2 MHz) by a factor >= 2.  The chroma
freq probably wasn't in the constraint; just a convenient source.


Putting together all the ideas above:

To really continue this line of reasoning to see if it pans out, we'd have to write a little program to try every possible value of the luma sampling frequency that satisfies (A), (B), and (D) (where we allow a wide range of interpretations of "active" region according to the ambiguities of the NTSC spec), and show that the value 12 3/11 MHz gives us a luma sampling frequency with the simplest possible relationship to the chroma frequency.

Anyone up for that?

Or it's possible that some other reason we have yet to guess is what led the designer to choose 12 3/11. Any ideas?


Wouldn't you know it...just as I was writing the text above, Andy Walls made just such a spreadsheet:

So if the goal for picking a pixel sampling frequency, fs, include

1. approximately 640 active pixels per line
2. an integer multiple, N, of the horizontal line rate fh
3. most easily derived from the chroma freq, fc

and since fc = fh * 455/2, it seems reasonable (but not obvious to me
why) to try and pick N with as many factors in common with 455 as
possible, while getting reasonably close to 640 pixels in the active
portion of a line.

The factors of 455 are 5, 7, and 13.

The values of N, that yield close to 640 pixels, and have more than one
factor in common with 455 are

735 = 3*5*7*7    => 11.5647 MHz => ~609 pixels  => -4.85% diff from 640
770 = 2*5*7*11   => 12.1154 MHz => ~638 pixels  => -0.32% diff from 640
780 = 2*2*3*5*13 => 12.2727 MHz => ~646 pixels  => +0.97% diff from 640
805 = 5*7*23     => 12.6661 MHz => ~667 pixels  => +4.21% diff from 640

So if those are the candidates, 12.1154 MHz and 12.2727 MHz are the two
best.  Since "not the horizontal blanking interval" isn't the best
definition of the active part of a line, I suspect 12.2727 MHz was
preferable, knowing that some of the 646 pixel times wouldn't actually
be visible.

I suppose one can quibble about what is the active portion of an NTSC
line, but the Horizontal Blanking interval (HBI) is 10.9 usec +/- 0.2
usec in every book I have.  So I had the spreadsheet compute the active
region of a line as "not HBI", for HBI of 10.7, 10.9, and 11.1 usec.
Download square-pixel-clock-options.ods (OpenDocument spreadsheet for OpenOffice/StarOffice/...)

Download square-pixel-clock-options.xls (Excel spreadsheet)

Download square-pixel-clock-options.pdf (PDF format: read-only)


Another comment from Andy Walls about the precedent of VGA computer graphics cards led me (Chris Pirazzi) to go down the following path. Unfortunately this path didn't lead me to 12 3/11 MHz, but perhaps another reader can find some inspiration here.

Andy said:

I thought I'd mention some observations about VGA: These reasons together lead me to say that 640 pixels was the target count for active pixels per line.
Yup that's all true. I think the huge precedent of the already-existing 1987 VGA display card standard being 640x480 is already enough to explain why the computer video input card designer was choosing 640. The reasons (i) and (ii0 you gave are perhaps what had motivated IBM to choose 640x480 for CGA/EGA/VGA in the first place. The divisibility (iii) is certainly a plus on the software too (it became even more relevant on the software side when JPEG and MPEG came along).

Your observation raises an important point I'm surprised I hadn't thought of before. Early personal computers like the Radio Shack TRS-80s, Commodore/Amiga, Atari 2600, Apple II, and early IBM PCs with CGA/EGA adapters were designed to hook up to consumer TVs so they very much cared about the details of NTSC scanning.

It wasn't until later graphics standards that it became widely accepted that we needed to buy separate "computer" monitors for computers rather than TVs. That is, it wasn't until later that the hardware details of the graphics card became divorced from the details of NTSC scanning. For some transitional cards like CGA, software developers had to assume that their customers might either have a NTSC composite monitor/TV, or a digital RGBI monitor:

http://en.wikipedia.org/wiki/640%C3%97480#Evolution_of_standards

http://en.wikipedia.org/wiki/Computer_display_standard#Standards

I wonder if we look back into the history of these early computer display devices, all of which greatly predated video input cards, if we will find 12 3/11 MHz somewhere.

Perhaps the designer of the first video digitizer card was simply copying 12 3/11 from somewhere else.

Let's see if this pans out...

--

CGA (1981) was 640x200 and was compatible with regular TVs:

http://en.wikipedia.org/wiki/Color_Graphics_Adapter

You could plug in either a composite (NTSC) monitor or a specially-built digital RGBI monitor. (CGA chose 200 in part because they knew the NTSC TV would be interlaced and they wanted to avoid or simplify dealing with field flicker!)

So far so good, but in this case we don't get any 12 3/11 MHz joy, because:

Hmm.

--

EGA (1984) was 640x350:

http://en.wikipedia.org/wiki/Enhanced_Graphics_Adapter

I believe this was the first IBM family card that could not be hooked up to an NTSC monitor: this was the first time you were forced to buy a specially-built computer monitor.

So this wouldn't help us link 640 to square-pixel scanning of an NTSC signal.

--

VGA (1987) was 640x480

http://en.wikipedia.org/wiki/Video_Graphics_Array

Same deal: VGA required a special monitor.

So this wouldn't help us link 640 to square-pixel scanning of an NTSC signal.

--

Amiga 2000 (Released 1986) And Video Toaster first gen (released 1990)

http://en.wikipedia.org/wiki/640%C3%97480#Evolution_of_standards

The 640×400i resolution (720×480i with borders disabled) was first introduced by home computers such as the Commodore Amiga and (later) Atari Falcon. These computers used interlace to boost the maximum vertical resolution....The advantage of a 720×480i overscanned computer was an easy interface with interlaced TV production, leading to the development of Newtek's Video Toaster. This device allowed Amigas to be used for CGI creation in various news departments (example: weather overlays), drama programs such as NBC's seaQuest, WB's Babylon 5, and early computer-generated animation by Disney for the Little Mermaid, Beauty and the Beast, and Aladdin.

http://en.wikipedia.org/wiki/Video_Toaster

The Toaster was released as a commercial product in December 1990[2] for the Commodore Amiga 2000 computer system, taking advantage of the video-friendly aspects of that system's hardware to deliver the product at an unusually low cost 2399 USD.[2] The Amiga was unique among personal computers in that its system clock at 7.16 MHz was precisely double that of the NTSC color carrier frequency, 3.579 MHz, allowing for simple synchronization of the video signal.

http://en.wikipedia.org/wiki/Amiga#Custom_chipset

The Amiga [graphics] chipset can genlock — adjust its own screen refresh timing to match an NTSC or PAL video signal. When combined with setting transparency, this allows an Amiga to overlay an external video source with graphics. This ability made the Amiga popular for many applications, and provides the ability to do character generation and CGI effects far more cheaply than earlier systems

http://en.wikipedia.org/wiki/Video_Toaster

Aside from simple fades and cuts, it had a large variety of character generation, overlays, and complex animated switching effects. These effects were in large part performed with the help of the native Amiga graphics chipset which were synchronized to the NTSC video signals; the result being that while the Toaster was rendering a switching animation the computer desktop display would not be visible

Ok this is all promising, especially since the Video Toaster itself had video inputs, though at its core the Video Toaster was really an expansion card for the Amiga 2000 with BNC video in/out connectors and it acted like a video switcher---the actual video pixels did not go into the computer in any software sense or ever cross the bus of the expansion card.

So again, I don't see how this could link back to 12 3/11 MHz square sampling of NTSC. I could never find much information about this mysterious 720 pixel wide Amiga overscanned mode, but combining all the tidbits above, one can conclude that the computer-generated characters and CGI effects were probably done at a resolution of 720 wide. It's clear that 720 pixels scanned across the whole NTSC active region with 480 lines is not going to be anywhere near square. Yes, the software designers needed to know just now non-square 720 pixels were so they could render circular circles, and they also probably knew exactly how those 720 pixels would map onto the NTSC timing of the input and output video signals (because they were tightly bound to one specific Amiga hardware and so could know its behavior precisely), so the Video Toaster overall would have to have made the same kind of definitive judgment about the proper luma sampling frequency to get square pixels. But because they were working in 720, it's unlikely that we will find any precedents for 12 3/11 MHz in there.

--

Hmm, it seemed so promising. But maybe 12 3/11 MHz came from some other historical source.


Got another idea? Let us know!

Submit
This Site
Like what you see?
Help spread the word using these social sites:
StumbleUpon
del.icio.us
del.icio.us
Support
This Site
More than 1000 hours of work have gone into making this site. Please support my work and ongoing site improvements in one of these ways:
donate now   Donate Now
Use your credit card or PayPal to donate in support of the site.

get anything at all from amazon.com
Use this link to Amazon—you pay the same, this site gets 4% from Amazon.
get the best thai-english phrasebook app
Experience Thailand richly with my Talking Thai-English-Thai Phrasebook app.
get the best thai-english dictionary app
Learn Thai with my Talking Thai-English-Thai Dictionary app for iOS, Android, Windows.
get a cool thai-english paper dictionary
Don't leave home without the Thai-English English-Thai Compact Dictionary I co-authored.
get thailand fever
I co-authored this bilingual cultural guidebook to Thai-Western romantic relationships.
get the best chinese phrasebook app
Visit China easily with my Talking Chinese-English-Chinese Phrasebook app.
CopyrightAll text and images copyright 1999-2015 Chris Pirazzi unless otherwise indicated.