How high is up? Seriously, in order to measure an error rate you actually need to know the value of what you are measuring. And the accuracy of your measurement procedure. Since you rarely know the exact value, you typically make multiple measurements (each of which has an error interval), take the average and compute the statistical confidence interval around the average of those measurements. This allows you to report that the value is X +/- Y with a 95%CI. That is frequently (erroneously) interpreted as saying the actual value is between X-Y and X+Y 95% of the time (and the other 5% the true value is actually outside the X +/- Y interval. That’s actually bunk. /1/ This assumes a bell curve (normal distribution), which is not always true. /2/ The error interval, Y is actually related to the measurement PROCEDURE and is established a priori (before the measurements are taken). To figure out the error interval, you need to understand your tools - for example, let's measure a meter stick in inches. I know the actual value, its 39.37". If I make my measurements using an yard stick (36") with typical 1/16th marks, the correct answer is 39 5.92/16ths. But I have to admit that my measurement procedure sucks (I have to move the yard stick and that introduces uncertainty of at least a 1/16nd). And then there is the 1/16th accuracy of both the 36" and 3.37" measurement. So my error interval is 3/16ths (these add as that's the worst case). If I take 10 measurements I would EXPECT that they should average to just under 39 6/16ths (but +/- 3/16 as averaging doesn’t reduce the error). This looks horrible (so a meter is somewhere between 39 3/16 and 39 9/16? That's useless - which is why most people use statistics to define a 'better' looking result). Now if I make my measurements using a 4 foot stick, I don't have the reposition error and I only have a single error related to the marked intervals on the stick. With that 1/16" stick some measurements will be 39 5/16 and some 39 6/16. My error interval is +/- 1/16nd, so my result of 39 6/16 +/- 1/16" looks a lot better, like I did a better job. Don't fool yourself, it's just better, more appropriate tools. With good tools, e.g. an electronic multimeter, the error interval is actually part of the specifications, either +/- 0.0005 or +/- one count in the last digit, etc. In situations where you don't know the actual value, sometimes you fall back on repeatability and the math also works the other way: if I know the exact value and have a number of measurements, I can calculate the error interval (but this is typically where people throw it into a stats program and calculate the statistical confidence interval and present THAT as the measurement accuracy). So to bring this back to the coin grading realm, you would need to establish the actual grade of a particular coin and then conduct multiple measurements (submissions to a TPG). You would need to do this with enough different coins (series, date, strike, etc.) to be a truly representative sample and against several TPGs. This doesn’t happen because /a/ nobody agrees on the actual grade and /b/ not enough money to perform the tests and /c/ the more time that elapses in conducting the tests, the more likely something changes (e.g. new grader) that makes the results invalid anyway… At best for time & money reasons, people submit a coin to a couple TPGs and say “X” sucks because they were different from “Y” who’s answer I like.
I presume most if not all of the replies here are for grading of US coins so I'll answer for world coins, particularly those I collect. NGC and PCGS grades are US centric and don't necessarily align with standards used in the home market. They probably actually never do. Since US collectors usually (but not always) buy coins using the TPG standards or their own which are more reflective of US practices, the grade assigned in the US versus elsewhere frequently differs. The series where I have noticed the greatest difference is with Spanish and Spanish colonial coins. I occasionally buy coins from auction house Calico and I'd say theirs are stricter. Additionally, US collectors incorporate "market acceptability" which doesn't exist elsewhere (at least in the same way) because without assigning numerical grades to a slab, there aren't any "details" grades either. Calico "net grades" their listings instead. For the pillar coinage in particular, I find both NGC and PCGS inconsistent. I can't say they are "wrong" but I can tell you that I find it very common to prefer a lower graded coin over a supposedly "better" one with a higher grade. Sometimes this is because the coin with the higher grade isn't "original" but other times I have no idea how they assigned the grade they did.
This is a very interesting topic, and I appreciate all the opinions being offered. It seems to me that grading is both art and science. Science with regard to technical aspects, and art with regard to the overall appearance of the coin and how much you "like" it. I would like to suggest another question about the accuracy of grading: does having a "PQ" or "CAC" sticker on a coin increase the probability that the stated grade is accurate?
As numerous folks have pointed out, grading is subjective, so accuracy can't be measured. The only possibilities are statistical measures of consistency within services, among services, and with time. Statistical measures can be used because the results are expressed numerically (1 to 70) and sometimes with a secondary categorical classification (for example, BN, RB, RD). Good luck on getting data about variation among graders within a service. Curiously, back in the days of ANACS photo-certificates, the opinion of four graders for obverse and reverse were listed on the certificate. The closest we can do today is to send the same coins repeatedly to a service a number of times in a short span and measure the variation in grades. However, for all we'll know, the same grader may have graded it each time. So we couldn't know if we were measuring variation among graders or variation within a single grader. Comparison among services would be straightforward. Send the same group of coins to each service in turn and compare the results. All should be done within a short time span. Establishing variation over time can be done in several ways. The one way would be comparison with one time period with another; the first ten years of grading versus the last ten years of grading, for example. The other way is trend analysis, which evaluates whether there is a significant trend over a time span. There would have to be separate studies of each coin type and well-thought-out statistical plans. The major grading services have probably performed some of these studies, but aren't likely to release the results. They could hardly be called unbiased investigators in any case. Perhaps a collector or organization with sufficient resources may fund the studies. The results and conclusions have to be used correctly and cautiously. For a particular coin type, one service may grade higher than another on average, but variation has to be considered. Typically variation gets expressed as to whether the averages differ significantly at some probability level. Typically, this is 95% in most scientific studies. But, for a collector, knowing that there is a 51% probability that one service grades higher than another might be good enough. The results wouldn't necessarily provide a clear guide as to whether one service is better than another for a collector submitting a single coin. For example, suppose service A grades Morgan dollars significantly higher than service B, but the variation within service A is greater than service B. You have one dollar to send in. If you were sending 100 Morgans, service A would be the choice, but for your one dollar, service B, which is more consistent, might be the way to go. Finally, for many folks, market value is really important, and a statistical study of market valuation of grades of different services is completely different from one that looks only at grading variation. Cal
I'm saying that TPG grading standards have deliberately been loosened, so yes, that is exactly what I am saying. My comments were about both US coins and world coins.
...and just another thought. Before we can consider the opinion of ANYONE who questions the grade on a TPGS slab, we need to consider their grading expertise. So I have a percentage for the OP to consider: In my experience, more than 80% of the dealers/collectors in the US should change hobbies/professions because they cannot grade. Edit: I forgot to say that since grading is subjective 100% of those in the more than 80% can grade.
I would agree if we speaking about private individuals. But what about when it's the TPG themselves ? In other words a coin was graded 10-15 years ago. Then it was sent in to the very same TPG and they upgraded it 1, 2, or even 3 grades. I agree completely.
This situation DOES NOT COUNT. At the time (10-15 years ago), IMO they were grading consistently to a different standard. As we, and others have posted many times before, the grading (commercial/market) standards used at the TPGS have CHANGED through the years. If pinned down, long time TPGS graders will tell you the standards have EVOLVED as they personally learned more. Thus, grading standards practiced at a TPGS have loosened and I predict that will continue until the entire system is changed (1 -100? Decimals?)
If the scale is ever changed to 100 as used by CGS of the UK now, it will be a complete scam. The grade on the holder has little to do with collecting anyway and everything to do with money.
IMO, grading may be a CRUTCH but it is not a scam. As for money... Do you think? At least that's what the TPGS in the US and most of the collectors will tell you... A GRADE PUTS A VALUE ON A COIN. Now, I will be the first one here to argue that it is not that simple!
As stated in the OP the question has no sensible answer. It isn't measuring consistency it is asking "If I sent in 100 coins graded MS-65 to my theoretical standard, how many would the TPG agree with?" If you are wanting to measure consistency you send them 100 MS-65 coins that they have previously graded (from the same generation holder so there won't be possible deviations from different grade standards assigned at different times) and see how many of the 100 come back at the same grade. From the results of the professional graders results in past PCGS World Series of Grading contests it would probably be in the 80% range or so.
I feel they get it right about 80% of the time. Usually with a second submission on the ones I don't agree on they'll get it right. Lately I feel the shift has been to extremely conservative grading again at least on the coins I've sent. They're usually where I thought or possibly lower. And Doug I'm seeing a lot of coins that would often previously get called ms back at au 58. As they're correctly graded that way with a hint of rub. I've seen very little of coins getting a bump to a higher grade if anything there a grade low this is just my experiences
I didn't mean that grading is a scam. I meant that a change to the grading system to a 100 point scale will be one. This is exactly what I believe the TPGs (especially PCGS as a public company) want because they are mostly left with NCLT, resubmissions and low value coins that are unlikely to ever be graded. The only reason for a change is generate another cycle of grading fees because it certainly doesn't do anything for collectors or even coin "investors". The one thing I know with virtual certainty is that supposed world coin bonanza they are hoping for isn't going to be remotely be as big as they presumably think. The supply of classic coins worth submitting is not that large in most countries and even where it is, most collectors don't want it.
Of course it counts. And yes they were definitely grading using a much tighter set of standards - the same one they used for the first 20 years they were in business. Oh I agree they've evolved alright They've loosened them, and loosened them, and then loosened them some more - all so they could keep on getting people to send in coins for upgrades - because they ran out of coins to grade. Other than moderns and new issues of course. As for learned more about grading- HORSE PUCKEY ! Have they, or could they, learn more about varieties and stuff like that ? Yeah, they could, maybe even about how to identify counterfeits. But that doesn't have anything to do with grading. Knowing how to recognize wear on a coin, knowing how to quantify the quality of the luster or quality of strike, knowing how to see hairlines, contact marks, and defects/flaws - people in numismatics have known how to do all of that since before the TPGs existed. So no, they have not learned any more about grading. What they have learned is that people will continue to believe their nonsense as long as they keep upgrading coins - and pay them to do it to boot ! That's one possibility, but both NGC and PCGS have said no to 100 point grading. Personally, I believe they will do a complete about face and return to the same strict standards they used for the first 20 years they were in business. It will in effect be history repeating itself just like it was in 1986/87 when all 65s became 63s literally overnight. And by doing that, going back to the strict grading standards - all coins currently in slabs will of course need to be re-graded and put in new slabs. They don't have much choice is they want to continue to get submissions, for they have all but run out of coins to upgrade once again.
I've always thought the 1-70 Sheldon grading scale is more complicated than needed. For one thing, there isn't really 70 different grade levels. The levels skip in seemingly random intervals at times. 45,50,53,58,60,61,62, etc. As @Mainebill said in his post, if a coin has light rub then it's AU (as in not uncirculated). I think the term "About Uncirculated" is also confusing but that's a separate argument we can save for another day. Then if the rub isn't light it should drop from AU to something more general like Average Circulated. Why do we need Extra Fine, Fine, Very good and Good? Finally, anything with a good anount of ciculation should be Well Circulated (or Below Average).
I don't think anyone can call it a straight out scam, but it definitely has some element of money over integrity at times. I don't think grading standards will become stricter since values are so tied into grades. Lets not forget that big dealers and TPGs are in each others pockets.
...and may actually have an "interest" in some of the TPGS; although that may not be the case anymore.