Grading is BS

Discussion in 'Coin Chat' started by davidh, Jan 14, 2017.

  1. baseball21

    baseball21 Well-Known Member

    As mentioned in other threads, petty and desperate. Essentially they cannot market their product on their own so have to attack others. I used to use them for some things but when I get my submission back it will be the last for being so petty
     
  2. Avatar

    Guest User Guest



    to hide this ad.
  3. dd27

    dd27 New Member

    No answer yet from PCGS, but let me explain why a formal inter-rater reliability study is important.

    First one needs to determine what one means by "accurate". Is accuracy some sort of pre-determined 'gold standard'?[1] Or is the "percent agreement" between independent raters?

    Of course, there is no definitive gold standard when it comes to grading coins. As others have pointed out, it is subjective to some extent, even with guidelines or standards; it is more art than science.

    Along those lines, one should specify which guidelines one uses, e.g., ANA Grading Standards [2], or the PCGS grading standards.

    One could develop an acceptable gold standard, by, for example, soliciting nominations and votes from numismatic organizations, coin clubs, coin dealers, the TPGs, and coin collectors, regarding the best coin graders (who are not affiliated with any of the major TPGs). A group of 15 such experts could independently grade coins in groups of 3, i.e., three graders per coin, and then either average the grades, or the three graders could discuss their grades and agree on a consensus grade. This grade would then become the gold standard for that particular coin, which would then be submitted to each of the TPGs under study.

    Such a research project would then require a statistical analysis of accuracy by comparing TPGs' grades vis-a-vis the gold standard grade.

    Or one could evaluate the degree of congruence (agreement) between two (or more) independent raters.

    Either way determining how to measure accuracy is not easy.

    For example, one would first have to determine the appropriate type of statistical analysis, interrater reliability (IRR) or interrater agreement (IRA).

    As LeBreton & Senter (2008, pp. 816-817) explain:

    "IRR [interrater reliability] refers to the relative consistency in ratings provided by multiple judges of multiple targets. Estimates of IRR are used to address whether judges rank order targets in a manner that is relatively consistent with other judges. The concern here is not with the equivalence of scores but rather with the equivalence of relative rankings. In contrast, IRA [interrater agreement] refers to the absolute consensus in scores furnished by multiple judges for one or more targets. Estimates of IRA are used to address whether scores furnished by judges are interchangeable or equivalent in terms of their absolute value.The concepts of IRR and IRA both address questions concerning whether or not ratings furnished by one judge are ‘‘similar’’ to ratings furnished by one or more other judges.

    These concepts simply differ in how they go about defining inter-rater similarity. Agreement emphasizes the interchangeability or the absolute consensus between judges and is typically indexed via some estimate of within-group rating dispersion. Reliability emphasizes the relative consistency or the rank order similarity between judges and is typically indexed via some form of a correlation coefficient. Both IRR and IRA are perfectly reasonable approaches to estimating rater similarity; however, they are designed to answer different research questions. Consequently, researchers need to make sure their estimates match their research questions." [3]
    Or, as Gisev, Bell, & Chen (2013, p. 330) note:

    "Interrater agreement indices assess the extent to which the responses of 2 or more independent raters are concordant. Interrater reliability indices assess the extent to which raters consistently distinguish between different responses. A number of indices exist, and some common examples include Kappa, the Kendall coefficient of concordance, Bland-Altman plots, and the intraclass correlation coefficient. Guidance on the selection of an appropriate index is provided. In conclusion, selection of an appropriate index to evaluate interrater agreement or interrater reliability is dependent on a number of factors including the context in which the study is being undertaken, the type of variable under consideration, and the number of raters making assessments." [4]​

    To complicate matters, in some cases even if one simply wants to calculate the extent of agreement between independent raters, one might nonetheless use an IRR analysis because the 70-point grading scale would be considered analogous to a continuous variable, even though it is technically an ordinal variable. (See Categorical and Continuous Variables, near the bottom of the page, at Types of Variables.)

    My point in bringing in this academic stuff is to highlight the complexities involved in establishing a reliable accuracy statistic. Even if a company wants to conduct an internal evaluation of grader consistency and accuracy, it requires careful planning, knowledge of research methodology (or program evaluation methodology, which is similar), and statistical analysis.

    The best guide to the statistical analysis is the Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters (4th Ed.). [5]

    Footnotes
    1. gold standard - "...a well-established and widely accepted model or paradigm of excellence by which similar things are judged or measured." Farlex Dictionary of Idioms

    2. Bressett, K. E. & Bowers, Q. D. (2006). The official American Numismatic Association grading standards for United States coins (6th Ed.). Atlanta, GA: Whitman Publishing. [ISBN-13: 978-0794819934]

    3. LeBreton, J. M., & Senter, J. L. (2008). Answers to 20 questions about interrater reliability and interrater agreement. Organizational Research Methods, 11(4), 815-852.

    4. Gisev, N., Bell, J. S., & Chen, T. F. (2013). Interrater agreement and interrater reliability: key concepts, approaches, and applications. Research in Social and Administrative Pharmacy, 9(3), 330-338.

    5. Gwet, K. L. (2014). Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters (4th Ed.). Gaithersburg, MD: Advanced Analytics [ISBN: 978-0970806284]
     
    Last edited: Jan 18, 2017
  4. Dynoking

    Dynoking Well-Known Member

    Great read! This helps me understand gradeflation, it's cause and effects. Thanks for posting!
     
    Stevearino and V. Kurt Bellman like this.
  5. V. Kurt Bellman

    V. Kurt Bellman Yes, I'm blunt! Get over your "feeeeelings".

    This is something I have believed for a VERY long time about PCGS, and as is usually the case when one reads something from someone else who agrees with what you've been thinking all along, I LOVE IT TO DEATH!

    PCGS gets away with proclaiming superiority with virtually no evidence to back it up, and they have for years. Don't give me that auction results tripe, because all that amounts to is a self-fulfilling prophesy based on a market bias - the bias of the market of people PCGS fawns over.
     
  6. V. Kurt Bellman

    V. Kurt Bellman Yes, I'm blunt! Get over your "feeeeelings".


    This much we know for sure - the grading "test" given at Anaheim that Charles took part in was absolutely an IRR system. Does the applicant agree with PCGS' graders? Pure and simple. If a prospective grader agrees with their grades, I guess they need to make him a job offer. Huh?
     
    dd27 likes this.
  7. SuperDave

    SuperDave Free the Cartwheels!

    Understandable. These days we appear to live in a sociopolitical environment where that sort of self-contained, self-reinforcing reality absent external structure (or fact) is the preferred method for obtaining "truth."
     
  8. GDJMSP

    GDJMSP Numismatist Moderator

    I first started posting about all of this, right here on this forum, over 10 years ago. And I have been posting about it ever since. All of the TPGs, not just PCGS, all of them, changed (loosened) their grading standards in 2004. And in the years since then they have loosened them a couple more times. Until now, today, TPG grading across the board is a joke !

    Of course for the first 7 or 8 of the last ten years pretty much everybody else on this forum told me I was crazy and didn't know what I was talking about. Some of the most heated and long lasting debates there has ever been on this forum were on that very subject.

    In the past few years however things have been changing. More and more people have realized that they can no longer deny that the TPGs have repeatedly loosened their grading standards. And today, well hardly anybody denies it anymore. Today we even have the TPGs themselves accusing each other of loosening their grading standards.

    So what's gonna happen ? Probably the same thing I predicted would happen quite a few years ago - the TPGs are gonna revert, they are going to do the same thing they did in 1986 when they started. They are going to tighten grading standards back to what they were back then. In 1986 a coin that was graded MS65 became an MS63 literally overnight. I believe that's going to happen again just like I said it would.

    And of course the coin market will go along with it, and the TPGs will sit and rub their little hands together with joy. For they will then be able to collect all those grading fees for everybody having to have all their coins graded all over again, based on the "new" and stricter grading standards.

    If a guy didn't know better he could swear they had it planned all along. But nahhhhhh - they'd never do that, would they ? :rolleyes:
     
    micbraun, dd27, Blissskr and 2 others like this.
  9. V. Kurt Bellman

    V. Kurt Bellman Yes, I'm blunt! Get over your "feeeeelings".

    Wow, that's deep.
     
  10. SuperDave

    SuperDave Free the Cartwheels!

    They may be incompetent but they're not stupid. :)
     
    Insider likes this.
  11. mikenoodle

    mikenoodle The Village Idiot Supporter

    a better potential next step to your process, Doug is this:
    All graded coins are re-submitted. Those that are overgraded are re-graded for free. Those accurately graded are re-holdered for free.
    Otherwise, noone will be able to tell between the overgraded old slabs and the correctly graded new slabs... although I say that they're having problems now...
     
  12. GDJMSP

    GDJMSP Numismatist Moderator

    Mike why do you think they loosened the grading standards to begin with ? They did it because they wanted all those grading fees that people would pay to get the upgrades.

    They are not going to re-grade coins for free - they'd go bankrupt. The TPGs are a business, and the very purpose of any business is to make money. And changing their grading standards is how they make money.
     
  13. GDJMSP

    GDJMSP Numismatist Moderator

    On the contrary, they are not incompetent at all, they know exactly what they are doing.
     
  14. mikenoodle

    mikenoodle The Village Idiot Supporter


    What market force do you suggest will encourage people to re-submit their coins?
     
  15. GDJMSP

    GDJMSP Numismatist Moderator

    Simple, because they won't be able to sell their coins in the over-graded slabs. Nobody will want them, they will only want coins in the new stricter slabs.
     
  16. Insider

    Insider Talent on loan from...

    Doug asks: "So what's gonna happen ? Probably the same thing I predicted would happen quite a few years ago - the TPGs are gonna revert, they are going to do the same thing they did in 1986 when they started. They are going to tighten grading standards back to what they were back then. In 1986 a coin that was graded MS65 became an MS63 literally overnight. I believe that's going to happen again just like I said it would."

    I disagree. Most coins in slabs are maxed out (FOR NOW until standards change again), it is a market policed system. I believe we'll see a decimal system or 1-100 range first to generate more money.
     
  17. V. Kurt Bellman

    V. Kurt Bellman Yes, I'm blunt! Get over your "feeeeelings".

    A simple subtle label change will accomplish this. "Oh, this was before the changeover."

    Question: what about old rattler slabs?
     
    Insider likes this.
  18. mikenoodle

    mikenoodle The Village Idiot Supporter

    then you are suggesting a complete slab re-design and an almost immeasurable marketing OOPS??? How do they wipe the egg from their face when their whole reason for existence and their integrity has been damaged by the old slabs?
    More likely is this: a "NEW" TPG will emerge. One who proclaims that they are better with a better guarantee, and a pristine reputation.
     
  19. GDJMSP

    GDJMSP Numismatist Moderator

    Time will tell, it always does.
     
  20. V. Kurt Bellman

    V. Kurt Bellman Yes, I'm blunt! Get over your "feeeeelings".

    There you go. A business opportunity for you and Matt. Prospective investors may reply to...
     
  21. V. Kurt Bellman

    V. Kurt Bellman Yes, I'm blunt! Get over your "feeeeelings".

    Are you suggesting that "time wounds all heels"?

    The thing that amazes me is how fast we went from "there's no such thing as a perfect coin" to some series and coins having 50% or higher "70" grades.
     
    Paul M. and dd27 like this.
Draft saved Draft deleted

Share This Page