Featured Predicting hammer price: playing a bit with Sixbid data & looking for inspiration

Roerbakmix · Nov 5, 2019

Hi all, so it has been a rather slow day at work which allowed me to spend a bit of time on a project: trying to predict a hammer price based on freely available data. This is probably impossible to do, but I tried it nonetheless and had some fun doing it.

METHODS:
Data:
Freely available data on SixBid (https://www.sixbid-coin-archive.com/#/en/search?currency=eur) with currency automatically recalculated to EUR to make life a bit easier. For the case studies, two search strings were used: 1) "caesar denarius elephant" and 2) "augustus denarius comet ivlivs"
Data is presented on SixBid in a more or less structured manner. First, the entire webpage was copy-pasted to google sheets. Using various (not super state of the art methods), data on hammer price, estimated price, auction house, date, grade (0=missing, 1= good, 2=fine, 3=very fine, 4=extremely fine, 5=mint state; or synonyms), NGC certificate (y/n) and provenance information (y/n) was extracted.

Statistical methods:
Linearity between the dependent (hammer price) and independent (estimated price, grade, NGC certificate y/n and provenance) variables was assumed (here is room for improvement because for some variables, this was obviously not the case). Missing data was coded as 0 (missing data on grade should probably be imputed). Multivariate linear regression with backwards selection (p in 0.05, p out 0.10). Predicted prices were compared against actual hammer prices.

RESULTS
Case study 1: JC denarius elephant
Data
The sixbid search yielded 836 results, of which 55 were unsold. The median hammer price was €560, the max hammer price €9,828.00, the min hammer price €1.00. A grade could be determined in 669 cases: good 5, fine 61, very fine 327, extremely fine 268, uncirculated 8. Provenance was mentioned in 93 cases (11.1%); CNG certificates in 152 (18.2%).

Results
Backwards selection of covariates resulted (after 3 steps) in estimated price and provenance as best predictors (beta resp. 1.148 and 131.0) with a constant of 119.3 (meaning that the hammer price would be 119.3 + 1.148*estimated price + 131*provenance).

The relation between hammer price (y-axis) and predicted price (x-axis) is fairly linear (scatter plot left). The difference between hammer price and predicted price (negative meaning overestimation; positive meaning underestimation) was more or less normally distributed with the model overestimating in 55.5% of the cases (meaning that, if this algorithm is used, you will win the coin in 55.5% of the cases). There was a steep linear relation between the hammer price and the predicted price (not shown).

Case study 2: Augustus Divus JC denarius
Data
The sixbid search yielded 87 results, of which 80 were sold (n unsold = 7). The median hammer price was €932.00; the max hammer €7,337.00; the min hammer €120.00. A grade could be determined in 84 cases: fine 5, very fine 43, extremely fine 35, uncirculated 1. Provenance was mentioned in 24 cases (27%); CNG certificates in 8 (9.2%).

Results
Backwards selection of covariates resulted (after 4 steps) in the estimated price as best predictor (beta 0.814, SE 0.079, p 0.000) with a constant of 374.1 (meaning that the hammer price would be 374.1 + 0.814*estimate).

The relation between hammer price (y-axis) and predicted price (x-axis) is fairly linear (scatter plot left). The difference between hammer price and predicted price (negative meaning overestimation; positive meaning underestimation) was more or less normally distributed with the model overestimating in 61% of the cases (meaning that, if this algorithm is used, you will win the coin in 61% of the cases). There was a steep linear relation between the hammer price and the predicted price (not shown).

Conclusion
It is difficult to predict hammer prices given freely available data (although the methods used were sub-optimal and there is definitely space for improvement). This is irrespective of data size. It should be noted that, given these methods, it is not possible to draw etiological conclusions (e.g. "grade is not related to hammer price").

What next?
I will probably (depending on spare time) improve modelling techniques, as these methods are somewhat rudimentary.

I need a bit of guidance
If you thinks this is a fun project (I do), please help me with a bit of advice. First, regarding grades, I search for these terms, with the following hierarchy:
good 1
fine 2
VF 3
very fine 3
AEF 3
EF 4
XF 4
UNC 4
extremely fine 4
mint state 5
uncirculated 5

>> Is this hierarchy correct?
>> Do you have alternative synonyms that may yield more grades (my data mining algorithm now finds ca. 80%).

For detecting provenance yes/no, I use the following terms:
Collection
Pedigree
Pedigreed

>> any other synonyms you can think of?

I considered putting 'auction house' as a predictor, however, this is only feasible with very large datasets, as the number of auction houses is quite large (e.g. 42 auction houses for the JC denarius). This will likely result in serious overfitting issues.

What other predictors can you think of?

Thanks for reading

Log in or Sign up to hide this ad.

Orfew · Nov 5, 2019

Interesting work!

With a large enough sample size I think time of year the coin was sold might prove to be an interesting variable.

pprp · Nov 5, 2019

You basically cannot reach any safe conclusion because apart from the grade there are other important parameters as the quality of the strike, the surface and the style. Not all provenances have the same impact; some can triple the price even for worn coins, some coming from infamous auctioneers can scare bidders away. Better test your method with moderns slabbed by NGC

Ken Dorney · Nov 5, 2019

Roerbakmix said: ↑

First, regarding grades,
Click to expand...

Sadly this wont give you any good data as grading is enormously subjective.

Roerbakmix said: ↑

For detecting provenance yes/no, I use the following terms:
Click to expand...

Other terms might include but may not give you what you want, "slabbed', 'NGC', 'PCGS', 'ex', 'collection', etc. I'm not sure this field will capture provenance as it depends on context as to what is in the description.

Roerbakmix said: ↑

I considered putting 'auction house' as a predictor
Click to expand...

Well, actually this is a huge factor in some respect. Lesser houses or dealers will not get the high prices that small dealers like me would. I might be able to get $40 for a coin, but CNG might get $500. Yes, that is an extreme example, but it does matter and happens. The smaller the dealer typically the smaller the price, but not always. Its so subjective....

Orfew said: ↑

With a large enough sample size I think time of year the coin was sold might prove to be an interesting variable.
Click to expand...

This is very accurate. Prices can be entirely dependent on time of year. October to March typically sees much higher prices than the rest of the year. It is a very big factor.

I know math is very fun for some, but ancient coins probably have too many variables to make it work.

Roerbakmix · Nov 5, 2019

Thanks @Orfew, @Ken Dorney and @pprp. You are probably completely right: it will be almost impossible to reach safe conclusions. Generally, it is impossible to create a model that explains all variance in the data. However, both created models (example 1 and 2) predict better than pure chance alone, which is, in my opninion, interesting. Exploring the data, there are weak, moderate and even strong trends (e.g. respectively NGC =yes, grading and estimated price). So while, for example, grading is highly subjective, there is a positive and moderate correlation between grade and hammer price.

I'll look into time of the year and the hammer price.

Thanks for the other synonyms and tips, will add those as well.

The auction house variable is methodologically difficult (it is one predictor, however with a large number of degrees of freedom), but should therefor not be discarded immediately.

Ken Dorney · Nov 5, 2019

Roerbakmix said: ↑

So while, for example, grading is highly subjective, there is a positive and moderate correlation between grade and hammer price.
Click to expand...

Absolutely true. And if one disregards randomly assigned grades by sellers, you can rely on your own opinion. I'm not sure what your goal is with this project (asides from just having fun), but generalized data with personal interpretation would likely lead you to finding what you are looking for.

Roerbakmix said: ↑

I'll look into time of the year and the hammer price.
Click to expand...

I hate math, so I am only relying on my experience here, but it would be very interesting to see someone graph sales data by month.

Roerbakmix · Nov 5, 2019

Here is the relation between the hammered price and the month of the year for test case 1 (JC denarius with the elephant). The linear assumption does probably not hold. (And it's a very ugly googlesheets graph).

@Ken Dorney : is this fast, or is this fast!

Ken Dorney · Nov 5, 2019

Roerbakmix said: ↑

@Ken Dorney : is this fast, or is this fast!
Click to expand...

Yep! Interesting as well. I think what this graph is telling us is that popular types will sell well in any month. I do think however that time offered does affect the majority of coins though.

Ed Snible · Nov 5, 2019

Roerbakmix said: ↑

Do you have alternative synonyms that may yield more grades (my data mining algorithm now finds ca. 80%).
Click to expand...

See https://www.fleur-de-coin.com/articles/coin-grades and https://germancoins.com/german-coin-grading/ which give the details you need and also the synonyms in the major coin cataloging languages.

Nvb · Nov 5, 2019

You obviously know a thing or two about statistics.
It'd be interesting if you model several approaches and compare the outcomes.
Even some very simple approaches may yield good results.

Lets say for instance you take a particular auction house and find their sale price/ estimate ratio over a number of auctions. It seems to me some auction houses estimate close to retail while others are laughably low. If those auction houses are very consistent with their estimating techniques, you could have a very useable predictor.
You'd do well to know which animal you are dealing with on a case by case..

Personally I take the traditional approach.. I'll pull up as many comparables, based entirely on appeal, as I can on acsearch. I disregard all other factors.
I'll get a feel for what is a fair price, then decide whether I'm willing to overpay or hope for a steal.

Roerbakmix · Nov 6, 2019

Thanks @Ed Snible for other synonyms.

For case study one (JC denarius elephant), I've added "Toning" (mentioned yes/no), "luster" (mentioned yes/no) and center (not mentioned=0, off-center=1, well-centered=2) to the model. Furthermore, I will use R now for the next calculations.

This did not really do anything: backwards selection (ie data driven predictor selection) still selects the estimate as the only (highly) significant predictor (R-squared: 0.417). Still, the prediction is not completely rubbish (the red line is a modeled relation between the actual and predicted price, using LOESS. Ideally, the line would be 45 degrees, i.e. the dotted grey line, indicating perfect prediction)

Suppose we create a multivariate model including all parameters, but excluding estimated price, this graph would look like this:

Gavin Richardson · Nov 6, 2019

I’m not gonna pretend I understand the statistical math. But I appreciate the rigor and the tentative conclusions. I found it all interesting. Here’s to more slow days at work.

dougsmit · Nov 6, 2019

Ken Dorney said: ↑

I know math is very fun for some, but ancient coins probably have too many variables to make it work.
Click to expand...

I agree with the 'too many' problem but there is another issue at play here. I am not a mathematician but sometime in elementary school I was taught something that strikes me as applicable here. Then we called it 'significant figures' but now it is the difference between analog and digital. Asssigning digital numbers 1 through 5 to grade, for example, ignores the fact that there are a thousand variations included in the term VF and another thousand opinions on which variations are more or less important. Even if we assume that all EF coins will outperform all VF coins in terms of price (wrong) we can not equate 1.5 to 2.499999 without peril. It might help to toss out results that don't play well with what we are trying to prove (never a good thing) but it is hard to digitize factors like two wealthy and pig-headed people who hate each other and get into a pushing match. I have a friend whose cat jumped on this keyboard an placed a last second bid on a coin he did not intend to buy. Factor in that! He honored the cat bid but online statistics do not always record returns, refusals, deaths of buyers etc.

This exercise strikes me as fun for math people and useless for those of us who do not believe life can be reduced to a provable formula.

Roerbakmix · Nov 6, 2019

Thanks all for the reactions. Interesting points have been raised here:

dougsmit said: ↑

This exercise strikes me as fun for math people and useless for those of us who do not believe life can be reduced to a provable formula.
Click to expand...

This is exactly not what I'm doing. Basically, there is available data, and there is a relation between several parameters (such as estimate, grade, etc.) and the hammer price. What this (rather crude) modelling methods show, is that a) there is indeed a relation and b) the variance of it can partly be explained. It is probably impossible to 100% predict the outcome (ie hammer price). But this is not what I try to do.

Nvb said: ↑

Personally I take the traditional approach.. I'll pull up as many comparables, based entirely on appeal, as I can on acsearch. I disregard all other factors.
I'll get a feel for what is a fair price, then decide whether I'm willing to overpay or hope for a steal.
Click to expand...

This is the same as doing modelling, except that it is done in an unstandardized fashion. Which is not wrong. But conceptually: you have previous data (your experience) and new data (the new auction) and with your experience, you're trying to predict the ideal outcome. So, for example, suppose you've got a coin which is well centered, high grade, but has a hole in it, you will add a certain weight to each predictor (centering, grade, hole) and thus reach your optimal estimate for the actual value of the coin. A model does exactly this: it determines the relative weight of each parameter to the outcome (hammer price).

Regarding the unexplained variance (e.g. the cat jumping on the keyboard resulting in a high bid, or the computer crashing resulting in a low bid). This could be regarded as random noise, and given large enough sample size, should level out each other.

Regarding differences in grading, or subjective grading: the concept above is probably applicable to that as well, except that there will probably a tendency to overgrade. However, if we assume this to be consistent (so not a thing of the past, i.e. if we assume dealers do not suddenly change this behaviour), this is not a problem either.

Now a final note regarding the parameter 'estimated price'. In a private conversation, we reached the conclusion that a) the other parameters are probably included in this estimate (and that this may partially explain the little added explained variance of these predictors), and b) the estimate is usually the starting point of bidding, or a percentage of the estimate. So, a hammer price lower than the estimate is relatively uncommon, and thus a strong positive relation between estimate and hammer price should be expected.

In conclusion: the auction results are probably not completely predictable, or perhaps not even moderately. This does not mean that auction results are something 'magic'. It mereley means that there is not enough explainable data. (But this is more a phylosophical standpoint than a practical).

Terence Cheesman · Nov 6, 2019

Over the last three years I have placed close to 2000 coins into auctions. Most of the coins have been sold through CNG or Triskeles Auctions. One would assume that I might of learned something from all of this. Your assumption would be wrong. I probably know not much more now than when I started. Many coins that I assumed would do well don't. Some die after the first bid. Whereas others do extremely well. Only once I was able to predict that a coin was going to do very badly. It suffered from "really ugly picture" disease. It was so bad I could not believe it was my coin. Oh well someone got a rather pleasant looking coin for a reasonable price.(I think I got $20 Ouch).
The same is true when I am buying. Last year I tried to get a number of coins only to be out bid by rather serious margins. Yet others which were very similar, I got for what I would consider to be a really good price. Like many others who have commented on this thread I think there are just too many variables to fact in to come up with what might be described at best an educated guess.
Coin I got at auction this year Alexander I Balas Ar drachm SC 1785.13 var. Antioch 147-146 B.C. 4.25 grms 19 mm Purchased from CNG Auction 111 Lot 340 May 29 2019 EX MNL Collection

frank008 · Nov 11, 2019

Ken Dorney said: ↑

This is very accurate. Prices can be entirely dependent on time of year. October to March typically sees much higher prices than the rest of the year. It is a very big factor.
Click to expand...

Interesting point and I'm curious about what might be the underlying/driving factor. Do you maybe think this is because [ancient] coin collecting is an "indoor hobby" (aka winter sport)? Or maybe holiday/gift purchases? Or even tax refund splurges?

Ed Snible · Nov 11, 2019

There is an "auction season" with the more expensive items typically sold at particular times of the year, @frank008 . I would have guessed the "auction season" distorts the price ... the best items are held back to be placed in a few key auctions. Do you see otherwise, that regular shoppers are also making bigger purchases in the winter?

@Roerbakmix , can you cook up a chart of average/median prices throughout the year?

Roerbakmix · Nov 12, 2019

Sure, @Ed Snible, here is an overview of the hammer price of the JC denarius

For those interested, it's a scatterplot with a LOESS curve fitted and a 95%CI around the estimate.

Ken Dorney · Nov 12, 2019

frank008 said: ↑

Interesting point and I'm curious about what might be the underlying/driving factor. Do you maybe think this is because [ancient] coin collecting is an "indoor hobby" (aka winter sport)? Or maybe holiday/gift purchases? Or even tax refund splurges?
Click to expand...

I really dont know why there is indeed a season for numismatics but I assume the factors are many and varied. I dont think gift purchases have much impact, though I do believe that vacations/holiday is a big factor. I did a quick calculation for three years past and charted it below. These are retail sales for me, no coin shows or auctions factored in and this year has skewed things as I had an unusually busy summer. The numbers dont mean much to you (but do represent dollars) and are meant to be visual:

rrdenarius · Nov 12, 2019

I am not sure how many of your data points list number of bids, but coins that sell for over start price / estimate have two or more bidders. I suspect that number of bids would correlate well.
Did you compare start price vs estimate?

Log in or Sign up

Featured Predicting hammer price: playing a bit with Sixbid data & looking for inspiration

Roerbakmix Well-Known Member

Attached Files:

upload_2019-11-5_16-19-38.png

upload_2019-11-5_16-22-20.png

Guest User Guest

Orfew Draco dormiens nunquam titillandus

pprp Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

Roerbakmix Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

Roerbakmix Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

Ed Snible Well-Known Member

Nvb Well-Known Member

Roerbakmix Well-Known Member

Gavin Richardson Well-Known Member

dougsmit Member

Roerbakmix Well-Known Member

Terence Cheesman Well-Known Member

frank008 New Member

Ed Snible Well-Known Member

Roerbakmix Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

rrdenarius non omnibus dormio

Share This Page

Log in or Sign up

Featured Predicting hammer price: playing a bit with Sixbid data & looking for inspiration

Roerbakmix Well-Known Member

Attached Files:

upload_2019-11-5_16-19-38.png

upload_2019-11-5_16-22-20.png

Guest User Guest

Orfew Draco dormiens nunquam titillandus

pprp Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

Roerbakmix Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

Roerbakmix Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

Ed Snible Well-Known Member

Nvb Well-Known Member

Roerbakmix Well-Known Member

Gavin Richardson Well-Known Member

dougsmit Member

Roerbakmix Well-Known Member

Terence Cheesman Well-Known Member

frank008 New Member

Ed Snible Well-Known Member

Roerbakmix Well-Known Member

Ken Dorney Yea, I'm Cool That Way...

rrdenarius non omnibus dormio

Share This Page

Useful Searches