Log in or Sign up
Coin Talk
Home
Forums
>
Coin Forums
>
Ancient Coins
>
Predicting hammer price: playing a bit with Sixbid data & looking for inspiration
>
Reply to Thread
Message:
<p>[QUOTE="Roerbakmix, post: 3839187, member: 100731"]Hi all, so it has been a rather slow day at work which allowed me to spend a bit of time on a project: trying to predict a hammer price based on freely available data. This is probably impossible to do, but I tried it nonetheless and had some fun doing it.</p><p><br /></p><p><b>METHODS:</b></p><p><b>Data:</b></p><p>Freely available data on SixBid (<a href="https://www.sixbid-coin-archive.com/#/en/search?currency=eur" target="_blank" class="externalLink ProxyLink" data-proxy-href="https://www.sixbid-coin-archive.com/#/en/search?currency=eur" rel="nofollow">https://www.sixbid-coin-archive.com/#/en/search?currency=eur</a>) with currency automatically recalculated to EUR to make life a bit easier. For the case studies, two search strings were used: 1) "caesar denarius elephant" and 2) "augustus denarius comet ivlivs"</p><p>Data is presented on SixBid in a more or less structured manner. First, the entire webpage was copy-pasted to google sheets. Using various (not super state of the art methods), data on hammer price, estimated price, auction house, date, grade (0=missing, 1= good, 2=fine, 3=very fine, 4=extremely fine, 5=mint state; or synonyms), NGC certificate (y/n) and provenance information (y/n) was extracted. </p><p><br /></p><p><b>Statistical methods:</b></p><p>Linearity between the dependent (hammer price) and independent (estimated price, grade, NGC certificate y/n and provenance) variables was assumed (here is room for improvement because for some variables, this was obviously not the case). Missing data was coded as 0 (missing data on grade should probably be imputed). Multivariate linear regression with backwards selection (p in 0.05, p out 0.10). Predicted prices were compared against actual hammer prices.</p><p><br /></p><p><b>RESULTS</b></p><p><b>Case study 1: JC denarius elephant</b></p><p>Data</p><p>The sixbid search yielded 836 results, of which 55 were unsold. The median hammer price was €560, the max hammer price €9,828.00, the min hammer price €1.00. A grade could be determined in 669 cases: good 5, fine 61, very fine 327, extremely fine 268, uncirculated 8. Provenance was mentioned in 93 cases (11.1%); CNG certificates in 152 (18.2%).</p><p><br /></p><p>Results</p><p>Backwards selection of covariates resulted (after 3 steps) in estimated price and provenance as best predictors (beta resp. 1.148 and 131.0) with a constant of 119.3 (meaning that the hammer price would be 119.3 + 1.148*estimated price + 131*provenance).</p><p><br /></p><p>The relation between hammer price (y-axis) and predicted price (x-axis) is fairly linear (scatter plot left). The difference between hammer price and predicted price (negative meaning overestimation; positive meaning underestimation) was more or less normally distributed with the model overestimating in 55.5% of the cases <u>(meaning that, if this algorithm is used, you will win the coin in 55.5% of the cases)</u>. There was a steep linear relation between the hammer price and the predicted price (not shown).</p><p>[ATTACH=full]1019246[/ATTACH]</p><p><br /></p><p><b>Case study 2: Augustus Divus JC denarius</b></p><p>Data</p><p>The sixbid search yielded 87 results, of which 80 were sold (n unsold = 7). The median hammer price was €932.00; the max hammer €7,337.00; the min hammer €120.00. A grade could be determined in 84 cases: fine 5, very fine 43, extremely fine 35, uncirculated 1. Provenance was mentioned in 24 cases (27%); CNG certificates in 8 (9.2%).</p><p><br /></p><p>Results</p><p>Backwards selection of covariates resulted (after 4 steps) in the estimated price as best predictor (beta 0.814, SE 0.079, p 0.000) with a constant of 374.1 (meaning that the hammer price would be 374.1 + 0.814*estimate).</p><p><br /></p><p>The relation between hammer price (y-axis) and predicted price (x-axis) is fairly linear (scatter plot left). The difference between hammer price and predicted price (negative meaning overestimation; positive meaning underestimation) was more or less normally distributed with the model overestimating in 61% of the cases <u>(meaning that, if this algorithm is used, you will win the coin in 61% of the cases)</u>. There was a steep linear relation between the hammer price and the predicted price (not shown).</p><p>[ATTACH=full]1019234[/ATTACH]</p><p><br /></p><p><br /></p><p><b>Conclusion</b></p><p>It is difficult to predict hammer prices given freely available data (although the methods used were sub-optimal and there is definitely space for improvement). This is irrespective of data size. It should be noted that, given these methods, it is <b>not possible</b> to draw etiological conclusions (e.g. "grade is not related to hammer price").</p><p><br /></p><p><b>What next?</b></p><p>I will probably (depending on spare time) improve modelling techniques, as these methods are somewhat rudimentary.</p><p><br /></p><p><b>I need a bit of guidance</b></p><p>If you thinks this is a fun project (I do), please help me with a bit of advice. First, regarding grades, I search for these terms, with the following hierarchy:</p><p>good 1</p><p>fine 2</p><p>VF 3</p><p>very fine 3</p><p>AEF 3</p><p>EF 4</p><p>XF 4</p><p>UNC 4</p><p>extremely fine 4</p><p>mint state 5</p><p>uncirculated 5</p><p> </p><p>>> Is this hierarchy correct?</p><p>>> Do you have alternative synonyms that may yield more grades (my data mining algorithm now finds ca. 80%).</p><p><br /></p><p>For detecting provenance yes/no, I use the following terms:</p><p>Collection</p><p>Pedigree</p><p>Pedigreed</p><p><br /></p><p>>> any other synonyms you can think of?</p><p><br /></p><p>I considered putting 'auction house' as a predictor, however, this is only feasible with very large datasets, as the number of auction houses is quite large (e.g. 42 auction houses for the JC denarius). This will likely result in serious overfitting issues.</p><p><br /></p><p><b>What other predictors can you think of?</b></p><p><br /></p><p>Thanks for reading <img src="styles/default/xenforo/clear.png" class="mceSmilieSprite mceSmilie1" alt=":)" unselectable="on" unselectable="on" />[/QUOTE]</p><p><br /></p>
[QUOTE="Roerbakmix, post: 3839187, member: 100731"]Hi all, so it has been a rather slow day at work which allowed me to spend a bit of time on a project: trying to predict a hammer price based on freely available data. This is probably impossible to do, but I tried it nonetheless and had some fun doing it. [B]METHODS: Data:[/B] Freely available data on SixBid ([URL]https://www.sixbid-coin-archive.com/#/en/search?currency=eur[/URL]) with currency automatically recalculated to EUR to make life a bit easier. For the case studies, two search strings were used: 1) "caesar denarius elephant" and 2) "augustus denarius comet ivlivs" Data is presented on SixBid in a more or less structured manner. First, the entire webpage was copy-pasted to google sheets. Using various (not super state of the art methods), data on hammer price, estimated price, auction house, date, grade (0=missing, 1= good, 2=fine, 3=very fine, 4=extremely fine, 5=mint state; or synonyms), NGC certificate (y/n) and provenance information (y/n) was extracted. [B]Statistical methods:[/B] Linearity between the dependent (hammer price) and independent (estimated price, grade, NGC certificate y/n and provenance) variables was assumed (here is room for improvement because for some variables, this was obviously not the case). Missing data was coded as 0 (missing data on grade should probably be imputed). Multivariate linear regression with backwards selection (p in 0.05, p out 0.10). Predicted prices were compared against actual hammer prices. [B]RESULTS Case study 1: JC denarius elephant[/B] Data The sixbid search yielded 836 results, of which 55 were unsold. The median hammer price was €560, the max hammer price €9,828.00, the min hammer price €1.00. A grade could be determined in 669 cases: good 5, fine 61, very fine 327, extremely fine 268, uncirculated 8. Provenance was mentioned in 93 cases (11.1%); CNG certificates in 152 (18.2%). Results Backwards selection of covariates resulted (after 3 steps) in estimated price and provenance as best predictors (beta resp. 1.148 and 131.0) with a constant of 119.3 (meaning that the hammer price would be 119.3 + 1.148*estimated price + 131*provenance). The relation between hammer price (y-axis) and predicted price (x-axis) is fairly linear (scatter plot left). The difference between hammer price and predicted price (negative meaning overestimation; positive meaning underestimation) was more or less normally distributed with the model overestimating in 55.5% of the cases [U](meaning that, if this algorithm is used, you will win the coin in 55.5% of the cases)[/U]. There was a steep linear relation between the hammer price and the predicted price (not shown). [ATTACH=full]1019246[/ATTACH] [B]Case study 2: Augustus Divus JC denarius[/B] Data The sixbid search yielded 87 results, of which 80 were sold (n unsold = 7). The median hammer price was €932.00; the max hammer €7,337.00; the min hammer €120.00. A grade could be determined in 84 cases: fine 5, very fine 43, extremely fine 35, uncirculated 1. Provenance was mentioned in 24 cases (27%); CNG certificates in 8 (9.2%). Results Backwards selection of covariates resulted (after 4 steps) in the estimated price as best predictor (beta 0.814, SE 0.079, p 0.000) with a constant of 374.1 (meaning that the hammer price would be 374.1 + 0.814*estimate). The relation between hammer price (y-axis) and predicted price (x-axis) is fairly linear (scatter plot left). The difference between hammer price and predicted price (negative meaning overestimation; positive meaning underestimation) was more or less normally distributed with the model overestimating in 61% of the cases [U](meaning that, if this algorithm is used, you will win the coin in 61% of the cases)[/U]. There was a steep linear relation between the hammer price and the predicted price (not shown). [ATTACH=full]1019234[/ATTACH] [B]Conclusion[/B] It is difficult to predict hammer prices given freely available data (although the methods used were sub-optimal and there is definitely space for improvement). This is irrespective of data size. It should be noted that, given these methods, it is [B]not possible[/B] to draw etiological conclusions (e.g. "grade is not related to hammer price"). [B]What next?[/B] I will probably (depending on spare time) improve modelling techniques, as these methods are somewhat rudimentary. [B]I need a bit of guidance[/B] If you thinks this is a fun project (I do), please help me with a bit of advice. First, regarding grades, I search for these terms, with the following hierarchy: good 1 fine 2 VF 3 very fine 3 AEF 3 EF 4 XF 4 UNC 4 extremely fine 4 mint state 5 uncirculated 5 >> Is this hierarchy correct? >> Do you have alternative synonyms that may yield more grades (my data mining algorithm now finds ca. 80%). For detecting provenance yes/no, I use the following terms: Collection Pedigree Pedigreed >> any other synonyms you can think of? I considered putting 'auction house' as a predictor, however, this is only feasible with very large datasets, as the number of auction houses is quite large (e.g. 42 auction houses for the JC denarius). This will likely result in serious overfitting issues. [B]What other predictors can you think of?[/B] Thanks for reading :)[/QUOTE]
Your name or email address:
Do you already have an account?
No, create an account now.
Yes, my password is:
Forgot your password?
Stay logged in
Coin Talk
Home
Forums
>
Coin Forums
>
Ancient Coins
>
Predicting hammer price: playing a bit with Sixbid data & looking for inspiration
>
Home
Home
Quick Links
Search Forums
Recent Activity
Recent Posts
Forums
Forums
Quick Links
Search Forums
Recent Posts
Competitions
Competitions
Quick Links
Competition Index
Rules, Terms & Conditions
Gallery
Gallery
Quick Links
Search Media
New Media
Showcase
Showcase
Quick Links
Search Items
Most Active Members
New Items
Directory
Directory
Quick Links
Directory Home
New Listings
Members
Members
Quick Links
Notable Members
Current Visitors
Recent Activity
New Profile Posts
Sponsors
Menu
Search
Search titles only
Posted by Member:
Separate names with a comma.
Newer Than:
Search this thread only
Search this forum only
Display results as threads
Useful Searches
Recent Posts
More...