Tag Archives: Coding

Speedy data 2

My Speedy data post generated a few comments and some discussion. I really appreciate people taking the time to get involved and share their knowledge and views.

The first comments came via twitter from TraderBot (here and here) with a link to stackoverflow. This is a site I’ve found to be really useful to get help with many programming issues in multiple languages (That reminds me, I keep meaning to do a list of what I use – apps and sites). The Q&As linked to, although relating to Java, are an interesting read with an answer to the speed question, in summary, of “it depends on what exactly you want to do/measure”.

LiamPauling commented on the post asking where I’m hosted and do I stream? I’m cloud hosted and not streaming. He continues that he thinks bottlenecks are more likely elsewhere, which, after further reference in later comments, seems to be a good point.

Betfair Pro Trader asked why I wanted to use an array. It’s not that I want to use an array more than any other data structure, I was looking at getting the best solution, if such a thing exists (which is becoming less clear).

Tony, via Twitter, suggested running a test code with the different structures used. This could be useful but I was put off from this initially by the confusion I was getting from reading differing opinions based on various implementations of arrays, collections and dictionaries (and later, lists). At this point I was thinking that the optimum stucture is dependant on the specific use and there isn’t an exact answer to my speed question.

Next, a comment from Ken. He points to Lists as it’s something that he uses regularly and he talks of some of the benefits. Again, I’d previously come across articles saying lists were slow but maybe I was too quick to dismiss them. Betfair Pro Trader has also suggested using lists and dictionaries combined. Ken adds that he codes in C# (C sharp) but I think for the purpose of data structures and speed they are similar (they, C# and VB.net compile to the same language and run against the same runtime libraries).

n00bmind added a detailed comment. He makes the point that the advantages of one structure over another are not always so, as mentioned above. Also, he goes on to agree with previous comments that my speed question may be missing the main issues – those being the program/algorithm itself and network latency. Further advice is given about profiling (something, as a specific process, I haven’t come across before) and maybe using a different language, such as Python (I have only a basic understanding of Python from messing with it on my Raspberry Pi).

Finally, Jptrader commented, agreeing mostly with n00bmind, and others, about looking at “handling network latency properly and doing performance profiling”.

Although a simple answer hasn’t been found (because there isn’t one), I’m guided by these comments to focus more on my code, handling serialization and latency, making the algorithm efficient and using the data structures that work for now, whether that’s arrays, collections, dictionaries, lists or a combination of. Moving to another language just isn’t feasible for me at the moment, it’s taken me over a year to get a running bot in VB, with limited hobby time. I am happy to accept that another language may have it’s advantages, so would advise others to look at this for optimising their bots performance (for me the advantage will be seen moving from VBA to VB.net).

The testing I’ve done hasn’t shown any particular advantage of the different structures. From my searches on the web I think this could be due to the relatively small amount of data I’m handling (many articles talk of data lines in the 10s to 100s of thousands when comparing structures). An error on my part also had me making double calls for data with my bot which added to my difficulties and questions initially.

I have plenty to be getting on with for now and will continue looking to improve my bots. Thanks again for all the comments.

A chart, I see

I’ll return to the speed question soon (as I have an idea for a testing bot, just needs writing), the results of the little testing I’ve done are not really good, as in they don’t really move me forward. Development of the VB bot is still progressing though, well my programming ability is, with the more I learn I add/change what I want to do but it’s all good. I’ve been playing with what data to collect/monitor and how to handle it (to avoid unnecessary bloating of the bot). As I like to visualise things, I’ve been presenting data in different ways. Below is one of the charts I created, just with excel, to show matched volumes. It covers the second favourite for the final two minutes before the off in a middle-of-the-road greyhounds race. The data is at 1 second intervals with a 4% decay added to project but not obscure the matching. The price can be seen to rise as it approaches off time (front of image) with last-price-traded at 4.0. It isn’t much use on its own but I like how it looks. Adding another parameter, or two, gives more meaning.

match_vol_dec_chart

Speedy data

Speed is an important part of my bot development. When it comes to storing data, the options (data structures) I’ve been working with are array, collection and dictionary. When I Google for articles on speed, I get lots of information pointing to dictionaries as being the fastest for programming. But the main point of interest in the articles is the speed of looking up data. To find something in an array, all elements have to be looped through until it’s found (or isn’t). The use of keys in collections and dictionaries makes lookup faster as they can be targeted without looping, as long as you know the key(s). There are other advantages that make dictionaries preferred to collections, however they seem less important when looking at speed. A disadvantage of arrays comes when changing its size. Collections and dictionaries can have elements added without problems. To do the same with an array, we have to change its size to accommodate more data, which involves more time as the original is put into temporary memory and a new, re-sized, array created with the existing data then added to it.

This leaves me thinking that the two main speed disadvantages of arrays is searching for data and resizing. Here’s the question I’m looking to answer – if I know the size I want the array to be and I know the location of all the data held in it (so I can refer to each element directly) is there a speed benefit to using collections or dictionaries? Any help appreciated.

How’s it going?

Automated_trader commented –

How’s coding your own bot going? I got a copy of Programming for Betfair today so gonna try and code my own too. I’ve no coding experience other than a little VBA for my current bots that run on Gruss so hoping it won’t be too big of a leap into the unknown. Might even start a blog to document my progress too.

The coding is going really well with a bot now in testing. I doubt I’d have started without Programming for Betfair as I thought the amount to learn wasn’t worth it. The book helps by giving you everything you need to start auto trading, as in it gives you the code to request data, place bets and handle the responses from Betfair. It also gives you the basic tools to trade with such as profit take, fill/kill and stoploss. Once you’ve worked through the book, it’s just a case of adapting what you’ve learnt to get what you want.

My currently running and previous bots are written in VBA and running through Gruss. Before I started them I hadn’t written in VBA but picked it up from forums and searches. I hadn’t done any VB.net prior to this but understanding it isn’t that much of a leap. It’s the same style of object programming and the terms and layout are familiar after using VBA.

It’s true that any errors (after the corrections on the associated website are completed) are down to typos. A couple of times I thought it wasn’t me and something must have changed making the code not work. But it was me, with one mistake being a missing “A” in an url string. That took hours to find and I’d looked at the offending line of code more than once. Another biggy was missing a whole line of code, again taking forever to find. Sometimes the error message isn’t that helpful, not to me anyway. This one was an “overload” which, I’ve since found out, means you’re trying to put more variables into an object than you have declared. That’s what the missing line was, a variable declaration.

Some things to note, firstly, Visual Studio (VS) uses IntelliSense which highlights errors as you go. To begin with, this is quite annoying as you’ll finish a module and there’s a list of errors, so you spend time looking for what you’ve done wrong – which is nothing. The errors disappear when you complete the next couple of pages in the book. I found it’s better to ignore most of them until you get to a point where the book tells you to try running the code. The ones to correct as you go are missing or expected characters, such as “(“ expected.

Second – VS includes an autocomplete function. This will really mash your head at some points as you try to add a new declaration and it changes it to something else. If this happens, a quick tap on backspace should swap it back to what you typed.

Third – I find that the first run of the code after starting VS can sometimes fail. Just stop and run again. This problem goes once you publish your project.

If you enjoy coding you shouldn’t have any difficulty with it, just need patience. Let me know how you get on.

Interesting test results

I’ve been recoding my horse racing form bot (Bot 3), trying to get it to work without errors. It scrapes a lot of data for each race – runner form, jockey form, track data, going, weather. Having got it to a point where it runs okay with me sat in front of it, I decided to run it live on the VPS but without placing bets, allowing performance and data collection checks. To keep things separate I’m running this on Betdaq with a slow refresh and, as I’ve not used it for a while, I’ve added a short section of code to place some pointless bets based on where the market odds are, nothing to do with the bots purpose but hopefully keeps the exchange happy.

The results so far are good and the bot is gathering most of the data without error. Any missing data isn’t causing the bot to stop, however, I’d like it to make multiple attempts to gather it, so some adjustments are needed. The algorithm for choosing a selection to back or lay is operational but requires refining. I’m basically calculating my own ‘betting forecast’ and backing or laying a selection depending on how far away from my suggested odds it is trading.

The interesting part comes from the short bet placement code I added. Of the 12 bets it placed over the five days testing, only 1 of the selections won. Obviously I’d backed them all.

 

My theory for this code was – if I place bets in efficient markets (which I thought close-to-off horse racing markets were), I will likely lose at a rate around my commission (5%) plus half the spread position. Therefore stakes of £0.1 are a small price for testing.

I know the small sample is hardly proof but with average odds of 3, none greater than 3.5 and results at 12, I’m going to keep this little code running to see how it pans out and update the blog on results. It will be interesting to find out if I just happened to drop into the strategy during a 10 long losing streak.

Oscar has returned

After a short holiday caused by this event, Oscar is now back at work, emotionless but committed.

Mike suggested using the Take-SP option to avoid missing trading out.

BPT commented he’d previously used Back at 1.01 and Lay at 1000 to catch the best available odds.

As a way of getting the bot going again, it was quite easy to replace the 2nd-next-best-price with 1.01 and 1000, so this is what I’ve done for now. This is only a temporary fix as the greening is still calculating one refresh behind.

As I improve the greening code, I will include the Take-SP as a last resort. I think this is better as, although the odds may be off, it is taken straight away without entering the market directly. With the code improved, the use of this should be less often anyway.

Back off hols

Oscar needs a holiday

A large loss occurred today as a result of a bet placed in error. Although I’ve seen this before, the size of the loss made me investigate the problem.

Nearing the off time my bot stops trading and then there’s a break of 10 seconds to allow any bets placed to be matched. Then a greening algorithm runs 3 times, first to green at the available odds, then at 1 tick worse than available odds, finally at 2 ticks worse than available odds. The idea is to reduce any risk of leaving a bad position open.

On today’s event the first attempt placed a lay in error. This left 2 attempts to hedge out. Unfortunately both failed. Looking over my code, the greening part is probably the oldest. A quick look on the Gruss forum and the example code has been updated a few times, without me keeping up. The code I use is greening on old data. The most recent code greens on current data.

As the possibility of a similar error exists, I’m not prepared to carry on trading until I’ve fixed this code. Due to the amount of change required and testing I need to do, I’ve stopped Oscar and his American brother for now. Hopefully the coming Bank Holiday weekend will provide some opportunity to make the fix.

Hols bot