Speedy data 2

My Speedy data post generated a few comments and some discussion. I really appreciate people taking the time to get involved and share their knowledge and views.

The first comments came via twitter from TraderBot (here and here) with a link to stackoverflow. This is a site I’ve found to be really useful to get help with many programming issues in multiple languages (That reminds me, I keep meaning to do a list of what I use – apps and sites). The Q&As linked to, although relating to Java, are an interesting read with an answer to the speed question, in summary, of “it depends on what exactly you want to do/measure”.

LiamPauling commented on the post asking where I’m hosted and do I stream? I’m cloud hosted and not streaming. He continues that he thinks bottlenecks are more likely elsewhere, which, after further reference in later comments, seems to be a good point.

Betfair Pro Trader asked why I wanted to use an array. It’s not that I want to use an array more than any other data structure, I was looking at getting the best solution, if such a thing exists (which is becoming less clear).

Tony, via Twitter, suggested running a test code with the different structures used. This could be useful but I was put off from this initially by the confusion I was getting from reading differing opinions based on various implementations of arrays, collections and dictionaries (and later, lists). At this point I was thinking that the optimum stucture is dependant on the specific use and there isn’t an exact answer to my speed question.

Next, a comment from Ken. He points to Lists as it’s something that he uses regularly and he talks of some of the benefits. Again, I’d previously come across articles saying lists were slow but maybe I was too quick to dismiss them. Betfair Pro Trader has also suggested using lists and dictionaries combined. Ken adds that he codes in C# (C sharp) but I think for the purpose of data structures and speed they are similar (they, C# and VB.net compile to the same language and run against the same runtime libraries).

n00bmind added a detailed comment. He makes the point that the advantages of one structure over another are not always so, as mentioned above. Also, he goes on to agree with previous comments that my speed question may be missing the main issues – those being the program/algorithm itself and network latency. Further advice is given about profiling (something, as a specific process, I haven’t come across before) and maybe using a different language, such as Python (I have only a basic understanding of Python from messing with it on my Raspberry Pi).

Finally, Jptrader commented, agreeing mostly with n00bmind, and others, about looking at “handling network latency properly and doing performance profiling”.

Although a simple answer hasn’t been found (because there isn’t one), I’m guided by these comments to focus more on my code, handling serialization and latency, making the algorithm efficient and using the data structures that work for now, whether that’s arrays, collections, dictionaries, lists or a combination of. Moving to another language just isn’t feasible for me at the moment, it’s taken me over a year to get a running bot in VB, with limited hobby time. I am happy to accept that another language may have it’s advantages, so would advise others to look at this for optimising their bots performance (for me the advantage will be seen moving from VBA to VB.net).

The testing I’ve done hasn’t shown any particular advantage of the different structures. From my searches on the web I think this could be due to the relatively small amount of data I’m handling (many articles talk of data lines in the 10s to 100s of thousands when comparing structures). An error on my part also had me making double calls for data with my bot which added to my difficulties and questions initially.

I have plenty to be getting on with for now and will continue looking to improve my bots. Thanks again for all the comments.

One thought on “Speedy data 2”

  1. Very interesting articles about bot speeds. I have been looking into the same. I have looked at my algorithms and have improved them. They now return values within 3-4 ms. However the main bottle neck is the price refreshes. Without streaming they currently have a price refresh at 200ms, my prices can be 180 ms out of date. If I was able to implement streaming I could improve my robots speed by a huge amount (probably 100ms), dwarfing any gains that could be made over optimising my robots. So I would suggest that the bottleneck is in your price refreshing and you could see a large improvement with your bots if you were able to stream prices.

    Like

Leave a comment