Saturday, 6 April 2013

3D's of computing (Data, Devil, Detail)

This week has seen a flurry of activity under the banner of Big Data!

The finale was Friday evening watching a recording of this weeks Horizon programme on 'Big Data' - which I watched with 3 of my advisor's - sounded like geek heaven to us. Anyway the programme unfolded, blah, blah, big data, lot of 1's and 0's flashing over the screen to show you where the big data was coming from and going to. As it went on though, I personally, was having trouble keeping my face straight - to the annoyance of one of my advisor's who kept telling me to shut up. Having slept on it and having been immersed in a real live project for the past month or so directly dealing with Very Big Data (maybe that will catch on - VBD ;) the things that were bothering me boil down to the following;

  1. there's a 'smoke and mirrors' feel about a lot of this big data talk. Certainly there is vast potential for mining data, but, from what I've seen 'ordinary' companies are miles away from being in a position to exploit it fully. Enter the big data repository suppliers who will solve all your big data consolidation and mining problems for you. Off you go....
  2. enter the mythical 'algorithm' - is having this central repository going to work. As in the Horizon programme, when you need to access the data all you do is create the algorithm to do what you need - simples! You have your data, you can access it from anywhere at any time (oh yes you can) what are you going to do with it (in my world you should have though of that beforehand but that's another story) you have your bucket of data and want to fish out some 'benefit'. What do you do, you write an algorithm to do this - most of this algorithm is just searching and filtering and displaying - not much algorithm about that. However, there could be an analysis element in this algorithm too - sounds like you need to dust of the old Fortran compiler to me! What's the problem, the problem is spreadsheets, everyone wants to run their own personal 'algorithm' dealing with their own specific needs - and quite rightly too! They take an extract of the big data, do some work on it, write the report and off they go. Well, probably a bit more than that but you get the idea! All this leads to fragmentation (again) of the data set as it is difficult to re-upload you work back into the mother ship. 
  3. what's needed of course is a managed way of allowing access to the big data and development of local 'algorithms' - sounds like app development to me! These can use and refresh the big data appropriately. Sorry seem to have entered the smoke and mirrors zone again. Great aspiration but do 'ordinary' companies really have the quality of data to allow meaningful apps to be developed?
The thoughts continue, keep smiling......

No comments:

Post a Comment