A free, open-source tool has been developed at the John Innes Centre, Norwich Research Park partner, to help researchers deal with big-data and data management overload in projects.

The tool, called ‘dtool’ was developed by Dr Tjelvar Olsson, computing lab manager, and Dr Matthew Hartley, head of informatics, at the John Innes Centre and has been two years in the making. Their vision was to give researchers confidence that the massive amounts of data being produced during research projects is safe and secure with data management processes happening constantly and un-obstructively in the background of everyday activities. They also wanted researchers to be able to make full use of cloud technology by providing a seamless storage solution that makes data as easy to access as it would be if stored on your own computer. By succeeding in this, they are allowing researchers to analyse data as quickly as possible, something that can be very important in projects with reoccurring deadlines.

Dr Olsson has said ‘We want more people to use dtool to manage their data. We have designed it in a way that slots into their way of working, a lightweight solution used in a minimal kind of way that sits on top of what they are already doing’. Dr Hartley adds that the impact of the tool is already being felt ‘dtool has made storing data cheaper, giving peace of mind and speeding up research.’

How does it work?

Dtool packages data and metadata together in ‘boxes’ with easily identifiable labels such as what the data is, where it comes from, when it was recorded and who by, as well as which project it is from. It works with traditional file systems as well as cloud options, allowing data to be shared across platforms without having to re-configure it.  

What makes it useful?

When a huge amount of data is produced from a project it can be difficult to store sensibly and find particular, interesting data quickly. Dtool stores this huge amount of data more effectively, using cloud storage and compressing data into smaller chunks taking up less space. The labels attached to ‘boxes’ of datasets allow researchers to find data more quickly as well as understanding what the data means. Dtool also makes it easy to share across platforms and creating an open-access mindset amongst researchers, something that is becoming increasingly more important