User Tools

Site Tools


Get All The Data (GATD)

  • Problem: Backend data management is necessary but a burden when the project focus is elsewhere.
  • Hypothesis: Almost all sensor data collection platforms share common building blocks, which are constantly reimplemented at the expense of the developers. A flexible data management system can share the same functional blocks among diverse applications and allow straight forward user configuration.
  • Solution: Build a framework that provides flexible configurable blocks and default client interfaces.
  • Timeline: Finish by the end of the semester, give or take a few weeks.


  • Provide the framework that is common to most sensing applications
  • Support an arbitrarily large number of sensors / load.
  • Allow for maximum flexibility in accepting data from sensors by letting users specify code for receivers and processors
  • Create endpoints for data access that allow for rapid application development
  • Graphing and basic analysis out of the box to quickly start.

System Overview

The system will consist of a couple independent processes, which can be run in a distributed fashion, with multiple instances running on many computers. These processes will communicate using message queues and database accesses.

From the Users Perspective

Walk-throughs of common use cases. Users of this system will have to specify a couple key algorithms and data formats:


  • Receivers/Listeners - These are snippets of code supplied by the end-user that run within a sandboxed. They can both periodically poll data sources using TCP sockets, and listen on arbitrary, system determined ports for incoming traffic. The code will be written in Python and support all of the basic language features. Code will be resource limited with a reporting/auditing mechanism in place to ensure users do not unfairly consume resources.
  • Parsers & Processors - These snippets of code are also supplied by the user and execute in a similar sandboxed environment. Parsers should have a very low overhead, and are essentially pure functions, they are presented with the framed input, and produce database rows an output for insertion into the database. Processors are able to perform more complex actions, including reduce-like operations on multiple rows.
proj/gatd/start.txt · Last modified: 2014/07/08 02:00 by mclarkk