BoxLab Summary ScriptBackgroundBoxLab is a platform for collecting sensor data on everyday activities in the homes of volunteer participants. We are making the data collected by BoxLab “kiosks” freely available for download by researchers. In addition to sensor values for different types of sensing, we provide audio/visual records of home activity and annotations of what a 3rd party observer believes is happening in the video recordings. More information about this project is available at http://boxlab.wikispaces.com ExerciseIn order to help users identify portions of the datasets that are interesting and relevant to their research questions, we wish to provide summaries of the sensor types and annotations available in each dataset on our website, http://datasets.mit.edu Step 1:Read all the information on the datasets website. Download the BoxLab Visualizer and the sample dataset. View the data. If you have any comments on the process, in particular how to make it easier, send a note to Jason (nawyn@mit.edu).Step 2:Using a language of your choice, write a script that will search the Boxlab pre-defined directory structure and output a table or other easily interpretable form in HTML summarizing the available content of the dataset. The design of the final output is open to your interpretation. Refer to the following link for information about the directory structure. http://boxlab.wikispaces.com/BoxLab+Directory+Format For purposes of this exercise, visit http://datasets.mit.edu and navigate to the data for BLP02B, and download everything that is available for 2009-12-10. Your summary should be based on this data. At minimum, the HTML output should consist of page with the following information in a table, but adding addition information about missing files, extra files, etc. would be helpful. You may enhance the output in ways you think will help users rapidly scan for rich portions of the dataset. Your solution should work for multiple days of data as well as for single days. Audio/Video DataTo determine if an hour of data exists for Audio/Video files, iterate through each hour folder and determine if a subdirectory exists by that name. If the subdirectory contains at least one file of type ‘.zip’ or ‘.wav’, mark the table with an ‘X’. If the subdirectory exists but contains only other types of files (e.g., ‘.txt’, ‘.csv’) mark it with a ‘?’. Leave the table cell empty if the subdirectory does not exist or contains no files. AnnotationsTo determine if annotations exist for an hour, you will need to load and scan any files with the extension ‘.annotation.xml’ provided with your template. If one or more files contain a time stamp occurring during a given hour, mark the table with an ‘X’. Sensor DataTo determine if sensor data exists for given hour, you will need to load and scan any .csv files you find nested in SensorFolder. Check each file to see if there are any timestamps matching each of the hours in the table; if so, mark the cell corresponding to the sensor type and hour with an ‘X’. Step 3:Email Jason Nawyn (nawyn@mit.edu) your code and output files. If you wish to pursue further work on this data, we will provide suggestions about how you might extend the summary you created to visually indicate the quality and quantity of data (rather than its mere existence) available in the selected dataset. Questions?Contact Jason Nawyn (nawyn@mit.edu).
|